MCPServer
Main Page > Development > Development documentation > MCP
Design
This page proposes a new feature and reviews design options
Development
This page describes a feature that's in development
Documentation
This page documents an implemented feature
Overview
The MCP is the core of the Archivematica system. It controls the various micro-services in the Archivematica system. Configuration and processing information are held in the database. The user monitors and controls the MCP via the dashboard . The MCP maintains a log of all completed work.
The MCP uses the gearman. The MCP Clients are relatively "dumb". They are gearman worker implementations, that inform the gearman server what tasks they can perform, and wait for the server to assign them a task.
The Archivematica system relies on client and server having access to the same directory, to process the commands. On a distributed system, this is done through the shared directory.
Basic configuration can be seen here MCP Basic Configuration <-- deprecated
Server And Database
The MCP has watched directories, which are linked to Job Chains. Each Job Chain is designed to carry out a function. The function is broken down into managable peices, which are called Job Chain Links. Each of these links performs a task. Like previous versions of the MCP, these tasks may be configured to run once, or once for each file in a directory.
One major fundamental change is that the MCP is no longer as linear as it once was. Decision points allow the user to select the next Microservice chain to process, based on what is available at that point. This allows for the creation of alternative, yet similar workflows to co-exist in the Archivematica-MCP system.
Job Chains
Job Chain Links
Decision Point
Regular Job
Tasks
mcp Modules
The mcp Modules are configured in the database, with the following schema.
This may be a little out of date. Note, was generated using mysql workbench (sudo apt-get install mysql-workbench).
Client
Clients connect to the specified gearman server and provides a list of modules they support. When the MCP informs the gearman server of a Task that the client supports and the gearman server assigns the job to the client, the client will process the Job, and return the results to the gearman server, which in turn will return them to the MCP.
Client on Windows
There has been some consideration of getting an MCP client to run in the Microsoft Windows environment. This would be advantageous for normalizing in a windows environment. Some testing has been done to this end. See issue 372.
Debugging
Debugging the MCP can be a difficult task. Logs can be large, and are placed in the /tmp/ directory, so they are automatically removed upon reboot.
Parsing Logs
Here are some commands to help parse logs:
grep "DEBUG type=\"archivematicaMCP\"" -v /tmp/archivematicaMCPServer* -h > /tmp/archivematicaOutput.txt
Removes the periodic debug message prints.
grep "Traceback (most recent call last):" /tmp/archivematicaOutput.txt -n
grep -i EXCEPTION /tmp/archivematicaMCPServer-* -n
-n will prepend the line number
sed -n '302092,+50'p /tmp/archivematicaMCPServer-*
prints 50 lines from the file, including line number 302092. This is useful to look at sections of the log that have exceptions, which can be found with the command above.
debugging tools
In extreme cases, you can setup your dev enviroment, so you log in as the archivematica user, and use eclipse with pyDev in debug mode, to run the MCP.
what clients are connected
python -c ' import gearman admin = gearman.admin_client.GearmanAdminClient(host_list=["127.0.0.1"]) for client in admin. get_workers(): if client["client_id"] != "-": #exclude server task connections print client["client_id"], client["ip"] for stat in admin.get_status(): if stat["running"] != 0 or stat["queued"] != 0: print stat '
Waching activity
tail /tmp/archivematicaMCP* -f
watch mysql -u root MCP --execute "\"SELECT * FROM Tasks WHERE endTime = 0;\""
Turning on printing all sql queries
sudo nano /usr/lib/archivematica/archivematicaCommon/databaseInterface.py
- http://code.google.com/p/archivematica/source/browse/tags/release-0.8-alpha/src/archivematicaCommon/lib/databaseInterface.py
- edit lines 34 and 73
- "printSQL = False" -> printSQL = True
- " print printSQL" -> " print sql"
This will cause archivematica to print ALL of it's queries issues to the database.
Change Log
0.8
- Switched to database configuration.
- Allows for alternative workflows (ie. don't create DIP)
- Start, MCP server will try to match any existing directories in the watched directories, to a processing directory/SIP.
0.7.1
- Work was done on microservices to make the system more stable.
- A config to set the underlying protocol max length was added.
0.7
- Work was done on microservices to make the system more stable.
0.6.2
- MCP was released.