System Architecture

From Archivematica
Revision as of 13:08, 24 September 2012 by Joseph (talk | contribs) (This is s dump from my BCIT practicum proposal. Edited to remove BCIT related stuff.)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search


System Diagram

Shared Directory

To make the files being processed accessible to all MCP Client machines, a shared directory is required. Future revisions of the Archivematica project may use a distributed file system to accomplish this.

MCP Client

The MCP client will read its configs, and inform the server through the protocol which task types it will support (based on information held in the configuration files). The server can then assign tasks of supported types to that client. The client will run the task as a sub-process, identified as a “Running Task” in the diagram above. The client will inform the server of the exit code and output (standard out and standard error) of the process.

MCP Client Configs

These configs include information about where to connect to the server, what the maximum number of tasks/processes the client can run simultaneously, and information on what types of tasks the client can run.

Running Task

A running task is a process launched by the client. It takes in information via arguments, like invoking a command on the command line. It can directly interact with files via the shared directory. Tasks will also connect to the database to store generated metadata during processing.

MCP Protocol

This is how the MCP Client and Server communicate with one and other. Clients will inform the server of types of tasks they support, and the server will assign jobs through this mechanism.

MCP Server Configs

Contain information about what directories to watch, what task types to create when watched events are triggered, what arguments to give those tasks, what template to create the tasks on (i.e. one for each file in the SIP/Transfer, or one single task), and what to do when completed successfully or unsuccessfully.

MCP Server

Read and react to configurations: watch directories for appropriate events, create proper tasks, log the task output, get user intervention if required via the MCP RPC Protocol.

MCP Database

The functions performed by the MCP database have expanded from simple logging and reporting, to containing configurations and live information on the items being processed.

Dashboard

The dashboard is a the web based GUI for the system. There, the user can approve jobs, view output of tasks, and get a select number of reports. The dashboard gets information to display mainly from the database, but also directly from the MCP through the MCP RPC Protocol.


MCP RPC Protocol

The MCP remote procedure call protocol is used to interface with clients, controlling workflow in the system. Only the Dashboard and a command line implementation have been developed, but theoretically other systems could be developed. This protocol is used to get information on workflow decisions requiring user intervention, and applying the decisions made by the user selections. These decisions include choices like: transcode files preservation and access, preservation only, or access only?