Difference between revisions of "MCPServer"

From Archivematica
Jump to navigation Jump to search
(Added schema to display structure of configurations.)
(Starting work on describing the new MCP)
Line 22: Line 22:
  
 
==Overview==
 
==Overview==
The MCP is the Archivematica [[micro-services]] tool to control flow in the Archivematica system. It "knows" what things need to be done, who can do them, what is currently being processed across the distributed system, and is responsible for distributing the work. The user controls and monitors the MCP via the [[ dashboard ]]. The MCP maintains a log of all completed work.
+
The MCP is the core of the Archivematica system. It controls the various [[micro-services]] in the Archivematica system. Configuration and processing information are held in the database. The user monitors and controls the MCP via the dashboard. The user controls and monitors the MCP via the [[ dashboard ]]. The MCP maintains a log of all completed work.
  
The MCP is a client server based architecture. The clients are relatively "dumb". They inform the server what tasks they can perform, and wait for the server to assign them a task.
+
The MCP uses the [http://gearman.org gearman]. The MCP Clients are relatively "dumb". They are gearman worker implementations, that inform the gearman server what tasks they can perform, and wait for the server to assign them a task.
  
The system relies on client and server having access to the same directory, to process the commands.
+
The Archivematica system relies on client and server having access to the same directory, to process the commands. On a distributed system, this is done through the shared directory.
  
Basic configuration can be seen here [[MCP Basic Configuration]]
+
Basic configuration can be seen here [[MCP Basic Configuration]] <-- deprecated
  
==Server==
 
The server is the core of the MCP. It uses a set of modules, created at run time based on configurations (mcpModulesConfigs). These configs specify a directory to watch, and a series of commands to execute on anything placed in the directory.
 
  
Once a folder is placed in a watched the Directory a Job object is created. The Job has a UUID, and is an instance of a given module. If the job requires user approval, it waits till it receives it, otherwise it proceeds. The first thing an approved job will do is move the Event folder to the associated processing directory. The job has a number of steps, each step is associated with creating and executing the associated command from the comfig. If the job step is to execute on each file, it creates a task for each file, filtered by the specified filters. If the task is not to execute on each file, it creates a single task to execute on the directory.
 
  
Failure is defined as one or more exec Tasks or verification Tasks returned non zero.
+
==Server And Database==
 +
The MCP has watched directories, which are linked to Job Chains. Each Job Chain is designed to carry out a function. The function is broken down into managable peices, which are called Job Chain Links. Each of these links performs a task. Like previous versions of the MCP, these tasks may be configured to run once, or once for each file in a directory.  
  
Each task, like a job, has a UUID. Tasks are assigned to clients to perform.  
+
One major fundamental change is that the MCP is no longer as linear as it once was. Decision points allow the user to select the next Microservice chain to process, based on what is available at that point. This allows for the creation of alternative, yet similar workflows to co-exist in the Archivematica-MCP system.
The clients inform the server of the tasks they can perform and the max number of threads they can handle.
 
When a client completes a task, it will inform the server.
 
  
When the server is notified that a task is complete, it checks if that was the last task in the job step, and if it is, proceeds on to the next job step, until the job is done.
+
===Job Chains===
When the job is done, the contents of the job processing folder are moved to their next location.
 
  
Each folder is separated in the processing folder by a sub folder named the job UUID.
+
===Job Chain Links===
This allows for the Event folder to be renamed or otherwise manipulated.
+
====Decision Point====
 +
====Regular Job====
  
===Server Implementation===
+
===Tasks===
As stated above, the MCP watches directories specified in the modules, and has a set of commands to issue on items placed in those folders. Those commands will not be performed by the MCP server itself, but rather delegated to a client.
 
  
 
===mcp Modules===
 
===mcp Modules===
Line 55: Line 50:
  
 
[[File:MCP_configuration_database_schema.png]]
 
[[File:MCP_configuration_database_schema.png]]
 +
  
 
==Client==
 
==Client==
Clients connect to the MCP and provide a list of modules they support. The MCP can then assign tasks, of supported types, to that client. The client reports the outcome of the task back to the MCP, once the task is completed.
+
Clients connect to the specified gearman server and provides a list of modules they support. When the MCP informs the gearman server of a Task that the client supports and the gearman server assigns the job to the client, the client will process the Job, and return the results to the gearman server, which in turn will return them to the MCP.
  
 
===Client on Windows===
 
===Client on Windows===
 
There has been some consideration of getting an MCP client to run in the Microsoft Windows environment. This would be advantageous for [[normalizing in a windows environment]]. Some testing has been done to this end. See issue 372.
 
There has been some consideration of getting an MCP client to run in the Microsoft Windows environment. This would be advantageous for [[normalizing in a windows environment]]. Some testing has been done to this end. See issue 372.
 +
 +
=Change Log=
 +
==0.8==
 +
* Switched to database configuration.
 +
* Allows for alternative workflows (ie. don't create DIP)
 +
* Start, MCP server will try to match any existing directories in the watched directories, to a processing directory/SIP.
 +
 +
==0.7.1==
 +
* Work was done on microservices to make the system more stable.
 +
* A config to set the underlying protocol max length was added.
 +
 +
==0.7==
 +
* Work was done on microservices to make the system more stable.
 +
 +
==0.6.2==
 +
* MCP was released.
 +
  
 
[[Category:Development documentation]]
 
[[Category:Development documentation]]

Revision as of 13:07, 1 September 2011

Main Page > Development > Development documentation > MCP


Design

This page proposes a new feature and reviews design options

Development

This page describes a feature that's in development

Documentation

This page documents an implemented feature

Overview

The MCP is the core of the Archivematica system. It controls the various micro-services in the Archivematica system. Configuration and processing information are held in the database. The user monitors and controls the MCP via the dashboard. The user controls and monitors the MCP via the dashboard . The MCP maintains a log of all completed work.

The MCP uses the gearman. The MCP Clients are relatively "dumb". They are gearman worker implementations, that inform the gearman server what tasks they can perform, and wait for the server to assign them a task.

The Archivematica system relies on client and server having access to the same directory, to process the commands. On a distributed system, this is done through the shared directory.

Basic configuration can be seen here MCP Basic Configuration <-- deprecated


Server And Database

The MCP has watched directories, which are linked to Job Chains. Each Job Chain is designed to carry out a function. The function is broken down into managable peices, which are called Job Chain Links. Each of these links performs a task. Like previous versions of the MCP, these tasks may be configured to run once, or once for each file in a directory.

One major fundamental change is that the MCP is no longer as linear as it once was. Decision points allow the user to select the next Microservice chain to process, based on what is available at that point. This allows for the creation of alternative, yet similar workflows to co-exist in the Archivematica-MCP system.

Job Chains

Job Chain Links

Decision Point

Regular Job

Tasks

mcp Modules

The mcp Modules are configured in the database, with the following schema.


MCP configuration database schema.png


Client

Clients connect to the specified gearman server and provides a list of modules they support. When the MCP informs the gearman server of a Task that the client supports and the gearman server assigns the job to the client, the client will process the Job, and return the results to the gearman server, which in turn will return them to the MCP.

Client on Windows

There has been some consideration of getting an MCP client to run in the Microsoft Windows environment. This would be advantageous for normalizing in a windows environment. Some testing has been done to this end. See issue 372.

Change Log

0.8

  • Switched to database configuration.
  • Allows for alternative workflows (ie. don't create DIP)
  • Start, MCP server will try to match any existing directories in the watched directories, to a processing directory/SIP.

0.7.1

  • Work was done on microservices to make the system more stable.
  • A config to set the underlying protocol max length was added.

0.7

  • Work was done on microservices to make the system more stable.

0.6.2

  • MCP was released.