Difference between revisions of "MCPServer"

From Archivematica
Jump to navigation Jump to search
(No difference)

Revision as of 12:59, 6 October 2010

Main Page > Development > Development documentation > MCP


Design

This page proposes a new feature and reviews design options

Development

This page describes a feature that's in development

Documentation

This page documents an implemented feature

Overview

The MCP is the Archivematica micro-services tool to control flow in the Archivematica system. It "knows" what things need to be done, who can do them, what is currently being processed across the distributed system, and is responsible for distributing the work. The user controls and monitors the MCP via the dashboard . The MCP maintains a log of all completed work.

The MCP is a client server based architecture. The clients are relatively "dumb". They inform the server what tasks they can perform, and wait for the server to assign them a task.

The system relies on client and server having access to the same directory, to process the commands.

Server

The server is the core of the MCP. It uses a set of modules, created at run time based on configurations (mcpModulesConfigs). These configs specify a directory to watch, and a series of commands to execute on anything placed in the directory.

Once a folder is placed in a watched the Directory a Job object is created. The Job has a UUID, and is an instance of a given module. If the job requires user approval, it waits till it receives it, otherwise it proceeds. The first thing an approved job will do is move the Event folder to the associated processing directory. The job has a number of steps, each step is associated with creating and executing the associated command from the comfig. If the job step is to execute on each file, it creates a task for each file, filtered by the specified filters. If the task is not to execute on each file, it creates a single task to execute on the directory.

Failure is defined as one or more exec Tasks or verification Tasks returned non zero.

Each task, like a job, has a UUID. Tasks are assigned to clients to perform. The clients inform the server of the tasks they can perform and the max number of threads they can handle. When a client completes a task, it will inform the server.

When the server is notified that a task is complete, it checks if that was the last task in the job step, and if it is, proceeds on to the next job step, until the job is done. When the job is done, the contents of the job processing folder are moved to their next location.

Each folder is separated in the processing folder by a sub folder named the job UUID. This allows for the Event folder to be renamed or otherwise manipulated.

Server Implementation

As stated above, the MCP watches directories specified in the modules, and has a set of commands to issue on items placed in those folders. Those commands will not be performed by the MCP server itself, but rather delegated to a client.

mcp Modules

The mcp Modules are XML based, and contain a number of fields: <module> Wait for user approval, before creating and assigning tasks.

<requiresUserApproval>Yes/No</requiresUserApproval>

Description to give user to approve Job.(not implemented yet)

<descriptionForApproval></descriptionForApproval>
<notificationStarted></notificationStarted>
<notificationCompletedWithoutErrors></notificationCompletedWithoutErrors>
<notificationCompletedWithErrors></notificationCompletedWithErrors>


<directories>

Directory to watch for folders moved to.

 <watchDirectory>%watchDirectoryPath%appraiseSIP</watchDirectory>

Standard directory to move to while the clients are performing tasks. Defined in archivematica.conf .

 <processingDirectory>%processingDirectory%</processingDirectory>

The output directory is determined by the return value of the commands, and the corresponding config folder. Chaining output folders and watch directories allows for flow through the system.

 <successDirectory>%watchDirectoryPath%...</successDirectory>
 <failureDirectory>%watchDirectoryPath%failed/</failureDirectory>
</directories>


<commands> 

A command to execute on each file or folder, the details of which are described below in 'mcp Modules Command'. The exe command is the command to run on the Event.

<exeCommand> 

The verification command is used if the return of the exeCommand is not reliable to determin the result of the command. (IE virus scan exits zero and logs there is a virus)

<verificationCommand> 

Cleanup allows for some post processing before the Event folder is moved to the success or fail directory.

<cleanupSuccessfulCommand> 
<cleanupUnsuccessfulCommand>
</commands> 

</module>


mcp Modules Command

Each command consists of a number of parts.

<command>
<descriptionWhileExecuting> </descriptionWhileExecuting>

To skip this command, and not execute it at all.

<skip>Yes/No</skip>

Filter what files/folder the command will operate on.

<filterFileEnd></filterFileEnd>
<filterFileStart></filterFileStart>
<filterSubDir></filterSubDir>

If the output of the command is all going to the same file, multiple threads may try to write to the same file simultaneously and cause a collision. Set this option to stop that occurrence.

<requiresOutputLock>No</requiresOutputLock>

Not used - placeholder

<standardIn></standardIn>

File to write standard out to.

<standardOut></standardOut>

File to write standard error to.

<standardError></standardError>
<failureNotification> </failureNotification>

Define the command the client is to execute. This will need to match an entry in the client's supported modules defined int archivematicaClientConfig

<execute> </execute>

The arguments to give the command.

<arguments> </arguments>

Does this command execute once on the SIP, or once for each file?

<executeOnEachFile>yes/no</executeOnEachFile>
</command>

Client

Client Requirements

Client Implementation