Creating Custom Workflows 0.8 alpha

From Archivematica
Jump to navigation Jump to search

Main Page > Development > Creating Custom Workflows

Read First

This page is used to described editing workflows in the archivematica system.

Overview

The MCP operates on a set of 'MicroService Chains' defined in the MCP database. Thes chains have a starting link, and default next link. The MCP will continue to process along these chains until it reaches a next chain link of Null. Note these chains can have branches, which will process a different set of commands.

MicroServiceChainLinks Fields

  1. currentTask
    FK TasksConfigs pk
    The task that operates at this stage in the chain.
  2. defaultNextChainLink
    If the exit code of this Job is not defined in the MicroServiceChainLinksExitCodes, goto this next chain.
    The following are really only used for advanced/not implemented features.
  3. defaultPlaySound
  4. microserviceGroup
  5. reloadFileList
  6. defaultExitMessage

TasksConfigs Fields

  1. taskType
    The task type is highly important.
    select * from TaskTypes;
    +----+---------------------------------+
    | pk | description |
    +----+---------------------------------+
    | 0 | one instance |
    | 1 | for each file |
    | 2 | get user choice to proceed with |
    | 3 | assign magic link |
    | 4 | goto magic link |
    +----+---------------------------------+
  2. taskTypePKReference
    Used in combination with the taskType. The type allows the MCP code to map to a table, and the pkReference knows which entry in that table to look at.
  3. description
    a text description to appear in the dashboard

StandardTasksConfigs Fields

  1. File search filters
    filterFileEnd - useful for looking for extensions.
    filterFileStart
    filterSubDir - sub directory to operate on within the unit location
  2. requiresOutputLock
    boolean. used when logging to files. If a number of tasks are writing to the same file it's used.
    this has more historical significance for archivematica than future.
  3. standardOutputFile
  4. standardErrorFile
  5. execute
    linked to archivematicaClientModules
    the client maps it to it's executable, and will run it as though on the command line with the given arguments below.
    The client can map these to anything callable at the command line.
    Whatever is called will need to return without human intervention, or the system will hang!
  6. arguments
    arguments given to the executable
    some variables are replaced when the task is created. See replacement dics getReplacementDic():
    note: transfers use the word SIP instead of transfer for simplifying workflow migration from previous revisions of archivematica

Workflow decision tools

This section defines the tools available to select the chain or next chain link to process.

Watched Directories

Watched directories are watching for directories/files placed in them. When one is placed in them, it starts the corresponding Microservice Chain.

WatchedDirectories Fields

  1. watchedDirectoryPath
    The path to the directory. Starts with variable '%watchDirectoryPath%', which is replaced by the MCP with the location of the watched directories.
  2. chain
    The pk of the MicroServiceChains to start processing down.
  3. onlyActOnDirectories
    Always true for Archivematica 0.8
    Future or expanded use of MCP to allow for watching of individual files
  4. expectedTypeTells the MCP what type of unit to expect
    two main unit types: SIPs and Transfers
    The MCP will try to match the directory to an existing unit, or create a unit to represent the directory.

Restrictions

  • A watched directory can not contain a watched directory.

MicroServiceChainLinksExitCodes

This mechanism is used to provide an alternative to the default next chain link defined in the MicroServiceChainLink. The default is usually the default error condition, and then you define exit code 0 to go to the next chain in the link.

There are special circumstances, where you may desire the code to change the direction of the processing links. This can be done through the exit code. See Archivematica 0.8 release use of exitCode 179 and 0, in the checkForAccessDirectory microservice for an example.

Choices

Pre 0.8 release of Archivematica, the MCP could only approve a microservice. In 0.8, that has changed to choices being their own steps/microservices.

Choices link to a microservice chain, and are defined in MicroServiceChainChoice. The concept being that the user is selecting a path to follow, and the paths are defined in the microservice chains.

Magic Chain Links

Magic chain links involve getting the next chain link from the unit the job is operating on. I think they have the potential to be very useful when generating unit tests. In archivematica there are two key job types: ( 3, 'assign magic link'), ( 4, 'goto magic link').

A key advantage the magic links provide, is that they allow two or more workflows to share the same watched directory. The items within that watched directory have a flag set to say which link they should go to next.

Creating your first workflow

Make sure you read the section "Read First" (above) first.

The mock situation

Collection of jpgs normalized to bmp for preservation poorly. Want to remove the bmps, from a transfer, then process it as a standard transfer. (will later be normalized to Uncompressed TIFF by archivematica).

Creating a chain

gedit /usr/share/archivematica/mysql view -> highlight mode -> source -> SQL

I find it easier to work back chronologically. The chronological order is: Watched directory watched Move to processing directory Remove .bmp files. Set permissions Move to regular processing watched directory.

So reverse is: Move to regular processing watched directory. Remove .bmp files. Move to processing directory Set permissions Watched directory watched

video

  • Paths may be different
  • patch from video video
Index: src/MCPServer/share/mysql
===================================================================
--- src/MCPServer/share/mysql	(revision 2328)
+++ src/MCPServer/share/mysql	(working copy)
@@ -2774,12 +2774,62 @@
 
 
 
+-- Move to regular processing watched directory. --
+INSERT INTO StandardTasksConfigs (filterFileEnd, filterFileStart, filterSubDir, requiresOutputLock, standardOutputFile, standardErrorFile, execute, arguments)
+    VALUES
+    (NULL, NULL, NULL, FALSE, NULL, NULL, 'moveTransfer_v0.0', '"%SIPDirectory%" "%sharedPath%watchedDirectories/activeTransfers/standardTransfer/." "%SIPUUID%" "%sharedPath%" "%SIPUUID%" "%sharedPath%"');
+INSERT INTO TasksConfigs (taskType, taskTypePKReference, description)
+    VALUES
+    (0,      LAST_INSERT_ID(), 'Move to standard transfer directory');
+INSERT INTO MicroServiceChainLinks (microserviceGroup, currentTask, defaultNextChainLink)    
+    VALUES (@microserviceGroup, LAST_INSERT_ID(), NULL);
+set @MicroServiceChainLink = LAST_INSERT_ID();
+INSERT INTO MicroServiceChainLinksExitCodes (microServiceChainLink, exitCode, nextMicroServiceChainLink) 
+    VALUES (@MicroServiceChainLink, 0, NULL);
+set @NextMicroServiceChainLink = @MicroServiceChainLink;
 
 
+-- Remove .bmp files. --
+INSERT INTO StandardTasksConfigs (filterFileEnd, filterFileStart, filterSubDir, requiresOutputLock, standardOutputFile, standardErrorFile, execute, arguments)
+    VALUES
+    ('.bmp', NULL, 'objects', TRUE, NULL, NULL, 'remove_v0.0',  '"%relativeLocation%"');
+SET @AssignfileUUIDstoobjects = LAST_INSERT_ID();
+INSERT INTO TasksConfigs (taskType, taskTypePKReference, description)
+    VALUES
+    (1,      @AssignfileUUIDstoobjects, 'Remove .bmp files');
+INSERT INTO MicroServiceChainLinks (microserviceGroup, currentTask, defaultNextChainLink)     
+    VALUES (@microserviceGroup, LAST_INSERT_ID(), @defaultNextChainLink);
+set @MicroServiceChainLink = LAST_INSERT_ID();
+INSERT INTO MicroServiceChainLinksExitCodes (microServiceChainLink, exitCode, nextMicroServiceChainLink) 
+    VALUES (@MicroServiceChainLink, 0, @NextMicroServiceChainLink);
+set @NextMicroServiceChainLink = @MicroServiceChainLink;
 
+-- Move to processing directory --
+-- move to processing directory --
+INSERT INTO MicroServiceChainLinks (microserviceGroup, currentTask, defaultNextChainLink)     
+    VALUES (@microserviceGroup, @moveToProcessingDirectoryTaskConfig, @defaultNextChainLink);
+set @MicroServiceChainLink = LAST_INSERT_ID();
+INSERT INTO MicroServiceChainLinksExitCodes (microServiceChainLink, exitCode, nextMicroServiceChainLink) 
+    VALUES (@MicroServiceChainLink, 0, @NextMicroServiceChainLink);
+set @NextMicroServiceChainLink = @MicroServiceChainLink;
 
+-- Set permissions --
+INSERT INTO MicroServiceChainLinks (microserviceGroup, currentTask, defaultNextChainLink)     
+    VALUES (@microserviceGroup, @setFilePermissionsTaskConfig, @defaultNextChainLink);
+set @MicroServiceChainLink = LAST_INSERT_ID();
+INSERT INTO MicroServiceChainLinksExitCodes (microServiceChainLink, exitCode, nextMicroServiceChainLink) 
+    VALUES (@MicroServiceChainLink, 0, @NextMicroServiceChainLink);
+set @NextMicroServiceChainLink = @MicroServiceChainLink;
 
+/*
+Watched directory watched
+/var/archivematica/sharedDirectory/watchedDirectories/example1
+*/
+INSERT INTO MicroServiceChains (startingLink, description) VALUES (@MicroServiceChainLink,  'Remove .bmp\'s before processing');
+set @MicroServiceChain = LAST_INSERT_ID();
 
+INSERT INTO WatchedDirectories (watchedDirectoryPath, chain, expectedType)
+    VALUES ('%watchDirectoryPath%example1', @MicroServiceChain, @expectedTypeTransfer);
 
 
 
@@ -2803,6 +2853,11 @@
 
 
 
+
+
+
+
+
 -- DSPACE TRANSER --
 -- transfer processing complete --
 SET @microserviceGroup  = 'Complete transfer';

using choices

Continuing the example using choices

Index: src/MCPServer/share/mysql
===================================================================
--- src/MCPServer/share/mysql	(revision 2328)
+++ src/MCPServer/share/mysql	(working copy)
@@ -2657,6 +2657,87 @@
     VALUES (@MicroServiceChainLink, 0, @NextMicroServiceChainLink);
 set @NextMicroServiceChainLink = @MicroServiceChainLink;
 
+
+-- move to processing directory --
+INSERT INTO MicroServiceChainLinks (microserviceGroup, currentTask, defaultNextChainLink)     
+    VALUES (@microserviceGroup, @moveToProcessingDirectoryTaskConfig, @defaultNextChainLink);
+set @MicroServiceChainLink = LAST_INSERT_ID();
+INSERT INTO MicroServiceChainLinksExitCodes (microServiceChainLink, exitCode, nextMicroServiceChainLink) 
+    VALUES (@MicroServiceChainLink, 0, @NextMicroServiceChainLink);
+
+INSERT INTO MicroServiceChains (startingLink, description) VALUES (@MicroServiceChainLink,  'Continue processing normally');
+set @ContinueProcessingNormallyMicroServiceChain = LAST_INSERT_ID();
+
+-- Remove .bmp files. --
+INSERT INTO StandardTasksConfigs (filterFileEnd, filterFileStart, filterSubDir, requiresOutputLock, standardOutputFile, standardErrorFile, execute, arguments)
+    VALUES
+    ('.bmp', NULL, 'objects', TRUE, NULL, NULL, 'remove_v0.0',  '"%relativeLocation%"');
+SET @AssignfileUUIDstoobjects = LAST_INSERT_ID();
+INSERT INTO TasksConfigs (taskType, taskTypePKReference, description)
+    VALUES
+    (1,      @AssignfileUUIDstoobjects, 'Remove .bmp files');
+INSERT INTO MicroServiceChainLinks (microserviceGroup, currentTask, defaultNextChainLink)     
+    VALUES (@microserviceGroup, LAST_INSERT_ID(), @defaultNextChainLink);
+set @MicroServiceChainLink = LAST_INSERT_ID();
+INSERT INTO MicroServiceChainLinksExitCodes (microServiceChainLink, exitCode, nextMicroServiceChainLink) 
+    VALUES (@MicroServiceChainLink, 0, @NextMicroServiceChainLink);
+set @NextMicroServiceChainLink = @MicroServiceChainLink;
+
+-- move to processing directory --
+INSERT INTO MicroServiceChainLinks (microserviceGroup, currentTask, defaultNextChainLink)     
+    VALUES (@microserviceGroup, @moveToProcessingDirectoryTaskConfig, @defaultNextChainLink);
+set @MicroServiceChainLink = LAST_INSERT_ID();
+INSERT INTO MicroServiceChainLinksExitCodes (microServiceChainLink, exitCode, nextMicroServiceChainLink) 
+    VALUES (@MicroServiceChainLink, 0, @NextMicroServiceChainLink);
+set @NextMicroServiceChainLink = @MicroServiceChainLink;
+
+INSERT INTO MicroServiceChains (startingLink, description) VALUES (@MicroServiceChainLink,  'Remove .bmp files');
+set @RemoveBMPfilesFirstMicroServiceChain = LAST_INSERT_ID();
+
+
+INSERT INTO StandardTasksConfigs (filterFileEnd, filterFileStart, filterSubDir, requiresOutputLock, standardOutputFile, standardErrorFile, execute, arguments)
+    VALUES
+    (NULL, NULL, NULL, FALSE, NULL, NULL, '', '');
+INSERT INTO TasksConfigs (taskType, taskTypePKReference, description)
+    VALUES
+    (2,      LAST_INSERT_ID(), 'Workflow decision - remove .bmp files');
+INSERT INTO MicroServiceChainLinks (microserviceGroup, currentTask, defaultNextChainLink)    
+    VALUES (@microserviceGroup, LAST_INSERT_ID(), NULL);
+set @MicroServiceChainLink = LAST_INSERT_ID();
+INSERT INTO MicroServiceChainChoice (choiceAvailableAtLink, chainAvailable)
+    VALUES                
+    (@MicroServiceChainLink, @ContinueProcessingNormallyMicroServiceChain);
+INSERT INTO MicroServiceChainChoice (choiceAvailableAtLink, chainAvailable)
+    VALUES                
+    (@MicroServiceChainLink, @RemoveBMPfilesFirstMicroServiceChain);
+INSERT INTO MicroServiceChainChoice (choiceAvailableAtLink, chainAvailable)
+    VALUES                
+    (@MicroServiceChainLink, @rejectSIPMicroServiceChain);
+set @NextMicroServiceChainLink = @MicroServiceChainLink;
+
+INSERT INTO MicroServiceChains (startingLink, description) VALUES (@MicroServiceChainLink,  'create remove .bmp files?');
+set @MicroServiceChain = LAST_INSERT_ID();
+
+INSERT INTO WatchedDirectories (watchedDirectoryPath, chain, expectedType)
+    VALUES ('%watchDirectoryPath%workFlowDecisions/removeBMPFiles/', @MicroServiceChain, @expectedTypeTransfer);
+
+
+INSERT INTO StandardTasksConfigs (filterFileEnd, filterFileStart, filterSubDir, requiresOutputLock, standardOutputFile, standardErrorFile, execute, arguments)
+    VALUES
+    (NULL, NULL, NULL, FALSE, NULL, NULL, 'moveTransfer_v0.0', '"%SIPDirectory%" "%sharedPath%watchedDirectories/workFlowDecisions/removeBMPFiles/." "%SIPUUID%" "%sharedPath%" "%SIPUUID%" "%sharedPath%"');
+Set @MovetoworkFlowDecisionsquarantineSIPdirectory = LAST_INSERT_ID();
+INSERT INTO TasksConfigs (taskType, taskTypePKReference, description)
+    VALUES
+    (0,      @MovetoworkFlowDecisionsquarantineSIPdirectory, 'Move to workFlowDecisions-removeBMPFiles directory');
+INSERT INTO MicroServiceChainLinks (microserviceGroup, currentTask, defaultNextChainLink)    
+    VALUES (@microserviceGroup, LAST_INSERT_ID(), NULL);
+set @MicroServiceChainLink = LAST_INSERT_ID();
+INSERT INTO MicroServiceChainLinksExitCodes (microServiceChainLink, exitCode, nextMicroServiceChainLink) 
+    VALUES (@MicroServiceChainLink, 0, NULL);
+set @NextMicroServiceChainLink = @MicroServiceChainLink;
+
+
+
 SET @microserviceGroup  = 'Include default Transfer processingMCP.xml';
 INSERT INTO StandardTasksConfigs (filterFileEnd, filterFileStart, filterSubDir, requiresOutputLock, standardOutputFile, standardErrorFile, execute, arguments)
     VALUES
@@ -2802,7 +2883,6 @@
 
 
 
-
 -- DSPACE TRANSER --
 -- transfer processing complete --
 SET @microserviceGroup  = 'Complete transfer';

using magic links