Workflows

From Archivematica
Revision as of 12:00, 25 November 2010 by Evelyn McLellan (talk | contribs)
Jump to navigation Jump to search

This page documents the default Archivematica workflows which are pre-configured for each installation. Alternate workflows can be defined using the MCP configuration files.

1. Primary (default) Archivematica workflow

The following steps represent a successful series of Archivematica workflow steps (i.e. without any errors). This "sunny day" workflow is derived from the orignal Archivematica requirements and pre-1.0, pilot project findings.

Event (mostly in order of occurence) Description PREMIS event name
Assign SIP UUID This happens immediately after the SIP is dropped into Archivematica. receive SIP
Verify SIP compliance The SIP should have four directories: objects, logs/fileMeta, logs, metadata. We don't currently know when/how these directories are created. A pass/fail notification appears to the user. It does not form a PREMIS event. n/a
Assign file UUIDs/Generate checksums Note that checksums are assigned even if the objects arrive with checksums. See next step.
  • identifier assignment
  • message digest calculation
Verify checksum If the objects arrive with checksums they are verified at this point, but the checksums are not retained once they have been verified. The checksums generated in Assign file UUID/Generate checksums are kept instead. fixity check
Appraise SIP for submission This is a yes/no decision. If no, the SIP is deleted. If yes, the archivist can decide to keep the entire SIP or delete some of the files. n/a
Scan for deleted files This step identifies any files that are deleted during the Appraise SIP for submission step. A log of deleted files is saved to the logs directory. No PREMIS event is generated. n/a
Start quarantine start quarantine
End quarantine end quarantine
Unpackage A PREMIS event is generated only for files that are unpackaged. unpackage
Assign file UUIDs/Generate checksums for unpackaged files
  • identifier assignment
  • message digest calculation
Name sanitize A PREMIS event is generated only for those files that have had their names sanitized. remove prohibited characters
Virus scan Output is pass or fail. A PREMIS event is generated only for those files that pass the scan. virus check
File format identification format identification
File format validation format validation
Appraise SIP for Preservation This appraisal step is based on format identification and validation information provided by FITS tools. This is a yes/no decision. If no, the SIP is deleted. If yes, the archivist can decide to keep the entire SIP or delete some of the files. n/a
Scan for deleted files This step identifies any files that are deleted during the Appraise SIP for preservation step. A log of deleted files is saved to the logs directory. No PREMIS event is generated. n/a
Normalization normalization
Assign file UUIDs/Generate checksums for normalized files
  • identifier assignment
  • message digest calculation
Generation of access copy n/a
Review DIP n/a
Upload DIP to access system n/a
Store AIP n/a