Difference between revisions of "Micro-services"

From Archivematica
Jump to navigation Jump to search
Line 1: Line 1:
[[Main Page]] > [[Overview]] > Micro-services
+
[[Main Page]] > [[Documentation]] > [[Technical architecture]] > Micro-services
 
 
[[File:Archivematica-architecture-7May2010-2.png|thumb|right|600px|Archivematica architecture - May 2010]]
 
  
 
The Archivematica [http://www.cdlib.org/services/uc3/curation/ micro-services] are based on the project's [[OAIS Use Cases|use case]] and [[UML Activity Diagrams|workflow]] analysis of the ISO-OAIS functional model. Each service is available to a conceptual entity that is equivalent to an OAIS information package:
 
The Archivematica [http://www.cdlib.org/services/uc3/curation/ micro-services] are based on the project's [[OAIS Use Cases|use case]] and [[UML Activity Diagrams|workflow]] analysis of the ISO-OAIS functional model. Each service is available to a conceptual entity that is equivalent to an OAIS information package:
Line 8: Line 6:
 
*Dissemination Information Package (DIP)  
 
*Dissemination Information Package (DIP)  
  
These packages are moved from one service to the next using the [http://en.wikipedia.org/wiki/Pipeline_%28Unix%29 Unix pipeline] design pattern implemented with a combination of Bash and Python scripts. Each service is provided by one or more of the open-source [[Release_0.6-alpha|software]] utilities and applications bundled in the Archivematica system.  
+
These packages are moved from one service to the next using the [http://en.wikipedia.org/wiki/Pipeline_%28Unix%29 Unix pipeline] design pattern implemented with a combination of Bash and Python scripts. Each service is provided by one or more of the open-source [[External tools|software]] utilities and applications bundled in the Archivematica system.  
  
In early iterations of the Archivematica system, some of the workflow controls (e.g. event triggering, error reporting, etc.) are handled via the Thunar file manager (e.g. drag-and-drop, desktop notifications). As the system approaches beta maturity all of the micro-services workflow will be managed via a web-based [[Dashboard]] application.
 
  
 
==Archivematica Micro-services==
 
==Archivematica Micro-services==

Revision as of 19:50, 16 February 2011

Main Page > Documentation > Technical architecture > Micro-services

The Archivematica micro-services are based on the project's use case and workflow analysis of the ISO-OAIS functional model. Each service is available to a conceptual entity that is equivalent to an OAIS information package:

  • Submission Information Package (SIP)
  • Archival Information Package (AIP)
  • Dissemination Information Package (DIP)

These packages are moved from one service to the next using the Unix pipeline design pattern implemented with a combination of Bash and Python scripts. Each service is provided by one or more of the open-source software utilities and applications bundled in the Archivematica system.


Archivematica Micro-services

Micro-service Description
Create SIP backup Archivematica automatically creates a backup of the entire SIP as soon as it is ingested.
Verify SIP compliance This micro-service verifies that the SIP conforms to the folder structure required for processing in Archivematica.
Assign file UUID and checksums Each file in the SIP is assigned a universal unique identifer and a sha-1 checksum.
Verify metadata directory checksums If the SIP contained a checksum.md5 file on ingest, Archivematica will check it to confirm that none of the files were deleted or altered on ingest.
Create DC A Dublin Core xml template is added to the metadata folder in the SIP. The user can fill in fields as desired. The elements map to fields in ICA-AtoM when the DIP is uploaded.
Appraise SIP for submission The archivist reviews the SIP, if desired, to confirm that it complies with any submission agreements. The archivist can delete unwanted files at this point; Archivematica will keep a log of the deleted files.
Quarantine The SIP is placed in quarantine for a pre-set period of time. The archivist can move the SIP out of quarantine before the pre-set time has expired, if desired.
Extract packages Files are extracted from any .zip files or other packages; each extracted file is assigned a universal unique identifier and a sha-1 checksum.
Sanitize file and directory names Prohibited characters, such as spaces or ampersands, are removed from file and folder names and replaced with underscores.
Scan for viruses ClamAV scans all files. In the event that a virus or other malware is found, the SIP is placed in a folder called SIPerrors and all processing on the SIP is stopped.
Characterize and extract metadata File formats are identified and the files validated against external format specifications. Technical metadata is extracted from the file.
Appraise SIP for preservation The archivist appraises the contents of the SIP, if desired, and deletes unwanted files. Archivematica will keep a log of the deleted files. In future releases of Archivematica, appraisal will be assisted by summary technical information on file formats, validation status and the presence of characteristics that might affect preservation.
Normalize Archivematica creates a preservation copy and an access copy of each file. For more on normalization, see Media type preservation plans.
Compile METS file Archivematica compiles a METS file with a complete set of PREMIS metadata for each ingested file. The technical metadata that were extracted during the "Characterize and extract metadata" micro-service are placed in the PREMIS objectCharacteristicsExtension element.
Create AIP checksum A checksum for all the contents of the AIP is generated.
Prepare AIP The AIP is packaged using the Library of Congress Bagit specification.
Store AIP The archivist reviews the AIP if desired, and approves it for archival storage. The AIP is moved into the AIPsStore folder, which is linked to the institution's storage system.
Generate DIP The access copies that were created during the "Normalize" micro-service are placed in a DIP folder and the METS file is added to the DIP.
Upload DIP The archivist reviews the DIP, if desired, and removes any access copies that cannot be sent to the public access system due to copyright, security or other issues. The archivist then approves the DIP for upload and the DIP is uploaded into the public access system (in Archivematica, the default access system is the open-source archival description tool ICA-AtoM). A backup copy of the DIP, including files that were deleted, is sent to the DIPbackups folder.