Ingest (0.5)

From Archivematica
Jump to: navigation, search

Main Page > Documentation > Release 0.5 Documentation > Ingest (0.5)

Contents

[edit] Setting up shared folders

In order to work through all of the steps in the tables below, you will need to set up two shared folders in Archivematica.

  • The purpose of shared folders is to allow you to place digital objects into a folder on your host machine and have the objects automatically appear in a folder in Archivematica, and vice versa.
  • The two folders in Archivematica which need to be set up as shared folders are /home/demo/ingestSIP and /home/demo/storeAIP.
    • /home/demo/ingestSIP is used to ingest SIPs from the host machine into Archivematica.
    • /home/demo/storeAIP is used to drop AIPs into a folder in Archivematica and have them appear back in the host machine.
  • Recommended names for the folders on the home machine are sendSIP and archivalstorage.
  • For instructions on setting up shared folders, please go to Virtual appliance instructions.
  • For testing purposes you can avoid setting up shared folders and simply use the test files found in /home/demo/testFiles/. However, you will not be moving SIPs into Archivematica or moving stored AIPs out of it.


[edit] Activity diagram 1 Receive SIP

Archivematica UML Activity diagram AD1 Receive SIP

Workflow diagram step Description Activity diagram references
Producer places SIP in shared folder on host machine
  • Place a folder of digital files into the shared ingest folder on the host machine.
  • Note that the SIP does not need to be prepared in any way prior to ingest - i.e. you do not need to prepare it as a METS file or otherwise process the SIP. A simple folder with one or more files in it is fine.
SIP appears in shared folder in Archivematica
  • SIP will appear in /home/demo/ingestSIP/.
  • To navigate to this folder, click Places > Home folder.
1.4 Receive SIP from Producer (UC-1.1)
Archivist copies SIP from shared folder to SIP receipt folder
  • Copy SIP from /home/demo/ingestSIP/ to /home/demo/receiveSIP/.
  • /home/demo/ingestSIP acts as a backup SIP copy. If anything goes wrong during the ingest process, this backup copy can be retrieved and processed.

[edit] Activity diagram 2 Audit SIP

Archivematica UML Activity diagram AD2 Audit SIP

Workflow diagram step Description Activity diagram references
Archivist moves SIP from SIP receipt folder into quarantine
  • Drag the SIP from /home/demo/receiveSIP/ and drop it into /home/demo/quarantine.
  • Note that you must drag and drop, not copy and paste, in order to trigger the quarantine process.
2.1 Quarantine SIP
SIP is quarantined for 2 minutes
  • In a production system, SIPs would normally be quarantined for a set period of time (for example, four weeks), to allow anti-virus software to be updated with the latest virus profiles.
  • A lock should appear on the SIP folder in quarantine. The archivist will not be able to read or modify the files during this time.
2.1 Quarantine SIP
SIP is scanned for malware
  • At the end of the quarantine period, ClamAV will automatically scan the files for viruses and other malware.
2.2 Check SIP for malware
Infected files are sent to possiblevirii folder
  • Infected files will appear in /home/demo/possiblevirii/. If this occurs, do not take any further steps in the ingest process. Inform the Producer that infected files have been found. It is recommended at this point to delete all SIP copies and request that the Producer take steps to review the causes of the problem and eventually resubmit a malware-free SIP.

2.4 Audit SIP for compliance
2.5 Assess SIP defiencies
2.6 Notify Producer of SIP rejection
2.8 Destroy SIP copies

Virus-checker report is sent to accessions folder
  • A report on ClamAV's virus scan will appear automatically: home/demo/accessionreports/virus.log.
2.4 Audit SIP for compliance

[edit] Activity diagram 3 Accept SIP for Ingest

Archivematica UML Activity diagram AD3 Accept SIP for Ingest

Workflow diagram step Description Activity diagram references
SIP contents are identified and validated using FITS
  • FITS (File Information Tool Set) is automatically launched once the quarantine period has ended and the files have been scanned for viruses.
  • FITS incorporates format identification and validation tools such as DROID. JHOVE and the New Zealand Metadata Extractor, comparing the results of each tool and extracting a set of identification, validation and technical metadata. For more information on the FITS tool, see http://code.google.com/p/fits/

3.3 Identify formats (UC-1.2, step 3)
3.4 Validate formats (UC-1.2, step 3)
3.5 Extract metadata (UC-1.2, step 3)

Identification/validation reports are sent to accessions folder
  • The FITS report will appear in /home/demo/accessionreports/. The report appears as a folder with a 10-digit number; inside the folder is a report for each file in the SIP.
  • Note that each report contains an MD5 checksum for the file.

3.3 Identify formats (UC-1.2, step 3)
3.4 Validate formats (UC-1.2, step 3)
3.5 Extract metadata (UC-1.2, step 3)

Accession log is sent to accessions folder
  • A report on the accession process will appear automatically: home/demo/accessionreports/accession.log.
  • For each file in the SIP, the accession log will state "Accession of /tmp/accession-[FITS folder number]/[SIP number]/filename] completed successfully."

[edit] Activity diagram 4 Generate AIP

Archivematica UML Activity diagram AD4 Generate AIP

Workflow diagram step Description Activity diagram references
SIP is moved to AIP preparation folder
  • At the end of the quarantine process, Archivematica automatically drops the SIP into /home/demo/prepareAIP
4.2 Add content information to AIP
Archivist normalizes files
  • From Archivematica's Linux desktop, open Xena
  • Click Add Directory
  • Select /home/demo/prepareAIP/[SIP]/
  • In Tools > Xena 4.2.1 Preferences > Xena destination directory enter /home/demo/prepareAIP/[SIP].
  • In Tools > Xena 4.2.1 Preferences > Xena log file enter /home/demo/accessionreports/xena_log.
  • Click OK to close Xena 4.2.1 Preferences
  • Click Normalise
  • Wait for normalization process to be completed (a pop-up dialogue box will open indicating that the process has been completed).
  • Click OK to close pop-up window
  • Close Xena
4.3 Transform content information (UC-1.3, step 9)
Normalized files are saved to AIP preparation folder
  • In the SIP, look for files with the extension .xena. These are normalized versions of the original files.
  • To view representations of normalized files, open the Xena Viewer from Archivematica's Linux desktop.
4.3 Transform content information (UC-1.3, step 9)
Normalization log is saved to accessions folder
  • A log file showing all the actions taken by Xena will appear: /home/demo/accessionreports/xena_log.0.
Archivist moves PDI from accessions folder to SIP in AIP preparation folder
  • In Archivematica, all the contents relating to the SIP in /home/demo/accessionreports/ is considered PDI (Preservation Description Information).
    • Cut these contents and paste them to /home/demo/prepareAIP/[SIP].
4.5 Add PDI to AIP
Archivist moves SIP to AIP generation folder
  • Drag the SIP from /home/demo/prepareAIP/ and drop it into /home/demo/generateAIP/.
  • Note that you must drag and drop, not copy and paste, in order to trigger the AIP generation process.
SIP content and PDI are zipped into AIP UC-1.3, step 10
AIP is moved to AIP receipt folder
  • The bagging process automatically moves the AIP to /home/demo/receiveAIP/.
    • To view the AIP, double-click it. When it opens in a separate window, double-click it again; this will allow you to view (but not modify or delete) the contents of the zipped bag.

[edit] Activity diagram 5 Transfer AIP to Archival Storage

Archivematica UML Activity diagram AD5 Transfer AIP to Archival Storage

Workflow diagram step Description Activity diagram references
Archivist copies AIP to archival storage folder
  • Copy the AIP from /home/demo/receiveAIP/ to /home/demo/storeAIP/.
5.2 Transfer AIP to archival storage (UC-1.5)

Go to Archival Storage (0.5)

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox