Difference between revisions of "Ingest (0.5)"

From Archivematica
Jump to navigation Jump to search
(Undo revision 640 by Evelyn McLellan (Talk))
Line 191: Line 191:
 
*Drag the AIP from /home/demo/receiveAIP/ and drop it into /home/demo/storeAIP/.
 
*Drag the AIP from /home/demo/receiveAIP/ and drop it into /home/demo/storeAIP/.
 
|
 
|
 +
|
 +
|-
 +
|AIP is copied to shared folder on host machine.
 +
|
 +
*This assumes that /home/demo/storeAIP has been set up as a shared folder.
 +
**The purpose of shared folders is to allow the Producer to drop SIPs into a folder on their host machine or network and have the SIPs automatically appear in a folder in Archivematica, and vice versa.
 +
**For instructions on setting up shared folders, please go to [[Virtual appliance instructions#Import_files_into_virtual_appliance_.28optional.29|Virtual appliance instructions]].
 +
**Neither the folder name nor any of the filenames should include spaces or special characters. Underscores are ok.
 +
**For testing purposes you can choose not to set up the shared folder, since it doesn't affect any activity in Archivematica.
 +
|
 +
|-
 +
|Archivist deletes SIP from SIP backup folder
 +
|
 +
*Delete the SIP from /home/demo/receiveSIPbackup/.
 
|
 
|
 
|-
 
|-
 
|}
 
|}

Revision as of 16:36, 19 November 2009

Main Page > Documentation > Release 0.5 Documentation > Ingest (0.5)


AD1 Receive SIP

File:Archivematica AD1 ReceiveSIP v1.pdf


Workflow diagram step Description UML diagram references
Producer places SIP in shared folder on host machine
  • The purpose of shared folders is to allow the Producer to drop SIPs into a folder on their host machine or network and have the SIPs automatically appear in a folder in Archivematica, and vice versa.
    • For instructions on setting up shared folders, please go to Virtual appliance instructions.
    • Neither the folder name nor any of the filenames should include spaces or special characters. Underscores are ok.
    • Make the shared folder in Archivematica /home/demoreceiveSIP/.
  • For testing purposes you can avoid setting up shared folders and simply use the test files found in /home/demo/testFiles/.
SIP appears in shared folder in Archivematica
  • SIP will appear in /home/demo/receiveSIP/.
  • If using files from /home/demo/testFiles/, copy a folder from /testFiles/ into /home/demo/receiveSIP/.
Archivist copies SIP from shared folder to SIP backup folder
  • Copy the SIP in /home/demo/receiveSIP/ and paste it into /home/demo/receiveSIPbackup/. If anything goes wrong during the ingest process, this backup copy can be retrieved and processed.

AD2 Audit SIP

File:Archivematica AD2 AuditSIP v5.pdf


Workflow diagram step Description UML diagram references
Archivist moves SIP from shared folder into quarantine
  • Drag the SIP from /home/demo/receiveSIP/ to /home/demo/quarantine.
SIP is quarantined for 2 minutes
  • In a production system, SIPs would normally be quarantined for a set period of time (for example, four weeks), to allow anti-virus software to be updated with the latest virus profiles.
  • A lock should appear on the SIP folder in quarantine. The archivist will not be able to read or modify the files during this time.
SIP is scanned for malware
  • At the end of the quarantine period, ClamAV will automatically scan the files for viruses and other malware.
Infected files are sent to possiblevirii folder
  • Infected files will appear in /home/demo/possiblevirii/. If this occurs, do not take any further steps in the ingest process. Inform the Producer that infected files have been found. It is recommended at this point to delete all SIP copies and request that the Producer take steps to review the causes of the problem and eventually resubmit a malware-free SIP.

AD3 Accept SIP for Ingest

File:Archivematica AD3 AcceptSIPforIngest v4.pdf


Workflow diagram step Description UML diagram references
SIP contents are identified and validated using FITS
  • FITS (File Information Tool Set) is automatically launched once the quarantine period has ended and the files have been scanned for viruses.
  • FITS incorporates format identification and validation tools such as DROID. JHOVE and the New Zealand Metadata Extractor, comparing the results of each tool and extracting a set of identification, validation and technical metadata. For more information on the FITS tool, see http://code.google.com/p/fits/
Identification/validation reports are sent to accessions folder
  • The FITS report will appear in /home/demo/accessionreports/. The report appears as a folder with a 10-digit number; inside the folder is a report for each file in the SIP.
  • Note that each report contains an MD5 checksum for the file.
Virus-checker report is sent to accessions folder
  • A report on ClamAV's virus scan will appear automatically: home/demo/accessionreports/virus.log.
Accession log is sent to accessions folder
  • A report on the accession process will appear automatically: home/demo/accessionreports/accession.log.
  • For each file in the SIP, the accession log will state "Accession of /tmp/accession-[FITS folder number]/[SIP number]/filename] completed successfully."

AD4 Generate AIP

File:Archivematica AD4 GenerateAIP v2.pdf


Workflow diagram step Description UML diagram references
SIP is moved to AIP preparation folder
  • At the end of the quarantine process, Archivematica automatically drops the SIP into /home/demo/prepareAIP
Archivist normalizes files
  • From Archivematica's Linux desktop, open Xena
  • Click Add Directory
  • Select /home/demo/prepareAIP/[SIP]/
  • In Tools > Xena 4.2.1 Preferences > Xena destination directory enter /home/demo/prepareAIP/[SIP].
  • In Tools > Xena 4.2.1 Preferences > Xena log file enter /home/demo/accessionreports/xena_log.
  • Click OK to close Xena 4.2.1 Preferences
  • Click Normalise
  • Wait for normalization process to be completed (a pop-up dialogue box will open indicating that the process has been completed).
  • Click OK to close pop-up window
  • Close Xena
Normalized files are saved to AIP preparation folder
  • In the SIP, look for files with the extension .xena. These are normalized versions of the original files.
Normalization log is saved to accessions folder
  • A log file showing all the actions taken by Xena will appear: /home/demo/accessionreports/xena_log.0.
Archivist moves PDI in accessions folder to SIP in AIP preparation folder
  • In Archivematica, all the contents relating to the SIP in /home/demo/accessionreports/ is considered PDI (Preservation Description Information).
    • Copy these contents to /home/demo/prepareAIP/[SIP]
Archivist moves SIP to AIP generation folder
  • Drag the SIP from /home/demo/prepareAIP/ and drop it into /home/demo/generateAIP/.
SIP content and PDI are zipped into AIP
AIP is moved to AIP receipt folder
  • The bagging process automatically moves the AIP to /home/demo/receiveAIP/.
    • To view the AIP, double-click it. When it opens in a separate window, double-click it again; this will allow you to view (but not modify or delete) the contents of the zipped bag.

AD5 Transfer AIP to Archival Storage

File:Archivematica AD5 TransferAIPtoArchivalStorage v2.pdf


Workflow step Description UML diagram references
Archivist moves AIP to archival storage folder
  • Drag the AIP from /home/demo/receiveAIP/ and drop it into /home/demo/storeAIP/.
AIP is copied to shared folder on host machine.
  • This assumes that /home/demo/storeAIP has been set up as a shared folder.
    • The purpose of shared folders is to allow the Producer to drop SIPs into a folder on their host machine or network and have the SIPs automatically appear in a folder in Archivematica, and vice versa.
    • For instructions on setting up shared folders, please go to Virtual appliance instructions.
    • Neither the folder name nor any of the filenames should include spaces or special characters. Underscores are ok.
    • For testing purposes you can choose not to set up the shared folder, since it doesn't affect any activity in Archivematica.
Archivist deletes SIP from SIP backup folder
  • Delete the SIP from /home/demo/receiveSIPbackup/.