Difference between revisions of "Improvements/Islandora"

From Archivematica
Jump to navigation Jump to search
Line 27: Line 27:
 
## Metadata includes an AIP ID generated either from user input or based on the collection metadata
 
## Metadata includes an AIP ID generated either from user input or based on the collection metadata
 
## METS.xml is posted to the Archivematica URI
 
## METS.xml is posted to the Archivematica URI
 
 
# Create Transfer
 
# Create Transfer
 
## Archivematica parses and validates the METS.xml
 
## Archivematica parses and validates the METS.xml
Line 34: Line 33:
 
## The transfer object is created
 
## The transfer object is created
 
## The file URIs are passed to Archivematica for asynchronous processing
 
## The file URIs are passed to Archivematica for asynchronous processing
 
 
# Collect Files
 
# Collect Files
 
## Files are retrieved from the Fedora REST API using HTTP GET and added to the transfer; if there is an error, an error response is returned
 
## Files are retrieved from the Fedora REST API using HTTP GET and added to the transfer; if there is an error, an error response is returned
 
## Checksums are confirmed
 
## Checksums are confirmed
 
 
# Transfer and Ingest
 
# Transfer and Ingest
 
## Files are moved to the watchedDirectory
 
## Files are moved to the watchedDirectory
 
## Transfer and Ingest are completed in Archivematica, either manually or automatically
 
## Transfer and Ingest are completed in Archivematica, either manually or automatically
 
## A blank HTTP POST is sent to SE-IRI; if the HTTP POST is false, the transfer or ingest are still in progress
 
## A blank HTTP POST is sent to SE-IRI; if the HTTP POST is false, the transfer or ingest are still in progress
 
 
# Check Archivematica status
 
# Check Archivematica status
 
## Islandora uses GET statement IRI to request status of transfer
 
## Islandora uses GET statement IRI to request status of transfer
 
 
# Status Response
 
# Status Response
 
## If the ingest is successful, Archivematica sends either a 201 (Created) or 412 (Precondition failed) response to Islandora, letting Islandora know that the last object in the AIP has been uploaded
 
## If the ingest is successful, Archivematica sends either a 201 (Created) or 412 (Precondition failed) response to Islandora, letting Islandora know that the last object in the AIP has been uploaded
 
 
# Content deleted from Islandora
 
# Content deleted from Islandora
 
## The Hi-Res datastream is deleted from Islandora (configurable), preserving only the access copy/ies of the content
 
## The Hi-Res datastream is deleted from Islandora (configurable), preserving only the access copy/ies of the content
 
## Islandora sends an HTTP POST to EM-IRI to indicate that the content has been deleted
 
## Islandora sends an HTTP POST to EM-IRI to indicate that the content has been deleted
 
 
# Log Islandora deletion
 
# Log Islandora deletion
 
## Archivematica marks objects as deleted from the access system
 
## Archivematica marks objects as deleted from the access system

Revision as of 19:32, 17 March 2016

Sections of this page have been copied and adapted from the Islandora Foundation's Archidora documentation under a Creative Commons Attribution-Share Alike 3.0 Unported License. These sections are appended by a hyperlink to the original content in the Islandora wiki. We are grateful to the Islandora Foundation for both writing and sharing this documentation.

Synopsis

Archidora is a module that integrates the digital preservation functionality of Archivematica with Islandora. It was developed by Artefactual Systems and Discovery Garden, sponsored by the University of Saskatchewan Library.

User story

The goal of the Archidora module is to allow Islandora users to seamlessly preserve content that is ingested into Islandora using Archivematica's suite of digital preservation micro-services, creating preservation copies of that content for long-term storage. The Islandora user ingesting the content should not be required to mediate the transfer to Archivematica in any way. Upon completion of the transfer and ingest into Archivematica, a notification is sent back to Islandora indicating that the storage was successful.

Archidora-1.png

Status

The Archidora module was developed in 2014 and has been deployed at the University of Saskatchewan Library since 2015. Testing is ongoing.

The code is currently held in Github by the Islandora Foundation, but is not being actively maintained.

Analysis

Building on the basic workflow described in the [User story], above, the following detailed workflow was developed to describe how content is ingested from Islandora into Archivematica and stored.

  1. Content Upload
    1. Content is ingested into Islandora
    2. Drupal cron and Fedora content validation are used to trigger content upload to Archivematica
    3. One upload is created per Fedora Object
    4. Metadata includes an AIP ID generated either from user input or based on the collection metadata
    5. METS.xml is posted to the Archivematica URI
  2. Create Transfer
    1. Archivematica parses and validates the METS.xml
    2. If the METS is valid, Archivematica sends either a 201 (Created) or 412 (Precondition failed) response to Islandora; Islandora saves the EM-IRI from the response and uses it to notify the user if the transfer was successful
    3. Archivematica identifies which files to request from Islandora
    4. The transfer object is created
    5. The file URIs are passed to Archivematica for asynchronous processing
  3. Collect Files
    1. Files are retrieved from the Fedora REST API using HTTP GET and added to the transfer; if there is an error, an error response is returned
    2. Checksums are confirmed
  4. Transfer and Ingest
    1. Files are moved to the watchedDirectory
    2. Transfer and Ingest are completed in Archivematica, either manually or automatically
    3. A blank HTTP POST is sent to SE-IRI; if the HTTP POST is false, the transfer or ingest are still in progress
  5. Check Archivematica status
    1. Islandora uses GET statement IRI to request status of transfer
  6. Status Response
    1. If the ingest is successful, Archivematica sends either a 201 (Created) or 412 (Precondition failed) response to Islandora, letting Islandora know that the last object in the AIP has been uploaded
  7. Content deleted from Islandora
    1. The Hi-Res datastream is deleted from Islandora (configurable), preserving only the access copy/ies of the content
    2. Islandora sends an HTTP POST to EM-IRI to indicate that the content has been deleted
  8. Log Islandora deletion
    1. Archivematica marks objects as deleted from the access system
    2. Search index is updated
    3. Archivematica sends either a 200 (OK) or 400 (error) response to Islandora

Download

- Islandora Archidora documentation

Install

Installation and testing is similar to any Drupal module. Please see Installing the Islandora Enhancement Modules for details.

- Islandora Archidora documentation

Configure

In the Archivematica Storage Space:

  • Create a Space with access protocol FEDORA via SWORD2.
  • Create a Location within that Space (purpose = FEDORA deposits)
  • Enter the Fedora URL, username and password.
  • See Archivematica Storage Service documentation for more details.

In Islandora

  • Enable cron.
  • Configure Archidora at admin/islandora/archidora.
    • Archivematica Storage Service Base URL - normally http://archivematica-url:8000
    • Deposit Location - will be configured automatically once storage service URL is entered
    • Archivematica User - Archivematica dashboard user to be used for Islandora integration (not storage service)
    • Archivematica API Key - API key for the Archivematica dashboard user listed above
  • Archivematica may also be configured to call back to Islandora to delete the high-res "OBJ" datastreams - this is done in the Storage Service > Administration > Service callbacks
    • URI: http://islandora-base-url/islandora/object/<source_id>/archidora/{Islandora API}/delete
      • Where the API key is the "Islandora Archivematica integration API key" listed/generated on the Archidora admin screen
    • Event: post-store
    • Method: post
    • Expected status: 200
  • Note: the OBJ datastreams are not deleted automatically, but rather are listed at the collection level (or compound object level) on the Manage | Archivematica tab. They can be deleted individually or in bulk.
  • Collection-level configuration:
    • Check off "Don't Archive Children" to stop objects from being sent to Archivematica for a particular collection.

- Islandora Archidora documentation


Scope

Archidora is still considered a beta feature. As such, further development is likely required to bring it to stability.

Other proposed improvements include the following:

  • Ingest Premis from Islandora into Archivematica
  • Ingest other metadata (e.g., DDI)
  • Ingest Bags
  • Format Policy Registry Integration
  • Asynchronous derivative generation
  • Integrate with checksum checker
  • Provide more information back to Islandora (e.g., aip url)
  • Support other workflows (e.g. upload from Archivematica to Islandora)
  • Support Fedora for AIP Storage
  • Improve reporting/logging in Archidora

No estimates have been prepared for the above.

Interest

Please feel free to add your or your organization's name and any comments to this section if you have an interest in improving this module.

Artefactual would like to see active development on Archidora. We are able to do the development work, for a fee. We are also willing to assist others to complete all or part of the work required.

Interested in talking to us about sponsoring development of Archidora? Get in touch with Artefactual at info@artefactual.org or with Discovery Garden at info@discoverygarden.ca.

Interested in contributing code to the Archidora project? Create a pull request!