Transfer backlog requirements
Handle cross-pipeline backlog
Summary: Transfers put into backlog should be able to be started as SIP (through the Appraisal tab, SIP arrange, or Backlog tab) on a different pipeline than they were put in backlog from.
Problem: Much of the information needed during Ingest is stored only in the database. If a Transfer is put into backlog and ingest is started on a different pipeline that information is not there.
Proposed fix: The Transfer METS file should be improved to contain all information needed, which could be parsed back into the database at the start of ingest.
Currently there are 4 places where backlogged transfer data lives: pipeline database, Elasticsearch transfers index, storage service database & transfer METS file. This should be consolidated to a single location and the other sources could be rebuilt from the canonical one.
The METS file is treated as the canonical metadata store in the AIP and is a good choice for the canonical metadata store for a transfer in backlog. Since all of this information is needed in ingest is for the AIP METS file, we know all the information can be stored as a METS file. We already parse the AIP METS file into the database on full reingest, so there is precedent & examples for doing this.
Tables needed include:
- RightsStatement & related
- Events_agents (link between Event & Agent)
- Agents (older version of AM?)
- FilesIdentifiedIDs (file ID info)
- FilesIDs (more file ID info)
- main_fpcommandoutput (characterization)
- Jobs or Tasks may also be required for status checking?
This also needs to handle transfers put in backlog before the transfer METS was updated. It should be straightforward to handle backlogged transfers where the pipeline still exists, as the data is all in the database. For backlogged transfers where the pipeline no longer exists, truncated, stub or default data could be done, or require that it be re-run through transfer.
Added in Archivematica 1.0
Transfer Backlog Management
- Related issues: Issue 951, Issue 1220, Issue 1141, Issue 1225, Issue 1257
Requirements for transfer backlog search
- Add ability to search transfer backlog and send one or more transfers to Ingest
- Add ability to download and/or view files/transfers (via right click)
- Search the following fields: Any field, transfer name, file name, accession number, PUID, Mimetype, Date - Ingest
Mockup of transfer backlog search
- Administration - allow MCP access to media or storage where transfer is located
- Assign accession number to transfer
- Remove transfer backup from workflow - no long processing configuration option
- Add Send transfer to backlog microservice
- Add Search transfer backlog tab from Ingest in Dashboard
- Add ability to download and/or view transfers and files from Search tab
- Add ability to send transfers from backlog search to Ingest/Create SIP (checkboxes, send button)
- see workflow diagrams below
0.9 Transfer workflow
- grey steps are automated, white are manual
Administration Tab in Dashboard
- Assign permission and access to the MCPServer to copy from transfer media (hard drives, optical media, USB, etc.) or network location.
- Assign transfer backlog locations (configuration is done outside of AM)
- Assign source directories
- Define transfer types
- Assign report locations (post-1.0)
- Set AIP storage location
- Set DIP upload location
- PREMIS Event = Registration
<event> <eventIdentifier> <eventIdentifierType>UUID</eventIdentifierType> <eventIdentifierValue>35cbe00d-d661-4174-b11a-e203f5608008</eventIdentifierValue> </eventIdentifier> <eventType>registration</eventType> <eventDateTime>2012-03-14</eventDateTime> <eventDetail></eventDetail> <eventOutcomeInformation> <eventOutcome></eventOutcome> <eventOutcomeDetail> <eventOutcomeDetailNote>accession#2012-029</eventOutcomeDetailNote> </eventOutcomeDetail> </eventOutcomeInformation> <linkingAgentIdentifier> <linkingAgentIdentifierType>archivist</linkingAgentIdentifierType> <linkingAgentIdentifierValue>Courtney Mumma</linkingAgentIdentifierValue> </linkingAgentIdentifier> </event>
- Manually input metadata in template on dashboard (See File_Browser_Requirements) : accession number
- Agent is the archivist logged in at the time doing the accession (post-1.0, for 1.0 this will still be repository)
- Event name is "registration" (to be added to PREMIS events master list should we decide to implement)
- Also see Issue 787 on the Archivematica issues list
Microservices Completed Before Move to Backlog
- All transfer microservices
- Indexing: See Transfer_and_SIP_creation#Transfer_indexing_requirements_0.9_and_beyond
Handling of Submission Documentation
- Normalized with objects in AIP (0.8)
- Upload submission documentation with transfer in transfer tab - Issue 1255
Search transfers from Archival Storage
New sponsored development planned for Archivematica 1.6 or later will allow users to manage the transfer backlog through the Archival Storage tab, as outlined in general workflows described in these diagrams:
As outlined above, users will be able to:
- Search transfers from archival storage tab
- Download copies of transfers or selected files from archival storage tab
- Be able to perform transfer deletion requests from archival storage tab
Transfer search user stories
As an archivist, I need to find transfers by searching...
- by the name of the transfer
- by the date the transfer was stored in backlog
- by names of files within the transfer
Mockups: Version 1
Search transfers from Archival Storage:
- Click on "Show transfers" to search AIPs as well as Transfers
- A new column in the table indicates whether a package is a Transfer or an AIP.
- To trigger transfer deletion, click on red "remove" icon (same functionality as AIP deletion)
Search files from transfers in Archival Storage:
- Clicking on both "Show files" and "Show transfers" before searching will load preview of files from transfer backlog.
- The UUID of the package and an indication of whether the file is from a Transfer or an AIP is in the right column.
Mockups: Version 2
In this version, toggling between searching for AIPs/Transfers is done through a tab at the top. This makes the development significantly less complicated, as we would not need to combine the Elasticsearch indexes for transfer METS and AIP METS.
Files within transfer search: