Difference between revisions of "Transfer backlog requirements"
(31 intermediate revisions by 5 users not shown) | |||
Line 1: | Line 1: | ||
[[Main Page]] > [[Development]] > [[:Category:Development documentation|Development documentation]] > Transfer backlog requirements | [[Main Page]] > [[Development]] > [[:Category:Development documentation|Development documentation]] > Transfer backlog requirements | ||
− | + | <div style="padding: 10px 10px; border: 1px solid black; background-color: #F79086;">This page is no longer being maintained and may contain inaccurate information. Please see the [https://www.archivematica.org/docs/latest/ Archivematica documentation] for up-to-date information.</div><p> | |
− | + | [[Category:Feature requirements]] | |
+ | == Proposed improvements == | ||
− | = Transfer Workflow = | + | === Handle cross-pipeline backlog === |
+ | |||
+ | March 2017 | ||
+ | |||
+ | '''Summary''': Transfers put into backlog should be able to be started as SIP (through the Appraisal tab, SIP arrange, or Backlog tab) on a different pipeline than they were put in backlog from. | ||
+ | |||
+ | '''Problem''': Much of the information needed during Ingest is stored only in the database. If a Transfer is put into backlog and ingest is started on a different pipeline that information is not there. | ||
+ | |||
+ | '''Proposed fix''': The Transfer METS file should be improved to contain all information needed, which could be parsed back into the database at the start of ingest. | ||
+ | |||
+ | Currently there are 4 places where backlogged transfer data lives: pipeline database, Elasticsearch transfers index, storage service database & transfer METS file. This should be consolidated to a single location and the other sources could be rebuilt from the canonical one. | ||
+ | |||
+ | The METS file is treated as the canonical metadata store in the AIP and is a good choice for the canonical metadata store for a transfer in backlog. Since all of this information is needed in ingest is for the AIP METS file, we know all the information can be stored as a METS file. We already parse the AIP METS file into the database on full reingest, so there is precedent & examples for doing this. | ||
+ | |||
+ | Tables needed include: | ||
+ | * Transfers | ||
+ | * Files | ||
+ | * DublinCore | ||
+ | * RightsStatement & related | ||
+ | * Events | ||
+ | * Events_agents (link between Event & Agent) | ||
+ | * Agents (older version of AM?) | ||
+ | * FilesIdentifiedIDs (file ID info) | ||
+ | * FilesIDs (more file ID info) | ||
+ | * main_fpcommandoutput (characterization) | ||
+ | * Jobs or Tasks may also be required for status checking? | ||
+ | |||
+ | This also needs to handle transfers put in backlog before the transfer METS was updated. It should be straightforward to handle backlogged transfers where the pipeline still exists, as the data is all in the database. For backlogged transfers where the pipeline no longer exists, truncated, stub or default data could be done, or require that it be re-run through transfer. | ||
+ | |||
+ | == Original requirements == | ||
+ | |||
+ | Added in Archivematica 1.0 | ||
+ | |||
+ | === Transfer Backlog Management === | ||
+ | * Related issues: Issue 951, Issue 1220, Issue 1141, Issue 1225, Issue 1257 | ||
+ | |||
+ | === Requirements for transfer backlog search === | ||
+ | |||
+ | * Add ability to search transfer backlog and send one or more transfers to Ingest | ||
+ | * Add ability to download and/or view files/transfers (via right click) | ||
+ | * Search the following fields: Any field, transfer name, file name, accession number, PUID, Mimetype, Date - Ingest | ||
+ | |||
+ | === Mockup of transfer backlog search === | ||
+ | |||
+ | |||
+ | [[File:1.0_TransferBacklogSearch.png|680px|thumb|center|]] | ||
+ | |||
+ | [[File:1.0_TransBacklogSearchResults.png|680px|thumb|center|]] | ||
+ | |||
+ | === Transfer Workflow === | ||
* Administration - allow MCP access to media or storage where transfer is located | * Administration - allow MCP access to media or storage where transfer is located | ||
− | * | + | * Assign accession number to transfer |
+ | * Remove transfer backup from workflow - no long processing configuration option | ||
+ | * Add Send transfer to backlog microservice | ||
+ | * Add Search transfer backlog tab from Ingest in Dashboard | ||
+ | * Add ability to download and/or view transfers and files from Search tab | ||
+ | * Add ability to send transfers from backlog search to Ingest/Create SIP (checkboxes, send button) | ||
* see workflow diagrams below | * see workflow diagrams below | ||
Line 21: | Line 76: | ||
[[Media:transferWorkflow0.9.pdf|transferWorkflow0.9.pdf]] | [[Media:transferWorkflow0.9.pdf|transferWorkflow0.9.pdf]] | ||
− | = Administration Tab in Dashboard = | + | === Administration Tab in Dashboard === |
* Assign permission and access to the MCPServer to copy from transfer media (hard drives, optical media, USB, etc.) or network location. | * Assign permission and access to the MCPServer to copy from transfer media (hard drives, optical media, USB, etc.) or network location. | ||
− | * Assign transfer | + | * Assign transfer backlog locations (configuration is done outside of AM) |
* Assign source directories | * Assign source directories | ||
* Define transfer types | * Define transfer types | ||
− | * Assign report locations | + | * Assign report locations (post-1.0) |
− | |||
* Set AIP storage location | * Set AIP storage location | ||
* Set DIP upload location | * Set DIP upload location | ||
− | = Accession metadata = | + | === Accession metadata === |
− | * PREMIS Event = | + | * PREMIS Event = Registration |
<event> | <event> | ||
<eventIdentifier> | <eventIdentifier> | ||
Line 40: | Line 94: | ||
<eventIdentifierValue>35cbe00d-d661-4174-b11a-e203f5608008</eventIdentifierValue> | <eventIdentifierValue>35cbe00d-d661-4174-b11a-e203f5608008</eventIdentifierValue> | ||
</eventIdentifier> | </eventIdentifier> | ||
− | <eventType> | + | <eventType>registration</eventType> |
<eventDateTime>2012-03-14</eventDateTime> | <eventDateTime>2012-03-14</eventDateTime> | ||
− | <eventDetail> | + | <eventDetail></eventDetail> |
<eventOutcomeInformation> | <eventOutcomeInformation> | ||
<eventOutcome></eventOutcome> | <eventOutcome></eventOutcome> | ||
<eventOutcomeDetail> | <eventOutcomeDetail> | ||
− | <eventOutcomeDetailNote></eventOutcomeDetailNote> | + | <eventOutcomeDetailNote>accession#2012-029</eventOutcomeDetailNote> |
</eventOutcomeDetail> | </eventOutcomeDetail> | ||
</eventOutcomeInformation> | </eventOutcomeInformation> | ||
Line 54: | Line 108: | ||
</linkingAgentIdentifier> | </linkingAgentIdentifier> | ||
</event> | </event> | ||
− | * Manually input metadata in template on dashboard (See [[File_Browser_Requirements]]) : | + | |
− | * Agent is the archivist logged in at the time doing the accession | + | * Manually input metadata in template on dashboard (See [[File_Browser_Requirements]]) : accession number |
− | * Event name is " | + | * Agent is the archivist logged in at the time doing the accession (post-1.0, for 1.0 this will still be repository) |
+ | * Event name is "registration" (to be added to PREMIS events master list should we decide to implement) | ||
* UUID | * UUID | ||
− | *Also see Issue 787 on the Archivematica issues list | + | * Also see Issue 787 on the Archivematica issues list |
− | = | + | === Microservices Completed Before Move to Backlog === |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | = Microservices Completed Before | ||
* All transfer microservices | * All transfer microservices | ||
* Indexing: See [[Transfer_and_SIP_creation#Transfer_indexing_requirements_0.9_and_beyond]] | * Indexing: See [[Transfer_and_SIP_creation#Transfer_indexing_requirements_0.9_and_beyond]] | ||
− | = Handling of Submission Documentation = | + | === Handling of Submission Documentation === |
* [http://sites.tufts.edu/dca/about-us/research-initiatives/taper-tufts-accessioning-program-for-electronic-records/| TAPER]? | * [http://sites.tufts.edu/dca/about-us/research-initiatives/taper-tufts-accessioning-program-for-electronic-records/| TAPER]? | ||
* Normalized with objects in AIP (0.8) | * Normalized with objects in AIP (0.8) | ||
− | * | + | * Upload submission documentation with transfer in transfer tab - Issue 1255 |
+ | |||
+ | === Search transfers from Archival Storage === | ||
+ | |||
+ | New sponsored development planned for Archivematica 1.6 or later will allow users to manage the transfer backlog through the Archival Storage tab, as outlined in general workflows described in these diagrams: | ||
+ | |||
+ | [[File:Transfer_management_workflows.png|700px]] | ||
+ | |||
+ | As outlined above, users will be able to: | ||
+ | |||
+ | *Search transfers from archival storage tab | ||
+ | *Download copies of transfers or selected files from archival storage tab | ||
+ | *Be able to perform transfer deletion requests from archival storage tab | ||
+ | |||
+ | '''Transfer search user stories''' | ||
+ | |||
+ | As an archivist, I need to find transfers by searching... | ||
+ | |||
+ | * by the name of the transfer | ||
+ | * by the date the transfer was stored in backlog | ||
+ | * by names of files within the transfer | ||
+ | * by....? | ||
+ | |||
+ | '''Mockups: Version 1''' | ||
+ | |||
+ | Search transfers from Archival Storage: | ||
+ | |||
+ | [[File:Archival_Storage_Transfer_Search.png|800px]] | ||
+ | |||
+ | '''Notes:''' | ||
+ | |||
+ | *Click on "Show transfers" to search AIPs as well as Transfers | ||
+ | *A new column in the table indicates whether a package is a Transfer or an AIP. | ||
+ | *To trigger transfer deletion, click on red "remove" icon (same functionality as AIP deletion) | ||
+ | |||
+ | Search files from transfers in Archival Storage: | ||
+ | |||
+ | [[File:Archival_Storage_Transfer_file_search.png|800px]] | ||
+ | |||
+ | '''Notes:''' | ||
+ | |||
+ | *Clicking on both "Show files" and "Show transfers" before searching will load preview of files from transfer backlog. | ||
+ | *The UUID of the package and an indication of whether the file is from a Transfer or an AIP is in the right column. | ||
+ | |||
+ | '''Mockups: Version 2''' | ||
+ | |||
+ | In this version, toggling between searching for AIPs/Transfers is done through a tab at the top. This makes the development significantly less complicated, as we would not need to combine the Elasticsearch indexes for transfer METS and AIP METS. | ||
+ | |||
+ | Transfer search: | ||
+ | |||
+ | [[File:Transfer_search_v2.png|800px]] | ||
+ | Files within transfer search: | ||
− | [[ | + | [[File:Transfer_search_files_v2.png|800px]] |
Latest revision as of 16:27, 11 February 2020
Main Page > Development > Development documentation > Transfer backlog requirements
Proposed improvements[edit]
Handle cross-pipeline backlog[edit]
March 2017
Summary: Transfers put into backlog should be able to be started as SIP (through the Appraisal tab, SIP arrange, or Backlog tab) on a different pipeline than they were put in backlog from.
Problem: Much of the information needed during Ingest is stored only in the database. If a Transfer is put into backlog and ingest is started on a different pipeline that information is not there.
Proposed fix: The Transfer METS file should be improved to contain all information needed, which could be parsed back into the database at the start of ingest.
Currently there are 4 places where backlogged transfer data lives: pipeline database, Elasticsearch transfers index, storage service database & transfer METS file. This should be consolidated to a single location and the other sources could be rebuilt from the canonical one.
The METS file is treated as the canonical metadata store in the AIP and is a good choice for the canonical metadata store for a transfer in backlog. Since all of this information is needed in ingest is for the AIP METS file, we know all the information can be stored as a METS file. We already parse the AIP METS file into the database on full reingest, so there is precedent & examples for doing this.
Tables needed include:
- Transfers
- Files
- DublinCore
- RightsStatement & related
- Events
- Events_agents (link between Event & Agent)
- Agents (older version of AM?)
- FilesIdentifiedIDs (file ID info)
- FilesIDs (more file ID info)
- main_fpcommandoutput (characterization)
- Jobs or Tasks may also be required for status checking?
This also needs to handle transfers put in backlog before the transfer METS was updated. It should be straightforward to handle backlogged transfers where the pipeline still exists, as the data is all in the database. For backlogged transfers where the pipeline no longer exists, truncated, stub or default data could be done, or require that it be re-run through transfer.
Original requirements[edit]
Added in Archivematica 1.0
Transfer Backlog Management[edit]
- Related issues: Issue 951, Issue 1220, Issue 1141, Issue 1225, Issue 1257
Requirements for transfer backlog search[edit]
- Add ability to search transfer backlog and send one or more transfers to Ingest
- Add ability to download and/or view files/transfers (via right click)
- Search the following fields: Any field, transfer name, file name, accession number, PUID, Mimetype, Date - Ingest
Mockup of transfer backlog search[edit]
Transfer Workflow[edit]
- Administration - allow MCP access to media or storage where transfer is located
- Assign accession number to transfer
- Remove transfer backup from workflow - no long processing configuration option
- Add Send transfer to backlog microservice
- Add Search transfer backlog tab from Ingest in Dashboard
- Add ability to download and/or view transfers and files from Search tab
- Add ability to send transfers from backlog search to Ingest/Create SIP (checkboxes, send button)
- see workflow diagrams below
0.9 Transfer workflow
- grey steps are automated, white are manual
Administration Tab in Dashboard[edit]
- Assign permission and access to the MCPServer to copy from transfer media (hard drives, optical media, USB, etc.) or network location.
- Assign transfer backlog locations (configuration is done outside of AM)
- Assign source directories
- Define transfer types
- Assign report locations (post-1.0)
- Set AIP storage location
- Set DIP upload location
Accession metadata[edit]
- PREMIS Event = Registration
<event> <eventIdentifier> <eventIdentifierType>UUID</eventIdentifierType> <eventIdentifierValue>35cbe00d-d661-4174-b11a-e203f5608008</eventIdentifierValue> </eventIdentifier> <eventType>registration</eventType> <eventDateTime>2012-03-14</eventDateTime> <eventDetail></eventDetail> <eventOutcomeInformation> <eventOutcome></eventOutcome> <eventOutcomeDetail> <eventOutcomeDetailNote>accession#2012-029</eventOutcomeDetailNote> </eventOutcomeDetail> </eventOutcomeInformation> <linkingAgentIdentifier> <linkingAgentIdentifierType>archivist</linkingAgentIdentifierType> <linkingAgentIdentifierValue>Courtney Mumma</linkingAgentIdentifierValue> </linkingAgentIdentifier> </event>
- Manually input metadata in template on dashboard (See File_Browser_Requirements) : accession number
- Agent is the archivist logged in at the time doing the accession (post-1.0, for 1.0 this will still be repository)
- Event name is "registration" (to be added to PREMIS events master list should we decide to implement)
- UUID
- Also see Issue 787 on the Archivematica issues list
Microservices Completed Before Move to Backlog[edit]
- All transfer microservices
- Indexing: See Transfer_and_SIP_creation#Transfer_indexing_requirements_0.9_and_beyond
Handling of Submission Documentation[edit]
- TAPER?
- Normalized with objects in AIP (0.8)
- Upload submission documentation with transfer in transfer tab - Issue 1255
Search transfers from Archival Storage[edit]
New sponsored development planned for Archivematica 1.6 or later will allow users to manage the transfer backlog through the Archival Storage tab, as outlined in general workflows described in these diagrams:
As outlined above, users will be able to:
- Search transfers from archival storage tab
- Download copies of transfers or selected files from archival storage tab
- Be able to perform transfer deletion requests from archival storage tab
Transfer search user stories
As an archivist, I need to find transfers by searching...
- by the name of the transfer
- by the date the transfer was stored in backlog
- by names of files within the transfer
- by....?
Mockups: Version 1
Search transfers from Archival Storage:
Notes:
- Click on "Show transfers" to search AIPs as well as Transfers
- A new column in the table indicates whether a package is a Transfer or an AIP.
- To trigger transfer deletion, click on red "remove" icon (same functionality as AIP deletion)
Search files from transfers in Archival Storage:
Notes:
- Clicking on both "Show files" and "Show transfers" before searching will load preview of files from transfer backlog.
- The UUID of the package and an indication of whether the file is from a Transfer or an AIP is in the right column.
Mockups: Version 2
In this version, toggling between searching for AIPs/Transfers is done through a tab at the top. This makes the development significantly less complicated, as we would not need to combine the Elasticsearch indexes for transfer METS and AIP METS.
Transfer search:
Files within transfer search: