Difference between revisions of "AIP re-ingest"

From Archivematica
Jump to navigation Jump to search
 
(25 intermediate revisions by 4 users not shown)
Line 1: Line 1:
 
[[Main Page]] > [[Requirements]] > AIP re-ingest
 
[[Main Page]] > [[Requirements]] > AIP re-ingest
  
This page documents requirements for retrieving an Archivematica AIP from archival storage and re-ingesting it for processing in Archivematica.
+
<div style="padding: 10px 10px; border: 1px solid black; background-color: #F79086;">This page is no longer being maintained and may contain inaccurate information. Please see the [https://www.archivematica.org/docs/latest/ Archivematica documentation] for up-to-date information.</div><p>
  
 +
This page documents requirements for retrieving an Archivematica AIP from archival storage and re-ingesting it for further processing.
 +
 +
 +
[[Category:Feature requirements]]
  
 
=Use cases=
 
=Use cases=
  
The basic use case involves re-ingesting a stored AIP to update the AIP, implement new preservation plans, take advantage of new format identification and metadata extraction tools, generating a DIP for a different access system, etc.
+
*Updating DC metadata (version 1.5)
 +
*Updating rights metadata (version 1.5)
 +
*Adding new DC metadata files (version 1.5)
 +
*Adding new submission documentation (unsponsored)
 +
*Adding new digital objects (unsponsored)
 +
*Deleting original digital objects (unsponsored)
 +
*Redoing format identification/validation/characterization (version 1.6)
 +
*Redoing metadata extraction (version 1.6)
 +
*Redoing preservation normalization (version 1.6)
 +
*Generating new DIP (version 1.5)
 +
*Running OCR on re-ingest (version 1.6)
 +
*Rerunning other micro-services (Examine Contents, Transfer Structure Report) (unsponsored)
 +
*Extracting packages and then deleting the packages (unsponsored)
 +
*Sending AIP to backlog and re-arranging/selecting from contents (unsponsored)
 +
 
 +
==Full re-ingest==
 +
 
 +
To be supported in version 1.6. Sends an AIP to the beginning of transfer to run all micro-services on AIP, and re-normalize for preservation and access if desired.
 +
 
 +
[[File:AIP_re-ingest_Preservation_and_metadata.png|800px|thumb|center]]
 +
 
 +
==Metadata re-ingest==
 +
 
 +
Supported in version 1.5. Sends an AIP to the beginning of ingest to allow user to update metadata.
 +
 
 +
[[File:AIP_re-ingest_workflows_-_Metadata_only.png|800px|thumb|center]]
 +
 
 +
==Partial re-ingest==
  
*Updating DC metadata
+
Supported in version 1.5. Sends an AIP to the beginning of ingest to allow user to update metadata and normalize for access.
*Updating rights metadata
 
*Adding new metadata files
 
*Adding new submission documentation
 
*Adding new digital objects
 
*Deleting digital objects
 
*Redoing format identification/validation
 
*Redoing metadata extraction
 
*Redoing preservation normalization
 
*Generating new DIP
 
  
 +
[[File:AIP_re-ingest_objects_with_metadata.png|800px|thumb|center]]
  
 +
=New micro-services=
  
 +
==New micro-services for all workflows==
  
 +
*Retrieve AIP from archival storage
 +
*Place AIP in active transfers for processing
 +
*Extract AIP contents and run BagIt checks
 +
*Add approve AIP re-ingest micro-service
 +
*Re-use existing file UUIDs
 +
*Identify new metadata files
 +
*Validate schemas in new metadata files
 +
*Update METS file
 +
*Replace AIP
  
 +
=METS versioning=
  
 +
*Versioning will be captured via METS file updates.
 +
*METS file updates will be handled through  <status>, <created> and <groupID> attributes in the various METS sections. See for example Element <dmdSec> at http://www.loc.gov/standards/mets/docs/mets.v1-9.html.
 +
*This means that there will always only be one AIP METS file, but it will contain both superseded and current metadata and versioning information for all updates.
  
 +
==dmdSec==
  
 +
*The first dmdSec created will be marked as status="original", and updated/revised dmdSecs will be marked as status="updated". The timestamp (created="[date/time]") is also updated.
  
 +
==amdSec==
  
 +
===techMD===
  
 +
*After re-ingest, the original techMD will contain the PREMIS:Object metadata and the status="superseded."
 +
*The next techMD will be status="current" and contain the PREMIS:Object metadata generated on re-ingest, as well as all PREMIS events from each ingest and re-ingest. This shows the complete set of actions taken upon the objects since they came into the repository.
 +
*The timestamp (created="[date/time]") is also updated.
  
 +
===rightsMD===
  
 +
*After re-ingest, the original rightsMD will have the status="superseded" and the revised rightsMD will have the status="current".
 +
*The timestamp (created="[date/time]") is also updated.
 +
*Altering either the rights statement or the act(s) will result in a new rightsMD.
  
 +
==fileSec==
  
[[Category:Development documentation]]
+
The fileSec does not indicate its status, but should reflect the files in the AIP currently. For example, if normalization is not performed on the first ingest, but is performed on reingest, the fileSec in the reingested AIP will include preservation derivatives.

Latest revision as of 16:14, 11 February 2020

Main Page > Requirements > AIP re-ingest

This page is no longer being maintained and may contain inaccurate information. Please see the Archivematica documentation for up-to-date information.

This page documents requirements for retrieving an Archivematica AIP from archival storage and re-ingesting it for further processing.

Use cases[edit]

  • Updating DC metadata (version 1.5)
  • Updating rights metadata (version 1.5)
  • Adding new DC metadata files (version 1.5)
  • Adding new submission documentation (unsponsored)
  • Adding new digital objects (unsponsored)
  • Deleting original digital objects (unsponsored)
  • Redoing format identification/validation/characterization (version 1.6)
  • Redoing metadata extraction (version 1.6)
  • Redoing preservation normalization (version 1.6)
  • Generating new DIP (version 1.5)
  • Running OCR on re-ingest (version 1.6)
  • Rerunning other micro-services (Examine Contents, Transfer Structure Report) (unsponsored)
  • Extracting packages and then deleting the packages (unsponsored)
  • Sending AIP to backlog and re-arranging/selecting from contents (unsponsored)

Full re-ingest[edit]

To be supported in version 1.6. Sends an AIP to the beginning of transfer to run all micro-services on AIP, and re-normalize for preservation and access if desired.

AIP re-ingest Preservation and metadata.png

Metadata re-ingest[edit]

Supported in version 1.5. Sends an AIP to the beginning of ingest to allow user to update metadata.

AIP re-ingest workflows - Metadata only.png

Partial re-ingest[edit]

Supported in version 1.5. Sends an AIP to the beginning of ingest to allow user to update metadata and normalize for access.

AIP re-ingest objects with metadata.png

New micro-services[edit]

New micro-services for all workflows[edit]

  • Retrieve AIP from archival storage
  • Place AIP in active transfers for processing
  • Extract AIP contents and run BagIt checks
  • Add approve AIP re-ingest micro-service
  • Re-use existing file UUIDs
  • Identify new metadata files
  • Validate schemas in new metadata files
  • Update METS file
  • Replace AIP

METS versioning[edit]

  • Versioning will be captured via METS file updates.
  • METS file updates will be handled through <status>, <created> and <groupID> attributes in the various METS sections. See for example Element <dmdSec> at http://www.loc.gov/standards/mets/docs/mets.v1-9.html.
  • This means that there will always only be one AIP METS file, but it will contain both superseded and current metadata and versioning information for all updates.

dmdSec[edit]

  • The first dmdSec created will be marked as status="original", and updated/revised dmdSecs will be marked as status="updated". The timestamp (created="[date/time]") is also updated.

amdSec[edit]

techMD[edit]

  • After re-ingest, the original techMD will contain the PREMIS:Object metadata and the status="superseded."
  • The next techMD will be status="current" and contain the PREMIS:Object metadata generated on re-ingest, as well as all PREMIS events from each ingest and re-ingest. This shows the complete set of actions taken upon the objects since they came into the repository.
  • The timestamp (created="[date/time]") is also updated.

rightsMD[edit]

  • After re-ingest, the original rightsMD will have the status="superseded" and the revised rightsMD will have the status="current".
  • The timestamp (created="[date/time]") is also updated.
  • Altering either the rights statement or the act(s) will result in a new rightsMD.

fileSec[edit]

The fileSec does not indicate its status, but should reflect the files in the AIP currently. For example, if normalization is not performed on the first ingest, but is performed on reingest, the fileSec in the reingested AIP will include preservation derivatives.