Difference between revisions of "Bag ingest"
Jump to navigation
Jump to search
(14 intermediate revisions by one other user not shown) | |||
Line 1: | Line 1: | ||
[[Main Page]] > [[Development]] > [[:Category:Development documentation|Development documentation]] > Bag ingest | [[Main Page]] > [[Development]] > [[:Category:Development documentation|Development documentation]] > Bag ingest | ||
+ | |||
+ | <div style="padding: 10px 10px; border: 1px solid black; background-color: #F79086;">This page is no longer being maintained and may contain inaccurate information. Please see the [https://www.archivematica.org/docs/latest/ Archivematica documentation] for up-to-date information. </div> <p> | ||
==Feature description== | ==Feature description== | ||
Archivematica accepts transfers packaged in accordance with the Bagit specification. | Archivematica accepts transfers packaged in accordance with the Bagit specification. | ||
− | + | </br> | |
==Requirements== | ==Requirements== | ||
*All standard Bagit checks are run: verifyvalid, checkpayloadoxum, verifycomplete, verifypayloadmanifests, verifytagmanifests. | *All standard Bagit checks are run: verifyvalid, checkpayloadoxum, verifycomplete, verifypayloadmanifests, verifytagmanifests. | ||
− | *The BagIt checks | + | *Archivematica differentiates between mandatory and optional bag elements so that if optional elements are not present the bag does not fail the verification micro-service. |
− | *The BagIt file manifest (manifest-sha512.txt) | + | *The BagIt checks generate log files that will be added to the ''logs'' directory of the transfer. |
+ | *The BagIt file manifest (manifest-sha512.txt) is placed in the '' metadata'' directory of the transfer. | ||
*The other BagIt files (bag-info.txt, bagit.txt, tagmanifest-md5.txt) will be placed in a ''logs/BagIt'' directory. | *The other BagIt files (bag-info.txt, bagit.txt, tagmanifest-md5.txt) will be placed in a ''logs/BagIt'' directory. | ||
− | *No new PREMIS events are required. The BagIt checks | + | *No new PREMIS events are required. The BagIt checks are recorded as a fixity check in PREMIS. |
+ | </br> | ||
==Workflow== | ==Workflow== | ||
In this workflow diagram, the white ovals are manual steps and the grey ovals are automated steps. | In this workflow diagram, the white ovals are manual steps and the grey ovals are automated steps. | ||
[[File:BagIt.png|680px|thumb|center|]] | [[File:BagIt.png|680px|thumb|center|]] | ||
+ | </br> | ||
+ | |||
+ | ==Parse and index contents of bag-info.txt== | ||
+ | *Enhancements being developed in 2015 | ||
+ | |||
+ | ===Parse bag-info.txt contents to AIP METS file=== | ||
+ | *Labels in bag-info.txt file serialized as XML in METS sourceMD, linked to the objects directory of the AIP | ||
+ | *Sample bag-info.txt (from [https://tools.ietf.org/html/draft-kunze-bagit-10 https://tools.ietf.org/html/draft-kunze-bagit-10]: | ||
+ | |||
+ | <pre>Source-Organization: Spengler University | ||
+ | Organization-Address: 1400 Elm St., Cupertino, California, 95014 | ||
+ | Contact-Name: Edna Janssen | ||
+ | Contact-Phone: +1 408-555-1212 | ||
+ | Contact-Email: ej@spengler.edu | ||
+ | External-Description: Uncompressed greyscale TIFF images from the Yoshimuri papers colle... | ||
+ | Bagging-Date: 2008-01-15 | ||
+ | External-Identifier: spengler_yoshimuri_001 | ||
+ | Bag-Size: 260 GB | ||
+ | Payload-Oxum: 279164409832.1198 | ||
+ | Bag-Group-Identifier: spengler_yoshimuri | ||
+ | Bag-Count: 1 of 15 | ||
+ | Internal-Sender-Identifier: /storage/images/yoshimuri | ||
+ | Internal-Sender-Description: Uncompressed greyscale TIFFs created from microfilm and are...</pre> | ||
+ | |||
+ | *Sample AIP METS file result: | ||
+ | |||
+ | <pre><mets:amdSec ID="amdSec_14"> | ||
+ | <mets:sourceMD ID="sourceMD_1"> | ||
+ | <mets:mdWrap MDTYPE="OTHER" OTHERMDTYPE="BagIt"> | ||
+ | <mets:xmlData> | ||
+ | <transfer_metadata> | ||
+ | <Source-Organization>Spengler University</Source-Organization> | ||
+ | <Organization-Address>1400 Elm St., Cupertino, California, 95014</Organization-Address> | ||
+ | <Contact-Name>Edna Janssen</Contact-Name> | ||
+ | <Contact-Phone>+1 408-555-1212</Contact-Phone> | ||
+ | <Contact-Email>ej@spengler.edu</Contact-Email> | ||
+ | <External-Description> Uncompressed greyscale TIFF images from the Yoshimuri papers colle...</External-Description> | ||
+ | <Bagging-Date>2008-01-15</Bagging-Date> | ||
+ | <External-Identifier>spengler_yoshimuri_001</External-Identifier> | ||
+ | <Bag-Size>260 GB</Bag-Size> | ||
+ | <Payload-Oxum>279164409832.1198</Payload-Oxum> | ||
+ | <Bag-Group-Identifier>spengler_yoshimuri</Bag-Group-Identifier> | ||
+ | <Bag-Count>1 of 15</Bag-Count> | ||
+ | <Internal-Sender-Identifier>/storage/images/yoshimuri</Internal-Sender-Identifier> | ||
+ | <Internal-Sender-Description>Uncompressed greyscale TIFFs created from microfilm and are...</Internal-Sender-Description> | ||
+ | </transfer_metadata> | ||
+ | </mets:xmlData> | ||
+ | </mets:mdWrap> | ||
+ | </mets:sourceMD> | ||
+ | </mets:amdSec></pre> | ||
+ | *When Bagit labels contain characters that are not valid XML labels, continue processing but print error message and skip labels with invalid content. | ||
+ | </br> | ||
+ | ===Search contents in archival storage tab=== | ||
+ | *Add keyword field "Transfer metadata" to drop-down menu in search. This will search all the contents of the <transfer_metadata> container in the METS file (as indexed in ElasticSearch). | ||
+ | *Add keyword field "Transfer metadata (other)" to drop-down menu in search. This will allow users to search individual fields in the <transfer_metadata> container. | ||
+ | **When the user selects "Transfer metadata (other)" a separate box will appear which will allow the user to enter the label of the specific field to be searched. | ||
+ | *Add ability to search date ranges. | ||
+ | **To search on a date range in <transfer_metadata> or one if its sub-fields, the user enters two dates in ISO date format separated by a colon. For example, "2015-01-03:2015-04-14". | ||
[[Category: Development documentation]] | [[Category: Development documentation]] |
Latest revision as of 15:41, 11 February 2020
Main Page > Development > Development documentation > Bag ingest
This page is no longer being maintained and may contain inaccurate information. Please see the Archivematica documentation for up-to-date information.
Feature description[edit]
Archivematica accepts transfers packaged in accordance with the Bagit specification.
Requirements[edit]
- All standard Bagit checks are run: verifyvalid, checkpayloadoxum, verifycomplete, verifypayloadmanifests, verifytagmanifests.
- Archivematica differentiates between mandatory and optional bag elements so that if optional elements are not present the bag does not fail the verification micro-service.
- The BagIt checks generate log files that will be added to the logs directory of the transfer.
- The BagIt file manifest (manifest-sha512.txt) is placed in the metadata directory of the transfer.
- The other BagIt files (bag-info.txt, bagit.txt, tagmanifest-md5.txt) will be placed in a logs/BagIt directory.
- No new PREMIS events are required. The BagIt checks are recorded as a fixity check in PREMIS.
Workflow[edit]
In this workflow diagram, the white ovals are manual steps and the grey ovals are automated steps.
Parse and index contents of bag-info.txt[edit]
- Enhancements being developed in 2015
Parse bag-info.txt contents to AIP METS file[edit]
- Labels in bag-info.txt file serialized as XML in METS sourceMD, linked to the objects directory of the AIP
- Sample bag-info.txt (from https://tools.ietf.org/html/draft-kunze-bagit-10:
Source-Organization: Spengler University Organization-Address: 1400 Elm St., Cupertino, California, 95014 Contact-Name: Edna Janssen Contact-Phone: +1 408-555-1212 Contact-Email: ej@spengler.edu External-Description: Uncompressed greyscale TIFF images from the Yoshimuri papers colle... Bagging-Date: 2008-01-15 External-Identifier: spengler_yoshimuri_001 Bag-Size: 260 GB Payload-Oxum: 279164409832.1198 Bag-Group-Identifier: spengler_yoshimuri Bag-Count: 1 of 15 Internal-Sender-Identifier: /storage/images/yoshimuri Internal-Sender-Description: Uncompressed greyscale TIFFs created from microfilm and are...
- Sample AIP METS file result:
<mets:amdSec ID="amdSec_14"> <mets:sourceMD ID="sourceMD_1"> <mets:mdWrap MDTYPE="OTHER" OTHERMDTYPE="BagIt"> <mets:xmlData> <transfer_metadata> <Source-Organization>Spengler University</Source-Organization> <Organization-Address>1400 Elm St., Cupertino, California, 95014</Organization-Address> <Contact-Name>Edna Janssen</Contact-Name> <Contact-Phone>+1 408-555-1212</Contact-Phone> <Contact-Email>ej@spengler.edu</Contact-Email> <External-Description> Uncompressed greyscale TIFF images from the Yoshimuri papers colle...</External-Description> <Bagging-Date>2008-01-15</Bagging-Date> <External-Identifier>spengler_yoshimuri_001</External-Identifier> <Bag-Size>260 GB</Bag-Size> <Payload-Oxum>279164409832.1198</Payload-Oxum> <Bag-Group-Identifier>spengler_yoshimuri</Bag-Group-Identifier> <Bag-Count>1 of 15</Bag-Count> <Internal-Sender-Identifier>/storage/images/yoshimuri</Internal-Sender-Identifier> <Internal-Sender-Description>Uncompressed greyscale TIFFs created from microfilm and are...</Internal-Sender-Description> </transfer_metadata> </mets:xmlData> </mets:mdWrap> </mets:sourceMD> </mets:amdSec>
- When Bagit labels contain characters that are not valid XML labels, continue processing but print error message and skip labels with invalid content.
Search contents in archival storage tab[edit]
- Add keyword field "Transfer metadata" to drop-down menu in search. This will search all the contents of the <transfer_metadata> container in the METS file (as indexed in ElasticSearch).
- Add keyword field "Transfer metadata (other)" to drop-down menu in search. This will allow users to search individual fields in the <transfer_metadata> container.
- When the user selects "Transfer metadata (other)" a separate box will appear which will allow the user to enter the label of the specific field to be searched.
- Add ability to search date ranges.
- To search on a date range in <transfer_metadata> or one if its sub-fields, the user enters two dates in ISO date format separated by a colon. For example, "2015-01-03:2015-04-14".