Difference between revisions of "Improvements/aipreadme"
(Adding Self Describing AIP suggestion) |
|||
Line 32: | Line 32: | ||
* a link to the METS file | * a link to the METS file | ||
* optionally a link to a CATALOG.html file, that includes more detailed information about the contents of the AIP. | * optionally a link to a CATALOG.html file, that includes more detailed information about the contents of the AIP. | ||
+ | |||
+ | ====Sample README file test==== | ||
+ | |||
+ | This readme file describes the basic structure of an AIP generated by Archivematica. | ||
+ | |||
+ | Acronyms | ||
+ | |||
+ | AIP = Archival Information Package | ||
+ | METS = Metadata Encoding and Transmission Standard | ||
+ | PDI = Preservation Description Information | ||
+ | PREMIS = Preservation Metadata Implementation Strategies | ||
+ | OAIS = Open Archival Information System | ||
+ | UUID = Unique Universal Identifier | ||
+ | |||
+ | What is Archivematica? | ||
+ | |||
+ | Archivematica is an open-source suite of tools designed to ingest diverse digital content and prepare AIPs for long-term storage. Once an AIP is generated it is not dependent on Archivematica for retrieval, and can be opened using any standard file browser. The concept of an AIP is derived from the ISO 14721:2012 Reference Model for an ''Open Archival Information System (OAIS)'', which defines it as “[a]n Information Package, consisting of the Content Information and the associated Preservation Description Information (PDI), which is preserved within an OAIS.” | ||
+ | |||
+ | Content Information | ||
+ | |||
+ | In an Archivematica AIP, the Content Information consists primarily of the originally ingested digital objects and any preservation versions of the objects created to mitigate the risk of format obsolescence over time. The preservation copies typically have the same filenames as the original objects but with different file extensions and with UUIDs appended to the filename. For example, for an original file named BBhelemet.ai the preservation version may be named ''BBhelmet-e3a3988d-8149-49ea-adc5-c255fb68d4f9.pdf''. | ||
+ | |||
+ | The originally ingested digital objects and any preservation versions are located in the ''objects'' directory of the AIP. There will be nested subdirectories in the ''object''’ directory if these subdirectories were included in the original transfer or added during SIP arrangement. The objects directory also includes a submissiondocumentation folder and a metadata folder. The ''submissiondocumentation'' folder contains documentation such as donor agreements and transfer forms, if included the original transfer, as well as a METS file that records the contents of the original transfer(s) from which the AIP was created. The ''objects'' directory may also contain a metadata folder, which holds any metadata files included in the original transfer, and any OCR text files generated during processing. | ||
+ | |||
+ | |||
=== Use case: Create and Use a Bag Profile === | === Use case: Create and Use a Bag Profile === |
Revision as of 11:08, 18 June 2017
User story
As a repository manager, I would like AIP's to be as self describing as possible, so that future users, with little or no information about Archivematica or what an AIP is, will be able to understand the structure and contents of the AIP's I produce now.
Status
2017-06-01 - New Proposal
Interest
If you'd like to get involved in this development, please feel free to contribute to this wiki page or start a discussion on our user forum.
Analysis:
Currently, Archivematica AIP's are structured as a Bag (https://tools.ietf.org/html/draft-kunze-bagit-14) and contain a METS file, which describes the contents of the AIP. Details about the Archivematica AIP structure are here: https://www.archivematica.org/en/docs/archivematica-1.6/user-manual/archival-storage/aip-structure/
METS files are machine readable, but are not human friendly formats.
Adding a human readable index or description into an AIP would improve the chances of a future user understanding the structure.
Archivematica structures AIP's in a specific way, but that is not documented within the AIP. Adding more explicit documentation about the structure would help users test that AIP's are valid, and help them to understand the structure.
There is a similar proposal outlined here: https://github.com/UTS-eResearch/datacrate
Use case: Add a README to each AIP
In the data/ directory (beside the mets file) add a README.html or README.md file. This would be intended as the first file to be opened by a human being trying to examine an AIP.
The README file would include
- some boilerplate text, describing what an AIP is
- links to the Archivematica documentation, to METS documentation, to PREMIS docs, etc.
- a link to the METS file
- optionally a link to a CATALOG.html file, that includes more detailed information about the contents of the AIP.
Sample README file test
This readme file describes the basic structure of an AIP generated by Archivematica.
Acronyms
AIP = Archival Information Package METS = Metadata Encoding and Transmission Standard PDI = Preservation Description Information PREMIS = Preservation Metadata Implementation Strategies OAIS = Open Archival Information System UUID = Unique Universal Identifier
What is Archivematica?
Archivematica is an open-source suite of tools designed to ingest diverse digital content and prepare AIPs for long-term storage. Once an AIP is generated it is not dependent on Archivematica for retrieval, and can be opened using any standard file browser. The concept of an AIP is derived from the ISO 14721:2012 Reference Model for an Open Archival Information System (OAIS), which defines it as “[a]n Information Package, consisting of the Content Information and the associated Preservation Description Information (PDI), which is preserved within an OAIS.”
Content Information
In an Archivematica AIP, the Content Information consists primarily of the originally ingested digital objects and any preservation versions of the objects created to mitigate the risk of format obsolescence over time. The preservation copies typically have the same filenames as the original objects but with different file extensions and with UUIDs appended to the filename. For example, for an original file named BBhelemet.ai the preservation version may be named BBhelmet-e3a3988d-8149-49ea-adc5-c255fb68d4f9.pdf.
The originally ingested digital objects and any preservation versions are located in the objects directory of the AIP. There will be nested subdirectories in the object’ directory if these subdirectories were included in the original transfer or added during SIP arrangement. The objects directory also includes a submissiondocumentation folder and a metadata folder. The submissiondocumentation folder contains documentation such as donor agreements and transfer forms, if included the original transfer, as well as a METS file that records the contents of the original transfer(s) from which the AIP was created. The objects directory may also contain a metadata folder, which holds any metadata files included in the original transfer, and any OCR text files generated during processing.
Use case: Create and Use a Bag Profile
https://github.com/ruebot/bagit-profiles
Archivematica could define a bag profile and reference this in the AIP's it produces. This would help make AIP's more easily machine readable.