Metadata elements
Revision as of 18:04, 31 August 2010 by Evelyn McLellan (talk | contribs)
Main Page > Development > Development documentation > Metadata elements
This page identifies a minimum set of metadata elements designed to ensure authenticity and interoperability of preserved objects and to facilitate their retrieval.
This process involves:
- Using the InterPARES Chain of Preservation (COP) model and the CoP/PREMIS crosswalk to identify required elements for objects preserved by Archivematica
- Analyzing existing metadata in the Archivematica AIP log files and METS.xml file in order to map them to METS and PREMIS elements
- Comparing 1) to 2) in order to determine what gaps exist in Archivematica
- Filling in the gaps - eg by modifying workflow to produce and/or capture missing elements
- Structuring the required elements into the Repository eXchange Package (RXP) specification
- Determining what metadata belongs in the DIP(s)
Map of Archivematica 0.6 metadata to PREMIS elements
Source: /data/logs/MD5checksum.txt | |||||
Process: Produced when quarantine period expires. Provides checksums for each object in the SIP. Note that if zipped files are present, a checksum is generated for the zipped file and not for each object within it. | |||||
Description | PREMIS entity | PREMIS semantic unit | PREMIS semantic component | Sample value(s) | |
---|---|---|---|---|---|
Checksum | Object | 1.5.2 Fixity | 1.5.2.2 messageDigest | 326e0206ae83f815e4be5f28464f6ac6 | |
Source: /data/logs/filenameCleanup.log | |||||
Process: Produced when quarantine period expires, prior to unpacking of any zipped files. If prohibited characters were present in filenames, provides crosswalk between original and "cleaned up" filenames. | |||||
Description | PREMIS entity | PREMIS semantic unit | PREMIS semantic component | Sample value(s) | |
Original filename | Object | 1.6 originalName | none | Syllabus final.doc | |
Cleaned-up filename | Event | 2.5.2 eventOutcomeDetail | 2.5.2.1 eventOutcomeDetailNote | Syllabus_final.doc | |
Source: /data/logs/fileUUIDs.log | |||||
Process: Produced after prohibited characters are removed from filenames and any zipped files have been unpacked. Provides a crosswalk between cleaned-up filenames and UUIDs. | |||||
Description | PREMIS entity | PREMIS semantic unit | PREMIS semantic component | Sample value(s) | |
Universal unique identifier (UUID) | Object | 1.1 objectIdentifier | 1.1.2 objectIdentifierValue | 270bd067-0483-4c5f-bdec-f2cbd6e651aa | |
Source: /data/logs/FITS-[UUID]-[SIP].xml (FITS output reports) | |||||
Process: Produced when FITS tool identifies and validates formats and extracts technical metadata | |||||
FITS element | PREMIS entity | PREMIS semantic unit | PREMIS semantic component | Sample value(s) | |
format | Object | 1.5.4.1 formatDesignation | 1.5.4.1.1 formatName |
| |
version | Object | 1.5.4.1 formatDesignation | 1.5.4.1.2 formatVersion | 6.0 | |
externalIdentifier | Object | 1.5.4.2 formatRegistry | 1.5.4.2.2 formatRegistryKey | fmt/10 | |
Size | Object | 1.5 objectCharacteristics | 1.5.3 size | 125968 | |
ImageWidth | Object | 1.4 significantProperties | 1.4.2 significantPropertiesValue | 2464 | |
ImageHeight | Object | 1.4 significantProperties | 1.4.2 significantPropertiesValue | 3248 | |
BitsPerSample | Object | 1.4 significantProperties | 1.4.2 significantPropertiesValue | 1 | |
SamplesPerPixel | Object | 1.4 significantProperties | 1.4.2 significantPropertiesValue | 3 | |
XResolution | Object | 1.4 significantProperties | 1.4.2 significantPropertiesValue | 300 | |
YResolution | Object | 1.4 significantProperties | 1.4.2 significantPropertiesValue | 300 | |
duration | Object | 1.4 significantProperties | 1.4.2 significantPropertiesValue | 0:2:26:16 | |
bitDepth | Object | 1.4 significantProperties | 1.4.2 significantPropertiesValue | 16 | |
sampleRate | Object | 1.4 significantProperties | 1.4.2 significantPropertiesValue | 48000.0 | |
channels | Object | 1.4 significantProperties | 1.4.2 significantPropertiesValue | 2 | |
aes:channelAssignment | Object | 1.4 significantProperties | 1.4.2 significantPropertiesValue |
| |
VideoFrameRate | Object | 1.4 significantProperties | 1.4.2 significantPropertiesValue |
| |
AspectRatio | Object | 1.4 significantProperties | 1.4.2 significantPropertiesValue | 1:1 | |
PageCount | Object | 1.4 significantProperties | 1.4.2 significantPropertiesValue | 16 | |
WordCount | Object | 1.4 significantProperties | 1.4.2 significantPropertiesValue | 876 | |
Paragraphs | Object | 1.4 significantProperties | 1.4.2 significantPropertiesValue | 19 | |
Slides | Object | 1.4 significantProperties | 1.4.2 significantPropertiesValue | 27 | |
Source: /data/logs/normalization.log | |||||
Process: Produced during normalization to preservation and access formats | |||||
Description | PREMIS entity | PREMIS semantic unit | PREMIS semantic component | Sample value(s) | |
Name of normalization tool | Agent | 3.2 agentName | none | FFmpeg version SVN-r19352-4:0.5+svn20090706-2ubuntu2.2 | |
Event description | Event | 2.2 eventType | none | Normalizing | |
Processing status | Event | 2.5 eventOutcomeInformation | 2.5.1 eventOutcome | Processing completed | |
Normalization result | Event | 2.5.2 eventOutcomeDetail | 2.5.2.1 eventOutcomeDetailNote |
| |
Source: /data/logs/MD5checksum.txtprepareAIP_check.log | |||||
Process: Produced after file normalization process. Checks that checksums for files in the SIP have not changed during normalization. | |||||
Description | PREMIS entity | PREMIS semantic unit | PREMIS semantic component | Sample value(s) | |
Pass/fail notification | Event | 2.5 eventOutcomeInformation | 2.5.1 eventOutcome |
| |
Source: /data/logs/AIP.MD5checksum.txt | |||||
Process: Produced during BagIt process. Provides checksums for the AIP and for each original and normalized file in the AIP. | |||||
Description | PREMIS entity | PREMIS semantic unit | PREMIS semantic component | Sample value(s) | |
AIP checksum | Object | 1.5.2 Fixity | 1.5.2.2 messageDigest | 12b86e038bf0bddd5aba110c35f288b8 | |
File checksum | Object | 1.5.2 Fixity | 1.5.2.2 messageDigest | 326e0206ae83f815e4be5f28464f6ac6 |