Metadata elements

From Archivematica
Revision as of 11:23, 31 August 2010 by Evelyn McLellan (talk | contribs) (Created page with 'Main Page > Vancouver Digital Archives > Requirements Analysis > Required metadata elements This page identifies a minimum set of metadata elements designed to ensur...')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Main Page > Vancouver Digital Archives > Requirements Analysis > Required metadata elements

This page identifies a minimum set of metadata elements designed to ensure authenticity and interoperability of preserved objects and to facilitate their retrieval.

This process involves:

  1. Using the COP model and the CoP/PREMIS crosswalk to identify required elements for objects preserved by Archivematica
  2. Analyzing existing metadata in the Archivematica AIP log files and METS.xml file in order to map them to METS and PREMIS elements
  3. Comparing 1) to 2) in order to determine what gaps exist in Archivematica
  4. Filling in the gaps - eg by modifying workflow to produce and/or capture missing elements
  5. Structuring the required elements into the Repository eXchange Package (RXP) specification
  6. Determining what metadata belongs in the DIP(s)

Map of Archivematica 0.6 metadata to PREMIS elements

Source: /data/logs/MD5checksum.txt
Process: Produced when quarantine period expires. Provides checksums for each object in the SIP. Note that if zipped files are present, a checksum is generated for the zipped file and not for each object within it.
Description PREMIS entity PREMIS semantic unit PREMIS semantic component Sample value(s)
Checksum Object 1.5.2 Fixity 1.5.2.2 messageDigest 326e0206ae83f815e4be5f28464f6ac6
Source: /data/logs/filenameCleanup.log
Process: Produced when quarantine period expires, prior to unpacking of any zipped files. If prohibited characters were present in filenames, provides crosswalk between original and "cleaned up" filenames.
Description PREMIS entity PREMIS semantic unit PREMIS semantic component Sample value(s)
Original filename Object 1.6 originalName none Syllabus final.doc
Cleaned-up filename Event 2.5.2 eventOutcomeDetail 2.5.2.1 eventOutcomeDetailNote Syllabus_final.doc
Source: /data/logs/fileUUIDs.log
Process: Produced after prohibited characters are removed from filenames and any zipped files have been unpacked. Provides a crosswalk between cleaned-up filenames and UUIDs.
Description PREMIS entity PREMIS semantic unit PREMIS semantic component Sample value(s)
Universal unique identifier (UUID) Object 1.1 objectIdentifier 1.1.2 objectIdentifierValue 270bd067-0483-4c5f-bdec-f2cbd6e651aa
Source: /data/logs/FITS-[UUID]-[SIP].xml (FITS output reports)
Process: Produced when FITS tool identifies and validates formats and extracts technical metadata
FITS element PREMIS entity PREMIS semantic unit PREMIS semantic component Sample value(s)
format Object 1.5.4.1 formatDesignation 1.5.4.1.1 formatName
  • Tagged Image File Format
  • Waveform Audio
  • Microsoft Powerpoint Presentation
version Object 1.5.4.1 formatDesignation 1.5.4.1.2 formatVersion 6.0
externalIdentifier Object 1.5.4.2 formatRegistry 1.5.4.2.2 formatRegistryKey fmt/10
Size Object 1.5 objectCharacteristics 1.5.3 size 125968
ImageWidth Object 1.4 significantProperties 1.4.2 significantPropertiesValue 2464
ImageHeight Object 1.4 significantProperties 1.4.2 significantPropertiesValue 3248
BitsPerSample Object 1.4 significantProperties 1.4.2 significantPropertiesValue 1
SamplesPerPixel Object 1.4 significantProperties 1.4.2 significantPropertiesValue 3
XResolution Object 1.4 significantProperties 1.4.2 significantPropertiesValue 300
YResolution Object 1.4 significantProperties 1.4.2 significantPropertiesValue 300
duration Object 1.4 significantProperties 1.4.2 significantPropertiesValue 0:2:26:16
bitDepth Object 1.4 significantProperties 1.4.2 significantPropertiesValue 16
sampleRate Object 1.4 significantProperties 1.4.2 significantPropertiesValue 48000.0
channels Object 1.4 significantProperties 1.4.2 significantPropertiesValue 2
aes:channelAssignment Object 1.4 significantProperties 1.4.2 significantPropertiesValue
  • channelNum="0" mapLocation="LEFT"
  • channelNum="1" mapLocation="RIGHT"
PageCount Object 1.4 significantProperties 1.4.2 significantPropertiesValue 16
WordCount Object 1.4 significantProperties 1.4.2 significantPropertiesValue 876
Paragraphs Object 1.4 significantProperties 1.4.2 significantPropertiesValue 19
Slides Object 1.4 significantProperties 1.4.2 significantPropertiesValue 27
Source: /data/logs/normalization.log
Process: Produced during normalization to preservation and access formats
Description PREMIS entity PREMIS semantic unit PREMIS semantic component Sample value(s)
Name of normalization tool Agent 3.2 agentName none FFmpeg version SVN-r19352-4:0.5+svn20090706-2ubuntu2.2
Event description Event 2.2 eventType none Normalizing
Processing status Event 2.5 eventOutcomeInformation 2.5.1 eventOutcome Processing completed
Normalization result Event 2.5.2 eventOutcomeDetail 2.5.2.1 eventOutcomeDetailNote
  • Already in preservation format. No need to normalize.
  • No default normalization tool defined.
  • Output #0, wav, to '/tmp/MultimediaSIP-9ece5881-640e-4bdc-9863-4ff50046a0bd/objects/sample.wav': Stream #0.0: Audio: pcm_s16le, 8000 Hz, stereo, s16, 256 kb/s
Source: /data/logs/MD5checksum.txtprepareAIP_check.log
Process: Produced after file normalization process. Checks that checksums for files in the SIP have not changed during normalization.
Description PREMIS entity PREMIS semantic unit PREMIS semantic component Sample value(s)
Pass/fail notification Event 2.5 eventOutcomeInformation 2.5.1 eventOutcome
  • PASSED
  • FAILED
Source: /data/logs/AIP.MD5checksum.txt
Process: Produced during BagIt process. Provides checksums for the AIP and for each original and normalized file in the AIP.
Description PREMIS entity PREMIS semantic unit PREMIS semantic component Sample value(s)
AIP checksum Object 1.5.2 Fixity 1.5.2.2 messageDigest 12b86e038bf0bddd5aba110c35f288b8
File checksum Object 1.5.2 Fixity 1.5.2.2 messageDigest 326e0206ae83f815e4be5f28464f6ac6