METS

From Archivematica
Revision as of 16:05, 6 October 2011 by Evelyn (talk | contribs)
Jump to navigation Jump to search

Main Page > Development > Development documentation > Metadata elements > METS

Requirements

  • The METS file will have a basic generic structure which will be present for all AIPs derived from different kinds of transfers. Certain types of transfers, such as those from DSpace, will have additional requirements but will have the same basic structure.

Generic transfer

This section shows a sample METS structure for an ingested SIP containing the following:

  • /objects
    • LAND2.BMP
    • lion.svg
    • /More images
      • MARBLES.TGA

<fileSec>

The fileSec is broken into two fileGrps, one for original files and one for preservation copies:

  • <fileGrp USE="original">
  • <fileGrp USE="preservation">

Example:

  • <fileGrp USE="original">
    • <file ID="LAND2.BMP-[UUID]" GROUPID="G1" ADMID="digiprov-LAND2.BMP-[UUID]"><Flocat xlink:href="objects/LAND2.BMP" locType="other" otherLocType="system"/>
  • <fileGrp USE="preservation">
    • <file ID="LAND2-[UUID].tif-[UUID]" GROUPID="G1" ADMID="LAND2-[UUID].tif-[UUID]"><Flocat xlink:href="objects/LAND2-[UUID].tif" locType="other" otherLocType="system"/>

Note the GROUPID="G1"; this links the original file to its normalized version. Also note that the objects in the submissionDocumentation folder are treated in the same way as ingested objects.



Generic fileSec.png

<structMap>

The structMap section of the Archivematica METS file is designed to capture the directory structure of the AIP. Its TYPE is therefore physical (rather than logical) and it is grouped into divisions by directory, as follows:

  • AIP
    • /objects
      • /directory1
      • /directory2 etc.
      • /submissionDocumentation
Generic structMap.png


Note the DMID="AIP-description" in the objects directory div. This links the Dublin Core metadata to the contents of the objects directory.


DSpace transfer

A typical DSpace transfer will contain objects, licenses and ocr text files if the objects are scanned pdf files. In this example, the pdf files are scanned articles, the files without extensions are licenses and the txt files are ocr text for the articles:

  • /objects
    • /Item@249-2700
      • bitstream_8262.pdf
      • bitstream_8263
      • bitstream_42698.txt
    • /Item@249-2701
      • bitstream_8264.pdf
      • bitstream_8265
      • bitstream_42699.txt

<fileSec>

The fileSec is broken into four fileGrps as follows:

  • <fileGrp USE="original">
  • <fileGrp USE="preservation">
  • <fileGrp USE="text/ocr">
  • <fileGrp USE="license">
DSpace fileSec.png


structMap

DSpace structMap.png