Difference between revisions of "TRIM exports"

From Archivematica
Jump to navigation Jump to search
 
(42 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 
[[Main Page]] > [[Development]] > [[:Category:Development documentation|Development documentation]] > TRIM exports
 
[[Main Page]] > [[Development]] > [[:Category:Development documentation|Development documentation]] > TRIM exports
 +
 +
<div style="padding: 10px 10px; border: 1px solid black; background-color: #F79086;">This page is no longer being maintained and may contain inaccurate information. Please see the [https://www.archivematica.org/docs/latest/ Archivematica documentation] for up-to-date information.</div><p>
  
 
This page documents ingest of TRIM exports based on requirements for VanDocs ingest at City of Vancouver Archives.
 
This page documents ingest of TRIM exports based on requirements for VanDocs ingest at City of Vancouver Archives.
  
</br>
+
[[Category:Feature requirements]]
  
 
==TRIM export contents==
 
==TRIM export contents==
Line 26: Line 28:
 
===Parsing contents to the SIP===
 
===Parsing contents to the SIP===
  
*Each container becomes a single transfer OR each transfer is broken into one SIP per container  
+
*Each transfer is broken into one SIP per container  
 
*manifest.txt is copied to metadata/submissionDocumentation/
 
*manifest.txt is copied to metadata/submissionDocumentation/
 
*Location.xml is copied to metadata/
 
*Location.xml is copied to metadata/
Line 43: Line 45:
  
 
The contents of the transfer must be verified against the manifest.txt file during the "Verify transfer compliance" micro-service.
 
The contents of the transfer must be verified against the manifest.txt file during the "Verify transfer compliance" micro-service.
 +
Associated PREMIS event: manifest check. See below for details.
 +
 +
==Manifest check==
 +
{| border="1" cellpadding="10" cellspacing="0" width=90%
 +
|-
 +
|- style="background-color:#cccccc;"
 +
!style="width:20%"|'''Semantic unit'''
 +
!style="width:20%"|'''Semantic component'''
 +
!style="width:20%"|'''Sample value(s)'''
 +
!style="width:20%"|'''Notes'''
 +
|-
 +
|eventIdentifier
 +
|eventIdentifierType
 +
|UUID
 +
|
 +
|-
 +
|eventIdentifier
 +
|eventIdentifierValue
 +
|21h50321-6d7b-3855-89ag-a8b0fhc1f256
 +
|
 +
|-
 +
|eventType
 +
|none
 +
|manifest check
 +
|
 +
|-
 +
|eventDateTime
 +
|none
 +
|2011-08-01T09:08:46-01:00
 +
|
 +
|-
 +
|eventDetail
 +
|none
 +
|
 +
|
 +
|-
 +
|eventOutcomeInformation
 +
|eventOutcome
 +
|{pass; fail}
 +
|
 +
|-
 +
|eventOutcomeDetail
 +
|eventOutcomeDetailNote
 +
|
 +
|
 +
|-
 +
|linkingAgentIdentifier
 +
|linkingAgentIdentifierType
 +
|preservation system
 +
|
 +
|-
 +
|linkingAgentIdentifier
 +
|linkingAgentIdentifierValue
 +
|Archivematica-1.0
 +
|
 +
|-
 +
|}
 +
 +
<br>
  
 
===Verifying checksums===
 
===Verifying checksums===
Line 55: Line 116:
  
 
These checksums must be verified during the "Verify transfer checksums" micro-service.
 
These checksums must be verified during the "Verify transfer checksums" micro-service.
 +
Associated PREMIS event: fixity check
  
 
</br>
 
</br>
 +
 +
==Fixity check==
 +
 +
{| border="1" cellpadding="10" cellspacing="0" width=90%
 +
|-
 +
|- style="background-color:#cccccc;"
 +
!style="width:20%"|'''Semantic unit'''
 +
!style="width:20%"|'''Semantic component'''
 +
!style="width:20%"|'''Sample value(s)'''
 +
!style="width:20%"|'''Notes'''
 +
|-
 +
|eventIdentifier
 +
|eventIdentifierType
 +
|UUID
 +
|
 +
|-
 +
|eventIdentifier
 +
|eventIdentifierValue
 +
|73f87321-6d7b-3855-89ag-a8b0fhc1f256
 +
|
 +
|-
 +
|eventType
 +
|none
 +
|fixity check
 +
|
 +
|-
 +
|eventDateTime
 +
|none
 +
|2010-08-01T09:08:46-01:00
 +
|
 +
|-
 +
|eventDetail
 +
|none
 +
|program="MD5Deep"; version="3.6"
 +
|
 +
|-
 +
|eventOutcomeInformation
 +
|eventOutcome
 +
|{pass; fail}
 +
|
 +
|-
 +
|eventOutcomeDetail
 +
|eventOutcomeDetailNote
 +
|
 +
|
 +
|-
 +
|linkingAgentIdentifier
 +
|linkingAgentIdentifierType
 +
|preservation system
 +
|
 +
|-
 +
|linkingAgentIdentifier
 +
|linkingAgentIdentifierValue
 +
|Archivematica-1.0
 +
|
 +
|-
 +
|}
 +
 +
<br>
  
 
==The AIP METS file==
 
==The AIP METS file==
Line 62: Line 183:
 
===dmdSecs===
 
===dmdSecs===
  
*Each container will have one dmdSec consisting of Dublin Core metadata derived from the TRIM export metadata
+
*Each container will have one dmdSec consisting of Dublin Core metadata derived from the TRIM export metadata (''ContainerMetadata.xml'')
*Each file will have one dmdSec consisting of Dublin Core or EAD metadata derived from the TRIM export metadata
+
*Each file will have one dmdSec consisting of Dublin Core metadata derived from the TRIM export metadata (eg ''DOC_2012_000100_Metadata.xml'')
  
 
</br>
 
</br>
Line 77: Line 198:
 
!'''TRIM element'''
 
!'''TRIM element'''
 
!'''DC element'''
 
!'''DC element'''
!'''EAD element'''
+
!'''RAD/AtoM element'''
!'''RAD element'''
 
 
!'''Comments'''
 
!'''Comments'''
 
|-
 
|-
|<TypedTitle>
+
|<TitleFreeTextPart>
|<dc.title>
+
|<dcterms:title>
|<unittitle>
 
 
|'''Title proper'''
 
|'''Title proper'''
 
|
 
|
 
|-
 
|-
|n/a
+
|<Department>
|n/a
+
|<dcterms:creator>
|<c> Level attribute
+
|'''Name'''
|'''Level of description'''
 
|Level of description will be obtained from METS StructMap div TYPE
 
|-
 
|<HomeLocation>
 
|<dc.creator>
 
|<origination>
 
|'''n/a'''
 
 
|AtoM adds a Name field linked to the Date(s) of creation field
 
|AtoM adds a Name field linked to the Date(s) of creation field
 
|-
 
|-
 
|<DateModified>
 
|<DateModified>
|<dc.date>
+
|<dcterms:date>
|<unitdate>
 
 
|'''Date(s) of creation'''
 
|'''Date(s) of creation'''
 
|Date range based on earliest and latest DateModified in document metadata
 
|Date range based on earliest and latest DateModified in document metadata
 +
|-
 +
|<OPR>
 +
|<dcterms:provenance>
 +
|'''Immediate source of acquisition'''
 +
|
 +
|-
 +
|<RecordNumber>
 +
|<dc:identifier>
 +
|'''Identifier'''
 +
|Only the numbers to the right of the slash in this field are used - eg 04-4000/0000070 --> 0000070
 
|-
 
|-
 
|n/a
 
|n/a
|<dcterms.extent>
+
|<dcterms:extent>
|<physdesc> and subelement <extent>
 
 
|'''Physical description'''
 
|'''Physical description'''
|Count of documents in the SIP plus fixed text
+
|Count of documents in the SIP plus fixed text: "digital objects"
 
|-
 
|-
|<Notes>
 
 
|n/a
 
|n/a
|<note>
+
|n/a
|'''General note'''
+
|'''Level of description'''
|
+
|Level of description taken from METS structMap div TYPE
 
|-
 
|-
|<RecordNumber>
+
|<FullClassificationNumber>
|<dc.identifier>
+
|<dcterms:isPartOf>
|<unitid>
+
|n/a
|'''n/a'''
+
|Field does not map to RAD but is used along with <OPR> to determine DIP upload destination
|AtoM adds an identifier field to archival descriptions
 
 
|-
 
|-
 
|}
 
|}
Line 132: Line 250:
 
|-
 
|-
 
!'''TRIM'''
 
!'''TRIM'''
!'''RAD'''
+
!'''AtoM'''
 
|-
 
|-
|PCI Compliance
+
|'''<TitleFreeTextPart>''' PCI Compliance
 
|'''Title proper''': PCI Compliance
 
|'''Title proper''': PCI Compliance
 
|-
 
|-
|n/a
+
|'''<Department>''' IT Strategy, Business Relationships and Projects - IT
|'''Level of description''': File
 
|-
 
|IT Strategy, Business Relationships and Projects - IT
 
 
|'''Name''': IT Strategy, Business Relationships and Projects - IT
 
|'''Name''': IT Strategy, Business Relationships and Projects - IT
 
|-
 
|-
|2010-03-01T18:20:15-08:00 / 2012-05-01T19:26:23-08:00
+
|'''<DateModified>''' 2010-03-01T18:20:15-08:00 / 2012-05-01T19:26:23-08:00
 
|'''Date(s) of creation''': 2010-03-01 - 2012-05-01
 
|'''Date(s) of creation''': 2010-03-01 - 2012-05-01
 +
|-
 +
|'''<OPR>''' IT Business Strategies
 +
|'''Immediate source of acquisition''': IT Business Strategies
 +
|-
 +
|'''<RecordNumber>''' 04-4000/0000070
 +
|'''Identifier''': 0000070
 
|-
 
|-
 
|n/a
 
|n/a
 
|'''Physical description''': 184 digital objects
 
|'''Physical description''': 184 digital objects
 
|-
 
|-
|Note about this container
 
|'''General note''': Note about this container
 
 
|-
 
|-
|04-4000/0000070
+
|n/a
|'''Reference code''': CA CVA [series number]-0000070
+
|'''Level of description''': File
 +
|-
 +
|'''<FullClassificationNumber>'''04-4000-20
 +
|
 
|-
 
|-
 
|}
 
|}
Line 165: Line 287:
 
!'''TRIM element'''
 
!'''TRIM element'''
 
!'''DC element'''
 
!'''DC element'''
!'''EAD element'''
+
!'''RAD/AtoM element'''
!'''RAD element'''
 
 
!'''Comments'''
 
!'''Comments'''
 
|-
 
|-
|<TypedTitle>
+
|<TitleFreeTextPart>
 
|<dc:title>
 
|<dc:title>
|<unittitle>
 
 
|'''Title proper'''
 
|'''Title proper'''
 
|
 
|
|-
 
|n/a
 
|n/a
 
|<c> Level attribute
 
|'''Level of description'''
 
|Level of description will be obtained from METS StructMap div TYPE
 
 
|-
 
|-
 
|<DateModified>
 
|<DateModified>
 
|<dc:date>
 
|<dc:date>
|<unitdate>
 
 
|'''Date(s) of creation'''
 
|'''Date(s) of creation'''
 
|
 
|
 
|-
 
|-
|<Notes>
+
|<RecordNumber>
|n/a
+
|<dc:identifier>
|<note>
+
|'''Identifier'''
|'''General note'''
 
 
|
 
|
 
|-
 
|-
|<RecordNumber>
+
|n/a
|<dc:identifier>
+
|n/a
|<unitid>
+
|'''Level of description'''
|'''n/a'''
+
|Level of description will be obtained from METS StructMap div TYPE
|AtoM adds an identifier field to archival descriptions
 
 
|-
 
|-
 
|}
 
|}
Line 209: Line 320:
 
|-
 
|-
 
!'''TRIM'''
 
!'''TRIM'''
!'''RAD'''
+
!'''AtoM'''
 
|-
 
|-
|MCPP Project Report
+
|'''<TitleFreeTextPart>''' MCPP Project Report
 
|'''Title proper''': MCPP Project Report
 
|'''Title proper''': MCPP Project Report
 
|-
 
|-
|n/a
+
|'''<DateModified>''' 2010-03-01T18:20:15-08:00
|'''Level of description''': Item
 
|-
 
|2010-03-01T18:20:15-08:00
 
 
|'''Date(s) of creation''': 2010-03-01
 
|'''Date(s) of creation''': 2010-03-01
 
|-
 
|-
|Note about this document
+
|'''<RecordNumber>''' DOC/2010/000100
|'''General note''': Note about this document
+
|'''Identifier''': DOC/2010/000100
 
|-
 
|-
|DOC/2010/000100
 
|'''Reference code''': CA CVA [series number]-0000070-DOC/2010/000100
 
 
|-
 
|-
 +
|n/a
 +
|'''Level of description''': Item
 
|}
 
|}
  
Line 234: Line 342:
 
*Each container will have an amdSec consisting of:
 
*Each container will have an amdSec consisting of:
 
**A digiprovMD with an xlink reference to metadata/ContainerMetadata.xml
 
**A digiprovMD with an xlink reference to metadata/ContainerMetadata.xml
**A digiprovMD with an xlink reference to metadata/Location.xml
 
  
 
</br>
 
</br>
Line 243: Line 350:
  
 
*Each file will have an amdSec consisting of:
 
*Each file will have an amdSec consisting of:
 +
**A rightsMD populated with PREMIS rights (see '''Flagging closed AIPs''', below)
 
**A digiprovMD with an xlink reference to the the relevant document metadata xml file
 
**A digiprovMD with an xlink reference to the the relevant document metadata xml file
 
**A techMD and digiprovMDs generated by Archivematica during processing
 
**A techMD and digiprovMDs generated by Archivematica during processing
Line 254: Line 362:
 
===fileSec and structMaps===
 
===fileSec and structMaps===
 
*Each METS file will have two structMaps, the Archivematica default structMap and a logical structMap for hierarchically arranging the container into a file and its child items
 
*Each METS file will have two structMaps, the Archivematica default structMap and a logical structMap for hierarchically arranging the container into a file and its child items
*The container and file div TYPE elements will map to the RAD Level of description field in AtoM
+
*The container and file div TYPE elements in the logical structMap will map to the RAD Level of description field in AtoM
 
*The structMap contains the links between containers and files and their relevant dmdSecs
 
*The structMap contains the links between containers and files and their relevant dmdSecs
 
*The structMap also contains the link between the container and its amdSec
 
*The structMap also contains the link between the container and its amdSec
Line 265: Line 373:
 
</br>
 
</br>
  
==Flagging closed/open AIPs==
+
==Flagging closed AIPs==
  
*The container metadata file (ContainerMetadata.xml) has two fields whose values will be used to populate the PREMIS rights entity in the SIP, DateClosed and RetentionSchedule. Examples are:
+
*The container metadata file (ContainerMetadata.xml) has two fields whose values will be used to populate the PREMIS rights entity in the SIP (in the METS <rightsMD> element), DateClosed and RetentionSchedule. Examples are:
 
**<DateClosed>2012-08-17T16:13:31-08:00</DateClosed>
 
**<DateClosed>2012-08-17T16:13:31-08:00</DateClosed>
 
**<RetentionSchedule>EV2.3.A</RetentionSchedule>
 
**<RetentionSchedule>EV2.3.A</RetentionSchedule>
Line 279: Line 387:
 
[[File:VanDocs_rights.png|680px|thumb|center|]]
 
[[File:VanDocs_rights.png|680px|thumb|center|]]
  
[[Category:Development documentation]]
+
==DIP upload==
 +
 
 +
*Upon DIP upload to AtoM, the container will become a file-level description, with level of description populated by the structMap div label for the container ("file"). Each object in the DIP will become a child level with the level of description populated by the structMap div label for the object ("item").
 +
*Descriptive metadata in RAD will be populated by the appropriate dmdSec for each container and object (see container and document metadata mapping, above).

Latest revision as of 17:28, 11 February 2020

Main Page > Development > Development documentation > TRIM exports

This page is no longer being maintained and may contain inaccurate information. Please see the Archivematica documentation for up-to-date information.

This page documents ingest of TRIM exports based on requirements for VanDocs ingest at City of Vancouver Archives.

TRIM export contents[edit]

A TRIM export consists of

  • 1 or more containers
  • A manifest of the transfer (manifest.txt)
  • XML schema documentation for all xml files in the transfer (container, location and document xml metadata)
  • Location metadata (Location.xml)
  • Container metadata (ContainerMetadata.xml)
  • Document metadata (eg DOC_2012_000100_Metadata.xml)
  • Documents (eg DOC_2012_000100.docx)


VanDocs1g.png


Processing a TRIM export[edit]

Parsing contents to the SIP[edit]

  • Each transfer is broken into one SIP per container
  • manifest.txt is copied to metadata/submissionDocumentation/
  • Location.xml is copied to metadata/
  • All schema documentation is copied to metadata/
  • The relevant ContainerMetadata.xml is copied to metadata/
  • The relevant document metadata files are copied to metadata/
  • All documents are copied to objects/


A SIP generated from a TRIM export


Verifying manifest[edit]

The contents of the transfer must be verified against the manifest.txt file during the "Verify transfer compliance" micro-service. Associated PREMIS event: manifest check. See below for details.

Manifest check[edit]

Semantic unit Semantic component Sample value(s) Notes
eventIdentifier eventIdentifierType UUID
eventIdentifier eventIdentifierValue 21h50321-6d7b-3855-89ag-a8b0fhc1f256
eventType none manifest check
eventDateTime none 2011-08-01T09:08:46-01:00
eventDetail none
eventOutcomeInformation eventOutcome {pass; fail}
eventOutcomeDetail eventOutcomeDetailNote
linkingAgentIdentifier linkingAgentIdentifierType preservation system
linkingAgentIdentifier linkingAgentIdentifierValue Archivematica-1.0


Verifying checksums[edit]

Each document metadata file contains an md5 checksum for the document:


Checksumg.png


These checksums must be verified during the "Verify transfer checksums" micro-service. Associated PREMIS event: fixity check


Fixity check[edit]

Semantic unit Semantic component Sample value(s) Notes
eventIdentifier eventIdentifierType UUID
eventIdentifier eventIdentifierValue 73f87321-6d7b-3855-89ag-a8b0fhc1f256
eventType none fixity check
eventDateTime none 2010-08-01T09:08:46-01:00
eventDetail none program="MD5Deep"; version="3.6"
eventOutcomeInformation eventOutcome {pass; fail}
eventOutcomeDetail eventOutcomeDetailNote
linkingAgentIdentifier linkingAgentIdentifierType preservation system
linkingAgentIdentifier linkingAgentIdentifierValue Archivematica-1.0


The AIP METS file[edit]

dmdSecs[edit]

  • Each container will have one dmdSec consisting of Dublin Core metadata derived from the TRIM export metadata (ContainerMetadata.xml)
  • Each file will have one dmdSec consisting of Dublin Core metadata derived from the TRIM export metadata (eg DOC_2012_000100_Metadata.xml)


DmdSecsg.png


Container metadata mapping[edit]

TRIM element DC element RAD/AtoM element Comments
<TitleFreeTextPart> <dcterms:title> Title proper
<Department> <dcterms:creator> Name AtoM adds a Name field linked to the Date(s) of creation field
<DateModified> <dcterms:date> Date(s) of creation Date range based on earliest and latest DateModified in document metadata
<OPR> <dcterms:provenance> Immediate source of acquisition
<RecordNumber> <dc:identifier> Identifier Only the numbers to the right of the slash in this field are used - eg 04-4000/0000070 --> 0000070
n/a <dcterms:extent> Physical description Count of documents in the SIP plus fixed text: "digital objects"
n/a n/a Level of description Level of description taken from METS structMap div TYPE
<FullClassificationNumber> <dcterms:isPartOf> n/a Field does not map to RAD but is used along with <OPR> to determine DIP upload destination


Sample container description

TRIM AtoM
<TitleFreeTextPart> PCI Compliance Title proper: PCI Compliance
<Department> IT Strategy, Business Relationships and Projects - IT Name: IT Strategy, Business Relationships and Projects - IT
<DateModified> 2010-03-01T18:20:15-08:00 / 2012-05-01T19:26:23-08:00 Date(s) of creation: 2010-03-01 - 2012-05-01
<OPR> IT Business Strategies Immediate source of acquisition: IT Business Strategies
<RecordNumber> 04-4000/0000070 Identifier: 0000070
n/a Physical description: 184 digital objects
n/a Level of description: File
<FullClassificationNumber>04-4000-20


Document metadata mapping[edit]

TRIM element DC element RAD/AtoM element Comments
<TitleFreeTextPart> <dc:title> Title proper
<DateModified> <dc:date> Date(s) of creation
<RecordNumber> <dc:identifier> Identifier
n/a n/a Level of description Level of description will be obtained from METS StructMap div TYPE



Sample document description

TRIM AtoM
<TitleFreeTextPart> MCPP Project Report Title proper: MCPP Project Report
<DateModified> 2010-03-01T18:20:15-08:00 Date(s) of creation: 2010-03-01
<RecordNumber> DOC/2010/000100 Identifier: DOC/2010/000100
n/a Level of description: Item



amdSecs[edit]

  • Each container will have an amdSec consisting of:
    • A digiprovMD with an xlink reference to metadata/ContainerMetadata.xml


Sample amdSec for a container


  • Each file will have an amdSec consisting of:
    • A rightsMD populated with PREMIS rights (see Flagging closed AIPs, below)
    • A digiprovMD with an xlink reference to the the relevant document metadata xml file
    • A techMD and digiprovMDs generated by Archivematica during processing


Sample amdSec for a file


fileSec and structMaps[edit]

  • Each METS file will have two structMaps, the Archivematica default structMap and a logical structMap for hierarchically arranging the container into a file and its child items
  • The container and file div TYPE elements in the logical structMap will map to the RAD Level of description field in AtoM
  • The structMap contains the links between containers and files and their relevant dmdSecs
  • The structMap also contains the link between the container and its amdSec
  • The files are linked to their amdSecs in the fileSec


StructMapg.png


Flagging closed AIPs[edit]

  • The container metadata file (ContainerMetadata.xml) has two fields whose values will be used to populate the PREMIS rights entity in the SIP (in the METS <rightsMD> element), DateClosed and RetentionSchedule. Examples are:
    • <DateClosed>2012-08-17T16:13:31-08:00</DateClosed>
    • <RetentionSchedule>EV2.3.A</RetentionSchedule>
  • The DateClosed field will be used to populate the termOfRestriction startDate in the PREMIS rights entity
  • The DateClosed and RetentionSchedule fields will be used to calculate the termOfRestriction endDate in the PREMIS rights entity. For the examples provided above, Archivematica would calculate 5 years from the end of 2012-08-17 and then to the end of the calendar year, for a result of 2017-12-31.
  • The closure period would also be captured as a standardized free text entry in the rightsGrantedNote field of the PREMIS rights entity, for example: Closed until 2012-12-31.
  • Other PREMIS fields would be auto-populated for every VanDocs ingest as shown in the screenshot below.


VanDocs rights.png

DIP upload[edit]

  • Upon DIP upload to AtoM, the container will become a file-level description, with level of description populated by the structMap div label for the container ("file"). Each object in the DIP will become a child level with the level of description populated by the structMap div label for the object ("item").
  • Descriptive metadata in RAD will be populated by the appropriate dmdSec for each container and object (see container and document metadata mapping, above).