Difference between revisions of "DSpace exports"

From Archivematica
Jump to navigation Jump to search
Line 45: Line 45:
 
Requirements:
 
Requirements:
 
*Map the elements of the DSpace AIPs to the Archivematica AIP
 
*Map the elements of the DSpace AIPs to the Archivematica AIP
*Structure the Archivematica mets.xml file to point to the relevant sections of the DSpace mets.xml files
+
*Structure the Archivematica mets.xml file to point to the DSpace mets.xml files
 
*Index the metadata in all the xml files
 
*Index the metadata in all the xml files
  
Line 51: Line 51:
 
*The digital objects get placed in the objects directory
 
*The digital objects get placed in the objects directory
 
*The license.txt files get placed in the metadata/submissiondocumentation directory; the text is parsed to the <rights> container in the PREMIS metadata
 
*The license.txt files get placed in the metadata/submissiondocumentation directory; the text is parsed to the <rights> container in the PREMIS metadata
*The mets.xml files get placed in the metadata/submissionDocumentation directory
+
*The mets.xml files get placed in the metadata/submissionDocumentation directory...hmm, why not put them in the metadata directory?
  
 
== Structure the Archivematica mets.xml file ==
 
== Structure the Archivematica mets.xml file ==
Line 57: Line 57:
 
|-
 
|-
 
|- style="background-color:#cccccc;"
 
|- style="background-color:#cccccc;"
!style="width:20%"|'''Archivematica mets file'''
+
!style="width:25%"|'''Archivematica mets file'''
!style="width:20%"|'''DSpace mets file'''
+
!style="width:75%"|'''Description/notes'''
!style="width:60%"|'''Description/notes'''
 
 
|-
 
|-
 
|<dmdSec>
 
|<dmdSec>
|n/a
+
|DC metadata added during transfer/ingest; SIP-level only
|DC metadata added during transfer/ingest; collection-level only
 
|-
 
|<dmdSec>
 
|<dmdSec><mdWrap MDTYPE="MODS"> in collection-level mets.xml.
 
|Use mdRef to point to the MODS metadata in the collection-level mets.xml file.
 
|-
 
|<dmdSec>
 
|<dmdSec><mdWrap MDTYPE="MODS"> in item-level mets.xml
 
|Use mdRef to point to MODS metadata in the mets.xml file. Repeat this <dmdSec> for each object in the AIP
 
 
|-
 
|-
 
|<amdSec>
 
|<amdSec>
|n/a
 
 
|PREMIS metadata
 
|PREMIS metadata
|-
 
|<amdSec>
 
|<amdSec><rightsMD>
 
|rightsMDs are scattered all over the DSpace mets.xml files. I don't really know how to map them.
 
 
|-
 
|-
 
|<fileSec>
 
|<fileSec>
|n/a
+
|Lists all the files in the objects directory of the AIP
|Lists all the files in the AIP
 
 
|-
 
|-
|<structMap
+
|<structMap>
|
+
|Groups the contents in the objects directory of the AIP to reflect the folder structure of the AIP; links each object to its license.txt file; links each object to its mets.xml file
|
 
 
|}
 
|}
  
 
[[Category:Development documentation]]
 
[[Category:Development documentation]]

Revision as of 17:44, 12 September 2011

Main Page > Development > Development documentation > DSpace exports

This page analyzes the structure of DSpace exports from an uncustomized (i.e. out of the box) DSpace installation.

Collection export

Used the following command (from DSpace user documentation) to export a two-item collection with the handle 123456789-6:

./dspace packager -d -a -t AIP -e <user name> -i 123456789-6 calamy.zip

This results in the export of three zipped packages: one for the collection and one for each of the items:

  • calamy.zip
  • ITEM@123456789-7.zip
  • ITEM@123456789-8.zip

The extracted contents of each zipped file are shown in this screenshot:

Export.png

Collection-level mets.xml file

The mets.xml file for the collection is structured as follows:

  • <mets ID="DSpace_COLLECTION_123456789-6" OBJID="hdl:123456789/6" TYPE="DSpace COLLECTION" PROFILE="http://www.dspace.org/schema/aip/mets_aip_1_0.xsd" xsi:schemaLocation="http://www.loc.gov/METS/ http://www.loc.gov/standards/mets/mets.xsd">
  • <metsHdr>
  • <dmdSec> (contains MODS metadata for collection-level description)
  • <dmdSec> (contains DSpace Intermediate Metadata (DIM) for collection-level description; all mapped to dc; some overlap with MODS metadata)
  • <amdSec> (contains information on DSpace users and groups associated with the collection)
  • <fileSec> (references the collection's logo, if there is one)
  • <structMap> (links the collection to its logo, if there is one, plus its two child items)
  • <structMap> (links the collection to the DSpace Community)

Item-level mets.xml file

  • <metsHdr>
  • <dmdSec_1> (contains MODS metadata for item)
  • <dmdSec_2> (contains DIM metadata for item; all mapped to dc; some overlap with MODS metadata)
  • <amdSec> (contains rights metadata)
  • <amdSec> (contains rights metadata)
  • <amdSec> (contains PREMIS object metadata; rights metadata; DIM metadata for the item)
  • <amdSec> (contains rights metadata)
  • <amdSec> (contains PREMIS object metadata; rights metadata; DIM metadata for the licence)
  • <fileSec> (links the item to the license)
  • <structMap> (links the bitstream to the logical object)
  • <structMap> (links the item to the collection)

Parsing a DSpace collection export in Archivematica

Requirements:

  • Map the elements of the DSpace AIPs to the Archivematica AIP
  • Structure the Archivematica mets.xml file to point to the DSpace mets.xml files
  • Index the metadata in all the xml files

Map the elements of the DSpace AIPs to the Archivematica AIP

  • The digital objects get placed in the objects directory
  • The license.txt files get placed in the metadata/submissiondocumentation directory; the text is parsed to the <rights> container in the PREMIS metadata
  • The mets.xml files get placed in the metadata/submissionDocumentation directory...hmm, why not put them in the metadata directory?

Structure the Archivematica mets.xml file

Archivematica mets file Description/notes
<dmdSec> DC metadata added during transfer/ingest; SIP-level only
<amdSec> PREMIS metadata
<fileSec> Lists all the files in the objects directory of the AIP
<structMap> Groups the contents in the objects directory of the AIP to reflect the folder structure of the AIP; links each object to its license.txt file; links each object to its mets.xml file