Difference between revisions of "DSpace exports"
Jump to navigation
Jump to search
Line 13: | Line 13: | ||
*ITEM@123456789-8.zip | *ITEM@123456789-8.zip | ||
− | The extracted contents of each zipped file are shown in this screenshot. | + | The extracted contents of each zipped file are shown in this screenshot. |
+ | *The bitstream in the collection-level directory (calamy) is a logo added to the collection description in DSpace. | ||
+ | *The text file bitstreams in the other directories are licenses. Note that they are not identified by filename as license files - Archivematica will need to recognize license files from each object's METS file (i.e. from <fileSec>). | ||
[[File:export.png|680px|thumb|center|]] | [[File:export.png|680px|thumb|center|]] | ||
Line 38: | Line 40: | ||
*<amdSec> (contains rights metadata) | *<amdSec> (contains rights metadata) | ||
*<amdSec> (contains PREMIS object metadata; rights metadata; DIM metadata for the licence) | *<amdSec> (contains PREMIS object metadata; rights metadata; DIM metadata for the licence) | ||
− | *<fileSec> ( | + | *<fileSec> (lists the item and its license) |
*<structMap> (links the bitstream to the logical object) | *<structMap> (links the bitstream to the logical object) | ||
*<structMap> (links the item to the collection) | *<structMap> (links the item to the collection) | ||
Line 50: | Line 52: | ||
== Map the elements of the DSpace AIPs to the Archivematica AIP == | == Map the elements of the DSpace AIPs to the Archivematica AIP == | ||
*The digital objects get placed in the objects directory | *The digital objects get placed in the objects directory | ||
− | *The license | + | *The license files get placed in the metadata/submissiondocumentation directory; the text is parsed to the <rights> container in the PREMIS metadata. See [[PREMIS metadata: rights#License-based]] |
*The mets.xml files get placed in the metadata/submissionDocumentation directory...hmm, why not put them in the metadata directory? | *The mets.xml files get placed in the metadata/submissionDocumentation directory...hmm, why not put them in the metadata directory? | ||
Revision as of 14:59, 19 September 2011
Main Page > Development > Development documentation > DSpace exports
This page analyzes the structure of DSpace exports from an uncustomized (i.e. out of the box) DSpace installation.
Collection export
Used the following command (from DSpace user documentation) to export a two-item collection with the handle 123456789-6:
./dspace packager -d -a -t AIP -e <user name> -i 123456789-6 calamy.zip
This results in the export of three zipped packages: one for the collection and one for each of the items:
- calamy.zip
- ITEM@123456789-7.zip
- ITEM@123456789-8.zip
The extracted contents of each zipped file are shown in this screenshot.
- The bitstream in the collection-level directory (calamy) is a logo added to the collection description in DSpace.
- The text file bitstreams in the other directories are licenses. Note that they are not identified by filename as license files - Archivematica will need to recognize license files from each object's METS file (i.e. from <fileSec>).
Collection-level mets.xml file
The mets.xml file for the collection is structured as follows:
- <mets ID="DSpace_COLLECTION_123456789-6" OBJID="hdl:123456789/6" TYPE="DSpace COLLECTION" PROFILE="http://www.dspace.org/schema/aip/mets_aip_1_0.xsd" xsi:schemaLocation="http://www.loc.gov/METS/ http://www.loc.gov/standards/mets/mets.xsd">
- <metsHdr>
- <dmdSec> (contains MODS metadata for collection-level description)
- <dmdSec> (contains DSpace Intermediate Metadata (DIM) for collection-level description; all mapped to dc; some overlap with MODS metadata)
- <amdSec> (contains information on DSpace users and groups associated with the collection)
- <fileSec> (references the collection's logo, if there is one)
- <structMap> (links the collection to its logo, if there is one, plus its two child items)
- <structMap> (links the collection to the DSpace Community)
Item-level mets.xml file
- <metsHdr>
- <dmdSec_1> (contains MODS metadata for item)
- <dmdSec_2> (contains DIM metadata for item; all mapped to dc; some overlap with MODS metadata)
- <amdSec> (contains rights metadata)
- <amdSec> (contains rights metadata)
- <amdSec> (contains PREMIS object metadata; rights metadata; DIM metadata for the item)
- <amdSec> (contains rights metadata)
- <amdSec> (contains PREMIS object metadata; rights metadata; DIM metadata for the licence)
- <fileSec> (lists the item and its license)
- <structMap> (links the bitstream to the logical object)
- <structMap> (links the item to the collection)
Parsing a DSpace collection export in Archivematica
Requirements:
- Map the elements of the DSpace AIPs to the Archivematica AIP
- Structure the Archivematica mets.xml file to point to the DSpace mets.xml files
- Index the metadata in all the xml files
Map the elements of the DSpace AIPs to the Archivematica AIP
- The digital objects get placed in the objects directory
- The license files get placed in the metadata/submissiondocumentation directory; the text is parsed to the <rights> container in the PREMIS metadata. See PREMIS metadata: rights#License-based
- The mets.xml files get placed in the metadata/submissionDocumentation directory...hmm, why not put them in the metadata directory?
Structure the Archivematica mets.xml file
METS file section | Description/notes |
---|---|
<dmdSec> | DC metadata added during transfer/ingest; SIP-level only |
<amdSec> | PREMIS metadata |
<fileSec> | Lists all the files in the objects directory of the AIP |
<structMap> | Groups the contents in the objects directory of the AIP to reflect the folder structure of the AIP |
Question: how do we link the object to the DSpace METS file? Give the METS file a UUID and make the link in the PREMIS relationships field?