Difference between revisions of "DSpace exports"

From Archivematica
Jump to navigation Jump to search
Line 29: Line 29:
 
[[File:fileSec.png|680px|thumb|center|]]
 
[[File:fileSec.png|680px|thumb|center|]]
  
Archivematica should move the license file to the metadata/submissionDocumentation directory; the text can be parsed to the <rights> container in the PREMIS metadata. See [[PREMIS metadata: rights#License-based]]
+
Archivematica should move the license file to the metadata/submissionDocumentation directory; the text can be parsed to the rights entity in the PREMIS metadata. See [[PREMIS metadata: rights#License-based]].
  
 
=== RightsMD ===
 
=== RightsMD ===
Line 37: Line 37:
 
[[File:rights.png|680px|thumb|center|]]
 
[[File:rights.png|680px|thumb|center|]]
  
Should Archivematica parse this rightsMD metadata to the PREMIS file?
+
This metadata can be added to the PREMIS rights entity in the rightsExtension field. See See [[PREMIS metadata: rights#From_DSpace_METS]]
 
 
 
=== Descriptive metadata ===
 
=== Descriptive metadata ===
  

Revision as of 11:56, 22 September 2011

Main Page > Development > Development documentation > DSpace exports

This page analyzes the structure of a DSpace collection export from an uncustomized (i.e. out of the box) DSpace installation. See also draft workflow for transferring and ingesting DSpace exports.

Used the following command (from DSpace user documentation) to export a two-item collection with the handle 123456789-6:

./dspace packager -d -a -t AIP -e <user name> -i 123456789-6 calamy.zip

This results in the export of three zipped packages: one for the collection and one for each of the items:

  • calamy.zip
  • ITEM@123456789-7.zip
  • ITEM@123456789-8.zip

The extracted contents of each zipped file are shown in this screenshot:

Export.png

Item-level METS files

Link to object

  • The mets.xml file is linked to the object by the handle of the original zipped file:
MetsID.png

Licenses

The text file bitstreams in the two item-level directories are licenses. Note that they are not identified by filename as license files - Archivematica will need to recognize license files from each object's METS file (i.e. from fileSec). Here is an example of the fileSec showing the object to be preserved (bitstream_12.png) and its license file (bitstream_13):

FileSec.png

Archivematica should move the license file to the metadata/submissionDocumentation directory; the text can be parsed to the rights entity in the PREMIS metadata. See PREMIS metadata: rights#License-based.

RightsMD

Each object also has an amdSec containing rightsMD data (populated automatically according to DSpace configuration settings):

Rights.png

This metadata can be added to the PREMIS rights entity in the rightsExtension field. See See PREMIS metadata: rights#From_DSpace_METS

Descriptive metadata

  • Each object has two dmdSecs: MODS and DSpace Intermediate Metadata (DIM).
    • The DIM metadata is not intended for use outside of DSpace: according to the DSpace website, "[DIM] is used by XsltCrosswalk. It is called the Intermediate format because it is intended solely as an intermediate stage in XML-translation-based crosswalks. To reiterate, This is an INTERMEDIATE format, it is NOT for exporting or harvesting metadata!"
  • What should we do with the MODS metadata?
    • Leave it in the DSpace METS file and just link the object to its METS file?
    • Add an <mdRef> to the Archivematica METS file to link each object to its MODS metadata?
    • Add the MODS metadata to the Archivematica METS file as <mdWrap>?

Checksums

Each object and license has an MD5 checksum recorded in the fileSec.

FileSec.png

Archivematica should verify these checksums after transfer.

Collection-level mets files

The collection-level mets file contains MODS and DIM metadata for the collection; the MODS metadata should be linked or added to the Archivematica mets file.

Parsing a DSpace collection export in Archivematica

Requirements:

  • Map the elements of the DSpace AIPs to the Archivematica AIP
    • Keep the object in /objects
    • Move the license file to /metadata/submissionDocumentation
    • Move the mets file to /metadata
  • Structure the Archivematica mets.xml file to point to the DSpace mets.xml files
    • Question: how do we link the object to the DSpace METS file? Give the METS file a UUID and make the link in the PREMIS relationships container?
  • Index the metadata in all the xml files