Difference between revisions of "UM Transfer metadata import 1.2"

From Archivematica
Jump to navigation Jump to search
(Created page with "Main Page > Documentation > User manual > Transfer > Metadata import ==General overview== This page documents the workflow an...")
 
Line 12: Line 12:
 
#The user adds a csv file to the metadata folder for the transfer entitled ''metadata.csv'' ('''figure 1''')
 
#The user adds a csv file to the metadata folder for the transfer entitled ''metadata.csv'' ('''figure 1''')
 
#*The first row of the csv file consists of field names. '''Field names must not include spaces.''' ('''figure 2''')
 
#*The first row of the csv file consists of field names. '''Field names must not include spaces.''' ('''figure 2''')
#*Dublin Core field names must contain the "dc" element in the name, eg "dc.title"
+
#*Simple Dublin Core field names must contain the "dc" element in the name, eg "dc.title". The dc namespace element should only be included if it is in our template (see [[METS#dmdSec]]). All 'other' metadata, even if it's Dublin Core, must be indicated without a namespace (so just 'medium' instead of 'dc.medium', for example). The simple DC that aligns with our template will be added into the main dmdSec of the METS.xml, and the rest goes into a second dmdSec as type "other".
 
#*Each subsequent row contains the complete set of field values for a single directory or file
 
#*Each subsequent row contains the complete set of field values for a single directory or file
 
#*For multi-value fields (such as dc.subject), the entire column is repeated and each column contains a single value
 
#*For multi-value fields (such as dc.subject), the entire column is repeated and each column contains a single value
Line 19: Line 19:
 
#*Note that filenames can be duplicates of filenames in other subdirectories if desired. For example, the name "page01.jp2" can occur in multiple subdirectories.
 
#*Note that filenames can be duplicates of filenames in other subdirectories if desired. For example, the name "page01.jp2" can occur in multiple subdirectories.
 
#At the generate METS micro-service, Archivematica parses the metadata in ''metadata.csv'' to the METS file, as follows:
 
#At the generate METS micro-service, Archivematica parses the metadata in ''metadata.csv'' to the METS file, as follows:
#*All Dublin Core elements are used to generate a dmdSec for each directory or file with MDTYPE="DC"
+
#*All simple Dublin Core elements are used to generate a dmdSec for each directory or file with MDTYPE="DC"
#*All non-Dublin Core elements are used to generate a dmdSec for each directory or file with MDTYPE="OTHER" OTHERMDTYPE="CUSTOM"
+
#*All non-simple Dublin Core elements are used to generate a dmdSec for each directory or file with MDTYPE="OTHER" OTHERMDTYPE="CUSTOM"
 
#*The dmdSecs are linked to their directories or files in the structMap
 
#*The dmdSecs are linked to their directories or files in the structMap
 
</div>
 
</div>

Revision as of 11:05, 18 December 2014

Main Page > Documentation > User manual > Transfer > Metadata import

General overview

This page documents the workflow and METS content for lower-level metadata import - i.e. metadata to be attached to subdirectories and files within a SIP.

Workflow

  1. For simple objects, the user places files in the objects directory, with or without intervening subdirectories. The imported metadata are attached to each object.
  2. For compound objects, the user creates one or more subdirectories in the objects directory, each containing the files that form a compound object. The imported metadata are attached to each subdirectory.
    • The subdirectory names must not contain spaces or other forbidden characters.
  3. The user adds a csv file to the metadata folder for the transfer entitled metadata.csv (figure 1)
    • The first row of the csv file consists of field names. Field names must not include spaces. (figure 2)
    • Simple Dublin Core field names must contain the "dc" element in the name, eg "dc.title". The dc namespace element should only be included if it is in our template (see METS#dmdSec). All 'other' metadata, even if it's Dublin Core, must be indicated without a namespace (so just 'medium' instead of 'dc.medium', for example). The simple DC that aligns with our template will be added into the main dmdSec of the METS.xml, and the rest goes into a second dmdSec as type "other".
    • Each subsequent row contains the complete set of field values for a single directory or file
    • For multi-value fields (such as dc.subject), the entire column is repeated and each column contains a single value
    • If the metadata are for simple objects, the csv file must contain a "filename" column listing the filepath and filename of each objects: eg "objects/BrocktonOval.jp2"
    • If the metadata are for compound objects, the csv file must contain a "parts" column listing the names of the directories containing the items that form the compound object: eg "objects/Jan021964"
    • Note that filenames can be duplicates of filenames in other subdirectories if desired. For example, the name "page01.jp2" can occur in multiple subdirectories.
  4. At the generate METS micro-service, Archivematica parses the metadata in metadata.csv to the METS file, as follows:
    • All simple Dublin Core elements are used to generate a dmdSec for each directory or file with MDTYPE="DC"
    • All non-simple Dublin Core elements are used to generate a dmdSec for each directory or file with MDTYPE="OTHER" OTHERMDTYPE="CUSTOM"
    • The dmdSecs are linked to their directories or files in the structMap
Figure 1 Metadata folder in transfer directory contains metadata.csv file
Figure 2 Example csv file contents


Simple objects

This section provides csv file and METS file examples for simple objects - i.e. individual files that are not pages in a compound object such as a book or a newspaper issue.

CSV file

Sample headings and values

filename dc.title dcterms.issued dc.publisher dc.contributor dc.subject dc.subject dc.date dc.description notes dcterms.isPartOf repository dc.rights project_website dc.format
objects/BrocktonOval.jp2 Stanley Park in December 1996-01-17 Riley Studios, Vancouver BC Don Langfield, photographer Vancouver (B.C.)--Parks Landscapes 1992/12/04 Image shows Brockton Oval after light snowfall Originally part of series entitled "Winter in Vancouver" Riley Studios collection New Caledonia Public Library Copyright held by Riley Studios http://www.ncpl/donlangfieldphotographs.ca image/jp2
objects/QE Park sunset.jp2 Sunset in Queen Elizabeth Park Riley Studios, Vancouver BC Don Langfield, photographer Vancouver (B.C.)--Parks 1994/07/13 Riley Studios collection New Caledonia Public Library Copyright held by Riley Studios http://www.ncpl/donlangfieldphotographs.ca image/jp2


METS file

Mets 1g.png
Mets 2g.png
Mets 3g.png

Compound objects

This section provides csv file and METS file examples for compound objects - i.e. multi-page digital objects such as newspapers and books.

CSV file

Sample headings and values

parts dc.title alternative_title dc.publisher dates_of_publication dc.subject dc.date dc.description frequency dc.language forms_part_of repository project_website digital_file_format
objects/Jan021964 Coast News, January 02, 1964 Sunshine Coast News Fred Cruice 1945-1995 Gibsons (B.C.)--Newspapers 1964/01/02 Serving the Growing Sunshine Coast Weekly English British Columbia Historical Newspapers collection Sunshine Coast Museum and Archives http://historicalnewspapers.library.ubc.ca image/jp2
objects/Jan091964 Coast News, January 09, 1964 Sunshine Coast News Fred Cruice 1945-1995 Gibsons (B.C.)--Newspapers 1964/01/09 Serving the Growing Sunshine Coast Weekly English British Columbia Historical Newspapers collection Sunshine Coast Museum and Archives http://historicalnewspapers.library.ubc.ca image/jp2



METS file

Mets 4g.png
Mets 5g.png
Mets 6g.png