Difference between revisions of "UM Transfer metadata import 1.2"
Jump to navigation
Jump to search
(Created page with "Main Page > Documentation > User manual > Transfer > Metadata import ==General overview== This page documents the workflow an...") |
|||
(One intermediate revision by the same user not shown) | |||
Line 12: | Line 12: | ||
#The user adds a csv file to the metadata folder for the transfer entitled ''metadata.csv'' ('''figure 1''') | #The user adds a csv file to the metadata folder for the transfer entitled ''metadata.csv'' ('''figure 1''') | ||
#*The first row of the csv file consists of field names. '''Field names must not include spaces.''' ('''figure 2''') | #*The first row of the csv file consists of field names. '''Field names must not include spaces.''' ('''figure 2''') | ||
− | #*Dublin Core field names must contain the "dc" element in the name, eg "dc.title" | + | #*Simple Dublin Core field names must contain the "dc" element in the name, eg "dc.title". The dc namespace element should only be included if it is in our template (see [[METS#dmdSec]]). All 'other' metadata, even if it's Dublin Core, must be indicated without a namespace (so just 'medium' instead of 'dc.medium', for example). The simple DC that aligns with our template will be added into the main dmdSec of the METS.xml, and the rest goes into a second dmdSec as type "other". |
#*Each subsequent row contains the complete set of field values for a single directory or file | #*Each subsequent row contains the complete set of field values for a single directory or file | ||
#*For multi-value fields (such as dc.subject), the entire column is repeated and each column contains a single value | #*For multi-value fields (such as dc.subject), the entire column is repeated and each column contains a single value | ||
Line 19: | Line 19: | ||
#*Note that filenames can be duplicates of filenames in other subdirectories if desired. For example, the name "page01.jp2" can occur in multiple subdirectories. | #*Note that filenames can be duplicates of filenames in other subdirectories if desired. For example, the name "page01.jp2" can occur in multiple subdirectories. | ||
#At the generate METS micro-service, Archivematica parses the metadata in ''metadata.csv'' to the METS file, as follows: | #At the generate METS micro-service, Archivematica parses the metadata in ''metadata.csv'' to the METS file, as follows: | ||
− | #*All Dublin Core elements are used to generate a dmdSec for each directory or file with MDTYPE="DC" | + | #*All simple Dublin Core elements are used to generate a dmdSec for each directory or file with MDTYPE="DC" |
− | #*All non-Dublin Core elements are used to generate a dmdSec for each directory or file with MDTYPE="OTHER" OTHERMDTYPE="CUSTOM" | + | #*All non-simple Dublin Core and other metadata elements are used to generate a dmdSec for each directory or file with MDTYPE="OTHER" OTHERMDTYPE="CUSTOM" |
#*The dmdSecs are linked to their directories or files in the structMap | #*The dmdSecs are linked to their directories or files in the structMap | ||
</div> | </div> |
Latest revision as of 11:06, 18 December 2014
Main Page > Documentation > User manual > Transfer > Metadata import
General overview[edit]
This page documents the workflow and METS content for lower-level metadata import - i.e. metadata to be attached to subdirectories and files within a SIP.
Workflow[edit]
- For simple objects, the user places files in the objects directory, with or without intervening subdirectories. The imported metadata are attached to each object.
- For compound objects, the user creates one or more subdirectories in the objects directory, each containing the files that form a compound object. The imported metadata are attached to each subdirectory.
- The subdirectory names must not contain spaces or other forbidden characters.
- The user adds a csv file to the metadata folder for the transfer entitled metadata.csv (figure 1)
- The first row of the csv file consists of field names. Field names must not include spaces. (figure 2)
- Simple Dublin Core field names must contain the "dc" element in the name, eg "dc.title". The dc namespace element should only be included if it is in our template (see METS#dmdSec). All 'other' metadata, even if it's Dublin Core, must be indicated without a namespace (so just 'medium' instead of 'dc.medium', for example). The simple DC that aligns with our template will be added into the main dmdSec of the METS.xml, and the rest goes into a second dmdSec as type "other".
- Each subsequent row contains the complete set of field values for a single directory or file
- For multi-value fields (such as dc.subject), the entire column is repeated and each column contains a single value
- If the metadata are for simple objects, the csv file must contain a "filename" column listing the filepath and filename of each objects: eg "objects/BrocktonOval.jp2"
- If the metadata are for compound objects, the csv file must contain a "parts" column listing the names of the directories containing the items that form the compound object: eg "objects/Jan021964"
- Note that filenames can be duplicates of filenames in other subdirectories if desired. For example, the name "page01.jp2" can occur in multiple subdirectories.
- At the generate METS micro-service, Archivematica parses the metadata in metadata.csv to the METS file, as follows:
- All simple Dublin Core elements are used to generate a dmdSec for each directory or file with MDTYPE="DC"
- All non-simple Dublin Core and other metadata elements are used to generate a dmdSec for each directory or file with MDTYPE="OTHER" OTHERMDTYPE="CUSTOM"
- The dmdSecs are linked to their directories or files in the structMap
Simple objects[edit]
This section provides csv file and METS file examples for simple objects - i.e. individual files that are not pages in a compound object such as a book or a newspaper issue.
CSV file[edit]
Sample headings and values
filename | dc.title | dcterms.issued | dc.publisher | dc.contributor | dc.subject | dc.subject | dc.date | dc.description | notes | dcterms.isPartOf | repository | dc.rights | project_website | dc.format |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
objects/BrocktonOval.jp2 | Stanley Park in December | 1996-01-17 | Riley Studios, Vancouver BC | Don Langfield, photographer | Vancouver (B.C.)--Parks | Landscapes | 1992/12/04 | Image shows Brockton Oval after light snowfall | Originally part of series entitled "Winter in Vancouver" | Riley Studios collection | New Caledonia Public Library | Copyright held by Riley Studios | http://www.ncpl/donlangfieldphotographs.ca | image/jp2 |
objects/QE Park sunset.jp2 | Sunset in Queen Elizabeth Park | Riley Studios, Vancouver BC | Don Langfield, photographer | Vancouver (B.C.)--Parks | 1994/07/13 | Riley Studios collection | New Caledonia Public Library | Copyright held by Riley Studios | http://www.ncpl/donlangfieldphotographs.ca | image/jp2 |
METS file[edit]
Compound objects[edit]
This section provides csv file and METS file examples for compound objects - i.e. multi-page digital objects such as newspapers and books.
CSV file[edit]
Sample headings and values
parts | dc.title | alternative_title | dc.publisher | dates_of_publication | dc.subject | dc.date | dc.description | frequency | dc.language | forms_part_of | repository | project_website | digital_file_format |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
objects/Jan021964 | Coast News, January 02, 1964 | Sunshine Coast News | Fred Cruice | 1945-1995 | Gibsons (B.C.)--Newspapers | 1964/01/02 | Serving the Growing Sunshine Coast | Weekly | English | British Columbia Historical Newspapers collection | Sunshine Coast Museum and Archives | http://historicalnewspapers.library.ubc.ca | image/jp2 |
objects/Jan091964 | Coast News, January 09, 1964 | Sunshine Coast News | Fred Cruice | 1945-1995 | Gibsons (B.C.)--Newspapers | 1964/01/09 | Serving the Growing Sunshine Coast | Weekly | English | British Columbia Historical Newspapers collection | Sunshine Coast Museum and Archives | http://historicalnewspapers.library.ubc.ca | image/jp2 |
METS file[edit]