Difference between revisions of "Metadata import"

From Archivematica
Jump to navigation Jump to search
 
(22 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
[[Main Page]] > [[Development]] > [[:Category:Development documentation|Development documentation]] > Metadata import
 
[[Main Page]] > [[Development]] > [[:Category:Development documentation|Development documentation]] > Metadata import
 +
 +
<div style="padding: 10px 10px; border: 1px solid black; background-color: #F79086;">This page is no longer being maintained and may contain inaccurate information. Please see the [https://www.archivematica.org/docs/latest/ Archivematica documentation] for up-to-date information.</div><p>
  
 
This page documents the workflow and METS content for lower-level metadata import - i.e. metadata to be attached to subdirectories and files within a SIP.
 
This page documents the workflow and METS content for lower-level metadata import - i.e. metadata to be attached to subdirectories and files within a SIP.
 +
 +
[[Category:Feature requirements]]
  
 
==Workflow==
 
==Workflow==
  
#For compound objects, the user creates one or more subdirectory in the objects directory, each containing the items that form a compound object.
+
#For simple objects, the user places files in the objects directory, with or without intervening subdirectories. The imported metadata are attached to each object.
#*The subdirectory names must not contain spaces, underscores or other forbidden characters.
+
#For compound objects, the user creates one or more subdirectories in the objects directory, each containing the files that form a compound object. The imported metadata are attached to each subdirectory.
 +
#*'''The subdirectory names must not contain spaces or other forbidden characters.'''
 
#The user adds a csv file to the metadata folder for the transfer entitled ''metadata.csv''
 
#The user adds a csv file to the metadata folder for the transfer entitled ''metadata.csv''
#*The first row of the csv file consists of field names. Field names must not include spaces.
+
#*The first row of the csv file consists of field names.
 
#*Dublin Core field names must contain the "dc" element in the name, eg "dc.title"
 
#*Dublin Core field names must contain the "dc" element in the name, eg "dc.title"
 
#*Each subsequent row contains the complete set of field values for a single directory or file
 
#*Each subsequent row contains the complete set of field values for a single directory or file
#*If the metadata are for simple objects, the csv file must contain a "filename" column listing the filepath and filename of each object: eg "objects/BrocktonOval.jp2"
+
#*For multi-value fields (such as dc.subject), the entire column is repeated and each column contains a single value
#*If the metadata are for compound objects, the csv file must contain a "parts" column listing the names of the directories containing the items that form the compound object: eg "objects/Jan011964/page01.jp2
+
#*If the metadata are for simple objects, the csv file must contain a "filename" column listing the filepath and filename of each objects: eg "objects/BrocktonOval.jp2"
 +
#*If the metadata are for compound objects, the csv file must contain a "parts" column listing the names of the directories containing the items that form the compound object: eg "objects/Jan021964"
 +
#*Note that filenames can be duplicates of filenames in other subdirectories if desired. For example, the name "page01.jp2" can occur in multiple subdirectories.
 
#At the generate METS micro-service, Archivematica parses the metadata in ''metadata.csv'' to the METS file, as follows:
 
#At the generate METS micro-service, Archivematica parses the metadata in ''metadata.csv'' to the METS file, as follows:
 
#*All Dublin Core elements are used to generate a dmdSec for each directory or file with MDTYPE="DC"
 
#*All Dublin Core elements are used to generate a dmdSec for each directory or file with MDTYPE="DC"
Line 22: Line 29:
 
==Simple objects==
 
==Simple objects==
  
This section provides csv file and METS file examples for simple objects - i.e. individual files that are not items in a compound object such as a book or a newspaper issue.
+
This section provides csv file and METS file examples for simple objects - i.e. individual files that are not pages in a compound object such as a book or a newspaper issue.
  
 
===CSV file===
 
===CSV file===
Line 32: Line 39:
 
!'''filename'''
 
!'''filename'''
 
!'''dc.title'''
 
!'''dc.title'''
!'''dates of publication'''
+
!'''dcterms.issued'''
 
!'''dc.publisher'''
 
!'''dc.publisher'''
 
!'''dc.contributor'''
 
!'''dc.contributor'''
 +
!'''dc.subject'''
 
!'''dc.subject'''
 
!'''dc.subject'''
 
!'''dc.date'''
 
!'''dc.date'''
 
!'''dc.description'''
 
!'''dc.description'''
 
!'''notes'''
 
!'''notes'''
!'''forms part of'''
+
!'''dcterms.isPartOf'''
 
!'''repository'''
 
!'''repository'''
 
!'''dc.rights'''
 
!'''dc.rights'''
!'''project website'''
+
!'''project_website'''
!'''Digital file format'''
+
!'''dc.format'''
 
|-
 
|-
|objects/BrocktonOval.jpg
+
|objects/BrocktonOval.jp2
 
|Stanley Park in December
 
|Stanley Park in December
|
+
|1996-01-17
 
|Riley Studios, Vancouver BC
 
|Riley Studios, Vancouver BC
 
|Don Langfield, photographer
 
|Don Langfield, photographer
 
|Vancouver (B.C.)--Parks
 
|Vancouver (B.C.)--Parks
 +
|Landscapes
 
|1992/12/04
 
|1992/12/04
 
|Image shows Brockton Oval after light snowfall
 
|Image shows Brockton Oval after light snowfall
Line 60: Line 69:
 
|image/jp2
 
|image/jp2
 
|-
 
|-
|objects/QE Park sunset.jpg
+
|objects/QE Park sunset.jp2
 
|Sunset in Queen Elizabeth Park
 
|Sunset in Queen Elizabeth Park
 
|
 
|
Line 66: Line 75:
 
|Don Langfield, photographer
 
|Don Langfield, photographer
 
|Vancouver (B.C.)--Parks
 
|Vancouver (B.C.)--Parks
 +
|
 
|1994/07/13
 
|1994/07/13
 
|
 
|
Line 86: Line 96:
  
 
==Compound objects==
 
==Compound objects==
 +
 +
This section provides csv file and METS file examples for compound objects - i.e. multi-page digital objects such as newspapers and books.
  
 
===CSV file===
 
===CSV file===
Line 95: Line 107:
 
!'''parts'''
 
!'''parts'''
 
!'''dc.title'''
 
!'''dc.title'''
!'''alternative title'''
+
!'''alternative_title'''
 
!'''dc.publisher'''
 
!'''dc.publisher'''
!'''dates of publication'''
+
!'''dates_of_publication'''
 
!'''dc.subject'''
 
!'''dc.subject'''
 
!'''dc.date'''
 
!'''dc.date'''
Line 103: Line 115:
 
!'''frequency'''
 
!'''frequency'''
 
!'''dc.language'''
 
!'''dc.language'''
!'''forms part of'''
+
!'''forms_part_of'''
 
!'''repository'''
 
!'''repository'''
!'''project website'''
+
!'''project_website'''
!'''Digital file format'''
+
!'''digital_file_format'''
 
|-
 
|-
 
|objects/Jan021964
 
|objects/Jan021964
Line 149: Line 161:
 
[[File:mets_5g.png|980px|thumb|center|]]
 
[[File:mets_5g.png|980px|thumb|center|]]
 
[[File:mets_6g.png|980px|thumb|center|]]
 
[[File:mets_6g.png|980px|thumb|center|]]
 
 
[[Category:Development documentation]]
 
 
__NOTOC__
 

Latest revision as of 17:26, 11 February 2020

Main Page > Development > Development documentation > Metadata import

This page is no longer being maintained and may contain inaccurate information. Please see the Archivematica documentation for up-to-date information.

This page documents the workflow and METS content for lower-level metadata import - i.e. metadata to be attached to subdirectories and files within a SIP.

Workflow[edit]

  1. For simple objects, the user places files in the objects directory, with or without intervening subdirectories. The imported metadata are attached to each object.
  2. For compound objects, the user creates one or more subdirectories in the objects directory, each containing the files that form a compound object. The imported metadata are attached to each subdirectory.
    • The subdirectory names must not contain spaces or other forbidden characters.
  3. The user adds a csv file to the metadata folder for the transfer entitled metadata.csv
    • The first row of the csv file consists of field names.
    • Dublin Core field names must contain the "dc" element in the name, eg "dc.title"
    • Each subsequent row contains the complete set of field values for a single directory or file
    • For multi-value fields (such as dc.subject), the entire column is repeated and each column contains a single value
    • If the metadata are for simple objects, the csv file must contain a "filename" column listing the filepath and filename of each objects: eg "objects/BrocktonOval.jp2"
    • If the metadata are for compound objects, the csv file must contain a "parts" column listing the names of the directories containing the items that form the compound object: eg "objects/Jan021964"
    • Note that filenames can be duplicates of filenames in other subdirectories if desired. For example, the name "page01.jp2" can occur in multiple subdirectories.
  4. At the generate METS micro-service, Archivematica parses the metadata in metadata.csv to the METS file, as follows:
    • All Dublin Core elements are used to generate a dmdSec for each directory or file with MDTYPE="DC"
    • All non-Dublin Core elements are used to generate a dmdSec for each directory or file with MDTYPE="OTHER" OTHERMDTYPE="CUSTOM"
    • The dmdSecs are linked to their directories or files in the structMap


Simple objects[edit]

This section provides csv file and METS file examples for simple objects - i.e. individual files that are not pages in a compound object such as a book or a newspaper issue.

CSV file[edit]

Sample headings and values

filename dc.title dcterms.issued dc.publisher dc.contributor dc.subject dc.subject dc.date dc.description notes dcterms.isPartOf repository dc.rights project_website dc.format
objects/BrocktonOval.jp2 Stanley Park in December 1996-01-17 Riley Studios, Vancouver BC Don Langfield, photographer Vancouver (B.C.)--Parks Landscapes 1992/12/04 Image shows Brockton Oval after light snowfall Originally part of series entitled "Winter in Vancouver" Riley Studios collection New Caledonia Public Library Copyright held by Riley Studios http://www.ncpl/donlangfieldphotographs.ca image/jp2
objects/QE Park sunset.jp2 Sunset in Queen Elizabeth Park Riley Studios, Vancouver BC Don Langfield, photographer Vancouver (B.C.)--Parks 1994/07/13 Riley Studios collection New Caledonia Public Library Copyright held by Riley Studios http://www.ncpl/donlangfieldphotographs.ca image/jp2


METS file[edit]

Mets 1g.png
Mets 2g.png
Mets 3g.png

Compound objects[edit]

This section provides csv file and METS file examples for compound objects - i.e. multi-page digital objects such as newspapers and books.

CSV file[edit]

Sample headings and values

parts dc.title alternative_title dc.publisher dates_of_publication dc.subject dc.date dc.description frequency dc.language forms_part_of repository project_website digital_file_format
objects/Jan021964 Coast News, January 02, 1964 Sunshine Coast News Fred Cruice 1945-1995 Gibsons (B.C.)--Newspapers 1964/01/02 Serving the Growing Sunshine Coast Weekly English British Columbia Historical Newspapers collection Sunshine Coast Museum and Archives http://historicalnewspapers.library.ubc.ca image/jp2
objects/Jan091964 Coast News, January 09, 1964 Sunshine Coast News Fred Cruice 1945-1995 Gibsons (B.C.)--Newspapers 1964/01/09 Serving the Growing Sunshine Coast Weekly English British Columbia Historical Newspapers collection Sunshine Coast Museum and Archives http://historicalnewspapers.library.ubc.ca image/jp2



METS file[edit]

Mets 4g.png
Mets 5g.png
Mets 6g.png