Difference between revisions of "Dataset preservation"
Jump to navigation
Jump to search
Line 1: | Line 1: | ||
==Workflow== | ==Workflow== | ||
− | + | *'''Metadata validation''': Archivematica should include a micro-service to validate metadata on ingest, using something like xmllint. Sample validation command: ''xmllint --schema ddi:instance:3_1 metadata/CCRI-CDN-Census1911V20110628.xml''. | |
− | *Some datasets may require manual normalization: see https://projects.artefactual.com/issues/1499. | + | *'''Normalization''':Some datasets may require manual normalization: see https://projects.artefactual.com/issues/1499. |
</br> | </br> | ||
Line 21: | Line 21: | ||
*If these standards are used, the mdRef in the METS file would need to use OTHER as MDTYPE, for example: ''<mdRef LABEL="CCRI-CDN-Census1911V20110628.xml-73b93b28-be1b-433f-861e-03bc321dfe7e" xlink:href="metadata/CCRI-CDN-Census1911V20110628.xml" MDTYPE="OTHER" OTHERMDTYPE="SDMX" LOCTYPE="OTHER" OTHERLOCTYPE="SYSTEM"/>'' | *If these standards are used, the mdRef in the METS file would need to use OTHER as MDTYPE, for example: ''<mdRef LABEL="CCRI-CDN-Census1911V20110628.xml-73b93b28-be1b-433f-861e-03bc321dfe7e" xlink:href="metadata/CCRI-CDN-Census1911V20110628.xml" MDTYPE="OTHER" OTHERMDTYPE="SDMX" LOCTYPE="OTHER" OTHERLOCTYPE="SYSTEM"/>'' | ||
</br> | </br> | ||
− | |||
− | |||
− | |||
[[Category:Development documentation]] | [[Category:Development documentation]] | ||
__NOTOC__ | __NOTOC__ |
Revision as of 12:19, 8 January 2013
Workflow
- Metadata validation: Archivematica should include a micro-service to validate metadata on ingest, using something like xmllint. Sample validation command: xmllint --schema ddi:instance:3_1 metadata/CCRI-CDN-Census1911V20110628.xml.
- Normalization:Some datasets may require manual normalization: see https://projects.artefactual.com/issues/1499.
Metadata
METS and DDI/FGDC
- DDI is Data Documentation Initiative, a metadata specification for the social and behavioral sciences; see http://www.ddialliance.org/.
- FGDC is Federal Geographic Data Committee Metadata Standard [FGDC-STD-001-1998]; see http://www.fgdc.gov/metadata/csdgm/
- DDI and FGDC are considered descriptive metadata (mdSec) in METS. From http://www.loc.gov/standards/mets/METSOverview.v2.html: "Valid values for the MDTYPE element [in mdSec] include...DDI (Data Documentation Initiative), FGDC (Federal Geographic Data Committee Metadata Standard [FGDC-STD-001-1998]."
- In the Archivematica METS file, a DDI or FGDC file could be referenced from the mdSec using mdRef, for example as follows: <mdRef LABEL="CCRI-CDN-Census1911V20110628.xml-73b93b28-be1b-433f-861e-03bc321dfe7e" xlink:href="metadata/CCRI-CDN-Census1911V20110628.xml" MDTYPE="DDI" LOCTYPE="OTHER" OTHERLOCTYPE="SYSTEM"/>.
METS and other metadata standards
- Other metadata standards that could be used for ingested datasets include:
- North American Profile (NAP) of ISO 19119, for geospatial metadata: http://www.fgdc.gov/metadata/geospatial-metadata-standards
- SDMX for aggregate data: http://sdmx.org/?page_id=10
- EML, the Ecological Metadata Language: http://knb.ecoinformatics.org/software/eml/eml-2.1.1/index.html
- If these standards are used, the mdRef in the METS file would need to use OTHER as MDTYPE, for example: <mdRef LABEL="CCRI-CDN-Census1911V20110628.xml-73b93b28-be1b-433f-861e-03bc321dfe7e" xlink:href="metadata/CCRI-CDN-Census1911V20110628.xml" MDTYPE="OTHER" OTHERMDTYPE="SDMX" LOCTYPE="OTHER" OTHERLOCTYPE="SYSTEM"/>