Difference between revisions of "Format policies"

From Archivematica
Jump to navigation Jump to search
Line 1: Line 1:
 
[[Main Page]] > [[Documentation]] > Format-specific preservation issues
 
[[Main Page]] > [[Documentation]] > Format-specific preservation issues
 +
 +
 +
 +
Normalization is one preservation strategy that can be implemented using Archivematica.
 +
 +
Archivematica uses the National Archives of Australia's Xena (XML Electronic Normalising for Archives) software to convert certain file formats to xml-based formats on ingest. A full list of the file types Xena can normalize is available at [http://xena.sourceforge.net/help.php?page=normformats.html http://xena.sourceforge.net/help.php?page=normformats.html].
 +
 +
However, some formats cannot be normalized by Xena. Also, it may be desirable to adopt format-specific preservation plans for certain types of files based on institutional requirements and preferences.
 +
 +
Follow the links below for further discussion of these issues for each file format.
 +
  
  
Line 6: Line 17:
 
|- style="background-color:#cccccc;"
 
|- style="background-color:#cccccc;"
 
!style="width:20%"|'''Format'''
 
!style="width:20%"|'''Format'''
!style="width:35%"|'''Preservation action plan'''
+
!style="width:35%"|'''File extension(s)'''
!style="width:45%"|'''Issues and links'''
+
!style="width:45%"|'''Xena support'''
 
|-
 
|-
|DOC (Microsoft Office Word)
 
 
|
 
|
*Normalization to OpenDocument Format (ODF) using [http://xena.sourceforge.net/ Xena]
+
*[[Microsoft Office Word]]
*Possible normalization to Portable Document Format/Archival (PDF/A).
+
* a.k.a. MS-WORD
 +
|.doc
 +
|supported
 +
|-
 
|
 
|
*See [http://www.nationalarchives.gov.uk/PRONOM/Format/proFormatSearch.aspx?status=detailReport&id=690 PRONOM: Microsoft Word for Windows Document] for information about the format.
+
*[[Nikon Electronic Format]]
*For more information on the ODF format see [http://www.nationalarchives.gov.uk/PRONOM/Format/proFormatSearch.aspx?status=detailReport&id=778 PRONOM: OpenDocument Format].
+
* a.k.a. Nikon Image Format
*For more information on the PDF/A format see [http://www.digitalpreservation.gov/formats/fdd/fdd000125.shtml Library of Congress Sustainability of Digital Formats: PDF/A-1].
+
|.nef
*A list of tools used to convert files to PDF format is available at [http://www.cogniview.com/convert-pdf-to-excel/post/pdf-editing-creation-50-open-sourcefree-alternatives-to-adobe-acrobat/ Codswallop technology + productivity blog].
+
|not supported
 
|-
 
|-
|NEF (Nikon Electronic Format, also known as Nikon Image Format)
 
|Possible normalization to Adobe Digital Negative (DNG) format
 
 
|
 
|
*NEF is a raw (i.e. pre-processed) image format produced by Nikon cameras.
+
*[[Tagged Image File Format]]
*Xena incorrectly guesses the file type as TIFF and normalizes as it would a TIFF, with the result that a very large file (e.g. 70 MB) is normalized to a very small one (eg. 250 KB) with considerable loss of data. Xena also removes embedded data added during image editing, converting the data to a separate XMP file. This means that any edits added by the creator are removed.
+
* a.k.a. TIFF
*NEF is a proprietary format with an unpublished specification.
 
*NEF files can be very large - 70 MB or more.
 
*PRONOM provides only outline information for NEF files: see [http://www.nationalarchives.gov.uk/PRONOM/Format/proFormatSearch.aspx?status=detailReport&id=927 PRONOM: Nikon Digital SLR Camera Raw Image File].
 
*Much more detailed information on NEF files is available at [http://www.digitalpreservation.gov/formats/fdd/fdd000241.shtml Library of Congress Sustainability of Digital Formats: Camera Raw Formats].
 
*Library of Congress states that [http://www.digitalpreservation.gov/formats/fdd/fdd000188.shtml Adobe Digital Negative (DNG), Version 1.1] may be emerging as a standard preservation format for raw camera images. Adobe provides a free (but not open-source) [http://www.adobe.com/products/dng/ Adobe DNG Converter] designed for downloading to Windows or Mac computers.
 
|-
 
|TIF or TIFF (Tagged Image File Format)
 
|Normalization using [http://xena.sourceforge.net/ Xena] to Portable Network Graphics (PNG) format with embedded metadata stored in XML.
 
 
|
 
|
*Normalization of TIF 6.0 files may not be necessary, since it is considered to be a stable, preservation-friendly format.  
+
*.tiff
*For more information on TIF files see [http://www.digitalpreservation.gov/formats/fdd/fdd000022.shtml Library of Congress Sustainability of Digital Formats: TIFF, Revision 6.0], [http://www.nationalarchives.gov.uk/PRONOM/Format/proFormatSearch.aspx?status=detailReport&id=612 PRONOM: Tagged Image File Format 6] and [http://www.fcla.edu/digitalArchive/formatInfo.htm Florida Digital Archive TIFF 6.0 Action Plan Background Report].
+
*.tif
*For more information on PNG files see [http://www.digitalpreservation.gov/formats/fdd/fdd000153.shtml Library of Congress Sustainability of Digital Formats: PNG, Portable Network Graphics].
+
|supported
 
|-
 
|-
 
|}<br />
 
|}<br />

Revision as of 17:26, 7 December 2009

Main Page > Documentation > Format-specific preservation issues


Normalization is one preservation strategy that can be implemented using Archivematica.

Archivematica uses the National Archives of Australia's Xena (XML Electronic Normalising for Archives) software to convert certain file formats to xml-based formats on ingest. A full list of the file types Xena can normalize is available at http://xena.sourceforge.net/help.php?page=normformats.html.

However, some formats cannot be normalized by Xena. Also, it may be desirable to adopt format-specific preservation plans for certain types of files based on institutional requirements and preferences.

Follow the links below for further discussion of these issues for each file format.


Format File extension(s) Xena support
.doc supported
.nef not supported
  • .tiff
  • .tif
supported