Difference between revisions of "Format policies"

From Archivematica
Jump to navigation Jump to search
Line 99: Line 99:
 
|[[Microsoft Word for Windows|DOC]], [[Corel WordPerfect|WPD]], [[Rich Text Format|RTF]]
 
|[[Microsoft Word for Windows|DOC]], [[Corel WordPerfect|WPD]], [[Rich Text Format|RTF]]
 
|
 
|
*ODF (WPD and RTF)
+
*Original format
*Original format (DOC)
 
 
|PDF
 
|PDF
 
|Tool search in progress
 
|Tool search in progress

Revision as of 11:10, 31 August 2015

Main Page > Documentation > Format policies

Format Policy Registry (FPR)

Archivematica manages format policies locally and externally via a Format Policy Registry (FPR). The registry is on a server that Artefactual hosts which includes our default policies for normalization, extraction and format identification. The local FPR offered in the user dashboard preservation planning tab is customizable for the local user. To learn about the FPR, please see Format Policy Registry. To read about some of the comprehensive goals of the FPR, see FPR Requirements.

Migration and emulation

Archivematica maintains the original format of all ingested files to support migration and emulation preservation strategies.

Normalization

Archivematica's primary preservation strategy is to normalize files to preservation and access formats upon ingest. Archivematica's preservation formats are all open standards. Additionally, the choice of preservation format is based on community best practices, availability of open-source normalization tools, and an analysis of the significant characteristics for each media type. The choice of access formats is based on the ubiquity of viewers for the file format.

Follow the link for each file format for further information about the open-source normalization tools and settings that have been tested and integrated into Archivematica to make the format conversions.

Format policies

  • Format Policies indicate what tool to run when normalizing for a given purpose (access, preservation) when a specific File Identification Tool identifies a specific File Format. They can be thought of as analogous to Virus Definitions, which need to be updated periodically in an Archivematica installation in order to ensure the efficacy of the virus scanning micro-service. Similarly, software security updates are downloaded at the operating system level, to keep the host machine secure.
Media type File formats Preservation format(s) Access format(s) Normalization tool
Audio AC3, AIFF, MP3, WAV, WMA WAVE (LPCM) MP3 FFmpeg
Email PST MBOX MBOX readpst
Email Maildir** Original format MBOX md2mb.py
Office Open XML DOCX, PPTX, XLSX Original format PDF for PPTX Tool search in progress
Plain text TXT Original format Original format None
Portable Document Format PDF PDF/A Original format Ghostscript
Presentation files PPT Original format PDF Tool search in progress
Raster images BMP, GIF, JPG, JP2*, PCT, PNG*, PSD, TIFF, TGA Uncompressed TIFF JPEG ImageMagick
Raw camera files/Digital Negative format** 3FR, ARW, CR2, CRW, DCR, DNG, ERF, KDC, MRW, NEF, ORF, PEF, RAF, RAW, X3F Original format JPEG ImageMagick/UFRaw
Spreadsheets XLS Original format Original format None
Vector images AI, EPS, SVG SVG PDF Inkscape
Video AVI, FLV, MOV, MPEG-1, MPEG-2, MPEG-4, SWF, WMV FFV1/LPCM in MKV MP4 FFmpeg
Word processing files DOC, WPD, RTF
  • Original format
PDF Tool search in progress
  • (*) PNG and JPEG2000 are not normalized to a preservation format
  • (**) in development


See also


While there is not currently a default format policy for Websites, we have done some research and assessment work with our clients that may be of interest towards developing one.