Difference between revisions of "Format policies"
Line 5: | Line 5: | ||
Archivematica maintains the original format of all ingested files to support migration and emulation preservation strategies. | Archivematica maintains the original format of all ingested files to support migration and emulation preservation strategies. | ||
− | ==Normalization | + | ==Normalization== |
− | Archivematica's primary preservation strategy is to normalize files to preservation and access formats upon ingest. The choice of access formats is based on the ubiquity of viewers for the file format. | + | Archivematica's primary preservation strategy is to normalize files to preservation and access formats upon ingest. The choice of access formats is based on the ubiquity of viewers for the file format. Archivematica's preservation formats are all [http://en.wikipedia.org/wiki/Open_standard open standards]. Additionally, the choice of preservation format is based on community best practices, availability of open-source normalization tools, and an analysis of the [[significant properties]] for each media type. |
− | |||
− | |||
− | Archivematica's preservation formats are all [http://en.wikipedia.org/wiki/Open_standard open standards]. Additionally, the choice of preservation format is based on community best practices, availability of open-source normalization tools, and an analysis of the [[significant properties]] for each media type. | ||
==Media type preservation plans== | ==Media type preservation plans== | ||
Line 22: | Line 19: | ||
!style="width:40%"|'''Comments''' | !style="width:40%"|'''Comments''' | ||
|- | |- | ||
− | + | |Audio | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |Audio | ||
|LPCM/WAVE | |LPCM/WAVE | ||
|OGG, MP3 | |OGG, MP3 | ||
Line 34: | Line 25: | ||
|We may also wish to consider FLAC as a preservation format for audio files. It is less well-known in the archival community but is a fully lossless, openly-specified, non-proprietary audio format. | |We may also wish to consider FLAC as a preservation format for audio files. It is less well-known in the archival community but is a fully lossless, openly-specified, non-proprietary audio format. | ||
|- | |- | ||
− | |Raster images | + | |Presentation files |
+ | |Open Document Format; PDF/A | ||
+ | |PDF or PDF/A | ||
+ | |Xena or OpenOffice Impress | ||
+ | |Xena may be preferable, since it appears to produce a more accurate representation of the original. | ||
+ | |- | ||
+ | |Raster images | ||
|TIFF, JPEG2000 or PNG | |TIFF, JPEG2000 or PNG | ||
|PNG | |PNG | ||
|ImageMagick | |ImageMagick | ||
− | |Since TIFF, JPEG2000 and PNG are all good formats for preservation, we could leave any files in those formats as they are (as long as they are uncompressed or losslessly compressed). However, we could normalize other formats, such as JPEG, GIF and BMP, to one of the preservation formats. | + | |Note: does not include raw camera files. |
+ | |||
+ | Since TIFF, JPEG2000 and PNG are all good formats for preservation, we could leave any files in those formats as they are (as long as they are uncompressed or losslessly compressed). However, we could normalize other formats, such as JPEG, GIF and BMP, to one of the preservation formats. | ||
|- | |- | ||
|Raw camera files | |Raw camera files | ||
Line 44: | Line 43: | ||
|TIFF or PNG | |TIFF or PNG | ||
|DigiKam DNG Converter | |DigiKam DNG Converter | ||
+ | | | ||
+ | |- | ||
+ | |Spreadsheets | ||
+ | |Open Document Format | ||
+ | | | ||
+ | |OpenOffice Calc | ||
| | | | ||
|- | |- | ||
Line 51: | Line 56: | ||
| | | | ||
| | | | ||
+ | |- | ||
+ | |Video | ||
+ | |Motion JPEG2000/MXF or MPEG-2/MXF | ||
+ | |OGG,FLV | ||
+ | |FFmpeg | ||
+ | |Motion JPEG2000 is the emerging preferred standard for video files but it is hard to find a tool for Linux that converts to that codec. MPEG-2 is an accepted standard, however, which is in use by a number of institutions. | ||
|- | |- | ||
|Word processing files | |Word processing files | ||
Line 57: | Line 68: | ||
|OpenOffice Writer | |OpenOffice Writer | ||
|PDF/A normalization of MS Word files is somewhat problematic because best results are achieved from within the native application - i.e. MS Office running in MS Windows. Archivematica does not support either Windows or MS Office since these are proprietary software packages. | |PDF/A normalization of MS Word files is somewhat problematic because best results are achieved from within the native application - i.e. MS Office running in MS Windows. Archivematica does not support either Windows or MS Office since these are proprietary software packages. | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
|- | |- | ||
| | | |
Revision as of 12:49, 9 March 2010
Main Page > Documentation > Media type preservation plans
Migration and emulation
Archivematica maintains the original format of all ingested files to support migration and emulation preservation strategies.
Normalization
Archivematica's primary preservation strategy is to normalize files to preservation and access formats upon ingest. The choice of access formats is based on the ubiquity of viewers for the file format. Archivematica's preservation formats are all open standards. Additionally, the choice of preservation format is based on community best practices, availability of open-source normalization tools, and an analysis of the significant properties for each media type.
Media type preservation plans
Media type | Preservation format(s) | Access format(s) | Normalization tool | Comments |
---|---|---|---|---|
Audio | LPCM/WAVE | OGG, MP3 | FFmpeg | We may also wish to consider FLAC as a preservation format for audio files. It is less well-known in the archival community but is a fully lossless, openly-specified, non-proprietary audio format. |
Presentation files | Open Document Format; PDF/A | PDF or PDF/A | Xena or OpenOffice Impress | Xena may be preferable, since it appears to produce a more accurate representation of the original. |
Raster images | TIFF, JPEG2000 or PNG | PNG | ImageMagick | Note: does not include raw camera files.
Since TIFF, JPEG2000 and PNG are all good formats for preservation, we could leave any files in those formats as they are (as long as they are uncompressed or losslessly compressed). However, we could normalize other formats, such as JPEG, GIF and BMP, to one of the preservation formats. |
Raw camera files | DNG | TIFF or PNG | DigiKam DNG Converter | |
Spreadsheets | Open Document Format | OpenOffice Calc | ||
Vector images | SVG | |||
Video | Motion JPEG2000/MXF or MPEG-2/MXF | OGG,FLV | FFmpeg | Motion JPEG2000 is the emerging preferred standard for video files but it is hard to find a tool for Linux that converts to that codec. MPEG-2 is an accepted standard, however, which is in use by a number of institutions. |
Word processing files | Open Document Format; PDF/A | PDF or PDF/A | OpenOffice Writer | PDF/A normalization of MS Word files is somewhat problematic because best results are achieved from within the native application - i.e. MS Office running in MS Windows. Archivematica does not support either Windows or MS Office since these are proprietary software packages. |