Difference between revisions of "Word processing files"
Jump to navigation
Jump to search
(Created page with 'Main Page > Documentation > Media type preservation plans > Word processing files ==Significant properties of word processing files== ==Preservation Format== O...') |
|||
(11 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
− | [[Main Page]] > [[Documentation]] > [[ | + | [[Main Page]] > [[Documentation]] > [[Format policies]] > Word processing files |
− | ==[[Significant | + | ==[[Significant characteristics of word processing files]]== |
==Preservation Format== | ==Preservation Format== | ||
− | Open Document Format | + | *Open Document Format (WPD) |
+ | *Original format for DOC (starting in Archivematica 0.8) | ||
+ | *Keep as RTF (RTF) | ||
==Access Format== | ==Access Format== | ||
− | PDF | + | PDF/A |
==Normalization tool== | ==Normalization tool== | ||
− | + | <strike> Unoconv/OpenOffice Writer </strike> | |
+ | Tool search in progress | ||
+ | |||
==Comments== | ==Comments== | ||
− | + | *Unoconv is used as a command-line tool to open OpenOffice, which performs conversions to both ODF and PDF/A | |
− | + | *The files are converted to ODT, which is the OpenOffice extension for Open Document Format | |
+ | *OOXML and DOC files are left in their original format owing to their ubiquity and ongoing support by Microsoft | ||
+ | *Normalization to Portable Document Format/Archival (PDF/A) may be an acceptable preservation strategy in addition to normalization to ODF using unoconv and OpenOffice. | ||
+ | **For more information on the PDF/A format see [http://www.digitalpreservation.gov/formats/fdd/fdd000125.shtml Library of Congress Sustainability of Digital Formats: PDF/A-1]. | ||
+ | **PDF/A normalization of MS Word files is somewhat problematic because best results are achieved from within the native application - i.e. MS Office running in MS Windows. Archivematica does not support either Windows or MS Office since these are proprietary software packages. | ||
+ | *Rich Text Format is still heavily used and can be opened in a large number of software programs. Although it is proprietary, it has a published, freely available format specification. In contrast, we'll continue to normalize wordperfect files to .odt because there is no published spec for the wordperfect format. | ||
__NOTOC__ | __NOTOC__ |
Latest revision as of 13:37, 26 November 2013
Main Page > Documentation > Format policies > Word processing files
Significant characteristics of word processing files[edit]
Preservation Format[edit]
- Open Document Format (WPD)
- Original format for DOC (starting in Archivematica 0.8)
- Keep as RTF (RTF)
Access Format[edit]
PDF/A
Normalization tool[edit]
Unoconv/OpenOffice Writer
Tool search in progress
Comments[edit]
- Unoconv is used as a command-line tool to open OpenOffice, which performs conversions to both ODF and PDF/A
- The files are converted to ODT, which is the OpenOffice extension for Open Document Format
- OOXML and DOC files are left in their original format owing to their ubiquity and ongoing support by Microsoft
- Normalization to Portable Document Format/Archival (PDF/A) may be an acceptable preservation strategy in addition to normalization to ODF using unoconv and OpenOffice.
- For more information on the PDF/A format see Library of Congress Sustainability of Digital Formats: PDF/A-1.
- PDF/A normalization of MS Word files is somewhat problematic because best results are achieved from within the native application - i.e. MS Office running in MS Windows. Archivematica does not support either Windows or MS Office since these are proprietary software packages.
- Rich Text Format is still heavily used and can be opened in a large number of software programs. Although it is proprietary, it has a published, freely available format specification. In contrast, we'll continue to normalize wordperfect files to .odt because there is no published spec for the wordperfect format.