Difference between revisions of "Word processing files"

From Archivematica
Jump to navigation Jump to search
Line 5: Line 5:
  
 
==Preservation Format==
 
==Preservation Format==
*Open Document Format RTF and WPD
+
*Open Document Format (RTF and WPD)
 
*Original format for DOC (starting in Archivematica 0.8)
 
*Original format for DOC (starting in Archivematica 0.8)
  

Revision as of 10:23, 27 October 2011

Main Page > Documentation > Media type preservation plans > Word processing files


Significant characteristics of word processing files

Preservation Format

  • Open Document Format (RTF and WPD)
  • Original format for DOC (starting in Archivematica 0.8)

Access Format

PDF/A

Normalization tool

Unoconv/OpenOffice Writer

Comments

  • Unoconv is used as a command-line tool to open OpenOffice, which performs conversions tp both ODF and PDF/A
  • The files are converted to ODT, which is the OpenOffice extension for Open Document Format
  • OOXML and DOC files are left in their original format owing to their ubiquity and ongoing support by Microsoft
  • Normalization to Portable Document Format/Archival (PDF/A) may be an acceptable preservation strategy in addition to normalization to ODF using unoconv and OpenOffice.
    • For more information on the PDF/A format see Library of Congress Sustainability of Digital Formats: PDF/A-1.
    • PDF/A normalization of MS Word files is somewhat problematic because best results are achieved from within the native application - i.e. MS Office running in MS Windows. Archivematica does not support either Windows or MS Office since these are proprietary software packages.