Difference between revisions of "Word processing files"

From Archivematica
Jump to navigation Jump to search
Line 5: Line 5:
  
 
==Preservation Format==
 
==Preservation Format==
Open Document Format; PDF/A
+
Open Document Format
  
 
==Access Format==
 
==Access Format==
PDF
+
PDF/A
  
 
==Normalization tool==
 
==Normalization tool==
Xena or OpenOffice Writer
+
Unoconv and OpenOffice Writer
 +
 
 
==Comments==
 
==Comments==
*Normalization to Portable Document Format/Archival (PDF/A) may be an acceptable preservation strategy in addition to normalization to ODF using Xena.
+
*Unoconv is used as a command-line tool to open OpenOffice, which performs conversions tp both ODF and PDF/A
 +
*The files are converted to ODT, which is the OpenOffice extension for Open Document Format
 +
*OOXML files are left in their original format
 +
*Normalization to Portable Document Format/Archival (PDF/A) may be an acceptable preservation strategy in addition to normalization to ODF using unoconv and OpenOffice.
 
**For more information on the PDF/A format see [http://www.digitalpreservation.gov/formats/fdd/fdd000125.shtml Library of Congress Sustainability of Digital Formats: PDF/A-1].
 
**For more information on the PDF/A format see [http://www.digitalpreservation.gov/formats/fdd/fdd000125.shtml Library of Congress Sustainability of Digital Formats: PDF/A-1].
**A list of tools used to convert files to PDF format is available at [http://www.cogniview.com/convert-pdf-to-excel/post/pdf-editing-creation-50-open-sourcefree-alternatives-to-adobe-acrobat/ Codswallop technology + productivity blog].
 
 
**PDF/A normalization of MS Word files is somewhat problematic because best results are achieved from within the native application - i.e. MS Office running in MS Windows. Archivematica does not support either Windows or MS Office since these are proprietary software packages.
 
**PDF/A normalization of MS Word files is somewhat problematic because best results are achieved from within the native application - i.e. MS Office running in MS Windows. Archivematica does not support either Windows or MS Office since these are proprietary software packages.
*For ODF conversion, Xena may be preferable since it also normalizes embedded objects such as image files and spreadsheets.
 
  
 
__NOTOC__
 
__NOTOC__

Revision as of 14:29, 7 May 2010

Main Page > Documentation > Media type preservation plans > Word processing files


Significant characteristics of word processing files

Preservation Format

Open Document Format

Access Format

PDF/A

Normalization tool

Unoconv and OpenOffice Writer

Comments

  • Unoconv is used as a command-line tool to open OpenOffice, which performs conversions tp both ODF and PDF/A
  • The files are converted to ODT, which is the OpenOffice extension for Open Document Format
  • OOXML files are left in their original format
  • Normalization to Portable Document Format/Archival (PDF/A) may be an acceptable preservation strategy in addition to normalization to ODF using unoconv and OpenOffice.
    • For more information on the PDF/A format see Library of Congress Sustainability of Digital Formats: PDF/A-1.
    • PDF/A normalization of MS Word files is somewhat problematic because best results are achieved from within the native application - i.e. MS Office running in MS Windows. Archivematica does not support either Windows or MS Office since these are proprietary software packages.