DOC to PDF/A-1a using OpenOffice

From Archivematica
Jump to: navigation, search

Main Page > Documentation > Media type preservation plans > Microsoft Word for Windows > DOC to PDF/A-1a using OpenOffice Writer

  • File used was MSWord_test_document.doc. Normalized version is available at MSWord_test_document.pdf.
  • OpenOffice version used was 3.1.1.
  • The converted file was verified as conforming entirely to the PDF/A-1a standard (file was opened in Adobe Acrobat Pro 9 and analyzed using the preflight tool).
  • Note several problems:
    • Had to use JPEG image compression; otherwise the filesize of the normalized version was over 20 MB due to the presence of a photographic image.
    • PDF/A does not allow transparency; one of the images in the original includes transparency, which was converted to black during normalization, dramatically changing the look of the image.
    • The indentation of the Heading 1 text was changed if the heading was more than one line because OpenOffice indents headings and MS Word doesn't.
    • The table of contents was rendered in plain black text in MS Word but as underlined blue text in the normalized version because of display differences between OpenOffice and MS Word.
  • Some of the problems listed result from the fact that the document is not being normalized from within its native application, MS Word. When the document is normalized from within MS Word (using Adobe PDF Maker) the result is more true in appearance to the original document. See File:MSWord test document 1.pdf to compare this version with the one produced by OpenOffice Writer.


Property Original Normalized
File size 1,460,736 bytes 1,445,354 bytes (see note)
PageCount 7 7
WordCount 1320
CharacterCount 6630; 7824 with spaces
ParagraphCount 126
LineCount 200
TableCount
GraphicsCount HasPictures true
Language U.S. English en-CA
Fonts Times New Roman, Arial, Verdana, Comic Sans MS and variants (eg TimesNewRomanPS-BoldMT)
Features hasOutline yes; hasAnnotations yes; hasForms no; isTagged yes/no conflict

Note: Used 70% JPEG image compression. Otherwise, filesize was over 20 MB.

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox