Difference between revisions of "PDF to PDF/A using Ghostscript"

From Archivematica
Jump to navigation Jump to search
 
(5 intermediate revisions by the same user not shown)
Line 15: Line 15:
 
!style="width:20%"|'''Original'''
 
!style="width:20%"|'''Original'''
 
!style="width:20%"|'''Normalized'''
 
!style="width:20%"|'''Normalized'''
 +
|-
 +
|PDF version
 +
|PDF 1.5
 +
|PDF/A-1b
 
|-
 
|-
 
|File size
 
|File size
Line 28: Line 32:
 
*10 embedded subsets
 
*10 embedded subsets
 
*2 non-embedded subsets: Arial and Arial Italic
 
*2 non-embedded subsets: Arial and Arial Italic
|10 embedded subsets. Arial replaced by Helvetica and Arial Italic replaced by Helvetica Oblique.
+
|
 +
*10 embedded subsets
 +
*Substituted Helvetica for Arial  
 +
*Substituted Helvetica Oblique for Arial Italic
 
|-
 
|-
 
|Features
 
|Features
 
|
 
|
 +
*Annotations: yes
 
*Forms: no
 
*Forms: no
*Metadata stream: no
 
 
*Outline: no
 
*Outline: no
*Threads: no
 
*Tagged: yes
 
*Page layout: single page
 
*Page mode: use none
 
 
|
 
|
 +
*Annotations: yes
 
*Forms: no
 
*Forms: no
*Metadata stream: no
 
 
*Outline: no
 
*Outline: no
*Threads: no
 
*Tagged: no
 
*Page layout: single page
 
*Page mode: use none
 
 
|-
 
|-
 
|}<br />
 
|}<br />
Line 64: Line 63:
 
!style="width:20%"|'''Original'''
 
!style="width:20%"|'''Original'''
 
!style="width:20%"|'''Normalized'''
 
!style="width:20%"|'''Normalized'''
 +
|-
 +
|PDF version
 +
|PDF 1.4
 +
|PDF/A-1b
 
|-
 
|-
 
|File size
 
|File size
Line 79: Line 82:
 
|Features
 
|Features
 
|
 
|
*Forms: no
+
*Annotations: no
*Metadata stream: no
+
*Forms: yes
 
*Outline: no
 
*Outline: no
*Threads: no
 
*Tagged: no
 
*Page layout: single page
 
*Page mode: use none
 
 
|
 
|
 +
*Annotations: no
 
*Forms: no
 
*Forms: no
*Metadata stream: no
 
 
*Outline: no
 
*Outline: no
*Threads: no
 
*Tagged: no
 
*Page layout: single page
 
*Page mode: use none
 
 
|-
 
|-
 
|}<br />
 
|}<br />
Line 99: Line 94:
 
== File 3 (fillable form) ==
 
== File 3 (fillable form) ==
  
*File used was IFPI Digital Music Report 2010, http://www.ifpi.org/content/library/DMR2010.pdf
+
*File used was Understanding Canada: Canadian Studies Application, Faculty Research Program, http://www.iccs-ciec.ca/pages/z_pdfs/FEP_FRP/FRPEnForm.pdf
*Used Ghostscript 8.71 using the following command: gs -dPDFA -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=DMR2010_PDFA.pdf DMR2010.pdf
+
*Used Ghostscript 8.71 using the following command: gs -dPDFA -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=FRPEnForm_PDFA.pdf FRPEnForm.pdf
 
+
*Note that the version normalized to PDF/A is no longer usable as a fillable form
 
<br>
 
<br>
  
Line 111: Line 106:
 
!style="width:20%"|'''Original'''
 
!style="width:20%"|'''Original'''
 
!style="width:20%"|'''Normalized'''
 
!style="width:20%"|'''Normalized'''
 +
|-
 +
|PDF version
 +
|PDF 1.6
 +
|PDF/A-1b
 
|-
 
|-
 
|File size
 
|File size
Line 130: Line 129:
 
|Features
 
|Features
 
|
 
|
 +
*Annotations: yes
 
*Forms: yes
 
*Forms: yes
*Metadata stream: yes
 
 
*Outline: yes
 
*Outline: yes
*Threads: yes
 
*Tagged: no
 
*Page layout: single page
 
*Page mode: use none
 
 
|
 
|
 +
*Annotations: no
 
*Forms: no
 
*Forms: no
*Metadata stream: yes
 
 
*Outline: yes
 
*Outline: yes
*Threads: no
 
*Tagged: no
 
*Page layout: single page
 
*Page mode: use none
 
 
|-
 
|-
 
|}<br />
 
|}<br />

Latest revision as of 18:31, 23 November 2010

Main Page > Documentation > Media type preservation plans > Portable Document Format > PDF to PDF/A using Ghostscript

File 1 (primarily text)[edit]

  • File used was A checklist for documenting PREMIS-METS decisions in a PREMIS profile, May 2010, Sally Vermaaten, OCLC, http://www.loc.gov/standards/premis/premis_mets_checklist.pdf.
  • Used Ghostscript 8.71 using the following command: gs -dPDFA -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=premis_mets_checklist_PDFA.pdf premis_mets_checklist.pdf



Property Original Normalized
PDF version PDF 1.5 PDF/A-1b
File size 318,500 bytes 974,071 bytes
PageCount 14 14
Fonts
  • 10 embedded subsets
  • 2 non-embedded subsets: Arial and Arial Italic
  • 10 embedded subsets
  • Substituted Helvetica for Arial
  • Substituted Helvetica Oblique for Arial Italic
Features
  • Annotations: yes
  • Forms: no
  • Outline: no
  • Annotations: yes
  • Forms: no
  • Outline: no


File 2 (text, graphics, colours, images)[edit]



Property Original Normalized
PDF version PDF 1.4 PDF/A-1b
File size 1,713,072 bytes 5,337,321 bytes
PageCount 32 32
Fonts 7 embedded subsets 7 embedded subsets
Features
  • Annotations: no
  • Forms: yes
  • Outline: no
  • Annotations: no
  • Forms: no
  • Outline: no


File 3 (fillable form)[edit]

  • File used was Understanding Canada: Canadian Studies Application, Faculty Research Program, http://www.iccs-ciec.ca/pages/z_pdfs/FEP_FRP/FRPEnForm.pdf
  • Used Ghostscript 8.71 using the following command: gs -dPDFA -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=FRPEnForm_PDFA.pdf FRPEnForm.pdf
  • Note that the version normalized to PDF/A is no longer usable as a fillable form



Property Original Normalized
PDF version PDF 1.6 PDF/A-1b
File size 153,196 bytes 61,175 bytes
PageCount 5 5
Fonts 5 non-embedded fonts
  • 5 embedded subsets
  • Substituted font Helvetica-Bold for Arial, Bold
  • Substituted font Helvetica for Arial
  • Substituted font Helvetica-Oblique for Arial, Italic
Features
  • Annotations: yes
  • Forms: yes
  • Outline: yes
  • Annotations: no
  • Forms: no
  • Outline: yes