Difference between revisions of "PDF to PDF/A using Ghostscript"

From Archivematica
Jump to navigation Jump to search
 
(7 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
[[Main Page]] > [[Documentation]] > [[Media type preservation plans]] > [[Portable Document Format]] > PDF to PDF/A using Ghostscript
 
[[Main Page]] > [[Documentation]] > [[Media type preservation plans]] > [[Portable Document Format]] > PDF to PDF/A using Ghostscript
  
== File 1 ==
+
== File 1 (primarily text) ==
  
 
*File used was A checklist for documenting PREMIS-METS decisions in a PREMIS profile, May 2010, Sally Vermaaten, OCLC, http://www.loc.gov/standards/premis/premis_mets_checklist.pdf.
 
*File used was A checklist for documenting PREMIS-METS decisions in a PREMIS profile, May 2010, Sally Vermaaten, OCLC, http://www.loc.gov/standards/premis/premis_mets_checklist.pdf.
Line 15: Line 15:
 
!style="width:20%"|'''Original'''
 
!style="width:20%"|'''Original'''
 
!style="width:20%"|'''Normalized'''
 
!style="width:20%"|'''Normalized'''
 +
|-
 +
|PDF version
 +
|PDF 1.5
 +
|PDF/A-1b
 
|-
 
|-
 
|File size
 
|File size
Line 28: Line 32:
 
*10 embedded subsets
 
*10 embedded subsets
 
*2 non-embedded subsets: Arial and Arial Italic
 
*2 non-embedded subsets: Arial and Arial Italic
|10 embedded subsets. Arial replaced by Helvetica and Arial Italic replaced by Helvetica Oblique.
+
|
 +
*10 embedded subsets
 +
*Substituted Helvetica for Arial  
 +
*Substituted Helvetica Oblique for Arial Italic
 
|-
 
|-
 
|Features
 
|Features
 
|
 
|
 +
*Annotations: yes
 
*Forms: no
 
*Forms: no
*Metadata stream: no
 
 
*Outline: no
 
*Outline: no
*Threads: no
 
*Tagged: yes
 
*Page layout: single page
 
*Page mode: use none
 
 
|
 
|
 +
*Annotations: yes
 
*Forms: no
 
*Forms: no
*Metadata stream: no
 
 
*Outline: no
 
*Outline: no
*Threads: no
 
*Tagged: no
 
*Page layout: single page
 
*Page mode: use none
 
 
|-
 
|-
 
|}<br />
 
|}<br />
  
== File 2 ==
+
== File 2 (text, graphics, colours, images) ==
  
 
*File used was IFPI Digital Music Report 2010, http://www.ifpi.org/content/library/DMR2010.pdf
 
*File used was IFPI Digital Music Report 2010, http://www.ifpi.org/content/library/DMR2010.pdf
Line 64: Line 63:
 
!style="width:20%"|'''Original'''
 
!style="width:20%"|'''Original'''
 
!style="width:20%"|'''Normalized'''
 
!style="width:20%"|'''Normalized'''
 +
|-
 +
|PDF version
 +
|PDF 1.4
 +
|PDF/A-1b
 
|-
 
|-
 
|File size
 
|File size
Line 74: Line 77:
 
|-
 
|-
 
|Fonts
 
|Fonts
 +
|7 embedded subsets
 +
|7 embedded subsets
 +
|-
 +
|Features
 
|
 
|
*8 embedded subsets
+
*Annotations: no
 +
*Forms: yes
 +
*Outline: no
 
|
 
|
 +
*Annotations: no
 +
*Forms: no
 +
*Outline: no
 +
|-
 +
|}<br />
 +
 +
== File 3 (fillable form) ==
 +
 +
*File used was Understanding Canada: Canadian Studies Application, Faculty Research Program, http://www.iccs-ciec.ca/pages/z_pdfs/FEP_FRP/FRPEnForm.pdf
 +
*Used Ghostscript 8.71 using the following command: gs -dPDFA -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=FRPEnForm_PDFA.pdf FRPEnForm.pdf
 +
*Note that the version normalized to PDF/A is no longer usable as a fillable form
 +
<br>
 +
 +
 +
{| border="1" cellpadding="10" cellspacing="0" width=90%
 +
|-
 +
|- style="background-color:#cccccc;"
 +
!style="width:20%"|'''Property'''
 +
!style="width:20%"|'''Original'''
 +
!style="width:20%"|'''Normalized'''
 +
|-
 +
|PDF version
 +
|PDF 1.6
 +
|PDF/A-1b
 +
|-
 +
|File size
 +
|153,196 bytes
 +
|61,175 bytes
 +
|-
 +
|PageCount
 +
|5
 +
|5
 +
|-
 +
|Fonts
 +
|5 non-embedded fonts
 +
|
 +
*5 embedded subsets
 +
*Substituted font Helvetica-Bold for Arial, Bold
 
*Substituted font Helvetica for Arial
 
*Substituted font Helvetica for Arial
*Substituted font Times-Roman for TimesNewRoman
+
*Substituted font Helvetica-Oblique for Arial, Italic
*Substituted font Times-Italic for TimesNewRoman, Italic
 
*Substituted font Times-Bold for TimesNewRoman, Bold
 
 
|-
 
|-
 
|Features
 
|Features
 
|
 
|
*Forms: no
+
*Annotations: yes
*Metadata stream: no
+
*Forms: yes
*Outline: no
+
*Outline: yes
*Threads: no
 
*Tagged: yes
 
*Page layout: single page
 
*Page mode: use none
 
 
|
 
|
 +
*Annotations: no
 
*Forms: no
 
*Forms: no
*Metadata stream: no
+
*Outline: yes
*Outline: no
 
*Threads: no
 
*Tagged: no
 
*Page layout: single page
 
*Page mode: use none
 
 
|-
 
|-
 
|}<br />
 
|}<br />

Latest revision as of 19:31, 23 November 2010

Main Page > Documentation > Media type preservation plans > Portable Document Format > PDF to PDF/A using Ghostscript

File 1 (primarily text)[edit]

  • File used was A checklist for documenting PREMIS-METS decisions in a PREMIS profile, May 2010, Sally Vermaaten, OCLC, http://www.loc.gov/standards/premis/premis_mets_checklist.pdf.
  • Used Ghostscript 8.71 using the following command: gs -dPDFA -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=premis_mets_checklist_PDFA.pdf premis_mets_checklist.pdf



Property Original Normalized
PDF version PDF 1.5 PDF/A-1b
File size 318,500 bytes 974,071 bytes
PageCount 14 14
Fonts
  • 10 embedded subsets
  • 2 non-embedded subsets: Arial and Arial Italic
  • 10 embedded subsets
  • Substituted Helvetica for Arial
  • Substituted Helvetica Oblique for Arial Italic
Features
  • Annotations: yes
  • Forms: no
  • Outline: no
  • Annotations: yes
  • Forms: no
  • Outline: no


File 2 (text, graphics, colours, images)[edit]



Property Original Normalized
PDF version PDF 1.4 PDF/A-1b
File size 1,713,072 bytes 5,337,321 bytes
PageCount 32 32
Fonts 7 embedded subsets 7 embedded subsets
Features
  • Annotations: no
  • Forms: yes
  • Outline: no
  • Annotations: no
  • Forms: no
  • Outline: no


File 3 (fillable form)[edit]

  • File used was Understanding Canada: Canadian Studies Application, Faculty Research Program, http://www.iccs-ciec.ca/pages/z_pdfs/FEP_FRP/FRPEnForm.pdf
  • Used Ghostscript 8.71 using the following command: gs -dPDFA -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=FRPEnForm_PDFA.pdf FRPEnForm.pdf
  • Note that the version normalized to PDF/A is no longer usable as a fillable form



Property Original Normalized
PDF version PDF 1.6 PDF/A-1b
File size 153,196 bytes 61,175 bytes
PageCount 5 5
Fonts 5 non-embedded fonts
  • 5 embedded subsets
  • Substituted font Helvetica-Bold for Arial, Bold
  • Substituted font Helvetica for Arial
  • Substituted font Helvetica-Oblique for Arial, Italic
Features
  • Annotations: yes
  • Forms: yes
  • Outline: yes
  • Annotations: no
  • Forms: no
  • Outline: yes