Difference between revisions of "Significant characteristics of word processing files"

From Archivematica
Jump to navigation Jump to search
Line 1: Line 1:
[[Main Page]] > [[Documentation]] > [[Media type preservation plans]] > [[Significant characteristics]] > Significant characteristics of word processing files
+
[[Main Page]] > [[Documentation]] > [[Format policies]] > [[Significant characteristics]] > Significant characteristics of word processing files
  
 
*"[T]he essential characteristics of a word processing document may include the textual content; formatting such as bolded text, font type and size; layout; bulleting; colour and embedded graphics." [http://www.naa.gov.au/Images/An-approach-Green-Paper_tcm2-888.pdf An Approach to the Preservation of Digital Records, National Archives of Australia, 2002]
 
*"[T]he essential characteristics of a word processing document may include the textual content; formatting such as bolded text, font type and size; layout; bulleting; colour and embedded graphics." [http://www.naa.gov.au/Images/An-approach-Green-Paper_tcm2-888.pdf An Approach to the Preservation of Digital Records, National Archives of Australia, 2002]

Revision as of 18:13, 13 February 2013

Main Page > Documentation > Format policies > Significant characteristics > Significant characteristics of word processing files


Semantic unit Description Obligation Characteristic Note
PageCount Total number of pages in the document Mandatory Structure
WordCount Total number of words in the document Optional Structure This element is included in this schema because it can be valuable for evaluating the completeness of the content after transformations. Caution must be used with this element, however, because tools and applications that can determine the number of words in a document do not always use the same algorithm for determining this value.
CharacterCount Total number of characters in the document Optional Structure See note for WordCount, above
ParagraphCount Total number of paragraphs in the document Optional Structure See note for WordCount, above
LineCount Total number of lines in the document Optional Structure See note for WordCount, above
TableCount Total number of tables in the document Optional Structure See note for WordCount, above
GraphicsCount Total number of graphics in the document Optional Structure See note for WordCount, above
Language A language identifier specifying the natural language used in the document Optional Content
Fonts (FontName, isEmbedded) A list of fonts used in the document; An indication of whether or not a font is embedded in a document Mandatory Content, Appearance This element allows a repository to store the names of all fonts used in a document. Some repositories may choose to store only the non-embedded fonts. It is recommended that repositories record at least the non-embedded fonts to assist in identifying the documents with potential long-term preservation risks.
Features Additional document features as follows: hasLayers, hasTransparency, hasOutline, hasForms, has Annotations Optional
  • hasLayers: appearance
  • hasTransparency: appearance
  • hasOutline: behaviour, appearance
  • hasForms: content
  • hasAnnotations: content