Difference between revisions of "Significant characteristics of word processing files"

From Archivematica
Jump to navigation Jump to search
 
(2 intermediate revisions by 2 users not shown)
Line 1: Line 1:
[[Main Page]] > [[Documentation]] > [[Media type preservation plans]] > [[Significant characteristics]] > Significant characteristics of word processing files
+
[[Main Page]] > [[Documentation]] > [[Format policies]] > [[Significant characteristics]] > Significant characteristics of word processing files
 +
 
 +
<div style="padding: 10px 10px; border: 1px solid black; background-color: #F79086;">This page is no longer being maintained and may contain inaccurate information. Please see the [https://www.archivematica.org/docs/latest/ Archivematica documentation] for up-to-date information.</div><p>
  
 
*"[T]he essential characteristics of a word processing document may include the textual content; formatting such as bolded text, font type and size; layout; bulleting; colour and embedded graphics." [http://www.naa.gov.au/Images/An-approach-Green-Paper_tcm2-888.pdf An Approach to the Preservation of Digital Records, National Archives of Australia, 2002]
 
*"[T]he essential characteristics of a word processing document may include the textual content; formatting such as bolded text, font type and size; layout; bulleting; colour and embedded graphics." [http://www.naa.gov.au/Images/An-approach-Green-Paper_tcm2-888.pdf An Approach to the Preservation of Digital Records, National Archives of Australia, 2002]
*[http://www.fcla.edu/digitalArchive/pdfs/documentMD.pdf Document Metadata: Document Technical Metadata for Digital Preservation, Florida Digital Archive and Harvard University Library, 2009]:  this document suggests technical metadata for textual records which "can be used to verify the result of document transformations, ensuring the properties of the original document are preserved and properly transformed to the new document format." The metadata in this table are adapted from that source:
+
*[http://fclaweb.fcla.edu/uploads/Lydia%20Motyka/FDA_documentation/documentMD.pdf Document Metadata: Document Technical Metadata for Digital Preservation, Florida Digital Archive and Harvard University Library, 2009]:  this document suggests technical metadata for textual records which "can be used to verify the result of document transformations, ensuring the properties of the original document are preserved and properly transformed to the new document format." The metadata in this table are adapted from that source:
  
  

Latest revision as of 16:35, 11 February 2020

Main Page > Documentation > Format policies > Significant characteristics > Significant characteristics of word processing files

This page is no longer being maintained and may contain inaccurate information. Please see the Archivematica documentation for up-to-date information.


Semantic unit Description Obligation Characteristic Note
PageCount Total number of pages in the document Mandatory Structure
WordCount Total number of words in the document Optional Structure This element is included in this schema because it can be valuable for evaluating the completeness of the content after transformations. Caution must be used with this element, however, because tools and applications that can determine the number of words in a document do not always use the same algorithm for determining this value.
CharacterCount Total number of characters in the document Optional Structure See note for WordCount, above
ParagraphCount Total number of paragraphs in the document Optional Structure See note for WordCount, above
LineCount Total number of lines in the document Optional Structure See note for WordCount, above
TableCount Total number of tables in the document Optional Structure See note for WordCount, above
GraphicsCount Total number of graphics in the document Optional Structure See note for WordCount, above
Language A language identifier specifying the natural language used in the document Optional Content
Fonts (FontName, isEmbedded) A list of fonts used in the document; An indication of whether or not a font is embedded in a document Mandatory Content, Appearance This element allows a repository to store the names of all fonts used in a document. Some repositories may choose to store only the non-embedded fonts. It is recommended that repositories record at least the non-embedded fonts to assist in identifying the documents with potential long-term preservation risks.
Features Additional document features as follows: hasLayers, hasTransparency, hasOutline, hasForms, has Annotations Optional
  • hasLayers: appearance
  • hasTransparency: appearance
  • hasOutline: behaviour, appearance
  • hasForms: content
  • hasAnnotations: content