Archivematica - User contributions [en]

Archivematica 1.8 and Storage Service 0.13 release notes

2018-09-28T23:36:42Z

Peter: /* Fixed */

[[Main_Page|Home]] > [[Release_Notes|Release Notes]] > Major release notes template

'''Work in progress'''

==Supported environments==

Link to installation instructions.

Specify supported environments.

Make special note of any changes to supported environment.

==Added==

Describe new features.

===New feature 1===

This is a description of this amazing feature! Here's why it's a net benefit to the project and the community. Also included are any special notes, like if it's a beta feature.

This work was sponsored by some amazing institution. Thank you!

* Documentation: link
* Pull requests: link

===New feature 2===

Here is a description of this amazing feature! Here's why it's a net benefit to the project and the community, and here is how it will impact your workflow. Also included are any special notes, like if it's a beta feature.

This work was sponsored by some amazing institution. Thank you!

* Documentation: link
* Pull requests: link
* Feature files: link

==Changed==

Describe enhancements or major fixes.

===Enhancement 1===

We fixed this issue. Here's why it's a net benefit to the project and the community, and here is how it will impact your workflow. Also included are any special notes, like if it's a beta feature.

This work was sponsored by some amazing institution. Thank you!

* Documentation: link
* Pull requests: link
* Feature files: link

==Fixed==

List bugfixes with a link to the Github issue.

* Bugfix 1: Failures on filenames with backticks and other 'silly' characters https://github.com/archivematica/Issues/issues/16
* Bugfix 2: AIP re-ingest fails. https://github.com/archivematica/Issues/issues/42
* Bugfix 3: PREMIS events from previous transfers are re-appearing https://github.com/archivematica/Issues/issues/43
* Bugfix 4: Metadata reingest fails when dc:type is null https://github.com/artefactual/archivematica/issues/1132
* Bugfix 5: Use 7-zip without compression (Copy) mode https://github.com/archivematica/Issues/issues/46
* Bugfix 6: Metadata added before "Approve Transfer" disappears https://github.com/archivematica/Issues/issues/140
* Bugfix 7: Generate AIP METS fails for bag SIPs if bag-info.txt has multiple instances of the same label https://github.com/archivematica/Issues/issues/173

==Upgraded tools and dependencies==

List any tools and dependencies that have been upgraded.

* Tool has been updated to version X.

==End of life dependencies==

List any dependencies that have reached end of life since the last release, as well as a note on the plan going forward.

Make note of any risks, and how users can mitigate them.

Archivematica 1.8 and Storage Service 0.13 release notes

2018-09-28T21:51:46Z

Peter: /* Fixed */

[[Main_Page|Home]] > [[Release_Notes|Release Notes]] > Major release notes template

'''Work in progress'''

==Supported environments==

Link to installation instructions.

Specify supported environments.

Make special note of any changes to supported environment.

==Added==

Describe new features.

===New feature 1===

This is a description of this amazing feature! Here's why it's a net benefit to the project and the community. Also included are any special notes, like if it's a beta feature.

This work was sponsored by some amazing institution. Thank you!

* Documentation: link
* Pull requests: link

===New feature 2===

Here is a description of this amazing feature! Here's why it's a net benefit to the project and the community, and here is how it will impact your workflow. Also included are any special notes, like if it's a beta feature.

This work was sponsored by some amazing institution. Thank you!

* Documentation: link
* Pull requests: link
* Feature files: link

==Changed==

Describe enhancements or major fixes.

===Enhancement 1===

We fixed this issue. Here's why it's a net benefit to the project and the community, and here is how it will impact your workflow. Also included are any special notes, like if it's a beta feature.

This work was sponsored by some amazing institution. Thank you!

* Documentation: link
* Pull requests: link
* Feature files: link

==Fixed==

List bugfixes with a link to the Github issue.

* Bugfix 1: Failures on filenames with backticks and other 'silly' characters https://github.com/archivematica/Issues/issues/16
* Bugfix 2: AIP re-ingest fails. https://github.com/archivematica/Issues/issues/42
* Bugfix 3: PREMIS events from previous transfers are re-appearing https://github.com/archivematica/Issues/issues/43
* Bugfix 4: Metadata reingest fails when dc:type is null https://github.com/artefactual/archivematica/issues/1132
* Bugfix 5: Use 7-zip without compression (Copy) mode https://github.com/archivematica/Issues/issues/46
* Bugfix 6: Metadata added before "Approve Transfer" disappears https://github.com/archivematica/Issues/issues/140

==Upgraded tools and dependencies==

List any tools and dependencies that have been upgraded.

* Tool has been updated to version X.

==End of life dependencies==

List any dependencies that have reached end of life since the last release, as well as a note on the plan going forward.

Make note of any risks, and how users can mitigate them.

Archivematica 1.8 and Storage Service 0.13 release notes

2018-09-28T18:32:02Z

Peter:

[[Main_Page|Home]] > [[Release_Notes|Release Notes]] > Major release notes template

'''Work in progress'''

==Supported environments==

Link to installation instructions.

Specify supported environments.

Make special note of any changes to supported environment.

==Added==

Describe new features.

===New feature 1===

This is a description of this amazing feature! Here's why it's a net benefit to the project and the community. Also included are any special notes, like if it's a beta feature.

This work was sponsored by some amazing institution. Thank you!

* Documentation: link
* Pull requests: link

===New feature 2===

Here is a description of this amazing feature! Here's why it's a net benefit to the project and the community, and here is how it will impact your workflow. Also included are any special notes, like if it's a beta feature.

This work was sponsored by some amazing institution. Thank you!

* Documentation: link
* Pull requests: link
* Feature files: link

==Changed==

Describe enhancements or major fixes.

===Enhancement 1===

We fixed this issue. Here's why it's a net benefit to the project and the community, and here is how it will impact your workflow. Also included are any special notes, like if it's a beta feature.

This work was sponsored by some amazing institution. Thank you!

* Documentation: link
* Pull requests: link
* Feature files: link

==Fixed==

List bugfixes with a link to the Github issue.

* Bugfix 1: Failures on filenames with backticks and other 'silly' characters https://github.com/archivematica/Issues/issues/16
* Bugfix 2: AIP re-ingest fails. https://github.com/archivematica/Issues/issues/42
* Bugfix 3: PREMIS events from previous transfers are re-appearing https://github.com/archivematica/Issues/issues/43
* Bugfix 4: Metadata reingest fails when dc:type is null https://github.com/artefactual/archivematica/issues/1132
* Bugfix 5: Use 7-zip without compression (Copy) mode https://github.com/archivematica/Issues/issues/46

==Upgraded tools and dependencies==

List any tools and dependencies that have been upgraded.

* Tool has been updated to version X.

==End of life dependencies==

List any dependencies that have reached end of life since the last release, as well as a note on the plan going forward.

Make note of any risks, and how users can mitigate them.

Archivematica 1.8 and Storage Service 0.13 release notes

2018-09-28T17:44:54Z

Peter:

[[Main_Page|Home]] > [[Release_Notes|Release Notes]] > Major release notes template

'''Work in progress'''

==Supported environments==

Link to installation instructions.

Specify supported environments.

Make special note of any changes to supported environment.

==Added==

Describe new features.

===New feature 1===

This is a description of this amazing feature! Here's why it's a net benefit to the project and the community. Also included are any special notes, like if it's a beta feature.

This work was sponsored by some amazing institution. Thank you!

* Documentation: link
* Pull requests: link

===New feature 2===

Here is a description of this amazing feature! Here's why it's a net benefit to the project and the community, and here is how it will impact your workflow. Also included are any special notes, like if it's a beta feature.

This work was sponsored by some amazing institution. Thank you!

* Documentation: link
* Pull requests: link
* Feature files: link

==Changed==

Describe enhancements or major fixes.

===Enhancement 1===

We fixed this issue. Here's why it's a net benefit to the project and the community, and here is how it will impact your workflow. Also included are any special notes, like if it's a beta feature.

This work was sponsored by some amazing institution. Thank you!

* Documentation: link
* Pull requests: link
* Feature files: link

==Fixed==

List bugfixes with a link to the Github issue.

* Bugfix 1: Failures on filenames with backticks and other 'silly' characters https://github.com/archivematica/Issues/issues/16
* Bugfix 2: AIP re-ingest fails. https://github.com/archivematica/Issues/issues/42
* Bugfix 3: PREMIS events from previous transfers are re-appearing https://github.com/archivematica/Issues/issues/43
* Bugfix 4: Metadata reingest fails when dc:type is null https://github.com/artefactual/archivematica/issues/1132

==Upgraded tools and dependencies==

List any tools and dependencies that have been upgraded.

* Tool has been updated to version X.

==End of life dependencies==

List any dependencies that have reached end of life since the last release, as well as a note on the plan going forward.

Make note of any risks, and how users can mitigate them.

Archivematica 1.8 and Storage Service 0.13 release notes

2018-09-28T17:44:30Z

Peter:

[[Main_Page|Home]] > [[Release_Notes|Release Notes]] > Major release notes template

'''Work in progress'''

==Supported environments==

Link to installation instructions.

Specify supported environments.

Make special note of any changes to supported environment.

==Added==

Describe new features.

===New feature 1===

This is a description of this amazing feature! Here's why it's a net benefit to the project and the community. Also included are any special notes, like if it's a beta feature.

This work was sponsored by some amazing institution. Thank you!

* Documentation: link
* Pull requests: link

===New feature 2===

Here is a description of this amazing feature! Here's why it's a net benefit to the project and the community, and here is how it will impact your workflow. Also included are any special notes, like if it's a beta feature.

This work was sponsored by some amazing institution. Thank you!

* Documentation: link
* Pull requests: link
* Feature files: link

==Changed==

Describe enhancements or major fixes.

===Enhancement 1===

We fixed this issue. Here's why it's a net benefit to the project and the community, and here is how it will impact your workflow. Also included are any special notes, like if it's a beta feature.

This work was sponsored by some amazing institution. Thank you!

* Documentation: link
* Pull requests: link
* Feature files: link

==Fixed==

List bugfixes with a link to the Github issue.

* Bugfix 1: [[Failures on filenames with backticks and other 'silly' characters]https://github.com/archivematica/Issues/issues/16]]
* Bugfix 2: AIP re-ingest fails. https://github.com/archivematica/Issues/issues/42
* Bugfix 3: PREMIS events from previous transfers are re-appearing https://github.com/archivematica/Issues/issues/43
* Bugfix 4: Metadata reingest fails when dc:type is null https://github.com/artefactual/archivematica/issues/1132

==Upgraded tools and dependencies==

List any tools and dependencies that have been upgraded.

* Tool has been updated to version X.

==End of life dependencies==

List any dependencies that have reached end of life since the last release, as well as a note on the plan going forward.

Make note of any risks, and how users can mitigate them.

Archivematica 1.8 and Storage Service 0.13 release notes

2018-09-28T17:43:39Z

Peter: /* Fixed */

[[Main_Page|Home]] > [[Release_Notes|Release Notes]] > Major release notes template

'''Work in progress'''

==Supported environments==

Link to installation instructions.

Specify supported environments.

Make special note of any changes to supported environment.

==Added==

Describe new features.

===New feature 1===

This is a description of this amazing feature! Here's why it's a net benefit to the project and the community. Also included are any special notes, like if it's a beta feature.

This work was sponsored by some amazing institution. Thank you!

* Documentation: link
* Pull requests: link

===New feature 2===

Here is a description of this amazing feature! Here's why it's a net benefit to the project and the community, and here is how it will impact your workflow. Also included are any special notes, like if it's a beta feature.

This work was sponsored by some amazing institution. Thank you!

* Documentation: link
* Pull requests: link
* Feature files: link

==Changed==

Describe enhancements or major fixes.

===Enhancement 1===

We fixed this issue. Here's why it's a net benefit to the project and the community, and here is how it will impact your workflow. Also included are any special notes, like if it's a beta feature.

This work was sponsored by some amazing institution. Thank you!

* Documentation: link
* Pull requests: link
* Feature files: link

==Fixed==

List bugfixes with a link to the Github issue.

* Bugfix 1: Failures on filenames with backticks and other 'silly' characters https://github.com/archivematica/Issues/issues/16
* Bugfix 2: AIP re-ingest fails. https://github.com/archivematica/Issues/issues/42
* Bugfix 3: PREMIS events from previous transfers are re-appearing https://github.com/archivematica/Issues/issues/43
* Bugfix 4: Metadata reingest fails when dc:type is null https://github.com/artefactual/archivematica/issues/1132

==Upgraded tools and dependencies==

List any tools and dependencies that have been upgraded.

* Tool has been updated to version X.

==End of life dependencies==

List any dependencies that have reached end of life since the last release, as well as a note on the plan going forward.

Make note of any risks, and how users can mitigate them.

Archivematica 1.8 and Storage Service 0.13 release notes

2018-09-27T23:50:17Z

Peter: /* Fixed */

[[Main_Page|Home]] > [[Release_Notes|Release Notes]] > Major release notes template

'''Work in progress'''

==Supported environments==

Link to installation instructions.

Specify supported environments.

Make special note of any changes to supported environment.

==Added==

Describe new features.

===New feature 1===

This is a description of this amazing feature! Here's why it's a net benefit to the project and the community. Also included are any special notes, like if it's a beta feature.

This work was sponsored by some amazing institution. Thank you!

* Documentation: link
* Pull requests: link

===New feature 2===

Here is a description of this amazing feature! Here's why it's a net benefit to the project and the community, and here is how it will impact your workflow. Also included are any special notes, like if it's a beta feature.

This work was sponsored by some amazing institution. Thank you!

* Documentation: link
* Pull requests: link
* Feature files: link

==Changed==

Describe enhancements or major fixes.

===Enhancement 1===

We fixed this issue. Here's why it's a net benefit to the project and the community, and here is how it will impact your workflow. Also included are any special notes, like if it's a beta feature.

This work was sponsored by some amazing institution. Thank you!

* Documentation: link
* Pull requests: link
* Feature files: link

==Fixed==

List bugfixes with a link to the Github issue.

* Bugfix 1: Failures on filenames with backticks and other 'silly' characters https://github.com/archivematica/Issues/issues/16
* Bugfix 2: AIP re-ingest fails. https://github.com/archivematica/Issues/issues/42
* Bugfix 3: PREMIS events from previous transfers are re-appearing https://github.com/archivematica/Issues/issues/43

==Upgraded tools and dependencies==

List any tools and dependencies that have been upgraded.

* Tool has been updated to version X.

==End of life dependencies==

List any dependencies that have reached end of life since the last release, as well as a note on the plan going forward.

Make note of any risks, and how users can mitigate them.

Archivematica 1.8 and Storage Service 0.13 release notes

2018-09-27T21:04:49Z

Peter: /* Fixed */

[[Main_Page|Home]] > [[Release_Notes|Release Notes]] > Major release notes template

'''Work in progress'''

==Supported environments==

Link to installation instructions.

Specify supported environments.

Make special note of any changes to supported environment.

==Added==

Describe new features.

===New feature 1===

This is a description of this amazing feature! Here's why it's a net benefit to the project and the community. Also included are any special notes, like if it's a beta feature.

This work was sponsored by some amazing institution. Thank you!

* Documentation: link
* Pull requests: link

===New feature 2===

Here is a description of this amazing feature! Here's why it's a net benefit to the project and the community, and here is how it will impact your workflow. Also included are any special notes, like if it's a beta feature.

This work was sponsored by some amazing institution. Thank you!

* Documentation: link
* Pull requests: link
* Feature files: link

==Changed==

Describe enhancements or major fixes.

===Enhancement 1===

We fixed this issue. Here's why it's a net benefit to the project and the community, and here is how it will impact your workflow. Also included are any special notes, like if it's a beta feature.

This work was sponsored by some amazing institution. Thank you!

* Documentation: link
* Pull requests: link
* Feature files: link

==Fixed==

List bugfixes with a link to the Github issue.

* Bugfix 1: Failures on filenames with backticks and other 'silly' characters https://github.com/archivematica/Issues/issues/16
* Bugfix 2: link. Sponsored by someone!
* Bugfix 3: link

==Upgraded tools and dependencies==

List any tools and dependencies that have been upgraded.

* Tool has been updated to version X.

==End of life dependencies==

List any dependencies that have reached end of life since the last release, as well as a note on the plan going forward.

Make note of any risks, and how users can mitigate them.

Improvements/AIP Packaging

2018-08-14T19:37:34Z

Peter: /* Use case: Oxford Common File Layout */

== User story ==

As a repository manager, I require flexibility in how AIPs are packaged so they can be stored as one or more physical entity.

== Status ==

Analysis is ongoing.

== Interest ==

If you'd like to get involved in this development, please feel free to contribute to this wiki page or start a discussion on our [https://groups.google.com/forum/#!forum/archivematica| user forum].

== Analysis: ==

Currently, Archivematica can only package a single AIP as a single bag. This bag can be stored as a folder (referred to as 'uncompressed' in the Archivematica UI) or as a single file (by default a 7zip file).

This has limitations in some repository environments and does not allow archivists/repository managers flexibility in how AIPs are stored and accessed. For example, some storage systems have a maximum file size limitation, which an individual AIP may exceed. In other cases, an organisation may have a requirement to encrypt all content at rest.

=== Use case: AIP split into multiple parts ===

An AIP is split into multiple pieces (zipped packages, loose files, binary chunks) for storage and retrieval purposes. There needs to be a way to record metadata that indicates the existence and locations all of the parts, record PREMIS events for each transformation that was applied to the AIP, and details about how to reverse each transformation.

==== AIP Splitting scenarios ====

Archivematica already creates pointer files, which are METS files that describe an AIP. Pointer files record a PREMIS event when an AIP is compressed, for example. They can also be used to record metadata about aip splitting.

Scenario 1: Simple splitting

An AIP is stored as a bag, and the bag is then turned into a .7z file. The .7z file is then split into multiple parts. This can be done with the unix split command ([http://man7.org/linux/man-pages/man1/split.1.html split man page]), with the -v argument to 7z ([https://sevenzip.osdn.jp/chm/cmdline/switches/volume.htm 7z volumes]) or by some other method. The result would look like:

.
└── AIP1 (folder)
├── AIP1.7z.001 (binary chunk)
├── AIP1.7z.002 (binary chunk)
├── AIP1.7z.003 (binary chunk)
└── pointer.xml (xml file)

The pointer file would contain metadata outlining how to pt the 3 parts back together into a single .7z file and unpack it. The result of this would be the original bag containing the AIP.

Scenario 2: Splitting into a Bag

One problem with the first scenario is that the AIP1 folder is not structured according to any standard. Some storage systems may have a requirement to store content in bags. To satisfy this, this 2nd scenario adds an additional step - create a bag to hold the chunks:

.
└── AIP1 (folder)
├── bag-info.txt
├── bagit.txt
├── data
│ ├── AIP1.7z.001 (binary chunk)
│ ├── AIP1.7z.002 (binary chunk)
│ ├── AIP1.7z.003 (binary chunk)
│ └── pointer.xml
│
├── manifest-md5.txt
├── manifest-sha256.txt
├── tagmanifest-md5.txt
└── tagmanifest-sha256.txt

In this scenario, there are actually 2 bags being created - one is holding the chunks (parts 1 to 3) and the pointer file. Once the chunks are stitched back together and unpacked, the result would be the original bag containing the AIP.
The outer bag is useful for allowing checksum/integrity checking, in a standards compliant manner (by validating the bag). It also allow metadata about the entire AIP to be recorded in the bag-info.txt, for example to conform to a storage systems requirement to use Bag Profiles.

Scenario 3: Splitting into many Bags

This scenario is a bit more complicated than Scenario 2. The only advantage it brings is the ability to further transform each bag (e.g. compress, encrypt). This might be a requirement if using an object storage system, where it is desirable to store each bag as a single file. This is not possible in scenario 2 without exceeding the maximum file size of the storage system.
.
└── AIP1 (folder)
├── AIP1.001 (folder)
│ ├──bag-info.txt
│ ├── bagit.txt
│ ├── data
│ │ └── AIP1.7z.001
│ ├── manifest-md5.txt
│ ├── manifest-sha256.txt
│ ├── tagmanifest-md5.txt
│ └── tagmanifest-sha256.txt
├── AIP1.002 (folder)
│ ├──bag-info.txt
│ ├── bagit.txt
│ ├── data
│ │ └── AIP1.7z.002
│ ├── manifest-md5.txt
│ ├── manifest-sha256.txt
│ ├── tagmanifest-md5.txt
│ └── tagmanifest-sha256.txt
├── AIP1.003 (folder)
│ ├──bag-info.txt
│ ├── bagit.txt
│ ├── data
│ │ └── AIP1.7z.003
│ ├── manifest-md5.txt
│ ├── manifest-sha256.txt
│ ├── tagmanifest-md5.txt
│ └── tagmanifest-sha256.txt
└─ pointer.xml

=== Use case: Encryption ===

An AIP should be encrypted before storing, independent of where it is stored. The AIP pointer file needs to track information required to unencrypt the AIP on retrieval.

=== Use case: Oxford Common File Layout ===

[https://ocfl.io/ https://ocfl.io/]

[[Category:Development documentation]]

Improvements/AIP Packaging

2018-08-14T19:13:46Z

Peter: /* Analysis: */

Dataset preservation

2013-06-24T22:10:26Z

Peter:

=Workflow=
*'''Composition of AIPs''': Large datasets may be divided into multiple transfers prior to ingest, so that one dataset ultimately consists of a number of AIPs. See '''Hierarchical AIC/AIP structure''', below.
**''note:'' a related, follow-up Archivematica requirement is to break up large files (e.g. video) that exceed a configurable maximum file size into multiple AIPs also tracked by an AIC
*'''Metadata ingest''': Metadata will be created outside of Archivematica prior to ingest, and may be referenced from the dmdSec of the AIP METS file as an xlink reference. See '''Metadata''', below.
*'''Normalization''':Some types of data files may require manual normalization: see https://projects.artefactual.com/issues/1499.


=Metadata=

==METS and DDI/FGDC==

*DDI is Data Documentation Initiative, a metadata specification for the social and behavioral sciences; see http://www.ddialliance.org/.
*FGDC is Federal Geographic Data Committee Metadata Standard [FGDC-STD-001-1998]; see http://www.fgdc.gov/metadata/csdgm/
*DDI and FGDC are considered descriptive metadata (dmdSec) in METS. From http://www.loc.gov/standards/mets/METSOverview.v2.html: "Valid values for the MDTYPE element [in dmdSec] include...DDI (Data Documentation Initiative), FGDC (Federal Geographic Data Committee Metadata Standard [FGDC-STD-001-1998]."
**In the Archivematica METS file, a DDI or FGDC file could be referenced from the dmdSec using mdRef, for example as follows: ''<mdRef LABEL="CCRI-CDN-Census1911V20110628.xml-73b93b28-be1b-433f-861e-03bc321dfe7e" xlink:href="metadata/CCRI-CDN-Census1911V20110628.xml" MDTYPE="DDI" LOCTYPE="OTHER" OTHERLOCTYPE="SYSTEM"/>''.


==METS and other metadata standards==

*Other metadata standards that could be used for ingested datasets include:
**North American Profile (NAP) of ISO 19119, for geospatial metadata: http://www.fgdc.gov/metadata/geospatial-metadata-standards
**SDMX for aggregate data: http://sdmx.org/?page_id=10
**EML, the Ecological Metadata Language: http://knb.ecoinformatics.org/software/eml/eml-2.1.1/index.html
*If these standards are used, the mdRef in the METS file would need to use OTHER as MDTYPE, for example: ''<mdRef LABEL="CCRI-CDN-Census1911V20110628.xml-73b93b28-be1b-433f-861e-03bc321dfe7e" xlink:href="metadata/CCRI-CDN-Census1911V20110628.xml" MDTYPE="OTHER" OTHERMDTYPE="SDMX" LOCTYPE="OTHER" OTHERLOCTYPE="SYSTEM"/>''


=Hierarchical AIC/AIP structure=

*Because datasets can be large and heterogeneous, one "dataset" may be broken into multiple AIPs. In such cases, the multiple AIPs can be intellectually combined into one AIC, or Archival Information Collection, defined by the OAIS reference model as "[a]n Archival Information Package whose Content Information is an aggregation of other Archival Information Packages." (OAIS 1-9).
**The AIC consists of a METS file containing a fileSec and a logical structMap listing all child AIPs (Note that this is based on '''Option 1''' under '''Possible AIC/AIP designs''', below).
**In storage, a pointer.xml file gives storage and compression information for each AIC and AIP.
*This diagram shows a storage area with standalone AIPs, an AIC with child AIPs, and related pointer.xml files.


[[File:AIC_AIP_storage.png|600px|thumb|center|Archival storage area containing pointer files, AICs and AIPs]]


==Possible AIC/AIP designs==

===Option 1 (preferred)===



[[File:AIP1G.png|680px|thumb|center]]



'''Description''': An AIC consisting of only a fileSec and structMap; AIPs consisting of data files and metadata for those data files; an AIP consisting of project/program-level (i.e. dataset) metadata and documentation.



'''Workflow''':
#User creates X number of AIPs and puts them in archival storage
#*One of these AIPs consists only of metadata and documentation about the program/project as a whole
#*The AIPs must have one or more common metadata elements that allows them to be identified as being related
#User searches for AIPs in archival storage tab (using the common metadata element in the AIPs in the search query)
#Once search results are retrieved, user clicks "Create AIC" button
#AIC is created, containing only a METS structMap listing all AIPs
#Over time, user can add new AIPs and re-create the AIC at any time; the new AIC will either replace or update the old one
#Over time, if needed the user either updates the existing documentation AIP or adds new documentation AIPs (i.e. there can be more than one documentation AIP per dataset)



'''Pros''':
*Don't have to duplicate program/project-level documentation in each AIP
*Simple workflow for creating AIC
*Easy to add new AIPs
*If program/project documentation needs updating, only one AIP has to be re-processed, or user can add new documentation AIP(s)



'''Cons''':
*There is only a one-way link between the AIC and child AIPs - i.e. the AIC has a structMap listing all child AIPs, but there is nothing in a child AIP to indicate that it belongs to a given AIC.



'''Sample AIC METS file'''



[[File:METS_AIC_AIP.png|700px|thumb|center|]]



'''Sample pointer.xml file'''



[[File:pointer6G.png|700px|thumb|center|]]
[[File:pointer7G.png|700px|thumb|center|]]



===Option 2===



[[File:AIP2G.png|680px|thumb|center]]



'''Description''': An AIC consisting of a METS structMap and project/program-level (i.e. dataset) metadata and documentation; content AIPs consisting of data files and metadata about the data files. AIPs have information in the METS files (in the structMap?) linking them to the parent AIC.



'''Workflow''':
To be determined - probably a dashboard tab with a gui to allow users to arrange existing AIPs into an AIC



'''Pros''':
*Don't have to duplicate program/project-level documentation in each AIP
*AIPs have a link up to the AIC, so if an AIP is orphaned the relationship to the AIC can easily be reconstructed
*If program/project-level metadata and documentation needs to be updated, only the AIC needs to be re-processed



'''Cons''':
*Workflow to create this structure may be complex
*No obvious mechanism for adding new AIPs over time



===Option 3===



[[File:AIP3G.png|680px|thumb|center]]



'''Description''': An AIC with a unique identifier consisting of project/program-level (i.e. dataset) metadata and documentation only (no structMap); AIPs consisting of data files, metadata for those data files, and the same identifier as the AIC. The relationship between the AIC and AIPs in this scenario is inferred from the matching identifiers.



'''Workflow''':
#User creates an AIC consisting of project/program-level (i.e. dataset) metadata and documentation
#*The AIC contains an identifier that distinguishes it from other AICs
#User creates AIPs consisting of data files and metadata for those data files
#*User includes the AIC identifier in each AIP
#Over time, if needed the user can add more AIPs with the same identifier



'''Pros''':
*Don't have to duplicate program/project-level documentation in each AIP
*Simple workflow
*Minimal development requirements, just new metadata field for identifier added to transfer tab, corresponding entry in AIC/AIP METS files and ability to search by AIC identifier in archival storage tab
*If program/project-level metadata and documentation needs to be updated, only the AIC needs to be re-processed
*Easy to add more AIPs to the same AIC over time



'''Cons''':
*No structMap in the AIC means that there is no single source of information about how many AIPs are in the AIC



===Option 4===



[[File:AIP4G.png|680px|thumb|center]]



'''Description''': No AIC; project/program-level metadata and documentation duplicated in all AIPs; links between AIPs belonging to one dataset inferred from metadata only



'''Workflow''':
User creates any number of AIPs with complete copies of the project/program-leve (i.e. dataset) metadata and documentation in each AIP



'''Pros''':
*Minimal Archivematica development required, just ensuring that matching metadata elements are parsed to the AIP METS files or otherwise made available to ElasticSearch index
*Easy to add new AIPs over time



'''Cons''':
*User has to maintain copies of project/program-level metadata and documentation outside of Archivematica so they can be added to each AIP
*Updating the project/program-level metadata and documentation would require re-processing the AIPs
*Relationships between AIPs would have to be inferred from matching metadata elements alone; if an AIP were lost, there would be no list of AIPs belonging to the dataset which would reveal the loss



[[Category:Development documentation]]

Dataset preservation

2013-06-24T22:04:56Z

Peter:

=Workflow=
*'''Composition of AIPs''': Large datasets may be divided into multiple transfers prior to ingest, so that one dataset ultimately consists of a number of AIPs. See '''Hierarchical AIC/AIP structure''', below.
** Note: a related Archivematica requirement is to break up large files (e.g. video) that exceed a configurable maximum file size into multiple AIPs also tracked by an AIC
*'''Metadata ingest''': Metadata will be created outside of Archivematica prior to ingest, and may be referenced from the dmdSec of the AIP METS file as an xlink reference. See '''Metadata''', below.
*'''Normalization''':Some types of data files may require manual normalization: see https://projects.artefactual.com/issues/1499.


=Metadata=

==METS and DDI/FGDC==

*DDI is Data Documentation Initiative, a metadata specification for the social and behavioral sciences; see http://www.ddialliance.org/.
*FGDC is Federal Geographic Data Committee Metadata Standard [FGDC-STD-001-1998]; see http://www.fgdc.gov/metadata/csdgm/
*DDI and FGDC are considered descriptive metadata (dmdSec) in METS. From http://www.loc.gov/standards/mets/METSOverview.v2.html: "Valid values for the MDTYPE element [in dmdSec] include...DDI (Data Documentation Initiative), FGDC (Federal Geographic Data Committee Metadata Standard [FGDC-STD-001-1998]."
**In the Archivematica METS file, a DDI or FGDC file could be referenced from the dmdSec using mdRef, for example as follows: ''<mdRef LABEL="CCRI-CDN-Census1911V20110628.xml-73b93b28-be1b-433f-861e-03bc321dfe7e" xlink:href="metadata/CCRI-CDN-Census1911V20110628.xml" MDTYPE="DDI" LOCTYPE="OTHER" OTHERLOCTYPE="SYSTEM"/>''.


==METS and other metadata standards==

*Other metadata standards that could be used for ingested datasets include:
**North American Profile (NAP) of ISO 19119, for geospatial metadata: http://www.fgdc.gov/metadata/geospatial-metadata-standards
**SDMX for aggregate data: http://sdmx.org/?page_id=10
**EML, the Ecological Metadata Language: http://knb.ecoinformatics.org/software/eml/eml-2.1.1/index.html
*If these standards are used, the mdRef in the METS file would need to use OTHER as MDTYPE, for example: ''<mdRef LABEL="CCRI-CDN-Census1911V20110628.xml-73b93b28-be1b-433f-861e-03bc321dfe7e" xlink:href="metadata/CCRI-CDN-Census1911V20110628.xml" MDTYPE="OTHER" OTHERMDTYPE="SDMX" LOCTYPE="OTHER" OTHERLOCTYPE="SYSTEM"/>''


=Hierarchical AIC/AIP structure=

*Because datasets can be large and heterogeneous, one "dataset" may be broken into multiple AIPs. In such cases, the multiple AIPs can be intellectually combined into one AIC, or Archival Information Collection, defined by the OAIS reference model as "[a]n Archival Information Package whose Content Information is an aggregation of other Archival Information Packages." (OAIS 1-9).
**The AIC consists of a METS file containing a fileSec and a logical structMap listing all child AIPs (Note that this is based on '''Option 1''' under '''Possible AIC/AIP designs''', below).
**In storage, a pointer.xml file gives storage and compression information for each AIC and AIP.
*This diagram shows a storage area with standalone AIPs, an AIC with child AIPs, and related pointer.xml files.


[[File:AIC_AIP_storage.png|600px|thumb|center|Archival storage area containing pointer files, AICs and AIPs]]


==Possible AIC/AIP designs==

===Option 1 (preferred)===



[[File:AIP1G.png|680px|thumb|center]]



'''Description''': An AIC consisting of only a fileSec and structMap; AIPs consisting of data files and metadata for those data files; an AIP consisting of project/program-level (i.e. dataset) metadata and documentation.



'''Workflow''':
#User creates X number of AIPs and puts them in archival storage
#*One of these AIPs consists only of metadata and documentation about the program/project as a whole
#*The AIPs must have one or more common metadata elements that allows them to be identified as being related
#User searches for AIPs in archival storage tab (using the common metadata element in the AIPs in the search query)
#Once search results are retrieved, user clicks "Create AIC" button
#AIC is created, containing only a METS structMap listing all AIPs
#Over time, user can add new AIPs and re-create the AIC at any time; the new AIC will either replace or update the old one
#Over time, if needed the user either updates the existing documentation AIP or adds new documentation AIPs (i.e. there can be more than one documentation AIP per dataset)



'''Pros''':
*Don't have to duplicate program/project-level documentation in each AIP
*Simple workflow for creating AIC
*Easy to add new AIPs
*If program/project documentation needs updating, only one AIP has to be re-processed, or user can add new documentation AIP(s)



'''Cons''':
*There is only a one-way link between the AIC and child AIPs - i.e. the AIC has a structMap listing all child AIPs, but there is nothing in a child AIP to indicate that it belongs to a given AIC.



'''Sample AIC METS file'''



[[File:METS_AIC_AIP.png|700px|thumb|center|]]



'''Sample pointer.xml file'''



[[File:pointer6G.png|700px|thumb|center|]]
[[File:pointer7G.png|700px|thumb|center|]]



===Option 2===



[[File:AIP2G.png|680px|thumb|center]]



'''Description''': An AIC consisting of a METS structMap and project/program-level (i.e. dataset) metadata and documentation; content AIPs consisting of data files and metadata about the data files. AIPs have information in the METS files (in the structMap?) linking them to the parent AIC.



'''Workflow''':
To be determined - probably a dashboard tab with a gui to allow users to arrange existing AIPs into an AIC



'''Pros''':
*Don't have to duplicate program/project-level documentation in each AIP
*AIPs have a link up to the AIC, so if an AIP is orphaned the relationship to the AIC can easily be reconstructed
*If program/project-level metadata and documentation needs to be updated, only the AIC needs to be re-processed



'''Cons''':
*Workflow to create this structure may be complex
*No obvious mechanism for adding new AIPs over time



===Option 3===



[[File:AIP3G.png|680px|thumb|center]]



'''Description''': An AIC with a unique identifier consisting of project/program-level (i.e. dataset) metadata and documentation only (no structMap); AIPs consisting of data files, metadata for those data files, and the same identifier as the AIC. The relationship between the AIC and AIPs in this scenario is inferred from the matching identifiers.



'''Workflow''':
#User creates an AIC consisting of project/program-level (i.e. dataset) metadata and documentation
#*The AIC contains an identifier that distinguishes it from other AICs
#User creates AIPs consisting of data files and metadata for those data files
#*User includes the AIC identifier in each AIP
#Over time, if needed the user can add more AIPs with the same identifier



'''Pros''':
*Don't have to duplicate program/project-level documentation in each AIP
*Simple workflow
*Minimal development requirements, just new metadata field for identifier added to transfer tab, corresponding entry in AIC/AIP METS files and ability to search by AIC identifier in archival storage tab
*If program/project-level metadata and documentation needs to be updated, only the AIC needs to be re-processed
*Easy to add more AIPs to the same AIC over time



'''Cons''':
*No structMap in the AIC means that there is no single source of information about how many AIPs are in the AIC



===Option 4===



[[File:AIP4G.png|680px|thumb|center]]



'''Description''': No AIC; project/program-level metadata and documentation duplicated in all AIPs; links between AIPs belonging to one dataset inferred from metadata only



'''Workflow''':
User creates any number of AIPs with complete copies of the project/program-leve (i.e. dataset) metadata and documentation in each AIP



'''Pros''':
*Minimal Archivematica development required, just ensuring that matching metadata elements are parsed to the AIP METS files or otherwise made available to ElasticSearch index
*Easy to add new AIPs over time



'''Cons''':
*User has to maintain copies of project/program-level metadata and documentation outside of Archivematica so they can be added to each AIP
*Updating the project/program-level metadata and documentation would require re-processing the AIPs
*Relationships between AIPs would have to be inferred from matching metadata elements alone; if an AIP were lost, there would be no list of AIPs belonging to the dataset which would reveal the loss



[[Category:Development documentation]]

Dataset preservation

2013-06-24T22:03:24Z

Peter:

=Workflow=
*'''Composition of AIPs''': Large datasets may be divided into multiple transfers prior to ingest, so that one dataset ultimately consists of a number of AIPs. See '''Hierarchical AIC/AIP structure''', below.
** Note: a related Archivematica requirement is to break up large files (e.g. video) that exceed a configurable maximum file size into multiple AIPs tracked by an AIC
*'''Metadata ingest''': Metadata will be created outside of Archivematica prior to ingest, and may be referenced from the dmdSec of the AIP METS file as an xlink reference. See '''Metadata''', below.
*'''Normalization''':Some types of data files may require manual normalization: see https://projects.artefactual.com/issues/1499.


=Metadata=

==METS and DDI/FGDC==

*DDI is Data Documentation Initiative, a metadata specification for the social and behavioral sciences; see http://www.ddialliance.org/.
*FGDC is Federal Geographic Data Committee Metadata Standard [FGDC-STD-001-1998]; see http://www.fgdc.gov/metadata/csdgm/
*DDI and FGDC are considered descriptive metadata (dmdSec) in METS. From http://www.loc.gov/standards/mets/METSOverview.v2.html: "Valid values for the MDTYPE element [in dmdSec] include...DDI (Data Documentation Initiative), FGDC (Federal Geographic Data Committee Metadata Standard [FGDC-STD-001-1998]."
**In the Archivematica METS file, a DDI or FGDC file could be referenced from the dmdSec using mdRef, for example as follows: ''<mdRef LABEL="CCRI-CDN-Census1911V20110628.xml-73b93b28-be1b-433f-861e-03bc321dfe7e" xlink:href="metadata/CCRI-CDN-Census1911V20110628.xml" MDTYPE="DDI" LOCTYPE="OTHER" OTHERLOCTYPE="SYSTEM"/>''.


==METS and other metadata standards==

*Other metadata standards that could be used for ingested datasets include:
**North American Profile (NAP) of ISO 19119, for geospatial metadata: http://www.fgdc.gov/metadata/geospatial-metadata-standards
**SDMX for aggregate data: http://sdmx.org/?page_id=10
**EML, the Ecological Metadata Language: http://knb.ecoinformatics.org/software/eml/eml-2.1.1/index.html
*If these standards are used, the mdRef in the METS file would need to use OTHER as MDTYPE, for example: ''<mdRef LABEL="CCRI-CDN-Census1911V20110628.xml-73b93b28-be1b-433f-861e-03bc321dfe7e" xlink:href="metadata/CCRI-CDN-Census1911V20110628.xml" MDTYPE="OTHER" OTHERMDTYPE="SDMX" LOCTYPE="OTHER" OTHERLOCTYPE="SYSTEM"/>''


=Hierarchical AIC/AIP structure=

*Because datasets can be large and heterogeneous, one "dataset" may be broken into multiple AIPs. In such cases, the multiple AIPs can be intellectually combined into one AIC, or Archival Information Collection, defined by the OAIS reference model as "[a]n Archival Information Package whose Content Information is an aggregation of other Archival Information Packages." (OAIS 1-9).
**The AIC consists of a METS file containing a fileSec and a logical structMap listing all child AIPs (Note that this is based on '''Option 1''' under '''Possible AIC/AIP designs''', below).
**In storage, a pointer.xml file gives storage and compression information for each AIC and AIP.
*This diagram shows a storage area with standalone AIPs, an AIC with child AIPs, and related pointer.xml files.


[[File:AIC_AIP_storage.png|600px|thumb|center|Archival storage area containing pointer files, AICs and AIPs]]


==Possible AIC/AIP designs==

===Option 1 (preferred)===



[[File:AIP1G.png|680px|thumb|center]]



'''Description''': An AIC consisting of only a fileSec and structMap; AIPs consisting of data files and metadata for those data files; an AIP consisting of project/program-level (i.e. dataset) metadata and documentation.



'''Workflow''':
#User creates X number of AIPs and puts them in archival storage
#*One of these AIPs consists only of metadata and documentation about the program/project as a whole
#*The AIPs must have one or more common metadata elements that allows them to be identified as being related
#User searches for AIPs in archival storage tab (using the common metadata element in the AIPs in the search query)
#Once search results are retrieved, user clicks "Create AIC" button
#AIC is created, containing only a METS structMap listing all AIPs
#Over time, user can add new AIPs and re-create the AIC at any time; the new AIC will either replace or update the old one
#Over time, if needed the user either updates the existing documentation AIP or adds new documentation AIPs (i.e. there can be more than one documentation AIP per dataset)



'''Pros''':
*Don't have to duplicate program/project-level documentation in each AIP
*Simple workflow for creating AIC
*Easy to add new AIPs
*If program/project documentation needs updating, only one AIP has to be re-processed, or user can add new documentation AIP(s)



'''Cons''':
*There is only a one-way link between the AIC and child AIPs - i.e. the AIC has a structMap listing all child AIPs, but there is nothing in a child AIP to indicate that it belongs to a given AIC.



'''Sample AIC METS file'''



[[File:METS_AIC_AIP.png|700px|thumb|center|]]



'''Sample pointer.xml file'''



[[File:pointer6G.png|700px|thumb|center|]]
[[File:pointer7G.png|700px|thumb|center|]]



===Option 2===



[[File:AIP2G.png|680px|thumb|center]]



'''Description''': An AIC consisting of a METS structMap and project/program-level (i.e. dataset) metadata and documentation; content AIPs consisting of data files and metadata about the data files. AIPs have information in the METS files (in the structMap?) linking them to the parent AIC.



'''Workflow''':
To be determined - probably a dashboard tab with a gui to allow users to arrange existing AIPs into an AIC



'''Pros''':
*Don't have to duplicate program/project-level documentation in each AIP
*AIPs have a link up to the AIC, so if an AIP is orphaned the relationship to the AIC can easily be reconstructed
*If program/project-level metadata and documentation needs to be updated, only the AIC needs to be re-processed



'''Cons''':
*Workflow to create this structure may be complex
*No obvious mechanism for adding new AIPs over time



===Option 3===



[[File:AIP3G.png|680px|thumb|center]]



'''Description''': An AIC with a unique identifier consisting of project/program-level (i.e. dataset) metadata and documentation only (no structMap); AIPs consisting of data files, metadata for those data files, and the same identifier as the AIC. The relationship between the AIC and AIPs in this scenario is inferred from the matching identifiers.



'''Workflow''':
#User creates an AIC consisting of project/program-level (i.e. dataset) metadata and documentation
#*The AIC contains an identifier that distinguishes it from other AICs
#User creates AIPs consisting of data files and metadata for those data files
#*User includes the AIC identifier in each AIP
#Over time, if needed the user can add more AIPs with the same identifier



'''Pros''':
*Don't have to duplicate program/project-level documentation in each AIP
*Simple workflow
*Minimal development requirements, just new metadata field for identifier added to transfer tab, corresponding entry in AIC/AIP METS files and ability to search by AIC identifier in archival storage tab
*If program/project-level metadata and documentation needs to be updated, only the AIC needs to be re-processed
*Easy to add more AIPs to the same AIC over time



'''Cons''':
*No structMap in the AIC means that there is no single source of information about how many AIPs are in the AIC



===Option 4===



[[File:AIP4G.png|680px|thumb|center]]



'''Description''': No AIC; project/program-level metadata and documentation duplicated in all AIPs; links between AIPs belonging to one dataset inferred from metadata only



'''Workflow''':
User creates any number of AIPs with complete copies of the project/program-leve (i.e. dataset) metadata and documentation in each AIP



'''Pros''':
*Minimal Archivematica development required, just ensuring that matching metadata elements are parsed to the AIP METS files or otherwise made available to ElasticSearch index
*Easy to add new AIPs over time



'''Cons''':
*User has to maintain copies of project/program-level metadata and documentation outside of Archivematica so they can be added to each AIP
*Updating the project/program-level metadata and documentation would require re-processing the AIPs
*Relationships between AIPs would have to be inferred from matching metadata elements alone; if an AIP were lost, there would be no list of AIPs belonging to the dataset which would reveal the loss



[[Category:Development documentation]]

Dataset preservation

2013-06-24T22:02:35Z

Peter:

=Workflow=
*'''Composition of AIPs''': Large datasets may be divided into multiple transfers prior to ingest, so that one dataset ultimately consists of a number of AIPs. See '''Hierarchical AIC/AIP structure''', below.
** Note: a related Archivematica 1.1 requirement is to break up large files that exceed a configurable maximum file size into multiple AIPs tracked by an AIC
*'''Metadata ingest''': Metadata will be created outside of Archivematica prior to ingest, and may be referenced from the dmdSec of the AIP METS file as an xlink reference. See '''Metadata''', below.
*'''Normalization''':Some types of data files may require manual normalization: see https://projects.artefactual.com/issues/1499.


=Metadata=

==METS and DDI/FGDC==

*DDI is Data Documentation Initiative, a metadata specification for the social and behavioral sciences; see http://www.ddialliance.org/.
*FGDC is Federal Geographic Data Committee Metadata Standard [FGDC-STD-001-1998]; see http://www.fgdc.gov/metadata/csdgm/
*DDI and FGDC are considered descriptive metadata (dmdSec) in METS. From http://www.loc.gov/standards/mets/METSOverview.v2.html: "Valid values for the MDTYPE element [in dmdSec] include...DDI (Data Documentation Initiative), FGDC (Federal Geographic Data Committee Metadata Standard [FGDC-STD-001-1998]."
**In the Archivematica METS file, a DDI or FGDC file could be referenced from the dmdSec using mdRef, for example as follows: ''<mdRef LABEL="CCRI-CDN-Census1911V20110628.xml-73b93b28-be1b-433f-861e-03bc321dfe7e" xlink:href="metadata/CCRI-CDN-Census1911V20110628.xml" MDTYPE="DDI" LOCTYPE="OTHER" OTHERLOCTYPE="SYSTEM"/>''.


==METS and other metadata standards==

*Other metadata standards that could be used for ingested datasets include:
**North American Profile (NAP) of ISO 19119, for geospatial metadata: http://www.fgdc.gov/metadata/geospatial-metadata-standards
**SDMX for aggregate data: http://sdmx.org/?page_id=10
**EML, the Ecological Metadata Language: http://knb.ecoinformatics.org/software/eml/eml-2.1.1/index.html
*If these standards are used, the mdRef in the METS file would need to use OTHER as MDTYPE, for example: ''<mdRef LABEL="CCRI-CDN-Census1911V20110628.xml-73b93b28-be1b-433f-861e-03bc321dfe7e" xlink:href="metadata/CCRI-CDN-Census1911V20110628.xml" MDTYPE="OTHER" OTHERMDTYPE="SDMX" LOCTYPE="OTHER" OTHERLOCTYPE="SYSTEM"/>''


=Hierarchical AIC/AIP structure=

*Because datasets can be large and heterogeneous, one "dataset" may be broken into multiple AIPs. In such cases, the multiple AIPs can be intellectually combined into one AIC, or Archival Information Collection, defined by the OAIS reference model as "[a]n Archival Information Package whose Content Information is an aggregation of other Archival Information Packages." (OAIS 1-9).
**The AIC consists of a METS file containing a fileSec and a logical structMap listing all child AIPs (Note that this is based on '''Option 1''' under '''Possible AIC/AIP designs''', below).
**In storage, a pointer.xml file gives storage and compression information for each AIC and AIP.
*This diagram shows a storage area with standalone AIPs, an AIC with child AIPs, and related pointer.xml files.


[[File:AIC_AIP_storage.png|600px|thumb|center|Archival storage area containing pointer files, AICs and AIPs]]


==Possible AIC/AIP designs==

===Option 1 (preferred)===



[[File:AIP1G.png|680px|thumb|center]]



'''Description''': An AIC consisting of only a fileSec and structMap; AIPs consisting of data files and metadata for those data files; an AIP consisting of project/program-level (i.e. dataset) metadata and documentation.



'''Workflow''':
#User creates X number of AIPs and puts them in archival storage
#*One of these AIPs consists only of metadata and documentation about the program/project as a whole
#*The AIPs must have one or more common metadata elements that allows them to be identified as being related
#User searches for AIPs in archival storage tab (using the common metadata element in the AIPs in the search query)
#Once search results are retrieved, user clicks "Create AIC" button
#AIC is created, containing only a METS structMap listing all AIPs
#Over time, user can add new AIPs and re-create the AIC at any time; the new AIC will either replace or update the old one
#Over time, if needed the user either updates the existing documentation AIP or adds new documentation AIPs (i.e. there can be more than one documentation AIP per dataset)



'''Pros''':
*Don't have to duplicate program/project-level documentation in each AIP
*Simple workflow for creating AIC
*Easy to add new AIPs
*If program/project documentation needs updating, only one AIP has to be re-processed, or user can add new documentation AIP(s)



'''Cons''':
*There is only a one-way link between the AIC and child AIPs - i.e. the AIC has a structMap listing all child AIPs, but there is nothing in a child AIP to indicate that it belongs to a given AIC.



'''Sample AIC METS file'''



[[File:METS_AIC_AIP.png|700px|thumb|center|]]



'''Sample pointer.xml file'''



[[File:pointer6G.png|700px|thumb|center|]]
[[File:pointer7G.png|700px|thumb|center|]]



===Option 2===



[[File:AIP2G.png|680px|thumb|center]]



'''Description''': An AIC consisting of a METS structMap and project/program-level (i.e. dataset) metadata and documentation; content AIPs consisting of data files and metadata about the data files. AIPs have information in the METS files (in the structMap?) linking them to the parent AIC.



'''Workflow''':
To be determined - probably a dashboard tab with a gui to allow users to arrange existing AIPs into an AIC



'''Pros''':
*Don't have to duplicate program/project-level documentation in each AIP
*AIPs have a link up to the AIC, so if an AIP is orphaned the relationship to the AIC can easily be reconstructed
*If program/project-level metadata and documentation needs to be updated, only the AIC needs to be re-processed



'''Cons''':
*Workflow to create this structure may be complex
*No obvious mechanism for adding new AIPs over time



===Option 3===



[[File:AIP3G.png|680px|thumb|center]]



'''Description''': An AIC with a unique identifier consisting of project/program-level (i.e. dataset) metadata and documentation only (no structMap); AIPs consisting of data files, metadata for those data files, and the same identifier as the AIC. The relationship between the AIC and AIPs in this scenario is inferred from the matching identifiers.



'''Workflow''':
#User creates an AIC consisting of project/program-level (i.e. dataset) metadata and documentation
#*The AIC contains an identifier that distinguishes it from other AICs
#User creates AIPs consisting of data files and metadata for those data files
#*User includes the AIC identifier in each AIP
#Over time, if needed the user can add more AIPs with the same identifier



'''Pros''':
*Don't have to duplicate program/project-level documentation in each AIP
*Simple workflow
*Minimal development requirements, just new metadata field for identifier added to transfer tab, corresponding entry in AIC/AIP METS files and ability to search by AIC identifier in archival storage tab
*If program/project-level metadata and documentation needs to be updated, only the AIC needs to be re-processed
*Easy to add more AIPs to the same AIC over time



'''Cons''':
*No structMap in the AIC means that there is no single source of information about how many AIPs are in the AIC



===Option 4===



[[File:AIP4G.png|680px|thumb|center]]



'''Description''': No AIC; project/program-level metadata and documentation duplicated in all AIPs; links between AIPs belonging to one dataset inferred from metadata only



'''Workflow''':
User creates any number of AIPs with complete copies of the project/program-leve (i.e. dataset) metadata and documentation in each AIP



'''Pros''':
*Minimal Archivematica development required, just ensuring that matching metadata elements are parsed to the AIP METS files or otherwise made available to ElasticSearch index
*Easy to add new AIPs over time



'''Cons''':
*User has to maintain copies of project/program-level metadata and documentation outside of Archivematica so they can be added to each AIP
*Updating the project/program-level metadata and documentation would require re-processing the AIPs
*Relationships between AIPs would have to be inferred from matching metadata elements alone; if an AIP were lost, there would be no list of AIPs belonging to the dataset which would reveal the loss



[[Category:Development documentation]]

Overview

2013-05-09T00:23:42Z

Peter:

[[Main Page]] > [[Documentation]] > [[Technical Architecture]] > Overview

{|style="width:95%; border="0"
|-valign="top"
|style="width: 70%; padding: 0.5em 1em 1em; color: rgb(0, 0, 0);"|

==Open Source OAIS==
Archivematica provides an integrated suite of free and open-source tools that allows users to process digital objects from [[Micro-services#Archivematica_Micro-services|ingest to archival storage and access]] in [[Requirements|compliance]] with the [http://en.wikipedia.org/wiki/Open_Archival_Information_System ISO-OAIS] functional model and other [[wikipedia:Digital preservation|digital preservation]] standards and best practices. All of the Archivematica code and documentation is released under AGPL and Creative Commons open-source licenses.

==Micro-Services design pattern==
Archivematica implements a [http://www.cdlib.org/services/uc3/curation/ micro-service] approach to digital preservation. The Archivematica micro-services are granular system tasks which operate on a conceptual entity that is equivalent to an OAIS information package: Submission Information Package (SIP), Archival Information Package (AIP), Dissemination Information Package (DIP). The physical structure of an information package will include files, checksums, logs, submission documentation, XML metadata, etc..

These information packages are processed using a series of micro-services. Micro-services are provided by a combination of Archivematica Python scripts and one or more of the free, open-source [[External tools|software tools]] bundled in the Archivematica system. Each micro-service results in a success or error state and the information package is processed accordingly by the next micro-service. There are a variety of mechanisms used to connect the various micro-services together into complex, custom workflows. Micro-services can be distributed to processing clusters for highly scalable configurations.

==Dashboard==
The web dashboard allow users to process, monitor and control the Archivematica workflow processes. It is developed using Python-based Django MVC framework. The Dashboard provides a multi-user interface that will report on the status of system events and make it simpler to control and trigger specific micro-services. This interface allows users to easily add or edit metadata, coordinate AIP and DIP storage and provide preservation planning information. Notifications include error reports, monitoring of MCP tasks and manual approvals in the workflow. In coming releases, the dashboard will support a transfer backlog linked to accession data as well as indexing, analysis, arrangement and minimal description of transfer(s) into SIP(s). An administration area allows users to manage storage locations, configuration of micro-services, alteration of preservation plans and user access levels.

==Single install==
Using the latest in virtualization technology, each release of the Archivematica system packages a customized Xubuntu environment as a [http://en.wikipedia.org/wiki/Virtual_appliance virtual appliance], making it possible to run on top of any consumer-grade hardware and operating system. This means the entire [[External tools|suite of digital preservation tools]] is now available from one simple installation. Archivematica can also be installed directly on dedicated hardware via its own Ubuntu repository. Its client/server processing architecture allows it to be deployed in multi-node, distributed processing configurations to support large-scale, resource-intensive production environments.

==Format policies==
Archivematica maintains the original format of all ingested files to support migration and emulation strategies. However, the primary preservation strategy is to normalize files to preservation and access formats upon ingest. Archivematica groups file formats into [[Media_type_preservation_plans|format policies]] (e.g. text, audio, video, raster image, vector image, etc.). Archivematica's preservation formats must all be open standards. Additionally, the choice of formats is based on community best practices, availability of free and open-source normalization tools, and an analysis of the significant characteristics for each media type. The choice of access formats is based largely on the ubiquity of web-based viewers for the file format.

For the 1.0 production release, Archivematica format policies will be moved to a structured, online format policy registry ([[Format_policy_registry_requirements|FPR]]) that brings together format identification information with significant characteristic analysis, risk assessments and normalization tool information to arrive at default preservation format and access format policies for Archivematica. The goal is to make this registry interoperable with [http://www.nationalarchives.gov.uk/PRONOM/Default.aspx PRONOM], the [http://corereg.arts.gla.ac.uk/PlanetsCoreRegistry/welcome.html Planets Core Registry] and/or the forthcoming [http://www.udfr.org/ Universal Digital Format Registry] (UDFR). Archivematica installations will use the registry to update their local, default policies and notify users if there has been a change in the risk status or migration options for these formats, allowing them to trigger a migration process using the available normalization tools. Users are free to determine their own format preservation policies, whether based on alternate institutional policies or developed through the use of a formal preservation policy tool like Plato. The system is configured to make it easy to add new normalization tools and customize local format policies.

==From Transfer to SIP to AIP and DIP==
The primary function of Archivematica is to process digital transfers (accessioned digital objects), turn them into SIPs, apply format policies and create high-quality, repository-independent Archival Information Packages (AIP) using [http://www.loc.gov/standards/mets/ METS], [http://www.loc.gov/standards/premis/ PREMIS] and [https://confluence.ucop.edu/download/attachments/16744580/BagItSpec.pdf?version=1 Bagit]. Archivematica is bundled with ICA-AtoM but is designed to upload Dissemination Information Packages (DIP), containing descriptive metadata and web-ready access copies, to any access system (e.g. Dspace, ContentDM, etc.).

==Lowering the barriers to best-practice digital preservation==
The goal of the Archivematica project is to give archivists and librarians with limited technical and financial capacity the tools, methodology and confidence to begin preserving digital information today. The project has conducted a thorough [[OAIS Use Cases|OAIS use case]] and process analysis to synthesize the specific, [[UML Activity Diagrams|concrete steps]] that must be carried out to comply with the OAIS functional model from Ingest to Access. Through deployment experiences and user feedback, the project has expanded even beyond OAIS to address analysis and arrangement of transferred digital objects into SIPs and allow for archival appraisal at multiple decision points. Wherever possible, these requirements are assigned to software tools within the Archivematica system. If it is not possible to automate these steps in the current system iteration, they are incorporated and [[Documentation|documented]] into a manual procedure to be carried out by the end user. This ensures that the entire set of preservation requirements is being carried out, even in the early, pre 1.0 system releases. In short, the system is conceptualized as an integrated whole of technology, people and procedures, not just a set of software tools. For institutions that want technical assistance to install and customize Archivematica, optional [http://artefactual.com/archivematica.html technical support services] are provided by Artefactual Systems.

All of the software, documentation and development infrastructure are available free of charge and released under AGPL and Creative Commons licenses to give users the freedom to study, adapt and re-distribute these resources as best suits them. Rather than spend precious funding on proprietary software licenses that restrict these freedoms, the Archivematica project encourages memory institutions tackling the challenges of digital preservation to pool their financial and technical resources in projects like Archivematica to maximize their long-term investments for the benefit of their colleagues, users and professional community as a whole.

|style="padding: 0.5em 1em 1em; color: rgb(0, 0, 0);"|

[[File:OAIS.png|thumb|left|300px|OAIS reference model]]

[[Image:CreateSIPs-10.png|300px|thumb|left|In Dashboard: A transfer that is has completed micro-service jobs in the transfer workflow to be packaged into a SIP or stored in backlog]]

[[File:FprShow-10.png|300px|thumb|left|Format Policy Registry (FPR) in Preservation Planning tab of the dashboard]]
[[Image:NormalizeMS-10.png|300px|left|thumb|In Dashboard: A SIP ready for normalization in the Ingest tab]]

[[File:AMarch.png|300px|left|thumb|Archivematica Ingest infrastructure overview]]

|}

__NOTOC__

Release Notes

2013-05-07T18:38:45Z

Peter:

[[Main Page]] > [[Software]] > Release Notes

* [[Archivematica_0.10-beta_Release_Notes|Archivematica 0.10 Release Notes]] (Current)
* [[Archivematica 0.9 Release Notes]]
* [[Archivematica 0.8 Release Notes]]
* [[Archivematica 0.7.1 Release Notes]]
* [[Archivematica 0.7 Release Notes]]
* [[Archivematica 0.6 Release Notes]]

Release Notes

2013-05-07T18:38:29Z

Peter:

[[Main Page]] > [[Software]] > Release Notes

* [[Archivematica_0.10-beta_Release_Notes Archivematica 0.10 Release Notes]] (Current)
* [[Archivematica 0.9 Release Notes]]
* [[Archivematica 0.8 Release Notes]]
* [[Archivematica 0.7.1 Release Notes]]
* [[Archivematica 0.7 Release Notes]]
* [[Archivematica 0.6 Release Notes]]

Register-0.10-beta

2013-05-07T18:36:46Z

Peter: /* Registration */

[[Main Page]] > [[Documentation]] > [[User Manual]] > [[User_manual_0.10|User manual 0.10]] > Register

= Registration =

When you first install Archivematica 0.10-beta, you will be asked to register repository information ('''figure 1''') in order to set your PREMIS agent and to get updates from the [[Administrator_manual_0.10#Format_Policy_Registry_.28FPR.29|Format Policy Registry (FPR)]] from the server ('''figure 2''')

* Have your preferred organization name and identifier ready, which will be your PREMIS agent and PREMIS agent identifier. You can make changes to the PREMIS agent later in the Administration tab of the dashboard.
* Select a username for your first administrative user. This user is also a PREMIS agent. You can make changes to this user later in the Administration tab of the dashboard.
* Enter first name, last name, e-mail (this is used by the system to send error reports in some configurations) and password.
</div>

<div class="clearfix">

</div>
[[Image:0.10-registration.png|700px|center|thumb|'''Figure 1''' Register and sign in]]

[[Image:0.10-registration2.png|700px|center|thumb|'''Figure 2''' Format policy registry update]]
</div>

<div class="clearfix">

</div>

* To begin processing digital objects, proceed to the [[UM_transfer|Transfer]] section of the user manual.

Installation

2013-05-07T18:36:31Z

Peter: /* Registration */

===Technical Requirements===

Archivematica is capable of running on almost any hardware supported by Ubuntu 12.04. However, processing large collections will require better hardware. The minimum requirements listed here will work for demonstration and training purposes, or for processing smaller collections.

Archivematica can be installed on a single machine, or across many machines to spread the processing workload.

==Minimum Requirements==
* '''Processor''': Dual Core+ CPU
* '''Memory''': 1GB+
* '''Disk space''': 7GB plus the disk space required for the collection

==Recommended Minimum Requirements:==
* '''Processor''': dual core i3 2nd generation CPU or better
* '''Memory''': 2GB+
* '''Disk space''': 10GB plus the disk space required for the collection.

==Firewall requirements==
When installing Archivematica on multiple machines, all the machines must be able to reach each other on the following ports:
* http, mysqld, gearman, nfs, ssh

==Installation==
* [[Install-0.10-beta|Install Release 0.10-beta]]
* [[Install|previous releases]]

==Registration==
When you first install Archivematica 0.10-beta, you will be asked to register repository information in order to set your PREMIS agent and to get updates from the [[Administrator_manual_0.10#Format_Policy_Registry_.28FPR.29|Format Policy Registry (FPR)]]
* [[Register-0.10-beta|Register Release 0.10-beta]]

Community

2013-04-12T18:08:52Z

Peter:

[[Main Page]] > [[Community|Community]]

==Interaction==

* [https://groups.google.com/forum/?fromgroups#!forum/archivematica Discussion list]
* [http://twitter.com/#!/archivematica Twitter account]
* [https://projects.artefactual.com/projects/archivematica Issues list]

==Implementations==

We know there are many more of you out there (at least 30). If you don't see your organization here and would like to be included as part of the visible Archivematica community of implementers, request an account to edit the wiki yourself or email ''courtney[at]artefactual[dot]com'' with your information and she will post it here for you.

* [http://www.library.ualberta.ca/ University of Alberta Libraries]
* [http://diginit.library.ubc.ca/ University of British Columbia Library]
* [http://www.computerhistory.org Computer History Museum], Mountain View, CA, USA. Contact: [[User:Heathermarie]]
* [http://rockarch.org/ Rockefeller Archive Center]
* [http://www.sfu.ca/archives/ Simon Fraser University Archives and Records Management]
* [http://vancouver.ca/ctyclerk/archives/ City of Vancouver Archives]

Presentations

2013-03-14T22:01:22Z

Peter: reformat links for readability

[[Main Page]] > [[Documentation]] > Presentations

= Archivematica 0.9 =

* Tutorial Workshop 0.9-beta ([[:File:Tutorial_Workshop_0-9.odp|Open Document]]) / ([[:File:Tutorial_Workshop_0-9.pdf|PDF]])
* UNESCO Memory of the World The Archivematica project: Meeting digital continuity's technical challenges ([[:File:2012-09-26-Mumma-UNESCOMoW.odp|Open Document]]) / ([[:File:2012-09-26-Mumma-UNESCOMoW-lowres.pdf|PDF]])
* iPres 2012 Toronto - The Community-Driven Evolution of the Archivematica Project ([[:File:2012-10-04-iPres-Toronto-VanGarderen-Mumma.odp|Open Document]]) / ([[:File:2012-10-04-iPres-Toronto-VanGarderen-Mummalowres.pdf|PDF]])

= Previous releases =

*

Community

2013-03-04T18:31:18Z

Peter: /* Implementations */

[[Main Page]] > [[Community|Community]]

==Interaction==

* [https://groups.google.com/forum/?fromgroups#!forum/archivematica Discussion list]
* [http://twitter.com/#!/archivematica Twitter account]
* [https://projects.artefactual.com/projects/archivematica Issues list]

==Implementations==

We know there are many more of you out there. If you don't see your organization here and would like to be included as part of the visible Archivematica community of implementers, request an account to edit the wiki yourself or email ''courtney[at]artefactual[dot]com'' with your information and she will post it here for you.

* [http://www.library.ualberta.ca/ University of Alberta Libraries]
* [http://diginit.library.ubc.ca/ University of British Columbia Library]
* [http://www.computerhistory.org Computer History Museum], Mountain View, CA, USA. Contact: [[User:Heathermarie]]
* [http://rockarch.org/ Rockefeller Archive Center]
* [http://www.sfu.ca/archives/ Simon Fraser University Archives and Records Management]
* [http://vancouver.ca/ctyclerk/archives/ City of Vancouver Archives]

Community

2013-03-04T18:30:50Z

Peter: /* Implementations */

[[Main Page]] > [[Community|Community]]

==Interaction==

* [https://groups.google.com/forum/?fromgroups#!forum/archivematica Discussion list]
* [http://twitter.com/#!/archivematica Twitter account]
* [https://projects.artefactual.com/projects/archivematica Issues list]

==Implementations==

We know there are many more of you out there. If you don't see your organization here and would like to be included as part of the visible Archivematica community of implementers, request an account to edit the wiki yourself or email courtney [at ] artefactual [dot] com with your information and she will post it here for you.

* [http://www.library.ualberta.ca/ University of Alberta Libraries]
* [http://diginit.library.ubc.ca/ University of British Columbia Library]
* [http://www.computerhistory.org Computer History Museum], Mountain View, CA, USA. Contact: [[User:Heathermarie]]
* [http://rockarch.org/ Rockefeller Archive Center]
* [http://www.sfu.ca/archives/ Simon Fraser University Archives and Records Management]
* [http://vancouver.ca/ctyclerk/archives/ City of Vancouver Archives]

Scalability testing

2013-01-11T17:54:28Z

Peter: /* Current Plans */

[[Main Page]] > [[Development roadmap]] > Scalability testing

= Objectives =

'''1. set up a dedicated testing environment'''
 
The testing environment will start with 5 virtual machines set up in a hosted environment, where hardware resources can be scaled up and down between tests. It is expected that the test environment will be ready to use by January 15th.
[[Test Environment Documentation]]

'''2. develop a new initial repeatable test suite'''
 Initial tests will focus on two main areas - file io and documenting individual micro-service performance. Data will be collected from external monitoring tools as well as from internal instrumentation.

External monitoring will be done with two open source packages, munin and collectd. This will provide data at the operating system level.
Internal instrumentation already exists within the Archivematica source code, where each step in the process has a start time and end time recorded in the local database. This instrumentation will be extended and refined during the buildout of the test suite. The data collected will be used to identify which specific micro-services, and which steps within those micro-services are taking the longest time to complete.

'''3. document a full matrix of test parameters'''
 Archivematica workflow can vary considerably depending on use case. Artefactual will document all testing efforts on this wiki, building out a matrix of test cases. For example, we expect that adding additional storage subsystem capacity will allow for linear growth in scalability (add more disks, it should all go faster). This will be one of the first 'columns' in our test matrix, repeating tests with the same workload, changing the capacity (maximum io's per second) of the storage subsystem between tests.

Initial tests will focus on the 4 primary stages in the Archivematica workflow - Transfer, Ingest, creation of SIP, creation of AIP. There are additional steps required, both before Transfer, and after creation of AIP, however these steps do not necessarily involve the use of Archivematica code. For example, moving digital objects to a shared folder that Archivematica can access is a prerequisite of the Transfer stage, and can take a considerable amount of time. We will document best practices for how to complete that work after initial scalability testing is complete.

'''4. repeat test suite at customer sites'''
 The two initial customer sites have been identified by Archivematica and tests will be repeasted at both customer sites.

= Test Structure =

Scalability testing is done using a scripted workload, where all decision points, that are normally left to the Archivist to make using the Dashboard, are instead automated through the use of a configuration file. This allows for repeatable test cases. Example test scripts will be posted here over the coming weeks.

= Test File Sets =
[http://archivematica.org/downloads/docZips/ Test Documents]

= Archived Scalability Test Results =
Historical test results are available [[Archived Scalability Test Results]]

Scalability testing

2013-01-11T17:54:03Z

Peter:

[[Main Page]] > [[Development roadmap]] > Scalability testing

= Current Plans =

'''* set up a dedicated testing environment'''
 
The testing environment will start with 5 virtual machines set up in a hosted environment, where hardware resources can be scaled up and down between tests. It is expected that the test environment will be ready to use by January 15th.
[[Test Environment Documentation]]

'''* develop a new initial repeatable test suite'''
 Initial tests will focus on two main areas - file io and documenting individual micro-service performance. Data will be collected from external monitoring tools as well as from internal instrumentation.

External monitoring will be done with two open source packages, munin and collectd. This will provide data at the operating system level.
Internal instrumentation already exists within the Archivematica source code, where each step in the process has a start time and end time recorded in the local database. This instrumentation will be extended and refined during the buildout of the test suite. The data collected will be used to identify which specific micro-services, and which steps within those micro-services are taking the longest time to complete.

'''* document a full matrix of test parameters'''
 Archivematica workflow can vary considerably depending on use case. Artefactual will document all testing efforts on this wiki, building out a matrix of test cases. For example, we expect that adding additional storage subsystem capacity will allow for linear growth in scalability (add more disks, it should all go faster). This will be one of the first 'columns' in our test matrix, repeating tests with the same workload, changing the capacity (maximum io's per second) of the storage subsystem between tests.

Initial tests will focus on the 4 primary stages in the Archivematica workflow - Transfer, Ingest, creation of SIP, creation of AIP. There are additional steps required, both before Transfer, and after creation of AIP, however these steps do not necessarily involve the use of Archivematica code. For example, moving digital objects to a shared folder that Archivematica can access is a prerequisite of the Transfer stage, and can take a considerable amount of time. We will document best practices for how to complete that work after initial scalability testing is complete.

'''* repeat test suite at customer sites'''
 The two initial customer sites have been identified by Archivematica and tests will be repeasted at both customer sites.

= Test Structure =

Scalability testing is done using a scripted workload, where all decision points, that are normally left to the Archivist to make using the Dashboard, are instead automated through the use of a configuration file. This allows for repeatable test cases. Example test scripts will be posted here over the coming weeks.

= Test File Sets =
[http://archivematica.org/downloads/docZips/ Test Documents]

= Archived Scalability Test Results =
Historical test results are available [[Archived Scalability Test Results]]

Development

2012-10-30T18:52:11Z

Peter: /* Developer Resources */

[[Main Page]] > Development

This page lists the resources available for project contributors:

==Project Communication==
* [http://groups.google.ca/group/archivematica Discussion list]
*[http://code.google.com/p/archivematica/issues/list Issues list] ([http://groups.google.com/group/archivematica-issues auto-updates])
* [[Chat room]]
* [[:Category:meetings|Weekly project meeting]]
* Archivematica.org wiki: [[Special:UserLogin|create an account]] to correct and add content on this wiki

==Developer Resources==
* [https://github.com/artefactual/archivematica Code repository]
* [[Development environment]]
* [[Contribute code]]
**[[Patches]]
**[[Contributor Agreement]]
**[[License]]
**[[Trademark]]
* [[:Category:Development documentation|Development documentation]]
* [[Development_roadmap:_Archivematica_1.0|Development Roadmap]]
* [[Creating Custom Workflows]]

==Project Management==
Archivematica software development, release management, and community support is managed by [http://artefactual.com Artefactual Systems] in collaboration with its contract clients and a growing network of Archivematica users and service partners.

[[Category:Development documentation]]

Development

2012-10-30T18:51:49Z

Peter: /* Developer Resources */

[[Main Page]] > Development

This page lists the resources available for project contributors:

==Project Communication==
* [http://groups.google.ca/group/archivematica Discussion list]
*[http://code.google.com/p/archivematica/issues/list Issues list] ([http://groups.google.com/group/archivematica-issues auto-updates])
* [[Chat room]]
* [[:Category:meetings|Weekly project meeting]]
* Archivematica.org wiki: [[Special:UserLogin|create an account]] to correct and add content on this wiki

==Developer Resources==
* [https://github.com/artefactual/archivematica Code repository]
* [[Development environment]]
* [[Contribute code]]
**[[Patches]]
**[[Contributor Agreement]]
**[[License]]
**[[Trademark]]
* [[:Category:Development documentation|Development documentation]]
* [[Development_roadmap:_Archivematica_1.0]]
* [[Creating Custom Workflows]]

==Project Management==
Archivematica software development, release management, and community support is managed by [http://artefactual.com Artefactual Systems] in collaboration with its contract clients and a growing network of Archivematica users and service partners.

[[Category:Development documentation]]

Format policy registry requirements

2012-10-23T17:05:58Z

Peter: /* Description */

[[Documentation]] > [[Requirements]] > Format policy registry requirements

== Description ==

* The Archivematica project team has recognized the need for a better way to manage preservation plans, i.e. business rules and tool commands for format transcoding. Since these are either implemented or altered by the institution running an Archivematica instance, these rules are referred to as policies. Format policies will change as community standards, practices and tools evolve. A format policy indicates the actions, tools and settings to apply to a file of a particular file format (e.g. conversion to preservation format, conversion to access format).

* Until now, the Archivematica project has managed this information on the [[Media_type_preservation_plans|archivematica.org/preservation]] wiki page.

* The Format Policy Registry (FPR) will manage this information in a structured format (SQL/JSON).
** APIs with other serializations may be added (e.g. XML, RDF)

* It will be hosted at archivematica.org/fpr/

* The FPR will also provide valuable online statistics about default format policy adoption as well as customizations amongst Archivematica users and will interface with other online registries (such as PRONOM and UDFR) to monitor and evaluate community-wide best practices.

* The FPR stores structured information about normalization format policies for preservation and access. These policies identify preferred preservation and access formats by media type. The choice of access formats is based on the ubiquity of viewers for the file format. Archivematica's preservation formats are all open standards; additionally, the choice of preservation format is based on community best practices, availability of open-source normalization tools, and an analysis of the significant characteristics for each media type.

* These default format policies can all be changed or enhanced by individual Archivematica implementers.

* Subscription to the FPR will allow the Archivematica project to notify users when new or updated preservation and access plans become available, allowing them to make better decisions about normalization and migration strategies for specific format types within their collections. It will also allow them to trigger migration processes as new tools and knowledge becomes available.

*One of the other primary goals of the FPR is to aggregate empirical information about institutional format policies to better identify community best practices. The FPR will provide a practical, community-based approach to OAIS preservation and access planning, allowing the Archivematica community of users to monitor and evaluate formats policies as they are adopted, adapted and supplemented by real-world practioners. The FPR APIs will be designed to share this information with the Archivematica user base as well with other interested communities and projects.

==Early prototype==

*An early FPR prototype (called "Formatica") was developed by Heather Bowden, then Carolina Digital Curation Doctoral Fellow at the School of Information and Library Science in the University of North Carolina at Chapel Hill.

[[File:Formatica.png|border|450px|Early FPR prototype originally called Formatica]]

= Requirements =

[[File:FPR overview Oct 2012.png|border|900px|FPR overview Oct 2012]]

* provide an authenticated Web based interface for creation and maintenance of policies
* provide a read-only RESTful Web API for accessing policies in JSON format
* provide an API for monitoring new and updated policies
* integrate with PRONOM to retrieve PUIDs
* model format policies so that they can be stored in a SQL (MySQL?, PostGres?, SQLlite?) dbase on both client & server
* develop iteratively with an emphasis on getting working code in front of users as quickly as possible to make them part of the design process (see #fileidhack)
* developer [[Format_policy_registry|notes]]

== Use Cases ==

== Data Model ==

== Workflow ==

== GUI ==

== API ==

[[Category:Development documentation]]

Format policy registry requirements

2012-10-23T00:29:09Z

Peter: /* Requirements */

[[Documentation]] > [[Requirements]] > Format policy registry requirements

== Description ==

* The Archivematica project team has recognized the need for a way to manage format conversion preservation plans, referred to by the project as format policies, which will change as formats and community standards evolve. A format policy indicates the actions, tools and settings to apply to a particular file format.

* Until now, the Archivematica project has managed this information on the [[Media_type_preservation_plans|archivematica.org/preservation]] wiki page.

* The Format Policy Registry (FPR) will manage this information in a structured format (SQL/JSON).
** APIs with other serializations may be added (e.g. XML, RDF)

* It will be hosted at archivematica.org/fpr/

* The FPR will also provide valuable online statistics about default format policy adoption as well as customizations amongst Archivematica users and will interface with other online registries (such as PRONOM and UDFR) to monitor and evaluate community-wide best practices.

* The FPR stores structured information about normalization format policies for preservation and access. These policies identify preferred preservation and access formats by media type. The choice of access formats is based on the ubiquity of viewers for the file format. Archivematica's preservation formats are all open standards; additionally, the choice of preservation format is based on community best practices, availability of open-source normalization tools, and an analysis of the significant characteristics for each media type.

* These default format policies can all be changed or enhanced by individual Archivematica implementers.

* Subscription to the FPR will allow the Archivematica project to notify users when new or updated preservation and access plans become available, allowing them to make better decisions about normalization and migration strategies for specific format types within their collections. It will also allow them to trigger migration processes as new tools and knowledge becomes available.

*One of the other primary goals of the FPR is to aggregate empirical information about institutional format policies to better identify community best practices. The FPR will provide a practical, community-based approach to OAIS preservation and access planning, allowing the Archivematica community of users to monitor and evaluate formats policies as they are adopted, adapted and supplemented by real-world practioners. The FPR APIs will be designed to share this information with the Archivematica user base as well with other interested communities and projects.

==Early prototype==

*An early FPR prototype (called "Formatica") was developed by Heather Bowden, then Carolina Digital Curation Doctoral Fellow at the School of Information and Library Science in the University of North Carolina at Chapel Hill.

[[File:Formatica.png|border|450px|Early FPR prototype originally called Formatica]]

= Requirements =

[[File:FPR overview Oct 2012.png|border|900px|FPR overview Oct 2012]]

* provide an authenticated Web based interface for creation and maintenance of policies
* provide a read-only RESTful Web API for accessing policies in JSON format
* provide an API for monitoring new and updated policies
* integrate with PRONOM to retrieve PUIDs
* model format policies so that they can be stored in a SQL (MySQL?, PostGres?, SQLlite?) dbase on both client & server
* develop iteratively with an emphasis on getting working code in front of users as quickly as possible to make them part of the design process (see #fileidhack)
* developer [[Format_policy_registry|notes]]

== Use Cases ==

== Data Model ==

== Workflow ==

== GUI ==

== API ==

[[Category:Development documentation]]

Format policy registry requirements

2012-10-23T00:21:44Z

Peter: /* Requirements */

[[Documentation]] > [[Requirements]] > Format policy registry requirements

== Description ==

* The Archivematica project team has recognized the need for a way to manage format conversion preservation plans, referred to by the project as format policies, which will change as formats and community standards evolve. A format policy indicates the actions, tools and settings to apply to a particular file format.

* Until now, the Archivematica project has managed this information on the [[Media_type_preservation_plans|archivematica.org/preservation]] wiki page.

* The Format Policy Registry (FPR) will manage this information in a structured format (SQL/JSON).
** APIs with other serializations may be added (e.g. XML, RDF)

* It will be hosted at archivematica.org/fpr/

* The FPR will also provide valuable online statistics about default format policy adoption as well as customizations amongst Archivematica users and will interface with other online registries (such as PRONOM and UDFR) to monitor and evaluate community-wide best practices.

* The FPR stores structured information about normalization format policies for preservation and access. These policies identify preferred preservation and access formats by media type. The choice of access formats is based on the ubiquity of viewers for the file format. Archivematica's preservation formats are all open standards; additionally, the choice of preservation format is based on community best practices, availability of open-source normalization tools, and an analysis of the significant characteristics for each media type.

* These default format policies can all be changed or enhanced by individual Archivematica implementers.

* Subscription to the FPR will allow the Archivematica project to notify users when new or updated preservation and access plans become available, allowing them to make better decisions about normalization and migration strategies for specific format types within their collections. It will also allow them to trigger migration processes as new tools and knowledge becomes available.

*One of the other primary goals of the FPR is to aggregate empirical information about institutional format policies to better identify community best practices. The FPR will provide a practical, community-based approach to OAIS preservation and access planning, allowing the Archivematica community of users to monitor and evaluate formats policies as they are adopted, adapted and supplemented by real-world practioners. The FPR APIs will be designed to share this information with the Archivematica user base as well with other interested communities and projects.

==Early prototype==

*An early FPR prototype (called "Formatica") was developed by Heather Bowden, then Carolina Digital Curation Doctoral Fellow at the School of Information and Library Science in the University of North Carolina at Chapel Hill.

[[File:Formatica.png|border|450px|Early FPR prototype originally called Formatica]]

= Requirements =

[[File:FPR overview Oct 2012.png|border|900px|FPR overview Oct 2012]]

* provide an authenticated Web based interface for creation and maintenance of policies
* provide a read-only RESTful Web API for accessing policies in JSON format
* provide an API for monitoring new and updated policies
* integrate with PRONOM to retrieve PUIDs
* model format policies so that they can be stored in a SQL (MySQL, PostGres, SQLlite) dbase on both client & server
* develop iteratively with an emphasis on getting working code in front of users as quickly as possible to make them part of the design process (see #fileidhack)
* developer [[Format_policy_registry|notes]]

== Use Cases ==

== Data Model ==

== Workflow ==

== GUI ==

== API ==

[[Category:Development documentation]]

Format policy registry requirements

2012-10-23T00:16:57Z

Peter: /* Requirements */

[[Documentation]] > [[Requirements]] > Format policy registry requirements

== Description ==

* The Archivematica project team has recognized the need for a way to manage format conversion preservation plans, referred to by the project as format policies, which will change as formats and community standards evolve. A format policy indicates the actions, tools and settings to apply to a particular file format.

* Until now, the Archivematica project has managed this information on the [[Media_type_preservation_plans|archivematica.org/preservation]] wiki page.

* The Format Policy Registry (FPR) will manage this information in a structured format (SQL/JSON).
** APIs with other serializations may be added (e.g. XML, RDF)

* It will be hosted at archivematica.org/fpr/

* The FPR will also provide valuable online statistics about default format policy adoption as well as customizations amongst Archivematica users and will interface with other online registries (such as PRONOM and UDFR) to monitor and evaluate community-wide best practices.

* The FPR stores structured information about normalization format policies for preservation and access. These policies identify preferred preservation and access formats by media type. The choice of access formats is based on the ubiquity of viewers for the file format. Archivematica's preservation formats are all open standards; additionally, the choice of preservation format is based on community best practices, availability of open-source normalization tools, and an analysis of the significant characteristics for each media type.

* These default format policies can all be changed or enhanced by individual Archivematica implementers.

* Subscription to the FPR will allow the Archivematica project to notify users when new or updated preservation and access plans become available, allowing them to make better decisions about normalization and migration strategies for specific format types within their collections. It will also allow them to trigger migration processes as new tools and knowledge becomes available.

*One of the other primary goals of the FPR is to aggregate empirical information about institutional format policies to better identify community best practices. The FPR will provide a practical, community-based approach to OAIS preservation and access planning, allowing the Archivematica community of users to monitor and evaluate formats policies as they are adopted, adapted and supplemented by real-world practioners. The FPR APIs will be designed to share this information with the Archivematica user base as well with other interested communities and projects.

==Early prototype==

*An early FPR prototype (called "Formatica") was developed by Heather Bowden, then Carolina Digital Curation Doctoral Fellow at the School of Information and Library Science in the University of North Carolina at Chapel Hill.

[[File:Formatica.png|border|450px|Early FPR prototype originally called Formatica]]

= Requirements =

[[File:FPR overview Oct 2012.png|border|900px|FPR overview Oct 2012]]

* provide an authenticated Web based interface for creation and maintenance of policies
* provide a read-only RESTful Web API for accessing policies in JSON format
* provide an API for monitoring new and updated policies
* integrate with PRONOM to retrieve PUIDs
* model format policies so that they can be stored in a SQL (MySQL, PostGres, SQLlite) dbase on both client & server
* develop iteratively with an emphasis on getting working code in front of users as quickly as possible to make them part of the design process (see #fileidhack)

== Use Cases ==

== Data Model ==

== Workflow ==

== GUI ==

== API ==

[[Category:Development documentation]]

Format policy registry requirements

2012-10-23T00:12:29Z

Peter: /* Description */

[[Documentation]] > [[Requirements]] > Format policy registry requirements

== Description ==

* The Archivematica project team has recognized the need for a way to manage format conversion preservation plans, referred to by the project as format policies, which will change as formats and community standards evolve. A format policy indicates the actions, tools and settings to apply to a particular file format.

* Until now, the Archivematica project has managed this information on the [[Media_type_preservation_plans|archivematica.org/preservation]] wiki page.

* The Format Policy Registry (FPR) will manage this information in a structured format (SQL/JSON).
** APIs with other serializations may be added (e.g. XML, RDF)

* It will be hosted at archivematica.org/fpr/

* The FPR will also provide valuable online statistics about default format policy adoption as well as customizations amongst Archivematica users and will interface with other online registries (such as PRONOM and UDFR) to monitor and evaluate community-wide best practices.

* The FPR stores structured information about normalization format policies for preservation and access. These policies identify preferred preservation and access formats by media type. The choice of access formats is based on the ubiquity of viewers for the file format. Archivematica's preservation formats are all open standards; additionally, the choice of preservation format is based on community best practices, availability of open-source normalization tools, and an analysis of the significant characteristics for each media type.

* These default format policies can all be changed or enhanced by individual Archivematica implementers.

* Subscription to the FPR will allow the Archivematica project to notify users when new or updated preservation and access plans become available, allowing them to make better decisions about normalization and migration strategies for specific format types within their collections. It will also allow them to trigger migration processes as new tools and knowledge becomes available.

*One of the other primary goals of the FPR is to aggregate empirical information about institutional format policies to better identify community best practices. The FPR will provide a practical, community-based approach to OAIS preservation and access planning, allowing the Archivematica community of users to monitor and evaluate formats policies as they are adopted, adapted and supplemented by real-world practioners. The FPR APIs will be designed to share this information with the Archivematica user base as well with other interested communities and projects.

==Early prototype==

*An early FPR prototype (called "Formatica") was developed by Heather Bowden, then Carolina Digital Curation Doctoral Fellow at the School of Information and Library Science in the University of North Carolina at Chapel Hill.

[[File:Formatica.png|border|450px|Early FPR prototype originally called Formatica]]

= Requirements =

[[File:FPR overview Oct 2012.png|border|900px|FPR overview Oct 2012]]

== Use Cases ==

== Data Model ==

== Workflow ==

== GUI ==

== API ==

[[Category:Development documentation]]

Format policy registry requirements

2012-10-23T00:10:52Z

Peter: /* Description */

[[Documentation]] > [[Requirements]] > Format policy registry requirements

== Description ==

* The Archivematica project team has recognized the need for a way to manage format conversion preservation plans, referred to by the project as format policies, which will change as formats and community standards evolve. A format policy indicates the actions, tools and settings to apply to a particular file format.

* Until now, the Archivematica project has managed this information on the [[Media_type_preservation_plans|archivematica.org/preservation]] wiki page.

* The Format Policy Registry (FPR) will manage this information in a structured format.

* It will be hosted at archivematica.org/fpr/

* The FPR will also provide valuable online statistics about default format policy adoption as well as customizations amongst Archivematica users and will interface with other online registries (such as PRONOM and UDFR) to monitor and evaluate community-wide best practices.

* The FPR stores structured information about normalization format policies for preservation and access. These policies identify preferred preservation and access formats by media type. The choice of access formats is based on the ubiquity of viewers for the file format. Archivematica's preservation formats are all open standards; additionally, the choice of preservation format is based on community best practices, availability of open-source normalization tools, and an analysis of the significant characteristics for each media type.

* These default format policies can all be changed or enhanced by individual Archivematica implementers.

* Subscription to the FPR will allow the Archivematica project to notify users when new or updated preservation and access plans become available, allowing them to make better decisions about normalization and migration strategies for specific format types within their collections. It will also allow them to trigger migration processes as new tools and knowledge becomes available.

*One of the other primary goals of the FPR is to aggregate empirical information about institutional format policies to better identify community best practices. The FPR will provide a practical, community-based approach to OAIS preservation and access planning, allowing the Archivematica community of users to monitor and evaluate formats policies as they are adopted, adapted and supplemented by real-world practioners. The FPR APIs will be designed to share this information with the Archivematica user base as well with other interested communities and projects.

==Early prototype==

*An early FPR prototype (called "Formatica") was developed by Heather Bowden, then Carolina Digital Curation Doctoral Fellow at the School of Information and Library Science in the University of North Carolina at Chapel Hill.

[[File:Formatica.png|border|450px|Early FPR prototype originally called Formatica]]

= Requirements =

[[File:FPR overview Oct 2012.png|border|900px|FPR overview Oct 2012]]

== Use Cases ==

== Data Model ==

== Workflow ==

== GUI ==

== API ==

[[Category:Development documentation]]

Format policy registry requirements

2012-10-23T00:08:39Z

Peter: /* Description */

[[Documentation]] > [[Requirements]] > Format policy registry requirements

== Description ==

* The Archivematica project team has recognized the need for a way to manage format conversion preservation plans, referred to by the project as format policies, which will change as formats and community standards evolve. A format policy indicates the actions, tools and settings to apply to a particular file format.

* Until now, the Archivematica project has managed this information on the [http://archivematica.org/preservation archivematica.org/preservation] wiki page.

* The Format Policy Registry (FPR) will manage this information in a structured format.

* It will be hosted at archivematica.org/fpr/

* The FPR will also provide valuable online statistics about default format policy adoption as well as customizations amongst Archivematica users and will interface with other online registries (such as PRONOM and UDFR) to monitor and evaluate community-wide best practices.

* The FPR stores structured information about normalization format policies for preservation and access. These policies identify preferred preservation and access formats by media type. The choice of access formats is based on the ubiquity of viewers for the file format. Archivematica's preservation formats are all open standards; additionally, the choice of preservation format is based on community best practices, availability of open-source normalization tools, and an analysis of the significant characteristics for each media type.

* These default format policies can all be changed or enhanced by individual Archivematica implementers.

* Subscription to the FPR will allow the Archivematica project to notify users when new or updated preservation and access plans become available, allowing them to make better decisions about normalization and migration strategies for specific format types within their collections. It will also allow them to trigger migration processes as new tools and knowledge becomes available.

*One of the other primary goals of the FPR is to aggregate empirical information about institutional format policies to better identify community best practices. The FPR will provide a practical, community-based approach to OAIS preservation and access planning, allowing the Archivematica community of users to monitor and evaluate formats policies as they are adopted, adapted and supplemented by real-world practioners. The FPR APIs will be designed to share this information with the Archivematica user base as well with other interested communities and projects.

==Early prototype==

*An early FPR prototype (called "Formatica") was developed by Heather Bowden, then Carolina Digital Curation Doctoral Fellow at the School of Information and Library Science in the University of North Carolina at Chapel Hill.

[[File:Formatica.png|border|450px|Early FPR prototype originally called Formatica]]

= Requirements =

[[File:FPR overview Oct 2012.png|border|900px|FPR overview Oct 2012]]

== Use Cases ==

== Data Model ==

== Workflow ==

== GUI ==

== API ==

[[Category:Development documentation]]

Format policy registry requirements

2012-10-23T00:05:44Z

Peter:

[[Documentation]] > [[Requirements]] > Format policy registry requirements

== Description ==

* The Archivematica project team has recognized the need for a way to manage format conversion preservation plans, referred to by the project as format policies, which will change as formats and community standards evolve. A format policy indicates the actions, tools and settings to apply to a particular file format. The Format Policy Registry (FPR) will provide valuable online statistics about default format policy adoption as well as customizations amongst Archivematica users and will interface with other online registries (such as PRONOM and UDFR) to monitor and evaluate community-wide best practices. It will be hosted at archivematica.org/fpr.

* The FPR stores structured information about normalization format policies for preservation and access. These policies identify preferred preservation and access formats by media type. The choice of access formats is based on the ubiquity of viewers for the file format. Archivematica's preservation formats are all open standards; additionally, the choice of preservation format is based on community best practices, availability of open-source normalization tools, and an analysis of the significant characteristics for each media type.

* These default format policies can all be changed or enhanced by individual Archivematica implementers.

* Subscription to the FPR will allow the Archivematica project to notify users when new or updated preservation and access plans become available, allowing them to make better decisions about normalization and migration strategies for specific format types within their collections. It will also allow them to trigger migration processes as new tools and knowledge becomes available.

*One of the other primary goals of the FPR is to aggregate empirical information about institutional format policies to better identify community best practices. The FPR will provide a practical, community-based approach to OAIS preservation and access planning, allowing the Archivematica community of users to monitor and evaluate formats policies as they are adopted, adapted and supplemented by real-world practioners. The FPR APIs will be designed to share this information with the Archivematica user base as well with other interested communities and projects.

==Early prototype==

*An early FPR prototype (called "Formatica") was developed by Heather Bowden, then Carolina Digital Curation Doctoral Fellow at the School of Information and Library Science in the University of North Carolina at Chapel Hill.

[[File:Formatica.png|border|450px|Early FPR prototype originally called Formatica]]

= Requirements =

[[File:FPR overview Oct 2012.png|border|900px|FPR overview Oct 2012]]

== Use Cases ==

== Data Model ==

== Workflow ==

== GUI ==

== API ==

[[Category:Development documentation]]

File:FPR overview Oct 2012.png

2012-10-23T00:04:27Z

Peter:

Format policy registry requirements

2012-10-23T00:00:48Z

Peter:

[[Documentation]] > [[Requirements]] > Format policy registry requirements

== Description ==

* The Archivematica project team has recognized the need for a way to manage format conversion preservation plans, referred to by the project as format policies, which will change as formats and community standards evolve. A format policy indicates the actions, tools and settings to apply to a particular file format. The Format Policy Registry (FPR) will provide valuable online statistics about default format policy adoption as well as customizations amongst Archivematica users and will interface with other online registries (such as PRONOM and UDFR) to monitor and evaluate community-wide best practices. It will be hosted at archivematica.org/fpr.

* The FPR stores structured information about normalization format policies for preservation and access. These policies identify preferred preservation and access formats by media type. The choice of access formats is based on the ubiquity of viewers for the file format. Archivematica's preservation formats are all open standards; additionally, the choice of preservation format is based on community best practices, availability of open-source normalization tools, and an analysis of the significant characteristics for each media type.

* These default format policies can all be changed or enhanced by individual Archivematica implementers.

* Subscription to the FPR will allow the Archivematica project to notify users when new or updated preservation and access plans become available, allowing them to make better decisions about normalization and migration strategies for specific format types within their collections. It will also allow them to trigger migration processes as new tools and knowledge becomes available.

*One of the other primary goals of the FPR is to aggregate empirical information about institutional format policies to better identify community best practices. The FPR will provide a practical, community-based approach to OAIS preservation and access planning, allowing the Archivematica community of users to monitor and evaluate formats policies as they are adopted, adapted and supplemented by real-world practioners. The FPR APIs will be designed to share this information with the Archivematica user base as well with other interested communities and projects.

=Early prototype=

*An early FPR prototype (called "Formatica") was developed by Heather Bowden, then Carolina Digital Curation Doctoral Fellow at the School of Information and Library Science in the University of North Carolina at Chapel Hill.

[[File:Formatica.png|border|900px|Early FPR prototype originally called Formatica]]

== Requirements ==

== Use Cases ==

== Workflow ==

[[Category:Development documentation]]

Development

2012-09-20T16:41:49Z

Peter:

[[Main Page]] > Development

This page lists the resources available for project contributors:

==Project Communication==
* [http://groups.google.ca/group/archivematica Discussion list]
*[http://code.google.com/p/archivematica/issues/list Issues list] ([http://groups.google.com/group/archivematica-issues auto-updates])
* [[Chat room]]
* [[:Category:meetings|Weekly project meeting]]
* Archivematica.org wiki: [[Special:UserLogin|create an account]] to correct and add content on this wiki

==Developer Resources==
* [https://github.com/artefactual/archivematica Code repository]
* [[Development environment]]
* [[Contribute code]]
**[[Patches]]
**[[Contributor Agreement]]
**[[License]]
**[[Trademark]]
* [[:Category:Development documentation|Development documentation]]
* [[Development roadmap]]
* [[Creating Custom Workflows]]

==Project Management==
Archivematica software development, release management, and community support is managed by [http://artefactual.com Artefactual Systems] in collaboration with its contract clients and a growing network of Archivematica users and service partners.

Elasticsearch Development

2012-09-06T22:58:58Z

Peter:

Archivematica 0.9+ stores AIP file information, such as METS data, using [http://www.elasticsearch.org/ Elasticsearch]. This data can be searched from the Archival Storage area of the dashboard or can be interfaced with programmatically. For Elasticsearch administration information, such as how to delete an Elasticsearch index, please refrence the [[Administrator_manual_0.9#Elasticsearch|administrator manual]].

=Programmatic Access to indexed AIP data=

To access indexed AIP data using a custom script or application, find an Elasticsearch interface library for the programming language you've chosen to use. In Archivematica we use Python with the [https://github.com/aparo/pyes/ pyes] library. In our developer documentation, we'll outline the use of pyes to access AIP data, but any programming language/interface library, such as PHP and [https://github.com/ruflin/Elastica/ Elastica], should work.

==Connecting to Elasticsearch==

On this page we'll run through an example of interfacing with Elasticsearch data using a Python script that leverages the pyes library.

The first step, when using pyes, is to require the module. The following code imports pyes functionality on a system on which Archivematica is installed.

<pre>
import sys
sys.path.append("/home/demo/archivematica/src/archivematicaCommon/lib/externals")
from pyes import *
</pre>

Next you'll want to create a connection to Elasticsearch.

<pre>
conn = ES('127.0.0.1:9200')
</pre>

==Full text searching==

Once connected to Elasticsearch, you can perform searches. Below is the code needed to do a "wildcard" search for all AIP files indexed by Elasticsearch and retrieve the first 20 items. Instead of doing a "wildcard" search you could also supply keywords, such as a certain AIP UUID.

<pre>
start_page = 1
items_per_page = 20

q = StringQuery('*')

try:
results = conn.search_raw(
query=q,
indices='aips',
type='aip',
start=start_page - 1,
size=items_per_page
)
except:
print 'Query error.'
</pre>

==Querying for specific data==

While the "StringQuery" query type is good for broad searches, you may want to narrow a search down to a specific field of data to reduce false positives. Below is an example of searching documents, using "TermQuery", matching criteria within specific data. As, by default, Elasticsearch stores term values in lowercase the term value searched for must also be lowercase.

<pre>
import sys
sys.path.append("/usr/lib/archivematica/archivematicaCommon/externals")
import pyes

conn = pyes.ES('127.0.0.1:9200')

q = pyes.TermQuery("METS.amdSec.ns0:amdSec_list.@ID", "amdsec_8")

try:
results = conn.search_raw(query=q, indices='aips')
except:
print 'Query failed.'
</pre>

==Displaying search results==

Now that you've performed a couple of searches, you can display some results. The below logic cycles through each hit in a results set, representing an AIP file, and prints the UUID of the AIP the file belongs in, the Elasticsearch document ID corresponding to the indexed file data, and the path of the file within the AIP.

<pre>
if results:
document_ids = []
for item in results.hits.hits:
aip = item._source
print 'AIP ID: ' + aip['AIPUUID'] + ' / Document ID: ' + item._id
print 'Filepath: ' + aip['filePath']
print
document_ids.append(item._id)
</pre>

==Fetching specific documents==

If you want to get Elasticsearch data for a specific AIP file, you can use the Elasticsearch document ID. The above code populates the <code>document_ids</code> array and the below code uses this data, retrieving individual documents and extracting a specific item of data from each document.

<pre>
for document_id in document_ids:
data = conn.get(index_name, type_name, document_id)

format = data['METS']['amdSec']['ns0:amdSec_list'][0]['ns0:techMD_list'][0]['ns0:mdWrap_list'][0]['ns0:xmlData_list'][0]['ns1:object_list'][0]['ns1:objectCharacteristics_list'][0]['ns1:format_list'][0]['ns1:formatDesignation_list'][0]['ns1:formatName']

print 'Format for document ID ' + document_id + ' is ' + format
</pre>

==Augmenting documents==

To add additional data to an Elasticsearch document, you'll need the document ID. The following code shows an Elasticsearch query being used to find a document and update it with additional data. Note that the name of the data field being added, "__public", is prefixed with two underscores. This practice prevents the accidental overwriting of system or Archivematica-specific data. System data is prefixed with a single underscore.

<pre>
import sys
sys.path.append("/usr/lib/archivematica/archivematicaCommon/externals")
import pyes

conn = pyes.ES('127.0.0.1:9200')

q = pyes.TermQuery("METS.amdSec.ns0:amdSec_list.@ID", "amdsec_8")

results = conn.search_raw(query=q, indices='aips')

try:
if results:
for item in results.hits.hits:
print 'Updating ID: ' + item['_id']

document = item['_source']
document['__public'] = 'yes'
conn.index(document, 'aips', 'aip', item['_id'])
except:
print 'Query failed.'
</pre>

User Manual

2012-09-05T20:31:34Z

Peter: /* Archivematica 0.9 */

[[Main Page]] > [[Documentation]] > User Manual

= Archivematica 0.9 =

* [[User manual 0.9|User manual]]
* [[Media:Tutorial-09.pdf|Tutorial]]
* [[Archivematica_0.9_Release_Notes|Release Notes]]

= Previous user manuals =

* [[User manual 0.8|Release 0.8 user manual]], [[Media:Tutorial-08.pdf|0.8 Tutorial pdf]]
* [[:File:ArchivematicaDocs071.pdf|Release 0.7.1 user instructions]]
* [[:File:ArchivematicaDocs07.pdf|Release 0.7 user instructions]] ([[:File:ArchivematicaDocs07.pdf|English]], [[:File:ArchivematicaDocs07.es.pdf|Spanish]])
* [[:File:Archivematica-0.6-WorkflowInstructions-v3.pdf|Release 0.6 Documentation]]
* [[Release 0.5 Documentation]]
* [[Release 0.4 Documentation]]
* [[Release 0.3.5 Documentation]]
* [[Release 0.3 Documentation]]
* [[Release 0.2 Documentation]]
* [[Release 0.1 Documentation]]

Archivematica 0.9 Release Notes

2012-09-05T20:28:23Z

Peter:

[[Main Page]] > [[Software]] > [[Release Notes]] > Archivematica 0.9 Release Notes

Release: August 29, 2012 | [[Install-0.9-beta|Download]] | [https://www.youtube.com/watch?v=GWmNfuO1ofw&feature=player_embedded Screencast ] | [https://archivematica.org/roadmap Roadmap ]|

== New features ==

* Update to ubuntu 12.04 LTS as the base operating system
* Web browser dashboard interface replacing most of the file browser functionality
* DIP upload to CONTENTdm
* Indexing and search of all AIP metadata using [http://www.elasticsearch.org/ ElasticSearch]
* Rights module update to PREMIS 2.2
* Email handling improvements and prototype ingest of maildir
* Ability to create user accounts
* Automatic restructuring of transfer for compliance
* In dashboard, grouped jobs into micro-services
* Ability to ingest Library of Congress Bagit format
* Nightly backup of MCP MySQL database
* Scalability ehnancements: see [[Scalability testing]].

== Bug fixes and enhancements ==
* Issue 185 Merge multiple layers in image files into single jpeg access copies
* Issue 304 Transcoding with Open Office fails periodically
* Issue 575 Client can configure their timezone to offset the date/time in the dashboard.
* Issue 673 during reinstall archivematica-mcp-client fails
* Issue 694 The archivematica VM's should include a timesync mechanism
* Issue 980: Check tasks, microservices and dropdown menus for naming clarity and consistency
* Issue 722 Add Administration tab to configure workflows
* Issue 777 Browser periodically fails to refresh when running a micro-service
* Issue 860 Rights granted restriction is a repeatable field.
* Issue 865 Archivematica freezes if transfer directory name has apostrophe
* Issue 869 Omitting termOfGrant startDate in rights causes generate METS.xml micro-service to fail
* Issue 872 In rights list page for SIP, column on right hand side is confusing
* Issue 875 Inconsistent normalization failure on pdf in submissionDocumentation
* Issue 885 Three locations of apache.default
* Issue 886 Make overiding the default assigned threads by core count configurable.
* Issue 887 Make Approval steps different colour
* Issue 892 Uploaded objects should have filename as title
* Issue 894 Microservices failing to connect to the mysql database.
* Issue 897 Integrate Transcoder into MCP
* Issue 902 Remove mac icon files automatically on ingest
* Issue 903 Ensure latest version of tutorial is included in demo/Docs
* Issue 906 When access normalization fails, a copy of the original file should be placed in the DIP
* Issue 910 Remove hidden files during transfer
* Issue 913 Description doesn't match command.
* Issue 918 Choosing "No normalization" results in failure at Prepare AIP
* Issue 927 Make compression a processing decision option
* Issue 932 Make DIP upload destination a selectable or configurable option
* Issue 934 When micro-service fails but transfer or SIP continues processing, icon shows fail at the end instead of success
* Issue 935 Rejected transfers or SIPs have icon showing that processing was completed
* Issue 937 Order structMap contents alphabetically as default
* Issue 939 Enclose fptr elements in divs in METS structMap
* Issue 943 Mysql connection issues.
* Issue 944 Give option to restructure for compliance when failing compliance.
* Issue 950 Make action items larger
* Issue 955 Generate thumbnails
* Issue 958 Improve user manual instructions for error handling
* Issue 962 Ingest maildir backups and convert to mbox for access
* Issue 969 Dashboard search functionality
* Issue 972 Replace isPartOf with Relation in DC template
* Issue 976 Replace pyinotify watched directories, with something that compares list of files.
* Issue 977 Add user-supplied structMap to AIP METS file
* Issue 978 During DSpace transfer processing user asked to approve load of non-existent file_labels.csv
* Issue 980 Check tasks, microservices and dropdown menus for naming clarity and consistency
* Issue 983 Replace -vpre normal in mp4 normalization command with new preset
* Issue 984 Access normalization fails in digitization workflow when filenames have periods in them
* Issue 985 Use ffmpeg.org version of ffmpeg instead of avconv
* Issue 986 Consolidate technical documentation into an administrator's manual
* Issue 991 Make sure blank value doesn't generate NaN in task popup data fields
* Issue 992 Add View METS and View AIP option at Store AIP task
* Issue 993 Add View normalization report and View normalized files option at Approve normalization task
* Issue 995 Swap click behavior of SIP row and magnifying glass icon
* Issue 998 Log MCP normalization output
* Issue 1001 Make user selectable replacement dic append, not replace.
* Issue 1004 Eliminate side info panels from dashboard andhome page
* Issue 1009 Include empty directories in BAG.
* Issue 1010 Resolve: two "CREATE TABLE StandardTasksConfigs"
* Issue 1011 Add default Archivematica structMap label to distinguish from user-supplied structMap
* Issue 1021 Make Archival Storage Tab load from db
* <strike>Issue 1025</strike> Test date fields with dates before 1970
* Issue 1026 Make defaultProcessingMCP.xml configurable in the administration tab.
* Issue 1035 Line up micro-service names
* Issue 1036 Change Dspace transfer folder name and micro-service
* Issue 1040 When ingested file is already in an access format, the file is not added to the DIP
* Issue 1042 Remove default normalization to .odt for .rtf files
* Issue 1044 Remove "None microservice"s
* Issue 1046 Office doc normalization failing on x32 installs
* Issue 1057 DC file not added to METS
* Issue 1081 Fix numerical indicators on the dashboard so they are on proper tab.
* Issue 1082 Verify file id classifications of preservation or access formats.

Scalability testing

2012-07-30T19:05:51Z

Peter: /* Test results */

[[Main Page]] > [[Development roadmap]] > Scalability testing

= Test File Sets =
[http://archivematica.org/downloads/docZips/ Test Documents]

= Test design =

Maximums to test for:
*Max number of SIPS - 10
*Max number of files in SIP - 10,000
*Max size of individual file - 30 GiB
*Max size of SIP - 100 GiB

Baseline amounts:
* number of SIPS - 1
* number of files in SIP - 10
* size of individual file - 1 MiB
* size of SIP - 100 MiB

{| class="wikitable"
|-
! Test
! No. of SIPs
! No. of files in SIP
! Max size of individual file
! Max size of SIP
|-
| 1. Baseline Test
| 1
| 10
| 1 MiB
| 100 MiB
|-
| 2. No. of SIPs
| '''10'''
| 10
| 1 MiB
| 100 MiB
|-
| 3. No. of files
| 1
| '''10,000'''
| 1 MiB
| 100 MiB
|-
| 4. Max file size
| 1
| 10
| '''30 GiB'''
| 100 MiB
|-
| 5. Max SIP size
| 1
| 10
| 1 MiB
| '''100 GiB'''
|-
| ...
|
|
|
|
|}

*Other tests: combination of maximums

= CVA tests =

System setup:

*Bare-metal install, 1 processor
*2 cores
*4GB ram 9 GB swap
*xubuntu

Note: excludes store AIP and upload DIP micro-services except where noted

{| border="1" cellpadding="10" cellspacing="0" width=90%
|-
!Test date
!No. transfers/SIPs
!No. files
!Total file size
!Largest file size
!AIP size
!Total time
!Comments
|-
|2011/11/10
|1/1
|1,000
|12.1 GB
|60 MB
|
|
|
*Failed at prepareAIP due to max Bag size: <strike>Issue 785</strike>
*Failed at uploadDIP due to max post size limit in ica-atom (8M).
|-
|2011/11/10
|1/1
|1
|2.7 GB
|2.7 GB
|
|
|Failed at prepareAIP due to max Bag size: <strike>Issue 785</strike>
|-
|2011/11/18
|1/1
|1,000
|12.1 GB
|60 MB
|7.2 GB
|4 hrs 30 mins
|Access normalization only
|-
|2011/12/02
|2/2
|1,998
|13 GB
|21 MB
|
|
|Access normalization only
|-
|2011/12/11
|1/1
|1,000
|6.51 GB
|21 MB
|3.5 GB
|
|Access normalization only
|-
|2011/12/11
|2/2
|1,996
|13.8 GB
|27 MB
|7.2 GB
|
|Access normalization only
|-
|2011/12/13
|3/3
|2,974
|18.6 GB
|20 MB
|10.3 GB
|3 hrs 19 mins
|Access normalization only
|-
|2011/12/14
|4/4
|3,993
|24.6 GB
|22 MB
|13.2 GB
|3 hrs 16 mins
|Access normalization only
|-
|2011/12/15
|4/4
|3,982
|43 GB
|12 MB
|15 GB
|3 hrs 30 mins
|Access normalization only
|-
|2011/12/15
|6/6
|5,113
|34.1 GB
|38 MB
|19.8 GB
|4 hrs 2 mins
|Access normalization only
|-
|2012/01/04
|6/6
|5,845
|42.4 GB
|33 MB
|24 GB
|3 hrs 52 mins
|Access normalization only
|-
|2012/01/05
|3/3
|2,957
|20.9 GB
|45 MB
|13.6 GB
|4 hrs
|Access normalization only
|-
|2012/01/05
|6/6
|'''5,947'''
|33 GB
|52 MB
|19.2 GB
|4 hrs 47 mins
|Access normalization only
|-
|2012/01/12
|6/6
|4,847
|38.5 GB
|58 MB
|23.2 GB
|4 hrs 43 mins
|Access normalization only
|-
|2012/01/13
|6/6
|5,912
|101.6 GB
|175 MB
|63.8 GB
|'''8 hrs 53 mins'''
|Access normalization only
|-
|2012/01/17
|1/1
|1
|1.4 GB
|1.4 GB
|0.6 GB
|25 mins
|Access normalization only
|-
|2012/01/17
|5/5
|23
|19.7 GB
|2.1 GB
|19 GB
|4 hrs 1 min
|Access normalization only
|-
|2012/01/18
|2/2
|2
|3.8 GB
|2.1 GB
|3.7 GB
|1 hr 11 mins
|Access normalization only
|-
|2012/01/20
|6/6
|14
|6.1 GB
|1.3 GB
|5.9 GB
|48 mins
|Access normalization only
|-
|2012/02/07
|5/5
|5
|56.7 GB
|'''25.4 GB'''
|55.5 GB
|4 hrs 51 mins
|No normalization
|-
|2012/02/08
|5/5
|10
|'''124.4 GB'''
|23.8 GB
|'''122.2 GB'''
|8 hrs 21 mins
|No normalization
|-
|2012/02
|1/1
|1044
|7.5 GB
|12.4 MB
|32.8 GB
|>16 hrs
|Preservation and access normalization
|-
|2012/02
|1/1
|104
|611.6 MB
|7.1 MB
|2.58 GB
|<2 hrs
|Preservation and access normalization
|-
|2012/02
|1/1
|2125
|47.1 GB
|35.9 MB
|46.2 GB
|>24 hrs
|Preservation and access normalization
|-
|2012/03
|1/1
|1654
|7.9 GB
|11.7 MB
|37.7 GB
|>16 hrs
|Preservation and access normalization
|-
|2012/03
|1/1
|1195
|5.7 GB
|9.9 MB
|26.8 GB
|>12 hrs
|Preservation and access normalization
|-
|2012/03/22
|1/1
|
|11.0 GB
|246.3 MB
| GB
|
|Preservation and access normalization
|-
|2012/03/22
|1/1
|
|6.7 GB
|9.7 MB
| GB
|
|Preservation and access normalization
|-
|2012/03/26
|1/1
|
|6.6 GB
|14.3 MB
| GB
|
|Preservation and access normalization
|-
|2012/03
|1/1
|
|18.1 GB
|11.7 MB
|
|
|Preservation and access normalization
|-
|}


= Multi-processor testing =

== Problem statement ==

*Does the amount of processing time decrease for each additional processing station added?
*If yes, by how much?

== Constants and variables ==

Constants:
*Ram amount
*Ram speed
*Disk size
*Cpu frequency

Variables:
*Number of clients
*Number of transfer(s)
*Size of transfer(s)
*Number of files(s)

Ideal network for testing network consists of 6nodes+ each with dual core processor, 2GB+ memory, and 6GB+ disk space. Due to limited disk capacity, current tests are running with 5 nodes.

== Testing data ==

*All testing data will be be preserved for analysis. Select data will be reported on this wiki.

== Network setup ==
{| class="wikitable"
|-
! HOSTNAME
! Processor
! Memory
! Disk/s Size
! IP
! Filesystem
! Services
! Network Connection Speed
! Ram speed/timing
! Shared directory disk write speed
! Shared directory disk read speed
|-
| test01server
| 4x500mhz
| 2048mb
| 6GB+35GB
| 10.10.0.1
| ext4
| MCPServer,MySQL,NFS,MCPClient
|-
| test01client01
| 2x500mhz
| 1024mb
| 6GB
| 10.10.0.11
| ext4,NFS
| MCPClient
|-
| test01client02
| 2x500mhz
| 1024mb
| 6GB
| 10.10.0.12
| ext4,NFS
| MCPClient
|-
| test01client03
| 2x500mhz
| 1024mb
| 6GB
| 10.10.0.12
| ext4,NFS
| MCPClient
|-
| test01client04
| 2x500mhz
| 1024mb
| 6GB
| 10.10.0.14
| ext4,NFS
| MCPClient
|}

== Testing metrics ==
Our results are derived from running 000.zip through the archivematica pipe line, and then extracting MYSQL- timing views from the database. This gives us a clearer picture of productivity of clients.

two scripts are used to extract testing data from the database:
* [http://archivematica.googlecode.com/svn/trunk/src/testingTools/distributedTesting/automatedDistributedTestingReports.sh automatedDistributedTestingReports.sh]
* [http://archivematica.googlecode.com/svn/trunk/src/testingTools/distributedTesting/automatedDistributedTestingProcessingMachineInformationGathering.sh automatedDistributedTestingProcessingMachineInformationGathering.sh]

After you have run your test data through archivematica they are to be used:
<pre>
./automatedDistributedTestingReports.sh
./automatedDistributedTestingProcessingMachineInformationGathering.sh
</pre>

you will recieve a similar fileset to this
<pre>
2012.05.02-11.52.12_server_jobDurationsView.html
2012.05.02-11.52.12_server_MCP_DUMP.sql
2012.05.02-11.52.12_server_mysql_status.log
2012.05.02-11.52.12_server_netstat_summary.log
2012.05.02-11.52.12_server_PDI_by_unit.html
2012.05.02-11.52.12_server_processingDurationInformation.html
server_2012.05.02-11.52.05_cpuinfo.log
server_2012.05.02-11.52.05_free.log
server_2012.05.02-11.52.05_IP.log
</pre>

== Test results ==

*Ram amount =
*Ram speed =
*Disk size =
*CPU frequency =


*Number of transfers =
*Total number of files =
*Total transfer size =


{| class="wikitable"
|-
! No. of processors
! Total processing time
! Longest job
! Second longest job
! Third longest job
|-
| 1
|
|
|
|
|-
| 2
|
|
|
|
|-
| 6
|
|
|
|
|-
|}

{| width="100%" border="0" style="margin: 20px 0;" class="youtube"
| <youtube>lOZ-Kcw4DQs</youtube>
|}

Scalability testing

2012-07-30T19:05:29Z

Peter: /* Test results */

Installation

2012-07-05T22:00:00Z

Peter:

==Technical Requirements==
* Workstation: min processor?, min RAM?, min storage?
* Enabled ports: http, mysqld, gearman, nfs, ssh
* VMplayer:
**Vbox version?
**VMware version?

==Installation==
* [[Install-0.9-beta|Install Release 0.9-beta]]

* [[Install|previous releases]]

Installation

2012-07-05T21:59:27Z

Peter:

==Technical Requirements==
* Workstation
* Enabled ports: http, mysqld, gearman, nfs, ssh
* VMplayer:
**Vbox version?
**VMware version?

==Installation==
* [[Install-0.9-beta|Install Release 0.9-beta]]

* [[Install|previous releases]]

Installation

2012-07-05T21:58:32Z

Peter:

==Technical Requirements==
* Workstation
* Enabled ports
* VMplayer:
**Vbox version?
**VMware version?

==Installation==
* [[Install-0.9-beta|Install Release 0.9-beta]]

* [[Install|previous releases]]

User manual 0.8

2012-07-05T21:57:54Z

Peter:

[[Main Page]] > [[Documentation]] > [[User Manual]] > User manual 0.8

{| style="width:95%; border="0"
|-valign="top"
| style="width:500px; border: 1px solid rgb(198, 201, 255); padding: 0.5em 1em 1em;" |

== [[Installation|Installation]] ==

== [[UM transfer|Transfer]] ==

== [[UM ingest|Ingest]] ==

*Ingesting [[UM digitization output|Digitization output]]

*Ingesting [[UM DSpace exports|DSpace exports]]

== [[UM archival storage|Archival storage]] ==

== [[UM access|Access]] ==

| style="width: 500px; border: 1px solid rgb(198, 201, 255); padding: 0.5em 1em 1em;" |

== [[UM preservation planning|Preservation planning]] ==

== [[UM error handling|Error handling]] ==

== [[UM glossary|Glossary]] ==

== Questions? ==

Please post them to Archivematica [http://groups.google.com/group/archivematica?hl=en discussion group]

__NOEDITSECTION__
__NOTOC__

Installation

2012-07-05T21:56:33Z

Peter:

[[Technical Requirements]]
* Workstation
* Enabled ports
* VMplayer:
**Vbox version?
**VMware version?

* [[Install-0.9-beta|Install Release 0.9-beta]]

* [[Install|previous releases]]

User manual 0.8

2012-07-05T21:56:10Z

Peter:

[[Main Page]] > [[Documentation]] > [[User Manual]] > User manual 0.8

{| style="width:95%; border="0"
|-valign="top"
| style="width:500px; border: 1px solid rgb(198, 201, 255); padding: 0.5em 1em 1em;" |

== [[Install|Installation]] ==

== [[UM transfer|Transfer]] ==

== [[UM ingest|Ingest]] ==

*Ingesting [[UM digitization output|Digitization output]]

*Ingesting [[UM DSpace exports|DSpace exports]]

== [[UM archival storage|Archival storage]] ==

== [[UM access|Access]] ==

| style="width: 500px; border: 1px solid rgb(198, 201, 255); padding: 0.5em 1em 1em;" |

== [[UM preservation planning|Preservation planning]] ==

== [[UM error handling|Error handling]] ==

== [[UM glossary|Glossary]] ==

== Questions? ==

Please post them to Archivematica [http://groups.google.com/group/archivematica?hl=en discussion group]

__NOEDITSECTION__
__NOTOC__

User manual 0.8

2012-07-05T21:55:56Z

Peter:

[[Main Page]] > [[Documentation]] > [[User Manual]] > User manual 0.8

{| style="width:95%; border="0"
|-valign="top"
| style="width:500px; border: 1px solid rgb(198, 201, 255); padding: 0.5em 1em 1em;" |

== [[Install|Installation]]

== [[UM transfer|Transfer]] ==

== [[UM ingest|Ingest]] ==

*Ingesting [[UM digitization output|Digitization output]]

*Ingesting [[UM DSpace exports|DSpace exports]]

== [[UM archival storage|Archival storage]] ==

== [[UM access|Access]] ==

| style="width: 500px; border: 1px solid rgb(198, 201, 255); padding: 0.5em 1em 1em;" |

== [[UM preservation planning|Preservation planning]] ==

== [[UM error handling|Error handling]] ==

== [[UM glossary|Glossary]] ==

== Questions? ==

Please post them to Archivematica [http://groups.google.com/group/archivematica?hl=en discussion group]

__NOEDITSECTION__
__NOTOC__