Archivematica 1.7 and Storage Service 0.11 release notes
Released MM DD, 2018.
Archivematica 1.7/Storage Service 0.11 has several new features, as well as enhancements to existing features, bug fixes and updated tools. Below you'll find a short description of each feature as well as links to the relevant documentation and code changes. Thank you to everyone who has sponsored the work that is included in this release - your dedication to making Archivematica better is appreciated by the whole community!
This release also includes several bugfixes, especially related to packaging. We've also been concentrating on improving the overall Archivematica documentation.
Translation hooks have been added to the Archivematica user interface, the Storage Service, the documentation, and the Archivematica website. This work will support the translation of those resources into many languages through Artefactual's current localization platform, Transifex. Note that translation hooks for Archivematica workflow components (microservice names, job names, and drop-down options) will be added in Archivematica 1.8.
This work was sponsored by the Canadian Council of Archives. Thank you!
NOTE: this work prepares Archivematica for localization; however, minimal translation has been completed. The interface will default to English at the present time, but can be changed to another language in the Settings menu.
- Documentation and translator's guide: Translations
- Pull requests: PR 159, Appraisal tab PR 151, Transfer browser PR 12, PR 506
This feature allows users to connect their Archivematica pipeline to GPG-encrypted AIP Storage and Transfer Backlog spaces. AIPs and transfers in backlog can also be encrypted. An AIP or transfer stored in an encrypted location is encrypted at rest; when downloaded, an encrypted AIP is decrypted for use.
This work was sponsored by the Simon Fraser University Archives. Thank you!
- Documentation: AIP Encryption
- Pull requests: SS PR198, PR616, Acceptance tests repo PR12, Acceptance tests repo PR 19, Ansible role PR109, SS PR 241, PR 738, Acceptance tests repo PR22, METS Reader-Writer PR 27
- Feature files: AIP encryption feature file, AIP encryption mirror location feature file
Shibboleth and LDAP integration
Archivematica and the Storage Service can now be deployed to use LDAP or Shibboleth authentication.
This integration allows users to use MediaConch to check the conformance of .mkv files (originals and derivatives) against the Matroska specification. It also checks the validity of media files against user-provided policies.
This work was sponsored by the PREFORMA Project. Thank you!
- Documentation: Format Policy Registry - Validation
- Example workflow: MediaConch workflow
- Pull requests: PR 557, Format Policy Registry PR 35, Acceptance tests PR 13, Ansible role PR 114, Sample data PR 2, Artefactual Labs Archivematica MediaConch policy check wrapper
- Feature files: Transfer tab MKV conformance, Ingest tab MKV conformance, Transfer policy check, Ingest policy check
Handle Server integration
Using this feature, Archivematica can be configured to make requests to a Handle System HTTP API so that files, directories and entire AIPs can be assigned persistent identifiers (PIDs) and derived persistent URLs (PURLs) in the METS file. This work was sponsored by the International Institute of Social History. Thank you!
- Documentation: Handle Server configuration
- Pull requests: PR 690, Acceptance tests PR 15
- Feature files: PID-binding feature
Assign UUIDs to directories and empty directories
Related to Handle Server Integration, above, this feature gives users the options of assigning UUIDs to directories and/or empty directories within the AIP METS file. This adds two new microservices: Assign UUIDs to directories and Document Empty directories. You may assign UUIDs to directories and document empty directories even if you do not have a handle server configured and are not binding PIDs. This work was sponsored by the International Institute of Social History. Thank you!
- Documentation: Assign UUIDs to directories, Document empty directories
- Pull requests: PR 690, PR 767, PR 833
- Feature Files: UUIDs for directories
This feature supports deployment of Archivematica in indexless mode, disabling ElasticSearch. This means that users that don't require Archivematica's indexing features can save the compute resources required for what can be an intensive task. Note that disabling the Elasticsearch index means you cannot make use of the Backlog, Appraisal, or Archival Storage tabs.
This work was sponsored by the Columbia University Libraries. Thank you!
A README file in html format has been added to the AIP containing a brief explanation of what the AIP is, how it was created and how it is structured. See the contents of the README here. This feature was sponsored by the Denver Art Museum. Thank you!
File modification dates
This feature adds a Store file modification job to the Characterize and Extract metadata microservice and adds the metadata to the Elasticsearch index. This allows users to view the last modified date of the files in the Appraisal tab. This feature was sponsored by the Bentley Historical Library at University of Michigan. Thank you!
What has changed:'
Antivirus (AV) scanning using ClamAV can now be configured up to its maximum thresholds (previously Archivematica was limited to ClamAV's default limits, around 25MB). It was discovered that installing with the default limit would result in false-positive PREMIS events indicating that files over 25 MB had been scanned. In these instances, there is still a possibility that a malicious payload existed, albeit a small chance because viruses are typically attached to smaller files which are intended to be sent and downloaded via email. Archivematica will no longer record a PREMIS event where either a file cannot be scanned due to the antivirus limits, or some other reason that might suggest that AV scanning has not completed successfully. Please see the Archivematica 1.7 documentation on the  effects of different configuration settings. Currently ClamAV has an upward limit of 2 GB per file which in the Archivematica 1.7 release is configured as the default.
What does this mean:
If you are concerned about AIPs created pre-Archivematica 1.7 containing viruses, you can use AIP reingest to re-run antivirus. However, in the majority of cases this will not be necessary. Factors to consider include:
- what is the source of the material? Do you have reason to mistrust this source? - what were the transfer protocols before the material was ingested by Archivematica? Did anitvirus scanning occur with another tool during that process? - does your current storage system include periodic enterprise-wide virus scanning?
Archivematica decoupled from the FPR server
The FPR was originally created to manage preservation plans, i.e. business rules and tool commands for format-based preservation events. The FPR server’s purpose has been to house the default rules and commands while allowing institutions to make local alterations as desired.
Going forward, we have made the decision to remove Archivematica’s dependency on the FPR server for the following reasons:
- The FPR rules on the FPR server are out of date. When new rules are added to the Preservation Planning tab in Archivematica, they aren't always being copied back to the FPR server.
- Currently, new Archivematica installations ping the FPR server, which records the IP address of the remote server. This data capture isn't useful and it isn’t necessary.
Removing the FPR server as a dependency should not impact how Archivematica is being used previous to 1.7.
The longer-term goal is to build a new FPR server, one that is completely decoupled from Archivematica, which would serve up format policy data to other applications using an open API. We would propose that this future registry should not rely on a single vendor for maintenance.
Pull requests: PR 971
DIP upload and storage workflow improvements
This work clarifies the sequence of the Upload DIP and Store DIP jobs on the Ingest tab. The processing configuration settings have also been updated so that almost every decision point can be automated (the exception is Upload DIP, which requires data entry).
Dashboard API whitelist mechanism
Two changes have been made to the API whitelist functionality:
- The default API whitelist setting is now empty.
- If the API whitelist setting is empty the user can still authenticate against the API using the key. The whitelist is only activated when at least one IP address is listed.
Default processing configuration
As a result of several features in this release, the processing configuration options have changed substantially, growing from 19 configurable decision points to 27. The new decision points include:
- Assign UUIDs to directories (related to Handle Server integration)
- Perform policy checks on originals (related to MediaConch integration)
- Perform policy checks on preservation derivatives (related to MediaConch integration)
- Perform policy checks on access derivatives (related to MediaConch integration)
- Bind PIDs (related to Handle Server integration)
- Document empty directories (related to Handle Server integration)
- Upload DIP (related to DIP upload and storage workflow improvements)
- Store DIP (related to DIP upload and storage workflow improvements)
We've also changed the default configuration, leaving more decision points set at "None", which will prompt the user to make a manual decision as a transfer is being moved through the Archivematica workflow. The purpose of these changes is to enable better testing - we think it's important that users see as many decision points in real time as possible while they are testing the system
We've also set the compression level to 1 - fastest mode, which also facilitates testing as it wraps up AIP and DIP storage more quickly; however, it does mean that packages will not be as compressed as they could be. If you have limited space on your test machine, we recommend either deleting packages on a regular basis or changing the compression level to a higher number so that packages are smaller.
Note that changes to the default configuration should not override your local configuration during an upgrade or migration.
- Storage Service timeouts increased from 5 seconds to 120 seconds (https://github.com/artefactual/archivematica/commit/c24183fd)
- Problem: editing notes in appraisal tab can delete data from ArchivesSpace - https://github.com/artefactual/archivematica/issues/713
- Sponsored by International Institute of Social History - Failure during METS creation due to recursion limit - https://github.com/artefactual/archivematica/commit/0881d587
- Original AIP METS still in filegrp:SubmissionDocumentation in METS - https://projects.artefactual.com/issues/10829