Archivematica 1.7 and Storage Service 0.11 release notes

From Archivematica
Revision as of 15:33, 1 May 2018 by Sromkey (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Home > Release Notes > Archivematica 1.7 and Storage Service 0.11 Release Notes

Released May 1, 2018.

Archivematica 1.7/Storage Service 0.11 has several new features, as well as enhancements to existing features, bug fixes and updated tools. Below you'll find a short description of each feature as well as links to the relevant documentation and code changes. Thank you to everyone who has sponsored the work that is included in this release - your dedication to making Archivematica better is appreciated by the whole community!

This release also includes several bugfixes, especially related to packaging. We've also been concentrating on improving the overall Archivematica documentation.

Supported environments

Installation instructions are available here.

Archivematica can be installed using packages or Ansible scripts in either CentOS/Red Hat or Ubuntu environments. It can also be installed using Docker. At this time, installation instructions are provided for officially tested and supported installation environments:


Manual install of OS packages on Ubuntu (14.04 and 16.04) is documented but not officially supported.

Installing Archivematica using Docker is not officially supported for production deployments. However, it is the preferred development environment for those who work on Archivematica's code.

If you are upgrading from a previous version of Archivematica, please see the upgrading instructions.

Added

Internationalization/localization

Translation hooks have been added to the Archivematica user interface, the Storage Service, the documentation, and the Archivematica website. This work will support the translation of those resources into many languages through Artefactual's current localization platform, Transifex. Note that translation hooks for Archivematica workflow components (microservice names, job names, and drop-down options) will be added in Archivematica 1.8.

This work was sponsored by the Canadian Council of Archives. Thank you!

NOTE: this work prepares Archivematica for localization; however, minimal translation has been completed. The interface will default to English at the present time, but can be changed to another language in the Settings menu.

AIP encryption

This feature allows users to connect their Archivematica pipeline to GPG-encrypted AIP Storage and Transfer Backlog spaces. AIPs and transfers in backlog can also be encrypted. An AIP or transfer stored in an encrypted location is encrypted at rest; when downloaded, an encrypted AIP is decrypted for use.

This work was sponsored by the Simon Fraser University Archives. Thank you!

Note that there is an open issue with storing uncompressed AIPs encrypted.

Shibboleth and LDAP integration

Archivematica and the Storage Service can now be deployed to use LDAP or Shibboleth authentication.

This work was sponsored by Jisc, MoMA, and the International Institute of Social History - a truly international effort. Thank you!

MediaConch integration

This integration allows users to use MediaConch to check the conformance of .mkv files (originals and derivatives) against the Matroska specification. It also checks the validity of media files against user-provided policies.

This work was sponsored by the PREFORMA Project. Thank you!

Handle Server integration

Using this feature, Archivematica can be configured to make requests to a Handle System HTTP API so that files, directories and entire AIPs can be assigned persistent identifiers (PIDs) and derived persistent URLs (PURLs) in the METS file. This work was sponsored by the International Institute of Social History. Thank you!

Assign UUIDs to directories and empty directories

Related to Handle Server Integration, above, this feature gives users the options of assigning UUIDs to directories and/or empty directories within the AIP METS file. This adds two new microservices: Assign UUIDs to directories and Document Empty directories. You may assign UUIDs to directories and document empty directories even if you do not have a handle server configured and are not binding PIDs. This work was sponsored by the International Institute of Social History. Thank you!

Indexless Archivematica

This feature supports deployment of Archivematica in indexless mode, disabling ElasticSearch. This means that users that don't require Archivematica's indexing features can save the compute resources required for what can be an intensive task. Note that disabling the Elasticsearch index means you cannot make use of the Backlog, Appraisal, or Archival Storage tabs.

This work was sponsored by the Columbia University Libraries. Thank you!

README file

A README file in html format has been added to the AIP containing a brief explanation of what the AIP is, how it was created and how it is structured. See the contents of the README here. This feature was sponsored by the Denver Art Museum. Thank you!

File modification dates

This feature adds a Store file modification job to the Characterize and Extract metadata microservice and adds the metadata to the Elasticsearch index. This allows users to view the last modified date of the files in the Appraisal tab. This feature was sponsored by the Bentley Historical Library at University of Michigan. Thank you!

Changed

Anti-virus changes

What has changed:

Antivirus (AV) scanning using ClamAV can now be configured up to its maximum thresholds (previously Archivematica was limited to ClamAV's default limits, around 25MB). It was discovered that installing with the default limit would result in false-positive PREMIS events indicating that files over 25 MB had been scanned. In these instances, there is still a possibility that a malicious payload existed, albeit a small chance because viruses are typically attached to smaller files which are intended to be sent via email. Archivematica will no longer record a PREMIS event where either a file cannot be scanned due to the antivirus limits, or some other reason that might suggest that AV scanning has not completed successfully. Please see the Archivematica 1.7 documentation on the effects of different configuration settings. Currently ClamAV has an upward limit of 2 GB per file which in the Archivematica 1.7 release is configured as the default.


What does this mean?

If you are concerned about AIPs created pre-Archivematica 1.7 containing viruses, you can use AIP reingest to re-run antivirus. However, in the majority of cases this will not be necessary. Factors to consider include:

  • what is the source of the material? Do you have reason to mistrust this source?
  • what were the transfer protocols before the material was ingested by Archivematica? Did anitvirus scanning occur with another tool during that process?
  • does your current storage system include periodic enterprise-wide virus scanning?

Archivematica decoupled from the FPR server

The FPR was originally created to manage preservation plans, i.e. business rules and tool commands for format-based preservation events. The FPR server’s purpose has been to house the default rules and commands while allowing institutions to make local alterations as desired.

Going forward, we have made the decision to remove Archivematica’s dependency on the FPR server for the following reasons:

  • The FPR rules on the FPR server are out of date. When new rules are added to the Preservation Planning tab in Archivematica, they aren't always being copied back to the FPR server.
  • Currently, new Archivematica installations ping the FPR server, which records the IP address of the remote server. This data capture isn't useful and it isn’t necessary.

Removing the FPR server as a dependency should not impact how Archivematica is being used previous to 1.7.

The longer-term goal is to build a new FPR server, one that is completely decoupled from Archivematica, which would serve up format policy data to other applications using an open API. We would propose that this future registry should not rely on a single vendor for maintenance.

Pull requests: PR 971

DIP upload and storage workflow improvements

This work clarifies the sequence of the Upload DIP and Store DIP jobs on the Ingest tab. The processing configuration settings have also been updated so that almost every decision point can be automated (the exception is Upload DIP, which requires data entry).

This work was sponsored by MoMA, the MIT Libraries, and the University of York. Thank you!

Dashboard API whitelist mechanism

Two changes have been made to the API whitelist functionality:

  • The default API whitelist setting is now empty.
  • If the API whitelist setting is empty the user can still authenticate against the API using the key. The whitelist is only activated when at least one IP address is listed.

Default processing configuration

As a result of several features in this release, the processing configuration options have changed substantially, growing from 19 configurable decision points to 27. The new decision points include:

  • Assign UUIDs to directories (related to Handle Server integration)
  • Perform policy checks on originals (related to MediaConch integration)
  • Perform policy checks on preservation derivatives (related to MediaConch integration)
  • Perform policy checks on access derivatives (related to MediaConch integration)
  • Bind PIDs (related to Handle Server integration)
  • Document empty directories (related to Handle Server integration)
  • Upload DIP (related to DIP upload and storage workflow improvements)
  • Store DIP (related to DIP upload and storage workflow improvements)

We've also changed the default configuration, leaving more decision points set at "None", which will prompt the user to make a manual decision as a transfer is being moved through the Archivematica workflow. The purpose of these changes is to enable better testing - we think it's important that users see as many decision points in real time as possible while they are testing the system

We've also set the compression level to 1 - fastest mode, which also facilitates testing as it wraps up AIP and DIP storage more quickly; however, it does mean that packages will not be as compressed as they could be. If you have limited space on your test machine, we recommend either deleting packages on a regular basis or changing the compression level to a higher number so that packages are smaller.

Note that changes to the default configuration should not override your local configuration during an upgrade or migration.

Miscellaneous changes

Fixed

Upgraded tools and dependencies

  • PRONOM updated to version 92
  • METS updated to version 1.11
  • Fido updated to 1.3.7
  • METS reader-writer updated to 0.2.0
  • AgentArchives updated to 0.3.0
  • Siegfried updated to 1.6.7

End of life dependencies

Several dependencies in Archivematica have reached end of life and resources did not allow for updates in 1.7. In future releases we will address these dependencies:

  • Django 1.8
  • Elasticsearch 1.7
  • Percona 5.5

To mitigate risks related with using these dependencies you might consider:

  • Installing Archivematica behind a firewall
  • Using Archivematica without Elasticsearch (turn indexing off)