Format policy registry requirements
Jump to navigation Jump to search
- The Archivematica project team has recognized the need for a way to manage format conversion preservation plans, referred to by the project as format policies, which will change as formats and community standards evolve. A format policy indicates the actions, tools and settings to apply to a particular file format.
- Until now, the Archivematica project has managed this information on the archivematica.org/preservation wiki page.
- The Format Policy Registry (FPR) will manage this information in a structured format (SQL/JSON).
- APIs with other serializations may be added (e.g. XML, RDF)
- It will be hosted at archivematica.org/fpr/
- The FPR will also provide valuable online statistics about default format policy adoption as well as customizations amongst Archivematica users and will interface with other online registries (such as PRONOM and UDFR) to monitor and evaluate community-wide best practices.
- The FPR stores structured information about normalization format policies for preservation and access. These policies identify preferred preservation and access formats by media type. The choice of access formats is based on the ubiquity of viewers for the file format. Archivematica's preservation formats are all open standards; additionally, the choice of preservation format is based on community best practices, availability of open-source normalization tools, and an analysis of the significant characteristics for each media type.
- These default format policies can all be changed or enhanced by individual Archivematica implementers.
- Subscription to the FPR will allow the Archivematica project to notify users when new or updated preservation and access plans become available, allowing them to make better decisions about normalization and migration strategies for specific format types within their collections. It will also allow them to trigger migration processes as new tools and knowledge becomes available.
- One of the other primary goals of the FPR is to aggregate empirical information about institutional format policies to better identify community best practices. The FPR will provide a practical, community-based approach to OAIS preservation and access planning, allowing the Archivematica community of users to monitor and evaluate formats policies as they are adopted, adapted and supplemented by real-world practioners. The FPR APIs will be designed to share this information with the Archivematica user base as well with other interested communities and projects.
- An early FPR prototype (called "Formatica") was developed by Heather Bowden, then Carolina Digital Curation Doctoral Fellow at the School of Information and Library Science in the University of North Carolina at Chapel Hill.
- provide an authenticated Web based interface for creation and maintenance of policies
- provide a read-only RESTful Web API for accessing policies in JSON format
- provide an API for monitoring new and updated policies
- integrate with PRONOM to retrieve PUIDs
- model format policies so that they can be stored in a SQL (MySQL, PostGres, SQLlite) dbase on both client & server
- develop iteratively with an emphasis on getting working code in front of users as quickly as possible to make them part of the design process (see #fileidhack)
- developer notes