Difference between revisions of "Development roadmap: Archivematica"

From Archivematica
Jump to navigation Jump to search
 
(101 intermediate revisions by 6 users not shown)
Line 1: Line 1:
 
[[Main Page]] > [[Development]] > Development roadmap
 
[[Main Page]] > [[Development]] > Development roadmap
 +
 +
<div style="padding: 10px 10px; border: 1px solid black; background-color: #F79086;">This roadmap page is no longer being maintained. We are now tracking the Archivematica roadmap in a [https://trello.com/b/aB72IgiX/archivematica-roadmap public Trello board]. Please subscribe to the [https://groups.google.com/forum/#!forum/archivematica Archivematica Google Group] for release and roadmap related announcements. </div> <p>
  
 
= Archivematica development roadmap =
 
= Archivematica development roadmap =
Line 8: Line 10:
  
 
We will issue public releases incrementally upon completion and testing of the sponsored features and enhancements listed below. All features are subject to code review and QA, the results of which may result in any feature being pushed to a future release.
 
We will issue public releases incrementally upon completion and testing of the sponsored features and enhancements listed below. All features are subject to code review and QA, the results of which may result in any feature being pushed to a future release.
 +
 +
==Artefactual Labs==
 +
* For other cool stuff we're working on, see [https://github.com/artefactual-labs Artefactual Labs]
  
 
==Features by release number==
 
==Features by release number==
  
===Storage Service 0.4.0===
+
===1.7.1===
 +
See the 1.7.1 Milestone on our [https://waffle.io/artefactual/archivematica?search=cul:%20phase%202&milestone=1.7.1 Waffleboard] for more information or our [[Archivematica_1.7.1_release_notes| 1.7.1 release notes]] for specific pull requests.
 +
 
 +
*'''Sponsored''' (Columbia University Library) Performance enhancements
 +
**Change MCPClient to stop sending tool outputs through the job scheduler
 +
**Change MCPServer to require only return code from client tasks
  
*'''Sponsored''' (SFU Library) [http://www.lockss.org/ LOCKSS] available as an AIP storage location using PLN Manager "LOCKSS-o-MATIC" (AIP storage / API plugin) #5425 [https://github.com/artefactual/archivematica-storage-service/pull/15 PR15]
+
*Fix encrypted AIP storage spaces on centos/ansible installs (0.11.1)
*'''Sponsored''' (SFU) Ability to configure transfer backlog locations via the Storage Service #6131 [https://github.com/artefactual/archivematica-storage-service/pull/9 PR#9]
+
*Fix storage for encrypted uncompressed AIPs (0.11.1)
*'''Sponsored''' (Harvard Business School Library) Manage DIP storage #6827 [https://github.com/artefactual/archivematica-storage-service/pull/11 PR11]
 
*'''Sponsored''' (Museum of Modern Art) Fixity checking app #6597 [https://github.com/artefactual/archivematica-storage-service/pull/13 PR13]
 
*View pointer files from Archival Storage and SS #5716 [https://github.com/artefactual/archivematica-storage-service/pull/5 PR5]
 
*Enhancements
 
** optimizations in moving files between Locations #6248 [https://github.com/artefactual/archivematica-storage-service/pull/4 PR4]
 
** streamlined creation of new endpoints with decorators [https://github.com/artefactual/archivematica-storage-service/pull/14 PR14]
 
** new dependency added unar (and lsar) used to add support for AIP's with multiple extensions (e.g., aip.tar.gz) #6764 [https://github.com/artefactual/archivematica-storage-service/pull/15 PR15]
 
*Bugfixes
 
** setting Location path from gui #5608 [https://github.com/artefactual/archivematica-storage-service/pull/10 PR10]
 
** allow email address to be used as username #6674 [https://github.com/artefactual/archivematica-storage-service/pull/12 PR12]
 
  
===Archivematica 1.2===
+
===1.7/0.11===
* '''Sponsored''' (Yale University Libraries) [[Digital_forensics_image_ingest|Forensic disk image ingest]] #5037, #5356, #5900
 
** '''Sponsored''' includes identification and flagging of personal information in transfers, as well as other [http://www.forensicswiki.org/wiki/Bulk_extractor bulk extractor] reporting functions
 
*'''Sponsored''' (COPPUL) For COPPUL hosting functionality at Bronze level, ability to process through to Transfer backlog only
 
* Add ability to configure Characterization commands via FPR https://github.com/artefactual/archivematica/pull/6
 
* Add verification command micro-service (verify frame-level fixity and lossless compression) #6501
 
* Improvements to transfer start #6220
 
** updates to manual documentation
 
* Scalability: Add nailgun (improve performance of java tools like FITS)
 
** may include changes to packaging
 
* View pointer files from Archival Storage and SS
 
* Bug fixes
 
* Improvements to file identification metadata in METS #
 
* Include [http://wiki.opf-labs.org/display/SPR/Tika+Batch+File+Identification TIKA] #5027 and [http://www.nationalarchives.gov.uk/information-management/projects-and-work/droid.htm DROID] in packages so FPR can be configured to use them as identification tools 
 
* Include [http://mediainfo.sourceforge.net/en MediaInfo], Exiftool and framemd5 (maybe ffprobe) for characterization and metadata extraction instead of FITS #5034
 
* Support Dublin Core metadata in JSON (as well as csv, which was already supported) https://github.com/artefactual/archivematica/pull/14
 
  
===Storage Service 0.5.0===
+
*'''Sponsored''' (PREFORMA/MediaArea) MediaConch integration for audio-visual format verification
* '''Sponsored''' (University of Saskatchewan) [http://islandora.ca/ Islandora] integration. See also: [[Sword_API]], #5796 #5837
+
*'''Sponsored''' (Rockefeller Archives Centre) Importing object-level PREMIS rights via CSV
** v1 of Sword API (the API used by external applications like Islandora)
+
** Allows users to apply PREMIS rights to individual objects, rather than having all of the objects in the transfer inherit rights applied to the SIP via the Add Rights metadata form, using a CSV file included with the transfer
 +
*'''Sponsored''' (Canadian Council of Archives) Enable internationalization and localization of Archivematica
 +
** This will allow the Archivematica interface to be translated into languages other than English, using a translation system or application
 +
*'''Sponsored''' (Canadian Centre for Architecture) Change METS encoding to UTF-8
 +
*'''Sponsored''' (Canadian Centre for Architecture) Use default access rule if normal rule errors
 +
*'''Sponsored''' (Bentley Historical Library) Allow zip as archive format for AIPs stored in DSpace
 +
*'''Sponsored''' (Museum of Modern Art New York) AIP migration
 +
** This allows AIPs to be moved from one location to another via a new API endpoint. The API takes two arguments: 1) UUID of an existing package (AIP or DIP or transfer) and 2) the UUID of a Location.
  
===Archivematica 1.3===
 
*'''Sponsored''' (University of Saskatchewan) Add post store AIP micro-service to let Islandora know that an object has been moved into storage and can be deleted from Islandora
 
* METS refactoring and METS generation improvements
 
** develop standalone Python METS writer application and distribute separately from and integrated with Archivematica
 
  
===Storage Service 0.6.0===
+
===Proposed/in development/experimental===
*Backend/Not user-facing:
 
**'''Supported''' (Zuse Institut) Changes to support AIP re-ingest (See below, Archivematica 1.4)
 
  
===Archivematica 1.4===
+
These features are works in progress or have experimental/proof of concept status.
  
* '''Sponsored''' (Zuse Institute) AIP DC and Rights MD Re-ingest [[AIP_re-ingest|Full AIP re-ingest requirements]]
+
See also, [[Improvements]]
** '''Sponsored''' supports AIP versioning (METS file updates) #1564
 
** '''Sponsored''' generate DIP from AIP after processing is complete - Issue #1843
 
** does not support re-normalization
 
** note that this work is only part of the entire AIP re-ingest feature, the rest is not yet sponsored
 
  
===Storage Service 0.7.0===
+
*'''Sponsored''' (Simon Fraser University Archives) WARC file ingest
*
+
** Analyze WARC header information and prepare metadata mapping to Archivematica AIP METS file
 +
** Add Archivematica micro-services to parse WARC header information to Archivematica METS file
 +
*'''Sponsored''' (Ontario Council of University Libraries) Dataverse integration (proof of concept)
 +
** The scope of sponsored work is a proof of concept model for integration of Dataverse with Archivematica. As design/development progresses, we will update the development roadmap accordingly.
 +
*'''Sponsored''' (University of York/University of Hull) Automated DIP generation workflow
 +
** Change workflow so that the ‘upload DIP’ choice can be preconfigured
 +
** Update AIP reingest workflow to allow uncompressed AIPs to be reingested. (DONE, version 1.6/0.10)
 +
** Enhance the callback functionality in the Storage Service, to notify third party apps when a DIP is ready to be used.
 +
*'''Sponsored''' (University of York/University of Hull) METS parsing tools
 +
** Develop the public facing API of the REST service and define the API to return answers as JSON-LD or another linked data format
 +
** Develop a python METSReader library that would live behind the REST service
 +
** Write documentation for the REST service
 +
*'''Sponsored''' (University of York/University of Hull) Generic search REST API (proof-of-concept)
 +
** Develop the public facing API of the REST service, a read-only API to provide a small number of endpoints to answer basic questions about the number of files in storage, their formats, date of ingest, etc.
 +
** Develop functionality in the Archivematica Storage Service to implement this API
 +
** Write documentation for the REST API
 +
*'''Sponsored''' (University of York/University of Hull) Enhance PRONOM integration
 +
** Allow a user to manually assign pronom IDs to non-identified files; record manual selection in the AIP METS file
 +
** Provide report of non-identified files in a SIP or AIP, with access to the file identification tool output
 +
** Provide direct access to the PRONOM submission form from within Archivematica.
 +
*'''Sponsored''' (University of York/University of Hull) Automation tools documentation
  
===Archivematica 1.5===
+
===Fixity app===
  
* '''Sponsored''' (National Library of Wales)  
+
*'''Sponsored''' (Simon Fraser University Archives) Better end-user documentation for the fixity app.
** '''Sponsored''' Add levels of description to Submission Information Packages using AtoM REST endpoint to enforce controlled vocabulary
 
** '''Sponsored''' Generate hierarchical structMap in Archival Information Package METS file
 
** This development is concurrent with AtoM development including the following: generate hierarchical arrangement based on METS structMap, map levels of description in hierarchical METS structMap to Level of description element in AtoM information object, and display hierarchical arrangement in AtoM treeview
 
  
 
==Wish list==
 
==Wish list==
Also see unsponsored features/tasks/bugs without assigned releases: [http://bit.ly/1eW9yRs unsponsored and unscheduled fixes, features and tasks]
+
This section describes enhancements and features that the Archivematica community would like to see researched and implemented; however, without development resources allocated to us or contributions from developers outside of Artefactual, we cannot guarantee their inclusion in an upcoming release. Also see unsponsored features/tasks/bugs without assigned releases: [http://bit.ly/1eW9yRs unsponsored and unscheduled fixes, features and tasks]
# Improvements to e-mail ingest workflow (maildir)
+
 
# DIP generation/upload info logged to pointer file
+
===Dashboard===
# Re-index AIPs and DIPs (storage service)
+
*'''User interface'''
# Persistent data about stored AIPs and DIPs
+
** Upload submission documentation during transfer upload #1910
# Upload submission documentation during transfer upload #1910
+
** Administrative dashboard interface for system monitoring, including status, restart services, maintenance of backups, tools for restoring, automatic indexing of ElasticSearch index
# Ability to upgrade rather than re-install
+
** Indicator in dashboard of decision made at decision points
# SAMBA plugin for Storage API
+
** Indicator that Archivematica is currently processing
# ElasticSearch in SS
+
** Status indicator to show current status of transfer/job
# Format Policy Registry (FPR) public site UI
+
** Reconsider icons and access to the Add Metadata / Rights templates (currently the icon matches the ‘report’ icon and it’s unclear when is the ‘right’ time to add metadata) and the Reminder: add metadata micro-service
# Ability to send local format policy changes to the FPR public site #5074
+
** Access tab, Archival Storage tab, Preservation Planning tab should have description of purpose of tab
# Visualization of transfer contents - #1578, [[Transfer and SIP creation#File visualization reporting page]]
+
** Treat each tab as its own web application
# Field validation in rights templates - #1519
+
** Administrative access to Storage Service from Access tab
# Hydra (AIP storage / API plugin)
+
** Task cogs containing no information should have a short descriptive indicator of why there is none (ie no tool output available)
# Fedora (AIP storage / API plugin)
+
** Ability to choose a fallback identification tool when the selected tool fails
# Dspace (DIP upload)
+
** Hide AtoM user password in the user interface
# BitCurator integration: how much functionality/data can be integrated/re-used prior to Archivematica ingest. - #1869
+
*'''SIP arrangement''' See also #6791
# Develop ability and end-user documentation to add other identification tools as selections from the drop-down menu in the Ingest tab of the dashboard to base normalization workflows on #5077 #5078
+
** Visualization of transfer contents - #1578, [[Transfer and SIP creation#File visualization reporting page]]
# Administrative dashboard interface for system monitoring, including status, restart services, maintenance of backups, tools for restoring, automatic indexing of ElasticSearch index
+
** Clean up of transfer backlog once arrangement is complete - in dashboard Admin? in Ingest?
# Status indicator to show current status of transfer/job
+
** Increase icon size and fix 'jumpiness' of content indicators
# Better documentation
+
** Include tooltips for buttons
 +
** Consider name change of 'originals' pane to 'transfer backlog search results' or the like
 +
** Create delete package request from Transfer backlog
 +
*'''Deposit tool'''
 +
** Configure transfer in GUI rather than in local filesystem for complex workflows (ie. adding metadata files, checksums, manually normalized content, etc)
 +
** Upload submission documentation (see above, may consider in Transfer dashboard tab)
 +
** Provide download link to METS file in AIP review that doesn't involve opening in browser (for larger METS files which timeout)
 +
*'''Email ingest workflow'''
 +
** Improvements to e-mail ingest workflow (maildir)
 +
*'''AIP Reingest'''
 +
** Include option to run microservices on previously normalized files
 +
 
 +
===Metadata===
 +
* Capture PREMIS from external systems
 +
* Field validation in PREMIS rights templates - #1519
 +
* METS refactoring and METS generation improvements
 +
** develop standalone Python METS reader/writer application and distribute separately from and integrated with Archivematica
 +
** In progress, see [https://github.com/artefactual-labs/mets-reader-writer METS Reader & Writer]
 +
* Change encoding of the METS file to UTF-8
 +
 
 +
===Format Policy Registry===
 +
* Format Policy Registry (FPR) public site UI
 +
* Ability to send local format policy changes to the FPR public site #5074
 +
 
 +
===Storage Service===
 +
* Move some/all DIP upload responsibilities to SS
 +
* Move Index AIP micro-service to SS
 +
* Automated deletion of content in transfer source once a successful AIP has been created and stored
 +
* Ability to send AIPs/DIPs to duplicate locations
 +
* Re-index transfer backlog, AIPs and DIPs - ElasticSearch re-indexing
 +
* Ability to select multiple packages from SS to download at once
 +
* Persistent data about stored AIPs and DIPs
 +
* DIP generation/upload info logged to pointer file
 +
* SAMBA plugin for Storage API
 +
* Move ElasticSearch to SS
 +
* Research management of processing space, so a transfer cannot be run if it's too big for the allotted space
 +
 
 +
===Integration===
 +
* AtoM - Send PREMIS rights metadata with DIP
 +
* Hydra (Ingest, AIP storage, API plugin)
 +
* DSpace (Ingest, DIP upload)
 +
* BitCurator integration: packages, bulk extractor reporting, how much functionality/data can be integrated/re-used prior to Archivematica ingest #1869
 +
 
 +
===Fixity app===
 +
* Add flag specifying number of AIPs to check simultaneously

Latest revision as of 17:53, 18 June 2019

Main Page > Development > Development roadmap

This roadmap page is no longer being maintained. We are now tracking the Archivematica roadmap in a public Trello board. Please subscribe to the Archivematica Google Group for release and roadmap related announcements.

Archivematica development roadmap[edit]

This roadmap describes what Artefactual is working on for the Archivematica system. Sponsored work, that is development of features and enhancements which is funded by our development partners, is prioritized. On our wishlist, we have also included enhancements and features that we would like to see or that the community has shown interest in; however, without development resources allocated to us or contributions from developers outside of Artefactual, we cannot guarantee their inclusion.

Reflecting the bounty business model for open source development, each feature is developed in partnership with an institution or group of institutions with unique workflow needs. Despite our best efforts to keep features as generic as possible, some extra development may be necessary for a feature to function well in your own environment. Please see the Archivematica services offered on Artefactual's website to find out more about how to become a development partner, get training and support, or take advantage of installation services.

We will issue public releases incrementally upon completion and testing of the sponsored features and enhancements listed below. All features are subject to code review and QA, the results of which may result in any feature being pushed to a future release.

Artefactual Labs[edit]

Features by release number[edit]

1.7.1[edit]

See the 1.7.1 Milestone on our Waffleboard for more information or our 1.7.1 release notes for specific pull requests.

  • Sponsored (Columbia University Library) Performance enhancements
    • Change MCPClient to stop sending tool outputs through the job scheduler
    • Change MCPServer to require only return code from client tasks
  • Fix encrypted AIP storage spaces on centos/ansible installs (0.11.1)
  • Fix storage for encrypted uncompressed AIPs (0.11.1)

1.7/0.11[edit]

  • Sponsored (PREFORMA/MediaArea) MediaConch integration for audio-visual format verification
  • Sponsored (Rockefeller Archives Centre) Importing object-level PREMIS rights via CSV
    • Allows users to apply PREMIS rights to individual objects, rather than having all of the objects in the transfer inherit rights applied to the SIP via the Add Rights metadata form, using a CSV file included with the transfer
  • Sponsored (Canadian Council of Archives) Enable internationalization and localization of Archivematica
    • This will allow the Archivematica interface to be translated into languages other than English, using a translation system or application
  • Sponsored (Canadian Centre for Architecture) Change METS encoding to UTF-8
  • Sponsored (Canadian Centre for Architecture) Use default access rule if normal rule errors
  • Sponsored (Bentley Historical Library) Allow zip as archive format for AIPs stored in DSpace
  • Sponsored (Museum of Modern Art New York) AIP migration
    • This allows AIPs to be moved from one location to another via a new API endpoint. The API takes two arguments: 1) UUID of an existing package (AIP or DIP or transfer) and 2) the UUID of a Location.


Proposed/in development/experimental[edit]

These features are works in progress or have experimental/proof of concept status.

See also, Improvements

  • Sponsored (Simon Fraser University Archives) WARC file ingest
    • Analyze WARC header information and prepare metadata mapping to Archivematica AIP METS file
    • Add Archivematica micro-services to parse WARC header information to Archivematica METS file
  • Sponsored (Ontario Council of University Libraries) Dataverse integration (proof of concept)
    • The scope of sponsored work is a proof of concept model for integration of Dataverse with Archivematica. As design/development progresses, we will update the development roadmap accordingly.
  • Sponsored (University of York/University of Hull) Automated DIP generation workflow
    • Change workflow so that the ‘upload DIP’ choice can be preconfigured
    • Update AIP reingest workflow to allow uncompressed AIPs to be reingested. (DONE, version 1.6/0.10)
    • Enhance the callback functionality in the Storage Service, to notify third party apps when a DIP is ready to be used.
  • Sponsored (University of York/University of Hull) METS parsing tools
    • Develop the public facing API of the REST service and define the API to return answers as JSON-LD or another linked data format
    • Develop a python METSReader library that would live behind the REST service
    • Write documentation for the REST service
  • Sponsored (University of York/University of Hull) Generic search REST API (proof-of-concept)
    • Develop the public facing API of the REST service, a read-only API to provide a small number of endpoints to answer basic questions about the number of files in storage, their formats, date of ingest, etc.
    • Develop functionality in the Archivematica Storage Service to implement this API
    • Write documentation for the REST API
  • Sponsored (University of York/University of Hull) Enhance PRONOM integration
    • Allow a user to manually assign pronom IDs to non-identified files; record manual selection in the AIP METS file
    • Provide report of non-identified files in a SIP or AIP, with access to the file identification tool output
    • Provide direct access to the PRONOM submission form from within Archivematica.
  • Sponsored (University of York/University of Hull) Automation tools documentation

Fixity app[edit]

  • Sponsored (Simon Fraser University Archives) Better end-user documentation for the fixity app.

Wish list[edit]

This section describes enhancements and features that the Archivematica community would like to see researched and implemented; however, without development resources allocated to us or contributions from developers outside of Artefactual, we cannot guarantee their inclusion in an upcoming release. Also see unsponsored features/tasks/bugs without assigned releases: unsponsored and unscheduled fixes, features and tasks

Dashboard[edit]

  • User interface
    • Upload submission documentation during transfer upload #1910
    • Administrative dashboard interface for system monitoring, including status, restart services, maintenance of backups, tools for restoring, automatic indexing of ElasticSearch index
    • Indicator in dashboard of decision made at decision points
    • Indicator that Archivematica is currently processing
    • Status indicator to show current status of transfer/job
    • Reconsider icons and access to the Add Metadata / Rights templates (currently the icon matches the ‘report’ icon and it’s unclear when is the ‘right’ time to add metadata) and the Reminder: add metadata micro-service
    • Access tab, Archival Storage tab, Preservation Planning tab should have description of purpose of tab
    • Treat each tab as its own web application
    • Administrative access to Storage Service from Access tab
    • Task cogs containing no information should have a short descriptive indicator of why there is none (ie no tool output available)
    • Ability to choose a fallback identification tool when the selected tool fails
    • Hide AtoM user password in the user interface
  • SIP arrangement See also #6791
    • Visualization of transfer contents - #1578, Transfer and SIP creation#File visualization reporting page
    • Clean up of transfer backlog once arrangement is complete - in dashboard Admin? in Ingest?
    • Increase icon size and fix 'jumpiness' of content indicators
    • Include tooltips for buttons
    • Consider name change of 'originals' pane to 'transfer backlog search results' or the like
    • Create delete package request from Transfer backlog
  • Deposit tool
    • Configure transfer in GUI rather than in local filesystem for complex workflows (ie. adding metadata files, checksums, manually normalized content, etc)
    • Upload submission documentation (see above, may consider in Transfer dashboard tab)
    • Provide download link to METS file in AIP review that doesn't involve opening in browser (for larger METS files which timeout)
  • Email ingest workflow
    • Improvements to e-mail ingest workflow (maildir)
  • AIP Reingest
    • Include option to run microservices on previously normalized files

Metadata[edit]

  • Capture PREMIS from external systems
  • Field validation in PREMIS rights templates - #1519
  • METS refactoring and METS generation improvements
    • develop standalone Python METS reader/writer application and distribute separately from and integrated with Archivematica
    • In progress, see METS Reader & Writer
  • Change encoding of the METS file to UTF-8

Format Policy Registry[edit]

  • Format Policy Registry (FPR) public site UI
  • Ability to send local format policy changes to the FPR public site #5074

Storage Service[edit]

  • Move some/all DIP upload responsibilities to SS
  • Move Index AIP micro-service to SS
  • Automated deletion of content in transfer source once a successful AIP has been created and stored
  • Ability to send AIPs/DIPs to duplicate locations
  • Re-index transfer backlog, AIPs and DIPs - ElasticSearch re-indexing
  • Ability to select multiple packages from SS to download at once
  • Persistent data about stored AIPs and DIPs
  • DIP generation/upload info logged to pointer file
  • SAMBA plugin for Storage API
  • Move ElasticSearch to SS
  • Research management of processing space, so a transfer cannot be run if it's too big for the allotted space

Integration[edit]

  • AtoM - Send PREMIS rights metadata with DIP
  • Hydra (Ingest, AIP storage, API plugin)
  • DSpace (Ingest, DIP upload)
  • BitCurator integration: packages, bulk extractor reporting, how much functionality/data can be integrated/re-used prior to Archivematica ingest #1869

Fixity app[edit]

  • Add flag specifying number of AIPs to check simultaneously