Metrics requirements

From Archivematica
Revision as of 14:40, 23 March 2017 by Hbecker (talk | contribs) (Move to feature requirements category)
Jump to navigation Jump to search

Documentation > Requirements > Metrics requirements


  • The Archivematica team is currently gathering metrics requirements to support digital repository management. The question we would like to answer is: what kinds of metrics would be useful to Archivematica users (e.g. performance/usage statistics, AIP/DIP monitoring, file format statistics, etc.).
  • Our goal is to get a sense of the foundational elements necessary to fulfill future reporting requirements that we will need to build into Archivematica ... eventually. We are of course limited by available funding and time. We will prioritize the metrics and do our best to get as much as we can into upcoming releases given available funding and time.


Use Cases

The following are some use cases generated by the Archivematica team and contributed by Archivematica users.

  • Output reports on the following:
  1. number of file format X ingested per month
  2. total disk size of ingests added per year
  3. number of failed normalizations per format x
  4. name of person who processed an AIP
  5. inventory of shares & mount points (e.g. SIP Source, /var/archivematica, AIP Destination, DIP destination) including disk space (total, available, used & quota) and availability
  6. total count of file formats
  7. total disk usage per file format
  8. growth: monthly report showing growth by file size, file format. Could also tap into descriptive metadata to show "collection growth"
  9. AIP Monitoring: Bit errors detected, bit errors corrected
  10. AIP average, median & max directory depth and directory breadth
  11. File size distribution, file size distribution per format
  12. Preservation and access normalization monitoring: Display not just file format but also version/specification


Data Model

-Logged-in user (should be captured as PREMIS agent) -UUID of the Archivematica instance (should be captured as PREMIS agent) -Possibly also environment data: what machines did Archivematica live on, what versions of all the tools were installed (already in PREMIS events), what version of Archivematica was used (already in software agent).

For AIPs:

  • AIP name - In METS structMap:
  • AIP UUID - In METS structMap:
  • AIP storage location - Not in METS file
  • Date placed in storage - Not in METS file
  • Date updated (when we have AIP versioning) - will be in METS header <metsHdr CREATEDATE="2013-05-09T15:00:00" LASTMODDATE=”2014-02-09T21:00:00>
  • AIP size - Not in METS file
  • Related DIP - Not in METS file
  • DIP storage location - Not in METS file
  • DIP size - Not in METS file
  • Date DIP uploaded - Not in METS file

For transfer backups:

  • Transfer name
  • Transfer UUID
  • Transfer size - Not in METS file
  • Transfer storage location - Not in METS file
  • Date placed in storage - Not in METS file