Difference between revisions of "Performance Improvements"
Line 27: | Line 27: | ||
Duration (wall time) | Duration (wall time) | ||
− | *This feature will be released in the next | + | *This feature will be released in the next minor release of Archivematica |
Revision as of 14:52, 8 June 2018
Make indexing configurable
Make search an optional feature in Archivematica so that it can be run with ElasticSearch turned off.
- The installation methods have been tested, documented, and released in Archivematica 1.7.0
Make capture output configurable
The next most place to start for performance improvements was selected: reducing processing time by changing how output streams are handled. In this phase, sending automatically writing standard out and standard error to the database was made configurable. When output capture is turned off, only a non-zero exist code (an error) is returned.
- This option has been tested, documented for all deployment methods, and released in Archivematica 1.7.1
Make performance metrics accessible via the API
Whenever a preservation task is performed, Archivematica records its timing information (start and end times) in the MySQL database. Columbia University Library wants to be able to measure the processing time (performance) of Archivematica packages and their component microservices so that they can identify bottlenecks, estimate package processing times, and make informed decisions about their configuration.
The problem is that the relevant timing information is not exposed via Archivematica’s API endpoints and is only partially exposed via its GUI. In addition, since this information is internal (not exposed via a public API), it is subject to change and users are therefore wary of building features or implementing workflows that make use of it.
We are therefore implementing an API endpoint that returns processing performance details for a specified package (i.e., a transfer or an AIP) divided by microservice group. This endpoint will return the following data:
Phase of processing (transfer or ingest) Microservice group CPU time Number of Tasks Duration (wall time)
- This feature will be released in the next minor release of Archivematica