Storage Service API

From Archivematica
Revision as of 15:06, 13 March 2017 by Hbecker (talk | contribs) (Initial structure & info)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Main Page > Development > Storage Service API

The Storage Service API provides programmatic access to moving files around in storage areas that the Storage Service has access to.

The API is written using TastyPie.

Improvement Note: TastyPie is less well supported than Django REST Framework, both in terms of docs & community. We should look at replacing TastyPie with DRF. See also

Endpoints require authentication with a username and API key. This can be submitted as GET parameters (eg ?username=test&api_key=e6282adabed84e39ffe451f8bf6ff1a67c1fc9f2) or as a header (eg Authorization: ApiKey test:e6282adabed84e39ffe451f8bf6ff1a67c1fc9f2)

Endpoints return JSON. If there's an error, they will return a 4xx or 5xx HTTP error code and a JSON body {'error': True, 'message': 'message describing error'} (Is this true?)

Pipeline

Get all pipelines

  • URL: /api/v2/pipeline/
  • Verb: GET

Create new pipeline

  • URL: /api/v2/pipeline/
  • Verb: POST
  • Parameters: JSON body
    • create_default_locations: If True, will associated default Locations with the newly created pipeline
    • shared_path: If default locations are created, created the processing location at this path in the local filesystem

If the 'Pipelines disabled on creation' setting is set, the pipeline will be disabled by default, and will not respond to queries.

Get pipeline details

  • URL: /api/v2/pipeline/<UUID>/
  • Verb: GET

Space

Improvement Note: Is there no way to create Spaces in the API?

Get all spaces

  • URL: /api/v2/space/
  • Verb: GET

Get space details

  • URL: /api/v2/space/<UUID>/
  • Verb: GET

Browse space path

  • URL: /api/v2/space/<UUID>/browse/
  • Verb: GET
  • Parameters: Query string parameters
    • path: Path inside the Space to look
  • Response: JSON
    • entries: List of entries in `path`, files or directories
    • directories: List of directories in `path`. Subset of `entries`.
Version 1: Returns paths as strings

Version 2: Returns all paths base64 encoded

Location

Improvement Note: Is there no way to create Locations in the API?

Get all locations

  • URL: /api/v2/location/
  • Verb: GET

Get location details

  • URL: /api/v2/location/<UUID>/
  • Verb: GET

Move files to this location

  • URL: /api/v2/location/<UUID>/
  • Verb: POST
  • Parameters: JSON body
    • origin_location: URI of the Location the files should be moved from
    • pipeline: URI of the pipeline. Both Locations must be associated with this pipeline.
    • files: List of dicts containing source and destination. The source and destination are paths relative to their Location of the files to be moved.

Intended for use with creating Transfers, SIPs, etc and other cases where files need to be moved but not tracked by the storage service.

Browse location path

  • URL: /api/v2/location/<UUID>/browse/
  • Verb: GET
  • Parameters: Query string parameters
    • path: Path inside the Location to look
  • Response: JSON
    • entries: List of entries in `path`, files or directories
    • directories: List of directories in `path`. Subset of `entries`.
Version 1: Returns paths as strings

Version 2: Returns all paths base64 encoded

SWORD collection

  • URL: /api/v2/location/<UUID>/sword/collection/
  • Verb: GET, POST

See Sword API for details


Package

Get all packages

  • URL: /api/v2/file/
  • Verb: GET

Create new package

  • URL: /api/v2/file/
  • Verb: POST

Get package details

  • URL: /api/v2/file/<UUID>/
  • Verb: GET

? detail PUT

  • URL: /api/v2/file/<UUID>/
  • Verb: PUT

? detail PATCH

  • URL: /api/v2/file/<UUID>/
  • Verb: PATCH

Used to update the reingest status.

Delete AIP request

  • URL: /api/v2/file/<UUID>/delete_aip/
  • Verb: POST
  • Parameters: JSON body
    • event_reason: Reason for deleting the AIP
    • pipeline: URI of the pipeline the delete request is from
    • user_id: User ID requesting the deletion. This is the ID of the user on the pipeline, and must be an integer greater than 0.
    • user_email: Email of the user requesting the deletion.

Recover AIP request

  • URL: /api/v2/file/<UUID>/recover_aip/
  • Verb: POST
  • Parameters: JSON body
    • event_reason: Reason for recovering the AIP
    • pipeline: URI of the pipeline the recovery request is from
    • user_id: User ID requesting the recovery. This is the ID of the user on the pipeline, and must be an integer greater than 0.
    • user_email: Email of the user requesting the recovery.

Download single file

  • URL: /api/v2/file/<UUID>/extract_file/
  • Verb: GET, HEAD
  • Parameters: Query string parameters
    • relative_path_to_file: Path to the file to download, relative to the package path.
  • Response: Stream of the requested file

Returns a single file from the Package. If the package is compressed, it downloads the whole AIP and extracts it.

This responds to HEAD because AtoM uses HEAD to check for the existence of a file.

Improvement Note: HEAD and GET should not perform the same functions. HEAD should be updated to not return the file, and to only check for existence. Currently, the storage service has no way to check if a file exists except by downloading and extracting this AIP

If the package is in Arkivum, the package may not actually be available. This endpoint checks if the package is locally available. If it is, it is returned as normal. If not, it returns 202 and emails the administrator about the attempted access.

Download package

  • URL: /api/v2/file/<UUID>/download/
  • URL: /api/v2/file/<UUID>/download/<chunk number>/ (for LOCKSS harvesting)
  • Verb: GET, HEAD
  • Parameters: None
  • Response: Stream of the package

Returns the entire package as a single file. If the AIP is uncompressed, create one file by using `tar`.

If the download URL has a chunk number, it will attempt to serve the LOCKSS chunk specified for that package. If the package is not in LOCKSS, it will return the the whole package.

This responds to HEAD because AtoM uses HEAD to check for the existence of a file.

Improvement Note: HEAD and GET should not perform the same functions. HEAD should be updated to not return the file, and to only check for existence.

If the package is in Arkivum, the package may not actually be available. This endpoint checks if the package is locally available. If it is, it is returned as normal. If not, it returns 202 and emails the administrator about the attempted access.

Get pointer file

  • URL: /api/v2/file/<UUID>/pointer_file/
  • Verb: GET
  • Parameters: None
  • Response: Stream of the pointer file.

Check fixity

  • URL: /api/v2/file/<UUID>/check_fixity/
  • Verb: GET
  • Parameters: Query string parameters
    • force_local: If true, download and run fixity on the AIP locally, instead of using the Space-provided fixity if available.
  • Response: JSON
    • success: True if the verification succeeded, False if the verification failed, None if the scan could not start
    • message: Human-readable string explaining the report; it will be empty for successful scans.
    • failures: List of 0 or more errors
    • timestamp: ISO-formated string with the datetime of the last fixity check. If the check was performed by an external system, this will be provided by that system. If not provided,or on error, it will be None.

AIP storage callback request

  • URL: /api/v2/file/<UUID>/send_callback/post_store/
  • Verb: GET

Request to call any Callbacks configured to run post-storage for this AIP.

Improvement Note: This only works on locally available AIPs (AIPs stored in Spaces that are available via a UNIX filesystem layer).

Get file information for package

  • URL: /api/v2/file/<UUID>/contents/
  • Verb: GET
  • Response: JSON
    • success: True
    • package: UUID of the package
    • files: List of dictionaries with file information. Each dictionary has:
      • source_id: UUID of the file to index
      • name: Relative path of the file inside the package
      • source_package: UUID of the SIP this file is from
      • checksum: Checksum of the file, or an empty string
      • accessionid: Accession number, or an empty string
      • origin: UUID of the Archivematica dashboard this is from

Returns metadata about every file within the package.

Update file information for package

  • URL: /api/v2/file/<UUID>/contents/
  • Verb: PUT
  • Parameters: JSON list of dictionaries with information on the files to be added. Each dict must have the following attributes:
    • relative_path: Relative path of the file inside the package
    • fileuuid: UUID of the file to index
    • accessionid: Accession number, or an empty string
    • sipuuid: UUID of the SIP this file is from
    • origin: UUID of the Archivematica dashboard this is from

Adds a set of files to a package.

Delete file information for package

  • URL: /api/v2/file/<UUID>/contents/
  • Verb: DELETE

Removes all file records associated with this package.

Query file information on packages

  • URL: /api/v2/file/metadata/
  • Verb: GET, POST
  • Parameters: Query string parameters. Must have at least one, but not all are required
    • relative_path: Relative path of the file inside the package
    • fileuuid: UUID of the file
    • accessionid: Accession number
    • sipuuid: UUID of the SIP this file is from
  • Response: JSON. List of dicts with file information about the files that match the query.
    • accessionid: Accession number, or an empty string
    • file_extension: File extension
    • filename: Name of the file, sans path.
    • relative_path: Relative path of the file inside the package
    • fileuuid: UUID of the file to index
    • sipuuid: UUID of the SIP this file is from
    • origin: UUID of the Archivematica dashboard this is from

Reingest AIP

  • URL: /api/v2/file/<UUID>/reingest/
  • Verb: POST
  • Parameters: JSON body
    • pipeline: UUID of the pipeline to reingest on
    • reingest_type: Type of reingest to start. One of METADATA_ONLY (metadata-only reingest), OBJECTS (partial reingest), FULL (full reingest)
    • processing_config: Optional. Name of the processing configuration to use on full reingest

SWORD endpoints

  • URL: /api/v2/file/<UUID>/sword/
  • URL: /api/v2/file/<UUID>/sword/media/
  • URL: /api/v2/file/<UUID>/sword/state/

See Sword API for details.