Difference between revisions of "Sword API"

From Archivematica
Jump to navigation Jump to search
 
(123 intermediate revisions by 5 users not shown)
Line 1: Line 1:
 +
[[Main Page]] > [[Documentation]] > [[Requirements]] > Sword API
 +
 +
<div style="padding: 10px 10px; border: 1px solid black; background-color: #F79086;">This page is no longer being maintained and may contain inaccurate information. Please see the [https://www.archivematica.org/docs/latest/ Archivematica documentation] for up-to-date information.</div><p>
 +
 +
[[Category:Feature requirements]]
  
 
== Overview ==
 
== Overview ==
  
One of the 1.1 release features is a sponsored project to integrate Archivematica with Islandora.  This integration will be accomplished by creating a Sword 2.0 API.  Islandora development work will add functionality to Islandora to use this API to deposit digital objects into Archviematica.  
+
One of the 1.1 release features is a sponsored project to integrate Archivematica with Islandora.  This integration will be accomplished by creating a [http://swordapp.org SWORD] 2.0 API for the Archivematica [https://www.archivematica.org/wiki/Administrator_manual_1.0#Storage_service storage service].  Islandora development work will add functionality to Islandora to use this API to deposit digital objects into Archviematica.
 +
 
 +
As described [[Overview|elsewhere]] on this wiki, ''"the primary function of Archivematica is to process digital transfers (accessioned digital objects), turn them into SIPs, apply format policies and create high-quality, repository-independent Archival Information Packages (AIP)"''.
 +
 
 +
The Archivematica Sword API allows 3rd party applications to automate the process of creating Transfers. 
 +
#Create Transfer - set the name and other metadata about a Transfer
 +
#Populate Transfer - add/edit/update digital objects in a Transfer, and associated metadata
 +
#Finalize Transfer - indicates that the Transfer is ready to start processing.
 +
 
 +
After content has been deposited, if users have access to the deposit directory they can also manually manipulate the deposit directory aside from any manipulation using the API.
 +
 
 +
Once a Transfer has been created, populated and finalized, Archivematica will begin processing that Transfer.
 +
 
 +
== Configuration ==
 +
 
 +
The SWORD API is part of the Archivematica storage service REST API. The storage service also talks to the Archivematica dashboard, however, to notify it that a, transfer created from a SWORD deposit, has been approved for processing.
 +
 
 +
To use the SWORD API you must install the Archivematica dashboard 1.0 branch and the Archivematica storage service dev/issue-5980-islandora branch.
 +
 
 +
Dashboard configuration:
 +
 
 +
# In dashboard REST API administration, add the IP of the storage service server to IP whitelist
 +
# In dashboard user administration, take note of username and API key for a dashboard user (example: user "demo" and API key "4e5f32ab2aefd3543e1b19a2de554dd65f90108a")
 +
# In dashboard general admininstration, take note of the dashboard's UUID
 +
 
 +
Storage service configuration:
 +
 
 +
# In the storage service web interface's Pipelines tab, find the dashboard pipeline (using dashboard's UUID noted during dashboard configuration)
 +
# Enter the dashboard's IP address into the "Remote name" field and the user and API key noted earlier into the "Api username" and "Api key" fields. Click the button to submit these changes.
 +
 
 +
After configuration you'll need to find the space UUID of the SWORD server. To do so, in the web interface's Spaces tab, click "View Details and Locations" for the SWORD server. The UUID will be shown at the top of the screen. The UUID is used to figure out the initial API URL to access (for example: "http://localhost:8000/api/v1/space/c0bee7c8-3e9b-41e3-8600-ee9b2c475da2/sword/collection/").
 +
 
 +
== Example session ==
 +
 
 +
Below is an example session using curl to manipulate the API. In the example a deposit is created, a file is added to it, and it is finalized.
 +
 
 +
NOTE: The SWORD API is in the process of being moved to the storage service. These URLs may not work.
 +
 
 +
<pre># create new deposit, providing a METS file specifying digital object URLs to download in the background
 +
curl -v -H "In-Progress: true" --data-binary @mets.xml --request POST http://localhost:8000/api/v1/location/c0bee7c8-3e9b-41e3-8600-ee9b2c475da2/sword/collection/
 +
# response XML includes endpoints for adding additional files, etc.
 +
 
 +
# add a single file to the deposit
 +
curl -v -H "Content-Disposition: attachment; filename=cat.jpg" --request POST \
 +
    --data-binary "@cat.jpg" \
 +
    http://127.0.0.1:8000/api/v1/file/9c8b4ac0-0407-4360-a10d-af6c62a48b69/sword/media/
 +
 
 +
# start another background download of resources specified in a METS
 +
curl -v -H "In-Progress: true" -H "Packaging: METS" --data-binary @mets.xml --request POST http://127.0.0.1:8000/api/v1/file/9c8b4ac0-0407-4360-a10d-af6c62a48b69/sword/media/
 +
 
 +
# finalize transfer and approve processing
 +
curl -v -H "In-Progress: false" --request POST http://127.0.0.1:8000/api/v1/file/149cc29d-6472-4bcf-bee8-f8223bf60580/sword/
 +
</pre>
 +
 
 +
You might, at some point, want to list the transfers that have been started. The following curl command will do so.
 +
 
 +
<pre>curl -v http://localhost:8000/api/v1/location/c0bee7c8-3e9b-41e3-8600-ee9b2c475da2/sword/collection/</pre>
 +
 
 +
If you're working on a transfer and want to list what files are included in it, the following curl command will list them.
 +
 
 +
<pre># list files in transfer
 +
curl -v http://127.0.0.1:8000/api/v1/file/9c8b4ac0-0407-4360-a10d-af6c62a48b69/sword/media/</pre>
 +
 
 +
If you've started a deposit, but want to discard it, you can delete the deposit. The following curl command shows an example.
 +
 
 +
<pre>curl -v -XDELETE http://127.0.0.1:8000/api/v1/file/149cc29d-6472-4bcf-bee8-f8223bf60580/sword/</pre>
 +
 
 +
== Single step deposit ==
 +
 
 +
If you wanted to deposit a transfer then immediately finalize it, after background downloading is complete, you could use something like the following curl command.
 +
 
 +
<pre>curl -v -H "In-Progress: false" --data-binary @mets.xml --request POST http://localhost:8000/api/v1/location/c0bee7c8-3e9b-41e3-8600-ee9b2c475da2/sword/collection/</pre>
  
 
== Endpoints ==
 
== Endpoints ==
  
=== Create a new transfer ===
+
=== Service document ===
* HTTP POST to http://localhost/api/v2/transfer/
 
* used to start a new transfer in Archivematica
 
  
Each Transfer will have a set of endpoints available after creation.
+
The SWORD service document provides information about the SWORD provider's capacities and lists SWORD collections. Archivematica currently includes one SWORD collection to which SWORD deposits can be made: transfers.
  
=== Transfer Details ===
+
The SWORD service document endpoint is located at /api/v1/sword/.
* used to get basic info about a Transfer with HTTP GET
 
* can update basic info using HTTP PUT
 
* can allow deletion with HTTP DELETE
 
* follows the form: [archivematica hostname]/api/v2/transfer/[uuid of transfer]/
 
* example: http://localhost/api/v2/transfer/1225c695-cfb8-4ebb-aaaa-80da344efa6a/
 
  
=== EM-IRI ===
+
Service document example:
* Edit-Media IRI
+
<pre>
* used to add new objects to an existing transfer
+
<service xmlns:dcterms="http://purl.org/dc/terms/"
* follows the form: [archivematica hostname]/api/v2/transfer/edit/[uuid of transfer]/
+
  xmlns:sword="http://purl.org/net/sword/terms/"
* example: http://localhost/api/v2/transfer/1225c695-cfb8-4ebb-aaaa-80da344efa6a/edit-media/
+
  xmlns:atom="http://www.w3.org/2005/Atom"
 +
  xmlns="http://www.w3.org/2007/app">
  
An HTTP POST to the em-iri for a transfer should include an
+
  <sword:version>2.0</sword:version>
  
An HTTP DELETE to the em-iri will remove all digital objects from the Transfer.  This is a valid operation only while the Transfer is being assembled.  Once the Transfer has been finalized, attempting to DELETE will return an error.
+
  <workspace>
 +
    <atom:title>Archivematica storage service</atom:title>
  
=== Edit-IRI ===
+
   
The client can replace the metadata of a resource by performing an HTTP PUT of a new Atom Entry on the Edit-IRI.
+
    <collection href="http://192.168.1.231:8000/api/v1/space/8f24ef5f-19d5-4b77-8a38-b30d034b11e7/sword/collection/">
 +
      <atom:title>Collection</atom:title>
 +
      <accept>*/*</accept>
 +
      <accept alternate="multipart-related">*/*</accept>
 +
      <sword:mediation>false</sword:mediation>
 +
    </collection>
 +
   
 +
  </workspace>
 +
</service>
 +
</pre>
  
This would be used to update metadata about a transfer, such as the transfer name.
+
=== List existing transfers ===
 +
* HTTP Get to the Collection IRI  (defined by default as: /api/v1/space/[space UUID]/sword/collection/)
 +
* Transfers are listed as an Atom feed
 +
* Each Atom feed entry contains an "atom:link" element whose href attribute contains the URL needed to get details about the transfer
 +
* optional filters via get parameters (not yet implemented)
  
* example: http://localhost/api/v2/transfer/1225c695-cfb8-4ebb-aaaa-80da344efa6a/edit/
+
Example response:
 +
<pre>
 +
<atom:feed xmlns="http://www.w3.org/2005/Atom">
 +
  <atom:id>http://192.168.1.231:8000/api/v1/space/8f24ef5f-19d5-4b77-8a38-b30d034b11e7/sword/collection/</atom:id>
 +
  <atom:title type="text">Deposits</atom:title>
 +
  <atom:link href="http://192.168.1.231:8000/api/v1/space/8f24ef5f-19d5-4b77-8a38-b30d034b11e7/sword/collection/" rel="self">
 +
  </atom:link>
 +
    <atom:entry>
 +
    <atom:id>http://192.168.1.231:8000/api/v1/location/88e70cd2-9258-4a84-8938-16e74be032e6/sword/</atom:id>
 +
    <atom:title type="text">Cinderella</atom:title>
 +
    <atom:link href="http://192.168.1.231:8000/api/v1/location/88e70cd2-9258-4a84-8938-16e74be032e6/sword/" rel="self">
 +
    </atom:link>
 +
  </atom:entry>
 +
</atom:feed>
 +
</pre>
  
=== SE-IRI ===
+
=== Create a new transfer ===
<link rel="http://purl.org/net/sword/terms/add" href="http://localhost/api/v2/transfer/add/1225c695-cfb8-4ebb-aaaa-80da344efa6a/" />
+
* used to start a new transfer in Archivematica
 
+
* HTTP POST of an Atom Entry Document to the Collection IRI (defined by default as: /api/v1/space/[space UUID]/sword/collection/)
=== State-IRI ===
 
* used to retrieve status of transfer
 
* implemented as rdf document
 
* example: http://localhost/api/v2/transfer/1225c695-cfb8-4ebb-aaaa-80da344efa6a/status/
 
  
== Rough notes ==
+
'''Possible HTTP Response Codes'''
 +
* HTTP 200 OK - transfer already exists (not yet implemented)
 +
* HTTP 201 Created - transfer has been accepted
 +
* HTTP 412 Precondition Failed - required metadata missing or invalid
  
Step 1) Add Content
+
Valid requests will receive a Sword Deposit Receipt in the body of the response.
  
POST a single Fedora Object to Col-IRI
+
'''Required HTTP Headers'''
 +
The HTTP POST must include certain specific http headers. 
  
http://localhost/api/v2/transfer/create/
+
Required by http 1.1 specification:
 +
* Host: Must be set to archivematica host name.
 +
* Content-Length: Must be set to the length of the atom entry document.
 +
* Content-Type: Must be set to "application/atom+xml;type=entry".
  
described here:
+
Required by Archivematica. 
http://swordapp.github.io/SWORDv2-Profile/SWORDProfile.html#protocoloperations_creatingresource_entry
+
* Authorization: Must include the api key and username assigned by Archivematica.
  
You post will look like the sample below (content in [] should be replaced with appropriate values). Instead of dcterms embedded in the atom entry document, you would embed the fedora mets file for this object.
+
Required by SWORD 2.0 protocol:
 +
* In-Progress: Must be set to "true"
  
The <id> tag inside the <entry> should contain the AIP id (generated from user input or based on the collection).
+
Optional in SWORD 2.0 protocol:
 +
* On-Behalf-Of: not implemented by Archivematica
 +
* Slug: not implemented by Archivematica
  
To add additional content to the same AIP, you can either POST another
+
'''Required in Body'''
 +
The Body of the request must be a valid METS Document.  The required fields in an Atom Entry Document are:
 +
* LABEL: used as the transfer name
 +
* id: used as an accession id (not yet implemented)
 +
* author: used as the source of acquisition (not yet implemented)
 +
* summary: not implemented yet.  Could be used in the future as a description for the Transfer
 +
* updated: should be set to the current timestamp. 
  
 +
'''Example'''
 +
Example HTTP POST request:
 
<pre>
 
<pre>
POST http://localhost/api/v2/transfer/create/ HTTP/1.1
+
POST /api/v1/space/96606387-cc70-4b09-b422-a7220606488d/sword/collection/ HTTP/1.1
 
Host: localhost
 
Host: localhost
Authorization: Basic ZGFmZnk6c2VjZXJldA==
+
Authorization: Archivematica-API api_key="XXXXXXXXXXXXXXXXXXXX", username="timh"
Content-Length: [content length]
+
Content-Length: 213
 
Content-Type: application/atom+xml;type=entry
 
Content-Type: application/atom+xml;type=entry
 
In-Progress: true
 
In-Progress: true
On-Behalf-Of: [archivematica-user]
 
Slug: [suggested identifier]
 
 
<?xml version="1.0"?>
 
<entry xmlns="http://www.w3.org/2005/Atom"
 
        xmlns:dcterms="http://purl.org/dc/terms/">
 
    <title>Title</title>
 
    <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
 
    <updated>2005-10-07T17:17:08Z</updated>
 
    <author><name>Contributor</name></author>
 
    <summary type="text">The abstract</summary>
 
 
    <!-- some embedded metadata -->
 
    <dcterms:abstract>The abstract</dcterms:abstract>
 
    <dcterms:accessRights>Access Rights</dcterms:accessRights>
 
    <dcterms:alternative>Alternative Title</dcterms:alternative>
 
    <dcterms:available>Date Available</dcterms:available>
 
    <dcterms:bibliographicCitation>Bibliographic Citation</dcterms:bibliographicCitation>
 
    <dcterms:contributor>Contributor</dcterms:contributor>
 
    <dcterms:description>Description</dcterms:description>
 
    <dcterms:hasPart>Has Part</dcterms:hasPart>
 
    <dcterms:hasVersion>Has Version</dcterms:hasVersion>
 
    <dcterms:identifier>Identifier</dcterms:identifier>
 
    <dcterms:isPartOf>Is Part Of</dcterms:isPartOf>
 
    <dcterms:publisher>Publisher</dcterms:publisher>
 
    <dcterms:references>References</dcterms:references>
 
    <dcterms:rightsHolder>Rights Holder</dcterms:rightsHolder>
 
    <dcterms:source>Source</dcterms:source>
 
    <dcterms:title>Title</dcterms:title>
 
    <dcterms:type>Type</dcterms:type>
 
 
</entry>
 
  
 +
<?xml version="1.0" encoding="UTF-8"?>
 +
<METS:mets EXT_VERSION="1.1" OBJID="islandora:72" LABEL="Cinderella"
 +
  xmlns:METS="http://www.loc.gov/METS/"
 +
  xmlns:xlink="http://www.w3.org/1999/xlink"
 +
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 +
  xsi:schemaLocation="http://www.loc.gov/METS/ http://www.fedora.info/definitions/1/0/mets-fedora-ext1-1.xsd">
 +
    <METS:fileSec>
 +
        <METS:fileGrp ID="DATASTREAMS">
 +
            <METS:fileGrp ID="OBJ">
 +
                <METS:file ID="OBJ.0">
 +
                    <METS:FLocat xlink:title="Cinderella_(1865).pdf" LOCTYPE="URL"
 +
                      CHECKSUM="fad0d2f982f317a15804d2e5c4b9efcc" CHECKSUMTYPE="MD5"
 +
                      xlink:href="http://localhost:8080/fedora/get/islandora:72/OBJ/2013-08-12T14:46:43.454Z" />
 +
                </METS:file>
 +
            </METS:fileGrp>
 +
        </METS:fileGrp>
 +
    </METS:fileSec>
 +
</METS:mets>
 
</pre>
 
</pre>
  
The response back will be in the following form:
+
Example Response:
 
<pre>
 
<pre>
 
201 Created
 
201 Created
Location: http://localhost/api/v2/transfer/1225c695-cfb8-4ebb-aaaa-80da344efa6a
+
Location: http://localhost/api/v1/location/91d4f258-0050-4914-9d0c-6a74815358d2/sword/
  
 
<entry xmlns="http://www.w3.org/2005/Atom"
 
<entry xmlns="http://www.w3.org/2005/Atom"
Line 112: Line 210:
 
         xmlns:dcterms="http://purl.org/dc/terms/">
 
         xmlns:dcterms="http://purl.org/dc/terms/">
  
     <title>My Deposit</title>
+
     <title>Cinderella</title>
     <id>info:something:1</id>
+
     <id>91d4f258-0050-4914-9d0c-6a74815358d2</id>
     <updated>2008-08-18T14:27:08Z</updated>
+
     <updated></updated>
    <summary type="text">A summary</summary>
+
     <generator uri="https://www.archivematica.org" version="1.0"/>
     <generator uri="http://www.myrepository.ac.uk/sword-plugin" version="1.0"/>
 
 
 
    <!-- the item's metadata -->
 
    <dcterms:abstract>The abstract</dcterms:abstract>
 
    <dcterms:accessRights>Access Rights</dcterms:accessRights>
 
    <dcterms:alternative>Alternative Title</dcterms:alternative>
 
    <dcterms:available>Date Available</dcterms:available>
 
    <dcterms:bibliographicCitation>Bibliographic Citation</dcterms:bibliographicCitation>
 
    <dcterms:contributor>Contributor</dcterms:contributor>
 
    <dcterms:description>Description</dcterms:description>
 
    <dcterms:hasPart>Has Part</dcterms:hasPart>
 
    <dcterms:hasVersion>Has Version</dcterms:hasVersion>
 
    <dcterms:identifier>Identifier</dcterms:identifier>
 
    <dcterms:isPartOf>Is Part Of</dcterms:isPartOf>
 
    <dcterms:publisher>Publisher</dcterms:publisher>
 
    <dcterms:references>References</dcterms:references>
 
    <dcterms:rightsHolder>Rights Holder</dcterms:rightsHolder>
 
    <dcterms:source>Source</dcterms:source>
 
    <dcterms:title>Title</dcterms:title>
 
    <dcterms:type>Type</dcterms:type>
 
 
 
    <!-- EM-IRI -->
 
    <link rel="edit-media" href="http://localhost/api/v2/transfer/edit/1225c695-cfb8-4ebb-aaaa-80da344efa6a/" />
 
 
 
    <!-- Edit-IRI -->
 
    <link rel="edit" href="http://localhost/api/v2/transfer/edit/1225c695-cfb8-4ebb-aaaa-80da344efa6a/" />
 
 
 
    <!-- SE-IRI -->
 
    <link rel="http://purl.org/net/sword/terms/add" href="http://localhost/api/v2/transfer/add/1225c695-cfb8-4ebb-aaaa-80da344efa6a/" />
 
 
 
    <!-- State-IRI -->
 
    <link rel="http://purl.org/net/sword/terms/statement"
 
            type="application/atom+xml;type=feed"
 
            href="http://localhost/api/v2/transfer/feed/1225c695-cfb8-4ebb-aaaa-80da344efa6a/" />
 
    <link rel="http://purl.org/net/sword/terms/statement"
 
            type="application/rdf+xml"
 
            href="http://localhost/api/v2/transfer/rdf/1225c695-cfb8-4ebb-aaaa-80da344efa6a/" />
 
  
 +
    <summary>A deposit was started.</summary>
 +
    <sword:treatment>A deposit directory was created and, optionally, asynchronous download of digital objects.</sword:treatment>
  
 +
    <link rel="alternate" href="http://www.swordserver.ac.uk/col1/mydeposit.html"/>
 +
    <link rel="edit-media" href="http://localhost/api/v1/location/91d4f258-0050-4914-9d0c-6a74815358d2/sword/media/"/>
 +
    <link rel="edit" href="http://localhost/api/v1/location/91d4f258-0050-4914-9d0c-6a74815358d2/sword/" />
 +
    <link rel="http://purl.org/net/sword/terms/add" href="http://192.168.1.76:8000/api/v1/location/91d4f258-0050-4914-9d0c-6a74815358d2/sword/" />
 +
    <link rel="http://purl.org/net/sword/terms/statement" type="application/atom+xml;type=feed"
 +
      href="http://localhost/api/v1/location/91d4f258-0050-4914-9d0c-6a74815358d2/sword/state/" />
 
</entry>
 
</entry>
 
 
</pre>
 
</pre>
  
To finalize an AIP
+
=== Add Files to a Transfer ===
  
blank HTTP POST to the SE-IRI for the AIP:
+
Post file data:
in this example, the POST would look like:
+
* It is possible to also POST the actual file to a deposit location's EM-IRI (/api/v1/location/[location UUID]/sword/media/)
 +
* The file should be posted as the request body and the filename specified using the Content-Disposition header (as per RFC 6266): for example "Content-Disposition: Attachment; filename=dog.jpg"
 +
* An MD5 checksum for an uploaded file can be provided via the Content-MD5 header (as per RFC 1544): for example "Content-MD5:  Q2hlY2sgSW50ZWdyaXR5IQ=="
 +
* If only an Atom Entry Document is POST'ed, Archivematica will look for uri's in the file StructMap section of the METS file, and attempt to GET each file listed, to include in the Transfer. (not yet implemented)
  
 +
=== Finalize a Transfer ===
 +
* POST with no body to the SE-IRI
 +
* set In-Progress HTTP header to : false
 +
* This will tell Archivematica that no further content will be added or removed from the Transfer. 
 +
* Archivematica will finish fetching any files that were added previously, and once they have all been downloaded, the Transfer will start through the Archivematica pipeline.
  
 +
'''Example'''
 
<pre>
 
<pre>
POST http://localhost/api/v2/transfer/add/1225c695-cfb8-4ebb-aaaa-80da344efa6a/ HTTP/1.1
+
POST http://localhost/api/v1/location/55ce8053-113a-4954-8aa3-fc9771da0bda/sword/ HTTP/1.1
 
Host: localhost
 
Host: localhost
Authorization: Basic ZGFmZnk6c2VjZXJldA==
+
Authorization: Archivematica-API api_key="XXXXXXXXXXXXXXXXXXXX", username="timh"
Content-Length: [content length]
+
Content-Length: 0
 
Content-Type: application/atom+xml;type=entry
 
Content-Type: application/atom+xml;type=entry
 
In-Progress: false
 
In-Progress: false
 
 
</pre>
 
</pre>
  
response will be HTTP 200/OK or 400/Error
+
The response will be HTTP 200/OK or 400/Error (400 if the Transfer was already finalized). If the transfer doesn't exist the response will be HTTP 404.
 
 
  
 +
=== Get Status of Transfer ===
 
To check Status:
 
To check Status:
  
Line 183: Line 259:
 
in this example:
 
in this example:
  
GET http://localhost/api/v2/transfer/feed/1225c695-cfb8-4ebb-aaaa-80da344efa6a/ HTTP/1.1
+
GET http://localhost/api/v1/location/09224ddb-4c7d-404c-9218-2f2e1b5a599e/sword/state/ HTTP/1.1
  
 
response will include:
 
response will include:
<pre>
+
<pre><atom:feed xmlns:sword="http://purl.org/net/sword/terms/"
<sword:state href="http://localhost/api/v2/transfer/feed/1225c695-cfb8-4ebb-aaaa-80da344efa6a/">
+
            xmlns:atom="http://www.w3.org/2005/Atom">
     <sword:stateDescription>The item has passed through the workflow and is now archived</sword:stateDescription>
+
 
</sword:state>
+
     <atom:category scheme="http://purl.org/net/sword/terms/state"
</pre>
+
        term="failed"
 +
        label="State">
 +
            Deposit initiation: Failed
 +
    </atom:category>
 +
 
 +
</atom:feed></pre>
 
the list of possible descriptions is not finalized.
 
the list of possible descriptions is not finalized.
 +
 +
The state term value will either be "processing" (asynchronous deposit still working), "complete" (ready to be finalized), or "failed" (asynchronous deposit encountered an error).
 +
 +
== Additional Details ==
 +
 +
=== Transfer Details ===
 +
* redirect to Edit-IRI
 +
* used to get basic info about a Transfer with HTTP GET
 +
* can update basic info using HTTP PUT (not yet implemented)
 +
* can allow deletion with HTTP DELETE
 +
* follows the form: /api/v1/location/[deposit location UUID]/sword/
 +
* example: http://192.168.1.231:8000/api/v1/location/09224ddb-4c7d-404c-9218-2f2e1b5a599e/sword/
 +
 +
=== EM-IRI ===
 +
* Edit-Media IRI
 +
* used to add new objects to an existing transfer
 +
* follows the form: [archivematica hostname]/api/v2/transfer/[uuid of transfer]/media
 +
* example: /api/v1/location/[deposit location UUID]/sword/media/
 +
 +
An HTTP POST to the em-iri for a transfer should include a single file as the body. If the HTTP "Packaging" header is set to "METS" then METS XML can be sent as the body, specifying a list of digital object URLs to download in the background.
 +
 +
An HTTP GET should return a list of files in the transfer
 +
 +
An HTTP DELETE to the em-iri will remove all digital objects from the Transfer.  This is a valid operation only while the Transfer is being assembled.  Once the Transfer has been finalized, attempting to DELETE will return an error.
 +
 +
=== Edit-IRI ===
 +
The client can replace the metadata of a resource by performing an HTTP PUT of a new Atom Entry on the Edit-IRI. (not yet implemented)
 +
 +
This would be used to update metadata about a transfer, such as the transfer name.
 +
 +
* example: http://localhost/api/v2/transfer/1225c695-cfb8-4ebb-aaaa-80da344efa6a/edit
 +
 +
=== SE-IRI ===
 +
* example: http://localhost/api/v2/transfer/add/1225c695-cfb8-4ebb-aaaa-80da344efa6a (not yet implemented)
 +
 +
=== State-IRI ===
 +
* used to retrieve status of transfer
 +
* implemented as Atom document
 +
* example: /api/v1/location/[deposit location UUID]/sword/state/
 +
* should be able to subscribe like RSS feed
 +
 +
=== Service document ===

Latest revision as of 17:27, 11 February 2020

Main Page > Documentation > Requirements > Sword API

This page is no longer being maintained and may contain inaccurate information. Please see the Archivematica documentation for up-to-date information.

Overview[edit]

One of the 1.1 release features is a sponsored project to integrate Archivematica with Islandora. This integration will be accomplished by creating a SWORD 2.0 API for the Archivematica storage service. Islandora development work will add functionality to Islandora to use this API to deposit digital objects into Archviematica.

As described elsewhere on this wiki, "the primary function of Archivematica is to process digital transfers (accessioned digital objects), turn them into SIPs, apply format policies and create high-quality, repository-independent Archival Information Packages (AIP)".

The Archivematica Sword API allows 3rd party applications to automate the process of creating Transfers.

  1. Create Transfer - set the name and other metadata about a Transfer
  2. Populate Transfer - add/edit/update digital objects in a Transfer, and associated metadata
  3. Finalize Transfer - indicates that the Transfer is ready to start processing.

After content has been deposited, if users have access to the deposit directory they can also manually manipulate the deposit directory aside from any manipulation using the API.

Once a Transfer has been created, populated and finalized, Archivematica will begin processing that Transfer.

Configuration[edit]

The SWORD API is part of the Archivematica storage service REST API. The storage service also talks to the Archivematica dashboard, however, to notify it that a, transfer created from a SWORD deposit, has been approved for processing.

To use the SWORD API you must install the Archivematica dashboard 1.0 branch and the Archivematica storage service dev/issue-5980-islandora branch.

Dashboard configuration:

  1. In dashboard REST API administration, add the IP of the storage service server to IP whitelist
  2. In dashboard user administration, take note of username and API key for a dashboard user (example: user "demo" and API key "4e5f32ab2aefd3543e1b19a2de554dd65f90108a")
  3. In dashboard general admininstration, take note of the dashboard's UUID

Storage service configuration:

  1. In the storage service web interface's Pipelines tab, find the dashboard pipeline (using dashboard's UUID noted during dashboard configuration)
  2. Enter the dashboard's IP address into the "Remote name" field and the user and API key noted earlier into the "Api username" and "Api key" fields. Click the button to submit these changes.

After configuration you'll need to find the space UUID of the SWORD server. To do so, in the web interface's Spaces tab, click "View Details and Locations" for the SWORD server. The UUID will be shown at the top of the screen. The UUID is used to figure out the initial API URL to access (for example: "http://localhost:8000/api/v1/space/c0bee7c8-3e9b-41e3-8600-ee9b2c475da2/sword/collection/").

Example session[edit]

Below is an example session using curl to manipulate the API. In the example a deposit is created, a file is added to it, and it is finalized.

NOTE: The SWORD API is in the process of being moved to the storage service. These URLs may not work.

# create new deposit, providing a METS file specifying digital object URLs to download in the background
curl -v -H "In-Progress: true" --data-binary @mets.xml --request POST http://localhost:8000/api/v1/location/c0bee7c8-3e9b-41e3-8600-ee9b2c475da2/sword/collection/
# response XML includes endpoints for adding additional files, etc.

# add a single file to the deposit
curl -v -H "Content-Disposition: attachment; filename=cat.jpg" --request POST \
    --data-binary "@cat.jpg" \
    http://127.0.0.1:8000/api/v1/file/9c8b4ac0-0407-4360-a10d-af6c62a48b69/sword/media/

# start another background download of resources specified in a METS
curl -v -H "In-Progress: true" -H "Packaging: METS" --data-binary @mets.xml --request POST http://127.0.0.1:8000/api/v1/file/9c8b4ac0-0407-4360-a10d-af6c62a48b69/sword/media/

# finalize transfer and approve processing
curl -v -H "In-Progress: false" --request POST http://127.0.0.1:8000/api/v1/file/149cc29d-6472-4bcf-bee8-f8223bf60580/sword/

You might, at some point, want to list the transfers that have been started. The following curl command will do so.

curl -v http://localhost:8000/api/v1/location/c0bee7c8-3e9b-41e3-8600-ee9b2c475da2/sword/collection/

If you're working on a transfer and want to list what files are included in it, the following curl command will list them.

# list files in transfer
curl -v http://127.0.0.1:8000/api/v1/file/9c8b4ac0-0407-4360-a10d-af6c62a48b69/sword/media/

If you've started a deposit, but want to discard it, you can delete the deposit. The following curl command shows an example.

curl -v -XDELETE http://127.0.0.1:8000/api/v1/file/149cc29d-6472-4bcf-bee8-f8223bf60580/sword/

Single step deposit[edit]

If you wanted to deposit a transfer then immediately finalize it, after background downloading is complete, you could use something like the following curl command.

curl -v -H "In-Progress: false" --data-binary @mets.xml --request POST http://localhost:8000/api/v1/location/c0bee7c8-3e9b-41e3-8600-ee9b2c475da2/sword/collection/

Endpoints[edit]

Service document[edit]

The SWORD service document provides information about the SWORD provider's capacities and lists SWORD collections. Archivematica currently includes one SWORD collection to which SWORD deposits can be made: transfers.

The SWORD service document endpoint is located at /api/v1/sword/.

Service document example:

<service xmlns:dcterms="http://purl.org/dc/terms/"
  xmlns:sword="http://purl.org/net/sword/terms/"
  xmlns:atom="http://www.w3.org/2005/Atom"
  xmlns="http://www.w3.org/2007/app">

  <sword:version>2.0</sword:version>

  <workspace>
    <atom:title>Archivematica storage service</atom:title>

    
    <collection href="http://192.168.1.231:8000/api/v1/space/8f24ef5f-19d5-4b77-8a38-b30d034b11e7/sword/collection/">
      <atom:title>Collection</atom:title>
      <accept>*/*</accept>
      <accept alternate="multipart-related">*/*</accept>
      <sword:mediation>false</sword:mediation>
    </collection>
    
  </workspace>
</service>

List existing transfers[edit]

  • HTTP Get to the Collection IRI (defined by default as: /api/v1/space/[space UUID]/sword/collection/)
  • Transfers are listed as an Atom feed
  • Each Atom feed entry contains an "atom:link" element whose href attribute contains the URL needed to get details about the transfer
  • optional filters via get parameters (not yet implemented)

Example response:

<atom:feed xmlns="http://www.w3.org/2005/Atom">
  <atom:id>http://192.168.1.231:8000/api/v1/space/8f24ef5f-19d5-4b77-8a38-b30d034b11e7/sword/collection/</atom:id>
  <atom:title type="text">Deposits</atom:title>
  <atom:link href="http://192.168.1.231:8000/api/v1/space/8f24ef5f-19d5-4b77-8a38-b30d034b11e7/sword/collection/" rel="self">
  </atom:link>
    <atom:entry>
    <atom:id>http://192.168.1.231:8000/api/v1/location/88e70cd2-9258-4a84-8938-16e74be032e6/sword/</atom:id>
    <atom:title type="text">Cinderella</atom:title>
    <atom:link href="http://192.168.1.231:8000/api/v1/location/88e70cd2-9258-4a84-8938-16e74be032e6/sword/" rel="self">
    </atom:link>
  </atom:entry>
</atom:feed>

Create a new transfer[edit]

  • used to start a new transfer in Archivematica
  • HTTP POST of an Atom Entry Document to the Collection IRI (defined by default as: /api/v1/space/[space UUID]/sword/collection/)

Possible HTTP Response Codes

  • HTTP 200 OK - transfer already exists (not yet implemented)
  • HTTP 201 Created - transfer has been accepted
  • HTTP 412 Precondition Failed - required metadata missing or invalid

Valid requests will receive a Sword Deposit Receipt in the body of the response.

Required HTTP Headers The HTTP POST must include certain specific http headers.

Required by http 1.1 specification:

  • Host: Must be set to archivematica host name.
  • Content-Length: Must be set to the length of the atom entry document.
  • Content-Type: Must be set to "application/atom+xml;type=entry".

Required by Archivematica.

  • Authorization: Must include the api key and username assigned by Archivematica.

Required by SWORD 2.0 protocol:

  • In-Progress: Must be set to "true"

Optional in SWORD 2.0 protocol:

  • On-Behalf-Of: not implemented by Archivematica
  • Slug: not implemented by Archivematica

Required in Body The Body of the request must be a valid METS Document. The required fields in an Atom Entry Document are:

  • LABEL: used as the transfer name
  • id: used as an accession id (not yet implemented)
  • author: used as the source of acquisition (not yet implemented)
  • summary: not implemented yet. Could be used in the future as a description for the Transfer
  • updated: should be set to the current timestamp.

Example Example HTTP POST request:

POST /api/v1/space/96606387-cc70-4b09-b422-a7220606488d/sword/collection/ HTTP/1.1
Host: localhost
Authorization: Archivematica-API api_key="XXXXXXXXXXXXXXXXXXXX", username="timh"
Content-Length: 213
Content-Type: application/atom+xml;type=entry
In-Progress: true

<?xml version="1.0" encoding="UTF-8"?>
<METS:mets EXT_VERSION="1.1" OBJID="islandora:72" LABEL="Cinderella"
  xmlns:METS="http://www.loc.gov/METS/"
  xmlns:xlink="http://www.w3.org/1999/xlink"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.loc.gov/METS/ http://www.fedora.info/definitions/1/0/mets-fedora-ext1-1.xsd">
    <METS:fileSec>
        <METS:fileGrp ID="DATASTREAMS">
            <METS:fileGrp ID="OBJ">
                <METS:file ID="OBJ.0">
                    <METS:FLocat xlink:title="Cinderella_(1865).pdf" LOCTYPE="URL"
                      CHECKSUM="fad0d2f982f317a15804d2e5c4b9efcc" CHECKSUMTYPE="MD5" 
                      xlink:href="http://localhost:8080/fedora/get/islandora:72/OBJ/2013-08-12T14:46:43.454Z" />
                </METS:file>
            </METS:fileGrp>
        </METS:fileGrp>
    </METS:fileSec>
</METS:mets>

Example Response:

201 Created
Location: http://localhost/api/v1/location/91d4f258-0050-4914-9d0c-6a74815358d2/sword/

<entry xmlns="http://www.w3.org/2005/Atom"
        xmlns:sword="http://purl.org/net/sword/"
        xmlns:dcterms="http://purl.org/dc/terms/">

    <title>Cinderella</title>
    <id>91d4f258-0050-4914-9d0c-6a74815358d2</id>
    <updated></updated>
    <generator uri="https://www.archivematica.org" version="1.0"/>

    <summary>A deposit was started.</summary>
    <sword:treatment>A deposit directory was created and, optionally, asynchronous download of digital objects.</sword:treatment>

    <link rel="alternate" href="http://www.swordserver.ac.uk/col1/mydeposit.html"/>
    <link rel="edit-media" href="http://localhost/api/v1/location/91d4f258-0050-4914-9d0c-6a74815358d2/sword/media/"/>
    <link rel="edit" href="http://localhost/api/v1/location/91d4f258-0050-4914-9d0c-6a74815358d2/sword/" />
    <link rel="http://purl.org/net/sword/terms/add" href="http://192.168.1.76:8000/api/v1/location/91d4f258-0050-4914-9d0c-6a74815358d2/sword/" />
    <link rel="http://purl.org/net/sword/terms/statement" type="application/atom+xml;type=feed"
      href="http://localhost/api/v1/location/91d4f258-0050-4914-9d0c-6a74815358d2/sword/state/" />
</entry>

Add Files to a Transfer[edit]

Post file data:

  • It is possible to also POST the actual file to a deposit location's EM-IRI (/api/v1/location/[location UUID]/sword/media/)
  • The file should be posted as the request body and the filename specified using the Content-Disposition header (as per RFC 6266): for example "Content-Disposition: Attachment; filename=dog.jpg"
  • An MD5 checksum for an uploaded file can be provided via the Content-MD5 header (as per RFC 1544): for example "Content-MD5: Q2hlY2sgSW50ZWdyaXR5IQ=="
  • If only an Atom Entry Document is POST'ed, Archivematica will look for uri's in the file StructMap section of the METS file, and attempt to GET each file listed, to include in the Transfer. (not yet implemented)

Finalize a Transfer[edit]

  • POST with no body to the SE-IRI
  • set In-Progress HTTP header to : false
  • This will tell Archivematica that no further content will be added or removed from the Transfer.
  • Archivematica will finish fetching any files that were added previously, and once they have all been downloaded, the Transfer will start through the Archivematica pipeline.

Example

POST http://localhost/api/v1/location/55ce8053-113a-4954-8aa3-fc9771da0bda/sword/ HTTP/1.1
Host: localhost
Authorization: Archivematica-API api_key="XXXXXXXXXXXXXXXXXXXX", username="timh"
Content-Length: 0
Content-Type: application/atom+xml;type=entry
In-Progress: false

The response will be HTTP 200/OK or 400/Error (400 if the Transfer was already finalized). If the transfer doesn't exist the response will be HTTP 404.

Get Status of Transfer[edit]

To check Status:

GET the State-IRI in this example:

GET http://localhost/api/v1/location/09224ddb-4c7d-404c-9218-2f2e1b5a599e/sword/state/ HTTP/1.1

response will include:

<atom:feed xmlns:sword="http://purl.org/net/sword/terms/" 
            xmlns:atom="http://www.w3.org/2005/Atom">

    <atom:category scheme="http://purl.org/net/sword/terms/state"
        term="failed"
        label="State">
            Deposit initiation: Failed
    </atom:category>

</atom:feed>

the list of possible descriptions is not finalized.

The state term value will either be "processing" (asynchronous deposit still working), "complete" (ready to be finalized), or "failed" (asynchronous deposit encountered an error).

Additional Details[edit]

Transfer Details[edit]

EM-IRI[edit]

  • Edit-Media IRI
  • used to add new objects to an existing transfer
  • follows the form: [archivematica hostname]/api/v2/transfer/[uuid of transfer]/media
  • example: /api/v1/location/[deposit location UUID]/sword/media/

An HTTP POST to the em-iri for a transfer should include a single file as the body. If the HTTP "Packaging" header is set to "METS" then METS XML can be sent as the body, specifying a list of digital object URLs to download in the background.

An HTTP GET should return a list of files in the transfer

An HTTP DELETE to the em-iri will remove all digital objects from the Transfer. This is a valid operation only while the Transfer is being assembled. Once the Transfer has been finalized, attempting to DELETE will return an error.

Edit-IRI[edit]

The client can replace the metadata of a resource by performing an HTTP PUT of a new Atom Entry on the Edit-IRI. (not yet implemented)

This would be used to update metadata about a transfer, such as the transfer name.

SE-IRI[edit]

State-IRI[edit]

  • used to retrieve status of transfer
  • implemented as Atom document
  • example: /api/v1/location/[deposit location UUID]/sword/state/
  • should be able to subscribe like RSS feed

Service document[edit]