Difference between revisions of "Administrator manual 0.10"

From Archivematica
Jump to navigation Jump to search
Line 256: Line 256:
 
* The Format Policy Registry (FPR) is how Archivematica manages preservation planning using format policies. A format policy indicates the actions, tools and settings to apply to a file of a particular file format (e.g. conversion to preservation format, conversion to access format). Format policies will change as community standards, practices and tools evolve.  
 
* The Format Policy Registry (FPR) is how Archivematica manages preservation planning using format policies. A format policy indicates the actions, tools and settings to apply to a file of a particular file format (e.g. conversion to preservation format, conversion to access format). Format policies will change as community standards, practices and tools evolve.  
  
* Hosted at [fpr.archivematica.org], the FPR stores structured information about normalization format policies for preservation and access. These policies identify preferred preservation and access formats. These default format policies can all be changed or enhanced locally by individual Archivematica implementers. For information about default format policies were selected, see the analysis of significant characteristics and tools here: [[Format_policies|Format policies]]
+
* Hosted at [http://fpr.archivematica.org], the FPR stores structured information about normalization format policies for preservation and access. These policies identify preferred preservation and access formats. These default format policies can all be changed or enhanced locally by individual Archivematica implementers. For information about default format policies were selected, see the analysis of significant characteristics and tools here: [[Format_policies|Format policies]]
  
 
* Subscription to the FPR will allow the Archivematica project to notify users when new or updated preservation and access plans become available, allowing them to make better decisions about normalization and migration strategies for specific format types within their collections. It will also allow them to trigger migration processes as new tools and knowledge becomes available.
 
* Subscription to the FPR will allow the Archivematica project to notify users when new or updated preservation and access plans become available, allowing them to make better decisions about normalization and migration strategies for specific format types within their collections. It will also allow them to trigger migration processes as new tools and knowledge becomes available.

Revision as of 16:52, 7 May 2013

Main Page > Documentation > Administrator Manual

This manual covers administrator-specific instructions for Archivematica. It will also provide help for using forms in the Administration tab of the Archivematica dashboard and the administrator capabilities in the Format Policy Registry (FPR), which you will find in the Preservation planning tab of the dashboard.

For end-user instructions, please see the user manual.

Installation

Upgrading

  • Currently, Archivematica does not support upgrading from one version to the next. A re-install is required.

Dashboard administration tab

The Archivematica administration pages, under the Administration tab of the dashboard, allows you to configure application components and manage users.

Transfer source directories

Archivematica allows you to start transfers using the operating system's file browser or via a web interface. Source files for transfers, however, can't be uploaded using the web interface: they must exist on volumes accessible to the Archivematica server.

When starting a transfer you're required to select one or more directories of files to add to the transfer. To speed up the process of selecting directories, Archivematica allows you to specify "source directories". A source directory is a directory in which files and directories likely to be added to a transfer are present.

To add a source directory, while on the transfer source directories page of the Administration tab in the dashboard, simply click the folder icon to expand the starting directory and navigate the interface until you find a directory you'd like to select as a source directory. Once you've found a suitable source directory, click the "Add" button to the right of the directory and it will be added.

To remove a source directory, simply click the "Remove" button to the right of it in the source directory path list.

AIP storage directories

AIP storage directories are directories in which completed AIPs are stored. Storage directories can be specified in a manner similar to source directories.

To add a storage directory, while on the storage directories page of the Administration tab in the dashboard, simply click the folder icon to expand the starting directory and navigate the interface until you find a directory you'd like to select as a storage directory. Once you've found a suitable storage directory, click the "Add" button to the right of the directory and it will be added.

To remove a storage directory, simply click the "Remove" button to the right of it in the storage directory path list.

AtoM DIP upload

Archivematica can upload DIPs directly to an AtoM website so the contents can be accessed online. The AtoM DIP upload configuration page is where you specify the details of the AtoM installation you'd like the DIPs uploaded to (and, if using Rsync to transfer the DIP files, Rsync transfer details).

The parameters that you'll most likely want to set are url, email, and password. These parameters, respectively, specify the destination AtoM website's URL, the email address used to log in to the website, and the password used to log in to the website.

AtoM DIP upload can also use Rsync as a transfer mechanism. Rsync is an open source utility for efficiently transferring files. The rsync-target parameter is used to specify an Rsync-style target host/directory pairing, "foobar.com:~/dips/" for example. The rsync-command parameter is used to specify rsync connection options, "ssh -p 22222 -l user" for example. If you are using the rsync option, please see AtoM server configuration below.

To set any parameters for AtoM DIP upload change the values, preserving the existing format they're specified in, in the "Command arguments" field then click "Save".

Note that in AtoM, the sword plugin (Admin --> qtSwordPlugin) and job scheduling (Admin --> Settings --> Job scheduling) must both be enabled in order for AtoM to receive uploaded DIPs.

AtoM server configuration

This server configuration step is necessary to allow Archivematica to log in to the AtoM server without passwords, and only when the user is deploying the rsync option described above in the AtoM DIP upload section.

To enable sending DIPs from Archivematica to the AtoM server:

Generate SSH keys for the Archivematica user. Leave the passphrase field blank.

 $ sudo -i -u archivematica
 $ cd ~
 $ ssh-keygen

Copy the contents of /var/lib/archivematica/.ssh/id_rsa.pub somewhere handy, you will need it later.

Now, it's time to configure the AtoM server so Archivematica can send the DIPs using SSH/rsync. For that purpose, you will create a user called archivematica and we are going to assign that user a restricted shell with access only to rsync:

 $ sudo apt-get install rssh
 $ sudo useradd -d /home/archivematica -m -s /usr/bin/rssh archivematica
 $ sudo passswd -l archivematica
 $ sudo vim /etc/rssh.conf // Make sure that allowrsync is uncommented!

Add the SSH key that we generated before:

 $ sudo mkdir /home/archivematica/.ssh
 $ chmod 700 /home/archivematica/.ssh/
 $ sudo vim /home/archivematica/.ssh/authorized_keys // Paste here the contents of id_dsa.pub
 $ chown -R archivematica:archivematica /home/archivematica

In Archivematica, make sure that you update the --rsync-target accordingly.
These are the parameters that we are passing to the upload-qubit microservice.
Go to the Administration > Upload DIP page in the dashboard.

Generic parameters:

--url="http://atom-hostname/index.php" \
--email="demo@example.com" \
--password="demo" \
--uuid="%SIPUUID%" \
--rsync-target="archivematica@atom-hostname:/tmp" \
--debug

CONTENTdm DIP upload

Archivematica can also upload DIPs to CONTENTdm websites. Multiple CONTENTdm destinations may be configured. For each possible CONTENTdm DIP upload destination, you'll specify a brief description and configuration parameters appropriate for the destination. Possible paramters include %ContentdmServer%, %ContentdmUser%, and %ContentdmGroup%.

When changing parameters for a CONTENTdm DIP upload destination simply change the values, preserving the existing format they're specified in. To add an upload destination fill in the form at the bottom of the page with the appropriate values. When you've completed your changes click the "Save" button.

Processing configuration

When processing a SIP or transfer, you may want to automate some of the workflow choices. Choices can be preconfigured by putting a 'processingMCP.xml' file into the root directory of a SIP/transfer.

If a SIP or transfer is submitted with a 'processingMCP.xml' file, processing decisions will be made with the included file.

The XML file format is:

<processingMCP>
  <preconfiguredChoices>
    <preconfiguredChoice>
      <appliesTo>Workflow decision - create transfer backup</appliesTo>
      <goToChain>Do not backup transfer</goToChain>
    </preconfiguredChoice>
    <preconfiguredChoice>
      <appliesTo>Workflow decision - send transfer to quarantine</appliesTo>
      <goToChain>Skip quarantine</goToChain>
    </preconfiguredChoice>
    <preconfiguredChoice>
      <appliesTo>Remove from quarantine</appliesTo>
      <goToChain>Unquarantine</goToChain>
      <delay unitCtime="yes">50</delay>
    </preconfiguredChoice>
  </preconfiguredChoices>
</processingMCP>

Where appliesTo is the name of the job presented in the dashboard, and goToChain is the desired selection. Note: these are case sensitive. The default processingMCP.xml file is located at '/var/archivematica/sharedDirectory/sharedMicroServiceTasksConfigs/processingMCPConfigs/defaultProcessingMCP.xml'.

The processing configuration administration page of the dashboard provides you with an easy form to configure the default 'processingMCP.xml' that's added to a SIP or transfer if it doesn't already contain one. When you change the options using the web interface the necessary XML will be written behind the scenes.

Processing configuration form in Administration tab of the dashboard
  • For the approval (yes/no) steps, the user ticks the box on the left-hand side to make a choice. If the box is not ticked, the approval step will appear in the dashboard.
  • For the other steps, if no actions are selected the choices appear in the dashboard
  • You can select whether or not to send transfers to quarantine (yes/no) and decide how long you'd like them to stay there.
  • You can approve normalization, sending the AIP to storage, and uploading the DIP without interrupting the workflow in the dashboard.
  • You can pre-select which format identification tool to base your normalization upon.
  • You can choose to send a transfer to backlog or to create a SIP every time.
  • You can select between lzma and bzip algorithms for AIP compression.
  • For select compression level, the options are as follows:
    • 9 - ultra compression
    • 7 - maximum compression
    • 3 - fast compression mode
    • 1 - fastest mode
    • 0 - copy mode
  • You can select one archival storage location where you will consistently send your AIPs.

PREMIS agent

The PREMIS agent name and code can be set via the administration interface.

thumbs

Rest API

In addition to automation using the processingMCP.xml file, Archivematica includes a REST API for automating transfer approval. Using this API, you can create a custom script that copies a transfer to the appropriate directory then uses the curl command, or some other means, to let Archivematica know that the copy is complete.

API keys

Use of the REST API requires the use of API keys. An API key is associated with a specific user. To generate an API key for a user:

  1. Browse to /administration/accounts/list/
  2. Click the "Edit" button for the user you'd like to generate an API key for
  3. Click the "Regenerate API key" checkbox
  4. Click "Save"

After generating an API key, you can click the "Edit" button for the user and you should see the API key.

IP whitelist

In addition to creating API keys, you'll need to add the IP of any computer making REST requests to the REST API whitelist. The IP whitelist can be edited in the administration interface at /administration/api/.

Approving a transfer

The REST API can be used to approve a transfer. The transfer must first be copied into the appropriate watch directory. To determine the location of the appropriate watch directory, first figure out where the shared directory is from the sharedDirectory value of /etc/archivematica/MCPServer/serverConfig.conf. Within that directory is a subdirectory activeTransfers. In this subdirectory are watch directories for the various transfer types.

When using the REST API to approve a transfer, if a transfer type isn't specified, the transfer will be deemed a standard transfer.

HTTP Method: POST

URL: /api/transfer/approve

Parameters:

directory: directory name of the transfer

type (optional): transfer type [standard|dspace|unzipped bag|zipped bag]

api_key: an API key

username: the username associated with the API key

Example curl command:

   curl --data "username=rick&api_key=f12d6b323872b3cef0b71be64eddd52f87b851a6&type=standard&directory=MyTransfer" http://127.0.0.1/api/transfer/approve

Example result:

   {"message": "Approval successful."}

Listing unapproved transfers

The REST API can be used to get a list of unapproved transfers. Each transfer's directory name and type is returned.

Method: GET

URL: /api/transfer/unapproved

Parameters:

api_key: an API key

username: the username associated with the API key

Example curl command:

   curl "http://127.0.0.1/api/transfer/unapproved?username=rick&api_key=f12d6b323872b3cef0b71be64eddd52f87b851a6"

Example result:

   {
       "message": "Fetched unapproved transfers successfully.",
       "results": [{
               "directory": "MyTransfer",
              "type": "standard"
           }
       ]
   }

Users

The dashboard provides a simple cookie-based user authentication system using the Django authentication framework. Access to the dashboard is limited only to logged-in users and a login page will be shown when the user is not recognized. If the application can't find any user in the database, the user creation page will be shown instead, allowing the creation of an administrator account.

Users can be also created, modified and deleted from the Administration tab. Here you can manage which users have access to Archivematica and what level of access they have. Standard users are able to access all sections of the interterface except for the administration section.

You can add a new user to the system by clicking the "Add new" button on the user administration page. These users won't have administrator capability. By adding a user you provide a way to access Archivematica using a username/password combination. Should you need to change a user's username or password, you can do so by clicking the "Edit" button, corresponding to the user, on the administration page. Should you need to revoke a user's access, you can click the corresponding "Delete" button.

CLI creation of administrative users

If you need an additional administrator user one can be created via the command-line after navigating to the src/dashboard/src directory in the source tree.

   python manage.py createsuperuser --settings='settings.common'

CLI password resetting

If you've forgotten the password for your administrator user, or any other user, you can change it via the command-line.

   python manage.py changepassword <username> --settings='settings.common'

Security

Archivematica uses PBKDF2 as the default algorithm to store passwords. This should be sufficient for most users: it's quite secure, requiring massive amounts of computing time to break. However, other algorithms could be used as the following document explains: How Django stores passwords.

Our plan is to extend this functionality in the future adding groups and granular permissions support.

Dashboard preservation planning tab

Format Policy Registry (FPR)

  • The Format Policy Registry (FPR) is how Archivematica manages preservation planning using format policies. A format policy indicates the actions, tools and settings to apply to a file of a particular file format (e.g. conversion to preservation format, conversion to access format). Format policies will change as community standards, practices and tools evolve.
  • Hosted at [1], the FPR stores structured information about normalization format policies for preservation and access. These policies identify preferred preservation and access formats. These default format policies can all be changed or enhanced locally by individual Archivematica implementers. For information about default format policies were selected, see the analysis of significant characteristics and tools here: Format policies
  • Subscription to the FPR will allow the Archivematica project to notify users when new or updated preservation and access plans become available, allowing them to make better decisions about normalization and migration strategies for specific format types within their collections. It will also allow them to trigger migration processes as new tools and knowledge becomes available.

Customization and automation

  • Workflow processing decisions can be made in the processingMCP.xml file. See here.
  • Workflows are currently created at the development level.
    Some resources avialable
  • Normalization commands can be viewed in the preservation planning tab.
  • Normalization paths and commands are currently editable under the preservation planning tab in the dashboard.

Elasticsearch

Archivematica has the capability of indexing data about files contained in AIPs and this data can be accessed programatically for various applications.

If, for whatever reason, you need to delete an ElasticSearch index please see ElasticSearch Administration.

If, for whatever reason, you need to delete an Elasticsearch index programmatically, this can be done with pyes using the following code.

import sys
sys.path.append("/home/demo/archivematica/src/archivematicaCommon/lib/externals")
from pyes import *
conn = ES('127.0.0.1:9200')

try:
    conn.delete_index('aips')
except:
    print "Error deleting index or index already deleted."

Data backup

In Archivematica there are three types of data you'll likely want to back up:

  • Filesystem (particularly your storage directories)
  • MySQL
  • ElasticSearch

MySQL is used to store short-term processing data. You can back up the MySQL database by using the following command:

mysqldump -u <your username> -p<your password> -c MCP > <filename of backup>

ElasticSearch is used to store long-term data. Instructions and scripts for backing up and restoring ElasticSearch are available here.

Security

Once you've set up Archivematica it's a good practice, for the sake of security, to change the default passwords.

MySQL

You should create a new MySQL user or change the password of the default "archivematica" MySQL user. The change the password of the default user, enter the following into the command-line:

$ mysql -u root -p<your MyQL root password> -D mysql \
   -e "SET PASSWORD FOR 'archivematica'@'localhost' = PASSWORD('<new password>'); \
   FLUSH PRIVILEGES;"

Once you've done this you can change Archivematica's MySQL database access credentials by editing these two files:

  • /etc/archivematica/archivematicaCommon/dbsettings (change the user and password settings)
  • /usr/share/archivematica/dashboard/settings/common.py (change the USER and PASSWORD settings in the DATABASES section)

Archivematica does not presently support secured MySQL communication so MySQL should be run locally or on a secure, isolated network. See issue 1645.

AtoM

In addition to changing the MySQL credentials, if you've also installed AtoM you'll want to set the password for it as well. Note that after changing your AtoM credentials you should update the credentials on the AtoM DIP upload administration page as well.

Gearman

Archivematica relies on the German server for queuing work that needs to be done. Gearman currently doesn't support secured connections so Gearman should be run locally or on a secure, isolated network. See issue 1345.

Questions

If you run into any difficulties while administrating Archivematica, please check out our FAQ and, if that doesn't help you, contain us using the Archivematica discussion group.

Frequently asked questions

Discussion group