Difference between revisions of "Administrator manual 0.10"
Line 96: | Line 96: | ||
Archivematica can also upload DIPs to [http://www.contentdm.org/ CONTENTdm] instances. Multiple CONTENTdm destinations may be configured. | Archivematica can also upload DIPs to [http://www.contentdm.org/ CONTENTdm] instances. Multiple CONTENTdm destinations may be configured. | ||
− | For each possible CONTENTdm DIP upload destination, you'll specify a brief description and configuration parameters appropriate for the destination. Paramters include <code>%ContentdmServer%</code> (full path to the CONTENTdm API, including the leading 'http://' or 'https://'), <code>%ContentdmUser%</code>, and <code>%ContentdmGroup%</code> (Linux user and group on the CONTENTdm server, not a CONTENTdm username). | + | For each possible CONTENTdm DIP upload destination, you'll specify a brief description and configuration parameters appropriate for the destination. Paramters include <code>%ContentdmServer%</code> (full path to the CONTENTdm API, including the leading 'http://' or 'https://', for example http://example.com:81/dmwebservices/index.php), <code>%ContentdmUser%</code>, and <code>%ContentdmGroup%</code> (Linux user and group on the CONTENTdm server, not a CONTENTdm username). |
When changing parameters for a CONTENTdm DIP upload destination simply change the values, preserving the existing format they're specified in. To add an upload destination fill in the form at the bottom of the page with the appropriate values. When you've completed your changes click the "Save" button. | When changing parameters for a CONTENTdm DIP upload destination simply change the values, preserving the existing format they're specified in. To add an upload destination fill in the form at the bottom of the page with the appropriate values. When you've completed your changes click the "Save" button. |
Revision as of 13:49, 12 June 2013
Main Page > Documentation > Administrator Manual
This manual covers administrator-specific instructions for Archivematica. It will also provide help for using forms in the Administration tab of the Archivematica dashboard and the administrator capabilities in the Format Policy Registry (FPR), which you will find in the Preservation planning tab of the dashboard.
For end-user instructions, please see the user manual.
Installation
Upgrading
- Currently, Archivematica does not support upgrading from one version to the next. A re-install is required.
Dashboard administration tab
The Archivematica administration pages, under the Administration tab of the dashboard, allows you to configure application components and manage users.
Transfer source directories
Archivematica allows you to start transfers using the operating system's file browser or via a web interface. Source files for transfers, however, can't be uploaded using the web interface: they must exist on volumes accessible to the Archivematica server.
When starting a transfer you're required to select one or more directories of files to add to the transfer. To speed up the process of selecting directories, Archivematica allows you to specify "source directories". A source directory is a directory in which files and directories likely to be added to a transfer are present.
To add a source directory, while on the transfer source directories page of the Administration tab in the dashboard, simply click the folder icon to expand the starting directory and navigate the interface until you find a directory you'd like to select as a source directory. Once you've found a suitable source directory, click the "Add" button to the right of the directory and it will be added.
To remove a source directory, simply click the "Remove" button to the right of it in the source directory path list.
AIP storage directories
AIP storage directories are directories in which completed AIPs are stored. Storage directories can be specified in a manner similar to source directories.
To add a storage directory, while on the storage directories page of the Administration tab in the dashboard, simply click the folder icon to expand the starting directory and navigate the interface until you find a directory you'd like to select as a storage directory. Once you've found a suitable storage directory, click the "Add" button to the right of the directory and it will be added.
To remove a storage directory, simply click the "Remove" button to the right of it in the storage directory path list.
AtoM DIP upload
Archivematica can upload DIPs directly to an AtoM website so the contents can be accessed online. The AtoM DIP upload configuration page is where you specify the details of the AtoM installation you'd like the DIPs uploaded to (and, if using Rsync to transfer the DIP files, Rsync transfer details).
The parameters that you'll most likely want to set are url
, email
, and password
. These parameters, respectively, specify the destination AtoM website's URL, the email address used to log in to the website, and the password used to log in to the website.
AtoM DIP upload can also use Rsync as a transfer mechanism. Rsync is an open source utility for efficiently transferring files. The rsync-target
parameter is used to specify an Rsync-style target host/directory pairing, "foobar.com:~/dips/" for example. The rsync-command
parameter is used to specify rsync connection options, "ssh -p 22222 -l user" for example. If you are using the rsync option, please see AtoM server configuration below.
To set any parameters for AtoM DIP upload change the values, preserving the existing format they're specified in, in the "Command arguments" field then click "Save".
Note that in AtoM, the sword plugin (Admin --> qtSwordPlugin) and job scheduling (Admin --> Settings --> Job scheduling) must both be enabled in order for AtoM to receive uploaded DIPs.
AtoM server configuration
This server configuration step is necessary to allow Archivematica to log in to the AtoM server without passwords, and only when the user is deploying the rsync option described above in the AtoM DIP upload section.
To enable sending DIPs from Archivematica to the AtoM server:
Generate SSH keys for the Archivematica user. Leave the passphrase field blank.
$ sudo -i -u archivematica $ cd ~ $ ssh-keygen
Copy the contents of /var/lib/archivematica/.ssh/id_rsa.pub
somewhere handy, you will need it later.
Now, it's time to configure the AtoM server so Archivematica can send the DIPs using SSH/rsync. For that purpose, you will create a user called archivematica
and we are going to assign that user a restricted shell with access only to rsync:
$ sudo apt-get install rssh $ sudo useradd -d /home/archivematica -m -s /usr/bin/rssh archivematica $ sudo passswd -l archivematica $ sudo vim /etc/rssh.conf // Make sure that allowrsync is uncommented!
Add the SSH key that we generated before:
$ sudo mkdir /home/archivematica/.ssh $ chmod 700 /home/archivematica/.ssh/ $ sudo vim /home/archivematica/.ssh/authorized_keys // Paste here the contents of id_dsa.pub $ chown -R archivematica:archivematica /home/archivematica
In Archivematica, make sure that you update the --rsync-target
accordingly.
These are the parameters that we are passing to the upload-qubit microservice.
Go to the Administration > Upload DIP page in the dashboard.
Generic parameters:
--url="http://atom-hostname/index.php" \ --email="demo@example.com" \ --password="demo" \ --uuid="%SIPUUID%" \ --rsync-target="archivematica@atom-hostname:/tmp" \ --debug
CONTENTdm DIP upload
Archivematica can also upload DIPs to CONTENTdm instances. Multiple CONTENTdm destinations may be configured.
For each possible CONTENTdm DIP upload destination, you'll specify a brief description and configuration parameters appropriate for the destination. Paramters include %ContentdmServer%
(full path to the CONTENTdm API, including the leading 'http://' or 'https://', for example http://example.com:81/dmwebservices/index.php), %ContentdmUser%
, and %ContentdmGroup%
(Linux user and group on the CONTENTdm server, not a CONTENTdm username).
When changing parameters for a CONTENTdm DIP upload destination simply change the values, preserving the existing format they're specified in. To add an upload destination fill in the form at the bottom of the page with the appropriate values. When you've completed your changes click the "Save" button.
Processing configuration
When processing a SIP or transfer, you may want to automate some of the workflow choices. Choices can be preconfigured by putting a 'processingMCP.xml' file into the root directory of a SIP/transfer.
If a SIP or transfer is submitted with a 'processingMCP.xml' file, processing decisions will be made with the included file.
The XML file format is:
<processingMCP> <preconfiguredChoices> <preconfiguredChoice> <appliesTo>Workflow decision - create transfer backup</appliesTo> <goToChain>Do not backup transfer</goToChain> </preconfiguredChoice> <preconfiguredChoice> <appliesTo>Workflow decision - send transfer to quarantine</appliesTo> <goToChain>Skip quarantine</goToChain> </preconfiguredChoice> <preconfiguredChoice> <appliesTo>Remove from quarantine</appliesTo> <goToChain>Unquarantine</goToChain> <delay unitCtime="yes">50</delay> </preconfiguredChoice> </preconfiguredChoices> </processingMCP>
Where appliesTo is the name of the job presented in the dashboard, and goToChain is the desired selection. Note: these are case sensitive. The default processingMCP.xml file is located at '/var/archivematica/sharedDirectory/sharedMicroServiceTasksConfigs/processingMCPConfigs/defaultProcessingMCP.xml'.
The processing configuration administration page of the dashboard provides you with an easy form to configure the default 'processingMCP.xml' that's added to a SIP or transfer if it doesn't already contain one. When you change the options using the web interface the necessary XML will be written behind the scenes.
- For the approval (yes/no) steps, the user ticks the box on the left-hand side to make a choice. If the box is not ticked, the approval step will appear in the dashboard.
- For the other steps, if no actions are selected the choices appear in the dashboard
- You can select whether or not to send transfers to quarantine (yes/no) and decide how long you'd like them to stay there.
- You can approve normalization, sending the AIP to storage, and uploading the DIP without interrupting the workflow in the dashboard.
- You can pre-select which format identification tool to base your normalization upon.
- You can choose to send a transfer to backlog or to create a SIP every time.
- You can select between lzma and bzip algorithms for AIP compression.
- For select compression level, the options are as follows:
- 9 - ultra compression
- 7 - maximum compression
- 3 - fast compression mode
- 1 - fastest mode
- 0 - copy mode
- You can select one archival storage location where you will consistently send your AIPs.
PREMIS agent
The PREMIS agent name and code can be set via the administration interface.
Rest API
In addition to automation using the processingMCP.xml file, Archivematica includes a REST API for automating transfer approval. Using this API, you can create a custom script that copies a transfer to the appropriate directory then uses the curl
command, or some other means, to let Archivematica know that the copy is complete.
API keys
Use of the REST API requires the use of API keys. An API key is associated with a specific user. To generate an API key for a user:
- Browse to
/administration/accounts/list/
- Click the "Edit" button for the user you'd like to generate an API key for
- Click the "Regenerate API key" checkbox
- Click "Save"
After generating an API key, you can click the "Edit" button for the user and you should see the API key.
IP whitelist
In addition to creating API keys, you'll need to add the IP of any computer making REST requests to the REST API whitelist. The IP whitelist can be edited in the administration interface at /administration/api/
.
Approving a transfer
The REST API can be used to approve a transfer. The transfer must first be copied into the appropriate watch directory. To determine the location of the appropriate watch directory, first figure out where the shared directory is from the sharedDirectory
value of /etc/archivematica/MCPServer/serverConfig.conf
. Within that directory is a subdirectory activeTransfers
. In this subdirectory are watch directories for the various transfer types.
When using the REST API to approve a transfer, if a transfer type isn't specified, the transfer will be deemed a standard transfer.
HTTP Method: POST
URL: /api/transfer/approve
Parameters:
directory
: directory name of the transfer
type
(optional): transfer type [standard|dspace|unzipped bag|zipped bag]
api_key
: an API key
username
: the username associated with the API key
Example curl command:
curl --data "username=rick&api_key=f12d6b323872b3cef0b71be64eddd52f87b851a6&type=standard&directory=MyTransfer" http://127.0.0.1/api/transfer/approve
Example result:
{"message": "Approval successful."}
Listing unapproved transfers
The REST API can be used to get a list of unapproved transfers. Each transfer's directory name and type is returned.
Method: GET
URL: /api/transfer/unapproved
Parameters:
api_key
: an API key
username
: the username associated with the API key
Example curl command:
curl "http://127.0.0.1/api/transfer/unapproved?username=rick&api_key=f12d6b323872b3cef0b71be64eddd52f87b851a6"
Example result:
{ "message": "Fetched unapproved transfers successfully.", "results": [{ "directory": "MyTransfer", "type": "standard" } ] }
Users
The dashboard provides a simple cookie-based user authentication system using the Django authentication framework. Access to the dashboard is limited only to logged-in users and a login page will be shown when the user is not recognized. If the application can't find any user in the database, the user creation page will be shown instead, allowing the creation of an administrator account.
Users can be also created, modified and deleted from the Administration tab. Here you can manage which users have access to Archivematica and what level of access they have. Standard users are able to access all sections of the interterface except for the administration section.
You can add a new user to the system by clicking the "Add new" button on the user administration page. These users won't have administrator capability. By adding a user you provide a way to access Archivematica using a username/password combination. Should you need to change a user's username or password, you can do so by clicking the "Edit" button, corresponding to the user, on the administration page. Should you need to revoke a user's access, you can click the corresponding "Delete" button.
CLI creation of administrative users
If you need an additional administrator user one can be created via the command-line after navigating to the src/dashboard/src
directory in the source tree.
python manage.py createsuperuser --settings='settings.common'
CLI password resetting
If you've forgotten the password for your administrator user, or any other user, you can change it via the command-line.
python manage.py changepassword <username> --settings='settings.common'
Security
Archivematica uses PBKDF2 as the default algorithm to store passwords. This should be sufficient for most users: it's quite secure, requiring massive amounts of computing time to break. However, other algorithms could be used as the following document explains: How Django stores passwords.
Our plan is to extend this functionality in the future adding groups and granular permissions support.
Dashboard preservation planning tab
Format Policy Registry (FPR)
- The Format Policy Registry (FPR) is how Archivematica manages preservation planning using format policies. A format policy indicates the actions, tools and settings to apply to a file of a particular file format (e.g. conversion to preservation format, conversion to access format). Format policies will change as community standards, practices and tools evolve.
- Hosted at fpr.archivematica.org, the FPR stores structured information about normalization format policies for preservation and access. These policies identify preferred preservation and access formats. These default format policies can all be changed or enhanced locally by individual Archivematica implementers. For information about how default format policies were selected, see the analysis of significant characteristics and tools here: Format policies
- Subscription to the FPR will allow the Archivematica project to notify users when new or updated preservation and access plans become available, allowing them to make better decisions about normalization and migration strategies for specific format types within their collections. It will also allow them to trigger migration processes as new tools and knowledge becomes available.
Customization and automation
- Workflow processing decisions can be made in the processingMCP.xml file. See here.
- Workflows are currently created at the development level.
- Some resources avialable
- Normalization commands can be viewed in the preservation planning tab.
- Normalization paths and commands are currently editable under the preservation planning tab in the dashboard.
Elasticsearch
Archivematica has the capability of indexing data about files contained in AIPs and this data can be accessed programatically for various applications.
If, for whatever reason, you need to delete an ElasticSearch index please see ElasticSearch Administration.
If, for whatever reason, you need to delete an Elasticsearch index programmatically, this can be done with pyes using the following code.
import sys sys.path.append("/home/demo/archivematica/src/archivematicaCommon/lib/externals") from pyes import * conn = ES('127.0.0.1:9200') try: conn.delete_index('aips') except: print "Error deleting index or index already deleted."
Rebuilding the AIP index
To rebuild the ElasticSearch AIP index enter the following to find the location of the rebuilding script:
locate rebuild-elasticsearch-aip-index-from-files
Copy the location of the script then enter the following to perform the rebuild (substituting "/your/script/location/rebuild-elasticsearch-aip-index-from-files" with the location of the script):
/your/script/location/rebuild-elasticsearch-aip-index-from-files <location of your AIP store>
Data backup
In Archivematica there are three types of data you'll likely want to back up:
- Filesystem (particularly your storage directories)
- MySQL
- ElasticSearch
MySQL is used to store short-term processing data. You can back up the MySQL database by using the following command:
mysqldump -u <your username> -p<your password> -c MCP > <filename of backup>
ElasticSearch is used to store long-term data. Instructions and scripts for backing up and restoring ElasticSearch are available here.
Security
Once you've set up Archivematica it's a good practice, for the sake of security, to change the default passwords.
MySQL
You should create a new MySQL user or change the password of the default "archivematica" MySQL user. The change the password of the default user, enter the following into the command-line:
$ mysql -u root -p<your MyQL root password> -D mysql \ -e "SET PASSWORD FOR 'archivematica'@'localhost' = PASSWORD('<new password>'); \ FLUSH PRIVILEGES;"
Once you've done this you can change Archivematica's MySQL database access credentials by editing these two files:
/etc/archivematica/archivematicaCommon/dbsettings
(change theuser
andpassword
settings)/usr/share/archivematica/dashboard/settings/common.py
(change theUSER
andPASSWORD
settings in theDATABASES
section)
Archivematica does not presently support secured MySQL communication so MySQL should be run locally or on a secure, isolated network. See issue 1645.
AtoM
In addition to changing the MySQL credentials, if you've also installed AtoM you'll want to set the password for it as well. Note that after changing your AtoM credentials you should update the credentials on the AtoM DIP upload administration page as well.
Gearman
Archivematica relies on the German server for queuing work that needs to be done. Gearman currently doesn't support secured connections so Gearman should be run locally or on a secure, isolated network. See issue 1345.
Questions
If you run into any difficulties while administrating Archivematica, please check out our FAQ and, if that doesn't help you, contain us using the Archivematica discussion group.
Frequently asked questions
Discussion group
- Discussion group for questions not covered by the FAQ