Difference between revisions of "Getting started"
(Vital stats & repo summary) |
|||
(42 intermediate revisions by 8 users not shown) | |||
Line 1: | Line 1: | ||
− | [[Main Page]] > [[Development]] > Getting Started | + | [[Main Page]] > [[Development]] > [[:Category:Development documentation|Development documentation]] > Getting Started |
+ | |||
+ | <div style="padding: 10px 10px; border: 1px solid black; background-color: #F79086;">This page is no longer being maintained and may contain inaccurate information. Please see the [https://www.archivematica.org/docs/latest/ Archivematica documentation] for up-to-date information.</div><p> | ||
+ | |||
+ | This wiki page describes getting started with Archivematica as a developer. For user and administrative manuals, please see http://www.archivematica.org. | ||
== Vital Stats == | == Vital Stats == | ||
Line 7: | Line 11: | ||
* VCS: git | * VCS: git | ||
* Major libraries: [https://www.djangoproject.com/ Django], [http://gearman.org/ gearman] ([https://pythonhosted.org/gearman/ Python API]) | * Major libraries: [https://www.djangoproject.com/ Django], [http://gearman.org/ gearman] ([https://pythonhosted.org/gearman/ Python API]) | ||
− | * [[Contribute_code#Code_Style_Guide_For_Archivematica|Coding style]] | + | * [[Contribute_code|Contribution guidelines]] |
+ | ** [[Contribute_code#Code_Style_Guide_For_Archivematica|Coding style]] | ||
− | == | + | == Projects == |
Archivematica consists of several projects working together. | Archivematica consists of several projects working together. | ||
Line 15: | Line 20: | ||
* [https://github.com/artefactual/archivematica Archivematica]: Main repository containing the user-facing dashboard, task manager MCPServer and clients scripts for the MCPClient | * [https://github.com/artefactual/archivematica Archivematica]: Main repository containing the user-facing dashboard, task manager MCPServer and clients scripts for the MCPClient | ||
* [https://github.com/artefactual/archivematica-storage-service Storage Service]: Responsible for moving files to Archivematica for processing, and from Archivematica into storage | * [https://github.com/artefactual/archivematica-storage-service Storage Service]: Responsible for moving files to Archivematica for processing, and from Archivematica into storage | ||
− | |||
There are also several smaller repositories that support Archivematica in various ways. In general, you will not need these to develop on Archivematica. | There are also several smaller repositories that support Archivematica in various ways. In general, you will not need these to develop on Archivematica. | ||
Line 21: | Line 25: | ||
* [https://github.com/artefactual/archivematica-devtools Development tools]: Scripts to help with development. E.g. restarting services, workflow analysis | * [https://github.com/artefactual/archivematica-devtools Development tools]: Scripts to help with development. E.g. restarting services, workflow analysis | ||
* [https://github.com/artefactual/archivematica-fpr-tools FPR tools]: All the tools, commands and rules used to populate the FPR database. Changes to the FPR should be submitted here. | * [https://github.com/artefactual/archivematica-fpr-tools FPR tools]: All the tools, commands and rules used to populate the FPR database. Changes to the FPR should be submitted here. | ||
− | * [https://github.com/artefactual/archivematica-docs Archivematica Documentation]: Documentation found at https://www.archivematica.org/en/docs/ | + | * [https://github.com/artefactual/archivematica-docs Archivematica Documentation]: Documentation found at https://www.archivematica.org/en/docs/ for Archivematica |
+ | * [https://github.com/artefactual/archivematica-storage-service-docs Storage Service Documentation]: Documentation found at https://www.archivematica.org/en/docs/ for the Storage Service | ||
* [https://github.com/artefactual/automation-tools Automation Tools]: Scripts used to automate processing material through Archivematica | * [https://github.com/artefactual/automation-tools Automation Tools]: Scripts used to automate processing material through Archivematica | ||
* [https://github.com/artefactual/deploy-pub Deployment]: Ansible scripts for deploying and configuring Archivematica | * [https://github.com/artefactual/deploy-pub Deployment]: Ansible scripts for deploying and configuring Archivematica | ||
Line 27: | Line 32: | ||
* [https://github.com/artefactual-labs/ansible-role-archivematica-src Deployment-Archivematica-dev]: Ansible playbook for Archivematica github install. | * [https://github.com/artefactual-labs/ansible-role-archivematica-src Deployment-Archivematica-dev]: Ansible playbook for Archivematica github install. | ||
* [https://github.com/artefactual/fixity Fixity checker]: Commandline tool that assists in checking fixity for AIPs stored in Archivematica Storage Service instances. | * [https://github.com/artefactual/fixity Fixity checker]: Commandline tool that assists in checking fixity for AIPs stored in Archivematica Storage Service instances. | ||
+ | * [https://github.com/artefactual-labs/mets-reader-writer METS reader/writer]: Library to create and parse METS files. | ||
+ | * [https://github.com/artefactual-labs/agentarchives agentarchives]: Clients to retrieve, add, and modify records from archival management systems. | ||
* [https://github.com/artefactual/archivematica-sampledata Sample data]: Data to test and show off Archivematica's processing | * [https://github.com/artefactual/archivematica-sampledata Sample data]: Data to test and show off Archivematica's processing | ||
* [https://github.com/artefactual/archivematica-history History]: Contains the pre-git history of Archivematica. Useful for checking the origins of code. | * [https://github.com/artefactual/archivematica-history History]: Contains the pre-git history of Archivematica. Useful for checking the origins of code. | ||
+ | |||
+ | == Installation == | ||
+ | |||
+ | The recommended way to install Archivematica for development is with Ansible and Vagrant. | ||
+ | |||
+ | Alternatively, you can try our environment based on Docker Compose - see https://github.com/artefactual-labs/am/tree/master/compose. | ||
+ | |||
+ | === Ansible & Vagrant === | ||
+ | |||
+ | The following instructions detail how to install and run Archivematica from source on a virtual machine. | ||
+ | # Install VirtualBox, Vagrant, and Ansible with the following commands: | ||
+ | #* <code>sudo apt-get install virtualbox vagrant</code> (this is the command for Ubuntu; if you use Mac or a different Linux distribution, it may be slightly different). | ||
+ | #** Note: Vagrant must be at least 1.5 (it can also be downloaded from [https://www.vagrantup.com/downloads.html vagrantup.com]). Check your version with <code>vagrant --version</code>. | ||
+ | #* <code>sudo pip install -U ansible</code> | ||
+ | # Checkout the deployment repo: | ||
+ | #* <code>git clone https://github.com/artefactual/deploy-pub.git</code> | ||
+ | # Download the Ansible roles: | ||
+ | #* <code>cd deploy-pub/playbooks/archivematica</code> | ||
+ | #* <code>ansible-galaxy install -f -p roles/ -r requirements.yml</code> | ||
+ | # (Optional) Change the branch by opening the file <code>vars-singlenode.yml</code> and modifying the following: | ||
+ | #* <code>archivematica_src_am_version: "branch-name"</code> | ||
+ | #* <code>archivematica_src_ss_version: "branch-name"</code> | ||
+ | # Create the virtual machine and provision it: | ||
+ | #* <code>vagrant up</code> (it takes a while!) | ||
+ | # You can now log in to your virtual machine: | ||
+ | #* <code>vagrant ssh</code> | ||
+ | # You can now access the following services in a web browser: | ||
+ | #* Archivematica - http://192.168.168.192 | ||
+ | #* Archivematica Storage Service: http://192.168.168.192:8000 | ||
+ | |||
+ | You may also wish to do the following. | ||
+ | # Provisioning (via ansible) can be re-run with vagrant to update the code on the server (for example, if new features are added): | ||
+ | #* <code>vagrant provision</code> | ||
+ | # To re-deploy a new branch to the same VM, update the branch variables in <code>vars-singlenode.yml</code>: | ||
+ | #* <code>archivematica_src_am_version: "branch-name"</code> | ||
+ | #* <code>archivematica_src_ss_version: "branch-name"</code> | ||
+ | #** This will probably require resetting the Archivematica installation as well. This can be done by adding variables to <code>vars-singlenode.yml</code> | ||
+ | #* <code>archivematica_src_reset_am_all: "true"</code> This will reset the Archivematica database, clear ElasticSearch and clear shared directories | ||
+ | #* <code>archivematica_src_reset_ss_db: "true"</code> This will reset the Storage Service database | ||
+ | #* For more variables to control deployment, see the [https://github.com/artefactual-labs/ansible-role-archivematica-src/blob/master/README.md README] | ||
+ | # See also the FAQ below | ||
+ | |||
+ | === Alternative Vagrant projects === | ||
+ | |||
+ | Community-provided alternatives have also been developed. | ||
+ | |||
+ | *https://github.com/emltech/eml-archivematica-vagrant | ||
+ | *https://github.com/statsbiblioteket/archivematica-vagrant | ||
+ | |||
+ | == Tests == | ||
+ | |||
+ | Archivematica and the related projects have a small but growing test suite. We use [http://pytest.org/ py.test] to run our tests, which should be listed as a requirement in the development/local requirements file. | ||
+ | |||
+ | To run the tests, go to the repository root and run <code>py.test</code> | ||
+ | |||
+ | See below for project-specific setup or changes to running the tests. | ||
+ | |||
+ | === Archivematica === | ||
+ | |||
+ | Before running Archivematica tests, set the following environment variable. Archivematica does not currently have a virtualenv that needs to be activated. | ||
+ | |||
+ | <pre> | ||
+ | #!/usr/bin/fish | ||
+ | set -xg PYTHONPATH /usr/share/archivematica/dashboard/:/usr/lib/archivematica/archivematicaCommon/ | ||
+ | </pre> | ||
+ | |||
+ | <pre> | ||
+ | #!/usr/bin/bash | ||
+ | export PYTHONPATH=$PYTHONPATH:/usr/share/archivematica/dashboard/:/usr/lib/archivematica/archivematicaCommon/ | ||
+ | </pre> | ||
+ | |||
+ | === Storage Service === | ||
+ | |||
+ | Before running Storage Service tests, activate the virtualenv and set the following environment variables. The tests should be run from the <code>storage_service</code> directory. This is the same directory that contains <code>manage.py</code>. You may need to install <code>requirements/test.txt</code> to install testing dependencies. | ||
+ | |||
+ | <pre> | ||
+ | #!/usr/bin/fish | ||
+ | set -xg PYTHONPATH (pwd)/storage_service # This directory contains manage.py | ||
+ | set -xg DJANGO_SETTINGS_MODULE storage_service.settings.test | ||
+ | set -xg DJANGO_SECRET_KEY 'ADDKEY' | ||
+ | </pre> | ||
+ | <pre> | ||
+ | #!/usr/bin/bash | ||
+ | export PYTHONPATH=$(pwd)/storage_service # This directory contains manage.py | ||
+ | export DJANGO_SETTINGS_MODULE=storage_service.settings.test | ||
+ | export DJANGO_SECRET_KEY='ADDKEY' | ||
+ | </pre> | ||
+ | |||
+ | == FAQ == | ||
+ | |||
+ | === How do I restart everything in Archivematica? === | ||
+ | |||
+ | A default install using Ansible also installs the devtools. Run <code>am restart-services</code> to restart all services related to Archivematica and the storage service. | ||
+ | |||
+ | === How do I restart just the dashboard? === | ||
+ | |||
+ | <code>sudo /etc/init.d/apache2 start</code> or <code>sudo service apache2 restart</code> | ||
+ | |||
+ | === How do I restart the storage service? === | ||
+ | |||
+ | <code>sudo service uwsgi restart</code> | ||
+ | |||
+ | === How do I restart nginx? === | ||
+ | |||
+ | <code>sudo service uwsgi restart</code> | ||
+ | <code>sudo service nginx restart</code> | ||
+ | |||
+ | === How do I update or reset an ansible install? === | ||
+ | |||
+ | To update an install, re-run <code>vagrant provision</code>. If you only want to run part of the ansible tasks, you can use ansible's tags, for example: <code>env ANSIBLE_ARGS="--tags=amsrc-ss-code" vagrant provision</code> More tags are documented in the [https://github.com/artefactual-labs/ansible-role-archivematica-src ansible repo] | ||
+ | |||
+ | To reset an install (delete all existing data like a fresh install) you can use ansible's role variables. For example: <code>env ANSIBLE_ARGS="--extra-vars=archivematica_src_reset_ss_db=true" vagrant provision</code> will reset the storage service database. More role variables are documented in the [https://github.com/artefactual-labs/ansible-role-archivematica-src ansible repo] | ||
+ | |||
+ | The other way to control a deployment is to modify the Vagrantfile and vars-singlenote.yml files directly. Tags can be provided in the Vagrantfile. For example: | ||
+ | <pre> | ||
+ | # ... more above | ||
+ | # Ansible provisioning | ||
+ | config.vm.provision :ansible do |ansible| | ||
+ | ansible.playbook = "./singlenode.yml" | ||
+ | ansible.host_key_checking = false | ||
+ | # ansible.verbose = "v" | ||
+ | ansible.extra_vars = { | ||
+ | "archivematica_src_dir" => "/srv", | ||
+ | "archivematica_src_environment_type" => "development", | ||
+ | } | ||
+ | ansible.raw_arguments = ENV['ANSIBLE_ARGS'] | ||
+ | ansible.tags = ['amsrc-pipeline'] | ||
+ | end | ||
+ | # ... more below | ||
+ | </pre> | ||
+ | |||
+ | Role variables can be modified in vars-singlenote.yml. Default values are found in the Archivematica role (in <code>roles/archivematica-src/defaults/main.yml</code>). For example: | ||
+ | |||
+ | <pre> | ||
+ | --- | ||
+ | |||
+ | # archivematica-src role | ||
+ | |||
+ | # What to install | ||
+ | archivematica_src_install_devtools: "yes" | ||
+ | archivematica_src_install_automationtools: "yes" | ||
+ | # archivematica_src_install_appraisaltab: "yes" | ||
+ | |||
+ | # SS django environment variables | ||
+ | archivematica_src_ss_env_django_setings_module: "storage_service.settings.local" | ||
+ | |||
+ | # Branches, | ||
+ | archivematica_src_am_version: "qa/1.x" | ||
+ | archivematica_src_ss_version: "qa/0.x" | ||
+ | # archivematica_src_devtools_version: "master" | ||
+ | # archivematica_src_automationtools_version: "master" | ||
+ | |||
+ | # Reset | ||
+ | # archivematica_src_reset_mcpdb: "true" | ||
+ | # archivematica_src_reset_shareddir: "true" | ||
+ | # archivematica_src_reset_es: "true" | ||
+ | archivematica_src_reset_am_all: "true" | ||
+ | archivematica_src_reset_ss_db: "true" | ||
+ | |||
+ | </pre> | ||
+ | |||
+ | === How do I activate the Storage Service virtualenv? === | ||
+ | |||
+ | <code>source /usr/share/python/archivematica-storage-service/bin/activate</code> | ||
+ | |||
+ | The location of the virtualenv is configured as part of the ansible install, which by default is <code>/usr/share/python/archivematica-storage-service</code> Sourcing the activate script should modify the prompt to display <code>(archivematica-storage-service)vagrant@am-local</code>. Note that if you're running fish, the activate script will not work; you may want to look in to [http://virtualfish.readthedocs.io/en/latest/ virtualfish]. | ||
+ | |||
+ | You will also want to set environment variables as described in the storage service testing section. | ||
+ | |||
+ | == Documentation == | ||
+ | |||
+ | Developer facing documentation can be found in the [[:Category:Development documentation|Development documentation]] category. Notable pages include: | ||
+ | |||
+ | * [[MCPServer]] | ||
+ | * [[MCPClient]] | ||
+ | * [[Storage Service]] | ||
+ | * [[Storage Service API]] | ||
+ | * [[Archivematica API]] | ||
+ | |||
+ | [[Category:Development documentation]] |
Latest revision as of 16:51, 11 February 2020
Main Page > Development > Development documentation > Getting Started
This wiki page describes getting started with Archivematica as a developer. For user and administrative manuals, please see http://www.archivematica.org.
Vital Stats[edit]
- Language: Python (primarily)
- License: AGPL
- VCS: git
- Major libraries: Django, gearman (Python API)
- Contribution guidelines
Projects[edit]
Archivematica consists of several projects working together.
- Archivematica: Main repository containing the user-facing dashboard, task manager MCPServer and clients scripts for the MCPClient
- Storage Service: Responsible for moving files to Archivematica for processing, and from Archivematica into storage
There are also several smaller repositories that support Archivematica in various ways. In general, you will not need these to develop on Archivematica.
- Development tools: Scripts to help with development. E.g. restarting services, workflow analysis
- FPR tools: All the tools, commands and rules used to populate the FPR database. Changes to the FPR should be submitted here.
- Archivematica Documentation: Documentation found at https://www.archivematica.org/en/docs/ for Archivematica
- Storage Service Documentation: Documentation found at https://www.archivematica.org/en/docs/ for the Storage Service
- Automation Tools: Scripts used to automate processing material through Archivematica
- Deployment: Ansible scripts for deploying and configuring Archivematica
- Deployment-Archivematica: Ansible playbook for Archivematica package install.
- Deployment-Archivematica-dev: Ansible playbook for Archivematica github install.
- Fixity checker: Commandline tool that assists in checking fixity for AIPs stored in Archivematica Storage Service instances.
- METS reader/writer: Library to create and parse METS files.
- agentarchives: Clients to retrieve, add, and modify records from archival management systems.
- Sample data: Data to test and show off Archivematica's processing
- History: Contains the pre-git history of Archivematica. Useful for checking the origins of code.
Installation[edit]
The recommended way to install Archivematica for development is with Ansible and Vagrant.
Alternatively, you can try our environment based on Docker Compose - see https://github.com/artefactual-labs/am/tree/master/compose.
Ansible & Vagrant[edit]
The following instructions detail how to install and run Archivematica from source on a virtual machine.
- Install VirtualBox, Vagrant, and Ansible with the following commands:
sudo apt-get install virtualbox vagrant
(this is the command for Ubuntu; if you use Mac or a different Linux distribution, it may be slightly different).- Note: Vagrant must be at least 1.5 (it can also be downloaded from vagrantup.com). Check your version with
vagrant --version
.
- Note: Vagrant must be at least 1.5 (it can also be downloaded from vagrantup.com). Check your version with
sudo pip install -U ansible
- Checkout the deployment repo:
- Download the Ansible roles:
cd deploy-pub/playbooks/archivematica
ansible-galaxy install -f -p roles/ -r requirements.yml
- (Optional) Change the branch by opening the file
vars-singlenode.yml
and modifying the following:archivematica_src_am_version: "branch-name"
archivematica_src_ss_version: "branch-name"
- Create the virtual machine and provision it:
vagrant up
(it takes a while!)
- You can now log in to your virtual machine:
vagrant ssh
- You can now access the following services in a web browser:
- Archivematica - http://192.168.168.192
- Archivematica Storage Service: http://192.168.168.192:8000
You may also wish to do the following.
- Provisioning (via ansible) can be re-run with vagrant to update the code on the server (for example, if new features are added):
vagrant provision
- To re-deploy a new branch to the same VM, update the branch variables in
vars-singlenode.yml
:archivematica_src_am_version: "branch-name"
archivematica_src_ss_version: "branch-name"
- This will probably require resetting the Archivematica installation as well. This can be done by adding variables to
vars-singlenode.yml
- This will probably require resetting the Archivematica installation as well. This can be done by adding variables to
archivematica_src_reset_am_all: "true"
This will reset the Archivematica database, clear ElasticSearch and clear shared directoriesarchivematica_src_reset_ss_db: "true"
This will reset the Storage Service database- For more variables to control deployment, see the README
- See also the FAQ below
Alternative Vagrant projects[edit]
Community-provided alternatives have also been developed.
- https://github.com/emltech/eml-archivematica-vagrant
- https://github.com/statsbiblioteket/archivematica-vagrant
Tests[edit]
Archivematica and the related projects have a small but growing test suite. We use py.test to run our tests, which should be listed as a requirement in the development/local requirements file.
To run the tests, go to the repository root and run py.test
See below for project-specific setup or changes to running the tests.
Archivematica[edit]
Before running Archivematica tests, set the following environment variable. Archivematica does not currently have a virtualenv that needs to be activated.
#!/usr/bin/fish set -xg PYTHONPATH /usr/share/archivematica/dashboard/:/usr/lib/archivematica/archivematicaCommon/
#!/usr/bin/bash export PYTHONPATH=$PYTHONPATH:/usr/share/archivematica/dashboard/:/usr/lib/archivematica/archivematicaCommon/
Storage Service[edit]
Before running Storage Service tests, activate the virtualenv and set the following environment variables. The tests should be run from the storage_service
directory. This is the same directory that contains manage.py
. You may need to install requirements/test.txt
to install testing dependencies.
#!/usr/bin/fish set -xg PYTHONPATH (pwd)/storage_service # This directory contains manage.py set -xg DJANGO_SETTINGS_MODULE storage_service.settings.test set -xg DJANGO_SECRET_KEY 'ADDKEY'
#!/usr/bin/bash export PYTHONPATH=$(pwd)/storage_service # This directory contains manage.py export DJANGO_SETTINGS_MODULE=storage_service.settings.test export DJANGO_SECRET_KEY='ADDKEY'
FAQ[edit]
How do I restart everything in Archivematica?[edit]
A default install using Ansible also installs the devtools. Run am restart-services
to restart all services related to Archivematica and the storage service.
How do I restart just the dashboard?[edit]
sudo /etc/init.d/apache2 start
or sudo service apache2 restart
How do I restart the storage service?[edit]
sudo service uwsgi restart
How do I restart nginx?[edit]
sudo service uwsgi restart
sudo service nginx restart
How do I update or reset an ansible install?[edit]
To update an install, re-run vagrant provision
. If you only want to run part of the ansible tasks, you can use ansible's tags, for example: env ANSIBLE_ARGS="--tags=amsrc-ss-code" vagrant provision
More tags are documented in the ansible repo
To reset an install (delete all existing data like a fresh install) you can use ansible's role variables. For example: env ANSIBLE_ARGS="--extra-vars=archivematica_src_reset_ss_db=true" vagrant provision
will reset the storage service database. More role variables are documented in the ansible repo
The other way to control a deployment is to modify the Vagrantfile and vars-singlenote.yml files directly. Tags can be provided in the Vagrantfile. For example:
# ... more above # Ansible provisioning config.vm.provision :ansible do |ansible| ansible.playbook = "./singlenode.yml" ansible.host_key_checking = false # ansible.verbose = "v" ansible.extra_vars = { "archivematica_src_dir" => "/srv", "archivematica_src_environment_type" => "development", } ansible.raw_arguments = ENV['ANSIBLE_ARGS'] ansible.tags = ['amsrc-pipeline'] end # ... more below
Role variables can be modified in vars-singlenote.yml. Default values are found in the Archivematica role (in roles/archivematica-src/defaults/main.yml
). For example:
--- # archivematica-src role # What to install archivematica_src_install_devtools: "yes" archivematica_src_install_automationtools: "yes" # archivematica_src_install_appraisaltab: "yes" # SS django environment variables archivematica_src_ss_env_django_setings_module: "storage_service.settings.local" # Branches, archivematica_src_am_version: "qa/1.x" archivematica_src_ss_version: "qa/0.x" # archivematica_src_devtools_version: "master" # archivematica_src_automationtools_version: "master" # Reset # archivematica_src_reset_mcpdb: "true" # archivematica_src_reset_shareddir: "true" # archivematica_src_reset_es: "true" archivematica_src_reset_am_all: "true" archivematica_src_reset_ss_db: "true"
How do I activate the Storage Service virtualenv?[edit]
source /usr/share/python/archivematica-storage-service/bin/activate
The location of the virtualenv is configured as part of the ansible install, which by default is /usr/share/python/archivematica-storage-service
Sourcing the activate script should modify the prompt to display (archivematica-storage-service)vagrant@am-local
. Note that if you're running fish, the activate script will not work; you may want to look in to virtualfish.
You will also want to set environment variables as described in the storage service testing section.
Documentation[edit]
Developer facing documentation can be found in the Development documentation category. Notable pages include: