Revision as of 12:29, 11 January 2013

Main Page > Development roadmap > Scalability testing

Current Plans

January 11, 2013 - Artefactual is committing resources towards Scalability Testing for Archivematica over the coming weeks and months.

The main goals of this effort are:

set up a dedicated testing environment

The testing environment will start with 5 virtual machines set up in a hosted environment, where hardware resources can be scaled up and down between tests. It is expected that the test environment will be ready to use by January 15th.

develop a new initial repeatable test suite

Initial tests will focus on two main areas - file io and documenting individual micro-service performance. Data will be collected from external monitoring tools as well as from internal instrumentation.

External monitoring will be done with two open source packages, munin and collectd. This will provide data at the operating system level. Internal instrumentation already exists within the Archivematica source code, where each step in the process has a start time and end time recorded in the local database. This instrumentation will be extended and refined during the buildout of the test suite. The data collected will be used to identify which specific micro-services, and which steps within those micro-services are taking the longest time to complete.

document a full matrix of test parameters

Archivematica workflow can vary considerably depending on use case. Artefactual will document all testing efforts on this wiki, building out a matrix of test cases. For example, we expect that adding additional storage subsystem capacity will allow for linear growth in scalability (add more disks, it should all go faster). This will be one of the first 'columns' in our test matrix, repeating tests with the same workload, changing the capacity (maximum io's per second) of the storage subsystem between tests.

Initial tests will focus on the 4 primary stages in the Archivematica workflow - Transfer, Ingest, creation of SIP, creation of AIP. There are additional steps required, both before Transfer, and after creation of AIP, however these steps do not necessarily involve the use of Archivematica code. For example, moving digital objects to a shared folder that Archivematica can access is a prerequisite of the Transfer stage, and can take a considerable amount of time. We will document best practices for how to complete that work after initial scalability testing is complete.

repeat test suite at customer sites

The two initial customer sites have been identified by Archivematica and tests will be repeasted at both customer sites.

Test Structure

Scalability testing is done using a scripted workload, where all decision points, that are normally left to the Archivist to make using the Dashboard, are instead automated through the use of a configuration file. This allows for repeatable test cases. Example test scripts will be posted here over the coming weeks.

Test File Sets

Test Documents

Test design

Maximums to test for:

Max number of SIPS - 10
Max number of files in SIP - 10,000
Max size of individual file - 30 GiB
Max size of SIP - 100 GiB

Baseline amounts:

number of SIPS - 1
number of files in SIP - 10
size of individual file - 1 MiB
size of SIP - 100 MiB

Test	No. of SIPs	No. of files in SIP	Max size of individual file	Max size of SIP
1. Baseline Test	1	10	1 MiB	100 MiB
2. No. of SIPs	10	10	1 MiB	100 MiB
3. No. of files	1	10,000	1 MiB	100 MiB
4. Max file size	1	10	30 GiB	100 MiB
5. Max SIP size	1	10	1 MiB	100 GiB
...

Other tests: combination of maximums

CVA tests

System setup:

Bare-metal install, 1 processor
2 cores
4GB ram 9 GB swap
xubuntu

Note: excludes store AIP and upload DIP micro-services except where noted

Test date	No. transfers/SIPs	No. files	Total file size	Largest file size	AIP size	Total time	Comments
2011/11/10	1/1	1,000	12.1 GB	60 MB			Failed at prepareAIP due to max Bag size: ~~Issue 785~~ Failed at uploadDIP due to max post size limit in ica-atom (8M).
2011/11/10	1/1	1	2.7 GB	2.7 GB			Failed at prepareAIP due to max Bag size: ~~Issue 785~~
2011/11/18	1/1	1,000	12.1 GB	60 MB	7.2 GB	4 hrs 30 mins	Access normalization only
2011/12/02	2/2	1,998	13 GB	21 MB			Access normalization only
2011/12/11	1/1	1,000	6.51 GB	21 MB	3.5 GB		Access normalization only
2011/12/11	2/2	1,996	13.8 GB	27 MB	7.2 GB		Access normalization only
2011/12/13	3/3	2,974	18.6 GB	20 MB	10.3 GB	3 hrs 19 mins	Access normalization only
2011/12/14	4/4	3,993	24.6 GB	22 MB	13.2 GB	3 hrs 16 mins	Access normalization only
2011/12/15	4/4	3,982	43 GB	12 MB	15 GB	3 hrs 30 mins	Access normalization only
2011/12/15	6/6	5,113	34.1 GB	38 MB	19.8 GB	4 hrs 2 mins	Access normalization only
2012/01/04	6/6	5,845	42.4 GB	33 MB	24 GB	3 hrs 52 mins	Access normalization only
2012/01/05	3/3	2,957	20.9 GB	45 MB	13.6 GB	4 hrs	Access normalization only
2012/01/05	6/6	5,947	33 GB	52 MB	19.2 GB	4 hrs 47 mins	Access normalization only
2012/01/12	6/6	4,847	38.5 GB	58 MB	23.2 GB	4 hrs 43 mins	Access normalization only
2012/01/13	6/6	5,912	101.6 GB	175 MB	63.8 GB	8 hrs 53 mins	Access normalization only
2012/01/17	1/1	1	1.4 GB	1.4 GB	0.6 GB	25 mins	Access normalization only
2012/01/17	5/5	23	19.7 GB	2.1 GB	19 GB	4 hrs 1 min	Access normalization only
2012/01/18	2/2	2	3.8 GB	2.1 GB	3.7 GB	1 hr 11 mins	Access normalization only
2012/01/20	6/6	14	6.1 GB	1.3 GB	5.9 GB	48 mins	Access normalization only
2012/02/07	5/5	5	56.7 GB	25.4 GB	55.5 GB	4 hrs 51 mins	No normalization
2012/02/08	5/5	10	124.4 GB	23.8 GB	122.2 GB	8 hrs 21 mins	No normalization
2012/02	1/1	1044	7.5 GB	12.4 MB	32.8 GB	>16 hrs	Preservation and access normalization
2012/02	1/1	104	611.6 MB	7.1 MB	2.58 GB	<2 hrs	Preservation and access normalization
2012/02	1/1	2125	47.1 GB	35.9 MB	46.2 GB	>24 hrs	Preservation and access normalization
2012/03	1/1	1654	7.9 GB	11.7 MB	37.7 GB	>16 hrs	Preservation and access normalization
2012/03	1/1	1195	5.7 GB	9.9 MB	26.8 GB	>12 hrs	Preservation and access normalization
2012/03/22	1/1		11.0 GB	246.3 MB	GB		Preservation and access normalization
2012/03/22	1/1		6.7 GB	9.7 MB	GB		Preservation and access normalization
2012/03/26	1/1		6.6 GB	14.3 MB	GB		Preservation and access normalization
2012/03	1/1		18.1 GB	11.7 MB			Preservation and access normalization

Multi-processor testing

Problem statement

Does the amount of processing time decrease for each additional processing station added?
If yes, by how much?

Constants and variables

Constants:

Ram amount
Ram speed
Disk size
Cpu frequency

Variables:

Number of clients
Number of transfer(s)
Size of transfer(s)
Number of files(s)

Ideal network for testing network consists of 6nodes+ each with dual core processor, 2GB+ memory, and 6GB+ disk space. Due to limited disk capacity, current tests are running with 5 nodes.

Testing data

All testing data will be be preserved for analysis. Select data will be reported on this wiki.

Network setup

HOSTNAME	Processor	Memory	Disk/s Size	IP	Filesystem	Services
test01server	4x500mhz	2048mb	6GB+35GB	10.10.0.1	ext4	MCPServer,MySQL,NFS,MCPClient
test01client01	2x500mhz	1024mb	6GB	10.10.0.11	ext4,NFS	MCPClient
test01client02	2x500mhz	1024mb	6GB	10.10.0.12	ext4,NFS	MCPClient
test01client03	2x500mhz	1024mb	6GB	10.10.0.12	ext4,NFS	MCPClient
test01client04	2x500mhz	1024mb	6GB	10.10.0.14	ext4,NFS	MCPClient

Testing metrics

Our results are derived from running 000.zip through the archivematica pipeline using standard processing configuration settings, and then extracting MYSQL- timing views from the database. This gives us a clearer picture of productivity of clients.

two scripts are used to extract testing data from the database:

After you have run your test data through archivematica they are to be used:

./automatedDistributedTestingReports.sh
./automatedDistributedTestingProcessingMachineInformationGathering.sh

you will recieve a similar fileset to this

2012.05.02-11.52.12_server_jobDurationsView.html
2012.05.02-11.52.12_server_MCP_DUMP.sql
2012.05.02-11.52.12_server_mysql_status.log
2012.05.02-11.52.12_server_netstat_summary.log
2012.05.02-11.52.12_server_PDI_by_unit.html
2012.05.02-11.52.12_server_processingDurationInformation.html
server_2012.05.02-11.52.05_cpuinfo.log
server_2012.05.02-11.52.05_free.log
server_2012.05.02-11.52.05_IP.log

Test results

Ram amount =
Ram speed =
Disk size =
CPU frequency =

Number of transfers =
Total number of files =
Total transfer size =

No. of processors	Total processing time	Longest job	Second longest job	Third longest job
1
2
6

Difference between revisions of "Scalability testing"