Bottlenecks
A bottleneck refers to a point of congestion in a system, typically a place of limited resources, where workflow is prone to slow.
- For more information on bottlenecks see wikipedia.
Processing Power
Archivematica uses it's distributed, multi processing MCP system to mitigate the traditional problems of a processing system. However, this places higher importance on two other bottlenecks: Network and Disk activity.
Network
In Archivematica processing, networking comes into play for two key reasons:
- distributing tasks
- central file store access
Disk/Hard drive
RAID
RAID (redundant array of inexpensive disks) is a way of distributing the load of a file system on a set of drives. There are various forms of RAID, with different levels of redundancy.
- For more information on RAIDs see wikipedia.
Distributed File System
Distributed file systems are arguably a sub-set of RAIDs. They are distributed over multiple machines, to form a single file system. This has the potential to lighten the Network load for processing.
We are looking at using a distributed file system with archivematica. See Issue 669.
Ceph
Ceph is a distributed file system, which is currently (July 2011) under alpha development. They have a beta 1.0 release scheduled for release 08/21/2011 see their roadmap.