Difference between revisions of "Bottlenecks"
Line 9: | Line 9: | ||
In Archivematica processing, networking comes into play for two key reasons: | In Archivematica processing, networking comes into play for two key reasons: | ||
* distributing tasks | * distributing tasks | ||
− | * central file store | + | * central file store accessed over the network |
+ | |||
+ | Distributing the tasks and getting the results is fairly light traffic on the network, but if the network is congested, it will hurt the performance of the system by slowing task assignment and results. | ||
+ | |||
+ | We are currently investigating distributed file systems, to avert some of the delay of accessing files remotely. See below. | ||
= Disk/Hard drive = | = Disk/Hard drive = |
Revision as of 10:51, 21 July 2011
A bottleneck refers to a point of congestion in a system, typically a place of limited resources, where workflow is prone to slow.
- For more information on bottlenecks see wikipedia.
Processing Power
Archivematica uses it's distributed, multi processing MCP system to mitigate the traditional problems of a processing system. However, this places higher importance on two other bottlenecks: Network and Disk activity.
Network
In Archivematica processing, networking comes into play for two key reasons:
- distributing tasks
- central file store accessed over the network
Distributing the tasks and getting the results is fairly light traffic on the network, but if the network is congested, it will hurt the performance of the system by slowing task assignment and results.
We are currently investigating distributed file systems, to avert some of the delay of accessing files remotely. See below.
Disk/Hard drive
RAID
RAID (redundant array of inexpensive disks) is a way of distributing the load of a file system on a set of drives. There are various forms of RAID, with different levels of redundancy.
- For more information on RAIDs see wikipedia.
Distributed File System
Distributed file systems are arguably a sub-set of RAIDs. They are distributed over multiple machines, to form a single file system. This has the potential to lighten the Network load for processing.
We are looking at using a distributed file system with archivematica. See Issue 669.
Ceph
Ceph is a distributed file system, which is currently (July 2011) under alpha development. They have a beta 1.0 release scheduled for release 08/21/2011 see their roadmap.