Difference between revisions of "Bottlenecks"

From Archivematica
Jump to navigation Jump to search
(Created page with 'Category:Development documentation A bottleneck refers to a point of congestion in a system, typically a place of limited resources, where workflow is prone to slow. * For mo...')
 
Line 14: Line 14:
  
 
== RAID ==
 
== RAID ==
 +
RAID (redundant array of inexpensive disks) is a way of distributing the load of a file system on a set of drives. There are various forms of RAID, with different levels of redundancy.
 +
* For more information on RAIDs see [http://en.wikipedia.org/wiki/RAID wikipedia.]
  
 
== Distributed File System ==
 
== Distributed File System ==
 +
Distributed file systems are arguably a sub-set of RAIDs. They are distributed over multiple machines, to form a single file system. This has the potential to lighten the Network load for processing.
 +
 +
We are looking at using a distributed file system with archivematica. See [http://code.google.com/p/archivematica/issues/detail?id=669 Issue 669.]
 +
 +
=== Ceph ===
 +
Ceph is a distributed file system, which is currently (July 2011) under alpha development. They have a beta 1.0 release scheduled for release 08/21/2011 see their [http://tracker.newdream.net/projects/ceph/roadmap roadmap.]

Revision as of 17:13, 20 July 2011

A bottleneck refers to a point of congestion in a system, typically a place of limited resources, where workflow is prone to slow.

  • For more information on bottlenecks see wikipedia.

Processing Power

Archivematica uses it's distributed, multi processing MCP system to mitigate the traditional problems of a processing system. However, this places higher importance on two other bottlenecks: Network and Disk activity.

Network

In Archivematica processing, networking comes into play for two key reasons:

  • distributing tasks
  • central file store access

Disk/Hard drive

RAID

RAID (redundant array of inexpensive disks) is a way of distributing the load of a file system on a set of drives. There are various forms of RAID, with different levels of redundancy.

Distributed File System

Distributed file systems are arguably a sub-set of RAIDs. They are distributed over multiple machines, to form a single file system. This has the potential to lighten the Network load for processing.

We are looking at using a distributed file system with archivematica. See Issue 669.

Ceph

Ceph is a distributed file system, which is currently (July 2011) under alpha development. They have a beta 1.0 release scheduled for release 08/21/2011 see their roadmap.