Difference between revisions of "AIP structure"

From Archivematica
Jump to navigation Jump to search
 
(34 intermediate revisions by 4 users not shown)
Line 1: Line 1:
 
[[Main Page]] > [[Development]] > [[:Category:Development documentation|Development documentation]] > AIP structure
 
[[Main Page]] > [[Development]] > [[:Category:Development documentation|Development documentation]] > AIP structure
  
This page documents the structure of the AIP which is the product of Archivematica 0.8 alpha.
+
<div style="padding: 10px 10px; border: 1px solid black; background-color: #F79086;">This page is no longer being maintained and may contain inaccurate information. Please see the [https://www.archivematica.org/docs/latest/ Archivematica documentation] for up-to-date information. </div> <p>
 +
 
 +
This page documents the structure of the AIP produced by Archivematica.
  
 
==Name==
 
==Name==
Line 10: Line 12:
 
#a UUID assigned during SIP formation
 
#a UUID assigned during SIP formation
  
'''example''': Images-aebbfc44-9f2e-4351-bcfb-bb80d4914112
+
'''example''': Pictures_of_my_cat-aebbfc44-9f2e-4351-bcfb-bb80d4914112
  
"Images" is the name assigned by the user and "aebbfc44-9f2e-4351-bcfb-bb80d4914112" is the UUID generated during SIP formation.
+
"Pictures_of_my_cat" is the name assigned by the user and "aebbfc44-9f2e-4351-bcfb-bb80d4914112" is the UUID generated during SIP formation.
 +
 
 +
</br>
  
 
==Directory Structure==
 
==Directory Structure==
  
[[Image:AIPtopdir.png|400px|right|thumb|'''Figure 1'''  AIP directory - top level]]
+
[[Image:ZippedAIP-10.png|500px|right|thumb|'''Figure 1'''  AIP directory - top level]]
 +
 
 +
* The AIP is zipped in the AIPsStore. The AIP directories are broken down into UUID quad directories* for efficient storage and retrieval. (*UUID quad directories: Some file systems limit the number of items allowed in a directory, Archivematica uses a directory tree structure to store AIPs. The tree is based on the AIP UUIDs. The UUID is broken down into manageable 4 character pieces, or "UUID quads", each quad representing a directory. The first four characters (UUID quad) of the AIP UUID will compose a sub directory of the AIP storage. The second UUID quad will be the name of a sub directory of the first, and so on and so forth, until the last four characters (last UUID Quad) create the leaf of the AIP store directory tree, and the AIP with that UUID resides in that directory.)('''figure 1''')
  
BagIt documentation
+
</br>
  
*The AIP is packaged in accordance with the [http://www.digitalpreservation.gov/documents/bagitspec.pdf| Library of Congress Bagit specification] (PDF, 84KB) In '''Figure 1''', the BagIt files are bag-info.txt, bagit.txt, manifest-sha512.txt and tagmanifest-md5.txt.
+
===BagIt documentation===
  
 +
*The AIP is packaged in accordance with the [http://www.digitalpreservation.gov/documents/bagitspec.pdf| Library of Congress Bagit specification] (PDF, 84KB) In '''figure 2''', the BagIt files are bag-info.txt, bagit.txt, manifest-sha512.txt and tagmanifest-md5.txt.
 +
[[File:BagSpec-10.png|500px|thumb|'''Figure 2''']]
 
</div>
 
</div>
  
Line 27: Line 35:
  
 
</div>
 
</div>
 +
*The following describes the contents of the AIP once extracted.
  
Data
+
</br>
 
 
[[Image:AIPdatadirectory.png|400px|right|thumb|'''Figure 2'''  AIP data directory]]
 
The '''data''' directory consists of the [[METS|METS file]] for the AIP and three folders: logs, metadata and objects.(See '''Figure 2''')
 
  
</div>
+
===Data===
  
<div class="clearfix">
+
[[Image:AIPdatadirectory-10.png|600px|right|thumb|'''Figure 3'''  AIP data directory]]
 +
*The '''data''' directory consists of the [[METS|METS file]] for the AIP and three folders: logs, objects. and thumbnails. (See '''figure 3''')
  
 
</div>
 
</div>
  
*Logs: /data/logs contains normalization log, malware scan log, and submission documentation extraction log generated during SIP creation. (See '''Figure 3''')
 
  
[[Image:Data_logs.png|600px|thumb|'''Figure 3'''  Logs folder content in Data]]
 
</div>
 
  
 
<div class="clearfix">
 
<div class="clearfix">
  
 
</div>
 
</div>
*Metadata: /data/metadata contains a folder containing logs from each transfer that makes up the SIP. The METS file for each transfer's original order is contained within the log file for the transfer (see bottom of page at [[METS|METS for Archivematica transfer]]).(See '''Figure 4''' ) More information about the contents of the "transfer" file(s) and screenshots in Transfer section below.
+
*[[METS|METS file]]: /data/METS.uuid.xml contains the full PREMIS implementation (see [[PREMIS metadata: original files|PREMIS metadata for original file]], [[PREMIS metadata: normalized files|PREMIS metadata: normalized files]], [[PREMIS metadata: events|PREMIS metadata: events]], and [[PREMIS metadata: rights|PREMIS metadata: rights]] The role of the METS file is to link original objects to their preservation copies and to their descriptions and submission documentation, as well as to link PREMIS metadata to the objects in the AIP.
[[Image:Data_metadata.png|600px|thumb|'''Figure 4'''  Metadata folder content in Data]]
+
 
 
</div>
 
</div>
  
Line 54: Line 58:
  
 
</div>
 
</div>
*Objects: /data/objects contains original objects, normalized objects and submission documentation. If there were any lower level directories within the SIP, that directory structure is maintained. (See '''Figure 5''' )
 
  
[[Image:Data_objects.png|600px|thumb|'''Figure 5''' Objects folder content in Data]]
+
*Logs: /data/logs contains the /transfers directory, normalization log, malware scan log, and the extraction log (from unpackaging packages) generated during SIP creation. (See '''figure 4''')
</div>
+
**The /transfers directory contains the logs from processing that occurred to each transfer which is part of the SIP in the transfer workflow in the dashboard.
 
 
<div class="clearfix">
 
 
 
</div>
 
*[[METS|METS file]]: /data/METS.uuid.xml contains the full PREMIS implementation (see [[PREMIS metadata: original files|PREMIS metadata for original file]], [[PREMIS metadata: normalized files|PREMIS metadata: normalized files]], [[PREMIS metadata: events|PREMIS metadata: events]], and [[PREMIS metadata: rights|PREMIS metadata: rights]] The role of the METS file is to link original objects to their preservation copies and to their descriptions and submission documentation, as well as to link PREMIS metadata to the objects in the AIP.
 
  
 +
[[Image:DataLogs-10.png|600px|thumb|'''Figure 4'''  Logs folder content in Data]]
 
</div>
 
</div>
  
Line 70: Line 69:
 
</div>
 
</div>
  
==The Transfers folder==
+
*Objects: /data/objects contains original objects, normalized objects, /metadata and /submissionDocumentation. If there were any lower level directories within the SIP, that directory structure is maintained. (See '''Figure 5''' )
 
+
**/metadata contains /transfers, which contains any metadata which may have been imported with the transfers
In the AIP, /data/metadata/transfers contains information about each transfer included in the formation of the SIP. At the top level of the transfer folder, you will see each transfer. '''Figure 6''' shows the contents of the transfers folder from an AIP derived from a SIP made up of only one transfer.
+
**/submissionDocumentation contains submission documentation for each transfer which is part of the SIP and each transfer's METS.xml file. The structmap for the transfer is the closest approximation of original order for the transfer.
  
 
+
[[Image:DataObjects-10.png|600px|thumb|'''Figure 5'''  Objects folder content in Data]]
[[Image:Data_metadata_transfers.png|600px|right|thumb|'''Figure 6'''  Transfer(s) within the transfer folder. Note: This is a SIP made from one transfer, but it could contain many transfers that make up a SIP.]]
 
 
</div>
 
</div>
  
 
<div class="clearfix">
 
<div class="clearfix">
  
</div>
 
Once you open up a single transfer, you will see logs and metadata folders for that transfer, /data/metadata/transfers/TransferX/logs and /data/metadata/transfers/metadata. (See '''Figure 7''')
 
[[Image:InsideTransfer.png|600px|right|thumb|'''Figure 7'''  Contents of a single transfer. Note: there could be one or many transfers that make up a SIP]]
 
 
</div>
 
</div>
  
<div class="clearfix">
+
*Thumbnails: /data/thumbnails contains any thumbnails generated for viewing in the AIP search interface of the dashboard.
  
</div>
 
The logs folder: /data/metadata/transfers/TransferX/logs contains the malware scan log, metadata extraction log, filename cleanup log, file UUID log, and the METS file that reflects the original order of the transfer before it underwent any arrangement and appraisal actions contributing to the formations of the SIP. (See '''Figure 8''')
 
[[Image:InsideTransferLogs.png|600px|thumb|'''Figure 8'''  Contents of the logs folder within a single transfer]]
 
</div>
 
  
<div class="clearfix">
 
  
</div>
 
The metadata folder: /data/metadata/transfers/TransferX/metadata contains a folder with any submission documentation submitted with the transfer. These files could be donor agreements, transfer forms, copyright agreements and any correspondence or other documentation relating to the transfer. Note that at the top level, within the objects folder, there are copies of this documentation in the submissionDocumentation folder. If this were an AIP made up of SIPs composed of multiple transfers, the objects/submissionDocumentation folder would contain copies of the submission documentation from each of those transfers. Copies of the submission documentation contained in the AIP at the objects/submissionDocumentation level has been normalized for preservation. (See '''Figure 9''')
 
[[Image:InsideTransfersubdoc.png|600px|thumb|'''Figure 9'''  Submission Documentation folder for a single transfer]]
 
 
[[Category:Development documentation]]
 
[[Category:Development documentation]]

Latest revision as of 16:40, 11 February 2020

Main Page > Development > Development documentation > AIP structure

This page is no longer being maintained and may contain inaccurate information. Please see the Archivematica documentation for up-to-date information.

This page documents the structure of the AIP produced by Archivematica.

Name[edit]

The AIP name is composed of the following:

  1. Either the name of the original transfer if no new name has been assigned to the SIP upon formation or the name of the SIP or SIPs created from the transfer and
  2. a UUID assigned during SIP formation

example: Pictures_of_my_cat-aebbfc44-9f2e-4351-bcfb-bb80d4914112

"Pictures_of_my_cat" is the name assigned by the user and "aebbfc44-9f2e-4351-bcfb-bb80d4914112" is the UUID generated during SIP formation.


Directory Structure[edit]

Figure 1 AIP directory - top level
  • The AIP is zipped in the AIPsStore. The AIP directories are broken down into UUID quad directories* for efficient storage and retrieval. (*UUID quad directories: Some file systems limit the number of items allowed in a directory, Archivematica uses a directory tree structure to store AIPs. The tree is based on the AIP UUIDs. The UUID is broken down into manageable 4 character pieces, or "UUID quads", each quad representing a directory. The first four characters (UUID quad) of the AIP UUID will compose a sub directory of the AIP storage. The second UUID quad will be the name of a sub directory of the first, and so on and so forth, until the last four characters (last UUID Quad) create the leaf of the AIP store directory tree, and the AIP with that UUID resides in that directory.)(figure 1)


BagIt documentation[edit]

  • The AIP is packaged in accordance with the Library of Congress Bagit specification (PDF, 84KB) In figure 2, the BagIt files are bag-info.txt, bagit.txt, manifest-sha512.txt and tagmanifest-md5.txt.
Figure 2
  • The following describes the contents of the AIP once extracted.


Data[edit]

Figure 3 AIP data directory
  • The data directory consists of the METS file for the AIP and three folders: logs, objects. and thumbnails. (See figure 3)


  • Logs: /data/logs contains the /transfers directory, normalization log, malware scan log, and the extraction log (from unpackaging packages) generated during SIP creation. (See figure 4)
    • The /transfers directory contains the logs from processing that occurred to each transfer which is part of the SIP in the transfer workflow in the dashboard.
Figure 4 Logs folder content in Data
  • Objects: /data/objects contains original objects, normalized objects, /metadata and /submissionDocumentation. If there were any lower level directories within the SIP, that directory structure is maintained. (See Figure 5 )
    • /metadata contains /transfers, which contains any metadata which may have been imported with the transfers
    • /submissionDocumentation contains submission documentation for each transfer which is part of the SIP and each transfer's METS.xml file. The structmap for the transfer is the closest approximation of original order for the transfer.
Figure 5 Objects folder content in Data
  • Thumbnails: /data/thumbnails contains any thumbnails generated for viewing in the AIP search interface of the dashboard.