Option 1 (preferred)
Description: An AIC consisting of only a fileSec and structMap; AIPs consisting of data files and metadata for those data files; an AIP consisting of project/program-level (i.e. dataset) metadata and documentation.
Workflow:
- User creates X number of AIPs and puts them in archival storage
- One of these AIPs consists only of metadata and documentation about the program/project as a whole
- The AIPs must have one or more common metadata elements that allows them to be identified as being related
- User searches for AIPs in archival storage tab (using the common metadata element in the AIPs in the search query)
- Once search results are retrieved, user clicks "Create AIC" button
- AIC is created, containing only a METS structMap listing all AIPs
- Over time, user can add new AIPs and re-create the AIC at any time; the new AIC will either replace or update the old one
- Over time, if needed the user either updates the existing documentation AIP or adds new documentation AIPs (i.e. there can be more than one documentation AIP per dataset)
Pros:
- Don't have to duplicate program/project-level documentation in each AIP
- Simple workflow for creating AIC
- Easy to add new AIPs
- If program/project documentation needs updating, only one AIP has to be re-processed, or user can add new documentation AIP(s)
Cons:
- There is only a one-way link between the AIC and child AIPs - i.e. the AIC has a structMap listing all child AIPs, but there is nothing in a child AIP to indicate that it belongs to a given AIC.
Sample AIC METS file
Sample pointer.xml file
Option 1A (supplied by U of A)
Description:
Workflow
- User selects type 'AIC' on Archivematica's Transfer page
- User provides input about the status of this AIC: Is this a new AIC
- Yes: User needs to create a new AIC with a new system-generated or user generated AIC#
- No: AIC already exists and the user just wants to add new AIPs or wants to update the AIC. User should be able to search this AIC through AIC# or using other search terms
- Once a new AIC has been created OR an existing one selected, a loop starts, during which new AIPs are prepared and added to this AIC. Each AIP contains a reference to the AIC through AIC#, as part of the system generated metadata. The loop ends when the user finishes adding AIPs
- User selects "Generate AIC structMap" option and it will generate a "structMap" for this collection. This structMap also has a reference to its corresponding AIC through AIC#. If possible, store it inside the AIC, otherwise as a seperate structMap AIP
- System prepares the whole package for preservation (AIC/AIPs/structMap) and sends it to archival storage
- Over time, user can add new AIPs. Generate AIC structMap step can either replace structMap or update the old one.
- Over time, the user can add/update the existing documentation in an AIC by updating the AIC's AIP
Pros
- Don't have to duplicate program/project-level documentation in each AIP
- AIPs and AIC are linked through AIC number
Cons
- The process of creating an AIC first, then adding AIPs and having the system automatically add the AIC UUID to the AIP, would be complex and could not easily be built on existing code for generating AIP METS files.
- "System prepares the whole package for preservation (AIC/AIPs/structMap) and sends it to archival storage" - this does not match Archivematica's basic processing pipeline design, which generates AIPs and places them individually in archival storage. In reference to point 1, above, there would be no obvious mechanism for finding the UUID of the AIC and adding it to an AIP if the AIC has not already been placed in archival storage.
- "Over time, user can add new AIPs. Generate AIC structMap step can either replace structMap or update the old one" - it is not clear where or how this step would occur, since this would require searching for an AIC and its constituent AIPs in archival storage, then generating a new structMap from stored AIPs plus the AIP that is currently being created (and this process would need to be run separately for each new AIP).
- Updating the documentation AIP would be problematic if the documentation AIP doubled as the AIC. There is currently no mechanism for AIP versioning, and it is likely that "versioning" will actually mean creating replacement AIPs. In this case, updating the documentation would result in the creation of a new AIC with a new UUID, which would mean that all the AIC UUIDs in the child AIPs would become obsolete.
Option 2
Description: An AIC consisting of a METS structMap and project/program-level (i.e. dataset) metadata and documentation; content AIPs consisting of data files and metadata about the data files. AIPs have information in the METS files (in the structMap?) linking them to the parent AIC.
Workflow:
To be determined - probably a dashboard tab with a gui to allow users to arrange existing AIPs into an AIC
Pros:
- Don't have to duplicate program/project-level documentation in each AIP
- AIPs have a link up to the AIC, so if an AIP is orphaned the relationship to the AIC can easily be reconstructed
- If program/project-level metadata and documentation needs to be updated, only the AIC needs to be re-processed
Cons:
- Workflow to create this structure may be complex
- No obvious mechanism for adding new AIPs over time
Option 3
Description: An AIC with a unique identifier consisting of project/program-level (i.e. dataset) metadata and documentation only (no structMap); AIPs consisting of data files, metadata for those data files, and the same identifier as the AIC. The relationship between the AIC and AIPs in this scenario is inferred from the matching identifiers.
Workflow:
- User creates an AIC consisting of project/program-level (i.e. dataset) metadata and documentation
- The AIC contains an identifier that distinguishes it from other AICs
- User creates AIPs consisting of data files and metadata for those data files
- User includes the AIC identifier in each AIP
- Over time, if needed the user can add more AIPs with the same identifier
Pros:
- Don't have to duplicate program/project-level documentation in each AIP
- Simple workflow
- Minimal development requirements, just new metadata field for identifier added to transfer tab, corresponding entry in AIC/AIP METS files and ability to search by AIC identifier in archival storage tab
- If program/project-level metadata and documentation needs to be updated, only the AIC needs to be re-processed
- Easy to add more AIPs to the same AIC over time
Cons:
- No structMap in the AIC means that there is no single source of information about how many AIPs are in the AIC
Option 4
Description: No AIC; project/program-level metadata and documentation duplicated in all AIPs; links between AIPs belonging to one dataset inferred from metadata only
Workflow:
User creates any number of AIPs with complete copies of the project/program-leve (i.e. dataset) metadata and documentation in each AIP
Pros:
- Minimal Archivematica development required, just ensuring that matching metadata elements are parsed to the AIP METS files or otherwise made available to ElasticSearch index
- Easy to add new AIPs over time
Cons:
- User has to maintain copies of project/program-level metadata and documentation outside of Archivematica so they can be added to each AIP
- Updating the project/program-level metadata and documentation would require re-processing the AIPs
- Relationships between AIPs would have to be inferred from matching metadata elements alone; if an AIP were lost, there would be no list of AIPs belonging to the dataset which would reveal the loss
|