Difference between revisions of "File Browser Requirements"

From Archivematica
Jump to navigation Jump to search
(Created page with 'Main Page > Development > Development documentation > File Browser Requirements This page describes requirements for moving all A...')
 
Line 16: Line 16:
 
*Assign transfer backup location (what is it backing up? at which stage?) This could be a staging area, in case many transfers will become one SIP or more time is needed to begin processing
 
*Assign transfer backup location (what is it backing up? at which stage?) This could be a staging area, in case many transfers will become one SIP or more time is needed to begin processing
  
Assign transfer type (drop-down, user configurable: Generic, DSpace export, digitization output (Issue 713), VanDocs export, BagIt package (Issue 593))
+
*Assign transfer type (drop-down, user configurable: Generic, DSpace export, digitization output (Issue 713), VanDocs export, BagIt package (Issue 593))
 +
 
 +
= More to come on this...=
 +
*full-text indexing (Tika/Lucene/ElasticSearch) on recognizable text files as well as data visualization (ElasticSearch/Protovis) to assist with:
 +
        identifying keywords for security classification and appraisal
 +
        identify duplicates and close matches (using hex values?)
 +
        high-level sorting and grouping of batches of objects into SIPs (incl. tracking difference between 'original order' and new physical/logical arrangement)
 +
   
 +
*Archivematica compliant SIP creation, including:
 +
        log of original directory structure, diff to new structure (using METS <structMap> to represent both)
 +
        use METS <structMap> (or physical directory arrangement of SIP?) to represent archival arrangement for rebuild in access system (see Issue 380)
 +
 
 +
*Change permissions of directory contents recursively (or should this be integrated in every step to avoid problems?)
 +
 
 
= TRANSFER TAB =
 
= TRANSFER TAB =
  

Revision as of 12:48, 13 March 2012

Main Page > Development > Development documentation > File Browser Requirements

This page describes requirements for moving all Archivematica interface to the web dashboard, including file browser functionality for transfer and SIP configuration, normalization output review, and AIP and DIP review prior to upload to storage and access. (mockups forthcoming)

Front Page of Browser

  • Login Archivist - archivist's name will be attached to all processing actions.
  • Upload transfer (browse box)
  • Create structured directory (create three directories: logs, metadata, objects)

-OR-

  • Restructure transfer for processing (put all contents into objects directory and create an empty log and metadata directory)
  • Run checksum on transfer and place checksum in folder called checksum.md5 in the metadata folder (checkbox) This is in case the transfer hasn't arrived with a checksum already.
  • Assign transfer backup location (what is it backing up? at which stage?) This could be a staging area, in case many transfers will become one SIP or more time is needed to begin processing
  • Assign transfer type (drop-down, user configurable: Generic, DSpace export, digitization output (Issue 713), VanDocs export, BagIt package (Issue 593))

More to come on this...

  • full-text indexing (Tika/Lucene/ElasticSearch) on recognizable text files as well as data visualization (ElasticSearch/Protovis) to assist with:
       identifying keywords for security classification and appraisal
       identify duplicates and close matches (using hex values?)
       high-level sorting and grouping of batches of objects into SIPs (incl. tracking difference between 'original order' and new physical/logical arrangement)
   
  • Archivematica compliant SIP creation, including:
       log of original directory structure, diff to new structure (using METS <structMap> to represent both)
       use METS <structMap> (or physical directory arrangement of SIP?) to represent archival arrangement for rebuild in access system (see Issue 380)
  • Change permissions of directory contents recursively (or should this be integrated in every step to avoid problems?)

TRANSFER TAB

indexing

  • keyword & pattern matching for privacy/security sensitive information (e.g. social insurance numbers, credit card numbers, 'private', 'confidential')
  • list of PDFs that have not been OCR'ed
  • list of password protected / encrypted files