Difference between revisions of "Digital forensics image ingest"

From Archivematica
Jump to navigation Jump to search
 
(34 intermediate revisions by 5 users not shown)
Line 1: Line 1:
 
[[Main Page]] > [[Development]] > [[:Category:Development documentation|Development documentation]] > Digital forensics image ingest
 
[[Main Page]] > [[Development]] > [[:Category:Development documentation|Development documentation]] > Digital forensics image ingest
[[Category:Development documentation]]
 
  
 +
<div style="padding: 10px 10px; border: 1px solid black; background-color: #F79086;">This page is no longer being maintained and may contain inaccurate information. Please see the [https://www.archivematica.org/docs/latest/ Archivematica documentation] for up-to-date information.</div><p>
 +
 +
[[Category:Feature requirements]]
  
 
Related issues: #5265
 
Related issues: #5265
 +
NOTE: Wherever possible, use BitCurator packages for forensics tools.
  
 +
The current status of implementation can be found at: [[Forensic imaging steps for 1.1]]
  
 
== Forensics image transfer type ==
 
== Forensics image transfer type ==
Line 11: Line 15:
 
** One or more images make up a transfer
 
** One or more images make up a transfer
 
** Repository makes image using outside imaging software prior to ingest
 
** Repository makes image using outside imaging software prior to ingest
** Some metadata from ingest process will be included, first from FTK, but later from other tools like Guymager (see metadata requirements below)
+
** Some metadata from ingest process will be included, first from FTK Imager, but later from other tools like Guymager (see metadata requirements below)
* Forensic image types accepted: dd (Raw), ISO, AD1, BIN (these formats are sponsored, support for the formats listed [http://www.forensicswiki.org/wiki/Forensic_file_formats here] is desirable in future releases)
+
* Image types to base development on (more analysis needed): raw sector images =dd, bin, ISO; E01, AFF, AD1; ISO images with CUE files that contain track information; STREAM images (Kryoflux STREAM: This is a representation of non-decoded raw magnetic flux transitions acquired using a Kryoflux.) (these formats are sponsored, support for the formats listed [http://www.forensicswiki.org/wiki/Forensic_file_formats here] is desirable in future releases)
  
 
== Forensics image transfer workflow ==
 
== Forensics image transfer workflow ==
 +
[[File:ArchivematicaForensicImageIngest.png|900px|thumb|center|]]
 +
[[File:ArchivematicaForensicImageIngest(2).png|700px|thumb|center|]]
 +
====Detail====
  
 
* User images external media outside the Archivematica workflow
 
* User images external media outside the Archivematica workflow
Line 22: Line 29:
 
** User enters MD (see MD requirements below)
 
** User enters MD (see MD requirements below)
 
** User saves MD and starts transfer processes
 
** User saves MD and starts transfer processes
* Fiwalk with Fido completes the Characterize and extract metadata micro-service
+
* User selects Start transfer to begin Archivematica transfer processing
* Archivematica runs Bulk Extractor ('''Examine contents micro-service''') and indexes output (this is to allow for reporting and visualization in the transfer backlog search for SIP creation and/or the AIP advanced search to allow for minimal description)
+
* Fiwalk with Fido or BitCurator fiwalk package completes the Characterize and extract metadata micro-service
 +
* Archivematica runs [http://www.forensicswiki.org/wiki/Bulk_extractor Bulk Extractor] tool('''Examine contents micro-service''') and indexes output (this is to allow for reporting and visualization in the transfer backlog search for SIP creation and/or the AIP advanced search to allow for minimal description)
 
* Transfer micro-services complete
 
* Transfer micro-services complete
 
* At Create SIP from Transfer micro-service, user selects one of two options:
 
* At Create SIP from Transfer micro-service, user selects one of two options:
Line 31: Line 39:
 
** Archivist searches for the accession in the transfer backlog, selects the appropriate transfers, and selects Create SIP  
 
** Archivist searches for the accession in the transfer backlog, selects the appropriate transfers, and selects Create SIP  
 
* In ingest tab, user approves SIP creation
 
* In ingest tab, user approves SIP creation
* In ingest tab, there is a decision point at '''Extract packages micro-service''' - User selects from drop-down: Extract objects from image, Do not extract objects from image, Reject
+
* In ingest tab, prior to normalization, there is a decision point at '''Extract packages micro-service''' - User selects from drop-down: Extract objects from image, Do not extract objects from image, Reject
** If user chooses not to extract objects, then skip micro-service decision about tool output to base normalization on, choose normalization for preservation only, and continue standard micro-services to store AIP.
+
** If user chooses not to extract objects, then skip micro-service decision about tool output to base normalization on, choose normalization for preservation only or no normalization, and continue standard micro-services to store AIP.
 
** If user chooses to extract objects, Archivematica runs FITS on the extracted contents. The user continues standard workflow, choosing any of the normalization options (including manual normalization) and continues processing to storage and/or access.
 
** If user chooses to extract objects, Archivematica runs FITS on the extracted contents. The user continues standard workflow, choosing any of the normalization options (including manual normalization) and continues processing to storage and/or access.
 +
 +
* EXCEPTIONS:
 +
** In the case of AD1 images, user should be able to choose to extract the objects from the AD1 image before transfer. Archivematica should recognize an AD1 image and issue an alert/warning. If the user chooses to proceed with the transfer/ingest then the AD1 file just gets stored without any normalization or metadata extraction.
  
 
==Metadata requirements==
 
==Metadata requirements==
 +
See associated issues #6093 and #6123.
 +
 +
When the user selects Forensic image transfer type, each image uploaded as part of the transfer will include a metadata form icon that, if selected, will open a form in another browser tab. There, the user will enter some or all of the MD indicated below in the Template for manual data entry list.
  
 
* Template for manual data entry
 
* Template for manual data entry
**
+
**accession number - recorded in transfer upload in dashboard
**
+
**media number - manual
**
+
**label text - manual (long text field)
**
+
**media manufacture - manual
**
+
**serial number - manual
**
+
**media format - manual, could be controlled value list - ''There is no support for taxonomy creation and management in Archivematica, this would require new sponsored feature development (Evelyn)''
 +
**media density - manual, could be controlled value list - ''See above''
 +
**source filesystem
 +
**imaging interface - manual, could be controlled value list - ''See above''
 +
**examiner - AUTOPOPULATED based on Archivematica user (PREMIS agent) - ''This can't be auto-populated since the event is undertaken prior to ingest and the examiner may be different from the logged-in Archivematica user (Evelyn)''
 +
**image format - manual, could be controlled value list - ''See above''
 +
**imaging software - manual, could be controlled value list - ''See above''
 +
**notes about the imaging process - manual (long text field)
 +
 
 +
* Import from imaging tool FTK or fiwalk/sleuthkit
 +
**imaging date (FTK or other imaging tool output)
 +
**imaging success - Yes, Yes with errors (FTK or other imaging tool output)
 +
**image fixity (FTK or other imaging tool output
 +
**source filesystem (fiwalk)
 +
**accession data about extent (fiwalk)
 +
 
 +
 
 +
 
 +
This will be a standard Archivematica AIP with METS modified to include forensic disk image metadata linked to transfer components, as follows:
 +
[[File:DigForImMETS.png|800px|center|thumb|]]
  
* Import from imaging tool FTK
+
DFXML was considered for the content entered into the template, but was determined inappropriate because the type of content is about the act of imaging and not all about the tool output from the process.
**
 
**
 
**
 
**
 
**
 
  
 +
The following table contains suggestions for parsing into descriptive systems.
  
 
{| border="1" cellpadding="10" cellspacing="0" width="100%"
 
{| border="1" cellpadding="10" cellspacing="0" width="100%"
Line 68: Line 97:
 
institution’s administrative control system. Optionally, devise unique identifiers at lower
 
institution’s administrative control system. Optionally, devise unique identifiers at lower
 
levels of a multilevel description.
 
levels of a multilevel description.
|
+
|3.1.1 - Reference codes
|
+
|<unitid>
 
|
 
|
 
|-
 
|-
Line 75: Line 104:
 
|textual transcription
 
|textual transcription
 
|7.1.2 Record, as needed, information not accommodated by any of the defined elements of description.  
 
|7.1.2 Record, as needed, information not accommodated by any of the defined elements of description.  
|
+
|3.6.1 Note
|
+
|<odd>, <note>
 
|
 
|
 
|-
 
|-
 
|media manufacturer
 
|media manufacturer
 
|
 
|
|7.1.4 - If the materials being described are in electronic form, give details of
+
|7.1.4 - If the materials being described are in electronic form, give details of any migration or logical reformatting since its transfer to archival custody. Indicate the location of any relevant documentation. Information regarding digitization is provided in the Existence and Location of Copies Element (6.2).
any migration
+
|3.6.1 Note
or logical reformatting since its transfer to archival custody.
+
|<odd>, <note>
Indicate the location of any
 
relevant documentation.
 
Information regarding digitization is provided in the Existence
 
and Location of Copies Element (6.2).
 
|
 
|
 
 
|
 
|
 
|-
 
|-
Line 95: Line 118:
 
|when applicable to external media
 
|when applicable to external media
 
|7.1.4 or 7.1.6 If appropriate at the file or item level of description, make a note of any important numbers borne by the unit being described.  
 
|7.1.4 or 7.1.6 If appropriate at the file or item level of description, make a note of any important numbers borne by the unit being described.  
|
+
|3.6.1 Note
|
+
|<odd>, <note>
 
|
 
|
 
|-
 
|-
Line 102: Line 125:
 
|a controlled value list (e.g. 3.5" floppy, 5.25" floppy, CD-R, etc)
 
|a controlled value list (e.g. 3.5" floppy, 5.25" floppy, CD-R, etc)
 
|7.1.4  
 
|7.1.4  
|
+
|3.6.1 Note
|
+
|<odd>, <note>
 
|
 
|
 
|-
 
|-
Line 109: Line 132:
 
|a controlled value list (e.g. single density, double density, quad density, high density)
 
|a controlled value list (e.g. single density, double density, quad density, high density)
 
|7.1.4  
 
|7.1.4  
|
+
|3.6.1 Note
|
+
|<odd>, <note>
 
|
 
|
 
|-
 
|-
Line 116: Line 139:
 
|a controlled file list(e.g. HFS, FAT, etc.) with the ability to add terms
 
|a controlled file list(e.g. HFS, FAT, etc.) with the ability to add terms
 
|7.1.4
 
|7.1.4
|
+
|3.6.1 Note
|
+
|<odd>, <note>
 
|
 
|
 
|-
 
|-
Line 123: Line 146:
 
|textual field to describe more detail about the imaging process
 
|textual field to describe more detail about the imaging process
 
|7.1.4
 
|7.1.4
|
+
|3.6.1 Note
|
+
|<odd>, <note>
 
|
 
|
 
|-
 
|-

Latest revision as of 16:16, 11 February 2020

Main Page > Development > Development documentation > Digital forensics image ingest

This page is no longer being maintained and may contain inaccurate information. Please see the Archivematica documentation for up-to-date information.

Related issues: #5265 NOTE: Wherever possible, use BitCurator packages for forensics tools.

The current status of implementation can be found at: Forensic imaging steps for 1.1

Forensics image transfer type[edit]

  • Archivematica transfer type: forensic image
    • One or more images make up a transfer
    • Repository makes image using outside imaging software prior to ingest
    • Some metadata from ingest process will be included, first from FTK Imager, but later from other tools like Guymager (see metadata requirements below)
  • Image types to base development on (more analysis needed): raw sector images =dd, bin, ISO; E01, AFF, AD1; ISO images with CUE files that contain track information; STREAM images (Kryoflux STREAM: This is a representation of non-decoded raw magnetic flux transitions acquired using a Kryoflux.) (these formats are sponsored, support for the formats listed here is desirable in future releases)

Forensics image transfer workflow[edit]

ArchivematicaForensicImageIngest.png
ArchivematicaForensicImageIngest(2).png

Detail[edit]

  • User images external media outside the Archivematica workflow
  • User uploads image(s) into the Archivematica transfer tab of the dashboard by browsing to the appropriate transfer source directory and selecting a directory containing their image(s)
  • User enters transfer name and accession number
  • User selects MD entry template for entering MD about the imaging process
    • User enters MD (see MD requirements below)
    • User saves MD and starts transfer processes
  • User selects Start transfer to begin Archivematica transfer processing
  • Fiwalk with Fido or BitCurator fiwalk package completes the Characterize and extract metadata micro-service
  • Archivematica runs Bulk Extractor tool(Examine contents micro-service) and indexes output (this is to allow for reporting and visualization in the transfer backlog search for SIP creation and/or the AIP advanced search to allow for minimal description)
  • Transfer micro-services complete
  • At Create SIP from Transfer micro-service, user selects one of two options:
    • If the user is an archivist/curator ready to process the image through to storage and/or access, choose Create single SIP and continue processing
    • If the user is uploading multiple images as part of one accession, for processing by an archivist/curator later, choose Send to backlog
      • In the second scenario, once all images from an accession are in the backlog, user alerts archivist/curator that the accession is ready for further processing
    • Archivist searches for the accession in the transfer backlog, selects the appropriate transfers, and selects Create SIP
  • In ingest tab, user approves SIP creation
  • In ingest tab, prior to normalization, there is a decision point at Extract packages micro-service - User selects from drop-down: Extract objects from image, Do not extract objects from image, Reject
    • If user chooses not to extract objects, then skip micro-service decision about tool output to base normalization on, choose normalization for preservation only or no normalization, and continue standard micro-services to store AIP.
    • If user chooses to extract objects, Archivematica runs FITS on the extracted contents. The user continues standard workflow, choosing any of the normalization options (including manual normalization) and continues processing to storage and/or access.
  • EXCEPTIONS:
    • In the case of AD1 images, user should be able to choose to extract the objects from the AD1 image before transfer. Archivematica should recognize an AD1 image and issue an alert/warning. If the user chooses to proceed with the transfer/ingest then the AD1 file just gets stored without any normalization or metadata extraction.

Metadata requirements[edit]

See associated issues #6093 and #6123.

When the user selects Forensic image transfer type, each image uploaded as part of the transfer will include a metadata form icon that, if selected, will open a form in another browser tab. There, the user will enter some or all of the MD indicated below in the Template for manual data entry list.

  • Template for manual data entry
    • accession number - recorded in transfer upload in dashboard
    • media number - manual
    • label text - manual (long text field)
    • media manufacture - manual
    • serial number - manual
    • media format - manual, could be controlled value list - There is no support for taxonomy creation and management in Archivematica, this would require new sponsored feature development (Evelyn)
    • media density - manual, could be controlled value list - See above
    • source filesystem
    • imaging interface - manual, could be controlled value list - See above
    • examiner - AUTOPOPULATED based on Archivematica user (PREMIS agent) - This can't be auto-populated since the event is undertaken prior to ingest and the examiner may be different from the logged-in Archivematica user (Evelyn)
    • image format - manual, could be controlled value list - See above
    • imaging software - manual, could be controlled value list - See above
    • notes about the imaging process - manual (long text field)
  • Import from imaging tool FTK or fiwalk/sleuthkit
    • imaging date (FTK or other imaging tool output)
    • imaging success - Yes, Yes with errors (FTK or other imaging tool output)
    • image fixity (FTK or other imaging tool output
    • source filesystem (fiwalk)
    • accession data about extent (fiwalk)


This will be a standard Archivematica AIP with METS modified to include forensic disk image metadata linked to transfer components, as follows:

DigForImMETS.png

DFXML was considered for the content entered into the template, but was determined inappropriate because the type of content is about the act of imaging and not all about the tool output from the process.

The following table contains suggestions for parsing into descriptive systems.

element description DACS (2013) ISAD(G) EAD PREMIS 2.2
media number repository specific alphanumeric designation assigned to individual physical media/carrier 2.1.3 local identifier - At the highest level of a multilevel description or in a single level description,

provide a unique identifier for the materials being described in accordance with the institution’s administrative control system. Optionally, devise unique identifiers at lower levels of a multilevel description.

3.1.1 - Reference codes <unitid>
label text textual transcription 7.1.2 Record, as needed, information not accommodated by any of the defined elements of description. 3.6.1 Note <odd>, <note>
media manufacturer 7.1.4 - If the materials being described are in electronic form, give details of any migration or logical reformatting since its transfer to archival custody. Indicate the location of any relevant documentation. Information regarding digitization is provided in the Existence and Location of Copies Element (6.2). 3.6.1 Note <odd>, <note>
serial number when applicable to external media 7.1.4 or 7.1.6 If appropriate at the file or item level of description, make a note of any important numbers borne by the unit being described. 3.6.1 Note <odd>, <note>
media format a controlled value list (e.g. 3.5" floppy, 5.25" floppy, CD-R, etc) 7.1.4 3.6.1 Note <odd>, <note>
media density a controlled value list (e.g. single density, double density, quad density, high density) 7.1.4 3.6.1 Note <odd>, <note>
source filesystem a controlled file list(e.g. HFS, FAT, etc.) with the ability to add terms 7.1.4 3.6.1 Note <odd>, <note>
notes about the imaging process textual field to describe more detail about the imaging process 7.1.4 3.6.1 Note <odd>, <note>
imaging interface a controlled value list (e.g. Catweasel, Firewire, USB, IDE, etc.) 2.2 - Event: Image capture
examiner the person doing the imaging 2.2 - Agent
imaging date date of imaging 2.2 - Event: Image capture
imaging success ex Yes/Yes, with errors 2.2 - Event: Image capture
image format a controlled value list (e.g. AFF3, dd/secort image, AD1, etc.) with the ability to add terms 2.2 - Object
imaging software a controlled value list (e.g. FTK imager 3.1.0.1514, Kryoflux, DTC 2.00 beta 9, etc.) with the ability to add terms 2.2 - Agent
image fixity type(s) and value(s) from FTK csv output 2.2 - Object


Forensic image transfer tools[edit]

fiwalk[edit]

  • Characterize and extract metadata micro-service
  • Use Mark Matienzo's github version which includes FIDO for format identification since fiwalk's format identification is libmagic (unsatisfactory for our purposes)

Sample fiwalk xml output:


<?xml version='1.0' encoding='ISO-8859-1'?>
<fiwalk xmloutputversion='0.2'>
  <metadata 
  xmlns='http://example.org/myapp/' 
  xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' 
  xmlns:dc='http://purl.org/dc/elements/1.1/'>
    <dc:type>Disk Image</dc:type>
  </metadata>
  <creator>
    <program>fiwalk</program>
    <version>0.5.7</version>
    <os>Darwin</os>
    <library name="tsk" version="3.0.1"></library>
    <library name="afflib" version="3.5.2"></library>
    <command_line>fiwalk -x /dev/disk2</command_line>
  </creator>
  <source>
    <imagefile>/dev/disk2</imagefile>
  </source>
<!-- fs start: 512 -->
  <volume offset='512'>
    <Partition_Offset>512</Partition_Offset>
    <block_size>512</block_size>
    <ftype>2</ftype>
    <ftype_str>fat12</ftype_str>
    <block_count>5062</block_count>
    <first_block>0</first_block>
    <last_block>5061</last_block>
    <fileobject>
      <filename>README.txt</filename>
      <id>2</id>
      <filesize>43</filesize>
      <partition>1</partition>
      <alloc>1</alloc>
      <used>1</used>
      <inode>6</inode>
      <type>1</type>
      <mode>511</mode>
      <nlink>1</nlink>
      <uid>0</uid>
      <gid>0</gid>
      <mtime>1258916904</mtime>
      <atime>1258876800</atime>
      <crtime>1258916900</crtime>
      <byte_runs>
       <run file_offset='0' fs_offset='37376' img_offset='37888' len='43'/>
      </byte_runs>
      <hashdigest type='md5'>2bbe5c3b554b14ff710a0a2e77ce8c4d</hashdigest>
      <hashdigest type='sha1'>b3ccdbe2db1c568e817c25bf516e3bf976a1dea6</hashdigest>
    </fileobject>
  </volume>
<!-- end of volume -->
<!-- clock: 0 -->
  <runstats>
    <user_seconds>0</user_seconds>
    <system_seconds>0</system_seconds>
    <maxrss>1814528</maxrss>
    <reclaims>546</reclaims>
    <faults>1</faults>
    <swaps>0</swaps>
    <inputs>56</inputs>
    <outputs>0</outputs>
    <stop_time>Sun Nov 22 11:08:36 2009</stop_time>
  </runstats>
</fiwalk>

Bulk Extractor[edit]