Difference between revisions of "Normalizing based on FITS output"
Jump to navigation
Jump to search
(3 intermediate revisions by the same user not shown) | |||
Line 54: | Line 54: | ||
| | | | ||
|'''Can use DROID output for most but not all audio files (ac3 and WMA are not reliably identified)''' | |'''Can use DROID output for most but not all audio files (ac3 and WMA are not reliably identified)''' | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
|- | |- | ||
|Office Open XML | |Office Open XML | ||
Line 139: | Line 125: | ||
| | | | ||
| | | | ||
− | |'''Use DROID for PPT, fileUtility | + | |'''Use DROID for PPT, fileUtility output or file extension for ODP''' |
|- | |- | ||
|Raster image | |Raster image | ||
Line 329: | Line 315: | ||
|FileType: XLS; MIMEType: application/vnd.mx-excel | |FileType: XLS; MIMEType: application/vnd.mx-excel | ||
|Exiftool identifies as FileType: FPX; MimeType: image/vnd.fpx if file extension is missing | |Exiftool identifies as FileType: FPX; MimeType: image/vnd.fpx if file extension is missing | ||
+ | |- | ||
+ | |Spreadsheet | ||
+ | |ODS | ||
+ | |Name: ZIP Format; PUID: x-fmt/263; MimeType: application/zip | ||
+ | |format: OpenDocument Spreadsheet; mimetype: application/octet-stream | ||
+ | |FileType: ZIP; MIMEType: application/zip | ||
+ | | | ||
+ | |- | ||
+ | |'''Spreadsheet summary''' | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | |'''Must use file extensions for spreadsheets''' | ||
|- | |- | ||
|Vector image | |Vector image | ||
|AI | |AI | ||
− | |PDF | + | |Name: Acrobat PDF 1.5 - Portable Document Format; PUID: fmt/19; MimeType: application/pdf |
− | |PDF | + | |format: PDF document; mimetype: application/pdf |
− | |mimetype: application/ | + | |FileType: PDF; MIMEType: application/pdf |
+ | |DROID gives IdentificationWarning: Possible file extension mismatch | ||
+ | |- | ||
+ | |Vector image | ||
+ | |EPS | ||
+ | |Name: Encapsulated PostScript File Format; PUID: fmt/124; MimeType: application/postscript | ||
+ | |format: DOS EPS Binary File Postscript; mimetype: application/octet-stream | ||
+ | |FileType: EPS; MIMEType: application/postscript | ||
| | | | ||
|- | |- | ||
|Vector image | |Vector image | ||
− | | | + | |SVG |
− | | | + | |Name: Scalable Vector Graphics; PUID: fmt/91; MimeType: image/svg+xml |
− | | | + | |format: SVG Scalable Vector Graphics Image; mimetype: image/svg+xml |
− | | | + | |FileType: SVG; MIMEType: image/svg+xml |
+ | | | ||
+ | |- | ||
+ | |'''Vector image summary''' | ||
+ | | | ||
+ | | | ||
+ | | | ||
| | | | ||
+ | |'''Can use DROID output for EPS and SVG; must use file extension for AI. DROID misidentifies AI as PDF.''' | ||
|- | |- | ||
|Video | |Video |
Latest revision as of 15:16, 5 August 2011
Main Page > Development > Development documentation > Normalizing based on FITS output
This table shows DROID, FileUtility and exifTool output for file extensions for which Archivematica has preservation and access plans.
Media type | Extension | DROID identification | fileUtility output | exifTool output | Notes |
---|---|---|---|---|---|
Audio | AC3 | (not identified) | format: ATSC A/52 aka AC-3 aka Dolby Digital stream; mimetype: application/octet-stream | Unknown file type | |
Audio | AIF | Name: Audio Interchange File Format; PUID: x-fmt/135; MimeType: audio/x-aiff | format: IFF data, AIFF audio; mimetype: audio/x-aiff | FileType: AIFF; MIMEType: audio/aiff | |
Audio | MP3 | Name: MPEG 1/2 Audio Layer 3; PUID: fmt/134; MimeType: audio/mpeg | format: audio file with ID3 version 2.3.0, contains: MPEG ADTS, Layer III; mimetype: audio/mpeg | FileType: MP3; MIMEType: audio/mpeg | |
Audio | WAV | Name: Waveform Audio; PUID: fmt/6; MimeType: audio/x-wav | format: RIFF (little-endian) data, WAVE audio, Microsoft PCM; mimetype: audio/x-wav | FileType: WAV; MIMEType: audio/x-wav | |
Audio | WMA | Name: Advanced Systems Format; PUID: fmt/131; MimeType: application/vnd.ms-asf | format: Microsoft ASF; mimetype: video/x-ms-asf | FileType: WMA; MIMEType: audio/x-ms-wma | DROID and FileUtility misidentify WMA as video |
Audio summary | Can use DROID output for most but not all audio files (ac3 and WMA are not reliably identified) | ||||
Office Open XML | DOCX | Name: Microsoft Office Open XML; PUID: fmt/189; MimeType: (none) | format: Zip archive data; mimetype: application/zip | FileType: ZIP; MIMEType: application/zip | |
Office Open XML | PPTX | Name: Microsoft Office Open XML; PUID: fmt/189; MimeType: (none) | format: Zip archive data; mimetype: application/zip | FileType: ZIP; MIMEType: application/zip | |
Office Open XML | XLSX | Name: Microsoft Office Open XML; PUID: fmt/189; MimeType: (none) | format: Zip archive data; mimetype: application/zip | FileType: ZIP; MIMEType: application/zip | |
Office Open XML summary | FITS can't distinguish between word processing, spreadsheet and presentation files; must use file extensions | ||||
Portable Document Format | Name: Acrobat PDF 1.4 - Portable Document Format; PUID: fmt/18; MimeType: application/pdf | format: PDF; mimetype: application/pdf | FileType: PDF; MIMEType: application/pdf | ||
Portable Document Format/Archival |
|
format: PDF document; mimetype: application/pdf | FileType: PDF; MIMEType: application/pdf | ||
Portable Document Format summary | Base on DROID output - very useful for distinguishing between PDF and PDF/A | ||||
Presentation | PPT | Name: Microsoft Powerpoint Presentation; PUID: fmt/126; MimeType: application/vnd.ms-powerpoint | format: Microsoft Office Document; mimetype: application/octet-stream | FileType: PPT; MIMEType: application/vnd.ms-powerpoint | Exiftool identifies as FileType FPX, MIMEType image/vnd.fpx if file extension is missing |
Presentation | ODP | Name: ZIP format; PUID: x-fmt/263; MimeType: application/zip | format: OpenDocument Presentation; mimetype: application/octet-stream | FileType: ZIP; MIMEType: application/zip | Use fileUtility format or file extension |
Presentation summary | Use DROID for PPT, fileUtility output or file extension for ODP | ||||
Raster image | BMP | name: Windows Bitmap; PUID: fmt/116; MimeType: image/bmp | format: PC Bitmap, Windows 3.x format; mimetype: image/x-ms-bmp | FileType: BMP; MIMEType: image/bmp | |
Raster image | GIF | name: Graphics Interchange Format; PUID fmt/4; MimeType: image/gif | format: GIF image data, version 89a; mimetype: image/gif | FileType: GIF; MIMEType: image/gif | |
Raster image | JPG | name: JPEG File Interchange Format; PUID: fmt/43; MimeType: image/jpeg | format: JPEG image data, JFIF standard 1.01; mimetype: image/jpeg | FileType: JPEG; MIMEType: image/jpeg | |
Raster image | JP2 | name: JPEG2000; PUID: x-fmt/392; MimeType: image/jp2 | format: JPEG 2000 image data; mimetype: application/octet-stream | FileType: JP2; MIMEType: image/jp2 | |
Raster image | PCT | name: Macintosh PICT Image; PUID: x-fmt/80; MimeType: (none) | format: data; mimetype: application/octet stream | FileType: PICT; MIMEType:image/pict | DROID doesn't recognize format if file extension is missing |
Raster image | PNG | Name: Portable Network Graphics; PUID: fmt/11; MimeType: image/png | format: PNG image; mimetype: image/png | FileType: PNG; MIMEType: image/png | |
Raster image | PSD | Name: Adobe Photoshop; PUID: x-fmt/92; MimeType: (none) | format: Adobe Photoshop Image; mimetype: image/vnd.adobe. photoshop | FileType: PSD; MIMEType: application/photoshop | |
Raster image | TIF | Name: Tagged Image File Format; PUID: fmt/7, fmt/8, fmt/9, fmt/10; MimeType: image/tiff | format: TIFF image data. big-endian image; mimetype: image/tiff | FileType: TIFF; MIMEType: image/tiff | |
Raster image | TGA | Name: Truevision Graphics Adapter; PUID: x-fmt/367; MimeType: (none) | format: Targa image data; mimetype: application/octet-stream | Error: Unknown file type | DROID doesn't recognize format if file extension is missing |
Raster image summary | DROID output seems reliable for raster images | ||||
Raw camera image | 3FR | ||||
Raw camera image | ARW | Name: Tagged Image File Format; PUID fmt/7, fmt/8, fmt/9, fmt/10; MimeType: image/tiff | format: TIFF image data, little-endian; mimetype: image/tiff | FileType: ARW; MIMEType: image/x-raw | Exiftool identifies as FileType TIFF MIMEType TIFF if file extension is missing |
Raw camera image | CR2 | Name: Tagged Image File Format; PUID fmt/7, fmt/8, fmt/9, fmt/10; MimeType: image/tiff | format: TIFF image data, little-endian; mimetype: image/tiff | FileType: CR2; MIMEType: image/x-raw | |
Raw camera image | CRW | (not identified) | format: data; mimetype: application/octet-stream | FileType: CRW; MIMEType: image/x-raw | |
Raw camera image | DCR | Name: Tagged Image File Format; PUID fmt/7, fmt/8, fmt/9, fmt/10; MimeType: image/tiff | format: TIFF image data, big-endian; mimetype: image/tiff | FileType: DCR; MIMEType: image/x-raw | Exiftool identifies as FileType TIFF MIMEType TIFF if file extension is missing |
Raw camera image | DNG | Name: Exchangeable Image File Format (Uncompressed); PUID x-fmt/387; MimeType: image/tiff | (no output) | FileType: DNG; MIMEType: image/x-raw | |
Raw camera image | ERF | Name: Tagged Image File Format; PUID fmt/7, fmt/8, fmt/9, fmt/10; MimeType: image/tiff | format: TIFF image data, big-endian; mimetype: image/tiff | FileType: ERF; MIMEType: image/x-raw | Exiftool identifies as FileType TIFF MIMEType TIFF if file extension is missing |
Raw camera image | KDC | Name: Tagged Image File Format; PUID fmt/7, fmt/8, fmt/9, fmt/10; MimeType: image/tiff | format: TIFF image data, big-endian; mimetype: image/tiff | FileType: KDC; MIMEType: image/x-raw | Exiftool identifies as FileType TIFF MIMEType TIFF if file extension is missing |
Raw camera image | MRW | (not identified) | format: Minolta Dimage camera raw image data; mimetype: application/octet-stream | FileType: MRW; MIMEType: image/x-raw | |
Raw camera image | NEF | Name: Tagged Image File Format; PUID fmt/7, fmt/8, fmt/9, fmt/10; MimeType: image/tiff | format: TIFF image data, big-endian; mimetype: image/tiff | FileType: NEF; MIMEType: image/x-raw | Exiftool identifies as FileType TIFF MIMEType TIFF if file extension is missing |
Raw camera image | ORF | (not identified) | format: Olympus ORF raw image data, little-endian; mimetype: image/x-olympus-orf | FileType: ORF; MIMEType: image/x-raw | Exiftool identifies as FileType TIFF MIMEType TIFF if file extension is missing |
Raw camera image | PEF | Name: Tagged Image File Format; PUID fmt/7, fmt/8, fmt/9, fmt/10; MimeType: image/tiff | format: TIFF image data, big-endian; mimetype: image/tiff | FileType: PEF; MIMEType: image/x-raw | Exiftool identifies as FileType TIFF MIMEType TIFF if file extension is missing |
Raw camera image | RAF | (not identified) | format: data; mimetype: application/octet-stream | FileType: RAF; MIMEType: image/x-raw | |
Raw camera image | RAW | ||||
Raw camera image | X3F | (not identified) | format: Foveon X3F raw image data, version 2.1; mimetype: image/x-x3f | FileType: X3F; MIMEType: image/x-raw | |
Raw camera image summary | Must use file extensions for raw camera images | ||||
Spreadsheet | XLS | Name: OLE2 Compound Document Format; PUID: fmt/111; MimeType: (none) | format: Microsoft Office Document; mimetype: application/octect-stream | FileType: XLS; MIMEType: application/vnd.mx-excel | Exiftool identifies as FileType: FPX; MimeType: image/vnd.fpx if file extension is missing |
Spreadsheet | ODS | Name: ZIP Format; PUID: x-fmt/263; MimeType: application/zip | format: OpenDocument Spreadsheet; mimetype: application/octet-stream | FileType: ZIP; MIMEType: application/zip | |
Spreadsheet summary | Must use file extensions for spreadsheets | ||||
Vector image | AI | Name: Acrobat PDF 1.5 - Portable Document Format; PUID: fmt/19; MimeType: application/pdf | format: PDF document; mimetype: application/pdf | FileType: PDF; MIMEType: application/pdf | DROID gives IdentificationWarning: Possible file extension mismatch |
Vector image | EPS | Name: Encapsulated PostScript File Format; PUID: fmt/124; MimeType: application/postscript | format: DOS EPS Binary File Postscript; mimetype: application/octet-stream | FileType: EPS; MIMEType: application/postscript | |
Vector image | SVG | Name: Scalable Vector Graphics; PUID: fmt/91; MimeType: image/svg+xml | format: SVG Scalable Vector Graphics Image; mimetype: image/svg+xml | FileType: SVG; MIMEType: image/svg+xml | |
Vector image summary | Can use DROID output for EPS and SVG; must use file extension for AI. DROID misidentifies AI as PDF. | ||||
Video | AVI | Name: Audio/Video Interleaved Format; PUID: fmt/5; MimeType: video/x-msvideo | format: RIFF (little-endian) data, AVI; mimetype: video/x-msvideo | FileType: AVI; MIMEType: video/avi | |
Video | FLV | Name: Macromedia FLV; PUID: x-fmt/382; MimeType: video/x-flv | format: Macromedia Flash Video; mimetype: video/x-flv | FileType: FLV; MIMEType: video/x-flv | |
Video | M2V | Name: PUID: MimeType: | format: mimetype: | FileType: MIMEType: | |
Video | MOV | Name: Quicktime; PUID: x-fmt/384; MimeType: video/quicktime | format: Apple QuickTime movie; mimetype: video/quicktime | FileType: MOV; MIMEType: video/quicktime | |
Video | MPG | Name: MPEG-1 Video Format; PUID: x-fmt/385; MimeType: video/mpeg | format: MPEG sequence, v1, system multiplex; mimetype: application/octet-stream | FileType: MPEG; MIMEType: video/mpeg | |
Video | MP4 | Name: MPEG-4 Media File; PUID: fmt/199 | format: ISO Media, MPEG v4 system, version 2: mimetype: video/mp4 | FileType: MP4, MIMEType: video/mp4 | |
Video | SWF | Name: Macromedia Flash; PUID: fmt/107; MimeType: | format: Macromedia Flash data; mimetype: application/x-shockwave-flash | FileType: SWF; MIMEType: application/x-shockwave-flash | |
Video | WMV | Name: Advanced Systems Format; PUID: fmt/131; MimeType: application/vnd.ms-asf | format: Microsoft ASF; mimetype: video/x-ms-asf | FileType: WMV MIMEType: video/x-ms-wmv | DROID and FileUtility misidentify WMA as WMV/video |
Video summary | DROID output appears to be reliable for video, except for WMV | ||||
Word processing | DOC | Name: OLE2 Compound Document Format; PUID: fmt/111; MimeType: (not identified) | format: Microsoft Office Document Microsoft Word Document; mimetype: application/msword | FileType: DOC; MIMEType: application/msword | If file extension is missing, ExifTool identifies as FileType FPX MIMEType image/vnd/fpx |
Word processing | ODT | Name: ZIP format; PUID: x-fmt/263; MimeType: application/zip | format: OpenDocument Text; mimetype: application/vnd.oaisis.opendocument.text | FileType: ZIP; MIMEType: application/zip | Use fileUtility output or file extension |
Word processing | RTF | Name: Rich Text Format; PUID: fmt/50, fmt/51; MimeType: application/rtf, text/rtf | format: Plain text; mimetype: text/rtf | Error: Unknown file type | If file extension is missing, DROID PUIDs are fmt/45, fmt/46, fmt/47, fmt/48, fmt/49 |
Word processing | WPD | Name: WordPerfect for Windows Document; PUID: x-fmt/203; MimeType (not identified) | format: Corel/WP); mimetype: application/octet-stream | Error: Unknown file type | If file extension is missing, DROID doesn't recognize this format. This is bad news because many old files with custom file extensions such as LTR or MEM are WordPerfect files |
Word processing summary | Use DROID output for RTF files only; for others, use file extension |