Transcoder
Transcode: convert (language or information) from one form of coded representation to another.[ source: Oxford English Dictionary ]
Overview
The transcoder is developed by artefactual, for the purpose of normalization and generating access copies in the archivematica system. In earlier versions it was called normalizer. It will try to identify the file type by the file extension, or other metadata, and look for matching configured actions for those identified. It will then perform those actions, and exit with a zero status if it believes those actions have been completed successfully.
Development
Presently to manage the complexity of automating the link between file identification and actions, a database based implementation of the transcoder is being built to replace the current xml one.
Configuration
Configuration files are located in the /etc/transcoder/ directory.
transcoderConfig.conf
transcoderConfig.conf is the primary transcoder configuration file. It is a bash script which defines the variables used in the various file format policy XML files; it primarily contains paths to conversion tools and standard file names.
Variables are stored as standard bash shell script variables. Variables can be added or edited using any text editor; any new variables added become available for use in format policy XML files. They use the format:
variableName="variable contents"
Default variables:
Variable | Description | Default value |
formatPoliciesPath | Directory containing format policy XML files | /etc/transcoder/archivematicaFormatPolicies/ |
transcoderScriptsDir | Directory containing transcoder normalization scripts | /usr/lib/transcoder/transcoderScripts/ |
convertPath | Path to ImageMagick for image conversion. Requires a space at the end. | /usr/bin/convert |
ffmpegPath | Path to ffmpeg for audio and video. Requires a space at the end. | /usr/bin/ffmpeg |
theoraPath | Path to ffmpeg2theora script to create Ogg Theora and Vorbis files. Currently unused. Requires a space at the end. | /usr/bin/ffmpeg2theora |
unoconvPath | Path to unoconv binary for converting document files. Currently unused. Requires a space at the end. | /usr/bin/unoconv |
unoconvAlternatePath | Path to unoconv launcher script for converting document files. Requires a space at the end. | /usr/lib/transcoder/transcoderScripts/unoconvAlternative.sh |
DublinCore | File name for Dublin Core metadata | dublincore.xml |
MD5FileName | File name containing SIP MD5 checksum | MD5checksum.txt |
fileUUIDHumanReadable | Log file containing unique IDs for items within a SIP | FileUUIDs.log |