Difference between revisions of "Email"
Jump to navigation
Jump to search
Line 20: | Line 20: | ||
==Normalization tool== | ==Normalization tool== | ||
*Options: | *Options: | ||
− | ** OfflineImap | + | ** [http://offlineimap.org/ OfflineImap] |
** [http://www.five-ten-sg.com/libpst/rn01re01.html readpst] ('''implemented in 0.7''') | ** [http://www.five-ten-sg.com/libpst/rn01re01.html readpst] ('''implemented in 0.7''') | ||
** [http://www.aduna-software.com/technology/aperture Aperture] | ** [http://www.aduna-software.com/technology/aperture Aperture] |
Revision as of 12:50, 26 July 2012
Main Page > Documentation > Media type preservation plans > Email
Email preservation planning is currently under development. See also Email preservation.
Significant characteristics of email
Preservation Format
- Options:
- CERP Project E-Mail Account Schema
- mbox (implemented in 0.7)
- maildir (implemented in 0.9)
Access Format
- Options:
- mbox (implemented in 0.7)
Attachments
- These should be normalized according to the media type preservation plan for each attachment file format. Attachments must remain linked to email message
Normalization tool
- Options:
- OfflineImap
- readpst (implemented in 0.7)
- Aperture
- Tika has an mbox extractor
- PEDALS project email extractor (MS-Windows)
- libpst
- aid4mail (proprietary license, MS-Windows)
- libpff (Not Tested)
- CERP's Email Preservation Parser
Conversion test results
- PST to MBOX using readpst
- PST to Email Account XML Schema using CERP Email Parser
- Gmail to Maildir using OfflineImap
- Zimbra to Maildir using OfflineImap
Comments
- The PEDALS (Persistent Digital Archives and Library System) project has produced an open-source email extractor that converts .pst files to xml. However, this tool is designed for Windows only. Users would need to extract the email outside Archivematica and submit the extracted emails as the SIP. For more information, see Library of Congress News and Events at http://www.digitalpreservation.gov/news/2010/20100924news_article_pedals_email_tool.html.
- Mbox might be an acceptable preservation format for email. MBox files are aggregations of email messages converted to plain text.
- The Bodleian Libraries at the University of Oxford use mbox as a preservation format for mailboxes. See http://www.dpconline.org/component/docman/doc_download/640-emailthomasjul2011.
- A detailed report on testing conversion of email from proprietary to open formats is available at http://www.significantproperties.org.uk/email-testingreport.html. The report includes information about testing conversions from pst to mbox using ReadPST.