Meeting 20120516
Artefactual Systems, Internal Archivematica Dev Mtg, 2012-05-16
Development[edit]
- Joseph worked on transcoder integration into the MCP, which means sub workflows in the MCP to give the ability to react to failed microservices
- Team members had meeting to discuss email processing
- Courtney is working on transfer workflow diagrams, will create mockups for user management
- Mike has been working on making the DIP destination selectable: added a UI for it
- Evelyn fixed issue 983
- Mike made some minor UI changes to the dashboard
Deployment[edit]
- Will be using VMs for UNESCO workshop
Testing[edit]
- Austin has been researching different email conversion paths
- Austin will continue working on scalability testing reports
Documentation[edit]
chat log[edit]
<peterVG> hey courtneyHome heard you're able to stay in OhCanada for a while longer * ARTi (~austin@70.71.0.109) has joined #openarchives <berwin221> I still need to do some cleanup serrounding that, and I've been talking with mike (mcantelon) about how to display this in the dashboard * arrti has quit (Quit: Yaaic - Yet another Android IRC client - http://www.yaaic.org) <courtneyHome> peterVG: Yup! I am so very happy <epmclellan> berwin221 nice <peterVG> yay! <epmclellan> courtneyHome was a little...nervous, shall we say <mcantelon> Nice! <peterVG> sorry berwin22 nice work on transcoder integration and sub-jobs <djjuhasz> courtneyHome: we're very happy you get to stay too :) <mcantelon> courtneyHome: Yay! <epmclellan> we had a meeting yesterday to discuss email processing, it likes we're going to try to convert RAC's PST files to maildir prior to ingest <epmclellan> http://code.google.com/p/archivematica/issues/detail?id=962 <peterVG> not just RAC <berwin221> congrats courtney <epmclellan> if we can standardize on maildir ingest for 0.9 that will be a really useful feature <epmclellan> yes <peterVG> yes <epmclellan> peterVG: right, I think many institutions would want to do that too <mcantelon> Mike has been working on making the DIP destination selectable. I added a UI for it this week and will move to testing it. <peterVG> epmclellan: & austin will work on maildir capture/conversion tools/instructions <epmclellan> yup, should be interesting <courtneyHome> i'm finishing up the new transfer workflow diagrams today <peterVG> berwin22 can now focus on just maildir as the expected input format for emailTransfers <ARTi> epmclellan, peterVG: ive been investigating some of the conversion paths... if we are using mbox we still may want to go readpst > mbox > maildir <peterVG> ARTi: fair enough <epmclellan> interesting <berwin221> that's a bonus for me <epmclellan> didn't think of mbox to maildir <ARTi> dosnt look like there is anything as hardcore as readpst, and readpst only goes to mbox afaik <peterVG> berwin22 yes, as discussed yesterday, that gives berwin22 the mbox to use later on down the pipeline <epmclellan> yeah but mbox is really flexible <peterVG> ARTi: how would you do mbox -> maildir <peterVG> using offlineIMAP? <ARTi> peterVG: we were using some scripts epmclellan found at sfu for the task <peterVG> does it require loading into an IMAP server? <ARTi> mbox2maildir.py er something, * ARTi looks <epmclellan> hmm, not mbox to maildir... the other way around <peterVG> okay cool <epmclellan> md2mb.py <peterVG> okay too bad <epmclellan> aid4mail converts pst to to eml <epmclellan> eml is the message format used in maildir <epmclellan> I'll check it out <berwin221> I found this: http://www.ianlewis.org/en/parsing-email-attachments-python <peterVG> i think one (probably more intensive/cumbersome) route would be to install an IMAP server (as a microservice/bundled tool?), load various email account formats into it, then suck out maildir using OfflineIMAP <ARTi> Im pretty sure there is some stuff that goes mbox>maildir <berwin221> In combination with some of the python libraries, we may be able to do maildir attachment extraction in python <peterVG> let's discuss further outside mtg <epmclellan> berwin221 that would be great <ARTi> k <epmclellan> I fixed http://code.google.com/p/archivematica/issues/detail?id=983 <peterVG> fyi our two paying archivematica clients for next 4 months are UBC & RAC so we're prioritizing their deliverables <epmclellan> changing video access normalization commands <epmclellan> peterVG: got it <peterVG> that includes this email preservation plan work for RAC (hopefully can satisify pro bono promises made to SFU in process) <courtneyHome> mcantelon berwin22 and i are meeting up next week to do a dev roadmap/issue checkin - will help to prioritize <peterVG> courtneyHome: okay great <mcantelon> Beware: Mike made a few tweaks to the dashboard UI: making it so you click on the bar to see microservices and click the magnifying glass to see metadata... also got rid of the redundant panel icon.z <peterVG> the other RAC req that we need to highlight is basic user/group permissions for dashboard <peterVG> other than that both clients (+UN) will need the more detailed external media transfer guidelines courtneyHome is working on (prioritizing after conference paper?) now <peterVG> lastly, we also need to start prototyping AT->Archiveamtica->XTF integration. can delay dev on it until post 9.0 but will need to work over next two months on identifying strategy / requirements <courtneyHome> peterVG: you'll have those transfer guides to take next week <epmclellan> re user management, http://code.google.com/p/archivematica/issues/detail?id=922, I reassigned to courtney to do requirements <peterVG> I would like ARTi to create a RAC dev VM = Archivematica trunk VM (as per Joseph's snazy instructoins) plus Archivist Toolkit and XTF installation <peterVG> courtneyHome: who's our favourite American? <peterVG> ummm sorry ARTi but for employment purposes you count as a Canadian <ARTi> :D <epmclellan> JessicaB: is American too, I think <peterVG> see above ^ <peterVG> ARTi: we can discuss RAC dev vm later today, <berwin221> We're all on the American continent... they're just the United States of America <peterVG> but just fyi, ARTi priorities now are: complete scalability testing reports (MCP, MCP + 2, MCP + 6), email tools, RAC vm <ARTi> great. <courtneyHome> i can get those user mgt requirements done next week after i finish the ipres paper <ARTi> Ill get the testing info into the wiki, we have a start on the email tools, and will build vm <peterVG> hey courtneyHome did you see that HillelA got hired for the RAC digital archivist post? <berwin221> I just had a closer look at the "Parsing email with attachments in python" link above, I don't think it will work for our needs. <epmclellan> ARTi: FYI I'll be in NY next week at UN <peterVG> courtneyHome: thx. we signed up Sevein for that piece of dashboard dev <peterVG> we agreed that first cut we're just worried about having an admin user group & everybody else group. Admin gets to change configs <berwin221> ... maybe though. I think it's worth taking a day to poke really hard. <courtneyHome> cool <epmclellan> hoping to have the user who is logged in be the PREMIS agent <peterVG> maybe later we have an additional 'security'(?) group, only those users get to see transfers/sips where we have matches on security keywords/regexs <peterVG> epmclellan: right ^ that too <epmclellan> those are the minimum requirements for 0.9 <epmclellan> RAC would be satisfied with that <peterVG> epmclellan: yes, user->PREMIS agent + admin user <epmclellan> yup <epmclellan> easy-peasy <peterVG> make it so <berwin221> hoping to have the user who is logged in be the PREMIS agent - is this a 0.9 requirement? <peterVG> berwin22 yes <epmclellan> berwin221 yes <epmclellan> it's in the issue, I think <berwin221> that's a lot of work <epmclellan> yes, just checked, it's in the issue <djjuhasz> berwin221: I tried to fight for a Pan-Americas definition of "America" at one point. I'm afraid it's a lost cause as it's become synonymous with the USA <berwin221> issue#? <epmclellan> http://code.google.com/p/archivematica/issues/detail?id=922 <peterVG> epmclellan: that's probably two seperate reqs/issues <peterVG> likely why berwin22 didn't catch it earlier <epmclellan> ok, I'll clarify <epmclellan> and add a separate issue <peterVG> also 922 priority should be high <epmclellan> right <peterVG> berwin22 courtneyHome mcantelon can discuss reqs/time/issues for PREMIS agent requirement at next week's release mtg <courtneyHome> most def <peterVG> okay that's time? <mcantelon> Any deployment news? <epmclellan> docs or testing? <epmclellan> ARTi: has done quite a bit of testing <mcantelon> Cool <epmclellan> 6 nodes are faster than one! <ARTi> tests have looked good, we did get a significant increase w/ more processors. But it seems to hit a point where more processors dont get us anywhere. At some point Id like to start testing with distributed filesystems <ARTi> 2 nodes == much better than 1 <ARTi> but 6 are only marginally better <epmclellan> that's great! <epmclellan> good to know <courtneyHome> dunno if this is deployment - but we're having a user group at SAA and tutorial workshops at UNESCO in Sept <epmclellan> hmm, that's more like "outreach" <courtneyHome> new meeting notes section? <berwin221> ARTi: yeah, I'm curious what type of python/MCP, gearman, or disk limitiations might exist <epmclellan> courtneyHome: sure, why not? <peterVG> arti interesting <peterVG> i agree that distributed filesystem testing is next step (after your other priorities discussed above) <ARTi> I received a nice lengthy email from ohi about their disk infrastructure and how we can go about utilizing more disks.. they are teh awesome <ARTi> peterVG: sounds good <peterVG> ARTi: glad to hear OHI worked out, sounds like the right service for us