Meeting 20110525

From Archivematica
Jump to navigation Jump to search

Development[edit]

  • Joseph tried to get some ingest of bag working, supporting more than just MD5 checksum files on ingest.
  • He did a small tweak to the microservice for normalization, so the tasks have different names for generating normalized and access copies.
  • Joseph got the UUID quads in the AIP sotrage script working, see r1392.
  • He added verify premis checksums.
  • He removed create AIP checksum, as bagit creates a checksum, and can now specify what they would like to use, in the microservice config file (.xml).
  • He is trying to find a reasonable access copy for MBOX/PST files. He found a solution easy to use and implement. Not sure about licensing: http://simile.mit.edu/wiki/Email_RDFizer
  • Jesús has been working on issue 464 (almost finished, waiting for feedback!) and other 0.7.1 Dashboard priorities like the Preservation planning report.

Deployment[edit]

  • Austin worked last Friday at UBC because of there firewalling, he ended up setting up sshfs (fuse). It was tricky, but should be straight forward in future installs. He documented the mount script in the private wiki page: Archivematica-UBC_Library

Testing[edit]

  • Austin tested the first season of trueblood, about 6gb of xvid/avi files. It took about a day to finish processing, and even though it looks like ffmpeg worked on all files it shows a fail on normalization.
  • Right now he is processing a 11gb sip with documentaries. It is also failing for upload qubit of course.

Documentation[edit]

Chat log[edit]

(10:31:08 AM) David Juhasz: archivematica?
(10:31:11 AM) David Juhasz: anyone?
(10:31:15 AM) David Juhasz: bueller?
(10:31:16 AM) Sevein: last week notes - http://archivematica.org/wiki/index.php?title=Meeting_20110518
(10:31:22 AM) Sevein: this week - http://archivematica.org/wiki/index.php?title=Meeting_20110525
(10:31:46 AM) ***David Juhasz waits for berwin22's dev dump
(10:31:47 AM) David Juhasz: :)
(10:31:51 AM) MJ Suhonos: /snooze 30m
(10:31:57 AM) David Juhasz: lol
(10:32:06 AM) MJ Suhonos: :P
(10:32:17 AM) berwin22: hmm...
(10:32:26 AM) berwin22: let me check and see what I've been up to
(10:32:42 AM) berwin22: Oh, I tried to get some injest of bag working!
(10:32:54 AM) berwin22: not sure if that will make it into the 0.7.1 release
(10:33:27 AM) David Juhasz: nice
(10:33:52 AM) David Juhasz: Peter, Evelyn and Austin(?) are in NY right now at Rockefeller, I think?
(10:34:05 AM) berwin22: supporting more than just md5 checksum files on injest
(10:34:22 AM) Sevein: I have been working on issues 464 (http://code.google.com/p/archivematica/issues/detail?id=464), finished but need testing, and other 0.7.1 priorities like the Preservation planning report.
(10:34:29 AM) Austin: David Juhasz: nope Im here
(10:34:40 AM) Austin: not much on dev, just testing and deployment
(10:34:47 AM) berwin22: small tweak to the microservice for normalization, so the tasks have different names for generating normalized and access copies
(10:34:53 AM) Austin: evelyn and peter are there
(10:35:18 AM) Austin: not much from me* even
(10:35:24 AM) berwin22: got the uuid quads in the AIP storage script working
(10:35:29 AM) Austin: berwin22: nice!
(10:35:38 AM) Austin: you know what r?
(10:35:50 AM) berwin22: added verify premis checksums
(10:36:23 AM) berwin22: uuid quads: http://code.google.com/p/archivematica/source/detail?r=1392
(10:36:40 AM) berwin22: Austin I didn't really use your script, sorry
(10:36:52 AM) Austin: nice, made it into yesterdays packages
(10:36:55 AM) berwin22: It didn't mesh well with otherings
(10:36:56 AM) Austin: np
(10:38:28 AM) berwin22: removed create AIP checksum, as bagit creates a checksum, and can now specify what they would like to use, in the microservice config file (.xml)
(10:40:17 AM) berwin22: I'm wondering what we should do about the errors when uploading files to with no extension to qubit
(10:41:03 AM) berwin22: ^ that's item 1 I'd like to discuss
(10:41:26 AM) Sevein: I could try to prepare a patch for qubit 1.0.9 (actually we don't have yet a solution for that issue in next releases)
(10:42:06 AM) MJ Suhonos: sorry, can someone tell me what is being uploaded to qubit?  this is a digital object package?
(10:42:41 AM) berwin22: Item 2... I'm trying to find a reasonable access copy for MBOX/PST files. I found this, and I done my first test: easy to use and implement. Not sure about licensing.
http://simile.mit.edu/wiki/Email_RDFizer
(10:43:06 AM) berwin22: Item1 needs to be addressed, because readpst outputs files with no extension.
(10:43:07 AM) Sevein: MJ: Archivematica makes use of a python script to upload digital objects to Qubit, it's an python client emulating a browser (cookies, auth, etc...), no API yet, so it's just a workaround.
(10:43:34 AM) berwin22: I'd rather correct our workflow then apply a hack to the tool
(10:43:38 AM) MJ Suhonos: Sevein:  ah, ok.  
(10:44:30 AM) berwin22: I could consider ommiting files without extensions from the DIP, but I'd need to talk with peter and evelyn about that
(10:44:33 AM) Sevein: MJ: but qubit itself does not support well file uploads without extension, qubit can't guess what kind of file is (we should be able to discover it calling GNU file, using mime-magic or whatever....)
(10:45:17 AM) Sevein: berwin22: the file isn't being uploaded because qubit crashes, right?
(10:45:34 AM) Sevein: berwin22: I think that supporting those files an unknown type would be ok and the fastest solution
(10:45:34 AM) berwin22: Sevein: correct, error code 500 I believe
(10:45:56 AM) Sevein: instead of trying to guess the format, which could be introduced later
(10:46:15 AM) Sevein: let me try it today
(10:46:48 AM) berwin22: OK, I should create an issue on it and assign it to you.
(10:47:05 AM) berwin22: I think we should start marking things as critical that need to be in the 0.7.1 release
(10:47:15 AM) Sevein: this one is ok?
(10:47:16 AM) Sevein: http://code.google.com/p/archivematica/issues/detail?id=355
(10:47:25 AM) berwin22: Does anyone else recall if that was what peter intended?
(10:47:29 AM) Sevein: no idea
(10:47:42 AM) berwin22: That's it
(10:48:09 AM) Sevein: ok 12 mins left
(10:48:19 AM) Sevein: Austin: you said you have been working on deployment and testing?
(10:49:14 AM) Austin: yes, friday was at ubc.. because of there firewalling I ended up setting up sshfs
(10:49:24 AM) Austin: was tricky, but should be straight forward in future installs
(10:49:37 AM) Austin: added mount script here http://amos.artefactual.com/wiki/Archivematica-UBC_Library
(10:51:02 AM) Austin: also, I tested the first season of trueblood, about 6gb of xvid/avi files.  took about a day to finish processing, and even though it looks like ffmpeg worked on all files it shows a fail on normalization.
(10:51:18 AM) Austin: right now Im processing a 11gb sip with documentaries
(10:51:44 AM) Austin: these are also failing for upload qubit of course ;]
(10:52:16 AM) berwin22: Austin: fyi there are newer version of read pst, and an issue that requires the updated version; I assigned that to you this morning
(10:52:34 AM) Sevein: ok
(10:52:44 AM) Austin: berwin22: cheers, Ill go take a look
(10:52:46 AM) Sevein: notes so far - http://archivematica.org/wiki/index.php?title=Meeting_20110525
(10:53:09 AM) Sevein: that's all? doc news?
(10:53:36 AM) Austin: nothing from me
(10:53:41 AM) Sevein: nice script Austin :)
(10:53:50 AM) Sevein: that's using the new service system of ubuntu?
(10:54:01 AM) Sevein: I don't know nothing about init stuff from the last 5 years hehehe
(10:54:14 AM) berwin22: not entirely documentation, but a neat discussion topic I'd like to point out:
https://groups.google.com/group/archivematica/browse_thread/thread/216a30f277593706?hl=en
(10:54:15 AM) Sevein: but I know they have changed everything, parallel start of services, etc...
(10:54:45 AM) Austin: yes, its using upstart http://upstart.ubuntu.com/
(10:54:50 AM) berwin22: *writing new microservices*
(10:54:52 AM) Sevein: beautiful!
(10:54:57 AM) David Juhasz: berwin22 can you capture that discussion on the wiki?
(10:54:58 AM) Austin: berwin22: 
(10:55:02 AM) Austin: that was a great post btw!
(10:55:08 AM) David Juhasz: it sounds like it's useful documentation
(10:55:09 AM) Austin: ++
(10:55:22 AM) berwin22: I have an issue to do so
(10:55:25 AM) David Juhasz: ah, great :)
(10:55:33 AM) berwin22: going to wait till after code freeze to work on doc stuff
(10:55:38 AM) David Juhasz: heh
(10:55:48 AM) David Juhasz: isn't that always the way>?
(10:56:05 AM) berwin22: keeps me busy, or something like that
(10:57:52 AM) Sevein: ok
(10:57:55 AM) Sevein: we're done?
(10:58:09 AM) Sevein: notes - http://archivematica.org/wiki/index.php?title=Meeting_20110525
(10:58:18 AM) David Juhasz: sweet