Meeting 20120411

From Archivematica
Revision as of 12:10, 11 April 2012 by Evelyn (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Development

  • Joseph tarted working on the office normalization not working, handed off to Austin: http://code.google.com/p/archivematica/issues/detail?id=961
    • Issue 961:12.04 + DocumentConverter.py normalization failures. Joseph would like to give this high priority. There have been some major changes to unoconv, and he'd like to give it a shot again. It would be nice to fix

Issue 304: Transcoding with Open Office fails periodically.

  • Austin is trying to track some similar issue down.. but will post issues to libre office and launchpad as well


Deployment

Documentation

  • Austin will be responsible for documenting multi-node processor testing

Testing

  • Multi-node processor testing: Joseph is not seeing database drops/disconnects in archivematica 0.9 dev on precise.
    • He's holding back on testing till we get a better idea of what metrics we want to keep.
    • There's a limit to how much can be tested in the office; the limit is the disk, so we will have to upgrade
    • however, it sounds like our ciaus test environment is confirming thus far that our multi-node mysql drops problem has dissappeared after full 12.04 upgrades
    • for testing, we should probably be comparing results for individual micro-services as well as total ingest-to-aip/dip creation times
    • Austin will create the test net on openhosting, maybe in an afternoon/evening
  • Austin will take over the testing process & docs, he will speak to Joseph about all the variations to test for, e.g. specific micro-services vs total pipeline time, bulk set of sample data vs testing for specific format types
    • All of the testing needs to be ordered in a systematic way, e.g. separate tests that keep everything the same except a change in one factor, ie. sample data, microservice, number of nodes
    • After we have all those stats (hopefully with positive results, ie. adding more processors boosts performance) we can add distributed FS as a new test variable, eg CEPH
      • FS is a variable as well for the initial set of baseline tests - EXT4+nfs right now
    • having a look at the mysql queries Joseph created for testing will also give a sense of some other variables we may want to include in analysis

Chat log

(10:38:44 AM) epmclellan: dev news?
(10:39:13 AM) berwin22: Started working on the office normalization not working, handed off to Austin:
(10:39:13 AM) berwin22: http://code.google.com/p/archivematica/issues/detail?id=961
(10:39:13 AM) berwin22: Issue 961:	12.04 + DocumentConverter.py normalization failures.
(10:39:13 AM) berwin22: I'd like to give this high priority. There have been some major changes to unoconv, and I'd like to give it a shot again. I may be dreaming, but I'd love to put this issue to bed:
(10:39:13 AM) berwin22: Issue 304:	Transcoding with Open Office fails periodically
(10:39:52 AM) peterVG: berwin22: worth another try
(10:40:06 AM) berwin22: well it's 100% broken atm
(10:40:12 AM) peterVG: oh.
(10:40:17 AM) ARTi: berwin22: nice,  Im trying to track some similar issue down.. but will post issues to libre office and launchpad as well
(10:40:30 AM) ARTi left the room (quit: Read error: Connection reset by peer).
(10:40:43 AM) peterVG: how are we doing on multi-node processor testing
(10:40:49 AM) peterVG: i still consider that our no.1 priority
(10:41:06 AM) berwin22: Testing: not seeing database drops/disconnects in archivematica 0.9 dev on precise.
(10:41:06 AM) berwin22: http://archdist1.local/ (for those of you in office)
(10:41:06 AM) berwin22: I'm holding back on testing till we get a better idea of what metrics we want to keep.
(10:42:16 AM) peterVG: okay so first there's multi-node testing, ensuring that we are getting all nodes contributing to reducing total processing time vs. using a single MCP processor
(10:42:44 AM) peterVG: in theory, that's supposed to scale horizontally, ie. we reduce processing time when we add more processing nodes
(10:43:03 AM) peterVG: I realize that ciaus might not be the best place to test this as its faking a multi-node environment
(10:43:05 AM) berwin22: I think there is a limit to that, that I won't be able to test in office:
(10:43:13 AM) berwin22: the limit is the disk
(10:43:17 AM) artii: network here keeps going down., on phone
(10:43:21 AM) peterVG: ^ right, understood
(10:43:27 AM) ARTi [~austin@24.207.112.199] entered the room.
(10:43:44 AM) ARTi: tried to setup openhosting, but not enough resources in test acount
(10:43:46 AM) peterVG: artii re-posting piece after you dropped out, important for you
(10:43:48 AM) peterVG: (10:40:45 AM) peterVG: how are we doing on multi-node processor testing
(10:43:48 AM) peterVG: (10:40:51 AM) peterVG: i still consider that our no.1 priority
(10:43:48 AM) peterVG: (10:41:07 AM) berwin22: Testing: not seeing database drops/disconnects in archivematica 0.9 dev on precise.
(10:43:48 AM) peterVG: (10:41:07 AM) berwin22: http://archdist1.local/ (for those of you in office)
(10:43:48 AM) peterVG: (10:41:07 AM) berwin22: I'm holding back on testing till we get a better idea of what metrics we want to keep.
(10:43:48 AM) peterVG: (10:42:17 AM) peterVG: okay so first there's multi-node testing, ensuring that we are getting all nodes contributing to reducing total processing time vs. using a single MCP processor
(10:43:50 AM) peterVG: (10:42:45 AM) peterVG: in theory, that's supposed to scale horizontally, ie. we reduce processing time when we add more processing nodes
(10:43:50 AM) peterVG: (10:43:04 AM) peterVG: I realize that ciaus might not be the best place to test this as its faking a multi-node environment
(10:43:52 AM) peterVG: (10:43:07 AM) berwin22: I think there is a limit to that, that I won't be able to test in office:
(10:43:52 AM) peterVG: (10:43:14 AM) berwin22: the limit is the disk
(10:43:54 AM) peterVG: (10:43:18 AM) artii: network here keeps going down., on phone
(10:43:54 AM) peterVG: (10:43:23 AM) peterVG: ^ right, understood
(10:43:57 AM) ARTi: but I think its a great infrastructure for testing, maybe berwin and I could share it
(10:44:15 AM) peterVG: okay, we'll have to upgrade then
(10:44:31 AM) peterVG: there's a limit to testing multinode performance improvement theory on ciaus
(10:44:46 AM) berwin22: yes
(10:45:32 AM) peterVG: however, it sounds like our ciaus test environment is confirming thus far that our multi-node mysql drops problem has dissappeared after full 12.04 upgrades?
(10:45:35 AM) berwin22: so are we only interested in total processing time of standard sips -from the sample set
(10:46:03 AM) berwin22: I haven't seen any since
(10:46:10 AM) peterVG: berwin22: well, in the end yes but I realize that there's a big variation in processor demands per micro-service
(10:46:50 AM) peterVG: so we should probably be comparing results for individual micro-services as well as total ingest-to-aip/dip creation times
(10:47:23 AM) peterVG: epmclellan:  or courtney can one of you pls take ownership of this project from a documentation point of view
(10:47:33 AM) peterVG: i don't want to keep repeating this discussion week after week
(10:47:38 AM) berwin22: I think certain files take a long time too: ie. xml files take forever in fits
(10:47:57 AM) epmclellan: peterVG: it would be better for whoever is doing the testing to do the documentation
(10:48:08 AM) epmclellan: i.e. Joseph and/or Austin
(10:48:29 AM) peterVG: berwin22: good point
(10:48:34 AM) peterVG: epmclellan: okay
(10:49:41 AM) peterVG: okay, so what's next on the testing then
(10:49:42 AM) ARTi left the room (quit: Remote host closed the connection).
(10:50:00 AM) peterVG: artii: to upgrade openhosting account, and get multi-node processing ready there
(10:50:03 AM) berwin22: generate queries to extrapolate the desired information
(10:50:17 AM) berwin22: how much time do you want to devote to this?
(10:50:56 AM) peterVG: as much as it takes to get some comprehensive results
(10:51:35 AM) peterVG: multinode scalability can't be just a theory anymore, we need to know whether its capable of working as we expect it to
(10:51:42 AM) ARTi [~austin@24.207.112.199] entered the room.
(10:52:08 AM) ARTi: peterVG: it wont take me long to create the test net on openhosting, maybe in a afternoon/evening 
(10:52:14 AM) ARTi: everything is pretty standard
(10:52:18 AM) peterVG: okay, good
(10:52:39 AM) peterVG: could you then please take over the testing process & docs
(10:52:48 AM) peterVG: speak to berwin22 about all the variations to test for
(10:52:56 AM) peterVG: e.g. specific micro-services vs total pipeline time
(10:53:06 AM) peterVG: bulk set of sample data vs testing for specific format types
(10:53:33 AM) ARTi: ok
(10:53:39 AM) peterVG: epmclellan: this is critical for production 1.0, i'd like you to be involved as well pls
(10:53:54 AM) epmclellan: ok
(10:53:57 AM) berwin22: null hypothesis:
(10:53:57 AM) berwin22: adding up to 6 nodes improves processing time on http://archivematica.org/downloads/docZips/
(10:54:09 AM) peterVG: can you coordinate with ARTi, i.e. touch base on getting test ready, reviewing/providing feedback on testing results
(10:54:12 AM) berwin22: agreed?
(10:54:16 AM) epmclellan: will do
(10:54:28 AM) ARTi: berwin22: great
ARTi artii 
(10:55:08 AM) peterVG: ARTi the other variable to test for is the number of processors, i.e. does adding 3 nodes give the same ratio of performance improvement as 6 nodes, e.g. horizontal scalability
(10:55:12 AM) epmclellan: ARTi we can chat later today
(10:55:31 AM) ARTi: peterVG: ok
(10:55:32 AM) courtney: epmclellan: i'd like to be involved in that chat
(10:55:38 AM) epmclellan: ok
(10:55:55 AM) peterVG: all of this needs to be ordered in a systematic way, e.g. seperate tests that keep everything the same except a change in one factor, ie. sample data, microservice, number of nodes
(10:56:49 AM) epmclellan: getting all of this into the meeting notes to help with documentation
(10:57:01 AM) peterVG: ARTi after we have all those stats (hopefully with positive results, ie. adding more processors boosts performance) we can add distributed FS as a new test variable, eg CEPH
(10:57:47 AM) ARTi: ok
(10:57:58 AM) peterVG: so make note of FS as a variable as well for the initial set of baseline tests (presumably EXT4 right now?)
(10:58:11 AM) ARTi: yes, ext4+nfs
(10:58:40 AM) peterVG: berwin22: are there other factors to test for? mysql insert/read response times/timeouts?
(10:59:45 AM) berwin22: thinking
(10:59:56 AM) peterVG: ARTi and epmclellan having a look at the mysql queries berwin22 created for testing will also give a sense of some other variables we may want to include in analysis
(11:00:06 AM) berwin22: I don't see what those would give us.
(11:00:31 AM) berwin22: If we weren't seeing a performance improvement, then we might want to start looking at those t hings
(11:00:36 AM) peterVG: what type of info are those queries returning?
(11:02:10 AM) berwin22: transfer time processing
(11:02:10 AM) berwin22: sip time processing
(11:02:10 AM) berwin22: I'm assuming the ones I was working on recently.
(11:02:10 AM) berwin22: Before, we only had time in system, but had no way to tell, which sips/transfers were taking longer, if multiple transfers were processing at once.
(11:02:40 AM) peterVG: okay, so would it be useful to capture a snapshot (eg print to pdf) of these queries for each test?
(11:03:02 AM) peterVG: or save as csv so we can put all into a spreadsheet?
(11:03:08 AM) berwin22: yes, I'd say xml.
(11:03:18 AM) berwin22: xml/html
(11:03:23 AM) berwin22: use the -H 
(11:03:53 AM) peterVG: epmclellan:  and ARTi can you please have a look at berwin22's query and figure out where/how to incorporate into test results and global analysis, sounds like a useful way to generate metrics
(11:04:11 AM) epmclellan: ok
(11:04:27 AM) epmclellan: I'll need to be brought up to speed on a lot of this
(11:04:47 AM) peterVG: okay, pls discuss with berwin22 and ARTi as needed
(11:04:51 AM) epmclellan: will do
(11:04:59 AM) peterVG: we need to start making some solid progress on this work this week
(11:05:09 AM) epmclellan: let's chat this afternoon, guys
(11:05:14 AM) peterVG: our window to take any dramatic corrective action (hopefully not necessary) is closing
(11:05:18 AM) berwin22: not here this aft
(11:05:28 AM) epmclellan: tomorrow then
(11:05:39 AM) berwin22: from the data I've seen, it's working
(11:05:54 AM) berwin22: but it's not scientific
(11:06:06 AM) ARTi: Ill be on all day tomorrow as well
(11:06:11 AM) peterVG: okay cool
(11:06:19 AM) epmclellan: ok, I'll be available in the chatroom all day
(11:06:31 AM) peterVG: sorry to dominate mtg with this topic but was critical
(11:06:40 AM) epmclellan: np
(11:06:44 AM) peterVG: i guess we're out of time, Mark is on stand-by for mtg
(11:06:48 AM) epmclellan: yes, he's here
(11:07:00 AM) peterVG: anything critical to address before next week's mtg?
(11:07:11 AM) courtney: we've tackled the BagIt issue, I think
(11:07:20 AM) epmclellan: no
(11:07:21 AM) peterVG: berwin22: sounds like you've got an idea of the high priority issues you want to work on
(11:07:33 AM) epmclellan: we have a few last issues to deal with re bagit ingest
(11:07:36 AM) peterVG: plus berwin22 you're taking some time off to move?
(11:07:42 AM) berwin22: Some good active discussion on:
(11:07:42 AM) berwin22: http://www.archivematica.org/wiki/index.php?title=Bag_ingest
(11:07:42 AM) berwin22: http://code.google.com/p/archivematica/issues/detail?id=593
(11:07:42 AM) berwin22: Issue 593: Add ability to ingest bagit SIPs.
(11:07:51 AM) berwin22: yes