Meeting 20120411
Jump to navigation
Jump to search
Development
- Joseph tarted working on the office normalization not working, handed off to Austin: http://code.google.com/p/archivematica/issues/detail?id=961
- Issue 961:12.04 + DocumentConverter.py normalization failures. Joseph would like to give this high priority. There have been some major changes to unoconv, and he'd like to give it a shot again. It would be nice to fix
Issue 304: Transcoding with Open Office fails periodically.
- Austin is trying to track some similar issue down.. but will post issues to libre office and launchpad as well
Deployment
Documentation
- Austin will be responsible for documenting multi-node processor testing
Testing
- Multi-node processor testing: Joseph is not seeing database drops/disconnects in archivematica 0.9 dev on precise.
- He's holding back on testing till we get a better idea of what metrics we want to keep.
- There's a limit to how much can be tested in the office; the limit is the disk, so we will have to upgrade
- however, it sounds like our ciaus test environment is confirming thus far that our multi-node mysql drops problem has dissappeared after full 12.04 upgrades
- for testing, we should probably be comparing results for individual micro-services as well as total ingest-to-aip/dip creation times
- Austin will create the test net on openhosting, maybe in an afternoon/evening
- Austin will take over the testing process & docs, he will speak to Joseph about all the variations to test for, e.g. specific micro-services vs total pipeline time, bulk set of sample data vs testing for specific format types
- All of the testing needs to be ordered in a systematic way, e.g. separate tests that keep everything the same except a change in one factor, ie. sample data, microservice, number of nodes
- After we have all those stats (hopefully with positive results, ie. adding more processors boosts performance) we can add distributed FS as a new test variable, eg CEPH
- FS is a variable as well for the initial set of baseline tests - EXT4+nfs right now
- having a look at the mysql queries Joseph created for testing will also give a sense of some other variables we may want to include in analysis
Chat log
(10:38:44 AM) epmclellan: dev news? (10:39:13 AM) berwin22: Started working on the office normalization not working, handed off to Austin: (10:39:13 AM) berwin22: http://code.google.com/p/archivematica/issues/detail?id=961 (10:39:13 AM) berwin22: Issue 961: 12.04 + DocumentConverter.py normalization failures. (10:39:13 AM) berwin22: I'd like to give this high priority. There have been some major changes to unoconv, and I'd like to give it a shot again. I may be dreaming, but I'd love to put this issue to bed: (10:39:13 AM) berwin22: Issue 304: Transcoding with Open Office fails periodically (10:39:52 AM) peterVG: berwin22: worth another try (10:40:06 AM) berwin22: well it's 100% broken atm (10:40:12 AM) peterVG: oh. (10:40:17 AM) ARTi: berwin22: nice, Im trying to track some similar issue down.. but will post issues to libre office and launchpad as well (10:40:30 AM) ARTi left the room (quit: Read error: Connection reset by peer). (10:40:43 AM) peterVG: how are we doing on multi-node processor testing (10:40:49 AM) peterVG: i still consider that our no.1 priority (10:41:06 AM) berwin22: Testing: not seeing database drops/disconnects in archivematica 0.9 dev on precise. (10:41:06 AM) berwin22: http://archdist1.local/ (for those of you in office) (10:41:06 AM) berwin22: I'm holding back on testing till we get a better idea of what metrics we want to keep. (10:42:16 AM) peterVG: okay so first there's multi-node testing, ensuring that we are getting all nodes contributing to reducing total processing time vs. using a single MCP processor (10:42:44 AM) peterVG: in theory, that's supposed to scale horizontally, ie. we reduce processing time when we add more processing nodes (10:43:03 AM) peterVG: I realize that ciaus might not be the best place to test this as its faking a multi-node environment (10:43:05 AM) berwin22: I think there is a limit to that, that I won't be able to test in office: (10:43:13 AM) berwin22: the limit is the disk (10:43:17 AM) artii: network here keeps going down., on phone (10:43:21 AM) peterVG: ^ right, understood (10:43:27 AM) ARTi [~austin@24.207.112.199] entered the room. (10:43:44 AM) ARTi: tried to setup openhosting, but not enough resources in test acount (10:43:46 AM) peterVG: artii re-posting piece after you dropped out, important for you (10:43:48 AM) peterVG: (10:40:45 AM) peterVG: how are we doing on multi-node processor testing (10:43:48 AM) peterVG: (10:40:51 AM) peterVG: i still consider that our no.1 priority (10:43:48 AM) peterVG: (10:41:07 AM) berwin22: Testing: not seeing database drops/disconnects in archivematica 0.9 dev on precise. (10:43:48 AM) peterVG: (10:41:07 AM) berwin22: http://archdist1.local/ (for those of you in office) (10:43:48 AM) peterVG: (10:41:07 AM) berwin22: I'm holding back on testing till we get a better idea of what metrics we want to keep. (10:43:48 AM) peterVG: (10:42:17 AM) peterVG: okay so first there's multi-node testing, ensuring that we are getting all nodes contributing to reducing total processing time vs. using a single MCP processor (10:43:50 AM) peterVG: (10:42:45 AM) peterVG: in theory, that's supposed to scale horizontally, ie. we reduce processing time when we add more processing nodes (10:43:50 AM) peterVG: (10:43:04 AM) peterVG: I realize that ciaus might not be the best place to test this as its faking a multi-node environment (10:43:52 AM) peterVG: (10:43:07 AM) berwin22: I think there is a limit to that, that I won't be able to test in office: (10:43:52 AM) peterVG: (10:43:14 AM) berwin22: the limit is the disk (10:43:54 AM) peterVG: (10:43:18 AM) artii: network here keeps going down., on phone (10:43:54 AM) peterVG: (10:43:23 AM) peterVG: ^ right, understood (10:43:57 AM) ARTi: but I think its a great infrastructure for testing, maybe berwin and I could share it (10:44:15 AM) peterVG: okay, we'll have to upgrade then (10:44:31 AM) peterVG: there's a limit to testing multinode performance improvement theory on ciaus (10:44:46 AM) berwin22: yes (10:45:32 AM) peterVG: however, it sounds like our ciaus test environment is confirming thus far that our multi-node mysql drops problem has dissappeared after full 12.04 upgrades? (10:45:35 AM) berwin22: so are we only interested in total processing time of standard sips -from the sample set (10:46:03 AM) berwin22: I haven't seen any since (10:46:10 AM) peterVG: berwin22: well, in the end yes but I realize that there's a big variation in processor demands per micro-service (10:46:50 AM) peterVG: so we should probably be comparing results for individual micro-services as well as total ingest-to-aip/dip creation times (10:47:23 AM) peterVG: epmclellan: or courtney can one of you pls take ownership of this project from a documentation point of view (10:47:33 AM) peterVG: i don't want to keep repeating this discussion week after week (10:47:38 AM) berwin22: I think certain files take a long time too: ie. xml files take forever in fits (10:47:57 AM) epmclellan: peterVG: it would be better for whoever is doing the testing to do the documentation (10:48:08 AM) epmclellan: i.e. Joseph and/or Austin (10:48:29 AM) peterVG: berwin22: good point (10:48:34 AM) peterVG: epmclellan: okay (10:49:41 AM) peterVG: okay, so what's next on the testing then (10:49:42 AM) ARTi left the room (quit: Remote host closed the connection). (10:50:00 AM) peterVG: artii: to upgrade openhosting account, and get multi-node processing ready there (10:50:03 AM) berwin22: generate queries to extrapolate the desired information (10:50:17 AM) berwin22: how much time do you want to devote to this? (10:50:56 AM) peterVG: as much as it takes to get some comprehensive results (10:51:35 AM) peterVG: multinode scalability can't be just a theory anymore, we need to know whether its capable of working as we expect it to (10:51:42 AM) ARTi [~austin@24.207.112.199] entered the room. (10:52:08 AM) ARTi: peterVG: it wont take me long to create the test net on openhosting, maybe in a afternoon/evening (10:52:14 AM) ARTi: everything is pretty standard (10:52:18 AM) peterVG: okay, good (10:52:39 AM) peterVG: could you then please take over the testing process & docs (10:52:48 AM) peterVG: speak to berwin22 about all the variations to test for (10:52:56 AM) peterVG: e.g. specific micro-services vs total pipeline time (10:53:06 AM) peterVG: bulk set of sample data vs testing for specific format types (10:53:33 AM) ARTi: ok (10:53:39 AM) peterVG: epmclellan: this is critical for production 1.0, i'd like you to be involved as well pls (10:53:54 AM) epmclellan: ok (10:53:57 AM) berwin22: null hypothesis: (10:53:57 AM) berwin22: adding up to 6 nodes improves processing time on http://archivematica.org/downloads/docZips/ (10:54:09 AM) peterVG: can you coordinate with ARTi, i.e. touch base on getting test ready, reviewing/providing feedback on testing results (10:54:12 AM) berwin22: agreed? (10:54:16 AM) epmclellan: will do (10:54:28 AM) ARTi: berwin22: great ARTi artii (10:55:08 AM) peterVG: ARTi the other variable to test for is the number of processors, i.e. does adding 3 nodes give the same ratio of performance improvement as 6 nodes, e.g. horizontal scalability (10:55:12 AM) epmclellan: ARTi we can chat later today (10:55:31 AM) ARTi: peterVG: ok (10:55:32 AM) courtney: epmclellan: i'd like to be involved in that chat (10:55:38 AM) epmclellan: ok (10:55:55 AM) peterVG: all of this needs to be ordered in a systematic way, e.g. seperate tests that keep everything the same except a change in one factor, ie. sample data, microservice, number of nodes (10:56:49 AM) epmclellan: getting all of this into the meeting notes to help with documentation (10:57:01 AM) peterVG: ARTi after we have all those stats (hopefully with positive results, ie. adding more processors boosts performance) we can add distributed FS as a new test variable, eg CEPH (10:57:47 AM) ARTi: ok (10:57:58 AM) peterVG: so make note of FS as a variable as well for the initial set of baseline tests (presumably EXT4 right now?) (10:58:11 AM) ARTi: yes, ext4+nfs (10:58:40 AM) peterVG: berwin22: are there other factors to test for? mysql insert/read response times/timeouts? (10:59:45 AM) berwin22: thinking (10:59:56 AM) peterVG: ARTi and epmclellan having a look at the mysql queries berwin22 created for testing will also give a sense of some other variables we may want to include in analysis (11:00:06 AM) berwin22: I don't see what those would give us. (11:00:31 AM) berwin22: If we weren't seeing a performance improvement, then we might want to start looking at those t hings (11:00:36 AM) peterVG: what type of info are those queries returning? (11:02:10 AM) berwin22: transfer time processing (11:02:10 AM) berwin22: sip time processing (11:02:10 AM) berwin22: I'm assuming the ones I was working on recently. (11:02:10 AM) berwin22: Before, we only had time in system, but had no way to tell, which sips/transfers were taking longer, if multiple transfers were processing at once. (11:02:40 AM) peterVG: okay, so would it be useful to capture a snapshot (eg print to pdf) of these queries for each test? (11:03:02 AM) peterVG: or save as csv so we can put all into a spreadsheet? (11:03:08 AM) berwin22: yes, I'd say xml. (11:03:18 AM) berwin22: xml/html (11:03:23 AM) berwin22: use the -H (11:03:53 AM) peterVG: epmclellan: and ARTi can you please have a look at berwin22's query and figure out where/how to incorporate into test results and global analysis, sounds like a useful way to generate metrics (11:04:11 AM) epmclellan: ok (11:04:27 AM) epmclellan: I'll need to be brought up to speed on a lot of this (11:04:47 AM) peterVG: okay, pls discuss with berwin22 and ARTi as needed (11:04:51 AM) epmclellan: will do (11:04:59 AM) peterVG: we need to start making some solid progress on this work this week (11:05:09 AM) epmclellan: let's chat this afternoon, guys (11:05:14 AM) peterVG: our window to take any dramatic corrective action (hopefully not necessary) is closing (11:05:18 AM) berwin22: not here this aft (11:05:28 AM) epmclellan: tomorrow then (11:05:39 AM) berwin22: from the data I've seen, it's working (11:05:54 AM) berwin22: but it's not scientific (11:06:06 AM) ARTi: Ill be on all day tomorrow as well (11:06:11 AM) peterVG: okay cool (11:06:19 AM) epmclellan: ok, I'll be available in the chatroom all day (11:06:31 AM) peterVG: sorry to dominate mtg with this topic but was critical (11:06:40 AM) epmclellan: np (11:06:44 AM) peterVG: i guess we're out of time, Mark is on stand-by for mtg (11:06:48 AM) epmclellan: yes, he's here (11:07:00 AM) peterVG: anything critical to address before next week's mtg? (11:07:11 AM) courtney: we've tackled the BagIt issue, I think (11:07:20 AM) epmclellan: no (11:07:21 AM) peterVG: berwin22: sounds like you've got an idea of the high priority issues you want to work on (11:07:33 AM) epmclellan: we have a few last issues to deal with re bagit ingest (11:07:36 AM) peterVG: plus berwin22 you're taking some time off to move? (11:07:42 AM) berwin22: Some good active discussion on: (11:07:42 AM) berwin22: http://www.archivematica.org/wiki/index.php?title=Bag_ingest (11:07:42 AM) berwin22: http://code.google.com/p/archivematica/issues/detail?id=593 (11:07:42 AM) berwin22: Issue 593: Add ability to ingest bagit SIPs. (11:07:51 AM) berwin22: yes