Difference between revisions of "Meeting 20120314"

From Archivematica
Jump to navigation Jump to search
(Created page with 'Artefactual Systems, Internal Archivematica Dev Mtg, 2012-02-21 = Development = * Discussion of vitess http://code.google.com/p/vitess/ * Work on updating packages to 12.04 = D...')
 
 
(7 intermediate revisions by 2 users not shown)
Line 1: Line 1:
Artefactual Systems, Internal Archivematica Dev Mtg, 2012-02-21
+
Artefactual Systems, Internal Archivematica Dev Mtg, 2012-03-14
  
 
= Development =
 
= Development =
* Discussion of vitess http://code.google.com/p/vitess/
+
* Mike has begun to group micro services - Issue 320
* Work on updating packages to 12.04
+
 
 +
* Joseph started work on selectable AIP storage location
 +
* Mark Jordan is working on DIP upload to CONTENTdm
 +
** Evelyn has been discussing requirements with him
 +
** Joseph is implementing structmap alphabetical ordering
 +
*** our alpha sorting requirement should use original (pre-sanitized) filenames, sort on UTF-8 chars, and respect numbers/lower-upper case. PyICU sounds like it might deal with Unicode sorting... http://stackoverflow.com/questions/1097908/how-do-i-sort-unicode-strings-alphabetically-in-python
  
 
= Deployment =
 
= Deployment =
* Ongoing work with CVA's deployment
 
  
 
= Testing =
 
= Testing =
* Work on maildir normalization path testing.
 
  
 
= Documentation =
 
= Documentation =
* http://archivematica.org/wiki/index.php?title=Creating_Custom_Workflows_0.8_alpha
+
* Courtney is working on the web-based transfer interface: http://archivematica.org/wiki/index.php?title=File_Browser_Requirements#START_TRANSFER
  
 
=chat log=
 
=chat log=
 
<pre>
 
<pre>
(10:39:33 AM) epmclellan: dev news?
+
(10:46:07 AM) epmclellan1: meeting time? I can take notes
(10:39:50 AM) courtney: pretty awesome dev vid by joseph
+
(10:46:07 AM) ARTi left the room (quit: Read error: Connection reset by peer).
(10:40:01 AM) epmclellan: thanks for doing that, berwin22
+
(10:46:37 AM) peterVG: back
(10:40:03 AM) mcantelon: Yeah, that is super useful documentation
+
(10:46:43 AM) epmclellan1: we just lost Autsin but we can still start with dev
(10:40:23 AM) berwin22: dev null - none from me.
+
(10:47:21 AM) courtney: are mockups dev?
(10:40:31 AM) courtney: i will add to soon-to-be reworked documentation for devs on wiki
+
(10:47:21 AM) mcantelon: I've started working on grouping jobs, in transfers, by microservice.
(10:40:50 AM) epmclellan: berwin22: all cva stuff etc?
+
(10:47:28 AM) epmclellan1: great!
(10:41:11 AM) berwin22: that would be support, not dev
+
(10:47:43 AM) courtney: super
(10:41:17 AM) epmclellan: right
+
(10:47:43 AM) ARTi [~austin@24.207.112.199] entered the room.
(10:41:28 AM) epmclellan: I meant it's taking up all your time
+
(10:47:51 AM) epmclellan1: hi ARTi
(10:41:42 AM) peterVG: dev vid?
+
(10:47:55 AM) epmclellan1: we've just started with dev
(10:41:59 AM) berwin22: yeah. support, testing, documentation, is taking up most my time
+
(10:47:58 AM) courtney: epmclellan1: didn't you start grouping microservices somewhere on the wiki?
(10:42:10 AM) berwin22: dev: I've updated my dev machine with 12.04
+
(10:48:17 AM) epmclellan1: yes, it's linked from the issue I think...
(10:42:22 AM) berwin22: ubuntu 12.04
+
(10:48:33 AM) ARTi: yes.. dunno if the internet is unstable here, I havnt seen anything since pool table
(10:42:33 AM) berwin22: We're planning on releasing the next archivematica on it.
+
(10:48:54 AM) epmclellan1: micro-services grouping issue is http://code.google.com/p/archivematica/issues/detail?id=320
(10:42:46 AM) epmclellan: peterVG: http://www.youtube.com/watch?v=haotj_NlbX0&feature=youtu.be
+
(10:49:03 AM) epmclellan1: includes mock-up and list of micro-services
(10:42:51 AM) berwin22: I did an update, not a re-install. I think that was a mistake
+
(10:49:29 AM) mcantelon: I *think* in the database there are already grouped, so it's just a matter of exposing that in the interface.
(10:43:17 AM) ARTi: xubuntu 12.04 looks nice, berwin22 yeah Ive gotten farther from a clean install
+
berwin22 berwin221
(10:43:20 AM) peterVG: berwin22: nice!
+
(10:49:56 AM) epmclellan1: berwin221 is that correct? ^
(10:43:22 AM) peterVG: re: vid
+
(10:49:56 AM) berwin221: yes
(10:43:49 AM) berwin22: a number of packages needed to be purged before I could get mysql running. I'll need to reinstall my machine at some point
+
(10:50:16 AM) berwin221: it's by no means complete, but a start
(10:44:04 AM) Sevein: I would suggest a web-based microservice interface rather than making users deal with that SQL file, I tried once and it is not easy even for developers
+
(10:50:17 AM) courtney: how much will the changes we're making in transfer backup impact the microservices grouping
(10:44:47 AM) Sevein: ^tt's specially difficult to maintain the chain, don't you think so berwin22?
+
(10:50:24 AM) berwin221: and should give mcantelon something to work with
(10:45:10 AM) berwin22: Yes. would be better to strap an interface on it.
+
(10:50:38 AM) epmclellan1: courtney: not too much, I think
(10:45:44 AM) berwin22: But we wanted some documentation on how to do it without, and doc for the developers making the interface
+
(10:50:45 AM) mcantelon: Yeah, it shouldn't be too much longer until I have something to show.  
(10:46:19 AM) Sevein: sounds good, I liked your work
+
(10:50:47 AM) epmclellan1: we're moving whole micro-services, not just individual tasks
(10:47:03 AM) berwin22: it's a somewhat complicated system, and you tried with no documentation!
+
(10:50:52 AM) courtney: ok
(10:47:24 AM) Sevein: totally :)
+
(10:51:27 AM) courtney: we should make a firm decision about which microservices are absolutely necessary for transfer backup - i'm writing up requirements today
(10:47:58 AM) berwin22: I think the concepts are relatively simple, but the implementation is a bit of a pain
+
(10:51:33 AM) epmclellan1: I like courtney's start transfer mockup
(10:48:34 AM) Sevein: yeah, the complexity is in the database but the idea is pretty easy, I think Django could be used for that
+
(10:51:40 AM) courtney: : )
(10:48:47 AM) Sevein: and provide an initial data set that can be loaded during the installation
+
(10:51:42 AM) epmclellan1: yes, we can talk about that after the meeting
(10:48:52 AM) Sevein: (for all the default micro-services)
+
(10:51:44 AM) peterVG: yes, nicely done
(10:49:11 AM) berwin22: http://archivematica.org/wiki/index.php?title=Creating_Custom_Workflows_0.8_alpha
+
(10:52:01 AM) courtney: it's going to change significantly today - and i'm adding several more
(10:49:48 AM) djjuhasz: berwin22: you've got a good voice for voiceover. :)
+
(10:52:05 AM) epmclellan1: getting a lot of new ideas about how archivists can handle everything from accession forward
(10:50:03 AM) berwin22: captain mumbles
+
(10:52:22 AM) epmclellan1: no other system will do anything like this
(10:50:09 AM) berwin22: I am
+
(10:52:32 AM) courtney: eliminating the need for archives to do preliminary backup actions
(10:50:33 AM) peterVG: ah, I recall your tweet now on this but thought you were talking about Courtney's screencast
+
(10:52:38 AM) courtney: which are currently haphazard at best
(10:50:43 AM) courtney: ouch
+
(10:52:51 AM) epmclellan1: and allowing them to get a better handle on their backlog
(10:50:46 AM) peterVG: which was also v good :-)
+
(10:52:58 AM) epmclellan1: in terms of understanding what's in it
(10:50:51 AM) peterVG: just though berwin22 was having a go
+
(10:53:20 AM) epmclellan1: berwin221 dev news?
(10:51:26 AM) berwin22: twit: "If you like heavy breathing, the word uhm, or technobabble, then I have a video for you!"
+
(10:53:38 AM) berwin221: dev:Work on sort of structmap:
(10:51:40 AM) peterVG: so berwin22 priority for dev is making 12.04 switch and then: testing multi-node mysql write?
+
(10:53:38 AM) berwin221: Did the default sort, and it appears to be by the binary representation of letters, so case then alphabetic
(10:51:54 AM) berwin22: yes
+
(10:54:09 AM) epmclellan1: how does it handle numbers?
(10:51:56 AM) peterVG: cool
+
(10:54:16 AM) epmclellan1: image001, image002 etc
(10:52:08 AM) epmclellan: I should add that to the roadmap if it's not there
+
(10:54:18 AM) peterVG: berwin221 "so case then alphabetic"?
(10:52:14 AM) djjuhasz: berwin22: I don't find the video "mumbly" at all.   There is a bit of a heavy breathing issue though ;)
+
(10:54:55 AM) peterVG: berwin221 sorry don't fully understand what the implications are
(10:52:22 AM) peterVG: epmclellan: the heavy breathing?
+
(10:55:20 AM) berwin221: dev:Work on selectable AIP storage location.
(10:52:29 AM) peterVG: do we have a .wav for that?
+
(10:55:20 AM) berwin221: Making another selection step, to pick the destination, from a specified list in the database.
(10:52:34 AM) courtney: value add
+
(10:55:20 AM) berwin221: The selection is stored in a variable, as a replacement dic, passed down the chain.
(10:52:35 AM) djjuhasz: I think you did pretty well on the "uhm" count
+
(10:56:02 AM) berwin221: numbers get sorted in the alphabetic step
(10:52:48 AM) peterVG: I am the worst for 'uhm's
+
(10:56:22 AM) epmclellan1: ok
(10:53:02 AM) mcantelon: Me too. I'm where "uhm"s come to die.
+
(10:56:29 AM) berwin221: http://www.asciitable.com/
(10:53:03 AM) berwin22: ARTi - how goes the switch to 12.04 packages?
+
(10:57:12 AM) epmclellan1: so for digitization output where eg one file equals a page, user needs to use naming, numberin and capitalization conventions
(10:53:08 AM) djjuhasz: no peterVG, I'm sure I have you beat on the "uhm's"
+
(10:57:20 AM) epmclellan1: which seems reasonable
(10:53:17 AM) courtney: question: maybe not a good one - will the 64 bit switch thing have any affect on bare metal install clients?
+
(10:57:58 AM) peterVG: berwin221 does that mean 'S' will get positioned before 'r' ?
(10:53:34 AM) Sevein: berwin22: did you see that mysql front-end I posted yesterday night?
+
(10:58:13 AM) berwin221: yes
(10:53:36 AM) peterVG: only those that don't support 64-bit chips
+
(10:58:24 AM) epmclellan1: is there a way around that?
(10:54:01 AM) berwin22: oh, we were also looking at moving to x64
+
(10:58:24 AM) peterVG: that's not desirable though is it?
(10:54:14 AM) ARTi: berwin22: yeah, x64 stuff should be working in 12.04
+
(10:58:42 AM) ARTi: notes so far http://archivematica.org/wiki/index.php?title=Meeting_20120314
(10:54:20 AM) peterVG: but any pc bought in the last year should and we've gotten away from the idea of promoting this as a system that runs on old pcs + Ubuntu
+
(10:58:57 AM) epmclellan1: thanks for taking notes, ARTi
(10:54:21 AM) ARTi: was just readpst that was blocking us
+
(10:59:00 AM) ARTi: np
(10:54:32 AM) berwin22: I saw. How are you thinking it will be used in archivematica?
+
(10:59:52 AM) berwin221: is there a way around that? time and money
(10:54:35 AM) ARTi: berwin22: Ill get some package changes in asap
+
(11:00:02 AM) berwin221: I've only started looking at the issue
(10:55:04 AM) berwin22: ARTi - great.
+
(11:00:22 AM) peterVG: berwin221 did you talk to Mike about it?
(10:55:19 AM) mcantelon: Sevein: Which MySQL front-end? I missed that...
+
(11:00:37 AM) berwin221: no
(10:55:23 AM) Sevein: berwin22: every client should have its own front-end maybe? but I am afraid that would change then way you access to the data in the code therefore it would be a pain to implement it? not sure, I just took a quick view
+
(11:00:44 AM) mcantelon: Not sure on the problem surface, but maybe there's a way to hack in natural sorting? http://stackoverflow.com/questions/4836710/does-python-have-a-built-in-function-for-string-natural-sort
(10:55:31 AM) Sevein: wait a sec
+
(11:01:06 AM) peterVG: multi-lingual alpha sorting is complex, so good to run by other devs for suggestions
(10:55:43 AM) Sevein: http://code.google.com/p/vitess/
+
(11:01:29 AM) berwin221: it's not multilingual
(10:56:21 AM) berwin22: you'd replace mysql connections with vitess connections. no value added & overhead is my first instinct
+
(11:01:31 AM) peterVG: I think we also need to be clearer on the requirement then
(10:56:38 AM) mcantelon: Sevein: Ah, cool... kind of like mysql-proxy.
+
(11:01:43 AM) berwin221: at that point, we've stripped the unicode
(10:57:02 AM) Sevein: one vitess connection vs tons of mysql connections
+
(11:01:43 AM) ARTi: mcantelon: cool
(10:57:20 AM) Sevein: it's ok, just wanted to share it :)
+
(11:02:01 AM) mcantelon: Multi-lingual sorting seems like it could be complex (presuming different culturing have different ways of sorting)...
(10:57:22 AM) mcantelon: Sevein: SQL parser!!! Nice.
+
(11:02:10 AM) peterVG: so we wouldn't be able to sort any files coming in using Unicode chars?
(10:57:30 AM) berwin22: Each microservice would stil need to establish a vitess connection
+
(11:02:21 AM) peterVG: e.g. anything non-ASCII?
(10:57:56 AM) berwin22: though it would be through the API
+
(11:02:26 AM) epmclellan1: We would need to sort by original name
(10:58:57 AM) berwin22: We'll have another look if we start to have problems again after all the updates
+
(11:02:30 AM) mjsuhonos: i've used the unicode decimal value as a sort index, but only for the first character or few
(10:59:08 AM) jonhattan left the room (quit: Quit: Ex-Chat).
+
(11:02:36 AM) epmclellan1: instead of sanitized name
(10:59:39 AM) Sevein: yeah it's tricky because they are different processes right?
+
(11:02:41 AM) epmclellan1: would that be possible?
(10:59:40 AM) Sevein: ok
+
(11:02:59 AM) mjsuhonos: that will cause sorting to align with the UTF-8 mapping, but don't know if that will be cultural
(10:59:41 AM) Sevein: :)
+
(11:03:25 AM) peterVG: mjsuhonos: does it also put capitalized letters before lower-case or does the Unicode decimal value respect this order?
(11:01:17 AM) peterVG: ARTi: what are you working on this week?
+
(11:03:50 AM) mjsuhonos: it just follows the unicode planes.  IIRC, upper-case characters are all mapped together
(11:04:20 AM) ARTi: 12.04 packages this week, (was working on) offlineimap for sfu, and web4 deploy on cwh
+
(11:03:55 AM) mjsuhonos: aabbccAABBCC
(11:04:35 AM) peterVG: cool
+
(11:04:13 AM) peterVG: hmm, so not true natural language sorting
(11:04:53 AM) epmclellan: ARTi helped me make a maildir backup of my gmail for testing
+
(11:04:16 AM) ARTi left the room (quit: Read error: Connection reset by peer).
(11:05:06 AM) epmclellan: which I was able to convert to mbox
+
(11:04:20 AM) peterVG: pipedream?
(11:05:11 AM) ARTi: want to start communicating about our backups as well, sounds like we have some good ideas on where to start.. just need consensus that we are following the right approach
+
(11:04:23 AM) mjsuhonos: no, it's glyph sorting.
(11:05:30 AM) epmclellan: I think we'll have reasonable consensus after our next visit to SFU
+
(11:04:28 AM) mjsuhonos: pipedream for sure.
(11:05:35 AM) epmclellan: scheduled for Wednesday
+
(11:04:46 AM) mjsuhonos: "natural order sorting" requires normalization and maybe even transliteration.  blag magic at best
(11:05:51 AM) courtney: i want to go
+
(11:04:54 AM) peterVG: okay, let's establish then what is actually possible with existing libraries available to us
(11:05:54 AM) epmclellan: ARTi: unless you're not talking about email backups at all...
+
(11:05:07 AM) epmclellan1: makes me wonder, if the objects are supposed to form eg the pages of a book, whether the user should have some means of ordering them during ingest
(11:05:56 AM) ARTi: backups@cwh
+
(11:05:09 AM) peterVG: let's continue in seperate thread, post-meeting?
 +
(11:05:12 AM) epmclellan1: ok
 +
(11:05:46 AM) epmclellan1: any more dev?
 +
(11:06:00 AM) peterVG: berwin221 can you please initiate on archivematica@artefactual.com or public list? (and include mjsuhonos)
 +
(11:06:24 AM) berwin221: k
 +
(11:06:26 AM) peterVG: thx
 +
(11:06:39 AM) peterVG: we've lost Austin again?
 +
(11:06:54 AM) epmclellan1: looks like it
 +
(11:06:59 AM) mcantelon: PyICU sounds like it might deal with Unicode sorting... http://stackoverflow.com/questions/1097908/how-do-i-sort-unicode-strings-alphabetically-in-python
 +
(11:07:01 AM) epmclellan1: I can finish minutes
 +
(11:07:01 AM) peterVG: what's the ETA on 12.04 port and multi-processor VM testing
 +
(11:07:22 AM) peterVG: that's most urgent task for him now as per last week's dev mtg
 +
(11:07:24 AM) Sevein: 12.04 end of April
 +
(11:07:38 AM) Sevein: well, the Ubuntu release I meant
 +
(11:07:39 AM) peterVG: Sevein: we've started porting to 12.04beta
 +
(11:07:46 AM) Sevein: yup, I know
 +
(11:07:51 AM) peterVG: just wondering on ETA for completion of our package updates
 +
(11:07:55 AM) ARTi [~austin@24.207.112.199] entered the room.
 +
(11:08:13 AM) ARTi: bleh nets.. did I miss anything to add to notes?
 +
(11:08:15 AM) peterVG: so that we can start multi-processor node testing
 +
(11:08:40 AM) epmclellan1: ARTi: I'll finish notes, I'll have the whole chat log
 +
(11:08:45 AM) peterVG: ARTi: we were talking about multilingual/UTF8 alpha sorting
 +
(11:08:49 AM) ARTi: epmclellan1: cheers
 +
(11:08:59 AM) peterVG: then I had a question about status of work on 12.04 porting
 +
(11:09:27 AM) ARTi: I havnt looked at it from last week, but its mostly done if I recall
 +
(11:09:36 AM) peterVG: as per last week's dev meeting that is your most urgent task now, followed by multiprocesser/node testing once 12.04beta porting is completed
 +
(11:09:55 AM) ARTi: yep, on it
 +
(11:10:15 AM) peterVG: MarkJ is pretty much handling all of the ContentDM task so you're off hook for that
 +
(11:11:04 AM) ARTi: cool
 +
(11:11:34 AM) epmclellan1: re contentDM and ordering etc, I've emailed UBC Library to get more details about requirements
 +
(11:11:47 AM) epmclellan1: their requirements may be fairly simple
 +
(11:13:15 AM) peterVG: okay, but just to reiterate, our alpha sorting requirement should use original (pre-sanitized) filenames, sort on UTF-8 chars, and respect numbers/lower-upper case
 +
(11:13:28 AM) epmclellan1: yes
 +
(11:13:39 AM) epmclellan1: that's the minimum
 +
(11:13:51 AM) peterVG: sounds like mcantelon's link above is good place to start for further investigation into how much of this is possible with existing libraries
 +
(11:14:07 AM) epmclellan1: need to know from UBC if they have logical structure requirements beyond alphanumeric sorting
 +
(11:14:16 AM) epmclellan1: hopefully not
 +
(11:14:26 AM) peterVG: epmclellan1: right, related but two seperate issues
 +
(11:14:45 AM) peterVG: can someone pls update the alpha sorting issue with this updated ^ discussion
 +
(11:14:45 AM) epmclellan1: well, it will dictate how we structure the structMap in METS
 +
(11:14:52 AM) epmclellan1: I can update the issue
 +
(11:15:03 AM) peterVG: yes, but that is a seperate issue from getting alpha sorting working
 +
(11:15:07 AM) epmclellan1: right
 +
(11:15:37 AM) peterVG: okay, that's time eh?
 +
(11:15:51 AM) epmclellan1: k
 +
(11:19:43 AM) epmclellan1: alpha sorting issue updated: http://code.google.com/p/archivematica/issues/detail?id=937
 
</pre>
 
</pre>
 
[[Category:meetings]]
 
[[Category:meetings]]

Latest revision as of 12:14, 14 March 2012

Artefactual Systems, Internal Archivematica Dev Mtg, 2012-03-14

Development[edit]

  • Mike has begun to group micro services - Issue 320
  • Joseph started work on selectable AIP storage location
  • Mark Jordan is working on DIP upload to CONTENTdm

Deployment[edit]

Testing[edit]

Documentation[edit]

chat log[edit]

(10:46:07 AM) epmclellan1: meeting time? I can take notes
(10:46:07 AM) ARTi left the room (quit: Read error: Connection reset by peer).
(10:46:37 AM) peterVG: back
(10:46:43 AM) epmclellan1: we just lost Autsin but we can still start with dev
(10:47:21 AM) courtney: are mockups dev?
(10:47:21 AM) mcantelon: I've started working on grouping jobs, in transfers, by microservice.
(10:47:28 AM) epmclellan1: great!
(10:47:43 AM) courtney: super 
(10:47:43 AM) ARTi [~austin@24.207.112.199] entered the room.
(10:47:51 AM) epmclellan1: hi ARTi
(10:47:55 AM) epmclellan1: we've just started with dev
(10:47:58 AM) courtney: epmclellan1: didn't you start grouping microservices somewhere on the wiki?
(10:48:17 AM) epmclellan1: yes, it's linked from the issue I think...
(10:48:33 AM) ARTi: yes.. dunno if the internet is unstable here, I havnt seen anything since pool table
(10:48:54 AM) epmclellan1: micro-services grouping issue is http://code.google.com/p/archivematica/issues/detail?id=320
(10:49:03 AM) epmclellan1: includes mock-up and list of micro-services
(10:49:29 AM) mcantelon: I *think* in the database there are already grouped, so it's just a matter of exposing that in the interface.
berwin22 berwin221 
(10:49:56 AM) epmclellan1: berwin221 is that correct? ^
(10:49:56 AM) berwin221: yes
(10:50:16 AM) berwin221: it's by no means complete, but a start
(10:50:17 AM) courtney: how much will the changes we're making in transfer backup impact the microservices grouping
(10:50:24 AM) berwin221: and should give mcantelon something to work with
(10:50:38 AM) epmclellan1: courtney: not too much, I think
(10:50:45 AM) mcantelon: Yeah, it shouldn't be too much longer until I have something to show. 
(10:50:47 AM) epmclellan1: we're moving whole micro-services, not just individual tasks
(10:50:52 AM) courtney: ok
(10:51:27 AM) courtney: we should make a firm decision about which microservices are absolutely necessary for transfer backup - i'm writing up requirements today
(10:51:33 AM) epmclellan1: I like courtney's start transfer mockup
(10:51:40 AM) courtney: : )
(10:51:42 AM) epmclellan1: yes, we can talk about that after the meeting
(10:51:44 AM) peterVG: yes, nicely done
(10:52:01 AM) courtney: it's going to change significantly today - and i'm adding several more
(10:52:05 AM) epmclellan1: getting a lot of new ideas about how archivists can handle everything from accession forward
(10:52:22 AM) epmclellan1: no other system will do anything like this
(10:52:32 AM) courtney: eliminating the need for archives to do preliminary backup actions
(10:52:38 AM) courtney: which are currently haphazard at best
(10:52:51 AM) epmclellan1: and allowing them to get a better handle on their backlog
(10:52:58 AM) epmclellan1: in terms of understanding what's in it
(10:53:20 AM) epmclellan1: berwin221 dev news?
(10:53:38 AM) berwin221: dev:Work on sort of structmap:
(10:53:38 AM) berwin221: Did the default sort, and it appears to be by the binary representation of letters, so case then alphabetic
(10:54:09 AM) epmclellan1: how does it handle numbers?
(10:54:16 AM) epmclellan1: image001, image002 etc
(10:54:18 AM) peterVG: berwin221 "so case then alphabetic"?
(10:54:55 AM) peterVG: berwin221 sorry don't fully understand what the implications are
(10:55:20 AM) berwin221: dev:Work on selectable AIP storage location.
(10:55:20 AM) berwin221: Making another selection step, to pick the destination, from a specified list in the database.
(10:55:20 AM) berwin221: The selection is stored in a variable, as a replacement dic, passed down the chain.
(10:56:02 AM) berwin221: numbers get sorted in the alphabetic step
(10:56:22 AM) epmclellan1: ok
(10:56:29 AM) berwin221: http://www.asciitable.com/
(10:57:12 AM) epmclellan1: so for digitization output where eg one file equals a page, user needs to use naming, numberin and capitalization conventions
(10:57:20 AM) epmclellan1: which seems reasonable
(10:57:58 AM) peterVG: berwin221 does that mean 'S' will get positioned before 'r' ?
(10:58:13 AM) berwin221: yes
(10:58:24 AM) epmclellan1: is there a way around that?
(10:58:24 AM) peterVG: that's not desirable though is it?
(10:58:42 AM) ARTi: notes so far http://archivematica.org/wiki/index.php?title=Meeting_20120314
(10:58:57 AM) epmclellan1: thanks for taking notes, ARTi
(10:59:00 AM) ARTi: np
(10:59:52 AM) berwin221: is there a way around that? time and money
(11:00:02 AM) berwin221: I've only started looking at the issue
(11:00:22 AM) peterVG: berwin221 did you talk to Mike about it?
(11:00:37 AM) berwin221: no
(11:00:44 AM) mcantelon: Not sure on the problem surface, but maybe there's a way to hack in natural sorting? http://stackoverflow.com/questions/4836710/does-python-have-a-built-in-function-for-string-natural-sort
(11:01:06 AM) peterVG: multi-lingual alpha sorting is complex, so good to run by other devs for suggestions
(11:01:29 AM) berwin221: it's not multilingual
(11:01:31 AM) peterVG: I think we also need to be clearer on the requirement then
(11:01:43 AM) berwin221: at that point, we've stripped the unicode
(11:01:43 AM) ARTi: mcantelon: cool
(11:02:01 AM) mcantelon: Multi-lingual sorting seems like it could be complex (presuming different culturing have different ways of sorting)...
(11:02:10 AM) peterVG: so we wouldn't be able to sort any files coming in using Unicode chars?
(11:02:21 AM) peterVG: e.g. anything non-ASCII?
(11:02:26 AM) epmclellan1: We would need to sort by original name
(11:02:30 AM) mjsuhonos: i've used the unicode decimal value as a sort index, but only for the first character or few
(11:02:36 AM) epmclellan1: instead of sanitized name
(11:02:41 AM) epmclellan1: would that be possible?
(11:02:59 AM) mjsuhonos: that will cause sorting to align with the UTF-8 mapping, but don't know if that will be cultural
(11:03:25 AM) peterVG: mjsuhonos: does it also put capitalized letters before lower-case or does the Unicode decimal value respect this order?
(11:03:50 AM) mjsuhonos: it just follows the unicode planes.  IIRC, upper-case characters are all mapped together
(11:03:55 AM) mjsuhonos: aabbccAABBCC
(11:04:13 AM) peterVG: hmm, so not true natural language sorting
(11:04:16 AM) ARTi left the room (quit: Read error: Connection reset by peer).
(11:04:20 AM) peterVG: pipedream?
(11:04:23 AM) mjsuhonos: no, it's glyph sorting.
(11:04:28 AM) mjsuhonos: pipedream for sure.
(11:04:46 AM) mjsuhonos: "natural order sorting" requires normalization and maybe even transliteration.  blag magic at best
(11:04:54 AM) peterVG: okay, let's establish then what is actually possible with existing libraries available to us
(11:05:07 AM) epmclellan1: makes me wonder, if the objects are supposed to form eg the pages of a book, whether the user should have some means of ordering them during ingest
(11:05:09 AM) peterVG: let's continue in seperate thread, post-meeting?
(11:05:12 AM) epmclellan1: ok
(11:05:46 AM) epmclellan1: any more dev?
(11:06:00 AM) peterVG: berwin221 can you please initiate on archivematica@artefactual.com or public list? (and include mjsuhonos)
(11:06:24 AM) berwin221: k
(11:06:26 AM) peterVG: thx
(11:06:39 AM) peterVG: we've lost Austin again?
(11:06:54 AM) epmclellan1: looks like it
(11:06:59 AM) mcantelon: PyICU sounds like it might deal with Unicode sorting... http://stackoverflow.com/questions/1097908/how-do-i-sort-unicode-strings-alphabetically-in-python
(11:07:01 AM) epmclellan1: I can finish minutes
(11:07:01 AM) peterVG: what's the ETA on 12.04 port and multi-processor VM testing
(11:07:22 AM) peterVG: that's most urgent task for him now as per last week's dev mtg
(11:07:24 AM) Sevein: 12.04 end of April
(11:07:38 AM) Sevein: well, the Ubuntu release I meant
(11:07:39 AM) peterVG: Sevein: we've started porting to 12.04beta
(11:07:46 AM) Sevein: yup, I know
(11:07:51 AM) peterVG: just wondering on ETA for completion of our package updates
(11:07:55 AM) ARTi [~austin@24.207.112.199] entered the room.
(11:08:13 AM) ARTi: bleh nets.. did I miss anything to add to notes?
(11:08:15 AM) peterVG: so that we can start multi-processor node testing
(11:08:40 AM) epmclellan1: ARTi: I'll finish notes, I'll have the whole chat log
(11:08:45 AM) peterVG: ARTi: we were talking about multilingual/UTF8 alpha sorting
(11:08:49 AM) ARTi: epmclellan1: cheers
(11:08:59 AM) peterVG: then I had a question about status of work on 12.04 porting
(11:09:27 AM) ARTi: I havnt looked at it from last week, but its mostly done if I recall
(11:09:36 AM) peterVG: as per last week's dev meeting that is your most urgent task now, followed by multiprocesser/node testing once 12.04beta porting is completed
(11:09:55 AM) ARTi: yep, on it
(11:10:15 AM) peterVG: MarkJ is pretty much handling all of the ContentDM task so you're off hook for that
(11:11:04 AM) ARTi: cool
(11:11:34 AM) epmclellan1: re contentDM and ordering etc, I've emailed UBC Library to get more details about requirements
(11:11:47 AM) epmclellan1: their requirements may be fairly simple
(11:13:15 AM) peterVG: okay, but just to reiterate, our alpha sorting requirement should use original (pre-sanitized) filenames, sort on UTF-8 chars, and respect numbers/lower-upper case
(11:13:28 AM) epmclellan1: yes
(11:13:39 AM) epmclellan1: that's the minimum
(11:13:51 AM) peterVG: sounds like mcantelon's link above is good place to start for further investigation into how much of this is possible with existing libraries
(11:14:07 AM) epmclellan1: need to know from UBC if they have logical structure requirements beyond alphanumeric sorting
(11:14:16 AM) epmclellan1: hopefully not
(11:14:26 AM) peterVG: epmclellan1: right, related but two seperate issues
(11:14:45 AM) peterVG: can someone pls update the alpha sorting issue with this updated ^ discussion 
(11:14:45 AM) epmclellan1: well, it will dictate how we structure the structMap in METS
(11:14:52 AM) epmclellan1: I can update the issue
(11:15:03 AM) peterVG: yes, but that is a seperate issue from getting alpha sorting working
(11:15:07 AM) epmclellan1: right
(11:15:37 AM) peterVG: okay, that's time eh?
(11:15:51 AM) epmclellan1: k
(11:19:43 AM) epmclellan1: alpha sorting issue updated: http://code.google.com/p/archivematica/issues/detail?id=937