Meeting 20120314
Jump to navigation
Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
Artefactual Systems, Internal Archivematica Dev Mtg, 2012-03-14
Development
- Mike has begun to group micro services - Issue 320
- Joseph started work on selectable AIP storage location
- Mark Jordan is working on DIP upload to CONTENTdm
- Evelyn has been discussing requirements with him
- Joseph is implementing structmap alphabetical ordering
- our alpha sorting requirement should use original (pre-sanitized) filenames, sort on UTF-8 chars, and respect numbers/lower-upper case. PyICU sounds like it might deal with Unicode sorting... http://stackoverflow.com/questions/1097908/how-do-i-sort-unicode-strings-alphabetically-in-python
Deployment
Testing
Documentation
- Courtney is working on the web-based transfer interface: http://archivematica.org/wiki/index.php?title=File_Browser_Requirements#START_TRANSFER
chat log
(10:46:07 AM) epmclellan1: meeting time? I can take notes (10:46:07 AM) ARTi left the room (quit: Read error: Connection reset by peer). (10:46:37 AM) peterVG: back (10:46:43 AM) epmclellan1: we just lost Autsin but we can still start with dev (10:47:21 AM) courtney: are mockups dev? (10:47:21 AM) mcantelon: I've started working on grouping jobs, in transfers, by microservice. (10:47:28 AM) epmclellan1: great! (10:47:43 AM) courtney: super (10:47:43 AM) ARTi [~austin@24.207.112.199] entered the room. (10:47:51 AM) epmclellan1: hi ARTi (10:47:55 AM) epmclellan1: we've just started with dev (10:47:58 AM) courtney: epmclellan1: didn't you start grouping microservices somewhere on the wiki? (10:48:17 AM) epmclellan1: yes, it's linked from the issue I think... (10:48:33 AM) ARTi: yes.. dunno if the internet is unstable here, I havnt seen anything since pool table (10:48:54 AM) epmclellan1: micro-services grouping issue is http://code.google.com/p/archivematica/issues/detail?id=320 (10:49:03 AM) epmclellan1: includes mock-up and list of micro-services (10:49:29 AM) mcantelon: I *think* in the database there are already grouped, so it's just a matter of exposing that in the interface. berwin22 berwin221 (10:49:56 AM) epmclellan1: berwin221 is that correct? ^ (10:49:56 AM) berwin221: yes (10:50:16 AM) berwin221: it's by no means complete, but a start (10:50:17 AM) courtney: how much will the changes we're making in transfer backup impact the microservices grouping (10:50:24 AM) berwin221: and should give mcantelon something to work with (10:50:38 AM) epmclellan1: courtney: not too much, I think (10:50:45 AM) mcantelon: Yeah, it shouldn't be too much longer until I have something to show. (10:50:47 AM) epmclellan1: we're moving whole micro-services, not just individual tasks (10:50:52 AM) courtney: ok (10:51:27 AM) courtney: we should make a firm decision about which microservices are absolutely necessary for transfer backup - i'm writing up requirements today (10:51:33 AM) epmclellan1: I like courtney's start transfer mockup (10:51:40 AM) courtney: : ) (10:51:42 AM) epmclellan1: yes, we can talk about that after the meeting (10:51:44 AM) peterVG: yes, nicely done (10:52:01 AM) courtney: it's going to change significantly today - and i'm adding several more (10:52:05 AM) epmclellan1: getting a lot of new ideas about how archivists can handle everything from accession forward (10:52:22 AM) epmclellan1: no other system will do anything like this (10:52:32 AM) courtney: eliminating the need for archives to do preliminary backup actions (10:52:38 AM) courtney: which are currently haphazard at best (10:52:51 AM) epmclellan1: and allowing them to get a better handle on their backlog (10:52:58 AM) epmclellan1: in terms of understanding what's in it (10:53:20 AM) epmclellan1: berwin221 dev news? (10:53:38 AM) berwin221: dev:Work on sort of structmap: (10:53:38 AM) berwin221: Did the default sort, and it appears to be by the binary representation of letters, so case then alphabetic (10:54:09 AM) epmclellan1: how does it handle numbers? (10:54:16 AM) epmclellan1: image001, image002 etc (10:54:18 AM) peterVG: berwin221 "so case then alphabetic"? (10:54:55 AM) peterVG: berwin221 sorry don't fully understand what the implications are (10:55:20 AM) berwin221: dev:Work on selectable AIP storage location. (10:55:20 AM) berwin221: Making another selection step, to pick the destination, from a specified list in the database. (10:55:20 AM) berwin221: The selection is stored in a variable, as a replacement dic, passed down the chain. (10:56:02 AM) berwin221: numbers get sorted in the alphabetic step (10:56:22 AM) epmclellan1: ok (10:56:29 AM) berwin221: http://www.asciitable.com/ (10:57:12 AM) epmclellan1: so for digitization output where eg one file equals a page, user needs to use naming, numberin and capitalization conventions (10:57:20 AM) epmclellan1: which seems reasonable (10:57:58 AM) peterVG: berwin221 does that mean 'S' will get positioned before 'r' ? (10:58:13 AM) berwin221: yes (10:58:24 AM) epmclellan1: is there a way around that? (10:58:24 AM) peterVG: that's not desirable though is it? (10:58:42 AM) ARTi: notes so far http://archivematica.org/wiki/index.php?title=Meeting_20120314 (10:58:57 AM) epmclellan1: thanks for taking notes, ARTi (10:59:00 AM) ARTi: np (10:59:52 AM) berwin221: is there a way around that? time and money (11:00:02 AM) berwin221: I've only started looking at the issue (11:00:22 AM) peterVG: berwin221 did you talk to Mike about it? (11:00:37 AM) berwin221: no (11:00:44 AM) mcantelon: Not sure on the problem surface, but maybe there's a way to hack in natural sorting? http://stackoverflow.com/questions/4836710/does-python-have-a-built-in-function-for-string-natural-sort (11:01:06 AM) peterVG: multi-lingual alpha sorting is complex, so good to run by other devs for suggestions (11:01:29 AM) berwin221: it's not multilingual (11:01:31 AM) peterVG: I think we also need to be clearer on the requirement then (11:01:43 AM) berwin221: at that point, we've stripped the unicode (11:01:43 AM) ARTi: mcantelon: cool (11:02:01 AM) mcantelon: Multi-lingual sorting seems like it could be complex (presuming different culturing have different ways of sorting)... (11:02:10 AM) peterVG: so we wouldn't be able to sort any files coming in using Unicode chars? (11:02:21 AM) peterVG: e.g. anything non-ASCII? (11:02:26 AM) epmclellan1: We would need to sort by original name (11:02:30 AM) mjsuhonos: i've used the unicode decimal value as a sort index, but only for the first character or few (11:02:36 AM) epmclellan1: instead of sanitized name (11:02:41 AM) epmclellan1: would that be possible? (11:02:59 AM) mjsuhonos: that will cause sorting to align with the UTF-8 mapping, but don't know if that will be cultural (11:03:25 AM) peterVG: mjsuhonos: does it also put capitalized letters before lower-case or does the Unicode decimal value respect this order? (11:03:50 AM) mjsuhonos: it just follows the unicode planes. IIRC, upper-case characters are all mapped together (11:03:55 AM) mjsuhonos: aabbccAABBCC (11:04:13 AM) peterVG: hmm, so not true natural language sorting (11:04:16 AM) ARTi left the room (quit: Read error: Connection reset by peer). (11:04:20 AM) peterVG: pipedream? (11:04:23 AM) mjsuhonos: no, it's glyph sorting. (11:04:28 AM) mjsuhonos: pipedream for sure. (11:04:46 AM) mjsuhonos: "natural order sorting" requires normalization and maybe even transliteration. blag magic at best (11:04:54 AM) peterVG: okay, let's establish then what is actually possible with existing libraries available to us (11:05:07 AM) epmclellan1: makes me wonder, if the objects are supposed to form eg the pages of a book, whether the user should have some means of ordering them during ingest (11:05:09 AM) peterVG: let's continue in seperate thread, post-meeting? (11:05:12 AM) epmclellan1: ok (11:05:46 AM) epmclellan1: any more dev? (11:06:00 AM) peterVG: berwin221 can you please initiate on archivematica@artefactual.com or public list? (and include mjsuhonos) (11:06:24 AM) berwin221: k (11:06:26 AM) peterVG: thx (11:06:39 AM) peterVG: we've lost Austin again? (11:06:54 AM) epmclellan1: looks like it (11:06:59 AM) mcantelon: PyICU sounds like it might deal with Unicode sorting... http://stackoverflow.com/questions/1097908/how-do-i-sort-unicode-strings-alphabetically-in-python (11:07:01 AM) epmclellan1: I can finish minutes (11:07:01 AM) peterVG: what's the ETA on 12.04 port and multi-processor VM testing (11:07:22 AM) peterVG: that's most urgent task for him now as per last week's dev mtg (11:07:24 AM) Sevein: 12.04 end of April (11:07:38 AM) Sevein: well, the Ubuntu release I meant (11:07:39 AM) peterVG: Sevein: we've started porting to 12.04beta (11:07:46 AM) Sevein: yup, I know (11:07:51 AM) peterVG: just wondering on ETA for completion of our package updates (11:07:55 AM) ARTi [~austin@24.207.112.199] entered the room. (11:08:13 AM) ARTi: bleh nets.. did I miss anything to add to notes? (11:08:15 AM) peterVG: so that we can start multi-processor node testing (11:08:40 AM) epmclellan1: ARTi: I'll finish notes, I'll have the whole chat log (11:08:45 AM) peterVG: ARTi: we were talking about multilingual/UTF8 alpha sorting (11:08:49 AM) ARTi: epmclellan1: cheers (11:08:59 AM) peterVG: then I had a question about status of work on 12.04 porting (11:09:27 AM) ARTi: I havnt looked at it from last week, but its mostly done if I recall (11:09:36 AM) peterVG: as per last week's dev meeting that is your most urgent task now, followed by multiprocesser/node testing once 12.04beta porting is completed (11:09:55 AM) ARTi: yep, on it (11:10:15 AM) peterVG: MarkJ is pretty much handling all of the ContentDM task so you're off hook for that (11:11:04 AM) ARTi: cool (11:11:34 AM) epmclellan1: re contentDM and ordering etc, I've emailed UBC Library to get more details about requirements (11:11:47 AM) epmclellan1: their requirements may be fairly simple (11:13:15 AM) peterVG: okay, but just to reiterate, our alpha sorting requirement should use original (pre-sanitized) filenames, sort on UTF-8 chars, and respect numbers/lower-upper case (11:13:28 AM) epmclellan1: yes (11:13:39 AM) epmclellan1: that's the minimum (11:13:51 AM) peterVG: sounds like mcantelon's link above is good place to start for further investigation into how much of this is possible with existing libraries (11:14:07 AM) epmclellan1: need to know from UBC if they have logical structure requirements beyond alphanumeric sorting (11:14:16 AM) epmclellan1: hopefully not (11:14:26 AM) peterVG: epmclellan1: right, related but two seperate issues (11:14:45 AM) peterVG: can someone pls update the alpha sorting issue with this updated ^ discussion (11:14:45 AM) epmclellan1: well, it will dictate how we structure the structMap in METS (11:14:52 AM) epmclellan1: I can update the issue (11:15:03 AM) peterVG: yes, but that is a seperate issue from getting alpha sorting working (11:15:07 AM) epmclellan1: right (11:15:37 AM) peterVG: okay, that's time eh? (11:15:51 AM) epmclellan1: k (11:19:43 AM) epmclellan1: alpha sorting issue updated: http://code.google.com/p/archivematica/issues/detail?id=937