[11:42:02] --- arnt has joined
[11:43:14] --- ams has joined
[14:48:18] --- Dave Cridland has joined
[14:48:28] <Dave Cridland> Hey Arnt.
[14:55:11] <arnt> hi
[14:55:40] <Dave Cridland> Must be pretty late for you.
[14:55:55] <arnt> ten at night. pretty late - the little darling has kept us up at night.
[14:56:20] <arnt> should I resemble a zombie, that's why.
[14:56:38] <Dave Cridland> Don't worry, it doesn't last. My four year old son sleeps most nights through...
[14:57:45] --- lisa has joined
[14:57:47] <arnt> you're in wales now?
[14:57:52] <Dave Cridland> Yep.
[14:57:53] <lisa> Hi guys
[14:57:57] <Dave Cridland> Hi Lisa.
[14:58:02] <arnt> hi.
[14:58:04] <lisa> No Cyrus? :(
[14:58:17] <lisa> (not that it's not nice to see you Dave and Arnt but Cyrus has slides to virtually present)
[14:58:34] <Dave Cridland> Not yet.
[14:58:35] <arnt> fyi, the third person present is my honoured partner, abhijit menon-sen.
[14:58:48] <Dave Cridland> Ah, that's where I've seen that TLA before.
[14:59:04] <arnt> you'll meet someday.
[14:59:25] <Dave Cridland> Lisa: I'm looking forward to reviewing the LDAP ABNF, although I'm not quite sure why that's on the agenda.
[15:00:52] <Dave Cridland> Is there meant to be sound yet?
[15:01:13] <arnt> sound? I haven't even set this computer up for sound.
[15:01:22] <arnt> <- sound-hater
[15:02:11] --- resnick has joined
[15:02:18] <Dave Cridland> Hi Pete.
[15:02:28] --- cnewman has joined
[15:02:39] --- Randy has joined
[15:02:40] <resnick> howdy
[15:03:06] --- alexeymelnikov has joined
[15:03:38] <cnewman> I volunteered to scribe
[15:03:38] --- kmurchison has joined
[15:03:41] <resnick> Anybody listening to the audio feed?
[15:03:50] <Dave Cridland> Yes, and I can hear Lisa.
[15:04:49] <cnewman> Agenda bash
[15:05:02] <cnewman> LISTEXT - Barry Leiba / Comparator / Collator / Condstore update / Annotate
[15:05:19] <cnewman> Any additions?
[15:05:19] --- cyrus_daboo has joined
[15:05:33] --- pguenther has joined
[15:05:53] <cnewman> Moved collator and comparator to end of agenda
[15:05:59] <arnt> are all you people staying away, or jabbering from the actual meeting?
[15:06:13] <arnt> you don't have infants. you should go to vancouver.
[15:06:40] <Dave Cridland> I have an infant. Although one that can walk as of a week ago.
[15:06:54] <cnewman> Mention that draft-melnikov-imap-ext-.. is in IETF wide last call and needs review and comments.
[15:07:18] <arnt> which one is that? the IMAP ABNF?
[15:07:18] <cnewman> LISTEXT - Barry speaking
[15:07:43] <resnick> arnt: Yes, IMAPABNF
[15:07:43] <cnewman> [yes, that's the IMAP ABNF draft]
[15:08:00] <cnewman> LISTEXT - will be another draft, but no need for another WG last call based on comments of 14
[15:08:25] <resnick> Philip volunteers, Chris volunteers, to do a last review of listext
[15:08:50] <arnt> arnt reviewed. if I didn't send mail, I was happy.
[15:08:53] <cnewman> Chris and Phillip volunteered to review LISTEXT draft within the next two weeks
[15:09:00] <cnewman> Moving on... CONDSTORE
[15:09:03] <Dave Cridland> I'll probably take a look, too.
[15:10:30] <Dave Cridland> arnt: There's a -15 minor update coming that Barry mentioned.
[15:11:06] <cnewman> new CONDSTORE draft has been posted by Alexey.
[15:11:10] <cyrus_daboo> lisa's mike seems to be cutting in and out a bit
[15:11:26] <cnewman> Lisa/Chair: believe this is a re-factoring of the document, not a new rev so claim is no new last call is needed.
[15:11:32] <resnick> fixed.
[15:12:12] <cnewman> Scott/AD: believes removing a normative reference may be a substantive change, so need to review changes
[15:12:31] <cnewman> Lisa: just an ABNF reference so we could remove reference.
[15:12:42] <cnewman> Scott: perhaps we can do without a last call then.
[15:13:02] <cnewman> Alexey: doesn't change on-the wire bits.
[15:13:36] <cnewman> Pete/co-chair: what should we do procedurally for Scott?
[15:13:54] <cnewman> Scott: run list of edits by AD
[15:14:12] <cnewman> Action item: chairs will do a review of condstore changes to Scott and the list
[15:15:48] <lisa> We discovered that turning on one mike turned off another. We won't use the clipon mike and we should stop making voices cut out :)
[15:15:53] <cnewman> Philip: If we remove normative reference to annotate and annotate is changed to remove flags then there could be an issue with flags in condstore
[15:16:04] <lisa> Philip's point is giving me a sinking feeling here.
[15:16:45] <cnewman> Phillip is asking for feedback from Arnt on this issue.
[15:16:45] --- tonyhansen has joined
[15:16:56] <lisa> Arnt, do you have an opinion about the desirability of removing flags from annotate given that it affects condstore?
[15:16:59] <Dave Cridland> Arnt has no audio...
[15:17:06] <cnewman> Alexey: doesn't want to change CONDSTORE abnf
[15:17:19] <cyrus_daboo> My opinion is remove flags from annotate.
[15:17:21] <lisa> Condstore is past IETF and in the RFC queue so that change to Annotate would throw Condstore back to us.
[15:17:30] --- Glenn Parsons has joined
[15:17:53] <cnewman> Alexey: need to reserve the flags namespace in ANNOTATE for use by CONDSTORE
[15:17:56] --- tonyhansen has left
[15:17:59] <arnt> I believe that removing flags is clearly desirable. for me, and I think for a lot of people, flags is the single biggest feature in annotate in terms of lines of code and number of tests.
[15:18:22] <cyrus_daboo> Right - we can always add flags back if there is sufficient demand and valid use cases and shared vs private dependency can be fixed in IMAP itself.
[15:18:24] <resnick> What does that mean for condstore?
[15:18:26] <arnt> I don't understand the impact on condstore.
[15:19:07] <Dave Cridland> Arnt: Neither did I, so I looked - it's the MODSEQ SEARCH key.
[15:19:13] <resnick> Alexey: We want the two documents to be compatible. One document has to say that there's a namespace reserved.
[15:19:24] <lisa> Alexey thinks he has a fix to avoid impacting condstore...
[15:19:30] <cnewman> We want the documents (CONDSTORE, ANNOTATE) compatible. CONDSTORE needs a mechanism to provide flag updates because that's a primary function of it. I would be nice if CONDSTORE used syntax compatible with ANNOTATE.
[15:20:17] <cyrus_daboo> I do not see any risk to CONDSTORE.
[15:20:19] <cnewman> Lisa: any objections to removing flags stuff from ANNOTATE to simplify it, even if there's a risk of setting back CONDSTORE?
[15:20:20] <arnt> I don't see that as necessarily dependent on the /flags/* mirror. accidentally, perhaps.
[15:20:40] <Dave Cridland> No objections from me.
[15:20:52] <cnewman> Lisa: Do we have consensus on removing flags from ANNOTATE?
[15:20:55] <Dave Cridland> *hum*
[15:22:03] <Dave Cridland> Alexey's asking whether we reserve the flag namespace or drop flags entirely from ANNOTATE. Lisa's saying we reserve the /flags/ namespace in ANNOTATE.
[15:22:03] <cnewman> Alexey: What does it mean to remove flags from ANNOTATE? If ANNOTATE says nothing about flags, then CONDSTORE has to go through another last call.
[15:22:55] <cnewman> Randy: originally it was intended to provide a single mechanism which tries to rationalize some of the issues flags had (shared vs. personal).
[15:23:00] <lisa> We believe we hear consensus.
[15:23:11] <cyrus_daboo> Flags lost a lot of their attractiveness once we removed tghe ability to store them via annotate.
[15:23:19] <cnewman> Lisa: Hearing rough consensus to remove flags from ANNOTATE with worrying about the effect on CONDSTORE, but still simplify CONDSTORE significantly.
[15:23:47] <cnewman> change last "CONDSTORE" to "ANNOTATE" in what I just typed...
[15:24:29] <cnewman> Phillip: need separate way to represent flags for MODSEQUENCE search critera.
[15:24:31] <arnt> mucho happy humming from munich.
[15:25:22] <Dave Cridland> Question for the panel - has anyone actually implemented CONDSTORE?
[15:25:29] <cnewman> Phillip: If the reserving of the annotation namespace for flags is an objectionable wart, then the alternative would be a special MODSEQUENCE ABNF for CONDSTORE flags.
[15:26:00] <cnewman> Alexey: kinda implemented CONDSTORE (previous employeer)
[15:26:08] <arnt> dave: I'm about ten lines from an implementation. it's amazingly easy (but a little costly at runtime).
[15:26:10] <Dave Cridland> Yes, Messaging Direct.
[15:26:17] <cnewman> Pete/Chair: thinks he knows where we're going.
[15:26:21] <arnt> strike "a little"
[15:26:46] <cnewman> Pete/Chair: moving on to ANNOTATE document
[15:27:06] <Dave Cridland> Arnt: It's nice and easy for clients. :-)
[15:27:13] <pguenther> arnt: so the /flags/ stuff wasn't as annoying in CONDSTORE?
[15:27:36] <cnewman> Slide: Cyrus' list of changes
[15:27:50] <Dave Cridland> Where are the slides?
[15:27:51] <cnewman> * Change 'string' formal syntax to 'list-mail' and 'astring' for entry'attribute names
[15:27:55] <resnick> Should have been sent to the list.
[15:28:05] <cnewman> * updated examples to match new astring syntax
[15:28:10] <resnick> URL coming.
[15:28:14] <arnt> pguenther: in condstore it's just syntax to generate. in annotate the /flags/* requres special cases.
[15:28:20] <cnewman> slides at: ietf.webdav.org/imapext
[15:28:20] <Dave Cridland> (Ah, I looked at othe onsite thing...)
[15:28:36] --- tonyhansen has joined
[15:28:37] <cyrus_daboo> http://ietf.webdav.org/imapext/presentations/IMAPEXT-meeting-ietf64.ppt
[15:28:39] <cnewman> Slides: ease of implementation issues
[15:28:46] <Dave Cridland> Ta.
[15:29:18] <pguenther> yeah, I guess it's simple enough when restricted to the SEARCH MODSEQ case
[15:29:20] <cnewman> More ideas suggested since Paris
[15:29:26] <cnewman> * Hard-code content-type?
[15:29:55] <cyrus_daboo> FYI The first three points on the suggested changes really boil down to trying to simply the set of attributes to as small as possible i.e. just value and size.
[15:29:55] <cnewman> * Remove content-language? if OK by IESG
[15:30:16] <cnewman> * Remove display-name?
[15:30:45] <Dave Cridland> Cyrus: Exactly.
[15:30:47] <cnewman> Pete makes frowns at suggestion to remove content-language
[15:30:58] <arnt> (in case it's not clear, I'm in favour of all the removes. don't really care much about language.)
[15:31:23] <cyrus_daboo> Phil: certainly we should dump display-name to remove ambiguity of content-lang plus some other issues.
[15:31:35] <cnewman> Phillip: complex due to need to overwrite content-language at same time attribute value is updated
[15:31:50] <arnt> if language stays, we really should make STORE ANNOTATION clear the entry. don't want an old content-language to stay if value is overwritten, etc.
[15:32:55] <cnewman> Lisa: experience from webdav is that language tags are never used. But personal vote is to keep it to see if it will succeed.
[15:33:01] <cyrus_daboo> Keeping it does raise the problem of keeping it consistent with the value as per Arnt's comment.
[15:33:07] <Dave Cridland> Arguably, though, the right thing is Plane 14 or MLSF.
[15:33:32] <Dave Cridland> Content-Language is a hack, a stop-gap, to make up for the lack of common usage of Plane 14 or MLSF.
[15:34:06] <cyrus_daboo> What does IESG have to say about Plane 14 / MLSF
[15:34:21] <cnewman> Phil: didn't want to make attribute names UTF-8, don't want stringprep. So add display-name when displaying the attribute name to the user. But language of display-name unclear.
[15:34:35] <Dave Cridland> MLSF was dropped slightly post-ACAP in favour of Unicode pushing Plane 14, which, as I understand it, never happened.
[15:35:02] <cnewman> [note: the Unicode consortium has deprecated Plane 14, and will scream bloody murder if anyone attempts to use MLSF, just FYI]
[15:35:04] <arnt> the unicode people didn't seem to have a warm fuzzy feeling about plane 14.
[15:35:20] <Dave Cridland> MLSF was Chris's idea and draft, so I'd guess he'd know more. Mark Crispin is the resident expert on the necessity for language tagging, IIRC.
[15:35:36] <cnewman> Lisa: other responses?
[15:35:49] <cyrus_daboo> Can we discuss the plane 14 issue?
[15:35:54] <cnewman> Lisa: who wants to keep content-language? (a few hands)
[15:36:01] <cnewman> Lisa: anyone who wants to remove it?
[15:36:02] <arnt> whose hands? implementers?
[15:36:25] <cyrus_daboo> Can it be used to replace content-lang?
[15:36:30] <arnt> I'll happily keep it if anyone stands up to say "I will implement it".
[15:36:35] <cnewman> Pete: leave plane 14 issue to the list.
[15:36:56] <cnewman> Phil: question: implement it with removal?
[15:37:12] <cnewman> Pete: let's take this issue back to the list.
[15:37:20] <cyrus_daboo> We need to ask the client folks about this as they are the ones that have to actually set the value.
[15:37:53] <cyrus_daboo> I never bothered doing anything with content-lang even in MIME headers let alone annotations, but maybe I was just being bad.
[15:37:59] <cnewman> Phil: content-language has traction at the MIME and web world? Do we have reason to believe it won't work here?
[15:38:09] <Dave Cridland> I've never seen the COntent-Language header actually set in an email, I have to admit. (In response to Phil's comments)
[15:38:26] <cnewman> Lisa: ANNOTATE fixes some of the ambiguities that Webdav had which ANNOTATIONS can avoid (e.g. the clearing issue, multi-values, etc).
[15:38:29] <resnick> David: How many Chinese or Japanese e-mails do you get?
[15:38:36] <arnt> dave: I've seen a few, but they don't correspond very well to the language actually used.
[15:39:05] <cnewman> Lisa: moving on to display-name. Removing it?
[15:39:06] <Dave Cridland> resnick: Hundreds, with pictures I don't like to be displaying near my wife and kids. ;-)
[15:39:32] <cnewman> Lisa: let it be chosen through out-of-band mechanisms via specifications or user-interface designers?
[15:39:39] <resnick> David: Something tells me that those senders are not too concerned about how their text is displayed. :)
[15:39:43] <cnewman> Lisa: Any objections to removing display-name? (silence)
[15:39:46] <cyrus_daboo> So Lisa is proposing having this as an IANA registry attribute?
[15:39:54] <Dave Cridland> No objections to removing display name from me.
[15:40:00] <arnt> display-name is a PERFECT candidate for translators. translate /altsubject as whatever.
[15:40:13] <cnewman> Alexey: Only matters if you have a client which can browse annotations and can show unrecognized ones.
[15:40:51] <Dave Cridland> The kind of client likely to be able to browse random annotations is not going to be used by anyone not willing go go Google for what the annotation means.
[15:41:01] <cnewman> Alexey: Don't think we should worry about unrecognized case.
[15:41:15] <Randy> Arnt, what are you saying about translators?
[15:41:30] <cnewman> Phil: If we remove content-type, then clients are already configured to know what type the attributes are. Display-name should go if and only if content-type goes.
[15:41:31] <arnt> display-name has the wrong granularity.
[15:41:52] <Dave Cridland> Phil's right.
[15:41:54] <cnewman> Lisa: This does weaken the ability of IMAP clients to browse unfamiliar annotations and display them.
[15:41:59] <arnt> it presumes that the person reading the annotation and the writer ahs the same language. what's really correct is that display-name should follow the reader's clients' general language.
[15:42:26] <arnt> so the reader's client should translate /altsubject in the same way that it translates the text on the "save" button.
[15:42:31] <cnewman> Alexey: Suppose we have generic LDAP browser. How many will localize Street name, for example. haven't seen any.
[15:43:14] <cnewman> Lisa: webdav analogy again: there are clients that take properties like modtime and such and represent them in Thai, but only the standardized properties.
[15:43:23] <cnewman> Pete: also Apple's LDAP client for the common ones it knows.
[15:43:44] <cnewman> Lisa: most webdav clients don't allow you to browse arbitrary properties.
[15:43:48] <cnewman> Pete: same with most LDAP clients.
[15:43:54] <arnt> I like lisa. she keeps injecting reality.
[15:44:04] <cnewman> Lisa: any objections to removing display-name? (no objections)
[15:44:22] <cnewman> Moving on... limit searching to value only?
[15:44:42] <cyrus_daboo> Yes - just three attributes.
[15:44:44] <cnewman> Phil: we're down to three attributes: value, size and maybe content-language.
[15:44:59] <cnewman> Lisa: any objection to limiting search to cover values only? (no objections)
[15:44:59] <Dave Cridland> DId we actually discuss removing content-type, yet?
[15:45:18] <pguenther> it's gone
[15:45:18] <resnick> Yes, quickly.
[15:45:18] <arnt> WITH PRESENT SYNTAX it makes sense to restrict searching to value only.
[15:45:26] <Dave Cridland> No objections from me to removing wildcarded entry names in SEARCH, nor restricting SEARCH to values only.
[15:45:31] <cnewman> Next item: remove % and * in SEARCH? In FETCH"?
[15:45:35] <cyrus_daboo> Sorry - I think the % * was for attributes not entries, I think. Wasn't that what you wanted Arnt?
[15:45:40] <cnewman> Alexey: think we should leave this
[15:45:55] <cnewman> Lisa: feeling a bit uncomfortable for client implementors
[15:45:59] <arnt> removing % and * from attributes was what I wanted.
[15:46:10] <cnewman> Alexey: Useful for a group of related attribute names.
[15:46:11] <arnt> I'm happy having */% for entries.
[15:46:15] <Dave Cridland> Gah. Yes, quite so.
[15:46:26] <cyrus_daboo> If we also remove vendor. attributes then there really is no need for % * in attributes anywhere.
[15:46:32] <cnewman> Lisa: can you fetch all annotations?
[15:46:44] <cnewman> Alexey: yes.
[15:46:52] <Dave Cridland> Yes, we need %* in, erm, annotation entries.
[15:48:22] <cnewman> Phil: pulling them from the attribute name makes sense now. But need them in entries.
[15:48:29] <Dave Cridland> We need to be able to SEARCH all comments, for instance.
[15:48:32] <cnewman> Lisa: Keep wildcards in SEARCH and FETCH?
[15:48:38] <cyrus_daboo> So the proposal is get rid of % * for attributes only, leave them for entries (in FETCH/SEARCH)
[15:48:43] <Dave Cridland> Yes, for annotation entry names.
[15:48:47] <arnt> agreed
[15:49:11] <cnewman> Lisa: can search with % * and values and entries, but not on attribute names?
[15:49:43] <cnewman> Phil: clarify % and * are only in the entry part.
[15:49:56] <cyrus_daboo> What I intend to do for SEARCH syntax is to remove the ability to specify the attribute at all as now you can only search value. So SEARCH will only list entry names to be searched and thos can be wildcarded.
[15:49:58] <pguenther> not magic in value (they're substrings not patterns)
[15:50:07] <cnewman> Lisa: remove % and * in SEARCH attributes? Yes In FETCH? No.
[15:50:23] <Dave Cridland> Cyrus: Including a wildcard in a non-existent syntax element being hard. :-)
[15:50:46] <arnt> please also in fetch. if we want, the attribute name '*' can mean 'all attributes'.
[15:51:03] <cyrus_daboo> Yes to FETCH.,
[15:51:10] <cnewman> Phil: would this also effect CONDSTORE?
[15:51:17] <Dave Cridland> Arnt: Well, "*.shared" might be useful.
[15:51:19] <cnewman> Lisa: Phil is trying to stir up trouble again.
[15:51:25] <arnt> but fetching attributes '*e*' is really a pain, and quite useless. fetching 'value' and 'all attributes' makes sense. fetching '*e*' not.
[15:51:42] <Dave Cridland> Arnt: Absolutely.
[15:51:44] <cnewman> Lisa: punt to list with language from Cyrus
[15:52:25] <cnewman> next slide: other issues
[15:52:58] <cnewman> remove vendor.* attributes - list consensus
[15:53:12] <cyrus_daboo> I have one last point - a big one
[15:53:16] <cyrus_daboo> one other issue - one driving force behind annotations in the first place was vpim/lemonade wanting per-body part flags. I do not see that listed anymore in lemonade profiles, so do we really need per-body part annotations?
[15:53:18] <cnewman> Anyone want to raise any issues with what's on the slides?
[15:53:40] <arnt> no issues with the slides.
[15:53:58] <Dave Cridland> Cyrus: Yes, they might be handy. I don't think they're a really big problem, are they?
[15:54:00] <arnt> I do hear what cyrus said, and I confess that that issue has been at the back of my mind.
[15:54:01] <cnewman> Lisa: this could teach the lemonade guys to come to the IMAP meetings...
[15:54:12] <cnewman> Lisa: waves at Glenn in the back..
[15:54:13] <Dave Cridland> Glenn is here...
[15:54:14] <arnt> HOWEVER:
[15:54:48] <arnt> per-part flags aren't a mirror of other imap functionality, so doesn't have the problems I (and peter coates) ran into.
[15:54:54] <cnewman> Alexey: if you want to forward a particular body part of a message and want to change body part. I think it's useful.
[15:55:00] <Dave Cridland> Gah... I just thought of a problem.
[15:55:12] <arnt> all that's needed for per-part flags is an IANA registration, even if per-part flags aren't in annotate proper.
[15:55:23] <cnewman> Mark: Has change to ANNOTATE from 'string' to 'astring' been done to ANNOTATEMORE?
[15:55:29] <cnewman> Alexey: yes, that should be kept in sync.
[15:55:30] <cyrus_daboo> I am updating ANNOTATEMORE and will make Mark's change, but that will break existing implementations.
[15:55:33] <Dave Cridland> Big use for body-part level annotations would be marking body parts with flags, like $forwarded - I think Alexey mentioned this. But flags are no longer in annotate...
[15:56:10] <cyrus_daboo> Does lemonade still want to use body-part flags?
[15:56:24] <cnewman> Pete: think lemonade still has pretty significant use for marking body parts with annotations?
[15:56:26] --- NFreed has joined
[15:56:28] <cyrus_daboo> And should it be in one of their profiles?
[15:56:31] <cnewman> Pete: templating idea.
[15:57:25] <cyrus_daboo> The question is whether per-body part annotations should be in the base ANNOTE document, or whether it can be defined later when lemonade actually wants to use it?
[15:57:26] <cnewman> Pete: issue with sent items, went to keep them and what to do with them.
[15:57:35] <arnt> FYI, I am not opposed to per-part flags, only to the bits that mirror STORE FLAGS etc.
[15:58:28] <cnewman> Pete: if we don't put it in the initial spec, is it going to be a pain in the ass to add later?
[15:58:34] <Dave Cridland> Pete: You can also do that URL storage scenario quite effectively with cunningly constructed content-ids.
[15:59:01] <cnewman> Alexey: issue with ANNOTATEMORE vs. ANNOTATE: much more state to store per message, so more of a deployment consideration.
[15:59:25] <cnewman> Pete: the word "cunningly" is noted and taken very seriously.
[15:59:46] <cyrus_daboo> So I guess I am not hearing consensus on this - discuss on the list?
[15:59:50] <cnewman> Pete: don't have a strong opinion, but think the lemonade people were more or less expecting it.
[16:00:07] <cnewman> Glenn: yes, lemonade was expecting this.
[16:00:42] <resnick> Chris: we should keep it in, but keep it lightweight.
[16:00:57] <cyrus_daboo> OK, I'm done now.
[16:01:06] <cnewman> Pete/Chair: try to leave it in lightweight, but take to lemonade for feedback.
[16:01:24] <cnewman> Lisa: Comparator topic
[16:01:48] <resnick> Back to IMAPABNF
[16:01:49] <cnewman> Lisa: IMAP ABNF first
[16:01:55] <resnick> ;)
[16:02:14] <cnewman> Alexey: Mark Crispin has comments. Barry gave some input. Will be another revision.
[16:02:37] <cnewman> Alexey: Main concern is not to change other drafts which already use syntax, like don't break CATENATE.
[16:02:58] <cnewman> Alexey: Please review document draft-melnikov-imap-ext-abnf-05
[16:03:04] <cnewman> hopefully will be done very soon.
[16:03:23] <cnewman> Lisa: do we have anything else?
[16:03:28] <Dave Cridland> Arnt: Do you have anything to say about collate?
[16:03:29] <cyrus_daboo> A lot of stuff is being hung up on i18n work (e.g. SIEVE drafts) we do need to make progress on this asap.
[16:03:30] <cnewman> Phil: collate is the application protocol collation registry, is an independent submission.
[16:03:44] <arnt> I can say a little about collation.
[16:03:54] <lisa> Go ahead and type Arnt
[16:03:59] <arnt> philip dug up what seemed a small issue, but it ended up being pretty big ;)
[16:04:08] <lisa> if there's anything Philip misses we can fill in from you
[16:04:12] <Dave Cridland> For which I apologise for being contrary.
[16:04:14] <cnewman> Phil: jointly by Arnt (with current token) with past input from Martin and Chris
[16:04:17] <arnt> do collators collate streams of octets or of characters?
[16:04:20] <cnewman> Phil: contention: operating on octets or characters.
[16:04:34] <cnewman> Phil: repercussions on IMAP i18n and on Sieve.
[16:04:44] <lisa> Controlled contrariness can be amazingly healthy in standards discussions :)
[16:04:51] <arnt> so, at the end of it, I wrote text (too late for the cutoff):
[16:04:56] <Dave Cridland> And it does have serious effect on ACAP. WHich probably only means me, but I still think it's important.
[16:05:08] <cnewman> Pete: what reason could there be for collation to collate on octets?
[16:05:24] <Dave Cridland> Pete: Because they *do*. ACAP works on octets, not characters.
[16:05:31] <cnewman> Phil: because that's what the original work (ACAP) worked on the octets of the UTF-8 characters
[16:05:34] <arnt> the collation is specified on octets, and as specified, each collator can form characters any way it wants. (usually by decoding utf-8, but...)
[16:05:50] <arnt> any particular implementation can deviate, and e.g. hand unicode characters to most octets.
[16:05:52] <arnt> important:
[16:06:20] <cnewman> Phil: if you try to switch over to characters than i;octet is even more magical than before.
[16:06:39] <cnewman> Pete: i;octet strikes Pete as treat each octet as a character and collate on it as a character.
[16:06:43] <NFreed> I agree with Pete here.
[16:06:51] <arnt> the client should not be able to detect any difference. if the client can detect that the server-collator interface is not using octets, the server is being naughty.
[16:07:33] <arnt> finally: for servers that don't operate on the data as 8-bit anythings, i;octet may not be implementable.
[16:07:35] <cnewman> Phil: it sounds like people have implemented everything as working on octets.
[16:07:56] <Dave Cridland> Arnt: No, you encode character strings as UTF-8, if that's all you have.
[16:08:00] <arnt> I had to look. AFAICT, everyone uses octets on disk and on the network, but some code uses UCS-2 in RAM.
[16:08:08] <cnewman> Phil: on the other list we wanted to stick to octet definition to avoid changing current implementations.
[16:09:02] <arnt> it makes sense to allow servers to pass UCS-2 (or UCS-4) to some collators in RAM, so I made that clear in the text. but that's implementation. the specification is in terms of octets, and each octet can say "I use UTF-8".
[16:09:22] <arnt> I also had to actually define "client" and "server". wasn't easy.
[16:09:24] <cnewman> Pete: Obviously i;octet continues to operate on octets. en;ascii-casemap operates on octets because it's lucky.
[16:09:42] <cnewman> Pete: but if you're going to write a UCA comparator it can't operate on octets.
[16:09:46] <arnt> I'm done speechifying.
[16:10:01] <Dave Cridland> Pete: Why is that unacceptable?
[16:10:32] <cnewman> Phil/Pete: complex discussion which I can't transcribe effectively...
[16:11:06] <cnewman> Phil: Other inputs are fine, but canonicalize to UTF-8 before applying the comparator.
[16:11:25] <cnewman> Mark: there's more to that too. You have to do that as well if the source is UTF-8 because of normalization.
[16:11:27] <Dave Cridland> Pete: If the implementation operates in characters (ie, UCS-2 or something) then it passes its character strings into a character string API, and the comparator *behaves* as if it's encded to UTF-8 and decoded again as required. Obviously it's not actually going to do that for "i;basic" stuff.
[16:11:54] <arnt> ned: operating on octets isn't a good option, it's the only option left by the protocols that transmit (TCP) byte streams. it's unsound to assume that an IMAP SEARCH command always uses unicode, e.g.
[16:12:28] <cnewman> Mark: the point is that the translation step is not a no-op if the source is UTF-8. You can assume your code to translate say 8859-* will generate properly normalized UTF-8 to begin with, but the step is a lot more complex when the source is UTF-8.
[16:12:38] <cnewman> Alexey: the canonicalization is part of the comparator.
[16:12:48] <arnt> agree with alexey
[16:12:54] <cnewman> Phil: i;octet should be able to distinguish UTF-8 text which are canonically equivalent.
[16:13:11] <NFreed> The ability to tell the difference between a composed form and a decomposed form is actually useful in some cases.
[16:13:14] <cnewman> Phil: if you want to not tell them apart, then use i;basic
[16:13:46] <cnewman> Phil: putting all the canonicalization except the charset conversion in the comparator simplifies.
[16:14:32] <cnewman> Pete: separate the collation itself and the implementation of the collation.
[16:14:39] <arnt> if the charset conversion is outside the collator, the code outside the collator has to know about all charsets used.
[16:15:10] <Dave Cridland> Pete - the implementation can accept character strings. The *definition* says you encode to UTF-8.
[16:15:11] <cnewman> Pete: it would be dumb to do the UTF-8 conversion when the data is UCS-4.
[16:15:52] <arnt> the IMPLEMENTATION can always be efficient, my text says, AS LONG AS a client cannot tell the difference.
[16:15:58] <Dave Cridland> Pete - As long as the semantics of the comparator (or collator) remains *as if* the UTF-8 encoding step was done, then it doesn't matter whether it was really done or not.
[16:16:20] <Dave Cridland> (Or what Arnt sez)
[16:17:01] <Dave Cridland> (And while we're at it, am I allowed to go into my customary rant about the renaming of i;ascii-casemap?)
[16:17:39] <arnt> please don't, it's hard to tell what's being said in the room from what's being said on jabber.
[16:18:06] <resnick> What good is it for the comparator to be on octets instead of characters?
[16:18:56] <Dave Cridland> Exactly!
[16:19:04] <arnt> it keeps the parsing of octets into characters within the domain of the collator. the general framework need not concern itself with character sets.
[16:19:20] <NFreed> Am I the only one who gets unlabeleld crap in header fields?
[16:19:24] <Dave Cridland> Pete said, "You mean you're doing this so you can feed CRAP into it?"
[16:19:40] <Dave Cridland> And I said "Exactly!", because that's a damn handy thing to do sometimes.
[16:19:58] <arnt> if you want to sort on gb18030 (I can't imagine why) you just register a collator for it. you don't have to extend the general framework.
[16:20:37] <NFreed> Yes
[16:20:59] <arnt> more to the point: if you sort on characters, the bit outside the collator has to behave differently for some collators than for others. I think that's architecturally unclean.
[16:21:04] <NFreed> Um. no.
[16:21:17] <NFreed> Assuming a particular charset doesn't work in general.
[16:21:35] <arnt> violent agreement from me.
[16:21:42] <NFreed> When the sample size is too small to figure out the charset.
[16:21:54] <NFreed> Like when it is any iso-8859-x, now determine x.
[16:21:58] <NFreed> OK.
[16:22:10] <arnt> uh, wait, not sure I understand what you mean.
[16:22:32] <NFreed> I guess. It would be a wierd charset, one where every byte
[16:22:43] <Dave Cridland> Pete: No, you're not, you're assuming that parts of it might look enough like that charset.
[16:22:49] <NFreed> mapped to a corresponding character with the same value.
[16:23:25] <arnt> the collator should not be heuristic.
[16:23:35] --- Jim Martin has joined
[16:23:56] <Dave Cridland> Arnt: Pete's discussing inserting crap into comparators.
[16:24:11] <NFreed> OK... That assumes we have a set of knobs for adding labels to unlabelled crap.
[16:24:14] <arnt> the collator's task is to collate, not guess character sets. if you tell it "sort those things" it should sort them, and treat the error cases as error cases.
[16:24:50] <NFreed> We have a set of knobs like this, but they're done as part of message delivery, not as part of IMAP SORT.
[16:25:01] <arnt> (ned: we too)
[16:25:17] <pguenther> chris: wrote spec as concrete description because he's a concrete kind of guy
[16:25:28] <pguenther> "have chunks of octets, need to do something"
[16:26:08] --- Barry Leiba has joined
[16:26:22] <NFreed> I think I like the proposal Chris is making.
[16:26:22] <arnt> is anyone in the room still not on jabber? ;)
[16:27:00] <pguenther> chris: when you register a comparator, you should be able to state both what it does on octet data and what it does on characters
[16:27:30] <Dave Cridland> I disagree with Chris, to be honest - I think it's trivial to map characters to octets, but it's sometimes impossible to map octets to characters. So we use that, and the simple fact that all comparators operate on UTF-8 safely, right now.
[16:27:35] <arnt> four sort operations? char/char, char/octet, octet/char, octet/octet?
[16:27:43] <arnt> four searches?
[16:28:16] <NFreed> This can all get very wierd when you have something like: To: random-8bit-crap <a@b>, =?utf-8?q?legit-encoded-utf-8?= <c@d>
[16:28:38] <Dave Cridland> Ned: I entirely agree.
[16:28:54] <NFreed> Yes. Unicode certainly maintains the separation.
[16:29:14] <Dave Cridland> Ned: But it's simple if you input the random-8bit-crap and the legit-encoded-utf-8 together.
[16:29:45] <lisa> I'll take over jabber scribing now
[16:29:59] <NFreed> BTW, "normalizing" simplified to traditional or vice versa is HUGELY nontrivial. I've heard you need something like 400,000 rules and it still stinks.
[16:29:59] <arnt> lisa: thanks. chris seems too active right now.
[16:30:10] <lisa> Ted: The problem with unicode isn't that they collapsed Chinese code into single characters, it's that they used single code-points for same characters regardless of language of origin.
[16:30:16] <NFreed> Dave: I agree. It's simple, but wierd.
[16:30:33] <lisa> Ted: You may have a character which has two different code points depending on which language.
[16:30:44] <arnt> please let's not discuss unification.
[16:30:51] <lisa> Ted: The problem you run into, is if you have a mapping table for semantic equivalence, rather than based on glyph, this is very complicated.
[16:30:58] <arnt> for collation, we have three pieces of data, and they're all non-negotiable.
[16:31:17] <lisa> Ted: You need to be very clear in Comparator, are you comparing glyph to glyph, or character meaning to character meaning.
[16:31:28] <lisa> Pete: Are there collations we would desire, where converting into UTF8 would be lossy.
[16:31:35] <lisa> Mark: I'm not aware of any such conversions.
[16:31:55] <lisa> Pete: Because unicode does contain the unified and simplifed forms of dual forms, you would not get a loss for collation purposes going to UTF8
[16:31:57] <NFreed> The goal in Unicode is to be able to accept translation from any charset without loss.
[16:32:10] <lisa> Chris: The stack was unicode centric, based on IETF polciy.
[16:32:11] <NFreed> But then there's Klingon...
[16:32:46] <lisa> Mark: There is a charset (not a character set) called ISO-22?jp-2, which allow you to shift into any other character set, if you do that for any characters which aren't unified, you lose the language context.
[16:32:47] <arnt> Q: is it acceptable if the collator is tied HARD to "unicode and i;octet"?
[16:32:52] <Barry Leiba> What does the Klingon charset look like?
[16:33:01] <lisa> Mark: Using character set to determine language was seen to be a terrible idea nd this idea went nowhere.
[16:33:12] <NFreed> Barry: It's ugly. What else would you expect?
[16:33:20] <lisa> Pete: You might want to collate Japanese in a different place from Chinese, and you want a language tag for that purpose.
[16:33:42] <NFreed> Seriously ,there is a Klingon charset, there was a proposal to add it to Unicode, it was rejected.
[16:33:43] <lisa> Mark: These are literally trivial font differences.
[16:33:59] <NFreed> So there is an example of a charset that doesn't map to Unicode. Dunno how serious it is, but...
[16:34:02] <lisa> Ted: Luckily Ned is not able to jump up and discuss glyph differences becaues it's a large problem :)
[16:34:12] <lisa> Ted: Is designing around the lossy bug important?
[16:34:39] <lisa> TEd: Until somebody tells us it's important, the conversion to UTF8 is not lossy in the known cases, we do have a Unicode consortium liaison and we could check.. but is it of practical concern?
[16:34:50] <Barry Leiba> Seriously, I suppose that if the Unicode people said "no", it's reasonable for us to say "no" also.
[16:35:01] <NFreed> Personally, I direct all angry Klingons to the Unicode Technical Committee...
[16:35:14] <lisa> Pete: The unicode comparator that we're pulling this out of is on characters, but for programmatic purposes. Morally this is equivalent to based on octets.
[16:35:31] <lisa> Pete: Is that a correct read?
[16:35:34] <NFreed> Pete: Sounds right.
[16:35:34] <lisa> Philip: thinks so
[16:35:35] <arnt> ned: one of unicode's stated goals is to encompass 20th century printed literature. apparently there isn't much in klingon.
[16:35:52] <lisa> Chris: The W3C is very interested in the spec and they operate on characters, not octets, at their level.
[16:36:02] <lisa> Chris: We, at the protocol layer, operate on octets.
[16:36:04] <Dave Cridland> Arnt: Although I seem to recall that they claim SHakespeare is better in the original Klingon.
[16:36:28] <arnt> if we operate on character, i;octet is damned hard to specify, and i;octet is effectively running code with which we must be compatible.
[16:36:31] <cyrus_daboo> Lingon reference: http://en.wikipedia.org/wiki/Klingon_language
[16:37:02] <lisa> Chris: I know if my input is octets, I don't know if it's characters, in the genereal case of coding. That's why it was written that way.
[16:37:27] <lisa> Chris: This is why the way it's written is more precise. However we need to nod to W3C to let them know they can operate on characgters and it won't cause much problem.
[16:37:29] <Dave Cridland> (Of course, I can't type Shakespeare and then compare it properly using en;ascii-casemap, because Shakespeare, like many British English writers, uses diacritical marks unavailable in ASCII.)
[16:38:02] <arnt> the only real problem W3C will run into is: i;octet becomes unimplementable.
[16:38:30] <lisa> Pete: There's got to be a way to explain to people in this doc that the net effect is of collating characters, it's just that the collation documentation specifies in terms of encoding and sorts among octets.
[16:38:48] <lisa> Pete and Philip: When given characters, you're effectively sorting characters.
[16:38:54] <Dave Cridland> Arnt: Curiously, no - "i;octet" is easy in terms of characters, because UTF-8 collates identically with UCS-4, IIRC.
[16:39:01] <lisa> Philip: Sorting characters was a goal of UTF8.
[16:39:16] <arnt> dave: I'll send you mail with cases where that's untrue ;)
[16:39:34] <cnewman> If the input is characters, to apply i;octet, convert them to UTF-8 and apply i;octet. This is also the same as applying a simple ordering to uncanonicalized UCS-4 (that's a nice characteristic of UTF-8).
[16:39:35] <lisa> Philip: If you have stuff whichisn't quite character-ish, it doesn't fall over and die.
[16:40:05] <arnt> chris: 'a SEARCH BODY GIF89A'.
[16:40:17] <lisa> Philip: The current draft is a little inconsistent, it talks of using i:octet on characters... I think we can massage it forward, have a clear paragraph... this is all on octets, it effectively operates on characters, we specify it this way to be robust.
[16:40:43] <Dave Cridland> Arnt: Ah, yes, but that's in an octet setting.
[16:40:55] <cnewman> arnt: that's for the case where the input is binary. We care about that case, but the W3C doesn't.
[16:41:25] <lisa> Looks like we're closing down.
[16:41:30] <lisa> Pete gets it. Only mildly nauseated.
[16:41:35] <Dave Cridland> Although we've not yet mentioned SIEVE's matching.
[16:41:50] <Dave Cridland> Which is out of scope for imapext, but while we're talking about comparators...
[16:41:58] <cnewman> Phil: Sieve's matching is the other meeting...
[16:41:59] <lisa> SIEVE is the other meeting.
[16:42:00] --- Jim Martin has left: Disconnected
[16:42:14] --- Barry Leiba has left
[16:42:18] <NFreed> TTFN
[16:42:25] --- NFreed has left
[16:42:27] --- cnewman has left
[16:42:29] <arnt> lisa:
[16:42:55] <arnt> tell them that if I'm an editor again, I'll come, infant or none. if necessary I'll take a red-eye flight, attent and go right back.
[16:43:40] --- cyrus_daboo has left
[16:43:46] <Dave Cridland> Arnt: Yes, it's amazing the difference between attending and plain jabber. THe Audio is pretty useful, but amazingly quiet.
[16:44:13] <arnt> now I'll just have to get this last called before next IETF, because I damned well don't want to apply for a US visa.
[16:44:23] <Dave Cridland> Heh.
[16:44:27] <arnt> the application procedure is painful.
[16:44:34] <Dave Cridland> More so for me.
[16:44:43] <arnt> you think?
[16:45:07] <Dave Cridland> My mum's childhood friend was Arafat's sister, and she was born in Palestine. Go figure.
[16:45:08] --- resnick has left
[16:45:33] <arnt> oh.
[16:45:38] <lisa> Yikes
[16:45:45] <lisa> that must indeed be a pain in the ass.
[16:46:14] <Dave Cridland> No, it's okay. For instance, the current Royal family of Jordan have probably mostly forgotten her.
[16:46:23] <arnt> well. good night.
[16:46:31] <lisa> night
[16:46:31] <Dave Cridland> :-)
[16:46:33] <Dave Cridland> Night night.
[16:46:34] <arnt> (unless there's more?)
[16:46:43] <Dave Cridland> Not really. How's the little one?
[16:47:12] <arnt> very well, thank you. but straining her parents.
[16:47:27] <Dave Cridland> My granmother says it gets easier when they hit 50.
[16:47:29] <arnt> she's very playful and happy all night at the moment.
[16:47:37] <arnt> last week you said 40.
[16:47:42] <Dave Cridland> It got worse.
[16:48:03] <arnt> oh well. see you the IETF after next, then.
[16:48:09] <arnt> now I want to catch what sleep I can.
[16:48:24] <Dave Cridland> Yeah. See you tomorrow for Lemonade or Sieve?
[16:50:55] --- Dave Cridland has left
[16:51:15] --- kmurchison has left
[16:51:37] --- alexeymelnikov has left
[16:51:39] --- arnt has left
[16:53:15] --- arnt has joined
[16:54:06] --- pguenther has left
[16:54:11] <arnt> not sure I'll be able to attend.
[16:56:06] --- tonyhansen has left
[16:56:16] --- arnt has left
[17:02:23] --- lisa has left
[17:12:02] --- Randy has left
[19:10:43] --- Glenn Parsons has left: Disconnected
[19:46:34] --- Glenn Parsons has joined
[19:46:43] --- Glenn Parsons has left
[21:42:57] --- ams has left