[15:28:09] yone joins the room [15:39:06] David Cooper joins the room [15:45:40] hildjj joins the room [15:45:57] Starting. [15:45:59] stpeter joins the room [15:46:06] WG deliverables [15:46:23] http://www.ietf.org/proceedings/80/slides/precis-1.pdf [15:46:36] hildjj: thanks for the links [15:46:59] stpeter is now known as Peter Saint-André [15:47:02] yoiwa joins the room [15:47:44] resnick joins the room [15:48:02] http://tools.ietf.org/html/draft-hoffman-rfc3536bis-01 [15:48:07] marc.blanchet.qc joins the room [15:48:19] http://tools.ietf.org/html/draft-saintandre-xmpp-i18n-03 [15:48:27] chair slides presented [15:48:31] martin.thomson joins the room [15:48:34] http://tools.ietf.org/html/draft-blanchet-precis-framework-00 [15:48:34] lef joins the room [15:48:42] Andrew Sullivan: on problem statement draft. [15:48:44] no slides [15:48:55] http://tools.ietf.org/html/draft-ietf-precis-problem-statement-02 [15:49:15] (andrew talking about the last one now) [15:49:20] Klensin joins the room [15:51:02] now peter presenting. [15:51:22] http://www.ietf.org/proceedings/80/slides/precis-0.pdf [15:51:52] Slide 1 [15:52:48] slide 2 [15:53:23] josephyee joins the room [15:53:33] Slide 3 [15:54:00] tlyu joins the room [15:54:01] The recap doesn't seem to be necessary - lots of people think they have a good handle on Unicode [15:54:13] Andrew joins the room [15:54:19] but we're going to do a fast version of it - seems like the chairs wanted it. [15:54:27] Recap (2) [15:54:57] slide 5 [15:55:30] slide 6 [15:56:02] normalization (slide 7) [15:56:18] slide 8 [15:56:27] Florian Zeitz joins the room [15:56:52] slide 9 [15:57:26] slide 10 [15:57:30] slide 11 [15:58:36] slide 12 - string classes [15:59:49] nameything[:wordything]@domaineything/stringything [16:00:24] slide 13 [16:01:34] slide 14 - nameythings [16:02:37] sm joins the room [16:03:18] that was joe hildebrand, this is pete resnick [16:03:49] I know off the top of my head: they are used in typography. What is stored is the question that the WG needs to be addressed. [16:04:06] Mic: they are "used" all the time. [16:04:15] john: got it [16:04:50] Mic: Since everything that is canonically equivalent is also compatibility equivalent this rule would also forbid everything that has a canonical equivalent. Is that intentional? [16:04:55] alexey.melnikov joins the room [16:05:05] ack [16:07:11] hsalgado joins the room [16:07:53] Florian: just a matter of writing the rule correctly. The way it seems to be written on the slide isn't the correct wording, but getting that right isn't hard. [16:08:16] @klensin/florian: +1. Just written poorly. [16:09:14] ftr: joe wants to avoid dealing with the confusables [16:09:17] slide 15 [16:09:34] I think what he meant is forbid "Compatibility Decomposable Characters". but as long as we all know what it is supposed to mean... [16:09:44] Florian Zeitz: yes [16:10:13] Jacky Yao (Health Yao) joins the room [16:10:51] Mic: for the width issue, "side of bus" is not really a problem although users may need to get used to conventions. The very similar "take business card and put it through a scanner" (or, as Tony say, QR codes) problem _is_ an issue. The other compatibility equivalent problem is that some compatibility equivalents are more equivalent than others: Full-width/ half-width are pretty easy, mathematical fonts are slightly less easy, some ideographic equivalents may not be. [16:13:26] Mic: The thing that makes PRECIS more difficult in this area than IDNA is that sensible people might want to be much more restrictive about identifiers but a lot less restrictive about passphrases, where entropy and confusability may be advantages. [16:13:42] ack [16:14:24] But "Anne O'Reilly" has always been a problem, as are lots of folks with spaces in their family names [16:14:24] tony.l.hansen joins the room [16:15:36] @Joe: that is exactly why a string that only I can figure out how to type makes a really good passphrase :-) [16:16:15] Klensin: your keyboard might be different to mine :) [16:16:30] Klensin: +1 [16:17:07] @martin: Exactly. And, again, that can be an advantage for passphrases as long as you don't try to borrow my computer and keyboard. [16:17:26] that's an argument for me to remap my password [16:17:31] s/password/keyboard/ [16:17:50] you should have seen the chairs training session: finding the @ symbol on a british keyboard was apparently a challenge [16:18:38] we're on slide 16 incidentally [16:19:55] Mic: There is a long version of this answer, but "case folding" in the general case is not meaningful without locale. See if there are any Turkish-speakers/writers in the room. [16:20:29] andrew has your point [16:22:16] there's a table of eigenthings coming in the deck. [16:23:20] excuse me: eigenthingies. [16:24:10] jhutz@jis.mit.edu/owl joins the room [16:24:34] Mic: Server knowing butkis about locale is _exactly_ the point. [16:24:59] woolf joins the room [16:25:20] hildjj leaves the room: Replaced by new connection. [16:25:30] hildjj joins the room [16:25:54] Pete, there really isn't much difference between dot in IDNA and apostrophe in XML [16:26:06] mic? [16:26:12] or are you just agreeing? [16:26:25] @jck: ack +1 [16:26:33] your option - it is a bit more than agreement, but I don't know if it is worth mic time. [16:27:13] @David: and that is the only fully-general solution... but lots of folks don't like it. [16:30:55] I think that's something we're going to disagree on. [16:32:22] Mic: would people _please_ distinguish between "lower casing" and "case folding". They are different and the Unicode spec (not just a few nasty people and examples) _strongly_ advise against use of the latter for mapping (as distinct from what, in our vocabulary, is server-side matching/comparison) [16:33:58] I'm not sure about anyone else. I'm talking about _comparison_, which means case folding [16:36:26] Mic: Jeff: but, any time someone talks about doing something in the input method and then transmitting, they are mapping, not comparing. Pure comparison requires that one send whatever one has to the server and the server figures out what matches. And, yes, Joe is right. [16:36:27] I will concede that filesystems are an extreme case [16:38:00] I think there may be enough CPU in the world, but I don't want to buy the either. On the other hand, server-side matching is exactly what we do in the DNS (and several other places) [16:38:16] woolf leaves the room [16:38:50] @Klensin: well, yes, but we have ruled out with IDNA2008 a lot of the hard cases. [16:38:52] slide 17 [16:38:55] resnick comments that he objected to 5893 from the get-go, so everyone looking at me about bidi is probably a lost cause. [16:39:07] Mic: as long as you understand that "left to right or right to left but not both" excludes the use of digits in RtoL strings. [16:39:38] John: that's been discussed in XMPP too [16:40:02] This is a really deep rat hote and my main point is that simple rules almost always either exclude something someone thinks is reasonable or fail [16:40:40] Aren't there some rules for bidi-insensitive codepoints, like there are for case-insensitive letters in some languages? [16:40:57] Heh. Very true [16:41:10] Good definition of "stringything".. I like it. As long as one never even thinks about matching or collation. [16:41:55] Klensin: in XMPP, they're never permuted or entered at two different places, so equality comparison isn't hard. [16:42:22] @jeff: When you say "bidi insensitive codepoints", do you mean "direction neutral"? [16:42:45] Yes, that's the phrase I was looking for [16:43:09] joe: I don't think that this was the comparison that is in question [16:43:21] And the only problem with direction neutral beasties is that they *display* differently in RtoL vs. LtoR environments. [16:43:36] (With mixed other chars, that is) [16:43:37] Jeff: recommendation: re bidi, go get a beer, consume it, go read either the Unicode bidi spec or the IDNA2008 equivalent, have another beer. Then try to convince yourself that you understood what you read. If you have the reaction I'd predict, you will end up with European digits and nothing else. [16:43:38] oh, and stringythings are never collated. [16:44:04] slide 18 [16:44:11] Klensin: i'll try that *again*. :) [16:45:02] Yeah, so if you want things you do comparison or collation on, you need a variant of nameything. Whereas if you want something more freeform, you can use a variation of stringything. I think we end up wanting both of those to take "options" like whether spaces are allowed, or whether some form of case-mapping is expected. [16:45:13] Sorry folks, but that "bucket" model works iff and only if the same keyboard, OS, and IME are always used. Otherwise, sooner or later, you will type to type the thing you have always typed before and find that you are using a keyboard/IME that runs NFD on the input stream. [16:45:35] and you find yourself locked out [16:45:42] jhutz: yes [16:45:58] @Klensin: not true [16:46:14] mic anyone? [16:46:18] I'm really worried about wordythings; we need to have things that can be used for "passwords" that are converted fairly reliably by clients into a canonical form that can be used as input to cryptographic functions. [16:46:20] This was one of Dave's central points: most of the time, what people_think_ is a bucket1 is actually bucket2 [16:46:39] martin, what do you want mic'd? [16:46:42] i.e. you don't compare directly, you compare after a well-known convention [16:46:53] woolf joins the room [16:46:53] which would include (e.g.) NFD/NFC [16:46:55] The convention is put "mic:" (in whatever case you like ;) ) in front of anything you want said at the mic. [16:47:41] I'm just looking for guidance: the 'mic' prefix was dropped at some point [16:48:08] I use my powers of good to reimpose the convention. [16:48:35] I'd be more likely to use my powers for awesome [16:49:37] (I'm deliberately not using "mic" when I don't know if the comment is important enough --or different enough from what has already been said-- to justify mic time. It is ok if someone decides to read them, but ok if not too) If you go to "bucket", you either have to be able to control the OS at what we used to call the TTY input level, or you can't match.. Andrew: right... a different version of my comment is that bucket2 doesn't compare. Otherwise the server has to at least normalize before comparison and then David and I start pricing CPUs. [16:50:48] @Klensin: yes, exactly true. I wasn't saying it didn't suck. I'm just trying to put the tiny shards in the right order on the floor before we start getting out the glue. [16:53:18] And, if the client follows the 3.2-based spec in its operations/hashes/ comparisions, and the server follows 6.0, the user is going to either fail to get into his account, or is generally end up immersed in sticky brown stuff. [16:53:52] slide 20 [16:53:53] we've had that bug when someone used a newer codepoint that was supposed to be prohibited since it was in the unused space in 3.2 [16:54:02] woolf leaves the room [16:54:22] some of the unicode libs include *both* 3.2 and the latest version for just this case. [16:55:01] Mic: General rule: client-side mappings are going to work iff all clients are using the same version of Unicode and code points that are unassigned in that version are utterly prohibited. [16:55:57] that can be stated a little more exactly as a prohibition on use of code points that are unassigned in a given version. Without that, even an NFD rule will fail [16:56:10] nod. [16:56:16] i think the room is getting it. [16:56:38] "unassigned" is a version-dependent assertion. [16:56:54] Dave Thaler joins the room [16:57:33] Klensin: absolutely agree. [16:58:06] you can't predict when someone will take an identifier from one protocol and try to use it in a different protocol... [16:58:28] Note too that something predicated on "Characters requiring compatibility decomposition" is also a version-dependent assertion. [16:59:25] Klensin: yes. how often do those change in practice, though? [16:59:44] every six months or a year lately. [16:59:49] sihg [16:59:51] sigh [17:00:10] s/sihg/bleetch/ [17:03:40] @hildjj: The typical Unicode answer, in fairness, is that most of the changes at this point apply to really obscure and/or historical/ archaic scripts. The Katy O'Kelly example is unlikely to arise as a version difference unless someone one discovers a living, first-language, speaker of, eg., proto-Hittite. [17:04:15] woolf joins the room [17:06:10] Mic: On the other hand, what we tell Katy in email is that she can perfectly have "Katy O'Kelley"@example.com as an email address... but that there are enough broken and email-insensitibve systems out there that she should really expect it to work for everyone who might want to send her mail. [17:06:34] got it [17:06:46] did you mean NOT work [17:06:52] ɹəlɐɥʇ əʌɐp joins the room [17:07:01] Adium's syntax highlighting only caught the part starting at "O'Kelley" [17:07:03] yes, sorry. she should not expect it to work [17:07:39] klensin: we had a *bunch* of wall street customers that have ' in their names. [17:07:54] email addresses, even [17:07:54] Yes, Pete, that got it. And the rule in 821 is different from the rule in 822, too. [17:08:07] Dave Thaler leaves the room [17:08:08] and we had to escape them in a hurry to make a sale. [17:08:13] Which means that Katy may get it delivered but she may not be able to read it. [17:09:30] 1123/2821/5321 say "if you try that, it may not work, but it is perfecty ok to try it" :-) [17:10:01] heh [17:10:04] nice - and it might work for a while then inexplicably fail [17:10:22] s/inexplicably/explicably/ [17:10:45] Florian Zeitz leaves the room: Disconnected: Replaced by new connection [17:10:47] Florian Zeitz joins the room [17:11:17] Florian Zeitz leaves the room: Disconnected: connection closed [17:11:27] Florian Zeitz joins the room [17:11:49] Note that this happens all over the place. john+bogus@example.com is absolutely valid under everything starting with 821 and going through to 5322, but don't try using it in a mailto that passes through a web client. Sometime it will work. Sometimes it will turn into something really creative. [17:11:53] "xenonomicon" [17:13:22] fullwidth vs. halfwidth is shorthand for "are there any codepoints with compatibility mappings that we can't prohibit?" [17:14:13] i was expecting more push-back on the NFD choice. [17:14:15] @Joe: Unfortunately, the answer to the question, if you get into trying to manage personal names, is "yes". [17:14:25] yeah. unfortunate. [17:15:21] Because NFD(NFC(string)) == NFD(NFC(NFD(string))), NFD may not be the right choice, but it is a workable one. [17:15:56] Put a "K" in there, or case-folding, and bad things happen [17:16:06] K loses data. i hates it. [17:16:38] can I groan instead of hum? [17:16:45] yes. which way? [17:16:45] humming on this being the right direction : positive [17:16:46] Yes; add a 'K' or case folding, and it becomes unstable. [17:16:51] Mic: groan. loudly [17:17:08] Oh, actually, I think the compat forms are stable alone, just not when combined with case folding. [17:18:11] They would be stable if Unicode stopped fine-tuning various rules with different versions. There are edge cases despite the stablility rules (that are then defined to not exist).. Maybe one doesn't care [17:19:13] hildjj leaves the room: Disconnected. [17:19:43] <ɹəlɐɥʇ əʌɐp> thats exactly what I did as a reviewer of one of the RFCs with a ticket [17:20:01] <ɹəlɐɥʇ əʌɐp> and yes, it worked well. [17:20:12] <ɹəlɐɥʇ əʌɐp> (pete's suggestion) [17:21:10] No, I mean they're not stable in that given Q(x)=NFKC(toCaseFold(NFD(x))) there are strings for which Q(Q(x)) != Q(x) See the paragraph on stability on page 171 of the current spec [17:21:29] hildjj joins the room [17:22:27] Mic: IMO, the prerequisite to any agreement-determining hum in this area is asking everyone in the room who is sure that they understood the details of the presentation and discussion to put up their hands. And then take a picture and put it in the minutes along with the hum conclusion. And I'm serious -- we' ve done it in EAI and the record may be important. [17:23:36] You're talking about the "is this the right direction" hum? [17:23:43] Yep [17:24:23] hildjj leaves the room: Disconnected. [17:24:31] Jacky Yao (Health Yao) leaves the room [17:24:33] all done [17:24:36] lef leaves the room [17:24:40] Andrew leaves the room [17:24:45] resnick leaves the room [17:24:47] martin.thomson leaves the room [17:24:54] yoiwa leaves the room [17:24:57] Florian Zeitz leaves the room: offline [17:25:22] sm leaves the room [17:25:26] josephyee leaves the room [17:25:58] tlyu leaves the room [17:27:49] hildjj joins the room [17:28:37] hildjj leaves the room: Disconnected. [17:28:56] marc.blanchet.qc leaves the room [17:31:27] hildjj joins the room [17:31:38] tony.l.hansen leaves the room [17:32:04] hildjj leaves the room: Disconnected. [17:32:51] David Cooper leaves the room [17:34:07] hsalgado leaves the room [17:34:34] woolf leaves the room [17:34:37] alexey.melnikov leaves the room [17:34:56] Klensin leaves the room [17:36:29] hildjj joins the room [17:36:41] hildjj leaves the room [17:39:08] martin.thomson joins the room [17:41:48] martin.thomson leaves the room [17:43:00] marc.blanchet.qc joins the room [17:43:11] marc.blanchet.qc leaves the room [17:44:07] ɹəlɐɥʇ əʌɐp leaves the room [17:44:54] Peter Saint-André leaves the room: Disconnected: connection closed [17:49:06] yone leaves the room: コンピューターが休止します [17:49:10] Jacky Yao (Health Yao) joins the room [17:53:35] yone joins the room [17:53:46] yone leaves the room [17:56:15] lef.mutualauth joins the room [18:03:55] Jacky Yao (Health Yao) leaves the room [18:06:54] martin.thomson joins the room [18:24:37] martin.thomson leaves the room [18:32:11] lef.mutualauth leaves the room [20:23:50] Peter Saint-André joins the room [20:46:34] Jacky Yao (Health Yao) joins the room [21:15:35] Peter Saint-André leaves the room: Disconnected: connection closed [21:18:30] Jacky Yao (Health Yao) leaves the room