[15:45:24] --- sleinen has joined
[16:53:28] --- ggm has joined
[17:15:22] --- SharonChisholm has joined
[17:16:51] <SharonChisholm> meeting slides : http://ipfix.doit.wisc.edu/ietf60
[17:20:08] --- aen has joined
[17:20:29] <ggm> ipfix kicks off. Dave Plonka, Nevil Brownlee in the chair(s)
[17:21:32] <ggm> administrivia
[17:22:08] <ggm> Requrements, Eval drafts in process with RFC editor
[17:22:24] <ggm> David Moore, CAIDA/UCSD, 'building a better netflow'
[17:22:44] <ggm> New drafts: IPFIX impl notes, IPFIX over TCP
[17:23:45] <ggm> move to current drafts, proto/model/info/arch/applicability
[17:24:08] <ggm> significant list of open issues to be discussed, not covered onlist
[17:24:58] <ggm> first up, david moore. .
[17:25:01] <ggm> paper to go to SIGCOMM 2004
[17:26:00] <ggm> not meaning any specific vendor. changes are metering related but might reflect back on protocol. not difinitive solution. discussion points, improvements
[17:26:29] <ggm> sampling pros/cons. good effects on load, usage/bw vs less accurate, cannot estimate non-TCP flowcounts.
[17:26:41] <ggm> how to pick the right sampling rate? choice depends on traffic mix.
[17:27:01] <ggm> looked at 4 problems in flow export.
[17:27:47] <ggm> how to tune. how much reporting b/w, depends on traffic rate, have to manually tune. people seem to use 'buckets' of time. flow timeouts eg nothing in 60, expire, try to turn back into bins. both ops and research. why do we keep doing that?
[17:27:54] --- ggm has left
[17:28:43] --- sakai has joined
[17:28:59] --- ggm has joined
[17:29:09] <ggm> (sorry) dropped out.
[17:29:22] <ggm> will look at different sampling rates.
[17:29:36] <ggm> need way to say 'in this bin, this is the rate' -makes it easier to have it binned.
[17:29:48] <ggm> if time bin is of appropriate size, can even reconstruct flow timeout data
[17:30:09] <ggm> slides about how to operate with bins. some use day-long, some use 5min bins
[17:30:47] <ggm> how does it fit with IPFIX? isnt it a metering issue? protocol requires timeout based flows. 'if you see a reset or fin' .. but its not normative (is that intentional?) not well matched to somebody not doing that kind of model.
[17:31:14] <ggm> there is text about exporting long lived flows. could slip it into a long lived flow exception. what this means isn't well defined.
[17:31:36] <ggm> how do you export? what do you do? I'm going to export 10,000 flows, but all in the same timebin. do I have to put that in every flow rec?
[17:31:49] <ggm> can I say 'this bin with the next 10000 records' -how to do it efficiently?
[17:32:51] <ggm> take whole bin, to export what you want. don't block on this, don't have to export immediately
[17:33:27] <ggm> nobody knows how to set params. no relationship into memory etc.
[17:33:48] <ggm> what we want to do, is adapt sampling rate to traffic. keep as high as possible. but, make sure the results are valid.
[17:33:56] <ggm> never overload the CPU, and never run out of memory.
[17:34:39] <ggm> with multiple sample rates, unless I deal with it, packet and bytecounts become useless.
[17:35:02] <ggm> to decrease sample rate eg cut in half, easiest way to simulate is throw away half. (pretend)
[17:35:46] <ggm> the probability to keep each pkt is 1 in 4. there are lots of more effiicient algs, techniques which work well for bytecounts.
[17:36:11] <ggm> Q what do I do when I need to change sampling number. can't go other direction, threw it away. one advantage of bins is that you can crank it up in the bin size.
[17:36:17] --- sakai has left
[17:36:37] <ggm> limiting can be done in parallel, is very efficient. simple integer arith, no random numbers generated.
[17:37:28] <ggm> eg compute hash, 1 entry is 3.4us, but to renormalize is 1.5us, so can do two for every entry. vendor configures initial sample rate high enough for cpu to keep up with minimum sized packets
[17:37:43] --- rgaglian has joined
[17:38:03] <ggm> what happens under DoS. show graphs of CPU load
[17:38:26] --- sakai has joined
[17:38:41] <ggm> all sampling is probabalistic sampling per packet. when I say "1 in 20" thats a probability, not every 20th packet.
[17:38:57] <ggm> Q in paper, can you prove effect of sampling down is same as initially selecting different probability
[17:39:42] <ggm> A yes we do probabalistic only, but we did show this in paper. for non-probabalistic, may be differernt. may get different flows in output depending on when it changes. aggregates when rescaled get right estimates
[17:41:29] <ggm> so graph, shows timeline of effect of DoS. the netflow data, isn't affected by the DoS load. its constant, peaks, adjusts how it does things, [more like phase-shifting within its parameters]
[17:41:50] <ggm> another graph showing adjustment to sampling rate in time as a step-graph.
[17:42:13] <ggm> Q when rate is decreased can normalize. but when increased, how to estimate it,
[17:42:27] <ggm> A we never increase. the next timebin is done. no flows across increase
[17:43:06] <ggm> Hannah Have differrent sampling rates per bin. accuracy is different per bin. need to report each time changes, so user knows. need to know.
[17:43:26] <ggm> <points to Q about needing to report, option or flag?>
[17:44:30] <ggm> instead of letting user pick sampling rate, doesnt make sense to user, doesnt relate to costs inmemory or bandwidth or filesize, -give tuning knob: how many records to report back.
[17:44:47] <ggm> more meaningful than sampling rate.
[17:45:21] <ggm> Q dont dispute have crude model of distribution, can calc stuff based on that.
[17:45:39] <ggm> A characteristic is DoS with source ports, dst ports spoofed, 2**32 bits of flowstate, cant fit in memory.
[17:46:10] <ggm> Q but you're sampling that. But, you can't tell anything about the accuracy of the resource usage just based on sample rate. with some assumption, you can.
[17:46:32] <ggm> A but it varies. what we try to do with all our work is 'graceful degredation' so monitor doesn't stop working in adverse conditions
[17:46:54] <ggm> Dave (chair) move on. if agree want to do this, then how in IPFIX
[17:47:26] <ggm> thing which matters: want different M size, (number to get) eg "first 1000 entries that give me a good overview."
[17:47:34] <ggm> then say "give me some other stuff, at lower priority"
[17:47:59] <ggm> reliable transport, in general, may be able to share memory for flows, from previous time bin, with memory needed for reTx.
[17:48:13] <ggm> S/Hannah/Tanya?/
[17:48:30] <ggm> section 8: option or flow rec?
[17:48:42] <ggm> how to handle exporting the same flow multiple times
[17:49:26] <ggm> problem: estimating aggregates of non-TCP flows.
[17:50:40] <ggm> cannot do packetsamp to do correct flowcounts. . consider case, one has 2pkt per flow, other has 1 per flow. sample rate same for both, will undercount flows (or over) no way to know
[17:51:01] <ggm> goal is to do unbiassed accurate flow counts. for any post aggregation work.
[17:52:06] <ggm> use an extension to hardware. other ways, but h/w is easy to do. keep hash of flow keys. good hash. scale flow counts by depth, half memory when you need to (filling up) get good, statisitically useful random sample of flows, so can do things on them. scaled
[17:52:20] <ggm> relationship to IPFIX. SCTP-PR. use priority levels
[17:52:35] <ggm> some issues on how to export
[17:53:12] <ggm> pointer to a TR in CAIDA
[17:55:23] <ggm> conclusions.
[17:55:35] <ggm> adaptive netflow improves netflow.
[17:55:50] <ggm> bined measurement makes it better to analyse
[17:56:01] <ggm> h/w changes not required, but useful (flow counts/samples)
[17:56:16] <ggm> Nevil. comments, on IPFIX aspects. now good time to talk about
[17:56:40] <ggm> Luca Deri, Implementing IPFIX -presented by Arne Oslebo
[17:56:45] <ggm> (uninett)
[17:56:51] <ggm> best to send issues/Q to ML.
[17:57:23] <ggm> required to implement IPFIX using NetFLOW v9. started with NFv9 since leading impl, reasonable to believe will be based on it.
[17:57:37] <ggm> (at least for early days)
[17:58:22] <ggm> Vendor Specific Exts, defined using PEN (IANA enterprise number) -and numeric field. templates/flows have different format compared to IETF fields. had to add additional logic in IPFIX to handle both, not a bit issue but complicates debug a bit
[17:58:44] <ggm> Suggestion, IETF should unify format for IETF and Vendor-Specific
[17:58:55] <ggm> <floor> fixed inlatest version of the draft
[17:59:03] <ggm> SCTP support.
[17:59:14] <ggm> shifting from connectionless to connection-oriented.
[17:59:24] <ggm> very little coding for this per. se.
[17:59:33] <ggm> reduce template traffic. send at beginning/end only
[18:00:06] <ggm> major code changes in the probe to support mult. collectors. necessary to resend the templates per connection. (when neccessary) and keep track of connections
[18:00:16] <ggm> issues
[18:01:24] <ggm> FLows are ack'd immediately. increasing network traffic. amplifies problem in some situations. table of replay of random sample of packets. number needed to export 50,000. SCTP is 50% more expensive in packets, but 3% more in bytes.
[18:01:43] <ggm> <floor> no,this demonstrates what is real: its not 'immediate ack' its the cost shown in the table.
[18:01:55] <ggm> SCTP only supported on a few platforms
[18:02:10] <ggm> <floor> ok, its supported on many platforms. BSD, Linux, Windows, HP, Solaris.
[18:02:12] <ggm> not native
[18:02:26] <ggm> <floor> agreed.
[18:02:58] <ggm> Plonka. please, he's presenting on somebody else's report, AND this is an IMPLEMENTORS report. its wonderful. he is not attacking anyones code, just commenting
[18:03:18] <ggm> robustness is unclear. how does it behave under attack, in production attack.
[18:03:27] <ggm> IPFIX evaluation.
[18:03:43] <ggm> standard netflow, (good and bad) little innovation after 10 years of flow based measurements.
[18:03:59] <ggm> Major limitations: no dynamic templates. people will use 'super templates'
[18:04:10] <ggm> <floor> nothing says must be static or dynamic.
[18:04:28] <ggm> I asked him about this, having lots of templates is overhead in the app.
[18:04:39] <ggm> Plonka dynamic template means have lifetime
[18:04:51] <ggm> I think he means having fields optional, avoid having to resend.
[18:05:07] <ggm> can define flows, but not flow headers.
[18:05:21] <ggm> still based on the concept of flow-packet even with SCTP.
[18:05:45] <ggm> one way proto (probe->collector) no support for config, monitoring, error reporting (eg via SNMP, like sFlow does)
[18:06:30] <ggm> http://www.ntop.org/ to see code. written I-D about experience
[18:07:06] <ggm> Dave Plonka. the way i took it, there was a separate codepath to do Vendor codepaths vs IETF, what was the change to cause it to rely on the same codepath.
[18:08:21] <ggm> tradeoff between bandwidth and code complexity.
[18:08:32] <ggm> if any Vendor Specific, then all have to be ..
[18:08:43] --- ggm has left
[18:09:35] --- ggm has joined
[18:09:44] <ggm> (sorry dropped out again)
[18:09:59] <ggm> Nevil. not comments on IPFIX as much as experiences, things to think about
[18:10:06] <ggm> Simon Lienen. implementing IPFIX on TCP
[18:11:04] <ggm> collector connects to exporter, exporter starts exporting flowsets
[18:11:37] <ggm> issues. should exporter initiate the connection? Benoit says yes.
[18:13:44] <ggm> redundancy. exporter and multiple collectors, find natural collecters connect to exporter when ready, benoit has this assumption exporter, looks at it form the router PoV exporter has to know this.
[18:13:59] <ggm> Benoit: report on router just for IPFIX, report to look at if hacker. if connect to it, can see stuff.
[18:14:05] <ggm> Simon 'security by obscurity'
[18:14:14] <ggm> Simon I wrote well known port configurable
[18:14:25] <ggm> seems to make configuration more complex
[18:14:44] --- yjs has joined
[18:15:36] <ggm> security: current draft has use of IPSEC, shouldnt really be long, transport mapping draft, not well aligned with text in protocol document, contradictory. will have to modify that., if you decide to do
[18:15:43] <ggm> IPFIX over TCP, natural to do TLS
[18:16:10] <ggm> <floor> take text and fold into protocol doc. have model. sections for protocols look same except for diffferences, can we adopt that approach?
[18:16:18] <ggm> Simon hopefully yes. focussing on the basic problems.
[18:17:04] <ggm> if we do TLS, do we need some kind of negotiation? not sure neccessary. exporter/collector should have agreement, if both configured, just do TLS handshake
[18:18:56] <ggm> Nevil assumption for some time, is one-way
[18:19:52] <ggm> In longer term, 2 way communications and more sharing between exporter and collector, but concerned to get things through in V1 without further delays.
[18:20:02] --- yjs has left
[18:20:07] <ggm> Happy to see text go into drafts? want to do another revision or put text straight across?
[18:20:26] <ggm> Simon think suggestions, have to align to structure of current transport patterns, other changes,
[18:20:36] <ggm> Nevil suggest you do that, publish again, lets see real discussion on the list
[18:21:12] <ggm> merge straight into protocol document. why make draft?
[18:21:29] <ggm> Nevil its already there., easiest way to get info out there. not intended to live very long.
[18:21:42] <ggm> Simon concerned leaving it separate draft will slow down proto draft?>
[18:21:57] <ggm> DaveP we've got somebody to do it. if he stops, take the text. we've got active participants
[18:22:55] <ggm> DaveP how many had chance to read drafts. [5-7 read them]
[18:23:23] <ggm> Next up. going over the drafts. Nevil has overview of documents. then will go to Benoit , Arch, Applicability
[18:23:31] <ggm> Nevil -Overview
[18:23:42] <ggm> explains what each document is for
[18:26:24] <ggm> protocol document seems most important. get benoit to walk us through it.
[18:26:53] <ggm> Benoit.
[18:27:00] <ggm> version 4 of protocol spec draft.
[18:27:09] <ggm> start with closed issues.
[18:27:37] <ggm> do we need IETF specific exclusive template, FlowSet format? resolution, reserve 0 and 1.
[18:28:00] <ggm> closed proto issue 5-11 -transport. -new sections on SCTP and UDP
[18:28:33] <ggm> PROTO-13 -how to distinguish IETF field bits. 0 means just type/length, set 1, then type. length and ent. number
[18:29:20] <ggm> PROTO-14. padding issue. inserted text, MAY padding, to get to aligned boundary. condition is, must be shorter than one flow data record. (otherstuff too)
[18:29:31] <ggm> Dave not entirely happy with text
[18:29:37] <ggm> DaveP want to disambiguate
[18:29:45] <ggm> Benoit
[18:30:54] <ggm> IPFIX and SAMPLING. proto specs say 'support samples, method out of scope' -been through draft, trying to sort out issues, didn't know where to put this. so proposal is to refine flow definition to include sampling mechanism.
[18:31:36] <ggm> PROTO3 -ip encaps packets. defenition kept unchanged.
[18:31:56] <ggm> PROTO-22 exporter ID. (ip addr of exporter) nothing changed in protocol draft.
[18:32:01] <ggm> OPEN ISSUE
[18:32:14] <ggm> ml comment: unhappy with FlowSet name.
[18:32:20] <ggm> choices:
[18:32:24] <ggm> leave as-is
[18:32:27] <ggm> Record Set
[18:32:29] <ggm> Record Array
[18:32:32] <ggm> Record Collection
[18:32:34] <ggm> Record List
[18:32:35] <ggm> Set
[18:32:42] <ggm> No consensus.
[18:32:57] <ggm> proposal: change FLowSet to Set.
[18:33:07] <ggm> Template Set, Option Template Set, Data Set
[18:34:00] <ggm> PROTO-16-20 -scope -> proposal
[18:34:48] <ggm> want to limit use of discete registry in IANA, to specify elements/scopes. make them the same. much easier for collecter.
[18:36:37] <ggm> want text to specify ordering issues, to make the orders follow, treating scopes as logical AND
[18:36:46] <ggm> part proposed to ML. but some is new.
[18:36:51] <ggm> comes from PSAMP
[18:37:08] <ggm> PROTO-26. need IANA section
[18:37:38] <ggm> been reading 2434 document on IANA considerations.
[18:37:57] <ggm> while waiting for IANA function, run in the list, both IPFIX and PSAMP.
[18:38:16] <ggm> when IANA process in place, then its to IANA, FCFS, number is just unique, important part is discription
[18:38:46] <ggm> IPFIX/PSAMP chairs, AD, should designate expert review. to check the specs.
[18:38:59] <ggm> for new FlowSet -"spec required" ie a new RFC. 'its substantial'
[18:40:06] <ggm> Nevil. new informational element, ask IANA. don't ask which value it is.
[18:40:12] <ggm> <floor> thats what he said.
[18:40:29] <ggm> Neviil oh.. ok. small review group, thats fine. IANA can't do that themselves
[18:41:00] <ggm> conceivably PSAMP/IPFIX might want to define elems, good idea to designate ranges for groups rather than having them all mixed together.
[18:41:10] <ggm> <floor> no, everything else is FCFS, lets do that.
[18:41:15] <ggm> <floor> was stewart btw
[18:41:27] <ggm> Benoit I agree with Stewart
[18:43:40] <ggm> issue PROTO-21. meter process stats.
[18:43:59] <ggm> maurizio posted to ML with proposal for required fields. no feedback.
[18:44:13] <ggm> <floor> lets deal with this on ML.
[18:44:27] <ggm> Nevil. these do seem to fit nicely with option data records. inf. elem descriptions
[18:46:03] <ggm> Stewart. many management ways to say whats lost. MIB for example
[18:46:16] <ggm> DaveP simpler to use the option sets in the proto
[18:46:32] <ggm> Nevil nowadays, if make protocol, expected to make a Mib.
[18:46:48] <ggm> starting by putting things to be reported on into the information model seems like a good start
[18:47:11] --- SharonChisholm has left: Disconnected
[18:47:15] <ggm> PROTO23
[18:47:23] <ggm> time sync proposal
[18:47:28] <ggm> FInalize details
[18:47:47] <ggm> Benoit -discussion last time, what resolution, micro or millisec
[18:48:06] <ggm> PROTO-5-11 -tcp section -see Simons work
[18:48:49] <ggm> PROTO-25 still not right. structure around SCTP, then document variances in UDP section
[18:49:05] <ggm> PROTO-30 -review requirements, make sure not missed when I-RFC
[18:49:08] <ggm> NEW ISSUES
[18:50:05] <ggm> reference to NetFlow v 9. limit text to simpler version to say 0 and 1 are not used for historical purposes.
[18:50:41] <ggm> DaveP good point. but historical purposes a bit odd. -some people think their NetFlow V9 and use it as-is.
[18:50:54] <ggm> Stewart change one word are to MUST and you;ve got it.
[18:51:01] <ggm> Exporter is not in the terminology section
[18:51:37] <ggm> but IPFIX device, and node are. proposal is to define exporter, keep device, remove 'node'
[18:54:12] <ggm> INFO model
[18:54:22] <ggm> no structural changes. has converged
[18:54:31] <ggm> version is intermediate. 'construction site'
[18:56:35] <ggm> variable field problems.
[18:56:44] <ggm> some stuff changes over the life of a flow. how to report?
[19:03:52] <ggm> fields type assignment. -start from field 128. hand to IANA. legacy fields are Netflow V9 1-127
[19:03:53] --- rgaglian has left: Disconnected
[19:05:28] <ggm> NetFLow V9 compat. some things not sure how to decide.
[19:06:49] <ggm> if very strict, decision is not ok,. but practical
[19:06:53] <ggm> Nevil Wait for list.
[19:07:01] <ggm> Packet Sampling fields.
[19:07:08] --- yjs has joined
[19:07:36] <ggm> put to PSAMP wg.
[19:07:50] <ggm> flow sampling not covered yet. probably not ready to consider it yet
[19:08:20] <ggm> Stewart. concern about doing some here, some PSAMP. need to be careful how to run number alloc mechanism.
[19:08:28] <ggm> DaveP have person in both groups. Jeurgen!
[19:08:39] <ggm> Jurgen can have joint process.
[19:10:28] <ggm> Arch draft is informational so need to remove normative language
[19:10:53] <ggm> Flow aggregates. optional, doesn't now seem such a good idea. proposing to take out.
[19:11:30] <ggm> selective export deleted,
[19:11:46] <ggm> flow expiration, comment added. effect of inactivity timer==0 (one packet flows)
[19:13:54] <ggm> arch-01 changed field to information element, adding text to explain arch-02 issues.
[19:13:59] <ggm> Stewart seems complex
[19:14:02] <ggm> Nevil lets go to list
[19:14:53] <ggm> arch-04 consistent terminology between drafts
[19:15:18] <ggm> doesnt mean have to be identical. can eg be more discursive or terse, depending
[19:17:05] <ggm> arch 08. want overview of information model.
[19:17:18] <ggm> arch 09 out of date text, needs reworking
[19:18:17] <ggm> arch 10 no mention of transport protocols in arch. will add text
[19:18:28] <ggm> arch 11, IANA considerations, will deal with it.
[19:19:05] --- aen has left
[19:19:51] --- sakai has left
[19:20:05] <ggm> arch 12. security considerations light.
[19:20:10] <ggm> what next?
[19:21:03] <ggm> applicability statement. Tanjya has new version. little bit of restructuring, only very few comments. thought of having more scenarios. few people from group issued per-packet draft to do QoS measurement
[19:21:10] <ggm> if interested, please comment
[19:21:24] <ggm> Nevil once we have new revisions, of other 3 drafts, come back and have closer look
[19:21:56] <ggm> DaveP two things to mention. Applicability. goal of document, cohere the others together. want to revisit, do we have the mainstream thing there. want to make sure mainstream is visible
[19:22:18] <ggm> Security push. we need somebody with experience of TLS or SCTP or threats, we need help. interested for both proto and arch doc. seeds, but need to flesh out
[19:25:04] --- yjs has left
[19:41:42] --- sleinen has left: Disconnected
[20:54:46] --- ggm has left: Disconnected
[21:16:24] --- SharonChisholm has joined
[21:16:40] --- SharonChisholm has left
[21:31:19] --- sleinen has joined
[23:40:15] --- sleinen has left: Disconnected