taad - a tool for traffic analysis and automatic diagnosis kathy l. benninger [email protected]...
TRANSCRIPT
TAAD - A Tool for Traffic Analysis and Automatic
Diagnosis
Kathy L. Benninger
NLANR/Pittsburgh Supercomputing Center
NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD
2
Outline
• Context for development of TAAD
• Characteristics of the tool
• Performance model
• Output description and interpretation
• OCXmon
• Practical considerations
NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD
3
Context
• TAAD is being developed by the NLANR network research group based at the Pittsburgh Supercomputing Center
• NCNE Pittsburgh GigaPoP based at PSC
• Coexistence of NLANR group and the NCNE Pittsburgh GigaPoP provides ample opportunity for development and test.
NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD
4
Context (cont’d)
• Need for tool to support NLANR/PSC’s TCP Trace-based Performance Diagnosis Flowchart– Analysis of heavily aggregated traffic – Automatic problem detection and partial
diagnosis
• Availability of OCXmon data collection
NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD
5
Tool Characteristics
• Searches aggregate traffic for miss-tuned microflows
• Tool for GigaPoP operators
• Examines traffic from GigaPoP viewpoint, but detects end-system problems
NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD
6
Tool Characteristics (cont’d)
• Uses model developed in “The Macroscopic Behavior of the TCP Congestion Avoidance Algorithm” [Mathis, Semke, Mahdavi, Ott, CCR July 1996]
• Compares actual TCP performance to performance predicted by the Model
NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD
7
Tool Characteristics (cont’d)
• Diagnosis of bulk flows
• Does not pinpoint why performance is poor
• Evolving...
NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD
8
Macroscopic Performance Model
• Rate = Estimated data rate (bytes/second)• MSS = Maximum Segment Size (bytes)• RTT = Round Trip Time (seconds)• p = Segment loss rate (probability)• C = Proportionality constant (typically 0.7)
)()( pCRTTMSSRate
NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD
9
TAAD Calculation
teMeasuredRa
tedByModelRatePredicGainRatio
NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD
10
Model used by TAAD
)( MSSRTT by te, scaledMeasuredRap
CGainRatio
• GainRatio = Indicates potential performance improvement
• p = Analogous to loss rate, but derived from number of packets successfully delivered between recovery events
• MeasuredRate = Data rate (bytes/second)
• RTT = Round Trip Time (seconds)
• MSS = Maximum Segment Size (bytes)
NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD
11
TAAD Output Fields
• Source addresses and ports
• Destination addresses and ports
• Start time and duration of flow
• Counts of packets and bytes
• GainRatio and OpportunitySize
NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD
12
TAAD Output Interpretation
• If GainRatio– is ~ 1, flow performance is close to Model– is > 1, indicates a non-IP bottleneck– is >> 1, invites tuning to improve performance– is < 1 means cheating!
NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD
13
TAAD Output Interpretation (cont’d)
• OpportunitySize is GainRatio scaled by number of packets– Indicates how much data could have been
transmitted in the same amount of time on a properly tuned connection
– Output flows are sorted by OpportunitySize– Flows with largest OpportunitySize offer
largest payoff with tuning
NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD
14
Sample Output# for if 0 vp:vc 1:153# unknown_ encaps: 0# not_ipv4: 0# pkts: 32434# bytes: 18128703# first: 0.9297266# latest: 3.69090628# oprtunsz src dst sport dport start_time duration pkts bytes gainratio
1244.2 0.177.0.0 0.4.0.0 1023 1383 0.00178352 3.669245243 339 476124 3.71180.0 0.25.0.0 0.4.0.0 20 1037 0.93062580 2.754259825 415 584200 2.8558.0 0.58.0.0 0.2.0.0 41415 25 0.93210152 2.620216131 217 305052 2.6454.9 0.228.0.0 0.2.0.0 119 2101 0.00483488 3.684525251 133 199500 3.4404.0 3.59.0.0 0.4.0.0 7919 5501 0.25671660 3.203558207 187 229234 2.2370.0 1.206.0.0 0.4.0.0 80 4586 0.06984540 3.618816853 199 297544 1.9352.4 2.60.0.0 0.2.0.0 80 1170 0.10803892 3.455293179 174 218624 2.0295.8 0.23.0.0 0.4.0.0 3474 1393 0.36288916 3.281085014 113 157084 2.6267.2 2.157.0.0 0.2.0.0 80 4252 0.14957588 3.521914005 103 143855 2.6241.8 0.228.0.0 0.30.0.0 80 1547 0.35120440 3.325309753 126 189000 1.9208.5 0.23.0.0 0.4.0.0 1275 6699 0.00540136 3.400367737 106 142896 2.0187.3 0.212.0.0 0.4.0.0 1986 20 0.00383024 3.671326876 103 140748 1.894.0 0.23.0.0 0.4.0.0 20 1422 0.25331492 3.378539562 103 57204 0.9
# end data for if 0 vp:vc 1:153
NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD
15
OC3mon
• Available though development efforts of– NLANR/MOAT project at SDSC– MCI’s OCXmon activity– CAIDA’s CoralReef software suite
• Passive network monitoring tool
NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD
16
OC3mon (cont’d)
• Data format– Trace files collected in Coral .crl format– Analysis output of TAAD is ASCII
• Collects packet headers
• Does not collect payload
NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD
17
Operation
• Five minute trace on one or two interfaces
• New trace capture begins while previous five minutes of data is analyzed
• Data volume (per interface, mid-day)– Capture .crl file ~ 40MB/minute– Analysis output filesize ~ 25K/minute
NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD
18
Operational Issues
• Data Policy– Amount of data– Security and privacy– Legal liability
• Run time– ATM card(s) devoted to continuous capture– Recommend dedicated machine
NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD
19
Resource requirement
• Currently running on one Intel 450MHz CPU– CPU ~2% load during trace capture– CPU ~75-80% load during analysis (and
continued trace)– wall-clock time for analysis is < 1 minute for a
5 minute trace capture (~200MB trace file)
• 6GB disk sufficient for summary data
NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD
20
Future
• Verification and release
• Adaptation for use with other trace tools
• Additional tools to create a TAAD toolset
NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD
21
Conclusion
• TAAD is intended to help meet the need for a tool to automate the analysis and diagnosis of aggregated bulk flows.
• The analysis and diagnosis is based on comparing modeled and actual performance
• Output is intended to be a pointer for where to direct tuning efforts for maximum benefit
NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD
22
References• Macroscopic paper
– http://www.psc.edu/networking/papers/model_ccr97.ps
• TCP Tuning– http://www.ncne.nlanr.net/TCP/
• TAAD– http://www.ncne.nlanr.net/TCP/TAAD
• CoralReef– http://www.caida.org/Tools/CoralReef/