characterizing and predicting tcp throughput on the wide area network dong lu, yi qiao, peter dinda,...
Post on 21-Dec-2015
215 views
TRANSCRIPT
Characterizing and Predicting TCP Throughput on the Wide Area Network
Dong Lu, Yi Qiao, Peter Dinda, Fabian Bustamante
Department of Computer ScienceNorthwestern University
http://plab.cs.northwestern.edu
2
Overview
• Algorithm for predicting the TCP throughput as function of flow size
• Minimal active probing• Dynamic probe rate adjustment
• Explaining flow size / throughput correlation
• Explaining why simple active probing fails
Large scale empirical study
3
Outline
• Why TCP throughput prediction?
• Particulars of study
• Flow size / TCP throughput correlation
• Issues with simple benchmarking
• DualPats algorithm
• Stability and dynamic rate adjustment
4
Goal
A library call
BW = PredictTransfer(src,dst,numbytes);
Expected Time = numbytes/BW;
Ideally, we want a confidence interval:
(BWLow,BWHigh) = PredictTransfer(src,dst,numbytes,p);
5
Available Bandwidth
• Maximum rate a path can offer a flow without slowing other flows– pathchar, cprobe, nettimer, delphi, IGI,
pathchirp, pathload …
• Available bandwidth can differ significantly from TCP throughput
• Not real time, takes at least tens of seconds to run
6
Simple TCP Benchmarking
• Benchmark paths with a single small probe– BW = ProbeSize/Time– Widely used Network Weather Service (NWS)
and others (Remos benchmarking collector)
• Not accurate for large transfers on the current high speed Internet– Numerous papers show this and attempt to fix it
7
Fixing Simple TCP Benchmarking
• Logs [Sundharshan]: correlate real transfer measurements with benchmarking measurements
• Recent transfers needed• Similar size transfers needed• Measurements at application chosen times
• CDF-matching [Swany]: correlate CDF of real transfer measurements with CDF of benchmarking measurements
• Recent transfers still needed• Measurements at application chosen times
8
Analysis of TCP
• Extensive research on TCP throughput modeling in networking community
• Really intended to build better TCPs
• Difficult to use models online because of hard to measure parameters
• Future loss rate and RTT
• Note: we measure goodput
9
Our Measurement Study
• PlanetLab and additional machines– Located all over the world
• Measurements of throughput– Wide open socket buffers (1-3 MB)– Simple ttcp-like client/server– scp– GridFTP
• Four separate sets of measurements
10
Distribution Set
• For analysis of TCP throughput stability and distributions
• 60 randomly chosen paths among PlanetLab machines
• 1.6 million transfers (client/server)– 100 KB, 200 KB, 400 KB, … 10 MB flows– 3000 consecutive transfers per path+flow
size
11
Correlation Set
• For studying correlation between throughput and flow size, initial testing of algorithm
• 60 randomly chosen paths among PlanetLab machines
• 2.4 million transfers, 270 thousand runs, client/server– 100 KB, 200 KB, 400 KB, … 10 MB flows– Run = sweep flow size for path
12
Verification Set
• Test algorithm
• 30 randomly chosen paths among PlanetLab machines and others
• 4800 transfers, 300 runs, scp and GridFTP– 5 KB to 1 GB flows– Run = sweep flow size for path
13
Online Evaluation Set
• Test online algorithm
• 50 randomly chosen paths among PlanetLab machines and others
• 14000 transfers, scp and GridFTP– 40 MB or 160 MB file, randomly chosen size– 10 days
15
Why Does The Correlation Exist?
• Slow start and user effects [Zhang]• Extant flows
• Non-negligible startup overheads– Control messages in scp and GridFTP
• Residual slow start effect– SACK results in slow convergence to
equilibrium
16
0
0.5
1
1.5
2
2.5
3
3.5
0 5000 10000 15000 20000 25000 30000 35000
File size (KB)
Tim
e (
se
c)
Why Simple Benchmarking FailsProbes are too small
Need more than one probe to capture correlation
17
0
0.5
1
1.5
2
2.5
3
3.5
0 5000 10000 15000 20000 25000 30000 35000
File size (KB)
Tim
e (
se
c)
Our ApproachTwo consecutive probes, both larger than the noise region
18
Our Approach
• Two consecutive probes are integrated into a single probe– 400KB, 800 KB in single 800 KB probe
0 T1 T2
Probe one
Probe two
19
Our Approach
BxAT
BxA
x
T
xTP
Flow sizeTransfer Time
Solve For A and B
Predict Throughput For Some Other Transfer
20
Model Fit is Excellent
Correlation SetLow and Normally Distributed Relative ErrorsAt All Flow Sizes
21
Stability
• How long does the TCP throughput function remain stable? – How frequently should we probe the path?
• What’s the distribution of throughput around the function (i.e., the error)?
24
Online DualPats Algorithm
• Fetch probe sequence for destination– Start probing process if no data exists
• Project probe sequence ahead– 20 point moving average over values with
current sampling interval
• Apply model using projected data
• Return result– confidence interval computed using
normality assumptions
25
Dynamic Sampling Rate
• Adjust sampling interval to correspond to the path’s stable intervals
• Limit rate (20 to 1200 seconds)
• Additive increase / additive decrease of based on difference between last two probes
< 5% => increase interval
> 15% => decrease interval
26
Finding Sufficiently Large Probe Size
• Default values: 400 KB / 800 KB
• Upper bound
• Additive increase until prediction error are less than threshold, all with same sign.
27
Evaluation
0
1
0.4-0.4
Mean relative error
Mean abs(relative error)
Relative error
P[m
ean
erro
r <
X
]
• Slight conservative bias• >90 % of predictions have < 35% error
Online Evaluation Set
28
Conclusions
• Algorithm for predicting the TCP throughput as function of flow size
• Minimal active probing• Dynamic probe rate adjustment
• Explaining flow size / throughput correlation
• Explaining why simple active probing fails
Large scale empirical study
29
For MoreInfo
• Prescience Lab– http://plab.cs.northwestern.edu
• Aqua Lab– http://aqualab.cs.northwestern.edu
• D. Lu, Y. Qiao, P. Dinda, and F. Bustamante, Modeling and Taming Parallel TCP on the Wide Area Network, IPDPS 2005 .
• Y. Qiao, J. Skicewicz, P. Dinda, An Empirical Study of the Multiscale Predictability of Network Traffic, HPDC 2004.