Download - Yi Qiao Jason Skicewicz Peter A. Dinda Prescience Laboratory Department of Computer Science
1
Yi Qiao Jason Skicewicz Peter A. Dinda
Prescience Laboratory
Department of Computer Science
Northwestern University
Evanston, IL 60201
An Empirical Study of the Multiscale Predictability of
Network Traffic
2
Talk in a NutshellIn-depth trace-based study of predictability of
link bandwidth at different resolutions– Binning and wavelet approximations
• Generalizations very difficult to make• Aggregation often helps• Predictability does not monotonically
increase with decreasing resolution• Predictability largely independent of
mechanism• Simple models sufficient
3
Outline
• Motivation and Related Work – MTTA
• Traces
• Binning Approximations and Wavelet Approximations
• Results
• Conclusions
4
Background• Why study predictability of network
traffic?– Adaptive applications– Congestion Control– Admission Control– Network management
• Eventual goal– Providing application level network traffic
queries to adaptive applications• Fine-grain app, e.g., Immersive audio• Coarse-grain app. e.g., Scientific app on grids
5
Message Transfer Time Advisor(conf_lower, conf_upper, conf_expected) =
MTTA::PredictTransferTime(src_ip_address,
dest_ip_address,message_size,
transport_protocol,
conf_level);
• Our contributions here– Predicting aggregate background traffic– Dealing with a wide range of time resolutionsTarget API
MTTAApplication Query
Time for transferring a 10MB message, confidence level
=0.95 ?
Query Answer
Expected transfer time is 50 seconds, confidence interval
is [45.9 54.1] seconds
6
Our Approach
Network
Sensor
High-Resolution Bandwidth Signal
Predictor
High-Resolution Prediction
Low-Resolution Prediction
MTTA
Resolution Selection
Application Query
Query Answer
App
7
Multiresolution Views of Resource Signals
• Two Different Approaches– Binning
• Commonly used by existing network measurement tools
– Wavelets• N-level streaming wavelet transform yielding detail
signals and approximation signals• Wavelet domain enables many useful analyses
8
Questions For This Study
• What is the nature of predictability of network resource signals?
• How does predictability depend on resolution?
• What predictive models should be used?
• What are the implications for the MTTA?
9
Tools And Data
• RPS: Resource Prediction System Toolkit for Distributed Systems
• Tsunami: Wavelet Toolkit for Distributed Systems
• NLANR Trace Archive
• Internet Traffic Archive
(Publicly Available From Us)
(Publicly Accessible)
10
Relevant Previous Work• Groschwitz, et al, ARIMA models to predict
long-term NSFNET traffic growth• Basu, et al, Modeling of FDDI, Ethernet LAN,
and NSFNET entry/exit point traffic• Leland, et al, Self-similarity of Ethernet traffic• Wolski, et al, Network Weather Service• Sang and Li: Multi-step prediction of network
traffic using ARMA and MMPP– Both aggregation and smoothing increase
predictability– Our finding: predictability often does not increase
monotonically with smoothing
11
Outline
• Motivation and Related Work – MTTA
• Traces
• Binning Approximations and Wavelet Approximations
• Results
• Conclusions
12
Trace Classification and Analysis
Y. Qiao, and P. Dinda, Network Traffic Analysis, Classification, and Prediction, Technical Report NWU-CS-02-11, Department of Computer Science, Northwestern University,
January, 2003
Time-series
Classification Scheme
Histogram PSD
ACF
Repeated the analysis for a wide-range of resolutions
Large number and high variety of traces Conclusions
13
Traces
NameNumber of
Raw Traces Classes Studied Duration ResolutionsRange of
NLANR
AUCKLAND
BC
180
34
4
12
8
N/A
39
34
4
.125,.25,…,1024s
7.8125 msto 16s
1d
1h, 1d
90s1,2,4,…,1024ms
Totals 218 N/A 77 90s to 1d
1 msto 1024 s
14
Outline
• Motivation and Related Work – MTTA
• Traces
• Binning Approximations and Wavelet Approximations
• Results
• Conclusions
15
Binning Approximations• Methodology
– Commonly used by existing network measurement tools
– Averages over N non-overlapping, power-of-two bins
1 S 8 S 128 S 1024 S
Increasing Bin Sizes
16
Wavelet Approximations• Parameterized by a wavelet basis function
– Equivalent to binning approach when using the Haar wavelet
• Methodology– N-level streaming wavelet transform– D8-wavelet were used for our study
Level 0
Level 1
Level 2
Increasing Approximation Level
19
Outline
• Motivation and Related Work – MTTA
• Traces
• Binning Approximations and Wavelet Approximations
• Results
• Conclusions
20
One-step Ahead Predictions
One-step ahead prediction
One-step ahead prediction
High Resolution
Low Resolution
now
Lower Resolution => Longer Interval Into Future
21
Predictability Ratio• Predictability ratio = Variance of error
signal over variance of resource signal = – Fraction of the “surprise” in the signal left after
prediction
• The smaller the ratio, the better predictability we have
22 / e
Resource signal =[1 4 10 9]
Prediction =[2 3 9 10]
Error signal =[1 -1 -1 1]
182 33.12 e
Predictability Ratio =1.33/18=0.07389
22
Wide Range of Prediction Models• Simple Models
– MEAN – long term mean of signal– LAST – last observed value as prediction– BM(32) – average over a history window of optimal size
• Box-Jenkins Models– AR(8), AR(32) – pure autoregressive– MA(8) – pure moving average– ARMA(4,4) – autoregressive moving average– ARIMA(4,1,4), ARIMA(4,2,4) – integrated ARMA
• Long-range dependence model– ARFIMA(4,-1,4) – “Fractionally integrated” ARMA
• Nonlinear model– MANAGED AR(32) – TAR variant
23
Binning Study on NLANR Traces
– Generally unpredictable– Predictability worse at coarser
granularities
LAST
BM(32)With AR Comp
Log Scale
24
Binning Study On BC Traces
– Weak predictability– Predictability not always
monotonically increasing with smoothing
LASTMA(8)
With AR Comp
25
Results for AUCKLAND Traces
• General predictability of traces
• How predictability changes with different resolutions
• Relative performance of different predictive models
3 different behaviors for binning study, and4 different behaviors for wavelet study
26
AUCKLAND Behavior 1 - Binning
MA(8)
LAST
BM(8)
With AR Comp
– 14 of 34 traces– Predictability converges to a
high level with increasing bin size
– Commensurate with conclusions from earlier papers
27
AUCKLAND Behavior 1 - Wavelet
– 7 of the 34 traces– Generally shows monotonic
relationship with approximation levels except outliners
– Relatively uncommon behavior
LAST
MA(8)
With AR Comp
28
AUCKLAND Behavior 2 - Binning
MA(8)
LAST
BM(8)
With AR Comp
– 15 of 34 traces– Presence of sweet spot - optimal bin
size that maximizes predictability– Contradicts earlier work
MaxPredictability
Sweet Spot
29
AUCKLAND Behavior 2- Wavelet
– 13 of the 34 AUCKLAND traces– a sweet spot at a particular
scale– Contradicting earlier work
MA(8)
LAST
With AR Comp
Sweet Spot
MaxPredictability
30
AUCKLAND Behavior 3 - Binning
– 11 of the 34 traces– Non-monotonic relationship between
scale and predictability– Predictability weaker than behavior 1
and 2
LAST BM(8)
MA(8)
With AR Comp
31
AUCKLAND Behavior 3 - Wavelet– Uncommon, 5 of 34 traces– Multiple peaks and valleys at
different approximations– Predictability not as strong
as the earlier two classesMA(8)
LAST MA(8)
With AR Comp
32
AUCKLAND Behavior 4 - Wavelet
– 3 of the 34 traces– Predictability ratio plateaus and
becomes more predictable at coarsest resolutions
– Behavior did not occur in binning study
LAST MA(8)
With AR Comp
33
ConclusionsIn-depth trace-based study of predictability of
link bandwidth at different resolutions– Binning and wavelet approximations
• Generalizations very difficult to make• Aggregation often helps• Predictability does not monotonically
increase with decreasing resolution• Predictability largely independent of
mechanism• Simple models sufficient
34
Implications for Message Transfer Time Advisor (MTTA)
• Online multiscale prediction system to support MTTA is feasible– Likely to be more accurate for WAN traffic
• Often a natural time scale for prediction– Adaptation likely best here
• Prediction system must itself adapt to changing network behavior
35
Current and Future Work
Wide-area TCP throughput characterization and prediction
Wide-area Parallel TCP throughput modeling and prediction
Tsunami Wavelet Toolkit
D. Lu, Y. Qiao, P. Dinda, and F. Bustamante, Modeling and Taming Parallel TCP on the Wide Area Network, Technical Report NWU-CS-04-35, May, 2004
J. Skicewicz, P. Dinda, Tsunami: A Wavelet Toolkit for Distributed Systems, Technical Report NWU-CS-03-16, Department of Computer Science, Northwestern University, November, 2003.
D. Lu, Y. Qiao, P. Dinda, and F. Bustamante, Characterizing and Predicting TCP Throughput on the Wide Area Network, Technical Report NWU-CS-04-34, Department of Computer Science, Northwestern University, April, 2004.
36
For MoreInformation
• Prescience Lab– http://plab.cs.northwestern.edu
• Tsunami and RPS Available for Download– http://rps.cs.northwestern.edu
• Contact– [email protected]
37
AUCKLAND Behavior 1-Binning
– 14 of 34 traces– Predictability converges to a high
level with increasing bin size– Commensurate with conclusions
from earlier papers
38
AUCKLAND Behavior 1-Wavelet
– 7 of the 34 traces– Generally shows monotonic relationship
with approximation levels except outliners– Relatively uncommon behavior
39
AUCKLAND Behavior 2-Binning
– 15 of 34 traces– Presence of sweet spot, an optimal
bin size that maximize predictability– Contradicts the conclusion of earlier
works
40
AUCKLAND Behavior 2-Wavelet
– 13 of the 34 AUCKLAND traces– a sweet spot at a particular
approximation scale for maximum predictability
– Contradicting earlier work
41
AUCKLAND Behavior 3-Binning– Uncommon, 5 of 34 traces– Multiple peaks and valleys at
different bin sizes – Predictability not as strong as the
earlier two classes
42
AUCKLAND Behavior 3-Wavelet
– 11 of the 34 traces– Non-monotonic relationship between the
approximation scale and the predictability– Predictability weaker then class 1