network traffic modeling mark crovella boston university computer science

89
Network Traffic Modeling Mark Crovella Boston University Computer Science

Post on 18-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Network Traffic Modeling Mark Crovella Boston University Computer Science

Network Traffic Modeling

Mark Crovella

Boston University Computer Science

Page 2: Network Traffic Modeling Mark Crovella Boston University Computer Science

Outline of Day

• 9:00 – 10:45 Lecture 1, including break• 10:45 – 12:00 Exercise Set 1• 12:00 – 13:30 Lunch• 13:30 – 15:15 Lecture 2, including

break• 15:15 – 17:00 Exercise Set 2

Page 3: Network Traffic Modeling Mark Crovella Boston University Computer Science

The Big Picture

• There are two main uses for Traffic Modeling:– Performance Analysis

• Concerned with questions such as delay, throughput, packet loss.

– Network Engineering and Management:• Concerned with questions such as capacity

planning, traffic engineering, anomaly detection.

• Some principal differences are that of timescale and stationarity.

Page 4: Network Traffic Modeling Mark Crovella Boston University Computer Science

Relevant Timescales

1 hour 1 day 1 week1 usec 1 sec

Network Engineering effects happen on long timescales

from an hour to months

Performance effects happen on short timescales

from nanoseconds up to an hour

Page 5: Network Traffic Modeling Mark Crovella Boston University Computer Science

Stationarity, informally

• “A stationary process has the property that the mean, variance and autocorrelation structure do not change over time.”

• Informally: “we mean a flat looking series, without trend, constant variance over time, a constant autocorrelation structure over time and no periodic fluctuations (seasonality).”

NIST/SEMATECH e-Handbook of Statistical Methods

http://www.itl.nist.gov/div898/handbook/

Page 6: Network Traffic Modeling Mark Crovella Boston University Computer Science

The 1-Hour / Stationarity Connection

• Nonstationarity in traffic is primarily a result of varying human behavior over time

• The biggest trend is diurnal• This trend can usually be ignored up to timescales

of about an hour, especially in the “busy hour”

Page 7: Network Traffic Modeling Mark Crovella Boston University Computer Science

Outline

• Morning: Performance Evaluation– Part 0: Stationary assumption – Part 1: Models of fine-timescale behavior– Part 2: Traffic patterns seen in practice

• Afternoon: Network Engineering– Models of long-timescale behavior– Part 1: Single Link– Part 2: Multiple Links

Page 8: Network Traffic Modeling Mark Crovella Boston University Computer Science

Morning Part 1:Traffic Models for Performance Evaluation

• Goal: Develop models useful for– Queueing analysis

• eg, G/G/1 queues

– Other analysis• eg, traffic shaping

– Simulation• eg, router or network simulation

Page 9: Network Traffic Modeling Mark Crovella Boston University Computer Science

A Reasonable Approach

• Fully characterizing a stochastic process can be impossible – Potentially infinite set of properties to capture– Some properties can be very hard to estimate

• A reasonable approach is to concentrate on two particular properties:

marginal distribution and autocorrelation

Page 10: Network Traffic Modeling Mark Crovella Boston University Computer Science

Marginals and Autocorrelation

Characterizing a process in terms of these two properties gives you – a good approximate understanding of the

process, – without involving a lot of work, – or requiring complicated models, – or requiring estimation of too many

parameters.

… Hopefully!

Page 11: Network Traffic Modeling Mark Crovella Boston University Computer Science

Marginals

Given a stochastic process X={Xi}, we are interested in the distribution of any Xi:

i.e., f(x) = P(Xi=x)

Since we assume X is stationary, it doesn’t matter which Xi we pick.

Estimated using a histogram:

Page 12: Network Traffic Modeling Mark Crovella Boston University Computer Science

Histograms and CDFs

• A Histogram is often a poor estimate of the pdf f(x) because it involves binning the data

• The CDF F(x) = P[Xi <= x] will have a point for each distinct data value; can be much more accurate

Page 13: Network Traffic Modeling Mark Crovella Boston University Computer Science

Modeling the Marginals• We can form a compact summary

of a pdf f(x) if we find that it is well described by a standard distribution – eg– Gaussian (Normal)– Exponential– Poisson– Pareto– Etc

• Statistical methods exist for – asking whether a dataset is well

described by a particular distribution– Estimating the relevant parameters

Page 14: Network Traffic Modeling Mark Crovella Boston University Computer Science

Distributional Tails

• A particularly important part of a distribution is the (upper) tail

• P[X>x]• Large values

dominate statistics and performance

• “Shape” of tail critically important

Page 15: Network Traffic Modeling Mark Crovella Boston University Computer Science

Light Tails, Heavy Tails

• Light – Exponential or faster decline

• Heavy – Slower than any exponential

f2(x) = x-2

f1(x) = 2 exp(-2(x-1))

Page 16: Network Traffic Modeling Mark Crovella Boston University Computer Science

Examining Tails

• Best done using log-log complementary CDFs

• Plot log(1-F(x)) vs log(x)

1-F2(x)

1-F1(x)

Page 17: Network Traffic Modeling Mark Crovella Boston University Computer Science

Heavy Tails Arrive

pre-1985: Scattered measurements note high variability in computer systems workloads

1985 – 1992: Detailed measurements note “long” distributional tails– File sizes– Process lifetimes

1993 – 1998: Attention focuses specifically on (approximately) polynomial tail shape: “heavy tails”

post-1998: Heavy tails used in standard models

Page 18: Network Traffic Modeling Mark Crovella Boston University Computer Science

Power Tails, Mathematically

We say that a random variable X is power tailed if:

where a ~ b means .1lim)(

)( xb

xa

x

Focusing on polynomial shape allowsParsimonious description

Capture of variability in parameter

20 ~][ xxXP

Page 19: Network Traffic Modeling Mark Crovella Boston University Computer Science

A Fundamental Shift in Viewpoint• Traditional modeling methods have focused

on distributions with “light” tails– Tails that decline exponentially fast (or faster)– Arbitrarily large observations are vanishingly rare

• Heavy tailed models behave quite differently– Arbitrarily large observations have non-negligible

probability– Large observations, although rare, can dominate a

system’s performance characteristics

Page 20: Network Traffic Modeling Mark Crovella Boston University Computer Science

Heavy Tails are Surprisingly Common

• Sizes of data objects in computer systems– Files stored on Web servers– Data objects/flow lengths traveling through the

Internet– Files stored in general-purpose Unix filesystems– I/O traces of filesystem, disk, and tape activity

• Process/Job lifetimes• Node degree in certain graphs

– Inter-domain and router structure of the Internet– Connectivity of WWW pages

• Zipf’s Law

Page 21: Network Traffic Modeling Mark Crovella Boston University Computer Science

Evidence: Web File Sizes

Barford et al., World Wide Web, 1999

Page 22: Network Traffic Modeling Mark Crovella Boston University Computer Science

Evidence: Process Lifetimes

Harchol-Balter and Downey, ACM TOCS,

1997

Page 23: Network Traffic Modeling Mark Crovella Boston University Computer Science

The Bad News

• Workload metrics following heavy tailed distributions are extremely variable

• For example, for power tails:– When 2, distribution has infinite variance– When 1, distribution has infinite mean

• In practice, empirical moments are slow to converge – or nonconvergent

• To characterize system performance, either:– Attention must shift to distribution itself, or – Attention must be paid to timescale of

analysis

Page 24: Network Traffic Modeling Mark Crovella Boston University Computer Science

Heavy Tails in Practice

Power tailswith=0.8

Large observations dominate statistics (e.g., sample mean)

Page 25: Network Traffic Modeling Mark Crovella Boston University Computer Science

Autocorrelation

• Once we have characterized the marginals, we know a lot about the process.

• In fact, if the process consisted of i.i.d. samples, we would be done.

• However, most traffic has the property that its measurements are not independent.

• Lack of independence usually results in autocorrelation

• Autocorrelation is the tendency for two measurements to both be greater than, or less than, the mean at the same time.

Page 26: Network Traffic Modeling Mark Crovella Boston University Computer Science

Autocorrelation

Page 27: Network Traffic Modeling Mark Crovella Boston University Computer Science

Measuring Autocorrelation

Autocorrelation Function (ACF) (assumes stationarity):

R(k) = Cov(Xn,Xn+k) =

E[Xn Xn+k] – E2[X0]

Page 28: Network Traffic Modeling Mark Crovella Boston University Computer Science

ACF of i.i.d. samples

Page 29: Network Traffic Modeling Mark Crovella Boston University Computer Science

How Does Autocorrelation Arise?

Client Server

Internet(TCP/IP)

Request

Response

Network traffic is the superposition of flows

“click”

Page 30: Network Traffic Modeling Mark Crovella Boston University Computer Science

Why Flows?: Sources appear to be ON/OFF

P1:

P2:

P3:

•••

{ {ON OFF

Page 31: Network Traffic Modeling Mark Crovella Boston University Computer Science

Superposition of ON/OFF sources Autocorrelation

P1:P2:P3:

Page 32: Network Traffic Modeling Mark Crovella Boston University Computer Science

Morning Part 2: Traffic Patterns Seen in Practice

Traffic patterns on a link are strongly affected by two factors:– amount of multiplexing on the link

• Essentially – how many flows are sharing the link?

– Where flows are bottlenecked• Is each flow’s bottleneck on, or off the link?• Do all bottlenecks have similar rate?

Page 33: Network Traffic Modeling Mark Crovella Boston University Computer Science

Low Multiplexed Traffic

• Marginals: highly variable

• Autocorrelation: low

Page 34: Network Traffic Modeling Mark Crovella Boston University Computer Science

Highly MultiplexedTraffic

Page 35: Network Traffic Modeling Mark Crovella Boston University Computer Science

High Multiplexed, Bottlenecked Traffic

• Marginals: tending to Gaussian

• Autocorrelation: high

Page 36: Network Traffic Modeling Mark Crovella Boston University Computer Science

Highly Mutiplexed, Mixed-Bottlenecks

dec-pkt-1

(Internet Traffic Archive)

Page 37: Network Traffic Modeling Mark Crovella Boston University Computer Science

Alpha and Beta Traffic

• ON/OFF model revisited:High variability in connection rates (RTTs)

Low rate = beta High rate = alpha

fractional Gaussian noise stable Levy noise

+

=

+

+

=Rice U., SPIN Group

Page 38: Network Traffic Modeling Mark Crovella Boston University Computer Science

Long Range Dependence

R[k] ~ k-a 0 < a < 1R[k] ~ a-k a > 1

H=1-a/2

Page 39: Network Traffic Modeling Mark Crovella Boston University Computer Science

Correlation and Scaling

• Long range dependence affects how variability scales with timescale

• Take a traffic timeseries Xn, sum it over

blocks of size m– This is equivalent to observing the original

process on a longer timescale

• How do the mean and std dev change?– Mean will always grow in proportion to m– For i.i.d. data, the std dev will grow in

proportion to sqrt(m)– So, for i.i.d. data, the process is “smoother”

at longer timescale

Page 40: Network Traffic Modeling Mark Crovella Boston University Computer Science

Self-similarity: unusual scaling of variability

• Exact self-similarity of a zero-mean, stationary process Xn

• H: Hurst parameter 1/2 < H < 1

• H = 1/2 for i.i.d. Xn

• LRD leads to (at least) asymptotic s.s.

0,allfor 1)1(

tmtm

mtii

Hd

t XmX N

Page 41: Network Traffic Modeling Mark Crovella Boston University Computer Science

Self Similarity in Practice

H=0.95

H=0.50

10ms 1s 100s

Page 42: Network Traffic Modeling Mark Crovella Boston University Computer Science

The Great Wave (Hokusai)

Page 43: Network Traffic Modeling Mark Crovella Boston University Computer Science

How Does Self-Similarity Arise?

Self-similarity Autocorrelation Flows

Autocorrelation declines like a power law Distribution of flow lengths has power law tail

Page 44: Network Traffic Modeling Mark Crovella Boston University Computer Science

Power Tailed ON/OFF sources Self-Similarity

P1:P2:P3:

{ {ON OFF

Page 45: Network Traffic Modeling Mark Crovella Boston University Computer Science

Measuring Scaling Properties• In principle, one can

simply aggregate Xn

over varying sizes of m, and plot resulting variance as a function of m

• Linear behavior on a log-log plot gives an estimate of H (or a).

• Slope > -1 indicates LRD

WARNING: this method is very sensitive to violation of assumptions!

Page 46: Network Traffic Modeling Mark Crovella Boston University Computer Science

Better: Wavelet-based estimation

Veitch and Abry

Page 47: Network Traffic Modeling Mark Crovella Boston University Computer Science

Optional Material: PerformanceImplications of Self-Similarity

Page 48: Network Traffic Modeling Mark Crovella Boston University Computer Science

Performance implications of S.S.

• Asymptotic queue length distribution (G/G/1)– For SRD Traffic:

– For LRD Traffic:

– Severe - but, realistic?

)exp(~][ 1xcxQP

)exp(~][ 222

HxcxQP

Page 49: Network Traffic Modeling Mark Crovella Boston University Computer Science

Evaluating Self-Similarity

• Queueing Models like these are open systems– delay does not feed back to source

• TCP dynamics are not being considered– packet losses cause TCP to slow down

• Better approach:– Closed network, detailed modeling of TCP

dynamics– self-similarity traffic generated “naturally”

Page 50: Network Traffic Modeling Mark Crovella Boston University Computer Science

Simulating Self-Similar Traffic

• Simulated network with multiple clients, servers

• Clients alternative between requests, idle times

• Files drawn from heavy-tailed distribution– Vary a to vary self-similarity of traffic

• Each request is simulated at packet level, including detailed TCP flow control

• Compare with UDP (no flow control) as an example of an open system

Page 51: Network Traffic Modeling Mark Crovella Boston University Computer Science

Traffic Characteristics

• Self-similarity varies smoothly as function of a

Page 52: Network Traffic Modeling Mark Crovella Boston University Computer Science

Performance Effects:Packet Loss

Open Loop: UDP Closed Loop: TCP

Page 53: Network Traffic Modeling Mark Crovella Boston University Computer Science

Performance Effects:TCP Throughput

Page 54: Network Traffic Modeling Mark Crovella Boston University Computer Science

Performance Effects:Buffer Queue Lengths

Open Loop: UDP Closed Loop: TCP

Page 55: Network Traffic Modeling Mark Crovella Boston University Computer Science

Severity of Packet Delay

Page 56: Network Traffic Modeling Mark Crovella Boston University Computer Science

Performance Implications

• Self-similarity is present in Web traffic– The internet’s most popular application

• For the Web, the causes of s.s. can be traced to the heavy-tailed distribution of Web file sizes– Caching doesn’t seem to affect things much– Multimedia tends to increase tail weight of

Web files– But, even text files alone appear to be

heavy-tailed

Page 57: Network Traffic Modeling Mark Crovella Boston University Computer Science

Performance Implications (continued)

• Knowing how s.s. arises allows us to recreate it “naturally” in simulation

• Effects of s.s. in simulated TCP networks– Packet loss not as high as open-loop models

might suggest– Throughput not a problem– Packet delays are the big problem

Page 58: Network Traffic Modeling Mark Crovella Boston University Computer Science

Morning Lab Exercises

• For each dataset, explore its marginals– Histograms– CDFs, CCDFs,– Log-log CCDFs to look at tails

• For each dataset, explore its correlations– ACFs– Logscale diagrams– Compare to scrambled dataset

• Study performance– Simple queueing– Compare to scrambled dataset

Page 59: Network Traffic Modeling Mark Crovella Boston University Computer Science

Afternoon: Network Engineering

• Moving from stationary domain to nonstationary

• Goal: Traffic models that are useful for– capacity planning– traffic engineering– anomaly / attack detection

• Two main variants– Looking at traffic on a single link at a time– Looking at traffic on multiple or all links in a

network simultaneously

Page 60: Network Traffic Modeling Mark Crovella Boston University Computer Science

Part 1: Single Link Analysis

• The general modeling framework that is most often used is “signal + noise”

• Sometimes interested in the signal, sometimes the noise…

• So, signal processing techniques are common– Frequency Domain / Spectral Analysis

• Generally based on FFT

– Time-Frequency Analysis• Generally based on wavelets

Page 61: Network Traffic Modeling Mark Crovella Boston University Computer Science

A Typical Trace

Notable features:periodicitynoisiness“spikes”

Page 62: Network Traffic Modeling Mark Crovella Boston University Computer Science

Capacity Planning

• Here, mainly interested in the “signal”• Want to predict long-term trends• What do we need to remove?

– Noise– Periodicity

K. Papagiannaki et al., Infocom 2003

Page 63: Network Traffic Modeling Mark Crovella Boston University Computer Science

Periodicity: Spectral Analysis

Page 64: Network Traffic Modeling Mark Crovella Boston University Computer Science

Denoising with Wavelets

Page 65: Network Traffic Modeling Mark Crovella Boston University Computer Science

Capacity Planning via Forecasting

Page 66: Network Traffic Modeling Mark Crovella Boston University Computer Science

Anomaly detection

• Goal: models that are useful in detecting– Network equipment failures– Network misconfigurations– Flash crowds– Attacks– Network Abuse

Page 67: Network Traffic Modeling Mark Crovella Boston University Computer Science

Traffic

High Freq.

Med Freq.

Low Freq.

P. Barford et al., Internet Measurement Workshop 2002

Misconfiguration detection via Wavelets

Page 68: Network Traffic Modeling Mark Crovella Boston University Computer Science

Flash Crowd Detection

Long Term Change in Mean Traffic(8 weeks)

Page 69: Network Traffic Modeling Mark Crovella Boston University Computer Science

Detecting DoS attacks

Page 70: Network Traffic Modeling Mark Crovella Boston University Computer Science

Afternoon Part 2: Multiple Links

• How to analyze traffic from multiple links?– Clearly, could treat as a collection of single

links, and proceed as before– But, want more: to detect trends and patterns

across multiple links

• Observation: multiple links share common underlying patterns– Diurnal variation should be similar across links– Many anomalies will span multiple links

• Problem is one of pattern extraction in high dimension – Dimension is number of links

Page 71: Network Traffic Modeling Mark Crovella Boston University Computer Science

Example Link Traces from a Single Network

Some have visible structure, some less so…

Page 72: Network Traffic Modeling Mark Crovella Boston University Computer Science

High Dimensionality: A General Strategy

• Look for a low-dimensional representation preserving the most important features of data

• Often, a high-dimensional structure may explainable in terms of a small number of independent variables

• Commonly used tool: Principal Component Analysis (PCA)

Page 73: Network Traffic Modeling Mark Crovella Boston University Computer Science

Principal Component Analysis

For any given dataset, PCA finds a new coordinate system that maps maximum variability in the data to a minimum number of coordinates

New axes are called Principal Axes or Components

Page 74: Network Traffic Modeling Mark Crovella Boston University Computer Science

Correlation in Space, not Time

# Links

tim

e

Traditional Frequency-Domain Analysis

Page 75: Network Traffic Modeling Mark Crovella Boston University Computer Science

PCA on Link Traffic

Single Link

# Links

time

# Links

time

Eigenlink

U: Eigenlinkmatrix(Columns areOrthonormal)

X: OD flowmatrix

# L

inks

# Links

V: Principalmatrix(Orthogonal)

PC

XV=U X=UVT

Page 76: Network Traffic Modeling Mark Crovella Boston University Computer Science

Singular Value Decomposition

X=UVT is the singular value decomposition of X

Page 77: Network Traffic Modeling Mark Crovella Boston University Computer Science

PCA on Link Traffic (2)

Each link is weighted sum of all eigenlinks

Singular values indicate the energy attributable to a principal component

;

Page 78: Network Traffic Modeling Mark Crovella Boston University Computer Science

Low Intrinsic Dimensionalityof Link Data

A plot of the singular values reveals how much energy is captured by each PC

Sharp elbow indicates that most of the energy captured by 5-10 singular values, for all datasets

Page 79: Network Traffic Modeling Mark Crovella Boston University Computer Science

Implications of Low Instrinsic Dimensionality

• Apparently, we can reconstruct X with high accuracy, keeping only a few columns of U

• A form of lossy data compression• Even more, a way of extracting the

most significant part of X, automatically– “signal + noise”?

X’=U’VT

Page 80: Network Traffic Modeling Mark Crovella Boston University Computer Science

Approximating With Top 5 Eigenlinks

Page 81: Network Traffic Modeling Mark Crovella Boston University Computer Science

Approximating With Top 5 Eigenlinks

Page 82: Network Traffic Modeling Mark Crovella Boston University Computer Science

Approximating With Top 5 Eigenlinks

Page 83: Network Traffic Modeling Mark Crovella Boston University Computer Science

A Link, Reconstructed

Link traffic

Components 1-5

Components 6-10

All the 100+ rest

Page 84: Network Traffic Modeling Mark Crovella Boston University Computer Science

Anomaly Detection

Single Link Approach: Use wavelets to detrend each flow in isolation.[Barford:IMW02]

Multi Link approach:Detrend all links simultaneously by choosing only certain principal components.

X = X’ + X’’

Page 85: Network Traffic Modeling Mark Crovella Boston University Computer Science

PCA based anomaly detection

L2 norm of entiretraffic vector X

L2 norm of residualvector X’’

Page 86: Network Traffic Modeling Mark Crovella Boston University Computer Science

Traffic Forecasting

Single-Link approach: Treat each flow timeseries independently. Use wavelets to extract trends. Build timeseries forecasting models on trends. [Papagiannaki:INFOCOM03]

Multi-Link approach:Build forecasting models on most significant eigenlinks as trends. Allows simultaneous examination and forecasting for entire ensemble of links.

X’=U’VT

Page 87: Network Traffic Modeling Mark Crovella Boston University Computer Science

Conclusions

• Whew!

• Bibliography on handout

• Traffic analysis methods vary considerably depending on:– Question being asked– Timescale– Stationarity

Page 88: Network Traffic Modeling Mark Crovella Boston University Computer Science

Conclusions

1 hour 1 day 1 week1 usec 1 sec

Network Engineering

•Signal + Noise•Single-link: Frequency domain analysis•Multi-link: Exploit Spatial correlation

Performance Evaluation

-Marginals-Watch out for heavy tails

-Correlation (in time)-Watch out for LRD / Self Similarity

Page 89: Network Traffic Modeling Mark Crovella Boston University Computer Science

Afternoon Lab Exercises

• For each dataset,– Perform PCA – Assess the variance in each component– Reconstruct using small number of

components

• Time / Interest permitting,– Analyze some of the single link timeseries

using wavelets (matlab wavelet toolbox)