1 in-network pca and anomaly detection ling huang* xuanlong nguyen* minos garofalakis § michael...

22
1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel Research {hling, xuanlong, jordan, adj}@cs.berkeley.edu {minos.garofalakis, [email protected]

Post on 19-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel

1

In-Network PCA andAnomaly Detection

Ling Huang* XuanLong Nguyen* Minos Garofalakis§

Michael Jordan* Anthony Joseph* Nina Taft§

*UC Berkeley §Intel Research

{hling, xuanlong, jordan, adj}@cs.berkeley.edu {minos.garofalakis, [email protected]

Page 2: 1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel

2

Detection of Network-wide Anomalies A volume anomaly is a sudden change in

an Origin-Destination flow (i.e., point to point traffic)

Given only link traffic measurements, efficiently diagnose the volume anomalies

H1

H2

The backbone network

Regional network 1Regional network 2

Page 3: 1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel

3

An Illustration

Observed network link flow = aggregate of application-level flows

Anomalies in (unobserved) application-level flow

Finding anomalies in high-dimensional, noisy data is difficult !

Page 4: 1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel

4

The PCA Method An approach to separate normal from anomalous

traffic Normal Subspace : space spanned by the top

k principal components Anomalous Subspace : space spanned by the

remaining principal components Then, decompose traffic on all links by projecting

onto and to obtain:

Traffic vector of all links at a particular point in time

Normal trafficvector

Residual trafficvector

Page 5: 1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel

5

A Geometric Illustration

In general, anomalous traffic results in a large value of , where

yCyyCy ababnono

aby

y

aby

noy

Page 6: 1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel

6

Detection Illustration

Value ofover time

at anomalous time points clearly stand out2

abyC

over time(residual part)

Value of

αQ

Page 7: 1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel

7

Y

Y

n

m

Operation center

The Centralized Algorithm

Y/mY A onPCA T

The Network

Eigen values

αQ

Threshold

Eigen vectors

abC

Projection

Data matrix Y (m x n) n links

m time points (data)

Detection procedure Raise a flag if

αab QyC Periodically (e.g. once a week)

Page 8: 1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel

8

Scalability Issues of Centralized Approach As the number of monitoring devices grow (up to

hundreds or thousands network data features) central processing site overloaded certain networks do not overprovision inter-site connectivity

When anomalies occur on smaller time scales (down to second or sub-second scales) “periodic push” has to be applied on second or sub-second

scales the volume of data transmitted through network would

explode

Page 9: 1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel

9

Our Distributed Approach A communication-efficient framework that

detects anomalies at desired accuracy level minimizes data communication cost

A distributed protocol for data processing local monitors decide when to update data to coordinator coordinator makes global decision and feedback to monitors

An algorithmic frame guide the tradeoff simple algorithm for determining filtering parameters given

desired detection accuracy stochastic matrix perturbation theory quantify

how a perturbed data matrix impacts its eigen structures how perturbed eigen structures impacts the detection accuracy

Page 10: 1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel

10

Our In-Network Detection Framework

Anomaly

User inputs

Originalmonitoredtime series

Processedtime series

Distr. Monitors

Coordinator

n ,,1

Page 11: 1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel

11

The Protocol At Monitors Each monitor updates information to

coordinator if its incoming signal

where (filtering slacks) are adaptively computed by the coordinator

can be based on any prediction model built by on its data at an update time e.g., the average of last 5 signal values observed

locally at

iM

iii tRt )()(Y *

)(Y ti

n ,,1

iM

iM

)( *tRi*t

Page 12: 1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel

12

The coordinator makes a new row

where

The Protocol At The Coordinator

otherwise ,)(Rimonitor from

updates getting if ,)(Y

)(Y*t

t

t

i

i

i

)(Y)(Yy 1 tt n

If any element in is updated Update Compute new

Perform detection usingαab QyC

y

αab Q and C

Y

Page 13: 1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel

13

The Tradeoff The bigger the , the less communication,

but the more the detection error Need an algorithm to related to

detection accuracy

αab QyC

αab QyC

Difference?

n ,,1

Eigen Vectors

Eigen Values

i

Raw data y in blue, data available for detection in red

y

Page 14: 1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel

14

Parameter Design and Error Control (I) Given upper bound of false alarm , determine the

monitor slacks ’s

Perturbation analysis: from deviation of false alarm to monitor slacks

Page 15: 1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel

15

Let and are eigenvalues of the covariance matrices and

Define the perturbation matrix

Define the eigen error

From matrix perturbation theory, we have

So the key point is to estimate in terms of slaks ’s

Parameter Design and Error Control (II)

Page 16: 1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel

16

Let and Standard assumptions on the filtering error

matrix W:

Eigen Error Monitor Slacks ’s (I)

Page 17: 1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel

17

Eigen-Error Monitor Slacks ’s (II)

Where: , n is number of monitors and m is the number of data points.

Page 18: 1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel

18

Detection Error Eigen-Error (I) Basic idea: study how eigen error impacts

detection error With full data, false alarm rate is

With approximate data, we only have perturbed version

Given eigen error, we can compute the false alarm rate (though not in closed-form solution) Inverse dependency: given desired false alarm rate, we can

determine tolerable eigen error by fast binary search

Page 19: 1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel

19

Detection Error Eigen-Error (I) Consider normalized random variable

For approximate data, we only obtain

Let denote an upper bound on The deviation of false alarm rate can be

approximate as

The upper bound of false alarm rate is

X

Page 20: 1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel

20

Evaluation Given a tolerable deviation of false alarm

rate, we can determine system parameters Using system parameters, we can evaluate

the actual detection accuracy using simulation

Experiment setup Abilene backbone network data Traffic matrices of size 1008 X 41 Set uniform slack for all monitors

Page 21: 1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel

21

Results

Monitor slacks, communication cost and detection error

Page 22: 1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel

22

Results (II)