pegasus: precision hunting for icebergs and anomalies in network flows

29
Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows Sriharsha Gangam 1 , Puneet Sharma 2 , Sonia Fahmy 1 1 Purdue University, 2 HP Labs 1 This research has been sponsored in part by GENI project 1723, and by Hewlett-Packard

Upload: jules

Post on 15-Jan-2016

17 views

Category:

Documents


0 download

DESCRIPTION

Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows. Sriharsha Gangam 1 , Puneet Sharma 2 , Sonia Fahmy 1 1 Purdue University, 2 HP Labs. This research has been sponsored in part by GENI project 1723 , and by Hewlett -Packard. Passive Flow Monitoring. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

1

Pegasus: Precision Hunting for Icebergs

and Anomalies in Network Flows

Sriharsha Gangam1, Puneet Sharma2, Sonia Fahmy1

1Purdue University, 2HP Labs

This research has been sponsored in part by GENI project 1723, and by Hewlett-Packard

Page 2: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

2

Passive Flow Monitoring

• Detect network congestion, attacks, faults, anomalies, traffic engineering and accounting

• Observe and collect traffic summaries

• e.g., InMon traffic sentinel [InMon] uses sFlow, Cisco’s NetFlow is used in ISPs

Monitoring Data

Collection &Analysis

Network Devices

e.g., switches

[InMon] http://inmon.com

Page 3: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

3

Passive Flow Monitoring - Challenges

• Large overhead to collect and analyze fine-grained flow data

• Increasing link speeds, network size and traffico Limited CPU, memory resources at the routerso Millions of flows in ISP networks

• Current Techniques?o NetFlow sampling rate in ISPs ~ 1 in 100 (Internet2)o sFlow packet sampling rate ~ 1 in 2000o Application dependent sketcheso Fine-grained information is lost

Page 4: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

4

Will More Resources Help?

• Commercial co-located compute and storageo HP ONE Bladeso Cisco SRE Modules

• Example configurationo 2.20 GHz Core Duo processoro 4 GB RAM, 250 GB HDo 2x10 Gbps duplex bandwidth to switch

• Storage and Analysis of fine-grained flow statisticso Distributed monitoring applications

Page 5: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

5

Design Space

Network Overhead

Additional

Compute &

Storage

Acc

urac

y

Ideal Solutio

n

Current Solutions: Sampling and

Sketching

Our Goals: Pegasus - Accurate & low overhead monitoring

Naïve Solutio

n

Impractical

Page 6: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

6

Key Class of Applications

• Network bottleneckso Top traffic destinations, sources, and links

• Suspicious port scanning activityo Sources that connect to more than 10% hosts within

time T

• DDoS attack detectiono Destinations with large number of connections or traffic

Page 7: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

7

Global Iceberg Detection

• Items with aggregate count exceeding a threshold (S xθ)o Global heavy hitters

• Observations at any single switch/router may not be significant or interestingo E.g., DDoS attack

Monitoring Data

Items contributing > 1% (θ)

traffic?Network Devices

e.g., switches

h1 h2 h4 …

20

10

50

h2 h3 h4 …

60

15

20

h3 h5 h6 …

50

10

30

Page 8: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

8

Online Iceberg Detection with

Pegasus• Reduce

communication overhead o Additional compute and

storage

• Precisely detect all global icebergs o zero false positives and

false negatives

• Feedback based iterative approach

High precision

Iterative solution

Page 9: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

Comparison of Different Approaches

Network Devices

e.g., switches

Naïve Approac

h Prohibitively large Monitoring Data

Collection &Analysis

(Aggregator)

Sampling and

SketchingLossy Summary:

False +ves and -ves

Pegasus

Lossy Summary: Sketch-sets

i1 i2 i4 …

20

10

50

i2 i3 i4 …

60

15

20

i3 i5 i6 …

50

10

30

Fine-grained data on-demand:

No False +ves or -ves

Monitor

Page 10: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

10

1- D Sketch-set Representation

• Sketch-set: Summary representation of a collection of flows, supports set operations

β

Coarse Sketch-set Generation

(Destination IP, Packet Count)

128.41.10.10, 128.41.10.50, 15, 30

128.41.10.110, 128.41.10.150, 100, 110

128.41.10.210, 128.41.10.210, 300, 300Coarse-grained sketch-sets

α

128.41.10.10, 15

128.41.10.20, 20

128.41.10.30, 15

128.41.10.40, 30

128.41.10.50, 25

128.41.10.110, 110

128.41.10.150, 100

128.41.10.210, 300

(startIP, endIP, minPkt, maxPkt)

Example: Destinations IPs receiving more than 200 packets

Page 11: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

11

Example

128.41.10.35, 128.41.10.70, 10, 35

128.41.10.100, 128.41.10.120, 90, 130

128.41.10.10, 128.41.10.50, 15, 30

128.41.10.110, 128.41.10.150, 100, 110

128.41.10.210, 128.41.10.210, 300, 300

(startIP, endIP, minPkt, maxPkt)

Coarse-grained Sketch-sets

Monitor 2

Monitor 1

Aggregator

Disjoint Sketch-sets

INTERSECTION SUBTRACTION

Non-icebergs

Query monitors (uncertai

n)

Iceberg

128.41.10.10, 128.41.10.34, 15, 30

128.41.10.35, 128.41.10.50, 10, 65

128.41.10.51 , 128.41.10.70, 10, 35

128.41.10.100, 128.41.10.109, 90, 130

128.41.10.121, 128.41.120.150, 100, 110

128.41.10.110, 128.41.10.120, 90, 240

128.41.10.210, 128.41.10.210, 300, 300

Page 12: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

12

Example…Query Response

Aggregator Query:(128.41.10.110, 128.41.10.120)

128.41.10.110, 90128.41.10.120,

130

128.41.10.110, 110

Monitor 2

Monitor 1Lookup relevant

flows

Generate Sketch-sets (finer

granularity)

128.41.10.110, 128.41.10.110, 90, 90

128.41.10.120, 128.41.10.120, 130, 130

128.41.10.110, 128.41.10.110, 110, 110

Page 13: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

13

Example…Query Response

Aggregator Query:(128.41.10.110, 128.41.10.120)

Monitor 2

Monitor 1

128.41.10.110, 128.41.10.110, 90, 90

128.41.10.120, 128.41.10.120, 130, 130

128.41.10.110, 128.41.10.110, 110, 110

128.41.10.110, 128.41.10.110, 200, 200

128.41.10.120, 128.41.10.120, 130, 130

Fine-grained sketch-sets

Aggregator

Non-icebergs

Iceberg

Page 14: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

14

Evaluation Methodology

• Abilene traceo Netflow records: 11 sites with 1 in 100 sampling for 5

mino Add small flows to revert sampling

• (90% of flows contribute to 20% of traffic, ~ 758K unique flow records)

o Trace is used in [Huang11]

• Enterprise network sFlow traceo sFlow records: 249 switches,1 in 2000 sampling for a

weeko Revert sampling by adding flows

• PlanetLab’s Outgoing Traffico NetFlow records generated at each PlanetLab host

[Huang11] G. Huang, A. Lall, C. Chuah, and J. Xu. Uncovering global icebergs in distributed streams: Results and implications. J. Netw. Syst. Manage., 19:84–110, March 2011

Page 15: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

15

Comparison with Sample-sketch

• Sends sampled monitoring data and sketches to the aggregator for iceberg detection

• Uses two main parameterso Sampling intervalo Sketch threshold

• Difficult to decide the parameters

• Can have false positives and false negatives

G. Huang, A. Lall, C. Chuah, and J. Xu. Uncovering global icebergs in distributed streams: Results and implications. J. Netw. Syst. Manage., 19:84–110, March 2011

Page 16: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

16

Abilene Trace

For the 5 min trace, θ = 0.08- Naive solution: ≈ 7.63 MB- Pegasus: ≈ 8 KB- Sample-Sketch: ≈ 36 KB

Larg

er

is b

ett

er

Pegasus has lower communication overhead

θ

θ

Page 17: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

17

Monitoring Outgoing PlanetLab Traffic

• Example of end-host monitoring system

• Detect accidental attacks and anomalies originating from PlanetLab

• Existing monitoring service: PlanetFlow o Decouples collection

from analysiso Collects 1 TB of data

every month [PF] (naïve approach)

PlanetLab nodes

Monitor

Aggregator

Monitor

NetFlow records generated from outgoing traffic

[PF] http://www.cs.princeton.edu/~sapanb/planetflow2/

Page 18: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

18

Pegasus PlanetLab Service

• PlanetLab’s outgoing traffico NetFlow records of ~250 PlanetLab nodeso Online global iceberg detection service

• Global Iceberg detection foro Flow identifier: Destination IP, Source Port, Destination

Porto Flow size: Packet count

Page 19: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

19

Pegasus PlanetLab Service

• 15 hour deployment - Pegasus: 403 MB, Naïve: 2.26 GB

• Most outbound traffic to other PlanetLab hosts• 1- Day outgoing traffic:

• CoDNS and CoDeeN don’t produce many icebergs

Source Port Icebergs Destination Port Icebergs

3 (CompressNET) 3 (CompressNET)

8 (unassigned) 0 (Reserved)

22 (SSH) 53 (DNS)

80 (HTTP) 80 (HTTP), 443 (HTTPS)

Page 20: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

20

Conclusions• Pegasus: A distributed measurement system

o Commercial co-located compute and storage deviceso Low network overheado High accuracy

• Adaptive aggregation for the global iceberg detectiono Iterative feedback solution

• Experiments from real traces and PlanetLab deploymento low overhead without false +ves and -ves

Page 21: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

21

Thank youQuestions?

Page 22: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

22

Anomaly Examples• Based on traffic features [Kind09]

[Kind09] Histogram-Based Traffic Anomaly Detection, In IEEE Trans. on Netwk. Service Management

Page 23: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

23

Related Work• Threshold Algorithm (TA) [Fagin03]

o Large number of iterations

• Three phase uniform threshold (TPUT) [Cao04]o Accounting data distributions [Yu05]

• Filtering based continuous monitoring algorithms [Babcock03] [Keralapura06] [Olston03]o Send update to aggregator when local arithmetic

constraints fail

[Yu05] Efficient processing of distributed top-k queries. In Proc. of DEXA, 2005

[Cao04] Efficient Top-K Query Calculation in Distributed Networks. In proc. of PODC, 2004

[Fagin03] Optimal aggregation algorithms for middleware. Jour. of Comp. and Sys. Sciences, 2003

[Babcock03] Distributed top-k monitoring. In Proc. SIGMOD, 2003[Keralapura06] Communication-efficient distributed monitoring of thresholded counts. In Proc. of SIGMOD, 2006[Olston03] Adaptive filters for continuous queries over distributed data streams. In Proc. SIGMOD, 2003

Page 24: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

24

Sketch-set Granularity - G

• High granularity ⇒ More precise, more expensive representation

• Granularity definition: maxSize – minSize

• Used to determine if more flows should be combined in a sketch-set

• Used to send finer granularity during monitor response (for convergence)

Page 25: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

25

Iterative Feedback Algorithm

Page 26: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

26

Abilene Trace

β little influence on the communication

cost

Page 27: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

27

Enterprise Network sFlow Trace

Larg

er

is b

ett

er

All except one parameter pair (green) has false

positives and negatives

Page 28: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

28

Scalability with Number of Monitors

Page 29: Pegasus: Precision Hunting for Icebergs and Anomalies in Network Flows

29

Scalability with Number of Monitors

sFlow trace Abilene trace

Larg

er

is b

ett

er