improving signature matching using binary decision diagrams liu yang, rezwana karim, vinod ganapathy...

41
Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim , Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Upload: elfrieda-day

Post on 24-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Improving Signature Matching using Binary Decision Diagrams

Liu Yang, Rezwana Karim, Vinod GanapathyRutgers University

Randy SmithSandia National Labs

Page 2: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Signature matching in IDS

• Find instances of network packets that match attack signatures

2

signatures Sig: /.*evil.*/

innocent

12evil34

Page 3: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Matching regular expressions• Signatures represented as regular expressions

(RE) • Finite Automata

– Represent and operate regular expressions

• Time Efficiency– Throughput must cope with Gbps link speeds

• Space Efficiency– Must fit in main memory of NIDS

3

Time-Space Tradeoff in Finite Automata

Page 4: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Time/Space tradeoff

4

DFAs

NFAs

Processing time

NFA-OBDD

Space efficient DFA

>= 4 orders of magnitude DFA :

Deterministic Finite Automaton

NFA : Non-Deterministic Finite Automaton

IDEAL

Page 5: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Contributions of our paper• NFA-OBDD: New data structure that offers fast

regular expression matching with space consumption comparable to that of NFAs

– Up to 1645x faster than NFAs with comparable memory consumption

– Speed is competitive with DFA variants• DFA runs out of memory for our signature sets

– Outperforms or is competitive with PCRE 5

Key Idea: Boolean encoding of NFA operation

Page 6: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Trends and challenges

6

Signature set size increased 5x in the last 5 years

ChallengePerform fast matching with low memory consumption

Page 7: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Combining DFAs

7

/.*ab.*cd/ /.*ef.*gh/ /.*ab.*cd | .*ef.*gh/

Picture courtesy : [Smith et al. Oakland’08]

Multiplicative increase in number of states

Page 8: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Combining NFAs

8

MN

/.*ab.*cd/ /.*ef.*gh/ /.*ab.*cd | .*ef.*gh/

Additive increase in number of states

Page 9: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

NFAs more compact than DFAs

9

Real Snort signatures have large counter values

Signature/.*1[0|1] {3} /

NFA

DFA

Page 10: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

10

Signature 1: /.*ab.*.{15}cd/Signature 2: /.*ef.*.{10}gh/

DFA

NFA

Page 11: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Outline

Problem DefinitionOur ContributionNFA-OBDD model and operationImplementationEvaluationRelated WorkConclusion

11

Page 12: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

NFA-OBDDs: Main idea

• Why are NFAs slow?– NFA frontiers may contain multiple states– Each frontier state must be processed for each

symbol in the input.

• Idea: Represent and operate NFA frontiers symbolically using Boolean functions– Entire frontier can be modified using a single

Boolean formula– Use ordered binary decision diagrams

(OBDDs) to represent Boolean formulae

Page 13: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Encoding NFA transition functions

• An NFA for (0|1)*1, ∑ = {0,1}

• Transition function

13

x i y f(x,i,y)

0 0 0

0 0 1

0 1 0

0 1 1

1 0 0

1 0 1

1 1 0

1 1 1

0

1

1

1

1

0

0

1

•Frontier: Set of current states• Size: O(n); n= # of states

•Set membership function-Disjunction of binary values of member states

Page 14: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

• Determine new frontier after processing input:

Next set of states =

• Checking acceptance:

NFA operation

14

SAT(Set_of_ Accept_States(x) ᴧ Frontier(x))

Transition_Function(x,i,y) ᴧ Frontier(x) ᴧ Input_Symbol(i)

Map y→x Ǝx,i

)

(

Page 15: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Ordered binary Decision Diagram (OBDD) [Bryant 86]

• Compact representation of Boolean function

15The BDD of f(x,i,y) with

order i < x < y

x i y f(x,i,y)

0 0 0 1

0 0 1 0

0 1 0 1

0 1 1 1

1 0 0 1

1 0 1 0

1 1 0 0

1 1 1 1

Page 16: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

NFA-OBDD• NFA-OBDD: NFA representation and

operation using OBDDs

• OBDD Representation of – Transitions– Frontiers

• Current set of states

– Input symbols– Set of accepting states– Set of start states

16

Page 17: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Space efficiency of NFA-OBDDs

• NFA-OBDD construction:– Uses same combination algorithm as NFAs– OBDD data structure itself utilizes the

redundancy of the binary function table

17

Rapid growth of signature set has little impact on NFA-OBDD space consumption

(unlike DFAs)

Page 18: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Outline

Problem DefinitionOur ContributionNFA-OBDD model and operationImplementationEvaluationRelated WorkConclusion

18

Page 19: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Experimental apparatus

19

• C++ and CUDD package for OBDDs

Page 20: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Regular expression sets

• Snort HTTP signature set– 1503 regular expressions from March 2007– 2612 regular expressions from October 2009

• Snort FTP signature set– 98 regular expressions from October 2009

• Extracted regular expressions from pcre and uricontent fields of signatures

20

Page 21: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Traffic traces• HTTP traces

– 33 traces– Size: 5.1MB –1.24 GB– One week period in Aug 2009 from Web

server of the CS department at Rutgers

• FTP Traces– 2 FTP traces– Size: 19.4MB, 24.7 MB– Two week period in March 2010 from FTP

server of the CS department at Rutgers21

Page 22: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Experimental results

• For 1503 REs from HTTP Signatures

22 *Intel Core2 Duo E7500, 2.93GHz; Linux-2.6; 2GB RAM*

1645x

9-26x

10x

Page 23: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Experimental results

• For 2612 REs from HTTP signatures

23

MDFA ran out of memory for 2612 REs

Page 24: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Experimental results

• For 98 RE from FTP signatures

24

MDFA ran out of memory for 98 REs

Page 25: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Multibyte matching

• Matches k>1 input symbol in a single step

• Also possible with NFA-OBDDs– Use OBDDs to represent k-step transitive

closure of NFA transition function– See paper for details.

• Brief summary of experimental results– 2-stride NFA-OBDD doubles the throughput– Outperforms 2-stride NFA by 3 orders of

magnitude25

Page 26: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Related work

• Multiple DFAs [Yu et al., ANCS’06]

• Extended finite automata [Smith et al., Oakland’08, SIGCOMM’08]

• D2FA [Kumar et al., SIGCOMM’06]

• Hybrid finite automata [Becchi et al., ANCS’08]

• Multibyte speculative matching [Luchaup et al., RAID’09]

• Many more – see paper for details

26

Page 27: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Conclusion

• NFA-OBDDs– Outperform NFAs by three orders of magnitude

• Up to 1645× in the best case

• Retain space efficiency of NFAs

– Outperform or competitive with the PCRE package

– Competitive with variants of DFAs but drastically less memory-intensive

27

Page 28: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Improving Signature Matching using Binary Decision Diagrams

Thank You

Liu Yang - [email protected]

Rezwana Karim - [email protected]

Vinod Ganapathy - [email protected]

Randy Smith - [email protected]

Page 29: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

BDD var ordering

29

Page 30: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

30

Page 31: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

NFA-OBDD vs NFA

NFA-OBDD

• Time Efficient– Process a set of states in one step

• Space Efficient– Use the same combination algorithm

31

Page 32: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

NFA-OBDD operation

Temp] = OBDD[Transition Function]

ᴧ OBDD[Frontier]

ᴧ OBDD[Input Symbol]

OBDD[Next Set of States] =

Map x→y (Ǝx,i OBDD[Temp])

Checking Acceptance :

OBDD[Set of Accept States]

ᴧ OBDD[Frontier]32

Page 33: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Solution for Mutibyte Matching

33

Page 34: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Experimental results

• 2 Stride 1400 HTTP

34

Page 35: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Experimental results

• 2 Stride 1400 HTTP signatures zoomed

35

Page 36: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Experimental Results

• 2 stride ftp 95 signatures

36

Page 37: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

37

Page 38: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

Matching multiple input symbol

38

Page 39: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

39

accept sig

Page 40: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

accept sig accept sig accept sig

accept sig

accept sig

accept sigaccept sig

accept sig

Page 41: Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs

NFA is compact

41

Recent signatures have large counter with value up to 500

Signature/.*1[0|1] {3}/