![Page 1: 1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF](https://reader035.vdocument.in/reader035/viewer/2022070410/56649f155503460f94c2b2d0/html5/thumbnails/1.jpg)
1
Efficient Rule Matching for Large Scale Systems
Packet Classification – A Case Study
Alok Tongaonkar
Stony Brook University
A
![Page 2: 1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF](https://reader035.vdocument.in/reader035/viewer/2022070410/56649f155503460f94c2b2d0/html5/thumbnails/2.jpg)
2
Rule Based Systems
Applications in Security – Intrusion Detection System Firewalls Access Control Systems
Policy specified in terms of a database of rules
Enforcement involves identifying the applicable rule(s)
![Page 3: 1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF](https://reader035.vdocument.in/reader035/viewer/2022070410/56649f155503460f94c2b2d0/html5/thumbnails/3.jpg)
3
Fundamental Operation Given an input p with attributes {p1, p2, ..., pk}, identify the
rules Ri from {R1, R2, ..., Rn} that match p
Ri: condition -> action
e.g. R1: dhost == PLUTO && dport == HTTP && content: “Bad command” -> DENY
Challenge
Rule matching algorithms do not scale well – either in space or in time
![Page 4: 1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF](https://reader035.vdocument.in/reader035/viewer/2022070410/56649f155503460f94c2b2d0/html5/thumbnails/4.jpg)
4
Matching Algorithms n – no. of rules k – no. of attributes Linear Search
Match one rule at a time Space efficient – O(n*k) Matching time increases very fast – O(n)
Table-based Search Columns correspond to attributes Rows correspond to rules Wastes space when many rules specify “*” for many
attributes – O(n*k) Efficient matching in hardware/multiprocessor – match
different attributes in parallel and combine results In uniprocessor environment matching time – O(n)
![Page 5: 1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF](https://reader035.vdocument.in/reader035/viewer/2022070410/56649f155503460f94c2b2d0/html5/thumbnails/5.jpg)
5
Matching Algorithms contd. Decision Tree (Trie-like structure)
Each node corresponds to test on an attribute Matching time – O(k)
No. of attributes is order of magnitude smaller than no. of rules
Size – Can be exponential in n
Minimization of decision tree is a NP-complete problem!
Goal
Develop efficient techniques for rule matching that scale to support thousands of rules
![Page 6: 1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF](https://reader035.vdocument.in/reader035/viewer/2022070410/56649f155503460f94c2b2d0/html5/thumbnails/6.jpg)
6
Outline
Problem Formulation Techniques
Minimize duplication Benign non-determinism Polynomial bound Utility
Results
![Page 7: 1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF](https://reader035.vdocument.in/reader035/viewer/2022070410/56649f155503460f94c2b2d0/html5/thumbnails/7.jpg)
7
Packet Classification A mechanism that
inspects network packets determines how to process a packet based on the values of
header fields and the payload Applications
Firewalls – Identify highest priority matching rule Intrusion Detection Systems
Use unordered rules Identify all matching rules
Network Monitoring – whether a packet satisfies any of the conditions
![Page 8: 1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF](https://reader035.vdocument.in/reader035/viewer/2022070410/56649f155503460f94c2b2d0/html5/thumbnails/8.jpg)
8
Objective Promote sharing of tests
not restricted to equality tests we need to support inequalities, disequalities, and
bit-masking operations Flexibility to support diverse application
Ordered (firewalls) and unordered (intrusion detection) rule sets
Packet-filtering (network monitoring)
![Page 9: 1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF](https://reader035.vdocument.in/reader035/viewer/2022070410/56649f155503460f94c2b2d0/html5/thumbnails/9.jpg)
9
Problem FormulationTests involve a variable x and one or two constants
(denoted by c). Equality tests x == c
tcp_sport == 80 Equality tests with bitmasks x & c1 == c
tcp_flags & 0x03 == 0x03 Disequality tests x != c
tcp_sport != 80
Disequality tests with bitmasks x & c1 != c tcp_flags & 0x03 != 0x03
Inequality tests x <= c tcp_dport <= 1024
![Page 10: 1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF](https://reader035.vdocument.in/reader035/viewer/2022070410/56649f155503460f94c2b2d0/html5/thumbnails/10.jpg)
10
Rules and priorities A rule R is a conjunction of tests
(dport == 22) && (sport <=1024) && (flags&0xb == 0x3) A set of rules may be partially ordered by a priority
relation The priority of R is denoted as Pri(R).
A rule R matches a packet p, if: the packet satisfies R, i.e., R(p) is true the packet does not satisfy any rule that has higher
priority than R
![Page 11: 1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF](https://reader035.vdocument.in/reader035/viewer/2022070410/56649f155503460f94c2b2d0/html5/thumbnails/11.jpg)
11
Decision Tree for Packet Classification
{R1, R2, R3}
{}
icmp_type == ECHO
ttl == 1
ttl == 1
ttl == 1 ttl != 1
ttl != 1
ttl != 1
icmp_type == ECHO_REPLY {R1, R3}
{R2, R3}
{R3}
{}{R3}
{R2, R3}
{R1, R3} {R1}
icmp_type != ECHO &&
icmp_type != ECHO_REPLY
R1: (icmp_type == ECHO)R2: (icmp_type == ECHO_REPLY) && (ttl ==1)R3: (ttl == 1)
![Page 12: 1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF](https://reader035.vdocument.in/reader035/viewer/2022070410/56649f155503460f94c2b2d0/html5/thumbnails/12.jpg)
12
Exponential Blowup R1: x == 1 R2: x == 2 R3: x == 3 R4: x == 4
R5: y == 1 R6: y == 2 R7: y == 3 R8: y == 4
12
34
x
y
213 4
else
elseelse1
2 3 4
{R1, R5} {R1, R6} {R2, R5} {R2, R6}
![Page 13: 1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF](https://reader035.vdocument.in/reader035/viewer/2022070410/56649f155503460f94c2b2d0/html5/thumbnails/13.jpg)
13
Decision Tree Construction Decompose and reorder tests to increase
sharing of tests among rules
R1: x == 5
R2: x & 0x03 != 1
{R2}
x & 0x03 != 1x & 0x03 == 1
x & 0x03 != 1x & 0x03 == 1
x == 5 x != 5
{R1} {R1, R2} {}
{R1}
![Page 14: 1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF](https://reader035.vdocument.in/reader035/viewer/2022070410/56649f155503460f94c2b2d0/html5/thumbnails/14.jpg)
14
Condition Factorization Decomposing rules into combination of more
primitive tests Similar to factorization of integers Based on the residue operation – analogous to
integer divisionResidue We want to determine if there is a match for a rule
C1
We have so far tested a condition C2
A residue captures the additional tests that need to be performed at this point to verify C1
![Page 15: 1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF](https://reader035.vdocument.in/reader035/viewer/2022070410/56649f155503460f94c2b2d0/html5/thumbnails/15.jpg)
15
Residue OperationThe residue C1/C2 is another condition C3 such
that:1. C2 Æ C3 ) C1
2. C1 Æ C2 ) C3
Examples C1: x 2 [1, 20], C2: x 2 [15, 25] C3: x <= 20
C1: x 2 [1, 20], C2: x == 15 C3: true
C1: x 2 [1, 20], C2: x == 35 C3: false
C1: x 2 [1, 20], C2: y == 15 C3: x 2 [1, 20]
![Page 16: 1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF](https://reader035.vdocument.in/reader035/viewer/2022070410/56649f155503460f94c2b2d0/html5/thumbnails/16.jpg)
16
Computing Residue on Tests
![Page 17: 1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF](https://reader035.vdocument.in/reader035/viewer/2022070410/56649f155503460f94c2b2d0/html5/thumbnails/17.jpg)
17
Build Algorithm Recursive procedure Takes a node s as its first parameter Builds the sub-tree that is rooted at s It takes two other parameters
Candidate Set (Cs) – rules that haven’t completed a match, but future matches can’t be ruled out either.
Match Set (Ms) – all rules for which a match can be announced at s.
![Page 18: 1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF](https://reader035.vdocument.in/reader035/viewer/2022070410/56649f155503460f94c2b2d0/html5/thumbnails/18.jpg)
18
Minimize Duplication R1: x == 1 && y == 1
R2: x == 2 && y == 2
R3: y == 3
x
12
else
yy y
1 3 else 2 else3 3 else
{R1} {R3} {} {}{R3}{R2} {} {R3}
![Page 19: 1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF](https://reader035.vdocument.in/reader035/viewer/2022070410/56649f155503460f94c2b2d0/html5/thumbnails/19.jpg)
19
Minimize Duplication R1: x == 1 && y == 1
R2: x == 2 && y == 2
R3: y == 3
y
12
else
xx
1 else 2 else
3
{R3}
{R1} {} {}
{}
{R2}
![Page 20: 1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF](https://reader035.vdocument.in/reader035/viewer/2022070410/56649f155503460f94c2b2d0/html5/thumbnails/20.jpg)
20
Benign Non-determinism Two rules R1 and R2 are said to be independent of each
if they do not have a common test Build separate trees for each independent set Match packets against each tree – non-determinism
without incurring any performance penalties If R1 and R2 are independent, packet may match R1, R2,
both, or neither. Number of nodes of tree for R1 is k1, for R2 is k2. Number of states of tree for R1 U R2 is k1 * k2. Combined number of nodes of independent trees for R1
and R2 is k1 + k2.
![Page 21: 1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF](https://reader035.vdocument.in/reader035/viewer/2022070410/56649f155503460f94c2b2d0/html5/thumbnails/21.jpg)
21
Exponential Blowup R1: x == 1 R2: x == 2 R3: x == 3 R4: x == 4
R5: y == 1 R6: y == 2 R7: y == 3 R8: y == 4
12
34
x
y
213 4
else
elseelse1
2 3 4
{R1, R5} {R1, R6} {R2, R5} {R2, R6}
yx
{R1} {R2} {R5} {R6}
![Page 22: 1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF](https://reader035.vdocument.in/reader035/viewer/2022070410/56649f155503460f94c2b2d0/html5/thumbnails/22.jpg)
22
Ensuring Polynomial Bounds Breadth of tree is function of breadth of sub-
trees Select a polynomial bound to satisfy at each
node Pick tests that satisfy the bounds Pick a test that comes closest to satisfying
this constraint and make some outgoing edges nondeterministic
![Page 23: 1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF](https://reader035.vdocument.in/reader035/viewer/2022070410/56649f155503460f94c2b2d0/html5/thumbnails/23.jpg)
23
Improving Matching TimeUtility - how much a test goes towards checking a rule based on notion of assigning costs to tests and rules compare cost of a rule with combined cost of a test and
the residue of a rule w.r.t the test
select strategySize reduction more important than matching time1. Pick discriminating test when available
Pick test with higher utility2. Examine opportunities for benign-nondeterminism3. Pick tests that satisfy polynomial bound
![Page 24: 1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF](https://reader035.vdocument.in/reader035/viewer/2022070410/56649f155503460f94c2b2d0/html5/thumbnails/24.jpg)
24
Tree Size
0
10000
20000
30000
40000
50000
60000
70000
0 50 100 150 200 250 300
No. of rules
No
. o
f n
od
es
ConditionFactorization
Snort NG
![Page 25: 1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF](https://reader035.vdocument.in/reader035/viewer/2022070410/56649f155503460f94c2b2d0/html5/thumbnails/25.jpg)
25
Matching Time
0102030405060708090
0 100 200 300
No. of rules
Mat
chin
g tim
e (p
er p
acke
t) in
ns
ConditionFactorization
Snort NG
Snort 2
![Page 26: 1 Efficient Rule Matching for Large Scale Systems Packet Classification – A Case Study Alok Tongaonkar Stony Brook University TexPoint fonts used in EMF](https://reader035.vdocument.in/reader035/viewer/2022070410/56649f155503460f94c2b2d0/html5/thumbnails/26.jpg)
26
Summary Developed a new technique for fast packet
classification Flexible – support diverse applications in a uniform
framework Promotes sharing of tests
Developed novel techniques for generating packet classification trees that Have polynomial size Virtually constant matching time
Demonstrated the gains from our technique for intrusion detection systems and firewalls