1 ents689l: packet processing and switching classification engines classification engines vahid...

42
1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

Post on 21-Dec-2015

224 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

1ENTS689L: Packet Processing and SwitchingClassification Engines

Classification Engines

Vahid Tabatabaee

Fall 2007

Page 2: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

2ENTS689L: Packet Processing and SwitchingClassification Engines

References

Pankaj Gupta, “Lookups and Classification presentation,” Lecture notes of EE384Y: Packet Switch Architecture” course of Prof. Nick McKeown in Stanford University” available online at http://www.stanford.edu/class/ee384y/

Pnkaj Gupta, Nick McKeown, "Algorithms for Packet Classification,” IEEE Network, March 2001.

Title: Network Processors Architectures, Protocols, and PlatformsAuthor: Panos C. Lekkas, Publisher: McGraw-Hill

Page 3: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

3ENTS689L: Packet Processing and SwitchingClassification Engines

Two General Classification Problems

Look-up and Classification: It is mainly used in simple packet routing switching

context. It consists of the identification of correct output port,

channel or interface that the packet should be forwarded.

This decision is based on the destination address. Deep Packet Classification:

A packet must be distinguished among several others.

It is based on several internal bit fields of variable length or format.

Page 4: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

4ENTS689L: Packet Processing and SwitchingClassification Engines

Deep Packet Classification

Distinguished: Different processing awaits each packet after it is singled out. These different types of processing corresponds to flows.

Several: Simultaneous application of multiple rules.

Internal: The bits may by buried deeper inside the packet and they are not conveniently located at fixed position on the header.

Variable length or format: They are not as straight forward 32-bit addresses, but they can represent range of values and can be of variable length such Uniform Resource Locators (URL).

Page 5: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

5ENTS689L: Packet Processing and SwitchingClassification Engines

Algorithms and Data Structures to Support Lookup and Forwarding

Page 6: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

6ENTS689L: Packet Processing and SwitchingClassification Engines

Binary Search Trees

In computer science, a binary search tree (BST) is a binary tree which has the following properties: Each node has a value. A total order is defined on these

values. The left subtree of a node contains

only values less than the node's value.

The right subtree of a node contains only values greater than or equal to the node's value.

Lookup time O(log N), but independent of address length From Wikipedia, the free encyclopedia

)(log NO

Page 7: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

7ENTS689L: Packet Processing and SwitchingClassification Engines

Binary Search Tries

In computer science, a trie, or prefix tree, is an ordered tree data structure that is used to store an associative array where the keys are strings.

Looking up keys is faster. Looking up a key of length m takes worst case O(m) time. Independent of the table size.

Tries can require less space when they contain a large number of short strings, because the keys are not stored explicitly and nodes are shared between keys O(NW).

Tries help with longest-prefix matching, where we wish to find the key sharing the longest possible prefix with a given key efficiently.

From Wikipedia, the free encyclopedia

tennistent

Page 8: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

8ENTS689L: Packet Processing and SwitchingClassification Engines

Tries for Exact Matches in Ethernet Switches

We do not need to chase one bit at a time. We can trade memory for search time. Pointer 0, means no children. Storage is O(NW), N number of entries and W is width of them.

16-ary Search Trie

0000, ptr 1111, ptr

0000, 0 1111, ptr

000011110000

0000, 0 1111, ptr

111111111111

Source: http://www.stanford.edu/class/ee384x/

Page 9: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

9ENTS689L: Packet Processing and SwitchingClassification Engines

Trade off between speed and memory size

As the degree increases more and more pointers are 0 (wasted).

Degree ofTree

# MemReferences

# Nodes(x106)

Total Memory(Mbytes)

FractionWasted (%)

2 48 1.09 4.3 494 24 0.53 4.3 738 16 0.35 5.6 8616 12 0.25 8.3 9364 8 0.17 21 98256 6 0.12 64 99.5

Table produced from 215 randomly generated 48-bit addresses

Source: http://www.stanford.edu/class/ee384x/

Page 10: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

10ENTS689L: Packet Processing and SwitchingClassification Engines

Tries for Longest Prefix Match

P1

111*

H1

P2

10* H2

P3

1010*

H3

P4

10101

H4

P2

P3

P4

P1

A

B

C

G

D

F

H

E

1

0

0

1 1

1

1

next-hop-ptr (if prefix)

left-ptr right-ptr

Trie node

Page 11: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

11ENTS689L: Packet Processing and SwitchingClassification Engines

Tries for Longest Prefix Match

P1

111*

H1

P2

10* H2

P3

1010*

H3

P4

10101

H4

P2

P3

P4

P1

A

B

C

G

D

F

H

E

1

0

0

1 1

1

1

Lookup 10111

next-hop-ptr (if prefix)

left-ptr right-ptr

Trie node

Page 12: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

12ENTS689L: Packet Processing and SwitchingClassification Engines

Tries for Longest Prefix Match

P1

111*

H1

P2

10* H2

P3

1010*

H3

P4

10101

H4

P2

P3

P4

P1

A

B

C

G

D

F

H

E

1

0

0

1 1

1

1 Add P5=1110*

I

0

P5

next-hop-ptr (if prefix)

left-ptr right-ptr

Trie node

Page 13: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

13ENTS689L: Packet Processing and SwitchingClassification Engines

Radix Trie

For W bit prefixes and N routes:Lookup Complexity: O(W)Storage Complexity: O(NW)Update Complexity: O(W)

Advantages:SimplicityExtensible to wider fields and larger tables

Disadvantage:Waste of memoryWorst-case look-up slow

Page 14: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

14ENTS689L: Packet Processing and SwitchingClassification Engines

Leaf Pushing Technique

Leaf pushing reduces the amount of information stored in each table entry.

The best match information is pushed to the leaf nodes. Each table entry contains either a pointer or next hop information.

A

B

C

G

D

E

1

0

0

1

1

left-ptr or next-hop

Trie node

right-ptr or next-hop

P2

P4P3

P2

P1

P1

111*

H1

P2

10* H2

P3

1010*

H3

P4

10101

H4

Page 15: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

15ENTS689L: Packet Processing and SwitchingClassification Engines

A

B

C

G

D

E

1

0

0

1

1

P2

P4P3

P2

P1P2

P3

P4

P1

A

B

C

G

D

F

H

E

1

0

0

1 1

1

1

Leaf Pushing Technique

Page 16: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

16ENTS689L: Packet Processing and SwitchingClassification Engines

Incremental Rebuilding with Leaf Pushing

Information changes at a node close to the root can potentially change a large number of leaves.

Add P5=1*

A

B

C

G

D

E

1

0

0

1

1

P2

P4P3

P2

P1

A

B

C

G

D

E

1

0

0

1

1

P2

P4P3

P2

P1P5

Page 17: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

17ENTS689L: Packet Processing and SwitchingClassification Engines

Multi-bit Tries

Faster Search Larger Memory

Depth = WDegree = 2Stride = 1 bit

Binary trieW

Depth = W/kDegree = 2k

Stride = k bits

Multi-ary trie

W/k

Page 18: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

18ENTS689L: Packet Processing and SwitchingClassification Engines

Prefix Expansion with Multi-bit Tries

If stride = k bits, prefix lengths that are not a multiple of k need to be expanded

Maximum number of expanded prefixes corresponding to one non-expanded prefix = 2k-1

Prefix Expanded prefixes

0* 00*, 01*

11* 11*

E.g., k = 2:

Page 19: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

19ENTS689L: Packet Processing and SwitchingClassification Engines

Example 4-ary Trie

P2

P3 P12

A

B

F11

next-hop-ptr (if prefix)

ptr00 ptr01

A four-ary trie node

P11

10

P42

H11

P41

10

10

1110

D

C

E

G

ptr10 ptr11

Lookup 10111

P1

111*

H1

P2

10* H2

P3

1010*

H3

P4

10101

H4

Page 20: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

20ENTS689L: Packet Processing and SwitchingClassification Engines

Memory expansion in Multi-bit Tries

Replication of next-hop ptr (more leaf nodes) Greater number of unused (null) pointers in a

node: (2k child not only 2)

Time ~ W/kStorage ~ NW/k * 2k-1

Page 21: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

21ENTS689L: Packet Processing and SwitchingClassification Engines

Generalization: Different Strides at different levels.

16-8-8 split4-10-10-8 split24-8 split21-3-8 split

Page 22: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

22ENTS689L: Packet Processing and SwitchingClassification Engines

Deep Packet Classification

Checking Multiple Fields

Page 23: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

23ENTS689L: Packet Processing and SwitchingClassification Engines

Motivation: Desire for Additional Services

ISP1NAP

E1

ISP2

ISP3X

Service Example

Differentiated Service

Ensure that traffic from ISP2 is given higher priority over traffic from ISP3.

Packet Filtering

Deny all web traffic from ISP3 at interface X.

Policy-based routing

Ensure that all web traffic from ISP2 is sent via interface Z.

Y

Z

Other examples: Accounting & billing, rate-limiting, etc.

Page 24: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

24ENTS689L: Packet Processing and SwitchingClassification Engines

Special Processing Requires Identification of Flows

All packets of a flow obey a pre-defined rule and are processed similarly by the router

E.g. a flow = (src-IP-address, dst-IP-address), or a flow = (dst-IP-prefix, protocol) etc.

Router needs to identify the flow of every incoming packet and then perform appropriate special processing based on negotiated service agreements

Page 25: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

25ENTS689L: Packet Processing and SwitchingClassification Engines

Special processing

Control

Datapath:(per-packet processing)

Routing lookup

Flow-aware Router: Basic Architectural Components

Routing, resource reservation, admission control, SLAs

Packet classification

Switching

Scheduling

Page 26: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

26ENTS689L: Packet Processing and SwitchingClassification Engines

Multi-field Packet Classification

Packet Classification: Find the action associated with the highest priority rule matching an incoming packet header.

Field 1 Field 2 … Field k Action

Rule 1

5.3.40.0/21

2.13.8.11/32

… UDP A1

Rule 2

5.168.3.0/24

152.133.0.0/16

… TCP A2

… … … … … …

Rule N

5.168.0.0/16

152.0.0.0/8

… ANY AN

Example: packet (5.168.3.32, 152.133.171.71, …, TCP)

L3-DA L3-SA L4-PROT

Page 27: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

27ENTS689L: Packet Processing and SwitchingClassification Engines

Example 4D Classifier

Rule L3-DA (address/mask)

L3-SA (address/mask)

L4-Destination

L4-PROT Action

R1 152.163.190.69/

255.255.255.255

152.163.80.11/

255.255.255.255

* * Deny

R2 152.168.3/

255.255.255

152.163.200.157/255.255.255.255

eq www udp Deny

R3 152.168.3/

255.255.255

152.163.200.157/255.255.255.255

range 20-21 udp Permit

R4 152.168.3/

255.255.255

152.163.200.157/255.255.255.255

eq www tcp Deny

R5 * * * * Deny

Page 28: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

28ENTS689L: Packet Processing and SwitchingClassification Engines

Example Classification Results

Pkt Hdr

L3-DA L3-SA L4-DP L4-PROT Rule, Action

P1 152.163.190.69 152.163.80.11 www tcp R1, Deny

P2 152.168.3.21 152.163.200.157 www udp R2, Deny

Page 29: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

29ENTS689L: Packet Processing and SwitchingClassification Engines

R5

Geometric Interpretation

R4

R3

R1R2

R7

Dimension 1

Dim

ensi

on 2

R6

P2 P1

Packet classification problem: Find the highest priority rectangle containing an incoming point

Page 30: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

30ENTS689L: Packet Processing and SwitchingClassification Engines

Metrics for Classification Algorithms

Speed Storage requirements Ability to handle large classifiers Low preprocessing time Update time Scalability in the number of header fields Flexibility in rule specification

Page 31: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

31ENTS689L: Packet Processing and SwitchingClassification Engines

Linear Search

Keep rules in a linked listO(N) storage, O(N) lookup time, O(1) update complexity

Page 32: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

32ENTS689L: Packet Processing and SwitchingClassification Engines

TCAMs (Recap)

Advantages

Extensible to multiple fieldsFast: 6-8 ns today (133-150 searches per second) going to 250 MspsSimple to understand and use

Disadvantages

Inflexible: range-to-prefix blowupPower: ~15-20W @ 100MspsCost: $200-$250 for ~2MByteDensity: largest available in 2006 is ~2MB, i.e., 128K x 128 (can be cascaded)Tough memory soft-error problem

Page 33: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

33ENTS689L: Packet Processing and SwitchingClassification Engines

Example Classifier

Rule Destination Address

Source Address

R1 0* 10*

R2 0* 01*

R3 0* 1*

R4 00* 1*

R5 00* 11*

R6 10* 1*

R7 * 00*

Page 34: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

34ENTS689L: Packet Processing and SwitchingClassification Engines

Hierarchical Tries

Dimension DA

O(NW) memoryO(W2) lookup

Rule

DA

SA

R1 0* 10*

R2 0* 01*

R3 0* 1*

R4 00* 1*

R5 00* 11*

R6 10* 1*

R7 * 00*

Search (000,010)

Dimension SAR5 R2 R1

R3R6

R7

R4

Page 35: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

35ENTS689L: Packet Processing and SwitchingClassification Engines

Set-pruning Tries [Tsuchiya, Sri98]

Reduced query time obtained by replicating rules to eliminate traversals

Dimension DA

Rule

DA

SA

R1 0* 10*

R2 0* 01*

R3 0* 1*

R4 00* 1*

R5 00* 11*

R6 10* 1*

R7 * 00*

R7 Dimension SAR2 R1 R5 R7 R2 R1

R3

R7

R6

R7

R4

O(WN2) memoryO(2W) lookup

Page 36: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

36ENTS689L: Packet Processing and SwitchingClassification Engines

Set-pruning Tries [Tsuchiya, Sri98]

Reduced query time obtained by replicating rules to eliminate traversals

Dimension DA

Rule

DA

SA

R1 0* 10*

R2 0* 01*

R3 0* 1*

R4 00* 1*

R5 00* 11*

R6 10* 1*

R7 * 00*

R7 Dimension SAR2 R1 R5 R7 R2 R1

R3

R7

R6

R7

R4

O(WN2) memoryO(2W) lookup

Search (000,010)

Page 37: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

37ENTS689L: Packet Processing and SwitchingClassification Engines

Recursive Flow Classification

It looks at classification as mapping S bits onto T bits. S bits are concatenation of all fields T bits represent classification outcomes It breaks down the mapping task into multiple stages At each stage one set of values is mapped to a smaller set

Page 38: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

38ENTS689L: Packet Processing and SwitchingClassification Engines

RFC Algorithm

1. In the first phase, fields of the packet header are split up into multiple chunks that are used to index into multiple memories in parallel. The contents of each memory are chosen so that the result of the lookup is narrower than the index.

2. In subsequent phases, memories are indexed using the results from earlier phases.

3. In the final phase, the memory yields the action d

Page 39: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

39ENTS689L: Packet Processing and SwitchingClassification Engines

RFC Performance

RFC is shown to perform 31.25 Mpps classification using a three-stage pipeline.

It requires two 4Mb SRAM. Four banks of 64Mb SDRAM under 125 MHz

clock rate. It is estimated to do 15000 rules in 10 Gbps

Page 40: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

40ENTS689L: Packet Processing and SwitchingClassification Engines

Classification: What’s Used Out There?

Majority of hardware platforms: TCAMsHigh performance, cost, power, determinstic worst-

case Some others: Modifications of RFC

Low speed, low cost DRAM-based, heuristicWorks well in software platforms

Some others: HyperCuts/HiCuts Others: nothing/linear search/simulated-parallel-search

etc.

Page 41: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

41ENTS689L: Packet Processing and SwitchingClassification Engines

Lookup: What’s Used Out There?

Overwhelming majority of routers:Modifications of multi-bit tries (h/w optimized trie

algorithms)DRAM (sometimes SRAM) based, large number of

routes (>0.25M)Parallelism required for speed/storage becomes

an issue Others mostly TCAM based

Allows sharing the same TCAM for both lookup and classification

Page 42: 1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007

42ENTS689L: Packet Processing and SwitchingClassification Engines

Packet Classification: References

F. Baboescu and G. Varghese, “Scalable packet classification,” Proc. Sigcomm 2001 [Lak98] T.V. Lakshman. D. Stiliadis. “High speed policy based packet forwarding

using efficient multi-dimensional range matching”, Sigcomm 1998, pp 191-202 K. Lakshminarayanan, A. Rangarajan and S. Venkatachary. “Algorithms for advanced

packet classification with Ternary CAMs”, Sigcomm 2005. [Sri98] V. Srinivasan, S. Suri, G. Varghese and M. Waldvogel. “Fast and scalable layer

4 switching”, Sigcomm 1998, pp 203-214 [Grid-of-tries, crossproducting] V. Srinivasan, G. Varghese, S. Suri. “Fast packet classification using tuple space

search”, Sigcomm 1999, pp 135-146 P. Gupta, N. McKeown, “Packet classification using hierarchical intelligent cuttings,”

Hot Interconnects VII, 1999 [Gupta99] P. Gupta, N. McKeown, “Packet classification on multiple fields,” Sigcomm

1999, pp 147-160 [RFC] P. Gupta, “Algorithms for routing lookups and packet classification”, PhD Thesis, Ch

1 and 4, Dec 2000, available at http://yuba.stanford.edu/ ~pankaj/phd.html [Background and introduction to Classification]

P. Gupta and N. McKeown, “Algorithms for packet classification,” IEEE Network, March/April 2001, vol. 15, no. 2, pp 24-32

S. Singh, F. Baboescu, G. Varghese and J. Wang, “Packet classification using multidimensional cutting,” Proc. ACM Sigcomm 2003. [HyperCuts]

S. Iyer, R.R. Kompella, and A. Shelat, “ClassiPI: An architecture for fast and flexible packet classification,” IEEE Network, March/April 2001, vol. 15, no. 2, pp 33-41

TCAM vendors: netlogicmicro.com, idt.com