fast lookup for dynamic packet filtering in fpga reporter: hsuan-ju li 2014/09/18 design and...
DESCRIPTION
Outline Introduction Related Work Design And Architecture Evaluation And Results Conclusion 3TRANSCRIPT
Fast Lookup for Dynamic Packet Filtering in FPGA
REPORTER: HSUAN- JU L I2014/09/18
Design and Diagnostics of Electronic Circuits & Systems, 17th International Symposium on (DDECS), 2014 18rd
International Conference on, April (2014)Luka´s Kekely, Martin ˇ Zˇ adn ´ ´ık, Jiˇr´ı Matousek, Jan Ko ˇ ˇrenek
2
OutlineIntroductionRelated WorkDesign And ArchitectureEvaluation And ResultsConclusion
3
OutlineIntroductionRelated WorkDesign And ArchitectureEvaluation And ResultsConclusion
4
Introduction Software applications of safety- and security-critical embedded systems are often divided into several self-contained functions.
Between individual system partitions and functions.We use segregation to confine error propagation.
Soft processors are one order of magnitude slower in terms of operating frequency than hard-wired devices.
5
Introduction(cont.) Current FPGA families provide wide and fast memory attachments mostly implemented as hard macros that are faster than configurable logic.There is a performance gap between soft processors and the
memory attachment.Propose an architecture combines : The specific needs of partitioned software.The flexibility of reconfigurable hardware.
6
Introduction(cont.) Multiple self-contained systems on a single platform FPGAShares available memory bandwidth among the systems In a predictable and scalable way.
The main building blocks of the proposed architectureSecure bus bridges that are used to form a segregated hierarchy
of memory busses.
7
Introduction(cont.) With secure bus bridges, it is possible to use soft processors for safety and security-critical functions.To reach high assurance levels with far less effort.
8
OutlineIntroductionRelated WorkDesign And ArchitectureEvaluation And ResultsConclusion
9
Related Work Cuckoo hash function
h(x)
h’(x)
x = {a, b, c}
h(a) h(b) h(c)
h’(b) h’(c) h’(a)
h(a) h(b) h(c)
h’(b) h’(c) h’(a)
h(6) = 6 mod 11 = 6h’(6) = floor(6/11) mod 11 = 0x ={20, 50, 53,75}
0 1 2 3
0 1 2 3
10
OutlineIntroductionRelated WorkDesign And ArchitectureEvaluation And ResultsConclusion
11
Design And ArchitectureA. Lookup engine interface and functionalityB. Cuckoo hash lookup engineC. Binary search tree lookup engine
12
Design And Architecture(cont.)A. Lookup engine interface and functionalityB. Cuckoo hash lookup engineC. Binary search tree lookup engine
13
Design And Architecture(cont.) Lookup engine interface
and functionality
LookupEngine
Key Width Data Width Maximum Capacity
Representation in bits
Interface
14
Design And Architecture(cont.) Lookup engine interface
and functionalityLookup procedure 3 basic groups:InputOutputConfiguration
15
Design And Architecture(cont.) Lookup engine interface
and functionalityLookup procedure 3 basic groups:
LookupEngine
Input keys Lookup results
Routing decisionKey identification
Arbitrary Data 1 bit information
FoundInvalid
(Outputs)
Configuration
(Every Clock Cycle)
16
Design And Architecture(cont.)A. Lookup engine interface and functionalityB. Cuckoo hash lookup engineC. Binary search tree lookup engine
17
Design And Architecture(cont.) Cuckoo hash
lookup engine
18
Design And Architecture(cont.) Cuckoo hash
lookup engine
Parallelcomputing
CRC implementation
19
Design And Architecture(cont.) Cuckoo hash
lookup engineReading records
Key value data
Record
Records from hash tables in memory or outside register
20
Design And Architecture(cont.) Cuckoo hash
lookup engineCompared for equality
At most one comparison successful
Data associated with matching key and set flag
21
Design And Architecture(cont.) Cuckoo hash
lookup engine
Update key set based on requests received
22
Design And Architecture(cont.) Cuckoo hash
lookup engine
Controller can evict records from hash tables on-the-fly preserving the set of active keys
Reconfiguration cycle
23
Design And Architecture(cont.) Cuckoo hash
lookup engine
Ccuckoo = d x t + 1
d – The number of used hash tables(hash functions)t – The size of individual table1 – Additional reconfiguration register
24
Design And Architecture(cont.)A. Lookup engine interface and functionalityB. Cuckoo hash lookup engineC. Binary search tree lookup engine
25
Design And Architecture(cont.) Binary search tree
lookup engineTree level (pipeline stage)
Piece of memory
Stage
Address of a node
Searched KeyComparator
26
Design And Architecture(cont.) Binary search tree
lookup engineContaining associated data to the key
Piece of memory
Stage
Address of a node
Searched KeyComparator
27
Design And Architecture(cont.) Binary search tree
lookup engine
Atomic operations
Piece of memory
Stage
Address of a node
Searched KeyComparator
Result corrected according to a register
28
The capacity of the BST based engine can be configured by the number of BST levels l.
Cbst = 2l - 1
Design And Architecture(cont.) Binary search tree
lookup engine
29
Design And Architecture(cont.)A. Lookup engine interface and functionalityB. Cuckoo hash lookup engineC. Binary search tree lookup engineD. Top-level lookup engine
30
Design And Architecture(cont.) Top-level
lookup engineBoth Cuckoo and BST engine in parallelBoth results are stored in FIFOs
Cuckoo engine BST engine
Stash Stash
FIFO
31
Ctotal = d×t+1+s.d and t of the cuckoo hash and the stash size s
Design And Architecture(cont.) Top-level
lookup engineThe maximum capacity of the cuckoo hash with stash lookup engine can be defined:
32
OutlineIntroductionRelated WorkDesign And ArchitectureEvaluation And ResultsConclusion
33
Evaluation And Results Memory utilization can be computed in two basic ways:Ucuckoo = (n−m)/Ccuckoo
Utotal = n/Ctotal
n: Total number of successfully inserted keys before the memory became fullm: The number of keys that resides in the stash
Stash can be always filled up to 100% of its capacity It can always put m = s
The values of n must be acquired from the test runs
34
Evaluation And Results Evaluate the relation between achievable memory utilization of cuckoo hash and the used sizes of stash for different parameters.
The memory utilization plotted in the graphs is Ucuckoo and the size of the stash (s) is plotted as a portion of t.
35
Evaluation And Results(cont.)
36
Evaluation And Results(cont.)
37
Evaluation And Results(cont.)
38
Evaluation And Results(cont.)
39
OutlineIntroductionRelated WorkDesign And ArchitectureEvaluation And ResultsConclusion
40
Conclusion The proposed architecture leverages the combination of the cuckoo hash engine with BST engine with a focus on parallel implementation in FPGA.
41
THANK YOU