ip routing processing with graphic processors author: shuai mu, xinya zhang, nairen zhang, jiaxin...

11
IP Routing Processing with Graphic Processors Author: Shuai Mu , Xinya Zhang , Nairen Zhang , Jiaxin Lu , Yangdong Steve Deng, Shu Zhang Publisher: IEEE Conference On DATE, pp.93-98, 2010 Presenter: Ye-Zhi Chen Date: 2011/8/24 1

Upload: warren-craig

Post on 05-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: IP Routing Processing with Graphic Processors Author: Shuai Mu, Xinya Zhang, Nairen Zhang, Jiaxin Lu, Yangdong Steve Deng, Shu Zhang Publisher: IEEE Conference

1

IP Routing Processing with Graphic Processors

Author:Shuai Mu , Xinya Zhang , Nairen Zhang , Jiaxin Lu ,

Yangdong Steve Deng, Shu Zhang

Publisher:IEEE Conference On DATE, pp.93-98, 2010 

Presenter:Ye-Zhi Chen

Date:2011/8/24

Page 2: IP Routing Processing with Graphic Processors Author: Shuai Mu, Xinya Zhang, Nairen Zhang, Jiaxin Lu, Yangdong Steve Deng, Shu Zhang Publisher: IEEE Conference

2

Introduction

Internet traffic will grow at an accelerated rate , routers , therefore ,

have to deliver increasing processing capacity accordingly.

Throughput and Programmability

Hardware - Higher Throughput , but Lower Programmability

Software – Higher Programmability , but Lower Throughput

Netowrk Processors(NPs) – The lack of mature programming

models and software development tools and incompatibility of

architectures

GPU – With high-performance computing and its programming is

accessible with CUDA.

Page 3: IP Routing Processing with Graphic Processors Author: Shuai Mu, Xinya Zhang, Nairen Zhang, Jiaxin Lu, Yangdong Steve Deng, Shu Zhang Publisher: IEEE Conference

3

CUDA

CUDA Program – composed of codes running on both CPU and

GPU.

Kernel - The function called by CPU but executed on GPU

Block – threads inside a block could exchange data through the

shared memory and synchronize with one another.

Page 4: IP Routing Processing with Graphic Processors Author: Shuai Mu, Xinya Zhang, Nairen Zhang, Jiaxin Lu, Yangdong Steve Deng, Shu Zhang Publisher: IEEE Conference

4

Network Intrusion Detection

Signature matching - checks if network payloads contain pre-

supplied signatures to at line rates. 60%

Two Algorithms

Bloom filter –

1. hash table

2. space-efficient data structure

3. Errors – hash conflicts

Aho-Corisick (AC) –

4. DFA

Page 5: IP Routing Processing with Graphic Processors Author: Shuai Mu, Xinya Zhang, Nairen Zhang, Jiaxin Lu, Yangdong Steve Deng, Shu Zhang Publisher: IEEE Conference

5

Implementation

Input

Page 6: IP Routing Processing with Graphic Processors Author: Shuai Mu, Xinya Zhang, Nairen Zhang, Jiaxin Lu, Yangdong Steve Deng, Shu Zhang Publisher: IEEE Conference

6

Implementation

Transfer packet :

Individually : simplest way but time-consuming

Batch : batch many small transfers into a larger one.

Paged-locked memory :be mapped into the address

space of the device

Store Bloom vector and transition table in GPU’s texture

memory

Divide each packet into smaller chunks , and every two

neighboring chunks have an overlapped content with a

length equal to the largest match texts.

Page 7: IP Routing Processing with Graphic Processors Author: Shuai Mu, Xinya Zhang, Nairen Zhang, Jiaxin Lu, Yangdong Steve Deng, Shu Zhang Publisher: IEEE Conference

7

Rounting Table Lookup

Longest prefix match(LPM)

Radix tree

Portable Routing Table

trie

Page 8: IP Routing Processing with Graphic Processors Author: Shuai Mu, Xinya Zhang, Nairen Zhang, Jiaxin Lu, Yangdong Steve Deng, Shu Zhang Publisher: IEEE Conference

8

Result

CPU : 0.6 Gbit/sGPU : • Kernel only : 19 Gbit/s• Transfer : 3.4 Gbit/s• Paged-locked : 17 Gbit/s

Page 9: IP Routing Processing with Graphic Processors Author: Shuai Mu, Xinya Zhang, Nairen Zhang, Jiaxin Lu, Yangdong Steve Deng, Shu Zhang Publisher: IEEE Conference

9

Result

CPU : 0.6 Gbit/sGPU : • Kernel only : 3.6 Gbit/s• Transfer : 2.3 Gbit/s• Paged-locked : 3.2 Gbit/s

Page 10: IP Routing Processing with Graphic Processors Author: Shuai Mu, Xinya Zhang, Nairen Zhang, Jiaxin Lu, Yangdong Steve Deng, Shu Zhang Publisher: IEEE Conference

10

Result

The GPU performance of DFA improves rapidly and approaches a peak throughput of 9.2Gbit/s, which is more than 15 times faster than CPU

Page 11: IP Routing Processing with Graphic Processors Author: Shuai Mu, Xinya Zhang, Nairen Zhang, Jiaxin Lu, Yangdong Steve Deng, Shu Zhang Publisher: IEEE Conference

11

Result