fd.io vector packet processing

35
©2015 Check Point Software Technologies Ltd. 1 ©2015 Check Point Software Technologies Ltd. Overview Kirill Tsym, Next Generation Enforcement team FD.IO VECTOR PACKET PROCESSING

Upload: kernel-tlv

Post on 14-Jan-2017

359 views

Category:

Software


9 download

TRANSCRIPT

Page 1: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd. 1©2015 Check Point Software Technologies Ltd.

Overview

Kirill Tsym,Next Generation Enforcement team

FD.IO VECTOR PACKET PROCESSING

Page 2: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd. 2

CHECK POINT SOFTWARE TECHNOLOGIESThe largest pure-play security vendor in the worldProtecting more than 100,000 companies with millions of users worldwide$1.63B annual revenues in 2015

Over 4,300 employees

Partners in over 95 countries

Page 3: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd. 3

Lecture agenda

Linux networking stack vs user space networking initiatives– Why User Space networking? Why so many projects around it?

Introduction to FD.io and VPP– Architecture, Vectors, Graph, etc.

VPP Data path – Typical graphs– Example of supported topologies

VPP Threads and scheduling

Single and Multicore support

Supported topologies

Page 4: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd.

LINUX KERNEL STACK

01

Page 5: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd. 5

Applications

Linux kernel data path

User SpaceKernel Space

NIC1 NIC2

TCP/IP StackForwarding

To Application

HW

Rx Tx

Design goals or why stack is in the kernel?– Linux is designed as an Internet Host (

RFC1122) or an “End-System” OS– Need to service multiple applications– Separate user applications from

sensitive kernel code– Make application as simple as possible– Receive direct access to HW drivers

Cost– Not optimized for Forwarding – Every change requires new kernel

version– Code is too generic– Networking stack today is a huge part

of the kernel

Pass-through

Application Path

ApplicationsApplication

Reference: Kernel Data Path

L1

L2

L3

L4

L7

Drivers

Sockets L5

Page 6: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd. 6

Linux stack whole picture

Reference: Network_data_flow_through_kernel

Page 7: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd. 7

Linux stack packet processing Packets are processed in Kernel one by one

– A lot of code involved in each packet processing– Processing path is monolithic, it’s impossible to change it or load new

stack modules – Impossible to achieve Instruction Cache optimization in this model– There are technics to hijack kernel routines or defines hooks, but no

simple and standard way to replace tcp_input() for example

skb processing is not cache optimized– sk_buff struct includes too much information– It could be ideal to load all needed sk_buff ‘s to cache before processing– But skb doesn’t fit to cache line nor placed in chain– As result there is no Data Cache optimization and usually a lot of cache

misses

Every change requires new kernel version– Upstream a new protocol takes very long time– Standardization goes much faster than implementation

Page 8: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd.

USER SPACE NETWORKINGPROJECTS

01

Page 9: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd. 9

Applicationnetmap API

Netmap

User SpaceKernel Space

NIC

HW

Linux Networking Stack

netmap rings

NICrings

Pros– BSD, Linux and Windows

ports– Good scalability– Data path is detached from

host stack– Widely adopted

Cons– No networking stack– Routing done in host stack

which slows down initial processing

Performance

Packet forwarding Mpps

Freebsd bridging 0.690

Netmap + libpcap 7.5

Netmap 14.88 Reference: netmap - the fast packet I/O framework

Page 10: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd. 10

DPDK / Forwarding engine

DPDK

User Space

Kernel Space

NIC1

Linux Networking Stack

Slow Path

Fast Path

4

HW

Kernel Networking Interface

3 5

8

NIC2

Pros– Kernel independent– All packet processing done

in user space– DPDK Fast Path is cache

and minimum instructions optimized

Cons– No networking stack– No routing stack– Need to send packets to

Kernel for routing decisions– Doesn’t perform well on

scaling tests– No external API– No integration with

management– Out of tree drivers

Fast Path

Slow Path

RoutingDecision

Drivers

71

2 6

Page 11: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd. 11

OpenFastPath BSD Networking Stack on top of DPDK and ODP

OpenDataPlane (ODP) is a cross-platform data plane SoC networking open source API

Supported by Nokia, ARM, Cavium and ENEA

Includes optimized IP, UDP and TCP stacks

Routes and MACs are in sync with Linux through Netlink

Page 12: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd. 12

Other projects OpenSwitch

N OS with Main component: DPDK based Open vSwitch N Various management and CLI daemonsN Routing decision made by Linux Kernel (Ouch!)N REST APIN Good for inter-VM communications

OpenOnloadN A user-level network stack from SloarflareN Depends on Solarflare NICs (Ouch!)

• IO Visor N XDP or eXpress Data Path N Not a user space networking!N Tries to bring performance in to

existing kernel with BPFN No need for 3rd party codeN Allows option of busy polling N No need to allocate large pagesN No need for dedicated CPUs

Page 13: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd.

FD.IO01

Page 14: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd. 14

FD.io Project overview• FD.io is Linux Foundation project

N A collection of several projects based on Data Plane Development Kit (DPDK) N Distributed under Apache licenseN A key project the Vector Packet Processing (VPP) is donated by Cisco N Proprietary version of VPP is running in Cisco CRS1 routerN There is no tool chain, OS, etc in Open sourced VPP versionN VPP is about 300K lines of codeN Major contributor: Cisco Chief Technology and Architecture office team

• Three Main componentsN Management AgentN Packet ProcessingN IO

• VPP RoadmapN First release 16 of June includes14MPPS single core L3 performanceN 16.09 release includes integration with containers and orchestrationN 17.01 release will include dpdk-16.11, dpdk CryptoDev, enhanced NAT, etc.

Page 15: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd. 15

VPP ideas• CPU cycles budget

N 14 Mpps on 3.5 Ghz CPU = 250 cycles per packet budgetN Memory access 67ns and it’s the cost of fetching one cache line (64

bytes) OR 134 CPU cycles

• SolutionN Perform all the processing with minimum of codeN Process more than one packet at a timeN Grab all available packets from Rx ring on every cycleN Perform each atomic task in a dedicated Node

• VPP Optimization TechniquesN Branch Prediction hintsN Use of vector instructions SSE, AVXN Prefetching – do not pre-fetch to much to left the cache warmN Speculations – around the packet destination instead of a full lookupN Dual Loops

Cache miss is unacceptable

Page 16: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd. 16

VPP architecture

NIC1 NIC2

User Space

Kernel Space

DPDK

VPP IP Stack

PluginsPluginVPP Plugins

VPP Pros

– Kernel independent– All packet processing done in user space– DPDK based (or netmap, virtio, host,

etc.)– Includes full scale L2/L3 Networking

stack– Routing decision made by VPP– Also includes bridge implementation– Good plugins framework– Integrated with external management:

Honeycomb

Cons– Young project

– First stable release ~06/16– Many open areas

– Open Stack integration / Neutron– Lack of Transport Layer integration– Control Plane API & Stack

But what about L4/L7?– TLDK Project

HW

Fast Path

VPP I/O Tasks I/O Polling logic + L2

L3 tasks

User Defined tasks

Page 17: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd. 17

Performance

N VPP data plane throughput not impacted by large IPv4 FIB sizeN OVSDPDK data plane throughput heavily impacted by IPv4 FIB sizeN VPP and OVSDPDK tested on Haswell x86 platform with E5-2698v3 2x16C 2.3GHz (Ubuntu 14.04 trusty)

fd.io Foundation

Reference: FD.io intro

Page 18: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd. 18

TLDK

VPP TLDK Application layer (project)

NIC1

User Space

Kernel Space

HWFast Path

Purpose build TLDK

Application

SocketApplication

BSD Socket Layer

LD_P

RE

LOA

D

Socket Layer

Native Linux

Application

DPDK

NIC2

VPP

TLDK Application Layer– Using TLDK Library to process

TCP and UDP packets

Purpose Built Application– Using TLDK API Directly (VPP node)– Provides highest performance

BSD Socket Layer– A standard BSD socket layer for

applications using sockets by design– Lower performance, but good

compatibility

LD_PRELOAD Socket Layer– Used to allow a ‘native binary Linux’

application to be ported in to the system

– Allows for existing application to work without any change

Page 19: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd. 19

VPP Nodes and Graph

Node 1

Node 2

Node 3

Node 4

Node 5

Node 6

Processing is divided per Node

Node works on Vector of Packets

Nodes are connected to graph

Graph could be changed dynamically

vector of packets

Page 20: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd.

DATA PATH

Page 21: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd. 21

• Full zero copy• Data always resides in

Huge Pages memory• Vector is passed from

graph node to node during processing

ethernet-input

Data path - ping

dpdk-input

ipv4-input ipv4-local ipv4-icmp-input

ipv4-icmp-echo-

request

ipv4-rewrite-

local

Gigabit Ethernet-

Output

Gigabit Ethernet-

Txt

DPDK

Core 0

vector of packet pointers

HugePagesMemory

packets data

Packets placed to Huge Pages

by NIC

VPP Vector created during input device work

Node

Page 22: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd. 22

ethernet-input

Vector processing – split example

input-device

ipv4-input

Gigabit Ethernet-

Output

Gigabit Ethernet-

Txt

input vector

ipv6-input

output vector A

output vector B

Transmit queue:

packets are reordered

Next node is called twice by threads

scheduler

DPDK

Page 23: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd. 23

ethernet-input

Vector processing – cloning example

dpdk-input

ipv4-input

Gigabit Ethernet-

Output

Gigabit Ethernet-

Txt

input vector

Transmit queue

ipv4-frag output vector * 2 packets

input vector

Max vector size is 256If output vector is full

Then two vectors will be created

DPDK

Page 24: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd. 24

Rx features example : IPsec flow

dpdk-input

ipsec-if-output

Gigabit Ethernet-

Output

Gigabit Ethernet-

Txt

DPDK

ethernet-input

ipv4-input esp-encrypt

ipv4-rewrite-

local

esp-decrypt

ipsec-if-input

ipv4-local

ipsec-if node been dynamically registered to receive

IPsec traffic using Rx Features during interface UP

Done through rewrite adjutancy

Page 25: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd.

THREADS AND SCHEDULING

Page 26: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd. 26

Threads scheduling

 [Restricted] ONLY for designated groups and individuals

One VPP scheduling cycle

PRE-INPUT

Purpose: Linux input and system control

Example:unix_epoll_input

dhcp-clientmanagement

stack interface

INPUT

Purpose:Packets input

Example:dpdk_io_input

dpdk_inputtuntap_rx

INTERRUPTS

Purpose:Run Suspended

processes

Example:expired timers

PENDINGNODES

DISPATCH

Purpose:Processing all vectors that

needs additional processing after

changes

Example:Worker thread

main

INTERNALNODES

DISPATCH

Purpose:Processing all

pending vectors on VPP graph

Example:Worker thread

main

Main work: L2/L3 stack processing and Tx

Page 27: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd. 27

Threads zoom-in

 [Restricted] ONLY for designated groups and individuals

vpp# show runTime 9.5, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00 vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call admin-up-down-process event wait 0 0 1 6.52e3 0.00api-rx-from-ring active 0 0 6 1.04e5 0.00cdp-process any wait 0 0 1 1.10e5 0.00cnat-db-scanner any wait 0 0 1 5.34e3 0.00dhcp-client-process any wait 0 0 1 6.58e3 0.00dpdk-process any wait 0 0 3 2.73e6 0.00flow-report-process any wait 0 0 1 6.19e3 0.00gmon-process time wait 0 0 2 5.36e8 0.00ip6-icmp-neighbor-discovery-ev any wait 0 0 10 1.81e4 0.00startup-config-process done 1 0 1 2.64e5 0.00unix-cli-stdin event wait 0 0 1 3.05e9 0.00unix-epoll-input polling 24811921 0 0 9.48e2 0.00vhost-user-process any wait 0 0 1 3.24e4 0.00vpe-link-state-process event wait 0 0 1 7.10e3 0.00vpe-oam-process any wait 0 0 5 1.37e4 0.00vpe-route-resolver-process any wait 0 0 1 9.52e3 0.00vpp# exit# ps -elf | grep vpp4 R root 20566 1 92 80 0 - 535432 - 16:10 ? 00:00:27 vpp -c /etc/vpp/startup.conf0 S root 20582 1960 0 80 0 - 4293 pipe_w 16:10 pts/34 00:00:00 grep --color=auto vpp#

Page 28: FD.IO Vector Packet Processing

 [Restricted] ONLY for designated groups and individuals ©2015 Check Point Software Technologies Ltd.

SINGLE AND MULTCORE MODES

Page 29: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd. 29

Core 0 Core 1 Core 2

Rx Tx Rx Tx

VPP Threading modes

 [Restricted] ONLY for designated groups and individuals

• Single-threadedN Both control and forwarding engine run on single thread

• Multi-thread with workers onlyN Control running on Main thread (API, CLI)N Forwarding performed by one or more worker threads

• Multi-thread with IO and WorkersN Control on main thread (API,CLI)N IO thread handling input and dispatching to worker threadsN Worker threads doing actual work including interface TXN RSS is in use

• Multi-thread with Main and IO

on a single threadN Workers separated by core

- Control - IO - Worker

Core 0 Core 1 Core 2

Rx Tx Tx

Core 0

Rx Tx

Core 0 Core 1 Core 2

Rx Tx

Core 3

Rx

…..

Page 30: FD.IO Vector Packet Processing

 [Restricted] ONLY for designated groups and individuals ©2015 Check Point Software Technologies Ltd.

SUPPORTED TOPOLOGIES

Page 31: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd. 31

Router and Switch for namespaces

Reference

Page 32: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd.

QUESTIONS?

Page 33: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd. 33

VPP Capabilities• Why VPP?

N Linux Kernel is good, but going too slow because of backward compatibilityN Standardization today moving faster than implementationsN Main reason for VPP speed – optimal usage of ICACHE

N Do not trash the cache with packet per packet processing like in the standard IP stack

N Separation of Data Plane and Control Plane. VPP is pure Data Plane

• Main ideasN Separation of Data Plane and Control PlaneN API generation. Available binding for Java, C and PythonN OpenStack integration

N Neutron ML2 driverN OPENFV / ODL-GBP / ODL-SFC (Service chaining like firewalls, NAT, QoS)

• ContainersN Could be in the host connecting between containersN Could be VPP inside of containers and talking between them

Page 34: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd. 34

Connection between various layers

dpdk-input

plugin

ethernet-input

ip-input

udp-localip4_register_protocol() UDP

ethernet_register_input_type() IPv4

vnet_hw_interface_rx_redirect_to_node()

Defined in plugin code

Next node is hardcoded in dpdk-input/handoff-dispatch

Callback

Data

Page 35: FD.IO Vector Packet Processing

©2015 Check Point Software Technologies Ltd. 35

Output attachment point

ipv4-input ipv4-lookup

VPP Adjacency: mechanism to add and rewrite next node dynamically after routing lookup. Available nodes:- miss- drop- punt- local- rewrite- classify- map- map_t- sixrd- hop_by_hop

*Possible place for POSTROUTING HOOK

ipv4-rewrite-transit

VPP Rx features: mechanism to add and rewrite next node dynamically after ipv4-input. Available nodes:- input acl *Prerouting- source check rx- source check any- ipsec- vpath- lookup

*Currently impossible to do it from plugins

L3 Nodes Various L4 Nodes Various Post Routing Nodes