smartnic programming models - open-nfp
TRANSCRIPT
©2016 Open-NFP 1
SmartNIC Programming Models
Johann Tönsing 2016-11-09
©2016 Open-NFP 2
Agenda
• SmartNIC hardware • Pre-programmed vs. custom (C and/or P4) firmware • Programming models / offload models • Switching on NIC, with SR-IOV / virtio data delivery • SmartNIC performance + TCO • Silicon and datapath software architectures • Example code
©2016 Open-NFP 3
Agilio™ CX SmartNIC Hardware
• Optimized for standard server based cloud data centers • Low Profile Half Length PCIe form factor, power < 25W • Based on Netronome’s NFP-4xxx silicon (72 C programmable cores, 8 threads each) • 2GB DRAM for lookup tables / state tables (millions of entries) • Dataplane fully implemented in software
1x 40GbE 2x 40GbE2x 10GbE 2x 25GbE
Also available: Agilio™ LX 2x40G / 1x100G with dual PCIe interfaces, 120 cores, 8GB DRAM…
©2016 Open-NFP 4
SmartNIC Firmware: Pre-programmed or Custom
• SmartNIC with dynamically downloadable firmware
©2016 Open-NFP 4
SmartNIC Firmware: Pre-programmed or Custom
• SmartNIC with dynamically downloadable firmware
vRouter OVS
OpenStack ONOS ODL
Linux BSD
• OVS / vRouter / eBPF+XDP datapath on host can be accelerated by SmartNIC • Firmware / drivers supplied by Netronome
D P D K
eBPF
©2016 Open-NFP 4
SmartNIC Firmware: Pre-programmed or Custom
Compiler Debugger
Run-Time
Editor
• Firmware can be developed in P4 and/or C • Custom dataplane
app.P4 app.C
• SmartNIC with dynamically downloadable firmware
vRouter OVS
OpenStack ONOS ODL
Linux BSD
• OVS / vRouter / eBPF+XDP datapath on host can be accelerated by SmartNIC • Firmware / drivers supplied by Netronome
D P D K
eBPF
©2016 Open-NFP 4
SmartNIC Firmware: Pre-programmed or Custom
Compiler Debugger
Run-Time
Editor
• Firmware can be developed in P4 and/or C • Custom dataplane
app.P4 app.C
• SmartNIC with dynamically downloadable firmware
vRouter OVS
OpenStack ONOS ODL
Linux BSD
• OVS / vRouter / eBPF+XDP datapath on host can be accelerated by SmartNIC • Firmware / drivers supplied by Netronome
D P D K
eBPF
• Hybrid - “sandbox / plugin” concept • Example: C plugin embedded in P4
©2016 Open-NFP
Integrated Development Environment Edit - Compile - Debug (HW, simulator)
4
SmartNIC Firmware: Pre-programmed or Custom
Compiler Debugger
Run-Time
Editor
• Firmware can be developed in P4 and/or C • Custom dataplane
app.P4 app.C
• SmartNIC with dynamically downloadable firmware
vRouter OVS
OpenStack ONOS ODL
Linux BSD
• OVS / vRouter / eBPF+XDP datapath on host can be accelerated by SmartNIC • Firmware / drivers supplied by Netronome
D P D K
eBPF
• Hybrid - “sandbox / plugin” concept • Example: C plugin embedded in P4
©2016 Open-NFP 5
Programming Models
©2016 Open-NFP 5
Programming Models
CPUUserMode
CPUKernel
DataPath
Offload / Acceleration(transparent -
drop in replacement)
©2016 Open-NFP 5
Programming Models
CPUUserMode
CPUKernel
DataPath
Offload / Acceleration(transparent -
drop in replacement)
NIC w. NFP
DataPath
Match Act
offload
©2016 Open-NFP 5
Programming Models
CPUUserMode
CPUKernel
DataPath
Offload / Acceleration(transparent -
drop in replacement)
NIC w. NFP
DataPath
Match Act
offload
©2016 Open-NFP 5
Programming Models
CPUUserMode
CPUKernel
DataPath
Offload / Acceleration(transparent -
drop in replacement)
NIC w. NFP
DataPath
Match Act
offload fallback
©2016 Open-NFP 5
Programming Models
CPUUserMode
CPUKernel
DataPath
Offload / Acceleration(transparent -
drop in replacement)
NIC w. NFP
DataPath
Match Act
offload fallback
VM / App
©2016 Open-NFP 5
Programming Models
CPUUserMode
CPUKernel
DataPath
Offload / Acceleration(transparent -
drop in replacement)
OVS/OpenFlowMatch, Action,
Tunnels
Contrail vRouterMatch, Action,
Tunnels
Conntrack(Firewall)
eBPF / XDP
...
t
NIC w. NFP
DataPath
Match Act
offload fallback
VM / App
©2016 Open-NFP 5
Programming Models
CPUUserMode
CPUKernel
DataPath
Offload / Acceleration(transparent -
drop in replacement)
OVS/OpenFlowMatch, Action,
Tunnels
Contrail vRouterMatch, Action,
Tunnels
Conntrack(Firewall)
eBPF / XDP
...
t
NIC w. NFP
CPUUserMode
CPUKernel
CPU Apps Calling APIsCompatible / Open Sourced
vs. Vendor Extension
AppAPI
DataPath
DataPath
Match Act
NIC w. NFP
DataPath
Match Act
offload fallback
VM / App
©2016 Open-NFP 5
Programming Models
CPUUserMode
CPUKernel
DataPath
Offload / Acceleration(transparent -
drop in replacement)
OVS/OpenFlowMatch, Action,
Tunnels
Contrail vRouterMatch, Action,
Tunnels
Conntrack(Firewall)
eBPF / XDP
...
t
NIC w. NFP
CPUUserMode
CPUKernel
CPU Apps Calling APIsCompatible / Open Sourced
vs. Vendor Extension
AppAPI
DataPath
DataPath
Match Act
NIC w. NFP
DataPath
Match Act
offload fallback
VM / App
DPDKPoll Mode Driver
eBPF / XDP APIs
Flow APIs - Match / Act / Tunnel
Load Balancing APIs
Crypto APIs
...
t
©2016 Open-NFP 5
Programming Models
CPUUserMode
CPUKernel
DataPath
Offload / Acceleration(transparent -
drop in replacement)
OVS/OpenFlowMatch, Action,
Tunnels
Contrail vRouterMatch, Action,
Tunnels
Conntrack(Firewall)
eBPF / XDP
...
t
NIC w. NFP
CPUUserMode
CPUKernel
CPU Apps Calling APIsCompatible / Open Sourced
vs. Vendor Extension
AppAPI
DataPath
DataPath
Match Act
NIC w. NFP
CPUUserMode
CPUKernel
Flexible Datapath AbstractionOpenFlow 2.x, P4, PIF, eBPF...
Protocol agnostic flexible parsing
Arbitrary arrangement of matching tables
Matching withouttables
State storage / retrieval
Complex actions
Event handling
AppAPI
DataPath
t
Data PathParse
MatchState
Act
Match
NIC w. NFP
DataPath
Match Act
offload fallback
VM / App
DPDKPoll Mode Driver
eBPF / XDP APIs
Flow APIs - Match / Act / Tunnel
Load Balancing APIs
Crypto APIs
...
t
©2016 Open-NFP 5
Programming Models
CPUUserMode
CPUKernel
DataPath
Offload / Acceleration(transparent -
drop in replacement)
OVS/OpenFlowMatch, Action,
Tunnels
Contrail vRouterMatch, Action,
Tunnels
Conntrack(Firewall)
eBPF / XDP
...
t
NIC w. NFP
CPUUserMode
CPUKernel
CPU Apps Calling APIsCompatible / Open Sourced
vs. Vendor Extension
AppAPI
DataPath
DataPath
Match Act
NIC w. NFP
CPUUserMode
CPUKernel
Flexible Datapath AbstractionOpenFlow 2.x, P4, PIF, eBPF...
Protocol agnostic flexible parsing
Arbitrary arrangement of matching tables
Matching withouttables
State storage / retrieval
Complex actions
Event handling
AppAPI
DataPath
t
Data PathParse
MatchState
Act
Match
NIC w. NFP
CPUUserMode
CPUKernel
Hybrid: Datapath Extensions in CPU / NFP
In C, P4 / PIF, ...
Custom tunnel
Custom action
Custom matching
...
AppAPI
DataPath
t
DataPath
Match Act
App
NIC w. NFP
DataPath
Match Act
offload fallback
VM / App
DPDKPoll Mode Driver
eBPF / XDP APIs
Flow APIs - Match / Act / Tunnel
Load Balancing APIs
Crypto APIs
...
t
©2016 Open-NFP 6
Traditional Model: SR-IOV
Virtual MachineVirtual Machine
Virtual Machine
x86 Userspace
PCIe
Virtual Machine
Traditional NIC
Apps
netdev or DPDK
1 Configuration from cloud management system
SR-IOV Traffic
DirectorOVS
Kernel DP Match/Act
(Nova, Neutron)
1
SR-IOV Configuration
SR-IOV
©2016 Open-NFP 6
Traditional Model: SR-IOV
Virtual MachineVirtual Machine
Virtual Machine
x86 Userspace
PCIe
Virtual Machine
Traditional NIC
Apps
netdev or DPDK
1 Configuration from cloud management system
2
2 Hardware configuration
SR-IOV Traffic
DirectorOVS
Kernel DP Match/Act
(Nova, Neutron)
1
SR-IOV Configuration
SR-IOV
©2016 Open-NFP 6
Traditional Model: SR-IOV
Virtual MachineVirtual Machine
Virtual Machine
x86 Userspace
PCIe
Virtual Machine
Traditional NIC
Apps
netdev or DPDK
1 Configuration from cloud management system
2
2 Hardware configuration
SR-IOV Traffic
DirectorOVS
Kernel DP Match/Act
(Nova, Neutron)
1
SR-IOV Configuration
SR-IOV
©2016 Open-NFP 6
Traditional Model: SR-IOV
Virtual MachineVirtual Machine
Virtual Machine
x86 Userspace
PCIe
Virtual Machine
Traditional NIC
Apps
netdev or DPDK
1 Configuration from cloud management system
2
2 Hardware configuration
SR-IOV Traffic
DirectorOVS
Kernel DP Match/Act
(Nova, Neutron)
1
SR-IOV Configuration
SR-IOV
Low expressiveness (MAC/VLAN basedtraffic directing)High performance
Poor manageability (no VM migration)
©2016 Open-NFP 7
Traditional Model: Host Runs Datapath
Datapath — P4, Open vSwitch, vRouter…
Control Agent
Virtual MachineVirtual Machine
Virtual Machine
x86 Kernel
x86 Userspace
PCIe
Virtual Machine
Traditional NIC
Apps
1
netdev or DPDK
CLI
1 Configuration via network protocol or CLI
Execute ActionMatching
OVS Kernel DP Match/Act
(Nova, Neutron)
1
Host - guest channel virtio or other
or equivalent
©2016 Open-NFP 7
Traditional Model: Host Runs Datapath
Datapath — P4, Open vSwitch, vRouter…
Control Agent
Virtual MachineVirtual Machine
Virtual Machine
x86 Kernel
x86 Userspace
PCIe
Virtual Machine
Traditional NIC
Apps
1
netdev or DPDK
CLI
1 Configuration via network protocol or CLI
2
2 Userspace agent populates table entries
Execute ActionMatching
OVS Kernel DP Match/Act
(Nova, Neutron)
1
Host - guest channel virtio or other
or equivalent
©2016 Open-NFP 7
Traditional Model: Host Runs Datapath
Datapath — P4, Open vSwitch, vRouter…
Control Agent
Virtual MachineVirtual Machine
Virtual Machine
x86 Kernel
x86 Userspace
PCIe
Virtual Machine
Traditional NIC
Apps
1
netdev or DPDK
CLI
1 Configuration via network protocol or CLI
2
2 Userspace agent populates table entries
Execute ActionMatching
OVS Kernel DP Match/Act
(Nova, Neutron)
1
Host - guest channel virtio or other
or equivalent
©2016 Open-NFP 7
Traditional Model: Host Runs Datapath
Datapath — P4, Open vSwitch, vRouter…
Control Agent
Virtual MachineVirtual Machine
Virtual Machine
x86 Kernel
x86 Userspace
PCIe
Virtual Machine
Traditional NIC
Apps
1
netdev or DPDK
CLI
1 Configuration via network protocol or CLI
2
2 Userspace agent populates table entries
Execute ActionMatching
OVS Kernel DP Match/Act
(Nova, Neutron)
1
Host - guest channel virtio or other
or equivalent
High expressiveness (software)Consumes many cores and/or exhibits low performance
©2016 Open-NFP 8
Offload Model: Agilio™ OVS Acceleration
Open vSwitch Subsystem
OVS Agent
OpenFlow
Virtual MachineVirtual Machine
Virtual Machine
x86 Kernel
x86 Userspace
PCIe
Virtual Machine
SR-IOV / VirtIO VFs
SR-IOV / VirtIO VFs
Agilio Intelligent ServerAdapter (NIC)
Apps
Apps
1
1
netdev or DPDK
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
OVS CLI Callable API
1 Configuration via controller, CLI, or Callable API
(Nova, Neutron)
Execute Action
OVS Kernel DP Match/Act
OVS Kernel DP Match/Act
©2016 Open-NFP 8
Offload Model: Agilio™ OVS Acceleration
Open vSwitch Subsystem
OVS Agent
OpenFlow
Virtual MachineVirtual Machine
Virtual Machine
x86 Kernel
x86 Userspace
PCIe
Virtual Machine
SR-IOV / VirtIO VFs
SR-IOV / VirtIO VFs
Agilio Intelligent ServerAdapter (NIC)
Apps
Apps
1
1
netdev or DPDK
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
OVS CLI Callable API
1 Configuration via controller, CLI, or Callable API
(Nova, Neutron)
Execute Action
Open vSwitch Datapath
Execute Action (e.g. Entunnel, Deliver to VM, Send to Port)
OVS Kernel DP Match/Act
OVS Kernel DP Match/Act
©2016 Open-NFP 8
Offload Model: Agilio™ OVS Acceleration
Open vSwitch Subsystem
OVS Agent
OpenFlow
Virtual MachineVirtual Machine
Virtual Machine
x86 Kernel
x86 Userspace
PCIe
Virtual Machine
SR-IOV / VirtIO VFs
SR-IOV / VirtIO VFs
Agilio Intelligent ServerAdapter (NIC)
Apps
Apps
1
1
netdev or DPDK
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
OVS CLI Callable API
1 Configuration via controller, CLI, or Callable API
(Nova, Neutron)
Execute Action
Open vSwitch Datapath
Execute Action (e.g. Entunnel, Deliver to VM, Send to Port)
OVS Kernel DP Match/Act
OVS Kernel DP Match/Act
Miss
©2016 Open-NFP 8
Offload Model: Agilio™ OVS Acceleration
Open vSwitch Subsystem
OVS Agent
OpenFlow
Virtual MachineVirtual Machine
Virtual Machine
x86 Kernel
x86 Userspace
PCIe
Virtual Machine
SR-IOV / VirtIO VFs
SR-IOV / VirtIO VFs
Agilio Intelligent ServerAdapter (NIC)
Apps
Apps
1
1
netdev or DPDK
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
OVS CLI Callable API
1 Configuration via controller, CLI, or Callable API
(Nova, Neutron)
Execute Action
Open vSwitch Datapath
Execute Action (e.g. Entunnel, Deliver to VM, Send to Port)
OVS Kernel DP Match/Act
OVS Kernel DP Match/Act
Miss
Miss
©2016 Open-NFP 8
Offload Model: Agilio™ OVS Acceleration
Open vSwitch Subsystem
OVS Agent
OpenFlow
Virtual MachineVirtual Machine
Virtual Machine
x86 Kernel
x86 Userspace
PCIe
Virtual Machine
SR-IOV / VirtIO VFs
SR-IOV / VirtIO VFs
Agilio Intelligent ServerAdapter (NIC)
Apps
Apps
1
1
netdev or DPDK
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
OVS CLI Callable API
1 Configuration via controller, CLI, or Callable API
2
2 OVS userspace agent populates kernel cache
(Nova, Neutron)
Execute Action
Open vSwitch Datapath
Execute Action (e.g. Entunnel, Deliver to VM, Send to Port)
OVS Kernel DP Match/Act
OVS Kernel DP Match/Act
Miss
Miss
©2016 Open-NFP 8
Offload Model: Agilio™ OVS Acceleration
Open vSwitch Subsystem
OVS Agent
OpenFlow
Virtual MachineVirtual Machine
Virtual Machine
x86 Kernel
x86 Userspace
PCIe
Virtual Machine
SR-IOV / VirtIO VFs
SR-IOV / VirtIO VFs
Agilio Intelligent ServerAdapter (NIC)
Apps
Apps
1
1
netdev or DPDK
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
OVS CLI Callable API
1 Configuration via controller, CLI, or Callable API
2
2 OVS userspace agent populates kernel cache
(Nova, Neutron)
Execute Action
Open vSwitch Datapath
Execute Action (e.g. Entunnel, Deliver to VM, Send to Port)
3 Offload datapath: copy match tables, sync stats
3
OVS Kernel DP Match/Act
OVS Kernel DP Match/Act
Miss
Miss
©2016 Open-NFP 8
Offload Model: Agilio™ OVS Acceleration
Open vSwitch Subsystem
OVS Agent
OpenFlow
Virtual MachineVirtual Machine
Virtual Machine
x86 Kernel
x86 Userspace
PCIe
Virtual Machine
SR-IOV / VirtIO VFs
SR-IOV / VirtIO VFs
Agilio Intelligent ServerAdapter (NIC)
Apps
Apps
1
1
netdev or DPDK
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
OVS CLI Callable API
1 Configuration via controller, CLI, or Callable API
2
2 OVS userspace agent populates kernel cache
(Nova, Neutron)
Execute Action
Open vSwitch Datapath
Execute Action (e.g. Entunnel, Deliver to VM, Send to Port)
3 Offload datapath: copy match tables, sync stats
3
4 Flow tracking: per-microflow state learning
4
Self Learning Exact Match Flow Tracker
Miss
Hit
OVS Kernel DP Match/Act
OVS Kernel DP Match/Act
Miss
Miss
©2016 Open-NFP 8
Offload Model: Agilio™ OVS Acceleration
Open vSwitch Subsystem
OVS Agent
OpenFlow
Virtual MachineVirtual Machine
Virtual Machine
x86 Kernel
x86 Userspace
PCIe
Virtual Machine
SR-IOV / VirtIO VFs
SR-IOV / VirtIO VFs
Agilio Intelligent ServerAdapter (NIC)
Apps
Apps
1
1
netdev or DPDK
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
OVS CLI Callable API
1 Configuration via controller, CLI, or Callable API
2
2 OVS userspace agent populates kernel cache
(Nova, Neutron)
Execute Action
Open vSwitch Datapath
Execute Action (e.g. Entunnel, Deliver to VM, Send to Port)
3 Offload datapath: copy match tables, sync stats
3
Conn track
FTPSIP
4 Flow tracking: per-microflow state learning
4
Self Learning Exact Match Flow Tracker
Miss
Hit
OVS Kernel DP Match/Act
OVS Kernel DP Match/Act
Miss
Miss
©2016 Open-NFP 8
Offload Model: Agilio™ OVS Acceleration
Open vSwitch Subsystem
OVS Agent
OpenFlow
Virtual MachineVirtual Machine
Virtual Machine
x86 Kernel
x86 Userspace
PCIe
Virtual Machine
SR-IOV / VirtIO VFs
SR-IOV / VirtIO VFs
Agilio Intelligent ServerAdapter (NIC)
Apps
Apps
1
1
netdev or DPDK
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
OVS CLI Callable API
1 Configuration via controller, CLI, or Callable API
2
2 OVS userspace agent populates kernel cache
(Nova, Neutron)
Execute Action
Open vSwitch Datapath
Execute Action (e.g. Entunnel, Deliver to VM, Send to Port)
Conn track
3 Offload datapath: copy match tables, sync stats
3
Conn track
FTPSIP
4 Flow tracking: per-microflow state learning
4
Self Learning Exact Match Flow Tracker
Miss
Hit
OVS Kernel DP Match/Act
OVS Kernel DP Match/Act
Miss
Miss
©2016 Open-NFP 8
Offload Model: Agilio™ OVS Acceleration
Open vSwitch Subsystem
OVS Agent
OpenFlow
Virtual MachineVirtual Machine
Virtual Machine
x86 Kernel
x86 Userspace
PCIe
Virtual Machine
SR-IOV / VirtIO VFs
SR-IOV / VirtIO VFs
Agilio Intelligent ServerAdapter (NIC)
Apps
Apps
1
1
netdev or DPDK
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
OVS CLI Callable API
1 Configuration via controller, CLI, or Callable API
2
2 OVS userspace agent populates kernel cache
(Nova, Neutron)
Execute Action
Open vSwitch Datapath
Execute Action (e.g. Entunnel, Deliver to VM, Send to Port)
Conn track
3 Offload datapath: copy match tables, sync stats
3
5 Offload connection tracking: synchronize state
5
Conn track
FTPSIP
4 Flow tracking: per-microflow state learning
4
Self Learning Exact Match Flow Tracker
Miss
Hit
OVS Kernel DP Match/Act
OVS Kernel DP Match/Act
Miss
Miss
©2016 Open-NFP 8
Offload Model: Agilio™ OVS Acceleration
Open vSwitch Subsystem
OVS Agent
OpenFlow
Virtual MachineVirtual Machine
Virtual Machine
x86 Kernel
x86 Userspace
PCIe
Virtual Machine
SR-IOV / VirtIO VFs
SR-IOV / VirtIO VFs
Agilio Intelligent ServerAdapter (NIC)
Apps
Apps
1
1
netdev or DPDK
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
OVS CLI Callable API
1 Configuration via controller, CLI, or Callable API
2
2 OVS userspace agent populates kernel cache
(Nova, Neutron)
Execute Action
Open vSwitch Datapath
Execute Action (e.g. Entunnel, Deliver to VM, Send to Port)
Conn track
3 Offload datapath: copy match tables, sync stats
3
5 Offload connection tracking: synchronize state
5
Conn track
FTPSIP
4 Flow tracking: per-microflow state learning
4
Self Learning Exact Match Flow Tracker
Miss
Hit
OVS Kernel DP Match/Act
Datapath Extension or Plugin
P4 / C in Sandbox
DP Ext.
6 Datapath extension software
6
6
OVS Kernel DP Match/Act
Miss
Miss
©2016 Open-NFP 8
Offload Model: Agilio™ OVS Acceleration
Open vSwitch Subsystem
OVS Agent
OpenFlow
Virtual MachineVirtual Machine
Virtual Machine
x86 Kernel
x86 Userspace
PCIe
Virtual Machine
SR-IOV / VirtIO VFs
SR-IOV / VirtIO VFs
Agilio Intelligent ServerAdapter (NIC)
Apps
Apps
1
1
netdev or DPDK
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
OVS CLI Callable API
1 Configuration via controller, CLI, or Callable API
2
2 OVS userspace agent populates kernel cache
(Nova, Neutron)
Execute Action
Open vSwitch Datapath
Execute Action (e.g. Entunnel, Deliver to VM, Send to Port)
Conn track
3 Offload datapath: copy match tables, sync stats
3
5 Offload connection tracking: synchronize state
5
Conn track
FTPSIP
4 Flow tracking: per-microflow state learning
4
Self Learning Exact Match Flow Tracker
Miss
Hit
OVS Kernel DP Match/Act
Datapath Extension or Plugin
P4 / C in Sandbox
DP Ext.
6 Datapath extension software
6
6
OVS Kernel DP Match/Act
Miss
Miss
©2016 Open-NFP 8
Offload Model: Agilio™ OVS Acceleration
Open vSwitch Subsystem
OVS Agent
OpenFlow
Virtual MachineVirtual Machine
Virtual Machine
x86 Kernel
x86 Userspace
PCIe
Virtual Machine
SR-IOV / VirtIO VFs
SR-IOV / VirtIO VFs
Agilio Intelligent ServerAdapter (NIC)
Apps
Apps
1
1
netdev or DPDK
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
OVS CLI Callable API
1 Configuration via controller, CLI, or Callable API
2
2 OVS userspace agent populates kernel cache
(Nova, Neutron)
Execute Action
Open vSwitch Datapath
Execute Action (e.g. Entunnel, Deliver to VM, Send to Port)
Conn track
3 Offload datapath: copy match tables, sync stats
3
5 Offload connection tracking: synchronize state
5
Conn track
FTPSIP
4 Flow tracking: per-microflow state learning
4
Self Learning Exact Match Flow Tracker
Miss
Hit
OVS Kernel DP Match/Act
Datapath Extension or Plugin
P4 / C in Sandbox
DP Ext.
6 Datapath extension software
6
6
OVS Kernel DP Match/Act
Miss
Miss
Best of all worlds
- Performance of SR-IOV
- Flexibility of virtio (VM migration)
- Performance and CPU core saving of switching on SmartNIC
©2016 Open-NFP 9
Example: Throughput vs. Number of Rules
5
10
15
20
25
30
OVS in Kernel Space
OVS in User Space on DPDK
100 Wildcard Rules
1000 Wildcard Rules
10000 Wildcard Rules
64000 Wildcard Rules
Mill
ions
of P
acke
ts p
er S
econ
d
12 CPU Cores
12 CPU Cores
OVS Offloaded to Agilio™ CX
1 CPU Core
5X Throughput Improvement + 90% CPU Savings
L2/L3 Forwarding to 8 VMs with 64K Flows
©2016 Open-NFP 9
Example: Throughput vs. Number of Rules
5
10
15
20
25
30
OVS in Kernel Space
OVS in User Space on DPDK
100 Wildcard Rules
1000 Wildcard Rules
10000 Wildcard Rules
64000 Wildcard Rules
Mill
ions
of P
acke
ts p
er S
econ
d
12 CPU Cores
12 CPU Cores
OVS Offloaded to Agilio™ CX
1 CPU Core
5X Throughput Improvement + 90% CPU Savings
L2/L3 Forwarding to 8 VMs with 64K Flows
©2016 Open-NFP 10
Tested Scenario: Server CPU Core AllocationUnaccelerated OVS (Kernel / User Mode)
Agilio™ OVSOVS
©2016 Open-NFP 10
Tested Scenario: Server CPU Core AllocationUnaccelerated OVS (Kernel / User Mode)
Agilio™ OVSOVS
Typical result: replace 3-6 racks with 1 rack!
©2016 Open-NFP 10
Tested Scenario: Server CPU Core AllocationUnaccelerated OVS (Kernel / User Mode)
Agilio™ OVSOVS
Benefits for your use case:https://www.netronome.com/products/ovs/roi-calculator/
Typical result: replace 3-6 racks with 1 rack!
©2016 Open-NFP
or successor
11
P4 Datapath on SmartNIC
P4 Control
P4 Control Agent
Virtual MachineVirtual Machine
Virtual Machine
x86 Server
PCIe
Virtual Machine
SR-IOV / virtio VFs
SR-IOV VFs
Agilio™SmartNIC
Apps
Apps
netdev or DPDK
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
CLI (Nova, Neutron)
P4 Matching
©2016 Open-NFP
or successor
11
P4 Datapath on SmartNIC
P4 Control
P4 Control Agent
Virtual MachineVirtual Machine
Virtual Machine
x86 Server
PCIe
Virtual Machine
SR-IOV / virtio VFs
SR-IOV VFs
Agilio™SmartNIC
Apps
Apps
netdev or DPDK
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
CLI
1
1
1 Configuration via control protocol or CLI
(Nova, Neutron)
P4 Matching
©2016 Open-NFP
or successor
11
P4 Datapath on SmartNIC
P4 Control
P4 Control Agent
Virtual MachineVirtual Machine
Virtual Machine
x86 Server
PCIe
Virtual Machine
SR-IOV / virtio VFs
SR-IOV VFs
Agilio™SmartNIC
Apps
Apps
netdev or DPDK
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
CLI
1
1
1 Configuration via control protocol or CLI
(Nova, Neutron)
P4 Datapath
Execute Action (e.g. Entunnel, Deliver to VM, Send to Port)
P4 Matching
©2016 Open-NFP
or successor
11
P4 Datapath on SmartNIC
P4 Control
P4 Control Agent
Virtual MachineVirtual Machine
Virtual Machine
x86 Server
PCIe
Virtual Machine
SR-IOV / virtio VFs
SR-IOV VFs
Agilio™SmartNIC
Apps
Apps
netdev or DPDK
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
Apps
netdev or DPDK
CLI
1
1
1 Configuration via control protocol or CLI
(Nova, Neutron)
P4 Datapath
Execute Action (e.g. Entunnel, Deliver to VM, Send to Port)
P4 Matching
2
2 Agent populates tables in SmartNIC datapath
©2016 Open-NFP 12
Traditional vs. SmartNIC Accelerated Networking
Forwarding /Virtual Switching Technology
SR-IOV
Intelligent Datapath
P4 / vRouter / Open vSwitch…
©2016 Open-NFP 12
Traditional vs. SmartNIC Accelerated Networking
Forwarding /Virtual Switching Technology
SR-IOV
Intelligent Datapath
P4 / vRouter / Open vSwitch…
Traditional Approach (Unaccelerated)
Limited expressiveness to direct traffic to VMs (no support for general match/action rules, tunnel termination, stateful firewalling) High throughput No VM migration support (difficult to manage)
High expressiveness - match/action, tunnels, stateless/stateful firewalling etc. Limited throughput High CPU utilization (e.g. 50% of cores)
©2016 Open-NFP 12
Traditional vs. SmartNIC Accelerated Networking
Forwarding /Virtual Switching Technology
SR-IOV
Intelligent Datapath
P4 / vRouter / Open vSwitch…
Traditional Approach (Unaccelerated)
Limited expressiveness to direct traffic to VMs (no support for general match/action rules, tunnel termination, stateful firewalling) High throughput No VM migration support (difficult to manage)
High expressiveness - match/action, tunnels, stateless/stateful firewalling etc. Limited throughput High CPU utilization (e.g. 50% of cores)
Fully Programmable SmartNIC Accelerated Approach
High expressiveness - match/action, tunnels, stateless/stateful firewalling etc.
and SR-IOV based data delivery to VMs High throughput Virtio supporting VM migration(facilitating cloud optimization and upgrading)
Higher throughput (~5x higher) Lower CPU utilization (~10x lower)
High expressiveness - match/action, tunnels, stateless/stateful firewalling etc.
and SR-IOV based data delivery to VMs High throughput Virtio supporting VM migration(facilitating cloud optimization and upgrading)
Higher throughput (~5x higher) Lower CPU utilization (~10x lower)
©2016 Open-NFP 13
Flow Processor Silicon Architecture
Network Flow Processor 4xxx (used on Agilio-CX SmartNICs) ▪ Highly parallel multithreaded architecture (8 threads / core) for high throughput
▪ Purpose built Flow Processing Cores (72) maximize flexibility ▪ H/W accelerators further maximize efficiency (throughput/watt)
Fully software defined feature set - examples: ▪ Network and PCIe SR-IOV / VirtIO RX/TX with stateless offloads ▪ Flexible tunneling support (e.g. VXLAN, GRE, VLAN, MPLS, NSH) ▪ Flexible Match/Action processing - many packet fields / protocols ▪ Highly scalable and fine grained security policies
External DDR3 accommodates millions of flows / rules
Convenient programmability using P4 and C
PCIe Gen3 x8
Ethernet 10G, 40G, …
Up to 8GB DRAM
©2016 Open-NFP 13
Flow Processor Silicon Architecture
Network Flow Processor 4xxx (used on Agilio-CX SmartNICs) ▪ Highly parallel multithreaded architecture (8 threads / core) for high throughput
▪ Purpose built Flow Processing Cores (72) maximize flexibility ▪ H/W accelerators further maximize efficiency (throughput/watt)
Fully software defined feature set - examples: ▪ Network and PCIe SR-IOV / VirtIO RX/TX with stateless offloads ▪ Flexible tunneling support (e.g. VXLAN, GRE, VLAN, MPLS, NSH) ▪ Flexible Match/Action processing - many packet fields / protocols ▪ Highly scalable and fine grained security policies
External DDR3 accommodates millions of flows / rules
Convenient programmability using P4 and C
PCIe Gen3 x8
Ethernet 10G, 40G, …
Up to 8GB DRAM
©2016 Open-NFP 13
Flow Processor Silicon Architecture
Network Flow Processor 4xxx (used on Agilio-CX SmartNICs) ▪ Highly parallel multithreaded architecture (8 threads / core) for high throughput
▪ Purpose built Flow Processing Cores (72) maximize flexibility ▪ H/W accelerators further maximize efficiency (throughput/watt)
Fully software defined feature set - examples: ▪ Network and PCIe SR-IOV / VirtIO RX/TX with stateless offloads ▪ Flexible tunneling support (e.g. VXLAN, GRE, VLAN, MPLS, NSH) ▪ Flexible Match/Action processing - many packet fields / protocols ▪ Highly scalable and fine grained security policies
External DDR3 accommodates millions of flows / rules
Convenient programmability using P4 and C
PCIe Gen3 x8
Ethernet 10G, 40G, …
Up to 8GB DRAM
©2016 Open-NFP 14
SmartNIC Datapath “Worker” Software Architecture
• Load balancer distributes each packet to next available thread for optimum throughput • Hardware assisted reordering ensures packet order is maintained • Flow tracker statefully learns / tracks millions of sessions • Matching performed using DRAM-backed tables - capacity millions of entries • Actions efficiently performed in on-chip memory
Parse Match ActFlow Tracker Learn microflows
Cache action
P4 Datapath Run to completion
Load Balance
Re-order
Pool of worker threads on flow processing cores
net or PCIE
net or PCIE
= Ring / Work Queue (multi producer / consumer)
©2016 Open-NFP 14
SmartNIC Datapath “Worker” Software Architecture
• Load balancer distributes each packet to next available thread for optimum throughput • Hardware assisted reordering ensures packet order is maintained • Flow tracker statefully learns / tracks millions of sessions • Matching performed using DRAM-backed tables - capacity millions of entries • Actions efficiently performed in on-chip memory
Parse Match ActFlow Tracker Learn microflows
Cache action
P4 Datapath Run to completion
Load Balance
C Plugin
Re-order
Pool of worker threads on flow processing cores
net or PCIE
net or PCIE
= Ring / Work Queue (multi producer / consumer)
©2016 Open-NFP 14
SmartNIC Datapath “Worker” Software Architecture
• Load balancer distributes each packet to next available thread for optimum throughput • Hardware assisted reordering ensures packet order is maintained • Flow tracker statefully learns / tracks millions of sessions • Matching performed using DRAM-backed tables - capacity millions of entries • Actions efficiently performed in on-chip memory
Parse Match ActFlow Tracker Learn microflows
Cache action
P4 Datapath Run to completion
Load Balance
C Plugin
Re-order
Pool of worker threads on flow processing cores
net or PCIE
net or PCIE
= Ring / Work Queue (multi producer / consumer)
C XX
©2016 Open-NFP 15
Datapath Distributed over Microengines• Worker uses ring to forward packets to other microengine(s)
Parse MatchFlow Tracker Learn microflows
Cache action
C / P4 Datapath Run to completion
Load Balance
Re-order
Pool of worker threads on microengines
net or PCIE
net or PCIE
= Ring / Work Queue (multi producer / consumer)
Act
©2016 Open-NFP 15
Datapath Distributed over Microengines• Worker uses ring to forward packets to other microengine(s)
Parse MatchFlow Tracker Learn microflows
Cache action
C / P4 Datapath Run to completion
Load Balance
Re-order
Pool of worker threads on microengines
net or PCIE
net or PCIE
= Ring / Work Queue (multi producer / consumer) C Code
“Coprocessor” microengines
Act
©2016 Open-NFP 15
Datapath Distributed over Microengines• Worker uses ring to forward packets to other microengine(s)
Parse MatchFlow Tracker Learn microflows
Cache action
C / P4 Datapath Run to completion
Load Balance
Re-order
Pool of worker threads on microengines
net or PCIE
net or PCIE
= Ring / Work Queue (multi producer / consumer) C Code
“Coprocessor” microengines
Act
©2016 Open-NFP 15
Datapath Distributed over Microengines• Worker uses ring to forward packets to other microengine(s)
Parse MatchFlow Tracker Learn microflows
Cache action
C / P4 Datapath Run to completion
Load Balance
Re-order
Pool of worker threads on microengines
net or PCIE
net or PCIE
= Ring / Work Queue (multi producer / consumer) C Code
“Coprocessor” microengines
Act
©2016 Open-NFP 15
Datapath Distributed over Microengines• Worker uses ring to forward packets to other microengine(s)
Parse MatchFlow Tracker Learn microflows
Cache action
C / P4 Datapath Run to completion
Load Balance
Re-order
Pool of worker threads on microengines
net or PCIE
net or PCIE
= Ring / Work Queue (multi producer / consumer) C Code
“Coprocessor” microengines
C Code
“Coprocessor” microengines
Act
©2016 Open-NFP 15
Datapath Distributed over Microengines• Worker uses ring to forward packets to other microengine(s)
Parse MatchFlow Tracker Learn microflows
Cache action
C / P4 Datapath Run to completion
Load Balance
Re-order
Pool of worker threads on microengines
net or PCIE
net or PCIE
= Ring / Work Queue (multi producer / consumer) C Code
“Coprocessor” microengines
C Code
“Coprocessor” microengines
Act
CryptoCrypto
©2016 Open-NFP 16
Example: P4 “main” implementing a simple NIC
header_type eth_hdr { fields { dst : 48; src : 48; etype : 16; } } header eth_hdr eth;
parser start { return eth_parse; }
parser eth_parse { extract(eth); return ingress; }
action drop_act() { drop(); }
action fwd_act(port) { modify_field(standard_metadata.egress_spec, port); }
table in_tbl { reads { standard_metadata.ingress_port : exact; } actions { fwd_act; drop_act; } }
control ingress { apply(in_tbl); }
©2016 Open-NFP 17
Example: C Code
©2016 Open-NFP
P4 Datapath with C Run to completion
18
Example of Fully Customized Datapath (P4 / C)
Run-Time Interface
Agent
Server (x86 - Linux)
PCIe
Agilio™ SmartNIC
Virtual Machine 1
VNF Kernel Mode
(C)
netdev
P4 / C Development Environment
Edit - Debug
Control App
Populate tables, display
statistics
Security µVNF (C)
Timestamp µVNF
Latency Stats µVNF (C)
Virtual Machine 2
VNF User Mode
(C)
DPDK
Timestamp µVNF
Latency Stats µVNF (C)
Match Protocol
Meter
Other
TCP
Concepts:
• P4 and C running on SmartNIC implements datapath - e.g.defines protocols, match / actionbehavior
• Datapath steers traffic to VNFsrunning on x86 server and on SmartNIC
RT I/F Helper
©2016 Open-NFP 19
Next Steps
• Use Agilio™ SmartNICs with existing dataplanes • Use Agilio™ OVS (with / without Conntrack) • Use Agilio™ Contrail vRouter • Use Agilio™ eBPF/XDP
• Program Agilio™ SmartNICs (following sessions - SDK, P4, OVS, eBPF/XDP) • Program using P4, C, eBPF/XDP…
• Participate in open source and standards evolution:openstack.org, openvswitch.org, opencontrail.org, p4.org, iovisor.org, open-nfp.org, opennetworking.org, opensourcesdn.org, opnfv.org, linuxfoundation.org • Examples: P4 / OpenFlow callable run-time API (cross-body effort starting), acceleration APIs
Increase flexibility, improve performance, free up server resources!
© 2016 NETRONOME
More information: netronome.com and: open-nfp.org
Thank You!
20