control and forwarding plane separation on an … · control and forwarding plane separation ......
TRANSCRIPT
![Page 1: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/1.jpg)
Linux Kongress2010-09-23 in Nürnberg
Robert Olsson, Uppsala UniversityOlof Hagsand, KTH
Control and forwarding plane separation
on an opensource router
![Page 2: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/2.jpg)
More than 10 year in productionat Uppsala University
Stockholm Stockholm
2 * XEON 5630TYAN 70254 *10g ixgbe sfp+ LR/SR
Full Internet routingvia EBGP/IBGP
DMZ
AS 2834
UU 1 UU 2
Interneral UUNet
L green L red
ISP/SUNET AS1653
Local BGP peeringIn Uppsala
IPv4/IPv6OSPF
Now at 10g towards ISP, SFP+ 850 nm, 1310nm
![Page 3: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/3.jpg)
Motivation Separate controlplane from forwarding plane
A la IETF FORCES Controlplane: sshd, bgp, stats, etc on CPU core 0 Forwardingplane: Bulk forwarding on
core1,..,coreN This leads to robustness of service against overload
and DOS attacks, etc Enabled by:
multicore CPUsNIC hw classifiersFast Buses (QPI/PCIE gen2)
![Page 4: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/4.jpg)
CE (core0)
FE1(core1)
Router
ControlElement
ForwardingElements
Incoming traffic
Classifier FE2(core2)
FEN(coreN)
...
Control traffic
Forwarding traffic
Outgoing traffic
Control-plane separation on a multi-core
![Page 5: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/5.jpg)
Hi-End Hardware
XEON 2 x E5630 TYAN S7025 Motherboard
Intel 82599
![Page 6: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/6.jpg)
Block hardware structure
CPU0Quad-core
CPU1Quad-core
QPIDDR3
DDR3
DDR3 DDR3
DDR3
DDR3
IOHTylersburg
IOHTylersburg
PCI-E Gen.2 x16
PCI-E Gen.2 x16
PCI-E Gen.2 x4
PCI-E Gen.2 x16
PCI-E Gen.2 x16
PCI-E Gen.2 x8
More I/O devices
ESI
QPI
QPI QPI
![Page 7: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/7.jpg)
Hi-End Hardware/Latency
![Page 8: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/8.jpg)
Hardware - NIC
Intel 10g board Chipset 82599 with SFP+
Open chip specs. Thanks Intel!
![Page 9: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/9.jpg)
Classification in the Intel 82599
The classification in the Intel 82599 consists of several steps, each is programmable. This includes:- RSS (Receiver-side scaling): hashing of headers and load-balancing- N-tuples: explicit packet header matches- Flow-director: implicit matching of individual flows.
![Page 10: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/10.jpg)
Routing daemons
Packet forwarding is done in Linux kernelRouting protocols is run in user-space
daemons
Currently tested versions of quaggaBgp, OSPF both IPv4, iPv6Cisco API
![Page 11: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/11.jpg)
source router sink
host
Experiment 1:flow separation external source
- Bulk forwarding data from source to sink (10Gb/s mixed packet lengths): mixed flow and packet lengths- Netperf's TCP transactions emulated control data from a separate host - Study latency of TCP transactions
Bulk data
TCP transactions
![Page 12: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/12.jpg)
N-tuple or Flowdirector
ethtool K eth0 ntuple on
ethtool U eth0 flowtype tcp4 srcip 0x0a0a0a01 srcipmask 0xFFFFFFFF dstip 0 dstipmask 0 srcport 0 srcportmask 0 dstport 0 dstportmask 0 vlan 0 vlanmask 0 userdef 0 userdefmask 0 action 0
ethtool u eth0
Ntuple is supported by SUN Niu and Intel ixgbe driver. Actions are: 1) queue 2) drop
But we were lazy and patched ixgbe for ssh and BGP to use CPU0
![Page 13: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/13.jpg)
N-tuple or Flowdirector
Even more lazy... we found the flowdirector was implicitly programmed by outgoing flows. So both incoming and outgoing would use the same queue.
So if we set affinity for BGP, sshd etc we could avoid the Ntuple filters
Example: taskset c 0 /usr/bin/sshd
Neat....
![Page 14: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/14.jpg)
RSS is still using CPU0
So we both got our “selected traffic”Plus the bulk traffic from RSS
We just want RSS to use “other” CPU's
![Page 15: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/15.jpg)
Patching RSS
Just a oneliner...
diff git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.cindex 1b1419c..08bbd85 100644 a/drivers/net/ixgbe/ixgbe_main.c+++ b/drivers/net/ixgbe/ixgbe_main.c@@ 2379,10 +2379,10 @@ static void ixgbe_configure_rx(struct ixgbe_adapter *adapter) mrqc = ixgbe_setup_mrqc(adapter); if (adapter>flags & IXGBE_FLAG_RSS_ENABLED) { /* Fill out redirection table */ for (i = 0, j = 0; i < 128; i++, j++) {+ /* Fill out redirection table but skip index 0 */+ for (i = 0, j = 1; i < 128; i++, j++) { if (j == adapter>ring_feature[RING_F_RSS].indices) j = 0;+ j = 1; /* reta = 4byte sliding window of * 0x00..(indices1)(indices1)00..etc. */ reta = (reta << 8) | (j * 0x11);
![Page 16: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/16.jpg)
Patching RSS
CPU-core
0 1 2 3 4 5 6 7
Number of packets
0 196830 200860 186922 191866 186876 190106 190412
No traffic to CPU core 0 still RSS gives fairness between other cores
![Page 17: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/17.jpg)
Transaction Performancenetperf TCP_RR
On “router” taskset -c 0 netserver
![Page 18: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/18.jpg)
Don't let forwarded packetsprogram the flowdirector
A new one-liner patch....
@@ 5555,6 +5555,11 @@ static void ixgbe_atr(struct ixgbe_adapter *adapter, struct sk_buff *skb, u32 src_ipv4_addr, dst_ipv4_addr; u8 l4type = 0; + if(!skb>sk) {+ /* ignore nonlocal traffic */+ return;+ }+ /* check if we're UDP or TCP */ if (iph>protocol == IPPROTO_TCP) { th = tcp_hdr(skb);
![Page 19: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/19.jpg)
Instrumenting the flow-director
ethtool -S eth0 | grep fdir
![Page 20: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/20.jpg)
Flow-director stats/1
fdir_maxlen: 0 fdir_maxhash: 0 fdir_free: 8191 fdir_coll: 0 fdir_match: 195 fdir_miss: 573632813 <--- Bulk forwarded data from RSS fdir_ustat_add: 1 <--- Old ssh session fdir_ustat_remove: 0 fdir_fstat_add: 6 fdir_fstat_remove: 0 fdir_maxlen: 0
ustat user stats→ fstat failed stats →
![Page 21: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/21.jpg)
Flow-director stats/2
fdir_maxhash: 0 fdir_free: 8190 fdir_coll: 0 fdir_match: 196 fdir_miss: 630653401 fdir_ustat_add: 2 <--- New ssh session fdir_ustat_remove: 0 fdir_fstat_add: 6 fdir_fstat_remove: 0
![Page 22: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/22.jpg)
Flow-director stats/3
fdir_maxlen: 0 fdir_maxhash: 0 fdir_free: 8190 fdir_coll: 0 fdir_match: 206 <--- ssh packets are matched fdir_miss: 645067311 fdir_ustat_add: 2 fdir_ustat_remove: 0 fdir_fstat_add: 6 fdir_fstat_remove: 0
![Page 23: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/23.jpg)
Flow-director stats/4
fdir_maxlen: 0 fdir_maxhash: 0 fdir_free: 32768 <-- Now incresed 32k fdir_coll: 0 fdir_match: 0 fdir_miss: 196502463 fdir_ustat_add: 0 fdir_ustat_remove: 0 fdir_fstat_add: 0 fdir_fstat_remove: 0
![Page 24: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/24.jpg)
Flow-director stats/5
fdir_maxlen: 0 fdir_maxhash: 0 fdir_free: 32764 fdir_coll: 0 fdir_match: 948 <-- netperf TCP_RR fdir_miss: 529004675 fdir_ustat_add: 4 fdir_ustat_remove: 0 fdir_fstat_add: 44 fdir_fstat_remove: 0
![Page 25: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/25.jpg)
Transaction latency using flow separation
![Page 26: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/26.jpg)
Experiment 1 results
Baseline (no background traffic) gives 30000 transactions per second
With background traffic using RSS over all cores gives increase in transaction latency reducing transactions per second to ~5000
The RSS patch (dont forward traffic on core 0) brings the transaction latency back to (almost) the same case as the baseline
In all cases the control traffic is bound to core 0
![Page 27: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/27.jpg)
source router sink
Experiment 2:Flow separation in-line traffic
Bulk data
TCP transactions
- Inline control within bulk data (on same incoming interface)- Study latency of TCP transactions- Work in progress
![Page 28: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/28.jpg)
Results in-line
Flow Mix 64 byte0
1000
2000
3000
4000
5000
6000
VanillaWith Separation
64 byte0
2
4
6
8
10
12
14
16
18
VanillaWith Separation
Zoom in of 64 byte packetsTransaction latency wo/w RSS pathFlow mix and 64 byte packets
![Page 29: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/29.jpg)
Classifier small packet problem
Seems we drop a lot packets before they are classified
DCB (Data Center Bridging) has a lot of features to prioritize different type of traffic. But only for IEEE 802.1Q
VMDq2 suggested by Peter Waskiewicz Jr at Intel
![Page 30: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/30.jpg)
Experiment 3:Transmit limits
CPU0Quad-core
CPU1Quad-core
QPIDDR3
DDR3
DDR3 DDR3
DDR3
DDR3
IOHTylersburg
IOHTylersburg
PCI-E Gen.2 x16
PCI-E Gen.2 x16
PCI-E Gen.2 x4
PCI-E Gen.2 x16
PCI-E Gen.2 x16
PCI-E Gen.2 x8
More I/O devices
ESI
QPI
QPI QPI
Investigate hardware limits by transmitting as much as possible from all cores simultaneously.
![Page 31: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/31.jpg)
pktgen/setup
Inter-face
eth0 eth1 eth2 eth3 eth4 eth5 eth6 eth7 eth8 eth9
CPU-core
0 1 2 3 4 5 6 7 12 13
Memnode
0 0 0 0 1 1 1 1 1 1
eth4, eth5 on x4 slot
![Page 32: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/32.jpg)
Setup
CPU0Quad-core
CPU1Quad-core
QPI
Memory node 0
IOHTylersburg
IOHTylersburg
QPI
QPI QPI
Memory node1
eth0
eth1
eth2
eth3
eth5
eth6
eth7
eth8
eth9
eth4
![Page 33: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/33.jpg)
TX w. 10 * 10g ports 93Gb/s “Optimal”
![Page 34: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/34.jpg)
Conclusions
We have shown traffic separation in a highend multicore PC with classifier NICs by assigning one CPU core as control and the other as forwarding cores. Our method:
Interrupt affinity to bind control traffic to core 0 Modified RSS to spread forwarding traffic over all except core 0 Modified the flowdirector implementation slightly by only letting local (control) traffic populate the flowdir table.
There are remaining issues with packet drops in inline separation
We have shown 93Gb/s simplex transmission bandwidth on a fully equipped PC platform
![Page 35: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/35.jpg)
That's all
Questions?
![Page 36: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/36.jpg)
Rwanda example
![Page 37: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/37.jpg)
Lagos next
![Page 38: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/38.jpg)
Low-Power DevelopmentSome ideas
Power consumption SuperMicro X7SPA @ 16.5 Volt with picoPSU
Watt Test-------------------1.98 Power-Off13.53 Idle14.35 1 core15.51 2 Core15.84 3 Core16.50 4 Core
Routing Performance about 500.000 packet/sec in optimal setup.
![Page 39: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/39.jpg)
Example herjulf.se14 Watt by 55Ah battery
bifrost/USB + lowpower disk
![Page 40: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/40.jpg)
Running on battery
![Page 41: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/41.jpg)
SuperCapacitors
![Page 42: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/42.jpg)
DOM - Optical Monitoring
Optical modules can support optical link monitoringRX, TX power, temperatuers, alarms etc
Newly added support to Bifrost/Linux
![Page 43: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/43.jpg)
DOM
ethtool D eth3
IntCalbr: Avr RXPower: RATE_SELECT: Wavelength: 1310 nm
Temp: 25.5 C
Vcc: 3.28 V
TxBias: 20.5 mA
TXpwr: 3.4 dBm ( 0.46 mW)
RXpwr: 15.9 dBm ( 0.03 mW)
![Page 44: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/44.jpg)
1
Linux Kongress2010-09-23 in Nürnberg
Robert Olsson, Uppsala UniversityOlof Hagsand, KTH
Control and forwarding plane separation
on an opensource router
![Page 45: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/45.jpg)
2
More than 10 year in productionat Uppsala University
Stockholm Stockholm
2 * XEON 5630TYAN 70254 *10g ixgbe sfp+ LR/SR
Full Internet routingvia EBGP/IBGP
DMZ
AS 2834
UU 1 UU 2
Interneral UUNet
L green L red
ISP/SUNET AS1653
Local BGP peeringIn Uppsala
IPv4/IPv6OSPF
Now at 10g towards ISP, SFP+ 850 nm, 1310nm
![Page 46: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/46.jpg)
3
Motivation Separate controlplane from forwarding plane
A la IETF FORCES Controlplane: sshd, bgp, stats, etc on CPU core 0 Forwardingplane: Bulk forwarding on
core1,..,coreN This leads to robustness of service against overload
and DOS attacks, etc Enabled by:
multicore CPUsNIC hw classifiersFast Buses (QPI/PCIE gen2)
![Page 47: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/47.jpg)
4
CE (core0)
FE1(core1)
Router
ControlElement
ForwardingElements
Incoming traffic
Classifier FE2(core2)
FEN(coreN)
...
Control traffic
Forwarding traffic
Outgoing traffic
Control-plane separation on a multi-core
![Page 48: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/48.jpg)
5
Hi-End Hardware
XEON 2 x E5630 TYAN S7025 Motherboard
Intel 82599
![Page 49: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/49.jpg)
6
Block hardware structure
CPU0Quad-core
CPU1Quad-core
QPIDDR3
DDR3
DDR3 DDR3
DDR3
DDR3
IOHTylersburg
IOHTylersburg
PCI-E Gen.2 x16
PCI-E Gen.2 x16
PCI-E Gen.2 x4
PCI-E Gen.2 x16
PCI-E Gen.2 x16
PCI-E Gen.2 x8
More I/O devices
ESI
QPI
QPI QPI
![Page 50: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/50.jpg)
7
Hi-End Hardware/Latency
![Page 51: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/51.jpg)
8
Hardware - NIC
Intel 10g board Chipset 82599 with SFP+
Open chip specs. Thanks Intel!
![Page 52: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/52.jpg)
9
Classification in the Intel 82599
The classification in the Intel 82599 consists of several steps, each is programmable. This includes:- RSS (Receiver-side scaling): hashing of headers and load-balancing- N-tuples: explicit packet header matches- Flow-director: implicit matching of individual flows.
![Page 53: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/53.jpg)
10
Routing daemons
Packet forwarding is done in Linux kernelRouting protocols is run in user-space
daemons
Currently tested versions of quaggaBgp, OSPF both IPv4, iPv6Cisco API
![Page 54: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/54.jpg)
11
source router sink
host
Experiment 1:flow separation external source
- Bulk forwarding data from source to sink (10Gb/s mixed packet lengths): mixed flow and packet lengths- Netperf's TCP transactions emulated control data from a separate host - Study latency of TCP transactions
Bulk data
TCP transactions
![Page 55: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/55.jpg)
12
N-tuple or Flowdirector
ethtool K eth0 ntuple on
ethtool U eth0 flowtype tcp4 srcip 0x0a0a0a01 srcipmask 0xFFFFFFFF dstip 0 dstipmask 0 srcport 0 srcportmask 0 dstport 0 dstportmask 0 vlan 0 vlanmask 0 userdef 0 userdefmask 0 action 0
ethtool u eth0
Ntuple is supported by SUN Niu and Intel ixgbe driver. Actions are: 1) queue 2) drop
But we were lazy and patched ixgbe for ssh and BGP to use CPU0
![Page 56: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/56.jpg)
13
N-tuple or Flowdirector
Even more lazy... we found the flowdirector was implicitly programmed by outgoing flows. So both incoming and outgoing would use the same queue.
So if we set affinity for BGP, sshd etc we could avoid the Ntuple filters
Example: taskset c 0 /usr/bin/sshd
Neat....
![Page 57: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/57.jpg)
14
RSS is still using CPU0
So we both got our “selected traffic”Plus the bulk traffic from RSS
We just want RSS to use “other” CPU's
![Page 58: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/58.jpg)
15
Patching RSS
Just a oneliner...
diff git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.cindex 1b1419c..08bbd85 100644 a/drivers/net/ixgbe/ixgbe_main.c+++ b/drivers/net/ixgbe/ixgbe_main.c@@ 2379,10 +2379,10 @@ static void ixgbe_configure_rx(struct ixgbe_adapter *adapter) mrqc = ixgbe_setup_mrqc(adapter); if (adapter>flags & IXGBE_FLAG_RSS_ENABLED) { /* Fill out redirection table */ for (i = 0, j = 0; i < 128; i++, j++) {+ /* Fill out redirection table but skip index 0 */+ for (i = 0, j = 1; i < 128; i++, j++) { if (j == adapter>ring_feature[RING_F_RSS].indices) j = 0;+ j = 1; /* reta = 4byte sliding window of * 0x00..(indices1)(indices1)00..etc. */ reta = (reta << 8) | (j * 0x11);
![Page 59: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/59.jpg)
16
Patching RSS
CPU-core
0 1 2 3 4 5 6 7
Number of packets
0 196830 200860 186922 191866 186876 190106 190412
No traffic to CPU core 0 still RSS gives fairness between other cores
![Page 60: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/60.jpg)
17
Transaction Performancenetperf TCP_RR
On “router” taskset -c 0 netserver
![Page 61: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/61.jpg)
18
Don't let forwarded packetsprogram the flowdirector
A new one-liner patch....
@@ 5555,6 +5555,11 @@ static void ixgbe_atr(struct ixgbe_adapter *adapter, struct sk_buff *skb, u32 src_ipv4_addr, dst_ipv4_addr; u8 l4type = 0; + if(!skb>sk) {+ /* ignore nonlocal traffic */+ return;+ }+ /* check if we're UDP or TCP */ if (iph>protocol == IPPROTO_TCP) { th = tcp_hdr(skb);
![Page 62: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/62.jpg)
19
Instrumenting the flow-director
ethtool -S eth0 | grep fdir
![Page 63: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/63.jpg)
20
Flow-director stats/1
fdir_maxlen: 0 fdir_maxhash: 0 fdir_free: 8191 fdir_coll: 0 fdir_match: 195 fdir_miss: 573632813 <--- Bulk forwarded data from RSS fdir_ustat_add: 1 <--- Old ssh session fdir_ustat_remove: 0 fdir_fstat_add: 6 fdir_fstat_remove: 0 fdir_maxlen: 0
ustat user stats→ fstat failed stats →
![Page 64: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/64.jpg)
21
Flow-director stats/2
fdir_maxhash: 0 fdir_free: 8190 fdir_coll: 0 fdir_match: 196 fdir_miss: 630653401 fdir_ustat_add: 2 <--- New ssh session fdir_ustat_remove: 0 fdir_fstat_add: 6 fdir_fstat_remove: 0
![Page 65: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/65.jpg)
22
Flow-director stats/3
fdir_maxlen: 0 fdir_maxhash: 0 fdir_free: 8190 fdir_coll: 0 fdir_match: 206 <--- ssh packets are matched fdir_miss: 645067311 fdir_ustat_add: 2 fdir_ustat_remove: 0 fdir_fstat_add: 6 fdir_fstat_remove: 0
![Page 66: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/66.jpg)
23
Flow-director stats/4
fdir_maxlen: 0 fdir_maxhash: 0 fdir_free: 32768 <-- Now incresed 32k fdir_coll: 0 fdir_match: 0 fdir_miss: 196502463 fdir_ustat_add: 0 fdir_ustat_remove: 0 fdir_fstat_add: 0 fdir_fstat_remove: 0
![Page 67: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/67.jpg)
24
Flow-director stats/5
fdir_maxlen: 0 fdir_maxhash: 0 fdir_free: 32764 fdir_coll: 0 fdir_match: 948 <-- netperf TCP_RR fdir_miss: 529004675 fdir_ustat_add: 4 fdir_ustat_remove: 0 fdir_fstat_add: 44 fdir_fstat_remove: 0
![Page 68: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/68.jpg)
25
Transaction latency using flow separation
![Page 69: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/69.jpg)
26
Experiment 1 results
Baseline (no background traffic) gives 30000 transactions per second
With background traffic using RSS over all cores gives increase in transaction latency reducing transactions per second to ~5000
The RSS patch (dont forward traffic on core 0) brings the transaction latency back to (almost) the same case as the baseline
In all cases the control traffic is bound to core 0
![Page 70: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/70.jpg)
27
source router sink
Experiment 2:Flow separation in-line traffic
Bulk data
TCP transactions
- Inline control within bulk data (on same incoming interface)- Study latency of TCP transactions- Work in progress
![Page 71: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/71.jpg)
28
Results in-line
Flow Mix 64 byte0
1000
2000
3000
4000
5000
6000
VanillaWith Separation
64 byte0
2
4
6
8
10
12
14
16
18
VanillaWith Separation
Zoom in of 64 byte packetsTransaction latency wo/w RSS pathFlow mix and 64 byte packets
![Page 72: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/72.jpg)
29
Classifier small packet problem
Seems we drop a lot packets before they are classified
DCB (Data Center Bridging) has a lot of features to prioritize different type of traffic. But only for IEEE 802.1Q
VMDq2 suggested by Peter Waskiewicz Jr at Intel
![Page 73: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/73.jpg)
30
Experiment 3:Transmit limits
CPU0Quad-core
CPU1Quad-core
QPIDDR3
DDR3
DDR3 DDR3
DDR3
DDR3
IOHTylersburg
IOHTylersburg
PCI-E Gen.2 x16
PCI-E Gen.2 x16
PCI-E Gen.2 x4
PCI-E Gen.2 x16
PCI-E Gen.2 x16
PCI-E Gen.2 x8
More I/O devices
ESI
QPI
QPI QPI
Investigate hardware limits by transmitting as much as possible from all cores simultaneously.
![Page 74: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/74.jpg)
31
pktgen/setup
Inter-face
eth0 eth1 eth2 eth3 eth4 eth5 eth6 eth7 eth8 eth9
CPU-core
0 1 2 3 4 5 6 7 12 13
Memnode
0 0 0 0 1 1 1 1 1 1
eth4, eth5 on x4 slot
![Page 75: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/75.jpg)
32
Setup
CPU0Quad-core
CPU1Quad-core
QPI
Memory node 0
IOHTylersburg
IOHTylersburg
QPI
QPI QPI
Memory node1
eth0
eth1
eth2
eth3
eth5
eth6
eth7
eth8
eth9
eth4
![Page 76: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/76.jpg)
33
TX w. 10 * 10g ports 93Gb/s “Optimal”
![Page 77: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/77.jpg)
34
Conclusions
We have shown traffic separation in a highend multicore PC with classifier NICs by assigning one CPU core as control and the other as forwarding cores. Our method:
Interrupt affinity to bind control traffic to core 0 Modified RSS to spread forwarding traffic over all except core 0 Modified the flowdirector implementation slightly by only letting local (control) traffic populate the flowdir table.
There are remaining issues with packet drops in inline separation
We have shown 93Gb/s simplex transmission bandwidth on a fully equipped PC platform
![Page 78: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/78.jpg)
35
That's all
Questions?
![Page 79: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/79.jpg)
36
Rwanda example
![Page 80: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/80.jpg)
37
Lagos next
![Page 81: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/81.jpg)
38
Low-Power DevelopmentSome ideas
Power consumption SuperMicro X7SPA @ 16.5 Volt with picoPSU
Watt Test-------------------1.98 Power-Off13.53 Idle14.35 1 core15.51 2 Core15.84 3 Core16.50 4 Core
Routing Performance about 500.000 packet/sec in optimal setup.
![Page 82: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/82.jpg)
39
Example herjulf.se14 Watt by 55Ah battery
bifrost/USB + lowpower disk
![Page 83: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/83.jpg)
40
Running on battery
![Page 84: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/84.jpg)
41
SuperCapacitors
![Page 85: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/85.jpg)
42
DOM - Optical Monitoring
Optical modules can support optical link monitoringRX, TX power, temperatuers, alarms etc
Newly added support to Bifrost/Linux
![Page 86: Control and forwarding plane separation on an … · Control and forwarding plane separation ... Bgp, OSPF both IPv4, ... Control and forwarding plane separation on an opensource](https://reader031.vdocument.in/reader031/viewer/2022022603/5b5c828e7f8b9ac8618c6693/html5/thumbnails/86.jpg)
43
DOM
ethtool D eth3
IntCalbr: Avr RXPower: RATE_SELECT: Wavelength: 1310 nm
Temp: 25.5 C
Vcc: 3.28 V
TxBias: 20.5 mA
TXpwr: 3.4 dBm ( 0.46 mW)
RXpwr: 15.9 dBm ( 0.03 mW)