snabb, a toolkit for building user-space network functions (es.nog 20)
TRANSCRIPT
SNABB
A TOOLKIT FOR BUILDING USER-SPACE NETWORKFUNCTIONS
ABOUT IGALIA
Consultancy specialized in open-sourceBase in Coruña but distributed all over the world (>60 people working from 15different countries)Contributors to projects such as WebKit, Chromium V8, etcOther areas: Graphics, Multimedia, Networking
https://www.igalia.com/networking
AGENDA
What is Snabb?How it works?Catalog and programsUse case: lwAFTR
WHAT IS SNABB?
Snabb is a toolkit for developing high-performance network functions in user-space
WHAT IS A NETWORK FUNCTION?
A program that manipulates traffic dataBasic operations: read, forward, drop, modify, create...Combining these primitives we can build any network function
EXAMPLES
Firewall: read incoming packets, compare to table of rules and execute anaction(forward or drop)NAT: read incoming packets, modify headers and forward packetTunelling: read incoming packets, create a new packet, embed packet into newone and send it
WHY SNABB?
Increasing improvement of commodity hardware: 10Gbps NICs at veryaffordable pricesHigh-performance equipment is still very expensiveIdea: build an analog high-performance router using commodity hardware
WHY SNABB?
What software to put into this hardware?Common intuition: LinuxDrawback: Linux is not suitable for high-performance networking
WHY NOT LINUX?
General-purpose operating systemAn OS abstracts hw resources to offer high-level interfaces: filesystems,processes, sockets...Our network function will be divided into two lands: user-space and kernel-spaceColorary: processing a packet has an inheritent cost => the cost of the OS
HIGH-PERFORMANCE NETWORKING
NIC: 10GbpsAvg Packet-size: 550-bytePPS: 2272727,271 packet every 440ns ((1/2272727,27)*10^9)CPU: 2,5 Ghz1100 cycles to process one packet (2,5 cycles/sec * 440 ns)
HIGH-PERFORMANCE NETWORKING
Packet-size: 64-byte: 51 ns per packet; 128 cycles per packetLock/Unlock: 16ns; Cache-miss: 32 nsSource: Jonathan Corbet's Small packet size => More packets per second => worseFaster CPU => better
"Improving Linux networking performance"
USER-SPACE DRIVER
Do a kernel by-pass and manage the hardware directly from user-space:Tell Linux not to manage the PCI device (unbind)Do a mmap of the registers of the PCI device into addressable memoryWhenever we read/write the addressable memory, we're actually poking theregisters of the NICFollow the NIC's datasheet to implement operations such as initialize,receive, transmit, etc
USER-SPACE NETWORKING
Snabb is not an isolated case of user-space networking:Snabb (2012)DPDK (2012)VPP/fd.io (2016)
DPDK (Data-plane Development Kit, Intel)VPP (Vector Packet Processing, Cisco)
RING-BUFFER
Very important to avoid packet drops
INSIDE SNABB
SNABB
Project started by Luke GorrieUser-space networking benefit: freedom of programming languageSnabb is mostly written in LuaNetwork functions are also written in LuaFast to run, fast to developSnabb means fast in Swedish :)
ABOUT LUA
Started in 1993 at University of Rio de Janeiro (PUC Rio)Very similar to JavaScript, easy to learnVery small and compact, it's generally embeded in other systemsUse cases: microcontrollers (NodeMCU), videogames (Grim Fandango), IA(Torch7)
ABOUT LUAJIT
Just-in-time compiler for LuaExtremely fast virtual machine!!Very good integration with C thanks to FFI (Foreign Function Interface)
FFI: EXAMPLE
ffi.cdef[[
void syslog(int priority, const char*format, ...);
]]
ffi.C.syslog(2, "error:...");
local ether_header_t = ffi.typeof [[
/* All values in network byte order. */
struct {
uint8_t dhost[6];
uint8_t shost[6];
uint16_t type;
} __attribute__((packed))
]]
SNABB IN A NUTSHELL
A snabb program is an app graphApps are conected together via linksSnabb processes the program in units called breadths
NF: APP GRAPH
BREADTHS
A breadth has two steps:inhale a batch of packets into the graphprocess those packets
To inhale, the method pull of the apps is executed (if defined)To process, the method push of the apps is executed (if defined)
# Pull function of included Intel 82599 driver
function Intel82599:pull ()
for i = 1, engine.pull_npackets do
if not self.dev:can_receive() then break end
local pkt = self.dev:receive()
link.transmit(self.output.tx, pkt)
end
end
# Push function of included PcapFilter
function PcapFilter:push ()
while not link.empty(self.input.rx) do
local p = link.receive(self.input.rx)
if self.accept_fn(p.data, p.length) then
link.transmit(self.output.tx, p)
else
packet.free(p)
end
end
end
PACKET PROCESSING
Normally only one app of the app graph introduces packets into the graphThe method push gives an opportunity to every app to do something with apacket
APP GRAPH DEFINITION
local c = config.new()
-- App definition.
config.add(c, "nic", Intel82599, {
pci = "0000:04:00.0"
})
config.add(c, "filter", PcapFilter, "src port 80")
config.add(c, "writer", Pcap.PcapWriter, "output.pcap")
-- Link definition.
config.link(c, "nic.tx -> filter.input")
config.link(c, "filter.output -> writer.input")
engine.configure(c)
engine.main({duration=1})
PACKETS
struct packet {
uint16_t length;
unsigned char data[10*1024];
};
LINKS
struct link {
struct packet *packets[1024];
// the next element to be read
int read;
// the next element to be written
int write;
};
SNABB: APP CATALOG AND PROGRAMS
INVENTARY
apps: software components that developers combine together to build networkfunctionsprograms: complete network functions
APPS I/O
Intel i210/i350/82599/XL710Mellanox Connectx-4/5Virtio host y guestUNIX socketLinux: tap and "raw" (e.g: eth0)Pcap files
APPS L2
Flooding and learning bridgeVLAN insert/removeARP/NDP
APPS L3
IPv4/v6 fragmentation and reassemblyIPv4/v6 splitterICMPv4/v6 echo responderControl-plane delegation (nh_fwd)
APPS L4
IPsec ESPLightweight 4-over-6 AFTRKeyed IPv6 Tunnel
APPS MONITORING
IPFix capturer and exporterL7 monitor/filtering (libndpi)Pcap expressions filter (with own backend for code generation)
APPS TESTING
Lots of load generators: loadgen, packetblaster, loadbench...
USE CASE: LWAFTR
CONTEXT
2012-2014: Several RIRs run out of IPv4 public addresses2008: IPv6 adoption starts to peak upStill big dependency on IPv4: services, websites, programs, etc
SOLUTIONS
Carrier-Grade NAT: temporal solution for IPv4 address exhaustion problemDeployment of Dual-Stack networks (IPv4 e IPv6)Dual-Stack implies increasing complexity and costs (maintenance of twoseparated networks)Dual-Stack Lite (IPv6-only network which also offers IPv4 connectivity relyingon CGN)Lightweight 4over6: iteration over Dual-Stack
LIGHTWEIGHT 4OVER6
LW4O6 - GOALS
RFC7596 fully complaint (lwAFTR part)Performance: 2MPPS; 550-byte (packet-size); Binding-table: 1M subscribers.No packet drops
LW4O6 - DEVELOPMENT
Version 1:PrototypeBasic functionality (encapsulating/decapsulating)Small binding-table (own format)Development of tools to measure performance
LW4O6 - DEVELOPMENT
Version 2Production qualityFully standard compliantBig binding-table: 1M subscribers (still customized format but much closerto standard)Add support for other necessary protocols: ARP, NDP, fragmentation,reassembly, pingTons of optimizations (use of AVX instructions to speed up lookups)
LW4O6 - DEVELOPMENT
Version 3:Added YANG support to SnabbSupport binding-table format according to standardSupport of execution as leader/worker (leader: control-plane/worker: data-plane)
LW4O6 - DEVELOPMENT
Version 4:Multiprocess (one leader, multiple workers)Improvement of the Intel 10Gbps driver (added support for RSS, ReceivedSide Scaling)Added alarms support according to latest draft
LIGHTWEIGHT 4OVER6 - TALKS
Juniper's vMX Lightweight 4over6 VNFCharla: Kostas Zordabelos's A real-world scale network VF using Snabb for lw4o6Charla:
Juniper Tech Club, Marzo 2017
SDN Meetup, Abril 2017
OTHER PROGRAMS
PROGRAM: PACKET BLASTER
Generally useful tool: fill TX buffer of NIC with packets and transmit them overand over againMeasures received traffic tooEasily saturates 10Gbps links
snabb packetblaster replay packets.pcap 82:00.1
PROGRAM: SNABBWALL
L7 firewall that optionally uses nDPICollaboration betwen Igalia and NLnet FoundationLanded upstream in 2017Website: http://snabbwall.org
PROGRAM: IPFIX
NETFLOW collector and exporter (v9 and IPFIX)Line-rate speed on a single core. Further improvement: parallel processing viaRSSLanded upstream very recently
PROGRAM: L2VPN
L2VPN over IPv6 (developed by Alexander Gall from SWITCH)Pending to land upstream; used in productionIdeal Snabb use case: programmer/operator builds bespoke tool
PROGRAM: YOUR VNF
Snabb upstream open to include new network functionsRepository will grow as people will build new thingsIgalia can build one for you
LAST NOTES ABOUT PERFORMANCE
CONSIDERATIONS
Isolcpus: Prevents the kernel to take a CPU to schedule processesDishable HyperThreadingUse HugePages (2MB) (Linux default is 4Kb)Do not neglect NUMA when launching programsMake use of SIMD instructions (AVX, AVX2) to speed up computations(checksum)Keep an eye on regressions: profile often
SUMMARY
Toolkit for developing high-performance network functions in user-spaceSnabb provides apps which can be combined together forming a graph (networkfunction)Snabb provides programs, complete network functions ready to useSnabb provides libraries, to easy the development of new network functionsCompletely written in Lua: easy to extendFast: kernel-by pass + high-level language + fast VM (LuaJIT)
THANKS!
Email: [email protected]: @diepg
$ git clone https://github.com/snabbco/snabb.git
$ cd snabb
$ make