building fast, flexible virtual networks on commodity hardware nick feamster georgia tech trellis: a...
TRANSCRIPT
Building Fast, Flexible Virtual Networks on
Commodity Hardware
Nick FeamsterGeorgia Tech
Trellis: A Platform for Building Flexible, Fast Virtual Networks on Commodity Hardware, Mundada, Bhatia, Motiwala, Valancius, Muhlbauer, Bavier, Nick Feamster, Rexford, Peterson, ROADS 2008
Building a Fast, Virtualized Data Plane with Programmable Hardware, Bilal Anwer and Nick Feamster (In Submission)
2
Concurrent Architectures are Better than One (“Cabo”)
• Infrastructure: physical infrastructure needed to build networks
• Service: “slices” of physical infrastructure from one or more providers
The same entity may sometimes play these two roles.
3
Network Virtualization: Characteristics
• Multiple logical routers on a single platform• Resource isolation in CPU, memory, bandwidth,
forwarding tables, …
• Customizable routing and forwarding software• General-purpose CPUs for the control plane• Network processors and FPGAs for data plane
Sharing
Customizability
4
Requirements
• Scalable sharing (to support many networks)
• Performance (to support real traffic, users)
• Flexibility (to support custom network services)
• Isolation (to protect networks from each other)
5
VINI
s
c
BGP
BGP
BGP
BGP
• Prototype, deploy, evaluate new network architectures– Carry real traffic for real users– More controlled conditions than PlanetLab
• Extend PlanetLab with per-slice Layer 2 virtual networks– Support research at Layer 3 and above
6
PL-VINI• Abstractions
– Virtual hosts connected by virtual P2P links
– Per-virtual host routing table, interfaces
•Drawbacks– Poor performance:
• 50Kpps aggregate
• 200Mb/s TCP throughput– Customization difficult
Control
XORP(routing protocols)
UML
eth1 eth3eth2eth0
PlanetLab VM
Click
PacketForwardEngine
DataUmlSwitch
element
Tunnel table
Filters
UDP tunnels
7
Trellis
• Same abstractions as PL-VINI– Virtual hosts and links
– Push performance, ease of use
• Full network-stack virtualization• Run XORP, Quagga in a slice
– Support data plane in kernel
• Approach native Linux kernel performance (15x PL-VINI)
• Be an “early adopter” of new Linux virtualization work
kernel FIB
virtualNIC
application
virtualNIC
user
kernel
bridge
shaper
EGREtunnel
bridge
shaper
EGREtunnel
Trellis virtual host
Trellis Substrate
8
Virtual Hosts
• Use container-based virtualization– Xen, VMWare: poor scalability, performance
• Option #1: Linux Vserver– Containers without network virtualization– PlanetLab slices share single IP address, port space
• Option #2: OpenVZ– Mature container-based approach– Roughly equivalent to Vserver– Has full network virtualization
9
Network Containers for Linux
• Create multiple copies of TCP/IP stack
• Per-network container– Kernel IPv4 and IPv6 routing table– Physical or virtual interfaces– Iptables, traffic shaping, sysctl.net variables
• Trellis: marry Vserver + NetNS– Be an early adopter of the new interfaces– Otherwise stay close to PlanetLab
10
Virtual Links: EGRE Tunnels
• Virtual Ethernet links• Make minimal assumptions about
the physical network between Trellis nodes
• Trellis: Tunnel Ethernet over GRE over IP– Already a standard, but no Linux
implementation
• Other approaches: – VLANs, MPLS, other network
circuits or tunnels– These fit into our framework
kernel FIB
virtualNIC
application
virtualNIC
user
kernel
EGREtunnel
EGREtunnel
Trellis virtual host
Trellis Substrate
11
Tunnel Termination
• Where is EGRE tunnel interface?• Inside container: better performance• Outside container: more flexibility
– Transparently change implementation– Process, shape traffic btw container and tunnel– User cannot manipulate tunnel, shapers
• Trellis: terminate tunnel outside container
12
Glue: Bridging
• How to connect virtual hosts to tunnels?– Connecting two Ethernet interfaces
• Linux software bridge– Ethernet bridge semantics, create P2M links– Relatively poor performance
• Common-case: P2P links• Trellis
– Use Linux bridge for P2M links– Create new “shortbridge” for P2P links
13
Glue: Bridging
• How to connect virtual hosts to EGRE tunnels?– Two Ethernet interfaces
• Linux software bridge– Ethernet bridge semantics– Support P2M links– Relatively poor performance
• Common-case: P2P links• Trellis:
– Use Linux bridge for P2M links– New, optimized “shortbridge” module
for P2P links
kernel FIB
virtualNIC
application
virtualNIC
user
kernel
bridge*
shaper
EGREtunnel
bridge*
shaper
EGREtunnel
Trellis virtual host
Trellis Substrate
14
IPv4 Packet Forwarding
2/3 of native performance, 10X faster than PL-VINI
0
100
200
300
400
500
600
700
800
900
PL-VINI Xen Trellis (Bridge) Trellis(Shortbridge)
Native Linux
Fo
rwa
rdin
g ra
te (
kpp
s)
15
Virtualized Data Plane in Hardware
• Software provides flexibility, but poor performance and often inadequate isolation
• Idea: Forward packets exclusively in hardware– Platform: OpenVZ over NetFPGA– Challenge: Share common functions, while isolating
functions that are specific to each virtual network
16
Accelerating the Data Plane
• Virtual environments in OpenVZ
• Interface to NetFPGA based on Stanford reference router
17
Control Plane
• Virtual environments– Virtualize the control plane by running multiple virtual
environments on the host (same as in Trellis)
– Routing table updates pass through security daemon
– Root user updates VMAC-VE table
• Hardware access control– VMAC-VE table/VE-ID controls access to hardware
• Control register– Used to multiplex VE to the appropriate hardware
18
Virtual Forwarding Table Mapping
19
Share Common Functions
• Common functions– Packet decoding– Calculating checksums– Decrementing TTLs– Input arbitration
• VE-Specific Functions– FIB– IP lookup table– ARP table
20
Forwarding Performance
21
Efficiency
• 53K Logic Cells• 202 Units of
Block RAM
Sharing common elements saves up to 75% savings over independent physical routers.
22
Conclusion
• Virtualization allows physical hardware to be shared among many virtual networks
• Tradeoffs: sharing, performance, and isolation• Two approaches
– Trellis: Kernel-level packet forwarding(10x packet forwarding rate improvement vs. PL-VINI)
– NetFPGA-based forwarding for virtual networks(same forwarding rate as NetFPGA-based router, with 75% improvement in hardware resource utilization)