washington washington university in st louis [email protected] diversifying the network edge fred...
Post on 19-Dec-2015
214 views
TRANSCRIPT
WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Diversifying the Network Edge
Fred Kuhns
2WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
Host and LAN Support for Network Diversification
• Motivation:– solution to network ossification: difficult to field new
protocols or technologies which address limitations in current data networks
– create common substrate layer over which new networking protocols, services and technologies may be deployed
– common substrate layer provides virtualized links, routers and end systems.
– key issue is how to realize virtualization with isolation at the network edge (LAN and end system)
3WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
Introduction - Diversified networking at the edge• Isolating vNet traffic in the LAN
– Define substrate packet format and protocol– Reserve portion of LAN bandwidth for vNET traffic– Determining topology and available bandwidth– Establish substrate links between send systems and substrate routers– Establishing virtual links between virtual end systems and associated virtual routers– Mechanisms for realizing BW reservation
• Isolating vNet traffic in the end system– Supporting the common substrate layer: managing the network resource
• network interface access control• bandwidth allocation and enforcement• delivering to neighbor• management and control interface (accounting, configuration)
– OS extensions to support new networking protocol instantiation and isolation• specifying and enforcing isolation• maintain kernel integrity• required safety and liveness properties of protocols• mechanisms to guard against ill-behaved vNet protocol instances: due to unsafe behavior (bugs, malicious)
or excessive resource use• mechanisms for protocol developers to use for enforcing safety/security• optimizing performance • software development environment for protocols (TBD)
– User versus kernel space protocol implementation and necessary kernel support
4WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
Context: Network Diversification (vNets)
substrate router
virtual router
substrate link
virtual link
virtual end-system
5WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
substrate router
virtual router
substrate link
virtual link
virtual end-system
6WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
Concepts• Intranet versus Internet:
– intranet (no routing) use existing model and protocols– internet (routing) use diversified networking model
• Diversified Networking Model:– multiple networks coexisting within common infrastructure (virtual networks or vNets)– each distinct network instance operates as though it has dedicated resources (non-interfering)– vNet specific routers (virtual routers) interconnected through simplex, point-to-point links
(virtual links)– common substrate layer used for delivering vNet packets to neighbor (provides a simple wire-
like service) • Current model:
– Dominant networking protocol: IP– Shared, heterogeneous physical networks (ATM, Ethernet, Frame Relay, wireless, packet over
SONET, etc.)– Links interconnecting packet switches– Interconnection links may be tunneled (Link Virtualization) through intermediate devices:
ATM, Packet over SONET (or PPP-over-X), MPLS.• Challenges at the network edge:
– partition LAN into virtual links and access routers– end-system support for virtual networks– isolation mechanisms for virtualized resources– bind virtualized resources to network instances
7WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
Terminology• Network Diversification:
– Virtual Network (vNet): distinct vNets coexist within a common physical network– Diversification layer: common substrate layer, provides isolation and point-to-point link services– vNet is composed of one or more virtual routers (VR) interconnected by virtual links. Virtual routers
and links are direct corollaries to their physical counterparts … Network resources are virtualized.– An end-system implements vNet protocols and provides connectivity services within a virtualized
network protocol environment (virtual end-system). The virtual end-system provides mechanisms for protocol implementation, resource control and isolation.
• Diversification layer provides two levels of abstraction (i.e. two core services):– Substrate: encapsulate existing layer 1 and layer 2 technologies and provide a single, consistent
framework for implementing virtualized links and routers.substrate link: abstraction to provide similar behavior as a point-to-point connection between communicating end points. Provides isolation services to different virtual networks using a common substrate link.substrate router: A physical device which forwards network traffic based on its vNet membership. Provides sharing and isolation services to disparate vNets and hosts virtual routers.
– Virtual: framework providing a simple model and set of interfaces for implementing virtual networks. The model defines virtual routers, end-systems and links. The goal is for virtual inks to and routers to behave similar to their physical counterparts.virtual link: simulates the behavior of a dedicted point-to-point link interconnecting virtual end points (virtual routers and/or virtual end systems). A virtual link is implemented by one or more substrate links. virtual router: implements a particular vNet’s routing logic. The underlying substrate router provides the necessary isolation and resource management functions.
8WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
Related Work• Current virtualization efforts on the end system are driven by the desire to
support many concurrently running, non-interfering, secure server applications– The goal is to completely isolate applications running on a common hardware
platform. It appears to each application as though it is running on a dedicated platform (hardware and operating system).
– The framework enforces resource constraints and access controls– In this model the isolation is complete and transparent– each operational environment appears as a complete end system with independent
operating system instances.– however this is too course grained for our purposes where we want to support
multiple networks per OS instance.– Mention VMWare, Xen, Denali
• Network Protocol extension or composition:– xkernel, spin
• Protocol development environment and patterns– ???
• Extensible Operating Systems (loading extensions into kernel), see next slide
9WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
Related Work: Extensible Operating Sytstems
• Issues:– safety, liveness, performance
• Techniques:– Safe Execution Environment/Virtual machines: Java, KoffeOS, packet
filters– Language based (type safety): OKE, mobile code (STP), SPIN, – Proofs: proof carrying code (PCC)– Software Fault Isolation (SFI): VINO– Hardware Fault Isolation (HFI): kernel plugins, Denali, XEN, Exokernel,
Palladium, NOOKS. See VMM next page.
• we focus on two approaches:– kernel extension to support simple interpreted environment (packet
filtering) with protocols implemented in user space– sandbox for in-kernel protocol implementations using a type safe
language and run-time support. In the sprit of OKE and mobile code (with concepts from OKE
10WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
Modeling the LAN Environment• The effort to provide a simple, common infrastructure layer for creating new or
specialized networks has parallels in operating system and middleware research. Both attempt to offer two key services[1]: – Resource management: time and space sharing (multiplex resources);
synchronization and deadlock handling (buffers, link access, link BW, non-preempted transmission of packet); accounting and status
– User friendliness: convenient and consistent operational environment (see the many RFCs); error detection and handling; protection and security; fault tolerance and failure recovery.
[1] Singhal, Shivaratri, Advanced Concepts in Operating Systems, McGraw-Hill, 1994
• A core technique is to export an extended, virtualized machine providing the illusion of dedicated resources (though the level of abstraction and degree of virtualization differ between systems)– extended machine: abstraction to deal with complexity– virtual machine: controlled sharing
• Define an administrative entity to represent clients (of the service).– For example, operating systems define processes to represent resource ownership
and protection domains. An IP network may define flows, or flow aggregates, to represent an abstract client to which resources (buffers and bandwidth) are assigned.
11WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
LAN Virtualization• Goal: enable unrelated entities, vNets, to transparently share a common set of
underlying resources. Similar to how processes transparently share the underlying computer platform.
• Abstract Resources (to create the extended Net): links, routers, end-systems• Virtualization (make the virtual resource behave as through they were real,
physical devices):– End-system: network subsystem interface, protocol implementation, device
interface for point-to-point links– LAN: links, switches and buffers– WAN/MAN: LANS, packet switches (beyond scope of this ppt)
• We would like to virtualize LAN resources such that registered vNets and local traffic are isolated.
• As an example we consider an Ethernet LAN: We can realize this with Ethernet and IEEE standards 802.1P/Q (VLANs and Priorities):– star topology– tree topology– layered tree (with priorities)
• If a virtual link must pass through an existing IP router (the vNet router is not directly attached to the same LAN) then tunnels may be used: IPIP, GRE, MPLS etc.
12WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
Simulates Star Topology for Substrate Links
Internetworking over a diversified networkSubstrate function with Ethernet: • Substrate links: use VLANs to provide the equivalent
of a virtualized “wire” connecting an endsystem to a specific substrate router.
• Sharing and Isolation: - All vNet traffic use assigned VLANs- Use priority queuing (802.1P/Q)- All intranet traffic uses lower priority queues.
• Resource management:- LAN: Use admission control (static or dynamic) to
provide bandwidth guarantees to vNet traffic.- End system: Substrate layer on end-system enforce
per VLAN and per vNet bandwidth constraints• Virtual links: In this simple example there is exactly
one virtual link for each substrate link.
• Each host to substrate router connection is assigned a distinct VLAN. So N hosts implies N VLANs on Ethernet.
• Alternative is to define one VLAN tree for each protocol suite (i.e. vnet).
…
switched LAN
VLANX1 VLANX2 VLANXN
vNetX
VR1
13WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
vNetX
VR1
Traffic isolation with priority aware substrate
Ethernet Hubwith High and LowPriority TX queues
vNet traffic to Highotherwise Low
…
HighLow
HighLow
HighLow HighLow
vNet traffic (internet)
Local traffic (intranet)
Local control/management;Legacy internet traffic
all vNet traffic
14WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
Substrate Link as a VLAN Tree
…
VLANX
ethernet switched LAN
Internetworking over a diversified networkSubstrate function with Ethernet: • Substrate links: The VLAN creates a tree
interconnecting all end-systems to the substrate router. Substrate end-point then uses the VLAN tag and source/destination address to realize the logical point-to-point substrate link.
• Sharing and Isolation: - no change from substrate star topology. The only
difference is the shared VLAN domain. Scheme provides traffic isolation.
• Resource management:- Same
• Virtual links: Same.
15WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
…
VLANX
switched LAN
VLANdgram
VLANmed
VLANhigh
switched LAN
…
16WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
Multiple Substrate Links
VLANdgram
VLANmed
VLANhigh
ethernet switched LAN
…
Internetworking over a diversified networkSubstrate function with Ethernet: • Substrate links: Three VLAN trees are used for all
virtual net traffic to/from a substrate router: - Low priority: default for best-effort traffic- Medium priority for virtual nets with soft
performance requirements (average bandwidth)- High priority for isochronous or low-delay,
interactive applications• Sharing and Isolation: See above.• Resource management: See above• Virtual links: Same.
17WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
Multiple vNets per Host
…
VLAN1 VLAN2 VLAN3
VLI VLI VLI
The full model:• Substrate link: connects end-system to substrate router.
Virtualization of a physical cable or wire. A packet enters one end, exists the other and is opaque within.
- Simplex or Duplex?• Substrate interface: end-system abstraction
- Ethernet: <interface, VLAN, dst_addr>- tunnel: MPLS, IP, IPsec, L2TPv3, GRE, AToM- Layer 2: ATM, others?
• Virtual link: Logical interconnection (virtual wire) of adjacent vNet nodes.
- Point-to-point, Simplex or Duplex?• Virtual interface: end-system abstraction representing
one end of a virtual link. Substrate defines mechanism for multiplexing onto common substrate link. For example a virtual link identifier (VLI) in a substrate header
- Simplex or Duplex?
VLAN tag and dst addridentify substraterouter. VLI tagused to router pkt
ether addr/vlan
ether addr/vlan
ether addr/vlan
ethernet LAN
substrate interface
virtual interface
substrate interfaces
virtual interface
VR1
VLIVLI
VR1
18WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
VLI VLI VLI
VR1
VLIVLI
VR1
SR1
…
SL1 SL2 SL3
ether addr/vlan
substrateinterface
SR3SR2
VRVRVR
Ethernet LAN
vNet1
vNet2
vNet3
substrateinterfaces
virtualinterface
19WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
vNet1
vNet2
vNet3
substrateinterfaces
virtualinterface
SR6VR
SR5
VR VR
VR
VLIVLI
VR
VLIVLIVLISR4
SR3VRVR VR
SR1
VR SR2
20WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
Multiple next hop VRs
VLANA1
vNetX
VR1
vNetX
VR2
vNetX
VR3
VLANA2 VLANA3
Host Amember of
vNetX and vNetY
ethernet switched LAN
Multiple Next Hop Virtual Routers:• Substrate link: per end-system, substrate router pair.• Substrate interface: three substrate interfaces:
SI1 = <eth0, VLANXA1, enetAddrSR1>SI2 = <eth0, VLANXA2, enetAddrSR2>SI3 = <eth0, VLANXA3, enetAddrSR3>
• Virtual link: Logical point-to-point connection between virtual end-system and access virtual router. Since we model a point-to-point link there is no need for link addresses.
• Virtual interface: Representation of virtual link on the end-system. The substrate assigns a per substrate link, virtual link identifier (VLI) for each virtual link.
VI1 = <SI1, VLI1>VI2 = <SI1, VLI2>VI3 = <SI2, VLI1>VI4 = <SI3, VLI1>
enetAddrSR1
enetAddrSR2 enetAddrSR3enetAddrA
substraterouter 1
substraterouter 2
substraterouter 3
vNetY
VR1
VLI1 VLI2
VLI1 VLI1
21WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
VLANA1
vNetX
VR1
vNetX
VR2
VLANA2 VLANA3
Host Amember of
vNetX and vNetY
ethernet switched LAN
enetAddrSR1
enetAddrSR2 enetAddrSR3enetAddrA
SR1
SR2 SR3
vNetY
VR1
VLI1 VLI2
VLI1
vNetX
VR3
VLI1
22WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
Substrate Interface:Directly connected: destination IP address + ARP = enet addrGateway: (Gateway’s IP + ARP = enet addr) + VLAN
Virtual Interface:Directly connected: Not used, model only for internetworkingGateway: VLI assigned by substrate. How is this integrated into the current ARP/route interface?
VLI VLI
IP
TCP/IP as an Example Protocol
…
destination
prefix
gateway
(router address)
virtual interface
substrate
interfacell_info
192.168.12.0/24 0.0.0.0 eth0 ARP
*
(default)192.168.12.254
vint0
(eth0,VLAN,ethDst)VLI
vNet Protocl = IP
eth0standard ethernet
Interface
ethernet device
VLANX
direct connect
ethernet LAN
VLAN
VLI
Substrate RouterSR1
ethernetdest. addr
vint0
VLANX
eth0
vNetframework
23WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
Using Tunnels for the substrate layer• Need to look into the various tunneling approaches/protocols.
How can we leverage these?– MPLS and MPLS VPNs
– Generic Routing Encapsulation (GRE): RFC 2784
– Point-to-point tunneling protocol (PPTP)
– Secure VPN
– Any transport over MPLS (AToM)
– IP tunnel
– IPsec VPNs
– Layer 2 Tunneling Protocol version 3 (L2TPv3)• version3 is a draft standard
• RFC 2661: Layer 2 tunneling protocol
– 802.1Q Tunneling: Cisco 802.1Q-in-Q VLAN Extension Services
• What about MPLS over IP tunnels: what was done there?
24WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
Supporting Diversified Networking on the End System
• vNet framework– substrate layer design and implementation on end system. Policies.– integration with existing networking subsystem and isolation mechanisms– packet processing and forwarding rules for both substrate and diversified networking layer. Includes address resolution rules and
techniques.– how do we coordinate substrate and vNet link establishment? VLAN label assignments, substrate router address (IP? ethernet?), VLI
assignments?– establishing links and assigning identifier and integrating with existing network infrastructure/tables.– controlling bandwidth allocations and link access– Supporting the common substrate layer: managing the network resource– what accounting functions are needed?– What control interface is exported?
• OS extensions to support new networking protocol instantiation and isolation– to what degree do we “protect” the kernel?
• buggy code? malicious code? the more protection the greater the performance hit.– specifying and enforcing isolation
• performance: interface access and bandwidth; CPU; buffer (buffer hoarding)• kernel integrity: corrupt data structs, exceptions, unauthorized access, improper interface use, other safety issues• other vNet protocol instance integrity: vNet instance may be able to corrupt another module but not the kernel
– do we attempt to monitor network traffic to ensure one vNet instance is not masquerading as another? Or other types of abuse?– techniques to require/enforce safety and liveness properties of protocols – or to detect violations (prevent or recover)
• type-safe compiler and run-time checks• hardware fault isolation• software fault isolation• cross our fingers
– optimizing performance – for user space protocols use safe execution environment for interpreting packet filters– for kernel space protocols use ???
• Software development environment for protocols– utility libraries and wrappers– patterns and OO models– compositional techniques or even automation
25WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
Background: Traditional Commodity OS Environments• Traditional general purpose operating system
– Process model (resource ownership and execution context): associate programs with resource usage (allocation, scheduling, access control, synchronization) and accounting (historical data).
– Isolation and accounting falls on this process boundary (or possibly thread)– the process model as implemented is not good at capturing resource usage resulting from hidden
scheduling (kernel performing work for a process asynchronously such as when network packets are received)
– likewise, the trust model assume the OS kernel is trustworthy which may not be true for dynamically extensible systems
– the virtualization and scheduling of the CPU and memory is well developed (out of necessity) however managing I/O access and bandwidth is a more recent concern
• With the increasing importance of networking and multimedia new techniques have been developed to manage I/O access and bandwidth
– Network transmit bandwidth is typically managed with the use of packet classifiers (map packet to flow or flow aggregate) and queuing disciplines. This allocation and accounting model differs from the process centric model.
– Disk I/O scheduling shifted from simply optimizing overall throughput to ensuring time critical operations completed on time.
– For the majority of desktop systems network bandwidth is not a limiting factor (1Gbps interfaces are common on new systems). Rather memory and disks remain the critical performance bottleneck.
– Much research and design has been directed at managing either per process or per Flow (or flow aggregate) I/O usage. Neither is the correct approach for this effort were we want per vNet resource management.
26WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
OS Kernel Block Diagram
configuration: registers, MMU (TLB, cache, VM) bus and peripheralsSystem Exception handlers
core ethernet
Socket Interface
UDP RAW IP
IP routes
TCPnTCP2TCP1 …TCP module
clock handlerprocess accountingschedulingtime management
uart
ethernetdevicedriver
timer
hardware dependent layer
HW interrupt/Exception
hardware independent layer
scheduler
SW int(AST)callout Q
TCPpoll
tasks
task management
openfiles
FS managementbuffercache
ops
File Interface ops
Device independent I/O
Inte
rru
pt P
roce
ssin
gA
ST
Pro
cess
ing
User Space (Applications)
Hardware
Basic I/O Interface
txqueue rxqueue
TC/AST
qdisc
device driver
OS ISR demux
callback
util
27WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
End-System Support for Network Diversification• What needs to change?• Process model: (Applications and programs need not change): No
– process model is sufficient for application isolation• Trust model (is network subsystem in trusted?): Yes
– current trust model is not good: need to dynamically load/unload new protocols which may not be trusted. Even user space applications will require mechanisms in the kernel to ensure non-interference
• Resource Management for the Network Subsystem: Yes– Network subsystem degree of isolation is not longer adequate. vNet protocols must be
separately contained, isolated, identifiable, preemptable and cancelable.– Network and processor usage accounting is not adequate. We need to keep track of per
vNet resource usage and constraints. asynchronous network events (aka hidden scheduling) must be properly accounted for and scheduled (per vNet basis).
• User friendliness (for the Network Subsystem - vNets): Yes– Provide simple mechanisms for adding, removing new vNet protocol instances.– Convenient environment for implementing, testing and debugging new protocols.– Support per vNet protection boundaries– mechanisms for implementing different security policies both within a given vNet and
between different vNets. – Ensure system as a whole is not adversely impacted by faulted or poorly implemented
protocols
28WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
Virtual End System• Comments and assumptions
– assume that the creation/deletion of new vNets is infrequent– an application may open connections on one or more different vNets– unrelated applications must be able to engage in IPC using any available
mechanism (pipes, shared memory, TCP/IP etc)– continue support for IP. In fact, IP can be considered to be the least
common denominator network instance. We could use the existing IP network for control to establish and/or manage vNets.
– support both user and kernel space protocol instances– provide isolation and resource guarantees on a per vNet basis– poorly behaved protocol instances (for a given vNet) will be detected,
stopped and expelled from an end system. Applications using this protocol stack will be informed via a socket error return value.
– intra-VN, implementers should have the mechanisms to support QoS and Security – what are they?
– simple mechanism for adding new protocols/VNs
29WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
Block Diagram
network device
proto mux/demux
application
vNet mux/demux
TCP/IPprotocol stack
vNet1
vNet2
vNet3
vNet Framework
30WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
network device
proto mux/demux
application
TCP/IPprotocol stack
vNet mux/demux
vNet1
vNet2
vNet3
vNet Framework
31WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
User or kernel Space protocols?• Each has pros and cons
• User space protocols:– easier to implement and debug
– easier to introduce new protocols (not tightly dependent on socket layer knowing about the new protocol)
– easier to isolate and protect protocols and apps from each other (leverage process model)
• kernel level protocols– easier to integrate into existing framework (simplifies support for system
interface functions like select/poll)
– simplifies intra-protocol security and protection (since protocol runs within trusted kernel)
– simplifies (well, more direct) kernel demultiplexing to correct protocol context (endpoint)
– increased efficiency
32WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
User Space Protocol Implementation• Uncommon outside of high-performance community, they want
zero-copy and specialized demux keys.• Problems: asynchronous processing, life cycle, authentication and
demultiplexing to endpoints– latency in delivering packets (i.e. acks) to user space– increased overhead in per packet processing before a drop/keep decision is
made– processing received acks– timeouts and retransmissions– establishing connections and security: snooping, masquerading– supporting select and poll– protocols where connection may outlive process (TCP’s TIMED_WAIT)– global routing and address resolution tables– global connection tables
• need to know what other ports are being used (locally)• accepting/rejecting new connections
33WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
user-space protocols: Global Issues• Routing: Direct packets to/from correct endpoint/interface
– How is traffic demultiplexed and sent to the correct endpoint/process?• In-kernel filters
– Where are the routing tables and how are they maintained?• route fixed when connection established or located in shared memory
• Control: I use IPv4 as an example– Address resolution protocols/tables? – Other control protocols. For example ICMP, IGRP, others?– Where are the routing protocols implemented?
• Management:– Must manage a protocols namespace (for example, port numbers in IPv4).– Common programming technique, allow protocol instance to select local address part
• specify port = 0 and addr = 0 then implementation will assign correct values– Passive connect model?
• In IPv4 a server listens on a port (host:port:proto) for a connection request. To establish a connection a unique (to the endsystem) port number is assigned and new socket allocated.
– socket-oriented system calls must be supported. On UNIX must support non-blocking I/O with select and poll.
– Connection lifetime may outlast process.• For example TCP TIME_WAIT or simply waiting for a final ack or resending if no ack received.
• Security: we must provide sufficient mechanisms for protocol developers– implementations must be able to guard against masquerading and eavesdropping
34WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
User Space: Configurations
• Given these global issues there are two likely configurations:– all traffic passes through common protocol daemon in user
space– control daemon implements basic set of control functions while
user library implements majority of data path functions– prior work has shown the latter approach to be superior.
• Having all traffic pass through a common protocol daemon => at least one extra copy operation (kernel -> daemon -> user process)
• A better solution is for a daemon to insert relatively simple packet filters in kernel for established connections which directs packets to/filters packets from endpoints.
35WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
socket layer
connection filters
User-Space: Passive Open
vnetXcontrol daemon:
(namespace, lifecycle, connections)
vnetX: protocol library
application
ethernet
vnet demux
3. insert incoming andoutgoing filters forvnetX connection
1. connectionrequest (in)
4. new connection
0. listen/accept(passive open)
5. data, establishedconnections
compare against connection specific outgoing filter
use VLI to access incoming filters and use to demux to filter set and/or socket.
data copy
2. ack (out)
36WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
User-Space: Active Open
socket layer
connection filters
vnetXcontrol daemon:
(namespace, lifecycle, connections)
vnetX: protocol library
application
ethernet
vnet demux
3. insert incoming andoutgoing filters forvnetX connection
4. new connection
0. connect
5. data, establishedconnections
compare against connection specific outgoing filter
data copy
1. connectionrequest (out)
2. ack (in)
use VLI to access incoming filters and use to demux to filter set and/or socket.
37WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
socket layer
connection filters
User-Space: Datagram (Connectionless)
vnetX: protocol library
application
ethernet
vnet demux
1. insert incoming andoutgoing filters forvnetX connection
2. new connection(local address)
0. open(any)
3. data establishedconnections
compare against “connection” specific outgoing filter
use VLI to access incoming filters and use to demux to socket. In this case only the local part is used.
data copy
daemon fills in local address and binds to socket. No restrictions on destination
vnetXcontrol daemon:
(namespace, lifecycle, connections)
38WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
socket layer
connection filters
User-Space: Datagram (Connectionless)
vnetX: protocol library
application
ethernet
vnet demux
1. insert incoming andoutgoing filters forvnetX connection
2. new connection(local and remote)
0. open(local and remote addr)
3. data establishedconnections
compare against “connection” specific outgoing filter
use VLI to access incoming filters and use to demux to socket.
data copy
daemon fills in both local and destination addresses. Destination restricted
vnetXcontrol daemon:
(namespace, lifecycle, connections)
39WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
socket layer
connection filters
User-Space: App exits
vnetXcontrol daemon:
(namespace, lifecycle, connections)
vnetX: protocol library
application
ethernet
vnet demux
3. remove filters 1. connectionclose (out)
drop
2. ack (in/out)
TCP enters TIME_WAIT after close
40WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
Considerations For Kernel Extensions• Identified areas where modules may impact system behavior
– software bugs (implementation errors) which may result in kernel or another vNet protocol stack to becoming corrupted.
• dereference invalid pointer: corrupt kernel memory, cause exception (invalid address), read invalid data• incorrect parameter usage• indexing beyond end of an array• incorrect locking protocol or deadlock• overflowing stack (large local variables, recursion etc)• memory management errors: using freed memory, memory leaks, incorrect allocation sizes• not checking return values
– design errors leading to kernel corruption• misuse of kernel interfaces• improper control processing• improper data output
– performance/efficiency errors: use too many resources (buffers, I/O bandwidth, CPU cycles, locks, time)• adversely impacts kernel and application processes• adversely impacts other vNet protocol stacks• adversely impacts network traffic (remote hosts or network devices)
– security or protection violation either compromising confidentiality or altering data• unauthorized read/write of kernel/user data• unauthorized use or resource (invalid packets set on network)• unauthorized read/write on another vNet protocol stack environment
• possible Isolation mechanisms:– static and dynamic enforcement of kernel module (interface) access restrictions– Bounded (deterministic or limited)
• buffers: common buffer pool but thresholds on number that can be in use at any one time. Easy for tx, what about receive (do we drop packets)?
• Bandwidth• Locks??• other resources?• hard/soft bounds? Deterministic or Statistical?
– ???
41WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
Pushing protocols into the Kernel
• Positives:– All the issues associated with user-space protocol simply go
away. Global tables and lifetime of the kernel
– Performance, efficiency, existing code base
– Enhances intra-Protocol security
– Simplifies integration with existing network I/O subsystems and interfaces
• Negatives: – Isolation: More difficult to isolate system from protocol
instances. Inter-protocol isolation difficult.
– Security: Proving trust/security more difficult
– Implementation and debugging more difficult in kernel
42WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
Our Approach
• ???
43WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
Kernel-Space Protocols
…
ethetnet
TCPnTCP2TCP1 …UDP RAW IP
IP routes
TCP
eth device driver
HW interrupt/Exception
HW Interrupt
SW Interrupt
User Space (Applications)
Hardware
openfiles
FS managementbuffercache
opsFile Interface
I/O Interface
vnet Demux
VLAN
Application(s)
vnet Socket I/O Interfacevnet ops
vnet Protostate tables
/dev/protoX/dev/vnet
udp:porttcp:port rawIP…vnet:epvnet:ep
Socket InterfacePF_VNET PF_INET
eth0
route to interface
TCP/IPvnet Protostate tables …
Rework!
44WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
User Space Protocols
Chandramohan A. Thekkath , Thu D. Nguyen , Evelyn Moy , Edward D. Lazowska, Implementing network protocols at user level, IEEE/ACM Transactions on Networking (TON), v.1 n.5, p.554-565, Oct. 1993
Chris Maeda, Brian Bershad, Protocol Service Decomposition for High-Performance Networking, Proceedings of the 14th ACM Symposium on Operating Systems Principles. December 1993, pp. 244-255.
• Aled Edwards , Steve Muir, Experiences implementing a high performance TCP in user-space, Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication, p.196-205, 1995
• Kieran Mansley, Engineering a User-Level TCP for the CLAN Network, Proceedings of the ACM SIGCOMM workshop on Network-I/O convergence: experience, lessons, implications, Pages: 228 – 236, 2003
45WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
Extensible protocol frameworks in the kernel
• Parveen Patel, Andrew Whitaker, David Wetherall, Jay Lepreau, Tim Stack, Upgrading Transport Protocols using Untrusted Mobile Code, Proceedings of the 19th ACM Symposium on Operating Systems Principles, Pages 1-14, October 2003.
• Herbert Bos, Bart Samwel, Safe Kernel Programming in the OKE, Proceedings of the fifth IEEE Conference on Open Architectures and Network Programming, June 2002
• Marc Fiuczynski, Brian Bershad, An Extensible Protocol Architecture for Application-Specific Networking, Proceedings of the Winter USENIX Technical Conference, pages 55-64, January, 1996
• Norman Hutchinson, Larry Peterson, The x-kernel: An Architecture for Implementing Network Protocols, IEEE Transactions on Software Engineering, 17(1):64-76, January 1991
46WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
Isolation Services• Marko Zec, Implementing a Clonable Network Stack In the FreeBSD
Kernel, Proceedings of USENIX Technical Conference, pages 137-150, June 9-14, 2003
• P. H. Kamp, R. N. M. Watson, Jails: Confining the omnipotent root, Proceedings of the 2nd International SANE Conference, May 2000
• A Bavier, M Bowman, B Chun, D Culler, S Karlin, S Muir, L Peterson, T Roscoe, T Spalink, M Wawrzoniak, Operating System Support for Planetary-Scale Network Services, Proceedings of the 1st USENIX Symposium on Networked Systems Design and Implementation, pages 253-266, March 2004
• G Back, W Hsieh, J. Lepreau, Processes in KaffeOS: Isolation, Resource Management, and Sharing in Java, Proceedings of the 4th Symposium on Operating Systems Design and Implementation, pages 333-346, October 2000
• R Wahbe, S Lucco, T Anderson, S Graham, Efficient Software-Based Fault Isolation, Proceedings of the 14th Symposium on Operating Systems Principles, pages 203-216, December 5-8, 1993
47WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 04/18/23
VMM
• P Barham, B Dragovic, K Fraser, S Hand, T Harris, A Ho, R Neugebauer, I Pratt, A Warfield, Xen and the Art of Virtualization, Proceedings of the 19th Symposium on Operating System Principles, pages 164-177, October 19-22, 2003
• A Whitaker, M Shaw, S Gribble, Scale and Performance in the Denali Isolation Kernel, Proceedings of the 5th Symposium on Operating Systems Design and Implementation, pages 195-210, December 9-11, 2002