using vini to test new network protocols murtaza motiwala, georgia tech andy bavier, princeton...
TRANSCRIPT
Using VINI to Test New Network Protocols
Murtaza Motiwala, Georgia TechAndy Bavier, Princeton University
Nick Feamster, Georgia TechSantosh Vempala, Georgia Tech
2
“The research agenda in measurement must change to consider measurement solutions which enlist the cooperation of routers. The need is so urgent that the deployment...can be finessed by cooperation between a few key ISPs. There is a rich vein of technical problems, hitherto considered only from an active measurement perspective, for which there can be new and effective...solutions.”
—Varghese and Estan, The Measurement Manifesto
3
Accountability and Availability
• Accountability: Detecting and locating the cause of performance degradations– Proposal: In-band path diagnosis (Orchid)– Need: Carry network traffic with modified packet
formats, routers with packet marking capabilities
• Availability: Maintaining reachability to Internet destinations in the face of failing components– Proposal: Path splicing– Need: Support for running multiple routing protocols
in parallel, modified packet formats, etc.
4
Data-Plane Accountability
• Mechanisms to detect and locate sources (and causes of bad behavior)
• Causes may be benign or malicious– Congestion– Faulty links– Denial of service attack
• Recourse to avoid faulty or malicious elements– Scalable network support for path diversity
5
One Mechanism: Out-of-Band
• Approach: Send additional probe traffic to capture network conditions– Ping, traceroute, pathchar, etc.
• Problem: Measured performance may not reflect conditions experienced by data traffic– May not capture transient faults– Probes may be treated differently– Introduces additional probe traffic, which may affect
observed performance
6
Alternative: In-Band Path Diagnosis
• Store information about network diagnostics in the packet itself.
• Advantage: Diagnostic information reflects information actually experienced by data traffic.
• Challenges– Lost data packets mean lost diagnostics– Distinguishing loss and reordering– Recovering diagnostic information (from the receiver)– Packet marking and storage requirements
7
Data-Plane Accountability
• Problem: Network elements drop packets, fail, and otherwise give rise to poor performance
• One Solution: In-Band Path Diagnosis
• Routers keep track of number of packets seen per flow
• Each router stamps each packet with current flow counter value
• If current counter value does not equal router’s expected packet count for that flow, router marks packet
IP Header
New Shim Header
Transport header
High-level Overview
8
Detailed Operation
• Suppose R2 and R3 have each lost one packet• Next packet: R2 sees “gap” in counter value
– Marks packet with its ID, updates flow counter value
• Subsequent packets contain marks for packets further downstream
9
Implementation and Evaluation
• Implementation in Click– Two main elements: ModifyIng, ModifyPkt
• Deployment on PL-VINI– Evaluation under direct packet drops and induced routing
instability
10
the entire approach completely disregards the cost of implementation on routers. … The authors must demonstrate that what they are proposing is feasible at e.g., 40Gbps if it is going to be implemented on the fast path…
Some Recent Feedback
11
Path Splicing: Main Idea
• Step 1: Run multiple instances of the routing protocol, each with slightly perturbed versions of the configuration
• Step 2: Allow traffic to switch between instances at any node in the protocol
ts
Compute multiple forwarding trees per destination.Allow packets to switch slices midstream.
Feamster, Motiwala, and Vempala, Path Splicing with Network Slicing
12
Perturbations
• Goal: Each instance provides different paths• Mechanism: Each edge is given a weight that is
a slightly perturbed version of the original weight– Two schemes: Uniform and degree-based
ts
3
3
3
“Base” Graph
ts
3.5
4
5 1.5
1.5
1.25
Perturbed Graph
13
Network Slicing
• Goal: Allow multiple instances to co-exist• Mechanism: Virtual forwarding tables
a
t
c
s b
t a
t c
Slice 1
Slice 2
dst next-hop
14
Path Splicing in Practice
• Packet has shim header with routing bits
• Routers use lg(k) bits to index forwarding tables– Shift bits after inspection– Incremental deployment is trivial– Persistent loops cannot occur
• To access different (or multiple) paths, end systems simply change the forwarding bits
15
Design and Implementation
• Click and Quagga on PL-VINI
Control Plane
ForwardingTable
Daemon
Classifier
Control Plane
ForwardingTable
Daemon
16
Challenges
• Can end hosts react quickly enough to recover?– How does the end system find the alternate path?
• How does splicing perform for other topologies?
• Deployment Paths– VINI– Overlay– Wireless
17
What ramifications does the proposed technique have on state-of-the-art router hardware?...As the routing method is supposed to use in the routers, some traditional metrics (e.g. the influence on throughput or latency) should be used to compare the performance…
More Feedback
the entire approach completely disregards the cost of implementation on routers. … The authors must demonstrate that what they are proposing is feasible at e.g., 40Gbps if it is going to be implemented on the fast path…
18
Questions
• What amount of “realism” should a testbed like VINI provide?
• How to convince– Researchers– Vendors– …
• Might VINI be a deployment platform, rather than simply a testing platform?
19
20
Internet Routing Lacks Accountability
• Control Plane: Messages can be falsified– Misconfiguration: AS 7007, ConEdison route leak– Malice: Spammers stealing address space
• Data Plane: Data traffic is not guaranteed to travel where the routing protocol indicates– Paths may not perform well– Even if a faulty path cold be located, no recourse
This talk: Detecting and isolating faulty elements and nodes.Some discussion about recourse.
21
Design Considerations
• Localization granularity: With what precision should a fault be located?– From within a few ASes to actual network element
• Statistics granularity: With what precision should statistics be captured?– From coarse, per-flow statistics to per-packet statistics
• Storage: How much state should be stored, and where should it be stored?– In the router vs. in the packet
22
Design Considerations (cont.)
• Modifications to packet format: Modify packet format, or squeeze data into existing headers?
• Robustness to malice: Should the scheme be robust in the face of malice?– Off-path: Hosts or routers off of the data path try to
disrupt communication– On-path: Malicious hosts or routers on-path may lie
23
Analysis of Accuracy
• Partially accurate: Faulty element identified, but not the correct number of lost packets– Example: Counter overflow
• Misleading: Network fault is attributed to the incorrect network element– Example: Packets containing information about packet
loss are also lost
• No information: No information reported
24
Multipath: Promise and Problems
• Bad: If any link fails on both paths, s is disconnected from t
• Want: End systems remain connected unless the underlying graph is disconnected
ts
25
Reliability Approaches that of Underlying Graph
• GEANT (Real) and Sprint (Rocketfuel) topologies• 1,000 trials• p indicates probability edge was removed from base graph
Reliability approaches optimal
Average stretch is only 1.3
GEANT topology,degree-based perturbations
26
Summary and Question• Network virtualization to “cheat” on scalability
tradeoffs– Path diversity vs. scalability
– Efficiency vs. scalability
– Convergence vs. scalability
• What are the common abstractions, functions, etc. that the substrate should provide?– Slicing
– Nesting
– “Knobs” for granularity control
– …?