path splicing nick feamster georgia tech joint work with murtaza motiwala, santosh vempala, megan...
TRANSCRIPT
![Page 1: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/1.jpg)
Path Splicing
Nick FeamsterGeorgia Tech
Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore
![Page 2: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/2.jpg)
2
Internet Availability
“It is not difficult to create a list of desired characteristics for a new Internet. Deciding how to design and deploy a network that achieves these goals is much harder. … It should be:
1. Robust and available. The network should be as robust, fault-tolerant and available as the wire-line telephone network is today.
2. …
• E911 service• Air traffic control• …
Stanford University Clean-Slate Design for the Internet:
OK for email and the Web, but what about:
![Page 3: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/3.jpg)
3
Work to do…• Various studies (Paxson, Andersen, etc.) show the
Internet is at about 2.5 “nines”• More “critical” (or at least availability-centric) applications
on the Internet• At the same time, the Internet is
getting more difficult to debug– Scale, complexity, disconnection, etc.
![Page 4: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/4.jpg)
8
Threats to Availability
• Natural disasters• Physical failures (node, link)• Router software bugs• Misconfiguration• Mis-coordination• Denial-of-service (DoS) attacks• Changes in traffic patterns (e.g., flash crowd)• …
![Page 5: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/5.jpg)
9
Idea: Backup/Multipath
• For intradomain routing– IP and MPLS fast re-route– Packet deflections [Yang 2006]– ECMP, NotVia, Loop-Free Alternates [Cisco]
• For interdomain routing– MIRO [Rexford 2006]
• Problem– Scale: Protecting against arbitrary failures requires
storing lots of state, exchanging lots of messages– Control: End systems can’t signal when they think a
path has “failed”
![Page 6: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/6.jpg)
10
Backup Paths: Promise and Problems
• Bad: If any link fails on both paths, s is disconnected from t
• Want: End systems remain connected unless the underlying graph has a cut
ts
![Page 7: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/7.jpg)
11
Path Splicing: Main Idea
• Step 1 (Generate slices): Run multiple instances of the routing protocol, each with slightly perturbed versions of the configuration
• Step 2 (Splice end-to-end paths): Allow traffic to switch between instances at any node in the protocol
ts
Compute multiple forwarding trees per destination.Allow packets to switch slices midstream.
![Page 8: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/8.jpg)
12
Outline
• Path Splicing for Intradomain Routing– Generating slices– Constructing paths– Forwarding– Recovery
• Evaluation– Reliability and recovery– Stretch– Effects on traffic
• Path Splicing for Interdomain Routing• Ongoing: Prototype and Deployment Paths
![Page 9: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/9.jpg)
13
Generating Slices
• Goal: Each instance provides different paths• Mechanism: Each edge is given a weight that is
a slightly perturbed version of the original weight– Two schemes: Uniform and degree-based
ts
3
3
3
“Base” Graph
ts
3.5
4
5 1.5
1.5
1.25
Perturbed Graph
![Page 10: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/10.jpg)
14
How to Perturb the Link Weights?
• Uniform: Perturbation is a function of the initial weight of the link
• Degree-based: Perturbation is a linear function of the degrees of the incident nodes– Intuition: Deflect traffic away from nodes where traffic
might tend to pass through by default
![Page 11: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/11.jpg)
15
Constructing Paths
• Goal: Allow multiple instances to co-exist• Mechanism: Virtual forwarding tables
a
t
c
s b
t a
t c
Slice 1
Slice 2
dst next-hop
![Page 12: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/12.jpg)
16
Forwarding Traffic
• Packet has shim header with forwarding bits
• Routers use lg(k) bits to index forwarding tables– Shift bits after inspection
• To access different (or multiple) paths, end systems simply change the forwarding bits– Incremental deployment is trivial– Persistent loops cannot occur
• Various optimizations are possible
![Page 13: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/13.jpg)
17
Forwarding: Putting It Together
• End system sets forwarding bits in packet header– Forwarding bits specify slice to be used at any hop
• Router examines/shifts bits, and forwards
ts
![Page 14: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/14.jpg)
18
Recovery Mechanisms
• End-system recovery– Switch slices at every hop with probability 0.5
• Network-based recovery– Router switches to a random slice if next hop is
unreachable– Continue for a fixed number of hops until
destination is reached
18
![Page 15: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/15.jpg)
19
Availability Evaluation: Two Aspects
• Reliability: Connectivity in the routing tables should approach the that of the underlying graph– If two nodes s and t remain connected in the
underlying graph, there is some sequence of hops in the routing tables that will result in traffic
• Recovery: In case of failure (i.e., link or node removal), nodes should quickly be able to discover a new path
![Page 16: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/16.jpg)
20
Availability Evaluation
• A definition for reliability
• Does path splicing improve reliability?– How close can splicing get to the best possible
reliability (i.e., that of the underlying graph)?
• Can path splicing enable fast recovery?– Can end systems (or intermediate nodes) find
alternate paths fast enough?
![Page 17: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/17.jpg)
21
Reliability Definition
• Reliability: the probability that, upon failing each edge with probability p, the graph remains connected
• Reliability curve: the fraction of source-destination pairs that remain connected for various link failure probabilities p
• The underlying graph has an underlying reliability (and reliability curve)– Goal: Reliability of routing system should approach that of the underlying graph.
![Page 18: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/18.jpg)
22
Reliability Curve: Illustration
Probability of link failure (p)
Fraction of source-dest pairs disconnected
Better reliability
More edges available to end systems -> Better reliability
![Page 19: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/19.jpg)
23
Experimental Setup
• Evaluation on two topologies– GEANT (Real) and Sprint (Rocketfuel)
• Compute base graph by taking the union of k perturbed graphs
• Remove an edge from the base graph with probability p
• Compute number of pairs that could reach one another (average over 1,000 trials)
![Page 20: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/20.jpg)
24
Reliability Approaches Optimal• Sprint (Rocketfuel) topology• 1,000 trials• p indicates probability edge was removed from base graph
Reliability approaches optimal
Average stretch is only 1.3
Sprint topology,degree-based perturbations
![Page 21: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/21.jpg)
25
Simple Recovery Strategies Work Well
• Which paths can be recovered within 5 trials?– Sequential trials: 5 round-trip times– …but trials could also be made in parallel
Recovery approaches maximum possible
Adding a few more slices improves recovery beyond best possible reliability with fewer slices.
![Page 22: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/22.jpg)
26
Significant Novelty for Modest Stretch
• Novelty: difference in nodes in a perturbed shortest path from the original shortest path
Example
s d
Novelty: 1 – (1/3) = 2/3
Fraction of edges on short path shared with long path
![Page 23: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/23.jpg)
27
Summary: Splicing Can Improve Availability
• Reliability: Connectivity in the routing tables should approach the that of the underlying graph– Approach: Overlay trees generated using random link-
weight perturbations. Allow traffic to switch between them– Result: Splicing ~ 10 trees achieves near-optimal reliability
• Recovery: In case of failure, nodes should quickly be able to discover a new path– Approach: End nodes randomly select new bits– Result: Recovery within 5 trials approaches best possible.
![Page 24: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/24.jpg)
28
Does Splicing Create Loops?
• Persistent loops are avoidable– In the simple scheme, path bits are exhausted from
the header– Never switching back to the same
• Transient loops can still be a problem because they increase end-to-end delay (“stretch”)– Longer end-to-end paths– Wasted capacity– Two-hop loops do occur (around 1 in 100 trials for
k=2, more for higher values of k), but can be avoided with the mechanisms above
![Page 25: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/25.jpg)
29
Interactions with Traffic
Maximum utilization unaffected
![Page 26: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/26.jpg)
30
Path Splicing for Interdomain Routing• Observation: Many routers already learn multiple
alternate routes to each destination.• Idea: Use the bits to index into these alternate routes at
an AS’s ingress and egress routers.
• Storing multiple entries per prefix • Indexing into them based on packet headers• Selecting the “best” k routes for each destination
Required new functionality
ddefault
alternate
Splice paths at ingress and egress routers
![Page 27: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/27.jpg)
31
Experimental Setup
• 2,500-node policy-annotated AS graph• Use C-BGP to compute routes on base graph• Remove each inter-AS edge with probability p• Test connectivity between a random subset of
AS pairs• Compute base reliability without policy
restrictions
![Page 28: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/28.jpg)
32
Interdomain Splicing: Reliability
2-slice deployment approaches best possible
![Page 29: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/29.jpg)
33
Incremental Deployment
Partial deployment provides some gains
![Page 30: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/30.jpg)
34
Ongoing Work
• Software implementation– Click Element– PlanetLab/VINI deployment
• Extension to Cisco Multi-Topology Routing– IETF draft in-progress
![Page 31: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/31.jpg)
35
Open Questions and Ongoing Work
• How does splicing interact with traffic engineering? Sources controlling traffic?
• What are the best mechanisms for generating slices and recovering paths?
• Can splicing eliminate dynamic routing?
![Page 32: Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala, Santosh Vempala, Megan Elmore](https://reader036.vdocument.in/reader036/viewer/2022062618/5514991b550346b2598b573a/html5/thumbnails/32.jpg)
36
Conclusion• Simple: Forwarding bits provide access to
different paths through the network
• Scalable: Exponential increase in available paths, linear increase in state
• Stable: Fast recovery does not require fast routing protocols
http://www.cc.gatech.edu/~feamster/papers/splicing-hotnets.pdf