transient bgp loops do they matter, and what can be done about them? nate kushman mit/akamai...
TRANSCRIPT
Transient BGP Loops
Do they matter, and what can be done about them?
Nate Kushman
MIT/Akamai
Srikanth Kandula, Dina Katabi and John Wroclawski
What causes: “Transient BGP Loops”
MIT
Bob
Joe
AT&TSprint
Maintenance
Withdraw MIT
What causes: “Transient BGP Loops”
MIT
Bob
Joe
AT&TSprint
Maintenance
What causes: “Transient BGP Loops”
MIT
Bob
Joe
AT&TSprint
Maintenance
Withdraw MIT
What causes: “Transient BGP Loops”
MIT
Bob
Joe
AT&TSprint
Maintenance
Routing Loop
What causes: “Transient BGP Loops”
MIT
Bob
Joe
AT&TSprint
Maintenance
Withdraw MIT
What causes: “Transient BGP Loops”
MIT
Bob
Joe
AT&TSprint
Maintenance
What causes: “Transient BGP Loops”
MIT
Bob
Joe
AT&TSprint
Maintenance
How common are: “Transient Inter-domain Routing Loops”
• Sprint Study (IMC 2003, IMW 2002):– Looked at packet traces from the Sprint
backbone
– Up to 90% of the observed packet-loss was caused by routing loops
– 60-100% of the loops attributable to BGP
Routing Loop Damage
• Our Study:– 20 vantage points with BGP feeds– 2 Months– 70,000 unique prefixes– Pinged once every 2 minutes– Trace-routed once every 30 minutes– TTL Exceeded responses to detect loops– Additional pings and traceroutes when loops
detected
Routing Loop Damage
10-15% of updates cause routing loops
Collateral Damage
AS A
AS F
AS E
AS D
AS C
AS B
Collateral Damage
AS A
AS F
AS E
AS D
AS C
AS B
CollateralDamage
X
Collateral Damage
Prefixes sharing a loopy link see 19% loss
0
2
4
6
8
10
12
14
16
18
20
-1000 -500 0 500 1000
100 second windows around sharing a loopy link
Pe
rce
nta
ge
of
Pa
ck
et
Lo
ss
What should be done?
We should prevent forwarding loops
A loop occurs because:
One AS pushes a route update to the data plane, but other AS's, unaware yet of the move, try to send packets on the old route
How can we avoid Routing Loops?
MIT
Bob
Joe
AT&TSprint
Maintenance
Withdraw MIT
How can we avoid Routing Loops?
MIT
Bob
Joe
AT&TSprint
Maintenance
Withdraw MIT
AT&T still thinks
Joe is routing
through Bob
How can we avoid Routing Loops?
MIT
Bob
Joe
AT&TSprint
Maintenance
What if:
AT&T knew about
Joe’s change before
making its own?
Suspension
• Continue to route traffic
• Tell control system not to propagate the route
How can we avoid Routing Loops?
MIT
Bob
Joe
AT&TSprint
Maintenance
Withdraw MIT
How can we avoid Routing Loops?
MIT
Bob
Joe
AT&TSprint
Maintenance
Withdraw MIT
What if:
Joe sends it’s update
before changing it’s
forwarding table?
How can we avoid Routing Loops?
MIT
Bob
Joe
AT&TSprint
Maintenance
How can we avoid Routing Loops?
MIT
Bob
Joe
AT&TSprint
Maintenance
And also waits for an
Ack from AT&T
before updating
it’s forwarding table?
How can we avoid Routing Loops?
MIT
Bob
Joe
AT&TSprint
Maintenance
Then we can be sure
that AT&T knows
about the path change
before it happens and
will not use the path
How can we avoid Routing Loops?
MIT
Bob
Joe
AT&TSprint
Maintenance
Instead, AT&T will
move immediately
to the Sprint path and
the loop is avoided.
More Generally
• We have proven:– Loops are prevented in the general case
– Convergence properties similar to normal BGP
• All sorts of good proofs and stuff:– http://nms.lcs.mit.edu/~nkushman/
Your feedback
• Clearly:– Planned Maintenance events
• 20% of update events caused by planned
maintenance
– Link up events
• What about?– Unplanned Link down events– Trade-off between loss on current path and
collateral damage
In Short
• Routing loops cause significant performance problems
• Even prefixes with no BGP updates are significantly affected by loops
• A simple change to BGP can avoid all routing loops
Questions?