interdomain routing and the border gateway protocol (bgp) cl oct 27, 2004 timothy g. griffin intel...
TRANSCRIPT
Interdomain Routing and The Interdomain Routing and The Border Gateway Protocol Border Gateway Protocol
(BGP)(BGP)
CLOct 27, 2004
Timothy G. Griffin Intel Research,
Cambridge UK
Architecture of Dynamic Routing
AS 1
AS 2
EGP (= BGP)
EGP = Exterior Gateway Protocol
IGP = Interior Gateway Protocol
Metric based: OSPF, IS-IS, RIP, EIGRP (cisco)
Policy based: BGP
The Routing Domain of BGP is the entire Internet
IGP
IGP
• Topology information is flooded within the routing domain
• Best end-to-end paths are computed locally at each router.
• Best end-to-end paths determine next-hops.
• Based on minimizing some notion of distance
• Works only if policy is shared and uniform
• Examples: OSPF, IS-IS
• Each router knows little about network topology
• Only best next-hops are chosen by each router for each destination network.
• Best end-to-end paths result from composition of all next-hop choices
• Does not require any notion of distance
• Does not require uniform policies at all routers
• Examples: RIP, BGP
Link State Vectoring
Technology of Distributed Routing
The Gang of Four
Link State Vectoring
EGP
IGP
BGP
RIPIS-IS
OSPF
How do you connect to the Internet?
Physical connectivity isjust the beginning of thestory….
Partial View of www.cl.cam.ac.uk (128.232.0.20) Neighborhood
AS 786 ja.net(UKERNA)
AS 1239 Sprint
AS 4373 Online Computer Library Center
Originates > 180 prefixes, Including 128.232.0.0/16
AS 3356Level 3
AS 6461AboveNet
AS 1213 HEAnet(Irish academic and research)
AS 7 UK Defense Research Agency
AS 5459 LINX
AS 702 UUNET
AS 20965 GEANT
How Many ASNs are there today?
Thanks to Geoff Huston. http://bgp.potaroo.net on October 24, 2003
15,981
How Many ASNs are there today?
Thanks to Geoff Huston. http://bgp.potaroo.net on October 26, 2004
18,217
12,940origin only (notransit)
AS Numbers (ASNs)
ASNs are 16 bit values.64512 through 65535 are “private”
• Genuity: 1 • MIT: 3• JANET: 786• UC San Diego: 7377• AT&T: 7018, 6341, 5074, … • UUNET: 701, 702, 284, 12199, …• Sprint: 1239, 1240, 6211, 6242, …• …
ASNs represent units of routing policy
Currently over 15,000 in use.
Autonomous Routing Domains Don’t Always Need BGP or an ASN
Qwest
Yale University
Nail up default routes 0.0.0.0/0pointing to Qwest
Nail up routes 130.132.0.0/16pointing to Yale
130.132.0.0/16
Static routing is the most common way of connecting anautonomous routing domain to the Internet. This helps explain why BGP is a mystery to many …
ASNs Can Be “Shared” (RFC 2270)
AS 701UUNet
ASN 7046 is assigned to UUNet. It is used byCustomers single homed to UUNet, but needing BGP for some reason (load balancing, etc..) [RFC 2270]
AS 7046Crestar Bank
AS 7046 NJIT
AS 7046HoodCollege
128.235.0.0/16
Autonomous Routing Domain != Autonomous System (AS)
• Most ARDs have no ASN (statically routed at Internet edge)
• Some unrelated ARDs share the same ASN (RFC 2270)
• Some ARDs are implemented with multiple ASNs (example: Worldcom)
ASes are an implementation detail of Interdomain routing
How many prefixes today?
Thanks to Geoff Huston. http://bgp.potaroo.net on October 24, 2003
154,894
Note: numbersactually dependspoint of view…
29%
23%
Address space covered
How many prefixes today?
Thanks to Geoff Huston. http://bgp.potaroo.net on October 26, 2004
179,903
Note: numbersactually dependspoint of view…
31%
23%
Address space covered
15
Policy-Based vs. Distance-Based Routing?
ISP1
ISP2
ISP3
Cust1
Cust2Cust3
Host 1
Host 2
Minimizing “hop count” can violate commercial relationships thatconstrain inter-domain routing.
YES
NO
16
Why not minimize “AS hop count”?
Regional ISP1
Regional ISP2
Regional ISP3
Cust1Cust3 Cust2
National ISP1
National ISP2
YES
NO
Shortest path routing is not compatible with commercial relations
Customers and Providers
Customer pays provider for access to the Internet
provider
customer
IP trafficprovider customer
The “Peering” Relationship
peer peer
customerprovider
Peers provide transit between their respective customers
Peers do not provide transit between peers
Peers (often) do not exchange $$$trafficallowed
traffic NOTallowed
Peering Provides Shortcuts
Peering also allows connectivity betweenthe customers of “Tier 1” providers.
peer peer
customerprovider
Peering Wars
• Reduces upstream transit costs
• Can increase end-to-end performance
• May be the only way to connect your customers to some part of the Internet (“Tier 1”)
• You would rather have customers
• Peers are usually your competition
• Peering relationships may require periodic renegotiation
Peering struggles are by far the most contentious issues in the ISP world!
Peering agreements are often confidential.
Peer Don’t Peer
The Border Gateway Protocol (BGP)
BGP = RFC 1771
+ “optional” extensionsRFC 1997 (communities) RFC 2439 (damping) RFC 2796 (reflection) RFC3065 (confederation) …
+ routing policy configurationlanguages (vendor-specific)
+ Current Best Practices in management of Interdomain Routing
BGP was not DESIGNED. It EVOLVED.
22
BGP Route Processing
Best Route Selection
Apply Import Policies
Best Route Table
Apply Export Policies
Install forwardingEntries for bestRoutes.
ReceiveBGPUpdates
BestRoutes
TransmitBGP Updates
Apply Policy =filter routes & tweak attributes
Based onAttributeValues
IP Forwarding Table
Apply Policy =filter routes & tweak attributes
Open ended programming.Constrained only by vendor configuration language
BGP Attributes
Value Code Reference----- --------------------------------- --------- 1 ORIGIN [RFC1771] 2 AS_PATH [RFC1771] 3 NEXT_HOP [RFC1771] 4 MULTI_EXIT_DISC [RFC1771] 5 LOCAL_PREF [RFC1771] 6 ATOMIC_AGGREGATE [RFC1771] 7 AGGREGATOR [RFC1771] 8 COMMUNITY [RFC1997] 9 ORIGINATOR_ID [RFC2796] 10 CLUSTER_LIST [RFC2796] 11 DPA [Chen] 12 ADVERTISER [RFC1863] 13 RCID_PATH / CLUSTER_ID [RFC1863] 14 MP_REACH_NLRI [RFC2283] 15 MP_UNREACH_NLRI [RFC2283] 16 EXTENDED COMMUNITIES [Rosen] ... 255 reserved for development
From IANA: http://www.iana.org/assignments/bgp-parameters
Mostimportantattributes
Not all attributesneed to be present inevery announcement
24
ASPATH Attribute
AS7018135.207.0.0/16AS Path = 6341
AS 1239Sprint
AS 1755Ebone
AT&T
AS 3549Global Crossing
135.207.0.0/16AS Path = 7018 6341
135.207.0.0/16AS Path = 3549 7018 6341
AS 6341
135.207.0.0/16
AT&T Research
Prefix Originated
AS 12654RIPE NCCRIS project
AS 1129Global Access
135.207.0.0/16AS Path = 7018 6341
135.207.0.0/16AS Path = 1239 7018 6341
135.207.0.0/16AS Path = 1755 1239 7018 6341
135.207.0.0/16AS Path = 1129 1755 1239 7018 6341
In fairness: could you do this “right” and still scale?
Exporting internalstate would dramatically increase global instability and amount of routingstate
Shorter Doesn’t Always Mean Shorter
AS 4
AS 3
AS 2
AS 1
Mr. BGP says that path 4 1 is better than path 3 2 1
Duh!
Routing Example 1
Thanks to Han Zheng
Routing Example 2
Thanks to Han Zheng
Tweak Tweak Tweak (TE)
• For inbound traffic– Filter outbound routes– Tweak attributes on
outbound routes in the hope of influencing your neighbor’s best route selection
• For outbound traffic– Filter inbound routes– Tweak attributes on
inbound routes to influence best route selection
outboundroutes
inboundroutes
inboundtraffic
outboundtraffic
In general, an AS has morecontrol over outbound traffic
29
Implementing Backup Links with Local Preference (Outbound Traffic)
Forces outbound traffic to take primary link, unless link is down.
AS 1
primary link backup link
Set Local Pref = 100for all routes from AS 1 AS 65000
Set Local Pref = 50for all routes from AS 1
30
Multihomed Backups (Outbound Traffic)
Forces outbound traffic to take primary link, unless link is down.
AS 1
primary link backup link
Set Local Pref = 100for all routes from AS 1
AS 2
Set Local Pref = 50for all routes from AS 3
AS 3provider provider
31
Shedding Inbound Traffic with ASPATH Prepending
Prepending will (usually) force inbound traffic from AS 1to take primary linkAS 1
192.0.2.0/24ASPATH = 2 2 2
customerAS 2
provider
192.0.2.0/24
backupprimary
192.0.2.0/24ASPATH = 2
Yes, this is a Glorious Hack …
32
… But Padding Does Not Always Work
AS 1
192.0.2.0/24ASPATH = 2 2 2 2 2 2 2 2 2 2 2 2 2 2
customerAS 2
provider
192.0.2.0/24
192.0.2.0/24ASPATH = 2
AS 3provider
AS 3 will sendtraffic on “backup”link because it prefers customer routes and localpreference is considered before ASPATH length!
Padding in this way is oftenused as a form of loadbalancing
backupprimary
33
COMMUNITY Attribute to the Rescue!
AS 1
customerAS 2
provider
192.0.2.0/24
192.0.2.0/24ASPATH = 2
AS 3provider
backupprimary
192.0.2.0/24ASPATH = 2 COMMUNITY = 3:70
Customer import policy at AS 3:If 3:90 in COMMUNITY then set local preference to 90If 3:80 in COMMUNITY then set local preference to 80If 3:70 in COMMUNITY then set local preference to 70
AS 3: normal customer local pref is 100,peer local pref is 90
BGP Wedgies ---- Bad Policy BGP Wedgies ---- Bad Policy Interactions that Cannot be DebuggedInteractions that Cannot be Debugged
[email protected]://www.cambridge.intel-research.net/~tgriffin/
What is a BGP Wedgie?
• BGP policies make sense locally• Interaction of local policies allows
multiple stable routings• Some routings are consistent with
intended policies, and some are not
– If an unintended routing is installed (BGP is “wedged”), then manual intervention is needed to change to an intended routing
• When an unintended routing is installed, no single group of network operators has enough knowledge to debug the problem
¾ wedgie
fullwedgie
¾ Wedgie Example
• AS 1 implements backup link by sending AS 2 a “depref me” community.
• AS 2 implements this community so that the resulting local pref is below that of routes from it’s upstream provider (AS 3 routes)
AS 1
AS 2
AS 3 AS 4
customer
provider
peer peer
provider
customer
customer
providerbackup link
primary link
And the Routings are…
AS 1
AS 2
AS 3 AS 4
Intended Routing
AS 1
AS 2
AS 3 AS 4
Unintended RoutingNote: This is easy to reach from the intended routing just by “bouncing”the BGP session on the primary link.
Note: this would be the ONLY routing if AS2 translated its “depref me” community to a “depref me” community of AS 3
Recovery
AS 1
AS 2
AS 3 AS 4
AS 1
AS 2
AS 3 AS 4
AS 1
AS 2
AS 3 AS 4
Bring down AS 1-2 session Bring it back up!
• Requires manual intervention• Can be done in AS 1 or AS 2
Load Balancing Example
primary link for prefix P1backup link for prefix P2
AS 1
AS 2
AS 3 AS 4
provider
peer peer
provider
customer
AS 5customer
primary link for prefix P2backup link for prefix P1
• Recovery for prefix P1 may cause a BGP wedgie for prefix P2 …
AS 1
AS 2
AS 3 AS 4
customer
provider
peer peer
provider
customer
customer
provider
primary link
Full Wedgie Example
AS 5
backup links
• AS 1 implements backup links by sending AS 2 and AS 3 a “depref me” communities.
• AS 2 implements its community so that the resulting local pref is below that of its upstream providers and it’s peers (AS 3 and AS 5 routes)
• AS 5 implements its community so that the resulting local pref is below its peers (AS 2) but above that of its providers (AS 3)
customer
peer peer
And the Routings are…
AS 1
AS 2
AS 3 AS 4
AS 5
AS 1
AS 2
AS 3 AS 4
AS 5
Intended Routing Unintended Routing
Recovery??
AS 1
AS 2
AS 3 AS 4
AS 5
AS 1
AS 2
AS 3 AS 4
AS 5
Bring down AS 1-2 session
Bring up AS 1-2 session
Recovery
AS 1
AS 2
AS 3 AS 4
AS 5
AS 1
AS 2
AS 3 AS 4
AS 5
Bring down AS 1-2 sessionAND AS 1-5 session
AS 1
AS 2
AS 3 AS 4
AS 5
Try telling AS 5 that it hasto reset a BGP session that isnot associated with a BEST route!
Bring up AS 1-2 sessionAND AS 1-5 session
Larry Speaks
http://www.larrysface.com/
Is this any way to run an Internet?
References
• [VGE1996, VGE2000] Persistent Route Oscillations in Inter-Domain Routing. Kannan Varadhan, Ramesh Govindan, and Deborah Estrin. Computer Networks, Jan. 2000. (Also USC Tech Report, Feb. 1996)
• [GW1999] An Analysis of BGP Convergence Properties. Timothy G. Griffin, Gordon Wilfong. SIGCOMM 1999
• [GSW1999] Policy Disputes in Path Vector Protocols. Timothy G. Griffin, F. Bruce Shepherd, Gordon Wilfong. ICNP 1999
• [GW2001] A Safe Path Vector Protocol. Timothy G. Griffin, Gordon Wilfong. INFOCOM 2001
• [GR2000] Stable Internet Routing without Global Coordination. Lixin Gao, Jennifer Rexford. SIGMETRICS 2000
• [GGR2001] Inherently safe backup routing with BGP. Lixin Gao, Timothy G. Griffin, Jennifer Rexford. INFOCOM 2001
– [GW2002a] On the Correctness of IBGP Configurations. Griffin and Wilfong.SIGCOMM 2002.
– [GW2002b] An Analysis of the MED oscillation Problem. Griffin and Wilfong. ICNP 2002.
Pointers
• Interdomain routing links– http://www.cambridge.intel-research.net/~tgriffin/interdomain/
• These slides– http://www.cambridge.intel-research.net/~tgriffin/talks_tutorials/CL_2031024.ppt