route reflectors, confederations, 4-byte as numbers. etc ... · dd2491, p1 2008 route flap damping...
TRANSCRIPT
DD2491, p1 2008
Route Reflectors,
Confederations,
4-byte AS numbers.
etc.
Olof Hagsand KTH/CSC
DD2491 p1 2008
DD2491, p1 2008
Documentation
• Route Reflectors
– Practical BGP pages 135 – 153
• Note Fig 4.12 does not match text
– RFC 4456
• Confederations
– Practical BGP pages 153 – 160
– RFC 5065
DD2491, p1 2008
Motivation
• Scalability problems with iBGP full mesh
– When your network have grown and a full mesh isn't feasible
– n*(n-1)/2 where n = the number of iBGP speaking routers
– If we have 200 routers in our network that would give us 19900 BGP
sessions
• Two solutions
– Route reflectors
– Confederations
DD2491, p1 2008
Route reflectors, clients and clusters
• Route reflection is concerned with distributing routes within an AS,
not the actual routing.
• The route reflectors have to be configured to reflect routes to router
reflector clients.
• The clients do not know they are clients and are configured as
normal iBGP peers.
• A set of RRs and clients is referred to as a cluster.
• Only the best route to a destination is sent from a RR to a client
• To avoid loops, A RR setup should always follow the physical
topology
DD2491, p1 2008
Route reflector example
Route reflector
Client ClientCluster
eBGP eBGP
eBGPiBGP
iBGP
iBGP
E1E1 E2
E3
AB
C
D E
DD2491, p1 2008
Route reflectors
• A route reflector reflects iBGP routing information
– From clients to iBGP peers and other clients
– From iBGP peers to clients
– Never from iBGP peers to iBGP peers (as before)
– Should not change the attributes
• NEXT_HOP
• AS_PATH
• LOCAL_PREF
• MED
DD2491, p1 2008
Exercise
• Using the previous figure:
– Trace a route from E1, E2 and E3 respectively
– Suppose the same route comes from both E1 and E2, how does it
propagate throughout the AS?
– How does traffic go in the AS. For example transit traffic between E2
and E3?
DD2491, p1 2008
Path attributes
• Two new attributes added by RR if a route is reflected
– CLUSTER_LIST
• RR adds a clusterid in the cluster list (like a path)
– ORIGINATOR_ID
• First RR add routerid of the peer it heard it from
– Both are optional, nontransitive (dont propagate to EBGP)
• Cluster ID
– 32 bit dotted decimal notation in JunOS
– Doesn't have to be a routed address
– Usually the RRs routerid is used, but can be configured to something else
(see clusterid with multiple RRs)
DD2491, p1 2008
Route reflector configuration in JunOS
protocols { bgp { group INTERNALRR { type internal; localaddress 192.168.1.1; cluster 192.168.1.1; neighbor 192.168.1.2; } }}
DD2491, p1 2008
Route reflector client configuration
protocols { bgp { group INTERNAL { type internal; localaddress 192.168.1.2; neighbor 192.168.1.1; } }}
DD2491, p1 2008
Route reflectors
• For redundancy, you can have more than one route reflector
in a cluster.
– Should the Cluster ID be the same on the RRs in a cluster?
•
DD2491, p1 2008
Cluster with multiple RRs
RR RT2Cluster 1.1.1.1
ClientRT3
ClientRT4
RR RT1Cluster 1.1.1.1
Prefix update from AS2192.168.1.0/24
DD2491, p1 2008
Route reflectors
• In the example both RRs add the same Cluster ID
• This will result in
– RT4 gets a prefix on an eBGP peer and sends the update to his iBGP
peers (RT1 and RT2)
– RT1 and RT2 adds Cluster ID 1.1.1.1 to the CLUSTER_LIST and
adds RT4's Router ID to ORIGINATOR_ID.
– RT1 and RT2 reflects the update to all iBGP peers and RR clients (in
JunOS RT4 will not get the update back)
– RT1 receives an update from RT2 with the same information in
CLUSTER_LIST and ORIGINATOR_ID as it already had and
therefore drops it
DD2491, p1 2008
Route reflectors
• RT3 receives updates from both RT1 and RT2 with the same
information in CLUSTER_LIST and ORIGINATOR_ID. RT3 will
install one of the updates and drop the other.
• What will happen if a router RTx (who use RT2 to get to the prefix
192.168.1.0/24) send packets to the destination in AS2 and the
iBGP peer between RT2 and RT4 was down?
DD2491, p1 2008
Route reflectors
• If instead RT1 added the Cluster ID 1.1.1.1 and RT2 the
Cluster ID 2.2.2.2
– RT2 would still have a valid information on where to forward the
packets
– But we would have duplicated paths
– We would be using additional memory and processor overhead due
to the duplicated paths.
DD2491, p1 2008
Route reflectors
• To further scale a network using RRs.
– You can use nested Rrs
• An RR client can be an RR for another cluster
– The Cluster ID must at least be unique per cluster within the AS
– A RR could be RR for more then one cluster
• Design considerations
– Always follow the physical topology
DD2491, p1 2008
Route reflectors
• There is a possibility to have all the clients within a cluster to
have full mesh iBGP
– If they have a full mesh there is no need to reflect client updates from
clients for the RR
– “set protocols bgp group groupname noclientreflect”
DD2491, p1 2008
Why follow physical topology?
• If you dont, you may have routing loops.
• B will prefer D and C will prefer A => routing loop!
– (This replaces practical BGP Fig 4.12)
10.1.1.0/24
A D
B C
RRRR
Client Client
DD2491, p1 2008
Confederations• Another way of solving iBGP full mesh
• The idea behind confederations is to take one large AS and
divide it into several smaller ones
– Non-members of the confederation see one AS, members of the
confederation are divided into sub-AS's
– One IGP must usually be run in the whole confederation to support
connectivity
– LOCAL-PREF and NEXTHOP is preserved through the confederation
DD2491, p1 2008
ExampleAS65100
AS65400
AS65500AS65200
AS65300
See practical BGP p 154
subAS Confederation id == Global AS
DD2491, p1 2008
Mechanism
• Two new segments of the AS_PATH are added (apart from
AS_SEQUENCE and AS_SET):
– AS-CONFED-SET
– AS-CONFED-SEQUENCE
• BGP speakers add sub-AS numbers to these within the
confederation
DD2491, p1 2008
Sub AS numbers
• AS confederation identifier = the external AS number
• AS member number = the confederation sub-AS number
• Design considerations: When configuring confederations use
private AS numbers (64512 – 65535)
– Some implementations of confederations have been known to leak
the member sub-AS numbers to it's eBGP peers
– What happens if you use public AS numbers that belonged to
someone else?
DD2491, p1 2008
Announcing Rules• IBGP (within a sub_AS) behaves as normal
• BGP peering between sub-ASs (sometimes called eiBGP):
– Prepend the sub-AS (AS member #) to the AS_PATH
• When a BGP update is leaving the confederation
– Remove the prepended sub-AS information from the AS-PATH.
• Differences between eBGP and eiBGP
– LOCAL-PREF is preserved through the confederation
– NEXT-HOP is also preserved
• You have to know if you should speak eiBGP or eBGP to your
neighbor:
– Share AS confederation identifier -> eiBGP
DD2491, p1 2008
Sub-hierarchies
• You cannot make sub-hierarchies using confederations
• You can use route reflection within a sub_AS
– And even sub-route reflector hierarchies,...
DD2491, p1 2008
Four-byte AS numbers (RFC 4893)
• 2-byte ASNs are quickly running out
• 4-byte ASNs have been standardized re-using the AS_PATH
and a migration technique using a special 2-byte ASN:
23456.
• The migration technique maps all 4-byte ASNs to 23456
when NEW speakers (that implement RFC 4893) talk to OLD
speakers (those that don't implement RFC 4893)
DD2491, p1 2008
Four-byte AS numbers• Where does BGP carry the AS number?
– In the UPDATE message (my ASN)
– In the AS_PATH attribute
– In the Aggregator attribute
– In Communities attributes
• A NEW speaker announces 4-byte AS as a capability
– The capabilty also includes myASN.
• NEW speakers use the AS_PATH attribute for 4-byte ASNs
DD2491, p1 2008
New attributes
• Two new attributes (optional transitive)
– AS4_PATH
– AS4_AGGREGATOR
• Only used to “tunnel” 4-byte AS information over non 4-byte
AS clouds.
DD2491, p1 2008
NEW speaking with OLD
• NEW converts all 4-byte ASNs in AS_PATH to 23456
• NEW creates the attribute AS4_PATH to “tunnel” the 4-byte
AS-path to other NEW speakers.
• When NEW receives a route from OLD with an AS4_PATH
attribute, it constructs a new AS_PATH replacing all 23456
with the corresponding 4-byte AS:s in AS4_PATH.
DD2491, p1 2008
Example
• How can loop detection work?
• Because you only make loop detection concerning your own
AS. And that is never AS_TRANS.
AS70000 AS100AS50
AS_PATH: 23456532
23456AS4_PATH: 70000
53290000
AS_PATH: 53290000
AS_PATH: 5070000
53290000
NEW OLD NEW
DD2491, p1 2008
BGP Multipath
• By default, the best selection algorithm in BGP selects one route – no
load balancing from a single router to a single prefix possible
– unless “outside” BGP using loopback peering for example
• BGP multipath enables load balancing between “equal” paths (to the
level of comparing routerids)
• Limited JunOS functionality
– set protocol bgp group extern multipath [multiple-as]
• By default equal cost multipath
– Book also describes unequal cost multipath in CISCO
Practical BGP: pages 261269
DD2491, p1 2008
BGP Multipath example
• Internal (eg from A) vs external (from D) load balancing
• Equal cost vs unequal cost multipath (links between B-D and C-D have
different bandwidth).
A
B C
D
DD2491, p1 2008
Graceful restart• When a BGP peering goes down (eg TCP connection is reset, closed, or
Keepalive fires) a BGP peer stops forwarding to that peer
– Ie, if the control plane fails, it is seen as a sign that the dataplane fails
• The other peers switches over the routes to other peers
• Then when the router comes back up (BGP peering established) they
may switch over to the restarted router again
• But most hw routers have separate forwarding and control planes
• So forwarding can continue without the control plane in principle
– At least for a limited period of time before forwarding data becomes outdated
• Graceful restart:
– Tell your peers that: You are going down but you will be back, please
continue forwarding to me until I am back.
Practical BGP: pages 269276
DD2491, p1 2008
Route Flap Damping RFC 2439
• A route that changes (withdrawn, announce or attribute change) leads to
a new UPDATE message.
• A route that changes too often lead to route flaps that can ripple through
the Internet
• A route can be a bad link that goes up and down, or an instable routing
state
– Some scenarios (clusters of routers) can even magnify flapping
• Only a small percentage of the Internet routes cause the majority of the
flaps
• Flaps cause many UPDATE messages, that causes recomputation of
FIBs that cause change in traffic.
Practical BGP: pages 238241
DD2491, p1 2008
Route flap dampening
• Idea is to predict history of route to predict future
• Introduce a penalty every time a route changes (eg 1000)
• Decay penalty exponentially using a half-time (eg 15 minutes)
• Add two levels:
– Suppress/Cutoff-threshold to stop announcing (eg 3000)
– Reuse threshold to start announcing (eg 750)
• Route flap dampening is most effective if everyone uses it towards the
edges. Why?
• But is usually installed towards upstream to protect from instable remote
prefixes.
DD2491, p1 2008
HomeworkThe home assignment should be made individually and should cover an advanced BGP topic, preferably a new proposal. You should write using your own words, and the intended audience of your report should be a fellow student in this course. That is, it should be written to an audience that are experts on BGP! The report should also include a personal reflection on the proposal.
Topics can be selected from the course book: 7 or 9 along with the extra RFCs listed in the RFC list. You may select a topic that is not in the book but covered by the RFCs. Examples of subjects (from section 7 of the book) are: BGP custom decision process, blocking denial of service, outbound route filter, etc. There are also some drafts and other sources of information if you wish to dig deeper.
NOTE: Before you start the homework, contact the course responsible with your choice of subject.
The paper should be around four A4 pages and written in English or Swedish.
Deadline: Oct 21, 2008.
DD2491, p1 2008
Examples (from ietf.org idr)# RFC 1772: Application of the Border Gateway Protocol in the Internet
# RFC 1997: BGP Communities Attribute
# RFC 1998: An Application of the BGP Community Attribute in Multi-home Routing
# RFC 2042: Registering New BGP Attribute Types
# RFC 2270: Using a Dedicated AS for Sites Homed to a Single Provider
# RFC 2385: Protection of BGP Sessions via the TCP MD5 Signature Option
# RFC 3562: Key Management Considerations for the TCP MD5 Signature Option
# RFC 2439: BGP Route Flap Damping
# RFC 2545: Use of BGP-4 Multiprotocol Extensions for IPv6
# RFC 2918: Route Refresh Capability for BGP-4
# RFC 3107: Carrying Label Information in BGP-4
# RFC 3221: Commentary on Inter-Domain Routing
DD2491, p1 2008
Examples (from ietf.org idr)# RFC 3345: Border Gateway Protocol (BGP) Persistent Route Oscillation
# RFC 3392: Capabilities Advertisement with BGP-4
# RFC 3882: Configuring BGP to Block Denial-of-Service Attacks
# RFC 4272: BGP Security Vulnerabilities Analysis
# RFC 4273: Definitions of Managed Objects for BGP-4.
# RFC 4274: BGP-4 Protocol Analysis.
# RFC 4275: BGP-4 MIB Implementation Survey.
# RFC 4276: BGP-4 Implementation Report.
# RFC 4277: Experience with the BGP-4 Protocol.
# RFC 4360: BGP Extended Communities Attribute.
# RFC 4364: BGP/MPLS IP Virtual Private Networks (VPNs)
# RFC 4451: BGP MULTI_EXIT_DISC (MED) Considerations.
DD2491, p1 2008
Examples (from ietf.org idr)# RFC 4456: BGP Route Reflection: An Alternative to Full Mesh Internal BGP (IBGP).
# RFC 4664: Framework for Layer 2 Virtual Private Networks (L2VPNs).
# RFC 4724: Graceful Restart Mechanism for BGP.
# RFC 4760: Multiprotocol Extensions for BGP-4.
# RFC 4893: BGP Support for Four-octet AS Number Space
# RFC 5004: Avoid BGP Best Path Transitions from One External to Another.
# RFC 5065: Autonomous System Confederations for BGP.
# RFC 5291: Outbound Route Filtering Capability for BGP-4
# RFC 5292: Address-Prefix-Based Outbound Route Filter for BGP-4