multi-topology protection: promises and problems
DESCRIPTION
Multi-topology protection: promises and problems. G. Apostolopoulos Institute of Computer Science Foundation Of Research and Technology Hellas (FORTH). Basic concept of MT protection. Based on IETF proposed MT extensions to IGPs Routers have multiple-routing tables - PowerPoint PPT PresentationTRANSCRIPT
Multi-topology protection: promises and problems
G. Apostolopoulos
Institute of Computer Science Foundation Of Research and Technology
Hellas (FORTH)
2MT protection: promises and problems Simula Research Lab, Oslo, April 20 2007
Basic concept of MT protection
Based on IETF proposed MT extensions to IGPs Routers have multiple-routing tables
Need to pick a routing table for each incoming packet Different addresses
Various types of packet marking
Use MT to repair failuresWhen a link/node fails affected traffic is locally switched to a pre-computed “backup” topology
Each destination in the FIB has a backup next-hop that is activated when a local link fails
Traffic reaches the destination over the backup topology without loops
3MT protection: promises and problems Simula Research Lab, Oslo, April 20 2007
MT protection
Primary
Backup
s
d
Mark traffic to send to backup top
Traffic reaches dest over backup top
4MT protection: promises and problems Simula Research Lab, Oslo, April 20 2007
Advantages of MT
Fast local repair of failure
Can repair all possible failures (single link or node) Multiple failures can be detected and addressed
No need to distinguish between link and node failures
ECMP, SRLG, lan failures, and multi homed prefixes can be handled easily
No need for tunnelingBut must mark packets instead
Can optimize how traffic is routed after the failure by manipulating link weights on the backup topologies
Failure may not last long but even so traffic impact is undesirable
5MT protection: promises and problems Simula Research Lab, Oslo, April 20 2007
But there are issues
Basic operation and optimization have been worked out Rough overview of some remaining issues:
How to differentiate traffic Premium versus regular and BE
How to use MT in a real network: Multiple areas, Inter-AS, Hot-potato routing with BGP transit trafficHow to return to normal after failure is repaired
Operational issuesHow complex is to configure? How expensive is to monitor/troubleshoot? Incremental deployment?
How to optimize link weights What to optimize for? Need to know traffic matrix
6MT protection: promises and problems Simula Research Lab, Oslo, April 20 2007
Traffic differentiation
Premium and regular/BE traffic
I may be willing to preempt non-premium traffic to make sure “premium” traffic is still ok
Standard practice with existing CSPF-FRR architectures
Different topologies for each traffic classOne of the envisioned uses of MT anyway
scalability?
Traffic optimization goals may be different nowHave to consider the interaction between the traffic types
Minimize effect on premium traffic
Do not starve BE traffic
…
7MT protection: promises and problems Simula Research Lab, Oslo, April 20 2007
How to return to normal after repair?
Failure is repaired and IGP is re-convergingHow to switch traffic back to the initial (no-failure) topology without micro-loops
How to avoid micro-loops in general After each IGP convergence event
SolutionUse a “fixed” topology – same as the primary topology
Continue routing traffic over the fixed/backup topology
Let IGP converge in the primary topology
After “convergence is complete”, switch all traffic to the primary topology
8MT protection: promises and problems Simula Research Lab, Oslo, April 20 2007
Converge in a separate topology
Fixed
Backup
s
ds
d
Primary
Switch traffic after IGP has converged
9MT protection: promises and problems Simula Research Lab, Oslo, April 20 2007
How to tell when IGP has converged
Use a “convergence” timer in IGP
Start it when a change that will require IGP re-convergence is detected
All traffic is forwarded over fixed/backup topologies
Must move traffic from primary to fixed
During convergence New routes are installed in primary topology
After timer expires (after IGP has converged)Switch all fixed and backup traffic to primary topology
Since no topology is in flux no micro-loops will occur from switching topologies
10MT protection: promises and problems Simula Research Lab, Oslo, April 20 2007
MT with multiple areas
Some of the destinations may be summary routes coming from outside areas
Need to map these summary routes to backup topologies in order to compute the backup next-hop for them
IGP can do this mapping
Link and non-ABR node failures Backup topologies for each area cover failures inside the area
No need to coordinate with other areas
Need to unmark the repaired packets when they leave the area
Remote area does not know about local backup topologies
11MT protection: promises and problems Simula Research Lab, Oslo, April 20 2007
MT with areas
ABR failures Failure affects two areas
Need a primary and a backup ABR for each summary route
Simple case: primary and backup ABR connect to the same area
Can handle with local backup topology
unmark packet when it leaves the area
Hard case: primary and backup ABR connect to different areas
Need to coordinate backup topologies among these areas else packet may loop
12MT protection: promises and problems Simula Research Lab, Oslo, April 20 2007
Multi-area MT example
Route-1
Unmark the packet
Packet will reach dest without issue
13MT protection: promises and problems Simula Research Lab, Oslo, April 20 2007
Multi-area MT example
Route-2
Unmark the packet
Packet will not reach Dest needs coordination
14MT protection: promises and problems Simula Research Lab, Oslo, April 20 2007
Other reasons for looking at all areas together
SRLGs may be different in each areaE.g. area 1 can not use ABR 2 as backup due to SRLG constraints in area 2
May be necessary if I want to optimize routing Backup topologies for different areas will have to coordinate their link weights for most effective routing after a failure
But it may be too expensive to optimize such a large topology
15MT protection: promises and problems Simula Research Lab, Oslo, April 20 2007
Inter-AS traffic
Cover failures of border routers and peering links Peering links do not belong to IGP Need extensions to let IGP know about these links
Stub links
Stub (potentially multi-homed) ISP and outgoing traffic
Similar to the area problem IGP can compute the backup topologies
Can compute few, independent of the number of BGP prefixes
Need to map BGP prefixes to these topologies to compute their backup next-hops
Should not have to import all BGP routes into IGP
Repaired packets need to be unmarked as they leave the AS
16MT protection: promises and problems Simula Research Lab, Oslo, April 20 2007
Inter-AS operation
Prefix
Special node
17MT protection: promises and problems Simula Research Lab, Oslo, April 20 2007
BGP-IGP interactions
How to map the BGP routes to the backup topologies and compute the backup next-hops for the BGP routes
Backup topologies are computed by IGP
Prefix reachability is controlled by BGP policy decisions
One approachBGP will have to tell RIB which two border routers can be used for reaching a prefix
BGP must have a concept of a “backup” border router for each prefix
IGP will tell RIB about the backup topologies
RIB will compute the backup next-hop for BGP routes on their way to the FIB
18MT protection: promises and problems Simula Research Lab, Oslo, April 20 2007
MT with hot-potato BGP traffic
Problem: Changes in the IGP weights/topology can cause massive shifts to transit BGP traffic
MT can help By avoiding micro-loops during IGP convergence
By creating a BGP forwarding topology that is engineered and protected with MT and insensitive to some of the changes in the IGP layer
This topology can be applied to only selected transit BGP prefixes
Optimization of traffic routing after failures becomes quite useful now
19MT protection: promises and problems Simula Research Lab, Oslo, April 20 2007
Other concerns
What is the administrative overhead of MT?
What is the performance overhead of MTStorage?
IGP signaling?
20MT protection: promises and problems Simula Research Lab, Oslo, April 20 2007
Administrative overhead
Need to manage multiple IGP topologies OSS tools will need to be extended
If backup topologies are optimized then I need to manage multiple sets of IGP link weights
Quite a bit of effort But done by automated offline tools anyway
Troubleshooting and monitoring Over which topology this prefix is routed? What is the connectivity status of topology T? All tools (ping, traceroute) may have to be upgraded depending on the topology de-multiplexing method
Does not look too good but compare:With full mesh of statically configured and optimized LSPs for TEWith statically configured FRR tunnels
Incremental deployment is tricky! May not be able to guarantee protection from all failures if not supported by all nodes
21MT protection: promises and problems Simula Research Lab, Oslo, April 20 2007
So how bad is scalability
Simulations show that Can repair failures with 3-4 backup topologiesCan optimize routing after failure with 6-8 topologies
How much does each topology cost? SPF computation
One SPF per topology, few topologies so not an issue
IGP signalingNo extra cost, single adjacency for all topologies
IGP RIB spaceSeparate routing tables for each topologyCan share next-hops
System RIB spaceSeparate routing table for each topology Only for IGP routes
BGP routes will not have to be replicated
Can share next-hops
22MT protection: promises and problems Simula Research Lab, Oslo, April 20 2007
Example FIB structure
0 0 0 0 1 0 3 2 1 2 3 2
Shared Next hop structures
TopologyHASH
Prefixlookup
Primary Table
Backup Table 0 0 0 0 X0 0 0 40 0
3 topologies4 next-hops for ECMP
23.45.6.2 1
23MT protection: promises and problems Simula Research Lab, Oslo, April 20 2007
MT with MPLS
Use MPLS labels for de-multiplexors
Build a MPLS forwarding plane for each topology using LDP
For VPNs/BGP free cores
Simple LDP extensions Essentially MT-LDP
No need to encapsulate traffic on a failure
Simpler than RSVP-TE/FRR less signaling overheads
Configuration overhead is not clear thoughDepends a lot on the OSS tools used
24MT protection: promises and problems Simula Research Lab, Oslo, April 20 2007
MT with multicast
Has become interesting with the advent of IP TV etc…
IETF discusses methods to extend LDP for P2MP LSPs and
MPLS-FRR for P2MP LSPs
MT protection can be easily extended to be used there using P2MP MPLS labels and P2MP extensions to LDP
And we can still optimize traffic after failure
25MT protection: promises and problems Simula Research Lab, Oslo, April 20 2007
Dynamic TE
Traffic matrices can change significantly DDoS attacks, Diurnal patterns, Failures
Adjust routing Does not have to happen extremely fast
Ideally this should happen automatically
CSPF-MPLS has ways to cope with this Traffic flows inside LSPs
May need a full mesh though
If a link gets overloaded cat try to shift some LSPs away from it
Doing this automatically can lead to oscillations
26MT protection: promises and problems Simula Research Lab, Oslo, April 20 2007
MT dynamic TE
When a link is overloaded need to shift traffic away from it
Option A:Create topology T1, re-optimize weights in T1 and shift all traffic to T1
Needs a coordinated switch, large impact on the network
Better when change in traffic patterns is permanent
Option B: Shift only some (S,D) pairs to T1
No need for coordinated switch and smaller impact to network
Better for temporary changes in the traffic patterns
Optimization problemWhat traffic I send into T1
What link weights I use for T1?
Can I do something that is adaptive/feedback based?
27MT protection: promises and problems Simula Research Lab, Oslo, April 20 2007
Immediate work items
MT with multiple traffic classes New optimization constraints (interaction among traffic types)
P2MP protectionOptimization issues
What are the right optimization goals?
combine P2MP and P2P backup for reaching the optimal solution
Dynamic TE with MT Optimization for both the (S,D) and the link weights
Adaptive based on congestion feedback?
28MT protection: promises and problems Simula Research Lab, Oslo, April 20 2007
Other interesting items
MT in Ethernet networks There have been some related proposals already
Use MT routing to handle rapid changes in links like those in a wireless network
Link fading could be considered a partial link failure
Deal with inaccurate traffic matrices Solutions that adapt to changing traffic matrices
Algorithms that find good routings even when given inaccurate traffic matrices
There has been significant work on the traffic matrix estimation and inaccuracy problem