capacity requirements for network recovery from node failure with dynamic path restoration

23
Capacity Requirements for Network Capacity Requirements for Network Recovery from Node Failure with Dynamic Recovery from Node Failure with Dynamic Path Restoration Path Restoration Gangxiang Shen and Wayne D. Gangxiang Shen and Wayne D. Grover Grover TR TR Labs Labs and University of Alberta and University of Alberta Edmonton, AB, Canada Edmonton, AB, Canada ( presented by presented by Jennifer Yates, AT&T Jennifer Yates, AT&T Research Research ) ) OFC 2003, OFC 2003, Tuesday March 25 2003, Atlanta, Tuesday March 25 2003, Atlanta, Georgia Georgia

Upload: ronda

Post on 25-Feb-2016

48 views

Category:

Documents


1 download

DESCRIPTION

Capacity Requirements for Network Recovery from Node Failure with Dynamic Path Restoration. Gangxiang Shen and Wayne D. Grover TR Labs and University of Alberta Edmonton, AB, Canada ( presented by Jennifer Yates, AT&T Research ) OFC 2003, Tuesday March 25 2003, Atlanta, Georgia . Outline. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Capacity Requirements for Network Recovery from Node Failure with Dynamic Path Restoration

Capacity Requirements for Network Recovery from Capacity Requirements for Network Recovery from Node Failure with Dynamic Path RestorationNode Failure with Dynamic Path Restoration

Gangxiang Shen and Wayne D. Grover Gangxiang Shen and Wayne D. Grover TRTRLabsLabs and University of Alberta and University of Alberta

Edmonton, AB, CanadaEdmonton, AB, Canada((presented bypresented by Jennifer Yates, AT&T ResearchJennifer Yates, AT&T Research) )

OFC 2003, OFC 2003, Tuesday March 25 2003, Atlanta, Georgia Tuesday March 25 2003, Atlanta, Georgia

Page 2: Capacity Requirements for Network Recovery from Node Failure with Dynamic Path Restoration

Gangxiang Shen & Wayne D. Grover OFC ‘03 Atlanta 2

OutlineOutline

• Path Restoration & Node Recovery

• Research Questions

• Design Models

• Results

• Summary of Findings

Page 3: Capacity Requirements for Network Recovery from Node Failure with Dynamic Path Restoration

Gangxiang Shen & Wayne D. Grover OFC ‘03 Atlanta 3

Background on Path RestorationBackground on Path Restoration

• Long recognized that a path restoration mechanism will “handle node failure as well as span failures.”

• But this is a statement about the mechanism,not an assurance of adequate spare capacity to permit the mechanism to realize full node recovery.

Question: How much (extra) spare capacity is needed for node recovery via a path restoration mechanism?

Page 4: Capacity Requirements for Network Recovery from Node Failure with Dynamic Path Restoration

Gangxiang Shen & Wayne D. Grover OFC ‘03 Atlanta 4

Some Background and Points about Node RecoverySome Background and Points about Node Recovery

• We consider dynamic adaptive path restoration (not preplanned backup path protection).

• What is the actual aim in node recovery?– It cannot be the same as in restoration of a span failure, because...– Demands that source / sink at a failed node cannot be restored by

network re-routing. Seek to restore 100% of transiting flows through a failed node.

• Aside: (observations about node survivability in general) – In a sense, it is already “too late” when we rely on network re-routing in

response to a node failure. Good backup power, fire, security, and software are the primary

strategies for node survivability

Page 5: Capacity Requirements for Network Recovery from Node Failure with Dynamic Path Restoration

Gangxiang Shen & Wayne D. Grover OFC ‘03 Atlanta 5

Initial Appreciations about Node RecoveryInitial Appreciations about Node Recovery

Capacity design to support node recovery has two opposing complexions:

1. It is equivalent to 2 to perhaps 6 simultaneous span failures, depending on node degree: this suggests a lot of extra spare capacity may be needed.

On the other hand,

2. demands terminating at the failed node “disappear from the problem.” this suggests node failure problems may not be quite as difficult as it

seems.

especially if “stub release” applies to the unrestorable demands

Page 6: Capacity Requirements for Network Recovery from Node Failure with Dynamic Path Restoration

Gangxiang Shen & Wayne D. Grover OFC ‘03 Atlanta 6

Concept of Stub ReleaseConcept of Stub Release

• Stub release (SR) refers to reuse of capacity on surviving portions of failed paths in the overall restoration effort.

• It is an option under dynamic path restoration.

• SR makes the overall response failure-specific and more efficient than using only fully disjoint predefined backup paths

• However, it requires fault isolation to the respective span (or opaque segment)

Page 7: Capacity Requirements for Network Recovery from Node Failure with Dynamic Path Restoration

Gangxiang Shen & Wayne D. Grover OFC ‘03 Atlanta 7

Illustrating concept of Stub releaseIllustrating concept of Stub release

Pre-failure demands Span Failure

Page 8: Capacity Requirements for Network Recovery from Node Failure with Dynamic Path Restoration

Gangxiang Shen & Wayne D. Grover OFC ‘03 Atlanta 8

Illustrating concept of Stub releaseIllustrating concept of Stub release

Possible Restoration / Protectionwith strictly Disjoint backup paths

(no stub release)

Page 9: Capacity Requirements for Network Recovery from Node Failure with Dynamic Path Restoration

Gangxiang Shen & Wayne D. Grover OFC ‘03 Atlanta 9

Concept of Stub ReleaseConcept of Stub Release

Possible Restoration with stub release

Failure-specific re-use of surviving

path segments (for same or other

demands)

Page 10: Capacity Requirements for Network Recovery from Node Failure with Dynamic Path Restoration

Gangxiang Shen & Wayne D. Grover OFC ‘03 Atlanta 10

Network Recovery from Node FailureNetwork Recovery from Node Failure

• With a node failure, not only are terminating demands “not part of the restoration problem,”

• But in addition, with stub release such failed paths may contribute useful extra “spare” capacity network-wide.

Page 11: Capacity Requirements for Network Recovery from Node Failure with Dynamic Path Restoration

Gangxiang Shen & Wayne D. Grover OFC ‘03 Atlanta 11

Illustrating Issues in Recovery from Node FailureIllustrating Issues in Recovery from Node Failure

Pre-failure demands and Node Failure

Page 12: Capacity Requirements for Network Recovery from Node Failure with Dynamic Path Restoration

Gangxiang Shen & Wayne D. Grover OFC ‘03 Atlanta 12

Illustrating Issues in Recovery from Node FailureIllustrating Issues in Recovery from Node Failure

(a) Recovery from Node Failure without Stub release:(1) Red demand is not included in the restoration effort(2) Green demand has to take fully disjoint path

Page 13: Capacity Requirements for Network Recovery from Node Failure with Dynamic Path Restoration

Gangxiang Shen & Wayne D. Grover OFC ‘03 Atlanta 13

Illustrating Issues in Recovery from Node FailureIllustrating Issues in Recovery from Node Failure

b) Recovery from Node Failure with Stub release:(1) Red demand is (again) not included in the

restoration effort(2) Surviving segments of red demand are released as

equivalent-to-spare capacity

(3) Green demand can take shorter replacement path

Page 14: Capacity Requirements for Network Recovery from Node Failure with Dynamic Path Restoration

Gangxiang Shen & Wayne D. Grover OFC ‘03 Atlanta 14

Specific Research QuestionsSpecific Research Questions

What are the maximum levels of node recovery that can be achieved with no more spare capacity than required for 100% span restoration? Call this the “Intrinsic Node Recovery” Level

How much additional spare capacity is required to guarantee both 100% node and span restoration compared to span restorability only?

How does capacity depend on the mix of services in a multiple Quality of Protection (multi-QoP) context ? consider a mix of span-failure survivable (Rs) and “node plus span” - failure

protected (Rs+n) service assurances.

Page 15: Capacity Requirements for Network Recovery from Node Failure with Dynamic Path Restoration

Gangxiang Shen & Wayne D. Grover OFC ‘03 Atlanta 15

Optimization Models to Study these QuestionsOptimization Models to Study these Questions1. Design for 100% span and node failure restoration -- minimize total spare capacity cost while Guaranteeing 100% span failure

restoration and 100% transiting flow node failure restoration

2. Design to support Multi-QoP -- Extends the first model to accept a mix of: (1) Best-efforts only (R0) class (2)

“Rs” class and (3) “Rs+n” class services

3. Maximal node recovery under spare capacity budget -- accepts a budget total limit on spare capacity

-- asserts 100% span restorability (required for feasibility) -- maximizes the node failure restorability given total spare capacity limit

Stub Release option: Each model has versions with and without stub release

Page 16: Capacity Requirements for Network Recovery from Node Failure with Dynamic Path Restoration

Gangxiang Shen & Wayne D. Grover OFC ‘03 Atlanta 16

Test Case ResultsTest Case Results

Networks ARPA2 NSFNET SmallNet Cost239 Level3

Intrinsic node recovery

No stub release 91.35% 99.84% 85.30% 78.89% 95.59%

Stub release 88.43% 99.60% 89.40% 82.85% 99.998%

Redundancy increase

(Rs+n vs. Rs)

No stub release 10.0% 0.02% 2.6% 3.4% 4.1%

Stub release 9.7% 0.03% 2.8% 1.4% 0.1%

Total cost inc. (Rs+n vs. Rs)

No stub release 5.2% 0.01% 1.7% 2.4% 2.0%

Stub release 5.3% 0.02% 1.9% 1.0% 0.001%

• Five test networks• Uniform random 1..20 demands on each O-D pair -> lots of transiting flows• Costs proportional to distances.

Rn of networks designed only for Rs=1 (very high on average)

Added % spare capacity to strictly assure both Rs = 1, Rn =1 (very little on average)

Page 17: Capacity Requirements for Network Recovery from Node Failure with Dynamic Path Restoration

Gangxiang Shen & Wayne D. Grover OFC ‘03 Atlanta 17

Results (2)Results (2)

0.85

0.87

0.89

0.91

0.93

0.95

0.97

0.99

1 1.02 1.04 1.06 1.08 1.1 1.12Relative increase in total spare capacity

Aver

age

node

failu

re re

stor

abili

ty

ARPA2 (no stub release)ARPA2 (stub release)SmallNet (no stub release)SmallNet (stub release)

Node restorability versus total budget allowance for spare capacity (relative to Rs=1 design)

In prior table and here we see that SR cases approach Rn =1, more slowly than non –SR case.!!??…Reason is that non-SR designs for span restorability only had more spare capacity to begin with.

Page 18: Capacity Requirements for Network Recovery from Node Failure with Dynamic Path Restoration

Gangxiang Shen & Wayne D. Grover OFC ‘03 Atlanta 18

Results (3)Results (3)

1

1.02

1.04

1.06

1.08

1.1

1.12

0 10 20 30 40 50 60 70 80 90 100Percent of R(s,n) restorable demand node pairs

Rel

ativ

e in

crea

se in

spa

re c

ap.

ARP A2 (no stub release)ARP A2 (stub release)NSFNET (no stub release)NSFNET (stub release)SmallNet (no stub release)SmallNet (stub release)

Spare capacity increase required to support different percentages of (Rs+n) services

-> Depending on network, 30 to 60% of services could be given “Rs+n” service assurance with no extra capacity.

Page 19: Capacity Requirements for Network Recovery from Node Failure with Dynamic Path Restoration

Gangxiang Shen & Wayne D. Grover OFC ‘03 Atlanta 19

Summary of FindingsSummary of Findings

• Very high levels of node recovery are intrinsically feasible in networks using path restoration in networks designed nominally for only span restorability.

• High levels of premium (“node and span failure assured resilience”) service guarantees can be supported without any penalty in terms of added spare capacity.

• Stub release is an important advantage of dynamic adaptive path-restorable networks in achieving the highest overall availability if we consider node recovery or multiple span and node/span combined failures.

Page 20: Capacity Requirements for Network Recovery from Node Failure with Dynamic Path Restoration

Capacity Requirements for Network Recovery from Capacity Requirements for Network Recovery from Node Failure with Dynamic Path RestorationNode Failure with Dynamic Path Restoration

End

(Thanks Jennifer ! )

Page 21: Capacity Requirements for Network Recovery from Node Failure with Dynamic Path Restoration

Gangxiang Shen & Wayne D. Grover OFC ‘03 Atlanta 21

An Example: NSFNET An Example: NSFNET

Spare capacity optimized for single span failures

Page 22: Capacity Requirements for Network Recovery from Node Failure with Dynamic Path Restoration

Gangxiang Shen & Wayne D. Grover OFC ‘03 Atlanta 22

Preliminary Look at counteracting effects Preliminary Look at counteracting effects (NSFNet)(NSFNet)

This can give an a priori indication of which node failures may have the most / least severe effects.

Demand

matrix

Demand “relief”Spare

capacity loss from node

outage

Page 23: Capacity Requirements for Network Recovery from Node Failure with Dynamic Path Restoration

Gangxiang Shen & Wayne D. Grover OFC ‘03 Atlanta 23

Understanding why Node Recovery takes so little Understanding why Node Recovery takes so little extra capacityextra capacity

Path Restorationof the span failure

Node Failure affectingthe same two demands

•Green path still needs restoration

•Red path is (necessarily) abandoned