quantitative comparison of end-to-end availability of service paths in ring and mesh- restorable...

Quantitative Comparison of End-to-End Quantitative Comparison of End-to-End Availability of Service Paths in Ring and Mesh-Availability of Service Paths in Ring and Mesh-

Restorable NetworksRestorable Networks

Matthieu Clouqueur, Wayne D. [email protected], [email protected]

National Fiber Optic Engineers Conference - NFOEC 2003

Orlando, Florida, USA

Matthieu Clouqueur and Wayne D. GroverNFOEC 20032

OutlineOutline

• Motivation

• Goals of Study

• Survivability Schemes Models

• Description of Test Cases

• Sample Results

• Discussion of Results

• Conclusion


MotivationMotivation

• Rings are often associated with high availability because they provide high restoration speed.

• Mesh is often thought to provide lower availability associated with low capacity redundancy and slower restoration.

We ask:

• Which really provides higher availability: ring or mesh ?

• How do ring and mesh compare in service path availability and in the range of availability levels they can offer ?

• In mesh, what would be the effects of affecting priorities to selected services ?


Goals and MethodologyGoals and Methodology

• Provide a true “apples-to-apples” comparison of end-to-end service path availability in ring and mesh:– Ring and Mesh are compared on identical facilities graphs serving identical

demands

– Efficient capacity designs are used for both architectures

– Exact survivability mechanisms are emulated

– Both architectures experience identical failure sequences

• Comparison is based on:– Average path unavailability (versus path length)

– Average number of outages experienced per year per service path (versus path length)

– Statistical frequency of total path outage times per year


Simulation Numerical DetailsSimulation Numerical Details

• Statistics of Failures:– Mean time between failures (MTBF): 1 year on each span

– Negative exponential distribution (Poisson process)

• Statistics of Repair:– Mean time to repair (MTTR): 12 hours

– Negative exponential distribution


Main Assumptions of the Availability AnalysisMain Assumptions of the Availability Analysis

• Previous work shows that what dominates service unavailability is:– Reconfiguration times (to single failures)

– Single node-failures

– Dual span-failures

– Triple failures

• The present analysis is based on the effects of dual span-failures– The contribution of reconfiguration times is not taken into account.

• The analysis is based on restorability investigations performed on a per path basis for 7each particular failure scenario occurring in the simulation


Modelling Ring SurvivabilityModelling Ring Survivability

• The Bi-directional Line Switched Ring (BLSR) model is assumed:

Protection channels

Working channels

Loop-back

Loop-back

1

2

34

5

Note: A dual fibre cut affecting a ring does not cause all service paths on the ring to experience outage

Detailed inspection is performed on a per path basis to identify paths affected by the dual failure.

surviving path

failed path


Modelling Mesh SurvivabilityModelling Mesh Survivability

• Adaptive mesh restoration behaviour is assumed:– Restoration paths for a failed span are dynamically searched within

remaining spare capacity (not according to pre-plan)

Note: Restoration of a failed span includes an effort to restore any spare capacity used on that span

A restoration path affected by a second failure may survive

9

7

13

21

2022

11

6

10

12

2324

17

19

16

1418

15

failed backup path is itself restored

Restoration to a first failure can be based on a pre-plan.

Second failure response is adaptive


Mesh Restoration with PrioritiesMesh Restoration with Priorities

• Capacity designs used are identical to normal single failure restorable designs (no additional capacity)

• A certain percentage of demands is given “Priority” status on each origin/destination node pair– The Priority service paths are considered first for restoration and will

therefore have a higher dual-failure restorability and therefore a higher availability

• Three service mixes: 10/90, 30/70, 50/50 (% high P. / % low P.)

Questions:– How much is the availability of Priority services improved?

– How much is the availability of non-Priority services degraded?


Three test casesThree test cases

1

2

34

56

7

8

9

10

1112

13

14

15

16

17

18

1920

2122

23

24

25

26

27

28

29

30

31

32

N22

N23

N01

N02

N25

N21

N20

N24

N03

N04

N05

N08

N07

N06

N10N09

N11N12

N15

N14

N13

N16

N18

N19

N17

(32 nodes, 45 edges) (25 nodes, 50 edges)

Hubbed Demand Matrix

Gravity-based Demand Matrix

Gravity-based Demand Matrix

net32-A net32-B 25n50s1

Topology

Type of Demand

Test case


Study DetailsStudy Details

• Mesh:– Working paths are routed on shortest path

– Minimal spare capacity placed by Integer Linear Programming Optimization*

– Average working path length: net32-A: 5.5; net32-B: 2.2; 25n50s1: 2.8

• Ring:– Ring designs using efficient methods developed in PhD work by D. Morley**

– # of rings in design: net32-A & net-32B: 8 OC-48; 25n50s1: 19 OC-48

– Average number of spans/ring: net32-A & net32-B: 11; 25n50s1: 9.4

– Working paths routed by shortest ring-constrained routing

* J. Doucette, W. D. Grover, “Influence of modularity and economy-of-scale effects on design of mesh-restorable DWDM networks,” IEEE JSAC, vol. 18, no. 10, October 2000, pp. 1912-1923.

** G. D. Morley, Analysis and Design of Ring-Based Networks, Ph.D. Thesis, University of Alberta, Spring 2001.

and G. D. Morley, W. D. Grover, “Tabu search optimization of optical ring transport networks,” in Proceedings of IEEE GLOBECOM 2001, November 2001, vol. 4, pp. 2160-2164.


Statistical ConsiderationsStatistical Considerations

• Results for each test case are based on series of 1000 one-year simulations

• Total of dual (or higher order) failures arising over 1000 trials:– Net32-A & Net32-B: 2619

– 25n50s1: 3180

• Average number of outage events per path being the basis for availability results:– Ring: 64.7

– Mesh: 45.9

• This was shown to give good confidence levels on results:– E.g. for 25n50s1 test case, over 10 separate 1000 one-year trials, standard

deviation of average unavailability Uave results was calculated to be 2.7 % of Uave


Comparative Results of Path UnavailabilityComparative Results of Path Unavailability

0.0E+00

2.0E-05

4.0E-05

6.0E-05

8.0E-05

1.0E-04

1.2E-04

1.4E-04

1.6E-04

1.8E-04

2.0E-04

0 2 4 6 8 10

Number of hops in path

Pat

h un

avai

labi

lity

net32-A R ingnet32-B R ing25n50s1 R ingnet32-A M eshnet32-B M esh25n50s1 M esh

Path unavailability is significantly higher in the ring architecture, especially for longer paths (up to a factor 2 in the worst case)

~ 26 % chance of outage in a given year (worst case)

~ 17 % chance of outage in a given year (worst case)

net32-B

net32-A

25n50s1

Average working path lengths


0.0%

0.1%

0.2%

0.3%

0.4%

0.5%

0.6%

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70

Total outage per year (hours)

Fre

quen

cy

25n50s1 Mesh

w orst case: 48 hours(Prob = 6x10-5 %)

0.0%

0.1%

0.2%

0.3%

0.4%

0.5%

0.6%

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70

Total outage per year (hours)

Fre

quen

cy

25n50s1 Ring

w orst case: 72 hours(Prob = 2.5x10-4 %)

Comparing Distributions of Outage TimesComparing Distributions of Outage Times

Ring: 93.7 % of paths experience no outage in a year

Mesh: 95.4 % of paths experience no outage in a year

median

~ 6 hours

Results for test case 25n50s1

median

~ 6 hours

90th percentile

15 hours

90th percentile

13.5 hours


Effects of Priorities in MeshEffects of Priorities in Mesh

0.0E+00

2.0E-05

4.0E-05

6.0E-05

8.0E-05

1.0E-04

0 1 2 3 4 5 6 7Number of hops in path

Pat

h un

avai

labi

lity

RingM eshM esh Low P . (50/50)M esh Low P . (70/30)M esh Low P . (90/10)M esh High P . (50/50)M esh High P . (70/30)M esh High P . (90/10)

The effect of prioritizing in mesh is a considerable reduction of path unavailability for the Priority class.

The non-Priority service class suffers only a small degradation of service availability.

The availability of non-Priority services remains comparable or better than in ring

Results for test case 25n50s1

A priority path in mesh can have a more than five times lower unavailability than in ring


Additional Insights from the StudyAdditional Insights from the Study

• The effect of priorities is a significant reduction of the probability of paths experiencing outage in a given year:– E.g. for 6-hop paths in the 25n50s1 test case, the probability of

experiencing outage in a given year drops from 12% to 2% is the path is included in the 10% priority class (10/90 scheme).

• Experimental results show that the advantage of mesh over ring is greater in the highly connected topology– This is especially true for results of prioritized mesh and confirms the

understanding that mesh benefits greatly from high network diversity

• Under the 10/90 service mix Priority services achieve the lowest unavailability while the availability of non-Priority services remains virtually unchanged


ConclusionConclusion

• Despite its lower capacity requirements, the mesh architecture achieves better availability than ring– The key is mesh’s better ability to withstand dual-failure states even with

less redundancy than rings.

• With prioritization of services, high priority services can be offered very low unavailability (3 to 5 times less than with rings) while non-Priority service still enjoy comparable or better availability than with rings.

Concluding comment:

What matters most for high availability is not fast restoration to single failures but the ability to provide high restorability to dual failures. This minimizes the probability of MTTR-scale outages on service paths.

quantitative comparison of end-to-end availability of service paths in ring and mesh- restorable...

Documents

path slide

grover nfoec

path basis

surviving path

year slide

average path unavailability

modelling ring survivability

simulation slide