trading availability among shared-protected dynamic...

13
Trading availability among shared-protected dynamic connections in WDM networks Diego Lucerna a , Massimo Tornatore b,, Biswanath Mukherjee c , Achille Pattavina b a Huawei Technologies Italia S.r.l., Via Lorenteggio 257, 20152 Milano (MI), Italy b Department of Electronics and Information, Politecnico di Milano, Via Ponzio 34-35, 20121 Milan, Italy c Department of Computer Science, University of California, Davis, CA 95616, USA article info Article history: Received 14 October 2011 Received in revised form 17 April 2012 Accepted 18 April 2012 Available online 3 May 2012 Keywords: Optical network WDM Elastic availability Holding time Shared protection SLA Violation Risk abstract Novel automatized management systems for optical WDM networks promise to allow cus- tomers asking for a connection (i.e., a bandwidth service) to specify on-demand the terms of the Service Level Agreement (SLA) to be guaranteed by the Network Operator (NO). In this work, we exploit the knowledge, among the other Service Level Specifications (SLS), of the holding time and of the availability target of the connections to operate shared-path protection in a more effective manner. In the proposed approach, for each connection we monitor the actual downtime experi- enced by the connection, and, when the network state changes (typically, for a fault occur- rence, or a connection departure or arrival), we estimate a new updated availability target for each connection based on our knowledge of all the predictable network-state changes, i.e., the future connection departures. Since some of the connections will be ahead of the stipulated availability target in their SLA (credit), while other connections will be behind their availability target (debit), we propose a mechanism that allows us to ‘‘trade’’ avail- ability ‘‘credits’’ and ‘‘debits’’, by increasing or decreasing the shareability level of the backup capacity. Our approach permits to flexibly manage the availability provided to liv- ing connections during their holding times. The quality of the provided service is evaluated in terms of availability as well as prob- ability of violation of availability target stipulated in the SLA (also called SLA Violation Risk), a recently-proposed metric that has been demonstrated to guarantee higher customer sat- isfaction than the classical statistical availability. For a typical wavelength-convertible US nationwide network, our approach obtains significative savings on Blocking Probability (BP), while reducing the penalties due to SLA violations. We also analytically demonstrate the proposed scheme can be highly beneficial if the monitored metric is the SLA Violation Risk instead of the availability. Ó 2012 Elsevier B.V. All rights reserved. 1. Introduction In optical WDM networks survivability mechanisms are needed to avoid that a failure of a network element may cause significant losses of revenue for those customers that run their service over the bandwidth provided by means of the optical paths. These revenue losses are then reclaimed in the form of penalties to be paid by the Network Operator (NO). Different protection mechanisms to ensure surviv- ability in WDM networks have been proposed [1]: among them, Shared-Path Protection (SPP) is one of the most- adopted options, because of its desirable resource effi- ciency [2]. Recently, many new applications are emerging with requirements of large bandwidth over relatively short 1389-1286/$ - see front matter Ó 2012 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.comnet.2012.04.021 Corresponding author. E-mail addresses: [email protected] (D. Lucerna), tornator@ elet.polimi.it (M. Tornatore), [email protected] (B. Mukherjee), [email protected] (A. Pattavina). Computer Networks 56 (2012) 3150–3162 Contents lists available at SciVerse ScienceDirect Computer Networks journal homepage: www.elsevier.com/locate/comnet

Upload: others

Post on 14-Jan-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Trading availability among shared-protected dynamic ...home.deib.polimi.it/pattavina/pub_archive/jrnl-COMNET12.pdf · Trading availability amongshared-protected dynamic connections

Computer Networks 56 (2012) 3150–3162

Contents lists available at SciVerse ScienceDirect

Computer Networks

journal homepage: www.elsevier .com/locate /comnet

Trading availability among shared-protected dynamic connections in WDMnetworks

Diego Lucerna a, Massimo Tornatore b,⇑, Biswanath Mukherjee c, Achille Pattavina b

a Huawei Technologies Italia S.r.l., Via Lorenteggio 257, 20152 Milano (MI), Italyb Department of Electronics and Information, Politecnico di Milano, Via Ponzio 34-35, 20121 Milan, Italyc Department of Computer Science, University of California, Davis, CA 95616, USA

a r t i c l e i n f o

Article history:Received 14 October 2011Received in revised form 17 April 2012Accepted 18 April 2012Available online 3 May 2012

Keywords:Optical networkWDMElastic availabilityHolding timeShared protectionSLA Violation Risk

1389-1286/$ - see front matter � 2012 Elsevier B.Vhttp://dx.doi.org/10.1016/j.comnet.2012.04.021

⇑ Corresponding author.E-mail addresses: [email protected] (D.

elet.polimi.it (M. Tornatore), [email protected]@elet.polimi.it (A. Pattavina).

a b s t r a c t

Novel automatized management systems for optical WDM networks promise to allow cus-tomers asking for a connection (i.e., a bandwidth service) to specify on-demand the termsof the Service Level Agreement (SLA) to be guaranteed by the Network Operator (NO). Inthis work, we exploit the knowledge, among the other Service Level Specifications (SLS),of the holding time and of the availability target of the connections to operate shared-pathprotection in a more effective manner.

In the proposed approach, for each connection we monitor the actual downtime experi-enced by the connection, and, when the network state changes (typically, for a fault occur-rence, or a connection departure or arrival), we estimate a new updated availability targetfor each connection based on our knowledge of all the predictable network-state changes,i.e., the future connection departures. Since some of the connections will be ahead of thestipulated availability target in their SLA (credit), while other connections will be behindtheir availability target (debit), we propose a mechanism that allows us to ‘‘trade’’ avail-ability ‘‘credits’’ and ‘‘debits’’, by increasing or decreasing the shareability level of thebackup capacity. Our approach permits to flexibly manage the availability provided to liv-ing connections during their holding times.

The quality of the provided service is evaluated in terms of availability as well as prob-ability of violation of availability target stipulated in the SLA (also called SLA Violation Risk),a recently-proposed metric that has been demonstrated to guarantee higher customer sat-isfaction than the classical statistical availability. For a typical wavelength-convertible USnationwide network, our approach obtains significative savings on Blocking Probability(BP), while reducing the penalties due to SLA violations. We also analytically demonstratethe proposed scheme can be highly beneficial if the monitored metric is the SLA ViolationRisk instead of the availability.

� 2012 Elsevier B.V. All rights reserved.

1. Introduction

In optical WDM networks survivability mechanisms areneeded to avoid that a failure of a network element maycause significant losses of revenue for those customers that

. All rights reserved.

Lucerna), [email protected] (B. Mukherjee),

run their service over the bandwidth provided by means ofthe optical paths. These revenue losses are then reclaimedin the form of penalties to be paid by the Network Operator(NO). Different protection mechanisms to ensure surviv-ability in WDM networks have been proposed [1]: amongthem, Shared-Path Protection (SPP) is one of the most-adopted options, because of its desirable resource effi-ciency [2].

Recently, many new applications are emerging withrequirements of large bandwidth over relatively short

Page 2: Trading availability among shared-protected dynamic ...home.deib.polimi.it/pattavina/pub_archive/jrnl-COMNET12.pdf · Trading availability amongshared-protected dynamic connections

D. Lucerna et al. / Computer Networks 56 (2012) 3150–3162 3151

and predictable periods of time: let us consider, e.g., videodistribution of important sport or social events, or the mas-sive data transfer for backup, storage or e-science pur-poses. Network technology and the bandwidth marketare developing to provide the flexible platform the newapplications are asking for. In particular, new architecturesand routines for user-controlled on-demand optical circuitprovisioning [3], typically based on automatic or web-based interfaces at the management plane (MP) [1], willenable the on-line specification of the Service Level Agree-ment (SLA) terms to be guaranteed (with different pricerange) by the Network Operator (NO). In other words,users may be able to specify the QoS terms [3] of their con-nection requests, e.g., the availability target (AT) or theholding-time.

In particular, the NO should be able to guarantee with ahigh probability that the stipulated AT is respected, in orderto avoid penalties; at the same time, the NO aims atincreasing its profit, i.e., it wants to maximize the numberof connections (or bandwidth) provisioned. But, in case ofSPP and dynamic traffic, the NO must carefully monitorthe actual availability provided to the customer. In fact,whenever a new SPP connection is routed, the NO mustnot only verify if the AT of the incoming connection is sat-isfied, but also it must check if the AT’s of the existing con-nections are still respected despite the increased sharing ofbackup resources. A typical solution to avoid penalties is toemploy an availability-guaranteed provisioning approach[4,2]: in this case, the NO provisions a connection only ifthe network can provide a path with a long-term theoret-ical availability (that can be a priori evaluated) that is equalor larger than the AT target.

In this work, we present a novel availability-guaranteedprovisioning method that dynamically manages the avail-ability provided to SPP-protected connections during theirholding time. Our approach (i) leverages on the informa-tion about connection departures (future departures implydecreased sharing and in turn more availability) and (ii) isable to dynamically ‘‘trade’’ availability from connectionsahead of their AT to connections behind their AT. Note thatthe proposed scheme do not reprovision backup resources(and, consequently, it avoids the additional control over-head required by backup reprovisioning), but it only oper-ates on the sharing of the backup resources.

When a new connection has to be routed, the targetavailability of each living connection in the network isre-estimated by considering the experienced downtimeand the remaining holding time: e.g., if a connection hasnot been affected by failures, the NO can be consideredin ‘‘credit’’ of availability with respect to the customerand the customer’s availability requirements can be oppor-tunely decreased by the NO, as long as the original targetAT is still respected. On the other hand, if a connectionhas undergone an outage period and it is getting close toits maximal acceptable downtime, then it can be consid-ered in ‘‘debit’’ of availability with respect to the NO, andits current availability requirement should be increasedby the NO in the attempt to match the original stipulatedAT. In brief, the proposed approach dynamically ‘‘transfers’’availability from connections that have an availability deb-it to connections in credit of availability by allowing more

or less sharing of backup resources and it helps the NO tomeet connections SLAs. Although the proposed methodol-ogy can be applied, to any SPP algorithm, in the followingwe show the effectiveness of our approach through simu-lative experiments using the Availability-GuaranteedProvisioning algorithm (AGP) presented in [2]. Since ourproposed method requires the knowledge of the connec-tion holding-time we will refer to it as the Holding-Time-Aware (HTA) method.

Furthermore, recent studies [5–7] have shown thatusing the theoretical long-term availability to evaluatethe quality of a connection provided over a short periodof time (e.g., a period comparable with the average failureinterval) may not be enough to characterize the quality ofprovisioning in terms of SLA satisfaction (more details willbe given in Section 2). So, in this paper, we also analyticallydiscuss how to extend the proposed approach to the casewhere the monitored metric is the SLA Violation Risk(i.e., the probability that, given a certain availability targetand certain statistical availability associated to the path,the offered connection does satisfy the availability target)when our trading-availability method is adopted. Somepreliminary results are also provided.

The rest of the paper is organized as follow. Section 2overviews some background work on availability-guaran-teed SPP strategies. In Section 3 we formally state theavailability-guaranteed SPP problem and we briefly de-scribe an existing solution for availability-guaranteedSPP, called AGP. In Section 4 our new HTA method todynamically trade availability among SPP connections ispresented. Section 5 discusses how to extend the HTA ap-proach to include the new SLA Violation Risk metric. InSection 6 we compare and evaluate by means of simula-tions our HTA methodology vs. the basic AGP approach.In Section 7, we draw the conclusion. In the Appendix A,a rigorous approach to evaluate the availability in a SPPnetwork scenario is presented.

2. Prior work

This paper provides novel contributions on two comple-mentary, bus distinct, lines of research in the field ofshared-path protection: (1) how to route of availability-guaranteed shared-path-protected connections and (2)how to evaluate the SLA Violation Risk, or interval avail-ability, for availability-guaranteed SPP routing.

2.1. Availability-guaranteed SPP

We start by considering the problem of dynamic routingwith Shared-Path Protection (SPP). To enable dynamic pro-visioning of SPP connections, a network-control element(say, e.g., the Path Computational Element, PCE) needs tocompute two link-disjoint paths, a dedicated working pathand a shared backup path, for each incoming connection re-quest, based on the current network state. SPP routing algo-rithms are usually based on two-step approaches (e.g., [8]),which compute separately the working and the backuppath, using shortest-path or K-shortest-path algorithms[9] that minimize the total link costs. Generally, link costs

Page 3: Trading availability among shared-protected dynamic ...home.deib.polimi.it/pattavina/pub_archive/jrnl-COMNET12.pdf · Trading availability amongshared-protected dynamic connections

3152 D. Lucerna et al. / Computer Networks 56 (2012) 3150–3162

are assigned according to optimized metrics such as fiberdistance, hop count, and link load.

In [8,10], meaningful link-cost assignment approacheshave been proposed to increase the sharing of resourceswhich are already reserved by the backup paths of otherworking connections, instead of reserving new resources.Ref. [11] shows that backup capacity for SPP can be de-creased by exploiting the connection holding-time infor-mation. However, these approaches do not take intoconsideration that incoming connections may have differ-ent availability requirements.

Availability-aware routing has been extensively investi-gated. In the following we comment only on those workswhere availability awareness is coupled with shared pathprotection [4,2,12–17]. The works in [2,12] propose newrouting algorithms which support service differentiationunder static and dynamic traffic conditions, respectively.The primary objective is to route connections that complywith their target availability. A secondary objective is tominimize resource usage. In [18], it is shown that, if thecost of a link is defined as a function of its availability, find-ing a shortest path traversing these links becomes equiva-lent to finding the Most-Reliable Path (MRP). Authors in [4]look at the case of SPP with node-disjointness. The work in[13] also investigates SPP with guaranteed availabilityrequirements in a dynamic environment and uses a ma-trix-based approach for availability analysis. To the bestof our knowledge, Ref. [17] is the first work where connec-tion-availability and connection’s holding time are jointlyconsidered. However, neither the outage history of connec-tions nor the connections holding time information areexploited to dynamically manage the availability targetsduring connections’ lifetime. Both these aspects will beaddressed in this paper.

2.2. SLA Violation Risk

As a second aspect of the overview, consider that, in or-der to apply availability-aware routing, one has to rely onrigorous analytical approaches to evaluate the long-termtheoretical availability of an SPP connection. Different avail-ability analysis methods can be found in literature [18–22].An exhaustive comparison of these approaches can befound in [23]. While the analytical estimation of the avail-ability of an SPP connection is a mature topic, recent litera-ture has shown that the theoretical long-term availability isnot enough to evaluate the quality of a connection providedover a short period of time (e.g., a period comparable withthe average failure interval) in terms of SLA satisfaction.

More specifically, different works have referred to theconcept of SLA Violation Risk, i.e., the probability that, givena certain availability target AT and certain theoretical long-term availability associated to the path, the offered connec-tion does satisfy the AT. The authors in [24,5] were the firstto propose to quantify the uncertainty of optical-layer pro-visioning based on service settings and failure profiles. In[24], the probability of SLA violation is examined based onsimulation. By running a large number of connections in agiven network, the ratio of SLA violations over the totaladmitted connections is determined, which essentially cor-responds to our SLA Violation Risk at statistical level. How-

ever, when the network setting changes, the simulationmust be re-run and hence generality is reduced. In [5], theauthors examine a safety factor to guarantee the SLA Viola-tion Risk focusing mainly on the randomness of the numberof failures in dedicated backup systems. Ref. [25,7] providean analytical analysis to compute the probability of SLAviolation considering both the number of failures and thefailure repair time as random variables for the case of a sin-gle (unprotected) path and provide a routing algorithm thatminimizes the probability of SLA violation. In [26], theauthors use service continuity (i.e., the probability ofobtaining an uninterrupted service) other than availabilityfor some classes of service. In [27], the authors show that,by using Markov models, accurate estimates for the intervalavailability distribution (a metric that is strictly related toSLA Violation Risk) can be achieved, and the authors deriveanalytical approximations to the interval availability distri-bution. Finally, Ref. [6] provides a formulation for intervalavailability also for the dedicated path protected case.

2.3. Elastic QoS

Authors in [28] introduce the concept of Elastic QoS fora connection, as the possibility to vary the target of QoS fora connection (e.g., in terms of number of protection pathsfor that connection) according to the network state(mainly according to congestion). In our case we showhow this elasticity can also be achieved in the case ofshared backup resources, by intelligently granting moreor less sharing over the backup resources.

In conclusion, for the first time, to the best of our knowl-edge, we apply the concept of SLA Violation Risk and QoSelasticity to SPP. Our new method for availability trading al-lows us to elastically manage availability and we investigatehow this elasticity affects the SLA-violation-risk properties.

3. Notation and problem statement

3.1. SPP provisioning problem

We first define the notation and formally state the SPProuting problem. A network is represented as a weighted,directed graph G = (V,E,C,We), where V is the set of nodes,E is the set of unidirectional fibers (referred to as links),C:E ? R+ is a function that maps the elements in E to posi-tive real numbers representing the link costs, andWe:E ? Z+ specifies the number of wavelengths on a gener-ic link e (where Z+ denotes the set of positive integers).

We use Wef to denote the number of free wavelengths on

link e 2 E. We denote the set of existing lightpaths in the net-

work at any time by L ¼ liw; l

ib; t

ia; t

ih

� �n o, where the quadru-

ple liw; l

ib; t

ia; t

ih

� �specifies the working path, the backup path,

the arrival time and the holding time for the ith lightpath.We associate a link vector [8] with each link in the net-

work, to identify the sharing potential between backuppaths. The link vector me for link e can be represented asan integer set, me0

e j8e0 2 E;0 6 me0e 6We0

� �, where me0

e speci-fies the number of working lightpaths that traverse linke0 and are protected by link e (i.e., their corresponding

Page 4: Trading availability among shared-protected dynamic ...home.deib.polimi.it/pattavina/pub_archive/jrnl-COMNET12.pdf · Trading availability amongshared-protected dynamic connections

D. Lucerna et al. / Computer Networks 56 (2012) 3150–3162 3153

backup lightpaths traverse link e). Through such a simpledata structure, the link vector captures the necessary infor-mation on the sharing potential offered by each link. Thenumber of wavelengths which need to be reserved forbackup lightpaths on link e is thus m�e ¼maxe02E me0

e

� �.

Therefore, using the link vector, we can simply reserve m�ewavelengths on link e as backup wavelengths.

Based on the information contained in me, an SPP proce-dure has to find two Shared-Link-Risk-Group (SRLG)-dis-joint paths for the incoming request (lw, lb, ta, th), so that:

(C.1) the working and backup lightpaths, lw and lb arelink disjoint;

(C.2) lw and liw do not utilize the same wavelength on any

common link they traverse;(C.3) lw does not share any wavelength with li

b on anycommon link they traverse;

(C.4) lb and lib can share a wavelength on a common link

only if lw and liw are link disjoint.

In the following Section 3.3, we will describe how thisSPP provisioning process can be upgraded to enforce avail-ability targets. In the Appendix A a rigorous approach toevaluate the availability of an SPP connection is shown.

3.2. Availability-guaranteed SPP provisioning problem

Now we extend the formal problem statement to theavailability-guaranteed SPP provisioning. Let us redefinethe set of the existing connections L ¼ li

w; lib; t

ia; ti

h;ATi;

�nAiÞg, where the sextuple specifies the working path,the backup path, the arrival time, the holding time, thestipulated availability target specified in the SLA, and thelong-term theoretical availability provided to a connection.Similarly, l = (ta, th,SLA) defines the arrival time, the holdingtime and the stipulated availability target specified in theSLA of a new incoming connection. As a difference fromthe traditional provisioning approach, the working path lwand the backup path lb of the incoming connection mustsatisfy two additional conditions with respect to the otherexisting connections in L:

(C.5) the availability target AT of the incoming connec-tion must be satisfied (AT 6 A);

(C.6) the availability target ATiof the existing connec-

tions must be satisfied (ATi6 Ai).

Note that the provisioning of the new connection mayreduce the availability of other existing connections dueto the increased sharing of backup capacity. So, if the SLAof one of the existing connections gets violated due to theincreased sharing, then the incoming connection is blocked(even if its own availability meets the requirement).

3.3. Availability-guaranteed SPP provisioning algorithm

Several algorithms have been proposed to solve theavailability-guaranteed SPP provisioning problem. InAlgorithm 1, we describe a baseline approach, called AGP,for availability-guaranteed SPP dynamic provisioning,which is a modified version of the algorithm in [2], that

we will use as a holding-time-agnostic counterpart of ourapproach. In AGP, a connection can be either unprotectedor shared-path-protected, such that its SLA requirementis met and network resource usage is minimized. The for-mulas used in this paper to calculate the long-term avail-ability of the connections is reported in the Appendix.The routing algorithm follows this two-step procedure:first the MRP is computed as the working path. If the SLAtarget is not met, then the connection is also provided bya shared-protected backup path. To compute the backuppath, a new cost function Ce = �log(Ae � ae) is applied toeach link e, so that the path with minimum cost will bethe path with maximum availability. Note that Ae is theavailability of link e, while ae represents the availabilityproduct of the links which, in case of double failure, con-tend on link e backup resources with links belonging tothe working path. If the backup wavelengths on a link ecan be shared by the incoming connection, ae is evaluatedconsidering only the v�e wavelengths already existing onlink e until the new arrival. Otherwise, a link is usablebut not sharable, if no existing backup v�e wavelengthscan be shared on it but there is at last one free wavelengthon this link. In this latter case, ae is evaluated considering eas formed v�e þ 1

� �backup wavelengths. ae is rigorously

defined in Eq. (2), where Wef denotes the number of free

wavelengths on link e 2 E.

Algorithm 1. Availability-Guaranteed Provisioning (AGP)

Input: G = (V,E,C,We), m = {meje 2 E}, the set of existing

connections L ¼ liw; lib; t

ia; ti

h;ATi;Ai

� �n o, an

incoming connection l = (ta, th,AT)

Output: A working path liw or a pair of working/backup

path liw; lib� �

for the incoming connection with

overall long-term availability A satisfyingconstraints C.1-C.6.1. Compute the MRP for the incoming connectionrequest, as the working path lw. If Alw P AT , thenA P AT and go to step 4. Block the incomingconnection if path lw is not found.2. Compute backup path lb with minimal costaccording to the following cost function:

Ce ¼1 if We

f ¼ 0^9e0 2 lwjve0

e ¼ v�e�logðAeaeÞ otherwise

8><>: ð1Þ

Compute A (see Eq. (A.5)). If A < AT (i.e., C.5 isviolated), or if path lb is not found, block theincoming connection request.3. Re-compute the availabilities for all theconnections in L, which share backup resourceswith lb. If there is any connection i 2 L whose re-computed availability does not meet its ATi

requirements (i.e., C.6 is violated), block theincoming connection request.4. The connection is accepted and the path lw or thepath pair lw and lb is set-up.

Page 5: Trading availability among shared-protected dynamic ...home.deib.polimi.it/pattavina/pub_archive/jrnl-COMNET12.pdf · Trading availability amongshared-protected dynamic connections

3154 D. Lucerna et al. / Computer Networks 56 (2012) 3150–3162

ae ¼Q8e00Rflw

Segj9e02lw^Dðe

0 ;e00 Þe Pv�e

Ae00 if 8 e0 2 lwjve0e < v�e

ae ¼Q8e00Rflw

Segj9e02lw^Dðe

0 ;e00 Þe >v�e

Ae00 if Wef – 0 ^ 9e0 2 lwjve0

e ¼ v�e

8<:

ð2Þ

In the next two section, we first provide in Section 4 anew version of the AGP algorithm which is able to managethe ‘‘availability trading’’ among dynamic connections,then in Section 5 we mathematically show how the sameconcept of trading, initially applied the availability metricin Section 4, can be extended to the more practical (and re-cently subject of investigation in various works in theavailability field) concept of SLA Violation Risk.

Fig. 1. In this simple network example, traditional availability evaluationfails to route the second connection, while the availability targetredefinition succeeds.

4. Availability-guaranteed SPP with availability trading

Let us now discuss the new concept of SPP with avail-ability trading and what are the upgrades to AGP neededto obtain an algorithm for the new approach.

In the proposed algorithm, whenever a new connectionis offered to the network, we exploit three crucial pieces ofinformation: (i) the outage history of the connection (i.e., ifan existing connection has been already subject tooutages), (ii) for how long the connection is going to remainin the network, (iii) for how long the other existingconnections are going to remain in the network. Hence, wewill refer to this algorithm as Holding-Time-Aware (HTA)algorithm.

The basic idea is that, with respect to a specific connec-tion, NO may pass (with respect to its customers) from sit-uations of availability credit to conditions of availabilitydebit, and vice versa, as long as the overall SLA target isguaranteed. There are four possible availability ‘‘transac-tions’’ that can be managed by the NO:

1. NO credit: in an availability-guaranteed provision-ing approach, initially the NO is always in creditwith its customers. In fact, a connection is acceptedonly if the statistical estimation of the availabilityprovided in the routing phase is larger than theavailability target AT. But topological constraintsand link-availability granularity of the networkforce the NO to provide its customer a long-termavailability level that is higher then the target thathas been paid for.

2. Reducing NO credit: the NO has a means to providean actual availability level closer to the stipulatedvalue by decreasing the availability targets of livingconnections. E.g., if a connection has not beenaffected by failures, at the next network change(typically the next connection arrival) its currentavailability requirement can be decreased withrespect to the initial AT.

3. NO debit: two main causes can lead to an NO avail-ability debit: one can be an outage affecting theconnection and putting its SLA at violation risk;alternatively the NO can voluntarily provide an ini-tial availability level lower than AT. In both cases,the NO may take availability debit which could beextinguished by providing a service with larger

availability statistical availability in the future(when other sharing connections will leave thenetwork).

4. Paying off NO debit: a service interruption, if notopportunely managed, may lead to an SLA viola-tion. To avoid this situation when the outage is ter-minated, NO may extinguish its debit reducing thesharing degree of the backup resources along thebackup path of the interested connection. Thiscan be obtained by avoiding to share the backupresources with backup paths of new incoming con-nections. Similarly, NO may be in debit with anincoming call as a voluntary action to accept moreconnections. In this case, the NO goes into debitonly if it can be someway guaranteed that, in thefuture, the required availability will be provided.In conclusion, an availability debit should beacceptable only if recoverable by the maximumamount of future suppliable availability.

4.1. NO credit

Let us refer to the following example to show how anNO can exploit the connection-holding-time knowledgeto take advantage of an availability credit. In Fig. 1, weshow the state of a network (consider, e.g., "e 2 E:We = 8)at an instant tc ¼ t2

a ¼ 10, when connection r2 has to beprovisioned between nodes E and F with t2

h ¼ 30. A connec-tion r1 has already been routed into the network betweennodes A and B at the instant t1

a ¼ 0 and it is characterizedby an holding time t1

h ¼ 20. Both connections require anavailability target AT ¼ AT1 ¼ AT2 ¼ 0:99.

In accord to the AGP approach, at time t1a ¼ 0, we fix the

route of the working path of connection r1 along the MRP(link A–B); as a second step, since the availability A1 pro-vided by the working path was less than the availabilitytarget AT1, a backup path was routed on nodes A–C–D–B(dashed line in figure) utilizing the link cost assignmentin Eq. (1). The connection r1 is accepted becauseA1 = 0.99152 > AT1 = 0.99. In this situation we say that theNO is in availability credit with respect to the customer.

At time t2a ¼ 10, the AGP approach fixes the route of the

working path of r2 connection along the MRP (link E–F);

Page 6: Trading availability among shared-protected dynamic ...home.deib.polimi.it/pattavina/pub_archive/jrnl-COMNET12.pdf · Trading availability amongshared-protected dynamic connections

Fig. 2. Time evolution on link C–D of the network in Fig. 1.

1 Note that, in order to pay back its debit, usually the NO needs to supplya future availability lower than the maximum future suppliable availabilityderived by Eq. (4). As a matter of fact, the debit may be dynamicallycompensated by iterative credit reductions if a connection has notexperimented any outages.

D. Lucerna et al. / Computer Networks 56 (2012) 3150–3162 3155

as second step, since the availability provided A2 with onlythe working path is less than the availability target AT2 , abackup path is routed on nodes E–C–D–F (dashed line in fig-ure) utilizing the link cost assignment in Eq. (1). Since r1 andr2 share a wavelength on link C–D along their backup paths,connection r2 is accepted if and only if conditions C.5 andC.6 are both not violated, i.e. the availability target of con-nection r2 is respected and the availability target of connec-tion r1 is still guaranteed. We recall that backup sharingreduces the availability of a SPP connection. In this case, uti-lizing the traditional approaches, both the two previousconditions are not respected, because A1 ¼ A2 ¼ 0:98945< AT1 ¼ AT2 . So, request r2 can be either refused or dedicatedprotection have to be utilized, which will induce high re-source consumption.

However, in our HTA method, we may notice that con-nection r1 has not been affected by failures during its pre-vious lifetime, from time 0 to time 10. Then, a newavailability target gAT1 for r1 can be set, taking into accountthat from t1

a to tc, the connection has been served with‘‘previous’’ availability A1

p ¼ 1 and that the connection willremain in the network from tc to t1

a þ t1h . In general, the

availability target can be redefined as:

fATi¼

ATiti

h þ Aip ti

a � tc� �

tia þ ti

h � tcð3Þ

and a new target fATican be substituted the previous ATi

target that we were using during the check of conditionC.6 in Algorithm 1. In this specific case, the availability tar-get of connection r1 will be reduced, i.e. the NO’s availabil-ity credit is decreased, given that r1 has not been subject toany service outage. As result, fATi

¼ 0:98 < A1 ¼ 0:98945and condition C.6 is now respected. However, connectionr2 could not be accepted because condition C.5 is stillviolated.

4.2. NO debit

The availability redefinition reported in Eq. (3) may alsobe utilized when the NO goes in availability debit due tonetwork failures. In this case, applying availability redefi-nition allows us to increase the availability target achiev-ing more conservative treatment of the connection and,in turn, reducing the availability debit. The NO will still ac-cept new connections, but it will avoid that their backuppaths share backup resources with existing connectionswhich have experimented an outage.

As mentioned before, an NO’s availability debit may beinduced voluntarily to accept more connections in the net-work; in the following we provide an example of a volun-tarily induced availability debit. In order to follow the timeevolution of the network links’ state, we introduce the newsymbols meðDskÞ; me0

e ðDskÞ; zðe0 ;e00 Þ

e ðDskÞ and Dðe0 ;e00 Þ

e ðDskÞwhich express the values of me; me0

e ; zðe0 ;e00 Þ

e and Dðe0 ;e00 Þ

e

respectively, in the time interval Dsk.Let us expressly define Dsk first. According to connection

holding times, the th’s can be ordered so that tia þ ti

h 6 tiþ1a

þtiþ1h ; i ¼ 1;2; . . . ; jLj. As a consequence, s ¼ fs0; . . . ; sjLjg ¼

0; t1a þ t1

h; t2a þ t2

h; . . . ; tjLja þ tjLjh

n owill indicate the departure

events and Dsk = sk � sk�1 expresses the time interval

between two departures. me0e ðDskÞ; zðe

0 ;e00 Þe ðDskÞ and Dðe

0 ;e00 Þe

ðDskÞ will be updated according to the kth connectiondeparture. In other words, we have divided the time into aseries of intervals Ds which express the distance betweentwo departures. In Fig. 2 we focus on the departure eventson link C–D of the network in Fig. 1, assuming that also con-nection r2 has been provisioned:

� Ds1 (from time 10 to time 20): backup paths of con-nection r1 and connection r2 share a wavelength onlink C–D. During this time interval the providedavailability A1(Ds1) = A2(Ds1) is low and equal to0.98945.

� Ds2 (from time 20 to time 40): r2 has a dedicatedresource on link C–D because connection r1 has leftthe network. In this time interval the suppliableavailability A2(Ds2) is equal to 0.99152.

As examined in previous section, during Ds1, NO can re-duce its availability credit with connection r1. Moreover, atthe same time interval, NO can go voluntarily into debitwith r2 because it will be paid off during the r2’s residuallifetime, i.e., Ds2. Likewise, the NO will be able to guaran-tee an overall availability of fA2 ¼ 0:99083 and also connec-tion r2 will be accepted.

More generally, the state of a link can vary in time, pass-ing, e.g., from shared to dedicated. and the availability pro-vided to the connection could consequently change. Eachof these availability contributions Ai(Dsk) can then beweighted proportionally over each time interval accordingto the following equation:

eAi ¼P

DskAiðDskÞ � Dsk

tia þ ti

h � tcð4Þ

where the new eAi can substitute the previous Ai inconditions C.5 and C.6 of Algorithm 1. In other words, eAi

expresses the maximum suppliable availability for the con-nection ri if its backup path will be not shared with any otherfuture incoming connections. This condition can be easilyenforced by preventing the backup paths of futureincoming connection from sharing backup capacity ofconnection ri.1

Page 7: Trading availability among shared-protected dynamic ...home.deib.polimi.it/pattavina/pub_archive/jrnl-COMNET12.pdf · Trading availability amongshared-protected dynamic connections

Fig. 3. Flow chart of the AGP and HTA algorithms.

3156 D. Lucerna et al. / Computer Networks 56 (2012) 3150–3162

Exploiting this holding-time-aware (HTA) approach[11], the NO is able to accept more connections into the net-work and violates less SLA availability targets by dynami-cally adapting them to network state evolution. In Fig. 3 aconceptual description of the main steps in HTA is reported.Note that without applying the SLA redefinition and themaximum future availability estimation, we can easilydowngrade the approach from HTA to AGP.

It is worth noting at this point that our approach doesnot involve reprovisioning of backup capacity, but the NOsimply manages in a more flexible manner the sharing ofbackup resources. Finally, our approach does not re-evalu-ate the availability status of all existing connections in thenetwork, but it is applied only to the incoming connectionand to connections that share backup resources with it.This, for common network scenarios, relevantly limits thecomputational complexity.2

5. SLA Violation Risk in case of availability trading

As often mentioned throughout the paper, availability-guaranteed provisioning in a optical WDM network musttypically satisfy the condition that the theoretical long-term availability A provided to a connection is greater orequal to the stipulated availability target SLA. However,due to the stochastic nature of network failures, even ifA 6 SLA, over a limited time period, there is a not-negligi-ble probability that the actual provided availability turnsout to be less than the availability target, and so the stipu-lated contracts are usually at risk. Different works have re-ferred to concept of calculating the SLA-Violation Risk(SLA-VR), i.e., the probability that, given a certain availabil-ity target SLA and a certain theoretical long-term availabil-ity A associated to a connection, the provisioned pathsatisfies the SLA target (see e.g., [6,7]).

In this section, we investigate an analytical approachthat allows us to utilize the HTA trading approachproposed considering the SLA-Violation Risk (SLA-VR)

2 We remark here that the optimality of our method is not discussedhere, since our method is applied in a dynamic context, where moreemphasis is devoted to scalable and distributed approaches.

instead of the long-term availability. Note that, to the bestof our knowledge, this is the first time the concept of SLAViolation Risk is applied in the context of shared path pro-tection (closed form analytical formulation have been pro-vided for unprotected connections and dedicated pathprotected connections, e.g., in [6,7,5]).

Let us consider a connection i defined by the sextuple

liw; l

ib; t

ia; ti

h;ATi;Ai

� �n o, as in Section 3.3). If we assume

we know the actual availability AAi(i.e., AAi

here do not rep-resent the long-term availability, but the actual experi-enced availability), we can easily calculate the ‘‘ProvidedDownTime’’ (PDTi) and the Stipulated ‘‘maximum allow-able DownTime’’ (SDTi) as:

PDTi ¼ ð1� AAiÞ � ti

h; ð5ÞSDTi ¼ ð1� ATi

Þ � tih: ð6Þ

While SDTi is the actual maximum allowable downtimethat the customer can experience before the NO must pay apenalty, the PDTi is the downtime that the customer actu-ally experiences. Therefore, the contract concerning theconnection i is violated when the effective provided avail-ability AAi

is lower than the stipulated availability targetSLAi, or alternatively when PDTi > SDTi.

However, the actual availability AAi(and, consequently,

the value of PDTi) is not known a priori, since it depends bythe specific occurrence of randomly-distributed networkfailures. It follows that the SLA-VR can only be probabilis-tically defined and evaluated as:

SLA� VR i ¼ PrðPDTi > SDTiÞ; ð7Þ

which is the probability that the stipulated contract will beviolated. So, in order to consider SLA-VR instead of long-term availability, we must substitute conditions C.5 andC.6 in Section 3 with the following two conditions to besatisfied by a new incoming connection:

(C.5bis) The availability target A of the incoming con-nection must be guaranteed with an SLA-VR lower thanor equal to a Prefixed Risk Probability (PRP).(C.6bis) The availability target Ai of the existing connec-tions must be guaranteed with a SLA-VRi lower than orequal to a Prefixed Risk Probability (PRP).

Page 8: Trading availability among shared-protected dynamic ...home.deib.polimi.it/pattavina/pub_archive/jrnl-COMNET12.pdf · Trading availability amongshared-protected dynamic connections

3 With an average connection holding time of 15 days, the MTTR resultsequal to 12 h.

D. Lucerna et al. / Computer Networks 56 (2012) 3150–3162 3157

In the following, extending the availability redefinitiontechnique presented in Section 4, we define a new method-ology to trade availability credits and debits, under thenew constraint that the Network Operator (NO) only permitsto provision a connection with an SLA-VR 6 PRP.

To compute the SLA-VRi for a generic connection i, wehave to estimate i) its long-term availability Ai and ii) theMean Time To Repair MTTRi of that connection. Both esti-mations depend on the protection scheme adopted. Ai iscomputed according to Eq. (A.5) (see Appendix). As forthe MTTRi, we consider here the common assumption thatthe failure rate ke is much smaller than the repair ratele = 1/MTTRe on a generic link e; thus, we can approximatethe MTTRi for a unprotected connection i or for a protectedconnection i to MTTRe and MTTRe/2, respectively. Theseassumptions have also been discussed and validated in[5] for dedicated path protection, but we do not expectthe choice of shared protection to significantly influencethe MTTR. Therefore the connection failure rate ki can bederived as (formula obtained inverting Eq. (A.1) in theAppendix):

ki ¼1� Ai

MTTR i � ð1� AiÞ � MTTR i: ð8Þ

Now, the SLA-VRi calculation reported in Eq. (7) can beexpressed as the following probability:

SLA� VR i ¼X

xj ð1�ATiÞ�ti

h<x�MTTR if g

e�ki tih kiti

h

� �x

x!ð9Þ

Note that the values of MTTR in this analysis is assumed tobe constant, so the values of xj ð1� ATi

Þ � tih < x� MTTR i

� �can be easily a priori evaluated.

Once the formula in Eq. (7) for SLR-VR has been devised,in order to extend the SPP availability-trading mechanismproposed in Section 4 to SLA-VR, the next step is to modifythe computation of the NO credit reduction and the NOdebit payment, so that, not only the target availability SLAi

is redefined, but also the SLA-VRi is re-evaluate step-by-step. In the next two subsections we show how.

5.1. NO credit with SLA Violation Risk

Referring to the analytical expression in Section 4.1, ateach instant tc (in which a network state change occurs)we can evaluate the SLA-VRi for a generic connection i as:

gSLA� VR i ¼X

x

e�ki tiaþti

h�tcð Þ ki ti

a þ tih � tc

� �� �x

x!ð10Þ

where xjfð1� fATiÞ � ti

a þ tih � tc

� �< x� MTTR ig and a new

couple of availability target and SLA Violation RiskffATi

; gSLA� VRig can substitute the previous onefATi

; SLA� VRig during the check of condition C.6bis.

5.2. NO debit with SLA Violation Risk

Referring to the analytical expression in Section 4.2, weintroduce the new symbols ki(Dsk) and MTTRi(Dsk) that ex-press the values of the connection failure rate and the con-nection Mean Time To Repair, respectively, in the time

interval Dsk. Each of these ki(Dsk) and MTTRi(Dsk) can thenbe weighted proportionally over each time interval accord-ing to the following equations:

ki ¼P

DskkiðDskÞ � Dsk

tia þ ti

h � tc; ð11Þ

MTTR i ¼P

DskMTTR iðDskÞ � Dsk

tia þ ti

h � tc: ð12Þ

Finally, for the NO-debit calculation, we substitute ki

and MTTRi as computed in Eqs. (11) and (12) in the SLA-VRi formula given in Eq. (10). A new couple of effective pro-vided availability and SLA Violation Risk f eAi ; gSLA� VRigcan substitute the previous couple {Ai,SLA � VRi} duringthe check of conditions C.5bis and C.6bis. In other words,eAi expresses the maximum supplied availability for theconnection i with a probability 1- gSLA� VRi under theassumption that its backup path will not be shared byany other future incoming connections.

6. Illustrative numerical examples

We now quantitatively evaluate the performance of thetwo approaches: (1) AGP and (2) HTA, the holding-time-aware provisioning approach with availability trading.We simulate a dynamic network environment with theassumptions that the connection-arrival process is Poissonand the connection-holding time follows a negative expo-nential distribution. Average connection-holding time isnormalized to unity. For the illustrative results shownhere, in every experiment, 105 connection requests aresimulated. All the plotted values have a 95% confidenceinterval not larger than 5% of the plotted value. Requestsare uniformly distributed among all node pairs; availabilityrequirements of the requests are uniformly distributedover the three classes {0.99,0.999,0,9999}, denoted as C1,C2, C3, respectively. The example network topology with32 wavelengths per fiber is shown in Fig. 4. In order to gen-erate the failures in our simulations, the MTTR is consid-ered constant and normalized to 0.032,3 while MTBFfollows a Poisson distribution that guarantees a link avail-ability value of 0.999.

We employ four performance metrics: Blocking Proba-bility, SLA Success Ratio, Resource Distribution and Availabil-ity Gap.

6.1. Blocking probability

The Blocking Probability (BP) indicates the ratio of theblocked connections over the offered connections to the net-work. Exploiting the HTA approach, the NO will be able toaccept more connections into the network, because, byperiodically redefining the availability target of a connec-tion, the more effective backup sharing is allowed in the net-work. Table 1 compares the BP achieved by HTA and AGP.For sake of completeness, we also considered our previousapproach reported in [29] which is similar to HTA but onlyconsiders the ‘‘credit’’ case. The approach in [29] outper-

Page 9: Trading availability among shared-protected dynamic ...home.deib.polimi.it/pattavina/pub_archive/jrnl-COMNET12.pdf · Trading availability amongshared-protected dynamic connections

Table 1Blocking probability, BP (%): AGP vs. HTA vs. [29].

Arrival rate

20 40 60 80 100

AGP 0.526 4.38 9.0411 14.092 17.677[29] 0.250 2.567 5.745 9.034 12.270HTA 0.151 1.028 2.6190 4.9650 7.2391

21

1

2

3

4

58

15

14

19

24

1013 17

18

1612

116

9722

20

23

Fig. 4. A carrier’s US nationwide backbone network topology.

3158 D. Lucerna et al. / Computer Networks 56 (2012) 3150–3162

forms AGP, especially for high loads: e.g., at around 100 Erl-angs, BP decreases from 17% to 12%. The HTA approach fur-ther reduces the BP respect to the previous approachreported in [29]: at around 100 Erlangs, BP decreases from12% to 7%. The connection blocking may be due to four dif-ferent causes: lack of resources, violation of condition C.5,violation of condition C.6, and violation of both conditionsC.5 and C.6. Fig. 5a and b shows the impact of the variouscontribution to BP in the AGP and HTA approach, respec-tively, for increasing traffic load. Note that in the AGPapproach, the inability to guarantee the SLA availabilitytarget to existing connections is the main causes of connec-tion blocking (violation of condition C.6). As show in Fig. 5b,HTA outperforms AGP, because it drastically reduces theblocking due to violation of condition C.6.

6.2. SLA success ratio

In Fig. 6a we show the ratio of the connections whichviolate the SLA target over the accepted connections tothe network. Our numerical results demonstrate that thenumber of violations is much lower when HTA is used.For the sake of conciseness, we report only the case forthe SLA Class 3, i.e., the class of connections which alwaysrequire a backup path. Note that this metric may not resultin a totally fair comparison because different schemes mayaccommodate different requests, while this measure doesnot differentiate one request from another.

6.3. Resource distribution

In order to investigate the fairness of these approaches,we now evaluate the resource distribution among differentclasses in AGP and HTA. Connections can be provisioned intwo different ways: unprotected or shared-protected. We

observe that all the connection requests in SLA Class 1are unprotected, and all the connection requests in SLAClass 3 are shared-path protected. The proportion ofshared-path-protected connection increases with theincrease in traffic load. In Fig. 6b it can be seen that, withthe AGP approach, increasing network load the percentageof unprotected connection for SLA Class 2 is incremented:AGP encourages the routing of a connection which doesnot require a backup path and blocks the other connec-tions, i.e. connections that cannot meet its SLA target withonly a provisioned working path. On the contrary, usingthe HTA approach the percentage of unprotected connec-tion is almost constant with different traffic loads. Thiscomes form the fact that HTA is able to better assign thebackup sharing during connection holding times and thusit protects a higher number of connections.

6.4. Availability gap

The relevant BP decrement shown in Table 1 can bemotivated also looking at the reduction of the gap betweenthe stipulated SLAs and the actual value of the availabilityA provided to connections in the AGP and HTA scenario. InTable 2, we compare the average stipulated SLA (which isthe NO objective and it would be reached if all the connec-tions i in the network would be provided exactly with Ai = -SLAi), with the actual average availability provided by HTA(AHTA) and AGP (AAGP). It can be seen that values for AHTA aremuch closer to the SLA target than those of AAGP.

This means that HTA is able to give connections a levelof service in terms of availability which is closer to that re-quired by the customers, freeing backup capacity to beused for other connections. A second important aspect isthat the degree of sharing of backup resources, for eachconnection, varies during the holding time of the connec-tion itself for both AGP and HTA. In the case of HTA, theaccurate redefinition of SLA targets allows us to provideto connections a fairer ‘‘amount of availability’’.

6.5. The SLA Violation Risk for SPP-protected connections

As presented in the rest of the paper, in traditional SPPschemes the provisioning of a working lightpath and abackup lightpath for an incoming connection is con-strained by the conditions C.1–C.6. If the SLA Violation Riskconcept is utilized instead of long-term availability, thenconditions C.5bis and C.6bis have to be respected. Unfortu-nately, in an availability-guaranteed SPP scheme where anavailability-trading technique is not adopted (such asAGP), the condition C.6bis may become very stringent, be-cause the SLA-VR tends to grow significantly during theconnection lifetime. In fact, intuitively, the shorter is theresidual holding time of an existing connection, the largerbecomes its SLA-VR. So the ‘‘older’’ connections (with ashort residual holding time) may lead to a blocking ofnew incoming connections given that the NO is not be ableto guarantee, with a Prefixed Risk Probability PRP, the stip-ulated target availability SLA of these ‘‘older’’ connections.In other words an SLA-VR-guaranteed provisioning schemecould be hardly applicable due to the increase of the value

Page 10: Trading availability among shared-protected dynamic ...home.deib.polimi.it/pattavina/pub_archive/jrnl-COMNET12.pdf · Trading availability amongshared-protected dynamic connections

Fig. 5. Blocking probability for AGP (a) and HTA (b).

Fig. 6. Percentage of violated SLA targets for SLA Class 3 (a) and unprotected connections percentage for SLA Class 2 (b): AGP vs. HTA.

Table 2Actual availability supplied by AGP and HTA, compared to the target SLA.

Arrival rate

20 40 60 80 100

Average stipulated 0.9963 0.9963 0.9963 0.9963 0.9963SLAAGP 0.99897 0.99896 0.99895 0.99895 0.99894HTA 0.99882 0.99870 0.99862 0.99854 0.99848

D. Lucerna et al. / Computer Networks 56 (2012) 3150–3162 3159

of SLA-VR for connections, unless some mechanism foravailability trading is applied, as proposed in this paper.

To numerically support this consideration, in Fig. 7awe consider the variation of the value of SLA-VR overthe entire holding time (ti

h ¼ 2 years, MTTRi = 12 h) of a

connection i having a long term availability Ai = 0.99897and a stipulated availability ATi

¼ 0:9963 (values takenfrom Table 2). Increasing the percentage of expired hold-ing-time, the SLA-VRi tends to increase up to unacceptablevalues: at the beginning of its holding time, the connection

Page 11: Trading availability among shared-protected dynamic ...home.deib.polimi.it/pattavina/pub_archive/jrnl-COMNET12.pdf · Trading availability amongshared-protected dynamic connections

Fig. 7. SLA Violation Risk for increasing percentage of the expired holding time without (a) and with (b) availability trading and SLA target redefinition.

3160 D. Lucerna et al. / Computer Networks 56 (2012) 3150–3162

ATihas a violation risk of about 0.45%; at about 80% of the

total holding time, the violation risk achieves about 24%.The sawtooth profile of the SLA-VR is related to the num-ber of allowed number of failures x in Eq. (9): every timethe value of x increases of one unit, the SLA-VR has a peak,which becomes very high in proximity to the end of theholding time.

Let us now see how the value of SLA-VR depends on thepercentage of expired holding time if we apply the avail-ability-trading mechanism and we redefine the target SLAi.In Fig. 7b we consider the SLA-VRi evolution for the sameconnection i with a fixed supplied availability Ai = 0.99897and a beginning stipulated availability SLAi = 0.9963. Forthe sake of simplicity, we assume that the connection i isnot affected by any failure during its holding time. Thestarting value of SLA-VRi is equal to 0.45%, as in the previ-ous case; by applying the availability trading and the SLAtarget redefinition, the SLA-VRi now decreases very rapidly.For this reason, the NO would able to reduce the providedavailability Ai, while still guaranteeing an acceptable SLA-VRi P PRP during the entire connection lifetime. The SLAtarget redefinition in SLA-VR scenarios leads to similar ben-efits also in the NO debit case. For sake of conciseness, wedo not report other results here.

7. Conclusion

In this paper we have proposed a new methodology foravailability-guaranteed Shared Path Protection that allowsus to ‘‘trade’’ availability ‘‘credits’’ and ‘‘debits’’ among var-ious connections in a dynamic environment, by increasingor decreasing the shareability level of the shared backupcapacity. We have shown that our approach allows us toobtain relevant improvements on blocking probabilityand reduces SLA target violations, by appropriately updat-ing the availability targets according to changes in thesharing degree of the backup resources due to new connec-tion arrivals or to connection departures. We have alsoanalytically demonstrated how our approach can also beapplied directly to the SLA Violation Risk metric, insteadof using the availability metric.

Acknowledgments

This work has been partially funded by the ItalianEducation Ministry within the Project BESOS (Bandwidthefficiency and Energy Saving by sub-lambda Optical

Switching). Preliminary versions of the content of this paperwere presented at the conferences OFC’08 and DRCN’09.

Appendix A. Availability evaluation for an SPPconnection

We provide in this appendix the analytical formulationto evaluate the availability of a an SPP connection.

In general, availability is the probability of a repairablesystem to be in an operating state. Failures and downstates occur, but maintenance or repair action always re-turn the system to an operating state. The basic equationfor the availability of a system with constant failure ratek and repair rate l is:

A ¼ MTTR

MTBFþ MTTR ¼l

lþ k; ðA:1Þ

where A is the availability, MTBF = 1/k is the mean time be-tween two consecutive failures, and MTTR = 1/l is themean time to repair [19]. According to [30] we considerfailure-immune nodes and we focus only on link failures.Then, let Ae denotes the availability of the link e.

The availability of the working and the backup pathscan be individually computed. Let lw and lb denote the setof links used by working path and backup path, respec-tively. Then, the availabilities of the paths are given bythe following equations:

Alw ¼Ye2lw

Ae; ðA:2Þ

Alb ¼Ye2lb

Ae: ðA:3Þ

Since SPP provides 100% restorability on single failures,let us consider the effects of double failures on a connec-tion to evaluate the availability [31]: the two additionalparameters zðe

0 ;e00 Þe and Dðe

0 ;e00 Þe will allow us to identify the

links that cause a resource conflict with links owing tothe working path under the assumption that (i) two con-current failures are affecting the network and (ii) one outof these two failures is affecting the working path. zðe

0 ;e00 Þe

represents the number of working paths that cross thetwo links e0 and e00, and whose backup path contains e.The parameter Dðe

0 ;e00Þe denotes the number of wavelengths

that would be required on link e in order to fully restorethe traffic on link e even if e0 and e00 fail

Dðe0 ;e00 Þ

e ¼ ve0e þ ve00

e � zðe0 ;e00 Þ

e : ðA:4Þ

Page 12: Trading availability among shared-protected dynamic ...home.deib.polimi.it/pattavina/pub_archive/jrnl-COMNET12.pdf · Trading availability amongshared-protected dynamic connections

D. Lucerna et al. / Computer Networks 56 (2012) 3150–3162 3161

In other words, if Dðe0 ;e00 Þ

e is larger than m�e , and if e0 and e00

fail, then some demands cannot be restored on e, becauseof insufficient backup bandwidth. Now, for each connec-tion, we can define Sc ¼ fe00j9fe0 2 lw; e 2 lbg ^ Dðe

0 ;e00 Þe > v�e)}

as the set of links e00 whose failure causes a conflict on linke if two concurrent failures occur on e0 and e00. Sc is formedby a series of links which are all disjoint from the backupand the working path of the connection.

In summary, if ASc ¼Q

e2ScAe, the following equation can

be applied to evaluate the availability of a SPP connection:

A � Alw þ Alb ASc � Alw Alb ASc : ðA:5Þ

References

[1] C. Pinart, G.J. Giralt, On managing optical services in future control-plane-enabled IP/WDM networks, Journal of Lightwave Technology23 (2005) 2868–2876.

[2] L. Song, J. Zhang, B. Mukherjee, Dynamic provisioning withavailability guarantee for differentiated services in survivable meshnetworks, IEEE Journal on Selected Areas in Communications 25(2007) 35–43.

[3] A. Iselt, A. KirstSdter, R. Chahine, The role of ASON and GMPLS for thebandwidth trading market, in: Proc. 1st International Conference onE-business and Telecomm. Networks (ICETE2004), 2004.

[4] Z. Pandi, M. Tacca, A. Fumagalli, L. Wosinska, Dynamic provisioningof availability-constrained optical circuits in the presence of opticalnode failures, IEEE/OSA Journal of Lightwave Technology 24 (2007)3268–3279.

[5] Z. Ling, W.D. Grover, A Theory for Setting the ‘‘Safety Margin’’ onAvailability Guarantees in an SLA, DRCN 2005, October 2005.

[6] D. Mello, H. Waldman, G.S. Quiterio, Interval availability estimationfor protected connections in optical networks, Computer Networks55 (2011) 193–204.

[7] C.M.M. Xia, M. Tornatore, B. Mukherjee, Risk-aware provisioning foroptical WDM Mesh Networks, IEEE/ACM Transactions onNetworking 19 (2011) 921–931.

[8] Canhui Ou, Jing Zang, Hui Zang, L.H. Sahasrabuddhe, BiswanathMukherjee, New and improved approaches for shared-pathprotection in WDM mesh networks, IEEE Journal of LightwaveTechnology 22 (2004) 1223–1232.

[9] J.Y. Yen, Finding K shortest loopless paths in a network, ManagementScience (1971) 712–716.

[10] G. Li, D. Wang, C. Kalmanek, R. Doverspike, Efficient distributed pathselection for shared restoration connections, in: Proc. IEEEINFOCOM, June 2002, pp. 140–149.

[11] M. Tornatore, C. (Sam) Ou, J. Zhang, A. Pattavina, B. Mukherjee,PHOTO: an efficient shared-path protection strategy based onconnection-holding-time awareness, IEEE/OSA Journal onLightwave Technology 23 (2005) 3138–3146.

[12] J. Zhang, K. Zhu, H. Zang, B. Mukherjee, A new provisioningframework to provide availability-guaranteed service in WDMmesh networks, in: Proc. IEEE ICC’03, May 2003, pp. 1484–1488.

[13] D.A.A. Mello, J.U. Pelegrini, R.P. Ribeiro, D.A. Schupke, H. Waldman,Dynamic provisioning of shared-backup path protected connectionswith guaranteed availability requirements, International Workshopon Guaranteed Optical Service Provisioning (GOSP), October 2005.

[14] Y. Li, Q. Qiu, L. Li, Availability-aware routing in optical networks withprimary-backup sharing, in: Proc. High Performance Switching andRouting (HPSR’08), 2008.

[15] Q. Guo, P.-H. Ho, A. Haque, H.T. Mouftah, Availability-constrainedshared backup path protection (SBPP) for GMPLS-based sparecapacity reconfiguration, in: Proc. ICC’07, 2007.

[16] B. Kantarci, H.T. Mouftah, S. Oktug, Adaptive schemes fordifferentiated availability-aware connection provisioning in opticaltransport networks, OSA Journal of Optical Networking 27 (2009)4595–4602.

[17] C. Cavdar, L. Song, M. Tornatore, B. Mukherjee, Holding-time-awareand availability-guaranteed connection provisioning in opticalWDM mesh networks, in: International Symposium High CapacityOptical Networks and Enabling Technologies, HONET, November2007.

[18] J. Zhang, K. Zhu, H. Zang, N.S. Matloff, B. Mukherjee, Availability-aware provisioning strategies for differentiated protection services

in wavelength-convertible WDM mesh networks, IEEE/ACMTransactions on Networking 15 (2007) 1177–1190.

[19] M. Tornatore, G. Maier, A. Pattavina, Availability design of opticaltransport network, IEEE Journal on Selected Areas inCommunications 23 (2005) 1520–1532.

[20] D.A.A. Mello, D.A. Schupke, H. Waldman, A matrix-based analyticalapproach to connection unavailability estimation in shared backuppath protection, IEEE Communications Letters 9 (2005) 35–43.

[21] L. Zhou, M. Held, U. Sennhauser, Connection availability analysis ofshared backup path-protected mesh networks, IEEE/OSA Journal ofLightwave Technology 25 (2007) 1111–1119.

[22] J. Segovia, E. Calle, P. Villa, J. Marzo, J. Tapolcai, Topology-focusedavailability analysis of basic protection schemes in opticaltransport networks, OSA Journal of Optical Networking 7 (2008)351–364.

[23] M. Tornatore, D. Lucerna, B. Mukherjee, A. Pattavina, Multilayerprotection with availability guarantees in WDM optical networks,Journal of System and Network Management (Springer) 20 (2012)34–55.

[24] R. Clemente, M. Bartoli, M. Bossi, G. D’Orazio, G. Cosmo, Riskmanagement in availability SLA, in: Proc. of Conference on Design ofReliable Communication Network (DRCN05). Ischia, Italy, 2005.

[25] M. Xia, J. Choi, T. Wang, Risk Assessment in SLA-Based WDMBackbone Networks, OFC 2009, March 2009.

[26] P. Cholda, A. Mykkeltveit, B. Helvik, A. Jajszczyk, Continuity-basedresilient communication, in: Proc. of Conference on Design ofReliable Communication Network (DRCN09). Washington, DC, USA,2009.

[27] D. Mello, G.S. Quiterio, H. Waldman, D.A. Schupke, Specification ofSLA survivability requirements for optical path protectedconnections, in: Proc. of Optical Fiber Communication Conference(OFC’06), 2006.

[28] K. Jong, K. Shin, Performance evaluation of dependable real-timecommunication with elastic QoS, in: Proc. of InternationalConference on Dependable Systems and Networks, (DSN), 2001,pp. 295–303.

[29] M. Tornatore, D. Lucerna, L. Song, B. Mukherjee, A. Pattavina,Dynamic SLA Redefinition for Shared-Path-Protected Connectionswith Known Duration, OFC 2008, February 2008.

[30] L. Jereb, T. Jakab, F. Unghvary, Availability analysis of multi-layeroptical networks, Optical Networks Magazine (2002) 84–94.

[31] J. Doucette, M. Clouqueur, W.D. Grover, On the availability andcapacity requirements of shared backup path-protected meshnetworks, Optical Networks Magazine 4 (2003) 29–44.

Diego Lucerna received a Ph.D. in InformationEngineering in 2011 from Politecnico diMilano. His research interests includeswitching technologies, network equipments,telematic applications and management ofpublic and private networks. He is currentlyenrolled in Huawei Technologies Italia as’’Customer Support Optical Engineer’’ forWDM, SDH and microwave systems.

Massimo Tornatore (S’03-M’07) received thePh.D. degree in information engineering fromthe Polytechnic University of Milan, Milan,Italy, in May 2006. He is currently an AssistantProfessor with the Department of Electronicsand Information, Polytechnic University ofMilan. From 2007 to 2009, he was a Post-Doctoral Researcher with the University ofCalifornia, Davis, where he is still collaborat-ing as a Visiting Researcher. He is coauthor ofmore than 80 conference and journal papers.His research interests include design, energy

efficiency, traffic grooming in optical networks, and group communica-tion security.

Page 13: Trading availability among shared-protected dynamic ...home.deib.polimi.it/pattavina/pub_archive/jrnl-COMNET12.pdf · Trading availability amongshared-protected dynamic connections

3162 D. Lucerna et al. / Computer Networks 56 (2012) 3150–3162

Biswanath Mukherjee is a DistinguishedProfessor at University of California, Davis,where he has been since 1987, and served asChairman of the Department of ComputerScience during 1997 to 2000. He received theB.Tech. (Hons) degree from Indian Institute ofTechnology, Kharagpur, in 1980, and the Ph.D.degree from University of Washington, Seat-tle, in 1987.He served as Technical Program Co-Chair ofthe Optical Fiber Communications (OFC)Conference 2009, and General Co-Chair of OFC

2011. He served as the Technical Program Chair of the IEEE INFOCOM ’96conference. He is Editor of Springer’s Optical Networks Book Series. Heserves or has served on the editorial boards of eight journals, most

notably IEEE/ACM Transactions on Networking and IEEE Network. He wasthe Founding Steering Committee Chair of the IEEE Advanced Networksand Telecom Systems (ANTS) Conference, and served as General Co-Chairof ANTS in 2007 and 2008.He is co-winner of the Optical Networking Symposium Best Paper Awardsat the IEEE Globecom 2007 and IEEE Globecom 2008 conferences. He wonthe 2004 UC Davis Distinguished Graduate Mentoring Award. He alsowon the 2009 UC Davis College of Engineering Outstanding Senior FacultyAward.To date, he has supervised to completion the Ph.D. Dissertations of 48students, and he is currently supervising approximately 15 Ph.D. studentsand research scholars.He is author of the textbook ’’Optical WDM Networks’’ published bySpringer in January 2006.He served a 5-year term as a Founding Member of the Board of Directors

of IPLocks, Inc., a Silicon Valley startup company. He has served on theTechnical Advisory Board of a number of startup companies in net-working, most recently Teknovus (acquired by Broadcom), IntelligentFiber Optic Systems, and LookAhead Decisions Inc. (LDI).He is a Fellow of the IEEE.

Achille Pattavina received his Dr. Eng. degreein Electronic Engineering from the UniversityLa Sapienza of Rome, Rome, Italy, in 1977.He was with the University La Sapienza ofRome until 1991, when he moved to thePolitecnico di Milano, Milan, Italy, where he isnow a Full Professor. He has been the authorof more than 100 papers in the area of com-munications networks published in leadinginternational journals and conference pro-ceedings. He has authored two books,Switching Theory, Architectures and Perfor-

mance in Broadband ATM Networks (Wiley, 1998) and CommunicationNetworks (McGraw-Hill, 1st edn. 2002, 2nd edn. 2007, in Italian). He hasbeen engaged in many research activities, including European Union

funded projects. His current research interests are in the areas of opticalswitching and networking, traffic modeling and multi-layer networkdesign. Dr. Pattavina has been Guest or Co-Guest Editor of special issueson switching architectures in IEEE and non-IEEE journals. He has been anEditor for switching architecture performance for IEEE Transactions onCommunications since 1994, and Editor-in-Chief of European Transac-tions on Telecommunications since 2001.