+ All Categories
Home > Documents > Backup Reprovisioning After Shared Risk Link Group (SRLG) Failures in WDM Mesh Networks

Backup Reprovisioning After Shared Risk Link Group (SRLG) Failures in WDM Mesh Networks

Date post: 15-Apr-2017
Category:
Upload: balaswamy-velpula
View: 220 times
Download: 0 times
Share this document with a friend
13
Backup Reprovisioning After Shared Risk Link Group (SRLG) Failures in WDM Mesh Networks Xu Shao, Yong Kee Yeo, Yuebin Bai, Jian Chen, Luying Zhou, and Lek Heng Ngoh Abstract—In this paper, we study the impact of shared risk link group (SRLG) failures on shared- path protection by examining the percentage of con- nections that are vulnerable after SRLG failures, in- vestigate the benefits of backup reprovisioning after SRLG failures, and evaluate different policies for backup reprovisioning. Compared with single-link failures, SRLG failures leave many more connections unprotected and vulnerable to the next failures and make the network topology much sparser. The major challenge of backup reprovisioning after SRLG fail- ures is how to find SRLG-disjoint backup paths for those unprotected connections with a recovery ratio that is as high as possible within reasonable compu- tational complexity. We are motivated to consider three reprovisioning policies by considering different sequences of reprovisioning according to the degree of SRLG constraints. The first policy is to reprovision backup paths for connections whose working paths traverse more SRLGs first (Policy I), and the second policy is to reprovision backup paths for connections whose working paths traverse fewer SRLGs first (Policy II). The third policy is to do backup reprovi- sioning randomly, i.e., we pick up an unprotected working path randomly (random reprovisioning). Ex- tensive simulation results show that 1) SRLG failures will leave more connections unprotected compared with single-link failures, and the percentage of con- nections left vulnerable tends to be proportional to the SRLG size; 2) the network performance based on the first reprovisioning policy always performs best in recovery ratio; and 3) the network performance based on Policy II even underperforms the random reprovisioning. These results can be explained by the fact that connections whose working paths traverse fewer SRLGs are more flexible in finding SRLG- disjoint backup paths, and thus priority given to con- nections whose working paths traverse more SRLGs in Policy I can significantly improve the recovery ra- tio. Index Terms—Shared-path protection; Backup reprovisioning; Shared risk link group (SRLG); Survivability; WDM networks. I. INTRODUCTION F ailures in a wavelength-division multiplexing (WDM) system will cause severe service loss. Sur- vivability is the capability of a network to maintain service continuity in the presence of faults within the network. Generally there are two types of fault man- agement techniques: protection and restoration. Pro- tection refers to a proactive procedure in which spare capacity is reserved in network planning and connec- tion establishment. Restoration is a reactive proce- dure in which the spare capacity is discovered dy- namically to restore the affected services; that is, the resources used for recovery are not reserved in ad- vance. Compared with protection, restoration is much more capacity efficient, but usually has a longer recov- ery time [1]. In a wavelength-routed WDM network, a connection between a source node and a destination node is referred to as a lightpath. The lightpath used to carry traffic during normal operation is known as a working lightpath. The connection is switched over a backup lightpath after the working lightpath is af- fected by a failure. Protection can be implemented on a per-link basis, a per-path basis, or a per-segment ba- sis. In path protection, a backup lightpath is chosen based on the source and destination of the request [1,2]. Backup reprovisioning tries to use restoration to further enhance the survivability of protection by re- provisioning backup paths for connections that be- come unprotected and vulnerable to the next failures immediately after link or node failures. Manuscript received January 15, 2010; revised May 27, 2010; ac- cepted June 2, 2010; published July 27, 2010 Doc. ID 122832. Xu Shao (e-mail: [email protected]), Yong Kee Yeo, Luying Zhou, and Lek Heng Ngoh are with the Institute for Infocomm Research, 1 Fusionopolis Way, #21-01 Connexis, Singapore 138632. Yuebin Bai is with Beihang University, Beijing, China 100191. Jian Chen is with Nanjing University of Posts and Telecommunications, Nanjing, China 210003. Digital Object Identifier 10.1364/JOCN.2.000587 Shao et al. VOL. 2, NO. 8/ AUGUST 2010/ J. OPT. COMMUN. NETW. 587 1943-0620/10/080587-13/$15.00 © 2010 Optical Society of America
Transcript
Page 1: Backup Reprovisioning After Shared Risk Link Group (SRLG) Failures in WDM Mesh Networks

Shao et al. VOL. 2, NO. 8 /AUGUST 2010/J. OPT. COMMUN. NETW. 587

Backup Reprovisioning After SharedRisk Link Group (SRLG)

Failures in WDM Mesh NetworksXu Shao, Yong Kee Yeo, Yuebin Bai, Jian Chen, Luying Zhou, and Lek Heng Ngoh

fdnit

rS

Fvsnatctdnrvmecntwbfasb[fpci

Abstract—In this paper, we study the impact ofshared risk link group (SRLG) failures on shared-path protection by examining the percentage of con-nections that are vulnerable after SRLG failures, in-vestigate the benefits of backup reprovisioning afterSRLG failures, and evaluate different policies forbackup reprovisioning. Compared with single-linkfailures, SRLG failures leave many more connectionsunprotected and vulnerable to the next failures andmake the network topology much sparser. The majorchallenge of backup reprovisioning after SRLG fail-ures is how to find SRLG-disjoint backup paths forthose unprotected connections with a recovery ratiothat is as high as possible within reasonable compu-tational complexity. We are motivated to considerthree reprovisioning policies by considering differentsequences of reprovisioning according to the degreeof SRLG constraints. The first policy is to reprovisionbackup paths for connections whose working pathstraverse more SRLGs first (Policy I), and the secondpolicy is to reprovision backup paths for connectionswhose working paths traverse fewer SRLGs first(Policy II). The third policy is to do backup reprovi-sioning randomly, i.e., we pick up an unprotectedworking path randomly (random reprovisioning). Ex-tensive simulation results show that 1) SRLG failureswill leave more connections unprotected comparedwith single-link failures, and the percentage of con-nections left vulnerable tends to be proportional tothe SRLG size; 2) the network performance based onthe first reprovisioning policy always performs bestin recovery ratio; and 3) the network performancebased on Policy II even underperforms the randomreprovisioning. These results can be explained by thefact that connections whose working paths traverse

Manuscript received January 15, 2010; revised May 27, 2010; ac-cepted June 2, 2010; published July 27, 2010 �Doc. ID 122832�.

Xu Shao (e-mail: [email protected]), Yong Kee Yeo, LuyingZhou, and Lek Heng Ngoh are with the Institute for InfocommResearch, 1 Fusionopolis Way, #21-01 Connexis, Singapore 138632.

Yuebin Bai is with Beihang University, Beijing, China 100191.Jian Chen is with Nanjing University of Posts and

Telecommunications, Nanjing, China 210003.Digital Object Identifier 10.1364/JOCN.2.000587

1943-0620/10/080587-13/$15.00 ©

ewer SRLGs are more flexible in finding SRLG-isjoint backup paths, and thus priority given to con-ections whose working paths traverse more SRLGs

n Policy I can significantly improve the recovery ra-io.

Index Terms—Shared-path protection; Backupeprovisioning; Shared risk link group (SRLG);urvivability; WDM networks.

I. INTRODUCTION

ailures in a wavelength-division multiplexing(WDM) system will cause severe service loss. Sur-

ivability is the capability of a network to maintainervice continuity in the presence of faults within theetwork. Generally there are two types of fault man-gement techniques: protection and restoration. Pro-ection refers to a proactive procedure in which spareapacity is reserved in network planning and connec-ion establishment. Restoration is a reactive proce-ure in which the spare capacity is discovered dy-amically to restore the affected services; that is, theesources used for recovery are not reserved in ad-ance. Compared with protection, restoration is muchore capacity efficient, but usually has a longer recov-

ry time [1]. In a wavelength-routed WDM network, aonnection between a source node and a destinationode is referred to as a lightpath. The lightpath usedo carry traffic during normal operation is known as aorking lightpath. The connection is switched over aackup lightpath after the working lightpath is af-ected by a failure. Protection can be implemented onper-link basis, a per-path basis, or a per-segment ba-

is. In path protection, a backup lightpath is chosenased on the source and destination of the request1,2]. Backup reprovisioning tries to use restoration tourther enhance the survivability of protection by re-rovisioning backup paths for connections that be-ome unprotected and vulnerable to the next failuresmmediately after link or node failures.

2010 Optical Society of America

Page 2: Backup Reprovisioning After Shared Risk Link Group (SRLG) Failures in WDM Mesh Networks

FNtion, and backup reprovisioning.

588 J. OPT. COMMUN. NETW./VOL. 2, NO. 8 /AUGUST 2010 Shao et al.

A. Shared-Path Protection and Impact of SRLGFailures

Shared risk link group (SRLG) refers to situationswhere links in a network share a common physical at-tribute, for example, cable, conduit, duct, node, or sub-structure [3–10]. Illustrative examples of an opticallayer and a physical layer of optical networks withSRLGs are shown in Figs. 1(a) and 1(b), respectively,where the optical layer topology and the physicallayer topology are heterogeneous. Links in a SRLGhave a shared possibility to fail simultaneously. A linkis referred to as the SRLG mate of another one if bothof them belong to the same SRLG. A single SRLG fail-ure will cause failures of all the links in the sameSRLG. The notable characteristic of SRLG is that agiven link could belong to more than one SRLG.AT&T’s experience indicates that a link may belong tomore than 100 SRLGs, each corresponding to a sepa-rate fiber group [3]. The size of an SRLG refers to thenumber of links the SRLG contains, which is used todescribe the extent of compromise [3]. A physical net-work may contain many SRLGs. Each SRLG repre-sents a type of compromise [3]. For example, in Fig.1(b), the 14-node NSFNet topology network has 4 SR-LGs. SRLG3= �B−G,E−F�, and its size is 2. Link B–Gbelongs to four SRLGs. B–C is the SRLG mate of B–Gsince both are in SRLG1.

In shared-path protection against SRLG failures,the backup lightpath and the working lightpath mustsatisfy SRLG-disjoint constraints, i.e., the workingpath and its backup path cannot go through the sameSRLG. If two working paths go through the sameSRLG, their backup paths cannot share wavelengths[5–8]. Figure 1(c) shows two connections with pathprotection to provide 100% SRLG failure protection.Figure 1(f) exemplifies backup sharing when twoworking paths are SRLG-disjoint. Path protectionwith SRLG constraints may have a severe trap prob-lem [5,6], in which the backup path cannot be founddue to physical or algorithm constraints. Traps can beclassified into avoidable traps and real traps. As a re-sult, the blocking probability of shared-path protec-tion with SRLGs becomes much higher compared withthe scenario without SRLGs.

B. Related Work and Motivation

Related research in the literature includes simulta-neous multiple-link failures, especially double-linkfailures [11–13], SRLG failures [5–10,14], near-simultaneous link failures [15,16], reprovisioning af-ter single-link failures, enhancement of survivabilityof single-link failure protection, backup path reoptimi-zation after link-state updates [17,18], and so on. Asnetworks grow in size and complexity, both the likeli-hood and impact of double-link failures increase.

ig. 1. (Color online) Illustrative example of SRLGs in the 14-nodeSFNet topology, impact of SRLG failures on shared-path protec-

Page 3: Backup Reprovisioning After Shared Risk Link Group (SRLG) Failures in WDM Mesh Networks

tb[adalappcia

dmdntfbTntcgds

wntSafSnWmfiuai

Shao et al. VOL. 2, NO. 8 /AUGUST 2010/J. OPT. COMMUN. NETW. 589

Double-link failures may occur simultaneously(double-link failures due to SRLG-failures and arbi-trary simultaneous double-link failures [11,12,19]) orone by one (near-simultaneous failures) [20,21]. One-by-one link failures assume that the second link fail-ures occur after the first link failures are physicallyrepaired through an algorithm but before they arephysically repaired [15,16]. The authors of [3] re-ported that fiber cut occurs at a rate of 4.39 cuts/1000sheath miles/year and it takes around 12 hours to fixa fiber failure. Therefore, reconfiguration [14] or re-provisioning of backup paths after failures or linkstate changes can significantly improve network sur-vivability. Backup reprovisioning is an important andactive area of research and has been studied widely inthe context of single-link failures. The authors in [21]are motivated by the need for considering the networkvulnerability subject to double-link failures whenplanning a shared path for 100% single-link failureprotection for each connection request. Furthermore,the authors in [22,23] studied the percentage of vul-nerable connections after single-link failures, re-storability, and the recovery ratio of dedicated-pathprotection and shared-path protection. Unlike restora-tion in which service will be disrupted, reprovisioningfurther enhances the working path after it becomesunprotected. Recent work in [13,15–17,24] presentedthe idea of backup reprovisioning for connections thatbecome unprotected or vulnerable to the next failuresimmediately after single-link failures. The authors in[18,25] further investigate the benefits of backup re-provisioning after network state updates including 1)the arrival of new connection requests, 2) the depar-ture of existing connections, 3) network failures, and4) the repair of a failed network component. Thep-cycle reconfiguration for enhanced dual-failure re-storability was presented in [13]. The simulation re-sults in [15] confirmed that backup sharing makes thebackup reprovisioning problem even worse by leavingmore connections vulnerable to the next failures. Theauthors in [26] analytically evaluate how globalbackup reprovisioning (GBR) computational time mayimpact the connection provisioning process and pro-pose algorithms to significantly decrease the computa-tional time while still achieving very high resource ef-ficiency. In addition to backup reprovisioning aftersingle-link failures, the authors in [27] propose to as-sign each connection one working path, one backuppath, and multiple subbackup paths to improve thesurvivability for multilink failures in WDM mesh net-works. Compared with shared-path reprovisioning,the proposed scheme can significantly improve surviv-able performance for multilink failures, but it is lesscapacity efficient due to the use of multiple subbackuppaths for the connection.

Apart from survivability enhancement, reconfigura-

ion of connections including working paths andackup paths can improve network performance22,28], especially for dynamic traffic. As the networknd the traffic evolve, the lightpaths of the existingemands become suboptimal. If both the working pathnd the backup path can be reoptimized, substantiallyarger bandwidth savings can be achieved [28]. Theuthors in [29] proved that the problem of backup re-rovisioning for all the lightpaths requiring shared-ath protection under a current network state is NP-omplete. Therefore, for the reprovisioning algorithm,t is necessary to balance between capacity efficiencynd execution speed [20,26,29–31].

Backup reprovisioning is a reactive method, whileouble or multiple failure protection is a proactiveethod. In addition to backup reprovisioning to ad-

ress near-simultaneous failures, another rival tech-ology is to use double- or multiple-link failure protec-ion, which can also address near-simultaneous linkailures. Its benefit is that there is no need to doackup reprovisioning after the single-link failures.o achieve 100% protection against arbitrary simulta-eous double-link failures with shared-path protec-ion, two link-disjoint backup paths need to be pre-omputed for the working path [11,12]. However,enerally, path protection against simultaneousouble-link failures is not as capacity efficient as thecenario against single-link failures.

Compared with single-link failures, SRLG failuresill leave many more connections vulnerable to theext failures. For example, Fig. 1(c) shows two connec-ions with path protection. Figure 1(d) shows thatRLG3 failures will lead to the failures of links E–Fnd B–G, and meanwhile, WP1 and BP2 will be af-ected correspondingly. In view of the severe impact ofRLG failures, in this study, we are motivated by theeeds of backup reprovisioning after SRLG failures.e argue that the existing backup reprovisioningethods after single-link failures may no longer be ef-

cient due to different patterns and the number of fail-res caused by SRLG failures. Backup reprovisioningfter SRLG failures differs from backup reprovision-ng after single-link failures as follows:

1) Single-link failures only cause the connectionsgoing through the link to fail. In contrast, singleSRLG failures will cause all the connections go-ing through the SRLG to fail. As a result, de-pending on the size of the SRLG, generally, agreater percentage of connections needs to be re-provisioned with backup paths after SRLG fail-ures.

2) SRLG failures make the network topologysparser than single-link failures do, particularlywhen the failed SRLG contains more links. In asparser network, it is more challenging to find anew backup path satisfying SRLG-disjoint con-

Page 4: Backup Reprovisioning After Shared Risk Link Group (SRLG) Failures in WDM Mesh Networks

tb

A

tbolbstdc

Sinsastrilb

uwl[dssPb�ultetdnct�o

sStt

590 J. OPT. COMMUN. NETW./VOL. 2, NO. 8 /AUGUST 2010 Shao et al.

straints.3) Due to the complexity of SRLG constraints and

more connections that need to be provided withbackup paths, backup reprovisioning after SRLGis an even more complex problem and requiresmore efficient algorithms.

In summary, backup reprovisioning after SRLG fail-ures is an effective and capacity-efficient way to pre-vent disruption of service by the next failures afterSRLG failures. We are motivated to study backup re-provisioning after SRLG failures [32]. We argue thatthe existing backup reprovisioning methods aftersingle-link failures are not suitable for backup repro-visioning after SRLG failures. Reprovisioning afterSRLG failures is more challenging and complex thanreprovisioning after single-link failures. For the re-provisioning algorithm, it is essential to balanceamong the recovery ratio, capacity efficiency, and com-putational complexity. The method proposed in [27]may not work well in the scenario of SRLGs due to itscapacity inefficiency. Although much previous workhas concentrated on SRLG failure protection andbackup reprovisioning after single-link failures, to thebest of our knowledge, no effort has gone into the de-tailed study of backup reprovisioning after SRLG fail-ures.

Our contributions are as follows: 1) We investigatethe percentage of vulnerable connections after SRLGfailures under various network, SRLG, and traffic pa-rameters; 2) we formulate the percentage of vulner-able connections after SRLG failures when usingDPP; 3) we present and evaluate several backup re-provisioning schemes after SRLG failures; 4) we de-fine the backup path availability index (BPAI) accord-ing to the number of SRLGs that the working pathgoes through, which can be used to choose the se-quence of backup reprovisioning; and 5) we presentextensive experimental results to show that the net-work performance based on the policy of giving prior-ity to working paths with large BPAI always performsbest in recovery ratio.

C. Organization of This Paper

The rest of this paper is organized as follows. Sec-tion II first states the problem, describes backup re-provisioning procedures, and then presents policiesand respective heuristics for backup reprovisioning af-ter SRLG failures. Section III presents extensivesimulation results on the performance metrics of thepercentage of vulnerable connections after SRLG fail-ures and the recovery ratio with various network andtraffic parameters. Section IV concludes the paper.

II. BACKUP REPROVISIONING AFTER SRLG FAILURES

In this section, we formally state the problem, andhen present policies and heuristic algorithms forackup reprovisioning after SRLG failures.

. Problem Statements

We first define the notations and then formally statehe problem. A network is represented as a weightedidirectional graph G= �V ,E�, where V= �n� is the setf network nodes and E= �li,j=1� is the set of physicalinks. Hence, the number of links in the network wille �i,jli,j. Let li,j,� be the �th wavelength of li,j. In thistudy, we assume that the weight on each link is 1 andhe number of wavelength channels per fiber in eachirection is W. The wavelength is assumed to be un-onvertible.

Let SRLGs= �SRLG1 ;SRLG2 ; . . . ;SRLGk ; . . . ;RLGM�, where SRLGk= �li,j� is the set of links belong-

ng to the same SRLG and M represents the totalumber of SRLGs in the network. Let S denote theize of an SRLG and for simplification in this study welways assume all the SRLGs in the network have theame size. Thus we use �S�M� to denote a networkhat has M SRLGs and each group contains S unidi-ectional links.1 The formation of SRLGs is very flex-ble, for example, li,j may belong to several SRLGs andi,j and lj,i may not belong to the same SRLG. Let Bi,je the number of SRLGs that li,j belongs to.

We analyze backup reprovisioning after SRLG fail-res with static traffic (offline traffic) [8] first, inhich the problem is relatively easier to be formu-

ated than that with dynamic traffic (online traffic)6,7]. The kth connection request from source node s toestination node d is represented as cs,d,k. In thistudy, every connection needs to be protected withhared-path protection against SRLG failures. Letws,d,k and Pb

s,d,k denote the working path and theackup path of cs,d,k, respectively. Let Pw,i,j,�

s,d,k =1 be theth wavelength of the link between node i and node jsed by working path Pw

s,d,k, else Pw,i,j,�s,d,k =0. Similarly,

et Pb,i,j,�s,d,k =1 be the �th wavelength of the link be-

ween node i and node j used by backup path Pbs,d,k,

lse Pb,i,j,�s,d,k =0. Let b be the blocking ratio assuming

hat not every connection request can be accommo-ated due to resource and SRLG constraints. Thus theumber of connections that have been accepted suc-essfully can be calculated as b ·�s,d,kcs,d,k. Hence,he average length of the working path iss,d,k,i,j,�Pw,i,j,�

s,d,k / �b ·�s,d,kcs,d,k� and the average lengthf the backup path is �s,d,k,i,j,�Pb,i,j,�

s,d,k / �b ·�s,d,kcs,d,k�.1It is notable that a real-world backbone network has different

ized SRLGs. So far, there are no reports on how the size of anRLG is distributed. It is difficult to set a reasonable range ratherhan a fixed size. Therefore, the fixed size of the SRLG defined inhis paper can be generally treated as the average size of SRLGs.

Page 5: Backup Reprovisioning After Shared Risk Link Group (SRLG) Failures in WDM Mesh Networks

pur

fcvtacwbfilqnclqnpas

B

bbcw

Shao et al. VOL. 2, NO. 8 /AUGUST 2010/J. OPT. COMMUN. NETW. 591

In addition to the common constraints from therouting and wavelength assignment (RWA) problems[1,2] in a wavelength-continuity WDM mesh network,assuming that Pw

s,d,k and Pws�,d�,k� denote the working

paths for two different connection requests and Pbs,d,k

and Pbs�,d�,k� denote the respective backup paths, the

working paths and backup paths of shared-path pro-tection against SRLG failures are further subject tothe following constraints:

C.1 Pws,d,k and Pb

s,d,k are SRLG-disjoint (note thatlink-disjoint is a special case of SRLG-disjoint, wherethe SRLG only contains one link).

C.2 Pws,d,k and Pw

s�,d�,k� cannot share any wavelength.

C.3 Pws,d,k cannot share any wavelength with Pb

s�,d�,k�.

C.4 Pbs,d,k and Pb

s�,d�,k� can share a wavelength on acommon link if and only if Pw

s,d,k and Pws�,d�,k� are

SRLG-disjoint.

After the first SRLG failures, due to SRLG failureprotection, services will not be disrupted, but some ofthem become unprotected. The scenarios of connec-tions vulnerable to the next possible failures afterSRLG failures are as follows:

1) If SRLG failures affect the backup path of a con-nection, its working paths will be vulnerable.That is, after li,j fails due to the failures of theSRLG that it belongs to, if ∃s ,d,k: Pb,i,j,�

s,d,k =1, thenPw

s,d,k will be vulnerable. An example is that thebackup path of connection 2 is affected by fail-ures of SRLG3, and its working path is unpro-tected, as shown in Fig. 1(d).

2) If SRLG failures affect the working path of a con-nection, its backup path will become the newworking path. The new working path will be vul-nerable. That is, after li,j fails due to the failuresof the SRLG that it belongs to, if ∃s ,d,k:Pw,i,j,�

s,d,k =1, then Pbs,d,k will be vulnerable. An ex-

ample is that the working path of connection 1 isaffected by failures of SRLG3, and its workingpath now (previous backup path) is unprotected,as shown in Fig. 1(d).

3) If connection B is sharing backup wavelengthswith connection A and SRLG failures affect theworking path of connection A, both the previousbackup path of connection A and the workingpath of connection B will be vulnerable. This isbecause connection A’s switching to the backuppath will block the backup path of connection B,leaving connection B’s working path unprotected.That is, after li,j fails due to the failures of theSRLG that it belongs to, if ∃s ,d,k: Pw,i,j,�

s,d,k =1,then Pb

s,d,k is vulnerable. Meanwhile, if

∃s� ,d� ,k� , i� , j� ,��: Pb,i�,j�,��s,d,k =1 and Pb,i�,j�,��

s�,d�,k� =1,

then Ps�,d�,k� will be vulnerable. The examples of

w

this scenario are shown in Figs. 1(f) and 1(g).

To measure the impact of SRLG failures and theerformance of backup reprovisioning after SRLG fail-res, we focus on the following two performance met-ics:1) Percentage of vulnerable connections: This is to

measure the percentage of vulnerable connec-tions after SRLG failures and is given by

Percentage of vulnerable connections

= the number of unprotected connections/

the total number of connections in the network

�1�

2) Recovery ratio: This is to measure the percentageof connections that have been affected by SRLGfailures but have been successfully reprovisionedwith backup paths. It is given by

Recovery ratio

= the number of connections that have been

successfully reprovisioned with backup paths/

the total number of unprotected connections

�2�

The problem of backup reprovisioning after SRLGailures can be formally stated as follows: Given theurrent network state and traffic distribution, repro-ision those connections that become unprotected af-er SRLG failures with a recovery ratio that is as highs possible. Meanwhile, the trade-off to capacity effi-iency should be minimal. Since the working path al-ays carries traffic, we focus on the scenario that onlyackup paths will be reprovisioned without the recon-guration of existing working paths. In fact, the prob-

em of backup reprovisioning for all the lightpaths re-uiring shared-path protection under a currentetwork state after SRLG failures is NP-complete be-ause a special case of this problem where the prob-em of backup reprovisioning for all the lightpaths re-uiring shared-path protection under a currentetwork state after single-link failures has beenroved to be NP-complete [29]. Since ILPs are too timend space intensive, we focus on efficient heuristics toolve this problem.

. Analysis of the Impact of SRLG Failures

As dedicated-path protection (DPP) does not shareackup paths with others, the working path andackup path play the same role. Therefore, the per-entage of vulnerable connections after SRLG failuresill be

Page 6: Backup Reprovisioning After Shared Risk Link Group (SRLG) Failures in WDM Mesh Networks

fnntfapu

oiiibtpcrectsiboocttrbt

D

ttcrraAfiSpbdttBS

Slf

592 J. OPT. COMMUN. NETW./VOL. 2, NO. 8 /AUGUST 2010 Shao et al.

vDPP = �S �i,j

li,j · �s,d,k,i,j

Pw,i,j,�s,d,k �b · �

s,d,kcs,d,k

+ �S �i,j

li,j · �s,d,k,i,j

Pb,i,j,�s,d,k �b · �

s,d,kcs,d,k

= S · �s,d,k,i,j

�Pb,i,j,�s,d,k

+ Pw,i,j,�s,d,k � �b · �

s,d,k,i,jcs,d,k · li,j. �3�

From Eq. (3), we can see that in DPP the percentageof vulnerable connections after SRLG failures is di-rectly proportional to the size of the SRLG. Specifi-cally, single-link failure is a special case of SRLG fail-ures when S=1. Therefore, with DPP the percentageof vulnerable connections after single-link failureswill be

vDPP = �s,d,k,i,j

�Pb,i,j,�s,d,k + Pw,i,j,�

s,d,k � �s,d,k,i,j

�b · cs,d,k · li,j�.

�4�

As discussed earlier, in shared-path protection, fail-ures of the working path will not only leave its backuppath unprotected but also other working paths thatare sharing backups with it. Compared with DPP, SPPis more vulnerable to SRLG failures. Therefore, withSPP the percentage of vulnerable connections aftersingle-link failures will be

vSPP � VDPP. �5�

C. Recovery and Backup Reprovisioning Procedures

Immediately after SRLG failures are detected, sig-naling will be sent to inform the source and destina-tion node of every connection going through the failedSRLG. The recovery and backup reprovisioning proce-dures can be divided into two distinct steps: recoveryfrom SRLG failures and backup path reprovisioning.Recovery from SRLG failures comprises two sce-narios:

1) If the backup path is affected by SRLG failures,the remaining resources used for the backuppath will be released as long as they have notbeen shared with others.

2) If the working path is affected by SRLG failures,the remaining resources used for the workingpath will be fully released, and its backup pathwill become the new working path. All thebackup paths sharing wavelengths with the con-nection will be blocked, and thus all the re-sources used by them will be released as long asthey have not been shared with others.

The authors in [1] studied the recovery time ofshared-path protection against single-link failures.Generally, the frequency of link state updating will af-

ect the recovery time. With SRLGs, since more con-ections need to be recovered and more coordinationeeds to be done, generally it takes more time thanhe recovery from single-link failures. After SRLGailure recovery, traffic distribution stabilizes againnd network topology becomes sparser. After that, therocedure of reprovisioning backup paths for thosenprotected connections will start.

Our assumption for backup reprovisioning is basedn the following assumptions: Working path reroutings not allowed, since it is carrying traffic, and rerout-ng the working path will discontinue services, whichs undesirable. What we can do is merely provision theackup path for the unprotected working path. Notehat sometimes it is impossible to find the backupath for a working path due to physical or algorithmonstraints. Therefore, it is essential to increase theecovery ratio with more efficient algorithms. How-ver, because some unprotected paths are carryingritical applications, the reprovisioning time is essen-ial to prevent them from being affected by other pos-ible SRLG failures. Basically, the reprovisioning times mainly decided by 1) the time of calculation of newackup path routes and 2) the instant from sendingut signaling to the instant of physical establishmentf new backup paths. As the latter is pretty much de-ided by the speed of hardware and the control plane,he former becomes more essential, which can be po-entially expedited by efficient heuristics. To summa-ize, backup reprovisioning needs to make trade-offsetween efficiency in recovery ratio and computa-ional complexity.

. Policies and Heuristics for Backup Reprovisioning

In the heuristic algorithm of backup reprovisioning,he sequence of choosing from unprotected connec-ions is important, since the network resource will de-rease with more iterations of the algorithm. For thiseason, we are motivated to design the heuristic algo-ithm according to the amount of network resourcesnd SRLG constraints in the backup reprovisioning.n instinctive perception is that it is more difficult tond backup paths for connections going through moreRLGs. Therefore, in this study, we define the backupath availability index (BPAI) according to the num-er of SRLGs that it goes through; that is, the BPAI isefined as the total number of SRLGs that the unpro-ected connection belongs to. For example in Fig. 1(c),he BPAI of the working path of connection 1, i.e.,–G, is 4, since it goes through 4 SRLGs: SRLG1,RLG2, SRLG3, and SRLG4.

The network topology will become sparser afterRLG failures. Meanwhile, some resources will be re-

eased due to the failures of some connections. There-ore, before running backup reprovisioning heuristics,

Page 7: Backup Reprovisioning After Shared Risk Link Group (SRLG) Failures in WDM Mesh Networks

bTSi

A

Pw

le

lt

p

s

c

fpawbbr

cstftwcts+

uetacSmfpa

se

Shao et al. VOL. 2, NO. 8 /AUGUST 2010/J. OPT. COMMUN. NETW. 593

after SRLG failures, we need to modify traffic distri-bution and prune the network topology from G to G�as follows:

1) Eliminate all the links from the graph belongingto the failed SRLG.

2) Release all the wavelengths used by failed pathsas well as backup wavelengths blocked due tosharing of connections.

3) If a connection’s working path fails, its backuppath will become the working path.

We are motivated to consider two reprovisioningpolicies representing two extremes of the BPAI. Thefirst policy is to reprovision backup paths for connec-tions whose working paths traverse more SRLGs first(Policy I), i.e., the highest BPAI connections first. Thesecond policy is to reprovision backups for connectionswhose working paths traverse fewer SRLGs first(Policy II), i.e., the lowest BPAI connections first. Ascomparisons to the two reprovisioning policies, thethird policy is to do backup reprovisioning randomly,i.e., we pick up an unprotected working path ran-domly (random reprovisioning or Policy III) withoutconsidering the BPAI. Table I shows an illustrative ex-ample comparing the three proposed reprovisioningpolicies. The first row in Table I shows the connectionsthat become unprotected after SRLG failures andtheir BPAI. c22�2� means that connection 22 is unpro-tected and its BPAI is 2. Radom reprovisioning doesbackup reprovisioning randomly regardless of theBPAI, as shown in the third row in Table I. Policy Iand Policy II do backup reprovisioning with a de-scending or ascending order of the BPAI, as shown inthe fourth and fifth rows in Table I, respectively.

Despite the differences among various reprovision-ing policies, to have a fair comparison of them, wecompare them based on the same framework; that isdynamic routing, working path first [5,7], and First-Fit to choose wavelengths. The complexity of findingthe shortest path with a disjoint path is analyzed in[33]. Dynamic routing means that the route of a newconnection request will be selected by using the cur-rent network states. Unlike static routing where aroute is always fixed, dynamic routing can reduce the

TABLE IILLUSTRATIVE EXAMPLE COMPARING THREE REPROVISIONING

POLICIESa

UnprotectedProtections

c22�2� c12�3� c40�6� c25�8� c7�9�

ReprovisioningSequence

1 2 3 4 5

Policy I c7�9� c25�8� c40�6� c12�3� c22�2�Policy II c22�2� c12�3� c40�6� c25�8� c7�9�RandomReprovisioning

c40�6� c22�2� c7�9� c12�3� c25�8�

ac (BPAI) denotes that the kth connection is unprotected.

k

locking probability and improve capacity efficiency.he heuristic algorithm of backup reprovisioning afterRLG failures with shared-path protection consider-

ng various reprovisioning policies is as follows:

lgorithm: Backup reprovisioning after SRLG failures

Step 1) Sort unprotected connections to a list according toolicy I, Policy II, or random reprovisioning, depending onhat policies will be used.Step 2) Choose the working path Pw

s,d,k accordingly from theist. To compute the backup path, prune G� to G� byliminating:Step 2.1) All the links (including all the channels in these

inks) the working path Pws,d,k traverses and the SRLG mates of

hese links.Step 2.2) All the channels used by all the established working

aths.

Step 2.3) If there is a working path Pws�,d�,k� that traverses the

ame SRLG as the current working path Pws,d,k, remove all the

hannels used by Pbs�,d�,k�.

Step 3) Compute the shortest paths on each wavelength layerrom the pruned graph G� in step 2). Compare the shortestaths from each wavelength layer and choose the shortest ones Pb

s,d,k. If some of the shortest paths from differentavelength layers are equal, use the First-Fit algorithm toreak the tie. If no backup path can be found, count thisackup reprovisioning as an unsuccessful backupeprovisioning.

Step 4). Go back to Step 2) for more iteration.

If Dijkstra’s algorithm is used for the shortest-pathomputation, the computational complexity for thehortest-path computation will be O�E+N · log N�. Ifhere are no SRLGs, and only backup reprovisioningor single-link failures is considered, the computa-ional complexity will be O�W · �E+N · log N��. Whene consider SRLG failures, SRLG failures will add a

omputational cost of O�S ·M�. Thus the computa-ional complexity of the heuristic for backup reprovi-ioning after SRLG failures is O�W ·S ·M · �EN · log N��.

III. PERFORMANCE EVALUATION

In this section, we study the impact of SRLG fail-res and the interrelationship between SRLG param-ters and the percentage of vulnerable connections af-er SRLG failures. We do not consider survivablebility for connections (SAC) [27] because our methodan always provide 100% survivability against singleRLG failures. We evaluate and compare the perfor-ance of the various reprovisioning policies under dif-

erent network topologies and wavelength and SRLGarameters by focusing on the percentage of vulner-ble connections and the recovery ratio.

Note that the definition of the recovery ratio in thistudy is different from the definition in some of the lit-rature, which is defined as the ratio of connections

Page 8: Backup Reprovisioning After Shared Risk Link Group (SRLG) Failures in WDM Mesh Networks

TbtqSawaunp(t

AF

tsealLWswS4nsScimnSv

594 J. OPT. COMMUN. NETW./VOL. 2, NO. 8 /AUGUST 2010 Shao et al.

recovering from link or SRLG failures [1,23]. The re-covery ratio in this study defined in Eq. (1) refers tothe recovery ratio from unprotected connectionsrather than the recovery ratio from a disruption ofservices. In our scenario, because we use 100% SRLGfailure protection in advance, it can always achieve100% recovery from single SRLG failures. However, asdiscussed earlier, some connections may be vulnerableto other possible SRLG failures. Backup reprovision-ing is an endeavor to reduce vulnerability.

We first use the 14-node NSFNet topology, as shownin Fig. 1, and each fiber is assumed to have 8 wave-lengths or 16 wavelengths in each direction in thesimulation. Then we use a 28-node network, as shownin Fig. 2, and each fiber is assumed to have 4 wave-lengths or 8 wavelengths in each direction in thesimulation. For dynamic traffic, the arrival of traffic tothe network follows a Poisson distribution with a rateof � connection requests per unit time and aconnection-holding time that is exponentially distrib-uted with a mean value of �. An arrival request isequally likely to originate from and be destined to anynode in the network. The network load is given by� /�. SRLGs are randomly selected, and one fiber maybelong to several SRLGs. Single SRLG failures arerandomly generated, and all the SRLGs have thesame failure probability. We use shared-path protec-tion to satisfy each connection request. If we are notable to find a working path and a backup path satis-fying SRLG-disjoint constraints for a connection re-quest, the connection request will be rejected. Theblocking probability, defined as the number of rejectedconnection requests against the total number of con-nection requests under online traffic (dynamic traffic),is calculated by simulating at least 20,000 connectionrequests. Due to the capacity inefficiency of dedicated-path protection, our focus in the simulation is onshared-path protection.

The simulation program is written with Matlab andit is run on a computer with a 3.2 GHz CPU and 1 GBof RAM. It usually takes about 48 hours to simulate50,000 connections with NSFNet topologies with 8wavelengths at a load for an arrival or departureevent together with backup reprovisioning. It willtake much longer when the network becomes largerand when the blocking probability becomes higher.

Since the percentage of vulnerable connections andthe recovery ratio are all related to the network load,we will present the simulation results versus differentnetwork loads. Since SRLG failures and connectionrequest arrivals/departures are independent, to calcu-late the percentage of vulnerable connections at eachload, in the simulation we assume there is an SRLGfailure event following every arrival and departureevent. After SRLG failures, backup paths are reprovi-sioned with the proposed three reprovisioning policies.

he final results of the average recovery ratio shownelow are the average of the recovery ratio at differentimes. Note that in the simulation, when the next re-uest comes, we will use the network states beforeRLG failures. The calculation of the blocking prob-bility will not consider SRLG failures. In otherords, the blocking probability is the blocking prob-bility at a certain load without SRLG failures. Wese the same assumption as that in [15,16] where aew failure occurs before a previous failure is re-aired. Therefore, we do not consider the failure ratenormalized in FIT) and average failure repair time inhe simulation.

. Percentage of Vulnerable Connections After SRLGailures

To examine the percentage of vulnerable connec-ions after SRLG failures, we conducted extensiveimulations under various network and SRLG param-ters. Figures 3 and 4 plot the percentage of vulner-ble connections after SRLG failures at differentoads, wavelengths, SRLG sizes, and number of SR-Gs under the NSFNet topology and the 28-nodeDM network topology, respectively. The scenario of

ingle-link failures is a special case of SRLG failureshere the size of the SRLG is 1, i.e., one link in eachRLG. From the simulation results from Figs. 3 and, we can compare the percentage of vulnerable con-ections after SRLG and single-link failures. Our re-ults are consistent with our expectations. First,RLG failures will cause more links to be unprotectedompared with single-link failures. Second, with thencrease in SRLG size, SRLG failures will also cause

ore links to be unprotected. The percentage of vul-erable connections tends to be proportional to theRLG size. For example in Fig. 3, the percentage ofulnerable connections when the size of the SRLG is 4

Fig. 2. 28-node WDM network topology.

Page 9: Backup Reprovisioning After Shared Risk Link Group (SRLG) Failures in WDM Mesh Networks

tmgTctSttsphc

3amwaclctSsimpaa

BP

prpar

FS

Shao et al. VOL. 2, NO. 8 /AUGUST 2010/J. OPT. COMMUN. NETW. 595

is approximately twice higher than that when the sizeof the SRLG is 2. This is largely in line with our analy-sis in Eq. (1). Last, if we compare the results in Figs. 3and 4, we can observe that with the same SRLG andwavelength parameters, SRLG failures will leavemore connections unprotected in a smaller network.For example, when SRLG= �4�1� and the wavelengthnumber is 8, NSFNet has 40% to 45% vulnerable con-nections due to SRLG failures, whereas the 28-nodenetwork topology sees less than 35% vulnerable con-nections. This can be explained by the fact that SRLGfailures will affect more percentages of links in asmaller network than a bigger network when theyhave the same size SRLG.

As for the interrelationship between the percentageof vulnerable connections after SRLG failures andnetwork load, Figs. 3 and 4 show that network loaddoes not have any significant impact on the percent-age of vulnerable connections. This is probably be-cause the percentage of connections going throughany SRLG does not increase with the network load.

Simulation results in Figs. 3 and 4 also show thatthe number of SRLGs has little impact on the percent-age of vulnerable connections after SRLG failures. Forexample, in Fig. 4, if we compare the results ofSRLG= �4�1�, W=4, and SRLG= �4�10�, W=4, wecan find that the results of the two scenarios are veryclose even though the difference of the SRLG numberis 8 times. This is intrinsic, since only the size of theSRLG will cause more links to fail simultaneously andthe number of SRLGs does not matter.

Last, simulation results from Figs. 3 and 4 alsodemonstrate that the increase in the number of wave-lengths per fiber does not lead to a significant increasein the percentage of vulnerable connections afterSRLG failures. In fact, Figs. 3 and 4 show differenttrends of the percentage of vulnerable connections af-ter SRLG failures with the increase in the number ofwavelengths per fiber. We argue that the increase in

Fig. 3. (Color online) Percentage of vulnerable connections afterSRLG failures (NSFNet).

he number of wavelengths per fiber can accommodateore connections, but the percentage of connections

oing through a certain SRLG remains constant.herefore, there is no direct relationship between linkapacity (the number of wavelengths per fiber) andhe percentage of connections traversing a certainRLG. The phenomenon can also be understood inhis way. The increase in link capacity is equivalent tohe decrease in network load and vice versa. We havehown and explained that network load has little im-act on the percentage of vulnerable connections, andence, link capacity also has little impact on the per-entage of vulnerable connections.

In conclusion, the simulation results shown in Figs.and 4 validate our analysis in Eq. (1). The percent-

ge of vulnerable connections after SRLG failures al-ost grows linearly with the size of SRLGs. The net-ork load and the wavelengths per fiber do not haveny significant impact on the percentage of vulnerableonnections. SRLG failures, especially failures ofarge SRLGs will leave more connections unprotected,ompared with the scenario of single-link failures. Forhis reason, efficient backup reprovisioning afterRLG failures becomes more meaningful and neces-ary, which will be discussed in Subsection III.B. Evenf no simulation is done to compare our proposed

ethods with SMR [27], it is intuitive that our pro-osed method is more capacity efficient in general,nd the method in [27] can provide more survivabilitygainst arbitrary multilink failures.

. Comparison of Various Backup Reprovisioningolicies

We quantitatively compare the performance of re-rovisioning Policy I, reprovisioning Policy II, andandom reprovisioning under different network to-ologies, loads, and SRLG parameters. Figures 5(b)nd 6(b) present a comparison of the performance ofeprovisioning Policy I, reprovisioning Policy II, and

ig. 4. (Color online) Percentage of vulnerable connections afterRLG failures (28-node WDM network topology).

Page 10: Backup Reprovisioning After Shared Risk Link Group (SRLG) Failures in WDM Mesh Networks

cbsb

nvtplnspFs

paawFlslT

Ft

596 J. OPT. COMMUN. NETW./VOL. 2, NO. 8 /AUGUST 2010 Shao et al.

random reprovisioning at different levels of blockingprobability as shown in Figs. 5(a) and 6(a) and withthe NSFNet topology and 28-node WDM network to-pology, respectively. Although the simulation resultsin last subsection do not show a significant correlationbetween network load and the percentage of vulner-able connections after SRLG failures, we believe thatthe network load has a great impact on the perfor-mance of backup reprovisioning. It is intuitive that abusier network with a higher load will cause more dif-ficulties in backup reprovisioning. For this reason, wewill investigate backup reprovisioning and comparereprovisioning policies at different loads and blockingprobabilities. In other words, to have a fair compari-son, the recovery ratios should be studied under dif-ferent loads as well as blocking probabilities.

Figure 3 has shown that with NSFNet, if the SRLGsize is 3, the percentage of vulnerable connections af-ter SRLG failures ranges between 35% and 40%. Fig-ure 5(a) shows the blocking probability versus net-work load when the wavelengths per fiber is 8 andSRLG= �3�6� with the NSFNet topology. From thesimulation results in Fig. 5(b), we can see that, amongthe three policies, Policy I always has the highest re-covery ratio and Policy II always has the lowest recov-ery ratio. The performance of random reprovisioningis between Policy I and Policy II. This can be ex-plained by connections traversing fewer SRLGs hav-ing more flexibility to find SRLG-disjoint backuppaths, so it is necessary to give priority to connectionstraversing more SRLGs. It is more difficult to findbackup paths for those unprotected connections tra-versing more SRLGs due to stronger SRLG con-straints. Clearly, at the beginning of the reprovision-ing heuristic, there are more network resources thanat the end of the reprovisioning heuristic. Policy Iplaces higher priorities for connections with strongerSRLG constraints first when network resources arestill relatively abundant at the beginning of reprovi-sioning. In this way, the recovery ratio is improvedwith Policy I. On the contrary, Policy II satisfies theconnections for which it is relatively easier to findbackup paths for at the beginning of the heuristic,which leads to more difficulties in satisfying connec-tions with stronger SRLG constraints toward the endof the heuristic when network resources become evenscarcer. Due to these reasons, Policy I always outper-forms Policy II in reprovisioning ratio, and Policy Iperforms even better with higher network load or highblocking probability.

Another observation is that the recovery ratio de-creases with the increase in the network load. In otherwords, if we combine Figs. 5(a) and 5(b), we can ob-serve that the recovery ratio decreases with the in-crease in blocking probability. This is obvious, sincespare capacity in a network decreases with the in-

rease in network load, and with less spare capacity, itecomes more difficult for backup reprovisioning toatisfy SRLG disjoint constraints in reprovisioningackup paths.

Figure 4 has shown that with the 28-node WDMetwork topology, if SRLG= �4�10�, the percentage ofulnerable connections after SRLG failures ranges be-ween 22% and 28%. Figure 5(a) shows the blockingrobability versus network load when the wave-engths per fiber is 4 and SRLG= �4�10� with the 28-ode WDM network topology. Figure 5(b) presents theimulation results on the recovery ratios of various re-rovisioning policies. From the simulation results inig. 6, we can draw the same conclusion as that in theimulation results in Fig. 5.

To study the performance of various reprovisioningolicies at different network sizes, we compare themt the same or similar blocking probability. For ex-mple, Fig. 5(a) shows that the blocking probabilityhen the load is 20 is around 26% in NSFNet, andig. 6(a) show that the blocking probability when the

oad is 12 is also around 26% in the 28-node topology,o we can compare the reprovisioning ratio when theoad is 20 in NSFNet and 12 in the 28-node topology.he reprovisioning ratios with Policy I are 82% and

ig. 5. (Color online) Comparison of reprovisioning policies underhe NSFNet topology (wavelengths per fiber=8 and SRLG= �3�6�).

Page 11: Backup Reprovisioning After Shared Risk Link Group (SRLG) Failures in WDM Mesh Networks

ciawbuSsiwcq

StdcFnaTpfittiiui

Astf

Shao et al. VOL. 2, NO. 8 /AUGUST 2010/J. OPT. COMMUN. NETW. 597

95.8%, respectively, when the load is 20 in NSFNetand 12 in the 28-node topology. Therefore, in a largernetwork, reprovisioning has more chances to succeed.This can be explained because, in a lager network,there are more opportunities to find SRLG-disjointpaths for unprotected connections.

To examine how the average SRLG size will affectthe performance if the SRLG size is not fixed, we alsosimulated SRLGs with different sizes rather than afixed size. Due to the length constraints of this paper,we cannot show these results. Generally, simulationresults have shown the same trend as we have ob-served in Figs. 3–6. Therefore, the fixed size of theSRLG defined in this paper can be generally treatedas the average size of SRLGs.

IV. CONCLUSIONS AND FUTURE WORK

Compared with single-link failures, SRLG failuresleave many more connections that are unprotectedand vulnerable to the next failures. The major chal-lenge in backup reprovisioning after SRLG failures ishow to find SRLG-disjoint backup paths for those un-protected connections with a recovery ratio that is ashigh as possible within reasonable computational

Fig. 6. (Color online) Comparison of reprovisioning policies underthe 28-node WDM network topology (wavelengths per fiber=4 andSRLG= �4�10�).

omplexity. The performance of backup reprovisionings mainly affected by the available network resourcess well as the number of SRLGs that an unprotectedorking path traverses. We argued that the existingackup reprovisioning methods after single-link fail-res are not suitable for backup reprovisioning afterRLG failures. Therefore, we defined BPAI to mea-ure the strength of the SRLG constraints for a work-ng path to find an SRLG-disjoint backup path. Weere motivated to examine three reprovisioning poli-

ies according to different methods in choosing the se-uence of unprotected connections.

Extensive simulation results have shown thatRLG failures will leave more connections unpro-ected compared with single-link failures and vali-ated our analysis that the percentage of vulnerableonnections tends to be proportional to the SRLG size.urthermore, simulation results have shown that theetwork performance based on reprovisioning Policy Ilways performs best in terms of the recovery ratio.his can be explained by connections whose workingaths traverse fewer SRLGs being more flexible tonding SRLG-disjoint backup paths, so it is necessaryo give priority to connections whose working pathsraverse more SRLGs, as Policy I does. Future workncludes studying the problem of how to support qual-ty of protection (QoP) classes of surviving SRLG fail-res by differentiated QoP backup path reprovision-

ng.

ACRONYMS

Dedicated-path protection DPPShared-path protection SPPWorking path WPBackup path BPShared risk link group SRLGBackup path availability index BPAI

ACKNOWLEDGMENT

short, summarized version of this paper was pre-ented at the IEEE/OSA Optical Fiber Communica-ion Conference (OFC), March 2007, Anaheim, Cali-ornia, USA.

REFERENCES

[1] S. Ramamurthy, L. Sahasrabuddhe, and B. Mukherjee, “Sur-vivable WDM mesh networks,” J. Lightwave Technol., vol. 21,pp. 870–883, Apr. 2003.

[2] Y. Xiong, D. Xu, and C. Qiao, “Achieving fast and bandwidth-efficient shared-path protection,” J. Lightwave Technol., vol.21, pp. 365–371, Feb. 2003.

[3] J. Strand, A. Chiu, and R. Tkach, “Issues for routing in the op-tical layer,” IEEE Commun. Mag., vol. 39, pp. 81–87, Feb.2001.

[4] D. Papadimitriou, F. Poppe, J. Jones, S. Venkatachalam, S.

Page 12: Backup Reprovisioning After Shared Risk Link Group (SRLG) Failures in WDM Mesh Networks

ssnoc

598 J. OPT. COMMUN. NETW./VOL. 2, NO. 8 /AUGUST 2010 Shao et al.

Dharanikota, R. Jain, R. Hartani, and D. Griffith, “Inference ofshared risk link groups,” IETF Internet Draft, Nov. 2001,http://tools.ietf.org/html/draft-many-inference-srlg-00.

[5] D. Xu, Y. Xiong, C. Qiao, and G. Li, “Failure protection in lay-ered networks with shared risk link groups,” IEEE Network,vol. 18, pp. 36–41, May–June 2004.

[6] D. Xu, Y. Xiong, and C. Qiao, “A new PROMISE algorithm innetworks with shared risk link groups,” in Proc. IEEEGLOBECOM, 2003, vol. 5, pp. 2536–2540.

[7] X. Shao, L. Zhou, X. Cheng, W. Zheng, and Y. Wang, “Best ef-fort shared risk link group (SRLG) failure protection in WDMnetworks,” in Proc. IEEE ICC, Beijing, China, 2008, pp. 5150–5154.

[8] H. Zang, C. Ou, and B. Mukherjee, “Path-protection routingand wavelength assignment (RWA) in WDM mesh networksunder duct-layer constraints,” IEEE/ACM Trans. Netw., vol.11, pp. 248–258, Apr. 2003.

[9] L. Shen, X. Yang, and B. Ramamurthy, “Shared risk link group(SRLG)-diverse path provisioning under hybrid service levelagreements in wavelength-routed optical mesh networks,”IEEE/ACM Trans. Netw., vol. 13, no. 4, pp. 918–931, Aug.2005.

[10] E. Oki, N. Matsuura, K. Shiomoto, and N. Yamanaka, “A dis-joint path selection scheme with shared risk link groups inGMPLS networks,” IEEE Commun. Lett., vol. 6, no. 9, pp. 406–408, Sept. 2002.

[11] W. He and A. K. Somani, “Path-based protection for survivingdouble-link failures in mesh-restorable optical networks,” inProc. IEEE GLOBECOM, 2003, vol. 5, pp. 2558–2563.

[12] W. He, M. Sridharan, and A. K. Somani, “Capacity optimiza-tion for surviving double-link failures in mesh-restorable opti-cal networks,” Photonic Network Commun., vol. 9, pp. 99–111,Jan. 2005.

[13] A. Schupke, W. D. Grover, and M. Clouqueur, “Strategies forenhanced dual failure restorability with static or reconfig-urable p-cycle networks,” in Proc. IEEE ICC, 2004, vol. 3, pp.1628–1633.

[14] A. Haque, P. Ho, R. Boutaba, and J. Ho, “Group shared protec-tion (GSP): a scalable solution for spare capacity reconfigura-tion in mesh WDM networks,” in Proc. IEEE GLOBECOM,2004, vol. 3, pp. 2029–2035.

[15] J. Zhang, K. Zhu, and B. Mukherjee, “A comprehensive studyon backup reprovisioning to remedy the effect of multiple-linkfailures in WDM mesh networks,” in Proc. IEEE ICC, 2004,vol. 3, pp. 1654–1658.

[16] J. Zhang, K. Zhu, and B. Mukherjee, “Backup reprovisioning toremedy the effect of multiple link failures in WDM mesh net-works,” IEEE J. Sel. Areas Commun., vol. 24, no. 8, pp. 57–67,Aug. 2006.

[17] L. Song, J. Zhang, and B. Mukherjee, “Backup reprovisioningafter network-state updates in survivable mesh networks,” inOptical Fiber Communication Conf. and the Nat. Fiber OpticEngineers Conf., 2006, paper JThB63.

[18] L. Song, J. Zhang, and B. Mukherjee, “A comprehensive studyon backup-bandwidth reprovisioning after network-state up-dates in survivable telecom mesh networks,” IEEE/ACMTrans. Netw., vol. 16, no. 6, pp. 1366–1377, 2008.

[19] X. Cheng, X. Shao, Y. Wang, and Y. K. Yeo, “Differentiated re-silient protection against multiple-link failures in survivableoptical networks,” in Optical Fiber Communication Conf. andthe Nat. Fiber Optic Engineers Conf., San Diego, CA, 2008, pa-per JWA100.

[20] W. Ni, X. Zheng, C. Zhu, Y. Guo, Y. Li, and H. Zhang, “An im-proved approach for online backup reprovisioning againstdouble near-simultaneous link failures in survivable WDMmesh networks,” in Proc. IEEE GLOBECOM, 2007, pp. 2304–2309.

[21] X. Shao, L. Zhou, T. Y. Chai, C. V. Saradhi, and Y. Wang, “Im-proving vulnerability of shared-path protection subject todouble-link failures,” in Optical Fiber Communication Conf.and the Nat. Fiber Optic Engineers Conf., Anaheim, CA, 2006,paper JThB62.

[22] D. Schupke and R. Prinz, “Performance of path protection andrerouting for WDM networks subject to dual failures,” in Opti-cal Fiber Communication Conf., 2003, vol. 1, pp. 209–210.

[23] S. Kim and S. Lumetta, “Evaluation of protection reconfigura-tion for multiple failures in WDM mesh networks,” in OpticalFiber Communication Conf., 2003, pp. 28–29.

[24] P. Ho, J. Tapolcai, and A. Haque, “Spare capacity reprovision-ing for shared backup path protection in dynamic generalizedmulti-protocol label switched networks,” IEEE Trans. Reliab.,vol. 57, no. 4, pp. 551–563, 2008.

[25] C. Assi, W. Huo, A. Shami, and N. Ghani, “On the benefits oflightpath re-provisioning in optical mesh networks,” in Proc.IEEE ICC, 2005, vol. 3, pp. 1746–1750.

[26] D. Lucerna, M. Tornatore, and A. Pattavina, “On the benefitsof a fast heuristic for backup reprovisioning in WDM net-works,” in Proc. IEEE GLOBECOM, 2008, pp. 1–5.

[27] L. Guo, X. Wang, and L. Li, “Improving survivability for multi-link failures with reprovisioning in WDM mesh networks,”Photonic Network Commun., vol. 14, no. 3, pp. 265–271, Dec.2007.

[28] E. Bouillet, J.-F. Labourdette, R. Ramamurthy, and S.Chaudhuri, “Lightpath re-optimization in mesh optical net-works,” IEEE/ACM Trans. Netw., vol. 13, no. 2, pp. 437–447,Apr. 2005.

[29] M. Tornatore, D. Lucerna, and A. Pattavina, “Improving effi-ciency of backup reprovisioning in WDM networks,” in Proc.IEEE INFOCOM, 2008, pp. 196–200.

[30] W. Ni, E. Patzak, M. Schlosser, Y. Ye, and H. Zhang, “On oper-ating shared-path-protected WDM networks non-revertivelyby using backup path reprovisioning,” in Optical Fiber Com-munication Conf. and the Nat. Fiber Optic Engineers Conf.,2010, paper OWH4.

[31] W. Ni, X. Zheng, C. Zhu, Y. Li, Y. Guo, and H. Zhang, “Achiev-ing resource-efficient survivable provisioning in service differ-entiated WDM mesh networks,” J. Lightwave Technol., vol. 26,no. 16, pp. 2831–2839, Aug. 2008.

[32] X. Shao, L. Zhou, and Y. Wang, “Backup reprovisioning aftershared risk link group (SRLG) failures in survivable WDMmesh networks,” in Optical Fiber Communication Conf. andthe Nat. Fiber Optic Engineers Conf., Anaheim, CA, 2007, pa-per OThJ4.

[33] D. Xu, Y. Chen, Y. Xiong, C. Qiao, and X. He, “On the complex-ity of and algorithms for finding the shortest path with a dis-joint counterpart,” IEEE/ACM Trans. Netw., vol. 14, pp. 147–158, Feb. 2006.

Xu Shao received the Ph.D. degree in com-munication and information systems fromBeijing University of Posts and Telecommu-nications, Beijing, China, in 2002, and theM.S. degree and B.S. degree from the Xid-ian University, Xi’an, China, in 1999 and1996, respectively. In 1999, he worked as anEngineer at Huawei Technologies Ltd.,Shenzhen, China. Since 2002, he has been aSenior Research Fellow at the Institute forInfocomm Research in Singapore. His re-

earch interests include optical networks, wireless networks, andervice-oriented architecture. He has published more than 50 jour-al and conference papers. He has been on the program committeesf many conferences, and he is also a member of Singapore’s Tele-ommunication Standards Technical Committee.

Page 13: Backup Reprovisioning After Shared Risk Link Group (SRLG) Failures in WDM Mesh Networks

gR(gTca

mWa

cfintAdgdeiqljotop

Shao et al. VOL. 2, NO. 8 /AUGUST 2010/J. OPT. COMMUN. NETW. 599

Yong-Kee Yeo received his bachelor’s de-gree in electrical and electronic engineering(with highest honours) from the NationalUniversity of Singapore in 1999 and a Ph.D.in electrical and computer engineering fromGeorgia Institute of Technology in 2007.From 1999 to 2002, he worked at A*STAR’sInstitute of Microelectronics, where he wasinvolved in the electromagnetic modeling ofsignal interconnects and power distributionnetworks for multi-GHz microprocessors.

He is now a Principal Investigator at A*STAR’s Institute for Info-comm Research, where he has led a number of projects in WDM-PON, Ethernet-over-WDM, and optical packet switching. Dr. Yeo’scurrent research interests include large port-count optical switchfabrics with nanosecond switching speed and optical delay buffers.He has filed more than 10 patents in the USA and Singapore andauthored more than 30 publications in the field of optical fiber com-munications. Dr. Yeo is a recipient of A*STAR’s National ScienceScholarship, and he is also a member of Singapore’s Telecommuni-cation Standards Technical Committee.

Yuebin Bai received his Ph.D. degree incomputer science from Xi’an Jiaotong Uni-versity, Xi’an, China, in 2001. From 2001 to2003, he was engaged in postdoctoral re-search in the College of Science and Tech-nology at Nihon University, Tokyo, Japan.In 2003, he joined the faculty of BeihangUniversity, Beijing, China, where he is cur-rently an Associate Professor in the Schoolof Computer Science and Engineering. Hehas published about 40 research papers in

key international conferences and journals. He is the inventor ofabout 15 pending Chinese invention patents. His current researchinterests include wireless networks, pervasive computing, and vir-tualization.

Jian Chen received the B.S., M.S., andPh.D. degrees in electronic engineeringfrom Southeast University, Nanjing, China,in 1988, 1990, and 1994, respectively. From1994 to 1999, he was an Associate Professorat the Department of Communications,Nanjing University of Posts and Telecom-munications (NUPT). From 1999 to 2001,he was with the Department of ElectricalEngineering, Korea Advanced Institute ofScience and Technology (KAIST), Daejeon,

South Korea. In 2002, he was a Member of the Technical Staff at theInstitute for Communication Research, National University of Sin-

apore (NUS). Since 2003, he has been a research scientist in theF and Optical Department of the Institute of Infocomm Research

I2R), Agency for Science, Technology and Research (A*STAR), Sin-apore. He is currently rejoining NUPT as the Director of Photonicsechnology Research Institute (PTRI). His research interests in-lude coherent optical communication, visible light communication,nd optical access networks.

Luying Zhou received the B.S. and M.S.degrees in automatic control in 1982 and1985, respectively, from South China Uni-versity of Technology, Guangzhou, China,and the Ph.D. degree in systems engineer-ing in 1990 from Xi’an Jiaotong University,Xi’an, China. He is currently a Scientist atthe Institute for Infocomm Research, Sin-gapore. He serves on the technical programcommittee of various international confer-ences and workshops and is a reviewer for

any journals. His research interests include high-speed networks,DM optical networks, wireless networks, optical access networks,

nd grid networking.

Lek Heng Ngoh is a Senior Research Sci-entist at the Agency for Science, Technologyand Research (A*STAR), Institute for Info-comm Research (I2R), and an Adjunct Asso-ciate Professor at the School of ComputerEngineering, Nanyang Technological Uni-versity. His research interests includebroadband multimedia communications,multimedia services, network protocols, andwireless sensor networks. He has publishedmore than 120 international journal and

onference research papers and co-authored four patents in theseelds. Dr. Ngoh has managed and played leadership roles in severalational and international advanced broadband network infrastruc-ure initiatives, involving broadband network technologies such asTM, gigabit IP-over-fiber, and pure optical networking. He was theeputy director and a member of the executive committee of Sin-apore’s next-generation Internet project, where his team con-ucted pioneering trials of broadband applications, including tele-ducation between universities in Singapore and their counterpartsn Asia-Pacific, Canada, USA, and Europe. These trials subse-uently led to the development and implementation of distance-earning broadband systems between Singapore and USA offeringoint postgraduate degree programs. He further involved a numberf early telemedicine trials between hospitals in Singapore andhose in Japan, South Korea, and Taiwan, as well as in the devel-pment of a secure teleophthalmology system for a government hos-ital in Singapore.


Recommended