Distributed Mesh Span Restoration

E E 681 - Module 20

W. D. Grover

TRLabs & University of Alberta

© Wayne D. Grover 2002, 2003

Distributed Mesh Span Distributed Mesh Span RestorationRestoration

E E 681 - Module 20 © Wayne D. Grover 2002, 2003 2

Centralized vs. Distributed Network Configuration

• Network configuration:– Establishment of working paths (provisioning)– Restoration– Traffic adaptation

• Centralized approach:– A central command point that has a knowledge of the complete

network state (full network map, link states, available capacity,…) makes routing decisions and sends commands to the nodes.

• Distributed approach (self-organizing):– Nodes apply simple rules in an autonomous and asynchronous

manner to handle the problems of demand routing, restoration or traffic adaptation



• Advantages of centralized approach:– The complete knowledge of the network state make it possible to

find an optimal network configuration

• Drawbacks of centralized approach:– Diverse telemetry network required for collection of information

about network state– Constant verification of database integrity is needed– Very slow due to time for commands download and validation– Can easily lead to the

Centralized control is unable to achieve the 2 second restoration time widely recognized as a target for the transport network.



• Advantages of a (distributed) self-organizing solution:– Simple (avoids the “Software mountain” problem)– Speed– Accuracy– Robustness– Resource usage efficiency

• Drawback:– Not necessarily optimal

The rest of the lecture presents a self-organizing solution to the problem of span-restoration: The self-healing network (SHN) protocol


Objectives of the self-healing protocol

• Find the maximum number of replacement paths between the end-nodes of a failed span with the least possible amount of spare capacity

(For the problem of span restoration)

The corresponding mathematical problem is single-commodity maximum flow:

,( , )

Maximize j cj c

k x

E

, ,( , ) ( , )

0i z z ii z z i

x x

E E

z N a c

,( , )

a ja j

x k

E

, ,i j i jx s ( , )i j E

, 0i jx , 0a cx ( , )i j E

Remark: A solution to this formulation does not tell us what cross connections should be made in the nodes


Objectives of the self-healing protocol (2)

• Achieve fast restoration (below 2 sec)

• Avoid the complexity of centralized restoration and the “software mountain” phenomenon


Distributed Restoration: what about flooding?

• Flooding is a simple solution to find restoration routes

However …

• Flooding does not help decide the number of restoration paths to establish on each route

• The formation of paths must be simultaneously coordinated to achieve maximum flow

• Flooding however can be used for the determination of minimum delay routes or broadcast of topology updates

(Remember the trap topology)


SHN Protocol: Node Interactions using Statelets

Statelets: (NID,Sender,Chooser,index,repeat count)

Node

NID: Node Identification Sender, Chooser: Pair of nodes at the two end of the failed span (decision of attribution of sender and chooser roles will be explained later)Index: Unique number assigned to each statelet emitted by the senderRepeat count: Counting of the number of hops since statelet was emitted by sender

• Statelets are attributes of the transmission links


SHN Protocol: Different node states

• Nodes can only be in a finite number of different states:– Pre-failure state– Sender state– Chooser state– “Tandem node” state

• The transition between these states depends on the changes of incoming statelets (events)– A change in an incoming statelet is called a receive statelet (RS)

event

The SHN protocol is an event-driven finite state machine (FSM)


SHN Protocol: Pre-failure Node State

Node Qpre-failure state

(null)

(null)(null)

(null)(null)(null)

(null)(null)(null)

(null)

(null)

(null)

(null)

ss

ss

s s s s

ss

ss

s

(null) = (Q,0,0,0,0)

In pre-failure state, nodes send null statelets on all working and spare links


SHN Protocol: Activation

Span cut

(null)

(null)(null)

Node Qpre-failure state

ww

ss

s s s s

ss

ss

s

w

(null)

(null)(null)

(null)(null)(null)

(null)(null)(null)

(null)

(null)

(Q,Q,P,6,1)(Q,Q,P,7,1)(Q,Q,P,8,1)

(Q,Q,P,5,1)

(Q,Q,P,4,1)

(Q,Q,P,1,1)(Q,Q,P,2,1)(Q,Q,P,3,1)

state determination

RS.NID = P

.....state = sender

A span failure is detected by the span end-nodes first. One of the nodes become the “Sender node”

Statelets are sent on each available spare links up to a maximum of min(w,si) in each span


(null)(null)(null)

SHN Protocol: Basic view of Tandem Nodes role(Complete tandem node rules to be explained later)

Node Tipre-failure state

(null)

(null)(null)

(null)(null)(null)

(null)

(null)

(null)

(null)

ss

ss

s s s

ss

ss

s

(Q,Q,P,1,1)(Q,Q,P,2,1)(Q,Q,P,3,1)

s

Node T1 in Tandem node state:• Selective re-broadcast• Update of statelets

(T1,Q,P,1,2)

(T1,Q,P,1,2)

(T1,Q,P,1,2)

(T1,Q,P,2,2)

(T1,Q,P,2,2)

(T1,Q,P,2,2)

(T1,Q,P,3,2)

(T1,Q,P,1,2)(Q,Q,P,1,1)

NID: Q replaced by Tandem node ID

Repeat count increased by 1


Node P receives first statelet(for index 2 in example):• complementary statelet sent

(P,Q,P,2,1)R

SHN Protocol: Initiation of Reverse Linking

Node PChooser state

ss

ss

s s s s

ss

sss

(Ti,Q,P,2,n)


(T1,Q,P,1,2)

(T1,Q,P,1,2)

(T1,Q,P,2,2)

(T1,Q,P,2,2)

(T1,Q,P,3,2)

SHN Protocol: Reverse Linking process

(null)(null)(null)

(null)(null)

(null)

(null)

(null)

ss

ss

s s s

ss

ss

s

(Q,Q,P,1,1)(Q,Q,P,2,1)(Q,Q,P,3,1)

s

(T2,Q,P,2,n-1)R

(T1,Q,P,1,2)

(T1,Q,P,2,2)

(T,Q,P,2,n)R

Request for local cross connection

Node T1 receives reverse-linking statelet and copies it to the port going to the precursor


More details on SHN: Sender-Chooser Arbitration

Node P Node Q

RS.NID = PRS.NID = Q

Node RankA 1B 2C 3D 4E 5…

Node RankA 1B 2C 3D 4E 5…

How am I ranked compared to Q?

How am I ranked

compared to P?

I will be the chooser node

I will be the sender node

Other possibility: The node with the lowest number of surviving spare links becomes the sender (to minimize the volume of statelets generated)


More details on SHN: Tandem Nodes Rules

1) Keep list of ports where precursor statelets are presently found and sort statelets by:– increasing repeat count– increasing number of the port where they appear

2) Replace precursors by better ones when better ones appear

3) Try as much as possible to re-broadcast statelets to all other spans

3a) When full re-broadcast is not possible, consider statelets in order of repeat count starting with the lowest values.

4) When complement statelet is received it is copied to the port of the precursor, all re-broadcast of forward flooding statelets for the corresponding index is stop and any subsequent appearance of a reverse linking statelet with that index is ignored


Tandem Node Rules: Index ranking

Keep list of ports where precursor statelets are presently found and sort statelets by:

• increasing repeat count• increasing number of the port where they appear

r = 3

r = 1

r = 6

r = 5

r = 2

1

2

3

41

2

3

5

4

n

Rank in the list


Tandem Node Rules: Selective re-broadcast

Try as much as possible to re-broadcast statelets to all other spansWhen full re-broadcast is not possible, consider statelets in order of repeat count starting with the lowest values.

r = 3

r = 1

r = 6

r = 5

r = 2

1

2

3

5

4


s

Tandem Node Rules: Reverse Linking

(T1,Q,P,2,2)

(T1,Q,P,1,2)

(T1,Q,P,1,2)

w

(T1,Q,P,2,2)

(T1,Q,P,2,2)

(T1,Q,P,3,2)

(null)(null)(null)

(null)(null)

(null)

(null)

ss

ss

s s s

ss

ss

(Q,Q,P,1,1)(Q,Q,P,2,1)(Q,Q,P,3,1)

s

(T2,Q,P,2,n-1)R

(T1,Q,P,1,2)

(T,Q,P,2,n)R

Request for local cross connection

Node T1 obeys rule 4:

Complement statelets

• The complement statelet is sent to precursor

• Cross connection is requested

• Re-broadcast for that index is stopped

• Set the status of both ports as “working”

w


SHN Protocol: Frequent low-level effects

• The application of rule 3 and the effects of reverse linking results is several frequent low-level effects:– The precursor location for an index shifts– A new index appears at the node– A precursor disappears– Links are freed for more re-broadcast after reverse linking

• After any of these events the rebroadcast pattern is revised to follow rule 3


SHN Protocol: Frequent low-level effects (2)

• The precursor location for an index shifts

Index i,repeat r

Index i,repeat r+1

Index i,repeat r+1

Index i,repeat r+1

Index i,repeat r-1

3Index i,repeat r-1

Index i,repeat r

Index i,repeat r

Index i,repeat r


SHN Protocol: Frequent low-level effects (3)

• A new index appears at a node

r = 3

r = 1

r = 6

r = 5

r = 2

1

2

3

5

4

r = 4 4

5

6

Index 6 is not re-broadcast anymore


SHN Protocol: High Level Behaviour

• At the high level what we see is:

Index trees expanding

Some trees are stopped because no re-broadcast

possible

Some trees reach the

chooser node

Reverse linking make successful

trees collapse

Freed capacity allows revision of re-broadcast

patterns

• Eventually:– 100% restoration is achieved– or … No reverse linking events occur and the sender suspends

statelet flooding after some time (time-out)


SHN Protocol: Performances

• Finding the maximum number of restoration paths– In no test case in over 15 test networks derived from the real world did the

SHN process yield any fewer the maximum feasible number of paths in the given network

• Achieving fast restoration– Speed of restoration depends on the implementation– But easily realized on-demand (adaptive) “restoration” in less than 2 seconds– or, can be used for Distributed Preplanning for pre-failure self-development

of fast-acting protection pre-plans (~ 100 msec or less reaction upon failure).

Results of complete restoration times for one of the implementation tested in [1]


Self-Organizing Networks: other applications

• “Capacity scavenging” for:– Automated service paths provisioning (“broad-band dial-up”– Network Audit (advance detection of restorability limitations and/or

locations where capacity will soon be exhausted)– Improved restorability to complete node failures

• “Distributed Pre-planning”

• For more details, see:[1] W. Grover, “Self-organizing broad-band transport networks,”

Proceedings of the IEEE, vol. 85, no. 10, October 1997.[2] W.D. Grover, "Distributed Restoration of the Transport Network,"

Chapter 11 in Telecommunications Network Management into the 21st Century, Techniques, Standards, Technologies and Applications, S.Aidarous, T. Plevyak (editors), IEEE Press, 1994, pp. 337-417.

Date post:	02-Jan-2016
Category:	Documents
Upload:	lamar-mathis
View:	20 times
Download:	1 times

Distributed Mesh Span Restoration

Documents