CS 294-8 Self-Stabilizing Systems cs.berkeley/~yelick/294

Post on 30-Jan-2016

34 views 5 download

description

CS 294-8 Self-Stabilizing Systems http://www.cs.berkeley.edu/~yelick/294. Administrivia. No seminar or class Thursday Sign up for project meetings tomorrow, Thursday, or Tuesday (?) Poster session Wednesday, 12/13, in the Woz, 2-5pm Final papers due Friday, 12/15. (Self) Stabilization. - PowerPoint PPT Presentation

transcript

CS294, Yelick Self Stabilizing, p1

CS 294-8Self-Stabilizing

Systems http://www.cs.berkeley.edu/~yelick/294

CS294, Yelick Self Stabilizing, p2

Administrivia• No seminar or class Thursday• Sign up for project meetings

tomorrow, Thursday, or Tuesday (?)

• Poster session Wednesday, 12/13, in the Woz, 2-5pm

• Final papers due Friday, 12/15

CS294, Yelick Self Stabilizing, p3

(Self) Stabilization• History: Idea introduced by Dijkstra

in 1973/74. Popularized by Lamport in 1983.

• Idea of stabilization: ability of a system to converge in finite number of steps from arbitrary states to desired state

CS294, Yelick Self Stabilizing, p5

Stabilization• Motivation: fault tolerance

– Especially transient faults– Also useful for others (crashes,

Byzantine)

• Where: Stabilization ideas appear in physics, control theory, mathematical analysis, and systems science

CS294, Yelick Self Stabilizing, p6

Stabilization• Definition: Let P be a state

predicate of a system S. S is stabilizing to P iff it satisfies the following:– Closure: P is closer in S: any

computation that starts in a state in P leads to states that are in P

– Convergence: every computation of S has a finite prefix such that the following is in P

CS294, Yelick Self Stabilizing, p7

Stabilization (Refined)

CS294, Yelick Self Stabilizing, p8

Practical Issues• Stabilizing protocols allow for

– Corrupted state– Initialization errors– Not corruption of code

• Applications– Routing, Scheduling, Resource

Allocation

CS294, Yelick Self Stabilizing, p9

Dijkstra’s Model• The concurrency model is

unrealistic, but useful for illustration– Processors are organized in a sparse,

connected graph– At each “step” a processor looks at its

own step and neighbors, and changes its own state

– A “demon” selects the processor to executed (fairly)

CS294, Yelick Self Stabilizing, p10

Impossibility of Stabilization• If the processors are truly identical

(symmetric), then stabilization is impossible– Consider an N processor system, with N a

non-prime, say N = 2m– Consider an initial state that is cyclically

symmetric, e.g.,• s, t, s, t, s, t• pi’s state is s for i even, and t for i odd

– Then the scheduling “demon” can schedule all even processors (which will all move to s’) and then all odd (move to t’), so no progress will be made

CS294, Yelick Self Stabilizing, p11

Implications of Impossibility• How important is this result?• Burns and Pachl show that with a

prime number of processors, self-stabilization with symmetric processors is possible

• More importantly, how realistic is the symmetry assumption?

CS294, Yelick Self Stabilizing, p12

Mutual Exclusion on a Line• The following simple example is a

solution to the mutual exclusion problem:– n processors are connected in a line– Each talks to 2 neighbors (1 on the

ends)

• State: – Each process has 2 variables

• up: token is above if true, below if false• x: a bit used for token passing

CS294, Yelick Self Stabilizing, p13

Token Passing on a Line•Top (Process n-1)

x = 0 up = falsex = 0 up = falsex = 0 up = false

x = 1 up = truex = 1 up = true . . .x = 1 up = true

•Bottom (Process 0)

• Logical token:• Token is at one of 2

procs where up differs• If x’s differ, upper proc,

if same, lower proc

CS294, Yelick Self Stabilizing, p14

Token Passing Program• Bottom-move-up[0]

– If x[0] = x[1] and up[1] = false then x[0] := ~x[0]

• Top-move-down[n-1]– If x[n-2] != x[n-1] then x[n-1] := x[n-2]

• Middle-move-up[i]– if x[i] != x[i-1] then {x[i] := x[i-1]; up[i] := true}

• Middle-move-down[i]– if (x[i] = x[i+1] and up[i] = true and

up[i+1]=true then up[i] = false

• This is Dijkstra’s second, “4-state” algorithm

CS294, Yelick Self Stabilizing, p15

Token Passing Up•Top (Process n-1)

x = 0 up = falsex = 0 up = falsex = 0 up = false

x = 1 up = truex = 1 up = true . . .x = 1 up = true

•Bottom (Process 0)

if x[i] != x[i-1] then {x[i] := x[i-1]; up[i] := true}

x = 1 up = true

CS294, Yelick Self Stabilizing, p16

Token Passing Down•Top (Process n-1)

x = 0 up = falsex = 1 up = truex = 1 up = true

x = 1 up = truex = 1 up = true . . .x = 1 up = true

•Bottom (Process 0)

if x[n-2] != x[n-1] then x[n-1] := x[n-2]

x = 1

CS294, Yelick Self Stabilizing, p17

Proof Idea for Correct States• If the initialization of states is correct

– One can divide the processor line in two parts based on “up”

• Two processors, i and i-1 in between• All processors above i have same x value as x[i]; all

below i-1 same as x[i-1]

– An action in the program is enabled only when the token is held

– Only 1 action is enabled (and only 1 process holds the token at any given time)

• The above can be checked by examining the predicates on the rules

CS294, Yelick Self Stabilizing, p18

Locally Checkable Properties• In any good state, the following

hold:– If up[i-1] = up[i], then x[i-1]=x[i]– If up[i] = true then up[i-1]=true

• These are enough to show that only 1 processor is enabled

• These are locally checkable– a local set (pair) of processors can

detect an incorrect state

CS294, Yelick Self Stabilizing, p19

General Stabilization Technique 1

• Varghese proposed local checking and correction as a general technique

• Turn local checks into local correction– Consider processors as tree (line is special

case)– Consider I-1 to be I’s parent– For each node I (I != 0), add Correction

action: check the local predicate between I and its parent, correct I’s state if necessary

– Correction affects only child, not parent

CS294, Yelick Self Stabilizing, p20

Practical Issues• Dijkstra’s algorithm works without

the explicit correction step• For more complex protocols,

correction is used• Although Dijkstra’s algorithm is

self-stabilizing, it goes through states where mutual exclusion is not guaranteed

CS294, Yelick Self Stabilizing, p21

Token Passing on Ring• Processor 0 and n-1 are neighbors• Initially, count = 0, except for processor 0

where count = 1• Zero-move

– If count[0] = count[n-1] then count[0] = count[0]+1 mod (n+1)

• Other-move– If count[i] != count[i-1] then count[i] := count[i-1]

• Note: this is Dijkstra’s first, k-state algorithm

CS294, Yelick Self Stabilizing, p22

Token Ring Execution

x-1 xx

x

x

x-1

x-1x-1x-1

x-1

x-1

x-1

Good States:

• For I = 1…n=1, either count[I-1]=count[I] or count[I-1] = count[I]+1

• Either count[0] = count[n-1] or count[0] = count[n-1]+1

token

p0

CS294, Yelick Self Stabilizing, p23

Proof Idea• The following can be shown

– In any execution, P0 will eventually increment its counter (because all other processor decrease # of counter values)

– In any execution P0 will eventually reach a “fresh” counter value

– Any state in which P0 has a fresh counter value m is eventually followed by a state in which all processes have m

CS294, Yelick Self Stabilizing, p24

General Stabilization Technique 2

• Varghese proposes counter flushing as a general technique for stabilization– Starting with some sender (P0) sending

to others, which messages in rounds– Make stabilizing by numbering

messages with counters (max ctr > N)– Sender must eventually get “fresh”

value

CS294, Yelick Self Stabilizing, p25

Compilers and Stabilization• Two useful properties for compilers

(according to Schneider):– Self-stabilizing source code should

produce self-stabilizing object– Compiler should produce a self-

stabilizing version of our program even if the source code is not

CS294, Yelick Self Stabilizing, p26

Compilers Con’t• Fundamental difference between

symmetric and asymmetric rings• Self-stabilization is “unstable”

across architectures• There is a class of programs for

which a compiler can be written to “force” stabilization

CS294, Yelick Self Stabilizing, p27

Summary• Self-stabilizing algorithms

– Overlooked for 10 years– Revived in distributed algorithms community– Algorithms for: MST, Communication, …

• Relevance to practice– Tolerating transient faults is important– Do these ideas appear in real systems?

• See http://www.cs.uiowa.edu/ftp/selfstab/bibliography/stabib.html