+ All Categories
Home > Documents > CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

Date post: 13-Dec-2015
Category:
Upload: leo-davis
View: 218 times
Download: 3 times
Share this document with a friend
29
CS294, Yelick Consensus revisited, p 1 CS 294-8 Consensus Revisited http://www.cs.berkeley.edu/~yelick/294
Transcript
Page 1: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p1

CS 294-8Consensus Revisited

http://www.cs.berkeley.edu/~yelick/294

Page 2: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p2

Agenda

• Consensus overview• Classic impossibility proof by FLP:

– Impossibility of consensus in shared memory with n-1 failures

– Impossibility of consensus in shared memory with 1 failure

– Impossibility of consensus with message passing

• What does this mean in practice?• Administrivia

Page 3: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p3

Models

• Failures:– Link failures– Processor crash failures– Byzantine processor failures

• Timing– Synchronous: lock step algorithms– Asynchronous: unbounded delay– Partially synchronous: bounds on message

delay or processor speed differences

Page 4: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p4

The Consensus Problem

• In general, the consensus problem is to get all non-faulty processors to agree on something:– To commit a transaction– Which processors are “up”– Which version of a file to use

• Abstract problem: Every processor has an input– Termination: Eventually every non-faulty

processor must decide on a value.– Agreement: All non-faulty decisions must be

the same.– Validity: If all inputs are the same, then the

non-faulty decision must be that input.

Page 5: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p5

Impossibility of Asynchronous Consensus

Proof outline:1. Show impossible in shared memory with n-1

faults. (Wait-free consensus)2. 1 implies there is no 2-proc algorithm

resilient to 1 fault3. Show impossible in shared memory with 1

fault by reduction4. Show impossible in message passing systems

by reduction.Original result by Fischer/Lynch/Paterson. This

proof presentation due to Welch.

Page 6: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p6

Step 1: Impossibility of Wait-Free Consensus

• An algorithm for n processor is wait-free if it can tolerate n-1 crashed processors

• Theorem 1: There is no wait-free consensus algorithm in an asynchronous shared memory system.

• Proof plan: By contradiction. Classify configurations C according to how many different decisions are reachable:– Bivalent: both 0 and 1 are reachable– Univalent: only one output is reachable

• (0-valent or 1-valent)Three lemmas lead to the result

C

0 0 1 0 1 1

Page 7: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p7

Impossibility of Wait-Free Consensus (con’t)

• Lemma 1: There is an initial configuration that is bivalent.

• Proof: Assume all initial configurations are univalent. Build a chain of configurations:

But if and ’ differ only in 1 input, processor i. Consider executions in which i fails immediately – since produces 0, so does ’, a contradiction.

… …0 1

00000 xx0xx xx1xx 11111

0-valent

0-valent

1-valent

1-valent

Page 8: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p8

Impossibility of Wait-Free Consensus (con’t)

• Lemma 2: If C1 and C2 are univalent and C1 and C2 are equivalent at pi then C1 and C2 have the same valency.

• Proof: Suppose C1 is v-valent.– Since the algorithm is wait-free (i.e., all

other processors could stop), there is a schedule in which only pi takes steps that causes pi to decide v

– Since pi cannot tell the difference between C1 and C2, if is applied to C2, pi also decides v there.

– Thus C2 is also v-valent

Page 9: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p9

Impossibility of Wait-Free Consensus (con’t)

• Lemma 3: If C is bivalent, then at least one processor is not critical, i.e., it can take a step and keep the system bivalent.

• Proof: By cases:– Suppose in contradiction that all processors are

critical. Then there exist processors pi pj:

0/1

10

C

pipj

Page 10: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p10

Impossibility of Wait-Free Consensus (con’t)

• Case 1: pi and pj access different registers or read the same register

• But these operations commute => a contradiction.

0/1

10

C

pipj

??

Page 11: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p11

Impossibility of Wait-Free Consensus (con’t)

• Case 2: pi writes to and pj reads from the same register

• Let C+i be the configuration after executing pi and C+j+i be the configuration after executing pj then pi. C+i is equivalent to C+j+i from pi’s perspective, contradicting Lemma 1.0/1

10

C

pi writes to R pj reads from R

pi writes to R

??

Page 12: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p12

Impossibility of Wait-Free Consensus (con’t)

• Case 3: pi and pj write to the same register

• As in case 2, we can “run” the completion of the left-hand execution after pj’s write. Since pi overwrites R, the executions result in 0.

0/1

10

C

pi writes to R pj writes to R

pi writes to R

??

Page 13: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p13

Impossibility of Wait-Free Consensus (con’t)

• Theorem 1: There is no wait-free consensus algorithm in an asynchronous shared memory system.

• Proof: Construct an execution in which all configurations are bivalent.1. Start with bivalent initial

configuration from lemma 1.2. Use lemma 2 to get net bivalent

configuration3. Repeat step 2 infinitely

Page 14: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p14

Agenda

• Consensus overview• Classic impossibility proof by FLP:

– Impossibility of consensus in shared memory with n-1 failures

– Impossibility of consensus in shared memory with 1 failure

– Impossibility of consensus with message passing

• What does this mean in practice?• Administrivia

Page 15: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p15

Impossibility of Single Failure Consensus

Even if the ratio of faulty processors is very low, consensus cannot be solved in asynchronous shared memory

Proof outline:1. Assume there exists an algorithm A for n

processors and 1 failure2. Use A as a subroutine to design algorithm for

A’ for 2 processors and 1 failure3. Previous result shows A’ cannot exist4. Thus A does not exist

Page 16: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p16

Impossibility of Single Failure Consensus (con’t)

Proof assumptions: for processors q0,…qn-1

1. Each qi has a single register Ri which it writes and others read

2. Code of each qi alternates reads and writes, beginning with a read

3. Each write step of each qi write qi’s entire current state into Ri

All of these are without loss of generality.

Page 17: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p17

Impossibility of Single Failure Consensus (con’t)

Idea of algorithm A’ for p0 and p1:

1. Each pi goes through the qj’s in round-robin order, trying to simulate their steps. Steps are grouped into pairs: a read and the following write.

2. When pi begins the simulation of qj, it uses its own input as the input for qj. If pi ever simulates a decision step by qj, it decides the same thing.

3. How do p0 and p1 keep their simulations consistent? The need to “agree” on the value of each qj’s local state after each pair of steps by qj.

Page 18: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p18

Impossibility of Single Failure Consensus (con’t)

For qj’s kth pair, p0 and p1 each have flag variable:

1. Assume qj’s k-1st pair has been computed.

2. pi calculates its suggestion for qj’s state after the kth pair (see later slides)

3. pi checks if pi-1 has made a suggestion for this state of qj

4. If not then pi sets its flag to 1

5. If so, then pi sets its flag to 0

Page 19: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p19

Impossibility of Single Failure Consensus (con’t)

Note order of operations:

So two 0 flags is possible, but not two 1’s.

1. Write suggest02. Read suggest13. Write flag0

1 if suggest1 empty

0 otherwise

1. Write suggest12. Read suggest13. Write flag1

Page 20: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p20

Impossibility of Single Failure Consensus (con’t)

Interpretation of flags:1. If pi’s flag is 1, then pi is the winner.2. If both are 0, then consider p0 the winner.3. If one is 0 and the other is not yet set, the

winner is not yet determined.4. If neither is set, the winner is not yet

determined.5. Not possible for both to be 1.

In cases 1 and 2, the kth pair is said to be computed; otherwise not.

Page 21: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p21

Impossibility of Single Failure Consensus (con’t)

How does pi calculate suggestion for qj’s state after qj’s kth pair?

• pi gets qj’s state after its k-1st pair:– if k-1 = 0, then user qj’s initial state with pi’s input– Otherwise get the suggestion of the winner for qj’s k-

1st pair.

• Consult qj’s state (just obtained) to determine which qr’s register is to be read in its kth pair

• Get current value of qr’s register by finding large m such that qr’s mth pair has been computed and get the winning suggestion

• Apply qj’s transition function to get the value of qj’s state after its kth pair

Page 22: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p22

Impossibility of Single Failure Consensus (con’t)

Each execution of A’ (by p’s) simulates an execution of A (by q’s). If pi observers a qj making a decision, then it makes the same decision.

• If the simulated execution is “admissible” (by failure assumption on q’s) then it satisfies:

– Termination: eventually all q’s decide– Agreement: all q’s agree– Validity: If all q’s have input v, then the

decision is v. • So A’ would be a correct execution

Page 23: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p23

Impossibility of Single Failure Consensus (con’t)

Why is the simulated execution admissible? We need to show that at least n-1 processors take an infinite number of steps in it.

How can a simulation of qj be blocked? If p0 or p1 crashes during its simulation of qj’s kth pair, e.g.:

– p0 writes a suggestion then crashes – p1 sees p0s suggestion and writes 0 to its flag– p0’s flag remains unset forever

– So qj’s kth pair is never computed

• But the crash of 1 pi can only block the simulation of 1 qj. In the example, p1 would continue simulating all other q’s.

Page 24: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p24

Agenda

• Consensus overview• Classic impossibility proof by FLP:

– Impossibility of consensus in shared memory with n-1 failures

– Impossibility of consensus in shared memory with 1 failure

– Impossibility of consensus with message passing

• What does this mean in practice?• Administrivia

Page 25: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p25

Impossibility of Consensus in Message Passing

• Assume there exists an n-processor, consensus algorithm A for message passing with 1 fault

• Use A as a subroutine to design A’ for shared memory

• Previous results show A’ cannot exist• So A cannot exist

• Idea of A’: Simulate message channels with read/write register. Then run A on top of these channels to get A’.

Page 26: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p26

Agenda

• Consensus overview• Classic impossibility proof by FLP:

– Impossibility of consensus in shared memory with n-1 failures

– Impossibility of consensus in shared memory with 1 failure

– Impossibility of consensus with message passing

• What does this mean in practice?• Administrivia

Page 27: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p27

Implications and Limitations of the Result

• FLP says consensus is impossible in an asynchronous environment.– All of the proofs are about liveness, not safety

• Castro/Liskov rely on this

– Explains “window of vulnerability” in practice:• Interval of time in which a fault can cause entire

system to wait indefinitely

– Do you care about liveness or response time (soft real-time guarantees)

• From a theoretical perspective, one can also “get around” this result by:– Using randomization (algorithm due to Ben Or

tolerates <= 1/3 faulty processors)– Using RMW register, rather than just R/W

Page 28: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p28

Overview of Results on Consensus

• Let f be the maximum number of faulty processors.

• The following are tight bounds for synchronous message passing:

• Partially synchronous case is not as well studied.

Crash Byzantine

Number of rounds

f+1 f+1

Number of processors

>= f+1 >= 3f +1

Message size Polynomial Polynomial

Page 29: CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p29

Administrivia

• If you’re doing a project and haven’t met with me in last 3 weeks, let me know asap.

• Final project deadlines:– Poster session Dec 13 in pm (with 262)– Final papers due Dec 15

• Papers online for next week by Thursday


Recommended