CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

CS294, Yelick Consensus revisited, p1

CS 294-8Consensus Revisited

http://www.cs.berkeley.edu/~yelick/294


Agenda

• Consensus overview• Classic impossibility proof by FLP:

– Impossibility of consensus in shared memory with n-1 failures

– Impossibility of consensus in shared memory with 1 failure

– Impossibility of consensus with message passing

• What does this mean in practice?• Administrivia


Models

• Failures:– Link failures– Processor crash failures– Byzantine processor failures

• Timing– Synchronous: lock step algorithms– Asynchronous: unbounded delay– Partially synchronous: bounds on message

delay or processor speed differences


The Consensus Problem

• In general, the consensus problem is to get all non-faulty processors to agree on something:– To commit a transaction– Which processors are “up”– Which version of a file to use

• Abstract problem: Every processor has an input– Termination: Eventually every non-faulty

processor must decide on a value.– Agreement: All non-faulty decisions must be

the same.– Validity: If all inputs are the same, then the

non-faulty decision must be that input.


Impossibility of Asynchronous Consensus

Proof outline:1. Show impossible in shared memory with n-1

faults. (Wait-free consensus)2. 1 implies there is no 2-proc algorithm

resilient to 1 fault3. Show impossible in shared memory with 1

fault by reduction4. Show impossible in message passing systems

by reduction.Original result by Fischer/Lynch/Paterson. This

proof presentation due to Welch.


Step 1: Impossibility of Wait-Free Consensus

• An algorithm for n processor is wait-free if it can tolerate n-1 crashed processors

• Theorem 1: There is no wait-free consensus algorithm in an asynchronous shared memory system.

• Proof plan: By contradiction. Classify configurations C according to how many different decisions are reachable:– Bivalent: both 0 and 1 are reachable– Univalent: only one output is reachable

• (0-valent or 1-valent)Three lemmas lead to the result

C

0 0 1 0 1 1


Impossibility of Wait-Free Consensus (con’t)

• Lemma 1: There is an initial configuration that is bivalent.

• Proof: Assume all initial configurations are univalent. Build a chain of configurations:

But if and ’ differ only in 1 input, processor i. Consider executions in which i fails immediately – since produces 0, so does ’, a contradiction.

… …0 1

’

00000 xx0xx xx1xx 11111

0-valent

0-valent

1-valent

1-valent



• Lemma 2: If C1 and C2 are univalent and C1 and C2 are equivalent at pi then C1 and C2 have the same valency.

• Proof: Suppose C1 is v-valent.– Since the algorithm is wait-free (i.e., all

other processors could stop), there is a schedule in which only pi takes steps that causes pi to decide v

– Since pi cannot tell the difference between C1 and C2, if is applied to C2, pi also decides v there.

– Thus C2 is also v-valent



• Lemma 3: If C is bivalent, then at least one processor is not critical, i.e., it can take a step and keep the system bivalent.

• Proof: By cases:– Suppose in contradiction that all processors are

critical. Then there exist processors pi pj:

0/1

10

C

pipj



• Case 1: pi and pj access different registers or read the same register

• But these operations commute => a contradiction.

0/1

10

C

pipj

??



• Case 2: pi writes to and pj reads from the same register

• Let C+i be the configuration after executing pi and C+j+i be the configuration after executing pj then pi. C+i is equivalent to C+j+i from pi’s perspective, contradicting Lemma 1.0/1

10

C

pi writes to R pj reads from R

pi writes to R

??



• Case 3: pi and pj write to the same register

• As in case 2, we can “run” the completion of the left-hand execution after pj’s write. Since pi overwrites R, the executions result in 0.

0/1

10

C

pi writes to R pj writes to R

pi writes to R

??



• Theorem 1: There is no wait-free consensus algorithm in an asynchronous shared memory system.

• Proof: Construct an execution in which all configurations are bivalent.1. Start with bivalent initial

configuration from lemma 1.2. Use lemma 2 to get net bivalent

configuration3. Repeat step 2 infinitely


Agenda







Impossibility of Single Failure Consensus

Even if the ratio of faulty processors is very low, consensus cannot be solved in asynchronous shared memory

Proof outline:1. Assume there exists an algorithm A for n

processors and 1 failure2. Use A as a subroutine to design algorithm for

A’ for 2 processors and 1 failure3. Previous result shows A’ cannot exist4. Thus A does not exist


Impossibility of Single Failure Consensus (con’t)

Proof assumptions: for processors q0,…qn-1

1. Each qi has a single register Ri which it writes and others read

2. Code of each qi alternates reads and writes, beginning with a read

3. Each write step of each qi write qi’s entire current state into Ri

All of these are without loss of generality.



Idea of algorithm A’ for p0 and p1:

1. Each pi goes through the qj’s in round-robin order, trying to simulate their steps. Steps are grouped into pairs: a read and the following write.

2. When pi begins the simulation of qj, it uses its own input as the input for qj. If pi ever simulates a decision step by qj, it decides the same thing.

3. How do p0 and p1 keep their simulations consistent? The need to “agree” on the value of each qj’s local state after each pair of steps by qj.



For qj’s kth pair, p0 and p1 each have flag variable:

1. Assume qj’s k-1st pair has been computed.

2. pi calculates its suggestion for qj’s state after the kth pair (see later slides)

3. pi checks if pi-1 has made a suggestion for this state of qj

4. If not then pi sets its flag to 1

5. If so, then pi sets its flag to 0



Note order of operations:

So two 0 flags is possible, but not two 1’s.

1. Write suggest02. Read suggest13. Write flag0

1 if suggest1 empty

0 otherwise

1. Write suggest12. Read suggest13. Write flag1



Interpretation of flags:1. If pi’s flag is 1, then pi is the winner.2. If both are 0, then consider p0 the winner.3. If one is 0 and the other is not yet set, the

winner is not yet determined.4. If neither is set, the winner is not yet

determined.5. Not possible for both to be 1.

In cases 1 and 2, the kth pair is said to be computed; otherwise not.



How does pi calculate suggestion for qj’s state after qj’s kth pair?

• pi gets qj’s state after its k-1st pair:– if k-1 = 0, then user qj’s initial state with pi’s input– Otherwise get the suggestion of the winner for qj’s k-

1st pair.

• Consult qj’s state (just obtained) to determine which qr’s register is to be read in its kth pair

• Get current value of qr’s register by finding large m such that qr’s mth pair has been computed and get the winning suggestion

• Apply qj’s transition function to get the value of qj’s state after its kth pair



Each execution of A’ (by p’s) simulates an execution of A (by q’s). If pi observers a qj making a decision, then it makes the same decision.

• If the simulated execution is “admissible” (by failure assumption on q’s) then it satisfies:

– Termination: eventually all q’s decide– Agreement: all q’s agree– Validity: If all q’s have input v, then the

decision is v. • So A’ would be a correct execution



Why is the simulated execution admissible? We need to show that at least n-1 processors take an infinite number of steps in it.

How can a simulation of qj be blocked? If p0 or p1 crashes during its simulation of qj’s kth pair, e.g.:

– p0 writes a suggestion then crashes – p1 sees p0s suggestion and writes 0 to its flag– p0’s flag remains unset forever

– So qj’s kth pair is never computed

• But the crash of 1 pi can only block the simulation of 1 qj. In the example, p1 would continue simulating all other q’s.


Agenda







Impossibility of Consensus in Message Passing

• Assume there exists an n-processor, consensus algorithm A for message passing with 1 fault

• Use A as a subroutine to design A’ for shared memory

• Previous results show A’ cannot exist• So A cannot exist

• Idea of A’: Simulate message channels with read/write register. Then run A on top of these channels to get A’.


Agenda







Implications and Limitations of the Result

• FLP says consensus is impossible in an asynchronous environment.– All of the proofs are about liveness, not safety

• Castro/Liskov rely on this

– Explains “window of vulnerability” in practice:• Interval of time in which a fault can cause entire

system to wait indefinitely

– Do you care about liveness or response time (soft real-time guarantees)

• From a theoretical perspective, one can also “get around” this result by:– Using randomization (algorithm due to Ben Or

tolerates <= 1/3 faulty processors)– Using RMW register, rather than just R/W


Overview of Results on Consensus

• Let f be the maximum number of faulty processors.

• The following are tight bounds for synchronous message passing:

• Partially synchronous case is not as well studied.

Crash Byzantine

Number of rounds

f+1 f+1

Number of processors

>= f+1 >= 3f +1

Message size Polynomial Polynomial


Administrivia

• If you’re doing a project and haven’t met with me in last 3 weeks, let me know asap.

• Final project deadlines:– Poster session Dec 13 in pm (with 262)– Final papers due Dec 15

• Papers online for next week by Thursday

Date post:	13-Dec-2015
Category:	Documents
Upload:	leo-davis
View:	218 times
Download:	3 times

CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited yelick/294.

Documents