Byzantine Fault Tolerance CS 425: Distributed Systems Fall 2011 1 Material drived from slides by I....

Post on 24-Dec-2015

219 views 0 download

Tags:

transcript

1

Byzantine Fault Tolerance

CS 425: Distributed SystemsFall 2011

Material drived from slides by I. Gupta and N.Vaidya

2

Reading List

• L. Lamport, R. Shostak, M. Pease, “The Byzantine Generals Problem,” ACM ToPLaS 1982.

• M. Castro and B. Liskov, “Practical Byzantine Fault Tolerance,” OSDI 1999.

Byzantine Generals Problem

A sender wants to send message to n-1 other peers

• Fault-free nodes must agree

• Sender fault-free agree on its message

• Up to f failures

Byzantine Generals Problem

A sender wants to send message to n-1 other peers

• Fault-free nodes must agree

• Sender fault-free agree on its message

• Up to f failures

S

321

v

vv

Byzantine Generals Algorithm

5

value v

Faulty peer

S

321

v

vv

6

value v

v v

Byzantine Generals Algorithm

S

321

v

vv

7

value v

v v

?

?

Byzantine Generals Algorithm

S

321

v

vv

8

value v

v

v v

?

v?

Byzantine Generals Algorithm

32

v

vv

9

value v

x

v v

?

v?[v,v,?]

[v,v,?]

S

1

Byzantine Generals Algorithm

32

v

vv

10

value v

x

v v

?

v?v

v

S

1Majorityvote resultsin correctresult atgood peers

Byzantine Generals Algorithm

S

321

v

wx

11

Faulty source

Byzantine Generals Algorithm

S

32

v

w

x

12

w w1

Byzantine Generals Algorithm

S

32

v

wx

13x

w w

v

xv

1

Byzantine Generals Algorithm

S

32

v

wx

14x

w w

v

xv

1[v,w,x]

[v,w,x]

[v,w,x]

Byzantine Generals Algorithm

S

32

v

wx

15x

w w

v

xv

1[v,w,x]

[v,w,x]

[v,w,x]

Vote resultidentical atgood peers

Byzantine Generals Algorithm

Known Results

• Need 3f + 1 nodes to tolerate f failures

• Need Ω(n2) messages in general

16

Ω(n2) Message Complexity

• Each message at least 1 bit

• Ω(n2) bits “communication complexity” to agree on just 1 bit value

17

18

Practical Byzantine Fault Tolerance

• Computer systems provide crucial services• Computer systems fail– Crash-stop failure– Crash-recovery failure– Byzantine failure

• Example: natural disaster, malicious attack, hardware failure, software bug, etc.

• Need highly available service Replicate to increase availability

19

Challenges

Request A Request B

Client Client

20

Requirements

• All replicas must handle same requestsdespite failure.

• Replicas must handle requests in identical order despite failure.

21

Challenges

2: Request B

1: Request A

Client Client

22

State Machine Replication

2: Request B

1: Request A

2: Request B

1: Request A

2: Request B

1: Request A

2: Request B

1: Request A

Client Client

How to assign sequence number to requests?

23

Primary Backup Mechanism

Client Client

2: Request B

1: Request A

What if the primary is faulty?Agreeing on sequence number

Agreeing on changing the primary (view change)

View 0

24

Normal Case Operation

• Three phase algorithm:– PRE-PREPARE picks order of requests– PREPARE ensures order within views– COMMIT ensures order across views

• Replicas remember messages in log• Messages are authenticated– .σk denotes a message sent by k

25

Pre-prepare Phase

Primary: Replica 0

Replica 1

Replica 2

Replica 3

Request: m

PRE-PREPARE, v, n, mσ0

Fail

26

Prepare PhaseRequest: m

PRE-PREPARE

Primary: Replica 0

Replica 1

Replica 2

Replica 3 Fail

Accepted PRE-PREPARE

27

Prepare PhaseRequest: m

PRE-PREPARE

Primary: Replica 0

Replica 1

Replica 2

Replica 3 Fail

PREPARE, v, n, D(m), 1σ1

Accepted PRE-PREPARE

28

Prepare PhaseRequest: m

PRE-PREPARE

Primary: Replica 0

Replica 1

Replica 2

Replica 3 Fail

PREPARE, v, n, D(m), 1σ1

Accepted PRE-PREPARE

Collect PRE-PREPARE + 2f matching PREPARE

29

Commit PhaseRequest: m

PRE-PREPARE

Primary: Replica 0

Replica 1

Replica 2

Replica 3 Fail

PREPARE

COMMIT, v, n, D(m)σ2

30

Commit Phase (2)Request: m

PRE-PREPARE

Primary: Replica 0

Replica 1

Replica 2

Replica 3 Fail

PREPARE COMMIT

Collect 2f+1 matching COMMIT: execute and reply

31

View Change

• Provide liveness when primary fails– Timeouts trigger view changes– Select new primary (= view number mod 3f+1)

• Brief protocol– Replicas send VIEW-CHANGE message along with

the requests they prepared so far– New primary collects 2f+1 VIEW-CHANGE messages– Constructs information about committed requests

in previous views

32

View Change Safety

• Goal: No two different committed request with same sequence number across views

Quorum for Committed Certificate (m, v, n)

At least one correct replica has Prepared Certificate (m, v, n)

View Change Quorum

33

Related WorksFault Tolerance

Fail Stop Fault Tolerance

Paxos1989 (TR)

VS ReplicationPODC 1988

Byzantine Fault Tolerance

Byzantine Agreement

RampartTPDS 1995

SecureRingHICSS 1998

PBFT OSDI ‘99

BASETOCS ‘03

Byzantine Quorums

Malkhi-ReiterJDC 1998

PhalanxSRDS 1998

FleetToKDI ‘00

Q/USOSP ‘05

Hybrid Quorum

HQ Replication OSDI ‘06