+ All Categories
Home > Documents > Network algorithms

Network algorithms

Date post: 23-Feb-2016
Category:
Upload: dyani
View: 74 times
Download: 0 times
Share this document with a friend
Description:
Network algorithms. Presenter- Kurchi S ubhra H azra. Agenda. Basic Algorithms such as Leader Election Consensus in Distributed Systems Replication and Fault Tolerance in Distributed Systems GFS as an example of a Distributed System. Network Algorithms. - PowerPoint PPT Presentation
39
NETWORK ALGORITHMS Presenter- Kurchi Subhra Hazra
Transcript
Page 1: Network algorithms

NETWORK ALGORITHMSPresenter-Kurchi Subhra Hazra

Page 2: Network algorithms

Agenda

• Basic Algorithms such as Leader Election

• Consensus in Distributed Systems

• Replication and Fault Tolerance in Distributed Systems

• GFS as an example of a Distributed System

Page 3: Network algorithms

Network Algorithms

• Distributed System is a collection of entities where

Each of them is autonomous, asynchronous and failure-proneCommunicating through unreliable channels To perform some common function

• Network algorithms enable such distributed systems to effectively perform these “common functions”

Page 4: Network algorithms

Gobal State in Distributed Systems

• We want to estimate a “consistent” state of a distributed system

• Required for determining if the system is deadlocked, terminated and for debugging

• Two approaches:• 1. Centralized- All processes and channels report to a central process• 2. Distributed – Chandy Lamport Algorithm

Page 5: Network algorithms

Chandy Lamport Algorithm

Based on Marker Messages M

On receiving M over channel c: If state is not recorded: a) Record own state b) Start recording state of incoming channels c) Send Marker Messages to all outgoing channels Else a) Record state of c

Page 6: Network algorithms

Chandy Lamport Algorithm

P1

P2

P3

e10

e20

e23

e30

e13

a

b

M

e11,2

M

1- P1 initiates snapshot: records its state (S1); sends Markers to P2 & P3; turns on recording for channels Ch21 and Ch31

e21,2,3

M

M

2- P2 receives Marker over Ch12, records its state (S2), sets state(Ch12) = {} sends Marker to P1 & P3; turns on recording for channel Ch32

e14

3- P1 receives Marker over Ch21, sets state(Ch21) = {a}

e32,3,4

M

M

4- P3 receives Marker over Ch13, records its state (S3), sets state(Ch13) = {} sends Marker to P1 & P2; turns on recording for channel Ch23

e24

5- P2 receives Marker over Ch32, sets state(Ch32) = {b}

e31

6- P3 receives Marker over Ch23, sets state(Ch23) = {}

e13

7- P1 receives Marker over Ch31, sets state(Ch31) = {} Taken from CS 425/UIUC/Fall 2009

Page 7: Network algorithms

Leader Election• Suppose you want to -elect a master server out of n servers-elect a co-ordinator among different mobile systems

Common Leader Election Algorithms-Ring Election-Bully Election

Two requirements- Safety (Process with best attribute is elected)- Liveness (Election terminates)

Page 8: Network algorithms

Ring Election• Processes organized in a ring• Send message clockwise to next process in a ring with its

id and own attribute value• Next process checks the election messagea) if its attribute value is greater, it replaces its own

process id with that in the message.b) If the attribute value is less, it simply passes on the

messagec) If the attribute value is equal it declares itself as the

leader and passes on an “elected” message.What happens when a node fails?

Page 9: Network algorithms

Ring Election - Example

Taken from CS 425/UIUC/Fall 2009

Page 10: Network algorithms

Ring Election - Example

Taken from CS 425/UIUC/Fall 2009

Page 11: Network algorithms

Bully Algorithm

Best case and worst case scenarios Taken from CS 425/UIUC/Fall 2009

Page 12: Network algorithms

Consensus• A set of n processes/systems attempt to “agree” on some information• Pi begins in undecided state and proposes value viєD• Pi‘s communicate by exchanging values• Pi sets its decision value di and enters decided state

• Requirements:

1.Termination: Eventually all correct processes decide, i.e., each correct process sets its decision variable2. Agreement : Decision value of all correct processes is the same3. Integrity: If all correct processes proposed v, then any correct decided process has di= v

Page 13: Network algorithms

2 Phase Commit Protocol• Useful in distributed transactions to perform atomic

commit• Atomic Commit: Set of distinct changes applied in a single

operation

• Suppose A transfers 300 $ from A’s account to B’s bank account.

• A= A-300• B=B+300

These operations should be guaranteed for consistency.

Page 14: Network algorithms

2 Phase Commit Protocol

What happens if the co-ordinator and a participant fails after doCommit?

Page 15: Network algorithms

Issue with 2PC

Co-ordinator

B

A

CanCommit?

Page 16: Network algorithms

Issue with 2PC

Co-ordinator

B

A

Yes

Page 17: Network algorithms

Issue with 2PC

Co-ordinator

B

A

doCommitA crashes

Co-ordinatorCrashes

B commits

A new co-ordinator cannot know whether A had committed.

Page 18: Network algorithms

3 Phase Commit Protocol (3PC)

Use an additional

stage

Page 19: Network algorithms

3PC Cont…

Co-ordinator

Cohort 1

Cohort 2

Cohort 3

canCommit ack preCommit ack commit

commit

commit

commit

Page 20: Network algorithms

3PC Cont…

• Why is this better? • 2PC: execute transaction when everyone is willing to COMMIT it• 3PC: execute transaction when everyone knows it will COMMIT(http://www.coralcdn.org/07wi-cs244b/notes/l4d.txt)

• But 3PC is expensive• Timeouts triggered by slow machines

Page 21: Network algorithms

Paxos Protocol• A consensus algorithm

• Important Safety Conditions: • Only one value is chosen• Only a proposed value is chosen

• Important Liveness Conditions:• Some proposed value is eventually chosen• Given a value is chosen, a process can learn the value eventually

• Nodes behave as Proposer, Acceptor and Learners

Page 22: Network algorithms

Paxos Protocol – Phase 1

22

Proposer

Acceptor

Acceptor

AcceptorAcceptor

Select a number n for proposal

of value v

Prepare message

What about this acceptor?

Majority of acceptors is enough

Acceptors respond back

with the highest n it has seen

Acknowledgement

Page 23: Network algorithms

Paxos Protocol – Phase 2

23

Proposer

Acceptor

Acceptor

Acceptor

Acceptor

n

n

n

Majority of acceptors agree on proposal n

with value v

Page 24: Network algorithms

Paxos Protocol – Phase 2

24

Proposer

Acceptor

Acceptor

Acceptor

Acceptor

Majority of acceptors agree on proposal n

with value v

Accept

Acceptors accept

What if v is null?

Page 25: Network algorithms

Paxos Protocol Cont…• What if arbitrary number of proposers are allowed?

P

Q

Acceptor

n1

Round 1

Round 2

n2

Page 26: Network algorithms

Paxos Protocol Cont…• What if arbitrary number of proposers are allowed?

• To ensure progress, use distinguished proposer

P

Q

Acceptor

Round 1

Round 2n3

Round 3n4

Round 4

Page 27: Network algorithms

Paxos Protocol Contd…

• Some issues: a) How to choose proposer? b) How do we ensure unique n ?c) Expensive protocold) No primary if distinguished proposer used

Originally used by Paxons to run their part-time parliament

Page 28: Network algorithms

Replication • Replication is important for1. Fault Tolerance2. Load Balancing3. Increased Availability

Requirements:4. Transparency5. Consistency

Page 29: Network algorithms

Failure in Distributed Systems

• An important consideration in every design decision

• Fault detectors should be :a) Complete – should be able to detect a fault when it

occurs b) Accurate – Does not raise false positives

Page 30: Network algorithms

Byzantine Faults

• Arbitrary messages and transitions• Cause: e.g., software bugs, malicious attacks

• Byzantine Agreement Problem: “Can a set of concurrent processes achieve coordination in spite of the faulty behavior of some of them?”

• Concurrent processes could be replicas in distributed systems

Page 31: Network algorithms

Practical Byzantine Fault Tolerance(PBFT)

• Replication Algorithm that is able to tolerate faults.• Useful for software faults• Why “Practical”? -> since can be used in an asynchronous environment like the internet

• Important Assumptions:1. At most nodes can be faulty2. All replicas start in the same state3. Failures are independent – Practical?

Page 32: Network algorithms

PBFT Cont..

C

R1

R2

R3

R4

request pre-prepare prepare commit reply

C : ClientR1: Primary replica

Client blocks and waits for f+1

replies

After accepting 2f prepares

Execution

after 2f+1 commits

Page 33: Network algorithms

PBFT Cont…• The algorithm provides • -> Safety• By guaranteeing linearizability. Pre-prepare and prepare

ensures total order on messages

• -> Liveness• By providing for view change, when the primary replica

fails. Here, synchrony is assumed.

• How do we know apriori the value of f?

Page 34: Network algorithms

Google File System• Revisited traditional file system design

1. Component failures are a norm2. Multi-GB Files are common3. Files mutated by appending new data4. Relaxed consistency model

Page 35: Network algorithms

GFS ArchitectureLeader Election/ Replication

Maintains metadata, namespace, chunk metadata etc

Page 36: Network algorithms

GFS – Relaxed Consistency

Page 37: Network algorithms

GFS – Design Issues

Single Master

Rational: Keep things simple

Problems:1. Increasing volume of underlying storage -> Increase in

metadata2. Clients not as fast as master server -> Master server became

bottleneck

Current: Multiple Masters per data center

Ref: http://queue.acm.org/detail.cfm?id=1594206

Page 38: Network algorithms

GFS Design Isuues

• Replication of chunks

a) Replication across racks – default number is 3b) Allowing concurrent changes to the same file.-> In retrospect,

they would rather have a single writer

c) Primary replica serializes mutation to chunks -They do not use any of the consensus protocols before applying mutations to the chunks.

Ref: http://queue.acm.org/detail.cfm?id=1594206

Page 39: Network algorithms

THANK YOU


Recommended