Date post: | 23-Dec-2015 |
Category: |
Documents |
Upload: | hortense-malone |
View: | 216 times |
Download: | 0 times |
Election Algorithms
Topics
Issues Detecting Failures Bully algorithm Ring algorithm
Readings
Van Steen and Tanenbaum: 5.4 Coulouris: 11.3
Election Algorithms
Remember using Lamport clocks for total order
Can you think of another way to do this? It turns out that you can use a sequencer.
All operations go to a sequencer The sequencer assigns numbers to each message
before the message goes to each replica What if the sequencer goes down?
Election Algorithms
Many distributed algorithms require a process to act as a coordinator.
The coordinator can be any process that organizes actions of other processes.
A coordinator may fail How is a new coordinator chosen or
elected?
Election Algorithms
Assumptions Each process has a unique number to
distinguish them. One process per machine (which suggests that
an IP address can be the unique identifier) Processes know each other’s process number Processes do not know which ones are currently
up and which ones are down. General Approach
Locate the process with the process with the highest process number and designate it as the coordinator.
Election algorithms differ in how they do this.
Issues in Dealing with Coordinator Failure
Detecting Failure• Any node might detect failure first• Multiple processes might detect failure at once.
Election• Must run without coordination• Must deal with arbitrary process failures• All nodes must agree on when election is over
and who the new coordinator is.
Detecting Failures
Timeouts are used to detect failuresT = 2Ttrans + Tprocess
• Where Ttran is maximum transmission delay and Tprocess represents the maximum delay for processing a message.
If a process fails to respond to a message request within T seconds then an election is initiated.
Bully Algorithm
When a process, P, notices that the coordinator is no longer responding to requests, it initiates an election. P sends an ELECTION message to all
processes with higher numbers. If no one responds, P wins the election and
becomes a coordinator. If one of the higher-ups answers, it takes
over. P’s job is done.
Bully Algorithm When a process gets an ELECTION message
from one of its lower-numbered colleagues: Receiver sends an OK message back to the
sender to indicate that he is alive and will take over.
Receiver holds an election, unless it is already holding one.
Eventually, all processes give up but one, and that one is the new coordinator.
The new coordinator announces its victory by sending all processes a message telling them that starting immediately it is the new coordinator.
Bully Algorithm
If a process that was previously down comes back: It holds an election. If it happens to be the highest process
currently running, it will win the election and take over the coordinator’s job.
“Biggest guy” always wins and hence the name “bully” algorithm.
The Bully Algorithm (Example)
The bully election algorithm Process 4 holds an election Process 5 and 6 respond, telling 4 to stop Now 5 and 6 each hold an election
The Bully Algorithm (Example)
d) Process 6 tells 5 to stope) Process 6 wins and tells everyone
Bully AlgorithmAnalysis
Best case The node with second highest identifier
detects failure Total messages = N-2
• One message for each of the other processes indicating the process with the second highest identifier is the new coordinator.
Worst case The node with lowest identifier detects failure.
This causes N-1 processes to initiate the election algorithm each sending messages to processes with higher identifiers.
Total messages = O(N2)
Bully Algorithm Discussion
How many processes are used to detect a coordinator failure? As many as you want. You could have all
other processes check out the coordinator. It is impossible for two processes to be
elected at the same time.
Ring Algorithm Use a ring (processes are physically or logically
ordered, so that each process knows who its successor is).
Algorithm When a process notices that coordinator is not
functioning:• Builds an ELECTION message (containing its own process
number)• Sends the message to its successor (if successor is down,
sender skips over it and goes to the next member along the ring, or the one after that, until a running process is located).
• At each step, sender adds its own process number to the list in the message.
Ring Algorithm Algorithm (continued)
When the message gets back to the process that started it all:
• Process recognizes the message that contains its own process number
• Changes message type to COORDINATOR• Circulates message once again to inform everyone
else: Who the new coordinator is (list member with highest number); Who the members of the new ring are.
• When message has circulated once, it is removed.
• Even if two ELECTIONS started at once, everyone will pick same leader since node with highest identifier is picked.
Ring Algorithm
Initiation:1. Process 4 sends an
ELECTION message to its successor (or next alive process) with its ID
Ring Algorithm
Initiation:2. Each process adds its own ID and forwards the ELECTION message
Ring Algorithm contd…
Leader Election:3. Message comes back to initiator, here the initiator is 4.4. Initiator announces the winner by sending another message around the ring
Ring Algorithm Analysis
•At best 2(N-1 ) messages are passed•One round for the ELECTION message
•One round for the COORDINATOR
•Assumes that only a single process starts an election.
•Multiple elections cause an increase in messages but no real harm done.
Summary
Synchronization between processes often requires that one process acts as a coordinator.
The coordinator is not fixed. Election algorithms determine the
coordinator.