Post on 19-Dec-2015
transcript
Fall 2010 5DV020 3
Outline
• Group communication• Fault-tolerant services
– Passive and active replication
• Highly available services
Fall 2010 5DV020 4
Group communication
• Static vs. Dynamic groups• Primary partition vs. partitionable groups
• Group management– Interface for membership changes– Failure detection– Notification upon membership changes
– Provide group address expansion
Fall 2010 5DV020 5
Group views
• Views contain a set of members at a given point in time– Failed identified processes are not in the view
• Events occur in views• View-synchronous group communication– Based on view delivery, we can know which messages must have been delivered to other members
Fall 2010 5DV020 6
View-synchronous group communication
• Correct processes deliver the same set of messages in any given view
• Messages are delivered at most once
• Correct processes always deliver messages they send– If delivering to q fails, the next view excludes q
Fall 2010 5DV020 7
Why replication?
• Many algorithms require a working server node
• Performance (load balancing)• Increased availability
1 – p(all replicas crashed) = 1 – pn
• Fault-tolerance– Correct servers in majority
Fall 2010 5DV020 8
Replication
• Replication transparency– Client unaware of replication
• Problem with >1 client– Concurrent access, rather than exclusive
– Operations are interleaved• How do we ensure correctness?
Fall 2010 5DV020 9
Correctness of interleavings
• Always– Interleaved sequence of operations must meet the specification of a single correct copy of the object(s)
• Sequential consistency property– Order of operations is consistent with the program order in which each individual process executed them
• Linearizability property– Order of operations is consistent with the real times at which the operations occurred during execution
Fall 2010 5DV020 10
Example (interleaved operations)
• C1: A, B, C• C2: d, e, f• Order during execution: A, B, d, C, e, f
• An interleaving with sequential consistency:A, B, d, e, f, C
• Interleaving with linearizability:A, B, d, C, e, f
Fall 2010 5DV020 11
Generalized replication
1. Request: client makes request
2. Coordination: replica managers decide upon order of request
3. Execution: request is executed
4. Agreement: replica managers agree on result of execution
5. Response: response is sent back to the client
Fall 2010 5DV020 12
Passive replication
• One Primary replicamanager, many backups
• If primary fails, backupscan take its place (election!)
• Implements linearizability if:– A failing primary is replaced by a unique backup
– Backups agree on which operations had been performed when primary crashed• View-synchronous group communication!
Fall 2010 5DV020 13
Passive replication
1. Request: front end issues request with unique ID
2. Coordination: primary checks if request has been carried out, if so, returns cached response
3. Execution: perform operation, cache results
4. Agreement: primary sends updated state to backups
5. Response: primary sends result to front end, which forwards to the client
Fall 2010 5DV020 14
Active replication
• More distributed• All replica managers carry out all operations
• Requests to RM are totally ordered
• Front ends issue one request at a time (FIFO)
• Implements sequential consistency
Fall 2010 5DV020 15
Active replication
1. Request: front end adds unique identifier to request, mcasts it to RMs
2. Coordination: totally ordered request delivery to RMs
3. Execution: each RM executes request
4. Agreement: not needed
5. Response: all RMs respond to front end, front end interprets response and forwards interpretation to client
Fall 2010 5DV020 16
Comparison (Active/Passive)
• Handling of crash failures?– Both: yes (but differently)
• Handling of arbitrary failures?– Active: yes, Passive: no
• Complexity?• Optimizations?
– Send “reads” to backups in passive• Lose linearizability property!
– Send “reads” to single backup in active• Lose fault tolerance
Fall 2010 5DV020 17
Highly available services
• Goal is to allow clients to use service for as long as possible– Even if network connections are lost
– Even if results may be inconsistent
Gossip
• Guarantees by Gossip– Each client gets a consistent service over time• Replicas will provide data that is fresher than what the client has seen so far
– Relaxed consistency between replicas• Generally less than sequential consistency
• Eventually, all updates are applied (in order), but clients may observe stale data
Fall 2010 185DV020
Gossip contd.
• Covered more in-depth later by Daniel
• Highly relevant for today’s distributed systems• Used by e.g. Facebook for Cassandra (source)
Fall 2010 195DV020