VirtualSynchrony
JaredCantwell
Review
• Mul7cast• Causalandtotalordering• ConsistentCuts• Synchronizedclocks• Impossibilityofconsensus
• Distributedfilesystems
Goal
• Distributedprogrammingishard• Whattoolscanmakeiteasier?
• Whatassump3onscanmakeiteasier?
Distributedprogrammingishard!Let’sgoshopping!!!
AccordingtohHp://en.wikipedia.org/wiki/Barbie,Barbieoncesaid“Mathishard!”(misquoted).
TheProcessGroupApproachtoReliableDistributedCompu7ng
• KenBirman– Professor,CornellUniversity
• ISIS– “toolkitmechanismfordistributedprogramming”– Financialtradingfloors– Telecommunica7onsswitching
VirtualSynchrony
• Simplifydistributedsystemsprogrammingbyassumingasynchronousenvironment
• Features:– ProcessGroups– ReliableMul7cast
– FaultTolerance
– Performance
Outline
• Problem/Mo7va7on• Solu7on(VirtualSynchrony)– Assump7ons
– CloseSynchrony– VirtualSynchrony
Outline
• Problem/Mo7va7on• Solu7on(VirtualSynchrony)– Assump7ons
– CloseSynchrony– VirtualSynchrony
Mo7va7on
• DistributedProgrammingishard
Difficul7es
• Noreliablemul7cast• Membershipchurn
• Messageordering
• Statetransfers• Failureatomicity
NoReliableMul7cast
p
q
r
Ideal Reality
• UDP,TCP,Mul7castnotgoodenough• Whatisthecorrectwaytorecover?
MembershipChurn
p
q
r
Receivesnewmembership
Neversent
• Membershipchangesarenotinstant• Howtohandlefailurecases?
MessageOrdering
p
q
r
1 2
• Everybodywantsit!• Howcanyouknowifyouhaveit?
• Howcanyougetit?
StateTransfers
• Newnodesmustgetcurrentstate• Doesnothappeninstantly• Howdoyouhandlenodesfailing/joining?
p
q
r
FailureAtomicity
p
q
r
Ideal Reality
x
?
• Nodescanfailmid‐transmit• Somenodesreceivemessage,othersdonot
• Inconsistenciesarise!
Mo7va7onReview
• Distributedprogrammingishard!
• Noreliablemul7cast
• Membershipchurn
• Messageordering
• Statetransfers• Failureatomicity
Outline
• Problem/Mo7va7on• Solu7on(VirtualSynchrony)– Assump7ons
– CloseSynchrony– VirtualSynchrony
Assump7ons
• WANofLANs• Unreliablenetwork• Flowcontrolatlowestlayer• Clocksnotsynchronized• Nopar77ons– CAPTheorem?
FailureModel
• Nodescrash• Networkislossy• Can’tdis7nguishdifference
Outline
• Problem/Mo7va7on• Solu7on(VirtualSynchrony)– Assump7ons
– CloseSynchrony– VirtualSynchrony
Outline
• Problem/Mo7va7on• Solu7on(VirtualSynchrony)– Assump7ons
– CloseSynchrony• Model
• Significance• Issues
– VirtualSynchrony
Model
• Events(allornothing)– Internalcomputa7on
– Messagetransmission&delivery– Membershipchange
Model
• Synchronousexecu7on
p
q
r
s
t
u
Ken’sSlides‐2006
Significance
• Mul7castisalwaysreliable• Membershipisalwaysconsistent
• Totallyorderedmessagedelivery
• State‐transferhappensinstantaneously• FailureAtomicity– Mul7castisasingleevent
Issues
• Discreteeventsimulator• Isitprac7cal?• Impossiblewithfailures
• Veryexpensive– Systemprogressesinlock‐step– Limitedbyspeedofothermembers
Outline
• Problem/Mo7va7on• Solu7on(VirtualSynchrony)– Assump7ons
– CloseSynchrony– VirtualSynchrony
Outline• VirtualSynchrony– AsynchronousExecu7on– VirtualSynchrony– ISIS– Parallels– Benefits– Discussion
AsynchronousExecu7on
• Keytohighthroughputindistributedsystems• Onlywaitforresponses(ortoofastsends)• Communica7onchannel– Actsasapipeline– Notlimitedbylatency
• NotpossiblewithCloseSynchrony!!
AsynchronousExecu7on
p
q
r
s
t
u
Ken’sSlides‐2006
VirtualSynchrony
• CloseSynchrony+Asynchronous• Indis7nguishabletoapplica7on• So….whencansynchronousexecu7onberelaxed?
ISIS
• Communica7onFramework• MembershipService
• VSprimi7ves– ABCAST– CBCAST
ISIS
• Problem– CrashandLossyNetworkIndis3nguishable
• Solu7on:– Membershiplist– Nonresponsiveorfailedmembersaredropped– Onlylistedmemberscanpar7cipate
– Re‐joinprotocol– DoesMembershipexistinalldistributedsystems?
ISIS
• AtomicBroadcast(ABCAST)• Nomessagecanbedeliveredtoanyuserun7lallpreviousABCASTmessageshavebeendelivered
• Costlytoimplement
• …Butnoteveryoneneedssuchstrongguarantees
ISIS
• CausalAtomicBroadcast(CBCAST)• Sufficientformostprogrammers
• Concurrentmessagescommute
• WeakerthanABCAST
WhentouseCBCAST?
• Whenanyconflic7ngmul7castsareuniquelyorderedalongasinglecausalchain
• …..ThisisVirtualSynchrony
p
r
s
t1
2
3
4
5
1
2
Ken’sSlides‐2006
Eachthreadcorrespondstoadifferentlock
Parallels
• Logical7me• Replica7onindatabasesystems
• Schneider’sstatemachineapproach
• Parallelprocessorarchitectures• Distributeddatabasesystems
Benefits
• Assumeacloselysynchronousmodel• Groupstateandstatetransfer• Pipelinedcommunica7on(async)
• Singleeventmodel
• Failurehandling
Discussion
• Par11ons• Falseposi7ves– Mosthavethem,VSadmitsit
• Falsenega7ves– Dependona7meout
Summary
• Programmingindistributedsystemsishard• CloseSynchronymakesiteasier– Coststoomuch
• Takeasynchronouswhenyoucan• VirtualSynchrony– Pipelined– Easytoreasonover
UnderstandingtheLimita7onsofCausallyandTotallyOrderedCommunica7on
• Authors– DavidCheriton
• Stanford• PhD–Waterloo
• Billionaire– DaleSkeen
• PhD–UCBerkeley• 3‐phasecommitprotocol
TheflawsofCATOCS
• Unrecognizedcausality• Noseman3cordering
• NoEfficiencyGain(overState‐levelTechniques)
• NoScalability
UnrecognizedCausality
• Externalcommunica7onisunknown
p
q
r
s
UnrecognizedCausality
• Databaseisexternalen7ty
• Causalrela7onexists,butCATOCSmissesit
NoSeman7cOrdering
• Serializa7on– Messagescan’tbe“grouptogether”
– Implemen7ngeliminatesCATOCSneed
• CausalMemory– Solu7on:state‐levellogicalclock
NoEfficiencyGain
• S7llneedstate‐leveltechniques• Falsecausality– ReducesPerformance
– IncreasedMemory
• Messageoverhead
NoEfficiencyGain
• Whatifm2happenedtofollowm1,butwasnotcausallyrelated?
• CATOCSwouldmakeFalseCausality
NoScalability
• ≈quadra7cgrowthofexpectedmessagebuffering
• RebuHal:– Worstcase– Imprac7calusecase
Summary
• CATOCSsopwareisoverkill• Communica7onsystemdoesn’tknoweverything
• EverythingisbeHerattheapplica7onlevel
Conclusions
• DistributedProgrammingishard• CloseSynchrony– Toocostly
• VirtualSynchrony– Limita7ons
• VSnotperfectforallsitua7ons