FaultTolerance
• Wehavesofarassumed“fail-stop”failures(e.g.,powerfailuresorsystemcrashes)
• Inotherwords,iftheserverisup,itfollowstheprotocol
• Hardenough:
• difficulttodis&nguishbetweencrashvs.networkdown
• difficulttodealwithnetworkpar&&on
LargerClassofFailures
• Canonehandlealargerclassoffailures?
• Buggyserversthatcomputeincorrectlyratherthanstopping
• ServersthathavebeenmodifiedbyanaOacker
• ReferredtoasByzan&nefaults
Model
• Provideareplicatedstatemachineabstrac&on
• Assume2f+1of3f+1nodesarenon-faulty
• Inotherwords,oneneeds3f+1replicastohandleffaults
• Asynchronoussystem,unreliablechannels
• Usecryptography(bothpublic-keyandsecret-keycrypto)
GeneralIdea
• Primary-backupplusquorumsystem
• Execu&onsaresequencesofviews
• Clientssendsignedcommandstoprimaryofcurrentview
• Primaryassignssequencenumbertoclient’scommand
• Primarycommitstoaquorum
AOacker’sPowers
• Worstcase:asingleaOackercontrolstheffaultyreplicas
• Suppliesthecodethatfaultyreplicasrun
• Knowsthecodethenon-faultyreplicasarerunning
• Knowsthefaultyreplicas’cryptokeys
• Canreadnetworkmessages
Whatfaultscannothappen?
• Nomorethanfoutof3f+1replicascanbefaulty
• Noclientfailure--clientscanneverdoanythingbad(orrathersuchbehaviorcanbedetectedusingstandardtechniques)
• Noguessingofcryptokeysorbreakingofcryptography
Whatcouldgowrong?
• Primarycouldbefaulty!
• Couldignorecommands;assignsamesequencenumbertodifferentrequests;skipsequencenumbers;etc.
• Canequivocateorliedifferentlytodifferentnodes
• Backupscouldbefaulty!
• Couldincorrectlystorecommandsforwardedbyacorrectprimary
• Faultyreplicascouldincorrectlyrespondtotheclient!
ExampleUseScenario
• Arvind:
echoA>grade
echoB>grade
tellKaiyuan"thegradefileisready"
• Kaiyuan:
catgrade
Design1
• client,nservers
• clientsendsrequesttoallofthem
• waitsforallntoreply
• onlyproceedsifallnagree
• whatiswrongwiththisdesign?
Design2
• letushavereplicasvote
• 2f+1servers,assumenomorethanfarefaulty
• clientwaitsforf+1matchingreplies
• ifonlyfarefaulty,andnetworkworkseventually,mustgetthem!
• whatiswrongwithdesign2?
IssueswithDesign2
• f+1matchingrepliesmightbefbadnodes&1good
• somaybeonlyonegoodnodegottheopera&on!
• nextopera&onalsowaitsforf+1
• mightnotincludethatonegoodnodethatsawop1
• example:S1S2S3(S1isbad)
• everyonehearsandrepliestowrite("A")
• S1andS2replytowrite("B"),butS3missesit
• clientcan'twaitforS3sinceitmaybetheonefaultyserver
• S1andS3replytoread(),butS2missesit;read()yields"A"
• result:clienttrickedintoaccep&ngout-of-datestate
Design3
• 3f+1servers,ofwhichatmostfarefaulty
• clientwaitsfor2f+1matchingreplies
• fbadnodesplusamajorityofthegoodnodes
• soallsetsof2f+1overlapinatleastonegoodnode
• doesdesign3haveeverythingweneed?
RefinedApproach
• letushaveaprimarytopickorderforconcurrentclientrequests
• useaquorumof2f+1outof3f+1nodes
• haveamechanismtodealwithfaultyprimary
• clientsno&fyreplicasofeachopera&on,aswellasprimary;ifnoprogress,forcechangeofprimary
• replicasexchangeinfoaboutopssentbyprimary
• replicassendresultsdirectlytoclient
PBFT:Overview
• Normalopera&on:howtheprotocolworksintheabsenceoffailures
• Viewchanges:howtodeposeafaultyprimaryandelectanewone
• Garbagecollec&on:howtoreclaimthestorageusedtokeepvariouscer&ficates
NormalOpera&on
• Threephases:
• Pre-prepare:assignssequencenumbertorequest
• Prepare:ensuresfault-tolerantconsistentorderingofrequestswithinviews
• Commit:ensuresfault-tolerantconsistentorderingofrequestsacrossviews
• Eachreplicamaintainsthefollowingstate:
• Servicestate
• Messagelogwithallmessagessent/received
• Integerrepresen&ngthecurrentviewnumber
PrepareCer&ficate
• P-cer&ficatesensuretotalorderwithinviews
• ReplicaproducesP-cer&ficate(m,v,n)iffitslogholds:
• Therequestm
• APRE-PREPAREforminviewvwithsequencenumbern
• 2fPREPAREsfromdifferentbackupsthatmatchthepre-prepare
• AP-cer&ficate(m,v,n)meansthataquorumagreeswithassigningsequencenumberntominviewv
• Notwonon-faultyreplicaswithP-cer&ficate(m1,v,n)andP-cer&ficate(m2,v,n)
P-cer&ficatesarenotenough
• AP-cer&ficateprovesthatamajorityofcorrectreplicashasagreedonasequencenumberforaclient’srequest
• Yetthatordercouldbemodifiedbyanewleaderelectedinaviewchange
CommitCer&ficate
• C-cer&ficatesensuretotalorderacrossviews
• can’tmissP-cer&ficateduringaviewchange
• AreplicahasaC-cer&ficate(m,v,n)if:
• ithadaP-cer&ficate(m,v,n)
• logcontains2f+1matchingCOMMITfromdifferentreplicas(includingitself)
• ReplicaexecutesarequestaoeritgetsaC-cer&ficateforit,andhasclearedallrequestswithsmallersequencenumbers
BFTDiscussion
• IsPBFTprac&cal?
• Doesitaddresstheconcernsthatenterpriseuserswouldliketobeaddressed?
Bitcoin
• adigitalcurrency
• apublicledgertopreventdouble-spending
• nocentralizedtrustormechanism<--thisishard!
Whydigitalcurrency?
• mightmakeonlinepaymentseasier
• creditcardshaveworkedwellbutaren'tperfect
• insecure->fraud->fees,restric&ons,reversals
• recordofallyourpurchases
Idea
• Signedsequenceoftransac&ons
• thereareabunchofcoins,eachownedbysomeone
• everycoinhasasequenceoftransac&onrecords
• oneforeach&methiscoinwastransferredaspayment
• acoin'slatesttransac&onindicateswhoownsitnow
Transac&onRecord
• pub(user1):publickeyofnewowner
• hash(prev):hashofthiscoin'sprevioustransac&onrecord
• sig(user2):signatureovertransac&onbypreviousowner'sprivatekey
• BitCoinhasmorecomplexity:amount(frac&onal),mul&plein/out,...
Transac&onExample
1. Yownsacoin,previouslygiventoitbyX:
• T7:pub(Y),hash(T6),sig(X)
2. YbuysahamburgerfromZandpayswiththiscoin
• ZsendspublickeytoY
• Ycreatesanewtransac&onandsignsit
• T8:pub(Z),hash(T7),sig(Y)
3. Ysendstransac&onrecordtoZ
4. Zverifies:T8'ssig()correspondstoT7'spub()
5. ZgiveshamburgertoY
DoubleSpending
• Ycreatestwotransac&onsforsamecoin:Y->Z,Y->Q
• bothwithhash(T7)
• Yshowsdifferenttransac&onstoZandQ
• bothtransac&onslookgood,includingsignaturesandhash
• nowbothZandQwillgivehamburgerstoY
Defense
• publishlogofalltransac&onstoeveryone,insameorder
• soQknowsaboutY->Z,andwillrejectY->Q
• a"publicledger"
• ensureYcan'tun-publishatransac&on
StrawmanSolu&on
• Assumeap2pnetwork
• Peersfloodnewtransac&onsover“overlay”
• Transac&onisacceptableonlyifmajorityofpeersthinkitisvalid
• Whataretheissueswiththisscheme?
BitCoinBlockChain
• theblockchaincontainstransac&onsonallcoins
• manypeers,eachwithacompletecopyofthechain
• proposedtransac&onsfloodedtoallpeers
• newblocksfloodedtoallpeers
• eachblock:hash(prevblock),setoftransac&ons,nonce,currentwallclock&mestamp
• newblockabout~10minutescontainingnewxac&ons
• payeedoesn'tverifyun&lxac&onisintheblockchain
“Mining”Blocks
• requirement:hash(block)hasNleadingzeros
• eachpeertriesnoncevaluesun<hisworksout
• tryingonenonceisfast,butmostnonceswon'twork
• miningablocknotaspecificfixedamountofwork
• onenodecantakemonthstocreateoneblock
• butthousandsofpeersareworkingonit
• suchthatexpected&metofirsttofindisabout10minutes
• thewinnerfloodsthenewblocktoallpeers
• thereisanincen&vetomineablock—12.5bc
Timing
• start:allpeersknow&llB5
• andareworkingonB6(tryingdifferentnonces)
• YsendsY->Ztransac&ontopeers,whichfloodit
• peersbufferthetransac&onun&lB6iscomputed
• peersthatheardY->Zincludeitinnextblock
• soeventuallyblockchainis:B5,B6,B7,whereB7includesY->Z
DoubleSpending
• whatifYsendsoutY->ZandY->Qatthesame&me?
• nocorrectpeerwillacceptboth
• ablockwillhaveonebutnotboth
• buttherecouldbeafork:B6<-BZandB6<-BQ
ForkedChain
• eachpeerbelieveswhicheverofBZ/BQitsawfirst
• triestocreateasuccessor
• ifmanymoresawBZthanBQ,morewillmineforBZ
• soBZsuccessorlikelytobecreatedfirst
• evenotherwiseonewillbeextendedfirstgivensignificantvarianceinminingsuccess&me
• peersalwaysswitchtominingthelongestfork,reinforcingagreement
DoubleSpendingDefense
• waitforenoughblockstobeminted
• ifafewblockshavebeenminted,unlikelythatadifferentforkwillwin
• ifsellingahigh-valueitem,thenwaitforafewblocksbeforeshipping
• couldaOackerstartaforkfromanoldblock?
• yes--butforkmustbelongerinorderforpeerstoacceptit
• iftheaOackerhas1000sofCPUs--morethanallthehonestbitcoinpeers--thentheaOackercancreatethelongestfork
• systemworksonlyifnoen&tycontrolsamajorityofnodes