distributed systems · • cap theorem. raft • proposed by ongaroand ousterhoutin 2014 • five...
TRANSCRIPT
![Page 1: Distributed Systems · • CAP Theorem. Raft • Proposed by Ongaroand Ousterhoutin 2014 • Five components • Leader election • Log replication • Safety • Client protocol](https://reader033.vdocument.in/reader033/viewer/2022052104/603fcf46eb72ff145553e596/html5/thumbnails/1.jpg)
DistributedSystemsDay12:Consistency
“GottoCatchThemAll…”
![Page 2: Distributed Systems · • CAP Theorem. Raft • Proposed by Ongaroand Ousterhoutin 2014 • Five components • Leader election • Log replication • Safety • Client protocol](https://reader033.vdocument.in/reader033/viewer/2022052104/603fcf46eb72ff145553e596/html5/thumbnails/2.jpg)
Today
• RaftRecap
• Consensus
• CAPTheorem
![Page 3: Distributed Systems · • CAP Theorem. Raft • Proposed by Ongaroand Ousterhoutin 2014 • Five components • Leader election • Log replication • Safety • Client protocol](https://reader033.vdocument.in/reader033/viewer/2022052104/603fcf46eb72ff145553e596/html5/thumbnails/3.jpg)
Raft• Proposed byOngaro andOusterhout in2014
• Fivecomponents• Leaderelection• Logreplication• Safety• Clientprotocol• Membershipchanges
• Assumescrashfailures(sonobyzantinefailures)
• Nodependency ontimeforsafety• Butdepends ontimeforavailability
• Tolerates(N-1)/2failures
![Page 4: Distributed Systems · • CAP Theorem. Raft • Proposed by Ongaroand Ousterhoutin 2014 • Five components • Leader election • Log replication • Safety • Client protocol](https://reader033.vdocument.in/reader033/viewer/2022052104/603fcf46eb72ff145553e596/html5/thumbnails/4.jpg)
RaftProperties• Safety:atmostoneleader
• Eachfollowervotesforatmostonecandidate• Acandidateneedsamajoritytobeleader
• Liveness:eventuallytherewillbealeader• Challenge:ifmultipletrytocallforelectionà splitvote• Timeout+randomness:randomnesshelpstoensurethatoneserverdetects
fasterthantheothers
• LogSafety: ifleadercommits,thendataisinallfutureleaders• ElectionModifications:followersonlyvoteforclientwithhigherterm/index• CommitModifications:NewLeaderdoesnotcommituntilentriesincurrent
termhavebeenagreedonbyfollowers
![Page 5: Distributed Systems · • CAP Theorem. Raft • Proposed by Ongaroand Ousterhoutin 2014 • Five components • Leader election • Log replication • Safety • Client protocol](https://reader033.vdocument.in/reader033/viewer/2022052104/603fcf46eb72ff145553e596/html5/thumbnails/5.jpg)
HowdoyouChangeClusterSize?N=5
N=7
Goal:addtwonewserverstocluster
Challenge:consistentlygetallserverstoagreeonnewclustersize
Case:Duringclusterupdate,leaderfailsANDdifferentservershavedifferentnotionsofsizeàmultipleleaders
![Page 6: Distributed Systems · • CAP Theorem. Raft • Proposed by Ongaroand Ousterhoutin 2014 • Five components • Leader election • Log replication • Safety • Client protocol](https://reader033.vdocument.in/reader033/viewer/2022052104/603fcf46eb72ff145553e596/html5/thumbnails/6.jpg)
HowdoyouChangeClusterSize?N=5
N=7
Goal:addtwonewserverstocluster
Challenge:consistentlygetallserverstoagreeonnewclustersize
Case:Duringclusterupdate,leaderfailsANDdifferentservershavedifferentnotionsofsizeàmultipleleaders
Solution:Needaprotocoltoconsistentlyupdatethecluster
![Page 7: Distributed Systems · • CAP Theorem. Raft • Proposed by Ongaroand Ousterhoutin 2014 • Five components • Leader election • Log replication • Safety • Client protocol](https://reader033.vdocument.in/reader033/viewer/2022052104/603fcf46eb72ff145553e596/html5/thumbnails/7.jpg)
ConfigurationChanges• Cannotswitchdirectlyfromoneconfigurationtoanother:conflictingmajoritiescouldarise
• SwitchingfromN=3toN=5Seethepaperfordetails
Cold Cnew
Server1Server2Server3Server4Server5
MajorityofCold
MajorityofCnew
time
![Page 8: Distributed Systems · • CAP Theorem. Raft • Proposed by Ongaroand Ousterhoutin 2014 • Five components • Leader election • Log replication • Safety • Client protocol](https://reader033.vdocument.in/reader033/viewer/2022052104/603fcf46eb72ff145553e596/html5/thumbnails/8.jpg)
Today
• RaftRecap
• Consensus
• CAPTheorem
![Page 9: Distributed Systems · • CAP Theorem. Raft • Proposed by Ongaroand Ousterhoutin 2014 • Five components • Leader election • Log replication • Safety • Client protocol](https://reader033.vdocument.in/reader033/viewer/2022052104/603fcf46eb72ff145553e596/html5/thumbnails/9.jpg)
![Page 10: Distributed Systems · • CAP Theorem. Raft • Proposed by Ongaroand Ousterhoutin 2014 • Five components • Leader election • Log replication • Safety • Client protocol](https://reader033.vdocument.in/reader033/viewer/2022052104/603fcf46eb72ff145553e596/html5/thumbnails/10.jpg)
![Page 11: Distributed Systems · • CAP Theorem. Raft • Proposed by Ongaroand Ousterhoutin 2014 • Five components • Leader election • Log replication • Safety • Client protocol](https://reader033.vdocument.in/reader033/viewer/2022052104/603fcf46eb72ff145553e596/html5/thumbnails/11.jpg)
ApproachestoReplication
PassiveReplication• Totalordering• Protocols:Zookeeper,Paxos,Chubby
ActiveReplication• FIFOordering• Toleratesbyzantinefailures
Lazyreplication• Causalordering• Protocols:Gossip,DynamoDB,
CassandraDB,VoldemortDB,MongoDB
ActiveReplication PassiveReplication LazyReplication
ServerB
ServerC
ServerA
FE
FEServerB(follower)
ServerC(Follower)
ServerA(leader)FE
FE FE ServerB
ServerC
ServerA
FE
![Page 12: Distributed Systems · • CAP Theorem. Raft • Proposed by Ongaroand Ousterhoutin 2014 • Five components • Leader election • Log replication • Safety • Client protocol](https://reader033.vdocument.in/reader033/viewer/2022052104/603fcf46eb72ff145553e596/html5/thumbnails/12.jpg)
ApproachestoReplication
PassiveReplication• Totalordering• Performanceissues:slowandlimitsparallelism
• Allservers processthesame request
Lazyreplication• Causalordering• Performance:faster
– Anyserver canprocessanyrequest– More parallelism˜
PassiveReplication LazyReplication
ServerB(follower)
ServerC(Follower)
ServerA(leader)FE
FE FE ServerB
ServerC
ServerA
FE
![Page 13: Distributed Systems · • CAP Theorem. Raft • Proposed by Ongaroand Ousterhoutin 2014 • Five components • Leader election • Log replication • Safety • Client protocol](https://reader033.vdocument.in/reader033/viewer/2022052104/603fcf46eb72ff145553e596/html5/thumbnails/13.jpg)
ThinkingAboutConsistency
• Allreplicasareoneserver
• Ifdifferentclientswriteandreadtothis’’one’’server,whatshouldweexpect?
ServerB(follower)
ServerC(Follower)
ServerA(leader)FE
FE
Get(c)
Get(c)
set(c=5)
set(c=7)
Get(c)
Get(c)
Get(c)
Get(c)
C1
C2
C3
C4
![Page 14: Distributed Systems · • CAP Theorem. Raft • Proposed by Ongaroand Ousterhoutin 2014 • Five components • Leader election • Log replication • Safety • Client protocol](https://reader033.vdocument.in/reader033/viewer/2022052104/603fcf46eb72ff145553e596/html5/thumbnails/14.jpg)
ConsistencySpectrum
StrictSerializability
Linearizable
Sequential
Causal+
Eventual
WEAKCONSISTENCY
STRONGCONSISTENCY
SLOWERBUTEASYTOPROGRAM
FASTBUTHARDERTOPROGRAM
![Page 15: Distributed Systems · • CAP Theorem. Raft • Proposed by Ongaroand Ousterhoutin 2014 • Five components • Leader election • Log replication • Safety • Client protocol](https://reader033.vdocument.in/reader033/viewer/2022052104/603fcf46eb72ff145553e596/html5/thumbnails/15.jpg)
Linearizable
• Totalorder+FIFO+“Time”• Recall:Totalorder– allserversoperateinsameorder• Linearizable isasubsetofTotalorder• OrderingmustbeFIFOandbasedontime
Get(c)
Get(c)
set(c=5)
set(c=7)
Get(c)
Get(c)
Get(c)
Get(c)
C1
C2
C3
C4
Initialc=3
![Page 16: Distributed Systems · • CAP Theorem. Raft • Proposed by Ongaroand Ousterhoutin 2014 • Five components • Leader election • Log replication • Safety • Client protocol](https://reader033.vdocument.in/reader033/viewer/2022052104/603fcf46eb72ff145553e596/html5/thumbnails/16.jpg)
Sequential
• Totalorder+FIFO
• SequentialisasubsetofTotalorder• OrderingmustbeFIFO• Requests fromdifferentclients canbereshuffled
Get(c)
Get(c)
set(c=5)
set(c=7)
Get(c)
Get(c)
Get(c)
Get(c)
C1
C2
C3
C4
Initialc=3
![Page 17: Distributed Systems · • CAP Theorem. Raft • Proposed by Ongaroand Ousterhoutin 2014 • Five components • Leader election • Log replication • Safety • Client protocol](https://reader033.vdocument.in/reader033/viewer/2022052104/603fcf46eb72ff145553e596/html5/thumbnails/17.jpg)
Causal+
• MustrespectCausality
• Needvectorclockstotrackandmaintaincausality
• Onlycausallyrelatedeventsneedtobeordered
• NOTOTALORDERING!!!!!
Get(c)
Get(c)
set(c=5)
set(c=7)
Get(c)
Get(c)
Get(c)
Get(c)
C1
C2
C3
C4
Initialc=3
![Page 18: Distributed Systems · • CAP Theorem. Raft • Proposed by Ongaroand Ousterhoutin 2014 • Five components • Leader election • Log replication • Safety • Client protocol](https://reader033.vdocument.in/reader033/viewer/2022052104/603fcf46eb72ff145553e596/html5/thumbnails/18.jpg)
Eventual
• AnythingCanhappen
• Ifnowritesà eventuallyallservers returnthesamedata
Get(c)
Get(c)
set(c=5)
set(c=7)
Get(c)
Get(c)
Get(c)
Get(c)
C1
C2
C3
C4
Initialc=3
![Page 19: Distributed Systems · • CAP Theorem. Raft • Proposed by Ongaroand Ousterhoutin 2014 • Five components • Leader election • Log replication • Safety • Client protocol](https://reader033.vdocument.in/reader033/viewer/2022052104/603fcf46eb72ff145553e596/html5/thumbnails/19.jpg)
ConsistencySpectrum• Linearizable:totalorder+realtime• Sequential:totalorder+clientorder• Causal+:causallyordered+eventuallyeveryoneagree• Eventual:eventuallyeveryoneagrees
StrictSerializability
Linearizable
Sequential
Causal+
Eventual
WEAKCONSISTENCY
STRONGCONSISTENCY
SLOWER
BUTEASYTOPROGRAM
FASTandParallel
BUTHARDERTOPROGRAM:needconflictresolution
![Page 20: Distributed Systems · • CAP Theorem. Raft • Proposed by Ongaroand Ousterhoutin 2014 • Five components • Leader election • Log replication • Safety • Client protocol](https://reader033.vdocument.in/reader033/viewer/2022052104/603fcf46eb72ff145553e596/html5/thumbnails/20.jpg)
Today
• RaftRecap
• Consensus
• CAPTheorem
![Page 21: Distributed Systems · • CAP Theorem. Raft • Proposed by Ongaroand Ousterhoutin 2014 • Five components • Leader election • Log replication • Safety • Client protocol](https://reader033.vdocument.in/reader033/viewer/2022052104/603fcf46eb72ff145553e596/html5/thumbnails/21.jpg)
CAPTheorem
• ConsistencyModel:Howdoesyoursystemreactduringpartition
• C:Consistency(linearizable)• Allaccess islinearizable• Requires consensus
• A:Availability• Allclients canmakeprogress
• P:Partitiontolerance• CandAcanhappenduringapartition
![Page 22: Distributed Systems · • CAP Theorem. Raft • Proposed by Ongaroand Ousterhoutin 2014 • Five components • Leader election • Log replication • Safety • Client protocol](https://reader033.vdocument.in/reader033/viewer/2022052104/603fcf46eb72ff145553e596/html5/thumbnails/22.jpg)
CAPTheorem
ServerB(follower)
ServerC(Follower)
ServerA(leader)
FE
FE
NetworkPartition
• ConsistencyModel:Howdoesyoursystemreactduringpartition
• C:Consistency(linearizable)• Allaccess islinearizable• Requires consensus
• A:Availability• Allclients canmakeprogress
• P:Partitiontolerance• CandAcanhappenduringapartition
Can’tcommit:• Nottheleader• Cantreachleader
Noavailability
![Page 23: Distributed Systems · • CAP Theorem. Raft • Proposed by Ongaroand Ousterhoutin 2014 • Five components • Leader election • Log replication • Safety • Client protocol](https://reader033.vdocument.in/reader033/viewer/2022052104/603fcf46eb72ff145553e596/html5/thumbnails/23.jpg)
CAPTheorem
ServerB(follower)
ServerC(Follower)
ServerA(leader)
FE
FE
NetworkPartition
• ConsistencyModel:Howdoesyoursystemreactduringpartition
• C:Consistency(linearizable)• Allaccess islinearizable• Requires consensus
• A:Availability• Allclients canmakeprogress
• P:Partitiontolerance• CandAcanhappenduringapartition I’llcommit:thereWILL
BECONFLICTS
NOCONSISTENCY
availability
![Page 24: Distributed Systems · • CAP Theorem. Raft • Proposed by Ongaroand Ousterhoutin 2014 • Five components • Leader election • Log replication • Safety • Client protocol](https://reader033.vdocument.in/reader033/viewer/2022052104/603fcf46eb72ff145553e596/html5/thumbnails/24.jpg)
ServerB(follower)
ServerC(Follower)
ServerA(leader)
FE
FE
NetworkPartition
I’llcommit: thereWILLBE CONFLICTS
NO CONSISTENCY
availability
ServerB(follower)
ServerC(Follower)
ServerA(leader)
FE
FE
NetworkPartition
Can’t commit:• Not theleader• Cant reach leader
No availability
Raft(linearizable):PassiveReplication:• Strongconsistency• Duringpartition---
• someclientswillmakenoprogress• Becauseleaderisunavailable
EventualConsistency:• Duringpartition---
• Someclientswillmakeprogress• Sinceclientscanchangesamedata
• Noconsistencygaurantees
![Page 25: Distributed Systems · • CAP Theorem. Raft • Proposed by Ongaroand Ousterhoutin 2014 • Five components • Leader election • Log replication • Safety • Client protocol](https://reader033.vdocument.in/reader033/viewer/2022052104/603fcf46eb72ff145553e596/html5/thumbnails/25.jpg)
CAPTheorem
• C:Consistency(Linearizable)• A:Availability• P:Partitiontolerance
• Givena“Partition”,youmustpickbetween“Availability”and“Consistency”• PickConsistently:Someclients(notall)canchange“data consistently”• PickAvailability:Allclientscanchangedatabut“inconsistently”
![Page 26: Distributed Systems · • CAP Theorem. Raft • Proposed by Ongaroand Ousterhoutin 2014 • Five components • Leader election • Log replication • Safety • Client protocol](https://reader033.vdocument.in/reader033/viewer/2022052104/603fcf46eb72ff145553e596/html5/thumbnails/26.jpg)
Today
• RaftRecap• Challenges: howtochangethesizeofthecluster
• Consensus:ConsistencyModels• Definitions ofdifferentconsistencymodels• Differencesbetween themodels
• CAPTheorem:Given‘P’,youcanonlyhave“A”and“P”.• Whendesigning asystemthatmusttoleratepartitions, youmustpickbetween “A”and“P”.