![Page 1: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/1.jpg)
Reliable Distributed Systems
Fault Tolerance (Recoverability High
Availability)
![Page 2: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/2.jpg)
Reliability and transactions Transactions are well matched to
database model and recoverability goals
Transactions don’t work well for non-database applications (general purpose O/S applications) or availability goals (systems that must keep running if applications fail)
When building high availability systems, encounter replication issue
![Page 3: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/3.jpg)
Types of reliability Recoverability
Server can restart without intervention in a sensible state
Transactions do give us this High availability
System remains operational during failure Challenge is to replicate critical data
needed for continued operation
![Page 4: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/4.jpg)
Replicating a transactional server Two broad approaches
Just use distributed transactions to update multiple copies of each replicated data item
We already know how to do this, with 2PC Each server has “equal status”
Somehow treat replication as a special situation
Leads to a primary server approach with a “warm standby”
![Page 5: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/5.jpg)
Replication with 2PC Our goal will be “1-copy serializability”
Defined to mean that the multi-copy system behaves indistinguishably from a single-copy system
Considerable form and theoretical work has been done on this
As a practical matter Replicate each data item Transaction manager
Reads any single copy Updates all copies
![Page 6: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/6.jpg)
Observation Notice that transaction manager must
know where the copies reside In fact there are two models
Static replication set: basically, the set is fixed, although some members may be down
Dynamic: the set changes while the system runs, but only has operational members listed within it
Today stick to the static case
![Page 7: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/7.jpg)
Replication and Availability
A series of potential issues How can we update an object during
periods when one of its replicas may be inaccessible?
How can 2PC protocol be made fault-tolerant?
A topic we’ll study in more depth But the bottom line is: we can’t!
![Page 8: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/8.jpg)
Usual responses?
Quorum methods: Each replicated object has an update
and a read quorum Designed so Qu+Qr > # replicas and
Qu+Qu > # replicas Idea is that any read or update will
overlap with the last update
![Page 9: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/9.jpg)
Quorum example X is replicated at {a,b,c,d,e} Possible values?
Qu = 1, Qr = 5 (violates QU+Qu > 5) Qu = 2, Qr = 4 (same issue) Qu = 3, Qr = 3 Qu = 4, Qr = 2 Qu = 5, Qr = 1 (violates availability)
Probably prefer Qu=4, Qr=2
![Page 10: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/10.jpg)
Things to notice Even reading a data item requires that
multiple copies be accessed! This could be much slower than normal
local access performance Also, notice that we won’t know if we
succeeded in reaching the update quorum until we get responses Implies that any quorum replication scheme
needs a 2PC protocol to commit
![Page 11: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/11.jpg)
Next issue?
Now we know that we can solve the availability problem for reads and updates if we have enough copies
What about for 2PC? Need to tolerate crashes before or
during runs of the protocol A well-known problem
![Page 12: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/12.jpg)
Availability of 2PC
It is easy to see that 2PC is not able to guarantee availability Suppose that manager talks to 3
processes And suppose 1 process and manager
fail The other 2 are “stuck” and can’t
terminate the protocol
![Page 13: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/13.jpg)
What can be done? We’ll revisit this issue soon Basically,
Can extend to a 3PC protocol that will tolerate failures if we have a reliable way to detect them
But network problems can be indistinguishable from failures
Hence there is no commit protocol that can tolerate failures
Anyhow, cost of 3PC is very high
![Page 14: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/14.jpg)
A quandry?
We set out to replicate data for increased availability
And concluded that Quorum scheme works for updates But commit is required And represents a vulnerability
Other options?
![Page 15: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/15.jpg)
Other options
We mentioned primary-backup schemes
These are a second way to solve the problem
Based on the log at the data manager
![Page 16: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/16.jpg)
Server replication Suppose the primary sends the log
to the backup server It replays the log and applies
committed transactions to its replicated state
If primary crashes, the backup soon catches up and can take over
![Page 17: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/17.jpg)
Primary/backup
primary
backup
Clients initially connected to primary, which keeps backup up to date. Backup tracks log
log
![Page 18: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/18.jpg)
Primary/backup
primary
backup
Primary crashes. Backup sees the channel break, applies committed updates. But it may have missedthe last few updates!
![Page 19: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/19.jpg)
Primary/backup
primary
backup
Clients detect the failure and reconnect to backup. Butsome clients may have “gone away”. Backup state couldbe slightly stale. New transactions might suffer from this
![Page 20: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/20.jpg)
Issues? Under what conditions should
backup take over Revisits the consistency problem seen
earlier with clients and servers Could end up with a “split brain”
Also notice that still needs 2PC to ensure that primary and backup stay in same states!
![Page 21: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/21.jpg)
Split brain: reminder
primary
backup
Clients initially connected to primary, which keeps backup up to date. Backup follows log
log
![Page 22: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/22.jpg)
Split brain: reminder
Transient problem causes some links to break but not all.Backup thinks it is now primary, primary thinks backup is down
primary
backup
![Page 23: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/23.jpg)
Split brain: reminder
Some clients still connected to primary, but one has switchedto backup and one is completely disconnected from both
primary
backup
![Page 24: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/24.jpg)
Implication?
A strict interpretation of ACID leads to conclusions that There are no ACID replication
schemes that provide high availability Most real systems solve by
weakening ACID
![Page 25: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/25.jpg)
Real systems
They use primary-backup with logging
But they simply omit the 2PC Server might take over in the wrong
state (may lag state of primary) Can use hardware to reduce or
eliminate split brain problem
![Page 26: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/26.jpg)
How does hardware help? Idea is that primary and backup share a
disk Hardware is configured so only one can
write the disk If server takes over it grabs the “token” Token loss causes primary to shut down
(if it hasn’t actually crashed)
![Page 27: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/27.jpg)
Reconciliation This is the problem of fixing the transactions
impacted by lack of 2PC Usually just a handful of transactions
They committed but backup doesn’t know because never saw commit record
Later. server recovers and we discover the problem
Need to apply the missing ones Also causes cascaded rollback Worst case may require human intervention
![Page 28: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/28.jpg)
Summary Reliability can be understood in terms of
Availability: system keeps running during a crash
Recoverability: system can recover automatically
Transactions are best for latter Some systems need both sorts of
mechanisms, but there are “deep” tradeoffs involved
![Page 29: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/29.jpg)
Replication and High Availability All is not lost! Suppose we move away from the
transactional model Can we replicate data at lower cost and
with high availability? Leads to “virtual synchrony” model Treats data as the “state” of a group of
participating processes Replicated update: done with multicast
![Page 30: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/30.jpg)
Steps to a solution First look more closely at 2PC, 3PC, failure
detection 2PC and 3PC both “block” in real settings But we can replace failure detection by
consensus on membership Then these protocols become non-blocking
(although solving a slightly different problem) Generalized approach leads to ordered
atomic multicast in dynamic process groups
![Page 31: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/31.jpg)
Non-blocking Commit
Goal: a protocol that allows all operational processes to terminate the protocol even if some subset crash
Needed if we are to build high availability transactional systems (or systems that use quorum replication)
![Page 32: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/32.jpg)
Definition of problem Given a set of processes, one of which
wants to initiate an action Participants may vote for or against the
action Originator will perform the action only if
all vote in favor; if any votes against (or don’t vote), we will “abort” the protocol and not take the action
Goal is all-or-nothing outcome
![Page 33: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/33.jpg)
Non-triviality Want to avoid solutions that do nothing
(trivial case of “all or none”) Would like to say that if all vote for
commit, protocol will commit... but in distributed systems we can’t be sure
votes will reach the coordinator! any “live” protocol risks making a mistake and
counting a live process that voted to commit as a failed process, leading to an abort
Hence, non-triviality condition is hard to capture
![Page 34: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/34.jpg)
Typical protocol Coordinator asks all processes if
they can take the action Processes decide if they can and
send back “ok” or “abort” Coordinator collects all the
answers (or times out) Coordinator computes outcome
and sends it back
![Page 35: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/35.jpg)
Commit protocol illustrated
ok to commit?
![Page 36: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/36.jpg)
Commit protocol illustrated
ok to commit?
ok with us
![Page 37: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/37.jpg)
Commit protocol illustrated
ok to commit?
ok with uscommit
Note: garbage collection protocol not shown here
![Page 38: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/38.jpg)
Failure issues So far, have implicitly assumed that
processes fail by halting (and hence not voting)
In real systems a process could fail in arbitrary ways, even maliciously
This has lead to work on the “Byzantine generals” problem, which is a variation on commit set in a “synchronous” model with malicious failures
![Page 39: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/39.jpg)
Failure model impacts costs! Byzantine model is very costly: 3t+1
processes needed to overcome t failures, protocol runs in t+1 rounds
This cost is unacceptable for most real systems, hence protocols are rarely used
Main area of application: hardware fault-tolerance, security systems
For these reasons, we won’t study such protocols
![Page 40: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/40.jpg)
Commit with simpler failure model Assume processes fail by halting Coordinator detects failures (unreliably)
using timouts. It can make mistakes! Now the challenge is to terminate the
protocol if the coordinator fails instead of, or in addition to, a participant!
![Page 41: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/41.jpg)
Commit protocol illustrated
ok to commit?
ok with us… times outabort!
Note: garbage collection protocol not shown here
crashed!
![Page 42: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/42.jpg)
Example of a hard scenario Coordinator starts the protocol One participant votes to abort, all
others to commit Coordinator and one participant
now fail... we now lack the information to
correctly terminate the protocol!
![Page 43: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/43.jpg)
Commit protocol illustrated
ok to commit?
okdecision unknown!
vote unknown!ok
![Page 44: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/44.jpg)
Example of a hard scenario Problem is that if coordinator told the
failed participant to abort, all must abort
If it voted for commit and was told to commit, all must commit
Surviving participants can’t deduce the outcome without knowing how failed participant voted
Thus protocol “blocks” until recovery occurs
![Page 45: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/45.jpg)
Skeen: Three-phase commit
Seeks to increase availability Makes an unrealistic assumption
that failures are accurately detectable
With this, can terminate the protocol even if a failure does occur
![Page 46: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/46.jpg)
Skeen: Three-phase commit Coordinator starts protocol by sending request Participants vote to commit or to abort Coordinator collects votes, decides on
outcome Coordinator can abort immediately To commit, coordinator first sends a “prepare
to commit” message Participants acknowledge, commit occurs
during a final round of “commit” messages
![Page 47: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/47.jpg)
Three phase commit protocol illustrated
ok to commit?
ok ....
commit
prepare to commit
prepared...
Note: garbage collection protocol not shown here
![Page 48: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/48.jpg)
Observations about 3PC If any process is in “prepare to commit”
all voted for commit Protocol commits only when all
surviving processes have acknowledged prepare to commit
After coordinator fails, it is easy to run the protocol forward to commit state (or back to abort state)
![Page 49: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/49.jpg)
Assumptions about failures If the coordinator suspects a failure, the
failure is “real” and the faulty process, if it later recovers, will know it was faulty
Failures are detectable with bounded delay
On recovery, process must go through a reconnection protocol to rejoin the system! (Find out status of pending protocols that terminated while it was not operational)
![Page 50: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/50.jpg)
Problems with 3PC With realistic failure detectors (that can make
mistakes), protocol still blocks! Bad case arises during “network partitioning”
when the network splits the participating processes into two or more sets of operational processes
Can prove that this problem is not avoidable: there are no non-blocking commit protocols for asynchronous networks
![Page 51: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/51.jpg)
Situation in practical systems? Most use protocols based on 2PC: 3PC is more
costly and ultimately, still subject to blocking! Need to extend with a form of garbage
collection mechanism to avoid accumulation of protocol state information (can solve in the background)
Some systems simply accept the risk of blocking when a failure occurs
Others reduce the consistency property to make progress at risk of inconsistency with failed proc.
![Page 52: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/52.jpg)
Process groups To overcome cost of replication will
introduce dynamic process group model (processes that join, leave while system is running) Will also relax our consistency goal: seek
only consistency within a set of processes that all remain operational and members of the system
In this model, 3PC is non-blocking! Yields an extremely cheap replication
scheme!
![Page 53: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/53.jpg)
Failure detection Basic question: how to detect a failure
Wait until the process recovers. If it was dead, it tells you
I died, but I feel much better now Could be a long wait
Use some form of probe But might make mistakes
Substitute agreement on membership Now, failure is a “soft” concept Rather than “up” or “down” we think about
whether a process is behaving acceptably in the eyes of peer processes
![Page 54: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/54.jpg)
Architecture
Membership Agreement, “join/leave” and “P seems to be unresponsive”
3PC-like protocols use membership changes instead of failure notification
Applications use replicated data for Applications use replicated data for high availabilityhigh availability
![Page 55: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/55.jpg)
Issues? How to “detect” failures
Can use timeout Or could use other system monitoring tools
and interfaces Sometimes can exploit hardware
Tracking membership Basically, need a new replicated service System membership “lists” are the data it
manages We’ll say it takes join/leave requests as input
and produces “views” as output
![Page 56: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/56.jpg)
Architecture
GMS
A
B
C
D
join leave
join
A seems to have failed
{A}
{A,B,D}
{A,D}
{A,D,C}
{D,C}
X Y Z
Application processes
GMS processes
membership views
![Page 57: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/57.jpg)
Issues Group membership service (GMS) has
just a small number of members This core set will tracks membership for a
large number of system processes Internally it runs a group membership
protocol (GMP) Full system membership list is just
replicated data managed by GMS members, updated using multicast
![Page 58: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/58.jpg)
GMP design What protocol should we use to
track the membership of GMS Must avoid split-brain problem Desire continuous availability
We’ll see that a version of 3PC can be used
But can’t “always” guarantee liveness
![Page 59: Reliable Distributed Systems Fault Tolerance (Recoverability High Availability)](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d3a5503460f94a158e6/html5/thumbnails/59.jpg)
Reading ahead? Read chapters 12, 13 Thought problem: how important is
external consistency (called dynamic uniformity in the text)?
Homework: Read about FLP. Identify other “impossibility results” for distributed systems. What is the simplest case of an impossibility result that you can identify?