pwlsd.1 a critique of the cap theorem - martin kleppmann
TRANSCRIPT
A Critique of the CAP TheoremMartin Kleppmann
Papers We Love San DiegoDaniel Norman – August 4th, 2016
@DreamingInCode
Chronology of CAP
2000: Brewer publicizes his conjecture in various talks and papers – “The CAP Principle”.
2002: Gilbert and Lynch formally proved Brewer’s conjecture, and CAP Theorem was born.
1970s, 80s, and 90s: Absolutely[1] Nothing[2] happened[3] Nothing[4] to[5] see[6] here[7] folks.
[1] Paul R Johnson and Robert H Thomas. RFC 677: The maintenance of duplicate databases. Network Working Group, January 1975. URL https://tools.ietf.org/html/rfc677.[2] Jim N Gray, Raymond A Lorie, Gianfranco R Putzolu, and Irving L Traiger. Granularity of locks and degrees of consistency in a shared data base. In G M Nijssen, editor, Modelling in Data Base Management Systems: Proceedings of the IFIP Working Conference on Modelling in Data Base Management Systems, pages 364–394. Elsevier/North Holland, 1976.[3] Leslie Lamport. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Transactions on Computers, 28(9):690–691, September 1979. doi:10.1109/TC.1979.1675439.[4] Bruce G Lindsay, Patricia Griffiths Selinger, C Galtieri, Jim N Gray, Raymond A Lorie, Thomas G Price, Gianfranco R Putzolu, Irving L Traiger, and Bradford W Wade. Notes on distributed databases. Technical Report RJ2571(33471), IBM Research, July 1979.[5] Bowen Alpern and Fred B Schneider. Defining liveness. Information Processing Letters, 21(4):181–185, October 1985. doi:10.1016/0020-0190(85)90056-0.[6] Susan B Davidson, Hector Garcia-Molina, and Dale Skeen. Consistency in partitioned networks. ACM Computing Surveys, 17(3):341–370, September 1985. doi:10.1145/5505.5508.[7] Linearizability: A Correctness Condition for Concurrent Objects. M P. Herlihy and J M. Wing. ACM Transactions, Vol. 12, No. 3, July 1990.
“Consistency”
“I’m totally pro-choice.” (Fox News, October 31, 1999)
“I’m pro-life.” (CPAC, February 10, 2011)
“I wanted to do this for myself. I had to do it for myself.” (Time, August 18, 2015)
“I don’t want it for myself. I don’t need it for myself.” (ABC News, November 20, 2015)
“I think the institution of marriage should be between a man and a woman.” (The Advocate, February 15, 2000)
“If two people dig each other, they dig each other.”(Trump University “Trump Blog,” December 22, 2005)
“I’m against gay marriage.” (Fox News, April 14, 2011)
What is “Consistency”
● Brewer defines consistency as one-copy-serializability (1SR)
● Gilbert and Lynch define consistency as linearizability
Put Simply:
Linearizability can be viewed as a special case of strict serializability where transactions are restricted to consist of a single operation applied to a single object.[1]
[1] Linearizability: A Correctness Condition for Concurrent Objects. By M P. Herlihy and J M. Wing. ACM Transactions on Programming Languages and Systems, Vol. 12, No. 3, July 1990.
“Availability”
Gilbert & Lynch defined availability differently.
Property of algorithm, or observed metric?
Brewer:“availability is obviously continuous from 0 to 100 percent”
Gylbert & Lynch:“For a distributed system to be continuously available, every request received by a non-failing node in the
system must result in a response”
“Availability”
The Server is DOWN! – What is “UP” anyway?
● Is it a server that’s fails to respond in 1000ms?
● What if it responds 5 minutes later?
● What if it responds, but with invalid data?
● What if it responds but the response is not received?
● What if your request packet is dropped, but all others are fine?
● What is the sound of one hand clapping?
“Availability”
● Gilbert & Lynch definition – Contradictory and counter-intuitive
● Brewer’s definition is ok
● Nonsensical to call an algorithm “Available”
“Partition Tolerance”
A Network partition is:
“a communication failure in which the network is split into disjoint
sub-networks, with no communication possible across sub-networks”
“Partition Tolerance”
Most networks are Fairloss links
A Fairloss link is a link where the probability the message you send is
delivered is non-zero.
(and the probability of delivery is less than 100%)
“Partition Tolerance”
A meditation:
What fraction of messages must go undelivered before it’s a “Partition”?
CA AP CP CCCP
Using Gilbert & Lynch definitions,
essentially only two options:
CA - Avoid the gulag dear comrade, coordinate carefully with your party official.
AP - You’re on your own, capitalist swine!
CP - Not really any different from CA
CCCP - Purge all your data and start over every few years
PROVE IT
From “Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web
Services” Gylbert & Lynch 2002
PROVE IT
G&L offer Irrefutable proof that:
The operation of their algorithm is non-linearizable only if the partition lasts
forever.
But:
Partitions don’t last forever!
Sometimes faults aren’t detected!
Maybe you don’t even want linearizability!
GREETINGS PROFESSOR BREWER
CAP IS A STRANGE GAME.
THE ONLY WINNING MOVE IS NOT TO PLAY.
HELLO
HOW ABOUT A NICE GAME OF CHESS?
In my humble opinion...There’s a simpler way to say it:
● Some consistency models require total event ordering
● Any up-to-date list can only exist in a single point in space
● We are not optimistic about FTL information transfer
Be patient, or be self-sufficient. No whining.
Kleppmann proposes a “Delay sensitivity framework”
In a nutshell: Travel takes time, how patient can you be?
So What’s better?
“Delay sensitivity framework”
● Networks latency is lumpy / uncertain
● Very few scenarios entail actually reliable networks - packet loss is common
● Understanding lower bounds for different consistency models
So What’s better?
Proposed Terminology
● Availability - Empirical measurements only!
● Delay Sensitive - How patient can we be for a given operation?
● Network Faults - All kinds of weird shit happens on networks.
● Fault Tolerance - Under what specific failure modes can we guarantee our invariants?
● Consistency - Not just one, but many different consistency models to choose from
So What’s better?
Strong vs Weak
Consistency models galore:
Consistency models are a continuum.
“Strong” and “Weak” are conversational terms, not formal.
“Weak” does not mean unsafe.
It’s all about selecting which invariants you want.
Kyle Kingsbury 2014 – https://aphyr.com/posts/313-strong-consistency-models
In Conclusion:
● Irrefutable truths mayy be predicated on
misleading definitions
● consistency is subjective.
● CAP has outlived it’s usefulness
● We can do better!
● Let’s reason about latency
● Lets fix our terminology
● Rigor is good, CS needs moar of it