cassandra eu 2012 - highly available: the cassandra distribution model by sam overton
DESCRIPTION
Sam Overton's talk from Cassandra Europe on March 28th 2012TRANSCRIPT
![Page 1: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/1.jpg)
Highly Available: The Cassandra Distribution
Model
Sam Overton
Cassandra Europe 2012
![Page 2: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/2.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Cassandra is:● built for scalability● built to tolerate failure
In this talk:● Cassandra distribution overview● Partitioning and placement● Replication● Consistency
![Page 3: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/3.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Cassandra is:● built for scalability● built to tolerate failure
In this talk:● Cassandra distribution overview● Partitioning and placement● Replication● Consistency
![Page 4: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/4.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Overview
● High availability● Partition tolerant● Tunable consistency● Scalable● Replication● No single point of failure
![Page 5: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/5.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Cassandra is:● built for scalability● built to tolerate failure
In this talk:● Cassandra distribution overview● Partitioning and placement● Replication● Consistency
![Page 6: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/6.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Partitioning and placement
Should...● Assign data to hosts● Have no S.P.O.F for routing clients to data● Balance load● Allow scaling without moving too much data
![Page 7: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/7.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Consistent Hashing
![Page 8: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/8.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Consistent Hashing
(k1, v1)
(k2, v2)
(k3, v3)
![Page 9: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/9.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Consistent Hashing
● partitioner maps key to ring token● hosts' tokens determine placement of keys● and proportion of data assigned to each host● each row is stored on one host● wide rows can cause hot-spotting!
So how does it scale?
![Page 10: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/10.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Consistent Hashing
![Page 11: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/11.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Consistent Hashing
Bootstrapping a new node
![Page 12: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/12.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Consistent Hashing
Range is transferred from old host to new host
![Page 13: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/13.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Consistent Hashing
![Page 14: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/14.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Consistent Hashing
![Page 15: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/15.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Consistent Hashing
![Page 16: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/16.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Consistent Hashing
Decommission is the reverse process
![Page 17: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/17.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Consistent Hashing
![Page 18: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/18.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Consistent Hashing
● Tokens can be assigned manually, automatically or randomly● Every node has full knowledge of placement● Client connects to any node, max 1 hop to data● Node status is gossiped
![Page 19: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/19.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Partitioners
● Converts a row key (from client data) into a token on the ring● RandomPartitioner● Order Preserving Partitioner
![Page 20: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/20.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Partitioners
Random Partitioner● token = hash(key)● good load balancing● no range queries across row keys
![Page 21: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/21.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Partitioners
Order Preserving Partitioner● token = key● requires manual load balancing● careful selection of tokens around the ring● allows range queries across row keys
![Page 22: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/22.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Partitioners
● Get it right first time!● Design data model for RP● Custom partitioners are possible if necessary
![Page 23: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/23.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Cassandra is:● built for scalability● built to tolerate failure
In this talk:● Cassandra distribution overview● Partitioning and placement● Replication● Consistency
![Page 24: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/24.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Replication
● For availability● For redundancy● Can increase read bandwidth
![Page 25: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/25.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Replication
● Replication Factor (RF) is number of copies of data● Defined per-keyspace● Can be changed (eg. If data becomes more/less valuable)● Determines how many failures can be tolerated
![Page 26: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/26.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Replication Strategy
● Determines how replicas are assigned for each host● Defined per keyspace (like RF)● SimpleStrategy● NetworkTopologyStrategy● Custom strategies can be written
![Page 27: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/27.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Replication Strategy : Simple Strategy
(k1, v1)
(k2, v2)
eg. RF=3
![Page 28: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/28.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Replication Strategy : Network Topology Strategy
![Page 29: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/29.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Replication Strategy : Network Topology Strategy
Multi-datacentre support
DC1 DC2
![Page 30: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/30.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Replication Strategy : Network Topology Strategy
![Page 31: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/31.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Snitches
● Enables routing of requests according to node proximity● Used by replication strategy to determine rack and DC membership● Custom snitches can be written
![Page 32: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/32.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Simple Snitch
● Every host is in the same rack & DC with equal proximity
RackInferringSnitch
● Infers the rack & DC from IP address of host123.8.2.100
DCrack host
![Page 33: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/33.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
EC2Snitch
● DC = EC2 region● Rack = EC2 availability zone
Property file snitch
● Rack and DC membership read from configuration file
![Page 34: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/34.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
DynamicSnitch
● Wraps each of the other snitches● Records latency stats from read operations● Avoids routing to slow hosts● Configurable update intervals
![Page 35: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/35.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Cassandra is:● built for scalability● built to tolerate failure
In this talk:● Cassandra distribution overview● Partitioning and placement● Replication● Consistency
![Page 36: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/36.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Consistency
● Replication and failures/partitions cause inconsistency● Old versions of data can be returned
Timestamps:● Chosen by the client● Can be used to avoid read-modify-write
![Page 37: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/37.jpg)
Cassandra Europe 2012
Consistency
● Cassandra allows a trade-off between partition-tolerance and consistency
● For strong consistency:R + W > N
● Eg. with 5 replicas(RF = N = 5)write to 3read from 3
Highly Available: The Cassandra Distribution Model
11
11
11
11
11
![Page 38: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/38.jpg)
Cassandra Europe 2012
Consistency
● Cassandra allows a trade-off between partition-tolerance and consistency
● For strong consistency:R + W > N
● Eg. with 5 replicas(RF = N = 5)write to 3read from 3
Highly Available: The Cassandra Distribution Model
22
22
22
11
11
write
![Page 39: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/39.jpg)
Cassandra Europe 2012
Consistency
● Cassandra allows a trade-off between partition-tolerance and consistency
● For strong consistency:R + W > N
● Eg. with 5 replicas(RF = N = 5)write to 3read from 3
Highly Available: The Cassandra Distribution Model
22
22
22
11
11
read
![Page 40: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/40.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Consistency Level
● ANY (only for writes)● ONE, TWO, THREE● QUORUM (N/2 + 1)● LOCAL QUORUM● ALL
● Relax strong consistency for partition tolerance● To tolerate 1 node failure with strong consistency use RF=3 with CL=QUORUM
![Page 41: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/41.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Increasing Consistency
● Read repair● Hinted hand-off● Anti-entropy repair
![Page 42: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/42.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Read Repair
![Page 43: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/43.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Read Repair
![Page 44: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/44.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Read Repair
![Page 45: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/45.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Read Repair
![Page 46: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/46.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Hinted Hand-off
eg. RF=2(k1, v1)
(k1, v1)
![Page 47: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/47.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Hinted Hand-off
eg. RF=2(k1, v1)
(k1, v1)
Write (k1, v2)
![Page 48: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/48.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Hinted Hand-off
eg. RF=2(k1, v1)
(k1, v1)
Write (k1, v2)
![Page 49: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/49.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Hinted Hand-off
eg. RF=2(k1, v1)
(k1, v1)
Write (k1, v2)
![Page 50: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/50.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Hinted Hand-off
eg. RF=2(k1, v1)
(k1, v1)
Write (k1, v2)
(k1, v2)
![Page 51: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/51.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Hinted Hand-off
eg. RF=2(k1, v2)
(k1, v1)
(k1, v2)
![Page 52: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/52.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Hinted Hand-off
eg. RF=2(k1, v2)
(k1, v2)
(k1, v2)
![Page 53: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/53.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Hinted Hand-off
eg. RF=2(k1, v2)
(k1, v2)
(k1, v2)
![Page 54: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/54.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Hinted Hand-off
● Hinted writes do not count towards the chosen consistency level● … except with CL=ANY which succeeds even if all replicas are down● Don't rely on hints: hints cannot be read!
![Page 55: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/55.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Anti-entropy repair
● Manual maintenance process● Compares all data stored on a host with the replicas● Differences are streamed to restore consistency● Must be run every 10 days to ensure tombstones are replicated
![Page 56: Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton](https://reader033.vdocument.in/reader033/viewer/2022051412/54b7a2cf4a795993718b475e/html5/thumbnails/56.jpg)
Cassandra Europe 2012
Highly Available: The Cassandra Distribution Model
Cassandra is:● built for scalability● built to tolerate failure
In this talk:● Cassandra distribution overview● Partitioning and placement● Replication● Consistency
fin.