![Page 1: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/1.jpg)
Introduction to Cassandra DuyHai DOAN, Technical Advocate
![Page 2: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/2.jpg)
@doanduyhai
About me!
2
Duy Hai DOAN Cassandra technical advocate • talks, meetups, confs • open-source devs (Achilles, …) • OSS technical point of contact in Europe
☞ [email protected] • production troubleshooting
![Page 3: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/3.jpg)
@doanduyhai 3
Datastax!• Founded in April 2010
• We drive Apache Cassandra™
• Home to Cassandra chair & most committers (≈80%)
• Headquarter in San Francisco Bay area
• EU headquarter in London, offices in France and Germany
• Datastax Enterprise = OSS Cassandra + features
![Page 4: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/4.jpg)
@doanduyhai
Agenda!
4
• Architecture • Replication • Consistency • Read/write path • Data Model • Failure handling
![Page 5: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/5.jpg)
What is Cassandra!
![Page 6: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/6.jpg)
@doanduyhai
Cassandra history!
6
NoSQL database • created at Facebook • open-sourced since 2008 • current version = 2.1 • column-oriented ☞ distributed table
![Page 7: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/7.jpg)
@doanduyhai
Cassandra 5 key facts!
7
Linear scalability Small & « huge » scale • 3 à1k+ nodes cluster • 3Gb à Pb+
![Page 8: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/8.jpg)
@doanduyhai
Cassandra 5 key facts!
8
Continuous availability (≈100% up-time) • resilient architecture (Dynamo) • rolling upgrades • data backward compatible n/n+1 versions
![Page 9: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/9.jpg)
@doanduyhai
Cassandra 5 key facts!
9
Multi-data centers • out-of-the-box (config only) • AWS conf for multi-region DCs • GCE/CloudStack support • resilience, work-load segregation • virtual data-centers
![Page 10: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/10.jpg)
@doanduyhai
Cassandra 5 key facts!
10
Operational simplicity • 1 node = 1 process + 1 config file • deployment automation • OpsCenter for monitoring
![Page 11: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/11.jpg)
@doanduyhai
Cassandra 5 key facts!
11
Analytics combo • Cassandra + Spark = awesome ! • realtime streaming
![Page 12: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/12.jpg)
Cassandra architecture!
![Page 13: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/13.jpg)
@doanduyhai
Cassandra architecture!
13
DATA STORE (BIG TABLES)
CLUSTER (DYNAMODB)
API (CQL & RPC)
DISKS
Node1
Client request
Node2
CLUSTER (DYNAMODB)
API (CQL & RPC)
DISKS
DATA STORE (BIG TABLES)
![Page 14: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/14.jpg)
@doanduyhai
Data distribution!
14
Random: hash of #partition → token = hash(#p) Hash: ]-X, X] X = huge number (264/2)
n1
n2
n3
n4
n5
n6
n7
n8
![Page 15: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/15.jpg)
@doanduyhai
Token Ranges!
15
A: ]0, X/8] B: ] X/8, 2X/8] C: ] 2X/8, 3X/8] D: ] 3X/8, 4X/8] E: ] 4X/8, 5X/8] F: ] 5X/8, 6X/8] G: ] 6X/8, 7X/8] H: ] 7X/8, X]
n1
n2
n3
n4
n5
n6
n7
n8
A
B
C
D
E
F
G
H
![Page 16: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/16.jpg)
@doanduyhai
Linear scalability!
16
n1
n2
n3
n4
n5
n6
n7
n8
n1
n2
n3 n4
n5
n6
n7
n8 n9
n10
8 nodes 10 nodes
![Page 17: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/17.jpg)
Q & R
! " !
![Page 18: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/18.jpg)
Replication & Consistency Model!
![Page 19: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/19.jpg)
@doanduyhai
Failure tolerance by replication!
19
Replication Factor (RF) = 3
n1
n2
n3
n4
n5
n6
n7
n8
1
2
3{B, A, H}
{C, B, A}
{D, C, B}
A
B
C
D
E
F
G
H
![Page 20: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/20.jpg)
@doanduyhai
Coordinator node!Incoming requests (read/write) Coordinator node handles the request
Every node can be coordinator àmasterless
n1
n2
n3
n4
n5
n6
n7
n8
1
2
3
coordinator Client request
![Page 21: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/21.jpg)
@doanduyhai
Consistency!
21
Tunable at runtime • ONE • QUORUM (strict majority w.r.t. RF) • ALL Apply both to read & write
![Page 22: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/22.jpg)
@doanduyhai
Write consistency!Write ONE • write request to all replicas in //
n1
n2
n3
n4
n5
n6
n7
n8
1
2
3
coordinator
![Page 23: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/23.jpg)
@doanduyhai
Write consistency!Write ONE • write request to all replicas in // • wait for ONE ack before returning to
client
n1
n2
n3
n4
n5
n6
n7
n8
1
2
3
coordinator
5 µs
![Page 24: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/24.jpg)
@doanduyhai
Write consistency!Write ONE • write request to all replicas in // • wait for ONE ack before returning to
client • other acks later, asynchronously
n1
n2
n3
n4
n5
n6
n7
n8
1
2
3
coordinator
5 µs
10 µs
120 µs
![Page 25: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/25.jpg)
@doanduyhai
Write consistency!Write QUORUM • write request to all replicas in // • wait for QUORUM acks before
returning to client • other acks later, asynchronously
n1
n2
n3
n4
n5
n6
n7
n8
1
2
3
coordinator
5 µs
10 µs
120 µs
![Page 26: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/26.jpg)
@doanduyhai
Read consistency!Read ONE • read from one node among all replicas
n1
n2
n3
n4
n5
n6
n7
n8
1
2
3
coordinator
![Page 27: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/27.jpg)
@doanduyhai
Read consistency!Read ONE • read from one node among all replicas • contact the fastest node (stats)
n1
n2
n3
n4
n5
n6
n7
n8
1
2
3
coordinator
![Page 28: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/28.jpg)
@doanduyhai
Read consistency!Read QUORUM • read from one fastest node
n1
n2
n3
n4
n5
n6
n7
n8
1
2
3
coordinator
![Page 29: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/29.jpg)
@doanduyhai
Read consistency!Read QUORUM • read from one fastest node • AND request digest from other
replicas to reach QUORUM
n1
n2
n3
n4
n5
n6
n7
n8
1
2
3
coordinator
![Page 30: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/30.jpg)
@doanduyhai
Read consistency!Read QUORUM • read from one fastest node • AND request digest from other
replicas to reach QUORUM • return most up-to-date data to client
n1
n2
n3
n4
n5
n6
n7
n8
1
2
3
coordinator
![Page 31: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/31.jpg)
@doanduyhai
Read consistency!Read QUORUM • read from one fastest node • AND request digest from other
replicas to reach QUORUM • return most up-to-date data to client • repair if digest mismatch n1
n2
n3
n4
n5
n6
n7
n8
1
2
3
coordinator
![Page 32: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/32.jpg)
@doanduyhai
Consistency in action!
32
RF = 3, Write ONE, Read ONE
B A A
B A A
Read ONE: A
data replication in progress …
Write ONE: B
![Page 33: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/33.jpg)
@doanduyhai
Consistency in action!
33
RF = 3, Write ONE, Read QUORUM
B A A
Write ONE: B
Read QUORUM: A
B A A
data replication in progress …
![Page 34: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/34.jpg)
@doanduyhai
Consistency in action!
34
RF = 3, Write ONE, Read ALL
B A A
Read ALL: B
B A A
data replication in progress …
Write ONE: B
![Page 35: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/35.jpg)
@doanduyhai
Consistency in action!
35
RF = 3, Write QUORUM, Read ONE
B B A
Write QUORUM: B
Read ONE: A
B B A
data replication in progress …
![Page 36: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/36.jpg)
@doanduyhai
Consistency in action!
36
RF = 3, Write QUORUM, Read QUORUM
B B A
Read QUORUM: B
B B A
data replication in progress …
Write QUORUM: B
![Page 37: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/37.jpg)
@doanduyhai
Consistency level!
37
ONE Fast, may not read latest written value
![Page 38: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/38.jpg)
@doanduyhai
Consistency level!
38
QUORUM Strict majority w.r.t. Replication Factor
Good balance
![Page 39: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/39.jpg)
@doanduyhai
Consistency level!
39
ALL Paranoid
Slow, no high availability
![Page 40: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/40.jpg)
@doanduyhai 40
Consistency summary!
ONERead + ONEWrite
☞ available for read/write even (N-1) replicas down
QUORUMRead + QUORUMWrite
☞ available for read/write even 1+ replica down
![Page 41: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/41.jpg)
Q & R
! " !
![Page 42: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/42.jpg)
Read & Write Path!
![Page 43: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/43.jpg)
@doanduyhai
Cassandra Write Path!
43
Commit log1
. . .
1
Commit log2
Commit logn
Memory
![Page 44: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/44.jpg)
@doanduyhai
Cassandra Write Path!
44
Memory
Commit log1
. . .
1
Commit log2
Commit logn
MemTable Table1
MemTable Table2
MemTable TableN
2
. . .
![Page 45: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/45.jpg)
@doanduyhai
Cassandra Write Path!
45
Commit log1
Commit log2
Commit logn
Table1
SStable1
Table2 Table3
SStable2 SStable3 3
Memory
. . .
![Page 46: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/46.jpg)
@doanduyhai
Cassandra Write Path!
46
Commit log1
Commit log2
Commit logn
Table1
SStable1
Table2 Table3
SStable2 SStable3
Memory . . . MemTable Table1
MemTable Table2
MemTable TableN
. . .
![Page 47: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/47.jpg)
@doanduyhai
Cassandra Write Path!
47
Commit log1
Commit log2
Commit logn
Table1
SStable1
Table2 Table3
SStable2 SStable3
Memory
SStable1
SStable2
SStable3 . . .
![Page 48: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/48.jpg)
@doanduyhai
Cassandra Read Path!
48
Row Cache 1
JVM Heap
Off Heap
SELECT .. FROM … WHERE #partition =…;
SStable3 SStable2 SStable1
![Page 49: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/49.jpg)
@doanduyhai
Bloom Filter Bloom Filter
Cassandra Read Path!
49
Row Cache 1
JVM Heap
Off Heap
SELECT .. FROM … WHERE #partition =…;
Bloom Filter 2
SStable3 SStable2 SStable1
? ? ?
![Page 50: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/50.jpg)
@doanduyhai
Bloom filters in action!
50
1 0 0 1* 0 0 1 0 1 1
#partition = foo
h1 h2 h3
#partition = bar Write
Read #partition = qux
![Page 51: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/51.jpg)
@doanduyhai
Partition Key Cache
Cassandra Read Path!
51
Row Cache 1
JVM Heap
Off Heap
SELECT .. FROM … WHERE #partition =…;
3
SStable3 SStable2 SStable1
Bloom Filter Bloom Filter
Bloom Filter 2
× ✓ ×
![Page 52: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/52.jpg)
@doanduyhai
Partition Key Cache!
52
#partition001 data data
…. #partition002
#partition350 data data
…
#Partition Offset …
#partition001 0x0 …
#partition002 0x153 …
#partition350 0x5464321 …
Partition Key Cache SStable
![Page 53: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/53.jpg)
@doanduyhai
Cassandra Read Path!
53
Row Cache 1
JVM Heap
Off Heap
SELECT .. FROM … WHERE #partition =…;
Partition Key Cache 3
Key Index Sample 4
SStable2
Bloom Filter Bloom Filter
Bloom Filter 2
![Page 54: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/54.jpg)
@doanduyhai
Key Index Sample!
54
#partition001 data data
…. #partition128
#partition350 data data
…
SStable
Sample Offset
#partition001 0x0
#partition128 0x4500
#partition256 0x851513
#partition512 0x5464321
Key Index Sample
![Page 55: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/55.jpg)
@doanduyhai
Cassandra Read Path!
55
Row Cache 1
JVM Heap
Off Heap
SELECT .. FROM … WHERE #partition =…;
Partition Key Cache 3
Key Index Sample 4
Compression Offset 5
SStable2
Bloom Filter Bloom Filter
Bloom Filter 2
![Page 56: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/56.jpg)
@doanduyhai
Compression Offset!
56
Compression Offset
Normal offset Compressed offset
0x0 0x0
0x4500 0x100
0x851513 0x1353
0x5464321 0x21245
![Page 57: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/57.jpg)
@doanduyhai
Cassandra Read Path!
57
Row Cache 1
JVM Heap
Off Heap
SELECT .. FROM … WHERE #partition =…;
Partition Key Cache 3
Key Index Sample 4
Compression Offset 5
SStable2 6
MemTable 7
∑
Bloom Filter Bloom Filter
Bloom Filter 2
![Page 58: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/58.jpg)
Data model!
Last Write Win!CQL basics!
From SQL to CQL!
![Page 59: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/59.jpg)
@doanduyhai
Last Write Win (LWW)!
59
jdoe age name
33 John DOE
INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33);
#partition
![Page 60: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/60.jpg)
@doanduyhai
Last Write Win (LWW)!
jdoe age (t1) name (t1)
33 John DOE
60
INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33);
auto-generated timestamp (μs)
.
![Page 61: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/61.jpg)
@doanduyhai
Last Write Win (LWW)!
61
UPDATE users SET age = 34 WHERE login = jdoe;
jdoe age (t1) name (t1)
33 John DOE jdoe
age (t2)
34
SSTable1 SSTable2
![Page 62: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/62.jpg)
@doanduyhai
Last Write Win (LWW)!
62
DELETE age FROM users WHERE login = jdoe;
jdoe age (t3)
ý
tombstone
jdoe age (t1) name (t1)
33 John DOE jdoe
age (t2)
34
SSTable1 SSTable2 SSTable3
![Page 63: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/63.jpg)
@doanduyhai
Last Write Win (LWW)!
63
SELECT age FROM users WHERE login = jdoe;
? ? ?
SSTable1 SSTable2 SSTable3
jdoe age (t3)
ý jdoe
age (t1) name (t1)
33 John DOE jdoe
age (t2)
34
![Page 64: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/64.jpg)
@doanduyhai
Last Write Win (LWW)!
64
SELECT age FROM users WHERE login = jdoe;
✓ ✕ ✕
SSTable1 SSTable2 SSTable3
jdoe age (t3)
ý jdoe
age (t1) name (t1)
33 John DOE jdoe
age (t2)
34
![Page 65: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/65.jpg)
@doanduyhai
Compaction!
65
SSTable1 SSTable2 SSTable3
jdoe age (t3)
ý jdoe
age (t1) name (t1)
33 John DOE jdoe
age (t2)
34
New SSTable
jdoe age (t3) name (t1)
ý John DOE
![Page 66: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/66.jpg)
@doanduyhai
CRUD operations!
66
INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33);
UPDATE users SET age = 34 WHERE login = jdoe;
DELETE age FROM users WHERE login = jdoe;
SELECT age FROM users WHERE login = jdoe;
![Page 67: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/67.jpg)
@doanduyhai
DDL Statements!
67
Keyspace (tables container) • CREATE/ALTER/DROP Table • CREATE/ALTER/DROP Type (custom type) • CREATE/ALTER/DROP
![Page 68: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/68.jpg)
@doanduyhai
User management & permissions!
68
User • CREATE/ALTER/DROP Permissions • GRANT <xxx> PERMISSION ON <resource> TO <user_name> • REVOKE <xxx> PERMISSION ON <resource> FROM <user_name>
![Page 69: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/69.jpg)
@doanduyhai
Simple Table!
69
CREATE TABLE users ( login text, name text, age int, … PRIMARY KEY(login));
partition key (#partition)
![Page 70: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/70.jpg)
@doanduyhai
Clustered table!
70
CREATE TABLE mailbox ( login text, message_id timeuuid, interlocutor text, message text, PRIMARY KEY((login), message_id));
partition key clustering column unicity
![Page 71: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/71.jpg)
@doanduyhai
Queries!
71
Get message by user and message_id (date)
SELECT * FROM mailbox WHERE login = jdoe and message_id = ‘2014-09-25 16:00:00’;
Get message by user and date interval
SELECT * FROM mailbox WHERE login = jdoe and message_id <= ‘2014-09-25 16:00:00’ and message_id >= ‘2014-09-20 16:00:00’;
![Page 72: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/72.jpg)
@doanduyhai
Queries!
72
Get message by message_id only (#partition not provided)
SELECT * FROM mailbox WHERE message_id = ‘2014-09-25 16:00:00’;
Get message by date interval (#partition not provided)
SELECT * FROM mailbox WHERE and message_id <= ‘2014-09-25 16:00:00’ and message_id >= ‘2014-09-20 16:00:00’;
![Page 73: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/73.jpg)
@doanduyhai
Queries!
73
SELECT * FROM mailbox WHERE login >= hsue and login <= jdoe;
Get message by user range (range query on #partition)
SELECT * FROM mailbox WHERE login like ‘%doe%‘;
Get message by user pattern (non exact match on #partition)
![Page 74: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/74.jpg)
@doanduyhai
WHERE clause restrictions!
74
All queries (INSERT/UPDATE/DELETE/SELECT) must provide #partition Only exact match (=) on #partition, range queries (<, ≤, >, ≥) not allowed • ☞ full cluster scan On clustering columns, only range queries (<, ≤, >, ≥) and exact match WHERE clause only possible on columns defined in PRIMARY KEY
![Page 75: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/75.jpg)
@doanduyhai
Collections & maps!
75
CREATE TABLE users ( login text, name text, age int, friends set<text>, hobbies list<text>, languages map<int, text>, … PRIMARY KEY(login));
![Page 76: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/76.jpg)
@doanduyhai
Collections & maps!
76
All elements loaded at each access • ☞ low cardinality only (≤ 1000 elements)
Should not be used use for 1-N, N-N relations
![Page 77: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/77.jpg)
@doanduyhai
User Defined Type (UDT)!
77
CREATE TABLE users ( login text, … street_number int, street_name text, postcode int, country text, … PRIMARY KEY(login));
Instead of
![Page 78: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/78.jpg)
@doanduyhai
User Defined Type (UDT)!
78
CREATE TYPE address ( street_number int, street_name text, postcode int, country text);
CREATE TABLE users ( login text, … location frozen <address>, … PRIMARY KEY(login));
![Page 79: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/79.jpg)
@doanduyhai
UDT in action!
79
INSERT INTO users(login,name, location) VALUES ( ‘jdoe’, ’John DOE’, { ‘street_number’: 124, ‘street_name’: ‘Congress Avenue’, ‘postcode’: 95054, ‘country’: ‘USA’ });
![Page 80: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/80.jpg)
@doanduyhai
From SQL to CQL!
80
Remember…
![Page 81: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/81.jpg)
@doanduyhai
From SQL to CQL!
81
Remember…
CQL is not SQL
![Page 82: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/82.jpg)
@doanduyhai
From SQL to CQL!
82
Remember…
there is no join
(do you want to scale ?)
![Page 83: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/83.jpg)
@doanduyhai
From SQL to CQL!
83
Remember…
there is no integrity constraint
(do you want to read-before-write ?)
![Page 84: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/84.jpg)
@doanduyhai
From SQL to CQL!
84
1-1, N-1 : normalized
Comment
User
1
n
CREATE TABLE comments ( article_id uuid, comment_id timeuuid, author_id uuid, // typical join id content text, PRIMARY KEY((article_id), comment_id));
![Page 85: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/85.jpg)
@doanduyhai
From SQL to CQL!
85
1-1, N-1 : de-normalized
Comment
User
1
n
CREATE TABLE comments ( article_id uuid, comment_id timeuuid, author person, // person is UDT content text, PRIMARY KEY((article_id), comment_id));
![Page 86: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/86.jpg)
@doanduyhai
From SQL to CQL!
86
N-1, N-N : normalized
User
n
n
CREATE TABLE friends( login text, friend_login text, // typical join id PRIMARY KEY((login), friend_login));
![Page 87: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/87.jpg)
@doanduyhai
From SQL to CQL!
87
N-1, N-N : de-normalized
User
n
n
CREATE TABLE friends( login text, friend_login text, friend person, // person is UDT PRIMARY KEY((login), friend_login));
![Page 88: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/88.jpg)
@doanduyhai
Data modeling best practices!
88
Start by queries • identify core functional read paths • 1 read scenario ≈ 1 SELECT
![Page 89: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/89.jpg)
@doanduyhai
Data modeling best practices!
89
Start by queries • identify core functional read paths • 1 read scenario ≈ 1 SELECT Denormalize • wisely, only duplicate necessary & immutable data • functional/technical trade-off
![Page 90: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/90.jpg)
@doanduyhai
Data modeling best practices!
90
Person UDT - firstname/lastname - date of birth - sexe - mood - location
![Page 91: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/91.jpg)
@doanduyhai
Data modeling best practices!
91
John DOE, male birthdate: 21/02/1981 subscribed since 03/06/2011 ☉ San Mateo, CA
’’Impossible is not John DOE’’
Full detail read from User table on click
![Page 92: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/92.jpg)
Q & R
! " !
![Page 93: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/93.jpg)
Failure Handling!
![Page 94: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/94.jpg)
@doanduyhai
Failure Handling!Failure occurs when • node hardware failure (disk, …) • network issues • heavy load, node flapping • JVM long GCs
![Page 95: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/95.jpg)
@doanduyhai
Hinted Handoff!When 1 replica down • coordinator stores Hints
n1
n2
n3
n4
n5
n6
n7
n8
1
2
3
coordinator ×
Hint
![Page 96: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/96.jpg)
@doanduyhai
Hinted Handoff!When replica up • coordinator forward stored Hints
n1
n2
n3
n4
n5
n6
n7
n8
1
2
3
coordinator
Hints
![Page 97: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/97.jpg)
@doanduyhai
Consistent read!Read at QUORUM or more • read from one fastest node • AND request digest from other
replicas to reach QUORUM • return most up-to-date data to client • repair if digest mismatch n1
n2
n3
n4
n5
n6
n7
n8
1
2
3
coordinator
![Page 98: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/98.jpg)
@doanduyhai
Read Repair!Read at CL < QUORUM • read from one least-loaded node • every x reads, request digests from
other replicas • compare digest & repair
asynchronously • read_repair_chance (10%)
n1
n2
n3
n4
n5
n6
n7
n8
1
2
3
coordinator
![Page 99: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/99.jpg)
@doanduyhai
Manual repair!Should be part of cluster operations • scheduled • I/O intensive Use incremental repair of 2.1
![Page 100: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/100.jpg)
Q & R
! " !
![Page 101: Cassandra introduction apache con 2014 budapest](https://reader033.vdocument.in/reader033/viewer/2022052908/55945a0f1a28ab51728b45fd/html5/thumbnails/101.jpg)
@doanduyhai Company Confidential 101
Training Day | December 3rd Beginner Track • Introduction to Cassandra • Introduction to Spark, Shark, Scala and
Cassandra
Advanced Track • Data Modeling • Performance Tuning
Conference Day | December 4th Cassandra Summit Europe 2014 will be the single largest gathering of Cassandra users in Europe. Learn how the world's most successful companies are transforming their businesses and growing faster than ever using Apache Cassandra. http://bit.ly/cassandrasummit2014