introduction to cassandra - github€¦ · @doanduyhai datastax! • founded in april 2010 • we...

72
@doanduyhai Introduction to Cassandra DuyHai DOAN, Technical Advocate

Upload: others

Post on 20-Apr-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Introduction to Cassandra DuyHai DOAN, Technical Advocate

Page 2: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Agenda!

2

Architecture •  cluster •  replication

Data model •  last write win (LWW), •  CQL basics (CRUD, DDL, collections, clustering column) •  lightweight transactions

Page 3: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Who Am I ?!Duy Hai DOAN Cassandra technical advocate •  talks, meetups, confs •  open-source devs (Achilles, …) •  OSS Cassandra point of contact

[email protected] ☞ @doanduyhai

3

Page 4: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Datastax!•  Founded in April 2010

•  We contribute a lot to Apache Cassandra™

•  400+ customers (25 of the Fortune 100), 200+ employees

•  Headquarter in San Francisco Bay area

•  EU headquarter in London, offices in France and Germany

•  Datastax Enterprise = OSS Cassandra + extra features

4

Page 5: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Cassandra 5 key facts!Key fact 1: linear scalability

C*

C* C*

NetcoSports 3 nodes, ≈3GB

1k+ nodes, PB+

YOU

5

Page 6: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Cassandra 5 key facts!Key fact 2: continuous availability (≈100% up-time) •  resilient architecture (Dynamo)

6

Page 7: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Cassandra 5 key facts!Key fact 3: multi-data centers •  out-of-the-box (config only) •  AWS conf for multi-region DCs •  GCE/CloudStack support •  Microsoft Azure

7

Page 8: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Cassandra 5 key facts!Key fact 4: operational simplicity •  1 node = 1 process + 2 config file (main + IP) •  deployment automation •  OpsCenter for monitoring

8

Page 9: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Cassandra 5 key facts!

9

Page 10: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),
Page 11: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

Cassandra architecture!

Cluster Replication

Page 12: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Cassandra architecture!

12

Cluster layer •  Amazon DynamoDB paper •  masterless architecture

Data-store layer •  Google Big Table paper •  Columns/columns family

Page 13: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Data distribution!

13

Random: hash of #partition → token = hash(#p) Hash: ]-X, X] X = huge number (264/2)

n1

n2

n3

n4

n5

n6

n7

n8

Page 14: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Token Ranges!

14

A: ]0, X/8] B: ] X/8, 2X/8] C: ] 2X/8, 3X/8] D: ] 3X/8, 4X/8] E: ] 4X/8, 5X/8] F: ] 5X/8, 6X/8] G: ] 6X/8, 7X/8] H: ] 7X/8, X]

n1

n2

n3

n4

n5

n6

n7

n8

A

B

C

D

E

F

G

H

Page 15: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Distributed Table!

15

n1

n2

n3

n4

n5

n6

n7

n8

A

B

C

D

E

F

G

H

user_id1

user_id2

user_id3

user_id4

user_id5

Page 16: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Distributed Table!

16

n1

n2

n3

n4

n5

n6

n7

n8

A

B

C

D

E

F

G

H

user_id1

user_id2

user_id3

user_id4

user_id5

Page 17: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Linear scalability!

n1

n2

n3

n4

n5

n6

n7

n8

8 nodes

17

Page 18: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Linear scalability!

n1

n2

n3 n4

n5

n6

n7

n8 n9

n10

10 nodes

18

Page 19: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),
Page 20: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),
Page 21: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Consistency!

21

Tunable at runtime •  ONE •  QUORUM (strict majority w.r.t. RF) •  ALL Apply both to read & write

Page 22: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Consistency in action!

22

RF = 3, Write ONE, Read ONE

B A A

B A A

Read ONE: A

data replication in progress …

Write ONE: B

Page 23: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Consistency in action!

23

RF = 3, Write ONE, Read QUORUM

B A A

Write ONE: B

Read QUORUM: A

B A A

data replication in progress …

Page 24: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Consistency in action!

24

RF = 3, Write ONE, Read ALL

B A A

Read ALL: B

B A A

data replication in progress …

Write ONE: B

Page 25: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Consistency in action!

25

RF = 3, Write QUORUM, Read ONE

B B A

Write QUORUM: B

Read ONE: A

B B A

data replication in progress …

Page 26: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Consistency in action!

26

RF = 3, Write QUORUM, Read QUORUM

B B A

Read QUORUM: B

B B A

data replication in progress …

Write QUORUM: B

Page 27: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Consistency trade-off!

27

Page 28: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Consistency level!

28

ONE Fast, may not read latest written value

Page 29: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Consistency level!

29

QUORUM Strict majority w.r.t. Replication Factor

Good balance

Page 30: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Consistency level!

30

ALL Paranoid

Slow, no high availability

Page 31: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai 31

Consistency summary!

ONERead + ONEWrite

☞ available for read/write even (N-1) replicas down

QUORUMRead + QUORUMWrite

☞ available for read/write even 1+ replica down

Page 32: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

Q & R

! " !

Page 33: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

Data model!

Last Write Win!CQL basics!

Clustered tables!Lightweight transactions!

Page 34: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Last Write Win (LWW)!

34

jdoe age name

33 John DOE

INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33);

#partition

Page 35: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Last Write Win (LWW)!

jdoe age (t1) name (t1)

33 John DOE

35

INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘

Page 36: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Last Write Win (LWW)!

36

UPDATE users SET age = 34 WHERE login = jdoe;

jdoe age (t1) name (t1)

33 John DOE jdoe

age (t2)

34

SSTable1 SSTable2

Page 37: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Last Write Win (LWW)!

37

DELETE age FROM users WHERE login = jdoe;

jdoe age (t3)

ý

tombstone

jdoe age (t1) name (t1)

33 John DOE jdoe

age (t2)

34

SSTable1 SSTable2 SSTable3

Page 38: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Last Write Win (LWW)!

38

SELECT age FROM users WHERE login = jdoe;

? ? ?

SSTable1 SSTable2 SSTable3

jdoe age (t3)

ý jdoe

age (t1) name (t1)

33 John DOE jdoe

age (t2)

34

Page 39: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Last Write Win (LWW)!

39

SELECT age FROM users WHERE login = jdoe;

✓ ✕ ✕

SSTable1 SSTable2 SSTable3

jdoe age (t3)

ý jdoe

age (t1) name (t1)

33 John DOE jdoe

age (t2)

34

Page 40: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Compaction!

40

SSTable1 SSTable2 SSTable3

jdoe age (t3)

ý jdoe

age (t1) name (t1)

33 John DOE jdoe

age (t2)

34

New SSTable

jdoe age (t3) name (t1)

ý John DOE

Page 41: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Historical data!

41

history

id date1(t1) date2(t2) … date9(t9)

… … … …

SSTable1 SSTable2

You want to keep data history ? •  do not use internal generated timestamp !!! •  ☞ time-series data modeling

id date10(t10) date11(t11) … …

… … … …

Page 42: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

CRUD operations!

42

INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33);

UPDATE users SET age = 34 WHERE login = jdoe;

DELETE age FROM users WHERE login = jdoe;

SELECT age FROM users WHERE login = jdoe;

Page 43: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Simple Table!

43

CREATE TABLE users ( login text, name text, age int, … PRIMARY KEY(login));

partition key (#partition)

Page 44: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Clustered table (1 – N)!

44

CREATE TABLE mailbox ( login text, message_id timeuuid, interlocutor text, message text, PRIMARY KEY((login), message_id));

partition key clustering column (sorted)

unicity

Page 45: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

SSTable2

SSTable1

On disk layout

jdoe message_id1 message_id2 … message_id104

… … … …

hsue message_id1 message_id2 … message_id78

… … … …

jdoe message_id105 message_id106 … message_id169

… … … …

45

Page 46: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Queries!

46

Get message by user and message_id (date)

Page 47: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Queries!

47

Get message by message_id only (#partition not provided)

SELECT * FROM mailbox WHERE message_id = ‘2014-09-25 16:00:00’;

Get message by date interval (#partition not provided)

SELECT * FROM mailbox WHERE and message_id <= ‘2014-09-25 16:00:00’ and message_id >= ‘2014-09-20 16:00:00’;

Page 48: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Without #partition

?

?

?

?

?

?

?

?

❓ ❓

No #partition ☞ no token ☞ where are my data ?

48

Page 49: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

The importance of #partition

In RDBMS, no primary key ☞ full table scan

😭 49

Page 50: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

The importance of #partition

With Cassandra, no partition key ☞ full CLUSTER scan

😱 50

Page 51: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Queries!

51

SELECT * FROM mailbox WHERE login >= hsue and login <= jdoe;

Get message by user range (range query on #partition)

SELECT * FROM mailbox WHERE login like ‘%doe%‘;

Get message by user pattern (non exact match on #partition)

Page 52: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Clustering order!

52

CREATE TABLE mailbox ( login text, message_id timeuuid, interlocutor text, message text, PRIMARY KEY((login), message_id)) WITH CLUSTERING ORDER BY (message_id DESC) ;

Page 53: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Reverse on disk layout!

53

jdoe message_id169 message_id168 … message_id105

… … … …

SSTable1

jdoe message_id104 message_id103 … message_id1

… … … …

SSTable2

Page 54: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

WHERE clause restrictions!All queries (INSERT/UPDATE/DELETE/SELECT) must provide #partition

Only exact match (=) on #partition, range queries (<, ≤, >, ≥) not allowed •  ☞ full cluster scan

On clustering columns, only range queries (<, ≤, >, ≥) and exact match

WHERE clause only possible •  on columns defined

Page 55: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

WHERE clause restrictions!What if I want to perform « arbitrary » WHERE clause ? •  search form scenario, dynamic search fields

55

Page 56: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

WHERE clause restrictions!What if I want to perform « arbitrary » WHERE clause ? •  search form scenario, dynamic search fields DO NOT RE-INVENT THE WHEEL ! ☞ Apache Solr (Lucene) integration (Datastax Enterprise) ☞ Same JVM, 1-cluster-2-products (Solr & Cassandra)

56

Page 57: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

WHERE clause restrictions!What if I want to perform « arbitrary » WHERE clause ? •  search form scenario, dynamic search fields DO NOT RE-INVENT THE WHEEL ! ☞ Apache Solr (Lucene) integration (Datastax Enterprise) ☞ Same JVM, 1-cluster-2-products (Solr & Cassandra)

SELECT * FROM users WHERE solr_query = ‘age:[33 TO *] AND gender:male’;

SELECT * FROM users WHERE solr_query = ‘lastname:*schwei?er’;

57

Page 58: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Collections & maps!

58

CREATE TABLE users ( login text, name text, age int, friends set<text>, hobbies list<text>, languages map<int, text>, … PRIMARY KEY(login));

Keep cardinality low ! (≈ 1000)

Page 59: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

From SQL to CQL!

59

Normalized

Comment

User

1

n

CREATE TABLE comments ( article_id uuid, comment_id timeuuid, author_id text, // typical join id content text, PRIMARY KEY((article_id), comment_id));

Page 60: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

From SQL to CQL!

60

De-normalized

Comment

User

1

n

CREATE TABLE comments ( article_id uuid, comment_id timeuuid, author_json text, // de-normalize content text, PRIMARY KEY((article_id), comment_id));

Page 61: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Data modeling best practices!

61

Start by queries •  identify core functional read paths •  1 read scenario ≈ 1 SELECT

Page 62: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Data modeling best practices!

62

Start by queries •  identify core functional read paths •  1 read scenario ≈ 1 SELECT Denormalize •  wisely, only duplicate necessary & immutable data •  functional/technical trade-off

Page 63: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Data modeling best practices!

63

Person JSON - firstname/lastname - date of birth - gender - mood - location

Page 64: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Data modeling best practices!

64

John DOE, male birthdate: 21/02/1981 subscribed since 03/06/2011 ☉ San Mateo, CA

’’Impossible is not John DOE’’

Full detail read from User table on click

Page 65: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Lightweight Transaction (LWT)!

65

What ? ☞ make operations linearizable Why ? ☞ solve a class of race conditions in Cassandra that would require installing an external lock manager

Page 66: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Lightweight Transaction (LWT)!

66

INSERT INTO account (id, email) VALUES (‘jdoe’, ‘[email protected]’);

SELECT * FROM account WHERE id= ‘jdoe’; (0 rows) SELECT * FROM account

WHERE id= ‘jdoe’; (0 rows)

INSERT INTO account (id, email) VALUES (‘jdoe’, ‘[email protected]’);

winner

Page 67: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Lightweight Transaction (LWT)!

67

How ? ☞ implementing Paxos protocol on Cassandra Syntax ?

INSERT INTO account (id, email) VALUES (‘jdoe’, ‘[email protected]’) IF NOT EXISTS;

UPDATE account SET email = ‘[email protected]’ IF email = ‘[email protected]’ WHERE id=‘jdoe’;

Page 68: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Lightweight Transaction (LWT)!

68

Recommendations •  insert with LWT ☞ delete must use LWT

INSERT INTO my_table … IF NOT EXISTS ☞ DELETE FROM my_table … IF EXISTS

Page 69: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Lightweight Transaction (LWT)!

69

Recommendations •  LWT expensive (4 round-trips), do not abuse •  only for 1% – 5% use cases

Page 70: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

@doanduyhai

Lightweight Transaction (LWT)!

70

1

2

3

4Compare Swap / Learn

Queue-in Consensus

Page 71: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

Q & R

! " !

Page 72: Introduction to Cassandra - GitHub€¦ · @doanduyhai Datastax! • Founded in April 2010 • We contribute a lot to Apache Cassandra™ • 400+ customers (25 of the Fortune 100),

Thank You @doanduyhai

[email protected]

https://academy.datastax.com/