cassandra 2.0 - introduction. use cases
DESCRIPTION
Patrick McFadin from DataStax talks about Cassandra at Big Data Guru MeetupTRANSCRIPT
![Page 1: Cassandra 2.0 - introduction. use cases](https://reader033.vdocument.in/reader033/viewer/2022060106/54b6caa04a79599d1b8b45ac/html5/thumbnails/1.jpg)
©2013 DataStax Confidential. Do not distribute without consent.
@PatrickMcFadin
Patrick McFadin Chief Evangelist/Solution Architect - DataStax
Cassandra : Introduction
![Page 2: Cassandra 2.0 - introduction. use cases](https://reader033.vdocument.in/reader033/viewer/2022060106/54b6caa04a79599d1b8b45ac/html5/thumbnails/2.jpg)
Who I am
�2
• Patrick McFadin • Solution Architect at DataStax • Cassandra MVP • User for years • Follow me for more:
I talk about Cassandra and building scalable, resilient apps ALL THE TIME!
@PatrickMcFadin
Dude. Uptime == $$
![Page 3: Cassandra 2.0 - introduction. use cases](https://reader033.vdocument.in/reader033/viewer/2022060106/54b6caa04a79599d1b8b45ac/html5/thumbnails/3.jpg)
Five Years of Cassandra
0 1 2 3 4 5
0.1 0.3 0.6 0.7 1.0 1.2...
2.0
DSE
Jul-08
![Page 4: Cassandra 2.0 - introduction. use cases](https://reader033.vdocument.in/reader033/viewer/2022060106/54b6caa04a79599d1b8b45ac/html5/thumbnails/4.jpg)
Cassandra - An introduction
![Page 5: Cassandra 2.0 - introduction. use cases](https://reader033.vdocument.in/reader033/viewer/2022060106/54b6caa04a79599d1b8b45ac/html5/thumbnails/5.jpg)
Cassandra - Intro
• Based on Amazon Dynamo and Google BigTable paper • Shared nothing • Data safe as possible • Predictable scaling
�5
Dynamo
BigTable
![Page 6: Cassandra 2.0 - introduction. use cases](https://reader033.vdocument.in/reader033/viewer/2022060106/54b6caa04a79599d1b8b45ac/html5/thumbnails/6.jpg)
Cassandra - More than one server
• All nodes participate in a cluster • Shared nothing • Add or remove as needed •More capacity? Add a server
�6
![Page 7: Cassandra 2.0 - introduction. use cases](https://reader033.vdocument.in/reader033/viewer/2022060106/54b6caa04a79599d1b8b45ac/html5/thumbnails/7.jpg)
Cassandra - Locally Distributed
• Client writes to any node • Node coordinates with others • Data replicated in parallel • Replication factor: How many
copies of your data? • RF = 3 here
�7
![Page 8: Cassandra 2.0 - introduction. use cases](https://reader033.vdocument.in/reader033/viewer/2022060106/54b6caa04a79599d1b8b45ac/html5/thumbnails/8.jpg)
Cassandra - Geographically Distributed
• Client writes local • Data syncs across WAN • Replication Factor per DC
�8
![Page 9: Cassandra 2.0 - introduction. use cases](https://reader033.vdocument.in/reader033/viewer/2022060106/54b6caa04a79599d1b8b45ac/html5/thumbnails/9.jpg)
Cassandra - Consistency
• Consistency Level (CL) • Client specifies per read or write
�9
• ALL = All replicas ack • QUORUM = > 51% of replicas ack • LOCAL_QUORUM = > 51% in local DC ack • ONE = Only one replica acks
![Page 10: Cassandra 2.0 - introduction. use cases](https://reader033.vdocument.in/reader033/viewer/2022060106/54b6caa04a79599d1b8b45ac/html5/thumbnails/10.jpg)
Cassandra - Transparent to the application
• A single node failure shouldn’t bring failure • Replication Factor + Consistency Level = Success • This example: • RF = 3 • CL = QUORUM
�10
>51% Ack so we are good!
![Page 11: Cassandra 2.0 - introduction. use cases](https://reader033.vdocument.in/reader033/viewer/2022060106/54b6caa04a79599d1b8b45ac/html5/thumbnails/11.jpg)
Cassandra Applications - Drivers
• DataStax Drivers for Cassandra • Java • C# • Python •more on the way
�11
![Page 12: Cassandra 2.0 - introduction. use cases](https://reader033.vdocument.in/reader033/viewer/2022060106/54b6caa04a79599d1b8b45ac/html5/thumbnails/12.jpg)
Cassandra Applications - Connecting
• Create a pool of local servers • Client just uses session to interact with Cassandra
�12
!contactPoints = {“10.0.0.1”,”10.0.0.2”}!!keyspace = “videodb”!!public VideoDbBasicImpl(List<String> contactPoints, String keyspace) {!
! cluster = Cluster! .builder()! .addContactPoints(!! contactPoints.toArray(new String[contactPoints.size()]))! .withLoadBalancingPolicy(Policies.defaultLoadBalancingPolicy())! .withRetryPolicy(Policies.defaultRetryPolicy())! .build();!! session = cluster.connect(keyspace);! }
![Page 13: Cassandra 2.0 - introduction. use cases](https://reader033.vdocument.in/reader033/viewer/2022060106/54b6caa04a79599d1b8b45ac/html5/thumbnails/13.jpg)
Cassandra Applications - Load balancing• Token aware - Request sent to primary node with data • Calls can be asynchronous and in parallel
�13
1
23
45
6Client
Thread
Node
Node
Node
Client Thread
Client Thread
Node
Driver
![Page 14: Cassandra 2.0 - introduction. use cases](https://reader033.vdocument.in/reader033/viewer/2022060106/54b6caa04a79599d1b8b45ac/html5/thumbnails/14.jpg)
Cassandra Applications - Fault tolerance
• Try first with a Consistency Level of QUORUM • If fails, retry with Consistency Level ONE
�14
Client Node
Node Replica
Replica
NodeReplica
![Page 15: Cassandra 2.0 - introduction. use cases](https://reader033.vdocument.in/reader033/viewer/2022060106/54b6caa04a79599d1b8b45ac/html5/thumbnails/15.jpg)
Application Example - Layout
• Active-Active • Service based DNS routing
�15
Cassandra Replication
![Page 16: Cassandra 2.0 - introduction. use cases](https://reader033.vdocument.in/reader033/viewer/2022060106/54b6caa04a79599d1b8b45ac/html5/thumbnails/16.jpg)
Application Example - Uptime
�16
• Normal server maintenance • Application is unaware
Cassandra Replication
![Page 17: Cassandra 2.0 - introduction. use cases](https://reader033.vdocument.in/reader033/viewer/2022060106/54b6caa04a79599d1b8b45ac/html5/thumbnails/17.jpg)
Application Example - Failure
�17
• Data center failure • Data is safe. Route traffic.
33
Another happy user!
![Page 18: Cassandra 2.0 - introduction. use cases](https://reader033.vdocument.in/reader033/viewer/2022060106/54b6caa04a79599d1b8b45ac/html5/thumbnails/18.jpg)
Cassandra Users and Use Cases
![Page 19: Cassandra 2.0 - introduction. use cases](https://reader033.vdocument.in/reader033/viewer/2022060106/54b6caa04a79599d1b8b45ac/html5/thumbnails/19.jpg)
Netflix!• If you haven’t heard their story… where have you been? • 18B market cap — Runs on Cassandra • User accounts • Play lists • Payments • Statistics
![Page 20: Cassandra 2.0 - introduction. use cases](https://reader033.vdocument.in/reader033/viewer/2022060106/54b6caa04a79599d1b8b45ac/html5/thumbnails/20.jpg)
Spotify
•Millions of songs. Millions of users. • Playlists • 1 billion playlists • 30+ Cassandra clusters • 50+ TB of data • 40k req/sec peak
�20
http://www.slideshare.net/noaresare/cassandra-nyc
![Page 21: Cassandra 2.0 - introduction. use cases](https://reader033.vdocument.in/reader033/viewer/2022060106/54b6caa04a79599d1b8b45ac/html5/thumbnails/21.jpg)
Instagram(Facebook)
• Loads and loads of photos. (Probably yours) • All in AWS • Security audits • News feed • 20k writes/sec. 15k reads/sec.
�21
![Page 22: Cassandra 2.0 - introduction. use cases](https://reader033.vdocument.in/reader033/viewer/2022060106/54b6caa04a79599d1b8b45ac/html5/thumbnails/22.jpg)
©2013 DataStax Confidential. Do not distribute without consent. �22