cassandra sf 2015 - repeatable, scalable, reliable, observable cassandra

69
CASSANDRA SF 2015 REPEATABLE, SCALABLE, RELIABLE, OBSERVABLE CASSANDRA Aaron Morton @aaronmorton Co-Founder & Principal Consultant Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License

Upload: aaronmorton

Post on 13-Apr-2017

1.008 views

Category:

Software


0 download

TRANSCRIPT

Page 1: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

CASSANDRA SF 2015

REPEATABLE, SCALABLE, RELIABLE, OBSERVABLE CASSANDRA

Aaron Morton@aaronmorton

Co-Founder & Principal Consultant

Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License

Page 2: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

About The Last Pickle.

Work with clients to deliver and improve Apache Cassandra based solutions.

Apache Cassandra Committer, DataStax MVP, Apache

Usergrid Committer. Based in New Zealand, Australia, & USA.

Page 3: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

DesignDevelopmentDeployment

Page 4: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Scaleable Data Model

Use no look writes to avoid unnecessary reads.

Page 5: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

No Look Writes

CREATE TABLE user_visits ( user text, day int, // YYYYMMDD PRIMARY KEY (user, day) );

Page 6: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

No Look Writes// Bad

SELECT * FROM user_visits WHERE user = ‘aaron’ AND day = 20150924;

INSERT INTO user_visits (user, day) VALUES ('aaron', 20150924);

Page 7: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

No Look Writes// Better

INSERT INTO user_visits (user, day) VALUES ('aaron', 20150924);

INSERT INTO user_visits (user, day) VALUES ('aaron', 20150924);

Page 8: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Scaleable Data Model

Limit Partition size by bounding it in time or space.

Page 9: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Limit Partition Size// Bad

CREATE TABLE user_visits ( user text, visit_time timestamp, data blob, // up to 100K PRIMARY KEY (user, visit) );

Page 10: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Limit Partition Size// Better

CREATE TABLE user_visits ( user text, day_bucket int, // YYYYMMDD visit_time timestamp, data blob, // up to 100K PRIMARY KEY ( (user, day_bucket), visit) );

Page 11: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Scaleable Data Model

Avoid mixed workloads on a single Table to reduce impact

of fragmentation.

Page 12: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Mixed Workloads// Bad

CREATE TABLE user ( user text, password text, // when password changed last_visit timestamp, // each page request PRIMARY KEY (user) );

Page 13: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Mixed Workloads// Better CREATE TABLE user_password ( user text, password text, PRIMARY KEY (user) ); CREATE TABLE user_last_visit ( user text, last_visit timestamp, PRIMARY KEY (user) );

Page 14: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Scaleable Data Model

Use LeveledCompactionStrategy

when overwrites or Tombstones.

Page 15: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Use LCS for Overwrites

CREATE TABLE user_visits ( user text, day int, // YYYYMMDD PRIMARY KEY (user, day) ) WITH COMPACTION = { 'class' : 'LeveledCompactionStrategy' };

Page 16: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Scaleable Data Model

Create parallel data models so throughput increases with

node count.

Page 17: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Parallel Data Models// Bad

CREATE TABLE hotel_price ( checkin_day int, // YYYYMMDD hotel_name text, price_data blob, PRIMARY KEY (checkin_day, hotel_name) );

Page 18: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Parallel Data Models// Better

CREATE TABLE hotel_price ( checkin_day int, // YYYYMMDD city text, hotel_name text, price_data blob, PRIMARY KEY ( (checkin_day, city), hotel_name) );

Page 19: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Scaleable Data Model

Use concurrent asynchronous requests to complete tasks.

Page 20: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Concurrent Asynchronous Requests

CREATE TABLE hotel_price ( checkin_day int, // YYYYMMDD city text, hotel_name text, price_data blob, PRIMARY KEY ( (checkin_day, city), hotel_name) );

Page 21: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Concurrent Asynchronous Requests

// request for cities concurrently SELECT * FROM hotel_price WHERE checkin_day = 20150924 AND city = 'Santa Clara'; SELECT * FROM hotel_price WHERE checkin_day = 20150924 AND city = 'San Jose';

Page 22: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Scaleable Data Model

Document when Eventual Consistency, Strong

Consistency or Linerizable Consistency is required.

Page 23: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Scaleable Data Model

Smoke Test the data model.

Page 24: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Data Model Smoke Test/* * Get Pricing Data */

// Load Data INSERT INTO city_distances (city, distance, nearby_city) VALUES ('Santa Clara', 0, 'Santa Clara'); INSERT INTO city_distances (city, distance, nearby_city) VALUES ('Santa Clara', 1, 'San Jose');

INSERT INTO hotel_price (checkin_day, city, hotel_name, price_data) VALUES (20150924, 'Santa Clara', 'Hilton Santa Clara', 0xFF); INSERT INTO hotel_price (checkin_day, city, hotel_name, price_data) VALUES (20150924, 'San Jose', 'Hyatt San Jose', 0xFF);

Page 25: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Data Model Smoke Test// Step 1 // Get the near by cities for the one selected by the user

SELECT nearby_city FROM city_distances WHERE city = 'Santa Clara' and distance < 2;

// Step 2 // Parallel requests for each city returned.

SELECT city, hotel_name, price_data FROM hotel_price WHERE checkin_day = 20150924 AND city = 'Santa Clara'; SELECT city, hotel_name, price_data FROM hotel_price WHERE checkin_day = 20150924 AND city = 'San Jose';

Page 26: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

DesignDevelopmentDeployment

Page 27: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Application Development

Ensure read requests are bound and know what the size

is.(hint: use auto-paging in 2.0)

Page 28: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Auto Paging

PreparedStatement prepStmt = session.prepare(CQL); BoundStatement boundStmt = new BoundStatement(prepStmt);

boundStatement.setFetchSize(100)

Page 29: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Application Development

Use appropriate Consistency Level.

(see Data Model Smoke Test)

Page 30: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Application Development

Use Token Aware Asynchronous requests with

CL ONE where possible.

Page 31: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Token Aware Policycluster = Cluster.builder() .addContactPoints("10.10.10.10") .withLoadBalancingPolicy(new TokenAwarePolicy( new DCAwareRoundRobinPolicy(“DC1”))) .build()

Page 32: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Asynchronous Requests

ResultSetFuture f = ses.executeAsync(stmt.bind("fo")); Row row = f.getUninterruptibly().one();

Page 33: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Application Development

Avoid DDOS’ing the cluster.

Page 34: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Monitoring and Alerting

Use what you like and what works for you.

Page 35: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Monitoring and Alerting

Some suggestions: OpsCentre, Riemann, Grafana, Log Stash,

Sensu.

Page 36: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

How To Monitor

Cluster wide aggregate.All nodes (if possible).

Top 3 & Bottom 3 Nodes.Individual Nodes.

Page 37: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

How To Monitor Rates

1 Minute RateDerivative of Counts

Page 38: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

How To Monitor Latency

75th Percentile95th Percentile99th Percentile

Page 39: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Monitoring Cluster Throughput.o.a.c.m.ClientRequest.

Write.Latency.1MinuteRate Read.Latency.1MinuteRate

Page 40: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Monitoring Local Table Throughput.o.a.c.m.ColumnFamily.

KEYSPACE.TABLE.WriteLatency.1MinuteRate KEYSPACE.TABLE.ReadLatency.1MinuteRate

Page 41: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Monitoring Request Latency.o.a.c.m.ClientRequest.

Write.Latency.75percentile Write.Latency.95percentile Write.Latency.99percentile Read.Latency.75percentile…

Page 42: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Monitoring Request Latency Per Table.o.a.c.m.ColumnFamily.

KEYSPACE.TABLE.CoordinatorWriteLatency.95percentile

KEYSPACE.TABLE.CoordinatorReadLatency.95percentile

Page 43: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Monitoring Local Table Latency.o.a.c.m.ColumnFamily.

KEYSPACE.TABLE.WriteLatency.95percentile KEYSPACE.TABLE.ReadLatency.95percentile

Page 44: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Monitoring Read Path.o.a.c.m.ColumnFamily.KEYSPACE.TABLE.

LiveScannedHistogram.95percentile

TombstoneScannedHistogram.95percentile

SSTablesPerReadHistogram.95percentile

Page 45: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Monitoring Inconsistency.o.a.c.m.

Storage.TotalHints.count

HintedHandOffManager. Hints_created-IP_ADDRESS.count

.o.a.c.m.Connection.TotalTimeouts.1MinuteRate

Page 46: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Monitoring Eventual Consistency.o.a.c.m.

ReadRepair.RepairedBackground.1MinuteRate

ReadRepair.RepairedBlocking.1MinuteRate

Page 47: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Monitoring Client Errors.o.a.c.m.ClientRequest.

Write.Unavailables.1MinuteRate Read.Unavailables.1MinuteRate Write.Timeouts.1MinuteRate Read.Timeouts.1MinuteRate

Page 48: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Monitoring Errors.o.a.c.m.

Storage.Exceptions.count

Page 49: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Monitoring Disk Usage.o.a.c.m.

Storage.Load.count

ColumnFamily.KEYSPACE.TABLE. TotalDiskSpaceUsed.count

Page 50: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Monitoring Pending Compactions.o.a.c.m.

Compaction.PendingTasks.value

ColumnFamily.KEYSPACE.TABLE.PendingCompactions .value

Compaction.TotalCompactionsCompleted.1MinuteRate

Page 51: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Monitoring Node Performance.o.a.c.m.ThreadPools.request.

MutationStage.PendingTasks.value ReadStage.PendingTasks.value

ReplicateOnWriteStage.PendingTasks.value RequestResponseStage.PendingTasks.value

Page 52: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Monitoring Node Performance.o.a.c.m.DroppedMessage.

MUTATION.Dropped.1MinuteRate READ.Dropped.1MinuteRate

Page 53: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

DesignDevelopmentProvisioning

Page 54: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Smoke Tests

“preliminary testing to reveal simple failures severe enough

to reject a prospective software release.”

Page 55: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Disk Smoke Tests

“Disk Latency and Other Random Numbers”

Al Tobyhttp://tobert.github.io/post/2014-11-13-slides-disk-

latency-and-other-random-numbers.html

Page 56: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Cassandra Smoke Testcassandra-stress write cl=quorum -schema replication\(factor=3\)

-mode native prepared cql3

cassandra-stress read cl=quorum -mode native prepared cql3

cassandra-stress mixed cl=quorum ratio\(read=1,write=4\) -mode native prepared cql3

Page 57: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Run Books

Plan now.

Page 58: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Run Books

Why are we doing this?What are we doing?How will we do it?

Page 59: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Fire Drills

Practice now.

Page 60: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Fire Drill: Short Term Single Node Failure

Down for less than Hint Window.

Available for QUORUM.No action necessary on return.

Page 61: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Fire Drill: Short Term Multi Node Failure (Break the cluster)

Down for less than Hint Window.

Available for ONE (maybe).Repair on return.

Page 62: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Fire Drill: Availability Zone / Rack Partition

Down for less than Hint Window.

Available for QUORUM.Maybe repair on return.

Page 63: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Fire Drill: Medium Term Single Node Failure

Down between Hint Window and gc_grace_seconds.

Available for QUORUM.Repair on return.

Page 64: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Fire Drill: Long Term Single Node Failure

Down longer than gc_grace_seconds.

Available for QUORUM.Replace node.

Page 65: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Fire Drill: Rolling Upgrade

Repeated short term failure.

Available for QUORUM.

Page 66: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Fire Drill: Scale Up

Repeated short term failure.

Available for QUORUM.

Page 67: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Fire Drill: Scale Out

Available for ALL.

Page 68: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Thanks.

Page 69: Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra

Aaron Morton@aaronmorton

Co-Founder & Principal Consultantwww.thelastpickle.com