cassandra@coursera: aws deploy and mysql transition

Post on 19-Aug-2014

825 Views

Category:

Engineering

10 Downloads

Preview:

Click to see full reader

DESCRIPTION

Touches on what Coursera aims to get out of Cassandra, what goes into a good deployment, and our experience so far transitioning off MySQL.

TRANSCRIPT

Cassandra @ Coursera Deploying in AWS MySQL Transition

Daniel Chia @DanielJHChia

Software Engineer, Infrastructure

Overview

• Why Cassandra

• What goes into a good deployment

• MySQL → Cassandra transition experience

110 partners !

698 courses !

8.5 million learners

A Coursera Course

Your Final Project

This is your chance to apply the course concepts to real-world situations

Identity Verified Certificates

Technical

• 100% hosted on AWS

• Service-oriented architecture

• Mix of MySQL and Cassandra for persistence

What do we care about?

We care about…

• Availability

• Scalability

• Operational Ease

• Latency

• (Bonus) Multi-region writes

Availability matters

EBS Outage (2012)

Master us-east-1a

Slave us-east-1c

Scalability

Scalability

Sharded by class

class1

class2

class3

class4

class5

Machine 1

class6

class7

class8

class9

class10

Machine 2

class11

class12

class13

class14

class15

Machine 3

New use-caseUh-oh… doesn’t fit in existing sharding

We care about…

• Availability

• Scalability

• Operational Ease

• Performance

• (Bonus) Multi-region

Try Cassandra!So we decided to…

Cassandra ≠ [database XYZ]

–Albert Einstein

“But if you judge a fish by its ability to climb a tree, it will live its whole life believing that it is stupid.”

Time to deploy Cassandra!sudo apt-get install dse-full

A good deploymentMachine-level Cluster-level

Picking a machine

• Disk

• IOPS… IOPS… IOPS

• Latency

Author: D-Kuru/Wikimedia Commons Licence: CC-BY-SA-3.0-AT

Picking a machine

• CPU

Author: Mark Sze Licence: CC BY-NC-ND 2.0

Picking a machine• Memory

• Save some for page cache!

Author: brutalSoCal Licence: CC BY-NC-ND 2.0

On AWS• Ephemeral disks.

• Please don’t use EBS. Really.

• IOPS usually the problem

• Instance sizes:

• spinning disk: m1.large, m1.xlarge, m2.4xlarge

• ssd: m3.xlarge, c3.2xlarge, i2.*

Set up the machine

• Lots of documentation / talks about this

• Recommended reading: Datastax guide [1]

[1] http://www.datastax.com/documentation/cassandra/2.0/cassandra/install/installRecommendSettings.html

Cluster configuration

A

C B

Priamcare and feeding of Cassandra on AWS

https://github.com/Netflix/Priam

Cluster Topology

• We use RF=3

• Ring balanced within datacenter

• Nodes alternate racks (or AZs)

Cluster Topology (Priam)

• Token assignments stored in a database

• Can takeover token in instance of node failure

Cluster Topology (Priam)

• Priam assigns tokens evenly per region

• Alternates AZs within region

az1

az3

az2

az1

az2

az3

Autoscaling groups

• Recover from lost instance

• We don't use it for scaling with traffic

Important: Need one ASG per AZ

east-1a east-1a east-1a

east-1b east-1beast-1b

east-1ceast-1c east-1c

ASG size: 9

Important: Need one ASG per AZ

ASG size: 9

east-1a east-1a east-1a

east-1b east-1beast-1b

east-1ceast-1c

east-1b

Important: Need one ASG per AZ

ASG-1a size: 3 east-1a east-1a east-1a

east-1b east-1beast-1b

east-1ceast-1c

ASG-1b size: 3

ASG-1csize: 3 east-1c

Backups

• Data on ephemeral disks

• Guard against application errors

• SSTables immutable -> ship to S3

• Priam does this

Restore

• Have to be able use your backup

• Also useful for QA / test

• Priam handles this rather nicely

Deployed!Time to chill?

https://www.flickr.com/photos/spunkinator/2394514059 Creative Commons

Monitoringworking / not working doesn’t count.

We have our own custom reporter agent for Datadog There’s pluggable reporter support in 2.0.2 now.

JVM GC woes

JVM GC woesAll happy now

SSTables Read Histogram

Questions?before we carry on

Transition takestime mindset shift expertise (some) risk

Our experience

• Pick one feature first

• Mindset shift

• Data modeling consulting

• Libraries / Patterns / Data-as-a-service

Pick one feature

• Don’t go all in with Cassandra with something important right away

• Work closely with that team

You probably will make mistakes

Oops!

Mindset shift

• Everyone knows SQL

• Not everyone knows Cassandra / NoSQL

• Need to know queries beforehand

Enrollment Example

• Learners enroll into a course

• learner (many-to-many) course

• Need to keep track of this membership

MySQL ModelCREATE TABLE `courses_learners` (

`id` INT(11) NOT NULL auto_increment,

`course_id` INT(11) NOT NULL,

`learner_id` INT(11) NOT NULL,

PRIMARY KEY (`id`),

UNIQUE KEY `c_l` (`learner_id`, `course_id`),

CONSTRAINT `ref1` FOREIGN KEY (`course_id`)

CONSTRAINT `ref2` FOREIGN KEY (`learner_id`)

)

MySQL ModelCREATE TABLE `courses_learners` (

`id` INT(11) NOT NULL auto_increment,

`course_id` INT(11) NOT NULL,

`learner_id` INT(11) NOT NULL,

PRIMARY KEY (`id`),

UNIQUE KEY `c_l` (`learner_id`, `course_id`),

CONSTRAINT `ref1` FOREIGN KEY (`course_id`)

CONSTRAINT `ref2` FOREIGN KEY (`learner_id`)

)

MySQL ModelCREATE TABLE `courses_learners` (

`id` INT(11) NOT NULL auto_increment,

`course_id` INT(11) NOT NULL,

`learner_id` INT(11) NOT NULL,

PRIMARY KEY (`id`),

UNIQUE KEY `c_l` (`learner_id`, `course_id`),

CONSTRAINT `ref1` FOREIGN KEY (`course_id`)

CONSTRAINT `ref2` FOREIGN KEY (`learner_id`)

)

MySQL ModelCREATE TABLE `courses_learners` (

`id` INT(11) NOT NULL auto_increment,

`course_id` INT(11) NOT NULL,

`learner_id` INT(11) NOT NULL,

PRIMARY KEY (`id`),

UNIQUE KEY `c_l` (`learner_id`, `course_id`),

CONSTRAINT `ref1` FOREIGN KEY (`course_id`)

CONSTRAINT `ref2` FOREIGN KEY (`learner_id`)

)

Cassandra Style

CREATE TABLE courses_by_learner (

learner_id uuid,

course_id uuid,

PRIMARY KEY (learner_id, course_id)

)

Data modeling consulting

• Build core team proficient at C* data modeling

• Available to consult for trickier use cases

Libraries / Patterns• Abstract away simple (but common) use-cases

• Key-value storage

• Simple time series

• Maybe every developer won’t need deep C* knowledge?

• More radical: data as a service (e.g. STAASH)

STAASH: https://github.com/Netflix/staash

It’s a long roadbut we’ll get there…

Author: Carissa Rogers License: CC BY 2.0

Conclusion

• Know Cassandra

• Know what makes a good deployment

• Know that new skills have to be acquired

Questions?

We’re hiring! coursera.org/jobs

top related