re:invent 2012 optimizing cassandra
DESCRIPTION
AWS re:Invent 2012 presentation on Optimizing Cassandra usage at Netflix. Overview of Netflix Open Source projects. Gregg Ulrich, Ruslan Meshenberg.TRANSCRIPT
Optimizing Cassandra for AWS
Ruslan Meshenberg, Gregg Ulrich - Netflix
Agenda
Netflix
AWS Cassandra
Netflix Inc.
With more than 30 million streaming members in the United States, Canada, Latin America, the United Kingdom,
Ireland, Sweden, Norway, Denmark and Finland, Netflix, Inc. is the world's leading internet subscription service for
enjoying movies and TV series..
Why Cloud?
Jan-
10
Feb-1
0
Mar
-10
Apr-1
0
May
-10
Jun-
10
Jul-1
0
Aug-1
0
Sep-1
0
Oct-
10
Nov-1
0
Dec-1
0
Jan-
110
5
10
15
20
25
Netflix API – Growth in requests
Data Center Capacity
Req
ues
ts i
n B
illi
on
s (p
er d
ay)
Netflix.com is now ~100% Cloud
• Some small back end data sources still in progress• USA specific logistics remains in the Datacenter• Working on SOX, PCI as scope starts to include AWS• All international product is cloud based
What is Cassandra?
• Persistent data store• NoSQL• Distributed key/value store• Tunable eventual consistency
Why did we choose Cassandra?
• Open sourced and written in Java• Multi-region replication• Data model supports wide range of use-cases• Runs on commodity hardware• Enhanced to understand AWS topology• Durable
Durability
• No single point of failure or specialized instances• Multiple copies of data across availability zones• Bootstrapping and hints restore data quickly• All writes appended to a commit log• Asynchronous cross-regional replication
How we configure Cassandra in AWS
1c 1d
1e
1c
1d
1e1c
1d
1e
1c
1d
1e1a 1b
1c
1a
1b
1c1a
1b
1c
1a
1b
1c
S3 S3
us-east-1 eu-west-1
S3
us-west-2
Durability (Quorum)
One instance: Availability zone:
Replica set:
How we configure Cassandra in AWS
• Mostly m2.4xlarge, but migrating to SSDs• Ephemeral storage for better performance• Multiple ASGs per cluster, each with one AZ• Single tenanted clusters• Overprovisioned clusters
Optimizations
• Cassandra enhancements• Client libraries• Operations• Schema and data management
Cassandra enhancements
• Bug fixes• New features• Performance• Security• AWS environment
Making a better Java client
• Multi-region and zone aware• Latency aware load balancer• Fluent API on top of Thrift• Best Practice Recipes
Filling the operational void
• Tomcat webapp for Cassandra administration• AWS-style instance provisioning• Full and incremental backups• JMX metrics collection• Consistent configuration across clusters• REST API for most administrative operations• Security Groups configuration
Managing your data and schema
• Missing UI for Cassandra client users• View and edit schema• Point queries and data updates• High level cluster status and metrics• Manages multiple Cassandra clusters• Integrated access control• Schema auditing
High level cluster status
Data query tool
Schema management tool
Operations
• June 29th AWS partial outage• Observations• Monitoring• Maintenances
From the Netflix tech blog:
“Cassandra, our distributed cloud persistence store which is distributed across all zones and regions, dealt with the loss of one third of its regional nodes without any loss of data or availability.”
June 29th AWS partial outage
• During outage- All Cassandra instances in us-east-1a were inaccessible- nodetool ring showed all nodes as DOWN- Monitoring other AZs to ensure availability- Waited for AWS to resolve the issue
• Recovery – power restored to us-east-1a- Majority of instances rejoined the cluster without issue- Most of remainder required a reboot to fix- The others needed to be replaced, one at a time
Observations: AWS
• Ephemeral drive performance is better than EBS• Instances seldom die on their own• Use as many availability zones as possible• Understand how AWS launches instances• I/O is constrained in most AWS instance types
- Repairs are very I/O intensive- Large size-tiered compactions can impact latency
• SSDs are game changers
23
Observations: Cassandra
• A slow node is worse than a down node• Cold cache increases load and kills latency• Use whatever dials you can find in an emergency
- Remove node from coordinator list- Compaction throttling- Min/max compaction thresholds- Enable/disable gossip
• Leveled compaction performance is very promising• 1.1.x and 1.2.x should address some big issues
24
Monitoring
• Actionable- Hardware and network issues- Cluster consistency
• Cumulative Cassandra trends- Throughput and latency- Key Cassandra metrics (queues, dropped ops, table reads)
• Informational- Schema changes- Log file errors/exceptions- Recent restarts
25
Maintenance
• Repair clusters regularly• Run off-line major compactions to avoid latency
- SSDs will make this unnecessary• Always replace nodes when they fail• Periodically replace all nodes in the cluster• Upgrade to new versions
- Binary (rpm) for major upgrades or emergencies- Rolling AMI push over time
26
Scaling Cassandrahttp://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html
0 50 100 150 200 250 300 3500
200000
400000
600000
800000
1000000
1200000
174373
366828
537172
1099837
Client Writes/s by node count – Replication Factor = 3
800K writes per second in production
Disk vs. SSD BenchmarkSame Throughput, Lower Latency, Half Cost
Cassandra
Memcached
Application
Load GenerationLoad Test
Driver
REST service
36x m2.xlarge EVcache
48x m2.4xlarge Cassandra
REST service
15x hi1.4xlarge
Cassandra
Netflix is “all in” with Cassandra
50 Number of production clusters
15 Number of multi-region clusters
4 Max regions, one cluster
101 Total TB of data across all clusters
780 Number of Cassandra nodes
72/32 Largest Cassandra cluster (nodes/data in TB)
250k/800k Max read/writes per second on a single cluster
Future optimizations
• Cassandra as a Service• Fewer clusters, more data• Autoscaling Cassandra• Priam on PEDs• Self maintaining Cassandra clusters
All optimizations are open sourced
• Enhancements committed to open source project• Netflix@github
- Astyanax- Priam- Cassandra Explorers (coming soon)
• Motivations- Give back to Apache licensed OSS community- Help define best practices
Netflix Open Source Center
Conclusion
• Cassandra is high performing and durable in AWS• Cassandra is flexible enough to handle most use-cases• AWS offerings help provide a complete solution• Cassandra performs well in AWS, especially on SSDs• “Just because Netflix does it doesn’t make it right for you”
Follow us
• http://techblog.netflix.com• http://netflix.github.com• Twitter
• @Netflix• @NetflixJobs• @rusmeshenberg (Ruslan)• @eatupmartha (Gregg)
We are sincerely eager to hear your FEEDBACK on this presentation and on re:Invent.
Please fill out an evaluation form when you have a
chance.