cassandra & puppet, scaling data at $15 per month
DESCRIPTION
Constant Contact shares lessons learned from DevOps approach to implementing Cassandra to manage social media data for over 400k small business customers. Puppet is the critical in our tool chain. Single most important factor was the willingness of Development and Operations to stretch beyond traditional roles and responsibilities.TRANSCRIPT
![Page 1: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/1.jpg)
Copyright © 2011 Constant Contact Inc. 1
Constant ContactMarch 2011
Dave Connors – VP OperationsJim Ancona – Systems ArchitectMark Schena – Manager Systems Automation
Cassandra & Puppet:Scaling data at $15/month
![Page 2: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/2.jpg)
Copyright © 2011 Constant Contact Inc. 2
Constant Contact
2000 – 2010
Market leader for Small Businesses• Email, Event & Survey• Over 400k paying customers• No. 134 on the Deloitte Technology Fast 500 listing
Business model• Many customers pay as little as $15 a month• ~2 million database transactions per minute
Constant Contact
![Page 3: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/3.jpg)
Copyright © 2011 Constant Contact Inc. 3
Constant Contact
The business problem
![Page 4: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/4.jpg)
Copyright © 2011 Constant Contact Inc. 4
Constant Contact
Small Businesses are looking to us for help with Social Media marketing
• Social Media 10-100 times more data
• Challenge with our business model
![Page 5: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/5.jpg)
Copyright © 2011 Constant Contact Inc. 5
The Key Challenge
Integrate social media data
• Solution = NoSQL
• Cost = Low
• Time to market = ?
The Key Challenge
![Page 6: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/6.jpg)
Copyright © 2011 Constant Contact Inc. 6
Implementation
Ops and Dev both face issues
• Data model• Monitoring• Authentication• Logging• Risk profile• Roles & Responsibilities
Implementing NoSQL
![Page 7: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/7.jpg)
Copyright © 2011 Constant Contact Inc. 7
Dev
Ops
![Page 8: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/8.jpg)
Copyright © 2011 Constant Contact Inc. 8
Apache Cassandra
• Developed at Facebook• Open sourced in 2008• Incubated at Apache• Became an Apache top-level project in 2010
• http://cassandra.apache.org
• In use at Digg, Facebook, Twitter, Reddit, Rackspace, Cloudkick, Cisco, …
• Largest production cluster has over 100 TB of data in over 150 machines
Apache Cassandra
![Page 9: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/9.jpg)
Copyright © 2011 Constant Contact Inc. 9
What is Cassandra?
• Implemented in Java
• Fault Tolerant• Elastic• Durable
• Rich data model• Replicated data • Consistency
options
What is Cassandra
![Page 10: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/10.jpg)
Copyright © 2011 Constant Contact Inc. 10
Replication
X
X X
How many copies of each piece of data
do we want?
N=3
Replication
![Page 11: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/11.jpg)
Copyright © 2011 Constant Contact Inc. 11
Y
Y Y
Y
X Y
Consistency LevelONE
WriterReade
r
YX
X X
Consistency Level One
![Page 12: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/12.jpg)
Copyright © 2011 Constant Contact Inc. 12
Y
Y Y
X
X Y
WriterReade
r
XX
X X
Consistency Level Quorum
![Page 13: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/13.jpg)
Copyright © 2011 Constant Contact Inc. 13
Risks and Mitigation
• Moving target• Developer
unfamiliarity• Operational
procedures• Reliability concerns
• Deployment automation
• Community involvement
• Training/Consulting• Application
selection• Lots of monitoring• Phased rollout
Risks and Mitigation
![Page 14: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/14.jpg)
Copyright © 2011 Constant Contact Inc. 14
Development Challenges
Understanding the data modelChoosing a client
■ Clients available for Java, Python, .NET, Ruby, PHP
■ Don’t use Thrift
Moving target
Development Challenges
![Page 15: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/15.jpg)
Copyright © 2011 Constant Contact Inc. 15
• Not “one neck to wring”• Paid support and training is available:
http://datastax.com• Community
■ Mailing lists■ IRC #cassandra at freenode
• Contribute
Open Source
![Page 16: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/16.jpg)
Copyright © 2011 Constant Contact Inc. 16
• Switchable modes• Mirroring• Dial-able traffic
Phased Rollout
![Page 17: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/17.jpg)
Copyright © 2011 Constant Contact Inc. 17
• Big, complex project• Close collaboration• Flexible roles• Ability to iterate
Collaboration
![Page 18: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/18.jpg)
Copyright © 2011 Constant Contact Inc. 18
Dev
Ops
![Page 19: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/19.jpg)
Copyright © 2011 Constant Contact Inc. 19
“Are you sure you really want that?”
• 3 500G disks• 1 250G disk• No SWAP• RAID Zero Root Partition and Data Storage• 32G Memory
“Are you sure you really want that?”
![Page 20: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/20.jpg)
Copyright © 2011 Constant Contact Inc. 20
We will need how many servers?We will need how many servers?
![Page 21: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/21.jpg)
Copyright © 2011 Constant Contact Inc. 21
• Quorum = 3 • Multiple Datacenters = 2• Use only half the available disk = 2• 12 Servers = ~1 TB Of Data Storage• ~6 TB of Data Storage
3 x 2 = 6x 2 = 12
72x 6 =
How many nodes?
![Page 22: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/22.jpg)
Copyright © 2011 Constant Contact Inc. 22
RanRandom Partitioner
![Page 23: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/23.jpg)
Copyright © 2011 Constant Contact Inc. 23
Tool ChainTool Chain
![Page 24: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/24.jpg)
Copyright © 2011 Constant Contact Inc. 24
with Puppet
• Puppet is the shared framework between Operations and Development
• Versioning of puppet code allows for adoption of development best practices
• Leverage Domain specific knowledge and skill
DevOps with Puppet
![Page 25: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/25.jpg)
Copyright © 2011 Constant Contact Inc. 25
Always Move ForwardAlways Move Forward
![Page 26: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/26.jpg)
Copyright © 2011 Constant Contact Inc. 26
Operational Efficiencies
• Remote logging is a requirement • Cassandra uses log4j natively• Resources not available for remote log4j
development• Scribed with Puppet provides the solution
Operational Efficiencies
![Page 27: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/27.jpg)
Copyright © 2011 Constant Contact Inc. 27
• Munin• JMX trending• Identify critical data points• Rapid development of graphs• Puppet Definitions are used for rapid
deployment
Development takes the Operational Lead
![Page 28: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/28.jpg)
Copyright © 2011 Constant Contact Inc. 28
Sample Munin Graph
![Page 29: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/29.jpg)
Copyright © 2011 Constant Contact Inc. 29
Puppet Code
define munin::cassandracolumnfamily ( ) { include cassandravirtual File <| title == "jmxbin" |>
$confdir="/opt/cassandra-munin-plugins” $plugindir="/etc/munin/plugins" $target="/opt/cassandra-munin-plugins/jmx_" # Match 3 strings separated by periods $pattern = '^([^.]*)[.]([^.]*)[.]([^.]*)$' $keyspace = regsubst($name, $pattern, '\1') $columnfamily = regsubst($name, $pattern, '\2') $file = regsubst($name, $pattern, '\3')
file {"${keyspace}_${columnfamily}_${file}.conf": owner => 'root', ensure => 'file', group => 'root', type => 'file', path => "${confdir}/${keyspace}_${columnfamily}_${file}.conf", mode => '644', content => template("munin/attribute_${file}.conf.erb"), require => [ Package['munin-node'], File['/opt/cassandra-munin-plugins'], File['jmxquery'], ], } file {"$plugindir/${keyspace}_${columnfamily}_${file}": ensure => 'link', owner => 'root', group => 'root', mode => '511', type => 'link', target => "$target", require => [ File['/opt/cassandra-munin-plugins'], File["${keyspace}_${columnfamily}_${file}.conf"], File['jmxquery'], Package['munin-node'], ],
Example: Munin Puppet Code
![Page 30: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/30.jpg)
Copyright © 2011 Constant Contact Inc. 30
Conclusion
• Cassandra as an appliance• Development Best Practices with Life Cycle
Management• Traditional vs. Today
• Infrastructure 4 weeks 4 hours to build 72 nodes
• Development to Deployment9 months 3 months
• CostMillions 150k
Conclusion
![Page 31: Cassandra & puppet, scaling data at $15 per month](https://reader033.vdocument.in/reader033/viewer/2022061201/5469e960af7959653c8b645e/html5/thumbnails/31.jpg)
Copyright © 2011 Constant Contact Inc. 31
Q&A
Thank You!