apache cassandra in the cloud
TRANSCRIPT
![Page 1: Apache Cassandra in the Cloud](https://reader036.vdocument.in/reader036/viewer/2022073102/55cebddebb61eb912f8b4806/html5/thumbnails/1.jpg)
Cassandra in the Cloud
Adam ZegelinCo-founder and VP of Engineering @ Instaclustr.
Sydney Tech Day
instaclustr.com @Instaclustr
![Page 2: Apache Cassandra in the Cloud](https://reader036.vdocument.in/reader036/viewer/2022073102/55cebddebb61eb912f8b4806/html5/thumbnails/2.jpg)
Instaclustr• Instaclustr provides Cassandra-as-a-Service in the cloud.
• Aussie startup, based out of Canberra.Our CTO, Ben Bromhead, recently opened our Silicon Valley office. We’re soon to open an office in London.
• Real production experience — several customers in production.
• Currently run on Amazon Web Services, Microsoft Azure in private beta. In discussion with Google and IBM, more to come.
• DataStax are investors & partners.
![Page 3: Apache Cassandra in the Cloud](https://reader036.vdocument.in/reader036/viewer/2022073102/55cebddebb61eb912f8b4806/html5/thumbnails/3.jpg)
• Ben and myself started as a different company — data hosting & market-place, running in the cloud.
• As a startup we wanted to use Cassandra in the cloud alongside our app, but not host & manage it
• Founded Instaclustr — now our full focus
• Instaclustr: Focus on writing your app, not managing database infrastructure
![Page 4: Apache Cassandra in the Cloud](https://reader036.vdocument.in/reader036/viewer/2022073102/55cebddebb61eb912f8b4806/html5/thumbnails/4.jpg)
C* Model
• One database, many servers
• All servers (nodes) participate in the cluster
• Decentralised
• Need more capacity? Add more servers!
• Multiple servers ≣ built in redundancy
![Page 5: Apache Cassandra in the Cloud](https://reader036.vdocument.in/reader036/viewer/2022073102/55cebddebb61eb912f8b4806/html5/thumbnails/5.jpg)
100,000ops/sec
200,000ops/sec
400,000ops/sec
![Page 6: Apache Cassandra in the Cloud](https://reader036.vdocument.in/reader036/viewer/2022073102/55cebddebb61eb912f8b4806/html5/thumbnails/6.jpg)
client
0
4
28
0
4
28
client
![Page 7: Apache Cassandra in the Cloud](https://reader036.vdocument.in/reader036/viewer/2022073102/55cebddebb61eb912f8b4806/html5/thumbnails/7.jpg)
Hosting C*
• Traditional — servers, racks, data centres
• Cloud — unlimited* compute resources
• Hybrid/cloud bursting — overflow into the cloud
![Page 8: Apache Cassandra in the Cloud](https://reader036.vdocument.in/reader036/viewer/2022073102/55cebddebb61eb912f8b4806/html5/thumbnails/8.jpg)
Traditional Model• Buy or rent your own servers
• Manage hardware & software deployments and updates
• Slow time to market
• In-flexible
• Buy enough hardware to handle peak load upfront
• Requires good capacity planning
• Co-locate in a data centre Or build your own if your big enough (or have certain requirements)
![Page 9: Apache Cassandra in the Cloud](https://reader036.vdocument.in/reader036/viewer/2022073102/55cebddebb61eb912f8b4806/html5/thumbnails/9.jpg)
☁• Pay for what you use
• Hardware is no-longer a concern
• Almost instant-on compute resources < 1 minute boot time
• Flexible
• Scale up and down with load
• Respond quickly to changes in capacity requirements
![Page 10: Apache Cassandra in the Cloud](https://reader036.vdocument.in/reader036/viewer/2022073102/55cebddebb61eb912f8b4806/html5/thumbnails/10.jpg)
☁• Node replacement is easy
• Global DeploymentsC* nodes with data replica + app instance close to your users.
• Redundancy — not all your eggs in one basket Availability zones, regions, providers
• Split workloads — one DC for the app, one DC for analytics Reduce the performance impact of data analytics on user facing nodes
![Page 11: Apache Cassandra in the Cloud](https://reader036.vdocument.in/reader036/viewer/2022073102/55cebddebb61eb912f8b4806/html5/thumbnails/11.jpg)
Hybrid/Cloud Bursting
• C* nodes in your own data centre and C* nodes in the cloud
• Both are C* replicated data centres
• Live backups & fail-over — faster data recovery
• On-demand extra capacity for planed peak loads
• Periodic computationally expensive analytics jobs
![Page 12: Apache Cassandra in the Cloud](https://reader036.vdocument.in/reader036/viewer/2022073102/55cebddebb61eb912f8b4806/html5/thumbnails/12.jpg)
Amazon Web Services• Largest cloud provider — well supported (documentation, community,
value-add services)
• Multiple regions — Sydney, APAC, US, Europe
• Node sizes to fit all C* use cases
• SSD-backed nodes
• Virtual Private Clouds
• VPNs and VPC peering for isolated access
![Page 13: Apache Cassandra in the Cloud](https://reader036.vdocument.in/reader036/viewer/2022073102/55cebddebb61eb912f8b4806/html5/thumbnails/13.jpg)
Gotchas
• Buggy APIs
• Noisy neighbours
• Data sovereignty & security
![Page 14: Apache Cassandra in the Cloud](https://reader036.vdocument.in/reader036/viewer/2022073102/55cebddebb61eb912f8b4806/html5/thumbnails/14.jpg)
C* + ☁ + Instaclustr
• We run C* for metrics and log storage Also dog-fooding
• The scale of this cluster — both performance & storage — will grow as we manage more nodes More nodes = more data + more ops/sec
• Improved dev & testing — Every developer can run their own copy of our app + the monitoring cluster, on-demand
![Page 15: Apache Cassandra in the Cloud](https://reader036.vdocument.in/reader036/viewer/2022073102/55cebddebb61eb912f8b4806/html5/thumbnails/15.jpg)
Case Study• Advertising company — recording click metrics and serve targeted advertisements
• Requires < 10 ms response time from C*
• Originally managed it themselves
• Managing their app and the C* cluster was a burden on their engineering team
• Switched over to Instaclustr
• Instaclustr + Cloud + C* is flexible.Their performance requirements changed and the cloud & C* allowed us to change the underlying virtual machines. In production. At runtime.