active cloud db at cloudcomp '10
DESCRIPTION
TRANSCRIPT
![Page 1: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/1.jpg)
Active Cloud DB: A RESTful Software-as-a-Service for Language
Agnostic Access to Distributed Datastores
Chris Bunch Jonathan Kupferman Chandra KrintzWednesday, October 27, 2010
CloudComp 2010
1
![Page 2: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/2.jpg)
Who’s Using NoSQL?
2
and many others!
![Page 3: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/3.jpg)
Do It Yourself!
• Pick a datastore
• Learn how the interfaces SHOULD work
• Learn how the interfaces REALLY work
• Migrate to a non-relational data model
• each of these are non-trivial!
3
![Page 4: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/4.jpg)
Trouble in Paradise
4
(at least they’re honest about it)
![Page 5: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/5.jpg)
The Problem
• No way to compare databases with real applications
• No standard on what a real test is
• Too many variables in the equation
• Topology, query language, data model, APIs, consistency settings (to name a few)
5
![Page 6: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/6.jpg)
You Need A Better Way
• Need a platform to:
• Easily evaluate datastores
• Quickly evaluate datastores
• Evaluate datastores on similar metrics
6
![Page 7: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/7.jpg)
Our Contribution
• Active Cloud DB: A Google App Engine app that exposes the DB via REST
• Exposes string key/value DB
• Speed up repeated operations via caching
• Works on Google or AppScale
• Free access to BigTable
7
![Page 8: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/8.jpg)
8
![Page 9: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/9.jpg)
Realistically Speaking
• One test takes ~ 2 hours
• In one day at work you could generate a graph comparing:
• HBase
• Cassandra
• Google BigTable
• Amazon SimpleDB
9
![Page 10: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/10.jpg)
RESTful Interface
• GET /resources/key ➜ get
• POST /resources/key (with value) ➜ put
• DELETE /resources/key ➜ delete
• GET /resources ➜ query (get all)
10
![Page 11: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/11.jpg)
Caching Support
• Leverages Memcache API / memcached
• Provides a Least-Recently-Used Cache
• Write-through caching strategy - all puts / deletes are written to the cache
• Generational caching strategy - queries use a generation number
11
![Page 12: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/12.jpg)
Bookstore App
• Four prototypes available that use Active Cloud DB:
• Ruby on Rails
• Ruby (through Sinatra)
• Python (via Django)
• Python (through web.py)
12
![Page 13: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/13.jpg)
13
![Page 14: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/14.jpg)
The Actual Code
• With BigTable:
• val = `curl -X GET http://your-app.appspot.com/resources/#{key}`
• Or in AppScale:
• val = `curl -X GET http://128.111.55.223:8080/resources/#{key}`
14
![Page 15: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/15.jpg)
• Originally presented at CloudComp 2009
• An open-source implementation of the Google App Engine APIs
• Automatically configures and deploys cloud infrastructures to run your application
• includes database deployment
15
![Page 16: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/16.jpg)
• Supported Datastores as of AppScale 1.4:
• HBase, Hypertable
• MySQL
• Cassandra, Voldemort, Scalaris
• MongoDB
• MemcacheDB
• Amazon SimpleDB
16
![Page 17: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/17.jpg)
17
![Page 18: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/18.jpg)
Not Good Enough
• AppScale / GAE solve the problem for Python and Java
• But only with certain APIs
• And with certain restrictions
• Need something general purpose
•All languages, no restrictions
18
![Page 19: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/19.jpg)
But how do we test it?
• Cassandra 0.5.0 / MemcacheDB 1.2.1β
• Place 1000 items in the database and time:
• Get, put, query, delete operations
• Nine accessor threads
• Standard deployment model
19
![Page 20: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/20.jpg)
20
![Page 21: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/21.jpg)
21
![Page 22: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/22.jpg)
22
![Page 23: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/23.jpg)
A different type of test
• Workload model
• 10000 random operations selected
• 50/30/20 get/put/query ratio
• Constrained to 16 nodes
• Performed on initially empty database
23
![Page 24: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/24.jpg)
24
![Page 25: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/25.jpg)
25
![Page 26: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/26.jpg)
26
![Page 27: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/27.jpg)
Future Work
• Performance impact of:
• Cache size
• Millions of items in DB
• Overhead of Active Cloud DB
• Transaction support
27
![Page 28: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/28.jpg)
Related Work
• BigTable as a Web Service
• Not open source, HBase-like API
• Yahoo Cloud Serving Benchmark[SOCC10]
• Doesn’t run applications
• No automation - you set up the DB, you set up the schemas, etc.
28
![Page 29: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/29.jpg)
Active Cloud DB is Open for Business
• Open source - free to use
• Customize your own batch test or workload test
• Access it via any programming language
• Bookstore applications included
29
![Page 30: Active Cloud DB at CloudComp '10](https://reader033.vdocument.in/reader033/viewer/2022051816/54620ef9af79599e2c8b492e/html5/thumbnails/30.jpg)
Thanks!
• Download Active Cloud DB and AppScale:
• http://appscale.cs.ucsb.edu
• To my advisor, Chandra Krintz
• To the AppScale team, especially co-lead Navraj Chohan
30