sql? nosql? newsql?!? what's a java developer to do? - phillyete 2012
DESCRIPTION
The database world is undergoing a major upheaval. NoSQL databases such as MongoDB and Cassandra are emerging as a compelling choice for many applications. They can simplify the persistence of complex data models and offering significantly better scalability and performance. But these databases have a very different and unfamiliar data model and APIs as well as a limited transaction model. Moreover, the relational world is fighting back with so-called NewSQL databases such as VoltDB, which by using a radically different architecture offers high scalability and performance as well as the familiar relational model and ACID transactions. Sounds great but unlike the traditional relational database you can’t use JDBC and must partition your data. In this presentation you will learn about popular NoSQL databases – MongoDB, and Cassandra – as well at VoltDB. We will compare and contrast each database’s data model and Java API using NoSQL and NewSQL versions of a use case from the book POJOs in Action. We will learn about the benefits and drawbacks of using NoSQL and NewSQL databases.TRANSCRIPT
![Page 1: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/1.jpg)
SQL, NoSQL, NewSQL? What's a developer to do?
Chris Richardson
Author of POJOs in Action Founder of the original CloudFoundry.com
[email protected] @crichardson Blog: http://plainoldobjects.com
Copyright (c) 2012 Chris Richardson. All rights reserved.
![Page 2: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/2.jpg)
Overall presentation goal
The joy and pain of building Java
applications that use NoSQL and NewSQL
2
![Page 3: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/3.jpg)
About Chris
3
![Page 4: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/4.jpg)
(About Chris)
4
![Page 5: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/5.jpg)
About Chris()
5
![Page 6: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/6.jpg)
About Chris
6
![Page 7: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/7.jpg)
About Chris
http://www.theregister.co.uk/2009/08/19/springsource_cloud_foundry/
7
![Page 8: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/8.jpg)
About Chris
Developer Advocate for CloudFoundry.com
8
![Page 9: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/9.jpg)
Agenda
Why NoSQL? NewSQL? Persisting entities Implementing queries
Slide 9
![Page 10: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/10.jpg)
Food to Go
Take-out food delivery service
“Launched” in 2006 Used a relational
database (naturally)
10
![Page 11: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/11.jpg)
Success Growth challenges
Increasing traffic Increasing data volume Distribute across a few data centers Increasing domain model complexity
![Page 12: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/12.jpg)
Limitations of relational databases
Scaling Distribution Updating schema O/R impedance mismatch Handling semi-structured data
12
![Page 13: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/13.jpg)
Solution: Spend Money
Hire more DevOps Use application-level sharding Build your own middleware …
Buy SSD and RAM Buy Oracle Buy high-end servers …
http://upload.wikimedia.org/wikipedia/commons/e/e5/Rising_Sun_Yacht.JPG
OR
http://www.trekbikes.com/us/en/bikes/road/race_performance/madone_5_series/madone_5_2/#
13
![Page 14: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/14.jpg)
Solution: Use NoSQL
Higher performance Higher scalability Richer data-model Schema-less
Limited transactions Relaxed consistency Unconstrained data
Ben
efits
Draw
backs
14
![Page 15: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/15.jpg)
Slide 15
MongoDB
![Page 16: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/16.jpg)
MongoDB is a document oriented DB
16
Server
Database: Food To Go
Collection: Restaurants
{ "_id" : ObjectId("4bddc2f49d1505567c6220a0") "name": "Ajanta", "serviceArea": ["94619", "99999"], "openingHours": [
{ "dayOfWeek": 1, "open": 1130, "close": 1430 }, { "dayOfWeek": 2, "open": 1130, "close": 1430 }, …
] }
BSON = binary JSON
Sequence of bytes on disk fast i/o
![Page 17: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/17.jpg)
MongoDB – writes are fast
Slide 17
Client MongoDB
Insert(collection, documents)
Insert(collection, documents)
Insert(collection, documents)
getLastError()
Async
Blocks until writes succeed
![Page 18: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/18.jpg)
Config Server
Shard 1
MongoDB is scalable Mongod (replica)
Mongod (master) Mongod
(replica)
Shard 2 Mongod (replica)
Mongod (master) Mongod
(replica)
mongod
mongod
mongod
mongos
client
Collections spread over multiple
shards
A shard consists of a replica set = generalization of master slave
4/11/12 Slide 18
![Page 19: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/19.jpg)
MongoDB use cases
Use cases High volume writes Complex data Semi-structured data
Used by: Shutterfly, Foursquare, Bit.ly, …
Slide 19
![Page 20: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/20.jpg)
20
Apache Cassandra
![Page 21: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/21.jpg)
Keyspace
Column Family
K1
Cassandra is a column-oriented DB
N1 V1 TS1 N2 V2 TS2 N3 V3 TS3
N1 V1 TS1 N2 V2 TS2 N3 V3 TS3 K2
Column Name
Column Value
Timestamp Row Key
21
Column name/value: number, string, Boolean, timestamp, and composite
![Page 22: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/22.jpg)
CF.insert(key=K1, (N4, V4, TS4), …)
Cassandra– inserting/updating data
Idempotent= transaction
Column Family
K1 N1 V1 TS1
…
N2 V2 TS2 N3 V3 TS3
Column Family
K1 N1 V1 TS1
…
N2 V2 TS2 N3 V3 TS3 N4 V4 TS4
22
![Page 23: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/23.jpg)
CF.slice(key=K1, startColumn=N2, endColumn=N4)
Cassandra– retrieving data Column Family
K1 N1 V1 TS1
…
N2 V2 TS2 N3 V3 TS3 N4 V4 TS4
K1 N2 V2 TS2 N3 V3 TS3 N4 V4 TS4
23
![Page 24: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/24.jpg)
Tunable reads and writes
Any node (for writes) One replica Quorum of replicas Local quorum Each quorum All replicas
Slide 24
Higher availability Lower response time Less consistency
Lower availability Higher response time More consistency
![Page 25: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/25.jpg)
Datacenter 2
Cassandra cluster
Datacenter 1
Cassandra cluster
Cassandra is scalable
Slide 25
Node 1
Node 2
Node 3
Node 4
Node 1
Node 2
Node 3
Node 4
Application Application
![Page 26: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/26.jpg)
Cassandra use cases
Use cases Big data Multiple Data Center distributed
database (Write intensive) Logging High-availability (writes)
Used by: Netflix, Facebook, Digg, etc.
Slide 26
![Page 27: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/27.jpg)
Other NoSQL databases
http://nosql-database.org/ lists 122+ NoSQL databases
Type Examples
Extensible columns/Column-oriented
Hbase SimpleDB DynamoDB
Graph Neo4j
Key-value Redis Membase
Document CouchDb
27
![Page 28: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/28.jpg)
Solution: Use NewSQL
Relational databases with SQL and ACID transactions
AND
New and improved architecture Radically better scalability and
performance
NewSQL vendors: ScaleDB, NimbusDB, …, VoltDB
Slide 28
![Page 29: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/29.jpg)
Stonebraker’s motivations
29
“…Current databases are designed for 1970s hardware …”
Stonebraker: http://www.slideshare.net/VoltDB/sql-myths-webinar
Significant overhead in “…logging, latching, locking, B-tree, and buffer management
operations…”
SIGMOD 08: Though the looking glass: http://dl.acm.org/citation.cfm?id=1376713
![Page 30: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/30.jpg)
About VoltDB
Open-source In-memory relational database Durability thru replication; snapshots
and logging Transparent partitioning Fast and scalable
Slide 30
http://www.mysqlperformanceblog.com/2011/02/28/is-voltdb-really-as-scalable-as-they-claim/
…VoltDB is very scalable; it should scale to 120 partitions, 39 servers, and 1.6 million complex transactions per second at over 300 CPU cores…
![Page 31: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/31.jpg)
The future is polyglot persistence
IEEE Software Sept/October 2010 - Debasish Ghosh / Twitter @debasishg
e.g. Netflix • RDBMS • SimpleDB • Cassandra • Hadoop/Hbase
31
![Page 32: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/32.jpg)
Spring Data is here to help
http://www.springsource.org/spring-data
NoSQL databases
32
For
![Page 33: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/33.jpg)
Agenda
Why NoSQL? NewSQL? Persisting entities Implementing queries
Slide 33
![Page 34: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/34.jpg)
Food to Go – Place Order use case
1. Customer enters delivery address and delivery time
2. System displays available restaurants 3. Customer picks restaurant 4. System displays menu 5. Customer selects menu items 6. Customer places order
34
![Page 35: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/35.jpg)
Food to Go – Domain model (partial)
class Restaurant { long id; String name; Set<String> serviceArea; Set<TimeRange> openingHours; List<MenuItem> menuItems; }
class MenuItem { String name; double price; }
class TimeRange { long id; int dayOfWeek; int openingTime; int closingTime; }
35
![Page 36: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/36.jpg)
Database schema ID Name …
1 Ajanta
2 Montclair Eggshop
Restaurant_id zipcode
1 94707
1 94619
2 94611
2 94619
Restaurant_id dayOfWeek openTime closeTime
1 Monday 1130 1430
1 Monday 1730 2130
2 Tuesday 1130 …
RESTAURANT table
RESTAURANT_ZIPCODE table
RESTAURANT_TIME_RANGE table
36
![Page 37: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/37.jpg)
Pick your JDBC-based framework
interface AvailableRestaurantRepository {
void add(Restaurant restaurant); Restaurant findDetailsById(int id); … }
37
Spring JDBC Hibernate/JPA …
JDBC
![Page 38: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/38.jpg)
Restaurant aggregate
But with NoSQL?
interface AvailableRestaurantRepository {
void add(Restaurant restaurant); Restaurant findDetailsById(int id); … }
38
Restaurant
MenuItem TimeRange
? ?
![Page 39: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/39.jpg)
Slide 39
MongoDB
![Page 40: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/40.jpg)
MongoDB: persisting restaurants is easy
40
{ "_id" : ObjectId("4bddc2f49d1505567c6220a0") "name": "Ajanta", "serviceArea": ["94619", "99999"], "openingHours": [
{ "dayOfWeek": 1, "open": 1130, "close": 1430 }, { "dayOfWeek": 2, "open": 1130, "close": 1430 }, …
] }
![Page 41: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/41.jpg)
Spring Data for Mongo code @Repository public class AvailableRestaurantRepositoryMongoDbImpl
implements AvailableRestaurantRepository {
public static String AVAILABLE_RESTAURANTS_COLLECTION = "availableRestaurants";
@Autowired private MongoTemplate mongoTemplate;
@Override public void add(Restaurant restaurant) { mongoTemplate.insert(restaurant, AVAILABLE_RESTAURANTS_COLLECTION); }
@Override public Restaurant findDetailsById(int id) { return mongoTemplate.findOne(new Query(where("_id").is(id)),
Restaurant.class, AVAILABLE_RESTAURANTS_COLLECTION); } }
41
![Page 42: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/42.jpg)
Spring Configuration
@Configuration public class MongoConfig extends AbstractDatabaseConfig {
@Value("#{mongoDbProperties.databaseName}") private String mongoDbDatabase;
@Bean public Mongo mongo() throws UnknownHostException, MongoException { return new Mongo(databaseHostName); }
@Bean public MongoTemplate mongoTemplate(Mongo mongo) throws Exception { MongoTemplate mongoTemplate = new MongoTemplate(mongo, mongoDbDatabase); … return mongoTemplate; } }
42
![Page 43: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/43.jpg)
43
Apache Cassandra
![Page 44: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/44.jpg)
Option #1: Use a column per attribute
Column Family: RestaurantDetails
1
name Ajanta
type indian
serviceArea[0] 94619
serviceArea[1] 94707
openingHours[0].dayOfWeek Monday
openingHours[0].open 1130
2
name Egg shop
type Break Fast
serviceArea[0] 94611
serviceArea[1] 94619
openingHours[0].dayOfWeek Monday
openingHours[0].open 0830
Column Name = path/expression to access property value
openingHours[0].close 1430
openingHours[0].close 1430
![Page 45: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/45.jpg)
2 attributes: { name: “Montclair Eggshop”, … }
Column Family: RestaurantDetails
1 attributes { name: “Ajanta”, …}
2
Option #2: Use a single column
attributes { name: “Eggshop”, …}
✔ 45
Column value = serialized object graph, e.g. JSON
Can’t use secondary indexes but they aren’t helpful for these use cases
![Page 46: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/46.jpg)
Cassandra code: wrapper around Hector
public class AvailableRestaurantRepositoryCassandraKeyImpl implements AvailableRestaurantRepository {
@Autowired private final CassandraTemplate cassandraTemplate;
public void add(Restaurant restaurant) { cassandraTemplate.insertEntity(keyspace,
RESTAURANT_DETAILS_CF, restaurant);
}
public Restaurant findDetailsById(int id) { String key = Integer.toString(id); return cassandraTemplate.findEntity(Restaurant.class,
keyspace, key, RESTAURANT_DETAILS_CF); … }
… 46
Home grown wrapper class
http://en.wikipedia.org/wiki/Hector
![Page 47: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/47.jpg)
Slide 47
![Page 48: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/48.jpg)
Using VoltDB
Use the original schema Standard SQL statements
BUT YOU MUST
Write stored procedures and invoke them using proprietary interface
Partition your data
Slide 48
![Page 49: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/49.jpg)
About VoltDB stored procedures
Key part of VoltDB Replication = executing stored
procedure on replica Logging = log stored procedure
invocation Stored procedure invocation =
transaction
Slide 49
![Page 50: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/50.jpg)
About partitioning
Slide 50
ID Name …
1 Ajanta
2 Eggshop
…
Partition column
RESTAURANT table
ID Name …
1 Ajanta
…
ID Name …
2 Eggshop
…
Partition 1 Partition 2
![Page 51: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/51.jpg)
Example VoltDB cluster
Slide 51
VoltDB Server A
ID Name …
1 Ajanta
…
VoltDB Server B
ID Name …
2 Eggshop
…
VoltDB Server C
ID Name …
… ..
…
Partition 1a Partition 2a Partition 3a
ID Name …
1 Ajanta
…
ID Name …
2 Eggshop
…
ID Name …
… ..
…
Partition 3b Partition 1b Partition 2b
(3 servers) x (2 partitions per server) ÷
(2 replicas per partition) 3 partitions
![Page 52: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/52.jpg)
Single partition procedure: FAST
Slide 52
VoltDB Server A
…
ID Name …
1 Ajanta
…
VoltDB Server B
…
ID Name …
1 Eggshop
…
VoltDB Server C
…
ID Name …
… ..
…
SELECT * FROM RESTAURANT WHERE ID = 1
High-performance lock free code Partition column
Stored procedure parameter
![Page 53: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/53.jpg)
Multi-partition procedure: SLOWER
Slide 53
VoltDB Server A
…
ID Name …
1 Ajanta
…
VoltDB Server B
…
ID Name …
1 Eggshop
…
VoltDB Server C
…
ID Name …
… ..
…
SELECT * FROM RESTAURANT WHERE NAME = ‘Ajanta’
Communication/Coordination overhead
![Page 54: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/54.jpg)
Chosen partitioning scheme
54
<partitions> <partition table="restaurant" column="id"/> <partition table="service_area" column="restaurant_id"/> <partition table="menu_item" column="restaurant_id"/> <partition table="time_range" column="restaurant_id"/> <partition table="available_time_range" column="restaurant_id"/> </partitions>
Performance is excellent: much faster than MySQL
![Page 55: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/55.jpg)
Stored procedure – AddRestaurant @ProcInfo( singlePartition = true, partitionInfo = "Restaurant.id: 0”)
public class AddRestaurant extends VoltProcedure {
public final SQLStmt insertRestaurant = new SQLStmt("INSERT INTO Restaurant VALUES (?,?);");
public final SQLStmt insertServiceArea = new SQLStmt("INSERT INTO service_area VALUES (?,?);");
public final SQLStmt insertOpeningTimes = new SQLStmt("INSERT INTO time_range VALUES (?,?,?,?);");
public final SQLStmt insertMenuItem = new SQLStmt("INSERT INTO menu_item VALUES (?,?,?);");
public long run(int id, String name, String[] serviceArea, long[] daysOfWeek, long[] openingTimes, long[] closingTimes, String[] names, double[] prices) {
voltQueueSQL(insertRestaurant, id, name);
for (String zipCode : serviceArea) voltQueueSQL(insertServiceArea, id, zipCode);
for (int i = 0; i < daysOfWeek.length ; i++) voltQueueSQL(insertOpeningTimes, id, daysOfWeek[i], openingTimes[i], closingTimes[i]);
for (int i = 0; i < names.length ; i++) voltQueueSQL(insertMenuItem, id, names[i], prices[i]);
voltExecuteSQL(true);
return 0; } }
55
![Page 56: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/56.jpg)
VoltDb repository – add() @Repository public class AvailableRestaurantRepositoryVoltdbImpl
implements AvailableRestaurantRepository {
@Autowired private VoltDbTemplate voltDbTemplate;
@Override public void add(Restaurant restaurant) { invokeRestaurantProcedure("AddRestaurant", restaurant); }
private void invokeRestaurantProcedure(String procedureName, Restaurant restaurant) { Object[] serviceArea = restaurant.getServiceArea().toArray(); long[][] openingHours = toArray(restaurant.getOpeningHours()); Object[][] menuItems = toArray(restaurant.getMenuItems());
voltDbTemplate.update(procedureName, restaurant.getId(), restaurant.getName(), serviceArea, openingHours[0], openingHours[1],
openingHours[2], menuItems[0], menuItems[1]); }
56
Flatten Restaurant
![Page 57: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/57.jpg)
VoltDbTemplate wrapper class public class VoltDbTemplate {
private Client client;
public VoltDbTemplate(Client client) { this.client = client; }
public void update(String procedureName, Object... params) { try { ClientResponse x =
client.callProcedure(procedureName, params); … } catch (Exception e) { throw new RuntimeException(e); } }
57
VoltDB client API
![Page 58: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/58.jpg)
VoltDb server configuration
voltcompiler target/classes \ src/main/resources/sql/voltdb-project.xml foodtogo.jar
58
<?xml version="1.0"?> <project> <info> <name>Food To Go</name> ... </info> <database> <schemas> <schema path='schema.sql' /> </schemas>
<partitions> <partition table="restaurant" column="id"/> ...
</partitions>
<procedures> <procedure class='net.chrisrichardson.foodToGo.newsql.voltdb.procs.AddRestaurant' /> ... </procedures> </database> </project>
<deployment> <cluster hostcount="1"
sitesperhost="5" kfactor="0" /> </deployment>
bin/voltdb leader localhost catalog foodtogo.jar deployment deployment.xml
![Page 59: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/59.jpg)
Agenda
Why NoSQL? NewSQL? Persisting entities Implementing queries
Slide 59
![Page 60: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/60.jpg)
Finding available restaurants Available restaurants =
Serve the zip code of the delivery address
AND Are open at the delivery time
public interface AvailableRestaurantRepository {
List<AvailableRestaurant> findAvailableRestaurants(Address deliveryAddress, Date deliveryTime); …
}
60
![Page 61: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/61.jpg)
Finding available restaurants on Monday, 6.15pm for 94619 zip
select r.* from restaurant r inner join restaurant_time_range tr on r.id =tr.restaurant_id inner join restaurant_zipcode sa on r.id = sa.restaurant_id Where ’94619’ = sa.zip_code and tr.day_of_week=’monday’ and tr.openingtime <= 1815 and 1815 <= tr.closingtime
Straightforward three-way join
61
![Page 62: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/62.jpg)
Slide 62
MongoDB
![Page 63: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/63.jpg)
MongoDB = easy to query
Find a restaurant that serves the 94619 zip code and is open at 6.15pm on a Monday
{ serviceArea:"94619", openingHours: { $elemMatch : { "dayOfWeek" : "Monday", "open": {$lte: 1815}, "close": {$gte: 1815} } } }
DBCursor cursor = collection.find(qbeObject); while (cursor.hasNext()) { DBObject o = cursor.next(); … }
63
db.availableRestaurants.ensureIndex({serviceArea: 1})
![Page 64: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/64.jpg)
MongoTemplate-based code @Repository public class AvailableRestaurantRepositoryMongoDbImpl implements AvailableRestaurantRepository {
@Autowired private final MongoTemplate mongoTemplate;
@Override public List<AvailableRestaurant> findAvailableRestaurants(Address deliveryAddress,
Date deliveryTime) { int timeOfDay = DateTimeUtil.timeOfDay(deliveryTime); int dayOfWeek = DateTimeUtil.dayOfWeek(deliveryTime);
Query query = new Query(where("serviceArea").is(deliveryAddress.getZip()) .and("openingHours”).elemMatch(where("dayOfWeek").is(dayOfWeek) .and("openingTime").lte(timeOfDay) .and("closingTime").gte(timeOfDay)));
return mongoTemplate.find(AVAILABLE_RESTAURANTS_COLLECTION, query, AvailableRestaurant.class);
}
mongoTemplate.ensureIndex(“availableRestaurants”, new Index().on("serviceArea", Order.ASCENDING));
64
![Page 65: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/65.jpg)
65
But how do this with Apache Cassandra??!
![Page 66: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/66.jpg)
Column Family
X Y Z
Slice is equivalent to a simple SELECT
keyVal colName colValue
X Y Z
…
columnFamily.slice(key=keyVal, startColumn=startVal, endColumn=endVal)
select key, colName, colValue from columnFamily where key = keyVal and colName >= startVal and colName <= endVal
![Page 67: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/67.jpg)
Slide 67
Queries instead of data model drives NoSQL database design
We need to implement an index that can be queried using a slice
![Page 68: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/68.jpg)
Simplification #1: Denormalization
Restaurant_id Day_of_week Open_time Close_time Zip_code
1 Monday 1130 1430 94707
1 Monday 1130 1430 94619
1 Monday 1730 2130 94707
1 Monday 1730 2130 94619
2 Monday 0700 1430 94619
…
SELECT restaurant_id FROM time_range_zip_code WHERE day_of_week = ‘Monday’ AND zip_code = 94619 AND 1815 < close_time
AND open_time < 1815
Simpler query: No joins Two = and two <
68
![Page 69: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/69.jpg)
Simplification #2: Application filtering
SELECT restaurant_id, open_time FROM time_range_zip_code WHERE day_of_week = ‘Monday’ AND zip_code = 94619 AND 1815 < close_time
AND open_time < 1815
Even simpler query • No joins • Two = and one <
69
![Page 70: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/70.jpg)
Simplification #3: Eliminate multiple =’s with concatenation
SELECT restaurant_id, open_time FROM time_range_zip_code WHERE zip_code_day_of_week = ‘94619:Monday’ AND 1815 < close_time
Restaurant_id Zip_dow Open_time Close_time
1 94707:Monday 1130 1430
1 94619:Monday 1130 1430
1 94707:Monday 1730 2130
1 94619:Monday 1730 2130
2 94619:Monday 0700 1430
…
Row key
range
70
![Page 71: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/71.jpg)
Column Family: AvailableRestaurants
94619:Monday (1430,0700,2) JSON FOR EGG
(1430,1130,1) JSON FOR AJANTA
(2130,1730,1) JSON FOR AJANTA
Column family as an index
Restaurant_id Zip_dow Open_time Close_time
1 94707:Monday 1130 1430
1 94619:Monday 1130 1430
1 94707:Monday 1730 2130
1 94619:Monday 1730 2130
2 94619:Monday 0700 1430
…
94619:Monday 1430 2 0700
![Page 72: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/72.jpg)
slice(key= 94619:Monday, sliceStart = (1815, *, *), sliceEnd = (2359, *, *))
Querying with a slice
94619:Monday
72
Column Family: AvailableRestaurants
94619:Monday (1430,0700,2) JSON FOR
EGG
(1430,1130,1) JSON FOR AJANTA
(2130,1730,1) JSON FOR AJANTA
18:15 is after 17:30 {Ajanta}
(2130,1730,1) JSON FOR AJANTA
![Page 73: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/73.jpg)
private void insertAvailability(Restaurant restaurant) {
for (String zipCode : (Set<String>) restaurant.getServiceArea()) { for (TimeRange tr : (Set<TimeRange>) restaurant.getOpeningHours()) { String dayOfWeek = format2(tr.getDayOfWeek()); String openingTime = format4(tr.getOpeningTime()); String closingTime = format4(tr.getClosingTime()); String restaurantId = format8(restaurant.getId());
String key = formatKey(zipCode, dayOfWeek); String columnValue = toJson(restaurant);
Composite columnName = new Composite(); columnName.add(0, closingTime); columnName.add(1, openingTime); columnName.add(2, restaurantId);
ColumnFamilyUpdater<String, Composite> updater = compositeCloseTemplate.createUpdater(key);
updater.setString(columnName, columnValue);
compositeCloseTemplate.update(updater);
} } }
Needs a few pages of code
73
@Override public List<AvailableRestaurant> findAvailableRestaurants(Address deliveryAddress, Date deliveryTime) { int dayOfWeek = DateTimeUtil.dayOfWeek(deliveryTime); int timeOfDay = DateTimeUtil.timeOfDay(deliveryTime); String zipCode = deliveryAddress.getZip(); String key = formatKey(zipCode, format2(dayOfWeek));
HSlicePredicate<Composite> predicate = new HSlicePredicate<Composite>(new CompositeSerializer()); Composite start = new Composite(); Composite finish = new Composite(); start.addComponent(0, format4(timeOfDay), ComponentEquality.GREATER_THAN_EQUAL); finish.addComponent(0, format4(2359), ComponentEquality.GREATER_THAN_EQUAL); predicate.setRange(start, finish, false, 100);
final List<AvailableRestaurantIndexEntry> closingAfter = new ArrayList<AvailableRestaurantIndexEntry>();
ColumnFamilyRowMapper<String, Composite, Object> mapper = new ColumnFamilyRowMapper<String, Composite, Object>() {
@Override public Object mapRow(ColumnFamilyResult<String, Composite> results) { for (Composite columnName : results.getColumnNames()) { String openTime = columnName.get(1, new StringSerializer()); String restaurantId = columnName.get(2, new StringSerializer()); closingAfter.add(new AvailableRestaurantIndexEntry(openTime, restaurantId, results.getString(columnName))); } return null; } };
compositeCloseTemplate.queryColumns(key, predicate, mapper);
List<AvailableRestaurant> result = new LinkedList<AvailableRestaurant>();
for (AvailableRestaurantIndexEntry trIdAndAvailableRestaurant : closingAfter) { if (trIdAndAvailableRestaurant.isOpenBefore(timeOfDay)) result.add(trIdAndAvailableRestaurant.getAvailableRestaurant()); }
return result; }
![Page 74: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/74.jpg)
What did I just do to query the data?
Wrote code to maintain an index Reduced performance due to extra
writes
74
![Page 75: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/75.jpg)
But what would you rather implement?
Slide 75
“Complex” query logic
OR
Multi-datacenter, multi-master database
infrastructure
![Page 76: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/76.jpg)
Slide 76
![Page 77: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/77.jpg)
VoltDB - attempt #0
Slide 77
SELECT r.* FROM restaurant r,time_range tr, service_area sa WHERE ? = sa.zip_code and sa.restaurant_id= tr.restaurant_id AND r.restaurant_id = sa.restaurant_id and tr.day_of_week=? AND tr.open_time <= ? AND ? <= tr.close_time
Slow
@ProcInfo( singlePartition = false) public class FindAvailableRestaurants extends VoltProcedure { ... }
![Page 78: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/78.jpg)
VoltDB - attempt #1
Slide 78
ERROR 10:12:03,251 [main] COMPILER: Failed to plan for statement type(findAvailableRestaurants_with_join) select r.* from restaurant r,time_range tr, service_area sa Where ? = sa.zip_code and sa.restaurant_id= tr.restaurant_id and r.restaurant_id = sa.restaurant_id and tr.day_of_week=? and tr.open_time <= ? and ? <= tr.close_time Error: "Unable to plan for statement. Possibly joining partitioned tables in a multi-partition procedure using a column that is not the partitioning attribute or a non-equality operator. This is statement not supported at this time." 2012-03-19 08:19:31,743 ERROR [main] COMPILER: Catalog compilation failed.
#fail
@ProcInfo( singlePartition = false) public class FindAvailableRestaurants extends VoltProcedure { ... }
create index idx_service_area_zip_code on service_area(zip_code); ..
![Page 79: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/79.jpg)
VoltDB - attempt #2
Slide 79
@ProcInfo( singlePartition = true, partitionInfo = "Restaurant.id: 0”) public class AddRestaurant extends VoltProcedure {
public final SQLStmt insertAvailable= new SQLStmt("INSERT INTO available_time_range VALUES (?,?,?, ?, ?, ?);");
public long run(....) { ...
for (int i = 0; i < daysOfWeek.length ; i++) { voltQueueSQL(insertOpeningTimes, id, daysOfWeek[i], openingTimes[i], closingTimes[i]); for (String zipCode : serviceArea) { voltQueueSQL(insertAvailable, id, daysOfWeek[i], openingTimes[i],
closingTimes[i], zipCode, name); } }
... voltExecuteSQL(true); return 0; } }
Works but queries are only slightly faster than MySQL!
public final SQLStmt findAvailableRestaurants_denorm = new SQLStmt( "select restaurant_id, name from available_time_range tr " + "where ? = tr.zip_code " + "and tr.day_of_week=? " + "and tr.open_time <= ? " + " and ? <= tr.close_time ");
![Page 80: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/80.jpg)
VoltDB - attempt #3
80
<partitions> ... <partition table="available_time_range" column="zip_code"/> </partitions>
Queries are really fast But inserts are not
@ProcInfo( singlePartition = false, ...) public class AddRestaurant extends VoltProcedure { ... }
@ProcInfo( singlePartition = true, partitionInfo = "available_time_range.zip_code: 0") public class FindAvailableRestaurants extends VoltProcedure { ... }
![Page 81: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/81.jpg)
VoltDB – key lesson
Slide 81
A given partitioning scheme can be good for some use cases but bad for others
![Page 82: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/82.jpg)
Summary
82
Different databases =
Different tradeoffs
![Page 83: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/83.jpg)
… Summary…
Benefits Drawbacks
RDBMS • SQL • ACID • Familiar
• Lack of performance and scalability • Difficult to distribute • Schema updates • Handling semi-structured data
NoSQL • Scalability • Performance • Schema-less
• Lack of ACID • Limited querying (some)
NewSQL • ACID • SQL • Familiar • Scalability • Performance
• Proprietary API • Schema updates • Handling semi-structured data • Not all use cases are performant
Slide 83
![Page 84: SQL? NoSQL? NewSQL?!? What's a Java developer to do? - PhillyETE 2012](https://reader033.vdocument.in/reader033/viewer/2022052410/554a5c58b4c905572f8b502d/html5/thumbnails/84.jpg)
… Summary
Very carefully pick the NewSQL/NoSQL DB for your application
Consider a polyglot persistence architecture
Encapsulate your data access code so you can switch
Startups = avoid NewSQL/NoSQL for shorter time to market?
Slide 84