nosql session gluecon may 2010

26
NoSQL : Channeling the Data Explosion Dwight Merriman CEO, 10gen @dmerr dmerr.tumblr.com GlueCon 2010

Upload: mongodb

Post on 19-Dec-2014

4.534 views

Category:

Technology


1 download

DESCRIPTION

Overview of NoSQL at GlueCon. Talk given by Dwight from 10gen/MongoDB.

TRANSCRIPT

Page 1: NOSQL Session GlueCon May 2010

NoSQL : Channeling the Data Explosion

Dwight MerrimanCEO, 10gen

@dmerr dmerr.tumblr.com

GlueCon 2010

Page 2: NOSQL Session GlueCon May 2010

The database world is changingNo longer one-size-fits-all

Page 3: NOSQL Session GlueCon May 2010

NoSQL = Non-relational next generation operation data stores

and databases

Page 4: NOSQL Session GlueCon May 2010

Scaling Out

no joins +light transactional semantics = horizontally scalable architectures

Page 5: NOSQL Session GlueCon May 2010

Why?

http://www.globalnerdy.com/2007/09/07/multicore-musings/

cloud

commodity

Page 6: NOSQL Session GlueCon May 2010

How the NoSQL Products Vary

• What’s the same– No joins– No complex transactions

• What varies– Scale-out model– Consistency model– Data model

Page 7: NOSQL Session GlueCon May 2010

Scaling Out

distribution & query models

Consistent hashing

Order preserving range chunking

Scatter gather

Page 8: NOSQL Session GlueCon May 2010

Data models

no joins +light transactional semantics = horizontally scalable architectures

Important side effect : new data models = improved ways to develop apps

Page 9: NOSQL Session GlueCon May 2010

Data Models

• Key/value• Column-oriented “bigtable-style”• Document-oriented (JSON)

Page 10: NOSQL Session GlueCon May 2010

Data Models

{ title: ‘Too Big to Fail’, author: ‘John S’, ts: Date(“05-Nov-09 10:33”), comments: [ { author: 'Ian White', comment: 'Great article!' }, { author: 'Joe Smith', comment: 'But how fast is it?', replies: [ {author: 'Jane Smith', comment: 'scalable?'} ] } ] ], tags: [‘finance’, ‘economy’]}

Page 11: NOSQL Session GlueCon May 2010

{ title: ‘Too Big to Fail’, author: ‘John S’, ts: Date(“05-Nov-09 10:33”), comments: [ { author: 'Ian White', comment: 'Great article!' }, { author: 'Joe Smith', comment: 'But how fast is it?', replies: [ {author: 'Jane Smith', comment: 'scalable?'} ] } ] ], tags: [‘finance’, ‘economy’]}

db.posts.find( { tags : ‘economy’ } ) .sort({ts:-1}).limit(10).skip(10)

db.posts.find( { “comments.author” : “Ian White” } )

Page 12: NOSQL Session GlueCon May 2010

Influences

Page 13: NOSQL Session GlueCon May 2010

CAP

It is impossible in the asynchronous network model to implement a read/write data object that guarantees the following properties:• Availability• Atomic consistency in all fair executions (including those in which messages are lost).

Page 14: NOSQL Session GlueCon May 2010

Consistency Models - CAP

• Choices are AP or CP• Write Availability, not Read Availability, is the

Main Question• It’s not all about CAP

Eventual consistency makes these non-availability aspects better:• Multi data center• Speed• Even load distribution

Page 15: NOSQL Session GlueCon May 2010

Eventual Consistency

Page 16: NOSQL Session GlueCon May 2010

Eventual Consistency

Read(x) : 1, 2, 2, 4, 4, 4, 4 …

Page 17: NOSQL Session GlueCon May 2010

Could we get this?

Read(x) : 1, 2, 1, 4, 2, 4, 4, 4 …

Page 18: NOSQL Session GlueCon May 2010

Terms

• R• W• N– R+W>N has nice properties

• Sloppy quorum

Page 19: NOSQL Session GlueCon May 2010

R+W>N

If R+W > N, we can’t have both fast local reads and writes at the same time if all the data centers are equal peers?

Page 20: NOSQL Session GlueCon May 2010

Network Partitions

Page 21: NOSQL Session GlueCon May 2010

Trivial Network Partitions

Page 22: NOSQL Session GlueCon May 2010
Page 23: NOSQL Session GlueCon May 2010

Sometimes we need global state / more consistency

• Unique key constraints– User registration

• ACL changes• Are we surprising the user?– read-your-own-writes

Page 24: NOSQL Session GlueCon May 2010

Could it be the case that…

uptime( CP + average developer ) >= uptime( AP + average developer )

where uptime:= system is up and non-buggy?

Page 25: NOSQL Session GlueCon May 2010

Predictions

• JSON will be the most popular building block for non-relational data models

• Tunable consistency in all the products• Some SQL in these products!

Page 26: NOSQL Session GlueCon May 2010

Questions?Thank you

[email protected]@dmerrdmerr.tumblr.com@mongodbDownload : www.mongodb.org10gen is hiring in SF and NYC – [email protected]