why nosql? why riak? - goto conferencegotocon.com/dl/jaoo-brisbane-2010/slides/justinsheehy... ·...

57
Why NoSQL? Justin Sheehy [email protected] Why Riak? 1

Upload: nguyendien

Post on 04-Oct-2018

232 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

Why NoSQL?

Justin [email protected]

Why Riak?

1

Page 2: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

What's all of this NoSQL nonsense?

2

MongoDB

CouchDB

Cassandra

Voldemort

Neo4j

MembaseRedis

HBaseRiak

(and the list goes on...)

Page 3: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

What went wrong with SQL databases?

3

Nothing! They are great tools.

Page 4: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

What went wrong with SQL databases?

4

Nothing! They are great tools.

Page 5: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

NoSQL came before SQL.

5

IBM IMS MUMPS

Honeywell IDS

Cincom TOTAL

dbm

VAX DBMS

(and the list goes on...)

PICK

Page 6: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

But that's not really NoSQL, is it?

6

IBM IMS MUMPS

Honeywell IDS

Cincom TOTAL

dbm

VAX DBMS

(and the list goes on...)

PICK

Page 7: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

The start of today's NoSQL

7

Amazon: Dynamo (2007)

Google: Bigtable (2006)

Technology is not the important part!

Page 8: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

8

Amazon sells books.(esp. 3+ years ago)

Why not just use Oracle/MySQL/etc?

Technology is not the important part!most

The start of today's NoSQL

Page 9: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

9

Amazon sells books.(esp. 3+ years ago)

Why not just use Oracle/MySQL/etc?

Business needs (re)created demand foralternative database technologies.

The start of today's NoSQL

Page 10: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

What's all of this NoSQL nonsense?

10

MongoDB

CouchDB

Cassandra

Voldemort

Neo4j

MembaseRedis

HBaseRiak

Do we really need all of these?

Page 11: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

It's still about business needs.

11

Redis

Riak

use cas

es drive

decisio

ns!

Page 12: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

It's still about business needs.

12

Redis

Riak

what solves

your re

al problem

choose

Page 13: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

13

what solves your real problemhow do you know ?

two places to start:

➡ innate/exposed data model➡ distribution/operational model

Page 14: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

14

data model differences

requirements usually from app developers

Native Data Structures

Semi-Structured Documents

Bigtable Column FamilyTabular/Relational

HierarchicalGraph Traversal Key/Value

Indexed Query

Page 15: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

15

data model differencesnot much useful ordering

Bigtable Column Family

Graph Traversal

which one is

more powerfu

l?

Page 16: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

16

ColFamGraphK/V etc etc

data model differences

Cassandra

Voldemort

Neo4j

Page 17: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

17

distribution model differencesrequirements usually from biz or operations

Locally Embedded

Single Server

Distributed System

Server Replication

Page 18: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

18

distribution model differencesrequirements usually from biz or operations

Locally Embedded

Single Server

Distributed System

Server Replication

Page 19: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

19

starting to decide

Locally Embedded

Single Server

Distributed System

Server Replication

ColFamGraphK/V etc etc

Page 20: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

20

starting to decide

Locally Embedded

Single Server

Distributed System

Server Replication

ColFamGraphK/V etc etc

We need a K/Vor ColFam store.

Page 21: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

21

starting to decide

Locally Embedded

Single Server

Distributed System

Server Replication

ColFamGraphK/V etc etc

We need a K/Vor ColFam store.

We need to recoverquickly from server failure.

Page 22: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

22

Distributed System

Server Replication

ColFamK/V

now we're getting somewhere!

Page 23: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

23

Distributed System

Server Replication

ColFamK/V

we can make a shorter list...Riak, Voldemort, CouchDB, Cassandra, HBase...

Page 24: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

24

Distributed System

Server Replication

ColFamK/V

we can make a shorter list...Riak, Voldemort, CouchDB, Cassandra, HBase...

then we can narrow it more...protocols, licensing, benchmarking, simplicity...

Page 25: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

25

Distributed System

Server Replication

ColFamK/V

we can make a shorter list...Riak, Voldemort, CouchDB, Cassandra, HBase...

then we can narrow it more...protocols, licensing, benchmarking, simplicity...

and make a choice!

Page 26: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

26

Distributed System

Server Replication

ColFamK/V

we can make a shorter list...Riak, Voldemort, CouchDB, Cassandra, HBase...

then we can narrow it more...protocols, licensing, benchmarking, simplicity...

and make a choice!

How about MySQL?

I believe we could, Bob.

Page 27: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

27

How about MySQL?

I believe we could, Bob.

wait, WHAT?

Page 28: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

28

How about MySQL?

I believe we could, Bob.

wait, WHAT?

Great tools are still great tools.

Page 29: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

29

Great tools are still great tools.

Understand your needsbefore you choose.

Page 30: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

30

So what's so special about

?

Page 31: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

31

Locally Embedded

Single Server

Distributed System

Server Replication

ColFamGraphK/V etc etc

Riak KV

Page 32: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

32

Locally Embedded

Single Server

Distributed System

Server Replication

ColFamGraphK/V etc etc

Riak KVRiak Search

Page 33: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

33

Locally Embedded

Single Server

Distributed System

Server Replication

ColFamGraphK/V etc etc

Riak KVRiak Search

Riak Core

Page 34: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

34

Locally Embedded

Single Server

Distributed System

Server Replication

ColFamGraphK/V etc ?

improvements flow upward

Riak Core can flow sideways

Page 35: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

35

Locally Embedded

Single Server

Distributed System

Server Replication

ColFamGraphK/V etc ?

Will the market need this?

Page 36: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

36

Locally Embedded

Single Server

Distributed System

Server Replication

ColFamGraphK/V etc ?

Will the market need this?

I know how to get there.

Page 37: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

37

client application

protobuf http

riak_client

dynamo model FSMs

riak core

vnode master

k/v vnode

storage engine

The Riak key/value stack:

Page 38: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

38

client application

protobuf http

riak_client

dynamo model FSMs

vnode master

k/v vnode

storage engine

client application

protobuf http

riak_client

dynamo model FSMs

the cluster nodes are united by riak core via gossip, consistent hashing, etc

vnode master

k/v vnode

storage engine

client application

protobuf http

riak_client

dynamo model FSMs

vnode master

k/v vnode

storage engine

Page 39: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

39

client application

protobuf http

riak_client

dynamo model FSMs

riak core

vnode master

k/v vnode

storage engine

just a local k/v store:

Page 40: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

40

client application

protobuf http

riak_client

dynamo model FSMs

riak core

vnode master

k/v vnode

storage engine

just an abstract k/v store:

Page 41: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

41

client application

protobuf http

riak_client

dynamo model FSMs

riak core

vnode master

k/v vnode

storage engine

a distributed system at heart:

Page 42: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

42

client application

protobuf http

riak_client

dynamo model FSMs

riak core

vnode master

k/v vnode

storage engine

a distributed system at heart:

virtual nodes

gossip

failure detection

vector clocks

sloppy quorums

remote dispatchdynamic membership

Page 43: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

43

client application

protobuf http

riak_client

dynamo model FSMs

riak core

vnode master

k/v vnode

storage engine

carefully managed complexity...

Page 44: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

44

client application

protobuf http

riak_client

dynamo model FSMs

riak core

vnode master

k/v vnode

storage engine

allows simplicity at the edges

Page 45: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

45

client application

protobuf http

riak_client

dynamo model FSMs

riak core

vnode master

k/v vnode

k/v storage engine

let's make a dist. search system!

Page 46: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

46

client application

protobuf http

riak_client

dynamo model FSMs

riak core

vnode master

k/v vnode

k/v storage engine

let's make a dist. search system!

adapted FSMs

Page 47: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

47

client application

protobuf http

riak_client

dynamo model FSMs

riak core

vnode master

k/v vnode

k/v storage engine

let's make a dist. search system!

adapted FSMs

search storage

search vnode

Page 48: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

48

protobuf http

riak_client

dynamo model FSMs

riak core

vnode master

k/v vnode

k/v storage engine

adapted FSMs

search storage

search vnode

search client

solr lucene

done!

Page 49: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

49

protobuf http

riak_client

dynamo model FSMs

riak core

vnode master

k/v vnode

k/v storage engine

adapted FSMs

search storage

search vnode

search client

solr lucene

the sum

is greater than

the parts

Page 50: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

50

I don't know what's next

Page 51: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

51

protobuf http

riak_client

dynamo model FSMs

riak core

vnode master

k/v vnode

k/v storage engine

adapted FSMs

search storage

search vnode

search client

solr lucene

I don't know what's next,

but I know how to build it.

more FSMs

X client

another interface

X storage

X vnode

I don't know what's next

Page 52: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

52

riak core

I don't know what's next,

but I know how to build it.

I don't know what's next

scalabilityfault-tolerance

ease of operations

interoperability

pluggable protocolsflexible storage

predictable availability

I know how to build it

Page 53: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

53

scalabilityfault-tolerance

ease of operations

interoperability

pluggable protocolsflexible storage

predictable availability

I know how to build itI know how to build it,

but you don't have to.

Page 54: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

54

our earlier evaluation criteria:

➡ innate/exposed data model➡ distribution/operational model

Page 55: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

55

➡ innate/exposed data model➡ distribution/operational model

fully distributed system

Page 56: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

56

➡ innate/exposed data model➡ distribution/operational model

fully distributed system

flexible and growing

Page 57: Why NoSQL? Why Riak? - GOTO Conferencegotocon.com/dl/jaoo-brisbane-2010/slides/JustinSheehy... · What's all of this NoSQL nonsense? 2 MongoDB CouchDB Cassandra Voldemort Neo4j Membase

57

Justin [email protected]

http://www.basho.com/