london web performance-couchbase meetup

48
1 Couchbase Distributed Document Database Perry Krug Sr. Solutions Architect

Upload: couchbase

Post on 22-Jun-2015

596 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: London web performance-Couchbase meetup

1

CouchbaseDistributed Document Database

Perry KrugSr. Solutions Architect

Page 2: London web performance-Couchbase meetup

2

Company Background

• Leading NoSQL database company

• Open source development and distribution model

• Provide easy-to-develop and -deploy, high-performance, easily scalable, document database

• Focused on internet and mobile applications and cloud computing environments

• Most mature, reliable and widely deployed solution– 1000s of production deployments worldwide

• Located in Silicon Valley (Mountain View, CA)– 80 employees including 50 in engineering/product– Round C company, VCs includes Accel partners, Mayfield, North Bridge, and Ignition

Page 3: London web performance-Couchbase meetup

3

Paid Production Deployments (partial list)

Page 4: London web performance-Couchbase meetup

4

Couchbase automatically distributes data across commodity servers. Built-in caching enables apps to read and write data with sub-millisecond latency. And with no schema to manage,

Couchbase effortlessly accommodates changing data management requirements.

Couchbase Server (a.k.a. Membase)

Simple. Fast. Elastic. NoSQL.

Page 5: London web performance-Couchbase meetup

5

RDBMS HAS DOMINATED FOR 40 YEARS BUT NO LONGER BEST SOLUTION FOR MANY APPS

Relational database technology has served us well for 40 years, and will likely continue to do so for the foreseeable future to support transactions requiring ACID guarantees. But a large, and increasingly dominant, class of software systems and data do not need those guarantees. Much of the data manipulated by Web applications have less strict transactional requirements but, for lack of a practical alternative, many IT teams continue to use relational technology, needlessly tolerating its cost and scalability limitations. For these applications and data, distributed document cache and database technologies such as Couchbase’s provide a promising alternative.

Carl OlofsonIDC Research Vice President, Information and Data Management

Page 6: London web performance-Couchbase meetup

6

Modern Interactive Software Architecture

Application Scales OutJust add more commodity web servers

Database Scales UpGet a bigger, more complex server

Expensive & disruptive sharding, doesn’t perform at web scale

Page 7: London web performance-Couchbase meetup

7

Data Layer Matches Application Logic Tier Architecture

Application Scales OutJust add more commodity web servers

Database Scales OutJust add more commodity data servers

Scaling out flattens the cost and performance curves

• Horizontally scalable with auto-sharding• High performance at web scale• Schema-less for flexibility

Page 8: London web performance-Couchbase meetup

8

Couchbase NoSQL: Simple, Fast, Elastic

• Easily scale apps to an “infinite” number of users– Simply add nodes with a single click– Never need to change your application to scale– Simple development with memcached API

• High performance with predictably low latency– Sub millisecond reads and writes– No drop in performance as app scales

• Schema-less document database– Flexibility to meet rapidly changing market requirements– Roadmap: Indexing, querying similar to RDBMS capabilities

• Low cost solution that economically scales with app

Page 9: London web performance-Couchbase meetup

9

PERFORMANCE

Page 10: London web performance-Couchbase meetup

10

Key results of Cisco and Solarflare Benchmark

Couchbase Server demonstrates

• Consistent sub-millisecond latency for mixed workload

• High throughput

• Linear scalability

http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9670/white_paper_c11-708169.pdf

Page 11: London web performance-Couchbase meetup

11

Your secret weapon: Sub-millisecond AND consistent latency

Object size (Bytes)

Late

ncy (

mic

ro s

econ

ds)

Consistently low latencies in microseconds for varying documents sizes with a mixed workload

Page 12: London web performance-Couchbase meetup

12

Your secret weapon: Sub-millisecond AND consistent latency

Number of servers in cluster

Op

era

tion

s p

er

secon

d

High throughput with 1.4 GB/sec data transfer rate using 4 servers

Linear throughput scalability

Page 13: London web performance-Couchbase meetup

13

SCALE

Page 14: London web performance-Couchbase meetup

14

Draw Something by OMGPOP

Page 15: London web performance-Couchbase meetup

15

Draw Something “goes viral” 3 weeks after launch

191715131197533/12826242220181614121082/6

Draw Something by OMGPOPDaily Active Users (millions)

21

2

4

6

8

10

12

14

16

Page 16: London web performance-Couchbase meetup

16

As usage grew, game data went non-linear.

191715131197533/12826242220181614121082/6

Draw Something by OMGPOPDaily Active Users (millions)

21

2

4

6

8

10

12

14

16

By March 19, there were over 30,000,000 downloads of the app,

over 5,000 drawings being stored per second,over 2,200,000,000 drawings stored,

over 105,000 database transactions per second,and over 3.3 terabytes of data stored. Instagram (7.5M in 5 wks)

Page 17: London web performance-Couchbase meetup

17

The game exploded. But Couchbase did not.

Without a second of downtime, and while sustaining front-end performance, the cluster was continuously expanded to support growth, absorbing frequent server hardware failures.

Drawings/second

Total drawings

R/W latency (usec)

Servers

February

6February

13February

20February

27March

5March

12March

19

0 5 12 50 500 1 2.2million million million million billion billion

6 6 6 18 54 72 90

30 40 32 31 38 29 34

0 3 50 333 1660 3000 5400

Page 18: London web performance-Couchbase meetup

18

In contrast.

191715131197533/12826242220181614121082/6

The Simpson’s: Tapped OutDaily Active Users (millions)

21

2

4

6

8

10

12

14

16

#2 Free app on iPad#3 Free app on iPhone

Page 19: London web performance-Couchbase meetup

19

NO SCHEMA

Page 20: London web performance-Couchbase meetup

20

Document database

{"_id": "brewery_Cleveland_ChopHouse_and_Brewery","_rev": "1-00000061480b50910000000000000000","city": "Cleveland","updated": "2010-07-22 20:00:20","code": "44113","name": "Cleveland ChopHouse and Brewery","country": "United States","phone": "1-216-623-0909","state": "Ohio","address": [

"824 West St.Clair Avenue"],"geo": {

"loc": ["-81.6994","41.4995"],"accuracy": "ROOFTOP"

},"$expiration": 0,"$flags": 0

}

• Json objects• Flexible schema

Page 21: London web performance-Couchbase meetup

21

COUCHBASE SOLUTION“THE BASICS”

Page 22: London web performance-Couchbase meetup

22

COUCHBASE CLIENT LIBRARY

Basic Operation – scale out

Docs distributed evenly across servers in the cluster

Each server stores both active & replica docs Only one server active at a time

Client library provides app with simple interface to database

Cluster map provides map to which server doc is on App never needs to know

App reads, writes, updates docs

Multiple App Servers can access same document at same time

Doc 4

Doc 2

Doc 5

SERVER 1

Doc 6

Doc 4

SERVER 2

Doc 7

Doc 1

SERVER 3

Doc 3

User Configured Replica Count = 1

Read/Write/Update

COUCHBASE CLIENT LIBRARY

Read/Write/Update

Doc 9

Doc 7

Doc 8 Doc 6

Doc 3

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

Doc 9

Doc 5

DOC

DOC

DOC

Doc 1

Doc 8 Doc 2

Replica Docs Replica Docs Replica Docs

Active Docs Active Docs Active Docs

CLUSTER MAP CLUSTER MAP

APP SERVER 1 APP SERVER 2

COUCHBASE SERVER CLUSTER

Page 23: London web performance-Couchbase meetup

23

Add Nodes

Two servers added to cluster One-click operation

Docs automatically rebalanced across cluster Even distribution of

docs Minimum doc

movement Cluster map updated

App database calls now distributed over larger # of servers

User Configured Replica Count = 1

Read/Write/Update Read/Write/Update

Doc 7

Doc 9

Doc 3

Active Docs

Replica Docs

Doc 6

COUCHBASE CLIENT LIBRARY

CLUSTER MAP

APP SERVER 1

COUCHBASE CLIENT LIBRARY

CLUSTER MAP

APP SERVER 2

Doc 4

Doc 2

Doc 5

SERVER 1

Doc 6

Doc 4

SERVER 2

Doc 7

Doc 1

SERVER 3

Doc 3

Doc 9

Doc 7

Doc 8 Doc 6

Doc 3

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

Doc 9

Doc 5

DOC

DOC

DOC

Doc 1

Doc 8 Doc 2

Replica Docs Replica Docs Replica Docs

Active Docs Active Docs Active Docs

SERVER 4 SERVER 5

Active Docs Active Docs

Replica Docs Replica Docs

COUCHBASE SERVER CLUSTER

Page 24: London web performance-Couchbase meetup

24

Fail Over Node

App servers happily accessing docs on Server 3

Server fails App server requests to server 3 fail Cluster detects server has failed

Promotes replicas of docs to active Updates cluster map

App server requests for docs now go to appropriate server

Typically rebalance would follow

User Configured Replica Count = 1

Doc 7

Doc 9

Doc 3

Active Docs

Replica Docs

Doc 6

COUCHBASE CLIENT LIBRARY

CLUSTER MAP

APP SERVER 1

COUCHBASE CLIENT LIBRARY

CLUSTER MAP

APP SERVER 2

Doc 4

Doc 2

Doc 5

SERVER 1

Doc 6

Doc 4

SERVER 2

Doc 7

Doc 1

SERVER 3

Doc 3

Doc 9

Doc 7 Doc 8

Doc 6

Doc 3

DOC

DOC

DOCDOC

DOC

DOC

DOC DOC

DOC

DOC

DOC DOC

DOC

DOC

DOC

Doc 9

Doc 5DOC

DOC

DOC

Doc 1

Doc 8

Doc 2

Replica Docs Replica Docs Replica Docs

Active Docs Active Docs Active Docs

SERVER 4 SERVER 5

Active Docs Active Docs

Replica Docs Replica Docs

COUCHBASE SERVER CLUSTER

Page 25: London web performance-Couchbase meetup

25

Couchbase Server 2.0

• Next major release of Couchbase Server• Currently in Developer Preview, approaching Beta and GA.

What’s new:• New storage engine technology (Append only b-tree)• Indexing and Querying• Incremental Map Reduce• Cross Data Center Replication• Better memory management, large data sets, and other

technological improvments• Fully backwards compatible with existing Couchbase Server

Page 26: London web performance-Couchbase meetup

26

storage interface

Hea

rtbe

at

Proc

ess

mon

itor

Glo

bal s

ingl

eton

sup

ervi

sor

Confi

gura

tion

man

ager

on each node

Reba

lanc

e or

ches

trat

or

Nod

e he

alth

mon

itor

one per cluster

vBuc

ket s

tate

and

repl

icati

on m

anag

er

httpRE

ST m

anag

emen

t API

/Web

UI

8092Couch View

CouchStoreAuto compaction

Memcached

Couchbase Server 2.0 Architecture

Couc

h AP

I

Membase EP Engine

11210Memcapable 2.0

Moxi

11211Memcapable 1.0

HTTP8091

Erlang port mapper4369

Distributed Erlang21100 - 21199

Membase

Erlang/OTP

Distributed Indexing

CouchBase

Cluster ManagerData Manager

Page 27: London web performance-Couchbase meetup

27

storage interface

Hea

rtbe

at

Proc

ess

mon

itor

Glo

bal s

ingl

eton

sup

ervi

sor

Confi

gura

tion

man

ager

on each node

Reba

lanc

e or

ches

trat

or

Nod

e he

alth

mon

itor

one per cluster

vBuc

ket s

tate

and

repl

icati

on m

anag

er

httpRE

ST m

anag

emen

t API

/Web

UI

8092Couch View

CouchStoreAuto compaction

Couchbase Server 2.0 Architecture

Couc

h AP

I

Membase EP Engine

11210Memcapable 2.0

Moxi

11211Memcapable 1.0

HTTP8091

Erlang port mapper4369

Distributed Erlang21100 - 21199

Membase

Erlang/OTP

Distributed Indexing

CouchBase

Cluster Manager

Memcached Interface

Page 28: London web performance-Couchbase meetup

28

storage interface

Hea

rtbe

at

Proc

ess

mon

itor

Glo

bal s

ingl

eton

sup

ervi

sor

Confi

gura

tion

man

ager

on each node

Reba

lanc

e or

ches

trat

or

Nod

e he

alth

mon

itor

one per cluster

vBuc

ket s

tate

and

repl

icati

on m

anag

er

httpRE

ST m

anag

emen

t API

/Web

UI

8092Couch View

CouchStoreAuto compaction

Memcached Interface

Couchbase Server 2.0 Architecture

Couc

h AP

I

EP Engine

11210Memcapable 2.0

Moxi

11211Memcapable 1.0

HTTP8091

Erlang port mapper4369

Distributed Erlang21100 - 21199

Erlang/OTP

Distributed Indexing

CouchBase

Page 29: London web performance-Couchbase meetup

29

Partitioning The Data – vbucket map

Page 30: London web performance-Couchbase meetup

30

Indexing and querying

• Build in incremental map reduce

• Map functions are written and executed on Java Script (V8)

• Index is built incrementally as mutation streams in

• Query in a scatter/gather fashion

Page 31: London web performance-Couchbase meetup

31

Incremental Map reduce using javascript

{"_id": "brewery_Cleveland_ChopHouse_and_Brewery","_rev": "1-00000061480b50910000000000000000","city": "Cleveland","updated": "2010-07-22 20:00:20","code": "44113","name": "Cleveland ChopHouse and Brewery","country": "United States","phone": "1-216-623-0909","state": "Ohio","address": [

"824 West St.Clair Avenue"],"geo": {

"loc": ["-81.6994","41.4995"],"accuracy": "ROOFTOP"

},"$expiration": 0,"$flags": 0

}

• Document from our sample built in beer database

Page 32: London web performance-Couchbase meetup

32

Map function

function (doc) { if (doc.country, doc.state, doc.city) { emit([doc.country, doc.state, doc.city], 1); } else if (doc.country, doc.state) { emit([doc.country, doc.state], 1); } else if (doc.country) { emit([doc.country], 1); }}

• Map functions

REST call: http://db1.couchbase.com:8092/beer-sample/_design/dev_beer/_view/by_location?limit=10

Page 33: London web performance-Couchbase meetup

33

Reduce functions

• Built in reduce functions• _count • _sum• _stats ({“sum”: 1411, “count”: 1411, “min”: 1, “max”: 1, “sumsqr”:1411})

• Developing procedure• Develop against a subset of the data• Built the index on the entire cluster• Promote a dev_ view to production

Page 34: London web performance-Couchbase meetup

34

APP SERVER 1

COUCHBASE CLIENT LIBRARY

Indexing and Querying

Indexing work is distributed amongst nodes Large data set possible Parallelize the effort

Each node has index for data stored on it

Queries combine the results from required nodes

CLUSTER MAP

Doc 4

Doc 2

Doc 5

SERVER 1

Doc 6

Doc 4

SERVER 2

Doc 7

Doc 1

SERVER 3

Doc 3

User Configured Replica Count = 1

APP SERVER 2

COUCHBASE CLIENT LIBRARY

CLUSTER MAP

Doc 9

Doc 7

Doc 8 Doc 6

Doc 3

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

Doc 9

Doc 5

DOC

DOC

DOC

Doc 1

Doc 8 Doc 2

Replica Docs Replica Docs Replica Docs

Active Docs Active Docs Active Docs

COUCHBASE CLIENT LIBRARY COUCHBASE CLIENT LIBRARY

CLUSTER MAP CLUSTER MAP

APP SERVER 1 APP SERVER 2

QueryResponse

Page 35: London web performance-Couchbase meetup

35

Cross Data Center Replication

Data close to users Multiple locations for disaster recovery Independently managed clusters serving local data

US DATA CENTER

EUROPE DATA CENTER

ASIA DATA CENTERReplication Replication

Replication

Page 36: London web performance-Couchbase meetup

36

Integration to Analytics systems

Use the cross data center interface

Agnostic to topology changes De-duplication Effective changes feed of the

entire cluster

Doc 4

Doc 2

Doc 5

SERVER 1

Doc 6

Doc 4

SERVER 2

Doc 7

Doc 1

SERVER 3

Doc 3

User Configured Replica Count = 1

Doc 9

Doc 7

Doc 8 Doc 6

Doc 3

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

DOC

Doc 9

Doc 5

DOC

DOC

DOC

Doc 1

Doc 8 Doc 2

Replica Docs Replica Docs Replica Docs

Active Docs Active Docs Active Docs

COUCHBASE SERVER CLUSTER

CROSS DATA CENTER CONNETROR

Changes feed to consumed byAny destination

Page 37: London web performance-Couchbase meetup

37

• Support large-scale analytics on application data by streaming data from Couchbase to Hadoop– Real-time integration using Flume– Batch integration using Sqoop

• Examples– Various game statistics (e.g., monthly / daily / hourly rankings)– Analyze game patterns from users to enhance various game metrics

Couchbase and Hadoop Integration

memcachedprotocol listener/sender

engine interface

Couchbase Storage Engine

TAPSqoop

Page 38: London web performance-Couchbase meetup

38

Couchbase Client SDKs

Java Client SDK

.Net SDK

PHP SDK

Ruby SDK

Python SDK

spymemcachedConnection

HTTP couchDB connection

Java client API

User Code

Couchbase Server

CouchbaseClient cb = new CouchbaseClient(listURIs,"aBucket", "letmein");// this is all the same as beforecb.set("hello", 0, "world");cb.get("hello");Map<String, Object> manyThings =cb.getBulk(Collection<String> keys);/* accessing a view View view = cb.getView("design_document", "my_view");Query query = new Query();query.getRange("abegin", "theend");

http://www.couchbase.org/code

Page 39: London web performance-Couchbase meetup

39

Couchbase Demonstration

• Couchbase ServerTemplate Demo– Starting with one database

node under load– Dynamically scaling to two

database nodes– Easy management and

monitoring– Not possible any other

database technology Couchbase Servers

In the EC2 or Datacenter

Web application server

Application user

Page 40: London web performance-Couchbase meetup

40

THANK YOU

COUCHBASE SIMPLE, FAST, ELASTIC NOSQL

QUESTIONS?

@couchbase [email protected]

Page 41: London web performance-Couchbase meetup

41

COUCHBASE CUSTOMERS

Page 42: London web performance-Couchbase meetup

42

Paid Production Deployments – Social Gaming

iki

Page 43: London web performance-Couchbase meetup

43

Paid Production Deployments – Key Segments

Ad Platforms

Social Networks

Page 44: London web performance-Couchbase meetup

44

Production Deployments – Key Industry Segments

Online Biz Services

Online Media

E-Commerce

Page 45: London web performance-Couchbase meetup

45

Production Deployments – Key Industry Segments

HealthCare

Military/Government

Communications

Page 46: London web performance-Couchbase meetup

46

Production Deployments – Key Industry Segments

Online Education

Web Design

FinancialServices

Page 47: London web performance-Couchbase meetup

47

Production Deployments – Key Industry Segments

Software

Security

Page 48: London web performance-Couchbase meetup

48

Production Deployments – Enterprises