couchbase_john_bryce_israel_training_use_cases

57
Why companies use Couchbase Perry Krug Sr. Solutions Architect

Upload: couchbase

Post on 11-May-2015

904 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: Couchbase_John_Bryce_Israel_Training_use_cases

Why companies use Couchbase

Perry Krug

Sr. Solutions Architect

Page 2: Couchbase_John_Bryce_Israel_Training_use_cases

Common Use CasesSocial Gaming• Couchbase stores

player and game data

• Examples customers include: Zynga

• Tapjoy, Ubisoft, Tencent

Mobile Apps• Couchbase stores user

info and app content

• Examples customers include: Kobo, Playtika

Ad Targeting• Couchbase stores

user information for fast access

• Examples customers include: AOL, Mediamind, Convertro

Session store• Couchbase Server as a key-

value store

• Examples customers include: Concur, Sabre

User Profile Store• Couchbase Server as a

key-value store

• Examples customers include: Tunewiki

High availability cache• Couchbase Server used as a cache tier replacement

• Examples customers include: Orbitz

Content & Metadata Store

• Couchbase document store with Elastic Search

• Examples customers include: McGraw Hill

3rd party data aggregation • Couchbase stores social media and

data feeds• Examples customers include:

Sambacloud

Page 3: Couchbase_John_Bryce_Israel_Training_use_cases

Use Cases & CustomersWeb app or Use-case Couchbase Solution Example Customer

Content Store & Metadata System

Couchbase document store + Elastic Search

Social Game &Mobile App

Couchbase store game and player data

Ad Targeting Couchbase stores user information for fast access

User Profile Store Couchbase Server as a key-value store

Session Store Couchbase Server as a key-value store

High Availability Caching Tier

Couchbase Server as a memcached tier replacement

Chat/Messaging Platform

Couchbase Server

Page 4: Couchbase_John_Bryce_Israel_Training_use_cases

•Content metadata•Content: Articles, text •Landing pages for website•Digital content: eBooks,

magazine, research material

Content and Metadata Store

Use Case: Content and Metadata Store

• Flexibility to store any kind of content• Fast access to content metadata

(most accessed objects) and content • Full-text Search across data set• Scales horizontally as more

content gets added to the system

• Fast access to metadata and content via object-managed cache• JSON provides schema flexibility to store all types of content and

metadata• Indexing and querying provides real-time analytics capabilities

across dataset • Integration with ElasticSearch for full-text search• Ease of scalability ensures that the data cluster can be grown

seamlessly as the amount of user and ad data grows

Types of Data Application Requirements

Why NoSQL and Couchbase

Page 5: Couchbase_John_Bryce_Israel_Training_use_cases

McGraw Hill Education Labs Learning portal

Page 6: Couchbase_John_Bryce_Israel_Training_use_cases

Use Case: Content and metadata store

Building a self-adapting, interactive learning portal with Couchbase

Page 7: Couchbase_John_Bryce_Israel_Training_use_cases

As learning move online in great numbers

Growing need to build interactive learning environments that

Scale!

Scale to millions of learners

Serve MHE as well as third-party content

Including open content

Support learning apps

010100100111010101010101001010101010

Self-adapt via usage data

The Problem

Page 8: Couchbase_John_Bryce_Israel_Training_use_cases

• Allow for elastic scaling under spike periods

• Ability to catalog & deliver content from many sources

• Consistent low-latency for metadata and stats access

• Require full-text search support for content discovery

• Offer tunable content ranking & recommendation functions

Backend is an Interactive Content Delivery Cloud that must:

XML Databases

SQL/MR Engines

In-memory Data Grids

Enterprise Search Servers

Experimented with a combination of:

The Challenge

Page 9: Couchbase_John_Bryce_Israel_Training_use_cases
Page 10: Couchbase_John_Bryce_Israel_Training_use_cases

The Learning Portal

• Designed and built as a collaboration between MHE Labs and Couchbase

• Serves as proof-of-concept and testing harness for Couchbase + ElasticSearch integration

• Available for download and further development as open source code

Page 11: Couchbase_John_Bryce_Israel_Training_use_cases

• Document Modeling

• Metadata & Content Storage

• View Querying to support Content Browsing

• Elastic Search Integration (Full Text Search)

-Content Updated in near Real-Time

-Search Content Summaries

-Relevancy boosted based on User Preferences

• Real-Time Content Updates

• Event Logging for offline analysis

Techniques Used

Page 12: Couchbase_John_Bryce_Israel_Training_use_cases

Couchbase 2.0 + Elasticsearch

Store full-text articles as well as document metadata for image, video and text content in Couchbase

Combine user preferences statistics with custom relevancy scoring to provide personalized search results

Logs user behavior to calculate user preference statistics (e.g. video > text)

1

2 4

Continuously accept updates from Couchbase with new content & stats

3

Page 13: Couchbase_John_Bryce_Israel_Training_use_cases

Data Model

Content Metadata Bucket

User ProfilesBucket

Content StatsBucket

• Stores content metadata for media objects and content for articles

• Includes tags, contributors, type information

• Includes pointer to the media

• Stores user view details per type

• Updated every time a user views a doc with running count

• To be used for customizing ES search results per user preference• Stores content view details

• Updated for every time a document is viewed

• To be used for boosting ES search results based on popularity

Page 14: Couchbase_John_Bryce_Israel_Training_use_cases

Architecture

Page 15: Couchbase_John_Bryce_Israel_Training_use_cases

•User account information•User game profile info•User’s social graph•State of the game•Player badges and stats

Social and Mobile Gaming

Use Case: Social Gaming

•Ability to support rapid growth•Fast response times for

awesome user experience•Game uptime –24x7x365•Easy to update apps with new

features

•Scalability ensures that games are ready to handle the millions of users that come with viral growth.

•High performance guarantees players are never left waiting to make their next move.

•Always-on operations means zero interruption to game play (and revenue)

•Flexible data model means games can be developed rapidly and updated easily with new features

Types of Data Application Requirements

Why NoSQL and Couchbase

Page 16: Couchbase_John_Bryce_Israel_Training_use_cases

Social gaming at Tencent Stomp Games

Page 17: Couchbase_John_Bryce_Israel_Training_use_cases

Use Case: Social gaming

Building a social game with an awesome user experience that can scale to millions of players

Page 18: Couchbase_John_Bryce_Israel_Training_use_cases

Social gaming is all about the experience

Applications needs

- User centric data (read key-value access)- Scalability - Easy and simple backend

The Problem

Page 19: Couchbase_John_Bryce_Israel_Training_use_cases

• Must be scalable

• Highly available

• Extreme performance (latency and throughput)

• Cost effective

• Operationally easy to maintain

Backend must be a platform for multiple games

CouchbaseMongoDB

DBShardsMySQL Cluster

Experimented with several databases

The Challenge

Page 20: Couchbase_John_Bryce_Israel_Training_use_cases

Evaluations considerations Couchbase MongoDB dbShards MySQL Cluster (NDB)Sharding strategy Replication Failover support Scalability Customized data support System compatibilityCoding effort Performance Protocol Upgrade difficulty Data persisting method Map Reduce / Join SQL compatible Licensing Price Bulk price Management / monitor tool Hardware requirement Supported OS Operation knowledge Operation training Operation difficulty Developer company size Market penetration Support Successful use cases

Page 21: Couchbase_John_Bryce_Israel_Training_use_cases

The architecture

Page 22: Couchbase_John_Bryce_Israel_Training_use_cases

22

Draw Something by OMGPOP

Page 23: Couchbase_John_Bryce_Israel_Training_use_cases

23

As Usage Grew, Game Data Went Non-LinearDraw Something by OMGPOP

Daily Active Users (millions)

Page 24: Couchbase_John_Bryce_Israel_Training_use_cases

24

In Contrast…The Simpson’s: Tapped OutDaily Active Users (millions)

Page 25: Couchbase_John_Bryce_Israel_Training_use_cases

•Social media feeds: Twitter, Facebook, LinkedIn

•Blogs, news, press articles•Data service feeds:

Hoovers, Reuters

3rd Party Data Aggregation

Use Case: 3rd party data aggregation

•Flexibility to store any kind of content

•Flexibility to handle schema changes

•Full-text Search across data set•High speed data ingestion•Scales horizontally as more content

gets added to the system

•JSON provides schema flexibility to store all types of content and metadata

•Fast access to individual documents via built-in cache, high write throughput

• Indexing and querying provides real-time analytics capabilities across dataset

• Integration with ElasticSearch for full-text search•Ease of scalability ensures that the data cluster can be grown seamlessly

as the amount of user and ad data grows

Types of Data Application Requirements

Why NoSQL and Couchbase

Page 26: Couchbase_John_Bryce_Israel_Training_use_cases

3rd party data aggregation at Sambacloud

Page 27: Couchbase_John_Bryce_Israel_Training_use_cases

Use Case: 3rd party data aggregation

Building a data and content aggregation and management platform

Page 28: Couchbase_John_Bryce_Israel_Training_use_cases

More and more data and content coming in from external sources: social media, data services, press and news, blogs

Require a single content store for all this information to handle different types of formats and schemas

The Problem

Page 29: Couchbase_John_Bryce_Israel_Training_use_cases

• Flexible data model to handle any schema and constant changes to schemas

• Allow for elastic scaling particularly for cloud environments

• Consistent low-latency access and ability to handle incoming streams

• Require full-text search support for content

• Light weight analytics for sorting / ranking

The platform must support

The Challenge

Page 30: Couchbase_John_Bryce_Israel_Training_use_cases

The Technologies

WorkAgile Projects

ShareAny Content

OrganizeChannels

RecommendAnalytics

SambaCloud Content Services – REST API, HTML5

Page 31: Couchbase_John_Bryce_Israel_Training_use_cases

•Application objects•Popular search query

results•Session information•Heavily accessed web

landing pages

High availability caching

Use Case: High availability caching

•Consistently low response times for document / key lookups

•High-availability - 24x7x365•Operationally easy to migrate /

upgrade / maintain with app online

•Replacement for entire caching tier

•Low latency in sub-milliseconds with consistently high read / write throughput

•Always-on operations even for database upgrades and maintenance with zero down time

•memcached compatibility for easy migration to Couchbase without any application changes

•High availability and disaster replication with intra-cluster and cross-cluster replication (XDCR)

Types of Data Application Requirements

Why NoSQL and Couchbase

Page 32: Couchbase_John_Bryce_Israel_Training_use_cases

Challenges with a Memcached TierProblem Symptoms Couchbase SolutionCold Cache Slowdown or collapse of the data

service layer due to heavily overloaded RDBMS when

memcached nodes go down (on failure or for maintenance)

Data is automatically replicated across the Couchbase cluster, providing high

availability of data even on failures

Heavy RDBMS Contention

Multiple requests for data items that do not exist in the cache results in

sudden shifting of load to the relational database causing heavy

contention

By replicating data across the cluster, Couchbase Server provides consistent performance without shifting load to

the RDBMS layer

Lack of Scalability Adding or removing memcached nodes is complicated and causes

unpredictable application performance degradation

Auto-sharding and online rebalancing in Couchbase Server provides easy non-

disruptive expansion of the cluster

Complex Monitoring

Management of individual memcached nodes increases the

complexity of operations and lacks a single consistent view of the caching

layer

Couchbase Server provides an in-built admin console for cluster wide

management and monitoring as well as RESTful APIs for easy automation and

third-party integration

Page 33: Couchbase_John_Bryce_Israel_Training_use_cases

Before and After: Replacing Caching Tier with Couchbase

Server

Page 34: Couchbase_John_Bryce_Israel_Training_use_cases

Memcached Tier Replacement: How it Works

• Fully memcached protocol compatible

• Easy to replace a tier of individual memcached servers with a Couchbase Server cluster

• The cluster receives reads and writes, keeps frequently accessed items in memory, persists and shards and replicates the data amongst the cluster

• Reads and writes are still as low latency and high throughput as memcached

• User gets all the scalability and high-availability advantages of a Couchbase Server cluster

Page 35: Couchbase_John_Bryce_Israel_Training_use_cases

•User profile: preferences and psychographic data

•Ad serving history by user•Ad buying history by

advertiser •Ad serving history by

advertiser

Ad Targeting

Use Case: Ad Targeting

•High performance to meet limited ad serving budget; time allowance is typically <40 msec

•Scalability to handle hundreds of millions of user profiles and rapidly growing amount of data

•24x7x365 availability to avoid ad revenue loss

•Sub-millisecond reads/writes means less time is needed for data access, more time is available for ad logic processing, and more highly optimized ads will be served

•Ease of scalability ensures that the data cluster can be grown seamlessly as the amount of user and ad data grows

•Always-on operations = always-on revenue. You will never miss the opportunity to serve an ad because downtime.

Types of Data Application Requirements

Why NoSQL and Couchbase

Page 36: Couchbase_John_Bryce_Israel_Training_use_cases

Easy Scalabili

ty

Consistent High

Performance

Always On

24x365

Grow cluster without application changes, without downtime with a single click

Consistent sub-millisecond read and write response times

with consistent high throughput

No downtime for software upgrades, hardware maintenance, etc.

Couchbase Server

JSONJSONJSON

JSONJSON

PERFORMANCE

Flexible Data Model

JSON document model with no fixed schema.

Couchbase is the Complete Solution

Page 37: Couchbase_John_Bryce_Israel_Training_use_cases

Proven Easy, Online Scalability

Page 38: Couchbase_John_Bryce_Israel_Training_use_cases

Scaling

• Fully online throughout

• Single REST/Click to add or remove arbitrary number of nodes

• Parallelize data movement on rebalance, throttles to prevent overload

Page 39: Couchbase_John_Bryce_Israel_Training_use_cases

Couchbase: High throughput that scales linearly

Linear throughput scalability

High throughput with 1.4 GB/sec data transfer rate

using 4 servers

http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9670/white_paper_c11-708169.pdf

Page 40: Couchbase_John_Bryce_Israel_Training_use_cases

Proven Rapid Growth ScalabilityDraw Something by OMGPOPDaily Active Users (millions)

Feb 2012 March 2012

Page 41: Couchbase_John_Bryce_Israel_Training_use_cases

Consistent High Performance

Page 42: Couchbase_John_Bryce_Israel_Training_use_cases

Consistent High Performance

• Consistent, predictable sub millisecond latency Apps need fast, predictable access to data, it’s not good enough

to be fast some of the time

• Consistent, predictable throughput Throughput capacity of your data layer should be independent

of the mix of reads and writes

Page 43: Couchbase_John_Bryce_Israel_Training_use_cases

Consistent low latency with varying doc sizes

Consistently low latencies in microseconds for

varying documents sizes with a mixed workload

http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9670/white_paper_c11-708169.pdf

Page 44: Couchbase_John_Bryce_Israel_Training_use_cases

High throughput that scales linearly

Linear throughput scalability

High throughput with 1.4 GB/sec data transfer rate

using 4 servers

Page 45: Couchbase_John_Bryce_Israel_Training_use_cases

Linked-In 4 node cluster

Page 46: Couchbase_John_Bryce_Israel_Training_use_cases

Always On 24x7x365

Page 47: Couchbase_John_Bryce_Israel_Training_use_cases

Always on 24x7x365

• Online upgrades Balance in nodes

with new versions

• Online backup

• Online compaction

• Built-in monitoring plus REST interface Cluster wide to per node drill down

• Full admin REST interface for easy integration

Page 48: Couchbase_John_Bryce_Israel_Training_use_cases

Availability

CACHE 1

CACHE 2

CACHE 3

0 10 20 30 40 50 60 70 80 90 100

82

57

72

Couchbase

Page 49: Couchbase_John_Bryce_Israel_Training_use_cases

Flexible Data Model

Page 50: Couchbase_John_Bryce_Israel_Training_use_cases

Relational vs Document Data Model

Relational data model Document data modelCollection of complex documents with

arbitrary, nested data formats andvarying “record” format.

Highly-structured table organization with rigidly-defined data formats and

record structure.

C1 C2 C3 C4

JSONJSON

JSON

{

}

Page 51: Couchbase_John_Bryce_Israel_Training_use_cases

Comparisons

Page 52: Couchbase_John_Bryce_Israel_Training_use_cases

Couchbase Server vs. MongoDB

Easy Scalability

Consistent, High Performance

Flexible Data Model

Always On 24x7x365

Consistent sub millisecond reads/writes;Consistent high throughput

No downtime for software upgrades, hardware maintenance, etc.

Schemaless data model for rapid development

With 1-click, horizontally grow cluster, even scale across datacenters

High & Inconsistent latency;Lower throughput

Schemaless data model for rapid development

Difficult online upgrade;Not all maintenance is online

Complex multi-step scaling, no write scaling across data centers

✔ ✖

Page 53: Couchbase_John_Bryce_Israel_Training_use_cases

Couchbase Server Leadership vs. Cassandra

Easy Scalability

Consistent, High Performance

Flexible Data Model

Always On 24x7x365

Consistent sub-millisecond reads/writes and high throughput

No downtime for software upgrades, hardware maintenance, etc.

Schemaless data model for rapid development

With 1-click, horizontally grow cluster, even scale across datacenters

High and inconsistent latency; medium throughput

Very complex columnar data model

Online upgrades and online maintenance

Complex multi-step scaling, coarse grain growth recommended

✔ ✖

Page 54: Couchbase_John_Bryce_Israel_Training_use_cases

Read performance comparison - NoSQL databases

0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 220000

2000

4000

6000

8000

10000

12000

14000

16000

18000

Read latencies against throughput

Operations per Second

95t

h Pe

rcen

tile

Late

ncy

(ms)

MongoDB cannot handle throughput above ~ 8000 ops / sec

Couchbase handles ~3X throughput with significantly lower latency

MongoDB

Cassandra

Couchbase

Third Party Data - Altoros

Page 55: Couchbase_John_Bryce_Israel_Training_use_cases

Write performance comparison - NoSQL databases

0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 220000

5000

10000

15000

20000

25000

30000

Insert/update latencies against throughput

Operations per Second

95t

h Pe

rcen

tile

Late

ncy

(ms)

MongoDB latency shoots up beyond 6000 ops / sec

Couchbase latency stays consistently low even at 20000 ops / sec

MongoDB

Cassandra

Couchbase

Third Party Data - Altoros

Page 56: Couchbase_John_Bryce_Israel_Training_use_cases

Thank you!

Get Couchbase http://www.couchbase.com/download

[email protected]

Page 57: Couchbase_John_Bryce_Israel_Training_use_cases