couchbase server 101: couchbase connect 2015

47
COUCHBASE 101 Dipti Borkar Head, WW Solutions Engineering

Upload: couchbase

Post on 26-Jul-2015

193 views

Category:

Technology


3 download

TRANSCRIPT

COUCHBASE 101

Dipti BorkarHead, WW Solutions Engineering

©2015 Couchbase Inc. 2

Agenda

Where does Couchbase fit in? Key Concepts Operations Cluster-wide operations Look at a Live Cluster 

©2015 Couchbase Inc. 3

Big Data = Operational + Analytic (NoSQL + Hadoop)

Online Web/Mobile/IoT apps Millions of

customers/consumers

Offline, batch-oriented Analytics apps Hundreds of business

analysts

©2015 Couchbase Inc. 4

Couchbase meets today’s & tomorrow’s requirements

Flexible data model

Consistent performance at scale

High availability

Easy, affordable scalability

24x365

©2015 Couchbase Inc. 5

Enterprises use Couchbase to enable key objectives

360 Degree Customer

View

Profile Managemen

t

Catalog Fraud Detection

Content Managemen

t

Internet of Things

Digital Communicat

ion

Real Time Big Data

Mobile Applicatio

ns

Personalization

Key Concepts

6

©2015 Couchbase Inc. 7

Couchbase can act as a

Key-Value Store Document Store

2014-06-23-10:15am : 75F

2014-06-23-11:30am : 77F

2014-06-23-02:00pm : 82F

0001:

{firstname: “Dipti”, lastname: “Borkar”, language: “English”, time_zone: “PST”, zip: 94403 }

Key - UTF-8 string up to 250 bytes

Value - can be 0 bytes – 20 MB (best practice < 1 MB)

©2015 Couchbase Inc. 8

Fundamentals

Similar to primary keys in relational databases Documents are partitioned based on the document ID ID based document lookup is extremely fast Must be unique

JSON Binary - integers, strings, booleans Common binary values include serialized objects, compressed XML,

compressed text, encrypted values

Document ID or Key

Value

CAS Value (unique identifier for concurrency) TTL Flags (optional client library metadata) Revision #

Metadata

©2015 Couchbase Inc. 9

Can Represent Complex Objects and Data Structures

Very simple notation, lightweight, compact, readable

The most common API return type for Integrations Facebook, Twitter, you name it, return JSON

Native to Javascript (can be useful)

Can be inserted straight into Couchbase (faster development)

Serialization and Deserialization are very fast

Benefits of JSON

©2014 Couchbase, Inc.©2015 Couchbase Inc. 10

Storing and retrieving documents

Couchbase Cluster

Server Nodes

User/application data

Which live on

Data Buckets

DocumentsRead from / Written to

That form a

Clients

Servers

Dynamically scalable

Based on hash partitioning

©2014 Couchbase, Inc.©2015 Couchbase Inc. 11

User Objectstring uid

string firstname

string lastname

int age

array favorite_colors

string email

u::[email protected]{ “uid”: 123456,

“firstname”: “John”,“lastname”: “Smith”,“age”: 22,“favorite_colors”: [“blue”, “black”],“email”: “[email protected]

}

User Objectstring uid

string firstname

string lastname

int age

array favorite_colors

string email

u::[email protected]{ “uid”: 123456,

“firstname”: “John”,“lastname”: “Smith”,“age”: 22,“favorite_colors”: [“blue”, “black”],“email”: “[email protected]

}

add()

get()

Objects Serialized to JSON and Back

©2015 Couchbase Inc. 12

Couchbase provides a complete Data Management solution

High availability

cache

Key-value store

Document

database

Embedded database

Sync management

Multi-purpose capabilities support a broad range of apps and use cases

Enterprises often start with cache, then broaden usage to other apps and use cases

©2015 Couchbase Inc. 13

What makes Couchbase unique?

Performance & scalability

leaderSub millisecond latency with high throughput; memory-centric architecture

Multi-purpose

Simplified administrationEasy to deploy & manage; integrated Admin Console, single-click cluster expansion & rebalance

Cache, key value store, document database, and local/mobile database in single platform

Always-on availability

Data replication across nodes, clusters, and data centers

Enterprises choose Couchbase for several key advantages

24x365

Operations

©2015 Couchbase Inc. 15

Couchbase Server Architecture

QueryEngine

Object-managed

Cache

Storage Engine

DATA MANAGER

11210 / 11211Data access ports

8092Query API

HTTP

REST management API/Web UI

Replication, Rebalance, Shard

State Manager

Erlang /OTP

CLUSTER MANAGER

8091Admin Console

©2015 Couchbase Inc. 16

Single Node Operations - Write

33 2Managed Cache

Dis

k Q

ueue

Disk

Replication Queue

App Server

Memory-to-Memory Replication to other node

Doc

Doc Doc

©2015 Couchbase Inc. 17

Managed Cache

Disk

Single Node Operations - Read

Managed Cache

Doc 1

Get Doc

1

Doc 1

Doc 1

App Server

Dis

k Q

ueue

Replication Queue

Memory-to-Memory Replication to other node

©2015 Couchbase Inc. 18

Disk

Managed Cache

Single Node Operations – Cache Ejection

Doc 1

Doc 1

Doc 2

Doc 3

Doc 4

Doc 5

Doc 6

Doc 2

Doc 3

Doc 4

Doc 5

Doc 6App Server

Dis

k Q

ueue

Replication Queue

Memory-to-Memory Replication to other node

©2015 Couchbase Inc. 19

Single Node Operations – Cache Miss

33 2

Dis

k Q

ueue

Disk

Replication Queue

App Server

Memory-to-Memory Replication to other node

Doc 1

Doc 2

Doc 3

Doc 4

Doc 5

Doc 6

Doc 2

Doc 3

Doc 4

Doc 5

Doc 6

Doc 1

Doc 1

Doc 1

Managed Cache

Get Doc

1

Cluster-wide Operations

©2015 Couchbase Inc. 21

Auto sharding – Bucket and vBuckets

Each bucket has active and replica data sets Each data set has 1024 Virtual Bucket (vBuckets) Documents get logically mapped to vBuckets

Document IDs always get hashed to the same virtual bucket Virtual buckets to do not have a fixed physical server location Mapping between the virtual buckets and physical server is

called the cluster map Each virtual bucket contains 1/1024th portion of the data set

vB

Data buckets

vB

1 ….. 1024

Virtual buckets

©2014 Couchbase, Inc.©2015 Couchbase Inc. 22

Cluster Map

Hash function (KEY)

vB1 vB2 vB3 vB4 vB5 vB1024

Ph

ysic

al

serv

ers

A B C

Add node to scale out

Log

ical

Part

itio

ns

Cluster Map

New Cluster Map

DocumentsRead from / Written to

©2015 Couchbase Inc. 23

Cluster Map

©2015 Couchbase Inc. 24

Cluster Map

©2015 Couchbase Inc. 25

Cluster Map – 2 nodes added

©2014 Couchbase, Inc.©2015 Couchbase Inc. 26

read/write/update

Active

SERVER 1

Active

SERVER 2

Active

SERVER 3

APP SERVER 1

COUCHBASE Client Library

CLUSTER MAP

COUCHBASE Client Library

CLUSTER MAP

APP SERVER 2

Shard 5

Shard 2

Shard 9

Shard

Shard

Shard

Shard 4

Shard 7

Shard 8

Shard

Shard

Shard

Shard 1

Shard 3

Shard 6

Shard

Shard

Shard

Replica Replica Replica

Shard 4

Shard 1

Shard 8

Shard

Shard

Shard

Shard 6

Shard 3

Shard 2

Shard

Shard

Shard

Shard 7

Shard 9

Shard 5

Shard

Shard

Shard

Multi-Node Operations

26

• Docs distributed evenly across servers

• Each server stores both active and replica docs Only one server active at a time

• Client library provides app with simple interface to database

• Cluster map provides map to which server doc is on App never needs to know

• App reads, writes, updates docs

• Multiple app servers can access same document at same time

©2014 Couchbase, Inc.©2015 Couchbase Inc. 27

SERVER 4 SERVER 5

Replica

Active

Replica

Active

read/write/update

APP SERVER 1

COUCHBASE Client Library

CLUSTER MAP

COUCHBASE Client Library

CLUSTER MAP

APP SERVER 2

Active

SERVER 1

Shard 9

Shard

Replica

Shard 4

Shard 1

Shard 8

Shard

Shard

Shard

Active

SERVER 2

Shard 8

Shard

Replica

Shard 6

Shard 3

Shard 2

Shard

Shard

Shard

Active

SERVER 3

Shard 6

Shard

Replica

Shard 7

Shard 9

Shard 5

Shard

Shard

Shard

read/write/update

Shard 5

Shard 2

Shard

Shard

Shard 4

Shard 7

Shard

Shard

Shard 1

Shard 3

Shard

Shard

Adding Nodes

27

• Two servers added withone-click operation

• Docs automatically rebalance across cluster Even distribution of docs Minimum doc movement

• Cluster map updated

• App database calls now distributed over larger number of servers

©2015 Couchbase Inc. 28

SERVER 4 SERVER 5

Replica

Active

Replica

ActiveActive

SERVER 1

Shard 5

Shard 2

Shard 9Shard

Shard

Shard

Replica

Shard 4

Shard 1

Shard 8Shard

Shard

Shard

Active

SERVER 2

Shard 4

Shard 7 Shard 8

Shard

Shard Shard

Replica

Shard 6

Shard 3 Shard 2

Shard

Shard Shard

Active

SERVER 3

Shard 1

Shard 3

Shard 6Shard

Shard

Shard

Replica

Shard 7

Shard 9

Shard 5Shard

Shard

Shard

• App servers accessing Shards

• Requests to Server 3 fail

• Cluster detects server failedo Promotes replicas of

Shards to activeo Updates cluster map

• Requests for docs now go to appropriate server

• Typically rebalance would follow

Shard 1 Shard 3

Shard

Managing failures

App Server 1

COUCHBASE Client Library

CLUSTER MAP

COUCHBASE Client Library

CLUSTER MAP

App Server 2

A look at a live cluster

Cross Data Center ReplicationXDCR

©2015 Couchbase Inc. 31

Market leading memory-to-memory replication

New YorkSan

Francisco

©2015 Couchbase Inc. 32

XDCR: Cross Data Center Replication Application can access both clusters (master – master) Scales out linearly Different from intra-cluster replication (“CP” versus “AP”)

©2015 Couchbase Inc. 35

XDCR: Flexible topologies One-one, one-many, many-one Differently sized and resourced clusters supported

©2015 Couchbase Inc. 36

33 2

XDCR after Write

Managed Cache

Dis

k Q

ueue

Disk

Replication Queue

App Server

Couchbase Server Node

Doc 1

Doc 1

XDCR Queue

Doc 1

Doc 1

(New in 3.0) Memory-to-Memory Replication to remote cluster

Memory-to-Memory Replication to other node

©2014 Couchbase, Inc.©2015 Couchbase Inc. 37

Indexing and Querying Features

Index and Query Distributed indexing and querying Secondary indexes of JSON document content Flexible querying of indexes

Incremental Map-Reduce Distributed simple real-time analytics Only considers changes due to updated data

Full Text Search Robust integration with ElasticSearch / Solr cluster Flexible full text search and faceted search

©2015 Couchbase Inc. 38

33 2

View processing after write

Managed Cache

Dis

k Q

ueue

Disk

Replication Queue

App Server

Couchbase Server Node

Doc 1

Doc 1

To other node

View engine Doc 1

Doc 1

©2014 Couchbase, Inc.©2015 Couchbase Inc. 39

Active

SERVER 1

Shard 5

Shard 2

Shard

Shard

Replica

Shard 4

Shard 1

Shard

Shard

Shard 1

Active

SERVER 3

Shard 5

Shard 2

Shard

Shard

Replica

Shard 4

Shard 1

Shard

Shard

Shard 1

Active

SERVER 2

Shard 5

Shard 2

Shard

Shard

Replica

Shard 4

Shard 1

Shard

Shard

Shard 1

APP SERVER 1COUCHBASE Client

LibraryCLUSTER

MAP

COUCHBASE Client Library

CLUSTER MAP

APP SERVER 2

Couchbase Server Architecture - Views

• Indexing work is distributed amongst nodes

• Large data set possible

• Parallelize the effort

• Each node has index for data stored on it

• Queries combine the results from required nodes

©2015 Couchbase Inc. 40

Couchbase Elastic Search Connector

©2015 Couchbase Inc. 41

Couchbase Solr Connector

N1QLWhy SQL for NoSQL?

©2015 Couchbase Inc. 43

Why SQL for NoSQL

JSON document model provides Rich Structure (no assembly) Structure Evolution (flexible schema, seamless change)

SQL provides Query across relationships Query in general

Why SQL for JSON? To address all these data concerns N1QL is SQL for JSON

©2015 Couchbase Inc. 44

Models for Representing Data

Data Concern Relational Model JSON Document Model (NoSQL)

Rich Structure Multiple flat tables Constant assembly and

disassembly

Documents No assembly required!

Relationships Represented Queried (SQL)

Represented Queried? Not so far…

Value Evolution

Data can be updated Data can be updated

Structure Evolution

Uniform and rigid Change is disruptive and

manual

Flexible Change is seamless and

data-driven

©2015 Couchbase Inc. 45

SELECT

Standard SELECT pipeline SELECT, FROM, WHERE, GROUP BY, ORDER BY, LIMIT, OFFSET

Queries across relationships JOINs Subqueries NEST — a JOIN that embeds child objects within their parent UNNEST — a JOIN that surfaces nested objects as top-level data

Aggregation Set operators

UNION, INTERSECT, EXCEPT

©2015 Couchbase Inc. 46

N1QL Architecture

Single node installation, services defined dynamically

Query service access Index and Data to formulate response

All queries and direct access is topology aware and dynamically scalable

©2015 Couchbase Inc. 47

Indexing

CREATE / DROP INDEX

Two types of indexes View indexes GSI indexes (global secondary indexes—new)

Can index any data expression Nested / complex expressions Computed expressions

EXPLAIN

©2015 Couchbase Inc. 48

Data writes*

UPDATE … WHERE … Partial updates; deep updates

DELETE … WHERE … Deeply nested conditions

INSERT … VALUES …; INSERT … SELECT … Bulk insert; transfer and transformation

MERGE INSERT or UPDATE; ETL support

*Single-document atomicity.

Q & AThank you.

[email protected]@dborkar