couchbase tlv dev track 04 - power techniques with indexing

37

Upload: couchbase

Post on 07-Jul-2015

750 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Couchbase TLV Dev track 04 - power techniques with indexing
Page 2: Couchbase TLV Dev track 04 - power techniques with indexing

Developing with Couchbase:Power Techniques with Indexing

Michael Nitschinger

Engineer, Developer Solutions

Page 3: Couchbase TLV Dev track 04 - power techniques with indexing

Agenda

• Introduction to Indexing and Querying in Couchbase

• Understand Map/Reduce Basics

• Architectural Overview

• Simple Indexes

• Simple Queries

Page 4: Couchbase TLV Dev track 04 - power techniques with indexing

Indexing and Querying

Page 5: Couchbase TLV Dev track 04 - power techniques with indexing

Views are Indexes

Indexes help to speed up access to data

Doc1

Doc2Doc3 Index

Doc1

Doc3 Doc4

Doc2

Doc5

Page 6: Couchbase TLV Dev track 04 - power techniques with indexing

Couchbase Server 2.0: Views

• Storing and Indexing Data are separate processes

• In RDBMS, Indexes are optimized based on fixed data types.

• Map-Reduce is a flexible approach helping to Index unstructured data.

Page 7: Couchbase TLV Dev track 04 - power techniques with indexing

Map-Reduce in General

• The map function locates data items and outputs optimized data structures

• The reduce function aggregates the output from a map function.

• Together: very good for semi-structured and distributed data.

ReduceMap

Output

MapOutput

MapOutput

MapOutput

Page 8: Couchbase TLV Dev track 04 - power techniques with indexing

Couchbase Server Map-Reduce

In Couchbase, Map-Reduce is specifically used to create an Index.

Map functions are applied to JSON Documents and they output or “emit” a data structure designed to be rapidly queried and traversed.

CRUD Operations MAP()

emit()

(processed)

Page 9: Couchbase TLV Dev track 04 - power techniques with indexing

Couchbase Server Views

• Create a View of beer names

• Filter only Documents with a JSON key type == beer and also has JSON keys brewery_id and name

• Output the beer name, and a Alcohol By Volume (ABV) value

Page 10: Couchbase TLV Dev track 04 - power techniques with indexing

Couchbase Server Views

• Views can cover a few different use cases

­ Simple secondary indexes (the most common)

­ Complex secondary, tertiary and composite indexes

­ Aggregation functions (reduction)

• Example: count the number of North American Ales

­ Organizing related data

Page 11: Couchbase TLV Dev track 04 - power techniques with indexing

Map() Function => Index

function(doc, meta) {emit(doc.username, doc.email)

}indexed key output value(s)create row

Content Metadata

Every changed document goes through all map functions

Map

Page 12: Couchbase TLV Dev track 04 - power techniques with indexing

Single Element Keys (Text Key)

function(doc, meta) {emit(doc.email, null)

}text key

Map

doc.email meta.id

[email protected] u::1

[email protected] u::2

[email protected] u::3

Page 13: Couchbase TLV Dev track 04 - power techniques with indexing

Compound Keys (Array)

function(doc, meta) {emit(dateToArray(doc.timestamp), 1)

}array key

Array Based Index Keys get sorted as Strings,

but can be grouped by array elements

Map

dateToArray(doc.timestam

p)value

[2012,7,9,18,45] 1

[2012,8,26,11,15] 1

[2012,9,13,2,12] 1

Page 14: Couchbase TLV Dev track 04 - power techniques with indexing

Indexing Architecture

33 2Managed Cache Disk Q

ueu

e

Disk

Replication Queue

App Server

Couchbase Server Node

Doc 1Doc 1

Doc 1

To other node

View Engine

Doc 1

Doc Updated in RAM Cache First

Indexer Updates Indexes After On Disk, in Batches

All Documents & Updates Pass Through View Engine

Page 15: Couchbase TLV Dev track 04 - power techniques with indexing

Buckets >> Design Documents >> Views

Beer-Sample

Beers Breweries

location beersallby_abvby_name

Indexers Are Allocated Per Design Doc

All Updated at Same TimeAll Updated at Same TimeAll Updated at Same Time

Page 16: Couchbase TLV Dev track 04 - power techniques with indexing

Querying Views: Parameters

Page 17: Couchbase TLV Dev track 04 - power techniques with indexing

Parameters used in View Querying

• key = “”

­ used for exact match of index-key

• keys = []

­ used for matching set of index-keys

• startkey/endkey = “”

­ used for range queries on index-keys

• startkey_docID/endkey_docID = “”

­ used for range queries on meta.id

• stale=[false, update_after, true]

­ used to decide indexer behavior from client

• group/group_by

­ used with reduces to aggregate with grouping

Page 18: Couchbase TLV Dev track 04 - power techniques with indexing

Query Pattern: Range

Page 19: Couchbase TLV Dev track 04 - power techniques with indexing

Index-Key Matching

doc.email meta.id

[email protected] u::1

[email protected] u::7

[email protected] u::2

[email protected] u::5

[email protected] u::6

[email protected] u::4

[email protected] u::3

?key=”[email protected]

Match a Single Index-Key

Page 20: Couchbase TLV Dev track 04 - power techniques with indexing

Range Query

doc.email meta.id

[email protected] u::1

[email protected] u::7

[email protected] u::2

[email protected] u::5

[email protected] u::6

[email protected] u::4

[email protected] u::3

?startkey=”b1” & endkey=”zz”

Pulls the Index-Keys between UTF-8 Range specified by the startkey and endkey.

?startkey=”bz” & endkey=”zn”

Pulls the Index-Keys between UTF-8 Range specified by the startkey and endkey.

?startkey=”[email protected]

&endkey=”[email protected]

Range of a single item (can also be done with key= parameter).

Page 21: Couchbase TLV Dev track 04 - power techniques with indexing

Index-Key Set Matches

doc.email meta.id

[email protected] u::1

[email protected] u::7

[email protected] u::2

[email protected] u::5

[email protected] u::6

[email protected] u::4

[email protected] u::3

?keys=[“[email protected]”,

[email protected]”]

Query Multiple in the Set (Array Notation)

Page 22: Couchbase TLV Dev track 04 - power techniques with indexing

Query Pattern: Basic Aggregations

Page 23: Couchbase TLV Dev track 04 - power techniques with indexing

Simple secondary Index

• Find the ABV for each brewery

Page 24: Couchbase TLV Dev track 04 - power techniques with indexing

Aggregation: Reducing doc.abv with _stats

Page 25: Couchbase TLV Dev track 04 - power techniques with indexing

Group reduce (reduce by unique key)

Page 26: Couchbase TLV Dev track 04 - power techniques with indexing

Querying from ViewsQuerying from Ruby Client

Page 27: Couchbase TLV Dev track 04 - power techniques with indexing

Query Pattern: Time Based Rollups

Page 28: Couchbase TLV Dev track 04 - power techniques with indexing

Find Comment Counts By Time

{"type": "comment","about_id":

"beer_Enlightened_Black_Ale",

"user_id": 525,

"text": "tastes like college!","updated": "2010-07-22 20:00:20"

}{

"id":

"u525_c1"

}

timestam

p

Page 29: Couchbase TLV Dev track 04 - power techniques with indexing

dateToArray() converts DateTimestrings to Array of values

• String or Integer based timestamps

• Output optimized for group_level queries

• Generates an array of JSON numbers: [2012,9,21,11,30,44]

Page 30: Couchbase TLV Dev track 04 - power techniques with indexing

Query with group_level=2 to get monthly rollups

Page 32: Couchbase TLV Dev track 04 - power techniques with indexing

Query Pattern: Leaderboard

Page 33: Couchbase TLV Dev track 04 - power techniques with indexing

Aggregate value stored in a document

• Lets find the top-rated beers!

{"brewery": "New Belgium Brewing",

"name": "1554 Enlightened Black Ale",

"style": "Other Belgian-Style Ales","updated": "2010-07-22 20:00:20",

“ratings” : {

“ingenthr” : 5,

“jchris” : 4,

“scalabl3” : 5,

“damienkatz” : 1

},

“comments” : [ “f1e62”, “6ad8c“ ]

}

ratings

Page 34: Couchbase TLV Dev track 04 - power techniques with indexing

Sort each beer by its average rating

• Lets find the top-rated beers!

34

Page 35: Couchbase TLV Dev track 04 - power techniques with indexing

Q&A

Page 36: Couchbase TLV Dev track 04 - power techniques with indexing

Thanks!

Page 37: Couchbase TLV Dev track 04 - power techniques with indexing