ccsf12-app-development-with-indexes-queries-and-geo
TRANSCRIPT
![Page 1: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/1.jpg)
1 1
Developing with Views:See Inside the Data
J Chris AndersonArchitect
![Page 2: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/2.jpg)
2
What we’ll talk about
• Lifecycle of a view• Index definition, build, and query phase• Consistency options (async by default)• Emergent Schema - Views and Documents• Patterns:• Secondary index• Basic aggregations (avg ratings by brewery)• Time-based analytics with group_level• Leaderboard• Schema Evolution
![Page 3: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/3.jpg)
3 3
view Lifecycle:Define - Build - query
![Page 4: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/4.jpg)
4
View Definition (in JavaScript)
like:CREATE INDEX city ON brewery city;
4
![Page 5: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/5.jpg)
5
Distributed Index Build Phase
• Optimized for lookups, in-order access and aggregations• All view reads from disk (different performance profile)• View builds against every document on every node–This is why you should group them in a design document
• Automatically kept up to date
Doc 4
Doc 2
Doc 5
SERVER 1
Doc 6
Doc 4
SERVER 2
Doc 7
Doc 1
SERVER 3
Doc 3
Doc 9
Doc 7
Doc 8 Doc 6
Doc 3
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
Doc 9
Doc 5
DOC
DOC
DOC
Doc 1
Doc 8 Doc 2
Replica Docs Replica Docs Replica Docs
Active Docs Active Docs Active Docs
![Page 6: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/6.jpg)
6
• Efficiently fetch an row or group of related rows.• Queries use cached values from B-tree inner nodes when possible• Take advantage of in-order tree traversal with group_level queries
Dynamic Range Queries with Optional Aggregation
Doc 4
Doc 2
Doc 5
SERVER 1
Doc 6
Doc 4
SERVER 2
Doc 7
Doc 1
SERVER 3
Doc 3
Doc 9
Doc 7
Doc 8 Doc 6
Doc 3
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
Doc 9
Doc 5
DOC
DOC
DOC
Doc 1
Doc 8 Doc 2
Replica Docs Replica Docs Replica Docs
Active Docs Active Docs Active Docs
?startkey=“J”&endkey=“K”{ “rows”:[{“key”:“Juneau”,“value”:null}]}
![Page 7: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/7.jpg)
7
Queries run against stale indexes by default
• stale=update_after (default if nothing is specified)–always get fastest response–can take two queries to read your own writes
• stale=ok–auto update will trigger eventually–might not see your own writes for a few minutes– least frequent updates -> least resource impact
• stale=false–Use with Persistence observe if data needs to be included in
view results–BUT aware of delay it adds, only use when really required
![Page 8: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/8.jpg)
8
Development vs. Production Views
• Development views index a subset of the data.• Publishing a view builds the
index across the entire cluster.• Queries on production
views are scattered to all cluster members and results are gathered and returned to the client.
![Page 9: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/9.jpg)
9 9
Emergent Schema
![Page 10: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/10.jpg)
10
Emergent Schema
JSON.org
Github API
Twitter API
"Capture the user's intent"
• Falls out of your key-value usage• Helps to know what's efficient• Mostly you can relax
![Page 11: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/11.jpg)
11 11
Query Pattern:Find by Attribute
![Page 12: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/12.jpg)
12
Find documents by a specific attribute
• Lets find beers by brewery_id!
![Page 13: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/13.jpg)
13
The index definition
![Page 14: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/14.jpg)
14
The result set: beers keyed by brewery_id
![Page 15: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/15.jpg)
15 15
Query Pattern:Basic Aggregations
![Page 16: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/16.jpg)
16
Use a built-in reduce function with a group query
• Lets find average abv for each brewery!
![Page 17: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/17.jpg)
17 17
We are reducing doc.abv with _stats
![Page 18: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/18.jpg)
18 18
Group reduce (reduce by unique key)
![Page 19: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/19.jpg)
19 19
Query Pattern:Time-based Rollups
![Page 20: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/20.jpg)
20
Find patterns in beer comments by time
{ "type": "comment", "about_id": "beer_Enlightened_Black_Ale", "user_id": 525, "text": "tastes like college!", "updated": "2010-07-22 20:00:20"}{ "id": "f1e62"}
timestamp
![Page 21: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/21.jpg)
21
Query with group_level=2 to get monthly rollups
![Page 22: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/22.jpg)
22
dateToArray() is your friend
• String or Integer based timestamps• Output optimized for group_level queries• array of JSON numbers:
[2012,9,21,11,30,44]
dateT
oArra
y()
![Page 23: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/23.jpg)
23
group_level=2 results
• Monthly rollup• Sorted by time—sort the query results in your
application if you want to rank by value—no chained map-reduce
![Page 24: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/24.jpg)
24
group_level=3 - daily results - great for graphing
• Daily, hourly, minute or second rollup all possible with the same index.
• http://crate.im/posts/couchbase-views-reddit-data/
![Page 25: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/25.jpg)
25 25
Query Pattern:Leaderboard
![Page 26: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/26.jpg)
26
Aggregate value stored in a document
• Lets find the top-rated beers!{ "brewery": "New Belgium Brewing", "name": "1554 Enlightened Black Ale", "abv": 5.5, "description": "Born of a flood...", "category": "Belgian and French Ale", "style": "Other Belgian-Style Ales", "updated": "2010-07-22 20:00:20", “ratings” : { “jchris” : 5, “scalabl3” : 4, “damienkatz” : 1 }, “comments” : [ “f1e62”, “6ad8c” ]}
ratings
![Page 27: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/27.jpg)
27 27
Sort each beer by its average rating
• Lets find the top-rated beers!
average
![Page 28: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/28.jpg)
28 28
WHat Not to Write
![Page 29: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/29.jpg)
29
Most common mistakes
• Reduces that don’t reduce• Trying to do too many things with one view• Emitting too much data into a view value• Expecting view query performance to be as fast as get/set• Recursive queries require application code.
![Page 30: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/30.jpg)
30 30
Geographic index
![Page 31: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/31.jpg)
31
Experimental Status
• Not yet using Superstar trees • (only fast on large clusters)
• Optimized for bulk loading
![Page 32: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/32.jpg)
32 32
Full Text index
![Page 33: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/33.jpg)
33
Elastic Search Adapter
ElasticSearch
• Elastic Search is good for ad-hoc queries and faceted browsing• Our adapter is aware of changing Couchbase topology• Indexed by Elastic Search after stored to disk in Couchbase
![Page 34: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/34.jpg)
34 34
Questions?
![Page 35: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/35.jpg)
35 35
Views Under The Hood
J Chris AndersonArchitect
THIS TALK IS NOT WRITTEN YETmaybe combine with Dustin’s internals talk about vbucket handoff
![Page 36: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/36.jpg)
36
What we’ll talk about
• Key areas/topics discussed
![Page 37: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/37.jpg)
37 37
Dynamic Time Range Queries
![Page 38: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/38.jpg)
38
The B-tree Index
• Helps to know what's efficient• Superstar
http://damienkatz.net/2012/05/stabilizing_couchbase_server_2.html
![Page 39: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/39.jpg)
39
• Incremental reduce values are stored in the tree
Logical View B-tree
REDUCES
REDUCES
![Page 40: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/40.jpg)
40
• Incremental reduce values are stored in the tree
Logical View B-tree
7 5 5 3 2 3 7 5 5 3 2 3
2525 REDUCES
REDUCES
![Page 41: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/41.jpg)
41
• Incremental reduce values are stored in the tree
Reduce!
7 5 5 3 2 3 7 5 5 3 2 3
2525
_count
function(keys, values) { return keys ? values.length : sum(values);}
_count
function(keys, values) { return keys ? values.length : sum(values);}
![Page 42: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/42.jpg)
42
• You can query that tree dynamically• Lots of the patterns are about pulling value from this data structure
Dynamic Queries
2525
7 5 5 3 2 3 7 5 5 3 2 3
{ }{ }?startkey=“abba”&endkey=“robot”{ “value”:19}?startkey=“abba”&endkey=“robot”{ “value”:19}
_count
function(keys, values) { return keys ? values.length : sum(values);}
_count
function(keys, values) { return keys ? values.length : sum(values);}
![Page 43: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/43.jpg)
43
• Queries use cached values from B-tree inner nodes when possible• Take advantage of in-order tree traversal with group_level queries
Dynamic Queries
2525{{7 5 5 3 2 3 7 5 5 3 2 3 {{
{ }{ }?startkey=“abba”&endkey=“robot”{ “value”:19}?startkey=“abba”&endkey=“robot”{ “value”:19}
(7 5 5 2)(7 5 5 2)
1919
_count
function(keys, values) { return keys ? values.length : sum(values);}
_count
function(keys, values) { return keys ? values.length : sum(values);}
![Page 44: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/44.jpg)
44
• Incremental reduce values are stored in the tree
Respect Reduce! (anti-pattern)
function(keys, values) { return values;}
function(keys, values) { return values;}
DO NOT DO THIS!
IT DOESN’T reduce
DO NOT DO THIS!
IT DOESN’T reduce
[“ace”, “argh!”,“asphalt”]s[“ace”, “argh!”,“asphalt”]s[“front”, “garage”,“hibernate”]s[“front”, “garage”,“hibernate”]s[“pluto”, “nectar”,“mirage”]s[“pluto”, “nectar”,“mirage”]s
[“ace”, “argh!”,“asphalt”, “front”, “garage”,“hibernate”][“ace”, “argh!”,“asphalt”, “front”, “garage”,“hibernate”]
![Page 45: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/45.jpg)
45
Just use the Map
• If you think you need “the identity reduce”—just use the map.
[“ace”, “argh!”,“asphalt”, “front”, “garage”,“hibernate”][“ace”, “argh!”,“asphalt”, “front”, “garage”,“hibernate”]
USE THE MAP
USE THE MAP
![Page 46: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/46.jpg)
46
Lookup via key-range
• Find tables during yesterdays lunch shift• Find shifts owned by which manager
7 5 5 3 2 3 7 5 5 3 2 3
2525
?startkey=“abba”&endkey=“robot”{ “value”:19}?startkey=“abba”&endkey=“robot”{ “value”:19}
![Page 47: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/47.jpg)
4747
Schema evolution
![Page 48: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/48.jpg)
48
Application and Views
• Interactive schema fully controlled by application• If your code can handle it, the database can• Learn to write views defensively
![Page 49: CCSF12-App-Development-with-Indexes-Queries-and-Geo](https://reader037.vdocument.in/reader037/viewer/2022110121/558c8d2cd8b42a5c678b4735/html5/thumbnails/49.jpg)
49
Incremental schema evolution
• Use a view to decide which documents need work• Make your workers idempotent• Once all your data is cleaned up, and old clients are no
longer writing the old format• The cleanup view is obsolete, so is any app code for
dealing with the old case• You've evolved!