![Page 1: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/1.jpg)
Couchase and Hadoop
Perry Krug
Sr. Solutions Architect
![Page 2: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/2.jpg)
Agenda• View basics
• Lifecycle of a view
• Index definition, build, and query phase
• Indexing details
• Replica indexes, failover and compaction
• Primary and Secondary indexes
• View best practices
• Couchbase and Elastic Search
• Couchbase and Hadoop
![Page 3: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/3.jpg)
pol·y·glot / päli glät/ˈ ˌAdjective: Knowing or using several languages.Noun: A person who knows several languages.Synonyms: multilingual
per·sist·ence /p r sist ns/ə ˈ əNoun: The continued or prolonged existence
of something.Synonyms: perseverance - tenacity - pertinacity –
stubbornness
![Page 4: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/4.jpg)
Couchbase Views – The basics• Define materialized views on JSON documents and then query
across the data set
• Using views you can define• Primary indexes
• Simple secondary indexes (most common use case)
• Complex secondary, tertiary and composite indexes
• Aggregations (reduction)
• Indexes are eventually indexed
• Queries are eventually consistent with respect to documents
• Built using Map/Reduce technology • Map and Reduce functions are written in Javascript
![Page 5: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/5.jpg)
View LifecycleDefine -> Build -> Query
5
![Page 6: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/6.jpg)
Buckets & Design docs & Views•C
reate design documents on a bucket
•Create views within a design documentBUCKET 1
Design document 1
View 1View 1
View 2View 2
View 3View 3
Design document 2
View 4View 4
View 5View 5
Design document 3
View 6View 6
View 7View 7
BUCKET 2
![Page 7: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/7.jpg)
Couchbase Server Cluster
Distributed Indexing and Querying
User Configured Replica Count = 1
Active
Doc 5
Doc 2
Doc
Doc
Doc
Server 1
REPLICA
Doc 3
Doc 1
Doc 7
Doc
Doc
Doc
App Server 1
COUCHBASE Client LibraryCOUCHBASE Client Library
Cluster Map
COUCHBASE Client LibraryCOUCHBASE Client Library
Cluster Map
App Server 2
Doc 9
• Indexing work is distributed amongst nodes
• Parallelize the effort
• Each node has index for data stored on it
• Queries combine the results from required nodes
Active
Doc 3
Doc 1
Doc
Doc
Doc
Server 2
REPLICA
Doc 6
Doc 4
Doc 9
Doc
Doc
Doc
Doc 8
Active
Doc 4
Doc 6
Doc
Doc
Doc
Server 3
REPLICA
Doc 2
Doc 5
Doc 8
Doc
Doc
Doc
Doc 7
Query
Create Index / View
![Page 8: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/8.jpg)
3333 22
Eventually indexed Views – Data flow2
Managed Cache
Dis
k Q
ueue
Disk
Replication Queue
App Server
Couchbase Server Node
Doc 1Doc 1
Doc 1
To other node
View engine
Doc 1
![Page 9: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/9.jpg)
DEFINE Index / View Definition in JavaScript
CREATE INDEX City ON Brewery.City;
![Page 10: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/10.jpg)
BUILD Distributed Index Build Phase
• Optimized for lookups, in-order access and aggregations
• View reads are from disk (different performance profile than GET/SET)
• Views built against every document on every node
Group them in a design document
• Views are automatically kept up to date
![Page 11: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/11.jpg)
QUERY Dynamic Queries with Optional Aggregation
• Eventually consistent with respect to document updates• Efficiently fetch a document or group of similar documents • Queries will use cached values from B-tree inner nodes when possible• Take advantage of in-order tree traversal with group_level queries
Query ?startkey=“J”&endkey=“K”{“rows”:[{“key”:“Juneau”,“value”:null}]}
![Page 12: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/12.jpg)
Simple Primary and Secondary Indexing
![Page 13: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/13.jpg)
Example Document Document
ID
![Page 14: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/14.jpg)
Define a primary index on the bucket• Lookup the document ID / key by key, range, prefix, suffix
Index definition
![Page 15: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/15.jpg)
Define a secondary index on the bucket
• Lookup an attribute by value, range, prefix, suffix
Index definition
![Page 16: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/16.jpg)
Find documents by a specific attribute
• Lets find beers by brewery_id!
![Page 17: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/17.jpg)
The index definition
ValueKey
![Page 18: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/18.jpg)
The result set: beers keyed by brewery_id
![Page 19: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/19.jpg)
Query PatternBasic Aggregations
![Page 20: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/20.jpg)
Use a built-in reduce function with a group query
• Lets find average abv for each brewery!
![Page 21: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/21.jpg)
Group reduce (reduce by unique key)
![Page 22: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/22.jpg)
Query PatternTime-based Rollups
![Page 23: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/23.jpg)
Find patterns in beer comments by time
{ "type": "comment", "about_id": "beer_Enlightened_Black_Ale", "user_id": 525, "text": "tastes like college!", "updated": "2010-07-22 20:00:20"}{ "id": "f1e62"}
timestamp
![Page 24: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/24.jpg)
Query with group_level=2 to get monthly rollups
![Page 25: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/25.jpg)
group_level=3 - daily results - great for graphing
![Page 26: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/26.jpg)
Query PatternLeaderboard
![Page 27: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/27.jpg)
Aggregate value stored in a document• Lets find the top-rated beers!
{ "brewery": "New Belgium Brewing", "name": "1554 Enlightened Black Ale", "abv": 5.5, "description": "Born of a flood...", "category": "Belgian and French Ale", "style": "Other Belgian-Style Ales", "updated": "2010-07-22 20:00:20", “ratings” : { “jchris” : 5, “scalabl3” : 4, “damienkatz” : 1 }, “comments” : [ “f1e62”, “6ad8c” ]}
ratings
![Page 28: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/28.jpg)
Sort each beer by its average rating• Lets find the top-rated beers!
average
![Page 29: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/29.jpg)
Couchbase and Elastic Search
![Page 30: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/30.jpg)
Full Text Search
![Page 31: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/31.jpg)
{ "name": "Abbey Belgian Style Ale", "description": "Winner of four World Beer Cup medals and eight medals at the Great American Beer Fest, Abbey Belgian Ale is the Mark Spitz of New Belgium’s lineup – but it didn’t start out that way."}
Search Across Full JSON Body
Search term: abbey
![Page 32: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/32.jpg)
{ "name": "Abbey Belgian Style Ale", "description": "Winner of four World Beer Cup medals and eight medals at the Great American Beer Fest, Abbey Belgian Ale is the Mark Spitz of New Belgium’s lineup – but it didn’t start out that way."}
Search Across Full JSON Body
Search term: abbey
![Page 33: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/33.jpg)
Faceted Search
Categories
Items with Counts
Range Facets
![Page 34: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/34.jpg)
Learning Portal – Proof of Concept
![Page 35: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/35.jpg)
Couchbase and Hadoop
![Page 36: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/36.jpg)
Cloudera, etc.
Operational vs. Analytic Databases
Couchbase
AnalyticAnalyticDatabasesDatabases
Get insights from Get insights from datadata
Real-time, Real-time, Interactive DatabasesInteractive Databases
Fast access Fast access to datato data
NoSQL
![Page 37: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/37.jpg)
What is Sqoop?
Sqoop is a tool designed to transfer data between Hadoop and [OLTP] databases. You can use Sqoop to import data from [an OLTP] database management system (RDBMS) such as MySQL or Oracle [or Couchbase] into the Hadoop Distributed File System (HDFS), transform the data in Hadoop MapReduce, and then export the data back.
sqoop.apache.org
![Page 38: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/38.jpg)
Traditional ETL
Application DataData
T
What is Sqoop?
![Page 39: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/39.jpg)
A different paradigm
Data
ApplicationData
What is Sqoop?
![Page 40: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/40.jpg)
A very scalable different paradigm
Data
Application
Data
Application
Data
Application
Data
![Page 41: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/41.jpg)
Where did the Transform go?
Application
Data
TTT TTT TTT TTT
What is Sqoop?
![Page 42: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/42.jpg)
Couchbase Import and Export
$ sqoop import –-connect http://localhost:8091/pools --table DUMP
$ sqoop import –-connect http://localhost:8091/pools --table BACKFILL_5
$ sqoop export --connect http://localhost:8091/pools
--table DUMP –export-dir DUMP
•For Imports, table must be:– DUMP: All keys currently in Couchbase– BACKFILL_n: All key mutations for n minutes
•Specified –username maps to bucket– By default set to “default” bucket
![Page 43: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/43.jpg)
Hadoop and Couchbase – Ad Targeting
click streamevents
profiles, campaigns
profiles, real time campaign statistics
40 milliseconds to respond with the decision.
2
3
1
![Page 44: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/44.jpg)
Moving Parts
![Page 45: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/45.jpg)
Content & Recommendation Targeting
![Page 46: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/46.jpg)
Moving Parts
![Page 47: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/47.jpg)
Thank you
Couchbase NoSQL Document Database
![Page 48: Couchbase_John_Bryce_Israel_Training_couchbase_hadoop](https://reader031.vdocument.in/reader031/viewer/2022032422/55a8e0fa1a28abcb4e8b4638/html5/thumbnails/48.jpg)