călin andrei burloiu - connecting hadoop with couchbase: engineering for performance
DESCRIPTION
We needed a bridge between the real-time tier, where we used Couchbase, and the batch tier, built on Hadoop. For lack of something better, we built our own: Couchdoop – an open-source Hadoop connector for Couchbase. Our presentation will discuss best practices on how to create a Hadoop connector for a NoSQL database. We will talk about the challenges we encountered while developing Couchdoop and share how we tuned it for performance. Together with Bigstep we worked on performance benchmarks for our technology, which show how much throughput that can be squeezed from a Hadoop connector.TRANSCRIPT
![Page 1: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/1.jpg)
![Page 2: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/2.jpg)
![Page 3: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/3.jpg)
![Page 4: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/4.jpg)
![Page 5: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/5.jpg)
![Page 6: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/6.jpg)
![Page 7: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/7.jpg)
![Page 8: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/8.jpg)
![Page 9: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/9.jpg)
![Page 10: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/10.jpg)
![Page 11: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/11.jpg)
Two-tier Architecture Real-time Tier (Couchbase) • Detects user intent • Gives next best recommendation or deal
Data Bridge (Couchdoop)
Batch Tier (Hadoop) • Recommends products
User events
Recom
mendations
![Page 12: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/12.jpg)
![Page 13: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/13.jpg)
![Page 14: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/14.jpg)
![Page 15: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/15.jpg)
![Page 16: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/16.jpg)
Importing Data { “user”: “Rudy”, “action”: “view”, “product”: “Fender Guitar” }
{ “user”: “Rudy”, “action”: “click”, “product”: “Guitar Amplifier” } {
“user”: “Emma”, “action”: “buy”, “product”: “Blue Skirt” }
Couchdoop
Machine Learning Recommenda0ons Hadoop
IMPORT
HDFS
![Page 17: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/17.jpg)
{ “user”: “Rudy”, “recommendations”: [ [“Ibanez Acoustic Guitar”, 450], [“Guitar Tuner”, 120], [“Sound Mixer”, 30] ] }
EXPORT
Exporting Data
Couchdoop
Machine Learning Recommenda0ons Hadoop
![Page 18: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/18.jpg)
{ “user”: “Rudy”, “recommendations”: [ [“Ibanez Acoustic Guitar”, 450], [“Guitar Tuner”, 120], [“Sound Mixer”, 30] ] }
Update
Updating Data
Couchdoop
Machine Learning Recommenda0ons Hadoop
![Page 19: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/19.jpg)
![Page 20: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/20.jpg)
![Page 21: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/21.jpg)
![Page 22: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/22.jpg)
![Page 23: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/23.jpg)
![Page 24: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/24.jpg)
![Page 25: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/25.jpg)
![Page 26: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/26.jpg)
![Page 27: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/27.jpg)
![Page 28: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/28.jpg)
![Page 29: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/29.jpg)
![Page 30: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/30.jpg)
![Page 31: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/31.jpg)
![Page 32: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/32.jpg)
![Page 33: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/33.jpg)
![Page 34: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/34.jpg)
![Page 35: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/35.jpg)
![Page 36: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/36.jpg)
![Page 37: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/37.jpg)
![Page 38: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/38.jpg)
![Page 39: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/39.jpg)
![Page 40: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/40.jpg)
![Page 41: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/41.jpg)
![Page 42: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/42.jpg)
![Page 43: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/43.jpg)
![Page 44: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/44.jpg)
![Page 45: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/45.jpg)
![Page 46: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/46.jpg)
![Page 47: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/47.jpg)
![Page 48: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/48.jpg)
![Page 49: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/49.jpg)
![Page 50: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/50.jpg)
![Page 51: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/51.jpg)
![Page 52: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/52.jpg)
![Page 53: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/53.jpg)
![Page 54: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/54.jpg)
![Page 55: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/55.jpg)
![Page 56: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/56.jpg)
![Page 57: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/57.jpg)
![Page 58: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/58.jpg)
![Page 59: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/59.jpg)
![Page 60: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/60.jpg)
![Page 61: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/61.jpg)
![Page 62: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/62.jpg)
![Page 63: Călin Andrei Burloiu - Connecting Hadoop with Couchbase: Engineering for performance](https://reader033.vdocument.in/reader033/viewer/2022042815/556262cfd8b42a14048b4ce8/html5/thumbnails/63.jpg)