webinar: mongodb and hadoop - working together to provide business insights
DESCRIPTION
Join us for a webinar on how MongoDB and Hadoop can work together to solve Big Data problems in today's enterprises. We will take an in depth look at how the two technologies make real business intelligence accessible to end users. After a brief introduction to both technologies, this webinar will dive deep into the MongoDB+Hadoop Connector and how it is applied to enable new business insights. In this webinar you will learn: What information problems are a good fit for MongoDB and Hadoop How to integrate the two technologies using the MongoDB+Hadoop Connector Programming paradigms for tackling common problemsTRANSCRIPT
![Page 1: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.vdocument.in/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/1.jpg)
MongoDB & Hadoop:Providing Business Insights
Thomas BoydSenior Solutions Architect, MongoDB
![Page 2: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.vdocument.in/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/2.jpg)
2
What is MongoDB?
The leading NoSQL database
Document Database
Open-Source
General Purpose
![Page 3: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.vdocument.in/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/3.jpg)
3
RDBMS
MongoDB Document Model
MongoDB
{
_id : ObjectId("4c4ba5e5e8aabf3"),
employee_name: "Dunham, Justin",
department : "Marketing",
title : "Product Manager, Web",
report_up: "Neray, Graham",
pay_band: “C",
benefits : [
{ type : "Health",
plan : "PPO Plus" },
{ type : "Dental",
plan : "Standard" }
]
}
![Page 4: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.vdocument.in/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/4.jpg)
4
What is Hadoop?
“The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models.”*
*source: hadoop.apache.org
• Large datasets• Analytics• Batch• Map-Reduce
![Page 5: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.vdocument.in/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/5.jpg)
5
Enterprise IT Stack
EDWHadoop
Man
agem
ent
& M
on
ito
rin
gS
ecurity &
Au
ditin
g
RDBMS
CRM, ERP, Collaboration, Mobile, BI
OS & Virtualization, Compute, Storage, Network
RDBMS
Applications
Infrastructure
Data Management
Online Data Offline Data
![Page 6: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.vdocument.in/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/6.jpg)
6
Consideration: Online vs. Offline
• Long-running• High-Latency• Availability is lower
priority
• Real-time• Low-latency• High availability
Online Offlinevs.
![Page 7: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.vdocument.in/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/7.jpg)
7
Consideration: Online vs. Offline
Online Offlinevs.
![Page 8: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.vdocument.in/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/8.jpg)
8
Hadoop is good for…
Risk Modeling Churn AnalysisRecommendation
Engine
Ad TargetingTransaction
AnalysisTrade
Surveillance
Network Failure Prediction
Search Quality Data Lake
![Page 9: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.vdocument.in/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/9.jpg)
9
MongoDB is good for…
360 Degree View of the Customer
Mobile & Social Apps
Fraud Detection
User Data Management
Content Management &
DeliveryReference Data
Product CatalogsMachine to
Machine AppsData Hub
![Page 10: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.vdocument.in/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/10.jpg)
10
MongoDB and Hadoop: Complementary
• “Data Lake”• In-depth analytics
• Real-time systems• Light-weight analytical
workloads
![Page 11: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.vdocument.in/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/11.jpg)
11
Use MongoDB+Hadoop Together
E-Commerce
• Products & Inventory• Real-time
recommendations• Customer profile• Session management• Customer clickstream• Fraud detection
• Transaction history• Clickstream history• Recommendation
model• Fraud modeling
Analysis
MongoDB Connector for
Hadoop
![Page 12: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.vdocument.in/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/12.jpg)
12
Example – Fraud Detection
Payments
• Fraud modeling
Nightly Analysis
MongoDB Connector for
Hadoop
Results Cache
• Online payments processing
3rd Party Data Sources
Fraud Detection
queryonly
query only
![Page 13: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.vdocument.in/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/13.jpg)
13
Customer example – Global Travel Firm
Travel
• Flights, hotels and cars
• Real-time offers• User profiles,
reviews• User metadata
(previous purchases, clicks, views)
• User segmentation• Offer recommendation
engine• Ad serving engine• Bundling engine
Algorithms
MongoDB Connector for
Hadoop
![Page 14: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.vdocument.in/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/14.jpg)
14
Customer example – MetLife
Insurance
• Insurance policies• Demographic data• Customer web data• Call center data• Real-time churn
detection
• Customer action analysis
• Churn prediction algorithms
Churn Analysis
MongoDB Connector for
Hadoop
![Page 15: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.vdocument.in/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/15.jpg)
15
Customer example – Criteo
Ad-Serving
• Catalogs and products
• User profiles• Clicks• Views• Transactions
• User segmentation• Recommendation
engine• Prediction engine
Algorithms
MongoDB Connector for
Hadoop
![Page 16: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.vdocument.in/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/16.jpg)
16
• Java Map-Reduce, Stream Map-Reduce, Pig, & Hive access to MongoDB– MongoDB as input
• mongo.job.input.format=com.hadoop.MongoInputFormat• mongo.input.uri=mongodb://my-db:27017/db1.collection1
– MongoDB as output• mongo.job.output.format=com.hadoop.MongoOutputFormat• mongo.input.uri=mongodb://my-db:27017/db1.collection2
– Using MongoDB backup files• mongo.job.output.format=com.hadoop.BSONFileOutputFormat• mapred.output.dir=file:///results.bson
– Xxx
What is MongoDB-Hadoop Connector?
![Page 17: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.vdocument.in/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/17.jpg)
17
• Version 1.1.0, July 2013
– Pig support
– Hive support
– Streaming support
– Read/Write MongoDB backups
– Update writes
– Much more….
Enhancing MongoDB-Hadoop Connector
• Version 1.2.0, December 2013
– Apache Hadoop 2.2 support
– Multiple collections as M-R
source
– Multiple mongos support
– Custom splitting support
– Performance improvements
![Page 18: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.vdocument.in/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/18.jpg)
18
• Rich query language
• Native secondary indexes
• Geospatial indexes & search
• Text indexes & search
• Aggregation framework
• Javascript Map-Reduce
• Client-side analytics
MongoDB Native Analytics
![Page 19: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.vdocument.in/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/19.jpg)
19
Resources
White paper: Big Data: Examples and Guidelines for the Enterprise Decision Maker
http://www.mongodb.com/lp/whitepaper/big-data-nosql
Recorded Webinar Series: Thrive with Big Data
http://www.mongodb.com/lp/big-data-series
Recorded Webinar: What’s New with MongoDB Hadoop Integration
http://www.mongodb.com/presentations/webinar-whats-new-mongodb-hadoop-integration Documentation: MongoDB Connector for
Hadoophttp://docs.mongodb.org/ecosystem/tools/hadoop/
Trouble Tickets http://jira.mongodb.org (project = Hadoop Integration)
Subscriptions, support, consulting, training https://www.mongodb.com/products/how-to-buy
Resource Location
![Page 20: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.vdocument.in/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/20.jpg)