Transcript
Page 1: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

MongoDB & Hadoop:Providing Business Insights

Thomas BoydSenior Solutions Architect, MongoDB

Page 2: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

2

What is MongoDB?

The leading NoSQL database

Document Database

Open-Source

General Purpose

Page 3: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

3

RDBMS

MongoDB Document Model

MongoDB

{

_id : ObjectId("4c4ba5e5e8aabf3"),

employee_name: "Dunham, Justin",

department : "Marketing",

title : "Product Manager, Web",

report_up: "Neray, Graham",

pay_band: “C",

benefits : [

{ type :  "Health",

plan : "PPO Plus" },

{ type :   "Dental",

plan : "Standard" }

]

}

Page 4: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

4

What is Hadoop?

“The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models.”*

*source: hadoop.apache.org

• Large datasets• Analytics• Batch• Map-Reduce

Page 5: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

5

Enterprise IT Stack

EDWHadoop

Man

agem

ent

& M

on

ito

rin

gS

ecurity &

Au

ditin

g

RDBMS

CRM, ERP, Collaboration, Mobile, BI

OS & Virtualization, Compute, Storage, Network

RDBMS

Applications

Infrastructure

Data Management

Online Data Offline Data

Page 6: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

6

Consideration: Online vs. Offline

• Long-running• High-Latency• Availability is lower

priority

• Real-time• Low-latency• High availability

Online Offlinevs.

Page 7: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

7

Consideration: Online vs. Offline

Online Offlinevs.

Page 8: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

8

Hadoop is good for…

Risk Modeling Churn AnalysisRecommendation

Engine

Ad TargetingTransaction

AnalysisTrade

Surveillance

Network Failure Prediction

Search Quality Data Lake

Page 9: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

9

MongoDB is good for…

360 Degree View of the Customer

Mobile & Social Apps

Fraud Detection

User Data Management

Content Management &

DeliveryReference Data

Product CatalogsMachine to

Machine AppsData Hub

Page 10: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

10

MongoDB and Hadoop: Complementary

• “Data Lake”• In-depth analytics

• Real-time systems• Light-weight analytical

workloads

Page 11: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

11

Use MongoDB+Hadoop Together

E-Commerce

• Products & Inventory• Real-time

recommendations• Customer profile• Session management• Customer clickstream• Fraud detection

• Transaction history• Clickstream history• Recommendation

model• Fraud modeling

Analysis

MongoDB Connector for

Hadoop

Page 12: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

12

Example – Fraud Detection

Payments

• Fraud modeling

Nightly Analysis

MongoDB Connector for

Hadoop

Results Cache

• Online payments processing

3rd Party Data Sources

Fraud Detection

queryonly

query only

Page 13: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

13

Customer example – Global Travel Firm

Travel

• Flights, hotels and cars

• Real-time offers• User profiles,

reviews• User metadata

(previous purchases, clicks, views)

• User segmentation• Offer recommendation

engine• Ad serving engine• Bundling engine

Algorithms

MongoDB Connector for

Hadoop

Page 14: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

14

Customer example – MetLife

Insurance

• Insurance policies• Demographic data• Customer web data• Call center data• Real-time churn

detection

• Customer action analysis

• Churn prediction algorithms

Churn Analysis

MongoDB Connector for

Hadoop

Page 15: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

15

Customer example – Criteo

Ad-Serving

• Catalogs and products

• User profiles• Clicks• Views• Transactions

• User segmentation• Recommendation

engine• Prediction engine

Algorithms

MongoDB Connector for

Hadoop

Page 16: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

16

• Java Map-Reduce, Stream Map-Reduce, Pig, & Hive access to MongoDB– MongoDB as input

• mongo.job.input.format=com.hadoop.MongoInputFormat• mongo.input.uri=mongodb://my-db:27017/db1.collection1

– MongoDB as output• mongo.job.output.format=com.hadoop.MongoOutputFormat• mongo.input.uri=mongodb://my-db:27017/db1.collection2

– Using MongoDB backup files• mongo.job.output.format=com.hadoop.BSONFileOutputFormat• mapred.output.dir=file:///results.bson

– Xxx

What is MongoDB-Hadoop Connector?

Page 17: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

17

• Version 1.1.0, July 2013

– Pig support

– Hive support

– Streaming support

– Read/Write MongoDB backups

– Update writes

– Much more….

Enhancing MongoDB-Hadoop Connector

• Version 1.2.0, December 2013

– Apache Hadoop 2.2 support

– Multiple collections as M-R

source

– Multiple mongos support

– Custom splitting support

– Performance improvements

Page 18: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

18

• Rich query language

• Native secondary indexes

• Geospatial indexes & search

• Text indexes & search

• Aggregation framework

• Javascript Map-Reduce

• Client-side analytics

MongoDB Native Analytics

Page 19: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

19

Resources

White paper: Big Data: Examples and Guidelines for the Enterprise Decision Maker

http://www.mongodb.com/lp/whitepaper/big-data-nosql

Recorded Webinar Series: Thrive with Big Data

http://www.mongodb.com/lp/big-data-series

Recorded Webinar: What’s New with MongoDB Hadoop Integration

http://www.mongodb.com/presentations/webinar-whats-new-mongodb-hadoop-integration Documentation: MongoDB Connector for

Hadoophttp://docs.mongodb.org/ecosystem/tools/hadoop/

Trouble Tickets http://jira.mongodb.org (project = Hadoop Integration)

Subscriptions, support, consulting, training https://www.mongodb.com/products/how-to-buy

Resource Location

Page 20: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights

Top Related