webinar: introducing the mongodb connector for bi 2.0 with tableau

Post on 19-Jan-2017

129 Views

Category:

Software

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Introducing the MongoDBConnector for BI 2.0 with Tableau

Buzz MoschettiEnterprise Architect

buzz.moschetti@mongodb.com@buzzmoschetti

Vaidy KrishnanSenior Product Marketing Manager

vkrishnan@tableau.com

Agenda

• Introduction to MongoDB• What is The BI Connector?• Analytics with Tableau on MongoDB• Demo• Best Practices

3

MongoDB: The Post-Relational General Purpose Database

Document Data Model

Open-Source

Fully FeaturedHigh Performance

Scalable

{ name: “John Smith”,pfxs: [“Dr.”,”Mr.”],address: “10 3rd St.”,phone: {

home: 1234567890,mobile: 1234568138 }

}

4

MongoDB Company Overview

600+ employees 2500+ customers

Over $311 million in fundingOffices in NY & Palo Alto and

across EMEA, and APAC

6

db-engines.com Ranks ~300 Databases

7

Nexus Architecture

Scalability& Performance

Always On,Global Deployments

FlexibilityExpressive Query Language& Secondary Indexes

Strong Consistency

Enterprise Management& Integrations

• Deep Dive

8

Major Sweet SpotsBigData Product&Asset

CatalogsSecurity&Fraud

InternetofThings Database-as-a-Service

MobileApps

CustomerDataManagement

SingleView Social&Collaboration

ContentManagement

IntelligenceAgencies

TopInvestmentandRetailBanks

TopGlobalShippingCompany

TopIndustrialEquipmentManufacturer

TopMediaCompany

TopInvestmentandRetailBanks

ComplexDataManagement

TopInvestmentandRetailBanks

Embedded/ISV

Cushman&Wakefield

Agenda

• Introduction To MongoDB• What is The BI Connector?• Analytics with Tableau on MongoDB• Demo• Best Practices

10

MongoDB Query Language is Powerful

> db.results.values.aggregate([{$match: { runnum:23, timeSeriesPath: "CDSSpread.12M//1909468128” },{$project: { timeSeriesPath: "$timeSeriesPath", values: foml }},{$unwind: {path: "$values", idx: "v_idx"}},{$match: {values: {$gt: 60}, {$or: [ {idx: 0}, {idx: {$size: . . .},{$group: {_id: {a: "$timeSeriesPath", b: term: "$idx"},

n: {$sum:1}, max: {$max: "$values"}, min: {$min: "$values"}},sdev: {$stdDevPop: "$values"}}

,{$lookup: { from: ”deskLimits", localField: ”instID", foreignField: ”instID", as: ”inst"}},{$match: {maxDeskLimit: {$gt: {$cond: [ {$gt: [2, $max]}, 2, $max]}}}},{$group: {_id: "$deskID", total: {$sum: “$max”}}}]);

11

Able To Leap Tall Buildings in a Single Bound!> db.foo.insert({_id:1, "poly": [ [0,0], [2,12], [4,0], [2,5], [0,0] ] });> db.foo.insert({_id:2, "poly": [ [2,2], [5,8], [6,0], [3,1], [2,2] ] });

> db.foo.aggregate([{$project: {"conv": {$map: { input: "$poly", as: "z", in: {

x: {$arrayElemAt: ["$$z”,0]}, y: {$arrayElemAt: ["$$z”,1]},len: {$literal: 0} }}}}}

,{$addFields: {first: {$arrayElemAt: [ "$conv", 0 ]} }},{$project: {"qqq":

{$reduce: { input: "$conv", initialValue: "$first", in: {x: "$$this.x”, y: "$$this.y",len: {$add: ["$$value.len", // len = oldlen + newLen

{$sqrt: {$add: [{$pow:[ {$subtract:["$$value.x","$$this.x"]}, 2]},{$pow:[ {$subtract:["$$value.y","$$this.y"]}, 2]}] }} ] } }}

,{$project: {"len": "$qqq.len"}}

{ "_id" : 1, “len" : 35.10137973546188 }{ "_id" : 2, "len" : 19.346952903339393 }

12

… But It Doesn’t Natively Speak SQL> db.restaurants.sql("select * from restaurants where cusine = 'Peruvian'");2017-01-12T14:57:23.930-0500 E QUERY [main] TypeError: db.restaurants.sql is not a function

13

The MongoDB BI Connector: A “SQL Bridge”

MongoDBMongoDB

BIConnector

Anything That

Speaks MySQL

select A.fn, A.LN, P.prodType, T.amt, T.tdfrom tx TJOIN product P on T.product = P.prodJOIN acct A on T.acct = A.acctwhereA.acct in ('A5' , 'A10')and T.td = '2015-03-01 00:00:00’and P.prodType = 'CAR'

db.tx.aggregate([{$match:{td:ISODate(“2015-03-01 00:00:00”)},{$lookup:{from: “acct”, localfield: “acct” …{$match:{acct: {$in: [“A5”, “A10” ]}},{$lookup:{from: “product”, localfield: “prod”{$match: {prodType: “CAR”}}

14

The MongoDB BI Connector: A “SQL Bridge”

MongoDBMongoDB

BIConnector

select A.fn, A.LN, P.prodType, T.amt, T.tdfrom tx TJOIN product P on T.product = P.prodJOIN acct A on T.acct = A.acctwhereA.acct in ('A5' , 'A10')and T.td = '2015-03-01 00:00:00’and P.prodType = 'CAR'

db.tx.aggregate([{$match:{td:ISODate(“2015-03-01 00:00:00”)},{$lookup:{from: “acct”, localfield: “acct” …{$match:{acct: {$in: [“A5”, “A10” ]}},{$lookup:{from: “product”, localfield: “prod”{$match: {prodType: “CAR”}}

15

Authentication & Entitlements are ALSO Bridged

MongoDBMongoDB

BIConnector

biUser?mechanism= MONGODB-CR,source=authDBpassword=*******

client = connect(biUser, *******);

16

A Mapping File is The Key Ingredient

schema:- db: food

tables:- table: restaurants

collection: restaurantscolumns:- Name: _idMongoType: bson.ObjectIdSqlName: _idSqlType: varchar

- Name: address.buildingMongoType: stringSqlName: address.buildingSqlType: varchar

MongoDBMongoDB

BIConnector

17

Mapping Generator to Get You Started

MongoDBMongoDB

BIConnector

mongodrdl –d food –c restaurants –o food.drdl

mongosqld –schema=food.drdl

Agenda

• Introduction To MongoDB• What is The BI Connector?• Analytics with Tableau on MongoDB• Demo• Best Practices

ConnectivityAccess to all

data

PerformanceFast interaction

with all data

DiscoveryFinding the right

data.

Tableau’s Big Data Focus

Analytics for All your Data

Broad access to Big Data platforms

Visual analytics without coding

Platform query performance

Consistent visual interface

Hybrid data architecture

Big Data Connectivity Roadmap

2010 2012 2013 2014 2015

Tableau v6.1.4Cloudera Hadoop

Tableau v7.0.10HortonworksHadoop

Tableau v8.2.3IBM BigInsights

Tableau v9.0Spark SQL

Tableau v5.2Pivotal Greenplum& HAWQ

2011

Tableau v7.0.10Cloudera Impala

Tableau v7.0.7MapR Hadoop

Tableau v7.0.10Datastax Enterprise& Cassandra

Tableau v8.1.4Splunk

Tableau v8.0.1Amazon Redshift

Tableau v8.2.3MarkLogic

Tableau v8.3.2Amazon EMR

Tableau v8.0Google BigQuery

Today

2016 2017

Cold, Warm, Hot Framework

• The Data Lake• Store Everything and

Anything• Unknown Questions

with Unknown Answers• Unstructured / Data

Mining / Data Science

• Data Warehouses• Data marts prepared

for entity analytics• Known questions

with unknown answers

• Regularly refreshed business concepts

• In-memory computing• Precomputed aggregates

to answer specific questions

• Known questions with known answers

• Dashboards

Aggregated dataPrepared data

Data Size

PerformanceLarge data (raw or prepared)

Cold, Warm, Hot Strategy

Aggregated dataPrepared data

Data Size

PerformanceLarge data (raw or prepared)

Cold, Warm, Hot Strategy with Optimized MongoDB

How do we see customers using Tableau on MongoDB

• Use Case–Data Exploration/Mining–Ad-Hoc Report Conceptual Modeling–Query directly/Explore Concepts to Migrate to Analytically Optimized

Data Stores

MongoDB

• Financial Services: Analyze ticks, tweets, satellite imagery, weather trends, and any other type of data to inform trading algorithms in real time.

• Government: Identify social program fraud within seconds based on program history, citizen profile, and geospatial data.

• HighTech: Identify unique individuals across any type of device, browser or app and use a holistic behavioral model to advertise to them.

• Retail: Set up a digital geo-fence around your brick-and-mortar locations to push in-store incentives to shoppers in real time.

• MongoDB – Verticals & Use Cases

Agenda

• Introduction To MongoDB• What is The BI Connector?• Analytics With Tableau on MongoDB• Demo• Best Practices

Agenda

• Introduction To MongoDB• What is The BI Connector?• Analytics With Tableau on MongoDB• Demo• Best Practices

Basic MongoDB Optimizations

✔ DO: ✗ AVOID

• Model for use

• Index effectively

• Use prejoined array tables

• Leverage custom pipelines in DRDL

• Let dates (SQL timestamp) and decimal

types flow w/o conversion to string

• Casts

• Date arithmetic

• Cross-collection

• Non-equijoins

• Subqueries

Tableau Data Extracts – When to use them?

Extracts Recommended Live Connection Recommended

• Slow SQL to MQL translation

• Smaller dataset sizes needed

• Offline analysis required

• Reduce “big query” impact on

nominal workload performance**

• Fast SQL to MQL translation

• Larger dataset sizes needed

• Real-time analysis required

• Extract Sampling Techniques• Filters

• Keep only well-known dimensions and measures• Use short date ranges

• Aggregates• Aggregate dimensions and measures when possible• Roll-up dates when possible

• Samples• Utilize Custom SQL with sample function

• Top N• May be skewed since non-random sampling

Optimize your Tableau Data Extracts

General Techniques for Improvement

Partition field as filter

Single denormalized table

Monitor for long running queries• Data blending large datasets

– Executed on the Tableau client side• Cull Unnecessary joins

– …and take advantage of prejoined tables in the BI Connector– Imperfectly implemented on many big data systems– Assume referential integrity

• Inefficient formulas

MongoDB

Leverage a multi-tiered approach based on your data

TDE+

Fast analytical database

Aggregateddata

Prepared data

Raw data (large)

MongoDB

• Chunks of Human Consumable Data• Aggregation of Data Tiers

• Year to Quarter to Month to Week to Day to Records

• Region to Country to State to County to Zip Code

• Drill Down to Raw Data with Context• Use Aggregates for Guided Drilling• Use Action Filters to Navigate the

Pyramid

•Human Scale of Data

SingleConsumableChunkofData

attheHumanScale(Dashboard)

AggregationLevel

Year

(4)

Mon

th (

48)

Wee

k (1

05)

Day

(90)

Raw

Dat

a

Filter Year

Filter Month

Filter Week

Filter Day

Select Week

Select Month

Select DimensionSelect Dimension

In the Weeds

• Use Action Filters to Jump from Tier to Tier with a filter context

• Drill Down to the Details• Leave the Data in the Appropriate

Data Architecture• Hot - Analytical Query• Warm - Entity Query• Cold - Data Discovery

Action Filters: Big Data Secret Weapon

COLD

WARM

HOT

• Dashboard or Document Acceleration• High Performance• Aggregations• Persistence

• Row Level Security• Live Connections• Core Report Development

• Data Mining• Detailed Data• Raw Data• Machine Learning

1. Do you have sufficient infrastructure/hardware to deal with the kind of data that will be analyzed? ~ Law of inertia , nothing moves till there is sufficient force applied to move it

2. Have you chosen an underlying data source that matches your performance aspirations, and have you engineered it for interactive performance? ~ law of dynamics, Force = mass * acceleration

3. Have you designed your Tableau vizzes so that the queries run efficiently? ~ For every action (viz) there is an equal and opposite reaction (from the data source)

Don’t forget the laws of Data Motion

Q & A

Thank You!

Buzz MoschettiEnterprise Architect

buzz.moschetti@mongodb.com@buzzmoschetti

Vaidy KrishnanSenior Product Marketing Manager

vkrishnan@tableau.com

top related