delivering real time analytics in 1 click
TRANSCRIPT
1 1
©2015 Talend Inc. ©2015 Talend Inc.
Delivering Real Time Analytics in One Click Jean-Michel Franco - @jmichel_franco
Mark Balkenende - @MarkBalk
2 2
See this presentation on line
An Online version of this presentation is accessible at the following URL.
• https://info.talend.com/en_bd_realtime_analytics_oneclick.html
3 3
Your Speakers Today
Jean-Michel Franco Product Marketing Director – Data Governance Products
Mark Balkenende Manager, Technical Product Marketing
4 4
Connecting the Data-Driven Enterprise
Data-Driven companies…
• 23 times greater customer acquisition
• 6 times greater customer retention
• 19 times more profitability
5 5
Connecting the data driven enterprise with
Information
as an asset
6 6
So you’re getting ready for rolling-out your data lake
7 7
But will this finally meet the promises of analytics ?
In most companies, fewer than 10% employees have access to BI and analytic systems.
8 8
Of course you can leverage data discovery, dataviz and predictive analysis
9 9
Source: September 20, 2011, “Understanding The Business Intelligence Growth Opportunity” Forrester report
But the scope and reach of Analytics has expanded
NOW
10 10
What you need to design is a data refinery
11 11
BI as we believe it should go The three new dimensions of analytics
Build an agile and manageable
data integration layer
From dashboard to analytical application
Predictive analytics and machine learning
Embed analytics in your operational processes
Provisioning the data
Designing the System of
insights
Operationalize Your analytics
Big D
ata in
tegration
B
ig Data
An
alytics
Data In
te- gratio
n &
p
reparatio
n
12 12
Build an agile and manageable integration layer
Data Inventory
Data Prepa-ration
Master Data
Mgmt.
Data Integra-
tion
Create your data
catalog.
Profile the Data.
Augment and connect.
Productize the
Data flows
Sanction the Data.
Share and monitor.
13 13
Big data and Open source is opening new horizons for data scientists
Designing the system of insights
• Data scientist role is finally recognized as a must to success in analytics
• Democratization of Analytics/machine learning technologies
- Open source tools : Rapid Miner, Knime, R …
- Cloud based machine learning platforms : Google Prediction API, Azure ML, Amazon ML…
- Larger range of options of high end solutions: Blue Yonder, Watson, SAS, BigML…
• Better options to operationalizing analytics, rather than use it mostly on an ad-hoc basis
- Run the model in place and schema on-read, where the Big Data is with Hadoop
- Robust options for deploying models are now emerging (Mahout, Spark ML)
14 14
Operationalize your analytics
Enterprise Apps
Market Data
Sensors
Logs
Digital applications
Data Integration
Real time Data & application
integration
Data warehouse
& marts
Ad hoc analysis
& mining
Repoting
Data
Lake
Data profiling
& preparation
Data
Discovery Data
modeling
Th
e D
ata
Lab
Th
e D
ata
F
acto
ry
Data
Hub
Data
flows
Predictions
& prescriptions
Embedded
analytics
15 15
Easiest and Most Powerful Integration Solution for Big Data
Introducing Talend Big Data
16 16
Future-Proof Architecture
ETL Day-to-day integration
ELT DW Appliance
CAMEL Message Transformation
HADOOP Highly
Scalable
17 17
Simplify Real-Time Big Data
100x performance increase
< 1 sec response
Address new use cases
(last minute defense, dynamic pricing, real-time fraud detection, CEP, etc.)
New components for streaming data
18 18
Spark integration in Talend Studio
Apache
• Technical Preview
• Machine learning components require a Talend Big Data Platform license
• Implementation of Spark, ML LIB and Spark Streaming API
• 17 Components for data integration - Data integration : Load, Connection,
Sample, FilterRow, FilterColumns, Normalize, Union, Replicate, Aggregate, Sort, Join, Uniq, Log, Store
- Machine learning and Data Quality: Sample, ALS Model, Recommend
"Don't assume you can easily port existing applications to Spark from another data-processing model, like MapReduce. Moving to Spark means a complete reimplementation, and the potential benefits must outweigh that cost. "
Nick Heudecker - Gartner
19 19
Otto Optimizes Pricing & Stock
A company that’s doing everything right
Challenge:
• Ever increasing Big Data velocity
• Many last minute cart abandonments
• Hard to optimize pricing
Why Talend:
• Is the central integration tool within their Business
Intelligence (BI) organization.
• Integrates clickstreams from last 6 months
Value:
• Leftover merchandise reduced by 20%
• Can predict abandoned shopping cart in real-time with a 90%
accuracy
• Performs dynamic pricing
20 20
Demonstration
Key capabilities • Drives the learning process by integrating data in
Hadoop and launch the MLlib learning process • Drives the recommendation process by ingesting
demographics data into the engine, and integrating the output into any application or data target.
Business Benefits • Hides the underlying complexity of Hadoop and
Spark • Easily embed machine learning into any
application or data target • Machine learning with precision and at scale • Predictive analysis for the rest of us
Demographics data Big Data
tSparkALSModel
tSparkRecommend
Test
Run
Training data
21 21
Start now with the Talend Big Data Sandbox
Virtual Image installed with • Multiple scenarios for you to try:
- Clickstream data
- Twitter sentiment
- Apache weblogs
- ETL Offload
- Recommendations through Spark Machine Learning
Download your Free Talend Big Data Sandbox today! http://www.talend.com/talend-big-data-sandbox
22 22
©2015 Talend Inc. ©2015 Talend Inc.
Delivering Real Time Analytics in one click Jean-Michel Franco - @jmichel_franco
Mark Balkenende - @MarkBalk