rapidminer overview - aptusdatalabs.com · the rapidminer data science platform lightning fast real...
TRANSCRIPT
2
Analysts
RapidMiner Highlights
By the
numbers
#1 Open-Source
Platform Last five years in a row
Data Mining &
Analytics Software Poll
Leader2017
Predictive Analytics
& Machine Learning
#1
Data Science
Platform
200,000+
Engaged
Community
Members
250+
Global
Clients
Channel
Partners
50+
Innovation Winner 2015
Wisdom of Crowds for Advanced
& Predictive Analytics, Big Data
Analytics & End-User Data Prep
Leader 2014, 2015, 2016 & 2017
Gartner Magic Quadrant
for Data Science Platforms
Accolades
CB InsightsThe AI 100, 2017
“100 Startups Using Artificial Intelligence
to Transform Industries”
VENTANA
RESEARCH2016 Technology Innovation
Awards Winner
Predictive Analytics
3
Insight Without Action Has No Value
Data Science
Big Data
Machine Learning
Human / Automated Actions
Data Visualization
Analytic Data Marts
Drilldown
Current Insight
Business Intelligence
Database
Sums & Counts
Historical Information
Step Five
Passive
Reactive
Proactive
Analytics 1.0Descriptive
Analytics 2.0Diagnostic
Analytics 3.0*
Predictive & Prescriptive
*First referenced by Thomas H Davenport, HBR December 2013
4
• Anti-Money Laundering• Rogue Trading• Cyber Security• Compliance• …
• Credit Scoring• Insurance Underwriting• Capital Planning • Stress Testing• Fraud Detection
High Value Use Cases Need Real Data Science
Increased profitability*
Avoid Risks
+46%Realized cost savings*
Reduce Costs
-34%New revenue opportunities*
Drive Revenue
+50%
*Ventana Research Next Generation Predictive Analytics Benchmark Research, 2015
Customer Analytics
• Customer Acquisition• Cross-sell/Upsell• Offer Optimization• Retention & Loyalty• Win back
• Channel / MixOptimization
• Web Analytics• Pricing Optimization• …
Risk AnalyticsOperational Analytics
• Call Center Operations• Retail Store Operations• Predictive Maintenance• IT Operations• …
• Supply Chain Optimization• Manufacturing Operations• Asset Performance
Process Engineering• Capacity Planning
Automotive
Banking
Insurance
Government
e-Health
Travel, Transport& Logistics
Life Sciences
Oil & Gas
Manufacturing
Utilities
Retail &Consumer Goods
Telco
5
Lightning-Fast Unified Platform
Data PrepSpeed & optimize ALL data
exploration, blending & cleansing tasks
OperationalizeEasily deploy & maintain
models and embed analytic results
Model & ValidateApply machine learning to
rapidly prototype & confidently validate predictive models Embed results in all
types of business apps & data
visualization tools
Incorporate all types of data
• Data selection• Data Cleaning• Data integration• Data formatting• Data exploration
• Modeling• Cross validation• Model Optimization• Model Management• Model Export
• Model deployment• Scoring as web service• Model monitoring• Reporting and visualization• Maintenance
6
The RapidMiner Competitive Advantage
Lightning FastData Science
Powerful, visual & guided use of 1,500data prep and machine learning functions & third party libraries
Unified
Platform
Prototype – Substantiate – Operationalize –
seamless, high performance orchestration
#1 Marketplace for
Data Science Expertise
On-demand consultants, algorithms &
extensions; global presence & domain
expertise in every industry
Real data science, fast and simple.
7
RapidMiner Platform & Pricing1 year subscription shown
Studio Large$10,000 per user
Studio Medium$5,000 per user
Studio Small$2,500 per user
Studio
Free
10,000 100,000 1,000,000 Unlimited
Data Rows
10,000 100,000 1,000,000 Unlimited
Data Rows
Unlimited
8
4
2
1
Cores
Unlimited
8
4
2
1
Cores
RapidMiner Studio RapidMiner Server
Server Large$60,000 per instance
Server Medium$30,000 per instance
Server Small$15,000 per instance
Server
Free
10x+performance
4xperformance
2xperformance
Row limits in Studio apply when using Server
or Radoop so limiting the data a user can
use.
Radoop EnterpriseFirst User $15,000
Each additional User $5,000
Executes all 1500+ RapidMiner
functions plus 70+ native Hadoop
operators
Radoop Free70+ native Hadoop operators only
RapidMiner Radoop
Free product versions receive
community support.
• Visual Workflow Designer
• Guided Analytics & Reusable Processes
• Wealth of Predictive Algorithms & Functions
• Execute Data Science Workflows
Seamlessly on Hadoop
• Analysis upon the full breadth &
variety of stored big data
• Collaborate & Share
• Compute
• Integrate
• Operationalize
8
Get Successful with RapidMiner
1Get Started
Jumpstart your enablement and get started fast with free self-service
tutorials, videos and the daily demo
2Get Guidance
Attend product workshops and ask questions of product
experts as you build your first machine learning workflows
5
Get Connected & ContributeConnect to the RapidMiner community: learn, share, contribute:• 200,000+ member, 34,000+
posts• Innumerable external blogs,
articles, scientific papers & books3
Get Educated & CertifiedDevelop the essential skills to be successful with the RapidMiner product suite
Live OnlineVirtual instructor-led
Self-Paced OnlineLearn when convenient
ClassroomFace-to-face at our oryour office
Books
Videos & In-Product Tutorials Webinars Demos & Documentation
Community & Blogs
4Get Successful
Utilize the experience and expertise of the RapidMiner Customer Success Team• Customer orientation• Installation support & guidance• Implementation planning• Use case, architecture, best practices• Training, Certification & Services needs• Quarterly reviews
9
RapidMiner Partner NetworkTechnology
Value Added
Resellers
Systems
Integrators
Global
Partners
OEM
rapidminer.com
@rapidminer
Real data science, fast and simple.
RapidMiner Inc.
10 Milk Street
11th Floor
Boston, MA 02108
Boston Budapest Dortmund London
12
Chief Analytics OfficerEmpower operational workers to
consume data science in their routine decision making
Coding Data ScientistAccelerate the creation of high-
value data science while streamlining low-value tasks
Applied Data ScientistConfidently extract the hidden
value from your data using intuitive predictive analytics
Bridge the Data Science Skills Gap
Chief Executive OfficerLeverage prescriptive analytics in
all your decisions to achieve better outcomes
RapidMiner Data Science Impact
Operationalize Competitive Advantage
*Ventana Research Next-Generation PredictiveAnalytics Benchmark Research, 2015
95%faster
50% Created new revenue opportunities*
Improvedcustomer service*
46%
39%
Increased profitability*
5-10x data sciencecapability
Build Better Predictive Models Faster Easily Use Predictive Analytics
13
The RapidMiner Data Science Platform
Lightning Fast Real Data Science, Code Optional Seamless Deployment, Management &
Collaboration
Simplified, Intelligent Big Data Science & Machine Learning
On-demand Innovation & Execution
RapidMiner Marketplaces
RapidMiner Studio RapidMiner Server
RapidMiner Radoop
ModelingEfficiently build and
deliver better models faster
ValidationConfidently & accurately
estimate model performance
Data AccessConnect to any data
source, any format, at any scale
Data ExplorationQuickly discover patterns
or data quality issues
Data PrepSpeed & optimize ALL data
exploration, blending & cleansing tasks
IntegrationEfficiently build and
deliver better models faster
ManagementConfidently & accurately
estimate model performance
CollaborationConnect to any data
source, any format, at any scale
ComputationQuickly discover
patterns or data quality issues
SchedulingSpeed &
optimize ALL dataexploration, blending &
cleansing tasks
Simplified AnalyticsReduces Hadoop complexity
Lightning FastCovers complete analytics
lifecycle
Broad Data AccessEliminate connectivity
struggles
Integrated SecurityEnsure security compliance
Optimized for HadoopLeverage Hadoop distributed
power
Scalable ProcessingProcess in-Hadoop and in-
memory
Spark ExecutionExecute RapidMiner sub -
processes in parallel
∞ Extensive Domain ExpertiseExpert marketplace of certified RapidMiner skills
Plug-ins, Algorithms, ExtensionsProduct Marketplace to extend and innovate
14
RapidMiner ServerCollaborate + Compute + Deploy + Maintain
Process
Execution Engine
Process
Scheduler
Data and Process
Repository
User/Group Access
Rights management
Web App Portal
Web S
erv
ices
RapidMiner Web Applications
Integrate using Web Services, JSON, SQL, …
Application (BI, ERP,
CRM…) / Portal
Java SE/EE
ApplicationServer
Application
Databases /
Data
warehouses
RapidMiner StudioVisual Workflow Designer
Process Execution
Engine
Workflow Builder
RapidMiner RadoopCompile + Execute in Hadoop
RapidMiner Market PlaceIndustry, Application & ML Extensions
RapidMiner Market PlaceIndustry, Application & ML Extensions
RapidMiner RadoopCompile + Execute in Hadoop
The RapidMiner Platform
Incorporate all
types of dataRun in multiple
Compute Engines
R / Python / SQL ScriptingIn-Memory H2O / WekaIn-Hadoop & Spark
15
RapidMiner Studio
Lightning Fast Visual interface for rapidly building complete analytic
workflows
PowerfulRich library of algorithms and functions to build the strongest
possible model for any use case
Open & Extensible• Open source innovation keeps pace with changing
business needs
All-In-One Data Science Workflow Designer
16
RapidMiner Server
Team Collaboration
Central repository facilitates sharing of data sources, analytic processes & best practices
Operationalization & Collaboration Management
Frictionless
Operationalization Flexible execution options
streamline deployment,
maintenance & embedding of
analysis
Dynamic & Continuous
Model ManagementIndividual and customizable
processes to check for accuracy
drifts or shifts
17
RapidMiner RadoopExtends the RapidMiner’s visual workflow to Hadoop
Hadoop made easy Translates data science workflows into Hadoop so data scientists
concentrate on analytics not Hadoop programming
In Hadoop ExecutionPushes analytic instructions into Hadoop
for computation
SecureComplies with Hadoop security standards
18
Sample Use Cases
Telco - Switzerland
Server & Equipment Load Forecasting, Predictive
Maintenance, Predicting & Preventing Server & Component
Failures
Telco - Austria
Automated Customer Feedback Text Analysis for Automated E-Mail
Categorization & Routing
Telco – Hungary
Customer Relationship Analytics, Churn Prediction & Prevention, Direct
Marketing Campaign Optimization, Scheduling & Automated Execution of
ETL Tasks
Telco – Germany
Automated Online Market Research, Text Analytics, Sentiment Analysis,
Customer Insight
Marketing – Germany
Automated Online Market Research, Text & Sentiment Analysis, Customer
Insight, Competitive Intelligence
Telco – Germany
Fraud Detection & Prevention
OEM – Europe
Fraud Detection & Prevention Solutions for Telecoms
Telco – Europe
CRM applications including optimization of direct marketing
campaigns, automated generation of product recommendations for cross-
selling and up-selling, customer churn prevention, and fraud detection
Payments – Worldwide
Sentiment Analysis of online text sources, including social media and
other user generated content for customer care triage
Telco - Austria
Optimize customer support by automatically categorizing
unstructured data by content and to prioritize and reduce response time
and cost so increasing customer satisfaction
Payments – Worldwide
Customer feedback & voice of the customer, churn prevention, text
mining, automated text categorization, and sentiment analysis to customer support and sat to prevent customer
churn
Market Research -Worldwide
Prediction of sales volumes; CRM optimization; social media
monitoring and sentiment analysis
19
Sample Customer Use Cases
Multiple Customers, Industries
Automated Customer Feedback Text Analysis for Automated E-Mail / Social
Media, Categorization, Triage & Routing
Partner - Europe
Smart meter installation optimization as a service – maximize first time visit
success
Market Research – Worldwide Org
Prediction of sales volumes; CRM optimization; social media monitoring and
sentiment analysis
Automated Customer Feedback Text Analysis for Automated E-Mail Categorization &
Routing
Telco – Europe
CRM applications including optimization of direct marketing campaigns,
automated generation of product recommendations for cross-selling and up-selling, customer churn prevention,
and fraud detection
Payments – Worldwide
Sentiment Analysis of online text sources, including social media and other user generated content for
customer care triage
Payments - Russia
Fraud detection in retail network historical data on service usage,
transaction history, customer profiles, usage logs, and known cases of
fraudulent behavior
20
Sample Customer Use Cases
Voice of the Customer
Automated Customer Feedback Text Analysis for Automated E-Mail / Social
Media, Categorization, Triage & Routing
Manufacturing – Predictive Maintenance
High Value Assets - Silicon, Cars, Trucks, Aircraft, Turbines, IT
Infrastructure,…
Maximizing Customer Lifetime Value
CRM applications including optimization of direct marketing campaigns,
automated generation of product recommendations for cross-selling and up-selling, customer churn prevention,
and fraud detection
Manufacturing – Production Optimization
Optimization Of Production Logistics & Flows, Quality, Yield, Product Mix, Process
Mining
Fraud Detection
Fraud detection in retail network historical data on service usage,
transaction history, customer profiles, usage logs, and known cases of
fraudulent behavior
21
Safeguarding Electronic Payments
The Challenge
RapidMiner Solution
Outcome
• Protecting against fraud and anticipation of risk 7x24
• Large and diverse set of partners (merchants) – over 70,0000
• How to classify and check merchant ecommerce sites for payment system compliance?
• Analyze, classify and check merchants’ ecommerce sites for compliance
• Utilize text mining with NLP to auto-categorize with high sentiment accuracy
• Mashup the widest data sets - historical data on service usage, transaction history, customer profiles, usage logs, and known cases of fraudulent behavior
• Detect anomalies, misuse and fraud through operationalized classification model
• Only 8-10% of merchant sites now screened manually at 80% confidence threshold
• Accurate automated analysis of high risk sites- 92% correctly classified
• Elimination of false positives - no normal sites classified as high risk
• Time and cost to resolve fraud case radically reduced
Anticipating the risk of fraud
Russia’sLargest electronic payment service
22
Repeat Business through Marketing Efficacy
The Challenge
RapidMiner Solution
Outcome
• Industry with tight margins & intense competition
• Broad array of online & mobile channels for customers to place orders
• Goal to improve marketing offers and create more repeat business
• Capture a vast array of customer ordering data from multiple online & mobile phone channels
• Use RapidMiner to join & enriched data with 3rd-party demographics & competitive data
• Use data science to assess performance and growth drivers at individual stores & franchise groups
• Results used to tailor coupons & upsell offers to customers
• Greater flow of repeat customers, driving growth at individual stores and franchise groups
• Far outpaced the industry: Posted best Q2 & Q3 domestic same-store sales growth of the 25 largest restaurant chains in the U.S.
• Next steps: RapidMiner Radoop
Identify upsell offers through deep customer analytics
Large North American
restaurant delivery chain
23
Customer Satisfaction through Quality of Service
The Challenge
RapidMiner Solution
Outcome
• Backend infrastructure footprint & costs increasing yearly
• Customer satisfaction driven by service quality in areas such as video streaming latency
• Network operation teams must accelerate root cause analysis, reduce time to repair
• Data visualization with big data alone cannot provide operationalized insight needed
• Secure large scale Hortonworks Hadoop Big Data Hub architecture to leverage data lakes
• Correlation of log events with historical log data to preempt service quality degradation
• Through machine learning rapidly predict demand as consumer usage patterns change
• Utilize text mining to optimize help desk ticket triage and processing
• Reduce infrastructure requirements (-10%)
• Improved customer retention (2%+)
• IT Operations costs reduced (-30%)
Customer experience begins with network quality
Leading European Telecoms Provider
24
Drive Data Science Agility & Cut Costs
The Challenge
RapidMiner Solution
Outcome
• Existing data science teams looking to replace SAS
– Strong dislike of unwieldy SAS platform with the coding & complexity of it’s multiple
applications & user interfaces
– Cost of SAS too high
• Pull together customer data from across a number of internal databases & third-party sources
• Easily incorporate a large library of legacy predictive models written in R & Python
• Small team of 4 data scientists using collaboration features in RapidMiner Server to share data
prep and machine learning processes
• Improved upsell opportunities and customer retention
• Speeds the process of data prep, rapid prototyping & validation of models over SAS methods
and coding-only methods
• Expansion into Risk department where data science team doesn’t code in SAS, R or Python
Faster development & deployment of customer analytics models
Leading
North American
Financial Services
Institution
25
Gartner & Forrester – RapidMiner a Clear Leader
“…a Leader, owing to its market presence, the volume of client inquiries that Gartner receives about it, its user community, and its well-rounded product that addresses
most data science use cases well.”
‘Reference customers praised many facets of the platform — its large selection of algorithms, flexible modeling capabilities, data source integration and
consequent data preparation. The platform's strength lies not just in particular areas, but also in its all-around consistency.”
2017
“RapidMiner wraps breadth and depth in a beautiful package.
RapidMiner invested heavily to revamp visual interface to make it the most concise and fluid that we have seen during this evaluation. Add to that, RapidMiner’s comprehensive set of operators that encapsulate a wide range of data prep, analytical, and modeling functionality to increase
productivity of data scientists.”
Magic Quadrant for Data Science Platforms
PAML Wave