rapidminer overview - aptusdatalabs.com · the rapidminer data science platform lightning fast real...

26
Real data science, fast and simple. RapidMiner Overview

Upload: buiduong

Post on 10-Mar-2019

234 views

Category:

Documents


1 download

TRANSCRIPT

Real data science, fast and simple.

RapidMiner

Overview

2

Analysts

RapidMiner Highlights

By the

numbers

#1 Open-Source

Platform Last five years in a row

Data Mining &

Analytics Software Poll

Leader2017

Predictive Analytics

& Machine Learning

#1

Data Science

Platform

200,000+

Engaged

Community

Members

250+

Global

Clients

Channel

Partners

50+

Innovation Winner 2015

Wisdom of Crowds for Advanced

& Predictive Analytics, Big Data

Analytics & End-User Data Prep

Leader 2014, 2015, 2016 & 2017

Gartner Magic Quadrant

for Data Science Platforms

Accolades

CB InsightsThe AI 100, 2017

“100 Startups Using Artificial Intelligence

to Transform Industries”

VENTANA

RESEARCH2016 Technology Innovation

Awards Winner

Predictive Analytics

3

Insight Without Action Has No Value

Data Science

Big Data

Machine Learning

Human / Automated Actions

Data Visualization

Analytic Data Marts

Drilldown

Current Insight

Business Intelligence

Database

Sums & Counts

Historical Information

Step Five

Passive

Reactive

Proactive

Analytics 1.0Descriptive

Analytics 2.0Diagnostic

Analytics 3.0*

Predictive & Prescriptive

*First referenced by Thomas H Davenport, HBR December 2013

4

• Anti-Money Laundering• Rogue Trading• Cyber Security• Compliance• …

• Credit Scoring• Insurance Underwriting• Capital Planning • Stress Testing• Fraud Detection

High Value Use Cases Need Real Data Science

Increased profitability*

Avoid Risks

+46%Realized cost savings*

Reduce Costs

-34%New revenue opportunities*

Drive Revenue

+50%

*Ventana Research Next Generation Predictive Analytics Benchmark Research, 2015

Customer Analytics

• Customer Acquisition• Cross-sell/Upsell• Offer Optimization• Retention & Loyalty• Win back

• Channel / MixOptimization

• Web Analytics• Pricing Optimization• …

Risk AnalyticsOperational Analytics

• Call Center Operations• Retail Store Operations• Predictive Maintenance• IT Operations• …

• Supply Chain Optimization• Manufacturing Operations• Asset Performance

Process Engineering• Capacity Planning

Automotive

Banking

Insurance

Government

e-Health

Travel, Transport& Logistics

Life Sciences

Oil & Gas

Manufacturing

Utilities

Retail &Consumer Goods

Telco

5

Lightning-Fast Unified Platform

Data PrepSpeed & optimize ALL data

exploration, blending & cleansing tasks

OperationalizeEasily deploy & maintain

models and embed analytic results

Model & ValidateApply machine learning to

rapidly prototype & confidently validate predictive models Embed results in all

types of business apps & data

visualization tools

Incorporate all types of data

• Data selection• Data Cleaning• Data integration• Data formatting• Data exploration

• Modeling• Cross validation• Model Optimization• Model Management• Model Export

• Model deployment• Scoring as web service• Model monitoring• Reporting and visualization• Maintenance

6

The RapidMiner Competitive Advantage

Lightning FastData Science

Powerful, visual & guided use of 1,500data prep and machine learning functions & third party libraries

Unified

Platform

Prototype – Substantiate – Operationalize –

seamless, high performance orchestration

#1 Marketplace for

Data Science Expertise

On-demand consultants, algorithms &

extensions; global presence & domain

expertise in every industry

Real data science, fast and simple.

7

RapidMiner Platform & Pricing1 year subscription shown

Studio Large$10,000 per user

Studio Medium$5,000 per user

Studio Small$2,500 per user

Studio

Free

10,000 100,000 1,000,000 Unlimited

Data Rows

10,000 100,000 1,000,000 Unlimited

Data Rows

Unlimited

8

4

2

1

Cores

Unlimited

8

4

2

1

Cores

RapidMiner Studio RapidMiner Server

Server Large$60,000 per instance

Server Medium$30,000 per instance

Server Small$15,000 per instance

Server

Free

10x+performance

4xperformance

2xperformance

Row limits in Studio apply when using Server

or Radoop so limiting the data a user can

use.

Radoop EnterpriseFirst User $15,000

Each additional User $5,000

Executes all 1500+ RapidMiner

functions plus 70+ native Hadoop

operators

Radoop Free70+ native Hadoop operators only

RapidMiner Radoop

Free product versions receive

community support.

• Visual Workflow Designer

• Guided Analytics & Reusable Processes

• Wealth of Predictive Algorithms & Functions

• Execute Data Science Workflows

Seamlessly on Hadoop

• Analysis upon the full breadth &

variety of stored big data

• Collaborate & Share

• Compute

• Integrate

• Operationalize

8

Get Successful with RapidMiner

1Get Started

Jumpstart your enablement and get started fast with free self-service

tutorials, videos and the daily demo

2Get Guidance

Attend product workshops and ask questions of product

experts as you build your first machine learning workflows

5

Get Connected & ContributeConnect to the RapidMiner community: learn, share, contribute:• 200,000+ member, 34,000+

posts• Innumerable external blogs,

articles, scientific papers & books3

Get Educated & CertifiedDevelop the essential skills to be successful with the RapidMiner product suite

Live OnlineVirtual instructor-led

Self-Paced OnlineLearn when convenient

ClassroomFace-to-face at our oryour office

Books

Videos & In-Product Tutorials Webinars Demos & Documentation

Community & Blogs

4Get Successful

Utilize the experience and expertise of the RapidMiner Customer Success Team• Customer orientation• Installation support & guidance• Implementation planning• Use case, architecture, best practices• Training, Certification & Services needs• Quarterly reviews

9

RapidMiner Partner NetworkTechnology

Value Added

Resellers

Systems

Integrators

Global

Partners

OEM

rapidminer.com

@rapidminer

Real data science, fast and simple.

RapidMiner Inc.

10 Milk Street

11th Floor

Boston, MA 02108

Boston Budapest Dortmund London

11

Additional Content

12

Chief Analytics OfficerEmpower operational workers to

consume data science in their routine decision making

Coding Data ScientistAccelerate the creation of high-

value data science while streamlining low-value tasks

Applied Data ScientistConfidently extract the hidden

value from your data using intuitive predictive analytics

Bridge the Data Science Skills Gap

Chief Executive OfficerLeverage prescriptive analytics in

all your decisions to achieve better outcomes

RapidMiner Data Science Impact

Operationalize Competitive Advantage

*Ventana Research Next-Generation PredictiveAnalytics Benchmark Research, 2015

95%faster

50% Created new revenue opportunities*

Improvedcustomer service*

46%

39%

Increased profitability*

5-10x data sciencecapability

Build Better Predictive Models Faster Easily Use Predictive Analytics

13

The RapidMiner Data Science Platform

Lightning Fast Real Data Science, Code Optional Seamless Deployment, Management &

Collaboration

Simplified, Intelligent Big Data Science & Machine Learning

On-demand Innovation & Execution

RapidMiner Marketplaces

RapidMiner Studio RapidMiner Server

RapidMiner Radoop

ModelingEfficiently build and

deliver better models faster

ValidationConfidently & accurately

estimate model performance

Data AccessConnect to any data

source, any format, at any scale

Data ExplorationQuickly discover patterns

or data quality issues

Data PrepSpeed & optimize ALL data

exploration, blending & cleansing tasks

IntegrationEfficiently build and

deliver better models faster

ManagementConfidently & accurately

estimate model performance

CollaborationConnect to any data

source, any format, at any scale

ComputationQuickly discover

patterns or data quality issues

SchedulingSpeed &

optimize ALL dataexploration, blending &

cleansing tasks

Simplified AnalyticsReduces Hadoop complexity

Lightning FastCovers complete analytics

lifecycle

Broad Data AccessEliminate connectivity

struggles

Integrated SecurityEnsure security compliance

Optimized for HadoopLeverage Hadoop distributed

power

Scalable ProcessingProcess in-Hadoop and in-

memory

Spark ExecutionExecute RapidMiner sub -

processes in parallel

∞ Extensive Domain ExpertiseExpert marketplace of certified RapidMiner skills

Plug-ins, Algorithms, ExtensionsProduct Marketplace to extend and innovate

14

RapidMiner ServerCollaborate + Compute + Deploy + Maintain

Process

Execution Engine

Process

Scheduler

Data and Process

Repository

User/Group Access

Rights management

Web App Portal

Web S

erv

ices

RapidMiner Web Applications

Integrate using Web Services, JSON, SQL, …

Application (BI, ERP,

CRM…) / Portal

Java SE/EE

ApplicationServer

Application

Databases /

Data

warehouses

RapidMiner StudioVisual Workflow Designer

Process Execution

Engine

Workflow Builder

RapidMiner RadoopCompile + Execute in Hadoop

RapidMiner Market PlaceIndustry, Application & ML Extensions

RapidMiner Market PlaceIndustry, Application & ML Extensions

RapidMiner RadoopCompile + Execute in Hadoop

The RapidMiner Platform

Incorporate all

types of dataRun in multiple

Compute Engines

R / Python / SQL ScriptingIn-Memory H2O / WekaIn-Hadoop & Spark

15

RapidMiner Studio

Lightning Fast Visual interface for rapidly building complete analytic

workflows

PowerfulRich library of algorithms and functions to build the strongest

possible model for any use case

Open & Extensible• Open source innovation keeps pace with changing

business needs

All-In-One Data Science Workflow Designer

16

RapidMiner Server

Team Collaboration

Central repository facilitates sharing of data sources, analytic processes & best practices

Operationalization & Collaboration Management

Frictionless

Operationalization Flexible execution options

streamline deployment,

maintenance & embedding of

analysis

Dynamic & Continuous

Model ManagementIndividual and customizable

processes to check for accuracy

drifts or shifts

17

RapidMiner RadoopExtends the RapidMiner’s visual workflow to Hadoop

Hadoop made easy Translates data science workflows into Hadoop so data scientists

concentrate on analytics not Hadoop programming

In Hadoop ExecutionPushes analytic instructions into Hadoop

for computation

SecureComplies with Hadoop security standards

18

Sample Use Cases

Telco - Switzerland

Server & Equipment Load Forecasting, Predictive

Maintenance, Predicting & Preventing Server & Component

Failures

Telco - Austria

Automated Customer Feedback Text Analysis for Automated E-Mail

Categorization & Routing

Telco – Hungary

Customer Relationship Analytics, Churn Prediction & Prevention, Direct

Marketing Campaign Optimization, Scheduling & Automated Execution of

ETL Tasks

Telco – Germany

Automated Online Market Research, Text Analytics, Sentiment Analysis,

Customer Insight

Marketing – Germany

Automated Online Market Research, Text & Sentiment Analysis, Customer

Insight, Competitive Intelligence

Telco – Germany

Fraud Detection & Prevention

OEM – Europe

Fraud Detection & Prevention Solutions for Telecoms

Telco – Europe

CRM applications including optimization of direct marketing

campaigns, automated generation of product recommendations for cross-

selling and up-selling, customer churn prevention, and fraud detection

Payments – Worldwide

Sentiment Analysis of online text sources, including social media and

other user generated content for customer care triage

Telco - Austria

Optimize customer support by automatically categorizing

unstructured data by content and to prioritize and reduce response time

and cost so increasing customer satisfaction

Payments – Worldwide

Customer feedback & voice of the customer, churn prevention, text

mining, automated text categorization, and sentiment analysis to customer support and sat to prevent customer

churn

Market Research -Worldwide

Prediction of sales volumes; CRM optimization; social media

monitoring and sentiment analysis

19

Sample Customer Use Cases

Multiple Customers, Industries

Automated Customer Feedback Text Analysis for Automated E-Mail / Social

Media, Categorization, Triage & Routing

Partner - Europe

Smart meter installation optimization as a service – maximize first time visit

success

Market Research – Worldwide Org

Prediction of sales volumes; CRM optimization; social media monitoring and

sentiment analysis

Automated Customer Feedback Text Analysis for Automated E-Mail Categorization &

Routing

Telco – Europe

CRM applications including optimization of direct marketing campaigns,

automated generation of product recommendations for cross-selling and up-selling, customer churn prevention,

and fraud detection

Payments – Worldwide

Sentiment Analysis of online text sources, including social media and other user generated content for

customer care triage

Payments - Russia

Fraud detection in retail network historical data on service usage,

transaction history, customer profiles, usage logs, and known cases of

fraudulent behavior

20

Sample Customer Use Cases

Voice of the Customer

Automated Customer Feedback Text Analysis for Automated E-Mail / Social

Media, Categorization, Triage & Routing

Manufacturing – Predictive Maintenance

High Value Assets - Silicon, Cars, Trucks, Aircraft, Turbines, IT

Infrastructure,…

Maximizing Customer Lifetime Value

CRM applications including optimization of direct marketing campaigns,

automated generation of product recommendations for cross-selling and up-selling, customer churn prevention,

and fraud detection

Manufacturing – Production Optimization

Optimization Of Production Logistics & Flows, Quality, Yield, Product Mix, Process

Mining

Fraud Detection

Fraud detection in retail network historical data on service usage,

transaction history, customer profiles, usage logs, and known cases of

fraudulent behavior

21

Safeguarding Electronic Payments

The Challenge

RapidMiner Solution

Outcome

• Protecting against fraud and anticipation of risk 7x24

• Large and diverse set of partners (merchants) – over 70,0000

• How to classify and check merchant ecommerce sites for payment system compliance?

• Analyze, classify and check merchants’ ecommerce sites for compliance

• Utilize text mining with NLP to auto-categorize with high sentiment accuracy

• Mashup the widest data sets - historical data on service usage, transaction history, customer profiles, usage logs, and known cases of fraudulent behavior

• Detect anomalies, misuse and fraud through operationalized classification model

• Only 8-10% of merchant sites now screened manually at 80% confidence threshold

• Accurate automated analysis of high risk sites- 92% correctly classified

• Elimination of false positives - no normal sites classified as high risk

• Time and cost to resolve fraud case radically reduced

Anticipating the risk of fraud

Russia’sLargest electronic payment service

22

Repeat Business through Marketing Efficacy

The Challenge

RapidMiner Solution

Outcome

• Industry with tight margins & intense competition

• Broad array of online & mobile channels for customers to place orders

• Goal to improve marketing offers and create more repeat business

• Capture a vast array of customer ordering data from multiple online & mobile phone channels

• Use RapidMiner to join & enriched data with 3rd-party demographics & competitive data

• Use data science to assess performance and growth drivers at individual stores & franchise groups

• Results used to tailor coupons & upsell offers to customers

• Greater flow of repeat customers, driving growth at individual stores and franchise groups

• Far outpaced the industry: Posted best Q2 & Q3 domestic same-store sales growth of the 25 largest restaurant chains in the U.S.

• Next steps: RapidMiner Radoop

Identify upsell offers through deep customer analytics

Large North American

restaurant delivery chain

23

Customer Satisfaction through Quality of Service

The Challenge

RapidMiner Solution

Outcome

• Backend infrastructure footprint & costs increasing yearly

• Customer satisfaction driven by service quality in areas such as video streaming latency

• Network operation teams must accelerate root cause analysis, reduce time to repair

• Data visualization with big data alone cannot provide operationalized insight needed

• Secure large scale Hortonworks Hadoop Big Data Hub architecture to leverage data lakes

• Correlation of log events with historical log data to preempt service quality degradation

• Through machine learning rapidly predict demand as consumer usage patterns change

• Utilize text mining to optimize help desk ticket triage and processing

• Reduce infrastructure requirements (-10%)

• Improved customer retention (2%+)

• IT Operations costs reduced (-30%)

Customer experience begins with network quality

Leading European Telecoms Provider

24

Drive Data Science Agility & Cut Costs

The Challenge

RapidMiner Solution

Outcome

• Existing data science teams looking to replace SAS

– Strong dislike of unwieldy SAS platform with the coding & complexity of it’s multiple

applications & user interfaces

– Cost of SAS too high

• Pull together customer data from across a number of internal databases & third-party sources

• Easily incorporate a large library of legacy predictive models written in R & Python

• Small team of 4 data scientists using collaboration features in RapidMiner Server to share data

prep and machine learning processes

• Improved upsell opportunities and customer retention

• Speeds the process of data prep, rapid prototyping & validation of models over SAS methods

and coding-only methods

• Expansion into Risk department where data science team doesn’t code in SAS, R or Python

Faster development & deployment of customer analytics models

Leading

North American

Financial Services

Institution

25

Gartner & Forrester – RapidMiner a Clear Leader

“…a Leader, owing to its market presence, the volume of client inquiries that Gartner receives about it, its user community, and its well-rounded product that addresses

most data science use cases well.”

‘Reference customers praised many facets of the platform — its large selection of algorithms, flexible modeling capabilities, data source integration and

consequent data preparation. The platform's strength lies not just in particular areas, but also in its all-around consistency.”

2017

“RapidMiner wraps breadth and depth in a beautiful package.

RapidMiner invested heavily to revamp visual interface to make it the most concise and fluid that we have seen during this evaluation. Add to that, RapidMiner’s comprehensive set of operators that encapsulate a wide range of data prep, analytical, and modeling functionality to increase

productivity of data scientists.”

Magic Quadrant for Data Science Platforms

PAML Wave

26

Peer Insights – True Expert Validation

Verified software ratings and reviews from your enterprise IT peers

Reviews for Advanced Analytics Platforms

Business Software and Services Reviews

Top Predictive Analytics Products by Enterprise reviewers