from data science to data stories - mathworks › content › dam › mathworks › ... · source:...

51
Katya Vladislavleva, PhD, PDEng From Data Science to Data Stories

Upload: others

Post on 25-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

Katya Vladislavleva, PhD, PDEng

From Data Science to Data

Stories

Page 2: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

• Energy

• Advanced

manufacturing

• Materials

• Finance

• (Digital) Health

• Consumer products

• Business operations

OUR TECHNOLOGY IS SHAPED BY THE REAL WORLDDifferent industries are solving the same analytics problems

Page 3: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

If you can measure it, you can understand it.

If you can understand it, you can alter it.

Katherine Neville

Page 4: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

OF THE

BUDGETGOES TO DATA

COLLECTION

Page 5: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,
Page 6: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

96% of data collected is not

utilized (not monetized)

OF THE

DATA

IS NOT UTILIZED

Source: IDC 2014

Page 7: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

More data does not always imply more information

How much data you have

Confu

sio

n

Page 8: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

How much data you

have

Insig

ht

an

d

Co

ntr

ol

The real goal is understanding and control

Page 9: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

The Age of the Customer is here

Source: « Competitive Strategy in The Age of the Customer » Forrester Report, 06/2011

Age of the CustomerAge of Information

Age of Distribution

Age of Manufacturing1900-1960

1990-20102010-Beyond

1960-1990

Page 10: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,
Page 11: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

Good Advice from Forrester

Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report, 10/2015

Embrace the

mobile

Be there, be digital

Attain Maximum Customer

Intelligence with Data Analytics1

2

3

Create customer

experience excellence

4

Page 12: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

MISMATCH BETWEEN THE VOLUME OF

DATA AND OUR CAPACITY TO ANALYSE IT

GROWS ALARMINGLY FAST

By 2018, the US only will face a shortage of up to 190,000 data scientists as well as 1.5 Million

managers and analysts with enough proficiency in statistics to use big data effectively.”

Page 13: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

Define

KPIs

Collect

Data

Organize &

Process Data

Interpret

Data

Act on

Findings

Co

mpe

titive

Ad

va

nta

ge

Pro

fit

0

-$$

+$$

Time

AnalyticsAnalytics ?

Page 14: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

Change the Mind set

Page 15: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

Change the Mind set

Work in Process costs are reduced by 75%

Page 16: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

Change the Mind set

Hmm… If this is possible,

What if we optimize our products for profitability?

Forward-thinking Leader

Page 17: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

There are different kinds of big data

Big

DataMedium & Small Data

Big Hype Big Mess

Page 18: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

http://www.tableau.com/sites/default/files/media/top8bigdatatrends2016_final_2.pdf

Page 19: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,
Page 20: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,
Page 21: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,
Page 22: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,
Page 23: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,
Page 24: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,
Page 25: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,
Page 26: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,
Page 27: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,
Page 28: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,
Page 29: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,
Page 30: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,
Page 31: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,
Page 32: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,
Page 33: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,
Page 34: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,
Page 35: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,
Page 36: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,
Page 37: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

Process Data: 154,260 energy

observations

78%elements

OVERALL

HEALTH OF

YOUR DATA42

%

TOTAL

NUMBER OF

TAGS

TOTAL NUMBER

OF ENERGY

OUTPUTS

TAGS WITH

AT LEAST

50% DATA

CONSTAN

T TAGS

1,239 1,183 12

292

DISCRET

E TAGS

609

CONTINUO

US TAGS

574

Page 38: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,
Page 39: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

Our Hackathon Outcomes

1

2

Built Non-linear Predictive

Models for Power

Consumtion with local error

bounds

Which lead to a

Dashboard with focus on

the variables that

currently matter most

3

That also integrates an innovative

approach to run What-If Scenarios

starting from the current process

state

4And allows you to maximize

the Sustainability of your

process at all times.

Page 40: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

✓ Integrated and aligned all available tags with energy data

✓ Generated monthly and quarterly datasets with 5 min averages

✓ Looked at Data Health, Data distributions, Linear Correlations, Mutual Information Content and Variable Connections & Grouping

Data Pre-Processing & Analysis

Page 41: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

Interactive Explorer of Tag Relationships

Page 42: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

Organize all your process data into one big table

Apply advanced machine

learning

$$$ Identify driving metrics & Deploy Predictive Models

$$$

Define your KPIs

Our Modeling Process

Page 43: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

Build Compact Non-linear Models per regime using

extensive process of variable competition and elimination

Iteration

Ta

g Im

po

rta

nce

Page 44: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

Result is compact 5-variable model-ensembles with error

limits for each regime

Measurements sorted by Power Consumption

Energ

y C

onsum

ption

Page 45: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

Dashboard with a focus

Power Consumption

Page 46: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

Power Consumption

Page 47: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

Models are Robust by DesignE

nerg

y C

onsum

ption

✓ 600,000 models created to produce a final ensemble for one region

✓ 2 regions of electricity consumption lead to 1,200,000 models

✓ 50% of the modeling effort is spent on cross-validation and making sure the models are predictive and not over-fitting

✓ From 1,183 potential inputs only five (5) metrics per region are necessary and sufficient

✓ Global R2 0.97; per regime R2 is 0.68-0.7

✓ Final ensembles consist of 100 models each and also provide confidence limits

Page 48: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

Time It took Us

Total Time Spent:

1

34

5

6

Data Download

Data Preparation &

Aggregation

Predictive Modeling

(not counting

DataStories’

RoboElves)

Data Visualization &

Connected Dashboard

Presentation Prep

Meetings & Driving

71 h

Matlab, Python, JS,

HTML5

Page 49: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

Benefits of using Matlab

✓Super fast implementation

✓Reliable deployment + flexibility (Matlab package, Stand-alone, Cloud)

✓Code protection

✓Matlab users can integrate it easily in their Matlab-based routines

✓Plug&Play Hadoop integration

Page 50: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

Business Outcomes

✓ Project of high business value

✓ Perfect product validation

Special Thanks to Brussels Data Science Community for organizing the hackathon

Page 51: From Data Science to Data Stories - MathWorks › content › dam › mathworks › ... · Source: « The Four Imperatives of Winning in The Age of the Customer » Forrester Report,

Subscribe for the VIP beta at beta.datastories.com