evolved analytics llc big insight...

18
Evolved Analytics LLC Big Insight vs. Big Data Mark Kotanchek 1

Upload: others

Post on 14-Jun-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Evolved Analytics LLC Big Insight vs.library.wolfram.com/infocenter/Conferences/9317/1445620635.pdf · Evolved Analytics LLC Big Insight vs. ... Modeling and data results can be archived

Evolved Analytics LLC

Big Insight vs. Big Data

Mark Kotanchek

1

Page 2: Evolved Analytics LLC Big Insight vs.library.wolfram.com/infocenter/Conferences/9317/1445620635.pdf · Evolved Analytics LLC Big Insight vs. ... Modeling and data results can be archived

www.evolved-analytics.com

Modeling Options

2

Drivingvariablesare known

Model structure is KNOWN to be

LINEAR

Drivingvariablesare known

Modelstructure is KNOWN but NONLINEAR

Driving variablesare known

Model structure is NOT known

Driving variablesare NOT known

Model structure is NOT known

known nonlinear System & Driving

variables are known

Linear System & Driving

Variables are known

Unknown Model Structure but

Driving variables are known

Linear Regression

Non-linear Regression Parameter Estimation

Neural Networks,

SVM, Random Forests, Symbolic

Regression

Symbolic Regression

Model structure is NOT known

& Driving variables

are NOT known

Page 3: Evolved Analytics LLC Big Insight vs.library.wolfram.com/infocenter/Conferences/9317/1445620635.pdf · Evolved Analytics LLC Big Insight vs. ... Modeling and data results can be archived

www.evolved-analytics.com

Modeling Workflow is Everything!

❖ Awareness & execution across all aspects

❖ Analysis flow is iterative

❖ Utilize visualization to guide analysis

❖ An audit trail is fundamental

3

Data

DataExploration

Data Selection &

Conditioning

ModelDevelopment

Insight &Understanding

ModelSelection

ModelDeployment

ModelMaintenance

Page 4: Evolved Analytics LLC Big Insight vs.library.wolfram.com/infocenter/Conferences/9317/1445620635.pdf · Evolved Analytics LLC Big Insight vs. ... Modeling and data results can be archived

www.evolved-analytics.com

Data in the Real World

4

Missing Elements

Correlated Variables

Lots of Records

Wide

Too Little Data

Noisy

Wrong Data

Unreliable Sensors

Page 5: Evolved Analytics LLC Big Insight vs.library.wolfram.com/infocenter/Conferences/9317/1445620635.pdf · Evolved Analytics LLC Big Insight vs. ... Modeling and data results can be archived

www.evolved-analytics.com

Key Point

5

Symbolic Regression⇒

Hypothesis Generator

Human limits of imagination & possibility are not imposed!

The only constraint is the supplied building blocks. We can exploit this creativity

to produce trustable data models.

Page 6: Evolved Analytics LLC Big Insight vs.library.wolfram.com/infocenter/Conferences/9317/1445620635.pdf · Evolved Analytics LLC Big Insight vs. ... Modeling and data results can be archived

www.evolved-analytics.com

The Illustration Data

❖ Data from an industrial chemical reactor

❖ Having problems with product quality (QC)

❖ 125 variables sampled over three months

❖ Chemical composition from gas chromatography (GC)

❖ Process information from plant (flows, temps & source)

❖ Two users:

❖ Analytical Chemists (Why??)

❖ Production Engineers (What to do??)

6

Page 7: Evolved Analytics LLC Big Insight vs.library.wolfram.com/infocenter/Conferences/9317/1445620635.pdf · Evolved Analytics LLC Big Insight vs. ... Modeling and data results can be archived

www.evolved-analytics.com7

The Zen of the data Correlated

Inputs

Page 8: Evolved Analytics LLC Big Insight vs.library.wolfram.com/infocenter/Conferences/9317/1445620635.pdf · Evolved Analytics LLC Big Insight vs. ... Modeling and data results can be archived

www.evolved-analytics.com8

Generate Models

Page 9: Evolved Analytics LLC Big Insight vs.library.wolfram.com/infocenter/Conferences/9317/1445620635.pdf · Evolved Analytics LLC Big Insight vs. ... Modeling and data results can be archived

www.evolved-analytics.com

Developed Models

9

Page 10: Evolved Analytics LLC Big Insight vs.library.wolfram.com/infocenter/Conferences/9317/1445620635.pdf · Evolved Analytics LLC Big Insight vs. ... Modeling and data results can be archived

www.evolved-analytics.com10

Model Dimensionality

Page 11: Evolved Analytics LLC Big Insight vs.library.wolfram.com/infocenter/Conferences/9317/1445620635.pdf · Evolved Analytics LLC Big Insight vs. ... Modeling and data results can be archived

www.evolved-analytics.com

Supporting the Chemists (Round 2)

11

Variable subsets can be isolated for focused modeling

MetaVariables can be explored and used in

subsequent modeling

Page 12: Evolved Analytics LLC Big Insight vs.library.wolfram.com/infocenter/Conferences/9317/1445620635.pdf · Evolved Analytics LLC Big Insight vs. ... Modeling and data results can be archived

www.evolved-analytics.com

Chemists (focused)

12

Supplied MetaVariables were

exploited

Each search is stochastic

It looks like 5 variables are required

This is an interesting

combination

Page 13: Evolved Analytics LLC Big Insight vs.library.wolfram.com/infocenter/Conferences/9317/1445620635.pdf · Evolved Analytics LLC Big Insight vs. ... Modeling and data results can be archived

www.evolved-analytics.com

Supporting Production

13

Modeling only using PI variables

Define inputs for focused model

development (vars in at least 20% of

models)

3 or 4 vars needed for good-enough models

Page 14: Evolved Analytics LLC Big Insight vs.library.wolfram.com/infocenter/Conferences/9317/1445620635.pdf · Evolved Analytics LLC Big Insight vs. ... Modeling and data results can be archived

www.evolved-analytics.com

Focused Production

Models

14

32 10 minute searches (80 minutes of

clock time)Picked four variables for deployed model development

Page 15: Evolved Analytics LLC Big Insight vs.library.wolfram.com/infocenter/Conferences/9317/1445620635.pdf · Evolved Analytics LLC Big Insight vs. ... Modeling and data results can be archived

www.evolved-analytics.com

Deployable Models

❖ 80 minutes of model development (32 independent searches of 10 minutes on a quad-core laptop)

❖ Models were rewarded for simplicity and accuracy

❖ The individual models are good; however, we want trustable models based upon ensembles

15

Page 16: Evolved Analytics LLC Big Insight vs.library.wolfram.com/infocenter/Conferences/9317/1445620635.pdf · Evolved Analytics LLC Big Insight vs. ... Modeling and data results can be archived

www.evolved-analytics.com

An Ensemble❖ From candidate (accurate and

simple) models, models chosen for their diversity

❖ Ensemble has better prediction accuracy than individual models

❖ Divergence of models provides a trust metric!

16

Page 17: Evolved Analytics LLC Big Insight vs.library.wolfram.com/infocenter/Conferences/9317/1445620635.pdf · Evolved Analytics LLC Big Insight vs. ... Modeling and data results can be archived

www.evolved-analytics.com

Ensemble Performance

❖ Ensemble predictions have a trust metric based upon divergence.

❖ Temperatures are coupled so they cannot be varied independently ⟹ prediction spread greatly increases if we try to do that!

17

Page 18: Evolved Analytics LLC Big Insight vs.library.wolfram.com/infocenter/Conferences/9317/1445620635.pdf · Evolved Analytics LLC Big Insight vs. ... Modeling and data results can be archived

www.evolved-analytics.com

Conclusions

❖ One more thing …

❖ Modeling and data results can be archived to analysis report at the click of a button

❖ A function package is available for use in a notebook front end as well as to facilitate automated analysis flows

❖ For more information or trial licenses for DataModeler, contact

[email protected]

❖ We also do consulting and custom analysis system development

❖ We have offices in the US and Europe

18