machine learning in the age of big data: new approaches and business applications

82
Machine learning in the age of big data Armando Vieira Closer Armando.lidinwise.com

Upload: armando-vieira

Post on 25-May-2015

704 views

Category:

Business


1 download

DESCRIPTION

Presentation at University of Lisbon on Machine Learning and big data. Deep learning algorithms and applications to credit risk analysis, churn detection and recommendation algorithms

TRANSCRIPT

Page 1: machine learning in the age of big data: new approaches and business applications

Machine learning in the age of big data

Armando VieiraCloser

Armando.lidinwise.com

Page 2: machine learning in the age of big data: new approaches and business applications

Predicting the flue

Page 3: machine learning in the age of big data: new approaches and business applications

1. Machine Learning: finding features, patterns & representations

2. The connectionist approach: Neural Networks3. Applications 4. The Deep Learning “revolution”: a step closer to the brain?5. Applications6. The Big Data deluge: better algorithms & more data

Topics

Page 4: machine learning in the age of big data: new approaches and business applications

Was “Deep Blue” Intelligent? How about Watson? Or Google? Does machines have reached the

intelligence level of a rat? …. Let’s be pragmatic: I’ll call “intelligent” any

device capable of surprise me!

What is an “intelligent” machine?

Page 5: machine learning in the age of big data: new approaches and business applications

Connectionism

Page 6: machine learning in the age of big data: new approaches and business applications

1943 – Mculloch & Pitts + Hebb 1968- Rosenblat perceptron and the Minsk

argument - or why a good theory may kill an even better idea

1985- Rumelhart Perceptron 2006- Hinton Deep Learning (Boltzmann)

Networks

All together: Watson, Google et al

The connectionist way

Page 7: machine learning in the age of big data: new approaches and business applications

Symbolic machines

Page 8: machine learning in the age of big data: new approaches and business applications

The brain way

Page 9: machine learning in the age of big data: new approaches and business applications

Input builds up on receptors (dendrites)

Cell has an input threshold

Upon breech of cell’s threshold, activation is fired down the axon.

Modeling the Human Brain?

Page 10: machine learning in the age of big data: new approaches and business applications

The visual cortex

How the brain do the trick?

Page 11: machine learning in the age of big data: new approaches and business applications

The simplest neural network

Page 12: machine learning in the age of big data: new approaches and business applications

A step closer to success thanks to a training algorithm: back propagation

What is a Multilayer Perceptron

Page 13: machine learning in the age of big data: new approaches and business applications
Page 14: machine learning in the age of big data: new approaches and business applications

Learning a function

Page 15: machine learning in the age of big data: new approaches and business applications

Training is nothing more than fitting: regression, classification, recommendations

Problem is we have to find a way to represent the world (extract features)

Supervised / unsupervised

Page 16: machine learning in the age of big data: new approaches and business applications
Page 17: machine learning in the age of big data: new approaches and business applications

Can a ANN learn this?

Page 18: machine learning in the age of big data: new approaches and business applications
Page 19: machine learning in the age of big data: new approaches and business applications

Sure!

Page 20: machine learning in the age of big data: new approaches and business applications
Page 21: machine learning in the age of big data: new approaches and business applications

A very simple problem

Money

Age

FRUSTATION

Page 22: machine learning in the age of big data: new approaches and business applications

Curse of dimensionality

Page 23: machine learning in the age of big data: new approaches and business applications

Learning too much

Page 24: machine learning in the age of big data: new approaches and business applications

OverfittingSimpler hypothesis has lower error rate

Page 25: machine learning in the age of big data: new approaches and business applications

ANN are very hard to optimize Lots of local minimum (trap for stochastic

gradient descendent) Permutation invariant (no unique solution) How to stop training?

Optimization & convergence

Page 26: machine learning in the age of big data: new approaches and business applications
Page 27: machine learning in the age of big data: new approaches and business applications

Neural Networks are incredible powerful algorithms

But they are also wild beasts that should be treated with great care

Its very easy to fall in the GIGO trap Problems like overfit, suboptimization, bad

conditioned, wrong interpretation are common

Feeding & understanding the beast

Page 28: machine learning in the age of big data: new approaches and business applications

Interpretation of outputs Loss function Outputs ≠ probabilities Where to draw the line? VERY careful in interpreting the outputs of ML

algorithms: you not always get what you see

Input preparation Clean & balance the data Normalize it properly Remove unneeded features, create new ones Missing values

Some care

Page 29: machine learning in the age of big data: new approaches and business applications

PCA, Isomap, NMF and the like

Page 30: machine learning in the age of big data: new approaches and business applications

Rutherford Backscattering (RBS) Credit Risk & Scoring Churn prediction (CDR) Prediction of hotel demand with Google

trends Adwords Optimization

Applications

Page 31: machine learning in the age of big data: new approaches and business applications

Ion beam analysis

h

RBS

NRA

PIXE

ERDA

Channelling

MeV/amu

Page 32: machine learning in the age of big data: new approaches and business applications

Rutherford backscattering: where is the pattern?

25Å Ge d-layer under 400 nm Si

Angle of incidence

Scattering angle

Beam energy

0 100 200 300 4000

500

10001.2 MeV

1.6 MeV

2 MeV

(a)

Channel

0

500

1000(b)

120o

140o

180o

Yie

ld (

arb

. un

its)

0

500

1000

1500

(c)

50o

25o

0o

Page 33: machine learning in the age of big data: new approaches and business applications

Ge in Si: ANN architecture

architecture train set error test set error

(I, 100, O) 6.3 11.7

(I, 250, O) 5.2 10.1

(I, 100, 80, O) 3.6 5.3

(I, 100, 50, 20, O) 4.2 5.1

(I, 100, 80, 50, O) 3.0 4.1

(I, 100, 80, 80, O) 2.8 4.7

(I, 100, 50, 100, O) 3.0 4.2

(I, 100, 80, 80, 50, O) 3.2 4.1

(I, 100, 80, 50, 30, 20, O) 3.8 5.3

Page 34: machine learning in the age of big data: new approaches and business applications

Anything in Al2O3: test set

0,1 1 10 100

0,1

1

10

100a)

Do

seA

NN (

1015

at/c

m2 )

Dosedata

(1015

at/cm2)

0 1000 2000 3000 40000

1000

2000

3000

4000

b)

Depthdata

(1015

at/cm2)

De

pth

AN

N (

1015

at/c

m2 )

Page 35: machine learning in the age of big data: new approaches and business applications
Page 36: machine learning in the age of big data: new approaches and business applications

Churn prediction on a Telecom

Page 37: machine learning in the age of big data: new approaches and business applications

Model validation | Lift and Profit curves

Page 38: machine learning in the age of big data: new approaches and business applications
Page 39: machine learning in the age of big data: new approaches and business applications

Adwords optimization

Page 40: machine learning in the age of big data: new approaches and business applications

Bankruptcy prediction

Page 41: machine learning in the age of big data: new approaches and business applications

The Rating System

Page 42: machine learning in the age of big data: new approaches and business applications
Page 43: machine learning in the age of big data: new approaches and business applications

-2-1

01

2

-2

-1

0

1

2-1.5

-1

-0.5

0

0.5

1

cr

eb

Score (EBIT, Current ratio)

Page 44: machine learning in the age of big data: new approaches and business applications

Fraud Detection

Page 45: machine learning in the age of big data: new approaches and business applications

Hotel demand prediction using Google

Page 46: machine learning in the age of big data: new approaches and business applications

Credit ScoringBefore After

Page 47: machine learning in the age of big data: new approaches and business applications

Where is the information?

-4.5

-4

-3.5

-3

-2.5

-2

-1.5

-1

-0.5

0

Page 48: machine learning in the age of big data: new approaches and business applications

ANN are not a “silver bullet”!

Neural networks are good when: many training data available; continuous variables; relevant features are known; unicity of the mapping.

Neural networks are less useful when: problem is linear; few data compared to the size of search space; data high dimensional; long range correlations.

They are black boxes

Page 49: machine learning in the age of big data: new approaches and business applications

Characteristics of ANNsCharacteristic Traditional methods

(Von Neumann)Artificial neural networks

logics Deductive Inductive

Processing principle Logical Gestalt

Processing style Sequential Distributed (parallel)

Functions realised through

concepts, rules, calculations Concepts, images, categories, maps

Connections between concepts

Programmed a priori Dynamic, evolving

Programming Through a limited set of rigid rules

Self-programmable (given an appropriate architecture)

Learning By rules By examples (analogies)

Self-learning Through internal algorithmic parameters

Continuously adaptable

Tolerance to errors Mostly none Inerent

Page 50: machine learning in the age of big data: new approaches and business applications

ANN are massive correlation & feature extraction machines isn’t what intelligence is all about?

Knowledge is embedded in a messy network of weights

Capable to model an arbitrary complex mapping

Here is the intelligence?

Page 51: machine learning in the age of big data: new approaches and business applications

We need thousands of examples for training. Why?

Prior

Algorithms are simple: complexity lies in the data

Still…

Page 52: machine learning in the age of big data: new approaches and business applications

Deep Learning approach

Page 53: machine learning in the age of big data: new approaches and business applications

Boltzmann Machines

Page 54: machine learning in the age of big data: new approaches and business applications
Page 55: machine learning in the age of big data: new approaches and business applications

Teaching a RBM by Gibbs Sampling

Page 56: machine learning in the age of big data: new approaches and business applications
Page 57: machine learning in the age of big data: new approaches and business applications
Page 58: machine learning in the age of big data: new approaches and business applications
Page 59: machine learning in the age of big data: new approaches and business applications
Page 60: machine learning in the age of big data: new approaches and business applications
Page 61: machine learning in the age of big data: new approaches and business applications
Page 62: machine learning in the age of big data: new approaches and business applications

Hinton et al, 2006

Page 63: machine learning in the age of big data: new approaches and business applications
Page 64: machine learning in the age of big data: new approaches and business applications
Page 65: machine learning in the age of big data: new approaches and business applications

“quasi” non-supervised machines Extract and combine subtle features in the

data Build high-level representations (abstractions) Capable of knowledge transfer Can handle (very) high-dimensional data Are deep and broad: millions of synapses Work both ways: up and down

Nice features of Deep Learning

Page 66: machine learning in the age of big data: new approaches and business applications

The dimensionality curse crushed?

Learning features that are not mutually exclusive

Page 67: machine learning in the age of big data: new approaches and business applications

Top on image identification (is some cases it beat humans)

Top on video classification Top on real-time translation Top on Gene identification Reverse engineering: can replicate complex

human behaviour, like walking. Data visualization and of text

disambiguation (river-bank/bank-bailout) Kaggle

Applications

Page 68: machine learning in the age of big data: new approaches and business applications

Before Now

Page 69: machine learning in the age of big data: new approaches and business applications

Here comes everybody: Big Data, real BIG

Page 70: machine learning in the age of big data: new approaches and business applications

The Big Data Revolution

Page 71: machine learning in the age of big data: new approaches and business applications

In 2 years we produce more data (and garbage) than the accumulated over all history

Zettabytes of data, 1021 bytes produced every year

In data we trust

Page 72: machine learning in the age of big data: new approaches and business applications

Data is the new gold… and its cheap

Machine learning molecules (**)

Page 73: machine learning in the age of big data: new approaches and business applications

Most ML algorithms work better (sometimes much better) by simple throwing more data to them

And now we have more data. Plenty of it! Which is signal and which is noise? Let the

machines decide (they are good at it) Where humans stands in this equation? We

are feeding the machines!

Does size matter? A lot!

Page 74: machine learning in the age of big data: new approaches and business applications

Don’t look for causation; welcome correlations Messiness - prepare to get your hands dirty Don’t expect definitive answers. Only

communists have them! Stop searching God’s equation Keep theories at bay and let the data speak Exactitude may not be better than “estimations” Forget about keep data clean and organized Data is alive and wild. Don’t imprisoned it

What have changed in such a data deluged world?

Page 75: machine learning in the age of big data: new approaches and business applications

Flue prediction Netflix movie rating contest New York city building security Used car Veg food->airport Prediction rare events frauds and why its

important

Examples

Page 76: machine learning in the age of big data: new approaches and business applications

A step closer to the brain? Yes and No What is missing? Predictive analytics (crime before it occurs)? Algorithms that learn & adapt Replace humans? Augment reality

Big Data & algorithms are revolutionizing the world. Fast!

6. Conclusions & reflections

Page 77: machine learning in the age of big data: new approaches and business applications
Page 78: machine learning in the age of big data: new approaches and business applications

Recommendations (Amazon, Netflix, Facebook) Trading (70% Wall Street is made by them) Identifying your partner, recruiting, votes Images, video, voice, translation (real time)

Where are we heading? NSA? Black boxes?

Are (intelligent) algorithms taking control of our lives?

Page 79: machine learning in the age of big data: new approaches and business applications

References Deeplearning.net Hinton Google talks “Too big to know” Big Data: a new revolution that will

transform business Machine Learning in R

Page 80: machine learning in the age of big data: new approaches and business applications

Matlab (several code – google for it) R (CRAN repository), Rminer Python (Skilearn) C++ (mainly on Github) Torch More on Deeplearning.net

Code

Page 81: machine learning in the age of big data: new approaches and business applications

User based Collaborative Filters

Recommend an unseen item i to an user u based on engagement of other users to items 1 to 8.Items recommended in this case are i2 followed by i1.

Page 82: machine learning in the age of big data: new approaches and business applications

Item based Collaborative Filters

Item based recommendation for a user ua

based on a neighbour of k = 3. Items recommended in this case are i3 followed by i4.

(item based CF superior to user based CF but it requires lot of information like ratings or user interaction with the product).