machine learning with matlab - ku leuven · 2014-12-08 · 6 challenges –machine learning lots of...

17
1 © 2014 The MathWorks, Inc. Machine Learning with MATLAB Leuven Statistics Day2014 Rachid Adarghal , Account Manager Jean-Philippe Villaréal, Application Engineer

Upload: others

Post on 23-Mar-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Machine Learning with MATLAB - KU Leuven · 2014-12-08 · 6 Challenges –Machine Learning Lots of data, with many variables (predictors) Data is too complex to know the governing

1© 2014 The MathWorks, Inc.

Machine Learning with MATLAB

Leuven Statistics Day2014Rachid Adarghal, Account Manager

Jean-Philippe Villaréal, Application Engineer

Page 2: Machine Learning with MATLAB - KU Leuven · 2014-12-08 · 6 Challenges –Machine Learning Lots of data, with many variables (predictors) Data is too complex to know the governing

2

Side note: Design of Experiments with

MATLAB

Page 3: Machine Learning with MATLAB - KU Leuven · 2014-12-08 · 6 Challenges –Machine Learning Lots of data, with many variables (predictors) Data is too complex to know the governing

3

What You Will Learn

Get an overview of Machine Learning

Machine learning models and techniques available

in MATLAB

MATLAB as an interactive environment

– Evaluate and choose the best algorithm

Page 4: Machine Learning with MATLAB - KU Leuven · 2014-12-08 · 6 Challenges –Machine Learning Lots of data, with many variables (predictors) Data is too complex to know the governing

4

Machine Learning

Characteristics

– Lots of data (many variables)

– System too complex to understand

the governing equation

Page 5: Machine Learning with MATLAB - KU Leuven · 2014-12-08 · 6 Challenges –Machine Learning Lots of data, with many variables (predictors) Data is too complex to know the governing

5

Domains of Application

Handwriting recognition

Autonomous vehicles

DNA sequencing / Genomics

Cancer tumor classification

Social Network Analysis

Astronomical Data Analysis

Market Segmentation

Organizing Computer Cluster for efficiency

Spam / non spam email classification

Hearing headsets: optimizing signal (Cocktail party)

Shazam / SoundHound

FingerPrinting

Page 6: Machine Learning with MATLAB - KU Leuven · 2014-12-08 · 6 Challenges –Machine Learning Lots of data, with many variables (predictors) Data is too complex to know the governing

6

Challenges – Machine Learning

Lots of data, with many variables (predictors)

Data is too complex to know the governing

equation

Significant technical expertise required

– Black box modelling

No “one size fits all” approach: Requires an

iterative approach:

– Try multiple algorithms, see what works best

– Time consuming to conduct the analysis

– Know-how required to debug your algorithm efficiently

Page 7: Machine Learning with MATLAB - KU Leuven · 2014-12-08 · 6 Challenges –Machine Learning Lots of data, with many variables (predictors) Data is too complex to know the governing

7

MATLAB Solutions

Strong environment for interactive exploration

Algorithms and Apps to get started

– Clustering, Classification, Regression

– Neural Network app, Curve fitting app

Easy to evaluate, iterate, and choose the best

algorithm

Parallel Computing

Deployment for Data Analytics workflows

Page 8: Machine Learning with MATLAB - KU Leuven · 2014-12-08 · 6 Challenges –Machine Learning Lots of data, with many variables (predictors) Data is too complex to know the governing

9

Overview – Machine Learning

Machine

Learning

Supervised

Learning

Classification

Regression

Unsupervised

LearningClustering

Group and interpretdata based only

on input data

Develop predictivemodel based on bothinput and output data

Type of Learning Categories of Algorithms

Recommender

systems

Page 9: Machine Learning with MATLAB - KU Leuven · 2014-12-08 · 6 Challenges –Machine Learning Lots of data, with many variables (predictors) Data is too complex to know the governing

10

Unsupervised Learning

Fuzzy C-Means

Hierarchical

clustering

Self-Organizing

Maps

Gaussian

Mixture

Hidden Markov

Model

k-Means

Partitional

Clustering

Overlapping

Clustering

Clustering

Page 10: Machine Learning with MATLAB - KU Leuven · 2014-12-08 · 6 Challenges –Machine Learning Lots of data, with many variables (predictors) Data is too complex to know the governing

11

Supervised Learning

Non-linear Reg.

(GLM, Logistic)

Linear

RegressionDecision Trees

Ensemble

Methods

Neural

Networks

Nearest

Neighbor

Discriminant

AnalysisNaive Bayes

Support Vector

Machines

Classification

Regression

Page 11: Machine Learning with MATLAB - KU Leuven · 2014-12-08 · 6 Challenges –Machine Learning Lots of data, with many variables (predictors) Data is too complex to know the governing

12

Supervised Learning - Workflow

Known data

Known responses

Model

Train the Model

Model

New Data

Predicted

Responses

Use for Prediction

Measure Accuracy

Select Model

Import Data

Explore Data

Data

Prepare Data

Speed up Computations

Page 12: Machine Learning with MATLAB - KU Leuven · 2014-12-08 · 6 Challenges –Machine Learning Lots of data, with many variables (predictors) Data is too complex to know the governing

13

Example – Bank Marketing Campaign

Goal:

– Predict if customer would subscribe to

bank term deposit based on different

attributes

Approach:

– Train a classifier using different models

– Measure accuracy and compare models

– Reduce model complexity

– Use classifier for prediction

Data set downloaded from UCI Machine Learning repository

http://archive.ics.uci.edu/ml/datasets/Bank+Marketing

0

10

20

30

40

50

60

70

80

90

100

Perc

enta

ge

Bank Marketing Campaign

Misclassification Rate

Neur

al N

et

Logi

stic

Reg

ress

ion

Dis

crim

inant

Ana

lysi

s k-

neare

st N

eig

hbor

s

Naiv

e B

ayes

Sup

port V

M

Deci

sion

Tree

s

Tre

eBagg

er

Redu

ced

TB

No

Misclassified

Yes

Misclassified

Page 13: Machine Learning with MATLAB - KU Leuven · 2014-12-08 · 6 Challenges –Machine Learning Lots of data, with many variables (predictors) Data is too complex to know the governing

14

Summary– Bank Marketing Campaign

Numerous predictive models with

rich documentation

– Clustering, regression, classification

Interactive tools to help discovery

– Histograms, bar charts, ROC curves

– Graphical Apps

Built-in parallel computing support

Quick prototyping

– Focus on modeling not programming

0

10

20

30

40

50

60

70

80

90

100

Perc

enta

ge

Bank Marketing Campaign

Misclassification Rate

Neur

al N

et

Logi

stic

Reg

ress

ion

Dis

crim

inant

Ana

lysi

s k-

neare

st N

eig

hbor

s

Naiv

e B

ayes

Sup

port V

M

Deci

sion

Tree

s

Tre

eBagg

er

Redu

ced

TB

No

Misclassified

Yes

Misclassified

Page 14: Machine Learning with MATLAB - KU Leuven · 2014-12-08 · 6 Challenges –Machine Learning Lots of data, with many variables (predictors) Data is too complex to know the governing

15

Learn More: Machine Learning with MATLAB

Visit our discovery page: www.mathworks.com/machine-learning

Page 15: Machine Learning with MATLAB - KU Leuven · 2014-12-08 · 6 Challenges –Machine Learning Lots of data, with many variables (predictors) Data is too complex to know the governing

16

Deploying / Sharing Your Application

MATLAB Compiler

.exe.dll

.lib

Builder

NE

Builder

JA

Builder

Ex

APPS

Web

Web

Web

MATLAB Coder

.exe .lib .dll

MEX

.CTF

Page 16: Machine Learning with MATLAB - KU Leuven · 2014-12-08 · 6 Challenges –Machine Learning Lots of data, with many variables (predictors) Data is too complex to know the governing

17

MathWorks Services

Trainings: – More that 30 course offerings

Consulting Services

– Enhance your team

Technical Support

– Ask questions

An active community: – MATLAB Central

– File exchange

– Blogs

– Newsletters

Page 17: Machine Learning with MATLAB - KU Leuven · 2014-12-08 · 6 Challenges –Machine Learning Lots of data, with many variables (predictors) Data is too complex to know the governing

18© 2014 The MathWorks, Inc.

Thank you for attending!