demystifying machine learning - wordpress.com · machine learning • a computer program is said to...

37
Demystifying Machine Learning Professor Paul Kennedy [email protected] Centre for Artificial Intelligence School of Software, Faculty of Engineering & IT

Upload: others

Post on 18-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

Demystifying Machine

LearningProfessor Paul Kennedy

[email protected]

Centre for Artificial Intelligence

School of Software, Faculty of Engineering & IT

Page 2: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

• What is ML?

• Different types of ML & what they can and can’t do

• Some examples

• Brief overview of some common ML approaches

• How to go about solving problems with ML

Page 3: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

What is Machine

Learning?

Page 4: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

Machine Learning

• A computer program is said to learn from experience E with respect to tasks

T and performance measure P if its performance at tasks T measured by P

improves with experience E

• T = tasks

• e.g., predict if customer will take up offer

• P = performance measure

• e.g., % correctly predicted

• E = experience

• e.g., past examples of customers who did and didn’t take up offer.

Page 5: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

Features Model

Task

Domain

objects Output

Flach 2012

Page 6: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

Features Model

Learning

Algorithm

Task

Learning problem

Domain

objects Output

Training

data

Flach 2012

Page 7: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

Different types of ML &

what it can and can’t do

Page 8: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

Machine Learning• Unsupervised methods

• Make sense of the data

• Supervised methods

• Learn a relationship between inputs and outputs from old

data.

• Use to predict output for new inputs.

• Others: reinforcement learning, semi-supervised learning,

transfer learning, one-class learning, …

Page 9: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

Unsupervised Learning

• Make sense of the

dataset by representing

it in another form.

• Identify clusters or

groups in the data.

• Identify frequent

patterns in the data.

• e.g. clustering,

association rule mining,

neural networks, …

Page 10: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

Supervised Learning

• Using existing data, learn the

relationship between some

‘inputs’ to predict a known

‘output’ or target value.

• Then it can be used to make

predictions for new ‘input’

data.

• e.g. classification and

regression

• Decision trees, neural

networks, support vector

machines, random forest, …

Page 11: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

Supervised Learning

• Using existing data, learn the

relationship between some

‘inputs’ to predict a known

‘output’ or target value.

• Then it can be used to make

predictions for new ‘input’

data.

• e.g. classification and

regression

• Decision trees, neural

networks, support vector

machines, random forest, …

Page 12: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

What it can do• Unsupervised:

• can divide the data points into groups, that hopefully match reality

and assess how well the clusters make sense.

• Supervised:

• if there are enough data points with few enough attributes …

• that match the scope of the real world domain …

• and there is a relationship between the inputs and outputs, …

• these methods can find a pattern that can generalise and we can

estimate the quality.

Page 13: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

What it can’t do

• It’s not magic!

• Heavily reliant on the quality and amount of input /

training data

• Usually doesn’t ‘understand’ the problem in a ‘human’

way

• Sometimes cannot explain why the decision is made

Page 14: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

Some examples

Page 15: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its
Page 16: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

Predicting the 2012 US

election result

• Nate Silver used predictive analytics

& statistics to correctly predict

outcomes of 50 out of 50 states

from polling and related data.

• Republican pundits were confident

in their landslide-win predictions.

Democrat pundits predicted razor-

thin victory.

• Shows the power of a data-centric

approach over “gut-feeling”.

Page 17: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

AlexNet

• Deep convolutional neural

network using GPUs

• Famously won the 2012

ImageNet LSVRC-2012

competition by a large margin

- 15.3% vs 26.2% (second

place) error rates

• Boosted deep learning

research

• Beaten in 2015 by Microsoft’s

ResNet.

Krizhevsky et al, Communications of the ACM. 60 (6): 84–90.

Page 18: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

Locally Interpretable Model-Agnostic

Explanations

LIME, Ribeiro et al, 2016

Page 19: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

LIME, Ribeiro et al, 2016

Page 20: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

Brown et al., Adversarial Patch, arXiv:1712.09665v2 [cs.CV] 17 May 2018

Page 21: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

Brief overview of

methods

Page 22: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

Unsupervised techniques

• Association analysis / Market Basket Analysis

• Identify frequent and/or interesting transactions from databases.

• e.g. if someone buys bread they are also likely to buy butter.

• Also measures of how often the rule appears and/or is true

Page 23: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

Unsupervised techniques

• Clustering

• Identify groups within

data where data points

in the group are similar

to one another but

different to those in

other groups.

• hierarchical, k-means,

k-medoids, EM,

DBScan, BIRCH, …

Page 24: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

Classification & prediction

methods• Linear regression, logistic regression, sparse variants …

• k-nearest neighbour classifiers, …

• Decision trees

• Random forest

• Artificial neural networks: multilayer perceptrons, deep networks,

convolutional neural networks, recurrent neural networks, …

• Support vector machines, …

• Naive Bayes, Bayesian networks, …

• Ensemble methods: gradient boosting, Adaboost, XGBoost, …

Page 25: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

Decision Tree

Page 26: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

Random Forest

?

Page 27: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

Random Forest

?

Page 28: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

Random Forest

Page 29: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

Random Forest

Page 30: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

Neural Nets

Yes

No

Page 31: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

Support Vector Machines

Page 32: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

How to go about

solving problems

Page 33: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

Fitting to the business

• Understand the business context, and stronger, framing a business

question.

• Translating the business question into a data analytics question.

• Collecting, understanding and processing data from across the

business and possibly externally.

• Build models and evaluate them.

• Deploying the results in the business to deliver benefits.

• Iterative process.

Page 34: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

CRISP-DM viewCRoss-Industry Standard Process for Data Mining (CRISP-

DM) methodologySource: Kenneth Jensen / Wikimedia Commons / Public Domain

Page 35: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

Validation

• Need to evaluate the

quality of models.

• Many approaches

• Hold out sets

• Bootstrap validation

• K-fold cross validation

Train

Validation

Test

Used to

train the

model

Used to tune

parameters

Used to

evaluate model

Page 36: Demystifying Machine Learning - WordPress.com · Machine Learning • A computer program is said to learn from experience E with respect to tasks T and performance measure P if its

Ways it can go wrong• Answering the wrong business question

• Not deploying properly

• Model goes stale - underlying problem is non stationary

• Overfitting - the model has a high training accuracy, but doesn’t work well in real world

• Underfitting - the model has a low training accuracy

• p>>n aka the curse of dimensionality

• Too many attributes for the number of data points

• Imbalanced classes - biased towards the major class, often not the one you’re interested in

• Bias - training data is biased or not representative of the real world situation

• Insufficient data cleaning