introduction to machine learning - uliege.be · 2014-10-16 · introduction to machine learning...

31
Introduction to Machine Learning Lesson given by ebastien Pi´ erard in the course “Vision 3D” (ULg, Pr. M. Van Droogenbroeck) INTELSIG, Montefiore Institute, University of Li` ege, Belgium November 30, 2011 1 / 31

Upload: others

Post on 20-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

Introduction to Machine Learning

Lesson given by Sebastien Pierard in the course“Vision 3D” (ULg, Pr. M. Van Droogenbroeck)

INTELSIG, Montefiore Institute, University of Liege, Belgium

November 30, 2011

1 / 31

Page 2: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

An introductory example

I Can you decide which silhouettes are those of humans ?

I Try to write an algorithm to solve this problem !

2 / 31

Page 3: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

An introductory example

I Can you decide which silhouettes are those of humans ?

I Try to write an algorithm to solve this problem !

3 / 31

Page 4: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

An introductory example

Observation

Most of the tasks related to video scene interpretation arecomplex. A human expert can easily take the right decision, butusually without being able to explain how he does it.

Solution

Machine learning techniques are indispensable in computer science.

4 / 31

Page 5: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

Outline

1 Introduction to machine learning (ML)

2 Classification

3 Conclusion

5 / 31

Page 6: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

Outline

1 Introduction to machine learning (ML)

2 Classification

3 Conclusion

6 / 31

Page 7: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

Machine learning techniques

Their1 aim is to

I to build a decision rule automatically

I to be able to generalize to unseen objects

I and to speedup the decisions.

Computational cost:

I The model is learned only once.

I The model is used many times.

I Which operation should be the fastest ?

1In this document, we consider only ”supervised” machine learningtechniques.

7 / 31

Page 8: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

Machine learning techniques

Examples of techniques are:

I the nearest neighbors;

I the neuronal networks;

I the ExtRaTrees [5];

I and the Support Vector Machines (SVM) [2].

A good reference book on this topic is [6].

8 / 31

Page 9: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

Families of machine learning methods

classification

supervised learning

machine learning

regression

unsupervised learning

9 / 31

Page 10: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

Applications

Such techniques have proven to be successful for many purposes:

I detecting people in images [3];

I recognizing them [1];

I analyzing their behavior [7];

I detector faces with software embedded on cameras [8];

I etc.

[image source: Shotton2011RealTime]

10 / 31

Page 11: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

Classification vs regression

Example of classification

yes yes no yes no no

Example of regression

65.2° −2.0° −71.5° 15.4° −47.4° −5.5°

11 / 31

Page 12: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

In practice . . .

ML

users

ML

designers

libraries

= ”black boxes”

There exists a lot of machine learning libraries. For example,

I libSVM (Matlab, Java, Python, etc)

I Regression trees (C/Matlab, on Pierre Geurts’s webpage)

I Neural Network Toolbox (Matlab)

I Java-ML (Java)

I Shark (C++)

I Shogun (C++)

I . . .

12 / 31

Page 13: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

Outline

1 Introduction to machine learning (ML)

2 Classification

3 Conclusion

13 / 31

Page 14: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

What is learned

Example of learning database:

x1 x2 x3 x4 x5 y

sample 1 7.99 6.77 9.75 1.58 1.00 0

sample 2 2.24 9.51 1.14 8.00 7.66 0

sample 3 2.18 2.83 2.96 5.14 9.73 0

sample 4 8.44 7.39 4.57 4.94 2.70 1

sample 5 9.55 5.92 2.52 0.46 1.53 1

sample 6 3.32 9.13 0.50 5.07 8.22 2

MODEL ≡ y (x1, x2, x3, x4, x5) =?

I y is the output variable (the class)

I The samples have to be described by attributes x1, x2, . . .

I The same number of attributes should be used for all samples.

I The meaning of an attribute should not depend on the sample.

14 / 31

Page 15: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

Example of classification task

handwritten character recognition

I size = 100 samples

I choice : attributes = raw pixels

I the size of the images is 32× 32

I dimension = 1024 attributes

[image source: P. Geurts, ”An introduction to Machine Learning”]

15 / 31

Page 16: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

The intrinsic difficulty of machine learning

The theoretical rule to minimize the error rate is

y(−→x )

= arg maxyi∈{0,1,...}

(P

[y = yi |−→x

])(1)

Let ρ be the probability density function of all objects in theattributes space, and ρi be the probability density function of theobjects belonging to class yi . Using Bayes’ rule,

P[y = yi |−→x

]=ρi

[−→x ]P [y = yi ]

ρ[−→x ] (2)

Therefore,

y(−→x )

= arg maxyi∈{0,1,...}

(ρi

[−→x ]P [y = yi ]

)(3)

The intrinsic difficulty is that it is very difficult to estimate ρi fromthe learning database because the space is not densely sampled.

16 / 31

Page 17: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

An example of decision rule in 1D

Imagine we have to recognize men and women based on a singleattribute: the height. In Australia, there are 100 women for 100men. But in Russia, there are 114 women for 100 men.

0

0.02

0.04

0.06

0.08

0.1

160 165 170 175 180

height [ cm ]

ρmanρwoman

australian thresholdρman * 100 / 214

ρwoman * 114 / 214russian threshold

y = ( height < 169.34 ) ? ”woman” : ”man” ;

17 / 31

Page 18: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

Example of classifier : the nearest neighbor(s)

Lets us consider a problem in 2 dimensions:

x1

x2

x1

x2

k-NN1-NN

x1

x2

?

18 / 31

Page 19: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

Example of classifier : the nearest neighbor(s)

I The size of the neighborhood is automatically chosendepending on k.

I The model is the learning set (or a pruned version of it).

I First drawback: the time needed to take a decision is O(n),where n is the learning set size.

I Second drawback: which distance measure should we select ?There exists an infinity of possible choices ! [4]

19 / 31

Page 20: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

Example of classifier : the decision trees

[image source: P. Geurts, ”An introduction to Machine Learning”]

20 / 31

Page 21: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

Choosing the complexity of the model

Which model is the best ?

[image source: P. Geurts, ”An introduction to Machine Learning”]

error(LS) = 3.4 % error(LS) = 1.0 % error(LS) = 0.0 %error(TS) = 3.5 % error(TS) = 1.5 % error(TS) = 3.5 %

I Does the model explain the learning set?→ resubstitution error = error estimated on the learning set

I Is the model able to predict the classes for unknown samples?→ generalization error = error estimated on the test set

21 / 31

Page 22: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

Choosing the complexity of the model

[image source: P. Geurts, ”An introduction to Machine Learning”]

22 / 31

Page 23: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

Evaluation of binary classifiers

I The two classes are { positive , negative }I Example: P = human, N = non-human

classified as

positive negative

real class positive true positives (TP) false negatives (FN)

(ground truth) negative false positives (FP) true negatives (TN)

#P = #TP + #FN #N = #TN + #FP

TPR = #TP#P FNR = #FN

#P TNR = #TN#N FPR = #FP

#N

23 / 31

Page 24: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

Evaluation of binary classifiers

Remark 1 :

To evaluate a classifier, two quantities are required:

1 TPR or FNR (TPR + FNR = 1)2 TNR or FPR (TNR + FPR = 1)

Remark 2 :

There is always a trade-off :

I It is easy to obtain a high TPR.

I It is easy to obtain a high TNR.

I But it is difficult to obtain both simultaneously !

Remark 3 :

A binary classification = a threshold thr . Both TPR and TNRdepend on the value of thr . Therefore we need to carefully choosethe value of thr to optimize (TPR, TNR) !

24 / 31

Page 25: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

Evaluation: receiver operating characteristic (ROC)

{(TPR (thr) ,TNR (thr))∀thr ∈ R} = a ROC curve

0

0.2

0.4

0.6

0.8

1

00.20.40.60.81

TP

R

TNR

ROC curve

model 1model 2model 3

stupid model

25 / 31

Page 26: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

Evaluation: detection error tradeoff (DET)

{(TPR (thr) ,TNR (thr))∀thr ∈ R} = a DET curve

10-5

10-4

10-3

10-2

10-1

100

10-5

10-4

10-3

10-2

10-1

100

FP

R =

1 -

TN

R

FNR = 1 - TPR

DET curve

model 1model 2model 3

stupid model

26 / 31

Page 27: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

Outline

1 Introduction to machine learning (ML)

2 Classification

3 Conclusion

27 / 31

Page 28: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

Conclusion

ML = automatic + generalization + preprocessing

Machine learning techniques are :

I powerful methods;

I a complement to traditional algorithmics;

I indispensable in computer science;

I adequate for real-time computations;

I “easy” to use2.

2However, optimal results are difficult to obtain. This is why researchers arestill working on machine learning methods.

28 / 31

Page 29: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

Bibliography I

N. Boulgouris, D. Hatzinakos, and K. Plataniotis.Gait recognition: a challenging signal processing technology forbiometric identification.IEEE Signal Processing Magazine, 22(6):78–90, November2005.

C. Cortes and V. Vapnik.Support-vector networks.Machine Learning, 20(3):273–297, 1995.

N. Dalal and B. Triggs.Histograms of oriented gradients for human detection.In IEEE International Conference on Computer Vision andPattern Recognition (CVPR), volume 1, pages 886–893, SanDiego, USA, June 2005.

29 / 31

Page 30: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

Bibliography II

M. Deza and E. Deza.Encyclopedia of Distances.Springer, 2009.

P. Geurts, D. Ernst, and L. Wehenkel.Extremely randomized trees.Machine Learning, 63(1):3–42, April 2006.

T. Hastie, R. Tibshirani, and J. Friedman.The elements of statistical learning: data mining, inference,and prediction.Springer Series in Statistics. Springer, second edition,September 2009.

30 / 31

Page 31: Introduction to Machine Learning - uliege.be · 2014-10-16 · Introduction to Machine Learning Lesson given by S ebastien Pi erard in the course \Vision 3D"(ULg, Pr. M. Van Droogenbroeck)

Bibliography III

J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio,R. Moore, A. Kipman, and A. Blake.Real-time human pose recognition in parts from single depthimages.In IEEE International Conference on Computer Vision andPattern Recognition (CVPR), Colorado Springs, June 2011.

P. Viola and M. Jones.Robust real-time face detection.International Journal of Computer Vision, 57(2):137–154,2004.

31 / 31