classification derek hoiem cs 598, spring 2009 jan 27, 2009

30
Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

Upload: cathleen-griffith

Post on 14-Jan-2016

223 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

Classification

Derek HoiemCS 598, Spring 2009

Jan 27, 2009

Page 2: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

Outline

• Principles of generalization

• Survey of classifiers

• Project discussion

• Discussion of Rosch

Page 3: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

Pipeline for Prediction

Imagery Representation Classifier Predictions

Page 4: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

Free Lunch Theorem

Page 5: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

Bias and Variance

Complexity Low BiasHigh Variance

High BiasLow Variance

Err

or

Page 6: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

Overfitting• Need validation set• Validation set not same as test set

Page 7: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

Bias-Variance View of Features• More compact = lower variance, potentially

higher bias• More features = higher variance, lower bias• More independence among features = simpler

classifier lower variance

Page 8: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

How to reduce variance• Parameterize model

E.g., linear vs. piecewise

Page 9: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

How to measure complexity?• VC dimension

Training error +

Upper bound on generalization error

N: size of training seth: VC dimension: 1-probability

Page 10: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

How to reduce variance• Parameterize model• Regularize

Page 11: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

How to reduce variance• Parameterize model• Regularize• Increase number of training examples

Page 12: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

Effect of Training Size

Number of Training Examples

Err

or

Page 13: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

Risk Minimization• Margins

x x

xx

x

x

x

x

oo

o

o

o

x2

x1

Page 14: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

Classifiers• Generative methods

– Naïve Bayes– Bayesian Networks

• Discriminative methods– Logistic Regression– Linear SVM– Kernelized SVM

• Ensemble methods– Randomized Forests– Boosted Decision Trees

• Instance based– K-nearest neighbor

• Unsupervised– Kmeans

Page 15: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

Components of classification methods• Objective function• Parameterization• Regularization• Training• Inference

Page 16: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

Classifiers: Naïve Bayes• Objective• Parameterization• Regularization• Training• Inference x1 x2 x3

y

Page 17: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

Classifiers: Logistic Regression• Objective• Parameterization• Regularization• Training• Inference

Page 18: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

Classifiers: Linear SVM• Objective• Parameterization• Regularization• Training• Inference

x x

xx

x

x

x

x

oo

o

o

o

x2

x1

Page 19: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

Classifiers: Linear SVM• Objective• Parameterization• Regularization• Training• Inference

x x

xx

x

x

x

x

oo

o

o

o

x2

x1

Page 20: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

Classifiers: Linear SVM• Objective• Parameterization• Regularization• Training• Inference

x x

xx

x

x

x

x

o

oo

o

o

o

x2

x1

Needs slack

Page 21: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

Classifiers: Kernelized SVM• Objective• Parameterization• Regularization• Training• Inference

xx xx oo o

x1

x

x

x

x

o

oo

x1

x12

Page 22: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

Classifiers: Decision Trees• Objective• Parameterization• Regularization• Training• Inference

x x

xx

x

x

x

x

oo

o

o

o

o

x2

x1

Page 23: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

Ensemble Methods: Boosting

figure from Friedman et al. 2000

Page 24: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

Boosted Decision Trees

Gray?

High inImage?

Many LongLines?

Yes

No

NoNo

No

Yes Yes

Yes

Very High Vanishing

Point?

High in Image?

Smooth? Green?

Blue?

Yes

No

NoNo

No

Yes Yes

Yes

Ground Vertical Sky

[Collins et al. 2002]

P(label | good segment, data)

Page 25: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

Boosted Decision Trees• How to control bias/variance trade-off

– Size of trees– Number of trees

Page 26: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

K-nearest neighbor

x x

xx

x

x

x

xo

oo

o

o

o

o

x2

x1

• Objective• Parameterization• Regularization• Training• Inference

Page 27: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

Clustering

x x

xx

x

xo

o

o

o

o

x1

x

x2

+ +

++

+

++

+

+

+

+

x2

x1

+

Page 28: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

References

• General– Tom Mitchell, Machine Learning, McGraw Hill, 1997– Christopher Bishop, Neural Networks for Pattern Recognition, Oxford

University Press, 1995

• Adaboost– Friedman, Hastie, and Tibshirani, “Additive logistic regression: a statistical view

of boosting”, Annals of Statistics, 2000

• SVMs– http://www.support-vector.net/icml-tutorial.pdf

Page 29: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

Project ideas?

Page 30: Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009

Discussion of Rosch