machine learning: classifier evaluation...who provides the “oracle” to validate answers?...

20
Machine Learning: Classier Evaluation Madhavan Mukund Chennai Mathematical Institute http://www.cmi.ac.in/ ~ madhavan AlgoLabs Certication Course on Machine Learning 23 February, 2015

Upload: others

Post on 27-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,

Machine Learning: Classifier Evaluation

Madhavan Mukund

Chennai Mathematical Institutehttp://www.cmi.ac.in/~madhavan

AlgoLabs Certification Course on Machine Learning23 February, 2015

Page 2: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,

Evaluating a classifier

Accuracy What fraction of predictions are correct?

Need access to an “oracle” to validate answers

Page 3: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,

Evaluating a classifier

Accuracy What fraction of predictions are correct?

Need access to an “oracle” to validate answers

Classification is often asymmetric

Suppose 1% of email traffic constitutes phishing

An email filter that always says “No” is 99% accurate, buttotally useless!

Note: Conventional to assume that “Yes” is the minorityanswer

Page 4: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,

Evaluating a classifier

Accuracy What fraction of predictions are correct?

Need access to an “oracle” to validate answers

Classification is often asymmetric

Suppose 1% of email traffic constitutes phishing

An email filter that always says “No” is 99% accurate, buttotally useless!

Note: Conventional to assume that “Yes” is the minorityanswer

Need a finer classification of correct predictions and errors

Page 5: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,

Evaluating a classifier . . .

Confusion matrix

Classified positive Classified negativeActual Positive TP FNActual Negative FP TN

Page 6: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,

Evaluating a classifier . . .

Confusion matrix

Classified positive Classified negativeActual Positive TP FNActual Negative FP TN

PrecisionWhat fraction of positive classifications are correct?

p =TP

TP + FP

Page 7: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,

Evaluating a classifier . . .

Confusion matrix

Classified positive Classified negativeActual Positive TP FNActual Negative FP TN

PrecisionWhat fraction of positive classifications are correct?

p =TP

TP + FP

Recall

What fraction of actual positive cases are correctly classified?

p =TP

TP + FN

Page 8: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,

Evaluating a classifier . . .

Classified positive Classified negativeActual Positive 1 99Actual Negative 0 1000

Here p = 1 but r = 0.01

Page 9: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,

Evaluating a classifier . . .

Classified positive Classified negativeActual Positive 1 99Actual Negative 0 1000

Here p = 1 but r = 0.01

No functional relationship between p and r

In practice, they are typically inversely related—increasing preduces r and vice versa

Conservative classifier — higher precision, ignores valid cases

Permissive classifier — higher recall, more mistakes

Page 10: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,

Evaluating a classifier . . .

Combine p,r into a single F-Score–weighted harmonic mean

F =1

α 1p + (1− α)1r

=(β2 + 1)pr

β2p + r

where α ∈ [0, 1] and β2 =1− α

α

Fβ=1 =2pr

p + r

Page 11: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,

Evaluating a classifier . . .

Page 12: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,

Evaluating a classifier . . .

Who provides the “oracle” to validate answers?

Page 13: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,

Evaluating a classifier . . .

Who provides the “oracle” to validate answers?

Holdout sets

Exclude a random sample of training data

Build classifier on remaining data, check answers on holdoutset

Suitable if we have a large volume of training data

Page 14: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,

Evaluating a classifier . . .

Who provides the “oracle” to validate answers?

Holdout sets

Exclude a random sample of training data

Build classifier on remaining data, check answers on holdoutset

Suitable if we have a large volume of training data

Cross validation

Systematically exclude 1/n of training data

Build classifier on remaining data and check answers onexcluded set

Repeat n times to span entire training data

Aggregate the scores obtained

Page 15: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,

Overfitting

Model is too specific

Tailored to fit anomalies in training data

Performs suboptimally on general data

More formally, there is another classifier such that:

Current classifier beats the other one on this data . . .. . . but the other one is better on unseen data

Page 16: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,

Overfitting . . .

Synthetic data, two classes, 0.75 yes and 0.25 no

Blindly saying yes has 0.25 error

Decision tree has 119 nodes, 0.35 error!

Page 17: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,

Overfitting . . .

Two classes, Pr(y) = p and Pr(n) = 1− p

Choose majority class Y uniformly, error is 1− p

Assign each item class Y with probability p, N withprobability 1− p

Errors on Y : p(1− p)Errors on N: (1− p)pTotal error 2p(1− p) > p, since p > 0.5!

Page 18: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,

Overfitting . . .

Prune the tree

Top-down: stop expanding tree if information gain drops belowa threshold

Bottom-up:

Remove children of a node if estimated error across children ismore than for original

Estimate error using holdout data

Page 19: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,

Overfitting . . .

Party affiliation of USlegislators based on votingpattern

Page 20: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,

Overfitting . . .

Party affiliation of US legislators based on voting pattern, afterpruning