machine learning: classifier evaluation...who provides the “oracle” to validate answers?...
TRANSCRIPT
![Page 1: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,](https://reader034.vdocument.in/reader034/viewer/2022050314/5f75fba0aae146187513000c/html5/thumbnails/1.jpg)
Machine Learning: Classifier Evaluation
Madhavan Mukund
Chennai Mathematical Institutehttp://www.cmi.ac.in/~madhavan
AlgoLabs Certification Course on Machine Learning23 February, 2015
![Page 2: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,](https://reader034.vdocument.in/reader034/viewer/2022050314/5f75fba0aae146187513000c/html5/thumbnails/2.jpg)
Evaluating a classifier
Accuracy What fraction of predictions are correct?
Need access to an “oracle” to validate answers
![Page 3: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,](https://reader034.vdocument.in/reader034/viewer/2022050314/5f75fba0aae146187513000c/html5/thumbnails/3.jpg)
Evaluating a classifier
Accuracy What fraction of predictions are correct?
Need access to an “oracle” to validate answers
Classification is often asymmetric
Suppose 1% of email traffic constitutes phishing
An email filter that always says “No” is 99% accurate, buttotally useless!
Note: Conventional to assume that “Yes” is the minorityanswer
![Page 4: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,](https://reader034.vdocument.in/reader034/viewer/2022050314/5f75fba0aae146187513000c/html5/thumbnails/4.jpg)
Evaluating a classifier
Accuracy What fraction of predictions are correct?
Need access to an “oracle” to validate answers
Classification is often asymmetric
Suppose 1% of email traffic constitutes phishing
An email filter that always says “No” is 99% accurate, buttotally useless!
Note: Conventional to assume that “Yes” is the minorityanswer
Need a finer classification of correct predictions and errors
![Page 5: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,](https://reader034.vdocument.in/reader034/viewer/2022050314/5f75fba0aae146187513000c/html5/thumbnails/5.jpg)
Evaluating a classifier . . .
Confusion matrix
Classified positive Classified negativeActual Positive TP FNActual Negative FP TN
![Page 6: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,](https://reader034.vdocument.in/reader034/viewer/2022050314/5f75fba0aae146187513000c/html5/thumbnails/6.jpg)
Evaluating a classifier . . .
Confusion matrix
Classified positive Classified negativeActual Positive TP FNActual Negative FP TN
PrecisionWhat fraction of positive classifications are correct?
p =TP
TP + FP
![Page 7: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,](https://reader034.vdocument.in/reader034/viewer/2022050314/5f75fba0aae146187513000c/html5/thumbnails/7.jpg)
Evaluating a classifier . . .
Confusion matrix
Classified positive Classified negativeActual Positive TP FNActual Negative FP TN
PrecisionWhat fraction of positive classifications are correct?
p =TP
TP + FP
Recall
What fraction of actual positive cases are correctly classified?
p =TP
TP + FN
![Page 8: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,](https://reader034.vdocument.in/reader034/viewer/2022050314/5f75fba0aae146187513000c/html5/thumbnails/8.jpg)
Evaluating a classifier . . .
Classified positive Classified negativeActual Positive 1 99Actual Negative 0 1000
Here p = 1 but r = 0.01
![Page 9: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,](https://reader034.vdocument.in/reader034/viewer/2022050314/5f75fba0aae146187513000c/html5/thumbnails/9.jpg)
Evaluating a classifier . . .
Classified positive Classified negativeActual Positive 1 99Actual Negative 0 1000
Here p = 1 but r = 0.01
No functional relationship between p and r
In practice, they are typically inversely related—increasing preduces r and vice versa
Conservative classifier — higher precision, ignores valid cases
Permissive classifier — higher recall, more mistakes
![Page 10: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,](https://reader034.vdocument.in/reader034/viewer/2022050314/5f75fba0aae146187513000c/html5/thumbnails/10.jpg)
Evaluating a classifier . . .
Combine p,r into a single F-Score–weighted harmonic mean
F =1
α 1p + (1− α)1r
=(β2 + 1)pr
β2p + r
where α ∈ [0, 1] and β2 =1− α
α
Fβ=1 =2pr
p + r
![Page 11: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,](https://reader034.vdocument.in/reader034/viewer/2022050314/5f75fba0aae146187513000c/html5/thumbnails/11.jpg)
Evaluating a classifier . . .
![Page 12: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,](https://reader034.vdocument.in/reader034/viewer/2022050314/5f75fba0aae146187513000c/html5/thumbnails/12.jpg)
Evaluating a classifier . . .
Who provides the “oracle” to validate answers?
![Page 13: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,](https://reader034.vdocument.in/reader034/viewer/2022050314/5f75fba0aae146187513000c/html5/thumbnails/13.jpg)
Evaluating a classifier . . .
Who provides the “oracle” to validate answers?
Holdout sets
Exclude a random sample of training data
Build classifier on remaining data, check answers on holdoutset
Suitable if we have a large volume of training data
![Page 14: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,](https://reader034.vdocument.in/reader034/viewer/2022050314/5f75fba0aae146187513000c/html5/thumbnails/14.jpg)
Evaluating a classifier . . .
Who provides the “oracle” to validate answers?
Holdout sets
Exclude a random sample of training data
Build classifier on remaining data, check answers on holdoutset
Suitable if we have a large volume of training data
Cross validation
Systematically exclude 1/n of training data
Build classifier on remaining data and check answers onexcluded set
Repeat n times to span entire training data
Aggregate the scores obtained
![Page 15: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,](https://reader034.vdocument.in/reader034/viewer/2022050314/5f75fba0aae146187513000c/html5/thumbnails/15.jpg)
Overfitting
Model is too specific
Tailored to fit anomalies in training data
Performs suboptimally on general data
More formally, there is another classifier such that:
Current classifier beats the other one on this data . . .. . . but the other one is better on unseen data
![Page 16: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,](https://reader034.vdocument.in/reader034/viewer/2022050314/5f75fba0aae146187513000c/html5/thumbnails/16.jpg)
Overfitting . . .
Synthetic data, two classes, 0.75 yes and 0.25 no
Blindly saying yes has 0.25 error
Decision tree has 119 nodes, 0.35 error!
![Page 17: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,](https://reader034.vdocument.in/reader034/viewer/2022050314/5f75fba0aae146187513000c/html5/thumbnails/17.jpg)
Overfitting . . .
Two classes, Pr(y) = p and Pr(n) = 1− p
Choose majority class Y uniformly, error is 1− p
Assign each item class Y with probability p, N withprobability 1− p
Errors on Y : p(1− p)Errors on N: (1− p)pTotal error 2p(1− p) > p, since p > 0.5!
![Page 18: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,](https://reader034.vdocument.in/reader034/viewer/2022050314/5f75fba0aae146187513000c/html5/thumbnails/18.jpg)
Overfitting . . .
Prune the tree
Top-down: stop expanding tree if information gain drops belowa threshold
Bottom-up:
Remove children of a node if estimated error across children ismore than for original
Estimate error using holdout data
![Page 19: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,](https://reader034.vdocument.in/reader034/viewer/2022050314/5f75fba0aae146187513000c/html5/thumbnails/19.jpg)
Overfitting . . .
Party affiliation of USlegislators based on votingpattern
![Page 20: Machine Learning: Classifier Evaluation...Who provides the “oracle” to validate answers? Holdout sets Exclude a random sample of training data Build classifier on remaining data,](https://reader034.vdocument.in/reader034/viewer/2022050314/5f75fba0aae146187513000c/html5/thumbnails/20.jpg)
Overfitting . . .
Party affiliation of US legislators based on voting pattern, afterpruning