the adaboost algorithm-0.5 margin theorem 1 let d be a distribution over x x {—1, 1}, and let s be...
TRANSCRIPT
![Page 1: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/1.jpg)
The AdaBoost Algorithm
![Page 2: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/2.jpg)
A typical learning curve
![Page 3: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/3.jpg)
…and a boosting one
![Page 4: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/4.jpg)
this lecture
![Page 5: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/5.jpg)
supervised learning
![Page 6: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/6.jpg)
boosting : introduction
![Page 7: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/7.jpg)
boosting example
Start with uniform distribution on data
Weak learners = halfplanes
![Page 8: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/8.jpg)
round 1
![Page 9: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/9.jpg)
round 2
![Page 10: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/10.jpg)
round 3
![Page 11: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/11.jpg)
final hypothesis
![Page 12: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/12.jpg)
the hypothesis points space
![Page 13: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/13.jpg)
separation
![Page 14: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/14.jpg)
AdaBoost - technical
![Page 15: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/15.jpg)
a bayesian interpretation
![Page 16: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/16.jpg)
AdaBoost – update rule
![Page 17: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/17.jpg)
online allocation - hedge algorithm
![Page 18: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/18.jpg)
AdaBoost – distribution update
![Page 19: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/19.jpg)
training error
Theorem [Freund&Schapire ’97]
![Page 20: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/20.jpg)
Boosting and margin distribution
θ
![Page 21: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/21.jpg)
Why margins are important
• Generalization error
![Page 22: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/22.jpg)
generalization error- based on margins
![Page 23: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/23.jpg)
AdaBoost - remarks
AdaBoost is adaptive • does not need to know or T a priori • can exploit εt << ½-
GOOD : does not overfit BAD : Susceptible to noise
but a nice property : identify outliers
![Page 24: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/24.jpg)
Recent advances
![Page 25: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/25.jpg)
![Page 26: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/26.jpg)
boosting vs bagging
![Page 27: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/27.jpg)
Boosting vs SVMs
![Page 28: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/28.jpg)
…hope you are still with me
![Page 29: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/29.jpg)
boosting using confidence-rated predictions
![Page 30: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/30.jpg)
multiclass : AdaBoost.M1
![Page 31: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/31.jpg)
multiclass,multilabel : AdaBoost.MH
![Page 32: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/32.jpg)
output coding for multiclass problems
![Page 33: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/33.jpg)
InfoBoost – [Aslam ’2000]
![Page 34: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/34.jpg)
applications
![Page 35: The AdaBoost Algorithm-0.5 margin Theorem 1 Let D be a distribution over X X {—1, 1}, and let S be a sample of m examples chosen independently at random according to D. Assume that](https://reader034.vdocument.in/reader034/viewer/2022050112/5f49758c79a6cc2793298b83/html5/thumbnails/35.jpg)
so