Boosting and Additive Models
Chapter 10Elements of Statistical Learning
Outline
• Model Averaging– Bagging– Boosting
• Boosting: AdaBoost• Forward Stagewise Modeling• Interpretation of Boosting• Summary
Classification Problem
Classification Tree (CART)
Decision Boundary: CART
Comparison of Learning Methods
Is there a method that combines the advantage of SVM and CART?
Or, keep the advantages of CART while increasing its prediction power?
Bagging (Bootstrap Aggregation)
• Bagging averages a given procedure over many samples to reduce the variance
Decision Boundary: Bagging
Bagging can dramatically reduce the variance of unstable procedures (like trees), leading to improved prediction.
Any simple structure in CART is lost
Decision Boundary: Bagging
History of Boosting
Procedure of Boosting
Boosting vs. Bagging
AdaBoost (Freund & Schapire 1996)
Forward Stagewise Modeling
Stagewise Least Square
Stagewise Lease Square
AdaBoost: Stagewise Modeling
Chapter 10.5 in Version 2
Why Exponential Loss
General Stagewise Algorithm
Boosting: avoid overfitting
Concluding Remarks
SVM via Loss + Penalty
SVM = hinge loss + L2regularization
Logistic Regression
Boosting via Loss + Penalty
Acknowledgement
• Dr Trevor Hastie’s slides for Chapter 10 in “Elements of Statistical Learning”
http://www-stat.stanford.edu/ hastie/TALKS/boost.pdfhttp://www-
stat.stanford.edu/ hastie/Papers/svmtalk.pdf
• “SVM tutorial” by Dr. C. Burges