boosting - penn engineeringcis520/lectures/12_boosting.pdf · gradient tree boosting for regression...

Boosting Lyle Ungar Learning objectives Review stagewise regression Know adaboost and gradient boosting algorithms

Upload: others

Post on 06-Jul-2020

5 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Page 1: boosting - Penn Engineeringcis520/lectures/12_boosting.pdf · Gradient Tree Boosting for Regression uLoss function: L 2 uBase learners h i(x) lFixed depth regression tree fit on pseudo-residual

BoostingLyle UngarLearning objectives

Review stagewise regressionKnow adaboost and gradient boosting algorithms

Page 2: boosting - Penn Engineeringcis520/lectures/12_boosting.pdf · Gradient Tree Boosting for Regression uLoss function: L 2 uBase learners h i(x) lFixed depth regression tree fit on pseudo-residual

Stagewise Regressionu Sequentially learn the weights at

l Never readjust previously learned weightsl h(x) =

h(x) = 0 or average(y)For t =1:T

rt = y - h(x)regress rt = at h(x) to find at

h(x) = h(x) + at ht(x)

Page 3: boosting - Penn Engineeringcis520/lectures/12_boosting.pdf · Gradient Tree Boosting for Regression uLoss function: L 2 uBase learners h i(x) lFixed depth regression tree fit on pseudo-residual

Boostingu Ensemble method

l Weighted combination of weak learners ht(x)

u Learned stagewisel At each stage, boosting gives more weight to what it got

wrong before

Page 4: boosting - Penn Engineeringcis520/lectures/12_boosting.pdf · Gradient Tree Boosting for Regression uLoss function: L 2 uBase learners h i(x) lFixed depth regression tree fit on pseudo-residual

Adaboost

Where at is the log-odds of the weighted probability of the prediction being wrong

Page 5: boosting - Penn Engineeringcis520/lectures/12_boosting.pdf · Gradient Tree Boosting for Regression uLoss function: L 2 uBase learners h i(x) lFixed depth regression tree fit on pseudo-residual

Adaboost exampleu https://alliance.seas.upenn.edu/~cis520/dynami

c/2019/wiki/index.php?n=Lectures.Boosting

https://alliance.seas.upenn.edu/~cis520/dynamic/2019/wiki/index.php?n=Lectures.Boosting

Page 6: boosting - Penn Engineeringcis520/lectures/12_boosting.pdf · Gradient Tree Boosting for Regression uLoss function: L 2 uBase learners h i(x) lFixed depth regression tree fit on pseudo-residual

Adaboost minimizes exponential loss

Page 7: boosting - Penn Engineeringcis520/lectures/12_boosting.pdf · Gradient Tree Boosting for Regression uLoss function: L 2 uBase learners h i(x) lFixed depth regression tree fit on pseudo-residual

And it learns it exponentially fast

Average Error Exponential in stages T and the accuracy of the weak learner g

Page 8: boosting - Penn Engineeringcis520/lectures/12_boosting.pdf · Gradient Tree Boosting for Regression uLoss function: L 2 uBase learners h i(x) lFixed depth regression tree fit on pseudo-residual

Gradient Tree Boostingu Current state-of-the-art for moderate-sized data sets

l i.e. on average very slightly better than random forests when you don’t have enough data to do deep learning

Page 9: boosting - Penn Engineeringcis520/lectures/12_boosting.pdf · Gradient Tree Boosting for Regression uLoss function: L 2 uBase learners h i(x) lFixed depth regression tree fit on pseudo-residual

Gradient Boostingu Model: F(x) = Si gi hi(x) + const

u Loss function: L(y,F(x))l L2 or logistic or …

u Base learner: hi(x)l Decision tree of specified depth

u Optionally subsample featuresl “stochastic gradient boosting”

u Do stagewise estimation of F(x)l Estimate hi(x) and gi at each iteration i

Page 10: boosting - Penn Engineeringcis520/lectures/12_boosting.pdf · Gradient Tree Boosting for Regression uLoss function: L 2 uBase learners h i(x) lFixed depth regression tree fit on pseudo-residual

https://en.wikipedia.org/wiki/Gradient_boosting

For squared error, this is just the standard residual

Gradient Tree Boostingfor Regression

u Loss function: L2

u Base learners hi(x)l Fixed depth regression tree fit on pseudo-residuall Gives a constant prediction for each leaf of the tree

u Stagewise: find weights on each hi(x)l Fancy version: fit different weights for each leaf of tree

Page 12: boosting - Penn Engineeringcis520/lectures/12_boosting.pdf · Gradient Tree Boosting for Regression uLoss function: L 2 uBase learners h i(x) lFixed depth regression tree fit on pseudo-residual

Regularization helps

http://scikit-learn.org/stable/auto_examples/ensemble/plot_gradient_boosting_regularization.html

Subsample = stochastic gradient boosting

Learning rate = shrinkage on g

Page 13: boosting - Penn Engineeringcis520/lectures/12_boosting.pdf · Gradient Tree Boosting for Regression uLoss function: L 2 uBase learners h i(x) lFixed depth regression tree fit on pseudo-residual

What you should knowu Boosting

l Stagewise regression upweighting previous errorsl Gives highly accurate ensemble modelsl Relatively fastl Tends not to overfit (but still: use early stopping!)

u Gradient Tree Boostingl Uses pseudo-residualsl Very accurate!!!

Machine Learning Methods Maximum entropy –Maxent is an example Boosting: –Boosted Regression Trees Neural Networks

A New Perspective on Boosting in Linear Regression via ... · A New Perspective on Boosting in Linear Regression via Subgradient Optimization and Relatives Robert M. Freund Paul Grigasy

Deep Boosting - Proceedings of Machine Learning Researchproceedings.mlr.press/v32/cortesb14.pdf · pares favorably to that of AdaBoost and Logistic Regression and their L1-regularized

SUTTON_Classification and Regression Trees, Bagging and Boosting

Boosting brainpower

Business Data Analytics · 2019. 5. 6. · organization. Input Output Input Output Types of models Logistic regression Decision Tree Random Forest Boosting Algorithms. Decision Tree

Multistep kernel regression smoothing by boostingcharles/boostreg.pdf · Multistep kernel regression smoothing by boosting Marco Di Marzio DMQTE, G.d’Annunzio University, ITALY

Classification and Regression trees: CART BOOSTING AND BAGGING RANDOM FOREST

Semi-parametric implied volatility surface models and ...Semi-parametric implied volatility surface models and forecasts based on a regression tree-boosting algorithm Dominik Colangelo

Regression Linear Regression Regression Trees

A Gentle Introduction to Gradient Boosting Boosting for Regression How is this related to gradient descent? For regression with square loss, residual ,negative gradient t h to residual

A Gradient-Based Boosting Algorithm for Regression Problems · 1 A Brief Summary of Boosting Methods Adaptive boosting methods are simple modular algorithms that operate as follows

Delta Boosting Machine and its Application in Actuarial ... · (MARS), regression trees [22] and boosting. 1.1. The Boosting Algorithms. Boosting methods are based on an idea of com-bining

BooST: BOOSTING SMOOTH TRANSITION REGRESSION TREES …jmcauley/workshops/scmls20/papers/scml… · want to predict the scores of basketball players based on their height and weight

A Comparison of Resampling Techniques to Handle the Class … · 2018-01-14 · Three methods were investigated: logistic regression, decision trees, and gradi-ent boosting trees

Recognition II Ali Farhadi. We have talked about Nearest Neighbor Naïve Bayes Logistic Regression Boosting

Additive Logistic Regression: a Statistical View of Boosting

Pressure Boosting

Boosted Regression (Boosting): An introductory tutorial ... · PDF file1 Schonlau M. Boosted Regression (Boosting): An introductory tutorial and a Stata plugin. The Stata Journal,

Identifying Risk Factors for Severe Childhood Malnutrition ... · Identifying Risk Factors for Severe Childhood Malnutrition by Boosting Additive Quantile Regression Nora Fenske Ludwig-Maximilians-Universit

Boosting Algorithms: Regularization, Prediction and Model ...web.stanford.edu/~hastie/Papers/buehlmann.pdfwell as regression models for survival analysis. Concepts of degrees of freedom

Neural Networks & Boosting - Statistics Departmentstine/mich/DM_06.pdf · •How are neural networks and logistic regression related? •How can you use repeated CV to convey the

Logistic Boosting Regression for Label Distribution Learningopenaccess.thecvf.com › content...Boosting_Regression... · Logistic Boosting Regression for Label Distribution Learning∗

Tweedie Gradient Boosting for Extremely Unbalanced Zero ... · zero-inﬂated Poisson regression and zero-inﬂated negative binomial regression (e.g., Lambert, 1992; Hall, 2000;

Package ‘FDboost’ - R · Package ‘FDboost’ August 4, 2018 Type Package Title Boosting Functional Regression Models Version 0.3-2 Date 2018-08-04 Description Regression models

Cloud-based Federated Boosting for Mobile Crowdsensing · Boosting framework by establishing numerous classification and regression trees (CART) models. The core of algorithm is the

Boosting Small Engines to High Performance – Boosting Systems

Boosting Methods Benk Erika Kelemen Zsolt. Summary Overview Boosting – approach, definition, characteristics Early Boosting Algorithms AdaBoost – introduction,

Boosting Renewable Energy Financing - World Bankpubdocs.worldbank.org › en › 915661481606611543 › Boosting... · Boosting Renewable Energy Financing Senior Director Ms. Anita

BOOSTING MULTIVARIATE ADAPTIVE REGRESSION SPLINE …repository.its.ac.id/45202/1/1315105027-Undergraduate_Theses.pdf · LEMBAR PENGESAHAN.....iii ABSTRAK ... Program Rujuk Balik,

uBASE reports catalogue

FilterBoost: Regression and Classiﬁcation on Large Datasets · 2020. 5. 9. · FilterBoost: Regression and Classification on Large Datasets 1. Introduction Boosting provides a ready

Steganalysis by Ensemble Classifiers with Boosting by ... · Steganalysis by Ensemble Classi ers - Marc Chaumont - ICIP’2012 Boosting by regression Weighting the weak classi ers

Regression Using Boosting Vishakh ([email protected])[email protected] Advanced Machine Learning Fall 2006

uBASE Starter Guide - University of Sheffieldhr.dept.shef.ac.uk/business_solutions/ubaserefresher/Starter_Guide.pdf · Delimit (puts an end date on the infotype). ... 0016 Contract