1 a brief introduction to adaboost hongbo deng 6 feb, 2007 some of the slides are borrowed from...

35
1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman.

Post on 20-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

1

A Brief Introduction to Adaboost

Hongbo Deng

6 Feb, 2007

Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman.

Page 2: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

2

Outline

Background

Adaboost Algorithm

Theory/Interpretations

Page 3: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

3

What’s So Good About Adaboost

Can be used with many different classifiers

Improves classification accuracy

Commonly used in many areas

Simple to implement

Not prone to overfitting

Page 4: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

4

Bootstrapping

Bagging

Boosting (Schapire 1989)

Adaboost (Schapire 1995)

A Brief History Resampling for estimating statistic

Resampling for classifier design

Page 5: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

5

Bootstrap Estimation

Repeatedly draw n samples from D For each set of samples, estimate a

statistic The bootstrap estimate is the mean of the

individual estimates Used to estimate a statistic (parameter)

and its variance

Page 6: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

6

Bagging - Aggregate Bootstrapping

For i = 1 .. MDraw n*<n samples from D with replacementLearn classifier Ci

Final classifier is a vote of C1 .. CM

Increases classifier stability/reduces variance

D1D2

D3 D

Page 7: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

7

Boosting (Schapire 1989)

Consider creating three component classifiers for a two-category problem through boosting.

Randomly select n1 < n samples from D without replacement to obtain D1 Train weak learner C1

Select n2 < n samples from D with half of the samples misclassified by C1 to obtain D2 Train weak learner C2

Select all remaining samples from D that C1 and C2 disagree on Train weak learner C3

Final classifier is vote of weak learners

D

D1

D2

D3

+ --

+

Page 8: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

8

Adaboost - Adaptive Boosting

Instead of resampling, uses training set re-weighting Each training sample uses a weight to determine the probability

of being selected for a training set.

AdaBoost is an algorithm for constructing a “strong” classifier as linear combination of “simple” “weak” classifier

Final classification based on weighted vote of weak classifiers

Page 9: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

9

Adaboost Terminology

ht(x) … “weak” or basis classifier (Classifier = Learner = Hypothesis)

… “strong” or final classifier

Weak Classifier: < 50% error over any distribution

Strong Classifier: thresholded linear combination of weak classifier outputs

Page 10: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

10

Discrete Adaboost AlgorithmEach training sample has a

weight, which determines the probability of being selected for training the component classifier

Page 11: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

11

Find the Weak Classifier

Page 12: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

12

Find the Weak Classifier

Page 13: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

13

The algorithm core

Page 14: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

14

Reweighting

y * h(x) = 1

y * h(x) = -1

Page 15: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

15

Reweighting

In this way, AdaBoost “focused on” the informative or “difficult” examples.

Page 16: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

16

Reweighting

In this way, AdaBoost “focused on” the informative or “difficult” examples.

Page 17: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

17

Algorithm recapitulation t = 1

Page 18: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

18

Algorithm recapitulation

Page 19: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

19

Algorithm recapitulation

Page 20: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

20

Algorithm recapitulation

Page 21: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

21

Algorithm recapitulation

Page 22: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

22

Algorithm recapitulation

Page 23: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

23

Algorithm recapitulation

Page 24: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

24

Algorithm recapitulation

Page 25: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

25

Pros and cons of AdaBoost

AdvantagesVery simple to implementDoes feature selection resulting in relatively

simple classifierFairly good generalization

DisadvantagesSuboptimal solutionSensitive to noisy data and outliers

Page 26: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

26

References Duda, Hart, ect – Pattern Classification

Freund – “An adaptive version of the boost by majority algorithm”

Freund – “Experiments with a new boosting algorithm”

Freund, Schapire – “A decision-theoretic generalization of on-line learning and an application to boosting”

Friedman, Hastie, etc – “Additive Logistic Regression: A Statistical View of Boosting”

Jin, Liu, etc (CMU) – “A New Boosting Algorithm Using Input-Dependent Regularizer”

Li, Zhang, etc – “Floatboost Learning for Classification”

Opitz, Maclin – “Popular Ensemble Methods: An Empirical Study”

Ratsch, Warmuth – “Efficient Margin Maximization with Boosting”

Schapire, Freund, etc – “Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods”

Schapire, Singer – “Improved Boosting Algorithms Using Confidence-Weighted Predictions”

Schapire – “The Boosting Approach to Machine Learning: An overview”

Zhang, Li, etc – “Multi-view Face Detection with Floatboost”

Page 27: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

27

Appendix

Bound on training error Adaboost Variants

Page 28: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

28

Bound on Training Error (Schapire)

Page 29: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

29

Discrete Adaboost (DiscreteAB)(Friedman’s wording)

Page 30: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

30

Discrete Adaboost (DiscreteAB)(Freund and Schapire’s wording)

Page 31: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

31

Adaboost with Confidence Weighted Predictions (RealAB)

Page 32: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

32

Adaboost Variants Proposed By Friedman LogitBoost

Solves Requires care to avoid numerical problems

GentleBoost Update is fm(x) = P(y=1 | x) – P(y=0 | x) instead of

Bounded [0 1]

Page 33: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

33

Adaboost Variants Proposed By Friedman LogitBoost

Page 34: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

34

Adaboost Variants Proposed By Friedman GentleBoost

Page 35: 1 A Brief Introduction to Adaboost Hongbo Deng 6 Feb, 2007 Some of the slides are borrowed from Derek Hoiem & Jan ˇSochman

35

Thanks!!!Any comments or questions?