10701 ensemble of trees: begging and random forestepxing/class/10701-20/slides/...score 4 30 21 7...

57
10701 Ensemble of trees: Begging and Random Forest

Upload: others

Post on 06-Mar-2021

21 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

10701

Ensemble of trees: Begging and Random Forest

Page 2: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Bagging

• Bagging or bootstrap aggregation a technique for reducing the variance of an estimated prediction function.

• For classification, a committee of trees each

cast a vote for the predicted class.

Page 3: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

BootstrapThe basic idea:

randomly draw datasets with replacement from the training data, each sample of the same size

Page 4: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Bagging

N e

xam

ple

s

Create bootstrap samplesfrom the training data

....…

M features

Page 5: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Random Forest Classifier

N e

xam

ple

s

Construct a decision tree

....…

M features

Page 6: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Bagging tree classifier

N e

xam

ple

s

....…

....…

Take the majority

vote

M features

Page 7: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Bagging

Z = {(x1, y1), (x2, y2), . . . , (xN, yN)}

Z*b where = 1,.., B.. The prediction at input xwhen bootstrap sampleb is used for training

Page 8: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Bagging

Hastie

Treat the votingProportions as probabilities

Page 9: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Random forest classifier

Random forest classifier, an extension to

bagging which uses a subset of the features rather than the samples.

Page 10: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Random Forest Classifier

N e

xam

ple

s

Training Data

M features

Page 11: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Random Forest Classifier

N e

xam

ple

s

Create bootstrap samplesfrom the training data

....…

M features

Page 12: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Random Forest Classifier

N e

xam

ple

s

Construct a decision tree

....…

M features

Page 13: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Random Forest Classifier

N e

xam

ple

s

....…

M features

At each node in choosing the split featurechoose only among m<M features

Page 14: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Random Forest ClassifierCreate decision tree

from each bootstrap sample

N e

xam

ple

s

....…

....…

M features

Page 15: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Random Forest Classifier

N e

xam

ple

s

....…

....…

Take he majority

vote

M features

Page 16: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions
Page 17: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Random forest for biology

GeneExpress

GeneExpress

TAP

Y2H

GOProcess N HMS_PCI N

GeneOccur Y GOLocalization Y

ProteinExpress

GeneExpress

GeneExpress

Domain

Y2H

HMS-PCISynExpress ProteinExpress

Page 18: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Regression

10-701

Machine Learning

Page 19: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Where we are

Inputs

ClassifierPredict

category

Inputs Density

Estimator

Prob-ability

Inputs

RegressorPredictreal no.

Today

Page 20: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Choosing a restaurantReviews

(out of 5

stars)

$ Distance Cuisine

(out of

10)

score

4 30 21 7 8.5

2 15 12 8 7.8

5 27 53 9 6.7

3 20 5 6 5.4

• In everyday life we need to make decisions

by taking into account lots of factors

• The question is what weight we put on each

of these factors (how important are they with

respect to the others).

• Assume we would like to build a

recommender system for ranking potential

restaurants based on an individuals’

preferences

• If we have many observations we may be

able to recover the weights

?

Page 21: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Linear regression

• Given an input x we would like

to compute an output y

• For example:

- Predict height from age

- Predict Google’s price from

Yahoo’s price

- Predict distance from wall

using sensor readings

X

Y

Note that now Y can be

continuous

Page 22: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Linear regression

• Given an input x we would like to compute an output y

• In linear regression we assume that y and x are related with the following equation:

y = wx+

where w is a parameter and represents measurement or other noise

X

Y

What we are

trying to predict

Observed values

Page 23: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

• Our goal is to estimate w from a training data

of <xi,yi> pairs

• One way to find such relationship is to

minimize the a least squares error:

• Several other approaches can be used as well

• So why least squares?

- minimizes squared distance between

measurements and predicted line

- has a nice probabilistic interpretation

- easy to compute

Linear regression

−i

iiw wxy 2)(minargX

Y += wxy

If the noise is Gaussian

with mean 0 then least

squares is also the

maximum likelihood

estimate of w

Page 24: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Solving linear regression using

least squares minimization

• You should be familiar with this by now …

• We just take the derivative w.r.t. to w and set to 0:

=

=

=−

−−=−

i

i

i

ii

i i

iii

i

iii

i

iii

i

ii

x

yx

w

wxyx

wxyx

wxyxwxyw

2

2

2

0)(2

)(2)(

Page 25: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Regression example

• Generated: w=2

• Recovered: w=2.03

• Noise: std=1

Page 26: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Regression example

• Generated: w=2

• Recovered: w=2.05

• Noise: std=2

Page 27: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Regression example

• Generated: w=2

• Recovered: w=2.08

• Noise: std=4

Page 28: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Bias term

• So far we assumed that the

line passes through the origin

• What if the line does not?

• No problem, simply change the

model to

y = w0 + w1x+

• Can use least squares to

determine w0 , w1

n

xwy

w i

ii −

=1

0

X

Y

w0

=

i

i

i

ii

x

wyx

w2

0

1

)(

Page 29: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Bias term

• So far we assumed that the

line passes through the origin

• What if the line does not?

• No problem, simply change the

model to

y = w0 + w1x+

• Can use least squares to

determine w0 , w1

n

xwy

w i

ii −

=1

0

X

Y

w0

=

i

i

i

ii

x

wyx

w2

0

1

)(

Just a second, we will soon

give a simpler solution

Page 30: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Multivariate regression• What if we have several inputs?

- Stock prices for Yahoo, Microsoft and Ebay for

the Google prediction task

• This becomes a multivariate linear regression

problem

• Again, its easy to model:

y = w0 + w1x1+ … + wkxk +

Google’s stock price

Yahoo’s stock price

Microsoft’s stock price

Page 31: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Multivariate regression

• What if we have several inputs?

- Stock prices for Yahoo, Microsoft and Ebay for

the Google prediction task

• This becomes a multivariate regression problem

• Again, its easy to model:

y = w0 + w1x1+ … + wkxk +

Not all functions can be

approximated using the input

values directly

Page 32: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

y=10+3x12-2x2

2+

In some cases we would like to use

polynomial or other terms based on the

input data, are these still linear

regression problems?

Yes. As long as the coefficients are

linear the equation is still a linear

regression problem!

Page 33: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Non-Linear basis function

• So far we only used the observed values

• However, linear regression can be applied in the same way to

functions of these values

• As long as these functions can be directly computed from the

observed values the parameters are still linear in the data and the

problem remains a linear regression problem

++++= 22

110 kk xwxwwy

Page 34: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Non-Linear basis function

• What type of functions can we use?

• A few common examples:

- Polynomial: j(x) = xj for j=0 … n

- Gaussian:

- Sigmoid:

j (x) =(x − j )

2 j

2

j (x) =1

1+ exp(−s jx)Any function of the input

values can be used. The

solution for the parameters

of the regression remains

the same.

Page 35: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

General linear regression

problem• Using our new notations for the basis function linear regression can

be written as

• Where j(x) can be either xj for multivariate regression or one of the

non linear basis we defined

• Once again we can use ‘least squares’ to find the optimal solution.

y = w j j (x)j= 0

n

Page 36: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

LMS for the general linear

regression problem

=

=k

j

jj xwy0

)(

J(w) = (y i − w j j (xi)

j

)i

2

Our goal is to minimize the following

loss function:

Moving to vector notations we get:

We take the derivative w.r.t w

J(w) = (y i −w T(x i))2

i

TT2T )())(w(2))(w( i

i

ii

i

ii xxyxyw

−=−

Equating to 0 we get

=

=−

TTT

TT

)()(w)(

0)())(w(2

i

i

ii

i

i

i

i

ii

xxxy

xxy

w – vector of dimension k+1

(xi) – vector of dimension k+1

yi – a scaler

Page 37: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

LMS for general linear regression problem

We take the derivative w.r.t w

J(w) = (y i −w T(x i))2

i

w(y i −w T(x i))2

i

= 2 (y i −w T(x i))i

(x i)T

Equating to 0 we get

2 (y i −w T(x i))i

(x i)T = 0

y i

i

(x i)T =w T (x i)i

(x i)T

Define:

=

)()()(

)()()(

)()()(

10

22

1

2

0

11

1

1

0

n

k

nn

k

k

xxx

xxx

xxx

Then deriving w

we get:

w = (T)−1Ty

Page 38: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

LMS for general linear regression problem

J(w) = (y i −w T(x i))2

i

Deriving w we get:

w = (T)−1Ty

n by k+1 matrix

n entries vectork+1 entries vector

This solution is

also known as

‘psuedo inverse’

Page 39: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Example: Polynomial regression

Page 40: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

A probabilistic interpretation

Our least squares minimization solution can also be

motivated by a probabilistic in interpretation of the

regression problem:

y =wT(x)+

The MLE for w in this model

is the same as the solution

we derived for least squares

criteria:

w = (T)−1Ty

Page 41: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Other types of linear regression

• Linear regression is a useful model for many problems

• However, the parameters we learn for this model are global; they

are the same regardless of the value of the input x

• Extension to linear regression adjust their parameters based on the

region of the input we are dealing with

Page 42: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Splines • Instead of fitting one function for the entire region, fit a set of

piecewise (usually cubic) polynomials satisfying continuity and

smoothness constraints.

• Results in smooth and flexible functions without too many

parameters

• Need to define the regions in advance (usually uniform)

y = a1x3 + b1x

2 + c1x + d1

y = a2x3 + b2x

2 + c2x + d2

y = a3x3 + b3x

2 + c3x + d3

Page 43: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

LOCAL, KERNEL

REGRESSION

Page 44: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Local Kernel Regression

• What is the temperature

in the room?

27Average “Local” Average

at location x?

x

Page 45: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Local Average Regression

28

h

Recall: NN classifier

with majority vote

Here we use Average instead

Sum of Ys in h ball around X

#pts in h ball around X

Page 46: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Nadaraya-Watson Kernel

Regression

h

Page 47: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Local Kernel Regression

• Nonparametric estimator akin to kNN

• Nadaraya-Watson Kernel Estimator

Where

• Weight each training point based on

distance to test point

• Boxcar kernel yields

local average

h

Page 48: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Kernels

Page 49: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Spatially adaptive regression

32

h

h

If function smoothness varies spatially, we want to allow bandwidth h to

depend on X

Local polynomials, splines, wavelets, regression trees …

Page 50: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Local Average Regression

33

h

Recall: NN classifier

with majority vote

Here we use Average instead

Sum of Ys in h ball around X

#pts in h ball around X

Page 51: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Nadaraya-Watson Kernel

Regression

h

Page 52: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Local Kernel Regression

• Nonparametric estimator akin to kNN

• Nadaraya-Watson Kernel Estimator

Where

• Weight each training point based on

distance to test point

• Boxcar kernel yields

local average

h

Page 53: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Kernels

Page 54: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Choice of kernel bandwidth h

Image Source:

Larry’s book – All

of Nonparametric

Statistics

h=1 h=10

h=50 h=200

Too small

Too largeJust

right

Too small

Page 55: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Choice of Bandwidth

h

Large Bandwidth – average more data points, reduce noise

Small Bandwidth – less smoothing, more accurate fit

(Lower variance)

(Lower bias)

Bias – Variance tradeoff

Should depend on n, # training data

(determines variance)

Should depend on smoothness of

function

(determines bias)

Page 56: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Spatially adaptive regression

39

h

h

If function smoothness varies spatially, we want to allow bandwidth h to

depend on X

Local polynomials, splines, wavelets, regression trees …

Page 57: 10701 Ensemble of trees: Begging and Random Forestepxing/Class/10701-20/slides/...score 4 30 21 7 8.5 2 15 12 8 7.8 5 27 53 9 6.7 3 20 5 6 5.4 •In everyday life we need to make decisions

Important points

• Linear regression

- basic model

- as a function of the input

• Solving linear regression

• Error in linear regression

• Advanced regression models