regression - university of washington€¦ · regression model: 26 stat/cse 416: intro to machine...

32
3/27/18 1 Regression: Predicting House Prices Emily Fox University of Washington March 27, 2018 1 ©2018 Emily Fox STAT/CSE 416: Intro to Machine Learning Predicting house prices ©2018 Emily Fox 2

Upload: others

Post on 12-Jun-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

1

STAT/CSE 416: Intro to Machine Learning

Regression:Predicting House PricesEmily FoxUniversity of WashingtonMarch 27, 2018

1 ©2018 Emily Fox

STAT/CSE 416: Intro to Machine Learning

Predicting house prices

©2018 Emily Fox2

Page 2: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

2

STAT/CSE 416: Intro to Machine Learning3

How much is my house worth?

©2018 Emily Fox

I want to listmy house

for sale

STAT/CSE 416: Intro to Machine Learning4

How much is my house worth?

©2018 Emily Fox

$$ ????

Page 3: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

3

STAT/CSE 416: Intro to Machine Learning

Data

©2018 Emily Fox

(x1 = sq.ft., y1 = $)

(x2 = sq.ft., y2 = $)

(x3 = sq.ft., y3 = $)

(x4 = sq.ft., y4 = $) Input vs. Output:• y is the quantity of interest

• assume y can be predicted from x

input output

(x5 = sq.ft., y5 = $)

5

STAT/CSE 416: Intro to Machine Learning6

Look at recent sales in my neighborhood

How much did they sell for?

©2018 Emily Fox

Page 4: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

4

STAT/CSE 416: Intro to Machine Learning7

Plot recent house sales (Past 2 years)

©2018 Emily Fox

square feet (sq.ft.)

pri

ce ($

)

x

y

Terminology:x – feature,

covariate, or predictor

y – observation or response

STAT/CSE 416: Intro to Machine Learning8

Predict your house by similar houses

©2018 Emily Fox

square feet (sq.ft.)

pri

ce ($

)

x

y

No house sold recently had exactlythe same sq.ft.

Page 5: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

5

STAT/CSE 416: Intro to Machine Learning9 ©2018 Emily Fox

square feet (sq.ft.)

pri

ce ($

)

x

y

• Look at average price in range

• Still only 2 houses!• Throwing out info

from all other sales

Predict your house by similar houses

STAT/CSE 416: Intro to Machine Learning10 ©2018 Emily Fox

square feet (sq.ft.)

pri

ce ($

)

x

y

Model – How we assume the world works

Regression model:

Page 6: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

6

STAT/CSE 416: Intro to Machine Learning11 ©2018 Emily Fox

square feet (sq.ft.)

pri

ce ($

)

x

y

“Essentially, all models are wrong, but some are useful.”

George Box, 1987.

Model – How we assume the world works

STAT/CSE 416: Intro to Machine Learning12 ©2018 Emily Fox

TrainingData

Featureextraction

ML model

Qualitymetric

ML algorithm

y

x ŷ

⌃f

Page 7: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

7

STAT/CSE 416: Intro to Machine Learning

Linear regression

©2018 Emily Fox13

STAT/CSE 416: Intro to Machine Learning14 ©2018 Emily Fox

TrainingData

Featureextraction

ML model

Qualitymetric

ML algorithm

y

x ŷ

f⌃

Page 8: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

8

STAT/CSE 416: Intro to Machine Learning15 ©2018 Emily Fox

square feet (sq.ft.)

pri

ce ($

)

x

yFit a line through the data

f(x) = w0+w1 x

parameters of model

Use a simple linear regression model

yi = w0+w1 xi + εi

STAT/CSE 416: Intro to Machine Learning16

Which line?

©2018 Emily Fox

square feet (sq.ft.)

pri

ce ($

)

x

y

different parameters w0,w1

f(x) = w0+w1 x

Page 9: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

9

STAT/CSE 416: Intro to Machine Learning17 ©2018 Emily Fox

TrainingData

Featureextraction

ML model

Qualitymetric

ML algorithm

ŵy

x ŷ

f⌃

STAT/CSE 416: Intro to Machine Learning18

“Cost” of using a given line

©2018 Emily Fox

square feet (sq.ft.)

pri

ce ($

)

x

y Residual sum of squares (RSS)

RSS(w0,w1) = ($house 1-[w0+w1sq.ft.house 1])2

+ ($house 2-[w0+w1sq.ft.house 2])2

+ ($house 3-[w0+w1sq.ft.house 3])2

+ … [include all houses]

Page 10: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

10

STAT/CSE 416: Intro to Machine Learning19

Find “best” line

©2018 Emily Fox

square feet (sq.ft.)

pri

ce ($

)

x

y Minimize cost over all possible w0,w1

RSS(w0=1.1,w1=0.8)

RSS(w0=0.97,w1=0.85)

RSS(w0=0.98,w1=0.87)

RSS(w0,w1) =

(yi-[w0+w1xi])2

STAT/CSE 416: Intro to Machine Learning20 ©2018 Emily Fox

TrainingData

Featureextraction

ML model

Qualitymetric

ML algorithm

ŵy

x ŷ

Page 11: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

11

STAT/CSE 416: Intro to Machine Learning21

Minimizing the cost

©2018 Emily Fox

min (yi-[w0+w1xi])2

w0,w1

RSS(w0,w1) is a functionof 2 variables

Minimize function over all possible w0,w1

STAT/CSE 416: Intro to Machine Learning22

Gradient descent

©2018 Emily Fox

Algorithm:

while not convergedw(t+1) ß w(t) - η RSS(w(t))

Δ

ŵ

Page 12: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

12

STAT/CSE 416: Intro to Machine Learning

The fitted line: use + interpretation

©2018 Emily Fox23

STAT/CSE 416: Intro to Machine Learning24 ©2018 Emily Fox

square feet (sq.ft.)

pri

ce ($

)

x

y

Model vs. fitted line

Estimated parameters:ŵ0 , ŵ1

f(x) = ŵ0 + ŵ1 x⌃

yi = w0+w1 xi + εi

Regression model:

Page 13: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

13

STAT/CSE 416: Intro to Machine Learning25

yi = w0+w1 xi + εi

Seller: Predicting your house price

©2018 Emily Fox

square feet (sq.ft.)

pri

ce ($

)

x

y

Best guess of your house price:

f(x) = ŵ0 + ŵ1 x

ŷhouse= ŵ0 + ŵ1 sq.ft.house

Regression model:

STAT/CSE 416: Intro to Machine Learning26

Buyer: Predicting size of house

©2018 Emily Fox

square feet (sq.ft.)

pri

ce ($

)

x

y

Best guess of size of house you can afford:

f(x) = ŵ0 + ŵ1 x

$in bank = ŵ0 + ŵ1 sq.ft.

yi = w0+w1 xi + εi

Regression model:

Page 14: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

14

STAT/CSE 416: Intro to Machine Learning27

A concrete example

©2018 Emily Fox

square feet (sq.ft.)

pri

ce ($

)

x

y f(x) = -44850 + 280.76 x

Predict $ of 2,640 sq.ft. house:

-44850 + 280.76 * 2,640= $696,356

Predict sq.ft. of $859,000 sale:(859000+44850)/ 280.76

= 3,219 sq.ft.

STAT/CSE 416: Intro to Machine Learning28 ©2018 Emily Fox

square feet (sq.ft.)

pri

ce ($

)

x

y

Interpreting the coefficients

Predicted $ of house with sq.ft.=0 (just land)

ŷ = ŵ0 + ŵ1 x

Page 15: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

15

STAT/CSE 416: Intro to Machine Learning29 ©2018 Emily Fox

square feet (sq.ft.)

pri

ce ($

)

x

y

Interpreting the coefficients

ŷ = ŵ0 + ŵ1 x

1 sq. ft.

predictedchange in $

STAT/CSE 416: Intro to Machine Learning30 ©2018 Emily Fox

square feet (sq.ft.)

pri

ce ($

)

x

y

Interpreting the coefficients

1 sq. ft.

predictedchange in $

Warning: magnitude dependson units of both

features and observations

ŷ = ŵ0 + ŵ1 x

Page 16: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

16

STAT/CSE 416: Intro to Machine Learning31

A concrete example

©2018 Emily Fox

square feet (sq.ft.)

pri

ce ($

)

x

y f(x) = -$44,850 + 280.76 ($/sq.ft.) x

Predict $ of 2,640 sq.ft. house:

-$44,850 + 280.76 ($/sq.ft.) * 2,640 sq.ft.= $696,356

Predict sq.ft. of $859,000 sale:($859,000+$44,850)/ 280.76 ($/sq.ft.)

= 3,219 sq.ft.

STAT/CSE 416: Intro to Machine Learning32

A concrete example

©2018 Emily Fox

square feet (sq.ft.)

pri

ce ($

)

x

y f(x) = -$44,850 + 280.76 ($/sq.ft.) x

But what if:

- House was measured in square meters?

- Price was measured in RMB?

Page 17: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

17

STAT/CSE 416: Intro to Machine Learning

Adding higher order effects

©2018 Emily Fox33

STAT/CSE 416: Intro to Machine Learning34

Fit data with a line or … ?

©2018 Emily Fox

square feet (sq.ft.)

pri

ce ($

)

x

y

You show your friend your analysis

Page 18: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

18

STAT/CSE 416: Intro to Machine Learning35

Fit data with a line or … ?

©2018 Emily Fox

square feet (sq.ft.)

pri

ce ($

)

x

y

Dude, it’s not a linear relationship!

STAT/CSE 416: Intro to Machine Learning36

What about a quadratic function?

©2018 Emily Fox

square feet (sq.ft.)

pri

ce ($

)

x

y

Dude, it’s not a linear relationship!

Page 19: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

19

STAT/CSE 416: Intro to Machine Learning37 ©2018 Emily Fox

square feet (sq.ft.)

pri

ce ($

)

x

y

fw(x) = w0 + w1 x+ w2 x2

What about a quadratic function?

STAT/CSE 416: Intro to Machine Learning38

Even higher order polynomial

©2018 Emily Fox

square feet (sq.ft.)

pri

ce ($

)

x

y

f(x) = w0 + w1 x+ w2 x2 + … + wp xp

Page 20: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

20

STAT/CSE 416: Intro to Machine Learning39

Polynomial regression

Model:yi = w0 + w1 xi+ w2 xi

2 + … + wp xip + εi

feature 1 = 1 (constant)feature 2 = xfeature 3 = x2

…feature p+1 = xp

©2018 Emily Fox

treat as different features

parameter 1 = w0

parameter 2 = w1

parameter 3 = w2

…parameter p+1 = wp

STAT/CSE 416: Intro to Machine Learning40

Generic basis expansion

Model:yi = w0h0(xi) + w1 h1(xi) + … + wD hD(xi)+ εi

= wj hj(xi) + εi

feature 1 = h0(x)…often 1feature 2 = h1(x)… e.g., xfeature 3 = h2(x)… e.g., x2 or sin(2πx/12)…feature p+1 = hp(x)… e.g., xp

©2018 Emily Fox

jth regression coefficientor weight

jth feature

DX

j=0

Page 21: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

21

STAT/CSE 416: Intro to Machine Learning41

Generic basis expansion

Model:yi = w0h0(xi) + w1 h1(xi) + … + wD hD(xi)+ εi

= wj hj(xi) + εi

feature 1 = h0(x)…often 1 (constant)feature 2 = h1(x)… e.g., xfeature 3 = h2(x)… e.g., x2 or sin(2πx/12) or log(x)…feature D+1 = hD(x)… e.g., xp

©2018 Emily Fox

DX

j=0

STAT/CSE 416: Intro to Machine Learning

Adding other features

42 ©2018 Emily Fox

Page 22: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

22

STAT/CSE 416: Intro to Machine Learning43

Predictions just based on house size

©2018 Emily Fox

square feet (sq.ft.)

pri

ce ($

)

x

y

Only 1 bathroom! Not same as my

3 bathrooms

STAT/CSE 416: Intro to Machine Learning44

Add more inputs

©2018 Emily Fox

square feet (sq.ft.)

pri

ce ($

)

x[1]

y

# bat

hroom

s

x[2]

f(x) = w0 + w1 sq.ft.+ w2 #bath

Page 23: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

23

STAT/CSE 416: Intro to Machine Learning45

Many possible inputs

- Square feet

- # bathrooms- # bedrooms

- Lot size- Year built

- …

©2018 Emily Fox

STAT/CSE 416: Intro to Machine Learning46

General notation

Output: yInputs: x = (x[1],x[2],…, x[d])

Notational conventions:x[j] = jth input (scalar)hj(x) = jth feature (scalar)xi = input of ith data point (vector)xi[j] = jth input of ith data point (scalar)

©2018 Emily Fox

d-dim vector

scalar

Page 24: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

24

STAT/CSE 416: Intro to Machine Learning47

Generic linear regression model

©2018 Emily Fox

Model:yi = w0 h0(xi) + w1 h1(xi) + … + wD hD(xi) + εi

= wj hj(xi) + εi

feature 1 = h0(x) … e.g., 1feature 2 = h1(x) … e.g., x[1] = sq. ft.feature 3 = h2(x) … e.g., x[2] = #bath

or, log(x[7]) x[2] = log(#bed) x #bath…feature D+1 = hD(x) … some other function of x[1],…, x[d]

DX

j=0

STAT/CSE 416: Intro to Machine Learning48

More on notation

# observations (xi,yi) : N# inputs x[j] : d# features hj(x) : D

©2015 Emily Fox & Carlos Guestrin

Page 25: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

25

STAT/CSE 416: Intro to Machine Learning49

RSS for multiple regression

©2018 Emily Fox

square feet (sq.ft.)

pri

ce ($

)

y

RSS(w) = (yi- )2# b

athro

oms

x[1]

x[2]

STAT/CSE 416: Intro to Machine Learning50

How many features to use?

• More on this soon!

©2018 Emily Fox

Page 26: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

26

STAT/CSE 416: Intro to Machine Learning

A compact representation

51 ©2018 Emily Fox

STAT/CSE 416: Intro to Machine Learning52

Representing function using vectors

©2018 Emily Fox

1 0 0 0 5 3 0 0 1 0 0 0 0

3 0 0 0 2 0 0 1 0 1 0 0 0

f(xi) = w0 h0(xi) + w1 h1(xi) + … + wD hD(xi) = wj hj(xi)DX

j=0

Page 27: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

27

STAT/CSE 416: Intro to Machine Learning53

As an algorithm…

©2018 Emily Fox

1 0 0 0 5 3 0 0 1 0 0 0 0

3 0 0 0 2 0 0 1 0 1 0 0 0

f(xi) = w0 h0(xi) + w1 h1(xi) + … + wD hD(xi) = wj hj(xi)DX

j=0

Algorithm:

for j=0,…,D

STAT/CSE 416: Intro to Machine Learning54

Compact notation

©2018 Emily Fox

1 0 0 0 5 3 0 0 1 0 0 0 0

3 0 0 0 2 0 0 1 0 1 0 0 0

f(xi) = w0 h0(xi) + w1 h1(xi) + … + wD hD(xi) = wj hj(xi)DX

j=0

Page 28: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

28

STAT/CSE 416: Intro to Machine Learning

Interpreting the fitted function

55 ©2018 Emily Fox

STAT/CSE 416: Intro to Machine Learning56 ©2018 Emily Fox

square feet (sq.ft.)

pri

ce ($

)

x

y

Interpreting the coefficients –Simple linear regression

ŷ = ŵ0 + ŵ1 x

1 sq. ft.

predictedchange in $

Page 29: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

29

STAT/CSE 416: Intro to Machine Learning57

fix

©2018 Emily Fox

Interpreting the coefficients –Two linear features

square feet (sq.ft.)

pri

ce ($

)

x[1]

y

# bat

hroom

s

x[2]

ŷ = ŵ0 + ŵ1 x[1] + ŵ2 x[2]

STAT/CSE 416: Intro to Machine Learning58

fix

©2018 Emily Fox

Interpreting the coefficients –Two linear features

ŷ = ŵ0 + ŵ1 x[1] + ŵ2 x[2]

# bathrooms

pri

ce ($

)

x[2]

1 bathroom

predictedchange in $

y

For fixed # sq.ft.!

Page 30: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

30

STAT/CSE 416: Intro to Machine Learning59

fix fix fixfix

©2018 Emily Fox

Interpreting the coefficients –Multiple linear featuresŷ = ŵ0 + ŵ1 x[1] + …+ŵj x[j] + … + ŵd x[d]

square feet (sq.ft.)

pri

ce ($

)

x[1]

y

# bat

hroom

s

x[2]

STAT/CSE 416: Intro to Machine Learning60 ©2018 Emily Fox

Interpreting the coefficients-Polynomial regressionŷ = ŵ0 + ŵ1x +… + ŵj xj + … + ŵp xp

square feet (sq.ft.)

pri

ce ($

)

x

y

Can’t hold other features fixed!

Page 31: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

31

STAT/CSE 416: Intro to Machine Learning

Summary for regression

67 ©2018 Emily Fox

STAT/CSE 416: Intro to Machine Learning68 ©2018 Emily Fox

TrainingData

Featureextraction

ML model

Qualitymetric

ML algorithm

ŵy

x ŷ

Page 32: Regression - University of Washington€¦ · Regression model: 26 STAT/CSE 416: Intro to Machine Learning Buyer: Predicting size of house ©2018 Emily Fox square feet (sq.ft.)) x

3/27/18

32

STAT/CSE 416: Intro to Machine Learning69

What you can do now…• Describe the input (features) and output (real-valued predictions) of a

regression model

• Calculate a goodness-of-fit metric (e.g., RSS)

• Understand how gradient descent is used to estimate model parameters by minimizing RSS

• Exploit the estimated model to form predictions

• Describe a regression model using multiple features• Interpret coefficients in a regression model with multiple features

• Describe other applications where regression is useful

©2018 Emily Fox