multiple regression i

Multiple Regression I

KNNL – Chapter 6

Models with Multiple Predictors

• Most Practical Problems have more than one potential predictor variable

• Goal is to determine effects (if any) of each predictor, controlling for others

• Can include polynomial terms to allow for nonlinear relations

• Can include product terms to allow for interactions when effect of one variable depends on level of another variable

• Can include “dummy” variables for categorical predictors

First-Order Model with 2 Numeric Predictors

0 1 1 2 2

0 1 1 2 20 Plane in 3-dimensionsi i i i

i

Y X X

E E Y X X

Interpretation of Regression Coefficients• Additive: E{Y} = 01X1 + 2X2 ≡ Mean of Y @ X1, X2

• 0 ≡ Intercept, Mean of Y when X1=X2=0

• 1 ≡ Slope with Respect to X1 (effect of increasing X1 by 1 unit, while holding X2 constant)

• 2 ≡ Slope with Respect to X2 (effect of increasing X2 by 1 unit, while holding X1 constant)

• These can also be obtained by taking the partial derivatives of E{Y} with respect to X1 and X2, respectively

• Interaction Model: E{Y} = 01X1 + 2X2 + 3X1X2

• When X2 = 0: Effect of increasing X1 by 1: 1(1)+3(1)(0)= 1

• When X2 = 1: Effect of increasing X1 by 1: 1(1)+3(1)(1)= 1+3

• The effect of increasing X1 depends on level of X2, and vice versa

General Linear Regression Model

0 1 1 2 2 1 , 1

1

01

1

00

0 1 1 2 2 1 1

...

where: 1

0 ... (Hyperplane in -dimensions)

1 1 Simple linear regression

Normality, indepe

i i i p i p i

p

i k ik ik

p

i k ik i ik

i p p

Y X X X

Y X

Y X X

E E Y X X X p

p

2 20 1 1 2 2 1 , 1

ndence, and constant variance for errors:

~ 0, ~ ... , , 0i i i i p i p i jNID Y N X X X Y Y i j

Special Types of Variables/Models - I• p-1 distinct numeric predictors (attributes)

Y = Sales, X1=Advertising, X2=Price

• Categorical Predictors – Indicator (Dummy) variables, representing m-1 levels of a m level categorical variable Y = Salary, X1=Experience, X2=1 if College Grad, 0 if Not

• Polynomial Terms – Allow for bends in the Regression Y=MPG, X1=Speed, X2=Speed2

• Transformed Variables – Transformed Y variable to achieve linearity Y’=ln(Y) Y’=1/Y

Special Types of Variables/Models - II• Interaction Effects – Effect of one predictor depends on

levels of other predictors Y = Salary, X1=Experience, X2=1 if Coll Grad, 0 if Not, X3=X1X2

E(Y) = X1X2X1X2

Non-College Grads (X2 =0):

E(Y) = X1(0)X1(0)= X1

College Grads (X2 =1):

E(Y) = X1(1)X1(1)= X1

• Response Surface Models E(Y) = X1X1

2X2X22X1X2

• Note: Although the Response Surface Model has polynomial terms, it is linear wrt Regression parameters

Matrix Form of Regression Model0 1 1 2 2 1 , 1

11 12 1, 11

21 22 2, 12

1

1 2 , 1

0 1

1 2

11

1

... 1,...,

Matrix Form:

1

1

1

i i i p i p i

p

p

n n p

n n n pn

np

p n

Y X X X i n

X X XY

X X XY

X X XY

n

Y X

β ε E ε

2

22

1

2

2

1 111 1

0 0 0

0 0 0

0 0 0

n

n n p n p nnp p

2

×1

2

n×1 n×p n×1p×1

σ ε I

Y X β ε E Y = E X β + ε X β σ Y I

Least Squares Estimation of Regression Coefficients

220 1 1 1 , 1

1 1

0 1 1 0 1 1

0

1

11 1

1

Goal: Minimize: ...

Obtain Estimates of , ,..., that minimize , ,...,

Normal Equations: ' '

n n

i i i p i pi i

p p

p p pp p

p

Q Y X X

Q b b b

b

b

b

-1X X b X Y b X'X X'Y

/2 22 20 1 1 1 , 12

1

2

0 1 1 1 , 11

Maximum Likelihood also leads to the same estimator :

1, 2 exp ...

2

since maximizing involves minimizing ...

nn

i i p i pi

n

i i p i pi

L Y X X

L Y X X

b

β

Fitted Values and Residuals

^

11

^

22

1 1

^

1 1

1 1 11 1

Fitted Values: Residuals:

'

n n

nn

n n p p

n n n n pn p

Y e

eY

eY

^

^ -1 -1

^ -1

Y e

Y X b X X'X X'Y = HY H = X X'X X' H = H' = HH

e = Y Y Y X b Y X X'X X'Y = I H Y I H I H I H I

MSE

MSE

^ ^2 2 2 2 2

2 2 2 2 2

H

σ Y = σ HY = Hσ Y H' = σ H s Y H

σ e = σ I H Y = I H σ Y I H ' = σ I H s e I H

Analysis of Variance – Sums of Squares

^

11 1

^

2 22

1 1 1

^

2

1

2^

1 11 1

1 1

1

n n n

n nn

n

ii

iii

Y YY Y

Y YYYn n

Y YYY

SSTO Y Yn

SSE Y Y

^

Y Y Xb HY Y JY

Y Y ' Y Y Y' I J Y

1

2^

1

2

12 2

' ' '1 1 '

1 1

1

n

n

i

i

p p

k kk k k kk kkk k k k

SSEMSE

n p

SSRSSR Y Y MSR

n n p

E MSE

E MSR SS SS SS

^ ^

^ ^

Y Y ' Y Y Y' I H Y Y'Y -b'X'Y

Y Y ' Y Y Y' H J Y b'X'Y Y' JY

1

''1

1 1 ... 0

n

k kik iki

p

X X X X

E MSR E MSE E MSR E MSE

ANOVA Table, F-test, and R2

Analysis of Variance (ANOVA) Table

Source df Sum of Squares Mean Square

-------------------------------------------------------------------------------------------------------------------------------------

1Regression 1

1

Error

SSRp SSR MSR

n p

n

b'X'Y - Y' JY

0 1 1 0

1Total 1

Test of : ... 0 : Not all 0

Test Statis

p A k

SSEp SSE MSE

n p

n SSTOn

H E Y H

Y'Y -b'X'Y

Y'Y -Y' JY

2 2

2

tic: * Rejection Region: * 1 ; 1, -value= Pr 1, *

Coefficient of Multiple Determination: 1 Correlation:

Adjusted- 1

1

MSRF F F p n p P F p n p F

MSE

SSR SSER R R

SSTO SSTO

SSEn p

RSSTOn

11 places a "penalty" on models with extra predictors

n SSE

n p SSTO

Inferences Regarding Regression Parameters

2 2

2 20 0 1 0 1 0 0 1 0 1

2 21 0 1 1 1 1 0 1 1 1

21 0 1 1 1 1

, , , ,

, , , ,

, ,

p p

p p

p p p p

b b b b b s b s b b s b b

b b b b b s b b s b s b b

b b b b b s b

2 2

-1 -1 -1

2 2

Y = Xβ +ε E ε = 0 σ ε = I E Y = Xβ σ Y = I

E b = E X'X X'Y = X'X X'E Y = X'X X'Xβ = β

σ b s b

20 1 1 1

2 2

0

, ,

~ 1 100% CI for 1 ;2

Test of : 0 : 0 Test Statistic: *

p p

k kn p k k k

k

k A k

b s b b s b

MSE

bt b t n p s b

s b

bH H t

-1 -1 -1 -1 -1 -12 2 2

-12

σ b = σ X'X X'Y X'X X'σ Y X'X X' ' X'X X'X X'X X'X

s b = X'X

Rejection Region: * 1 ; P-value=2Pr *2

Simultaneous 1 100% CI for : 1 ;2

k

k

s sk k

s b

t t n p t n p t

g p b t n p s bg

Estimating Mean Response at Specific X-levels

1 1 1 , 1

^1 ' '

1

, 1

^ ^ ^' 2 ' 2 ' 2 '

^ ^ ^

Given set of levels of ,..., : ,...,

1

1 100% CI for : 1 ;2

1 100

p h h p

hhh h h h

p

h p

h h hh h h h h h h

h h h

X X X X

XE Y Y

X

E Y Y s Y MSE

E Y Y t n p s Y

-1 -12

X X β X b

X β X σ b X X X'X X X X'X X

^ ^2

^ ^ ^

% Confidence Region for Regression Surface: 1 ; ,

1 100% CI for several ( ) : 1 ;2

h h

h h h

Y Ws Y W pF p n p

g E Y Y Bs Y B t n pg

Predicting New Response(s) at Specific X-levels

1 1 1 , 1

^1 ' '

1

, 1

2 '

2 '

Given set of levels of ,..., : ,...,

1

pred 1

1mean of observations (at same X-levels): predmean

1

p h h p

hhh h h h

p

h p

h h

h h

X X X X

XE Y Y

X

s MSE

m s MSEm

-1

-1

X X β X b

X X'X X

X X'X X

^

(new)

^2

(new)

^

(new)

100% CI for : 1 ; pred2

Scheffe: 1 100% CI for several ( ) : pred 1 ; ,

Bonferroni: 1 100% CI for several ( ) : pred 1 ;2

hh

hh

hh

Y Y t n p s

g Y Y Ss S gF g n p

g Y Y Bs B t n pg

multiple regression i

Documents