multiple regression i
DESCRIPTION
Multiple Regression I. KNNL – Chapter 6. Models with Multiple Predictors. Most Practical Problems have more than one potential predictor variable Goal is to determine effects (if any) of each predictor, controlling for others Can include polynomial terms to allow for nonlinear relations - PowerPoint PPT PresentationTRANSCRIPT
Multiple Regression I
KNNL – Chapter 6
Models with Multiple Predictors
• Most Practical Problems have more than one potential predictor variable
• Goal is to determine effects (if any) of each predictor, controlling for others
• Can include polynomial terms to allow for nonlinear relations
• Can include product terms to allow for interactions when effect of one variable depends on level of another variable
• Can include “dummy” variables for categorical predictors
First-Order Model with 2 Numeric Predictors
0 1 1 2 2
0 1 1 2 20 Plane in 3-dimensionsi i i i
i
Y X X
E E Y X X
Interpretation of Regression Coefficients• Additive: E{Y} = 01X1 + 2X2 ≡ Mean of Y @ X1, X2
• 0 ≡ Intercept, Mean of Y when X1=X2=0
• 1 ≡ Slope with Respect to X1 (effect of increasing X1 by 1 unit, while holding X2 constant)
• 2 ≡ Slope with Respect to X2 (effect of increasing X2 by 1 unit, while holding X1 constant)
• These can also be obtained by taking the partial derivatives of E{Y} with respect to X1 and X2, respectively
• Interaction Model: E{Y} = 01X1 + 2X2 + 3X1X2
• When X2 = 0: Effect of increasing X1 by 1: 1(1)+3(1)(0)= 1
• When X2 = 1: Effect of increasing X1 by 1: 1(1)+3(1)(1)= 1+3
• The effect of increasing X1 depends on level of X2, and vice versa
General Linear Regression Model
0 1 1 2 2 1 , 1
1
01
1
00
0 1 1 2 2 1 1
...
where: 1
0 ... (Hyperplane in -dimensions)
1 1 Simple linear regression
Normality, indepe
i i i p i p i
p
i k ik ik
p
i k ik i ik
i p p
Y X X X
Y X
Y X X
E E Y X X X p
p
2 20 1 1 2 2 1 , 1
ndence, and constant variance for errors:
~ 0, ~ ... , , 0i i i i p i p i jNID Y N X X X Y Y i j
Special Types of Variables/Models - I• p-1 distinct numeric predictors (attributes)
Y = Sales, X1=Advertising, X2=Price
• Categorical Predictors – Indicator (Dummy) variables, representing m-1 levels of a m level categorical variable Y = Salary, X1=Experience, X2=1 if College Grad, 0 if Not
• Polynomial Terms – Allow for bends in the Regression Y=MPG, X1=Speed, X2=Speed2
• Transformed Variables – Transformed Y variable to achieve linearity Y’=ln(Y) Y’=1/Y
Special Types of Variables/Models - II• Interaction Effects – Effect of one predictor depends on
levels of other predictors Y = Salary, X1=Experience, X2=1 if Coll Grad, 0 if Not, X3=X1X2
E(Y) = X1X2X1X2
Non-College Grads (X2 =0):
E(Y) = X1(0)X1(0)= X1
College Grads (X2 =1):
E(Y) = X1(1)X1(1)= X1
• Response Surface Models E(Y) = X1X1
2X2X22X1X2
• Note: Although the Response Surface Model has polynomial terms, it is linear wrt Regression parameters
Matrix Form of Regression Model0 1 1 2 2 1 , 1
11 12 1, 11
21 22 2, 12
1
1 2 , 1
0 1
1 2
11
1
... 1,...,
Matrix Form:
1
1
1
i i i p i p i
p
p
n n p
n n n pn
np
p n
Y X X X i n
X X XY
X X XY
X X XY
n
Y X
β ε E ε
2
22
1
2
2
1 111 1
0 0 0
0 0 0
0 0 0
n
n n p n p nnp p
2
×1
2
n×1 n×p n×1p×1
σ ε I
Y X β ε E Y = E X β + ε X β σ Y I
Least Squares Estimation of Regression Coefficients
220 1 1 1 , 1
1 1
0 1 1 0 1 1
0
1
11 1
1
Goal: Minimize: ...
Obtain Estimates of , ,..., that minimize , ,...,
Normal Equations: ' '
n n
i i i p i pi i
p p
p p pp p
p
Q Y X X
Q b b b
b
b
b
-1X X b X Y b X'X X'Y
/2 22 20 1 1 1 , 12
1
2
0 1 1 1 , 11
Maximum Likelihood also leads to the same estimator :
1, 2 exp ...
2
since maximizing involves minimizing ...
nn
i i p i pi
n
i i p i pi
L Y X X
L Y X X
b
β
Fitted Values and Residuals
^
11
^
22
1 1
^
1 1
1 1 11 1
Fitted Values: Residuals:
'
n n
nn
n n p p
n n n n pn p
Y e
eY
eY
^
^ -1 -1
^ -1
Y e
Y X b X X'X X'Y = HY H = X X'X X' H = H' = HH
e = Y Y Y X b Y X X'X X'Y = I H Y I H I H I H I
MSE
MSE
^ ^2 2 2 2 2
2 2 2 2 2
H
σ Y = σ HY = Hσ Y H' = σ H s Y H
σ e = σ I H Y = I H σ Y I H ' = σ I H s e I H
Analysis of Variance – Sums of Squares
^
11 1
^
2 22
1 1 1
^
2
1
2^
1 11 1
1 1
1
n n n
n nn
n
ii
iii
Y YY Y
Y YYYn n
Y YYY
SSTO Y Yn
SSE Y Y
^
Y Y Xb HY Y JY
Y Y ' Y Y Y' I J Y
1
2^
1
2
12 2
' ' '1 1 '
1 1
1
n
n
i
i
p p
k kk k k kk kkk k k k
SSEMSE
n p
SSRSSR Y Y MSR
n n p
E MSE
E MSR SS SS SS
^ ^
^ ^
Y Y ' Y Y Y' I H Y Y'Y -b'X'Y
Y Y ' Y Y Y' H J Y b'X'Y Y' JY
1
''1
1 1 ... 0
n
k kik iki
p
X X X X
E MSR E MSE E MSR E MSE
ANOVA Table, F-test, and R2
Analysis of Variance (ANOVA) Table
Source df Sum of Squares Mean Square
-------------------------------------------------------------------------------------------------------------------------------------
1Regression 1
1
Error
SSRp SSR MSR
n p
n
b'X'Y - Y' JY
0 1 1 0
1Total 1
Test of : ... 0 : Not all 0
Test Statis
p A k
SSEp SSE MSE
n p
n SSTOn
H E Y H
Y'Y -b'X'Y
Y'Y -Y' JY
2 2
2
tic: * Rejection Region: * 1 ; 1, -value= Pr 1, *
Coefficient of Multiple Determination: 1 Correlation:
Adjusted- 1
1
MSRF F F p n p P F p n p F
MSE
SSR SSER R R
SSTO SSTO
SSEn p
RSSTOn
11 places a "penalty" on models with extra predictors
n SSE
n p SSTO
Inferences Regarding Regression Parameters
2 2
2 20 0 1 0 1 0 0 1 0 1
2 21 0 1 1 1 1 0 1 1 1
21 0 1 1 1 1
, , , ,
, , , ,
, ,
p p
p p
p p p p
b b b b b s b s b b s b b
b b b b b s b b s b s b b
b b b b b s b
2 2
-1 -1 -1
2 2
Y = Xβ +ε E ε = 0 σ ε = I E Y = Xβ σ Y = I
E b = E X'X X'Y = X'X X'E Y = X'X X'Xβ = β
σ b s b
20 1 1 1
2 2
0
, ,
~ 1 100% CI for 1 ;2
Test of : 0 : 0 Test Statistic: *
p p
k kn p k k k
k
k A k
b s b b s b
MSE
bt b t n p s b
s b
bH H t
-1 -1 -1 -1 -1 -12 2 2
-12
σ b = σ X'X X'Y X'X X'σ Y X'X X' ' X'X X'X X'X X'X
s b = X'X
Rejection Region: * 1 ; P-value=2Pr *2
Simultaneous 1 100% CI for : 1 ;2
k
k
s sk k
s b
t t n p t n p t
g p b t n p s bg
Estimating Mean Response at Specific X-levels
1 1 1 , 1
^1 ' '
1
, 1
^ ^ ^' 2 ' 2 ' 2 '
^ ^ ^
Given set of levels of ,..., : ,...,
1
1 100% CI for : 1 ;2
1 100
p h h p
hhh h h h
p
h p
h h hh h h h h h h
h h h
X X X X
XE Y Y
X
E Y Y s Y MSE
E Y Y t n p s Y
-1 -12
X X β X b
X β X σ b X X X'X X X X'X X
^ ^2
^ ^ ^
% Confidence Region for Regression Surface: 1 ; ,
1 100% CI for several ( ) : 1 ;2
h h
h h h
Y Ws Y W pF p n p
g E Y Y Bs Y B t n pg
Predicting New Response(s) at Specific X-levels
1 1 1 , 1
^1 ' '
1
, 1
2 '
2 '
Given set of levels of ,..., : ,...,
1
pred 1
1mean of observations (at same X-levels): predmean
1
p h h p
hhh h h h
p
h p
h h
h h
X X X X
XE Y Y
X
s MSE
m s MSEm
-1
-1
X X β X b
X X'X X
X X'X X
^
(new)
^2
(new)
^
(new)
100% CI for : 1 ; pred2
Scheffe: 1 100% CI for several ( ) : pred 1 ; ,
Bonferroni: 1 100% CI for several ( ) : pred 1 ;2
hh
hh
hh
Y Y t n p s
g Y Y Ss S gF g n p
g Y Y Bs B t n pg