12-1. 12-2 chapter twelve multiple regression and model building mcgraw-hill/irwin copyright © 2004...
TRANSCRIPT
![Page 1: 12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved](https://reader036.vdocument.in/reader036/viewer/2022070308/551c19ee550346ad4f8b57ff/html5/thumbnails/1.jpg)
12-12-11
![Page 2: 12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved](https://reader036.vdocument.in/reader036/viewer/2022070308/551c19ee550346ad4f8b57ff/html5/thumbnails/2.jpg)
12-12-22
Chapter Twelve
Multiple Regressionand Model Building
McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
![Page 3: 12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved](https://reader036.vdocument.in/reader036/viewer/2022070308/551c19ee550346ad4f8b57ff/html5/thumbnails/3.jpg)
12-12-33
Multiple Regression
12.1 The Linear Regression Model12.2 The Least Squares Estimates and Prediction12.3 The Mean Squared Error and the Standard Error12.4 Model Utility: R2, Adjusted R2, and the F Test12.5 Testing the Significance of an Independent Variable12.6 Confidence Intervals and Prediction Intervals12.7 Dummy Variables12.8 Model Building and the Effects of Multicollinearity 12.9 Residual Analysis in Multiple Regression
![Page 4: 12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved](https://reader036.vdocument.in/reader036/viewer/2022070308/551c19ee550346ad4f8b57ff/html5/thumbnails/4.jpg)
12-12-44
12.1 The Linear Regression Model
εxβxβxββ=εμy= kkxxy|x k ...22110,...,, 21
The linear regression model relating y to x1, x2, …, xk is
is the mean value of the dependent variable y when the values of the independent variables are x1, x2, …, xk.
are the regression parameters relating the mean value of y to x1, x2, …, xk.
is an error term that describes the effects on y of all factors other than the independent variables x1, x2, …, xk .
kkxxy|x xβxβxββ=μk
...22110,...,, 21
kββββ ,...,,, 210
where
![Page 5: 12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved](https://reader036.vdocument.in/reader036/viewer/2022070308/551c19ee550346ad4f8b57ff/html5/thumbnails/5.jpg)
12-12-55
Example: The Linear Regression Model
εxβxββ=y 22110
Average Hourly Fuel ConsumptionWeek Temperature, x1 (F) Chill Index, x2 y (MMcf)1 28.0 18 12.42 28.0 14 11.73 32.5 24 12.44 39.0 22 10.85 45.9 8 9.46 57.8 16 9.57 58.1 1 8.08 62.5 0 7.5
Example 12.1: The Fuel Consumption Case
![Page 6: 12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved](https://reader036.vdocument.in/reader036/viewer/2022070308/551c19ee550346ad4f8b57ff/html5/thumbnails/6.jpg)
12-12-66
The Linear Regression Model Illustrated
Example 12.1: The Fuel Consumption Case
![Page 7: 12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved](https://reader036.vdocument.in/reader036/viewer/2022070308/551c19ee550346ad4f8b57ff/html5/thumbnails/7.jpg)
12-12-77
The Regression Model Assumptions
Assumptions about the model error terms, ’s
Mean Zero The mean of the error terms is equal to 0.
Constant Variance The variance of the error terms is, the same for every combination values of x1, x2, …, xk.
Normality The error terms follow a normal distribution for every combination values of x1, x2, …, xk.
Independence The values of the error terms are statistically independent of each other.
Model εxβxβxββ=εμy= kkxxy|x k ...22110,...,, 21
![Page 8: 12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved](https://reader036.vdocument.in/reader036/viewer/2022070308/551c19ee550346ad4f8b57ff/html5/thumbnails/8.jpg)
12-12-88
12.2 Least Squares Estimates and Prediction
kk xbxbxbby 00220110 ...ˆ
Estimation/Prediction Equation:
b1, b2, …, bk are the least squares point estimates of the parameters 1, 2, …, k.
x01, x02, …, x0k are specified values of the independent predictor variables x1, x2, …, xk.
is the point estimate of the mean value of the dependent variable when the values of the independent variables are x01, x02, …, x0k. It is also the point prediction of an individual value of the dependent variable when the values of the independent variables are x01, x02, …, x0k.
![Page 9: 12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved](https://reader036.vdocument.in/reader036/viewer/2022070308/551c19ee550346ad4f8b57ff/html5/thumbnails/9.jpg)
12-12-99
Example: Least Squares Estimation
Example 12.3: The Fuel Consumption Case Minitab OutputFuelCons = 13.1 - 0.0900 Temp + 0.0825 Chill
Predictor Coef StDev T PConstant 13.1087 0.8557 15.32 0.000Temp -0.09001 0.01408 -6.39 0.001Chill 0.08249 0.02200 3.75 0.013
S = 0.3671 R-Sq = 97.4% R-Sq(adj) = 96.3%
Analysis of VarianceSource DF SS MS F PRegression 2 24.875 12.438 92.30 0.000Residual Error 5 0.674 0.135Total 7 25.549
Predicted Values (Temp = 40, Chill = 10) Fit StDev Fit 95.0% CI 95.0% PI 10.333 0.170 ( 9.895, 10.771) ( 9.293, 11.374)
![Page 10: 12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved](https://reader036.vdocument.in/reader036/viewer/2022070308/551c19ee550346ad4f8b57ff/html5/thumbnails/10.jpg)
12-12-1010
Example: Point Predictions and Residuals
Example 12.3: The Fuel Consumption CaseObserved Fuel Predicted Fuel
Average Hourly Consumption Consumption ResidualWeek Temperature, x1 (F) Chill Index, x2 y (MMcf) 13.1087 - .0900x1 + .0825x2 e = y - pred
1 28.0 18 12.4 12.0733 0.32672 28.0 14 11.7 11.7433 -0.04333 32.5 24 12.4 12.1631 0.23694 39.0 22 10.8 11.4131 -0.61315 45.9 8 9.4 9.6372 -0.23726 57.8 16 9.5 9.2260 0.27407 58.1 1 8.0 7.9616 0.03848 62.5 0 7.5 7.4831 0.0169
![Page 11: 12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved](https://reader036.vdocument.in/reader036/viewer/2022070308/551c19ee550346ad4f8b57ff/html5/thumbnails/11.jpg)
12-12-1111
12.3 Mean Square Error and Standard Error
Mean Square Error, point estimate of residual variance )1(
2
kn-
SSEMSEs
)1(
kn-
SSEMSEs Standard Error, point estimate of
residual standard deviation
Example 12.3 The Fuel Consumption Case
0.1348
38
674.0
)1(2
kn-
SSEMSEs 0.3671 1348.02ss
22 )ˆ( iii yyeSSE Sum of Squared Errors
Analysis of VarianceSource DF SS MS F PRegression 2 24.875 12.438 92.30 0.000Residual Error 5 0.674 0.135Total 7 25.549
![Page 12: 12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved](https://reader036.vdocument.in/reader036/viewer/2022070308/551c19ee550346ad4f8b57ff/html5/thumbnails/12.jpg)
12-12-1212
12.4 Model Utility: Multiple Coefficient of Determination, R²
The multiple coefficient of determination R2 is
variationTotal
n variatioExplainedR 2
(SSE)SquaresofSumErrorˆ= variationdUnexplaine
(SSR) SquaresofSumRegressionˆ= variationExplained
(SSTO) SquaresofSumTotal = variationTotal
2
2
2
)y(y
)yy(
)y(y
ii
i
i
variation dUnexplaine variation Explained variation Total
R2 is the proportion of the total variation in y explained by the linear regression model
2Multiple correlation coefficient R, R
![Page 13: 12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved](https://reader036.vdocument.in/reader036/viewer/2022070308/551c19ee550346ad4f8b57ff/html5/thumbnails/13.jpg)
12-12-1313
12.4 Model Utility: Adjusted R2
The adjusted multiple coefficient of determination is
)1(
1
1R 22
kn
n
n
kR
Fuel Consumption Case:S = 0.3671 R-Sq = 97.4% R-Sq(adj) = 96.3%
Analysis of VarianceSource DF SS MS F PRegression 2 24.875 12.438 92.30 0.000Residual Error 5 0.674 0.135Total 7 25.549
963.0)12(8
18
18
2974.0R,974.0
25.549
24.875 R 2 2
![Page 14: 12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved](https://reader036.vdocument.in/reader036/viewer/2022070308/551c19ee550346ad4f8b57ff/html5/thumbnails/14.jpg)
12-12-1414
12.4 Model Utility: F Test for Linear Regression Model
To test H0: = = …= = 0 versus
Ha: At least one of the, , …, k is not equal to 0
Test Statistic:
1)](k-)/[n variationed(Unexplain
)/k variation(Explained
F(model)
Reject H0 in favor of Ha if: F(model) > For p-value <
Fis based on k numerator and n-(k+1) denominator degrees of freedom.
![Page 15: 12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved](https://reader036.vdocument.in/reader036/viewer/2022070308/551c19ee550346ad4f8b57ff/html5/thumbnails/15.jpg)
12-12-1515
Example: F Test for Linear Regression
Test Statistic:
30.92)38/(674.0
2/875.24
1)](k-)/[n variationed(Unexplain
)/k variation(ExplainedF(model)
Example 12.5 The Fuel Consumption Case Minitab Output
Reject H0 at level of significance, since
Fis based on 2 numerator and 5 denominator degrees of freedom.
F-test at = 0.05 level of significance
05.0000.0value-p
and79.530.92F(model) 05.F
Analysis of VarianceSource DF SS MS F PRegression 2 24.875 12.438 92.30 0.000Residual Error 5 0.674 0.135Total 7 25.549
![Page 16: 12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved](https://reader036.vdocument.in/reader036/viewer/2022070308/551c19ee550346ad4f8b57ff/html5/thumbnails/16.jpg)
12-12-1616
12.5 Testing Significance of the Independent Variable
bj
j
s
bt=
Test Statistic
If the regression assumptions hold, we can reject H0: j = 0 at the level of significance (probability of Type I error equal to ) if and only if the appropriate rejection point condition holds or, equivalently, if the corresponding p-value is less than .
0:
0:
0:
ja
ja
ja
H
H
H
2/2/
2/
or
isthat,
tttt
tt
tt
tt
t, t/2 and p-values are based on n – (k+1) degrees of freedom.
Alternative Reject H0 if: p-Value
tofrightondistributit underarea Twice
tofleftondistributit underArea
tofrightondistributit underArea
100(1-)% Confidence Interval for j
][ 2/ jbj stb
![Page 17: 12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved](https://reader036.vdocument.in/reader036/viewer/2022070308/551c19ee550346ad4f8b57ff/html5/thumbnails/17.jpg)
12-12-1717
Example: Testing and Estimation for s
Example 12.6: The Fuel Consumption Case Minitab Output
Predictor Coef StDev T PConstant 13.1087 0.8557 15.32 0.000Temp -0.09001 0.01408 -6.39 0.001Chill 0.08249 0.02200 3.75 0.013
025.2 571.275.3
02200.0
08249.0
2
ts
bt=
b
013.0)75.3(2 tPvaluep
t, t/2 and p-values are based on 5 degrees of freedom.
Chill is significant at the = 0.05 level, but not at = 0.01
0.13905]0.02593,[
]05656.008249.0[
)]02200.0)(571.2(08249.0[
][22/2
bstb
Test Interval
![Page 18: 12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved](https://reader036.vdocument.in/reader036/viewer/2022070308/551c19ee550346ad4f8b57ff/html5/thumbnails/18.jpg)
12-12-1818
12.6 Confidence and Prediction Intervals
valueDistance]ty[ )ˆ()ˆ(/2 sss yyyy
t is based on n-(k+1) degrees of freedom
valueDistance+1],ty[ ˆˆ/2 sss yy
Prediction:
100(1 - )% confidence interval for the mean value of y
If the regression assumptions hold,
100(1 - )% prediction interval for an individual value of y
kk xbxbxbby 00220110 ...ˆ
(Distance value requires matrix algebra)
![Page 19: 12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved](https://reader036.vdocument.in/reader036/viewer/2022070308/551c19ee550346ad4f8b57ff/html5/thumbnails/19.jpg)
12-12-1919
Example: Confidence and Prediction Intervals
Example 12.9 The Fuel Consumption Case Minitab Output
FuelCons = 13.1 - 0.0900 Temp + 0.0825 Chill
Predicted Values (Temp = 40, Chill = 10) Fit StDev Fit 95.0% CI 95.0% PI 10.333 0.170 (9.895, 10.771) (9.293,11.374)
]771.10,895.9[
]438.0333.10[
]0.2144515)3671.02.571)(([10.333
] valueDistancety[ /2
s
95% Confidence Interval 95% Prediction Interval
]374.11,292.9[
]041.1333.10[
]0.21445151)3671.02.571)(([10.333
] valueDistance1ty[ /2
s
![Page 20: 12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved](https://reader036.vdocument.in/reader036/viewer/2022070308/551c19ee550346ad4f8b57ff/html5/thumbnails/20.jpg)
12-12-2020
12.7Dummy Variables
Number of Location SalesHouseholds Dummy Volume
Store x Location DM y1 161 Street 0 157.272 99 Street 0 93.283 135 Street 0 136.814 120 Street 0 123.795 164 Street 0 153.516 221 Mall 1 241.747 179 Mall 1 201.548 204 Mall 1 206.719 214 Mall 1 229.78
10 101 Mall 1 135.22
Example 12.11 The Electronics World Case
otherwise0
locationmallainisstoreaif1MD
Location Dummy Variable
![Page 21: 12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved](https://reader036.vdocument.in/reader036/viewer/2022070308/551c19ee550346ad4f8b57ff/html5/thumbnails/21.jpg)
12-12-2121
Example: Regression with a Dummy Variable
Example 12.11: The Electronics World Case Minitab Output
Sales = 17.4 + 0.851 Households + 29.2 DM
Predictor Coef StDev T PConstant 17.360 9.447 1.84 0.109Househol 0.85105 0.06524 13.04 0.000DM 29.216 5.594 5.22 0.001
S = 7.329 R-Sq = 98.3% R-Sq(adj) = 97.8%
Analysis of Variance
Source DF SS MS F PRegression 2 21412 10706 199.32 0.000Residual Error 7 376 54Total 9 21788
![Page 22: 12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved](https://reader036.vdocument.in/reader036/viewer/2022070308/551c19ee550346ad4f8b57ff/html5/thumbnails/22.jpg)
12-12-2222
12.8 Model Building and the Effects of Multicollinearity
Example: The Sale Territory Performance Case
Sales Time MktPoten Adver MktShare Change Accts WkLoad Rating3669.88 43.10 74065.11 4582.88 2.51 0.34 74.86 15.05 4.93473.95 108.13 58117.30 5539.78 5.51 0.15 107.32 19.97 5.12295.10 13.82 21118.49 2950.38 10.91 -0.72 96.75 17.34 2.94675.56 186.18 68521.27 2243.07 8.27 0.17 195.12 13.40 3.46125.96 161.79 57805.11 7747.08 9.15 0.50 180.44 17.64 4.62134.94 8.94 37806.94 402.44 5.51 0.15 104.88 16.22 4.55031.66 365.04 50935.26 3140.62 8.54 0.55 256.10 18.80 4.63367.45 220.32 35602.08 2086.16 7.07 -0.49 126.83 19.86 2.36519.45 127.64 46176.77 8846.25 12.54 1.24 203.25 17.42 4.94876.37 105.69 42053.24 5673.11 8.85 0.31 119.51 21.41 2.82468.27 57.72 36829.71 2761.76 5.38 0.37 116.26 16.32 3.12533.31 23.58 33612.67 1991.85 5.43 -0.65 142.28 14.51 4.22408.11 13.82 21412.79 1971.52 8.48 0.64 89.43 19.35 4.32337.38 13.82 20416.87 1737.38 7.80 1.01 84.55 20.02 4.24586.95 86.99 36272.00 10694.20 10.34 0.11 119.51 15.26 5.52729.24 165.85 23093.26 8618.61 5.15 0.04 80.49 15.87 3.63289.40 116.26 26879.59 7747.89 6.64 0.68 136.58 7.81 3.42800.78 42.28 39571.96 4565.81 5.45 0.66 78.86 16.00 4.23264.20 52.84 51866.15 6022.70 6.31 -0.10 136.58 17.44 3.63453.62 165.04 58749.82 3721.10 6.35 -0.03 138.21 17.98 3.11741.45 10.57 23990.82 860.97 7.37 -1.63 75.61 20.99 1.62035.75 13.82 25694.86 3571.51 8.39 -0.43 102.44 21.66 3.41578.00 8.13 23736.35 2845.50 5.15 0.04 76.42 21.46 2.74167.44 58.54 34314.29 5060.11 12.88 0.22 136.58 24.78 2.82799.97 21.14 22809.53 3552.00 9.14 -0.74 88.62 24.96 3.9
![Page 23: 12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved](https://reader036.vdocument.in/reader036/viewer/2022070308/551c19ee550346ad4f8b57ff/html5/thumbnails/23.jpg)
12-12-2323
Correlation Matrix
Example: The Sale Territory Performance Case
![Page 24: 12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved](https://reader036.vdocument.in/reader036/viewer/2022070308/551c19ee550346ad4f8b57ff/html5/thumbnails/24.jpg)
12-12-2424
Multicollinearity
Multicollinearity refers to the condition where the independent variables (or predictors) in a model are dependent, related, or correlated with each other.
EffectsHinders ability to use bjs, t statistics, and p-values to assess the relative importance of predictors.Does not hinder ability to predict the dependent (or response) variable.
DetectionScatter Plot MatrixCorrelation MatrixVariance Inflation Factors (VIF)
![Page 25: 12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved](https://reader036.vdocument.in/reader036/viewer/2022070308/551c19ee550346ad4f8b57ff/html5/thumbnails/25.jpg)
12-12-2525
12.9 Residual Analysis in Multiple Regression
For an observed value of yi, the residual is
)...(ˆ 110 ikkiiiii xbxbbyyye
If the regression assumptions hold, the residuals should look like a random sample from a normal distribution with mean 0 and variance 2.
Residual Plots
Residuals versus each independent variableResiduals versus predicted y’sResiduals in time order (if the response is a time series)Histogram of residualsNormal plot of the residuals
![Page 26: 12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved](https://reader036.vdocument.in/reader036/viewer/2022070308/551c19ee550346ad4f8b57ff/html5/thumbnails/26.jpg)
12-12-2626
Multiple Regression
Summary:12.1 The Linear Regression Model12.2 The Least Squares Estimates and Prediction12.3 The Mean Squared Error and the Standard Error12.4 Model Utility: R2, Adjusted R2, and the F Test12.5 Testing the Significance of an Independent
Variable12.6 Confidence Intervals and Prediction Intervals12.7 Dummy Variables12.8 Model Building and the Effects of
Multicollinearity 12.9 Residual Analysis in Multiple Regression