© 2010 pearson prentice hall. all rights reserved least squares regression models
Post on 19-Dec-2015
224 views
TRANSCRIPT
![Page 1: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/1.jpg)
© 2010 Pearson Prentice Hall. All rights reserved
Least Squares Regression Models
![Page 2: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/2.jpg)
14-2
The least-squares regression model is given by
where• yi is the value of the response variable for the
ith individual• 0 and 1 are the parameters to be estimated
based on sample data• xi is the value of the explanatory variable for the
ith individual• i is a random error term with mean 0 and
variance , the error terms are independent and normally distributed.
• i=1,…,n, where n is the sample size (number of ordered pairs in the data set)
y i 1x i 0 i
i
2 2
![Page 3: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/3.jpg)
Formulas for the slope and intercept estimates.
ii xbby 10ˆ For the estimated regression equation given by the formula:
The slope b1 is calculated by:
n
xx
n
yxxy
b 2
2
1)(
)(
)()()(
And the intercept b0 can be found with :
)(10 xbyb
![Page 4: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/4.jpg)
14-4
The standard error of the estimate, se, is found using the formula
se y i ˆ y i 2n 2
residuals2n 2
![Page 5: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/5.jpg)
14-5
Parallel Example 2: Compute the Standard Error
Compute the standard error of the estimate for thedrilling data which is presented on the next slide.
![Page 6: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/6.jpg)
14-6
Depth at Which Time to DrillDrilling Begins, x 5 Feet, y
(in feet) (in minutes)35 5.8850 5.9975 6.7495 6.1120 7.47130 6.93145 6.42155 7.97160 7.92175 7.62185 6.89190 7.9
![Page 7: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/7.jpg)
14-7
Solution
Step 1: Using technology (i.e. Minitab), we find the least squares regression line to be
Step 2, 3: The predicted values as well as the residuals for the 12 observations are given in the table on the next slide
ˆ y 0.0116x 5.5273
![Page 8: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/8.jpg)
14-8
Depth, x Time, y35 5.88 5.9333 -0.0533 0.002850 5.99 6.1073 -0.1173 0.013875 6.74 6.3973 0.3427 0.117495 6.1 6.6293 -0.5293 0.2802120 7.47 6.9193 0.5507 0.3033130 6.93 7.0353 -0.1053 0.0111145 6.42 7.2093 -0.7893 0.6230155 7.97 7.3253 0.6447 0.4156160 7.92 7.3833 0.5367 0.2880175 7.62 7.5573 0.0627 0.0039185 6.89 7.6733 -0.7833 0.6136190 7.9 7.7313 0.1687 0.0285
ˆ y
y ˆ y
y ˆ y 2
residuals 2 2.7012
![Page 9: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/9.jpg)
14-9
SolutionStep 4: We find the sum of the squared
residuals by summing the last column of the table:
Step 5: The standard error of the estimate is then given by:
residuals2 2.7012
se residuals2
n 2
2.7012
100.5197
![Page 10: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/10.jpg)
14-10
CAUTION!
Be sure to divide by n-2 when computing the standard error of the estimate.
![Page 11: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/11.jpg)
14-11
Parallel Example 4: Compute the Standard Error
Verify that the residuals from the drilling example are normally distributed.
![Page 12: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/12.jpg)
14-12
![Page 13: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/13.jpg)
Conclusion: We have insufficient evidence at the 5% level of significance to support the claim that the residual errors from this model are not normally distributed.
![Page 14: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/14.jpg)
14-14
Hypothesis Test Regarding the Slope Coefficient, 1
To test whether two quantitative variables are linearlyrelated, we use the following steps provided that
1. the sample is obtained using random sampling.2. the residuals are normally distributed with
constant error variance.
![Page 15: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/15.jpg)
14-15
Step 1: Determine the null and alternative hypotheses. The hypotheses can be structured in one of three ways:
Step 2: Select a level of significance, , depending on the seriousness of making a Type I error.
Two-tailed Left-Tailed Right-Tailed
H0: 1 = 0 H0: 1 = 0 H0: 1 = 0
H1: 1 0 H1: 1 < 0 H1: 1 > 0
![Page 16: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/16.jpg)
14-16
Step 3: Compute the test statistic
which follows Student’s t-distribution with n-2 degrees of freedom. Remember, when computing the test statistic, we assume the null hypothesis to be true. So, we assume that 1=0.
t0 b1 1
sb1
b1
sb1
![Page 17: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/17.jpg)
14-17
Step 4: Use Table VI to estimate the P-value using n-2 degrees of freedom.
P-Value Approach
![Page 18: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/18.jpg)
14-18
P-Value Approach
Two-Tailed
![Page 19: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/19.jpg)
14-19
P-Value Approach
Left-Tailed
![Page 20: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/20.jpg)
14-20
P-Value Approach
Right-Tailed
![Page 21: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/21.jpg)
14-21
Step 5: If the P-value < , reject the null hypothesis.
P-Value Approach
![Page 22: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/22.jpg)
14-22
Step 6: State the conclusion.
![Page 23: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/23.jpg)
14-23
CAUTION!
Before testing H0: 1 = 0, be sure to draw a residual plot to verify that a linear model is appropriate.
![Page 24: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/24.jpg)
14-24
Parallel Example 5: Testing for a Linear Relation
Test the claim that there is a linear relation between drill depth and drill time at the = 0.05 level of significance using the drilling data.
![Page 25: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/25.jpg)
14-25
Solution
Verify the requirements:• We assume that the experiment was randomized so
that the data can be assumed to represent a random sample.
• In Parallel Example 4 we confirmed that the residuals were normally distributed by constructing a normal probability plot.
• To verify the requirement of constant error variance, we plot the residuals against the explanatory variable, drill depth.
![Page 26: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/26.jpg)
14-26
There is no discernable pattern.
![Page 27: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/27.jpg)
14-27
Solution
Step 1: We want to determine whether a linear relation exists between drill depth and drill time without regard to the sign of the slope. This is a two-tailed test with
H0: 1 = 0 versus H1: 1 0
Step 2: The level of significance is = 0.05.
Step 3: Using technology, we obtained an estimate of 1 in Parallel Example 2, b1=0.0116. To determine the standard deviation of b1, we compute .
The calculations are on the next slide.
x i x 2
![Page 28: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/28.jpg)
14-28
Depth, x35 -91.25 8326.562550 -76.25 5814.062575 -51.25 2626.562595 -31.25 976.5625120 -6.25 39.0625130 3.75 14.0625145 18.75 351.5625155 28.75 826.5625160 33.75 1139.0625175 48.75 2376.5625185 58.75 3451.5625190 63.75 4064.0625
x i x
x i x 2
xi x 2 30006 .25
![Page 29: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/29.jpg)
14-29
Solution
Step 3, cont’d: We have
The test statistic is
sb1
se
x i x 2
0.5197
30006.250.0030
t0 b1
sb1
0.0116
0.0033.867
![Page 30: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/30.jpg)
14-30
Solution: P-Value Approach
Step 4: Since this is a two-tailed test, the P-value is the sum of the area under the t-distribution with 12-2=10 degrees of freedom to the left of -t0 = -3.867 and to the right of t0 = 3.867. Using Table VI we find that with 10 degrees of freedom, the value 3.867 is between 3.581 and 4.144 corresponding to right-tail areas of 0.0025 and 0.001, respectively. Thus, the P-value is between 0.002 and 0.005.
Step 5: Since the P-value is less than the level of significance, 0.05, we reject the null hypothesis.
![Page 31: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/31.jpg)
14-31
Solution
Step 6: There is sufficient evidence at the = 0.05 level of significance to conclude that a linear relation exists between drill depth and drill time.
![Page 32: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/32.jpg)
14-32
Confidence Intervals for the Slope of the Regression Line
A (1- )100% confidence interval for the slope of the true regression line, 1, is given by the following formulas:
Lower bound:
Upper bound:
Here, t/2 is computed using n-2 degrees of freedom.
b1 t 2 se
x i x 2b1 t 2 sb1
b1 t 2 se
x i x 2b1 t 2 sb1
![Page 33: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/33.jpg)
14-33
Note: The confidence interval formula for 1 can be computed only if the data are randomly obtained, the residuals are normally distributed, and there is constant error variance.
![Page 34: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/34.jpg)
14-34
Parallel Example 7: Constructing a Confidence Interval for the Slope of the True Regression Line
Construct a 95% confidence interval for the slope of the least-squares regression line for the drilling example.
![Page 35: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/35.jpg)
14-35
Solution
The requirements for the usage of the confidence interval formula were verified in previous examples.We also determined• b1 = 0.0116• in previous examples.
sb10.0030
![Page 36: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/36.jpg)
14-36
Solution
Since t0.025=2.228 for 10 degrees of freedom, we have
Lower bound = 0.0116-2.2280.003=0.0049Upper bound = 0.0116+2.2280.003=0.0183.
We are 95% confident that the mean increase in the time it takes to drill 5 feet for each additional foot of depth at which the drilling begins is between 0.005 and 0.018 minutes.
![Page 37: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/37.jpg)
The Coefficient of Determination
The Coefficient of Determination is the proportion of the variability in the response variable that can be attributed to the least squares regression model.
![Page 38: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/38.jpg)
How to calculate R2
Using the sum of squares technique:
2
22
)(
)(1
yy
residualsR
But for the SLR models we can simplify the calculation slightly and , where r is the correlation between the response and predictor variables.
22 )(rR
![Page 39: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/39.jpg)
14-39
Parallel Example 8: Calculating the Coefficient of Determination
Using technology for our drilling example we can calculate the correlation between the response and predictor to be 0.772822. Using the simplified calculation for the coefficient of determination that means:
5973.0)772822.0( 22 R
![Page 40: © 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models](https://reader035.vdocument.in/reader035/viewer/2022081511/56649d2c5503460f94a028b5/html5/thumbnails/40.jpg)
Interpretation: Our model using the depth at which drilling begins as a predictor is able to explain 59.73% of the natural variability in the time it takes to drill 5 feet.