chapter 9.3 ppt · 2018-04-26 · corresponding predicted y-value. ... constructing a prediction...
TRANSCRIPT
M E A S U R E S O F R E G R E S S I O N A N D P R E D I C T I O N I N T E R V A L S
Larson/Farber 4th ed.
1
Section 9.3
Section 9.3 Objectives
Larson/Farber 4th ed.
2
� Interpret the three types of variation about a regression line
� Find and interpret the coefficient of determination� Find and interpret the standard error of the estimate
for a regression line� Construct and interpret a prediction interval for y
Variation About a Regression Line
Larson/Farber 4th ed.
3
� Three types of variation about a regression line¡ Total variation¡ Explained variation¡ Unexplained variation
� To find the total variation, you must first calculate¡ The total deviation¡ The explained deviation¡ The unexplained deviation
Variation About a Regression Line
Larson/Farber 4th ed.
4
iy y-ˆiy y-
ˆi iy y-
(xi, ŷi)
x
y (xi, yi)
(xi, yi)
Unexplained deviation
ˆi iy y-Total
deviationiy y- Explained
deviationˆiy y-
y
x
Total Deviation = Explained Deviation = Unexplained Deviation =
Variation About a Regression Line
Larson/Farber 4th ed.
5
Total variation• The sum of the squares of the differences between
the y-value of each ordered pair and the mean of y.
Explained variation� The sum of the squares of the differences between
each predicted y-value and the mean of y.
( )2iy yå -Total variation =
Explained variation = ( )2ˆiy yå -
Variation About a Regression Line
Larson/Farber 4th ed.
6
Unexplained variation� The sum of the squares of the differences between
the y-value of each ordered pair and each corresponding predicted y-value.
( )2ˆi iy yå -Unexplained variation =
The sum of the explained and unexplained variation is equal to the total variation.
Total variation = Explained variation + Unexplained variation
Coefficient of Determination
Larson/Farber 4th ed.
7
Coefficient of determination� The ratio of the explained variation to the total
variation.� Denoted by r2
2 Explained variationTotal variationr =
Example: Coefficient of Determination
Larson/Farber 4th ed.
8
The correlation coefficient for the advertising expenses and company sales data as calculated in Section 9.1 isr ≈ 0.913. Find the coefficient of determination. What does this tell you about the explained variation of the data about the regression line? About the unexplained variation?
22 (0.913)0.834
r »»
About 83.4% of the variation in the company sales can be explained by the variation in the advertising expenditures. About 16.9% of the variation is unexplained.
Solution:
The Standard Error of Estimate
Larson/Farber 4th ed.
9
Standard error of estimate� The standard deviation of the observed yi -values
about the predicted ŷ-value for a given xi -value.� Denoted by se.
� The closer the observed y-values are to the predicted y-values, the smaller the standard error of estimate will be.
2( )ˆ2
i ie
y ys nå -= -
n is the number of ordered pairs in the data set
Larson/Farber 4th ed. 10
2( )ˆ2
i ie
y ys nå -= -
2
, , , ( ), ˆ ˆ( )ˆi i i i i
i i
x y y y yy y
--
The Standard Error of Estimate
1. Make a table that includes the column heading shown.
2. Use the regression equation to calculate the predicted y-values.
3. Calculate the sum of the squares of the differences between each observed y-value and the corresponding predicted y-value.
4. Find the standard error of estimate.
ˆ iy mx b= +
2 ( )ˆi iy yå -
In Words In Symbols
Example: Standard Error of Estimate
Larson/Farber 4th ed.
11
The regression equation for the advertising expenses and company sales data as calculated in section 9.2 is
ŷ = 50.729x + 104.061Find the standard error of estimate. (Calculated in 1000s)
Solution:Use a table to calculate the sum of the squared differences of each observed y-value and the corresponding predicted y-value.
Larson/Farber 4th ed. 12
Solution: Standard Error of Estimatex y ŷ i (yi – ŷ i)2
2.4 225 225.81 (225 – 225.81)2 = 0.6561
1.6 184 185.23 (184 – 185.23)2 = 1.5129
2.0 220 205.52(220 – 205.52)2 =
209.6704
2.6 240 235.96 (240 – 235.96)2 = 16.3216
1.4 180 175.08 (180 – 175.08)2 = 24.2064
1.6 184 185.23 (184 – 185.23)2 = 1.5129
2.0 186 205.52(186 – 205.52)2 =
381.0304
2.2 215 215.66 (215 – 215.66)2 = 0.4356Σ = 635.3463 unexplained
variation
Solution: Standard Error of Estimate
Larson/Farber 4th ed.
13
• n = 8, Σ(yi – ŷ i)2 = 635.3463
2( )ˆ2
i ie
y ys nå -= =-
The standard error of estimate of the company sales for a specific advertising expense is about $10,290.
635.3463 10.2908 2 »-
Prediction Intervals
Larson/Farber 4th ed.
14
� Two variables have a bivariate normal distribution if for any fixed value of x, the corresponding values of y are normally distributed and for any fixed values of y, the corresponding x-values are normally distributed.
Prediction Intervals
Larson/Farber 4th ed.
15
� A prediction interval can be constructed for the true value of y.
� Given a linear regression equation ŷ = mx + b and x0, a specific value of x, a c-prediction interval for y is
ŷ – E < y < ŷ + E where
� The point estimate is ŷ and the margin of error is E. The probability that the prediction interval contains yis c.
202 2( )11
( )c en x xE t s n n x x
-= + +å - å
Constructing a Prediction Interval for y for a Specific Value of x
Larson/Farber 4th ed.
16
1. Identify the number of ordered pairs in the data set n and the degrees of freedom.
2. Use the regression equation and the given x-value to find the point estimate ŷ.
3. Find the critical value tc that corresponds to the given level of confidence c.
ˆi iy mx b= +
Use Table 5 in Appendix B.
In Words In Symbols
d.f. = n – 2
Constructing a Prediction Interval for y for a Specific Value of x
Larson/Farber 4th ed.
17
2( )ˆ2
i ie
y ys nå -= -
4. Find the standard error of estimate se.
5. Find the margin of error E.
6. Find the left and right endpoints and form the prediction interval.
In Words In Symbols
202 2( )11
( )c en x xE t s n n x x
-= + +å - å
Left endpoint: ŷ – E Right endpoint: ŷ + E Interval: ŷ – E < y < ŷ + E
202 2( )11
( )c en x xE t s n n x x
-= + +å - å
Example: Constructing a Prediction Interval
Larson/Farber 4th ed.
18
Construct a 95% prediction interval for the company sales when the advertising expenses are $2100. What can you conclude?Recall, n = 8, ŷ = 50.729x + 104.061, se = 10.290 S.E.= 10.975
Solution:Point estimate:ŷ = 50.729(2.1) + 104.061 ≈ 210.592
Critical value:d.f. = n –2 = 8 – 2 = 6 tc = 2.447
215.8, 32.44, 1.975x x xS = S = =
Solution: Constructing a Prediction Interval
Larson/Farber 4th ed.
19
0
2
2
2
2 2
1 8(2.1 1.975)(2.447)(10.290) 1 26.8578 8(32.44) (15
( )
8)
1
.
1( )c e
n x xE t s n n x x-
-= + +
=
»
+ +å - å
-
Left Endpoint: ŷ – E Right Endpoint: ŷ + E
183.735 < y < 237.449
210.592 – 26.857≈ 183.735
210.592 + 26.857≈ 237.449
You can be 95% confident that when advertising expenses are $2100, the company sales will be between $183,735 and $237,449.
Section 9.3 Summary
Larson/Farber 4th ed.
20
� Interpreted the three types of variation about a regression line
� Found and interpreted the coefficient of determination
� Found and interpreted the standard error of the estimate for a regression line
� Constructed and interpreted a prediction interval for y