1 Slide
Simple Linear RegressionEstimation and Residuals
Chapter 14BA 303 – Spring 2011
2 Slide
Point Estimation
0 1y b b x
If 3 TV ads are run prior to a sale, we expect the mean number of cars sold to be:
^y = 10 + 5(3) = 25 cars
3 Slide
/ y t sp yp 2
where:confidence coefficient is 1 -
andt/2 is based on a t distributionwith n - 2 degrees of freedom
Confidence Interval Estimate of E(yp)
The CI is an interval estimate of the mean value of y for a given value of x.
Confidence Interval of E(yp)
4 Slide
2
ˆ 2( )1
( )p
py
i
x xs s
n x x
Estimate of the Standard Deviation of py
Confidence Interval for E(yp)
2
ˆ 2 2 2 2 2(3 2)12.16025 5 (1 2) (3 2) (2 2) (1 2) (3 2)pys
ˆ1 12.16025 1.44915 4pys
5 Slide
The 95% confidence interval estimate of the mean number of cars sold when 3 TV ads are run is:
Confidence Interval for E(yp)
25 - 4.61
/ y t sp yp 2
25 + 3.182(1.4491)
20.39 to 29.61 cars
25 + 4.61
6 Slide
where:confidence coefficient is 1 -
andt/2 is based on a t distributionwith n - 2 degrees of freedom
Prediction Interval Estimate of yp
/ 2 indpy t s
The PI is an interval estimate of an individual value of y for a given value of x. The margin of error is larger than for a CI.
Prediction Interval
7 Slide
2
ind 2( )11 ( )
p
i
x xs s
n x x
Estimate of the Standard Deviation of an Individual Value of yp
ˆ1 12.16025 1 5 4pys
ˆ 2.16025(1.20416) 2.6013pys
Prediction Interval for yp
8 Slide
The 95% prediction interval estimate of the number of cars sold in one particular week when 3 TV ads are run is:
Prediction Interval for yp
25 - 8.28
25 + 3.1824(2.6013)
/ 2 indpy t s
16.72 to 33.28 cars
25 + 8.28
9 Slide
Comparison
16.72 to 33.28 carsPrediction Interval:
Confidence Interval: 20.39 to 29.61 cars
Point Estimate: 25
10 Slide
PRACTICEPREDICTION INTERVALS AND CONFIDENCE INTERVALS
11 Slide
Data
1 2.83 8.05 13.2
ttable 3.182 =0.05, /2=0.025d.f. = n – 2 = 3
s 2.033
3
10x
ix iy
2)( xxi
14 Slide
RESIDUAL ANALYSIS
15 Slide
Residual Analysis
ˆi iy y Much of the residual analysis is based on an examination of graphical plots.
Residual for Observation i The residuals provide the best information about e .
If the assumptions about the error term e appear questionable, the hypothesis tests about the significance of the regression relationship and the interval estimation results may not be valid.
16 Slide
Residual Plot Against x
If the assumption that the variance of e is the same for all values of x is valid, and the assumed regression model is an adequate representation of the relationship between the variables, then
The residual plot should give an overall impression of a horizontal band of points
17 Slide
x
ˆy y
0
Good PatternRe
sidua
l
Residual Plot Against x
18 Slide
Residual Plot Against x
x
ˆy y
0
Resid
ual
Nonconstant Variance
19 Slide
Residual Plot Against x
x
ˆy y
0
Resid
ual
Model Form Not Adequate
20 Slide
Residuals
1 14 15 -13 24 25 -12 18 20 -21 17 15 23 27 25 2
ix iy
)ˆ( ii yy iyix iy
21 Slide
Residual Plot Against x
0 2 4
-3
-2
-1
0
1
2
3
22 Slide
Standardized Residual for Observation i
Standardized Residuals
ˆ
ˆi i
i i
y y
y ys
ˆ 1i i iy ys s h
2
2( )1( )i
ii
x xhn x x
where:
23 Slide
Standardized Residuals
1 1 0.2500 0.4500 1.60203 1 0.2500 0.4500 1.60202 0 0.0000 0.2000 1.93211 1 0.2500 0.4500 1.60203 1 0.2500 0.4500 1.6020
4
ix2)( xxi ih ii yys ˆ
2
2
)()(xxxx
i
i
s=2.1602x=2
24 Slide
Standardized Residuals
1 14 15 1.6020 -0.62423 24 25 1.6020 -0.62422 18 20 1.9321 -1.03511 17 15 1.6020 1.24843 27 25 1.6020 1.2484
ix iyiyii yy
ii
syy
ˆ
)ˆ(
ii yys ˆ
25 Slide
Standardized Residual Plot
The standardized residual plot can provide insight about the assumption that the error term e has a normal distribution.
If this assumption is satisfied, the distribution of the standardized residuals should appear to come from a standard normal probability distribution.
26 Slide
Standardized Residual Plot
0 2 4
-1.5000
-1.0000
-0.5000
0.0000
0.5000
1.0000
1.5000
27 Slide
Standardized Residual Plot
All of the standardized residuals are between –1.5 and +1.5 indicating that there is no reason to question the assumption that e has a normal distribution.
28 Slide
Outliers and Influential Observations
Detecting Outliers
• Minitab classifies an observation as an outlier if its standardized residual value is < -2 or > +2.• This standardized residual rule sometimes fails to identify an unusually large observation as being an outlier.
• This rule’s shortcoming can be circumvented by using studentized deleted residuals.• The |i th studentized deleted residual| will be larger than the |i th standardized residual|.
• An outlier is an observation that is unusual in comparison with the other data.
29 Slide
PRACTICESTANDARDIZED RESIDUALS
30 Slide
Standardized Residuals
1
2
3
4
5
ix2)( xxi ih ii yys ˆ
2
2
)()(xxxx
i
i
2)( xxi10
x3
2.0330s
32 Slide
COMPUTER SOLUTIONS
33 Slide
Computer Solution
Performing the regression analysis computations without the help of a computer can be quite time consuming.
34 Slide
Our Solution – Calculations
35 Slide
Our Solution – Calculations
36 Slide
Basic MiniTab Output
37 Slide
MiniTab Residuals, Prediction Intervals, and Confidence Intervals
38 Slide
Excel Output
39 Slide