1 copyright © 2005 brooks/cole, a division of thomson learning, inc. simple linear regression...
TRANSCRIPT
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Simple Linear Regression
Estimates for single and mean responses
2 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Properties of the Sampling Distribution of a + bx for a Fixed x Value
Let x* denote a particular value of the independent variable x. When the four basic assumptions of the simple linear regression model are satisfied, the sampling distribution of the statistic a + bx* has the following properties:
1. The mean value of a + bx* is + x*, so a + bx* is an unbiased statistic for estimating the average y value when x = x*
3 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Properties of the Sampling Distribution of a + bx for a Fixed x Value
3. The distribution of the statistic a + bx* is normal.
2. The standard deviation of the statistic a + bx* denoted by a+bx*, is given by
2
a bx*xx
x * x1n S
4 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Addition Information about the Sampling Distribution of a + bx for a Fixed x Value
The estimated standard deviation of the statistic a + bx*, denoted by sa+bx*, is given by 2
a bx* exx
x * x1s s
n S
When the four basic assumptions of the simple linear regression model are satisfied, the probability distribution of the standardized variable
is the t distribution with df = n - 2.a bx*
a bx * ( x*)t
s
5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Confidence Interval for a Mean y Value
When the four basic assumptions of the simple linear regression model are met, a confidence interval for a + bx*, the average y value when x has the value x*, is
a + bx* (t critical value)sa+bx*
Where the t critical value is based on df = n -2.
Many authors give the following equivalent form for the confidence interval.
2
exx
1 (x * x)a bx * (t critical value)s
n S
6 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Confidence Interval for a Single y Value
When the four basic assumptions of the simple linear regression model are met, a prediction interval for y*, a single y observation made when x has the value x*, has the form
Where the t critical value is based on df = n -2.
2 2e a bx*a bx * (t critical value) s s
Many authors give the following equivalent form for the prediction interval.
2
exx
1 (x * x)a bx * (t critical value)s 1
n S
7 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example - Mean Annual Temperature vs. Mortality
Data was collected in certain regions of Great Britain, Norway and Sweden to study the relationship between the mean annual temperature and the mortality rate for a specific type of breast cancer in women.
* Lea, A.J. (1965) New Observations on distribution of neoplasms of female breast in certain European countries. British Medical Journal, 1, 488-490
Mean Annual Temperature (F°)
51 50 50 49 49 48
Mortality Index 103 105 100 96 87 95
Mean Annual Temperature (F°)
47 45 46 42 44
Mortality Index 89 89 79 85 82
8 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example - Mean Annual Temperature vs. Mortality
Regression Analysis: Mortality index versus Mean annual temperature The regression equation isMortality index = - 21.8 + 2.36 Mean annual temperature Predictor Coef SE Coef T PConstant -21.79 15.67 -1.39 0.186Mean ann 2.3577 0.3489 6.76 0.000 S = 7.545 R-Sq = 76.5% R-Sq(adj) = 74.9% Analysis of Variance Source DF SS MS F PRegression 1 2599.5 2599.5 45.67 0.000Residual Error 14 796.9 56.9Total 15 3396.4 Unusual ObservationsObs Mean ann Mortalit Fit SE Fit Residual St Resid 15 31.8 67.30 53.18 4.85 14.12 2.44RX R denotes an observation with a large standardized residualX denotes an observation whose X value gives it large influence.
9 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example - Mean Annual Temperature vs. Mortality
504030
100
90
80
70
60
50
Mean annual
Mor
talit
y in
S = 7.54466 R-Sq = 76.5 % R-Sq(adj) = 74.9 %
Mortality in = -21.7947 + 2.35769 Mean annual
Regression Plot
The point has a large standardized residual and is influential because of the low Mean Annual Temperature.
10 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example - Mean Annual Temperature vs. Mortality
Predicted Values for New Observations New Obs Fit SE Fit 95.0% CI 95.0% PI1 53.18 4.85 ( 42.79, 63.57) ( 33.95, 72.41) X 2 60.72 3.84 ( 52.48, 68.96) ( 42.57, 78.88) 3 72.51 2.48 ( 67.20, 77.82) ( 55.48, 89.54) 4 83.34 1.89 ( 79.30, 87.39) ( 66.66, 100.02) 5 96.09 2.67 ( 90.37, 101.81) ( 78.93, 113.25) 6 99.16 3.01 ( 92.71, 105.60) ( 81.74, 116.57) X denotes a row with X values away from the center Values of Predictors for New Observations New Obs Mean ann1 31.82 35.03 40.04 44.65 50.06 51.3
These are the x* values for which the above fits, standard errors of the fits, 95% confidence intervals for Mean y values and prediction intervals for y values given above.
11 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
504030
120
110
100
90
80
70
60
50
40
30
Mean annual
Mor
talit
y in
S = 7.54466 R-Sq = 76.5 % R-Sq(adj) = 74.9 %
Mortality in = -21.7947 + 2.35769 Mean annual
95% PI
95% CI
Regression
Regression PlotExample - Mean Annual Temperature vs. Mortality
95% prediction interval for single y value at x = 45. (67.62,100.98)
95% confidence interval for Mean y value at x = 40. (67.20, 77.82)