1 copyright © 2005 brooks/cole, a division of thomson learning, inc. simple linear regression...

1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Simple Linear Regression

Estimates for single and mean responses


Properties of the Sampling Distribution of a + bx for a Fixed x Value

Let x* denote a particular value of the independent variable x. When the four basic assumptions of the simple linear regression model are satisfied, the sampling distribution of the statistic a + bx* has the following properties:

1. The mean value of a + bx* is + x*, so a + bx* is an unbiased statistic for estimating the average y value when x = x*


Properties of the Sampling Distribution of a + bx for a Fixed x Value

3. The distribution of the statistic a + bx* is normal.

2. The standard deviation of the statistic a + bx* denoted by a+bx*, is given by

2

a bx*xx

x * x1n S


Addition Information about the Sampling Distribution of a + bx for a Fixed x Value

The estimated standard deviation of the statistic a + bx*, denoted by sa+bx*, is given by 2

a bx* exx

x * x1s s

n S

When the four basic assumptions of the simple linear regression model are satisfied, the probability distribution of the standardized variable

is the t distribution with df = n - 2.a bx*

a bx * ( x*)t

s


Confidence Interval for a Mean y Value

When the four basic assumptions of the simple linear regression model are met, a confidence interval for a + bx*, the average y value when x has the value x*, is

a + bx* (t critical value)sa+bx*

Where the t critical value is based on df = n -2.

Many authors give the following equivalent form for the confidence interval.

2

exx

1 (x * x)a bx * (t critical value)s

n S


Confidence Interval for a Single y Value

When the four basic assumptions of the simple linear regression model are met, a prediction interval for y*, a single y observation made when x has the value x*, has the form

Where the t critical value is based on df = n -2.

2 2e a bx*a bx * (t critical value) s s

Many authors give the following equivalent form for the prediction interval.

2

exx

1 (x * x)a bx * (t critical value)s 1

n S


Example - Mean Annual Temperature vs. Mortality

Data was collected in certain regions of Great Britain, Norway and Sweden to study the relationship between the mean annual temperature and the mortality rate for a specific type of breast cancer in women.

* Lea, A.J. (1965) New Observations on distribution of neoplasms of female breast in certain European countries. British Medical Journal, 1, 488-490

Mean Annual Temperature (F°)

51 50 50 49 49 48

Mortality Index 103 105 100 96 87 95

Mean Annual Temperature (F°)

47 45 46 42 44

Mortality Index 89 89 79 85 82



Regression Analysis: Mortality index versus Mean annual temperature The regression equation isMortality index = - 21.8 + 2.36 Mean annual temperature Predictor Coef SE Coef T PConstant -21.79 15.67 -1.39 0.186Mean ann 2.3577 0.3489 6.76 0.000 S = 7.545 R-Sq = 76.5% R-Sq(adj) = 74.9% Analysis of Variance Source DF SS MS F PRegression 1 2599.5 2599.5 45.67 0.000Residual Error 14 796.9 56.9Total 15 3396.4 Unusual ObservationsObs Mean ann Mortalit Fit SE Fit Residual St Resid 15 31.8 67.30 53.18 4.85 14.12 2.44RX R denotes an observation with a large standardized residualX denotes an observation whose X value gives it large influence.



504030

100

90

80

70

60

50

Mean annual

Mor

talit

y in

S = 7.54466 R-Sq = 76.5 % R-Sq(adj) = 74.9 %

Mortality in = -21.7947 + 2.35769 Mean annual

Regression Plot

The point has a large standardized residual and is influential because of the low Mean Annual Temperature.



Predicted Values for New Observations New Obs Fit SE Fit 95.0% CI 95.0% PI1 53.18 4.85 ( 42.79, 63.57) ( 33.95, 72.41) X 2 60.72 3.84 ( 52.48, 68.96) ( 42.57, 78.88) 3 72.51 2.48 ( 67.20, 77.82) ( 55.48, 89.54) 4 83.34 1.89 ( 79.30, 87.39) ( 66.66, 100.02) 5 96.09 2.67 ( 90.37, 101.81) ( 78.93, 113.25) 6 99.16 3.01 ( 92.71, 105.60) ( 81.74, 116.57) X denotes a row with X values away from the center Values of Predictors for New Observations New Obs Mean ann1 31.82 35.03 40.04 44.65 50.06 51.3

These are the x* values for which the above fits, standard errors of the fits, 95% confidence intervals for Mean y values and prediction intervals for y values given above.


504030

120

110

100

90

80

70

60

50

40

30

Mean annual

Mor

talit

y in

S = 7.54466 R-Sq = 76.5 % R-Sq(adj) = 74.9 %

Mortality in = -21.7947 + 2.35769 Mean annual

95% PI

95% CI

Regression

Regression PlotExample - Mean Annual Temperature vs. Mortality

95% prediction interval for single y value at x = 45. (67.62,100.98)

95% confidence interval for Mean y value at x = 40. (67.20, 77.82)

1 copyright © 2005 brooks/cole, a division of thomson learning, inc. simple linear regression...

Documents