regression analysis part c confidence intervals and hypothesis testing read chapters 3, 4 and 5 of...

44
Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach.

Upload: harry-mccarthy

Post on 05-Jan-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

RegressionAnalysis

Part CConfidence Intervals and

Hypothesis Testing

Read Chapters 3, 4 and 5of Forecasting and Time Series, An Applied Approach.

Page 2: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 2

Part A – Basic Model & Parameter Estimation

Part B – Calculation Procedures

Part C – Inference: Confidence Intervals & Hypothesis Testing

Part D – Goodness of Fit

Part E – Model Building

Part F – Transformed Variables

Part G – Standardized Variables

Part H – Dummy Variables

Part I – Eliminating Intercept

Part J - Outliers

Part K – Regression Example #1

Part L – Regression Example #2

Part N – Non-linear Regression

Part P – Non-linear Example

Regression Analysis Modules

Page 3: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 3

Overview of Part L01C Confidence Intervals and Hypothesis Testing

• Confidence Intervals• For Yi prediction and Yi mean

• Formulas for univariate and multivariate cases.• Example calculation: 1) Manual in Excel and 2) SPSS.

• For Regression Coefficients, bi

• Formulas for univariate and multivariate cases.• Example calculation: 1) Data Analysis in Excel and 2)

SPSS.

• Hypothesis Testing• For Regression Coefficients, bi

• For Entire Regression Model, F-test

Page 4: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 4

Underlying Statistical Theory Confidence Intervals and Hypothesis Testing

)(~

)(~

kntondistributitahas

jbsjb

kntondistributitahasesiy

Page 5: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 5

The Standard Error of a Regression Equation single independent variable

kn

YY

kn

YY

kn

SSEss

iiii

e

222

2ˆ)ˆ(

whereYi is the actually observed values of the dependent variable.Yi

hat is the predicted value from the fitted regression equation.p = 1 is the number of independent variables. k = p+1 = 2 for the number of parameters, 0, 1.n is the sample size used when calculating s.

Page 6: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 6

Confidence Interval for Individual Prediction single independent variable

2

2

)(

)(11)2/,(ˆ

XX

XX

nskntY

i

fef

wheref denotes the future (forecasted) or predicted value.

p = 1 is the number of independent variables. k = p+1 = 2 for the number of parameters, 0, 1. n is the sample size used when calculating s.1- is the confidence level, typically .95. So /2 = .025.

Page 7: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 7

Confidence Interval for Mean Prediction single independent variable (1 of 2)

2

2

)(

)(11)2/,(ˆ

XX

XX

nmskntY

i

fef

wheref denotes the future (forecasted) or predicted value. p = 1 is the number of independent variables. k = p+1 = 2 for the number of parameters, 0, 1. n is the sample size used when calculating s.m is the sample size that is going to be used to calculate the mean value.1- is the confidence level, typically .95. So /2 = .025.

Page 8: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 8

When m=1, the CI for the mean becomes the CI for an individual Y.

When m = infinity, the CI for the mean become the CI for a general mean.

Confidence Interval for Mean Prediction single independent variable (2 of 2)

2

2

)(

)(1

1

1)2/,(ˆ

XX

XX

nskntY

i

fef

2

2

2

2

)(

)(1)2/,(ˆ

)(

)(11)2/,(ˆ

XX

XX

nskntY

XX

XX

nskntY

i

fef

i

fef

Page 9: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 9

Confidence Intervals for Individual Predictions and Mean Predictions

0

20

40

60

80

100

120

140

160

180

200

0 5 10 15 20 25 30 35 40 45

R/D Expense

Sa

les

Page 10: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 10

CI Manual Calculationssingle independent

variable

)39.115,27.41(

06.3733.7805275.1)532.17)(06.2(33.78

01705.0357.1)532.17)(06.2(33.78

)27()443.7(

)05.2520(

28

11367.307)06.2()20(459.3151.9

)(

)(11)2/,(ˆ

2

2

2

2

XX

XX

nskntY

i

fef

=TINV(0.05,26)

2.06

Descriptive Statistics

95.82 30.968 28

25.0536 7.44342 28

SALES

R D

Mean Std. Deviation N

ANOVAb

17902.572 1 17902.572 58.245 .000a

7991.535 26 307.367

25894.107 27

Regression

Residual

Total

Model1

Sum ofSquares df Mean Square F Sig.

Predictors: (Constant), R Da.

Dependent Variable: SALESb. Coefficientsa

9.151 11.830 .774 .446

3.459 .453 .831 7.632 .000

(Constant)

R D

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig.

Dependent Variable: SALESa.

Page 11: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 11

12

3456789101112131415161718192021222324252627282930313233343536373839

A B C D E F G H I J K L M N O P Q R S T=TINV(0.05,26)

a = 9.15 "A_coef" b = 3.46 "B_coef" 2.06 "t_05"

ORIGINAL DATA PREDICTED VALUESX Y X (X-Xbar)2

Ret 1+(1/n)+Ret (1/n)+Ret Delta Delta M Y pred Lower Upper Lower UpperQuarter R D Sales 9 257.72 0.172 1.099 0.46 39.61 16.44 40.29 0.68 79.89 23.85 56.72

1 9.25 40 41.150 1600.00 1693.34 249.75 10 226.61 0.151 1.090 0.43 39.27 15.59 43.74 4.48 83.01 28.15 59.342 12.50 37 52.393 1369.00 2745.06 157.59 11 197.50 0.132 1.081 0.41 38.94 14.76 47.20 8.26 86.15 32.44 61.963 17.50 50 69.690 2500.00 4856.76 57.06 12 170.40 0.114 1.072 0.39 38.64 13.94 50.66 12.02 89.30 36.72 64.604 20.00 70 78.339 4900.00 6137.00 25.54 13 145.29 0.097 1.064 0.36 38.36 13.13 54.12 15.77 92.48 40.99 67.265 15.00 60 61.042 3600.00 3726.11 101.07 14 122.18 0.082 1.057 0.34 38.09 12.35 57.58 19.49 95.68 45.24 69.936 18.00 60 71.420 3600.00 5100.84 49.75 15 101.07 0.068 1.050 0.32 37.85 11.58 61.04 23.19 98.89 49.46 72.627 22.00 72 85.258 5184.00 7268.90 9.32 16 81.97 0.055 1.044 0.30 37.63 10.84 64.50 26.87 102.13 53.66 75.348 25.25 88 96.501 7744.00 9312.43 0.04 17 64.86 0.043 1.039 0.28 37.43 10.13 67.96 30.53 105.40 57.83 78.099 15.00 101 61.042 10201.00 3726.11 101.07 18 49.75 0.033 1.034 0.26 37.26 9.46 71.42 34.16 108.68 61.96 80.8810 20.25 80 79.204 6400.00 6273.25 23.07 19 36.65 0.024 1.030 0.25 37.11 8.84 74.88 37.77 111.99 66.04 83.7211 24.25 81 93.042 6561.00 8656.73 0.65 20 25.54 0.017 1.026 0.23 36.98 8.28 78.34 41.36 115.32 70.06 86.6212 27.50 97 104.285 9409.00 10875.29 5.99 21 16.43 0.011 1.023 0.22 36.87 7.79 81.80 44.93 118.67 74.01 89.5913 25.00 110 95.636 12100.00 9146.26 0.00 22 9.32 0.006 1.021 0.20 36.79 7.38 85.26 48.47 122.04 77.88 92.6414 25.75 89 98.231 7921.00 9649.26 0.49 23 4.22 0.003 1.019 0.20 36.73 7.07 88.72 51.99 125.44 81.64 95.7915 29.25 103 110.339 10609.00 12174.62 17.61 24 1.11 0.001 1.018 0.19 36.69 6.88 92.18 55.49 128.87 85.30 99.0616 32.75 117 122.447 13689.00 14993.18 59.24 25 0.00 0.000 1.018 0.19 36.68 6.81 95.64 58.96 132.31 88.83 102.4517 30.00 131 112.933 17161.00 12753.91 24.47 26 0.90 0.001 1.018 0.19 36.69 6.87 99.10 62.41 135.78 92.23 105.9618 28.00 98 106.014 9604.00 11239.05 8.68 27 3.79 0.003 1.019 0.20 36.72 7.05 102.55 65.83 139.27 95.51 109.6019 33.50 112 125.041 12544.00 15635.30 71.34 28 8.68 0.006 1.021 0.20 36.78 7.34 106.01 69.24 142.79 98.67 113.3620 38.25 134 141.473 17956.00 20014.74 174.15 29 15.57 0.010 1.023 0.21 36.86 7.74 109.47 72.61 146.33 101.73 117.2121 32.00 153 119.852 23409.00 14364.52 48.25 30 24.47 0.016 1.026 0.23 36.96 8.22 112.93 75.97 149.90 104.71 121.1622 25.25 145 96.501 21025.00 9312.43 0.04 31 35.36 0.024 1.029 0.24 37.09 8.78 116.39 79.30 153.48 107.61 125.1723 22.25 101 86.123 10201.00 7417.12 7.86 32 48.25 0.032 1.033 0.26 37.24 9.40 119.85 82.61 157.09 110.46 129.2524 25.00 89 95.636 7921.00 9146.26 0.00 33 63.15 0.042 1.038 0.28 37.42 10.06 123.31 85.90 160.73 113.25 133.3725 26.25 90 99.960 8100.00 9992.08 1.43 34 80.04 0.054 1.044 0.30 37.61 10.76 126.77 89.16 164.38 116.01 137.5426 31.25 105 117.257 11025.00 13749.32 38.40 35 98.93 0.066 1.050 0.32 37.83 11.50 130.23 92.40 168.06 118.73 141.7327 30.00 125 112.933 15625.00 12753.91 24.47 36 119.82 0.080 1.056 0.34 38.07 12.26 133.69 95.62 171.76 121.43 145.9528 40.50 145 149.257 21025.00 22277.70 238.59 37 142.72 0.095 1.064 0.36 38.33 13.05 137.15 98.82 175.48 124.10 150.20

Sum's = 701.50 2,683.00 282,983.0 274,991.47 1,495.92 38 167.61 0.112 1.071 0.38 38.61 13.85 140.61 102.00 179.22 126.76 154.46Mean's = 25.05 95.82 95.821 "X_SS" 39 194.50 0.130 1.080 0.41 38.91 14.67 144.07 105.16 182.98 129.40 158.74

=B34/28 307.37 =(E34-F34)/26 40 223.40 0.149 1.089 0.43 39.23 15.50 147.53 108.30 186.76 132.03 163.03min = 9.25 s = 17.53 =SQRT(E36) 41 254.29 0.170 1.098 0.45 39.57 16.34 150.99 111.42 190.56 134.64 167.33

max = 40.50 "St_Err" =J37/X_SS =t_05*St_Err*L37 =P37+N37 =P37+O37

=(I37-X_mean)^2 =SQRT(1+(1/28)+K37)

Y2 Pred (X-Xbar)2Y pred Y2

CI Manual Calculationssingle independent

variable

2

2

)(

)(11)2/,(ˆ

XX

XX

nmskntY

i

fef

Page 12: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 12

SPSS Data Analysis Calculations single independent variable

Page 13: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 13

SPSS Data Analysis Calculations single independent variable (continued)

Quarter R_D Sales lmci_1 umci_1 lici_1 uici_11 9.25 40.00 24.93 57.37 1.63 80.672 12.50 37.00 38.86 65.93 13.90 90.893 17.50 50.00 59.90 79.48 32.35 107.034 20.00 70.00 70.06 86.62 41.36 115.325 15.00 60.00 49.46 72.62 23.19 98.896 18.00 60.00 61.96 80.88 34.16 108.687 22.00 72.00 77.88 92.64 48.47 122.048 25.25 88.00 89.69 103.31 59.83 133.189 15.00 101.00 49.46 72.62 23.19 98.8910 20.25 80.00 71.05 87.35 42.26 116.1511 24.25 81.00 86.19 99.89 56.36 129.7212 27.50 97.00 97.10 111.47 67.54 141.0313 25.00 110.00 88.83 102.45 58.96 132.3114 25.75 89.00 91.39 105.07 61.55 134.9115 29.25 103.00 102.49 118.19 73.46 147.2216 32.75 117.00 112.56 132.34 85.08 159.8217 30.00 131.00 104.71 121.16 75.97 149.9018 28.00 98.00 98.67 113.36 69.24 142.7919 33.50 112.00 114.63 135.45 87.53 162.5520 38.25 134.00 127.42 155.53 102.79 180.1521 32.00 153.00 110.46 129.25 82.61 157.0922 25.25 145.00 89.69 103.31 59.83 133.1823 22.25 101.00 78.83 93.42 49.35 122.8924 25.00 89.00 88.83 102.45 58.96 132.3125 26.25 90.00 93.06 106.86 63.27 136.6526 31.25 105.00 108.33 126.19 80.13 154.3827 30.00 125.00 104.71 121.16 75.97 149.9028 40.50 145.00 133.33 165.18 109.86 188.66

Page 14: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 14

The Standard Error of a Regression Equation multivariate case

knkn

SSEsse

YX'b'YY'2

whereY is the actually observed values of the dependent variable, an [n x 1] matrix vector.X is the actually observed values of the independent variable, an [n x 1] matrix vector.b is the calculated regression parameters, a [k x 1] matrix. b=(X’X)-1(X’Y)p is the number of independent variables. k=p+1 is the number of parameters, 0, 1, … p.n is the sample size used when calculating s.

kns

YX'b'YY'

Page 15: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 15

Confidence Interval for Individual Predictions multivariate case

ffef skntY XXX'X1 1' )()2/,(ˆ

whereXf is a matrix vector of specified values for the independent variables. X’f = [1 Xf,1, Xf,2, … Xf,p]p is the number of independent variables. k = p+1 is the number of parameters, 0, 1, … p. n is the sample size used when calculating s.1- is the confidence level, typically .95. So /2 = .025.

Page 16: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 16

Confidence Interval for Mean Predictions multivariate case

ffef skntY XXX'X 1' )()2/,(ˆ

whereXf is a matrix vector of specified values for the independent variables. X’f = [1 Xf,1, Xf,2, … Xf,p]p is the number of independent variables. k = p+1 is the number of parameters, 0, 1, … p. n is the sample size used when calculating s.1- is the confidence level, typically .95. So /2 = .025.

Page 17: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 17

CI Manual Calculations multivariate

case Y Y '6 8 . 7 0 6 8 . 7 5 4 . 9 5 1 . 5 7 1 . 6 5 8 . 4 4 0 . 7 5 1 . 7 7 1 . 9 5 7 . 1 5 8 . 3 7 3 . 5 5 8 . 5 4 9 . 1 6 7 . 5 5 3 . 7 5 05 4 . 9 0

5 1 . 5 0 Y ' Y b ' X ' Y N u m D e m s 2 s7 1 . 6 0 5 6 2 5 6 . 6 5 6 0 3 1 2 2 5 . 6 1 3 1 7 . 3 5 4 . 1 6 6 " S t _ E r "5 8 . 4 0 = A T 5 - A W 5 = A Y 5 / A Z 5

4 0 . 7 0 X '5 1 . 7 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

7 1 . 9 0 2 . 0 5 1 . 7 1 . 4 7 1 . 7 5 1 . 9 4 1 . 1 9 1 . 5 6 1 . 9 5 1 . 6 1 . 4 9 1 . 9 1 1 . 3 8 1 . 5 5 1 . 8 8 1 . 6 1 . 5 5

5 7 . 1 0 3 . 4 3 1 1 . 6 1 8 . 3 1 0 7 . 4 1 3 1 . 7 1 6 . 1 2 . 0 5 1 . 7 4 2 . 7 6 0 0 1 2 . 6 1 2 . 8 7 . 0 8 1 8

5 8 . 3 0

7 3 . 5 0 b = ( X ' X ) - 1 ( X ' Y ) X ' Y b ' ( X ' X ) - 1

5 8 . 5 0 2 9 . 4 1 9 8 9 3 7 . 1 2 9 . 4 2 2 0 . 3 3 - 0 . 5 8 7 8 5 . 7 5 1 - 3 . 1 5 7 - 0 . 0 5 74 9 . 1 0 2 0 . 3 3 1 8 2 . 1 6 1 5 8 3 . 5 2 3 - 3 . 1 5 7 1 . 7 6 9 0 . 0 2 86 7 . 5 0 - 0 . 5 8 7 8 " t _ . 0 2 5 " 6 3 5 2 . 3 0 3 - 0 . 0 5 7 0 . 0 2 8 0 . 0 0 15 3 . 7 0

5 0 . 0 0

X ' f ( X ' X ) - 1 X f

X f Y f R a d D e l t a D e l t a _ I L o w e r U p p e r L o w e r U p p e r

1 1 . 1 0 5 1 . 7 8 0 . 9 4 7 8 . 7 5 9 1 2 . 5 5 8 3 9 . 2 3 6 4 . 3 4 4 3 . 0 3 6 0 . 5 41 1 . 3 0 5 5 . 8 5 0 . 5 3 4 6 . 5 7 4 1 1 . 1 4 5 4 4 . 7 1 6 7 . 0 0 4 9 . 2 8 6 2 . 4 31 1 . 5 0 5 9 . 9 2 0 . 2 6 2 4 . 6 0 3 1 0 . 1 0 8 4 9 . 8 1 7 0 . 0 3 5 5 . 3 1 6 4 . 5 2

1 1 . 7 5 0 6 5 0 . 1 2 3 . 1 2 3 9 . 5 2 6 5 5 . 4 7 7 4 . 5 3 6 1 . 8 8 6 8 . 1 21 1 . 9 0 6 8 . 0 5 0 . 1 4 2 3 . 3 8 9 9 . 6 1 7 5 8 . 4 3 7 7 . 6 7 6 4 . 6 6 7 1 . 4 41 2 . 1 0 7 2 . 1 2 0 . 2 9 4 4 . 8 8 2 1 0 . 2 3 8 6 1 . 8 8 8 2 . 3 5 6 7 . 2 3 7 71 1 . 1 3 2 3 2 . 9 7 0 . 6 4 7 7 . 2 4 1 1 1 . 5 5 1 2 1 . 4 2 4 4 . 5 2 2 5 . 7 3 4 0 . 2 1

1 1 . 3 3 2 3 7 . 0 4 0 . 5 9 1 6 . 9 2 0 1 1 . 3 5 3 2 5 . 6 9 4 8 . 3 9 3 0 . 1 2 4 3 . 9 61 1 . 5 3 2 4 1 . 1 1 0 . 6 7 7 7 . 4 0 4 1 1 . 6 5 3 2 9 . 4 5 5 2 . 7 6 3 3 . 7 4 8 . 5 11 1 . 7 3 2 4 5 . 1 7 0 . 9 0 4 8 . 5 5 5 1 2 . 4 1 7 3 2 . 7 6 5 7 . 5 9 3 6 . 6 2 5 3 . 7 31 1 . 9 3 2 4 9 . 2 4 1 . 2 7 2 1 0 . 1 5 1 1 3 . 5 6 6 3 5 . 6 7 6 2 . 8 1 3 9 . 0 9 5 9 . 3 9

1 2 . 1 3 2 5 3 . 3 1 1 . 7 8 2 1 2 . 0 1 4 1 5 . 0 1 1 3 8 . 2 9 6 8 . 3 2 4 1 . 2 9 6 5 . 3 2= M M U L T ( A R 2 0 : A T 2 0 , A T $ 1 3 : A T $ 1 5 )

X f'

1 1 1 1 1 1 1 1 1 1 1 11 . 1 1 . 3 1 . 5 1 . 7 5 1 . 9 2 . 1 1 . 1 1 . 3 1 . 5 1 . 7 1 . 9 2 . 1

0 0 0 0 0 0 3 2 3 2 3 2 3 2 3 2 3 2

kns

YX'b'YY'

Page 18: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 18

SPSS Data Analysis Calculations multivariate case

Page 19: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 19

SPSS Data Analysis Calculations multivariate case

(continued)

House Price Size Age lmci_1 umci_1 lici_1 uici_11 68.70 2.05 3.43 64.49 73.68 58.98 79.192 54.90 1.70 11.61 54.42 59.90 47.75 66.573 51.50 1.47 8.31 51.28 57.57 44.89 63.964 71.60 1.75 0.00 61.88 68.12 55.47 74.535 58.40 1.94 7.41 60.54 68.47 54.67 74.346 40.70 1.19 31.70 28.05 41.91 23.62 46.347 51.70 1.56 16.10 48.48 54.86 42.13 61.228 71.90 1.95 2.05 64.24 71.49 58.16 77.569 57.10 1.60 1.74 57.56 64.29 51.32 70.5410 58.30 1.49 2.76 54.09 62.09 48.24 67.9411 73.50 1.91 0.00 64.81 71.69 58.62 77.8912 58.50 1.38 0.00 51.73 63.22 46.80 68.1613 49.10 1.55 12.61 50.89 56.15 44.15 62.9014 67.50 1.88 2.80 62.88 69.12 56.47 75.5215 53.70 1.60 7.08 55.37 60.21 48.47 67.1116 50.00 1.55 18.00 46.75 53.95 40.66 60.05

Page 20: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 20

The Standard Error of a Regression Equation

kn

YYs

ii

e

22 ˆ

whereYi is the actually observed values of the dependent variable.

Yihat is the predicted value from the fitted regression equation.

p = 1 is the number of independent variables. k = p+1 = 2 for the number of parameters, 0, 1.n is the sample size used when calculating s.

Review from previous s

lide.

knse

YX'b'YY'

Page 21: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 21

Skip’s Quick and Dirty method to Estimate the Confidence Interval for a Regression Line.

eeUCL

eeLCL

sXbasYY

sXbasYY

2ˆˆ2ˆˆ

2ˆˆ2ˆˆ

Procedure:Select a range of X values from Minimum X to Maximum X.

Calculate the corresponding predicted values for Y, Yhat.

Add and subtract 2 times the Standard Error for Regression to the predicted values.

Optional – plot the two CL line on the scatter plot.

Page 22: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 22

Confidence Interval for Regression Coefficients single independent variable

1)2/,2(ˆ

)(

1)2/,(ˆ

1

21

b

ie

sntb

XXskntb

wherep = 1 is the number of independent variables. k = p+1 = 2 for the number of parameters, 0, 1. n is the sample size used when calculating s.1- is the confidence level, typically .95. So /2 = .025.

0)2/,2(ˆ

)()2/,(ˆ

0

2

21

0

b

ie

sntb

XXN

Xskntb

Page 23: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 23

Confidence Interval for Regression Coefficients multivariate case

ibi

iei

skntb

dskntb

)2/,(ˆ

)2/,(ˆ

1i (X'X) of element diagonal ith the is d where

wherep is the number of independent variables. k = p+1 is the number of parameters, 0, 1, … p. n is the sample size used when calculating s.1- is the confidence level, typically .95. So /2 = .025.

Page 24: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 24

Excel, Data Analysis Calculations Multivariate Case

House Price Size Age TOOLS / DATA ANALYSIS / Regression1 68.70 2.05 3.432 54.90 1.70 11.613 51.50 1.47 8.31 Regression Statistics4 71.60 1.75 0.005 58.40 1.94 7.416 40.70 1.19 31.707 51.70 1.56 16.108 71.90 1.95 2.059 57.10 1.60 1.7410 58.30 1.49 2.7611 73.50 1.91 0.0012 58.50 1.38 0.0013 49.10 1.55 12.6114 67.50 1.88 2.8015 53.70 1.60 7.0816 50.00 1.55 18.00

Page 25: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 25

Excel, Data Analysis Calculations Multivariate Case

(continued)

SUMMARY OUTPUTRegression StatisticsMultiple R 0.914R Square 0.836Adjusted R Square0.810Standard Error4.166Observations 16

ANOVAdf SS MS F Significance F

Regression 2 1146.2 573.1 33.0 0.0Residual 13 225.6 17.4Total 15 1371.8

CoefficientsStandard Error t Stat P-value Lower 95% Upper 95%Lower 95.0%Upper 95.0%Intercept 29.42 9.99 2.94 0.011 7.84 51.00 7.84 51.00Size 20.33 5.54 3.67 0.003 8.36 32.30 8.36 32.30Age -0.59 0.15 -3.85 0.002 -0.92 -0.26 -0.92 -0.26

Page 26: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 26

SPSS Data Analysis Calculations Multivariate Case

Page 27: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 27

SPSS Data Analysis Calculations Multivariate Case

(continued)

Coefficientsa

29.420 9.990 2.945 .011 7.837 51.002

20.332 5.540 .503 3.670 .003 8.363 32.301

-.588 .153 -.527 -3.845 .002 -.918 -.258

(Constant)

SIZE

AGE

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig. Lower Bound Upper Bound

95% Confidence Interval for B

Dependent Variable: PRICEa.

Page 28: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 28

Hypothesis Test of Regression Coefficient

)TDIST( if Rejector

TINVif Reject

H

Statistic Test :H

1

o

2value-Probvalue-Prob

),()2/,()2/,(

0:

0

,n-k,t

knkntkntt

s

btb

b

C

C

b

jCj

j

j

wherep is the number of independent variables. k = p+1 is the number of parameters, 0, 1, … p. n is the sample size used when calculating s.1- is the confidence level, typically .95. So /2 = .025.

Page 29: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 29

Excel, Data Analysis Calculation

Multivariate CaseSUMMARY OUTPUTRegression Statistics

Multiple R 0.914R Square 0.836Adjusted R Square 0.810Standard Error 4.166Observations 16

ANOVAdf SS MS F Significance F

Regression 2 1146.2 573.1 33.0 0.000Residual 13 225.6 17.4Total 15 1371.8

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%Intercept 29.42 9.99 2.94 0.011 7.84 51.00Size 20.33 5.54 3.67 0.003 8.36 32.30Age -0.59 0.15 -3.85 0.002 -0.92 -0.26

t = =2.16 0.05

=TINV(0.05,13)

Page 30: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 30

SPSS Data Analysis Calculations

Multivariate Case

Coefficientsa

29.420 9.990 2.945 .011 7.837 51.002

20.332 5.540 .503 3.670 .003 8.363 32.301

-.588 .153 -.527 -3.845 .002 -.918 -.258

(Constant)

SIZE

AGE

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig. Lower Bound Upper Bound

95% Confidence Interval for B

Dependent Variable: PRICEa.

Page 31: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 31

Coefficientsa

29.420 9.990 2.945 .011 7.837 51.002

20.332 5.540 .503 3.670 .003 8.363 32.301

-.588 .153 -.527 -3.845 .002 -.918 -.258

(Constant)

SIZE

AGE

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig. Lower Bound Upper Bound

95% Confidence Interval for B

Dependent Variable: PRICEa.

Summary:

Never test the intercept (constant). Discussed in more detail in L01I

If sig is less than .05, keep the variable (slope not equal to zero).

If sig is greater than .05, consider eliminating the variable from the model (slope could be zero).

Page 32: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 32

Coefficientsa

29.420 9.990 2.945 .011 7.837 51.002

20.332 5.540 .503 3.670 .003 8.363 32.301

-.588 .153 -.527 -3.845 .002 -.918 -.258

(Constant)

SIZE

AGE

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig. Lower Bound Upper Bound

95% Confidence Interval for B

Dependent Variable: PRICEa.

Summary:

Never test the intercept (constant).

If sig is less than .05, keep the variable (slope not equal to zero).

If sig is greater than .05, consider eliminating the variable from the model (slope could be zero).

If you can’t remember theses rules a year

from now, look at the confidence

interval. Does the confidence interval

contain 0 (zero)

Page 33: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 33

F-test for Overall Model

)2FDIST( valueProb valueProb ifReject or

),1,(FINV),,1(),,1(ifReject

ˆ

/)ˆ(

1/)ˆ(

/Variation dUnexplaine

1/Variation Explained

/

1/StatisticTest

0 oneleast at :H0.... :H

22

22

2

2

mod

1

321o

,n-k,t

knkknkFknkFF

k/nYY

/k)YNY(

knYY

kYY

kn

k

knSSE

kSSF

bbbbb

C

C

ii

i

ii

i

elC

j

p

Page 34: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 34

Excel, Data Analysis Calculation

Multivariate Case

SUMMARY OUTPUTRegression Statistics

Multiple R 0.914R Square 0.836Adjusted R Square 0.810Standard Error 4.166Observations 16

ANOVAdf SS MS F Significance F

Regression 2 1146.2 573.1 33.0 0.000Residual 13 225.6 17.4Total 15 1371.8

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%Intercept 29.42 9.99 2.94 0.011 7.84 51.00Size 20.33 5.54 3.67 0.003 8.36 32.30Age -0.59 0.15 -3.85 0.002 -0.92 -0.26

Page 35: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 35

SPSS Data Analysis Calculations

Multivariate Case

ANOVAb

1146.245 2 573.123 33.027 .000a

225.589 13 17.353

1371.834 15

Regression

Residual

Total

Model1

Sum ofSquares df Mean Square F Sig.

Predictors: (Constant), AGE, SIZEa.

Dependent Variable: PRICEb.

Page 36: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 36

Review of ANOVA Analysis

Green = Residual from mean.

Blue, dashed = portion of residual explained by regression equation.

Red = portion of residual still unexplained after fitting regression equation.

Sales = 9.15 + 3.46(R/D Expense)20

40

60

80

100

120

140

160

0.00 10.00 20.00 30.00 40.00 50.00

R/D Expense

Sal

es

Page 37: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 37

Fundamental Concept ofANOVA Analysis

Residual AnalysisTotal = Unexplained + Explained

It can be shown (algebraically complex) Total SS = Unexplained SS + Explained SS

)ˆ()ˆ()( yyyyyy iiii

222 )ˆ()ˆ()( yyyyyy iiii

Page 38: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 38

Review of ANOVA Table (1 of 3)

Terminology and Table Calculations

SS df MS F(Sum of Squares) (degrees of freedom) (Mean Squares)

SSR k-1 SSR/(k-1) SSR/(k-1) / SSE/(n-k)(Sum of Squares Regression) (Mean Squares Regression)

SSE n-k SSE/(n-k)(Sum of Squares Error) (Mean Squares Residual)

(Sum of Squares Residual)

SST n-1(Sum of Squares Total)

Page 39: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 39

Review of ANOVA Table (2 of 3)

Algebraic explanation of terms

(total)

ˆ

ˆ

1n)y(ySST

ed)(unexplain

k)SSE/(nkn)y(yeSSE

)(explained

k)SSE/(n1)SSR/(k1)SSR/(k1k)yy(SSR

FMSdfSS

2i

2ii

2i

2i

Page 40: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 40

Review of ANOVA Table (3 of 3)

Calculation formulas

k)SSE/(nesSSR/SST2R

(total)

1SST

ed)(unexplain

)/(SSEˆSSE

)(explained

)/(SSE)1/(SSR)1/(SSR1ˆSSR

22

22

22

nyny

knknyy

knkkkyny

i

ii

i

FMSdf SS

Page 41: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 41

Review of ANOVA Table (1 of 3)

Matrix explanation of termsRegression prediction compared to prediction

mean of y

(total)

1nynSST

ed)(unexplain

k)SSE/(nknSSE

)(explained

k)SSE/(n1)SSR/(k1)SSR/(k1kynSSR

FMSdfSS

2original

2original

YY'

YX'b'-YY'

YX'b'

Page 42: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 42

Review of ANOVA Table (2 of 3)Alternative matrix explanation of termsRegression prediction compared to prediction 0 (zero)

(total)

nSST

ed)(unexplain

k)SSE/(nknSSE

)(explained

k)SSE/(nSSR/(k)SSR/(k)kSSR

FMSdfSS

YY'

YX'b'-YY'

YX'b'

Page 43: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 43

Review of ANOVA Table (3 of 3)Alternative matrix explanation of termsRegression prediction compared to prediction mean of Y & 0 (zero)

total)ed(uncorrect

(total)

nSST

1nynSST

ed)(unexplain

k)SSE/(nknSSE

)(explained

k)SSE/(n1)SSR/(k1)SSR/(k1kynbSSR

)(explained

k)SSE/(n1)SSR/(k1)SSR/(k1ynSSb

FMSdfSS

2original

2original0

2original0

YY'

YY'

YX'b'-YY'

YX'b'

Page 44: Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach

L01C MGS 8110 - Regression Inference 44

Statistical Assumptions0. The expected value of the residuals is zero, E(i)=0.

The algebraic equation is the correct functional form and accurately predicts E(Yi,j) for all j.

Inference Assumptions1. The residual variance is constant. That is, j,j

2 = 2 for all Xj,j and all i and j. The variance of the observations (Yi,j) does not change as more observations are obtained and/or as different values of Xj are observed.

2. The observations are statistically independent. That is, Yi,j is statistically independent of all other Y’,j values for all i (& j fixed). Knowing the current value of Y does not provide insights into the value of the next Y.

3. The residual errors are normally distributed. The i,j terms are N(0,2).