sda 3e chapter 6 (1)

30
© 2007 Pearson Education Chapter 6: Regression  Analysis Part 1: Simple Linear Regression

Upload: xinearpinger

Post on 07-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 1/30

© 2007 Pearson Education

Chapter 6: Regression Analysis

Part 1: Simple Linear Regression

Page 2: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 2/30

Regression Analysis Building models that characterize the

relationships between a dependent variable

and one (single) or more (multiple)independent variables, all of which arenumerical, for example:

Sales = a + b*Price + c*Coupons + 

d*Advertising + e*Price*Advertising   Cross-sectional data

Time series data (forecasting)

Page 3: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 3/30

Simple Linear Regression Single independent variable

Linear relationship

Page 4: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 4/30

SLR ModelY =  0 +  1   X + e  

Intercept slope error

E(Y X)

f(Y X)

Page 5: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 5/30

Error Terms (Residuals)e 

i   = Y 

i -  0 -  1   X 

i  

Page 6: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 6/30

Estimation of the Regression

LineTrue regression line (unknown): 

Y =  0 +  1   X + e 

Estimated regression line:

Y = b0

+ b1  X 

Observed errors:

ei= Y i - b

0- b

1  X i 

Page 7: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 7/30

Least Squares Regression=

+-n

i

ii X bbY 1

2

10])[(

2

1

2

1

1

 X n X 

Y  X nY  X 

bn

ii

n

i

ii

-

-

=

=

=

b0

=     Y - b1     X  

minimize

a

bc

Page 8: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 8/30

Excel Trendlines Construct a scatter diagram

Method 1: Select Chart/Add Trendline  

Method 2: Select data series; right click  

= 0.0854(6000) – 108.59

= 403.816000ˆY 

Page 9: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 9/30

Regression and

Investment Risk  Systematic risk  – variation in stock price

explained by the market

Measured by beta

Beta = 1: perfect match to marketmovements

Beta < 1: stock is less volatile than market Beta > 1: stock is more volatile than

market

Page 10: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 10/30

Systematic Risk (Beta)

Beta is the slope of theregression line

Page 11: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 11/30

Theory Without Regression

Y

Best estimate for Y is the

mean; independent of the

value of X

A measure of total variation is

SST =  (Y i -    Y)2 

Unexplained

variation

Xi 

Yi 

Page 12: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 12/30

Theory With RegressionObserved values Yi 

Fitted values Yi 

Xi 

Yi 

Variation unexplained

after regression, Y - Y

Fitted line

Y = b0 + b1X

Y

Variation explained

by regression, Y - Y

Page 13: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 13/30

Sums of SquaresSST =  (Y 

i-    Y)2 

=  (   – Y )

2

+  (Y - )2

 Y ˆ

Y ˆ

= SSR + SSE

Explained variation Unexplained variation

Page 14: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 14/30

Coefficient of Determination R 2 = SSR/SST = (SST  – SSE)/SST = 1  – SSE/SST =

coefficient of determination: the proportion of variation explained by the independent variable

(regression model)

0 R 2  1

 Adjusted R 2 incorporates sample size and number of explanatory variables (in multiple regression models).

Useful for comparing against models with differentnumbers of explanatory variables.

-

---=

2

111

22

n

n R Radj

Page 15: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 15/30

Correlation Coefficient Sample correlation coefficient 

R = R 2 

Properties

-1 R  1

R = 1 => perfect positive correlation

R = -1 => perfect negative correlation

R = 0 => no correlation

Page 16: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 16/30

Standard Error of the Estimate MSE = SSE/(n-2) = an unbiased

estimate of the variance of the errors

about the regression line Standard error of the estimate is

Measures the spread of data about the line

SYX = 

2-n

SSE 

Page 17: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 17/30

Confidence Interval for Mean

 Value of Y  Confidence intervals depend on the value of 

the independent variable

t /2, n-2 SYXhi iY ˆ

=

-

-+= n

i

i

ii

 X  X 

 X  X 

nh

1

2

2

)(

)(1

Page 18: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 18/30

Prediction Intervals Prediction intervals apply to individual

(not mean) values of the dependent

variable

t /2, n-2 SYX1+hi iY ˆ

Page 19: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 19/30

PHStat Simple Linear

Regression

Define range of dependentand independent variables

Choose regression options

Choose output options

Page 20: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 20/30

Regression Statistics and

Confidence Interval Outputs

Page 21: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 21/30

Regression as ANOVA  Testing for significance of regression

SST = SSR + SSE

The null hypothesis implies that SST = SSE, or SSR = 0

MSR = SSR/1 = variance explained by regression

F = MSR/MSE

If F > critical value, it is likely that 1  0, or that theregression line is significant

H0: 1 = 0

H1: 1  0

Page 22: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 22/30

t-test for Significance of 

Regression

= -

-=

n

iiYX  X  X S

bt 

1

2

11

)( / 

  with n-2 degrees of freedom

This allows you to test

H0: slope = 1 

H1: slope  1 

Page 23: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 23/30

Excel Regression Tool Excel menu > Tools > Data Analysis >

Regression 

Input variable ranges

Check appropriateboxes

Select output options

Page 24: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 24/30

Regression OutputCorrelation coefficient

SYX 

b0 b1 

p-value for

significance of 

regression

t- test for slope Confidence

interval for slope

Page 25: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 25/30

Residuals

Standard residuals are residuals divided by their standard

error, expressed in units independent of the units of the

data.

Page 26: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 26/30

 Assumptions Underlying

Regression Linearity

Check with scatter diagram of the data or the residual plot

Normally distributed errors for each X with mean 0 and constant

variance Examine histogram of standardized residuals or use goodness-of-fit

tests

Homoscedasticity  – constant variance about the regression linefor all values of the independent variable

Examine by plotting residuals and looking for differences invariances at different values of X

No autocorrelation. Residuals should be independent for eachvalue of the independent variable. Important if the independentvariable is time (e.g., forecasting models).

Page 27: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 27/30

Residual Plot

Page 28: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 28/30

Histogram of Residuals

Page 29: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 29/30

Evaluating HomoscedasticityOK 

Heteroscadastic

Page 30: SDA 3E Chapter 6 (1)

8/4/2019 SDA 3E Chapter 6 (1)

http://slidepdf.com/reader/full/sda-3e-chapter-6-1 30/30

 Autocorrelation Durbin-Watson statistic

D < 1 suggest autocorrelation

D > 1.5 suggest no autocorrelation

D > 2.5 suggest negative autocorrelation

PHStat tool calculates this statistic

=

=

--

=n

i

i

n

i

ii

e

ee

 D

1

2

2

2

1 )(