sda 3e chapter 6 (1)
Post on 07-Apr-2018
220 Views
Preview:
TRANSCRIPT
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 1/30
© 2007 Pearson Education
Chapter 6: Regression Analysis
Part 1: Simple Linear Regression
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 2/30
Regression Analysis Building models that characterize the
relationships between a dependent variable
and one (single) or more (multiple)independent variables, all of which arenumerical, for example:
Sales = a + b*Price + c*Coupons +
d*Advertising + e*Price*Advertising Cross-sectional data
Time series data (forecasting)
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 3/30
Simple Linear Regression Single independent variable
Linear relationship
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 4/30
SLR ModelY = 0 + 1 X + e
Intercept slope error
E(Y X)
f(Y X)
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 5/30
Error Terms (Residuals)e
i = Y
i - 0 - 1 X
i
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 6/30
Estimation of the Regression
LineTrue regression line (unknown):
Y = 0 + 1 X + e
Estimated regression line:
Y = b0
+ b1 X
Observed errors:
ei= Y i - b
0- b
1 X i
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 7/30
Least Squares Regression=
+-n
i
ii X bbY 1
2
10])[(
2
1
2
1
1
X n X
Y X nY X
bn
ii
n
i
ii
-
-
=
=
=
b0
= Y - b1 X
minimize
a
bc
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 8/30
Excel Trendlines Construct a scatter diagram
Method 1: Select Chart/Add Trendline
Method 2: Select data series; right click
= 0.0854(6000) – 108.59
= 403.816000ˆY
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 9/30
Regression and
Investment Risk Systematic risk – variation in stock price
explained by the market
Measured by beta
Beta = 1: perfect match to marketmovements
Beta < 1: stock is less volatile than market Beta > 1: stock is more volatile than
market
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 10/30
Systematic Risk (Beta)
Beta is the slope of theregression line
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 11/30
Theory Without Regression
Y
Best estimate for Y is the
mean; independent of the
value of X
A measure of total variation is
SST = (Y i - Y)2
Unexplained
variation
Xi
Yi
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 12/30
Theory With RegressionObserved values Yi
Fitted values Yi
Xi
Yi
Variation unexplained
after regression, Y - Y
Fitted line
Y = b0 + b1X
Y
Variation explained
by regression, Y - Y
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 13/30
Sums of SquaresSST = (Y
i- Y)2
= ( – Y )
2
+ (Y - )2
Y ˆ
Y ˆ
= SSR + SSE
Explained variation Unexplained variation
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 14/30
Coefficient of Determination R 2 = SSR/SST = (SST – SSE)/SST = 1 – SSE/SST =
coefficient of determination: the proportion of variation explained by the independent variable
(regression model)
0 R 2 1
Adjusted R 2 incorporates sample size and number of explanatory variables (in multiple regression models).
Useful for comparing against models with differentnumbers of explanatory variables.
-
---=
2
111
22
n
n R Radj
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 15/30
Correlation Coefficient Sample correlation coefficient
R = R 2
Properties
-1 R 1
R = 1 => perfect positive correlation
R = -1 => perfect negative correlation
R = 0 => no correlation
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 16/30
Standard Error of the Estimate MSE = SSE/(n-2) = an unbiased
estimate of the variance of the errors
about the regression line Standard error of the estimate is
Measures the spread of data about the line
SYX =
2-n
SSE
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 17/30
Confidence Interval for Mean
Value of Y Confidence intervals depend on the value of
the independent variable
t /2, n-2 SYXhi iY ˆ
=
-
-+= n
i
i
ii
X X
X X
nh
1
2
2
)(
)(1
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 18/30
Prediction Intervals Prediction intervals apply to individual
(not mean) values of the dependent
variable
t /2, n-2 SYX1+hi iY ˆ
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 19/30
PHStat Simple Linear
Regression
Define range of dependentand independent variables
Choose regression options
Choose output options
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 20/30
Regression Statistics and
Confidence Interval Outputs
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 21/30
Regression as ANOVA Testing for significance of regression
SST = SSR + SSE
The null hypothesis implies that SST = SSE, or SSR = 0
MSR = SSR/1 = variance explained by regression
F = MSR/MSE
If F > critical value, it is likely that 1 0, or that theregression line is significant
H0: 1 = 0
H1: 1 0
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 22/30
t-test for Significance of
Regression
= -
-=
n
iiYX X X S
bt
1
2
11
)( /
with n-2 degrees of freedom
This allows you to test
H0: slope = 1
H1: slope 1
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 23/30
Excel Regression Tool Excel menu > Tools > Data Analysis >
Regression
Input variable ranges
Check appropriateboxes
Select output options
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 24/30
Regression OutputCorrelation coefficient
SYX
b0 b1
p-value for
significance of
regression
t- test for slope Confidence
interval for slope
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 25/30
Residuals
Standard residuals are residuals divided by their standard
error, expressed in units independent of the units of the
data.
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 26/30
Assumptions Underlying
Regression Linearity
Check with scatter diagram of the data or the residual plot
Normally distributed errors for each X with mean 0 and constant
variance Examine histogram of standardized residuals or use goodness-of-fit
tests
Homoscedasticity – constant variance about the regression linefor all values of the independent variable
Examine by plotting residuals and looking for differences invariances at different values of X
No autocorrelation. Residuals should be independent for eachvalue of the independent variable. Important if the independentvariable is time (e.g., forecasting models).
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 27/30
Residual Plot
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 28/30
Histogram of Residuals
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 29/30
Evaluating HomoscedasticityOK
Heteroscadastic
8/4/2019 SDA 3E Chapter 6 (1)
http://slidepdf.com/reader/full/sda-3e-chapter-6-1 30/30
Autocorrelation Durbin-Watson statistic
D < 1 suggest autocorrelation
D > 1.5 suggest no autocorrelation
D > 2.5 suggest negative autocorrelation
PHStat tool calculates this statistic
=
=
--
=n
i
i
n
i
ii
e
ee
D
1
2
2
2
1 )(
top related