doane chapter 12a
TRANSCRIPT
-
7/29/2019 Doane Chapter 12a
1/36
-
7/29/2019 Doane Chapter 12a
2/36
Bivariate Regression (Part 1)
C h
a p t er
12
Visual Displays and Correlation Analysis Bivariate Regression
Regression Terminology Ordinary Least Squares Formulas
Tests for Significance
-
7/29/2019 Doane Chapter 12a
3/36
Visual Displays andCorrelation Analysis
Begin the analysis of bivariate data (i.e., twovariables) with a scatter plot .
A scatter plot- displays each observed data pair ( x i , y i ) as a dot
on an X/Y grid- indicates visually the strength of the relationship
between the two variables
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.
Visu al Disp lay s
-
7/29/2019 Doane Chapter 12a
4/36
Visual Displays andCorrelation Analysis
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.
Visu al Disp lay s
-
7/29/2019 Doane Chapter 12a
5/36
Visual Displays andCorrelation Analysis
The sample correlation coefficient (r ) measuresthe degree of linearity in the relationship between
X and Y . -1 < r < +1
r = 0 indicates no linear relationship In Excel, use =CORREL(array1,array2),where array1 is the range for X and array2 is therange for Y .
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.
Correlat ion A nalys is
Strong negative relationship Strong positive relationship
-
7/29/2019 Doane Chapter 12a
6/36
Visual Displays andCorrelation Analysis
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.
Correlat ion A nalys is
-
7/29/2019 Doane Chapter 12a
7/36
Visual Displays andCorrelation Analysis
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.
Correlat ion A nalys is
Strong PositiveCorrelation
Weak PositiveCorrelation
-
7/29/2019 Doane Chapter 12a
8/36
Visual Displays andCorrelation Analysis
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.
Correlat ion A nalys is
Weak NegativeCorrelation
Strong NegativeCorrelation
-
7/29/2019 Doane Chapter 12a
9/36
Visual Displays andCorrelation Analysis
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.
Correlat ion A nalys is
No Correlation
Nonlinear Relation
-
7/29/2019 Doane Chapter 12a
10/36
Visual Displays andCorrelation Analysis
r is an estimate of the population correlationcoefficient r (rho).
To test the hypothesis H 0: r = 0, the test statisticis:
The critical value t a is obtained from Appendix Dusing n = n 2 degrees of freedom for any a .
Find the p -value using Excels function=TDIST(t,deg_freedom,tails) or MINITAB.
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.
Tests for Sign if icance
-
7/29/2019 Doane Chapter 12a
11/36
Visual Displays andCorrelation Analysis
Equivalently, you can calculate the critical valuefor the correlation coefficient using
This method gives a benchmark for the
correlation coefficient. However, there is no p -value and is inflexible if you change your mind about a .
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.
Tests for Sign if icance
-
7/29/2019 Doane Chapter 12a
12/36
Visual Displays andCorrelation Analysis
Step 1: State the HypothesesDetermine whether you are using a one or two-tailed test and the level of significance ( a ).
H 0: r = 0H 1: r 0
Step 2: Calculate the Critical Value
For degrees of freedom n = n -2, look up thecritical value t a in Appendix D, then calculate
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.
Steps in Test in g i f r = 0
-
7/29/2019 Doane Chapter 12a
13/36
Visual Displays andCorrelation Analysis
Step 3: Make the DecisionIf the sample correlation coefficient r exceeds the
critical value r a , then reject H 0.If using the t statistic method, reject H 0 if t > t a or if the p -value < a .
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.
Steps in Test in g i f r = 0
-
7/29/2019 Doane Chapter 12a
14/36
Visual Displays andCorrelation Analysis
A quick test for significance of a correlation ata = .05 is | r | > 2/ n
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.
Quick Rule for S igni f icance
-
7/29/2019 Doane Chapter 12a
15/36
Visual Displays andCorrelation Analysis
As sample size increases, the critical value of r becomes smaller.
This makes it easier for smaller values of thesample correlation coefficient to be consideredsignificant.
A larger sample does not mean that thecorrelation is stronger nor does its significanceimply importance.
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.
Role of Sam p le Size
-
7/29/2019 Doane Chapter 12a
16/36
Bivariate Regression
Bivariate Regression analyzes the relationshipbetween two variables.
It specifies one dependent (response ) variableand one independent ( predictor ) variable.
This hypothesized relationship may be linear,quadratic, or whatever.
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.
Wh at is B ivar iate Regress ion ?
-
7/29/2019 Doane Chapter 12a
17/36
Bivariate Regression
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.
Mod el Form
-
7/29/2019 Doane Chapter 12a
18/36
Regression Terminology
Unknown parameters areb0 Intercept
b1 Slope The assumed model for a linear relationship is
y i = b0 + b1 x i + ei for all observations ( i = 1, 2, , n)
The error term is not observable, is assumednormally distributed with mean of 0 and standarddeviation s .
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.
Mod els and Parameters
-
7/29/2019 Doane Chapter 12a
19/36
Regression Terminology
The fitted model used to predict the expected value of Y for a given value of X is
y i = b 0 + b 1 x i
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.
Mod els and Parameters
The fitted coefficients areb 0 the estimated intercept
b 1 the estimated slope Residual is e i = y i - y i . Residuals may be used to estimate s , the
standard deviation of the errors.
^
^
-
7/29/2019 Doane Chapter 12a
20/36
Regression Terminology
Step 1:- Highlight the data columns.
- Click on the Chart Wizard and choose Scatter Plot
- In the completed graph, click once on the pointsin the scatter plot to select the data
- Right-click and choose Add Trendline- Choose Options and check Display Equation
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.
Fi t t ing a Regress io n o n a Scat ter Plot in Excel
-
7/29/2019 Doane Chapter 12a
21/36
Regression Terminology
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.
Fi t t ing a Regress io n o n a Scat ter Plot in Excel
-
7/29/2019 Doane Chapter 12a
22/36
Regression Terminology
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.
-
7/29/2019 Doane Chapter 12a
23/36
Ordinary Least Squares Formulas
The ordinary least squares method ( OLS )estimates the slope and intercept of theregression line so that the residuals are small.
The sum of the residuals = 0
The sum of the squared residuals is SSE
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.
Slope and Intercept
-
7/29/2019 Doane Chapter 12a
24/36
Ordinary Least Squares Formulas
The OLS estimator for the slope is:
The OLS estimator for the intercept is:
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.
Slope and Intercept
or
-
7/29/2019 Doane Chapter 12a
25/36
Ordinary Least Squares Formulas
We want to explain the total variation in Y aroundits mean ( SST for Total Sums of Squares )
The regression sum of squares ( SSR ) is theexplained variation in Y
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies Inc All rights reserved
A ssess ing F it
-
7/29/2019 Doane Chapter 12a
26/36
Ordinary Least Squares Formulas
The error sum of squares ( SSE ) is theunexplained variation in Y
If the fit is good, SSE will be relatively smallcompared to SST .
A perfect fit is indicated by an SSE = 0. The magnitude of SSE depends on n and on the
units of measurement.McGraw-Hill/Irwin 2007 The McGraw-Hill Companies Inc All rights reserved
A ssess ing F it
-
7/29/2019 Doane Chapter 12a
27/36
Ordinary Least Squares Formulas
R 2 is a measure of relative fit based on acomparison of SSR and SST .
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies Inc All rights reserved
Coeff ic ient o f Determ inat ion
0 < R 2 < 1
Often expressed as a percent, an R 2 = 1 (i.e.,100%) indicates perfect fit.
In a bivariate regression, R 2 = ( r )2
R 2 is a measure of relative fit based on acomparison of SSR and SST .
-
7/29/2019 Doane Chapter 12a
28/36
Tests for Significance
The standard error (s yx ) is an overall measure of model fit.
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies Inc All rights reserved
Stand ard Error o f Regress ion
If the fitted models predictions are perfect(SSE = 0), then s yx = 0. Thus, a small s yx
indicates a better fit. Used to construct confidence intervals. Magnitude of s yx depends on the units of
measurement of Y and on data magnitude.
-
7/29/2019 Doane Chapter 12a
29/36
Tests for Significance
Standard error of the slope:
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies Inc All rights reserved
Con fidenc e Intervals fo r Slope and Intercept
Standard error of the intercept:
-
7/29/2019 Doane Chapter 12a
30/36
Tests for Significance
Confidence interval for the true slope:
McGraw Hill/Irwin
2007 The McGraw Hill Companies Inc All rights reserved
Con fidenc e Intervals fo r Slope and Intercept
Confidence interval for the true intercept:
-
7/29/2019 Doane Chapter 12a
31/36
Tests for Significance
If b1 = 0, then X cannot influence Y and theregression model collapses to a constant b0 plusrandom error.
The hypotheses to be tested are:
McGraw Hill/Irwin
2007 The McGraw Hill Companies Inc All rights reserved
Hyp oth esis Tests
-
7/29/2019 Doane Chapter 12a
32/36
Tests for Significance
A t test is used with n = n 2 degrees of freedomThe test statistics for the slope and intercept are:
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.
Hyp oth esis Tests
t n-2 is obtained from Appendix D or Excel for agiven a .
Reject H 0 if t > t a or if p -value < a .
Slope:
Intercept:
-
7/29/2019 Doane Chapter 12a
33/36
Tests for Significance
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.
Using Exc el
-
7/29/2019 Doane Chapter 12a
34/36
Tests for Significance
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.
Usin g MegaStat
-
7/29/2019 Doane Chapter 12a
35/36
Tests for Significance
McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.
Us in g MINITA B
-
7/29/2019 Doane Chapter 12a
36/36
Applied Statistics inBusiness and Economics
End of Chapter 12 Part 1 of 2