statistics for business and economics: bab 14
DESCRIPTION
Statistics for Business and Economics: bab 14Materi Statistik untuk Bisnis dan Ekonomi:Anderson, Sweeney, Williams; Bab 14TRANSCRIPT
1 Slide
Slides Prepared byJOHN S. LOUCKS
St. Edward’s University
© 2002 South-Western/Thomson Learning
2 Slide
Chapter 14 Simple Linear Regression
Simple Linear Regression Model Least Squares Method Coefficient of Determination Model Assumptions Testing for Significance Using the Estimated Regression Equation
for Estimation and Prediction Computer Solution Residual Analysis: Validating Model
Assumptions Residual Analysis: Outliers and Influential Observations
3 Slide
The Simple Linear Regression Model
Simple Linear Regression Model y = 0 + 1x +
Simple Linear Regression EquationE(y) = 0 + 1x
Estimated Simple Linear Regression Equationy = b0 + b1x^
4 Slide
Least Squares Method
Least Squares Criterion
where:yi = observed value of the dependent
variable for the ith observationyi = estimated value of the dependent
variable for the ith observation
min (y yi i )2
^
5 Slide
Slope for the Estimated Regression Equation
y-Intercept for the Estimated Regression Equation
b0 = y - b1xwhere:xi = value of independent variable for ith observationyi = value of dependent variable for ith observation
x = mean value for independent variable y = mean value for dependent variable n = total number of observations
__
b x y x y nx x n
i i i i
i i1 2 2
( )/( ) /
__
The Least Squares Method
6 Slide
Example: Reed Auto Sales
Simple Linear RegressionReed Auto periodically has a special week-long sale. As part of the advertising campaign Reed runs one or more television commercials during the weekend preceding the sale. Data from a sample of 5 previous sales are shown below.
Number of TV Ads Number of Cars Sold1 143 242 181 173 27
7 Slide
Slope for the Estimated Regression Equation b1 = 220 - (10)(100)/5 = 5
24 - (10)2/5 y-Intercept for the Estimated Regression
Equation b0 = 20 - 5(2) = 10
Estimated Regression Equationy = 10 + 5x
^
Example: Reed Auto Sales
8 Slide
Example: Reed Auto Sales
Scatter Diagram
y = 5x + 10
0
5
10
15
20
25
30
0 1 2 3 4TV Ads
Car
s So
ld
9 Slide
The Coefficient of Determination
Relationship Among SST, SSR, SSESST = SSR + SSE
Coefficient of Determinationr2 = SSR/SST
where: SST = total sum of squares SSR = sum of squares due to
regression SSE = sum of squares due to error
( ) ( ) ( )y y y y y yi i i i 2 2 2^^
10 Slide
Coefficient of Determinationr2 = SSR/SST = 100/114 = .8772The regression relationship is very strong
since 88% of the variation in number of cars sold can be explained by the linear relationship between the number of TV ads and the number of cars sold.
Example: Reed Auto Sales
11 Slide
The Correlation Coefficient
Sample Correlation Coefficient
where: b1 = the slope of the estimated
regressionequation
21 ) of(sign rbrxy
ionDeterminat oft Coefficien ) of(sign 1brxy
xbby 10ˆ
12 Slide
Example: Reed Auto Sales
Sample Correlation Coefficient
The sign of b1 in the equation is “+”.
rxy = +.9366
21 ) of(sign rbrxy
ˆ 10 5y x
=+ .8772xyr
13 Slide
Model Assumptions
Assumptions About the Error Term • The error is a random variable with mean
of zero.• The variance of , denoted by 2, is the
same for all values of the independent variable.
• The values of are independent.• The error is a normally distributed random
variable.
14 Slide
Testing for Significance
To test for a significant regression relationship, we must conduct a hypothesis test to determine whether the value of 1 is zero.
Two tests are commonly used• t Test• F Test
Both tests require an estimate of 2, the variance of in the regression model.
15 Slide
Testing for Significance
An Estimate of 2
The mean square error (MSE) provides the estimate
of 2, and the notation s2 is also used. s2 = MSE = SSE/(n-2)
where: 2
102 )()ˆ(SSE iiii xbbyyy
16 Slide
Testing for Significance
An Estimate of • To estimate we take the square root of
2.• The resulting s is called the standard error
of the estimate.
2SSEMSE
n
s
17 Slide
Hypotheses H0: 1 = 0 Ha: 1 = 0
Test Statistic
Rejection RuleReject H0 if t < -t or t > t
where t is based on a t distribution with
n - 2 degrees of freedom.
Testing for Significance: t Test
t bsb
1
1
18 Slide
t Test • Hypotheses H0: 1 = 0
Ha: 1 = 0• Rejection Rule
For = .05 and d.f. = 3, t.025 = 3.182
Reject H0 if t > 3.182• Test Statistics
t = 5/1.08 = 4.63• Conclusions
Reject H0
Example: Reed Auto Sales
19 Slide
Confidence Interval for 1
We can use a 95% confidence interval for 1 to test the hypotheses just used in the t test.
H0 is rejected if the hypothesized value of 1 is not included in the confidence interval for 1.
20 Slide
Confidence Interval for 1
The form of a confidence interval for 1 is:
where b1 is the point estimateis the margin of erroris the t value providing an
areaof /2 in the upper tail of a
t distribution with n - 2 degrees
of freedom
12/1 bstb
12/ bst2/t
21 Slide
Example: Reed Auto Sales
Rejection RuleReject H0 if 0 is not included in the
confidence interval for 1. 95% Confidence Interval for 1
= 5 +/- 3.182(1.08) = 5 +/- 3.44
or 1.56 to 8.44 Conclusion
Reject H0
12/1 bstb
22 Slide
Testing for Significance: F Test
Hypotheses H0: 1 = 0 Ha: 1 = 0
Test StatisticF = MSR/MSE
Rejection RuleReject H0 if F > F
where F is based on an F distribution with 1 d.f. in the numerator and n - 2 d.f. in the denominator.
23 Slide
F Test• Hypotheses H0: 1 = 0
Ha: 1 = 0• Rejection Rule
For = .05 and d.f. = 1, 3: F.05 = 10.13
Reject H0 if F > 10.13.• Test Statistic
F = MSR/MSE = 100/4.667 = 21.43• Conclusion
We can reject H0.
Example: Reed Auto Sales
24 Slide
Some Cautions about theInterpretation of Significance Tests
Rejecting H0: 1 = 0 and concluding that the relationship between x and y is significant does not enable us to conclude that a cause-and-effect relationship is present between x and y.
Just because we are able to reject H0: 1 = 0 and demonstrate statistical significance does not enable us to conclude that there is a linear relationship between x and y.
25 Slide
Confidence Interval Estimate of E(yp)
Prediction Interval Estimate of yp
yp + t/2 sind
where the confidence coefficient is 1 - and
t/2 is based on a t distribution with n - 2 d.f.
Using the Estimated Regression Equationfor Estimation and Prediction
/ y t sp yp 2
26 Slide
Point EstimationIf 3 TV ads are run prior to a sale, we expect the mean number of cars sold to be:
y = 10 + 5(3) = 25 cars Confidence Interval for E(yp)
95% confidence interval estimate of the mean number of cars sold when 3 TV ads are run is:
25 + 4.61 = 20.39 to 29.61 cars Prediction Interval for yp
95% prediction interval estimate of the number of cars sold in one particular week when 3 TV ads are run is: 25 + 8.28 = 16.72 to 33.28 cars
^
Example: Reed Auto Sales
27 Slide
Residual for Observation i yi – yi
Standardized Residual for Observation i
where:
Residual Analysis
^
y ysi i
y yi i
^^
s s hy y ii i 1^
28 Slide
Example: Reed Auto Sales
ResidualsObservation Predicted Cars Sold Residuals
1 15 -12 25 -13 20 -24 15 25 25 2
29 Slide
Example: Reed Auto Sales
Residual Plot
TV Ads Residual Plot
-3
-2
-1
0
1
2
3
0 1 2 3 4TV Ads
Resi
dual
s
30 Slide
Residual Analysis Detecting Outliers
• An outlier is an observation that is unusual in comparison with the other data.
• Minitab classifies an observation as an outlier if its standardized residual value is < -2 or > +2.
• This standardized residual rule sometimes fails to identify an unusually large observation as being an outlier.
• This rule’s shortcoming can be circumvented by using studentized deleted residuals.
• The |i th studentized deleted residual| will be larger than the |i th standardized residual|.
31 Slide
End of Chapter 14