statistics for business and economics

111
Statistics for Business and Economics Chapter 11 Multiple Regression and Model Building

Upload: sukey

Post on 15-Jan-2016

50 views

Category:

Documents


0 download

DESCRIPTION

Statistics for Business and Economics. Chapter 11 Multiple Regression and Model Building. Learning Objectives. Explain the Linear Multiple Regression Model Describe Inference About Individual Parameters Test Overall Significance Explain Estimation and Prediction - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Statistics for Business and Economics

Statistics for Business and Economics

Chapter 11 Multiple Regression and Model

Building

Page 2: Statistics for Business and Economics

Learning Objectives

1. Explain the Linear Multiple Regression Model

2. Describe Inference About Individual Parameters

3. Test Overall Significance

4. Explain Estimation and Prediction

5. Describe Various Types of Models

6. Describe Model Building

7. Explain Residual Analysis

8. Describe Regression Pitfalls

Page 3: Statistics for Business and Economics

Types of Regression Models

Simple

1 ExplanatoryVariable

RegressionModels

2+ ExplanatoryVariables

Multiple

LinearNon-

LinearLinear

Non-Linear

Page 4: Statistics for Business and Economics

Models With Two or More Quantitative Variables

Page 5: Statistics for Business and Economics

Types of Regression Models

Simple

1 ExplanatoryVariable

RegressionModels

2+ ExplanatoryVariables

Multiple

LinearNon-

LinearLinear

Non-Linear

Page 6: Statistics for Business and Economics

Multiple Regression Model

• General form:

• k independent variables

• x1, x2, …, xk may be functions of variables

— e.g. x2 = (x1)2

0 1 1 2 2 k ky x x x

Page 7: Statistics for Business and Economics

Regression Modeling Steps

1. Hypothesize deterministic component

2. Estimate unknown model parameters

3. Specify probability distribution of random error term

• Estimate standard deviation of error

4. Evaluate model

5. Use model for prediction and estimation

Page 8: Statistics for Business and Economics

Probability Distribution of Random Error

Page 9: Statistics for Business and Economics

Regression Modeling Steps

1. Hypothesize deterministic component

2. Estimate unknown model parameters

3. Specify probability distribution of random error term

• Estimate standard deviation of error

4. Evaluate model

5. Use model for prediction and estimation

Page 10: Statistics for Business and Economics

Assumptions for Probability Distribution of ε

1. Mean is 0

2. Constant variance, σ2

3. Normally Distributed

4. Errors are independent

Page 11: Statistics for Business and Economics

Linear Multiple Regression Model

Page 12: Statistics for Business and Economics

Types of Regression Models

ExplanatoryVariable

1stOrderModel

3rdOrderModel

2 or MoreQuantitative

Variables

2ndOrderModel

1stOrderModel

2ndOrderModel

Inter-ActionModel

1Qualitative

Variable

DummyVariable

Model

1Quantitative

Variable

Page 13: Statistics for Business and Economics

Regression Modeling Steps

1. Hypothesize deterministic component

2. Estimate unknown model parameters

3. Specify probability distribution of random error term

• Estimate standard deviation of error

4. Evaluate model

5. Use model for prediction and estimation

Page 14: Statistics for Business and Economics

First–Order Multiple Regression Model

Relationship between 1 dependent and 2 or more independent variables is a linear function

0 1 1 2 2 k ky x x x

Dependent (response) variable

Independent (explanatory) variables

Population slopes

Population Y-intercept

Random error

Page 15: Statistics for Business and Economics

• Relationship between 1 dependent and 2 independent variables is a linear function

• Model

• Assumes no interaction between x1 and x2

— Effect of x1 on E(y) is the same regardless of x2 values

First-Order Model With 2 Independent Variables

0 1 1 2 2( )E y x x

Page 16: Statistics for Business and Economics

y

x1

0Response

Plane

(Observed y)

i

Population Multiple Regression Model

Bivariate model:

x2

(x1i , x2i)

0 1 1 2 2i i i iy x x

0 1 1 2 2( ) i iE y x x

Page 17: Statistics for Business and Economics

Sample Multiple Regression Model

y

0Response

Plane

(Observed y)

i

^^

Bivariate model:

x1

x2

(x1i , x2i)

0 1 1 2 2ˆ ˆ ˆ

i i i iy x x

0 1 1 2 2ˆ ˆ ˆˆi i iy x x

Page 18: Statistics for Business and Economics

No Interaction

Effect (slope) of x1 on E(y) does not depend on x2 value

E(y)

x1

4

8

12

00 10.5 1.5

E(y) = 1 + 2x1 + 3(2) = 7 + 2x1

E(y) = 1 + 2x1 + 3x2

E(y) = 1 + 2x1 + 3(1) = 4 + 2x1

E(y) = 1 + 2x1 + 3(0) = 1 + 2x1

E(y) = 1 + 2x1 + 3(3) = 10 + 2x1

Page 19: Statistics for Business and Economics

Parameter Estimation

Page 20: Statistics for Business and Economics

Regression Modeling Steps

1. Hypothesize Deterministic Component

2. Estimate Unknown Model Parameters

3. Specify Probability Distribution of Random Error Term

• Estimate Standard Deviation of Error

4. Evaluate Model

5. Use Model for Prediction & Estimation

Page 21: Statistics for Business and Economics

1 1 1 3

2 4 8 5

3 1 3 2

4 3 5 6

: : : :

First-Order Model Worksheet

Run regression with y, x1, x2

Case, i yi x1i x2i

Page 22: Statistics for Business and Economics

Multiple Linear Regression Equations

Too complicated

by hand! Ouch!

Page 23: Statistics for Business and Economics

Interpretation of Estimated Coefficients

2. Y-Intercept (0)• Average value of y when xk = 0

^

^

^

^1. Slope (k)

• Estimated y changes by k for each 1 unit increase in xk holding all other variables constant– Example: if 1 = 2, then sales (y) is expected to

increase by 2 for each 1 unit increase in advertising (x1) given the number of sales rep’s (x2)

Page 24: Statistics for Business and Economics

1st Order Model Example

You work in advertising for the New York Times. You want to find the effect of ad size (sq. in.) and newspaper circulation (000) on the number of ad responses (00). Estimate the unknown parameters.

You’ve collected the following data:

(y) (x1) (x2)Resp Size Circ

1 1 24 8 81 3 13 5 72 6 44 10 6

Page 25: Statistics for Business and Economics

Parameter Estimates

Parameter Standard T for H0:Variable DF Estimate Error Param=0 Prob>|T|

INTERCEP 1 0.0640 0.2599 0.246 0.8214

ADSIZE 1 0.2049 0.0588 3.656 0.0399

CIRC 1 0.2805 0.0686 4.089 0.0264

Parameter Estimation Computer Output

2^

0^

1^

1 2ˆ .0640 .2049 .2805y x x

Page 26: Statistics for Business and Economics

Interpretation of Coefficients Solution

^2. Slope (2)• Number of responses to ad is expected to increase

by .2805 (28.05) for each 1 unit (1,000) increase in circulation holding ad size constant

^1. Slope (1)• Number of responses to ad is expected to

increase by .2049 (20.49) for each 1 sq. in. increase in ad size holding circulation constant

Page 27: Statistics for Business and Economics

Estimation of σ2

Page 28: Statistics for Business and Economics

Regression Modeling Steps

1. Hypothesize Deterministic Component

2. Estimate Unknown Model Parameters

3. Specify Probability Distribution of Random Error Term

• Estimate Standard Deviation of Error

4. Evaluate Model

5. Use Model for Prediction & Estimation

Page 29: Statistics for Business and Economics

Estimation of σ2

22 ˆ( 1) i i

SSEs where SSE y y

n k

2

( 1)

SSEs s

n k

For a model with k independent variables

Page 30: Statistics for Business and Economics

Calculating s2 and s Example

You work in advertising for the New York Times. You want to find the effect of ad size (sq. in.), x1, and newspaper circulation (000), x2, on the number of ad responses (00), y. Find SSE, s2, and s.

Page 31: Statistics for Business and Economics

Analysis of Variance

Source DF SS MS F PRegression 2 9.249736 4.624868 55.44 .0043

Residual Error 3 .250264 .083421Total 5 9.5

Analysis of Variance Computer Output

SSE S2

2 .250264.083421

6 3

.083421 .2888

s

s

Page 32: Statistics for Business and Economics

Evaluating the Model

Page 33: Statistics for Business and Economics

Regression Modeling Steps

1. Hypothesize Deterministic Component

2. Estimate Unknown Model Parameters

3. Specify Probability Distribution of Random Error Term

• Estimate Standard Deviation of Error

4. Evaluate Model

5. Use Model for Prediction & Estimation

Page 34: Statistics for Business and Economics

Evaluating Multiple Regression Model Steps

1. Examine variation measures

2. Test parameter significance• Individual coefficients • Overall model

3. Do residual analysis

Page 35: Statistics for Business and Economics

Variation Measures

Page 36: Statistics for Business and Economics

Evaluating Multiple Regression Model Steps

1. Examine variation measures

2. Test parameter significance• Individual coefficients • Overall model

3. Do residual analysis

Page 37: Statistics for Business and Economics

Multiple Coefficient of Determination

• Proportion of variation in y ‘explained’ by all x variables taken together

• Never decreases when new x variable is added to model

— Only y values determine SSyy

— Disadvantage when comparing models

2

yy

Explained Variation SSER = = 1-

Total Variation SS

Page 38: Statistics for Business and Economics

Adjusted Multiple Coefficient of Determination

• Takes into account n and number of parameters

• Similar interpretation to R2

2 2a

n-1R = 1- 1

n-(k+1)R

Page 39: Statistics for Business and Economics

Estimation of R2 and Ra2

ExampleYou work in advertising for the New York Times. You want to find the effect of ad size (sq. in.), x1, and newspaper circulation (000), x2, on the number of ad responses (00), y. Find R2 and Ra

2.

Page 40: Statistics for Business and Economics

Excel Computer OutputSolution

R2

Ra2

Page 41: Statistics for Business and Economics

Testing Parameters

Page 42: Statistics for Business and Economics

Evaluating Multiple Regression Model Steps

1. Examine variation measures

2. Test parameter significance• Individual coefficients • Overall model

3. Do residual analysis

Page 43: Statistics for Business and Economics

Inference for an Individual β Parameter

• Confidence Interval

• Hypothesis TestHo: βi = 0Ha: βi ≠ 0 (or < or > )

• Test Statistic

ˆ2ˆ

ii t s df = n – (k + 1)

ˆ

ˆ

i

its

Page 44: Statistics for Business and Economics

Confidence Interval Example

You work in advertising for the New York Times. You want to find the effect of ad size (sq. in.), x1, and newspaper circulation (000), x2, on the number of ad responses (00), y. Find a 95% confidence interval for β1.

Page 45: Statistics for Business and Economics

Excel Computer OutputSolution

1s1

Page 46: Statistics for Business and Economics

Confidence IntervalSolution

1

.204921 3.182(.058822)

.0177 .3921

Page 47: Statistics for Business and Economics

Hypothesis Test Example

You work in advertising for the New York Times. You want to find the effect of ad size (sq. in.), x1, and newspaper circulation (000), x2, on the number of ad responses (00), y. Test the hypothesis that the mean ad response increases as circulation increases (ad size constant). Use α = .05.

Page 48: Statistics for Business and Economics

Hypothesis Test Solution

• H0:• Ha:• • df • Critical Value(s):

Test Statistic:

Decision:

Conclusion:

t0 2.353

.05

Reject H0

2 = 0

2 0

.05

6 - 3 = 3

Page 49: Statistics for Business and Economics

Excel Computer OutputSolution

2ˆs

2

Page 50: Statistics for Business and Economics

Hypothesis Test Solution

• H0:• Ha:• • df • Critical Value(s):

Test Statistic:

Decision:

Conclusion:

t0 2.353

.05

Reject H0

2 = 0

2 0

.05

6 - 3 = 3 2

2

ˆ

ˆ .2804924.089

.068602t

S

Reject at = .05

There is evidence the mean ad response increases as circulation increases

Page 51: Statistics for Business and Economics

Excel Computer OutputSolution

2

2

ˆ

ˆt

s

P–Value

Page 52: Statistics for Business and Economics

Evaluating Multiple Regression Model Steps

1. Examine variation measures

2. Test parameter significance• Individual coefficients• Overall model

3. Do residual analysis

Page 53: Statistics for Business and Economics

Testing Overall Significance

• Shows if there is a linear relationship between all x variables together and y

• Hypotheses— H0: 1 = 2 = ... = k = 0

No linear relationship

— Ha: At least one coefficient is not 0 At least one x variable affects y

Page 54: Statistics for Business and Economics

• Test Statistic

• Degrees of Freedom1 = k 2 = n – (k + 1)

— k = Number of independent variables

— n = Sample size

( )

( )

MS ModelF

MS Error

Testing Overall Significance

Page 55: Statistics for Business and Economics

Testing Overall Significance Example

You work in advertising for the New York Times. You want to find the effect of ad size (sq. in.), x1, and newspaper circulation (000), x2, on the number of ad responses (00), y. Conduct the global F–test of model usefulness. Use α = .05.

Page 56: Statistics for Business and Economics

Testing Overall Significance Solution

• H0:• Ha:• =• 1 = 2 = • Critical Value(s):

Test Statistic:

Decision:

Conclusion:

F0 9.55

= .05

β1 = β2 = 0

At least 1 not zero

.052 3

Page 57: Statistics for Business and Economics

Testing Overall SignificanceComputer Output

Analysis of Variance

Sum of Mean Source DF Squares Square F Value Prob>F

Model 2 9.2497 4.6249 55.440 0.0043

Error 3 0.2503 0.0834

C Total 5 9.5000

k

n – (k + 1) MS(Error)

MS(Model)

Page 58: Statistics for Business and Economics

Testing Overall Significance Solution

• H0:• Ha:• =• 1 = 2 = • Critical Value(s):

Test Statistic:

Decision:

Conclusion:

F0 9.55

= .05

β1 = β2 = 0

At least 1 not zero

.052 3

Reject at = .05

There is evidence at least 1 of the coefficients is not zero

4.624955.44

.0834F

Page 59: Statistics for Business and Economics

Testing Overall SignificanceComputer Output Solution

Analysis of Variance

Sum of Mean Source DF Squares Square F Value Prob>F

Model 2 9.2497 4.6249 55.440 0.0043

Error 3 0.2503 0.0834

C Total 5 9.5000

P-Value

MS(Model) MS(Error)

Page 60: Statistics for Business and Economics

Interaction Models

Page 61: Statistics for Business and Economics

Types of Regression Models

ExplanatoryVariable

1stOrderModel

3rdOrderModel

2 or MoreQuantitative

Variables

2ndOrderModel

1stOrderModel

2ndOrderModel

Inter-ActionModel

1Qualitative

Variable

DummyVariable

Model

1Quantitative

Variable

Page 62: Statistics for Business and Economics

Interaction Model With 2 Independent Variables

• Hypothesizes interaction between pairs of x variables

— Response to one x variable varies at different levels of another x variable

0 1 1 2 2 3 1 2( )E y x x x x • Contains two-way cross product terms

• Can be combined with other models— Example: dummy-variable model

Page 63: Statistics for Business and Economics

Effect of Interaction

0 1 1 2 2 3 1 2( )E y x x x x Given:

• Without interaction term, effect of x1 on y is measured by 1

• With interaction term, effect of x1 on y is measured by 1 + 3x2

— Effect increases as x2 increases

Page 64: Statistics for Business and Economics

Interaction Model Relationships

Effect (slope) of x1 on E(y) depends on x2 value

E(y)

x1

4

8

12

00 10.5 1.5

E(y) = 1 + 2x1 + 3x2 + 4x1x2

E(y) = 1 + 2x1 + 3(1) + 4x1(1) = 4 + 6x1

E(y) = 1 + 2x1 + 3(0) + 4x1(0) = 1 + 2x1

Page 65: Statistics for Business and Economics

Interaction Model Worksheet

1 1 1 3 3

2 4 8 5 40

3 1 3 2 6

4 3 5 6 30

: : : : :Multiply x1 by x2 to get x1x2. Run regression with y, x1, x2 , x1x2

Case, i yi x1i x2i x1i x2i

Page 66: Statistics for Business and Economics

Interaction Example

You work in advertising for the New York Times. You want to find the effect of ad size (sq. in.), x1, and newspaper circulation (000), x2, on the number of ad responses (00), y. Conduct a test for interaction. Use α = .05.

Page 67: Statistics for Business and Economics

Interaction Model Worksheet

1 2 2

8 8 64

3 1 3

5 7 35

Multiply x1 by x2 to get x1x2. Run regression with y, x1, x2 , x1x2

x1i x2i x1i x2i

1

4

1

3

yi

6 4 24

10 6 60

2

4

Page 68: Statistics for Business and Economics

Excel Computer OutputSolution

Global F–test indicates at least one parameter is not zero

P-ValueF

Page 69: Statistics for Business and Economics

Interaction Test Solution

• H0:• Ha:• • df • Critical Value(s):

Test Statistic:

Decision:

Conclusion:

t0 2.776-2.776

.025

Reject H0 Reject H0

.025

3 = 0

3 ≠ 0

.05

6 - 2 = 4

Page 70: Statistics for Business and Economics

Excel Computer OutputSolution

3

3

ˆ

ˆt

s

Page 71: Statistics for Business and Economics

Interaction Test Solution

• H0:• Ha:• • df • Critical Value(s):

Test Statistic:

Decision:

Conclusion:

t0 2.776-2.776

.025

Reject H0 Reject H0

.025

3 = 0

3 ≠ 0

.05

6 - 2 = 4

Do no reject at = .05

There is no evidence of interaction

t = 1.8528

Page 72: Statistics for Business and Economics

Second–Order Models

Page 73: Statistics for Business and Economics

Types of Regression Models

ExplanatoryVariable

1stOrderModel

3rdOrderModel

2 or MoreQuantitative

Variables

2ndOrderModel

1stOrderModel

2ndOrderModel

Inter-ActionModel

1Qualitative

Variable

DummyVariable

Model

1Quantitative

Variable

Page 74: Statistics for Business and Economics

Second-Order Model With 1 Independent Variable

• Relationship between 1 dependent and 1 independent variable is a quadratic function

• Useful 1st model if non-linear relationship suspected

• Model

20 1 2( )E y x x

Linear effect

Curvilinear effect

Page 75: Statistics for Business and Economics

y

Second-Order Model Relationships

yy

y

2 > 02 > 0

2 < 02 < 0

x1

x1x1

x1

Page 76: Statistics for Business and Economics

Second-Order Model Worksheet

Create x2 column. Run regression with y, x, x2.

1 1 1 1

2 4 8 64

3 1 3 9

4 3 5 25

: : : :

Case, i yi xi2xi

Page 77: Statistics for Business and Economics

2nd Order Model Example

The data shows the number of weeks employed and the number of errors made per day for a sample of assembly line workers. Find a 2nd order model, conduct the global F–test, and test if β2 ≠ 0. Use α = .05 for all tests.

Errors (y) Weeks (x)20 1 18 1 16 210 48 44 53 61 82 101 110 121 12

Page 78: Statistics for Business and Economics

Second-Order Model Worksheet

Create x2 column. Run regression with y, x, x2.

20 1 1

18 1 1

16 2 4

10 4 16

yi xi2xi

: ::

Page 79: Statistics for Business and Economics

Excel Computer Output Solution

2ˆ 23.728 4.784 .242y x x

Page 80: Statistics for Business and Economics

Overall Model Test Solution

Global F–test indicates at least one parameter is not zero

P-ValueF

Page 81: Statistics for Business and Economics

β2 Parameter Test Solutionβ2 test indicates curvilinear relationship exists

P-Valuet

Page 82: Statistics for Business and Economics

Types of Regression Models

ExplanatoryVariable

1stOrderModel

3rdOrderModel

2 or MoreQuantitative

Variables

2ndOrderModel

1stOrderModel

2ndOrderModel

Inter-ActionModel

1Qualitative

Variable

DummyVariable

Model

1Quantitative

Variable

Page 83: Statistics for Business and Economics

Second-Order Model With

2 Independent Variables• Relationship between 1 dependent and 2

independent variables is a quadratic function

• Useful 1st model if non-linear relationship suspected

• Model

225

214

21322110)(

ii

iiii

xx

xxxxyE

Page 84: Statistics for Business and Economics

Second-Order Model Relationships

y

x2x1

4 + 5 > 0 y 4 + 5 < 0

y 32 > 4 4 5

225

214213

22110)(

i

iii

ii

x

xxx

xxyE

x2x1

x2

x1

Page 85: Statistics for Business and Economics

Second-Order Model Worksheet

1 1 1 3 3 1 9

2 4 8 5 40 64 25

3 1 3 2 6 9 4

4 3 5 6 30 25 36

: : : : : : :

Multiply x1 by x2 to get x1x2; then create x12, x2

2. Run regression with y, x1, x2 , x1x2, x1

2, x22.

Case, i yi x1i2x1i

2x2ix2i x1ix2i

Page 86: Statistics for Business and Economics

Models With One Qualitative Independent Variable

Page 87: Statistics for Business and Economics

Types of Regression Models

ExplanatoryVariable

1stOrderModel

3rdOrderModel

2 or MoreQuantitative

Variables

2ndOrderModel

1stOrderModel

2ndOrderModel

Inter-ActionModel

1Qualitative

Variable

DummyVariable

Model

1Quantitative

Variable

Page 88: Statistics for Business and Economics

Dummy-Variable Model

• Involves categorical x variable with 2 levels— e.g., male-female; college-no college

• Variable levels coded 0 and 1• Number of dummy variables is 1 less than

number of levels of variable• May be combined with quantitative variable

(1st order or 2nd order model)

Page 89: Statistics for Business and Economics

Dummy-Variable Model Worksheet

1 1 1 1

2 4 8 0

3 1 3 1

4 3 5 1

: : : :x2 levels: 0 = Group 1; 1 = Group 2. Run regression with y, x1, x2

Case, i yi x1i x2i

Page 90: Statistics for Business and Economics

Interpreting Dummy-Variable Model Equation

Given:

y = Starting salary of college graduates

x1 = GPA0 if Male1 if Female

Same slopes

0 1 1 2 2ˆ ˆ ˆˆi i iy x x

x2 =

Male ( x2 = 0 ):

0 1 1 2 0 1 1ˆ ˆ ˆ ˆ ˆˆ (0)i i iy x x

0 1 1 2 0 2 1 1ˆ ˆ ˆ ˆ ˆ ˆˆ (1)i i iy x x

Female ( x2 = 1 ):

Page 91: Statistics for Business and Economics

Dummy-Variable Model Example

Computer Output:

Same slopes

0 if Male1 if Female

x2 =

Male ( x2 = 0 ):

1 1 1ˆ 3 5 7(0) 3 5i i iy x x

Female ( x2 = 1 ):

1 1ˆ 3 5 7(1) 3 7 5i i iy x x

1 2ˆ 3 5 7i i iy x x

Page 92: Statistics for Business and Economics

Dummy-Variable Model Relationships

y

x100

Same Slopes 1

0

0 + 2

^

^ ^

^

Female

Male

Page 93: Statistics for Business and Economics

Nested Models

Page 94: Statistics for Business and Economics

Comparing Nested Models• Contains a subset of terms in the complete (full) model

• Tests the contribution of a set of x variables to the relationship with y

• Null hypothesis H0: g+1 = ... = k = 0

— Variables in set do not improve significantly the model when all other variables are included

• Used in selecting x variables or models

• Part of most computer programs

Page 95: Statistics for Business and Economics

Selecting Variables in Model Building

Page 96: Statistics for Business and Economics

Selecting Variables in Model Building

A butterfly flaps its wings in Japan, which causes it to rain in Nebraska. -- Anonymous

Use Theory Only! Use Computer Search!

Page 97: Statistics for Business and Economics

Model Building with Computer Searches

• Rule: Use as few x variables as possible

• Stepwise Regression— Computer selects x variable most highly correlated

with y

— Continues to add or remove variables depending on SSE

• Best subset approach— Computer examines all possible sets

Page 98: Statistics for Business and Economics

Residual Analysis

Page 99: Statistics for Business and Economics

Evaluating Multiple Regression Model Steps

1. Examine variation measures

2. Test parameter significance• Individual coefficients • Overall model

3. Do residual analysis

Page 100: Statistics for Business and Economics

Residual Analysis

• Graphical analysis of residuals— Plot estimated errors versus xi values

Difference between actual yi and predicted yi

Estimated errors are called residuals

— Plot histogram or stem-&-leaf of residuals

• Purposes— Examine functional form (linear v. non-linear

model)— Evaluate violations of assumptions

Page 101: Statistics for Business and Economics

Residual Plot for Functional Form

Add x2 Term Correct Specification

x

e

x

e

Page 102: Statistics for Business and Economics

Residual Plot for Equal Variance

Unequal Variance Correct Specification

Fan-shaped.Standardized residuals used typically.

x

e

x

e

Page 103: Statistics for Business and Economics

Residual Plot for Independence

x

Not Independent Correct Specification

Plots reflect sequence data were collected.

e

x

e

Page 104: Statistics for Business and Economics

Residual Analysis Computer Output

Dep Var Predict Student

Obs SALES Value Residual Residual -2-1-0 1 2

1 1.0000 0.6000 0.4000 1.044 | |** |

2 1.0000 1.3000 -0.3000 -0.592 | *| |

3 2.0000 2.0000 0 0.000 | | |

4 2.0000 2.7000 -0.7000 -1.382 | **| |

5 4.0000 3.4000 0.6000 1.567 | |*** |

Plot of standardized (student) residuals

Page 105: Statistics for Business and Economics

Regression Pitfalls

Page 106: Statistics for Business and Economics

Regression Pitfalls

• Parameter Estimability— Number of different x–values must be at least one

more than order of model

• Multicollinearity— Two or more x–variables in the model are

correlated

• Extrapolation— Predicting y–values outside sampled range

• Correlated Errors

Page 107: Statistics for Business and Economics

Multicollinearity

• High correlation between x variables

• Coefficients measure combined effect

• Leads to unstable coefficients depending on x variables in model

• Always exists – matter of degree

• Example: using both age and height as explanatory variables in same model

Page 108: Statistics for Business and Economics

Detecting Multicollinearity

• Significant correlations between pairs of x variables are more than with y variable

• Non–significant t–tests for most of the individual parameters, but overall model test is significant

• Estimated parameters have wrong sign

Page 109: Statistics for Business and Economics

Solutions to Multicollinearity

• Eliminate one or more of the correlated x variables

• Avoid inference on individual parameters

• Do not extrapolate

Page 110: Statistics for Business and Economics

y Interpolation

x

Extrapolation Extrapolation

Sampled Range

Extrapolation

Page 111: Statistics for Business and Economics

Conclusion

1. Explained the Linear Multiple Regression Model

2. Described Inference About Individual Parameters

3. Tested Overall Significance

4. Explained Estimation and Prediction

5. Described Various Types of Models

6. Described Model Building

7. Explained Residual Analysis

8. Described Regression Pitfalls