multiple regression involves the use of more than one independent variable. multivariate analysis...

26
Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding more variables will help us to explain more variance - the trick becomes: are the additional variables significant and do they improve the overall model? Additionally, the added independent variables should not be too highly related with each other!

Post on 20-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding

Multiple Regression

Involves the use of more than one independent variable.

Multivariate analysis involves more than one dependent variable - OMS 633

Adding more variables will help us to explain more variance - the trick becomes: are the additional variables significant and do they improve the overall model? Additionally, the added independent variables should not be too highly related with each other!

Page 2: Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding

Multiple Regression

A sample data set:

Sales= hundreds of gallonsPrice = price per gallonAdvertising = hundreds of dollars

Week Sales Price Advrtising1 10 1.3 92 6 2 73 5 1.7 54 12 1.5 145 10 1.6 156 15 1.2 127 5 1.5 68 12 1.4 109 17 1 15

10 20 1.1 21

Page 3: Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding

Analyzing the output

Evaluate for multicollinearity State and interpret the equation Interpret Adjusted R2

Interpret Syx

Are the independent variables significant? Is the model significant Forecast and develop prediction interval Examine the error terms Calculate MAD, MSE, MAPE, MPE

Page 4: Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding

Correlation Matrix

Simple correlation for each combination of variables (independents vs. independents; independents vs. dependent)

Sales Price AdvrtisingSales 1Price -0.86349 1Advrtising 0.891497 -0.65449 1

Page 5: Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding

Multicollinearity

It’s possible that the independent variables are related to one another. If they are highly related, this condition is called multicollinearity. Problems: A regression coefficient that is positive in sign in a two-

variable model may change to a negative sign Estimates of the regression coefficient change greatly from

sample to sample because the standard error of the regression coefficient is large.

Highly interrelated independent variable can explain some of the same variance in the dependent variable - so there is no added benefit, even though the R-square has increased.

We would throw one variable out - high correlation (.7)

Page 6: Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding

Multiple Regression Equation

Gallon Sales = 16.4 - 8.2476 (Price) + .59 (Adv)

iiXbXbXbbY ...1ˆ2210

CoefficientsStandard

Error t Stat P-valueIntercept 16.41 4.34 3.78 0.01Price -8.25 2.20 -3.76 0.01Advrtising 0.59 0.13 4.38 0.00

Page 7: Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding

Regression Coefficients

bo is the Y-intercept - the value of sales when X1 and X2 are 0.

b1 and b2 are net regression coefficients. The change in Y per unit change in the relevant independent variable, holding the other independent variables constant.

Page 8: Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding

Regression Coefficients

For each unit increase ($1.00) in price, sales will decrease 8.25 hundred gallons, holding advertising constant.

For each unit increase ($100, represented as 1) in Advertising, sales will increase .59 hundred gallons, holding price constant.

Be very careful about the units! 10 in the advertising indicates $1,000 because advertising is in hundreds

Gallons = 16.4 - 8.2476 (1.00) + .59 (10)

= 14.06 or 1,406 Gallons

Page 9: Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding

Regression Coefficients

How does a one cent increase in price affect sales (holding advertising at $1,000)?

16.4-8.25(1.01)+.59(10) = 13.9675

If price stays $1.00, and increase advertising $100, from $1,000 to $1100:

16.4-8.25(1.00)+.59(11) = 14.65

Page 10: Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding

Regression Statistics

Standard error of the estimateR2 and Adjusted R2

Regression StatisticsMultiple R 0.965364R Square 0.931929Adjusted R Square 0.91248Standard Error 1.507196Observations 10

Page 11: Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding

R2 and Adjusted R2

Same formulas as Simple Regression SSR/SST (this is an UNADJUSTED R2 ) Adjusted R2 from ANOVA = 1-MSR/(SST/n-1)

91% of the variance in gallons sold is explained by price per gallon and advertising.

Page 12: Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding

Standard Error of the Estimate

Measures the standard amount that the

actual values (Y) differ from the

estimated values .

No change in formula, except, in this

example, k=3.

Can still use square root of MSE

Y

Page 13: Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding

Evaluate the Independent Variables

Ho: The regression coefficient is not significantly different from zero

HA: The regression coefficient is significantly different from zero

Use the t-stat and the --value to evaluate EACH independent variable. If an independent variable is NOT significant, we remove it from the model and re-run!

Coefficients

Standard Error t Stat P-value

Intercept 16.40637 4.342519 3.778075 0.00691Price -8.24758 2.196057 -3.75563 0.007115Advrtising 0.585101 0.133672 4.377145 0.003246

Page 14: Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding

Evaluate the Model

Ho: The model is NOT valid and there is NOT a statistical relationship between the dependent and independent variables

HA: The model is valid. There is a statistical relationship between the dependent and independent variables.

If F from the ANOVA is greater than the F from the F-table, reject Ho: The model is valid. We can look at the P-values. If the p-value is less than our set level, we can REJECT Ho.

ANOVA

df SS MS FSignifica

nce FRegression 2 217.6985 108.8493 47.91657 8.23E-05Residual 7 15.90149 2.271641Total 9 233.6

Page 15: Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding

Forecast and Prediction Interval

Same as simple regression - however, many times we will not have the correction factor (formula under the square root). It is acceptable to use the Standard error of the estimate provided in the computer output.

2

2

2/ )(

)(11 ˆ

XX

XX

nSZY

i

iyx

Page 16: Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding

Examining the Errors

Heteroscedasticity exists when the residuals do not have a constant variance across an entire range of values.

Run an autocorrelation on the error terms to determine if the errors are random. If the errors are not random, the model needs to be re-evaluated. More on this in Chapter 9.

Evaluate with MAD, MAPE, MPE, MSE

Page 17: Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding

Dummy Variables

Used to determine the relationship between qualitative independent variables and a dependent variable. Differences based on genderEffect of training/no-training on performanceSeasonal data- quarters

We use 0 and 1 to indicate “off” or “on”. For example, code males as 1 and females as 0.

Page 18: Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding

Dummy Variables

The data indicates jobperformance rating basedon achievement test score and female (0) and males (1).

How do males and females differ in their job performance?

Rating Test Score Gender5 60 04 55 03 35 0

10 96 02 35 07 81 06 65 09 85 09 99 12 43 18 98 16 91 17 95 13 70 16 85 1

Page 19: Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding

Dummy Variables

The regression equation:

Job performance = -1.96 +.12 (test score) -2.18 (gender)

Holding gender constant, a one unit increase in test score increases job performance rating by 1.2 points.

Holding test score constant, males experience a 2.18 point lower performance rating than females. Or stated differently, females have a 2.18 higher job performance than males, holding test scores constant.

Coefficients

Standard Error t Stat P-value

Intercept -1.96 0.71 -2.77 0.02Test Score 0.12 0.01 11.86 0.00Gender -2.18 0.45 -4.84 0.00

Page 20: Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding

Dummy Variable Analysis

Evaluate for multicollinearity State and interpret the equation Interpret Adjusted R2

Interpret Syx

Are the independent variables significant? Is the model significant Forecast and develop prediction interval Examine the error terms Calculate MAD, MSE, MAPE, MPE

Page 21: Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding

Model Evaluation

If the variables indicate multicollinearity, run the model, interpret, but then re-run the best model (I.e. throw out one of the highly correlated variables)

If one of the independent variables are NOT significant, (whether dummy variable or other) throw it out and re-run the model

If the overall model is not significant - back to the drawing board - need to gather better predictor variables… maybe an elective course!

Page 22: Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding

Stepwise Regression

Sometimes, we will have a great number of variables - running a correlation matrix will help determine if any variables should NOT be in the model (low correlation with the dependent variable).

Can also run different types of regression, such as stepwise regression

Page 23: Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding

Stepwise regression

Adds one variable at a time - one step at a time. Based on explained variance (and highest correlation with the dependent variable). The independent variable that explains the most variance in the dependent variable is entered into the model first.

A partial f-test is determined to see if a new variable stays or is eliminated.

Page 24: Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding

Start with the correlation Matrix

Unit Sales

Test Score

Age (years) Anxiety

Experience (Years)

High School GPA

Unit Sales 1Test Score 0.67612 1Age (years) 0.798141 0.227706 1Anxiety -0.29586 -0.22199 -0.28679 1Experience (Years) 0.549834 0.349639 0.539568 -0.27869 1High School GPA 0.621784 0.317772 0.694569 -0.24438 0.3121288 1

Page 25: Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding

Stepwise Regression

F-to-Enter: 4.00 F-to-Remove: 4.00

Response is Unit Sales on 5 predictors, with N = 30

Step 1 2Constant -100.85 -86.79

Age (yea 6.97 5.93T-Value 7.01 10.60

Test Sco 0.200T-Value 8.13

S 6.85 3.75R-Sq 63.70 89.48

Page 26: Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding

Stepwise Regression

The equation at Step1:Sales = -100.85 + 6.97 (age)

The equation at Step2:Sales = -86.79 + 5.93 (age) + .200 (test

score)

No other variables are significant; the model stops.