multiple regression dr. andy field. slide 2 aims understand when to use multiple regression....
TRANSCRIPT
![Page 1: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/1.jpg)
Multiple Regression
Dr. Andy Field
![Page 2: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/2.jpg)
Slide 2
Aims• Understand When To Use Multiple Regression.• Understand the multiple regression equation and
what the betas represent.• Understand Different Methods of Regression
– Hierarchical– Stepwise– Forced Entry
• Understand How to do a Multiple Regression on PASW/SPSS
• Understand how to Interpret multiple regression.• Understand the Assumptions of Multiple
Regression and how to test them
![Page 3: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/3.jpg)
Slide 3
What is Multiple Regression?
• Linear Regression is a model to predict the value of one variable from another.
• Multiple Regression is a natural extension of this model:– We use it to predict values of an outcome
from several predictors.– It is a hypothetical model of the relationship
between several variables.
![Page 4: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/4.jpg)
Regression: An Example• A record company boss was interested in
predicting record sales from advertising.• Data
– 200 different album releases• Outcome variable:
– Sales (CDs and Downloads) in the week after release
• Predictor variables– The amount (in £s) spent promoting the record
before release (see last lecture)– Number of plays on the radio (new variable)
![Page 5: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/5.jpg)
Slide 5
The Model with One Predictor
![Page 6: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/6.jpg)
Slide 6
Multiple Regression as an Equation
• With multiple regression the relationship is described using a variation of the equation of a straight line.
110 Xbby 110 Xbby innXbXb 22 innXbXb 22
![Page 7: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/7.jpg)
Slide 7
b0
• b0 is the intercept.
• The intercept is the value of the Y variable when all Xs = 0.
• This is the point at which the regression plane crosses the Y-axis (vertical).
![Page 8: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/8.jpg)
Slide 8
Beta Values
• b1 is the regression coefficient for variable 1.
• b2 is the regression coefficient for variable 2.
• bn is the regression coefficient for nth variable.
![Page 9: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/9.jpg)
Slide 9
The Model with Two Predictors
bAdverts
bairplay
b0
![Page 10: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/10.jpg)
Slide 10
Methods of Regression
• Hierarchical:– Experimenter decides the order in which
variables are entered into the model.
• Forced Entry:– All predictors are entered simultaneously.
• Stepwise:– Predictors are selected using their semi-
partial correlation with the outcome.
![Page 11: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/11.jpg)
![Page 12: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/12.jpg)
Slide 12
Hierarchical Regression
• Known predictors (based on past research) are entered into the regression model first.
• New predictors are then entered in a separate step/block.
• Experimenter makes the decisions.
![Page 13: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/13.jpg)
Slide 13
Hierarchical Regression• It is the best method:
– Based on theory testing.– You can see the unique predictive
influence of a new variable on the outcome because known predictors are held constant in the model.
• Bad Point:– Relies on the experimenter knowing
what they’re doing!
![Page 14: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/14.jpg)
Slide 14
Forced Entry Regression
• All variables are entered into the model simultaneously.
• The results obtained depend on the variables entered into the model.– It is important, therefore, to have good
theoretical reasons for including a particular variable.
![Page 15: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/15.jpg)
Slide 15
Stepwise Regression I
• Variables are entered into the model based on mathematical criteria.
• Computer selects variables in steps.• Step 1
– SPSS looks for the predictor that can explain the most variance in the outcome variable.
![Page 16: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/16.jpg)
Exam Performance
Revision Time
Variance accounted for
by Revision Time (33.1%)
Previous Exam
Variance explained
(1.7%)
Difficulty
Variance explained
(1.3%)
![Page 17: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/17.jpg)
Correlations
1.000 .130** -.113* .575**
. .005 .014 .000
474 474 474 474
.130** 1.000 -.012 .047
.005 . .790 .303
474 474 474 474
-.113* -.012 1.000 -.236**
.014 .790 . .000
474 474 474 474
.575** .047 -.236** 1.000
.000 .303 .000 .
474 474 474 474
Pearson Correlation
Sig. (2-tailed)
N
Pearson Correlation
Sig. (2-tailed)
N
Pearson Correlation
Sig. (2-tailed)
N
Pearson Correlation
Sig. (2-tailed)
N
Exam Mark (%)
First Year Exam Grade
Difficulty of Question
Revision Time
Exam Mark(%)
First YearExamGrade
Difficulty ofQuestion
RevisionTime
Correlation is significant at the 0.01 level (2-tailed).**.
Correlation is significant at the 0.05 level (2-tailed).*.
![Page 18: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/18.jpg)
Slide 18
Stepwise Regression II
• Step 2:–Having selected the 1st predictor, a
second one is chosen from the remaining predictors.
–The semi-partial correlation is used as a criterion for selection.
![Page 19: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/19.jpg)
Slide 19
Semi-Partial Correlation• Partial correlation:
– measures the relationship between two variables, controlling for the effect that a third variable has on them both.
• A semi-partial correlation:– Measures the relationship between two
variables controlling for the effect that a third variable has on only one of the others.
![Page 20: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/20.jpg)
Slide 20
Partial Correlation Semi-Partial Correlation
![Page 21: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/21.jpg)
Slide 21
Semi-Partial Correlation in Regression
• The semi-partial correlation– Measures the relationship between a
predictor and the outcome, controlling for the relationship between that predictor and any others already in the model.
– It measures the unique contribution of a predictor to explaining the variance of the outcome.
![Page 22: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/22.jpg)
Slide 22
Excluded Variablesc
.103a 2.756 .006 .126 .998
.024a .624 .533 .029 .944
.024b .631 .528 .029 .944
First Year Exam Grade
Difficulty of Question
Difficulty of Question
Model1
2
Beta In t Sig.Partial
Correlation Tolerance
CollinearityStatistics
Predictors in the Model: (Constant), Revision Timea.
Predictors in the Model: (Constant), Revision Time, First Year Exam Gradeb.
Dependent Variable: Exam Mark (%)c.
Coefficientsa
-10.863 3.069 -3.539 .000
3.400 .222 .575 15.282 .000
-24.651 5.858 -4.208 .000
3.371 .221 .570 15.241 .000
.175 .063 .103 2.756 .006
(Constant)
Revision Time
(Constant)
Revision Time
First Year Exam Grade
Model1
2
B Std. Error
UnstandardizedCoefficients
Beta
StandardizedCoefficients
t Sig.
Dependent Variable: Exam Mark (%)a.
![Page 23: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/23.jpg)
Slide 23
Problems with Stepwise Methods
• Rely on a mathematical criterion.– Variable selection may depend upon only
slight differences in the Semi-partial correlation.
– These slight numerical differences can lead to major theoretical differences.
• Should be used only for exploration
![Page 24: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/24.jpg)
Slide 24
Doing Multiple Regression
![Page 25: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/25.jpg)
Slide 25
Doing Multiple Regression
![Page 26: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/26.jpg)
Regression Statistics
![Page 27: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/27.jpg)
Regression Diagnostics
![Page 28: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/28.jpg)
Slide 28
Output: Model Summary
![Page 29: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/29.jpg)
Slide 29
R and R2
• R– The correlation between the observed
values of the outcome, and the values predicted by the model.
• R2
– Yhe proportion of variance accounted for by the model.
• Adj. R2
– An estimate of R2 in the population (shrinkage).
![Page 30: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/30.jpg)
Slide 30
Output: ANOVA
![Page 31: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/31.jpg)
Slide 31
Analysis of Variance: ANOVA
• The F-test– looks at whether the variance
explained by the model (SSM) is significantly greater than the error within the model (SSR).
– It tells us whether using the regression model is significantly better at predicting values of the outcome than using the mean.
![Page 32: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/32.jpg)
Slide 32
Output: betas
![Page 33: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/33.jpg)
Slide 33
How to Interpret Beta Values
• Beta values:– the change in the outcome associated
with a unit change in the predictor.
• Standardised beta values:– tell us the same but expressed as
standard deviations.
![Page 34: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/34.jpg)
Slide 34
Beta Values
• b1= 0.087.– So, as advertising increases by £1,
record sales increase by 0.087 units.
• b2= 3589.– So, each time (per week) a song is
played on radio 1 its sales increase by 3589 units.
![Page 35: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/35.jpg)
Slide 35
Constructing a Model
plays3589Adverts087.041124Sales22110
XbXbby
plays3589Adverts087.041124Sales22110
XbXbby
181959
538358700041124
153589000,000,1087.041124Sales
plays 15 g,AdvertisinMillion £1
181959
538358700041124
153589000,000,1087.041124Sales
plays 15 g,AdvertisinMillion £1
![Page 36: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/36.jpg)
Slide 36
Standardised Beta Values• 1= 0.523
– As advertising increases by 1 standard deviation, record sales increase by 0.523 of a standard deviation.
• 2= 0.546– When the number of plays on radio increases
by 1 s.d. its sales increase by 0.546 standard deviations.
![Page 37: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/37.jpg)
Slide 37
Interpreting Standardised Betas
• As advertising increases by £485,655, record sales increase by 0.523 80,699 = 42,206.
• If the number of plays on radio 1 per week increases by 12, record sales increase by 0.546 80,699 = 44,062.
![Page 38: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/38.jpg)
Reporting the Model
![Page 39: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/39.jpg)
Slide 39
How well does the Model fit the data?
• There are two ways to assess the accuracy of the model in the sample:
• Residual Statistics– Standardized Residuals
• Influential cases– Cook’s distance
![Page 40: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/40.jpg)
Slide 40
Standardized Residuals • In an average sample, 95% of
standardized residuals should lie between 2.
• 99% of standardized residuals should lie between 2.5.
• Outliers– Any case for which the absolute value of
the standardized residual is 3 or more, is likely to be an outlier.
![Page 41: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/41.jpg)
Slide 41
Cook’s Distance
• Measures the influence of a single case on the model as a whole.
• Weisberg (1982):– Absolute values greater than 1 may be
cause for concern.
![Page 42: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/42.jpg)
Slide 42
Generalization
• When we run regression, we hope to be able to generalize the sample model to the entire population.
• To do this, several assumptions must be met.
• Violating these assumptions stops us generalizing conclusions to our target population.
![Page 43: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/43.jpg)
Slide 43
Straightforward Assumptions • Variable Type:
– Outcome must be continuous– Predictors can be continuous or dichotomous.
• Non-Zero Variance:– Predictors must not have zero variance.
• Linearity:– The relationship we model is, in reality, linear.
• Independence:– All values of the outcome should come from a
different person.
![Page 44: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/44.jpg)
Slide 44
The More Tricky Assumptions
• No Multicollinearity:– Predictors must not be highly correlated.
• Homoscedasticity:– For each value of the predictors the variance of the
error term should be constant.• Independent Errors:
– For any pair of observations, the error terms should be uncorrelated.
• Normally-distributed Errors
![Page 45: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/45.jpg)
Slide 45
Multicollinearity
• Multicollinearity exists if predictors are highly correlated.
• This assumption can be checked with collinearity diagnostics.
![Page 46: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/46.jpg)
• Tolerance should be more than 0.2 (Menard, 1995)
• VIF should be less than 10 (Myers, 1990)
![Page 47: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/47.jpg)
Slide 47
Checking Assumptions about Errors
• Homoscedacity/Independence of Errors:–Plot ZRESID against ZPRED.
• Normality of Errors:–Normal probability plot.
![Page 48: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/48.jpg)
Regression Plots
![Page 49: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/49.jpg)
Homoscedasticity:ZRESID vs. ZPRED
GoodGood BadBad
![Page 50: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/50.jpg)
Normality of Errors: Histograms
GoodGood BadBad
![Page 51: Multiple Regression Dr. Andy Field. Slide 2 Aims Understand When To Use Multiple Regression. Understand the multiple regression equation and what the](https://reader035.vdocument.in/reader035/viewer/2022062304/56649d975503460f94a81415/html5/thumbnails/51.jpg)
Normality of Errors: Normal Probability Plot
Normal P-P Plot of Regression
Standardized Residual
Dependent Variable: Outcome
Observed Cum Prob
1.00.75.50.250.00
Exp
ect
ed
Cu
m P
rob
1.00
.75
.50
.25
0.00
BadBadGoodGood