regression maarten buis 12-12-2005. outline recap estimation goodness of fit goodness of fit versus...
TRANSCRIPT
Regression
Maarten Buis
12-12-2005
Outline
• Recap
• Estimation
• Goodness of Fit
• Goodness of Fit versus Effect Size
• transformation of variables and effect size
Recap
• With regression we looked at the effect of one variable on another
• an effect is a comparison of groups• Effect of for instance age consists of a
comparison of too many groups• so, look at an average effect• implies a straight line• average effect is slope
rent surface arearoom 1 175 13room 2 180 16room 3 185 16room 4 190 20room 5 200 18room 6 210 19room 7 210 20 room 8 210 22room 9 230 20room 10 240 18room 11 240 18room 12 250 24room 13 250 20room 14 280 24room 15 300 23room 16 300 26room 17 310 27room 18 325 28room 19 620 49
mean and regression
• Mean summarizes observations with one number that minimizes the sum of squared deviations from that number
• Regression summarizes observations with one line that minimizes the sum of squared deviations from that line.
ren
t of r
oo
m
200
mean
300
400
500
600
15 20 25 30 35 40 45 50
20
03
00
40
05
00
60
0
surface area of room
ren
t of r
oo
m
15 20 25 30 35 40 45 50
20
03
00
40
05
00
60
0
surface area of room
ren
t of r
oo
m
Ordinary Least Squares (OLS)
•
• So we want to minimize:
• by choosing optimal values of b0 and b1
xbby 10ˆ
2ˆ yy
What you need to know
• How to find the slope and intercept in:– a graph– a regression equation– SPSS output
• How to interpret the slope and intercept
Coefficientsa
4845,644 235,959 20,536 ,000
-33,002 3,317 -,205 -9,950 ,000
(Constant)
age age at dayof interview
Model1
B Std. Error
UnstandardizedCoefficients
Beta
StandardizedCoefficients
t Sig.
Dependent Variable: incmid household income in guildersa.
Coefficientsa
4845,644 235,959 20,536 ,000
-330,023 33,167 -,205 -9,950 ,000
(Constant)
age10
Model1
B Std. Error
UnstandardizedCoefficients
Beta
StandardizedCoefficients
t Sig.
Dependent Variable: incmid household income in guildersa.
COMPUTE age10 = age/10 .
Coefficientsa
4,846 ,236 20,536 ,000
-,033 ,003 -,205 -9,950 ,000
(Constant)
age age at dayof interview
Model1
B Std. Error
UnstandardizedCoefficients
Beta
StandardizedCoefficients
t Sig.
Dependent Variable: incmid1000a.
COMPUTE incmid1000 = incmid/1000 .
Coefficientsa
3030,520 59,960 50,542 ,000
-33,002 3,317 -,205 -9,950 ,000
(Constant)
age55
Model1
B Std. Error
UnstandardizedCoefficients
Beta
StandardizedCoefficients
t Sig.
Dependent Variable: incmid household income in guildersa.
COMPUTE age55 = age-55 .
How well does the regression fit?
• We started with variation in the dependent variable
• We fitted a regression, which has less variation around the regression line
• The decrease in variation (Proportion of variance explained) is a measure of fit.
• R2
Model Summary
,205a ,042 ,042 1,45385Model1
R R SquareAdjustedR Square
Std. Error ofthe Estimate
Predictors: (Constant), age age at day of interviewa.
Standard Error of the Estimate
• Unfortunate choice, should have been standard deviation of the estimate
• Measures the (unexplained) variation around the regression line.
2
ˆ
1
2
.
2
N
yyS
N
yyS i
xyi
y