sociology 601 class 23: november 17, 2009

16
Sociology 601 Class 23: November 17, 2009 Homework #8 • Review spurious, intervening, & interactions effects stata regression commands & output F-tests and inferences (A&F 11.4) 1

Upload: lelia

Post on 24-Jan-2016

34 views

Category:

Documents


0 download

DESCRIPTION

Sociology 601 Class 23: November 17, 2009. Homework #8 Review spurious, intervening, & interactions effects stata regression commands & output F-tests and inferences (A&F 11.4). Review: Types of 3-variable Causal Models. Spurious x 2 causes both x 1 and y - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Sociology 601 Class 23: November 17, 2009

Sociology 601 Class 23: November 17, 2009

• Homework #8

• Review

– spurious, intervening, & interactions effects

– stata regression commands & output

• F-tests and inferences (A&F 11.4)

1

Page 2: Sociology 601 Class 23: November 17, 2009

Review: Types of 3-variable Causal Models

• Spurious• x2 causes both x1 and y• e.g., age causes both marital status and earnings

• Intervening• x1 causes x2 which causes y• e.g., marital status causes more hours worked which

raises annual earnings

• No statistical difference between these models.

• Statistical interaction effects: The relationship between x1 and y depends on the value of another variable, x2

• e.g., the relationship between marital status and earnings is different for men and women.

2

Page 3: Sociology 601 Class 23: November 17, 2009

Review: Causal Models with earnings & marital status

bivariate relationship:1.married earnings

spuriousness:2. married earnings

age

intervening:3. married hours earnings

interaction effect:4.married earnings

gender

3

Page 4: Sociology 601 Class 23: November 17, 2009

Review: Stata Commands

• describe• summarize• tab• tab xcat, sum(yvar)• drop if / keep if• gen / replace• ttest• regress• predict / predict, residuals

• histogram / scattergram• graph box yvar, over(xvar)

4

Page 5: Sociology 601 Class 23: November 17, 2009

Review: Regression models using Stata

see:

http://www.bsos.umd.edu/socy/vanneman/socy601/conrinc.do

5

Page 6: Sociology 601 Class 23: November 17, 2009

Review: Regression models with Earnings, Marital status and Age

bivariate relationship:. * association of earnings and marital status:. regress conrinc married

Source | SS df MS Number of obs = 725-------------+------------------------------ F( 1, 723) = 31.29 Model | 1.9321e+10 1 1.9321e+10 Prob > F = 0.0000 Residual | 4.4645e+11 723 617501240 R-squared = 0.0415-------------+------------------------------ Adj R-squared = 0.0402 Total | 4.6577e+11 724 643334846 Root MSE = 24850

------------------------------------------------------------------------------ conrinc | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- married | 10383.4 1856.279 5.59 0.000 6739.057 14027.74 _cons | 35065.27 1380.532 25.40 0.000 32354.94 37775.6------------------------------------------------------------------------------

. spuriousness (partial):

. * age makes the marriage-earnings relationship partly spurious:

. regress conrinc married age

Source | SS df MS Number of obs = 725-------------+------------------------------ F( 2, 722) = 36.20 Model | 4.2454e+10 2 2.1227e+10 Prob > F = 0.0000 Residual | 4.2332e+11 722 586315863 R-squared = 0.0911-------------+------------------------------ Adj R-squared = 0.0886 Total | 4.6577e+11 724 643334846 Root MSE = 24214

------------------------------------------------------------------------------ conrinc | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- married | 8243.081 1840.613 4.48 0.000 4629.489 11856.67 age | 702.0977 111.7749 6.28 0.000 482.6551 921.5403 _cons | 8836.284 4387.025 2.01 0.044 223.4344 17449.13------------------------------------------------------------------------------

6

Page 7: Sociology 601 Class 23: November 17, 2009

Review: Regression models with Earnings, Marital status and Hours Worked

Intervening variable relationship (hours worked):. * hours worked explains some of how marital status increases earnings:. regress conrinc married age hrs1

Source | SS df MS Number of obs = 664-------------+------------------------------ F( 3, 660) = 25.02 Model | 4.4322e+10 3 1.4774e+10 Prob > F = 0.0000 Residual | 3.8970e+11 660 590458672 R-squared = 0.1021-------------+------------------------------ Adj R-squared = 0.0980 Total | 4.3402e+11 663 654637868 Root MSE = 24299

------------------------------------------------------------------------------ conrinc | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- married | 7328.527 1934.225 3.79 0.000 3530.551 11126.5 age | 631.5836 117.8463 5.36 0.000 400.1848 862.9824 hrs1 | 281.3472 71.47315 3.94 0.000 141.0051 421.6894 _cons | -232.1376 5465.426 -0.04 0.966 -10963.86 10499.58------------------------------------------------------------------------------

But: problem with N!

Create new hours worked:. gen hrs=hrs1(101 missing values generated)

. replace hrs=hrs2 if hrs1>=.(24 real changes made, 2 to missing)

. replace hrs=0 if hrs1>=. & wrkstat>=3(101 real changes made)

7

Page 8: Sociology 601 Class 23: November 17, 2009

Review: Regression models with Earnings, Marital status and Hours Worked

Intervening variable relationship (revised hours worked):

. regress conrinc married age hrs

Source | SS df MS Number of obs = 725-------------+------------------------------ F( 3, 721) = 36.27 Model | 6.1081e+10 3 2.0360e+10 Prob > F = 0.0000 Residual | 4.0469e+11 721 561294582 R-squared = 0.1311-------------+------------------------------ Adj R-squared = 0.1275 Total | 4.6577e+11 724 643334846 Root MSE = 23692

------------------------------------------------------------------------------ conrinc | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- married | 7465.107 1805.967 4.13 0.000 3919.526 11010.69 age | 640.1643 109.891 5.83 0.000 424.4197 855.9089 hrs | 278.3368 48.31685 5.76 0.000 183.4783 373.1954 _cons | -493.7634 4587.79 -0.11 0.914 -9500.786 8513.259------------------------------------------------------------------------------

b(married) reduced to 7465.1 from 8243.1 (N= 725 for both)

8

Page 9: Sociology 601 Class 23: November 17, 2009

Review: Regression models with EarningsMarital status, Age, and Hours worked.

9

Model 0 Model 1 Model 2x Model 2

Married 10,383.4*** 8,243.1*** 7,328.5*** 7,465.1***

Age 702.1*** 631.6*** 640.2***

Hours worked 281.3*** 278.3***

Constant 35,065.3*** 8,836.3* -232.1n.s. -493.8n.s.

N 725 725 664 725

R-square 0.042 0.091 0.102 0.133

Page 10: Sociology 601 Class 23: November 17, 2009

Review: Regression models with Earnings and Marital status, separately by Gender

Statistical Interaction Effect:. * association of earnings and marital status for men:. regress conrinc married if sex==1

Source | SS df MS Number of obs = 725-------------+------------------------------ F( 1, 723) = 31.29 Model | 1.9321e+10 1 1.9321e+10 Prob > F = 0.0000 Residual | 4.4645e+11 723 617501240 R-squared = 0.0415-------------+------------------------------ Adj R-squared = 0.0402 Total | 4.6577e+11 724 643334846 Root MSE = 24850

------------------------------------------------------------------------------ conrinc | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- married | 10383.4 1856.279 5.59 0.000 6739.057 14027.74 _cons | 35065.27 1380.532 25.40 0.000 32354.94 37775.6------------------------------------------------------------------------------

. * association of earnings and marital status for women:

. regress conrinc married if sex==2

Source | SS df MS Number of obs = 749-------------+------------------------------ F( 1, 747) = 0.26 Model | 106732224 1 106732224 Prob > F = 0.6129 Residual | 3.1118e+11 747 416578779 R-squared = 0.0003-------------+------------------------------ Adj R-squared = -0.0010 Total | 3.1129e+11 748 416164546 Root MSE = 20410

------------------------------------------------------------------------------ conrinc | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- married | 755.3387 1492.253 0.51 0.613 -2174.17 3684.848 _cons | 26201 1038.855 25.22 0.000 24161.57 28240.42------------------------------------------------------------------------------

10

Page 11: Sociology 601 Class 23: November 17, 2009

Inferences: F-tests of global model

Ho : β1 = β2 = ... βk = 0

• α or β0 ?

F-tests of H0:

• Calculate new test statistic, F

• ratio of “explained variance” / “unexplained variance”

• F-distribution: ratio of chi-square distributions

• df1 (numerator); df2 (denominator)

• if df1=1, then F = t2

• Table D, pages 671-673

• Global F-test less useful (almost always significant unless you have a really bad model or very small N).

• Base for F-test comparing regression models (later)

11

Page 12: Sociology 601 Class 23: November 17, 2009

F-test: Method 1, STATA output

. regress conrinc married age hrs1

Source | SS df MS Number of obs = 725-------------+------------------------------ F( 3, 721) = 36.27 Model | 6.1081e+10 3 2.0360e+10 Prob > F = 0.0000 Residual | 4.0469e+11 721 561294582 R-squared = 0.1311-------------+------------------------------ Adj R-squared = 0.1275 Total | 4.6577e+11 724 643334846 Root MSE = 23692

------------------------------------------------------------------------------ conrinc | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- married | 7465.107 1805.967 4.13 0.000 3919.526 11010.69 age | 640.1643 109.891 5.83 0.000 424.4197 855.9089 hrs | 278.3368 48.31685 5.76 0.000 183.4783 373.1954 _cons | -493.7634 4587.79 -0.11 0.914 -9500.786 8513.259------------------------------------------------------------------------------

df1 = 3 (= k = # parameters = β(married), β(age), β(hrs) )

df2 = 721 [ = N – (k+1) = 725 – (3+1) ]

F(3,721) = 2.60 (α = .05); 36.27 >> 2.60

12

Page 13: Sociology 601 Class 23: November 17, 2009

F-test: Method 2, using R-square

13

F =R2 / k

(1− R2) / [N − (k +1)]

df 1 = k; df 2 = N − (k +1)

F =.1311/ 3

(1− .1311) / [725 − (3+1)]

F =.0437

.8689 / 721

F = 36.26

Page 14: Sociology 601 Class 23: November 17, 2009

F-test: Method 3, using SSE and Model SS

14

F =ModelSS / k

SSError / [N − (k +1)]

F =Model Mean Square

Mean Square Error

df 1 = k; df 2 = N − (k +1)

F = 2.0360e+10 / 561294582

= 36.27

Page 15: Sociology 601 Class 23: November 17, 2009

Inferences: βi

15

H0: βi = 0

• what we are usually most interested in

test statistic:

t =bi

ˆ σ bi

df = df 2 = N − (k +1)

ˆ σ bi is calculated from matrix routines

Page 16: Sociology 601 Class 23: November 17, 2009

Next: Regression with Dummy Variables

16

Agresti and Finlay 12.3 • (skim 12.1-12.2 on analysis of variance)

Example: marital status, 3 categories• currently married• never married• widowed• separated• divorced