home introduction to causal inference kenneth a. frank cstat 2-4-2011

Home

Introduction to Causal InferenceKenneth A. Frank CSTAT 2-4-2011

Home

Overview• Alternative Causal Mechanisms and the Counterfactual• Approximations to the Counterfactual• How Regression works: Explained Variance in Regression• Concern over Missing Confound (Internal Validity)• Consider Alternate Sample (External Validity)• Defining Absorption• Analyzing Pre/post-test designs ANCOVA: Analysis of Cova...• Schools as Fixed or Random• Statistical power in multilevels• Differential Treatment Effects and Heckman’s Rationality• References on Causal Inference

Home

My Take

• Sociological• Motivated by studies of social context

– People select themselves into contexts– Cannot randomize– Each context is different (effects across contexts?)

• Regression based– Control for confounds– Explore interactions

• Sensitivity/robustness– What would it take to invalidate an inference?

Home

Methods Covered• Counterfactual (2 potential outcomes)• Statistical control via regression/general linear model

– Random and fixed effects• Robustness of inference

– for impact of a confounding variable (internal validity)– for representativeness of sample (external validity)– Robustness indices a form of sensitivity analysis

• Absorption– Randomization– Instrumental variables– Pre-test

• Differential treatment effects– Treatment effect for treated/for control

• Propensity scores– Attention to assignment mechanism

• Logistic regression– Using propensity scores in analysis

• Weighting• Control• Strata• matching

Home

Example: The effect of National Board Certification on the help a teacher provides others (Frank et al)

What is National Board Certification?The National Board (a private organization) offers a certification process for primary and

secondary teachers. The process takes approximately 1 year and involves considerable reflection and documentation of practice. Emphasis on progressive approach to teaching and engagement in professional leadership.

The fifth core proposition of the NBPTS states that accomplished teaching reaches outside of the individual classroom and involves collaboration with other teachers, parents, administrators, and others (National Board for Professional Teaching Standards, 1989)

Descriptive

Q: Do National Board certified teachers (NBCTs) provide more help to others in their schools than non-NBCTs?

A: Yes, the average NBCT is nominated by about 1.6 others as providing help with instruction, in contrast to about .95 for a non-NBCT.

Causal Inference

Q: Does National Board certification affect the amount of help a teacher provides?

Frank, K.A., Gary Sykes, Dorothea Anagnostopoulos, Marisa Cannata, Linda Chard, Ann Krause, Raven McCrory. Extended Influence: National Board Certified Teachers as Help Providers. Submitted to Education, Evaluation, and Policy Analysis

Home

Policy Implications

• Board has emphasized helpfulness as one of its goals• Other Practices of BCT’s may disseminate throughout

school• Key goal of organizational literature has been to cultivate

more “social capital” and sense of community, where teachers help each other more better student outcomes.

• Amount of help teachers receive affects implementation of innovations (Frank, Zhao and Borman 2004; Zhao and Frank 2003) http://www.msu.edu/~kenfrank/research.htm#social

Incentives for more teachers within existing BCT oriented schools to become BCT’s

Incentives for schools and districts with few or no BCTs to engage BCT

Home

Correlation Does Not Equal Causation

• Estimated effect could be attributed to unmeasured covariate alternative causal mechanism

• ExampleY=amount of help a teacher provides to otherss= whether or not a teacher became National

Board Certifiedcv=confounding variable (e.g., inclination to be

helpful) representing alternative causal mechanism

Home

t( β1)

rscv

rycvrscv×rycv

Inclination to be Helpful(confounding variable --cv)

BoardCertified

(s) Numberothershelped

(y)

The Impact of a Confounding Variable on a Regression Coefficient

Home

Alternative Causal Mechanisms and the Counterfactual

1) I have a headache

2) I take an aspirin (treatment)

3) My headache goes away (outcome)

Q) Is it because I took the aspirin?

A) We’ll never know – it is counterfactual – for the individual

This is the Fundamental Problem of Causal Inference

Home

Treatment Effect and Missing data for the Counterfactual

Potential OutcomeAssignment

Home

Counterfactual and Philosophers: Hume

• spatial/temporal contiguity:– Cause and measurement of effect apply to

single unit

• Temporal succession– Effect assessed after treatment is applied

• Constant conjunction– If effect is constant

• Missing: effect of one cause is relative to effects of others

Home

Mill

• Liked the experimental paradigm• Concommitant variation:

– Correlational smoke causational fire ( I agree, more later)

• Method of Difference: Yit – Yi

c

• Method of Residues Yab – Ya

• Method of Agreement Yit – Yi

c=0 implies null effect, – compare observed effect against null effect

• Limitation: anything can be a cause

Home

Suppes

• Prima facia cause– Correlation

• Genuine Cause– No confounding vaiables Liked the

experimental paradigm

• Limitation: must explain full cause of effect, rather than small effect of particular cause

Home

Lewis

• Named the counterfactual• If A were the case, C would be the case” is

true in the actual world if and only if (i) there are no possible A-worlds; or (ii) some A-world where C holds is closer to the actual world than is any A-world where C does not hold. http://plato.stanford.edu/entries/causation-counterfactual/

Home

Basic Model for the Counterfactual9=2+4+3

5=2+3

=[2+4+3]-[2+3]=[(2-2)+(4-0)+(3-3)=4

=2+(1 or 0)x4+3

9=2+(1)x4+3

5=2+(0)x4+3

=[2+4+3]-[2+3]=[(2-2)+(4-0)+(3-3)=4

Home

Treatment Effect and Missing data for the Counterfactual

Potential OutcomeAssignment

Home

Reflection

• What part if most confusing to you?– Why?– More than one interpretation?

• Talk with one other, share

• Find new partner and problems and solutions

Home

Approximations to the Counterfactual

• Compare repetitions within person (observe teachers before and after certification)

• Randomly assign people to become certified or not (Fisher/Rosenbaum)– Randomization (with large enough n) insures that there will be

no baseline differences between those assigned to treatment and those assigned to control

• Regression (assuming all relevant confounds have been measured)

• Each attempts to approximate the counterfactual by insuring no relationship between confound and assignment to treatment condition (rx cv=0 rx cvx x rx cv=0)

Home

Randomization often not possible, especially for social contexts

• Logistics– Getting people to agree

• Independence– People within social contexts (e.g., schools) are

dependent randomize at level of context (the school) $$$$$$$

• Ethics– Assigning adolescents to friendship groups?!

• Timing: the longer the treatment intervention, the more likely to violate assumption that control group represents forecast for treatment group

• Exposure to confounding with small n

Home

Rubin’s (1974) response

• Was causal inference impossible prior to randomized experiments (circa 1930)?

• Make maximum use of data• Approximate counterfactual

– Statistical control– propensity score matching – match those who

received treatment with similar others but who received control (like “twins”).

Home

Employ Statistical Control for Confound

tiY tiY tiY

Home

SPSS Syntax for reading in toy counterfactual data

DATA LIST FREE / y confound s .

Begin DATA .

9 6 1

10 7 1

11 8 1

5 3 0

6 4 0

7 5 0

End DATA .

Home

Counterfactual Predicted Values from Regression: Effect isn’t 4, it’s 1!

Home

Regression Without Control: wrong answer: Estimate of 4

REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT y /METHOD=ENTER s .

Home

Regression with Control: Right answer, Estimate of 1

REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT y /METHOD=ENTER s confound .

confoundsy 210

1 2 1 0 1 6 8cy

Home

Counterfactual Predicted Values from Regression: Effect isn’t 4, it’s 1!

Home

Keys to Statistical Control

• Need to know and measure relevant covariates (identically independently distributed errors)– Omitted confound dependencies among units that

have similar values on the confound (e.g., teachers who are similarly inclined to help)

• Assumes optimal control for covariate is linear function of X’s

• Assumes constant treatment effect

Home

How Regression works: Explained Variance in

Regression

Y

X1 X2

X1 and X2 explain different parts of Y X1 and X2 are independent (uncorrelated)

Circles represent variances

Home

But usually there is multicollinearity (or the need for statistical control)

Y

X2X1

‘competition’ between the variables (in explaining Y)!

The degree of competition depends on the amount of Correlation (overlap) between the ‘independent’ (!) variables

Home

Y

X2X1

a

c

b

e

carYX 2

1 cbrYX 2

2

cbaR XXY 2. 21

2

22.

2

2

221

1

1 YX

YXXXY

YX

r

rR

eaa

pr

22.

2

2211 YXXXYYX rRasr

2

22.

2

1

121

2

1 YX

YXXXY

YX

r

rR

ebb

pr

22.

2

1212 YXXXYYX rRbsr

Home

Focus on Overlap and alternative explanations

Home

Example: The effect of National Board Certification on the help a teacher provides others (Frank et al)

Descriptive

Q: Do National Board certified teachers (NBCTs) provide more help to others in their schools than non-NBCTs?

A: Yes, the average NBCT is nominated by about 1.6 others as providing help with instruction, in contrast to about .95 for a non-NBCT.

Causal Inference

Q: Does National Board certification affect the amount of help a teacher provides?

Home

Data

• 47 schools (in 2 states)• 1583 teachers• Case studies in 4 schools• Surveys:• background• attitudes towards leadership and bct• sociometric:

• teachers were asked to list others who helped with instruction

Home

Syntax for Descriptives

GET FILE='C:\Documents and Settings\kenfrank\My Documents\MyFiles\sykes\workshop.sav'.

DESCRIPTIVES VARIABLES=bct leave female glevel owned yrstch nograde attracth expanseh bcttreat leader leadna white /STATISTICS=MEAN STDDEV .

Home

Table 1: Measures and Descriptive statistics (n=1363)

VariableMean Std Dev

Number other teachers helped by respondent (attracth) .96 1.08

number other teachers who helped respondent (expanseh) .91 .77

Board certified teacher, 1=Yes, 0 = No (BCT) .13 .34

White (white) .84 .37

Female (female) .93 .25

highest grade level taught (glevel) 8.32 4.13

no grade level indicated (nograde) .04 .19

level of own education (owned) 3.01 1.02

years teaching (yrstch) 16.12 8.64

Intention to leave (leave) 1.72 .72

perceived advantage of certification (bcttreat) 1.95 .55

enhancement through leadership (leader) 2.35 1.20

missing on enhancement of teaching (leadna) .17 .37

number certified others in school ( nbct) 2.31 2.44

number certified others in school squared (nbctsq) 6.42 11.69

(n is approximately 1208)

Home

Descriptives Separately for BCT and non-BCT


SORT CASES BY bct .SPLIT FILE LAYERED BY bct .DESCRIPTIVES VARIABLES=leave female glevel owned yrstch nograde attracth expanseh white bcttreat leader leadna nbct /STATISTICS=MEAN STDDEV .

Try it, what do you get?

Home

Recall regression model with statistical control for a confound

tiY tiY tiY

0 1 2

0 1 2Help Provided Board Certification Leadership

y s confound

Home

Partialled and unpartialled (zero order) correlations

Unpartialled (zero-order, or total) variation between help provided (y) and board certification (x) is .1762=.031

Variation between help provided (y) and board certification (x), partialled for enhancement of teaching through leadership is .1672=.028

Difference unpartialed and partialed is variance between board certification (x) and help provided (y) also accounted for by enhancement of teaching through leadership (confound):

.031-.028=.003

Home

How Regression Works: Overlapping Variances

Help provided

Board CertificationEnhancement

Throughleadership

Help provided

Board Certification

Variance between help provided and board certification, Partialling for enhancement through leadership, =.1672 =.028

Variance between help provided and board certification =.1762=.031

Home

How Regression Works: Partial and Semi-Partial correlation

s· s· ·s· | 2 2 2 2

· s·

.176 .072 .170.167

1 1 1 .170 1 .072

y cv y cvycv

y cv cv

r r rr

r r

Semi-Partial Correlation: correlation between s and y, where s has been controlled for the confounding variable

Partial Correlation: correlation between s and y, where s and y have been controlled for the confounding variable

164.072.1

170.072.176.

1 22

s

cvsx

cvycvsyss sr

r

rrrsr

Home

Regression and Correlation Coefficient

T ratio for regression coefficient and correlation are identical

s· 1

( ) 1.077, .176 .557

( ) .341y

sd yr

sd s

Home

Regression of Help Provided on Board Certification Controlling for Enhancement of Teaching through

Leadership

Controlling for enhancement of teaching through leadership

s· | 1|

( | ) 1.075, .167 .534

( | ) .336y c c

sd y cr

sd s c

Model: y=β0 +β1 c

Model: s=β0 +β1 c

Home

rsy=.18

rscv=.17

rycv=.07rscv×rycv

CVEnhancement ofteaching through

leadership

SBoard

Certification YHelp

Provided

The Impact of a Enhancement of Teaching through leadership on Correlation Between

Board Certification and Help Provided

rsy|cv=.167

How Regression Works:Impact of Enhancement of Teaching Through Leadership on Correlation Between Board Certification and Help Provided

rsy=.176

Home

Calculating Impacts:Correlations Between BCT, Amount of Help Provided, and Covariates

Home

Impacts of Covariates on Correlation between BCT and Help

ProvidedComponent Correlations

Home

Reflection

• What part if most confusing to you?– Why?– More than one interpretation?

• Talk with one other, share

• Find new partner and problems and solutions

Home

ExerciseHow Regression Works:

Exercise• Calculate the correlation between board

certification and help provided – Unpartialed– Partialed (for something other than

leadership) • (see basic calculations, sheet 1).

https://www.msu.edu/~kenfrank/research.htm#causal

• Do same for example in a data set you have

Home

Exercise: Find Impacts of measured Covariates on Correlation between BCT and Help Provided

Use data file “Board Certified Teachers”GET FILE='C:\Documents and Settings\kenfrank\My Documents\MyFiles\COURSES\causal '+ 'inference\groningen\data\spass_data\workshop.sav'.DATASET NAME DataSet6 WINDOW=FRONT.CORRELATIONS /VARIABLES=attracth bct expanseh white female leave glevel nograde owned yrstch leader nbct nbctsq bcttreat leadna /PRINT=TWOTAIL NOSIG /STATISTICS DESCRIPTIVES /matrix=out(forimp) /MISSING=PAIRWISE .

GET FILE= ' forimp'.

AUTORECODE VARIABLES=ROWTYPE_ varname_ /INTO t n /PRINT.

FILTER OFF.USE ALL.SELECT IF(t = 1 and n>=4).EXECUTE .

COMPUTE impact = attracth * bct .EXECUTE .

SORT CASES BY impact (D) .

SAVE OUTFILE='impact' /keep rowtype_ varname_ attracth bct impact /COMPRESSED.

Home

Reminder: Motivation: If you don’t argue scientifically, those who you disagree with

will, and your views will not be heard

Home

Concern over Missing Confound(Internal Validity)

• Causal Inference concern: How much of the estimate of the Board Certification effect would have to be attributed to other factors to invalidate the causal inference? – Maybe NBCTS help more because they had a

previous inclination to help?

• We may never know ,but we can quantify the concern– What would the impact of a confound (e.g, inclination

to help) have to be to alter our Inference? (Frank, 2000)

Home

Full Regression of Help Provided Others on Board Certification and Covariates

UNIANOVA attracth BY school WITH bct leave female glevel owned yrstch nograde expanseh leader white nbct nbctsq bcttreat leadna /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /PRINT = PARAMETER /CRITERIA = ALPHA(.05) /DESIGN = bct leave female glevel owned yrstch nograde expanseh leader white nbct nbctsq bcttreat leadna school .

Home

Impact of an Unmeasured Confounding Variable on Inference of Effect of Board

Certification on Help Provided

t( 1)

rscv

rycvrscv×rycv


BoardCertified

(s) Numberothershelped

(y)

Home

What must be the Impact of an Unmeasured Confounding variable

invalidate the Inference?

Step 1: Establish Correlation Between BCT and Help Provided, partialling for all covariates

Step 2: Define a Threshold for InferenceStep 3: Calculate the Threshold for the

Impact Necessary to Invalidate the Inference

Step 4: Multivariate Extension, with other Covariates

Home

Step 1: Establish Correlation Between BCT and Help Provided,

partialling for all covariates

2 2

t 6.79r .196

(n q 1) t (1156) 6.79

t taken from regression, =6.79 n is the sample size q is the number of parameters estimatedN-q-1=1156

Home

Step 2: Define a Threshold for Inference

• Define r# as the value of r that is just statistically significant:

# critical

2critical

tr

(n q 1) t

n is the sample size q is the number of parameters estimatedtcritical is the critical value of the t-distribution for making an inference

#

2

1.96.058

(1156) 1.96r

r# can also be defined in terms of effect sizes

Home

Step 3: Calculate the Threshold for the Impact Necessary to Invalidate the Inference

#·

#

r

1 | r |x yr

TICV

· · · ·· | 2 2

· ·11 1

x y x cv y cv x yx ycv

y cv x cv

r r r r kr

kr r

Set rx∙y|cv =r# and solve for k to find the threshold for the impact of a confounding variable (TICV).

Define the impact: k =rx∙cv x ry∙cv and assume rx∙cv =ry∙cv (which maximizes the impact of the confounding variable).

impact of an unmeasured confound > .147 → inference invalidimpact of an unmeasured confound < .147 → inference valid.

.196 .058.147

1 .058TICV

Home

Calculations made easy!

• http://www.msu.edu/~kenfrank/papers/calculating%20indices%203.xls

Home

Live Example

N-q=1131-18=1113. T=.603/.092=6.56

Impact Threshold=.142 Component correlations = .38

Frank, K.A., Gary Sykes, Dorothea Anagnostopoulos, Marisa Cannata, Linda Chard, Ann Krause, Raven McCrory. 2008. Extended Influence: National Board Certified Teachers as Help Providers. Education, Evaluation, and Policy Analysis. Vol 30(1): 3-30.

Home

Exercise 3: Impact Threshold Exercise

1)Identify a statistical inference from an article you are interested in.

2) Describe possible confounds/alternative explanations that could bias the estimate

3) Note the t-ratio and sample size

4) Calculate robustness of inference usinghttp://www.msu.edu/~kenfrank/papers/calculating%20indices%203.xls

5) Explain your inference and how robust you think it is. Why could your inference be challenged?

Home

Step 4: Multivariate Extension, with Covariates

#· |2 2

· · #

r(1 )(1 )

1 | r |x y z

x z y z

rTICV r r

2·2

· 2·

1

1

y z

y cv

x z

rr TICV

r

k=rx ∙cv|z× ry ∙ cv|z

Maximizing the impact with covariates z in the model implies

2·2

· 2·

1

1x z

x cv

y z

rr TICV

r

And

=.125

Home

SPSS Syntax for Obtaining Multivariate Impact Threshold

GET

FILE='C:\Documents and Settings\kenfrank\My Documents\MyFiles\sykes\workshop.sav'.

UNIANOVA

attracth BY school WITH leave female glevel owned yrstch nograde expanseh

bcttreat leader leadna white nbct nbctsq

/METHOD = SSTYPE(3)

/INTERCEPT = INCLUDE

/PRINT =ETASQ PARAMETER

/CRITERIA = ALPHA(.05)

/DESIGN = leave female glevel owned yrstch nograde expanseh bcttreat leader leadna

white nbct nbctsq school .

UNIANOVA

bct BY school WITH leave female glevel owned yrstch nograde expanseh bcttreat

leader leadna white nbct nbctsq

/METHOD = SSTYPE(3)

/INTERCEPT = INCLUDE

/PRINT = ETASQ PARAMETER

/CRITERIA = ALPHA(.05)

/DESIGN = leave female glevel owned yrstch nograde expanseh bcttreat leader leadna

white nbct nbctsq school .

Home

Obtaining R2

Home

Multivariate Calculations

• http://www.msu.edu/~kenfrank/papers/calculating%20indices%203.xls

Home

What must be the Impact of an Unmeasured Confound to Invalidate the Inference?

If k > .125 (or .147 without covariates) then the inference is invalid

If r x cv = ry cv, then each would have to be greater than k1/2

=.38 to alter the inference.(multivariate correction, ry cv > .38 and r x cv >.34)

Furthermore, correlations must be partialled for covariates z.

Impact of strongest measured covariate (perception leadership will enhance teaching) is .012;

Impact of unmeasured confound would have to be ten times greater than the impact of the strongest observed covariate to invalidate the inference. Hmmm….

Home

Applications of Impact Threshold• Frank, K.A., Gary Sykes, Dorothea Anagnostopoulos, Marisa Cannata, Linda Chard, Ann Krause, Raven McCrory. 2008.

Extended Influence: National Board Certified Teachers as Help Providers. Education, Evaluation, and Policy Analysis. Vol 30(1): 3-30.

• • Frisco, Michelle, Muller, C. and Frank, K.A. 2007. Using propensity scores to study changing family structure and academic

achievement. Journal of Marriage and Family. Vol 69(3): 721–741• • *Frank, K. A. and Min, K. 2007. Indices of Robustness for Sample Representation. Sociological Methodology. Vol 37, 349-

392. * co first authors.• Frank, K. 2000. "Impact of a Confounding Variable on the Inference of a Regression Coefficient." Sociological Methods and

Research, 29(2), 147-194

• Crosnoe, Robert and Carey E. Cooper. 2010. “Economically Disadvantaged Children’s Transitions into Elementary School: Linking Family Processes, School Contexts, and Educational Policy.” American Educational Research Journal 47: 258-291.

• Crosnoe, Robert. 2009. “Low-Income Students and the Socioeconomic Composition of Public High Schools.” American Sociological Review 74: 709-730.

• Maroulis, S. & Gomez, L. (2008). “Does ‘Connectedness’ Matter? Evidence from a Social Network Analysis within a Small School Reform.” Teachers College Record, Vol. 110, Issue 9.

• Cheng, Simon, Regina E. Werum, and Leslie Martin. 2007. “Adult Social Capital: How Family• and Community Ties Shape Track Placement of Ethnic Groups in Germany.” American• Journal of Education 114: 41-74.• William Carbonaro1 Elizabeth Covay1 School Sector and Student Achievement in the Era of Standards Based Reforms.

Sociology of eductaion vol. 83 no. 2 160-182 .• see also• Pan, W., and Frank, K.A. (2004). "An Approximation to the Distribution of the Product of Two Dependent Correlation

Coefficients." Journal of Statistical Computation and Simulation, 74, 419-443

Pan, W., and Frank, K.A., 2004. "A probability index of the robustness of a causal inference," Journal of Educational and Behavioral Statistics, 28, 315-337.

Home

Consider Alternate Sample(External Validity)

Causal Inference concern: We cannot assert cause if the effect of Board Certification is not constant across contexts.

Statistical Translation:Would the inference be valid if the sample included more of some population (e.g. teachers in other states) for which the effect was not as strong?

Rephrased for robustness: what must be the conditions in the alternative sample to invalidate the inference?

Home

Consider Alternate Sample(External Validity)

Define as the proportion of the sample that is replaced with an alternate sample.

r is correlation in unobserved data

R is combined correlation for observed and unobserved data:

Rxy=(1-)rxy + rxy .

Home

Thresholds for Sample Replacement

Set R=r# and solve for rxy:If half the sample is replaced (=.5), original

inference is invalid if rxy < 2r#-rxy

Therefore, 2r#-rxy defines the threshold for replacement: TR(=.5)

If rxy =0, inference is altered if π> 1-r#/rxy . Therefore 1-r#/rxy defines the threshold for

replacement: TR(rxy=0)Assumes means and variances are constant across samples, alternative calculations available.

Home

Example of Thresholds for Replacement

TR(=.5)= 2r#-rxy|z =2(.058)-.196=-.081. Correlation between Board Certification and number

of others helped would have to be less than -.081 to alter inference if half the teachers in our sample were replaced (e.g., with teachers from another state).

TR(rxy =0)= 1-r#/rxy|z =1-(.058/.196)=.71More than 70% of teachers would have to be

replaced with others for whom Board Certification

has no effect (rxy =0) to invalidate the inference in a combined sample.

Home

Calculations for Robustness of Inference for External Validity

Home

Basis of Comparison: Separate Effects for observed subgroups

• White(n=981): .71; Non-white(n=176): .27• Female(n=1080): .63; Male (n=77): -.50 !• Compare -.504 with TR(=.5)=-.081. • Results invalidate for populations consisting of more male (elementary) teachers.• Only 5 males who were bct:

GET

FILE='C:\Documents and Settings\kenfrank\My Documents\MyFiles\sykes\workshop.sav‘

.

CROSSTABS

/TABLES=bct BY female

/FORMAT= AVALUE TABLES

/CELLS= COUNT ROW COLUMN TOTAL

/COUNT ROUND CELL .

Home

Generally, How Much Bias Must there be to

Invalidate the Inference? Estimate=unbiased estimate + bias:

robserved = runbiased + M

where runbiased is defined by E(runbiased )= relationship in population or ρ

Inference invalid if runbiased < r# . So…

1) Set runbiased < r# and solve for M. Inference invalid if

M > (robserved - r#).

2) As a proportion of initial estimate, Inference invalid if

M/ robserved > 1-r#/ robserved=TR(rxy=0)=.71

Interpretation:71% of estimate must be attributable to bias to alter the inference (same as % replacement if r unobserved=0)

3) Rule of thumb (for large n)

% bias need to invalidate inference = 1-tcritical/tobserved

Sykes et al, % bias needed to invalidate inference = 1-1.96/6.79=.71

Home

Exercise: Robustness for Sample Representativeness (external Validity)

1)Identify a statistical inference in your own work or in the literature for which there is concern about the external validity

2) Identify possible populations for which the effect may not apply

3) Note the t-ratio and sample size4) Calculate robustness of inference usinghttp://www.msu.edu/~kenfrank/papers/calculating%20indices%203.xls

5) Discuss with a new partner your inference and how robust you think it is. Partner can challenge. Then change roles.

Home

Assumptions are the bridge between statistical and causal inference

Statistical Inference Causal Inference

Assumptions

Cornfield, J., & Tukey, J. W. (1956, Dec.), Average Values of Mean Squares in Factorials. Annals of Mathematical Statistics, 27(No. 4), 907_949.

Home

In Donald Rubin’s words

“Nothing is wrong with making assumptions; on the contrary, such assumptions are the strands that join the field of statistics to scientific disciplines. The quality of these assumptions and their precise explication, not their existence, is the issue”(Rubin, 2004, page 345).

Home

Conclusions for Robustness Indices

• Objections to moving from statistical to causal inference in terms of violations of assumptions– No unobserved confounding variables– Treatment has same effect for all

• Robustness indices quantify how much must assumptions must be violated to invalidate inference.

• No new causal inferences!– robustness indices merely quantify terms of debate regarding

causal inferences.• Can be used with any threshold.• Can be used (theoretically) for any t-ratio

– Discuss: Statistical inference as threshold?• Extension of sensitivity – indices are a property of original estimate

Home

Limitations

• Would like to do experiment• Would like longitudinal data to control for

previous inclination to help– (perhaps leverage this study to get a second

wave of data?)

• Don’t know if BCT’s are more helpful or merely perceived as such because of symbolic status

• Nationally representative data?

Home

Defining Absorption

• The impact of any given covariate can be absorbed by controlling for other covariates the impact of covariate c on the association between treatment x and outcome y is reduced once controlling for covariate a

| |( , , , )r r

1 1r r

cx a cy aAbsorb a c x y

cx cy

impact of c on x

impact of c on x given a

Home

The impact of confound c on the association between treatment x and outcome y is reduced once controlling for covariate a

rsy

rscv

rycvrscv×rycv

Confound

X

y

aa

Green indicates absorbed impact

Home

Syntax for calculating absorption

SUBTITLE "Impact Partialing Leader".GET FILE=‘workshop.sav’.PARTIAL CORR /VARIABLES= attracth bct expanseh white female leave glevel nograde owned yrstch nbct nbctsq bcttreat leadna BY leader /SIGNIFICANCE=TWOTAIL /matrix=out(forimpa.sav) /MISSING=LISTWISE .

GET FILE=forimpa.sav’.AUTORECODE VARIABLES=ROWTYPE_ /INTO t /PRINT.FILTER OFF.USE ALL.SELECT IF(t=3).EXECUTE .

COMPUTE attracth_post=attracth.COMPUTE bct_post=bct.COMPUTE impact_post=attracth_post * bct_post.EXECUTE .

SAVE OUTFILE='impactaa.sav' /keep ROWTYPE_ VARNAME_ attracth_post bct_post impact_post /COMPRESSED.

GET FILE=impactaa.sav’.AUTORECODE VARIABLES=VARNAME_ /INTO n /PRINT.

FILTER OFF.USE ALL.SELECT IF(n>2).EXECUTE .SAVE OUTFILE='impacta.sav' /keep ROWTYPE_ VARNAME_ attracth_post bct_post impact_post /COMPRESSED.

GET FILE='impact.sav'.SORT CASES BY VARNAME_ (A) .SAVE OUTFILE='byn.sav' /COMPRESSED.GET FILE='impacta.sav'.SORT CASES BY VARNAME_ (A) .SAVE OUTFILE='byna.sav' /COMPRESSED.GET FILE='byn.sav'.MATCH FILES /FILE=* /FILE='byna.sav' /RENAME (ROWTYPE_ = d0) /BY VARNAME_ /DROP= d0.EXECUTE.COMPUTE absorb=1-impact_post/impact .EXECUTE .SAVE OUTFILE='absorb.sav' /keep ROWTYPE_ VARNAME_ absorb impact attracth bct impact_post attracth_post bct_post /COMPRESSED.

Home

Extent to which Leader absorbs the impact of other covariates on inference regarding

effect of BCT on help provided

Once controlling for leader less of a need to control for intention to leave or years teaching

Home

Absorption Exercise

• Looking at the absorption and impact matrices can you guess what will happen when you add female to the model? How about when you add number of others in the school who are board certified (nbct)

• Check using syntax:


UNIANOVAbct BY school WITH leave female glevel owned yrstch nograde expanseh bcttreat leader leadna white nbct nbctsq /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /PRINT = ETASQ PARAMETER /CRITERIA = ALPHA(.05) /DESIGN = leave female glevel owned yrstch nograde expanseh bcttreat leader leadna white nbct nbctsq school .

Home

How a pre-test absorbs impact

Home

Analyzing Pre/post-test designs ANCOVA: Analysis of Covariance

• Research questions:• pre- versus post interacting with treatment (Not recommended): Is

there a difference between pre and post scores, and does that difference depend on whether or not the subject participated in the treatment?

• ANCOVA: Controlling for the pre-test, did subjects who participated in the treatment score higher on the post test than those in the control?– Did the effect of the treatment depend on the level of the pre-test -- did

the treatment work better for some than others

• Difference scores: Did the subject who participated in the treatment learn more (or grow more) from pre-test to post-test than those in the control?

Home

Models:pre- versus post interacting

with treatment (Not recommended):

bob post 0 1 bob post 2 bob post 3 bob post bob postˆ ˆ ˆ ˆ ˆy dtreatment dpost x dtreatmentdpost e

0 1 2 3ˆ ˆ ˆ ˆ ˆy dtreatment dpost x dtreatmenti i i i idpost e

Problem: observations are not independent – each person is measured twice, pre and post. The effects of each person who mutually effect error terms for the same person, and thus be correlated:

bob pre 0 1 bob pre 2 bob pre 3 bob pre bob preˆ ˆ ˆ ˆ ˆy dtreatment dpost x dtreatmentdpost e

The errors for the two models in (2) will be dependent due to the common effect of “bobness” on each error that has not been accounted for.

Home

ANCOVA:• Controlling for the pre-test, did subjects who participated in the

treatment score higher on the post test than those in the control? Did the effect of the treatment depend on the level of the pre-test -- did the treatment work better for some than others

0 1 2 3ˆ ˆ ˆ ˆ ˆpost achievement pre dtreatment pre x treatmenti i i i ie

Alternate Expression of model with factors (categorical variables), covariates (continuous variables) and interactions

ˆ ˆpost achievement pre prei j i j ij ije

Home

Difference Scores

• Construct: Δyi = yposti - yprei . This measures the change from y-pre to y-post for person i.

• Model0 1

ˆ ˆ ˆ+ dtreatmenti i ie Advantages: only one observation/person. Essentially modeling “growth.” Disadvantage: cannot test for interaction effect.

Home

When to use Difference scores versus ANCOVA

• Allison argues use difference scores when the pre-test is not considered a causal predictor of either the treatment or control.

• A. Pre-test “causing” outcome: Stocks versus flows• The pre-test can “cause” the post test when the outcome like a “stock” --

the outcome has an inherent persistence over time -- such as height, which typically cannot decrease (Allison, page 107). In this case, use ANCOVA.

• The pre-test is not considered “causal” for most measures of behavior and attitude which must be regenerated each time, like something that “flows” which can therefore be cut off.

• B. Pre-test “causing” treatment:• Examples (Allison, page 109):• (use Δ)All seniors in high school A are enrolled in the treatment, and the

SAT is administered before and after the treatment or control period. All students in High school B serve as controls.

• (use ANCOVA): The SAT is administered as a pretest to a group of high school seniors. Those who score below 400 are enrolled in the treatment, and those who score above 400 are in the control.

• (use Δ): Seniors self-select into treatment & control before seeing the results of a pre-test administration of the SAT.

• (use ANCOVA): Seniors self-select into the program after seeing the results of a pre-test administration of the SAT.

Home

Flow Chart for use of Difference Scores versus ANCOVA

0 1ˆ ˆ ˆ+ dtreatmenti i ie

Pre-test cause treatment conditions?No yes

Use ANCOVADifference scoreIf test has high reliability

Difference Score ANCOVA

0 1 2 3ˆ ˆ ˆ ˆ ˆpost achievement pre dtreatment pre x treatmenti i i i ie

Home

Absorption of Impact Via Randomly Assigned Treatment

Green area goes to zero

Home

How Random Assignment Absorbs Impact

t( 1)

rxcv

rycvrxcv×rycv


Random assignment (s)

Numberothershelped

(y)

Treatment(x)

rxsrys|x=0

Home

How Does Regression Discontinuity Absorb Impact?

• Criteria for Assignment to treatment conditions known with certainty

• Comparison of those who just exceeded criteria with those who just missed criteria.

Home

How Regression Discontinuity absorbs impact

Home

How Instrumental Variable Absorbs Impact

t( 1)

rxcv

rycvrxcv×rycv


InstrumentalVariable

Numberothershelped

(y)

Treatment(x)

rxsrys|x=0

FidelityAssumed

Home

Impact Thresholds and Instrumental variables

• Can still do impact threshold. • Define iv as the instrumental variable, cv as the

confound.• Exclusion restriction: For any confounding

variable for which r cv y > 0, r iv cv must equal 0. In other words, r iv y x r iv cv =0.

• But what if this doesn’t hold? Inference invalidated by r iv y x r iv cv . This is the impact of a confound.

• Can compare with existing relationships between IV and other covariates.

Home

Comment on Instrumental Variables

• Exclusion restriction: instrument related to treatment assignment but related to outcome only through treatment is difficult to satisfy

– Attempts: draft # (Angrist et al)– Whether you’re Catholic or not for attending catholic school

• A recent meta-analysis [Glazerman, Stephen, Levy, Dan and Myers, David (2003). Nonexperimental versus Experimental Estimates of Earnings Impacts.” Annals, AAPS (589): 63-85] found that statistical control for a prior measure most approximated randomized experiments in a meta-analysis of effects of welfare, job training and employment service programs on earnings.

• Steiner, Peter M., Thomas D. Cook & William R. Shadish (in press). On the importance of reliable covariate measurement in selection bias adjustments using propensity scores. Journal of Educational and Behavioral Statistics.

• Steiner, Peter M., Thomas D. Cook, William R. Shadish & M.H. Clark (in press). The importance of covariate selection in controlling for selection bias in observational studies. Psychological Methods.

• Cook, T. D., Shadish, S., & Wong, V. A. (2008). Three conditions under which experiments and observational studies produce comparable causal estimates: New findings from within-study comparisons. Journal of Policy and Management. 27 (4), 724–750.

Home

Parents of Friends as Instrument for Friends?

Home

Reflection

1) Identify the aspects that are unclear to you or that concern you

2) Find a partner or two and discuss your concerns

3) Be prepared to teach others or share concerns

Home

Schools as Fixed or Random

• Problem: students and teachers are nested within schools (data are multilevel)

• Common problem in social science research: people nested within organization

– If no control for organizations, members of a given organization are commomly affected by that organization

– Example: All students are commonly affected by their principal– Implication: error terms are not independent, standard errors are

biased, p values are wrong!• Response: control for schools• Fixed effects: enter a dummy variable for each school (except

one) to control for school effects.– Same way one controls for gender or race

• Random effects (multilevels):– Assume there is a distribution (e.g., normal) of effects across schools,

only estimate paramters of distribution

Home

Schools as Fixed or Random

• Fixed: essentially using dummy variables to control each school– Spends degrees of freedom – 1 for each school– Focus on individual effects within contexts– Schools in the sample are the population of interest– Controls for all unobservable factors associated with school

• Random: assume residual school effects are normally distributed– Only estimate mean and variance, not each one– Can estimate effects at indiviudal or school level, as well as cross-level

interactions (slopes as outcomes)– Schools are considered a sample from a larger population– Controls for all unobservable factors associated with school?

• Pretty much, with careful centering (see next results)

• Biggest difference is whether all predictors are adjusted for group characteristics

– Fixed effects: yes– Random effects: No

• (unless you group mean center all variables)• Subtract the group (school) mean from each predictor

Home

Syntax for Schools as Random versus Fixed Effects

UNIANOVA attracth BY school WITH bct expanseh white female leave glevel nograde owned yrstch leader nbct nbctsq bcttreat leadna /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /PRINT = PARAMETER /CRITERIA = ALPHA(.05) /DESIGN = bct expanseh white female leave glevel nograde owned yrstch leader nbct nbctsq bcttreat leadna school .

SORT CASES BY school (A) .

AGGREGATE /OUTFILE=* MODE=ADDVARIABLES /BREAK=school /bct_mean = MEAN(bct) /expanseh_mean = MEAN(expanseh) /white_mean = MEAN(white) /female_mean = MEAN(female) /leave_mean = MEAN(leave) /glevel_mean = MEAN(glevel) /nograde_mean = MEAN(nograde) /owned_mean = MEAN(owned) /yrstch_mean = MEAN(yrstch) /leader_mean = MEAN(leader) /nbct_mean = MEAN(nbct) /nbctsq_mean = MEAN(nbctsq) /bcttreat_mean = MEAN(bcttreat) /leadna_mean = MEAN(leadna).

COMPUTE bct= bct -bct_mean.COMPUTE expanseh = expanseh -expanseh_mean.COMPUTE white = white - expanseh_mean .…………………………………COMPUTE leadna = leadna - leadna_mean .EXECUTE .SAVE OUTFILE=*+ ' inference workshop\spss dataset\cen.sav' /keep school attracth q71 bct expanseh white female leave glevel nograde owned yrstch leader nbct nbctsq bcttreat leadna.

UNIANOVA attracth BY school WITH bct expanseh white female leave glevel nograde owned yrstch leader nbct nbctsq bcttreat leadna /RANDOM=school /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /PRINT = PARAMETER /CRITERIA = ALPHA(.05) /DESIGN = bct expanseh white female leave glevel nograde owned yrstch leader nbct nbctsq bcttreat leadna.

Home

Output Controlling for Schools as Random Effects

Compare with estimate of .622681 (se=.091653) from model Controlling for schools as fixed effects

Home

Statistical power in multilevels

• How to choose – number of cases per unit– Number of units

• Where to allocate resources:• Rules of thumb

– the larger the intraclass correlation (e.g., variation between schools) the more df are based on number of units, and you should sample more units and fewer per unit

– the smaller the intraclass correlation the more df are based on number of observations within units, and you should sample more observations per unit.

– 80 is good to detect moderate effect.– You need less if you have a pretest – increases precision

Home

Optimal design software

– http://sitemaker.umich.edu/group-based/home– Developed by Raudenbush, S– http://sitemaker.umich.edu/group-based/

optimal_design_software

Home

Referenceshttp://sitemaker.umich.edu/group-based/references

ReferencesBloom, H. S., Richburg-Hayes, L., & Black, A. R. (2007). Using Covariates to Improve Precision for Studies That Randomize Schools to

Evaluate Educational Interventions. Educational Evaluation and Policy Analysis, 29(1), 30-59. (http://epa.sagepub.com/cgi/content/abstract/29/1/30, 10-03-2007)

This article examines how controlling statistically for baseline covariates, especially pretests, improves the precision of studies that randomize schools to measure the impacts of educational interventions on student achievement. Empirical findings from five urban school districts indicate that (1) pretests can reduce the number of randomized schools needed for a given level of precision to about half of what would be needed otherwise for elementary schools, one fifth for middle schools, and one tenth for high schools, and (2) school-level pretests are as effective in this regard as student-level pretests. Furthermore, the precision-enhancing power of pretests (3) declines only slightly as the number of years between the pretest and posttests increases; (4) improves only slightly with pretests for more than 1 baseline year; and (5) is substantial, even when the pretest differs from the posttest. The article compares these findings with past research and presents an approach for quantifying their uncertainty.Hedges, L. V., & Hedberg, E. C. (2007). Intraclass Correlation Values for Planning Group-Randomized Trials in Education. Educational Evaluation and Policy Analysis, 29(1), 60-87. (http://epa.sagepub.com/cgi/content/abstract/29/1/60, 10-03-2007)

Experiments that assign intact groups to treatment conditions are increasingly common in social research. In educational research, the groups assigned are often schools. The design of group-randomized experiments requires knowledge of the intraclass correlation structure to compute statistical power and sample sizes required to achieve adequate power. This article provides a compilation of intraclass correlation values of academic achievement and related covariate effects that could be used for planning group-randomized experiments in education. It also provides variance component information that is useful in planning experiments involving covariates. The use of these values to compute the statistical power of group-randomized experiments is illustrated.Raudenbush, S. W. (1997). Statistical Analysis and Optimal Design for Cluster Randomized Trials. Psychological Methods, 2(2), 173-185. (raudenbush.1997.pdf, 1854.0 kb, 10-03-2007) Raudenbush, S. W., & Liu, X. (2001). Effects of Study Duration, Frequency of Observation, and Sample Size on Power in Studies of Group Differences in Polynomial Change. Psychological Methods, 6(4), 387-401. (raudenbush.liu.2001.pdf, 1551.0 kb, 10-03-2007) Raudenbush, S. W., Martinez, A., & Spybrook, J. (2007). Strategies for Improving Precision in Group-Randomized Experiments. Educational Evaluation and Policy Analysis, 29(1), 5-29. (http://epa.sagepub.com/cgi/content/abstract/29/1/5, 10-03-2007)

Interest has rapidly increased in studies that randomly assign classrooms or schools to interventions. When well implemented, such studies eliminate selection bias, providing strong evidence about the impact of the interventions. However, unless expected impacts are large, the number of units to be randomized needs to be quite large to achieve adequate statistical power, making these studies potentially quite expensive. This article considers when and to what extent matching or covariance adjustment can reduce the number of groups needed to achieve adequate power and when these approaches actually reduce power. The presentation is nontechnical.

Home

Differential Treatment Effects and Heckman’s Rationality

• Individuals choose treatments they expect will be most beneficial to them – they can anticipate outcome of treatment.– Treatment effect for treated > treatment effect for

control– Attend to assignment mechanism – factors that affect

choice of treatment• OLS estimates average treatment effect

– Invalidates paradigm of randomized experiment because people choose treatments.

Home

Differential Treatment Effects

Home

Policy Implications

• Treatment effect for treated evaluates effect of existing program for those who received it.

• Treatment effect for control evaluates effect of program if it is expanded to those now receiving the control.

Home

Propensity scores– Estimate differential treatment effects

– Improve covariance adjustment

– Non-monotonic relationship between propensity and discriminant function of covariates

– Unequal variances in treatment and control group

» Dilation effect of treatment (Rosenbaum 2000)

– Motivate evaluation of assignment mechanism

• Cf. Heckman’s 2005 critique of Rubin/Holland model

– Align with counterfactual

• matched comparisons

– No need to match on all covariates

• comparisons within propensity strata

• Presentation loosely based on “– Introduction to Propensity Score Matching” Guo et al

– http://ssw.unc.edu/jif/sacws/docs/Day1a.ppt

Home

Definition of Propensity

Propensity of receiving treatment (i.e., s=1) given covariates x = e(x) = Pr{s = 1|x},

Note e(x) not a probability, since all subjects have already received the treatment (1) or not (0).

Can be obtained as predicted value from logistic regression

Home

Impact of an Unmeasured Confounding Variable on Inference of Effect of Board

Certification on Help Provided

Frank, K.A., Gary Sykes, Dorothea Anagnostopoulos, Marisa Cannata, Linda Chard, Ann Krause, Raven McCrory. 2008. Extended Influence: National Board Certified Teachers as Help Providers. Education, Evaluation, and Policy Analysis. Vol 30(1): 3-30.

Home

Use Propensity to Weight Analysis:For Treatment Effect for Treated:

Where ω is the weight, t=treatment (1or 0), e(x) is propensity to have received the treatment (predicted value from logistic regression)

(1 ) ( )( , )

1 1 ( )

t t e xt x

e x

Home

Propensity and the Relevant Comparison: Estimate of Treatment for

those who received the treatment

Treatment Effect for Treated}

Home

For Treatment Effect for Untreated


(1 ( )) 1( , )

( ) 1

t e x tt x

e x

Home


those who received the control

Treatment Effect for control {

Home

Use Propensity to Weight Analysis: Estimate of Treatment for People at the Margin of

Indifference (EOTM)


(Hirano and Imbens 2001; Robins Rotnitzky and Zhao1995)

1( , )

( ) 1 ( )

t tt x

e x e x

Home


People at the Margin of Indifference (EOTM)

Estimated Effect for People at the Margin

of Indifference (EOTM)

Home

General procedure for propensity score analysis

• Step 1) Estimate propensity of receiving the treatment (versus control) – using logistic regression of factors predicting treatment versus control– Interpret logistic regression– Save predicted values – these are the propensities

• Step 2) Balance– Compare distribution of propensity by treatment and control groups– Compare treatment and control by covariates (balance) accounting for

propensity• Either by strata or using weights

• Step 3) Estimate effect of treatment on outcome by – propensity strata– matching treatment and control groups on propensity

• Includes composite matches (Heckman’s Kernel functions)– Weighting analyses by propensity (ken’s preferred)– Controlling for propensity (Heckman’s control functions)

Home

Step 1) SPSS Syntax for propensity model and saving predicteds

GET FILE='F:\RA work\for Ken\causal inference\SPSS\workshop.sav'.

LOGISTIC REGRESSION bct /METHOD = ENTER expanseh white female leave glevel nograde owned yrstch leader nbct nbctsq bcttreat leadna /SAVE = PRED /CRITERIA = PIN(.05) POUT(.10) ITERATE(20) CUT(.5) .COMPUTE pbct=pre_1.EXECUTE .SAVE OUTFILE='F:\RA work\for Ken\causal inference\SPSS\withp.sav' /COMPRESSED.

GET FILE='F:\RA work\for Ken\causal inference\SPSS\withp.sav'.COMPUTE pbct=pre_1.IF (pbct > 0) pweight=bct/pbct + (1-bct)/(1-pbct).IF (pbct > 0) pweightt=bct + (1-bct)/(1-pbct).IF (pbct > 0) pweightc=bct/pbct + (1-bct).EXECUTE .

VARIABLE LABELS pbct 'baseline propensity'.VARIABLE LABELS pweight 'weight EOTM: those on the margin weight'.VARIABLE LABELS pweightt 'weight for treatement effect for treated'.VARIABLE LABELS pweightc 'weight for treatement effect for control'.EXECUTE.

SAVE OUTFILE='F:\RA work\for Ken\causal inference\SPSS\pmp.sav' /COMPRESSED.

Home

Table 2: Logistic Regression for Being Board Certified

Independent Variable Estimate Standard Error Wald Chi-Square Pr>ChiSq

Intercept -6.8725 1.0566 35.1514 <.0001

White -.078 .246 .101 .751

Female 1.447 .605 5.722 .017

highest grade level taught

-.0001 .023 .0000 .996

no grade level indicated

-1.176 .776 2.297 .130

level of own education

.403 .100 16.348 <.0001

Years teaching .003 .011 .055 .814

Intention to Leave -.097 .131 .549 .459

perceived advantage of certification

.136 .160 .731 .393

Enhancement of teaching through leadership

.695 .157 19.482 <.0001

missing on enhancement of teaching

.962 .582 2.735 .098

number other teachers who helped respondent

0.185 .1100 2.818 0.1230

number certified others in school

.1234 .068 3.306 .069

number certified others in school squared

-.013 .014 .913 .339

Home

Interpreting logistic regression

• Key predictors– Level of own education– Enhancement of teaching through leadership

• Adjusting for context through number of others in school who were certified

• Keep in even marginal variables• Logistic function correctly classifies 62% of

cases when classified as BCT if probability >.13 (13% of teachers are Board certified)

Home

Step 2) Syntax for checking balance of propensity

GET

FILE='C:\Documents and Settings\kenfrank\My Documents\MyFiles\sykes\pmp.sav'.

CROSSTABS

/TABLES=bct BY female

/FORMAT= AVALUE TABLES

/CELLS= COUNT ROW COLUMN TOTAL

/COUNT ROUND CELL .SORT CASES BY

bct (A) .

EXAMINE

VARIABLES=pbct pweight pweightt pweightc BY bct

/PLOT BOXPLOT HISTOGRAM

/COMPARE GROUP

/STATISTICS NONE

/CINTERVAL 95

/MISSING LISTWISE

/NOTOTAL.

Home

Boxplot Comparison of Distributions of Propensity between NBCTs and non-NBCTs: Common support

NBCTOther

PropensityScore

Home

EOTM Weights before Trimming

Home

Code for Trimming weight and recheck balance of propensity

RECODE pweight (20 thru Highest=20) .EXECUTE .RECODE pweight pweightc (20 thru Highest=20) .EXECUTE .

SAVE OUTFILE='C:\Documents and Settings\kenfrank\My Documents\MyFiles\sykes\mp.sav' /COMPRESSED.

subtitle "visual of balance of weights and propensity".EXAMINE VARIABLES=pbct pweight pweightt pweightc BY bct /PLOT BOXPLOT HISTOGRAM /COMPARE GROUP /STATISTICS DESCRIPTIVES /CINTERVAL 95 /MISSING LISTWISE /NOTOTAL.

Home

Weights after trimming

Home

Syntax for checking balance of covariates

DESCRIPTIVES VARIABLES=pweight /STATISTICS=MEAN .COMPUTE npweight = pweight /1.943653666769.EXECUTE .

WEIGHT BY npweight .T-TEST GROUPS = bct(0 1) /MISSING = ANALYSIS /VARIABLES = attracth expanseh white female leave glevel nograde owned yrstch leader nbct nbctsq bcttreat leadna /CRITERIA = CI(.95) .

WEIGHT BY npweight .SORT CASES BY bct .SPLIT FILE LAYERED BY bct .DESCRIPTIVES VARIABLES=attracth expanseh white female leave glevel nograde owned yrstch leader nbct nbctsq bcttreat leadna /STATISTICS=MEAN STDDEV MIN MAX.

Home

Testing for Balance, weighted by Propensity (EOTM)

VariableBCT

(n=162)Non-BCT(n=1038)

Number other teachers helped by respondent 1.38(3.59)

.89(1.06)

number other teachers who helped respondent .90(2.09)

.91(.81)

White .82(1.00)

.84(.39)

Female .95(.59)

.93(.27)

highest grade level taught 8.42(10.21)

8.3(4.45)

no grade level indicated* .01(.31)

.04(.21)

level of own education 3.08(2.57)

3.01(1.10)

years teaching 15.92(18.22)

16.1(9.56)

Intention to leave 1.68(1.96)

1.70(.80)

perceived advantage of certification 1.95(1.22)

1.94(.61)

enhancement through leadership 2.43(3.13)

2.35(1.29)

number certified others in school 2.43(6.50)

2.31(2.63)

Home

Exercise: What is Balance without weights?

GET FILE=‘C:\Documents and Settings\kenfrank\My Documents\MyFiles\sykes\mp.sav’.

subtitle "checking for balance among covariates".

T-TEST GROUPS = bct(0 1) /MISSING = ANALYSIS /VARIABLES = attracth expanseh white female leave glevel nograde owned yrstch leader nbct nbctsq bcttreat leadna /CRITERIA = CI(.95) .

subtitle "checking for balance among covariates".

SORT CASES BY bct .SPLIT FILE LAYERED BY bct .DESCRIPTIVES VARIABLES=attracth expanseh white female leave glevel nograde owned yrstch leader nbct nbctsq bcttreat leadna /STATISTICS=MEAN STDDEV MIN MAX.SPLIT FILE OFF.

Home

Step 3) syntax for estimating effects with weights

subtitle "weighted by pweight, EOTM".UNIANOVA attracth BY school WITH bct /REGWGT = npweight /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /PRINT = ETASQ PARAMETER /CRITERIA = ALPHA(.05) /DESIGN = bct school .

subtitle "weighted by pweightt, for treated".UNIANOVA attracth BY school WITH bct /REGWGT = npweightt /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /PRINT = ETASQ PARAMETER /CRITERIA = ALPHA(.05) /DESIGN = bct school .

subtitle "weighted by pweightc, for control".UNIANOVA attracth BY school WITH bct /REGWGT = npweightc /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE/PRINT = ETASQ PARAMETER /CRITERIA = ALPHA(.05) /DESIGN = bct school .

*notes for syntax to get npweight, npweightt, npweightc

Home

Syntax and Output for Treatment Effect for Treated

UNIANOVA attracth BY school WITH bct /REGWGT = npweightt /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /PRINT = PARAMETER /CRITERIA = ALPHA(.05) /DESIGN = bct school .

Home

Syntax and Output for Treatment Effect for Control

UNIANOVA attracth BY school WITH bct /REGWGT = npweightc /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /PRINT = PARAMETER /CRITERIA = ALPHA(.05) /DESIGN = bct school .

Home

Syntax and Output for EOTM

UNIANOVA attracth BY school WITH bct /REGWGT = npweight /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /PRINT = PARAMETER /CRITERIA = ALPHA(.05) /DESIGN = bct school .

Home

Table 3: Estimated Effect of Board Certification on Amount of Help Provided

Non bootstrap standard errors in ()

.

Model* Coefficient Std error t-ratio PValue

Weighted by propensity(EOTM)

.569 .138 4.12 <.001

Weighted by propensity(treatment effect for the treated)

.598 .130 4.60 <.001

Weighted by propensity(treatment effect for the control)

.562 .138 4.07 <.001

Unweighted, with covariates a .603 .092 6.56 <.001

Unweighted with covariates, using multiple imputation

.621 .092 6.75 <.001

Unweighted, no covariates .583 .092 6.35 <.001

Unweighted, no control for school .540 .092 5.88 <.001

NBPTS certified teacher versus other teachers who applied, EOTM (n=280, bct=160, non-bct=120)

.577 .167 3.46 <.001

NBPTS certified teacher versus other teachers who did not applyEOTM (n=1017, bct=160, non-bct=857)

.562 .139 4.04 <.001

*Schools controlled for with fixed effects in all models unless otherwise stated. n=1131 unless otherwise stated. Standard errors based on 500 bootstrap replications. a R2=.21 for standard model with covariates.

Home

Interpretation

• Propensity weighting did not make much of a difference!

• Allowed for focus on different treatment effects• In paper, applied robustness indices to

estimates based on propensities• Schools controlled for with fixed effects

– Accounts for any factor that can be attributed to schools

• Principal, student composition, unmeasured factors

Home

Criticisms of propensity scores• No better than the covariates that go into it

• no control for unobservables• Ambivalent about quality of propensity model• Group overlap must be substantial

• Propensity model should not fit too well!• implies confounding of covariates and treatment

• not good enough implies poorly understood treatment mechanism – poor control

• Short-term biases (2 years) are substantially less than medium term (3 to 5 year) biases—the value of comparison groups may deteriorate

Home

Reflection

1) Identify the aspects that are unclear to you or that concern you

2) Find a partner or two and discuss your concerns

3) Be prepared to teach others or share concerns

Home

Alternative to Weighting by Propensity

• Matching (Rosenbaum and Rubin 1983; Morgan 2001)

• Analyses by Strata (Morgan 2001)

• Kernal Matching (Heckman et al.)

• Control for propensity (Heckman and Robb’s control function – see Winship and Morgan 677).

Home

Matching, Propensity strata and Regression Adjustment

• Heckman refers to regression adjustment as same as matching and propensity strata. Here’s why:

• infinite number of strata matching:– One pair of observations, in treatment and control, within

each stratum

• Implies that strata level is not related to treatment – there’s a treatment and control in each stratum.

• Estimate from matching would be mean difference between treatment and control groups

Home

Matching, Propensity strata and Regression Adjustment

If there is one case in each stratum, estimate from regression would be mean difference between treatment and control because:

· · ·· | 2 2

· ·1 1

x y x cv y cvx ycv

y cv x cv

r r rr

r r

But rx cv=0 (because there is one case within each stratum), therefore rx y| cv =rx y which will generate

same estimate as from regression.

Home

Syntax for Propensity by StrataGET FILE=C:\Documents and Settings\kenfrank\My Documents\MyFiles\sykes\forstrata.sav'.

RANK VARIABLES=rpbct (A) /RANK /NTILES (5) /PRINT=YES /TIES=MEAN .

RECODE Nrpbct (1=0) (2=1) (3=2) (4=3) (5=4) (SYSMIS=SYSMIS) INTO rpbct .EXECUTE .

SAVE OUTFILE=C:\Documents and Settings\kenfrank\My Documents\MyFiles\sykes\strata.sav' /COMPRESSED.

SORT CASES BY rpbct (A) .

SAVE OUTFILE='C:\Documents and Settings\kenfrank\My Documents\MyFiles\sykes\strata_s.sav' /COMPRESSED.

GET FILE='C:\Documents and Settings\kenfrank\My Documents\MyFiles\sykes\strata_s.sav'.

subtitle "checking for balance".SORT CASES BY rpbct .SPLIT FILE LAYERED BY rpbct .T-TEST GROUPS = bct(0 1) /MISSING = ANALYSIS /VARIABLES = attracth expanseh white female leave glevel nograde owned yrstch leader nbct nbctsq bcttreat leadna /CRITERIA = CI(.95) .SPLIT FILE OFF.

SORT CASES BY rpbct .SPLIT FILE LAYERED BY rpbct .DESCRIPTIVES VARIABLES=attracth expanseh white female leave glevel nograde owned yrstch leader nbct nbctsq bcttreat leadna /STATISTICS=MEAN STDDEV MIN MAX.SPLIT FILE OFF.

subtitle "estimate by strata".SORT CASES BY rpbct .SPLIT FILE LAYERED BY rpbct .UNIANOVA attracth BY school WITH bct /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /PRINT = PARAMETER /CRITERIA = ALPHA(.05) /DESIGN = bct school.SPLIT FILE OFF.

Home

Estimates by Strata (including controls for school)

Strata Est se

1 .22 .33

2 .74 .21

3 .40 .21

4 .62 .19

Average .50

Home

Exercise

Identify an inference regarding an effect in your own work that might benefit from using propensity scores:

1) Is the “treatment” dichotomous2) Are you interested in differential treatment

effects (e.g., for the control and for the treated)?

3) Do you know what factors affect treatment choice?

4) Which propensity approach appeals to you?

Home

Substantive Conclusion

• We infer that National Board certification has an effect on the amount of help teachers provide to others – Effect is at least .5 a standard deviation– Largest effect more than 1-to-1 diffusion – for

every BCT, 1.5 receives help (e.g., 4 BCTs help to 6 others, a total of 10 in school affected by the process).

– lets debate in quantitative terms of robustness indices.

Home

Policy Implications

• Extra Benefit of Board Certification– Contribute to social capital– Spread ideas of board certification– Help other teachers innovate

• Offer incentives for Board Certification– Can advocate policy because inferences

robust

Home

Methodological Conclusion

• Propensity scores narrow the estimate

• Robustness Indices quantify threats to validity

• Robustness Indices more informative than propensity scores?

Home

Methods Reviewed• Counterfactual (2 possible outcomes)• Statistical control

– Random and fixed effects• Robustness of inference

– for impact of a confounding variable (internal validity)– for representativeness of sample (external validity)– Robustness indices a form of sensitivity analysis

• Absorption– Randomization– Instrumental variables– Pre-test

• Differential treatment effects– Treatment effect for treated/for control

• Propensity scores– Attention to assignment mechanism

• Logistic regression– Using propensity scores in analysis

• Weighting• Control• Strata• matching

Home

References on Causal Inference• Holland, P. W. (1986), Statistics and causal inference. Journal of the American Statistical

Association, 81, 945_970.• Rubin, D. B. (1974), Estimating causal effects of treatments in randomized and non_randomized

studies. Journal of Educational Psychology, 66, 688_701.• Rubin, D.B. (2004). “Teaching Statistical Inference for Causal Effects in Experiments and

Observational Studies.”Journal of Educational and Behavioral Statistics, Vol 29(3): 343-368.• Winship, C., & Morgan, S. (1999). The Estimation of Causal Effects from Observational Data.

Annual Review of Sociology, 25, 659_707.• Winship, C. and Sobel, M. (2004) “Causal Inference in Sociological Studies”. Chapter 21 in

Handbook of Data Analysis (Hardy, Melissa., and Bryman, Alan, ed.). London: Sage Publications.

• Heckman, James. (2005). “The Scientific Model of Causality.” Sociological Methodology.”• Masnki, Charles F. 1995. Identification Problems in the Social Sciences. Cambridge, Ma:

Harvard University Press.• Rosenbaum, Paul R. (2002). Observational Studies. New York: Springer.On the Web• http://www.wjh.harvard.edu/soc/faculty/winship/CFA_site.html (Winship’s portal)• http://www.ets.org/research/dload/AERA_2004-Holland.pdf (recent Paul Holland)• http://bayes.cs.ucla.edu/jp_home.html (Judea Pearl)• http://plato.stanford.edu/entries/causation-counterfactual/ (philosophy of counterfactual)• http://sekhon.berkeley.edu/causalinf/causalinf.pdf syllabus on causal inference

Home

Technical Appendix B for calculating Impact Thresholds

t critical n r# observed t r (x,y) ITCV r(x,cv) r(y,cv)

1.961295

=+A2/SQRT(A2*A2+B2-3) 7.34

=+D2/SQRT(B2-2+D2*D2) =+(E2-C2)/(1-C2) =+SQRT(F2) =+SQRT(F2)

Multivariate (with other covariates, z, in model)

t critical

num z r# R2 (x,z) R2 (y,z) ITCV r(x,cv) r(y,cv)

1.96 45

=+A7/(SQRT(A7*A7+B2-B7-3)) 0.15 0.13 =+F2*SQRT((1-D7)*(1-E7))

=SQRT(+F7*SQRT((1-D7)/(1-E7)))

=SQRT(+F7*SQRT((1-E7)/(1-D7)))

User enters values in yellow boxes

Indices calculated in pink

User can replace threshold value, r#, in green. Default is defined by statistical significance

Note that R2 (x,z) and R2 (y,z) only need to be entered to correct ITCV calculations in F7-H7.

Can be downloaded from http://www.msu.edu/~kenfrank/

Home

Questions for Scotte Page

• When a group makes a decision from statistical evidence– are they making a causal inference?– How much of discourse is : you didn’t control for xxx?”

• Ken says: How strong would unmeasured factor have to be to invalidate inference

– Do you believe in statistical controls?

• Class project: Evaluating effect of NCLB sanctions on Michigan schools– Compare schools just above cutoff for sanctions with

those just below cutoff for sanctions

home introduction to causal inference kenneth a. frank cstat 2-4-2011

Documents

help teachers

causal inference slide

home policy implications

policy analysis slide

help providers

causal inference q

home example

home methods