home introduction to causal inference kenneth a. frank cstat 2-4-2011
Post on 21-Dec-2015
225 views
TRANSCRIPT
Home
Introduction to Causal InferenceKenneth A. Frank CSTAT 2-4-2011
Home
Overview• Alternative Causal Mechanisms and the Counterfactual• Approximations to the Counterfactual• How Regression works: Explained Variance in Regression• Concern over Missing Confound (Internal Validity)• Consider Alternate Sample (External Validity)• Defining Absorption• Analyzing Pre/post-test designs ANCOVA: Analysis of Cova...• Schools as Fixed or Random• Statistical power in multilevels• Differential Treatment Effects and Heckman’s Rationality• References on Causal Inference
Home
My Take
• Sociological• Motivated by studies of social context
– People select themselves into contexts– Cannot randomize– Each context is different (effects across contexts?)
• Regression based– Control for confounds– Explore interactions
• Sensitivity/robustness– What would it take to invalidate an inference?
Home
Methods Covered• Counterfactual (2 potential outcomes)• Statistical control via regression/general linear model
– Random and fixed effects• Robustness of inference
– for impact of a confounding variable (internal validity)– for representativeness of sample (external validity)– Robustness indices a form of sensitivity analysis
• Absorption– Randomization– Instrumental variables– Pre-test
• Differential treatment effects– Treatment effect for treated/for control
• Propensity scores– Attention to assignment mechanism
• Logistic regression– Using propensity scores in analysis
• Weighting• Control• Strata• matching
Home
Example: The effect of National Board Certification on the help a teacher provides others (Frank et al)
What is National Board Certification?The National Board (a private organization) offers a certification process for primary and
secondary teachers. The process takes approximately 1 year and involves considerable reflection and documentation of practice. Emphasis on progressive approach to teaching and engagement in professional leadership.
The fifth core proposition of the NBPTS states that accomplished teaching reaches outside of the individual classroom and involves collaboration with other teachers, parents, administrators, and others (National Board for Professional Teaching Standards, 1989)
Descriptive
Q: Do National Board certified teachers (NBCTs) provide more help to others in their schools than non-NBCTs?
A: Yes, the average NBCT is nominated by about 1.6 others as providing help with instruction, in contrast to about .95 for a non-NBCT.
Causal Inference
Q: Does National Board certification affect the amount of help a teacher provides?
Frank, K.A., Gary Sykes, Dorothea Anagnostopoulos, Marisa Cannata, Linda Chard, Ann Krause, Raven McCrory. Extended Influence: National Board Certified Teachers as Help Providers. Submitted to Education, Evaluation, and Policy Analysis
Home
Policy Implications
• Board has emphasized helpfulness as one of its goals• Other Practices of BCT’s may disseminate throughout
school• Key goal of organizational literature has been to cultivate
more “social capital” and sense of community, where teachers help each other more better student outcomes.
• Amount of help teachers receive affects implementation of innovations (Frank, Zhao and Borman 2004; Zhao and Frank 2003) http://www.msu.edu/~kenfrank/research.htm#social
Incentives for more teachers within existing BCT oriented schools to become BCT’s
Incentives for schools and districts with few or no BCTs to engage BCT
Home
Correlation Does Not Equal Causation
• Estimated effect could be attributed to unmeasured covariate alternative causal mechanism
• ExampleY=amount of help a teacher provides to otherss= whether or not a teacher became National
Board Certifiedcv=confounding variable (e.g., inclination to be
helpful) representing alternative causal mechanism
Home
t( β1)
rscv
rycvrscv×rycv
Inclination to be Helpful(confounding variable --cv)
BoardCertified
(s) Numberothershelped
(y)
The Impact of a Confounding Variable on a Regression Coefficient
Home
Home
Alternative Causal Mechanisms and the Counterfactual
1) I have a headache
2) I take an aspirin (treatment)
3) My headache goes away (outcome)
Q) Is it because I took the aspirin?
A) We’ll never know – it is counterfactual – for the individual
This is the Fundamental Problem of Causal Inference
Home
Treatment Effect and Missing data for the Counterfactual
Potential OutcomeAssignment
Home
Counterfactual and Philosophers: Hume
• spatial/temporal contiguity:– Cause and measurement of effect apply to
single unit
• Temporal succession– Effect assessed after treatment is applied
• Constant conjunction– If effect is constant
• Missing: effect of one cause is relative to effects of others
Home
Mill
• Liked the experimental paradigm• Concommitant variation:
– Correlational smoke causational fire ( I agree, more later)
• Method of Difference: Yit – Yi
c
• Method of Residues Yab – Ya
• Method of Agreement Yit – Yi
c=0 implies null effect, – compare observed effect against null effect
• Limitation: anything can be a cause
Home
Suppes
• Prima facia cause– Correlation
• Genuine Cause– No confounding vaiables Liked the
experimental paradigm
• Limitation: must explain full cause of effect, rather than small effect of particular cause
Home
Lewis
• Named the counterfactual• If A were the case, C would be the case” is
true in the actual world if and only if (i) there are no possible A-worlds; or (ii) some A-world where C holds is closer to the actual world than is any A-world where C does not hold. http://plato.stanford.edu/entries/causation-counterfactual/
Home
Basic Model for the Counterfactual9=2+4+3
5=2+3
=[2+4+3]-[2+3]=[(2-2)+(4-0)+(3-3)=4
=2+(1 or 0)x4+3
9=2+(1)x4+3
5=2+(0)x4+3
=[2+4+3]-[2+3]=[(2-2)+(4-0)+(3-3)=4
Home
Treatment Effect and Missing data for the Counterfactual
Potential OutcomeAssignment
Home
Reflection
• What part if most confusing to you?– Why?– More than one interpretation?
• Talk with one other, share
• Find new partner and problems and solutions
Home
Approximations to the Counterfactual
• Compare repetitions within person (observe teachers before and after certification)
• Randomly assign people to become certified or not (Fisher/Rosenbaum)– Randomization (with large enough n) insures that there will be
no baseline differences between those assigned to treatment and those assigned to control
• Regression (assuming all relevant confounds have been measured)
• Each attempts to approximate the counterfactual by insuring no relationship between confound and assignment to treatment condition (rx cv=0 rx cvx x rx cv=0)
Home
Randomization often not possible, especially for social contexts
• Logistics– Getting people to agree
• Independence– People within social contexts (e.g., schools) are
dependent randomize at level of context (the school) $$$$$$$
• Ethics– Assigning adolescents to friendship groups?!
• Timing: the longer the treatment intervention, the more likely to violate assumption that control group represents forecast for treatment group
• Exposure to confounding with small n
Home
Rubin’s (1974) response
• Was causal inference impossible prior to randomized experiments (circa 1930)?
• Make maximum use of data• Approximate counterfactual
– Statistical control– propensity score matching – match those who
received treatment with similar others but who received control (like “twins”).
Home
Employ Statistical Control for Confound
tiY tiY tiY
Home
SPSS Syntax for reading in toy counterfactual data
DATA LIST FREE / y confound s .
Begin DATA .
9 6 1
10 7 1
11 8 1
5 3 0
6 4 0
7 5 0
End DATA .
Home
Counterfactual Predicted Values from Regression: Effect isn’t 4, it’s 1!
Home
Regression Without Control: wrong answer: Estimate of 4
REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT y /METHOD=ENTER s .
Home
Regression with Control: Right answer, Estimate of 1
REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT y /METHOD=ENTER s confound .
confoundsy 210
1 2 1 0 1 6 8cy
Home
Counterfactual Predicted Values from Regression: Effect isn’t 4, it’s 1!
Home
Keys to Statistical Control
• Need to know and measure relevant covariates (identically independently distributed errors)– Omitted confound dependencies among units that
have similar values on the confound (e.g., teachers who are similarly inclined to help)
• Assumes optimal control for covariate is linear function of X’s
• Assumes constant treatment effect
Home
How Regression works: Explained Variance in
Regression
Y
X1 X2
X1 and X2 explain different parts of Y X1 and X2 are independent (uncorrelated)
Circles represent variances
Home
But usually there is multicollinearity (or the need for statistical control)
Y
X2X1
‘competition’ between the variables (in explaining Y)!
The degree of competition depends on the amount of Correlation (overlap) between the ‘independent’ (!) variables
Home
Y
X2X1
a
c
b
e
carYX 2
1 cbrYX 2
2
cbaR XXY 2. 21
2
22.
2
2
221
1
1 YX
YXXXY
YX
r
rR
eaa
pr
22.
2
2211 YXXXYYX rRasr
2
22.
2
1
121
2
1 YX
YXXXY
YX
r
rR
ebb
pr
22.
2
1212 YXXXYYX rRbsr
Home
Focus on Overlap and alternative explanations
Home
Example: The effect of National Board Certification on the help a teacher provides others (Frank et al)
Descriptive
Q: Do National Board certified teachers (NBCTs) provide more help to others in their schools than non-NBCTs?
A: Yes, the average NBCT is nominated by about 1.6 others as providing help with instruction, in contrast to about .95 for a non-NBCT.
Causal Inference
Q: Does National Board certification affect the amount of help a teacher provides?
Home
Data
• 47 schools (in 2 states)• 1583 teachers• Case studies in 4 schools• Surveys:• background• attitudes towards leadership and bct• sociometric:
• teachers were asked to list others who helped with instruction
Home
Syntax for Descriptives
GET FILE='C:\Documents and Settings\kenfrank\My Documents\MyFiles\sykes\workshop.sav'.
DESCRIPTIVES VARIABLES=bct leave female glevel owned yrstch nograde attracth expanseh bcttreat leader leadna white /STATISTICS=MEAN STDDEV .
Home
Table 1: Measures and Descriptive statistics (n=1363)
VariableMean Std Dev
Number other teachers helped by respondent (attracth) .96 1.08
number other teachers who helped respondent (expanseh) .91 .77
Board certified teacher, 1=Yes, 0 = No (BCT) .13 .34
White (white) .84 .37
Female (female) .93 .25
highest grade level taught (glevel) 8.32 4.13
no grade level indicated (nograde) .04 .19
level of own education (owned) 3.01 1.02
years teaching (yrstch) 16.12 8.64
Intention to leave (leave) 1.72 .72
perceived advantage of certification (bcttreat) 1.95 .55
enhancement through leadership (leader) 2.35 1.20
missing on enhancement of teaching (leadna) .17 .37
number certified others in school ( nbct) 2.31 2.44
number certified others in school squared (nbctsq) 6.42 11.69
(n is approximately 1208)
Home
Descriptives Separately for BCT and non-BCT
GET FILE='C:\Documents and Settings\kenfrank\My Documents\MyFiles\sykes\workshop.sav'.
SORT CASES BY bct .SPLIT FILE LAYERED BY bct .DESCRIPTIVES VARIABLES=leave female glevel owned yrstch nograde attracth expanseh white bcttreat leader leadna nbct /STATISTICS=MEAN STDDEV .
Try it, what do you get?
Home
Recall regression model with statistical control for a confound
tiY tiY tiY
0 1 2
0 1 2Help Provided Board Certification Leadership
y s confound
Home
Partialled and unpartialled (zero order) correlations
Unpartialled (zero-order, or total) variation between help provided (y) and board certification (x) is .1762=.031
Variation between help provided (y) and board certification (x), partialled for enhancement of teaching through leadership is .1672=.028
Difference unpartialed and partialed is variance between board certification (x) and help provided (y) also accounted for by enhancement of teaching through leadership (confound):
.031-.028=.003
Home
How Regression Works: Overlapping Variances
Help provided
Board CertificationEnhancement
Throughleadership
Help provided
Board Certification
Variance between help provided and board certification, Partialling for enhancement through leadership, =.1672 =.028
Variance between help provided and board certification =.1762=.031
Home
How Regression Works: Partial and Semi-Partial correlation
s· s· ·s· | 2 2 2 2
· s·
.176 .072 .170.167
1 1 1 .170 1 .072
y cv y cvycv
y cv cv
r r rr
r r
Semi-Partial Correlation: correlation between s and y, where s has been controlled for the confounding variable
Partial Correlation: correlation between s and y, where s and y have been controlled for the confounding variable
164.072.1
170.072.176.
1 22
s
cvsx
cvycvsyss sr
r
rrrsr
Home
Regression and Correlation Coefficient
T ratio for regression coefficient and correlation are identical
s· 1
( ) 1.077, .176 .557
( ) .341y
sd yr
sd s
Home
Regression of Help Provided on Board Certification Controlling for Enhancement of Teaching through
Leadership
Controlling for enhancement of teaching through leadership
s· | 1|
( | ) 1.075, .167 .534
( | ) .336y c c
sd y cr
sd s c
Model: y=β0 +β1 c
Model: s=β0 +β1 c
Home
rsy=.18
rscv=.17
rycv=.07rscv×rycv
CVEnhancement ofteaching through
leadership
SBoard
Certification YHelp
Provided
The Impact of a Enhancement of Teaching through leadership on Correlation Between
Board Certification and Help Provided
rsy|cv=.167
How Regression Works:Impact of Enhancement of Teaching Through Leadership on Correlation Between Board Certification and Help Provided
rsy=.176
Home
Calculating Impacts:Correlations Between BCT, Amount of Help Provided, and Covariates
Home
Impacts of Covariates on Correlation between BCT and Help
ProvidedComponent Correlations
Home
Reflection
• What part if most confusing to you?– Why?– More than one interpretation?
• Talk with one other, share
• Find new partner and problems and solutions
Home
ExerciseHow Regression Works:
Exercise• Calculate the correlation between board
certification and help provided – Unpartialed– Partialed (for something other than
leadership) • (see basic calculations, sheet 1).
https://www.msu.edu/~kenfrank/research.htm#causal
• Do same for example in a data set you have
Home
Exercise: Find Impacts of measured Covariates on Correlation between BCT and Help Provided
Use data file “Board Certified Teachers”GET FILE='C:\Documents and Settings\kenfrank\My Documents\MyFiles\COURSES\causal '+ 'inference\groningen\data\spass_data\workshop.sav'.DATASET NAME DataSet6 WINDOW=FRONT.CORRELATIONS /VARIABLES=attracth bct expanseh white female leave glevel nograde owned yrstch leader nbct nbctsq bcttreat leadna /PRINT=TWOTAIL NOSIG /STATISTICS DESCRIPTIVES /matrix=out(forimp) /MISSING=PAIRWISE .
GET FILE= ' forimp'.
AUTORECODE VARIABLES=ROWTYPE_ varname_ /INTO t n /PRINT.
FILTER OFF.USE ALL.SELECT IF(t = 1 and n>=4).EXECUTE .
COMPUTE impact = attracth * bct .EXECUTE .
SORT CASES BY impact (D) .
SAVE OUTFILE='impact' /keep rowtype_ varname_ attracth bct impact /COMPRESSED.
Home
Reminder: Motivation: If you don’t argue scientifically, those who you disagree with
will, and your views will not be heard
Home
Concern over Missing Confound(Internal Validity)
• Causal Inference concern: How much of the estimate of the Board Certification effect would have to be attributed to other factors to invalidate the causal inference? – Maybe NBCTS help more because they had a
previous inclination to help?
• We may never know ,but we can quantify the concern– What would the impact of a confound (e.g, inclination
to help) have to be to alter our Inference? (Frank, 2000)
Home
Full Regression of Help Provided Others on Board Certification and Covariates
UNIANOVA attracth BY school WITH bct leave female glevel owned yrstch nograde expanseh leader white nbct nbctsq bcttreat leadna /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /PRINT = PARAMETER /CRITERIA = ALPHA(.05) /DESIGN = bct leave female glevel owned yrstch nograde expanseh leader white nbct nbctsq bcttreat leadna school .
Home
Impact of an Unmeasured Confounding Variable on Inference of Effect of Board
Certification on Help Provided
t( 1)
rscv
rycvrscv×rycv
Inclination to be Helpful(confounding variable --cv)
BoardCertified
(s) Numberothershelped
(y)
Home
Home
What must be the Impact of an Unmeasured Confounding variable
invalidate the Inference?
Step 1: Establish Correlation Between BCT and Help Provided, partialling for all covariates
Step 2: Define a Threshold for InferenceStep 3: Calculate the Threshold for the
Impact Necessary to Invalidate the Inference
Step 4: Multivariate Extension, with other Covariates
Home
Step 1: Establish Correlation Between BCT and Help Provided,
partialling for all covariates
2 2
t 6.79r .196
(n q 1) t (1156) 6.79
t taken from regression, =6.79 n is the sample size q is the number of parameters estimatedN-q-1=1156
Home
Step 2: Define a Threshold for Inference
• Define r# as the value of r that is just statistically significant:
# critical
2critical
tr
(n q 1) t
n is the sample size q is the number of parameters estimatedtcritical is the critical value of the t-distribution for making an inference
#
2
1.96.058
(1156) 1.96r
r# can also be defined in terms of effect sizes
Home
Step 3: Calculate the Threshold for the Impact Necessary to Invalidate the Inference
#·
#
r
1 | r |x yr
TICV
· · · ·· | 2 2
· ·11 1
x y x cv y cv x yx ycv
y cv x cv
r r r r kr
kr r
Set rx∙y|cv =r# and solve for k to find the threshold for the impact of a confounding variable (TICV).
Define the impact: k =rx∙cv x ry∙cv and assume rx∙cv =ry∙cv (which maximizes the impact of the confounding variable).
impact of an unmeasured confound > .147 → inference invalidimpact of an unmeasured confound < .147 → inference valid.
.196 .058.147
1 .058TICV
Home
Calculations made easy!
• http://www.msu.edu/~kenfrank/papers/calculating%20indices%203.xls
Home
Live Example
N-q=1131-18=1113. T=.603/.092=6.56
Impact Threshold=.142 Component correlations = .38
Frank, K.A., Gary Sykes, Dorothea Anagnostopoulos, Marisa Cannata, Linda Chard, Ann Krause, Raven McCrory. 2008. Extended Influence: National Board Certified Teachers as Help Providers. Education, Evaluation, and Policy Analysis. Vol 30(1): 3-30.
Home
Exercise 3: Impact Threshold Exercise
1)Identify a statistical inference from an article you are interested in.
2) Describe possible confounds/alternative explanations that could bias the estimate
3) Note the t-ratio and sample size
4) Calculate robustness of inference usinghttp://www.msu.edu/~kenfrank/papers/calculating%20indices%203.xls
5) Explain your inference and how robust you think it is. Why could your inference be challenged?
Home
Step 4: Multivariate Extension, with Covariates
#· |2 2
· · #
r(1 )(1 )
1 | r |x y z
x z y z
rTICV r r
2·2
· 2·
1
1
y z
y cv
x z
rr TICV
r
k=rx ∙cv|z× ry ∙ cv|z
Maximizing the impact with covariates z in the model implies
2·2
· 2·
1
1x z
x cv
y z
rr TICV
r
And
=.125
Home
SPSS Syntax for Obtaining Multivariate Impact Threshold
GET
FILE='C:\Documents and Settings\kenfrank\My Documents\MyFiles\sykes\workshop.sav'.
UNIANOVA
attracth BY school WITH leave female glevel owned yrstch nograde expanseh
bcttreat leader leadna white nbct nbctsq
/METHOD = SSTYPE(3)
/INTERCEPT = INCLUDE
/PRINT =ETASQ PARAMETER
/CRITERIA = ALPHA(.05)
/DESIGN = leave female glevel owned yrstch nograde expanseh bcttreat leader leadna
white nbct nbctsq school .
UNIANOVA
bct BY school WITH leave female glevel owned yrstch nograde expanseh bcttreat
leader leadna white nbct nbctsq
/METHOD = SSTYPE(3)
/INTERCEPT = INCLUDE
/PRINT = ETASQ PARAMETER
/CRITERIA = ALPHA(.05)
/DESIGN = leave female glevel owned yrstch nograde expanseh bcttreat leader leadna
white nbct nbctsq school .
Home
Obtaining R2
Home
Multivariate Calculations
• http://www.msu.edu/~kenfrank/papers/calculating%20indices%203.xls
Home
What must be the Impact of an Unmeasured Confound to Invalidate the Inference?
If k > .125 (or .147 without covariates) then the inference is invalid
If r x cv = ry cv, then each would have to be greater than k1/2
=.38 to alter the inference.(multivariate correction, ry cv > .38 and r x cv >.34)
Furthermore, correlations must be partialled for covariates z.
Impact of strongest measured covariate (perception leadership will enhance teaching) is .012;
Impact of unmeasured confound would have to be ten times greater than the impact of the strongest observed covariate to invalidate the inference. Hmmm….
Home
Applications of Impact Threshold• Frank, K.A., Gary Sykes, Dorothea Anagnostopoulos, Marisa Cannata, Linda Chard, Ann Krause, Raven McCrory. 2008.
Extended Influence: National Board Certified Teachers as Help Providers. Education, Evaluation, and Policy Analysis. Vol 30(1): 3-30.
• • Frisco, Michelle, Muller, C. and Frank, K.A. 2007. Using propensity scores to study changing family structure and academic
achievement. Journal of Marriage and Family. Vol 69(3): 721–741• • *Frank, K. A. and Min, K. 2007. Indices of Robustness for Sample Representation. Sociological Methodology. Vol 37, 349-
392. * co first authors.• Frank, K. 2000. "Impact of a Confounding Variable on the Inference of a Regression Coefficient." Sociological Methods and
Research, 29(2), 147-194
• Crosnoe, Robert and Carey E. Cooper. 2010. “Economically Disadvantaged Children’s Transitions into Elementary School: Linking Family Processes, School Contexts, and Educational Policy.” American Educational Research Journal 47: 258-291.
• Crosnoe, Robert. 2009. “Low-Income Students and the Socioeconomic Composition of Public High Schools.” American Sociological Review 74: 709-730.
• Maroulis, S. & Gomez, L. (2008). “Does ‘Connectedness’ Matter? Evidence from a Social Network Analysis within a Small School Reform.” Teachers College Record, Vol. 110, Issue 9.
• Cheng, Simon, Regina E. Werum, and Leslie Martin. 2007. “Adult Social Capital: How Family• and Community Ties Shape Track Placement of Ethnic Groups in Germany.” American• Journal of Education 114: 41-74.• William Carbonaro1 Elizabeth Covay1 School Sector and Student Achievement in the Era of Standards Based Reforms.
Sociology of eductaion vol. 83 no. 2 160-182 .• see also• Pan, W., and Frank, K.A. (2004). "An Approximation to the Distribution of the Product of Two Dependent Correlation
Coefficients." Journal of Statistical Computation and Simulation, 74, 419-443
Pan, W., and Frank, K.A., 2004. "A probability index of the robustness of a causal inference," Journal of Educational and Behavioral Statistics, 28, 315-337.
Home
Consider Alternate Sample(External Validity)
Causal Inference concern: We cannot assert cause if the effect of Board Certification is not constant across contexts.
Statistical Translation:Would the inference be valid if the sample included more of some population (e.g. teachers in other states) for which the effect was not as strong?
Rephrased for robustness: what must be the conditions in the alternative sample to invalidate the inference?
Home
Consider Alternate Sample(External Validity)
Define as the proportion of the sample that is replaced with an alternate sample.
r is correlation in unobserved data
R is combined correlation for observed and unobserved data:
Rxy=(1-)rxy + rxy .
Home
Home
Thresholds for Sample Replacement
Set R=r# and solve for rxy:If half the sample is replaced (=.5), original
inference is invalid if rxy < 2r#-rxy
Therefore, 2r#-rxy defines the threshold for replacement: TR(=.5)
If rxy =0, inference is altered if π> 1-r#/rxy . Therefore 1-r#/rxy defines the threshold for
replacement: TR(rxy=0)Assumes means and variances are constant across samples, alternative calculations available.
Home
Home
Example of Thresholds for Replacement
TR(=.5)= 2r#-rxy|z =2(.058)-.196=-.081. Correlation between Board Certification and number
of others helped would have to be less than -.081 to alter inference if half the teachers in our sample were replaced (e.g., with teachers from another state).
TR(rxy =0)= 1-r#/rxy|z =1-(.058/.196)=.71More than 70% of teachers would have to be
replaced with others for whom Board Certification
has no effect (rxy =0) to invalidate the inference in a combined sample.
Home
Calculations for Robustness of Inference for External Validity
Home
Basis of Comparison: Separate Effects for observed subgroups
• White(n=981): .71; Non-white(n=176): .27• Female(n=1080): .63; Male (n=77): -.50 !• Compare -.504 with TR(=.5)=-.081. • Results invalidate for populations consisting of more male (elementary) teachers.• Only 5 males who were bct:
GET
FILE='C:\Documents and Settings\kenfrank\My Documents\MyFiles\sykes\workshop.sav‘
.
CROSSTABS
/TABLES=bct BY female
/FORMAT= AVALUE TABLES
/CELLS= COUNT ROW COLUMN TOTAL
/COUNT ROUND CELL .
Home
Generally, How Much Bias Must there be to
Invalidate the Inference? Estimate=unbiased estimate + bias:
robserved = runbiased + M
where runbiased is defined by E(runbiased )= relationship in population or ρ
Inference invalid if runbiased < r# . So…
1) Set runbiased < r# and solve for M. Inference invalid if
M > (robserved - r#).
2) As a proportion of initial estimate, Inference invalid if
M/ robserved > 1-r#/ robserved=TR(rxy=0)=.71
Interpretation:71% of estimate must be attributable to bias to alter the inference (same as % replacement if r unobserved=0)
3) Rule of thumb (for large n)
% bias need to invalidate inference = 1-tcritical/tobserved
Sykes et al, % bias needed to invalidate inference = 1-1.96/6.79=.71
Home
Exercise: Robustness for Sample Representativeness (external Validity)
1)Identify a statistical inference in your own work or in the literature for which there is concern about the external validity
2) Identify possible populations for which the effect may not apply
3) Note the t-ratio and sample size4) Calculate robustness of inference usinghttp://www.msu.edu/~kenfrank/papers/calculating%20indices%203.xls
5) Discuss with a new partner your inference and how robust you think it is. Partner can challenge. Then change roles.
Home
Assumptions are the bridge between statistical and causal inference
Statistical Inference Causal Inference
Assumptions
Cornfield, J., & Tukey, J. W. (1956, Dec.), Average Values of Mean Squares in Factorials. Annals of Mathematical Statistics, 27(No. 4), 907_949.
Home
In Donald Rubin’s words
“Nothing is wrong with making assumptions; on the contrary, such assumptions are the strands that join the field of statistics to scientific disciplines. The quality of these assumptions and their precise explication, not their existence, is the issue”(Rubin, 2004, page 345).
Home
Conclusions for Robustness Indices
• Objections to moving from statistical to causal inference in terms of violations of assumptions– No unobserved confounding variables– Treatment has same effect for all
• Robustness indices quantify how much must assumptions must be violated to invalidate inference.
• No new causal inferences!– robustness indices merely quantify terms of debate regarding
causal inferences.• Can be used with any threshold.• Can be used (theoretically) for any t-ratio
– Discuss: Statistical inference as threshold?• Extension of sensitivity – indices are a property of original estimate
Home
Limitations
• Would like to do experiment• Would like longitudinal data to control for
previous inclination to help– (perhaps leverage this study to get a second
wave of data?)
• Don’t know if BCT’s are more helpful or merely perceived as such because of symbolic status
• Nationally representative data?
Home
Defining Absorption
• The impact of any given covariate can be absorbed by controlling for other covariates the impact of covariate c on the association between treatment x and outcome y is reduced once controlling for covariate a
| |( , , , )r r
1 1r r
cx a cy aAbsorb a c x y
cx cy
impact of c on x
impact of c on x given a
Home
The impact of confound c on the association between treatment x and outcome y is reduced once controlling for covariate a
rsy
rscv
rycvrscv×rycv
Confound
X
y
aa
Green indicates absorbed impact
Home
Home
Syntax for calculating absorption
SUBTITLE "Impact Partialing Leader".GET FILE=‘workshop.sav’.PARTIAL CORR /VARIABLES= attracth bct expanseh white female leave glevel nograde owned yrstch nbct nbctsq bcttreat leadna BY leader /SIGNIFICANCE=TWOTAIL /matrix=out(forimpa.sav) /MISSING=LISTWISE .
GET FILE=forimpa.sav’.AUTORECODE VARIABLES=ROWTYPE_ /INTO t /PRINT.FILTER OFF.USE ALL.SELECT IF(t=3).EXECUTE .
COMPUTE attracth_post=attracth.COMPUTE bct_post=bct.COMPUTE impact_post=attracth_post * bct_post.EXECUTE .
SAVE OUTFILE='impactaa.sav' /keep ROWTYPE_ VARNAME_ attracth_post bct_post impact_post /COMPRESSED.
GET FILE=impactaa.sav’.AUTORECODE VARIABLES=VARNAME_ /INTO n /PRINT.
FILTER OFF.USE ALL.SELECT IF(n>2).EXECUTE .SAVE OUTFILE='impacta.sav' /keep ROWTYPE_ VARNAME_ attracth_post bct_post impact_post /COMPRESSED.
GET FILE='impact.sav'.SORT CASES BY VARNAME_ (A) .SAVE OUTFILE='byn.sav' /COMPRESSED.GET FILE='impacta.sav'.SORT CASES BY VARNAME_ (A) .SAVE OUTFILE='byna.sav' /COMPRESSED.GET FILE='byn.sav'.MATCH FILES /FILE=* /FILE='byna.sav' /RENAME (ROWTYPE_ = d0) /BY VARNAME_ /DROP= d0.EXECUTE.COMPUTE absorb=1-impact_post/impact .EXECUTE .SAVE OUTFILE='absorb.sav' /keep ROWTYPE_ VARNAME_ absorb impact attracth bct impact_post attracth_post bct_post /COMPRESSED.
Home
Extent to which Leader absorbs the impact of other covariates on inference regarding
effect of BCT on help provided
Once controlling for leader less of a need to control for intention to leave or years teaching
Home
Absorption Exercise
• Looking at the absorption and impact matrices can you guess what will happen when you add female to the model? How about when you add number of others in the school who are board certified (nbct)
• Check using syntax:
GET FILE='C:\Documents and Settings\kenfrank\My Documents\MyFiles\sykes\workshop.sav'.
UNIANOVAbct BY school WITH leave female glevel owned yrstch nograde expanseh bcttreat leader leadna white nbct nbctsq /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /PRINT = ETASQ PARAMETER /CRITERIA = ALPHA(.05) /DESIGN = leave female glevel owned yrstch nograde expanseh bcttreat leader leadna white nbct nbctsq school .
Home
How a pre-test absorbs impact
Home
Analyzing Pre/post-test designs ANCOVA: Analysis of Covariance
• Research questions:• pre- versus post interacting with treatment (Not recommended): Is
there a difference between pre and post scores, and does that difference depend on whether or not the subject participated in the treatment?
• ANCOVA: Controlling for the pre-test, did subjects who participated in the treatment score higher on the post test than those in the control?– Did the effect of the treatment depend on the level of the pre-test -- did
the treatment work better for some than others
• Difference scores: Did the subject who participated in the treatment learn more (or grow more) from pre-test to post-test than those in the control?
Home
Models:pre- versus post interacting
with treatment (Not recommended):
bob post 0 1 bob post 2 bob post 3 bob post bob postˆ ˆ ˆ ˆ ˆy dtreatment dpost x dtreatmentdpost e
0 1 2 3ˆ ˆ ˆ ˆ ˆy dtreatment dpost x dtreatmenti i i i idpost e
Problem: observations are not independent – each person is measured twice, pre and post. The effects of each person who mutually effect error terms for the same person, and thus be correlated:
bob pre 0 1 bob pre 2 bob pre 3 bob pre bob preˆ ˆ ˆ ˆ ˆy dtreatment dpost x dtreatmentdpost e
The errors for the two models in (2) will be dependent due to the common effect of “bobness” on each error that has not been accounted for.
Home
ANCOVA:• Controlling for the pre-test, did subjects who participated in the
treatment score higher on the post test than those in the control? Did the effect of the treatment depend on the level of the pre-test -- did the treatment work better for some than others
0 1 2 3ˆ ˆ ˆ ˆ ˆpost achievement pre dtreatment pre x treatmenti i i i ie
Alternate Expression of model with factors (categorical variables), covariates (continuous variables) and interactions
ˆ ˆpost achievement pre prei j i j ij ije
Home
Difference Scores
• Construct: Δyi = yposti - yprei . This measures the change from y-pre to y-post for person i.
• Model0 1
ˆ ˆ ˆ+ dtreatmenti i ie Advantages: only one observation/person. Essentially modeling “growth.” Disadvantage: cannot test for interaction effect.
Home
When to use Difference scores versus ANCOVA
• Allison argues use difference scores when the pre-test is not considered a causal predictor of either the treatment or control.
• A. Pre-test “causing” outcome: Stocks versus flows• The pre-test can “cause” the post test when the outcome like a “stock” --
the outcome has an inherent persistence over time -- such as height, which typically cannot decrease (Allison, page 107). In this case, use ANCOVA.
• The pre-test is not considered “causal” for most measures of behavior and attitude which must be regenerated each time, like something that “flows” which can therefore be cut off.
• B. Pre-test “causing” treatment:• Examples (Allison, page 109):• (use Δ)All seniors in high school A are enrolled in the treatment, and the
SAT is administered before and after the treatment or control period. All students in High school B serve as controls.
• (use ANCOVA): The SAT is administered as a pretest to a group of high school seniors. Those who score below 400 are enrolled in the treatment, and those who score above 400 are in the control.
• (use Δ): Seniors self-select into treatment & control before seeing the results of a pre-test administration of the SAT.
• (use ANCOVA): Seniors self-select into the program after seeing the results of a pre-test administration of the SAT.
Home
Flow Chart for use of Difference Scores versus ANCOVA
0 1ˆ ˆ ˆ+ dtreatmenti i ie
Pre-test cause treatment conditions?No yes
Use ANCOVADifference scoreIf test has high reliability
Difference Score ANCOVA
0 1 2 3ˆ ˆ ˆ ˆ ˆpost achievement pre dtreatment pre x treatmenti i i i ie
Home
Absorption of Impact Via Randomly Assigned Treatment
Green area goes to zero
Home
Home
How Random Assignment Absorbs Impact
t( 1)
rxcv
rycvrxcv×rycv
Inclination to be Helpful(confounding variable --cv)
Random assignment (s)
Numberothershelped
(y)
Treatment(x)
rxsrys|x=0
Home
How Does Regression Discontinuity Absorb Impact?
• Criteria for Assignment to treatment conditions known with certainty
• Comparison of those who just exceeded criteria with those who just missed criteria.
Home
How Regression Discontinuity absorbs impact
Home
How Instrumental Variable Absorbs Impact
t( 1)
rxcv
rycvrxcv×rycv
Inclination to be Helpful(confounding variable --cv)
InstrumentalVariable
Numberothershelped
(y)
Treatment(x)
rxsrys|x=0
FidelityAssumed
Home
Home
Impact Thresholds and Instrumental variables
• Can still do impact threshold. • Define iv as the instrumental variable, cv as the
confound.• Exclusion restriction: For any confounding
variable for which r cv y > 0, r iv cv must equal 0. In other words, r iv y x r iv cv =0.
• But what if this doesn’t hold? Inference invalidated by r iv y x r iv cv . This is the impact of a confound.
• Can compare with existing relationships between IV and other covariates.
Home
Comment on Instrumental Variables
• Exclusion restriction: instrument related to treatment assignment but related to outcome only through treatment is difficult to satisfy
– Attempts: draft # (Angrist et al)– Whether you’re Catholic or not for attending catholic school
• A recent meta-analysis [Glazerman, Stephen, Levy, Dan and Myers, David (2003). Nonexperimental versus Experimental Estimates of Earnings Impacts.” Annals, AAPS (589): 63-85] found that statistical control for a prior measure most approximated randomized experiments in a meta-analysis of effects of welfare, job training and employment service programs on earnings.
• Steiner, Peter M., Thomas D. Cook & William R. Shadish (in press). On the importance of reliable covariate measurement in selection bias adjustments using propensity scores. Journal of Educational and Behavioral Statistics.
• Steiner, Peter M., Thomas D. Cook, William R. Shadish & M.H. Clark (in press). The importance of covariate selection in controlling for selection bias in observational studies. Psychological Methods.
• Cook, T. D., Shadish, S., & Wong, V. A. (2008). Three conditions under which experiments and observational studies produce comparable causal estimates: New findings from within-study comparisons. Journal of Policy and Management. 27 (4), 724–750.
Home
Parents of Friends as Instrument for Friends?
Home
Reflection
1) Identify the aspects that are unclear to you or that concern you
2) Find a partner or two and discuss your concerns
3) Be prepared to teach others or share concerns
Home
Schools as Fixed or Random
• Problem: students and teachers are nested within schools (data are multilevel)
• Common problem in social science research: people nested within organization
– If no control for organizations, members of a given organization are commomly affected by that organization
– Example: All students are commonly affected by their principal– Implication: error terms are not independent, standard errors are
biased, p values are wrong!• Response: control for schools• Fixed effects: enter a dummy variable for each school (except
one) to control for school effects.– Same way one controls for gender or race
• Random effects (multilevels):– Assume there is a distribution (e.g., normal) of effects across schools,
only estimate paramters of distribution
Home
Schools as Fixed or Random
• Fixed: essentially using dummy variables to control each school– Spends degrees of freedom – 1 for each school– Focus on individual effects within contexts– Schools in the sample are the population of interest– Controls for all unobservable factors associated with school
• Random: assume residual school effects are normally distributed– Only estimate mean and variance, not each one– Can estimate effects at indiviudal or school level, as well as cross-level
interactions (slopes as outcomes)– Schools are considered a sample from a larger population– Controls for all unobservable factors associated with school?
• Pretty much, with careful centering (see next results)
• Biggest difference is whether all predictors are adjusted for group characteristics
– Fixed effects: yes– Random effects: No
• (unless you group mean center all variables)• Subtract the group (school) mean from each predictor
Home
Syntax for Schools as Random versus Fixed Effects
UNIANOVA attracth BY school WITH bct expanseh white female leave glevel nograde owned yrstch leader nbct nbctsq bcttreat leadna /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /PRINT = PARAMETER /CRITERIA = ALPHA(.05) /DESIGN = bct expanseh white female leave glevel nograde owned yrstch leader nbct nbctsq bcttreat leadna school .
SORT CASES BY school (A) .
AGGREGATE /OUTFILE=* MODE=ADDVARIABLES /BREAK=school /bct_mean = MEAN(bct) /expanseh_mean = MEAN(expanseh) /white_mean = MEAN(white) /female_mean = MEAN(female) /leave_mean = MEAN(leave) /glevel_mean = MEAN(glevel) /nograde_mean = MEAN(nograde) /owned_mean = MEAN(owned) /yrstch_mean = MEAN(yrstch) /leader_mean = MEAN(leader) /nbct_mean = MEAN(nbct) /nbctsq_mean = MEAN(nbctsq) /bcttreat_mean = MEAN(bcttreat) /leadna_mean = MEAN(leadna).
COMPUTE bct= bct -bct_mean.COMPUTE expanseh = expanseh -expanseh_mean.COMPUTE white = white - expanseh_mean .…………………………………COMPUTE leadna = leadna - leadna_mean .EXECUTE .SAVE OUTFILE=*+ ' inference workshop\spss dataset\cen.sav' /keep school attracth q71 bct expanseh white female leave glevel nograde owned yrstch leader nbct nbctsq bcttreat leadna.
UNIANOVA attracth BY school WITH bct expanseh white female leave glevel nograde owned yrstch leader nbct nbctsq bcttreat leadna /RANDOM=school /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /PRINT = PARAMETER /CRITERIA = ALPHA(.05) /DESIGN = bct expanseh white female leave glevel nograde owned yrstch leader nbct nbctsq bcttreat leadna.
Home
Output Controlling for Schools as Random Effects
Compare with estimate of .622681 (se=.091653) from model Controlling for schools as fixed effects
Home
Statistical power in multilevels
• How to choose – number of cases per unit– Number of units
• Where to allocate resources:• Rules of thumb
– the larger the intraclass correlation (e.g., variation between schools) the more df are based on number of units, and you should sample more units and fewer per unit
– the smaller the intraclass correlation the more df are based on number of observations within units, and you should sample more observations per unit.
– 80 is good to detect moderate effect.– You need less if you have a pretest – increases precision
Home
Optimal design software
– http://sitemaker.umich.edu/group-based/home– Developed by Raudenbush, S– http://sitemaker.umich.edu/group-based/
optimal_design_software
Home
Referenceshttp://sitemaker.umich.edu/group-based/references
ReferencesBloom, H. S., Richburg-Hayes, L., & Black, A. R. (2007). Using Covariates to Improve Precision for Studies That Randomize Schools to
Evaluate Educational Interventions. Educational Evaluation and Policy Analysis, 29(1), 30-59. (http://epa.sagepub.com/cgi/content/abstract/29/1/30, 10-03-2007)
This article examines how controlling statistically for baseline covariates, especially pretests, improves the precision of studies that randomize schools to measure the impacts of educational interventions on student achievement. Empirical findings from five urban school districts indicate that (1) pretests can reduce the number of randomized schools needed for a given level of precision to about half of what would be needed otherwise for elementary schools, one fifth for middle schools, and one tenth for high schools, and (2) school-level pretests are as effective in this regard as student-level pretests. Furthermore, the precision-enhancing power of pretests (3) declines only slightly as the number of years between the pretest and posttests increases; (4) improves only slightly with pretests for more than 1 baseline year; and (5) is substantial, even when the pretest differs from the posttest. The article compares these findings with past research and presents an approach for quantifying their uncertainty.Hedges, L. V., & Hedberg, E. C. (2007). Intraclass Correlation Values for Planning Group-Randomized Trials in Education. Educational Evaluation and Policy Analysis, 29(1), 60-87. (http://epa.sagepub.com/cgi/content/abstract/29/1/60, 10-03-2007)
Experiments that assign intact groups to treatment conditions are increasingly common in social research. In educational research, the groups assigned are often schools. The design of group-randomized experiments requires knowledge of the intraclass correlation structure to compute statistical power and sample sizes required to achieve adequate power. This article provides a compilation of intraclass correlation values of academic achievement and related covariate effects that could be used for planning group-randomized experiments in education. It also provides variance component information that is useful in planning experiments involving covariates. The use of these values to compute the statistical power of group-randomized experiments is illustrated.Raudenbush, S. W. (1997). Statistical Analysis and Optimal Design for Cluster Randomized Trials. Psychological Methods, 2(2), 173-185. (raudenbush.1997.pdf, 1854.0 kb, 10-03-2007) Raudenbush, S. W., & Liu, X. (2001). Effects of Study Duration, Frequency of Observation, and Sample Size on Power in Studies of Group Differences in Polynomial Change. Psychological Methods, 6(4), 387-401. (raudenbush.liu.2001.pdf, 1551.0 kb, 10-03-2007) Raudenbush, S. W., Martinez, A., & Spybrook, J. (2007). Strategies for Improving Precision in Group-Randomized Experiments. Educational Evaluation and Policy Analysis, 29(1), 5-29. (http://epa.sagepub.com/cgi/content/abstract/29/1/5, 10-03-2007)
Interest has rapidly increased in studies that randomly assign classrooms or schools to interventions. When well implemented, such studies eliminate selection bias, providing strong evidence about the impact of the interventions. However, unless expected impacts are large, the number of units to be randomized needs to be quite large to achieve adequate statistical power, making these studies potentially quite expensive. This article considers when and to what extent matching or covariance adjustment can reduce the number of groups needed to achieve adequate power and when these approaches actually reduce power. The presentation is nontechnical.
Home
Home
Home
Differential Treatment Effects and Heckman’s Rationality
• Individuals choose treatments they expect will be most beneficial to them – they can anticipate outcome of treatment.– Treatment effect for treated > treatment effect for
control– Attend to assignment mechanism – factors that affect
choice of treatment• OLS estimates average treatment effect
– Invalidates paradigm of randomized experiment because people choose treatments.
Home
Differential Treatment Effects
Home
Policy Implications
• Treatment effect for treated evaluates effect of existing program for those who received it.
• Treatment effect for control evaluates effect of program if it is expanded to those now receiving the control.
Home
Propensity scores– Estimate differential treatment effects
– Improve covariance adjustment
– Non-monotonic relationship between propensity and discriminant function of covariates
– Unequal variances in treatment and control group
» Dilation effect of treatment (Rosenbaum 2000)
– Motivate evaluation of assignment mechanism
• Cf. Heckman’s 2005 critique of Rubin/Holland model
– Align with counterfactual
• matched comparisons
– No need to match on all covariates
• comparisons within propensity strata
• Presentation loosely based on “– Introduction to Propensity Score Matching” Guo et al
– http://ssw.unc.edu/jif/sacws/docs/Day1a.ppt
Home
Definition of Propensity
Propensity of receiving treatment (i.e., s=1) given covariates x = e(x) = Pr{s = 1|x},
Note e(x) not a probability, since all subjects have already received the treatment (1) or not (0).
Can be obtained as predicted value from logistic regression
Home
Impact of an Unmeasured Confounding Variable on Inference of Effect of Board
Certification on Help Provided
Frank, K.A., Gary Sykes, Dorothea Anagnostopoulos, Marisa Cannata, Linda Chard, Ann Krause, Raven McCrory. 2008. Extended Influence: National Board Certified Teachers as Help Providers. Education, Evaluation, and Policy Analysis. Vol 30(1): 3-30.
Home
Use Propensity to Weight Analysis:For Treatment Effect for Treated:
Where ω is the weight, t=treatment (1or 0), e(x) is propensity to have received the treatment (predicted value from logistic regression)
(1 ) ( )( , )
1 1 ( )
t t e xt x
e x
Home
Propensity and the Relevant Comparison: Estimate of Treatment for
those who received the treatment
Treatment Effect for Treated}
Home
For Treatment Effect for Untreated
Where ω is the weight, t=treatment (1or 0), e(x) is propensity to have received the treatment (predicted value from logistic regression)
(1 ( )) 1( , )
( ) 1
t e x tt x
e x
Home
Propensity and the Relevant Comparison: Estimate of Treatment for
those who received the control
Treatment Effect for control {
Home
Use Propensity to Weight Analysis: Estimate of Treatment for People at the Margin of
Indifference (EOTM)
Where ω is the weight, t=treatment (1or 0), e(x) is propensity to have received the treatment (predicted value from logistic regression)
(Hirano and Imbens 2001; Robins Rotnitzky and Zhao1995)
1( , )
( ) 1 ( )
t tt x
e x e x
Home
Propensity and the Relevant Comparison: Estimate of Treatment for
People at the Margin of Indifference (EOTM)
Estimated Effect for People at the Margin
of Indifference (EOTM)
Home
General procedure for propensity score analysis
• Step 1) Estimate propensity of receiving the treatment (versus control) – using logistic regression of factors predicting treatment versus control– Interpret logistic regression– Save predicted values – these are the propensities
• Step 2) Balance– Compare distribution of propensity by treatment and control groups– Compare treatment and control by covariates (balance) accounting for
propensity• Either by strata or using weights
• Step 3) Estimate effect of treatment on outcome by – propensity strata– matching treatment and control groups on propensity
• Includes composite matches (Heckman’s Kernel functions)– Weighting analyses by propensity (ken’s preferred)– Controlling for propensity (Heckman’s control functions)
Home
Step 1) SPSS Syntax for propensity model and saving predicteds
GET FILE='F:\RA work\for Ken\causal inference\SPSS\workshop.sav'.
LOGISTIC REGRESSION bct /METHOD = ENTER expanseh white female leave glevel nograde owned yrstch leader nbct nbctsq bcttreat leadna /SAVE = PRED /CRITERIA = PIN(.05) POUT(.10) ITERATE(20) CUT(.5) .COMPUTE pbct=pre_1.EXECUTE .SAVE OUTFILE='F:\RA work\for Ken\causal inference\SPSS\withp.sav' /COMPRESSED.
GET FILE='F:\RA work\for Ken\causal inference\SPSS\withp.sav'.COMPUTE pbct=pre_1.IF (pbct > 0) pweight=bct/pbct + (1-bct)/(1-pbct).IF (pbct > 0) pweightt=bct + (1-bct)/(1-pbct).IF (pbct > 0) pweightc=bct/pbct + (1-bct).EXECUTE .
VARIABLE LABELS pbct 'baseline propensity'.VARIABLE LABELS pweight 'weight EOTM: those on the margin weight'.VARIABLE LABELS pweightt 'weight for treatement effect for treated'.VARIABLE LABELS pweightc 'weight for treatement effect for control'.EXECUTE.
SAVE OUTFILE='F:\RA work\for Ken\causal inference\SPSS\pmp.sav' /COMPRESSED.
Home
Table 2: Logistic Regression for Being Board Certified
Independent Variable Estimate Standard Error Wald Chi-Square Pr>ChiSq
Intercept -6.8725 1.0566 35.1514 <.0001
White -.078 .246 .101 .751
Female 1.447 .605 5.722 .017
highest grade level taught
-.0001 .023 .0000 .996
no grade level indicated
-1.176 .776 2.297 .130
level of own education
.403 .100 16.348 <.0001
Years teaching .003 .011 .055 .814
Intention to Leave -.097 .131 .549 .459
perceived advantage of certification
.136 .160 .731 .393
Enhancement of teaching through leadership
.695 .157 19.482 <.0001
missing on enhancement of teaching
.962 .582 2.735 .098
number other teachers who helped respondent
0.185 .1100 2.818 0.1230
number certified others in school
.1234 .068 3.306 .069
number certified others in school squared
-.013 .014 .913 .339
Home
Interpreting logistic regression
• Key predictors– Level of own education– Enhancement of teaching through leadership
• Adjusting for context through number of others in school who were certified
• Keep in even marginal variables• Logistic function correctly classifies 62% of
cases when classified as BCT if probability >.13 (13% of teachers are Board certified)
Home
Step 2) Syntax for checking balance of propensity
GET
FILE='C:\Documents and Settings\kenfrank\My Documents\MyFiles\sykes\pmp.sav'.
CROSSTABS
/TABLES=bct BY female
/FORMAT= AVALUE TABLES
/CELLS= COUNT ROW COLUMN TOTAL
/COUNT ROUND CELL .SORT CASES BY
bct (A) .
EXAMINE
VARIABLES=pbct pweight pweightt pweightc BY bct
/PLOT BOXPLOT HISTOGRAM
/COMPARE GROUP
/STATISTICS NONE
/CINTERVAL 95
/MISSING LISTWISE
/NOTOTAL.
Home
Boxplot Comparison of Distributions of Propensity between NBCTs and non-NBCTs: Common support
NBCTOther
PropensityScore
Home
EOTM Weights before Trimming
Home
Code for Trimming weight and recheck balance of propensity
RECODE pweight (20 thru Highest=20) .EXECUTE .RECODE pweight pweightc (20 thru Highest=20) .EXECUTE .
SAVE OUTFILE='C:\Documents and Settings\kenfrank\My Documents\MyFiles\sykes\mp.sav' /COMPRESSED.
subtitle "visual of balance of weights and propensity".EXAMINE VARIABLES=pbct pweight pweightt pweightc BY bct /PLOT BOXPLOT HISTOGRAM /COMPARE GROUP /STATISTICS DESCRIPTIVES /CINTERVAL 95 /MISSING LISTWISE /NOTOTAL.
Home
Weights after trimming
Home
Syntax for checking balance of covariates
DESCRIPTIVES VARIABLES=pweight /STATISTICS=MEAN .COMPUTE npweight = pweight /1.943653666769.EXECUTE .
WEIGHT BY npweight .T-TEST GROUPS = bct(0 1) /MISSING = ANALYSIS /VARIABLES = attracth expanseh white female leave glevel nograde owned yrstch leader nbct nbctsq bcttreat leadna /CRITERIA = CI(.95) .
WEIGHT BY npweight .SORT CASES BY bct .SPLIT FILE LAYERED BY bct .DESCRIPTIVES VARIABLES=attracth expanseh white female leave glevel nograde owned yrstch leader nbct nbctsq bcttreat leadna /STATISTICS=MEAN STDDEV MIN MAX.
Home
Testing for Balance, weighted by Propensity (EOTM)
VariableBCT
(n=162)Non-BCT(n=1038)
Number other teachers helped by respondent 1.38(3.59)
.89(1.06)
number other teachers who helped respondent .90(2.09)
.91(.81)
White .82(1.00)
.84(.39)
Female .95(.59)
.93(.27)
highest grade level taught 8.42(10.21)
8.3(4.45)
no grade level indicated* .01(.31)
.04(.21)
level of own education 3.08(2.57)
3.01(1.10)
years teaching 15.92(18.22)
16.1(9.56)
Intention to leave 1.68(1.96)
1.70(.80)
perceived advantage of certification 1.95(1.22)
1.94(.61)
enhancement through leadership 2.43(3.13)
2.35(1.29)
number certified others in school 2.43(6.50)
2.31(2.63)
Home
Exercise: What is Balance without weights?
GET FILE=‘C:\Documents and Settings\kenfrank\My Documents\MyFiles\sykes\mp.sav’.
subtitle "checking for balance among covariates".
T-TEST GROUPS = bct(0 1) /MISSING = ANALYSIS /VARIABLES = attracth expanseh white female leave glevel nograde owned yrstch leader nbct nbctsq bcttreat leadna /CRITERIA = CI(.95) .
subtitle "checking for balance among covariates".
SORT CASES BY bct .SPLIT FILE LAYERED BY bct .DESCRIPTIVES VARIABLES=attracth expanseh white female leave glevel nograde owned yrstch leader nbct nbctsq bcttreat leadna /STATISTICS=MEAN STDDEV MIN MAX.SPLIT FILE OFF.
Home
Step 3) syntax for estimating effects with weights
subtitle "weighted by pweight, EOTM".UNIANOVA attracth BY school WITH bct /REGWGT = npweight /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /PRINT = ETASQ PARAMETER /CRITERIA = ALPHA(.05) /DESIGN = bct school .
subtitle "weighted by pweightt, for treated".UNIANOVA attracth BY school WITH bct /REGWGT = npweightt /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /PRINT = ETASQ PARAMETER /CRITERIA = ALPHA(.05) /DESIGN = bct school .
subtitle "weighted by pweightc, for control".UNIANOVA attracth BY school WITH bct /REGWGT = npweightc /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE/PRINT = ETASQ PARAMETER /CRITERIA = ALPHA(.05) /DESIGN = bct school .
*notes for syntax to get npweight, npweightt, npweightc
Home
Syntax and Output for Treatment Effect for Treated
UNIANOVA attracth BY school WITH bct /REGWGT = npweightt /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /PRINT = PARAMETER /CRITERIA = ALPHA(.05) /DESIGN = bct school .
Home
Syntax and Output for Treatment Effect for Control
UNIANOVA attracth BY school WITH bct /REGWGT = npweightc /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /PRINT = PARAMETER /CRITERIA = ALPHA(.05) /DESIGN = bct school .
Home
Syntax and Output for EOTM
UNIANOVA attracth BY school WITH bct /REGWGT = npweight /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /PRINT = PARAMETER /CRITERIA = ALPHA(.05) /DESIGN = bct school .
Home
Table 3: Estimated Effect of Board Certification on Amount of Help Provided
Non bootstrap standard errors in ()
.
Model* Coefficient Std error t-ratio PValue
Weighted by propensity(EOTM)
.569 .138 4.12 <.001
Weighted by propensity(treatment effect for the treated)
.598 .130 4.60 <.001
Weighted by propensity(treatment effect for the control)
.562 .138 4.07 <.001
Unweighted, with covariates a .603 .092 6.56 <.001
Unweighted with covariates, using multiple imputation
.621 .092 6.75 <.001
Unweighted, no covariates .583 .092 6.35 <.001
Unweighted, no control for school .540 .092 5.88 <.001
NBPTS certified teacher versus other teachers who applied, EOTM (n=280, bct=160, non-bct=120)
.577 .167 3.46 <.001
NBPTS certified teacher versus other teachers who did not applyEOTM (n=1017, bct=160, non-bct=857)
.562 .139 4.04 <.001
*Schools controlled for with fixed effects in all models unless otherwise stated. n=1131 unless otherwise stated. Standard errors based on 500 bootstrap replications. a R2=.21 for standard model with covariates.
Home
Interpretation
• Propensity weighting did not make much of a difference!
• Allowed for focus on different treatment effects• In paper, applied robustness indices to
estimates based on propensities• Schools controlled for with fixed effects
– Accounts for any factor that can be attributed to schools
• Principal, student composition, unmeasured factors
Home
Criticisms of propensity scores• No better than the covariates that go into it
• no control for unobservables• Ambivalent about quality of propensity model• Group overlap must be substantial
• Propensity model should not fit too well!• implies confounding of covariates and treatment
• not good enough implies poorly understood treatment mechanism – poor control
• Short-term biases (2 years) are substantially less than medium term (3 to 5 year) biases—the value of comparison groups may deteriorate
Home
Reflection
1) Identify the aspects that are unclear to you or that concern you
2) Find a partner or two and discuss your concerns
3) Be prepared to teach others or share concerns
Home
Alternative to Weighting by Propensity
• Matching (Rosenbaum and Rubin 1983; Morgan 2001)
• Analyses by Strata (Morgan 2001)
• Kernal Matching (Heckman et al.)
• Control for propensity (Heckman and Robb’s control function – see Winship and Morgan 677).
Home
Matching, Propensity strata and Regression Adjustment
• Heckman refers to regression adjustment as same as matching and propensity strata. Here’s why:
• infinite number of strata matching:– One pair of observations, in treatment and control, within
each stratum
• Implies that strata level is not related to treatment – there’s a treatment and control in each stratum.
• Estimate from matching would be mean difference between treatment and control groups
Home
Matching, Propensity strata and Regression Adjustment
If there is one case in each stratum, estimate from regression would be mean difference between treatment and control because:
· · ·· | 2 2
· ·1 1
x y x cv y cvx ycv
y cv x cv
r r rr
r r
But rx cv=0 (because there is one case within each stratum), therefore rx y| cv =rx y which will generate
same estimate as from regression.
Home
Syntax for Propensity by StrataGET FILE=C:\Documents and Settings\kenfrank\My Documents\MyFiles\sykes\forstrata.sav'.
RANK VARIABLES=rpbct (A) /RANK /NTILES (5) /PRINT=YES /TIES=MEAN .
RECODE Nrpbct (1=0) (2=1) (3=2) (4=3) (5=4) (SYSMIS=SYSMIS) INTO rpbct .EXECUTE .
SAVE OUTFILE=C:\Documents and Settings\kenfrank\My Documents\MyFiles\sykes\strata.sav' /COMPRESSED.
SORT CASES BY rpbct (A) .
SAVE OUTFILE='C:\Documents and Settings\kenfrank\My Documents\MyFiles\sykes\strata_s.sav' /COMPRESSED.
GET FILE='C:\Documents and Settings\kenfrank\My Documents\MyFiles\sykes\strata_s.sav'.
subtitle "checking for balance".SORT CASES BY rpbct .SPLIT FILE LAYERED BY rpbct .T-TEST GROUPS = bct(0 1) /MISSING = ANALYSIS /VARIABLES = attracth expanseh white female leave glevel nograde owned yrstch leader nbct nbctsq bcttreat leadna /CRITERIA = CI(.95) .SPLIT FILE OFF.
SORT CASES BY rpbct .SPLIT FILE LAYERED BY rpbct .DESCRIPTIVES VARIABLES=attracth expanseh white female leave glevel nograde owned yrstch leader nbct nbctsq bcttreat leadna /STATISTICS=MEAN STDDEV MIN MAX.SPLIT FILE OFF.
subtitle "estimate by strata".SORT CASES BY rpbct .SPLIT FILE LAYERED BY rpbct .UNIANOVA attracth BY school WITH bct /METHOD = SSTYPE(3) /INTERCEPT = INCLUDE /PRINT = PARAMETER /CRITERIA = ALPHA(.05) /DESIGN = bct school.SPLIT FILE OFF.
Home
Estimates by Strata (including controls for school)
Strata Est se
1 .22 .33
2 .74 .21
3 .40 .21
4 .62 .19
Average .50
Home
Exercise
Identify an inference regarding an effect in your own work that might benefit from using propensity scores:
1) Is the “treatment” dichotomous2) Are you interested in differential treatment
effects (e.g., for the control and for the treated)?
3) Do you know what factors affect treatment choice?
4) Which propensity approach appeals to you?
Home
Substantive Conclusion
• We infer that National Board certification has an effect on the amount of help teachers provide to others – Effect is at least .5 a standard deviation– Largest effect more than 1-to-1 diffusion – for
every BCT, 1.5 receives help (e.g., 4 BCTs help to 6 others, a total of 10 in school affected by the process).
– lets debate in quantitative terms of robustness indices.
Home
Policy Implications
• Extra Benefit of Board Certification– Contribute to social capital– Spread ideas of board certification– Help other teachers innovate
• Offer incentives for Board Certification– Can advocate policy because inferences
robust
Home
Methodological Conclusion
• Propensity scores narrow the estimate
• Robustness Indices quantify threats to validity
• Robustness Indices more informative than propensity scores?
Home
Methods Reviewed• Counterfactual (2 possible outcomes)• Statistical control
– Random and fixed effects• Robustness of inference
– for impact of a confounding variable (internal validity)– for representativeness of sample (external validity)– Robustness indices a form of sensitivity analysis
• Absorption– Randomization– Instrumental variables– Pre-test
• Differential treatment effects– Treatment effect for treated/for control
• Propensity scores– Attention to assignment mechanism
• Logistic regression– Using propensity scores in analysis
• Weighting• Control• Strata• matching
Home
References on Causal Inference• Holland, P. W. (1986), Statistics and causal inference. Journal of the American Statistical
Association, 81, 945_970.• Rubin, D. B. (1974), Estimating causal effects of treatments in randomized and non_randomized
studies. Journal of Educational Psychology, 66, 688_701.• Rubin, D.B. (2004). “Teaching Statistical Inference for Causal Effects in Experiments and
Observational Studies.”Journal of Educational and Behavioral Statistics, Vol 29(3): 343-368.• Winship, C., & Morgan, S. (1999). The Estimation of Causal Effects from Observational Data.
Annual Review of Sociology, 25, 659_707.• Winship, C. and Sobel, M. (2004) “Causal Inference in Sociological Studies”. Chapter 21 in
Handbook of Data Analysis (Hardy, Melissa., and Bryman, Alan, ed.). London: Sage Publications.
• Heckman, James. (2005). “The Scientific Model of Causality.” Sociological Methodology.”• Masnki, Charles F. 1995. Identification Problems in the Social Sciences. Cambridge, Ma:
Harvard University Press.• Rosenbaum, Paul R. (2002). Observational Studies. New York: Springer.On the Web• http://www.wjh.harvard.edu/soc/faculty/winship/CFA_site.html (Winship’s portal)• http://www.ets.org/research/dload/AERA_2004-Holland.pdf (recent Paul Holland)• http://bayes.cs.ucla.edu/jp_home.html (Judea Pearl)• http://plato.stanford.edu/entries/causation-counterfactual/ (philosophy of counterfactual)• http://sekhon.berkeley.edu/causalinf/causalinf.pdf syllabus on causal inference
Home
Technical Appendix B for calculating Impact Thresholds
t critical n r# observed t r (x,y) ITCV r(x,cv) r(y,cv)
1.961295
=+A2/SQRT(A2*A2+B2-3) 7.34
=+D2/SQRT(B2-2+D2*D2) =+(E2-C2)/(1-C2) =+SQRT(F2) =+SQRT(F2)
Multivariate (with other covariates, z, in model)
t critical
num z r# R2 (x,z) R2 (y,z) ITCV r(x,cv) r(y,cv)
1.96 45
=+A7/(SQRT(A7*A7+B2-B7-3)) 0.15 0.13 =+F2*SQRT((1-D7)*(1-E7))
=SQRT(+F7*SQRT((1-D7)/(1-E7)))
=SQRT(+F7*SQRT((1-E7)/(1-D7)))
User enters values in yellow boxes
Indices calculated in pink
User can replace threshold value, r#, in green. Default is defined by statistical significance
Note that R2 (x,z) and R2 (y,z) only need to be entered to correct ITCV calculations in F7-H7.
Can be downloaded from http://www.msu.edu/~kenfrank/
Home
Questions for Scotte Page
• When a group makes a decision from statistical evidence– are they making a causal inference?– How much of discourse is : you didn’t control for xxx?”
• Ken says: How strong would unmeasured factor have to be to invalidate inference
– Do you believe in statistical controls?
• Class project: Evaluating effect of NCLB sanctions on Michigan schools– Compare schools just above cutoff for sanctions with
those just below cutoff for sanctions