research designs that assess effectiveness

Research Designs That Assess Effectiveness

Christopher Schatschneider, Ph.D.

Florida Center for Reading Research

Florida State University

Presented at the pre-application meeting for the

Interagency Education Research Initiative (IERI)

Institute of Education Sciences

U.S. Department of Education

Washington, D.C.

February 21, 2003

Why run an experiment?

• We run experiments to determine cause and effect relationships.

• We typically have a particular cause in mind, and want to know if it has an effect on an outcome, and if so, to what degree.

Cause and effect

• What has to happen to establish a cause and effect relationship?– The cause must precede the effect– The cause must be related to the effect– We can find no other plausible alternative

explanation for the effect other than the cause.

Cause and effect

• Experiments are ideally suited to studying cause and effect relationships because they – insure that a presumed cause is deliberately

manipulated and, thereby, precedes the observed effect

– incorporate procedures that help determine whether the cause is related to the effect

– incorporate procedures to minimize and/or assess the influence of extraneous factors that could produce the effect presumed to be attributed to the cause

Factors that influence early achievement

• The IERI request for applications asks for research on reading, mathematics and the sciences

• These constructs are clearly influenced by multiple factors

• Type of school instruction• Time spent learning these skills in the classroom• Previous exposure to these skills outside the classroom• Home environment characteristics that impact these skills

Overarching design issue

• As researchers, our goal is typically to investigate some intervention or approach that may impact these constructs.

• To do this, we must design studies that will– Maximize Systematic Variance– Minimize Error Variance– Control Extraneous Variance (Kerlinger, 1973)

MAXMINCON

• Maximize Systematic Variance– Deliver the treatment with high fidelity and to

the targeted population.– Collect information on variables that may

impact treatment. Any variance you can account for becomes systematic variance. Any variance unaccounted for becomes error variance.

MAXMINCON

• Minimize Error Variance– Use highly reliable measures– Use multiple measures of the same construct

• Control Extraneous Variance– This is probably the largest area of concern

when developing a research design. We want to control the influences of other factors that may impact our outcome in addition the treatment.

How can we control extraneous influences?

• The best way to control extraneous influences is to employ random assignment of units to conditions.– Random assignment creates a hypothetical counterfactual that

has the best probabilistic chance of being equivalent to the experimental group

• A hypothetical counterfactual, in this case, is a group of people who represent what would have happened to the treatment group had they not had received treatment.

• It is by comparing the hypothetical counterfactual to the treatment groups that we are able to estimate treatment effects.

– Random assignment distributes nonexperimental causes across all groups.

– This allows for a strong statement about experimental causality because any other influence upon the outcome must only occur by chance.

– Random assignment also protects against many threats to internal validity.

Nonrandom assignment in experiments

• Sometimes, however, people cannot be randomly assigned to conditions. – May be unethical– Impractical– Very expensive– Causes artificially low external validity– Interest in “intact groups” i.e. gender

• These designs must employ different strategies to compensate for the fact that random assignment did not occur.

• These types of designs (called Quasi-experimental designs) still provide information about cause and effect relationships. They just have to work harder to do it.

Quasi-experimental designs

• Quasi-experimental designs are still experiments and can provide information about cause and effect relationships

• However, they must overcome some threats to the validity of this approach.

Threats to internal validity

1) Selection bias - individuals assigned to given experimental condition happen to be different at the outset of the experiment in ways that might erroneously be attributed to the treatment. This is probably the biggest problem.

2) History effects – specific effects in addition to X occur DURING the experiment.

3) Regression to the mean - refers to the tendency of individuals with extreme scores on one assessment to obtain scores that are closer to the population mean on a subsequent assessment

4) Mortality/Attrition – differential loss after treatment

How can quasi-experiments address these concerns?

• Shadish et al. argue for three principles when working with quasi-experimental designs 1) Identification and evaluation of plausible threats to

internal validity.2) Control by design. Multiple control groups, multiple

baselines, statistical control as a last resort.3) Coherent pattern matching - principle involves

making a complex prediction about a particular causal hypothesis that would leave few viable alternative explanations. The logic behind this principle is that the more complex the prediction, the less likely that a given alternative could generate the same results.

Other threats to the validity of experiments

• There are other threats to experimentally validity that must be addressed by both experimental and quasi-experimental designs.

Construct validity

– Do the measures proposed have high reliability and are they appropriate for the age range?

– Do the measures adequately measure the construct being assessed?

– Are there multiple measures of the same construct?

External validity

• Will the effects generalize? To whom with they generalize?– How will the sample be obtained?– What will the characteristics of the sample

be?– Is the sample somehow different than the

population that they are trying to generalize to?

Statistical conclusion validity

• Are the proposed analyses appropriate for the hypotheses stated? Do the analyses “map up” to the hypotheses?

• Do the analyses take into account the nature of the data? I am going to take a wild guess and state that almost all of the research proposals will propose collecting data that is nested. The proposed data analyses need to account for the nested structure of the data.

• Does the proposed design have enough power? Will you control for Type I error?

Design and analysis tips

• DISCLAIMER: The following tips are my own and are not endorsed by anyone at IERI.

• These are the suggestions I would provide if you were to approach me personally and ask for my opinion.

Tip #1

• Randomize units to conditions whenever possible. Randomization is one of the best tools we have to control extraneous variance.

• Try to randomize units to conditions at your highest unit of nesting. For example– if your study involves teacher training as the

treatment, try to randomize teachers to treatments. – If your study involves school reform models, then try

to randomize at the school level.

• Block randomization is useful when the number of higher level units are low.

Block Randomization

• Block randomization works by grouping units together based on some factor that needs to be controlled, then randomly assigning the units within the block to the experimental conditions. This strategy can sometimes take the “lumpiness” out of the randomization process.

Tip #2

• If you can’t randomize, control.– If you aren’t going to randomize, you have to prove to the

reviewers that you will be able to make valid treatment inferences.

– Options include• Multiple control groups. This would increase the chances of being

able to make valid comparisons to the treatment group• Multiple baselines. This applies if you are forming groups based on

some extreme score (for example, if you are planning on using a cut-off score for admittance to the study). Multiple pretest measures greatly reduce regression to the mean effects.

• Measure relevant variables at baseline and propose to investigate baseline differences among the groups

• Employ statistical controls. This should really be used as a last resort.

• Hypothesize some within treatment effects.

Tip #3

• If you have nested data, propose analyses that can account for the nesting.– Hierarchical Linear Models

• Random Effects ANOVA• Random Effects Regression• Growth curves• Latent growth curves

Tip # 4

• Make sure your analyses map up to your hypotheses (And make it easy for the reviewer to see it)

Tip # 5

• Make sure your proposed statistics have adequate power.– With nested designs, this becomes difficult.– Power simulations are always an option– Another strategy is the “at least” strategy.

• Perform a power analysis assuming that you are going to aggregate your data to the highest unit.

• Report that your power will be greater than this.

– Consulting a statistician is another option

Conclusions

• Randomize when you can

• Control when you can’t

• Provide evidence that your treatment effects will be correctly estimated

Finally

• I wanted to acknowledge that my presentation today was greatly influenced by Shadish, Cook, and Campbell (2002) and I also borrowed much from a book chapter that I co-authored with Frank Vellutino.Shadish, W.R., Cook, T.D., & Campbell, D.T. (2002). Experimental and quasi-experimental designs for general causal inference. Boston: Houghton Mifflin.Vellutino, F. & Schatschneider, C. (2003). Experimental and Quasi- Experimental

Design in Literacy Research. To appear in Literacy Research Methods (N. Duke & M. Mallette Eds.). Guilford Press

research designs that assess effectiveness

Documents