Download - 1 Evidence Based Policy, Evidence Grading Schemes and Entities, and Ethics in Complex Research Systems Robert Boruch, University of Pennsylvania September

1

Evidence Based Policy, Evidence Grading Schemes and Entities, and Ethics in Complex Research

Systems

Robert Boruch, University of PennsylvaniaSeptember 14-15 2008

5th European Conference on Complex SystemsJerusalem

2

Summary of ThemesSummary of Themes

People want to know “what works” so as to inform their decisions.

The scientific quality of evidence on “what works” is variable and often poor.

People’s access to information on what works through the internet is substantial.

Organizations and data bases have been created to (a) develop evidence grading schemes, (b) apply the schemes in systematic reviews of evidence from multiple studies, and (c) disseminate results through the internet.

3

The “What Works” ThemeThe “What Works” Theme

“What works” refers here to estimating the effects of social, educational, or criminological interventions

In a statistically/scientifically unbiased wayAnd so as to generate a statistical statement

of one’s confidence in the results.

4

Evidence Based Policy/Law: A Evidence Based Policy/Law: A Driver of Interest in What Works Driver of Interest in What Works US, Canada, UK, Israel (e.g. National

Academy)Sweden, Norway, DenmarkAustralia, Malaysia, ChinaMexico, others in Central AmericaMultinationals: OECD, World BankOthers

5

Information Glut as Driver: Naïve Information Glut as Driver: Naïve Web SearchesWeb Searches

A Google search on “evidence based, ” yields 9, 660,000 links.

A Google search on “what works” yields 6,350,000 links (.21 seconds).

A Google search on “evidence based practice” yields 2,000,000 links (.42 seconds).A Google search on “evidence based policy” yields 132,000 links (.35 seconds).

What are we to make of this?

6

Publication Rates in Publication Rates in EducationEducation

20,000 articles on education published each year in English language journals

2/1,000-5/1,000/year report on controlled trials of programs, policies, or practices to estimate effectiveness

For every curriculum package that has been tested in a controlled trial, there are 50-80 that are claimed to be effective based on no defensible scientific evidence.

7

Relevant Organizations Nested Relevant Organizations Nested National, State/Provincial, Municipal:

Policy or lawAgencies with nation, etc., e.g. National

Science Foundations, Institute for Education Sciences (US), University Research

Programs and projects within agenciesData bases and reports within projectsUsers of information at each level, e.g.

scientists, policy people, the public

8

International Organizations: International Organizations: NGOsNGOs

Cochrane Collaboration in Health Care: http://cochrane.org

Campbell Collaboration in education, welfare, crime and justice: http://campbellcollboration.org

9

Two Examples HereTwo Examples Here

International Campbell Collaboration in education, welfare, crime and justice

What Works Clearinghouse in education (Institute for Education Sciences, US)

10

Data Bases in this ContextData Bases in this Context

Evidence Grading Schemes currently focus on reports of statistical analyses of impact, not micro-records of individuals as yet.

Example: 5-10 statistical reports (ingredients of part of data base) on evaluating impact of conditional income transfer programs in developing regions

Example: Cochrane Collaboration data base on randomized trials contains nearly .5 million such reports

“Meta-analysis” of results of multiple studies

11

C2 SPECTR C2 SPECTR

C2 Social, Psychological, Educational, and Criminological Trials Register

13,000+ entries on randomized and possibly randomized trials

Feeding into C2 systematic reviews Feeding into the IES What Works

Clearinghouse (USDE)

12

The Campbell Collaboration The Campbell Collaboration

Mission: since 2000, prepare, maintain and make accessible C2 systematic reviews of evidence on the the effects of interventions (“what works” ) to inform

decision makers and stakeholders. International and Multidisciplinary: Education,

social welfare/services, crime and justice http://campbellcollaboration .org Precedent: Cochrane Collaboration in health (1993)

13

Nine Key Principles of C2: Nine Key Principles of C2: A Scientific EthicA Scientific Ethic

1. Collaborating across Nations and Disciplines

2. Building on Enthusiasm

3. Avoiding Duplication

4. Minimizing Bias

5. Keeping Current

6. Striving for Relevance

7. Promoting Access

8. Ensuring Quality

9. Maintaining Continuity

14

What are Evidence Grading What are Evidence Grading Schemes (EGSs) ? Schemes (EGSs) ?

These are inventories (guidance, checklists, scales) or processes that…

facilitate making transparent and uniform scientific judgments about…

The quality of evidence on effects of programs or practices or policies

15

C2’s and Others’ Major Evidence C2’s and Others’ Major Evidence Grading Distinction on What Grading Distinction on What

WorksWorks Randomized controlled trials yield the least biased

and least equivocal evidence on “what works” i.e. effect of a new intervention (program, practice, etc.)

Alternative methods to estimate the effect of interventions yield more equivocal and more biased estimates of effect, e.g. “before-after” evaluations and other nonrandomized trials.

Both randomized trials and nonrandomized trials are important, but they must be separated in evidence grading schemes.

16

Example: Randomized Controlled Example: Randomized Controlled TrialTrial

Individuals or entities such as villages or organizations are randomly allocated to one of two or more interventions

The random allocation assures a fair comparison of the effects of the interventions

And the random allocation assures a statistically credible statement about confidence in the result, e.g. confidence interval and statistical tests

17

More Specific ExampleMore Specific Example

A new commercially curriculum package for math education is the intervention under investigation

The new curriculum is RANDOMLY allocated to half of a sample of 100 schools, with the remaining half of schools serving as a control group, so as to form two equivalent groups of schools (fair comparison)

The outcomes, such as achievement test scores, from the intervention group and the control group are compared

18

Entities and Evidence Grading Entities and Evidence Grading Schemes for What Works Schemes for What Works

Cochrane Collaboration: Systematic reviews in health Campbell Collaboration: crime, education, welfare Society for Prevention Research (Prevention Science

2006) What Works Clearinghouse, Institute for Education

Sciences WWC IES http://whatworks.ed.gov Food and Drug Administration, other regulatory agencies National Register of Evidence-based Programs and

Practices Others: California etc.

19

What are the Ingredients of EGSs? What are the Ingredients of EGSs?

Pre-specification of primary outcomes Comparison condition fidelity

Pre-specification of all analyses Reliability of outcome measures

Pre-specification of all measures Validity of outcome measures

Control for assignment/selection bias Adherence to standards for data collection

Appropriate comparison condition Adjustment for differential attrition

Control for subject awareness of assigned intervention

Adjustment for overall loss to follow-up

Control for provider awareness of assigned intervention

Adjustment for missing data

Control for data collector awareness of assigned intervention

Analysis meets statistical assumptions

Assurances to participants to elicit disclosure Analysis consistent with study theory

Intervention fidelity/Measurement of exposure

Adjustment for multiple measures

Control for contamination and co-intervention Absence of or explanation for anomalous findings

Reliability and validity of exposure measures

20

WWC AimsWWC Aims

To be a trusted source of scientific evidence on what works, what does not, and on where evidence is absent…

Not to endorse products http://www.whatworks.ed.gov

Acrobat Document Acrobat Document

21

What Works Clearinghouse What Works Clearinghouse Illustration Illustration

22

Beginning Reading Review Protocol The Beginning Reading What Works Clearinghouse (WWC) review focuses on reading interventions for students in grades K–3 (or ages 5-8) that are intended to increase skills in alphabetics (phonemic awareness, phonological awareness, letter recognition, print awareness and phonics), reading fluency, comprehension (vocabulary and reading comprehension), or general reading achievement. Interventions for this review are defined as programs, products, practices, or policies that are intended to increase skills in the areas named above. For the first set of intervention Beginning Reading reports, the WWC focused on “branded” programs and products.

Effectiveness ratings for Beginning Reading programs in four domains

Intervention Alphabetics Comprehension Fluency General reading achievement

DaisyQuest

Reading Recovery® WWC Intervention Reports provide all findings that "Meet Evidence Standards" or "Meet Evidence Standards with Reservations" for studies on a particular intervention. Intervention reports are created for those interventions that have at least one study that "Meets Evidence Standards" or "Meets Evidence Standards with Reservations." Intervention reports are one component of the decision-making process, but should not be the sole source of information when making educational decisions. Key

Positive effects: strong evidence of a positive effect with no overriding contrary evidence

Potentially positive effects: evidence of a positive effect with no overriding contrary evidence

Mixed effects: evidence of inconsistent effects

No discernible effects: no affirmative evidence of effects

Potentially negative effects: evidence of a negative effect with no overriding contrary evidence

Negative effects: strong evidence of a negative effect with no overriding contrary evidence

23

Example: C2 Parental Example: C2 Parental Involvement TrialsInvolvement Trials

500 possibly relevant studies of impact45 Possible Randomized Controlled Trials

(RCTs)20 RCTs Met Study Inclusion Criteria20 RCTs Met Study Inclusion Criteria

18 RCTs included in the Meta-Analysis18 RCTs included in the Meta-AnalysisNye, Turner, Schwartz Nye, Turner, Schwartz

http//:campbellcollaboration.orghttp//:campbellcollaboration.org

24

Model Study name Comparison Outcome Statistics for each study Hedges's g and 95% CI

Hedges's Lower Upper g limit limit

Ryan (1964) Parent_vs_Control Combined 0.347 0.088 0.605

Aronson (1966) Combined Read_Ach 1.109 0.421 1.798

Clegg (1971) Combined Combined 0.776 -0.098 1.651

Hirst (1974) Parent_vs_Control Combined 0.181 -0.217 0.579

Henry (1974) Combined Combined 0.281 -0.677 1.239

O'Neil (1975) Combined Combined 0.223 -0.724 1.169

Tizard (1982) Combined Read_Comp 0.879 0.369 1.390

Heller (1993) ParentRpt_vs_ControlCombined 1.496 0.881 2.110

Miller (1993) Combined Combined 0.164 -0.557 0.884

Roeder (1993) Parent_vs_Control Math_Ach 0.123 -0.445 0.692

Fantuzzo (1995) Combined Combined 0.741 -0.047 1.529

Ellis (1996) Parent_vs_Control Combined -0.116 -0.652 0.420

Joy (1996) Combined Cr_Math_Ach 0.114 -0.842 1.071

Peeples (1996) Parent_vs_Control Combined 0.920 0.345 1.495

Kosten (1997) Parent_vs_Control Science_Ach 0.075 -0.573 0.723

Hewison (1988) Combined Read_Comp 0.646 0.089 1.203

Meteyer (1998) Parent_vs_Control Combined 0.381 -0.164 0.925

Powell-Smith (2000)Combined Combined -0.298 -1.076 0.480

Fixed 0.430 0.299 0.561

Random 0.453 0.248 0.659

-2.00 -1.00 0.00 1.00 2.00

Favors ControlFavors Treatment

Figure Efficacy of Parent Involvement on Student Achievement

Heterogeneity Statistics for a Fixed Effects Model: Q=35.6, df=17, Prob.=0.005, and I Squared=52.3%.

25

Example: Petrosino et al on Example: Petrosino et al on Scared Straight TrialsScared Straight Trials

Over 600 articles that are possibly relevant to impact of Scared Straight

Only 15 reach a “reasonable” level of scientific standard

Only 7 reached standard of being randomized controlled trial.

26

Figure 1. The effects of Scared Straight and other juvenile awareness programs on juvenile delinquency: random effects model, “first effect,” reported in the study (Petrosino, Turpin-Petrosino, and Buehler, 2002)

n=number of failures

N=number of participants

CI=confidence intervals

Random=random effects model assumed

27

C2 Product: Scared StraightC2 Product: Scared StraightPro Humanitate AwardPro Humanitate Award

Observational Studies Ashcroft: -50% crime Buckner: 0% Berry: -5% Mitchell -53% Several dozen others

Randomized Trials Mich: +26% crime Gtr Egypt: +5% Yarb: +1% Orchow: +2% Vreeland: +11% Finckenauer: +30% Lewis: +14%

28

Scientific EthicScientific Ethic

Providing access to scientific reports of evaluations of the effect of interventions, e.g. journal publications and limited circulation reports from governments or private organizations

Providing information beyond reports to assure understanding

In principle, but not always in practice, providing access to micro-records from impact evaluations

29

Ethics of Research on HumansEthics of Research on Humans

Evidence Grading Schemes and organizations need not worry about individual privacy because they have not access, as yet, to individuals records in identifiable form

They rely only on statistical/scientific reports that are published in peer reviewed journals and other reports and which include no individual records.

30

Ethics and Law: US Ethics and Law: US

Individual rights to privacy are routinely assured on account of professional ethics statements and laws in the US.

The relevant codes of professional ethics in US include those of AERA, ASA, AAPOR, APA, and others.

The relevant laws in the US include Family Education Rights and Privacy Act (FERPA), Privacy Act, HIPPA

31

Ethics and Randomized Ethics and Randomized Controlled TrialsControlled Trials

Relevant codes and law concern individual privacy and confidentiality of individual’s identifiable micro-records

Relevant regulations and codes include attention to informed consent (45CFR46)

Access to anonymous micro-records for secondary analysis is problematic and possibly unnecessary in this context

32

AppendicesAppendices

33

Robert Boruch: BioRobert Boruch: Bio

Boruch is the University Trustee Chair Professor in the Graduate School of Education and the Statistics Department of the Wharton School at the University of Pennsylvania, Philadelphia Pennsylvania

Boruch is Fellow of the American Statistical Association, Academy of Experimental Criminology, American Academy of Arts and Sciences, American Educational Research Association

Email: [email protected]

34

Provision to Advance Rigorous EvaluationsProvision to Advance Rigorous Evaluationsin Legislationin Legislation

The program shall allocate X% of program funds [or $Y million] to evaluate the effectiveness of funded projects using a methodology that –

– Includes, to the maximum extent feasible, random assignment of program participants (or entities working with such persons) to intervention and control groups; and

– Generates evidence on which program approaches and strategies are most effective.

The program shall require program grantees, as a condition of grant award, participate in such evaluations if asked, including the random assignment.

35

Provision to Advance Replication ofProvision to Advance Replication ofResearch-Proven InterventionsResearch-Proven Interventions

Agency shall establish a competitive grant program focused on scaling up research-proven models

Grant applicants shall –– Identify the research-proven model they will implement,

including supporting evidence (well-designed RCTs showing sizeable, sustained effects on important outcomes);

– Provide a plan to adhere closely to key elements of the the model; and

– Obtain sizeable matching funds from other sources, especially large formula grant programs.

36

A Focus on Data Bases that A Focus on Data Bases that Concern “What Works”Concern “What Works”

Here, the focus is on projects that generate evidence about “what works,” and what does not work using good scientific standards

This is different from a focus on projects or programs that generate information on nature of a problem, monitoring program compliance with law, etc.

37

What are the Campbell What are the Campbell Collaboration (C2) Assumptions?Collaboration (C2) Assumptions?

Public interest in evidence based policy and practice will increase.

Scientific and government interest in cumulation and synthesis of evidence on “what works” will increase.

Access to information and evidence of dubious quality and need to screen for quality of evidence will increase.

The use of randomized controlled trials to generate trustworthy evidence on what works will increase.

38

What are the Products?What are the Products?

1. Registries of C2 Systematic Reviews of the effects of interventions (C2-RIPE)

2. Registries of reports of randomized trials and nonrandomized trials, (C2-SPECTR) and future reports of randomized trials (C2-PROT)

3. Standards of evidence for conducting C2 Systematic reviews

4. Annual Campbell Colloquia

5. Training for producing reviews

6. New technologies and methodologies

7. Web site: http://www.campbellcollaboration.org

39

What are Other C2 Products?What are Other C2 Products?

C2 Trials Register (C2 SPECTR): 13,000 entries Annals of the American Academy of Political and

Social Sciences: Special Issues C2 Prospective Trials Register C2 Policy Briefs Annual and Intermediate Meetings: London,

Philadelphia, Stockholm, Lisbon, Paris, Oslo, Copenhagen, Helsinki, Los Angeles

40

Hand Search vs Machine Based Hand Search vs Machine Based SearchSearch

Journal of Educational Psychology (‘03-”06)

Hand search: RCT=66Full Text Elec N=99: 59% accurate, 41%

false positives, 24% false negativesAbstract only Elect N=11: 91% accurate.

9% false positive, 85% false negative

41

What Is the Value Added ?What Is the Value Added ?

Building a cumulative knowledge baseDeveloping exhaustive searchesProducing transparent and uniform

standards of evidenceInternational scopePeriodic updatingMaking reviews accessible

42

C2 Futures/TensionsC2 Futures/Tensions

C2 Production: AIR and othersC2 Publications v journalsC2 and governments and C2 apart from

governmentsC2 and Sustainability, C2 as voluntary

Organization versus C2 and Spin Off Organizations and Products

43

What are Other Illustrative What are Other Illustrative Reviews?Reviews?

“Scared Straight” Programs (Done, Award) Multi-systemic Therapy (Done) Parental Involvement (Done) After School programs (Due 12/05) Peer Assisted Learning Counter Terrorism Strategies (Under revision) Reducing Illegal Firearms Possession

Download - 1 Evidence Based Policy, Evidence Grading Schemes and Entities, and Ethics in Complex Research Systems Robert Boruch, University of Pennsylvania September

Top Related