8.0 assessing the quality of the evidence. assessing study quality or critical appraisal ► ►...

34
8.0 Assessing the Quality of the Evidence

Upload: percival-curtis

Post on 24-Dec-2015

220 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

8.0

Assessing the Quality

of the Evidence

Page 2: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Assessing study quality or critical appraisal

►Minimize Bias

►Weight for Quality

►Assess relationship between effect size and quality

2

Page 3: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Coding for study quality

Pre-established criteria (internal validity) applied to each study to inform the synthesis (meta-analysis)

Only use findings from studies judged to be of high quality or qualify findings

Look for homogeneity/heterogeneity

Examine differences in findings according to quality (sensitivity analysis)

3

Page 4: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

“A careful look at randomized experiments will make clear that they are

not the gold standard. But then, nothing is. And the alternatives are usually

worse.”

Berk RA. (2005) Journal of Experimental Criminology 1, 417-433. 4

Page 5: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Code for Study Design Characteristics:

1. Design Type RCT or quasi-experiment or other

2. Fidelity to Random AllocationIs the method of assignment unclear? Look for confusion between non-random

and random assignment – the former can lead to bias.

3. Allocation Concealment

5

Page 6: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Which studies are RCTs?1. “We took two groups of schools – one group had high

ICA use and the other low ICA use – we then took a random sample of pupils from each school and tested them.”

2. “We put the students into two groups, we then randomly allocated one group to the intervention whilst the other formed the control.”

3. “We formed the two groups so that they were approximately balanced on gender and pre-test scores.”

4. “We identified 200 children with a low reading age and then randomly selected 50 to whom we gave the intervention. They were then compared to the remaining 150.”

5. “Of the eight [schools] two randomly chosen schools served as a control group.”

6

Page 7: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

EXAMPLES

7

Page 8: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Is it randomized?

“The groups were balanced for gender and, as far as possible, for school. Otherwise, allocation was randomized.”

Thomson et al. Br J Educ Psychology 1998;68:475-91.

8

Page 9: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Is it randomized?

“The students were assigned to one of three groups, depending on how revisions were made: exclusively with computer word processing, exclusively with paper and pencil or a combination of the two techniques.”

Greda and Hannafin, J Educ Res 1992; 85:144. 9

Page 10: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Mixed allocation

“Students were randomly assigned to either Teen Outreach participation or the control condition either at the student level (i.e., sites had more students sign up than could be accommodated and participants and controls were selected by picking names out of a hat or choosing every other name on an alphabetized list) or less frequently at the classroom level.”

Allen et al, Child Development 1997;64:729-42. 10

Page 11: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Non-random assignment confused with random allocation

“Before mailing, recipients were randomized by rearranging them in alphabetical order according to the first name of each person. The first 250 received one scratch ticket for a lottery conducted by the Norwegian Society for the Blind, the second 250 received two such scratch tickets, and the third 250 were promised two scratch tickets if they replied within one week.”  Finsen V, Storeheier, AH (2006) Scratch lottery tickets are a poor

incentive to respond to mailed questionnaires. BMC Medical Research Methodology 6, 19.  doi:10.1186/1471-2288-6-19. 11

Page 12: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Misallocation issues

“23 offenders from the treatment group could not attend the CBT course and they were then placed in the control group.”

12

Page 13: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Concealed allocation – Why is Concealed allocation – Why is it important?it important?

►Inflated Effect Sizes

►Selection bias and exaggeration of group differences

13

Page 14: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Allocation concealment: A meta-analysis

►250 randomized trials in the field of pregnancy and child birth.

►The trials were divided into 3 concealment groups: Good concealment (difficult to subvert); Unknown (not enough detail in paper); Poor (e.g., randomisation list on a public notice board).

►Results: Inflated ES for poorly concealed compared with well concealed randomisation. 14

Page 15: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Comparison of adequate, unclear and inadequate

concealment

Allocation Concealment

Effect Size OR

Adequate 1.0

Unclear 0.67 P < 0.01

Inadequate 0.59

Schulz et al. JAMA 1995; 273:408.15

Page 16: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Examples of good allocation concealment

►“Randomisation by centre was conducted by personnel who were not otherwise involved in the research project.” [1]

►Distant assignment was used to: “protect overrides of group assignment by the staff, who might have a concern that some cases receive home visits regardless of the outcome of the assignment process.”[2]

[1] Cohen et al. (2005) J of Speech Language and Hearing Res. 48, 715-729.

[2] Davis RG, Taylor BG. (1997) Criminology 35, 307-333. 16

Page 17: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Assignment Discrepancy

►“Pairs of students in each classroom were matched on a salient pretest variable, Rapid Letter Naming, and randomly assigned to treatment and comparison groups.”

►“The original sample – those students were tested at the beginning of Grade 1 – included 64 assigned to the SMART program and 63 assigned to the comparison group.”

Baker S, Gersten R, Keating T. (2000) When less may be more: A 2-year longitudinal evaluation of a volunteer tutoring program requiring minimal

training. Reading Research Quarterly 35, 494-519. 17

Page 18: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Change in concealed allocation

05

1015

2025

3035

4045

50

Drug No Drug

<1997

>1996

NB No education trial used concealed allocation

P = 0.04 P = 0.7018

Page 19: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Example of unbalanced trial affecting results

►Trowman and colleagues undertook a systematic review to see if calcium supplements were useful for helping weight loss among overweight people.

►The meta-analysis of final weights showed a statistically significant benefit of calcium supplements. HOWEVER, a meta-analysis of baseline weights showed that most of the trials had ‘randomized’ lower weight people into the intervention group. When this was taken into account there was no longer any difference.

19

Page 20: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Meta-analysis of baseline body weight

Trowman R et al (2006) A systematic review of the effects of calcium supplementation on body weight. British Journal of Nutrition 95, 1033-38. 20

Page 21: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Summary of assignment and concealment

Code for Design Type

Code Fidelity of Allocation

Code for assignment discrepancies

21

Page 22: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Other design issues

► Attrition (drop-out)

► Unblinded ascertainment (outcome measurement)

► Small samples can lead to Type II error (concluding there is no difference when there is a difference)

► Multiple statistical tests can give Type I errors (concluding there is a difference when this is due to chance)

► Poor reporting of uncertainty (e.g., lack of confidence intervals)

22

Page 23: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Blinding of Participantsand Investigators

►Participants can be blinded to: Research hypotheses Nature of the control or experimental condition

Whether or not they are taking part in a trial

►Investigators should be blinded (if possible) to follow-up tests as this eliminates ‘ascertainment’ bias.

23

Page 24: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Blinded outcome assessment

0

5

10

15

20

25

30

35

40

Hlth Ed Education

<1997

>1996

P = 0.03P = 0.13

Torgerson CJ, Torgerson DJ, Birks YF, Porthouse J. (2005) A comparison of randomized controlled trials in health and education. British Educational Research Journal,31:761-785.

24

Page 25: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Statistical power

►Few effective educational interventions produce large effect sizes especially when the comparator group is an ‘active’ intervention.

►In a tightly controlled setting 0.5 of a standard deviation difference at post-test is good. Smaller effect sizes in field trials are to be expected (e.g. 0.25). To detect 0.5 of an effect size with 80% power (sig = 0.05), we need 128 participants for an individually randomized experiment. 25

Page 26: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Percentage of trials underpowered (n < 128)

0

10

20

30

40

50

60

70

80

90

Hlth Ed Education

<1997

>1996

P = 0.22

P = 0.76

Torgerson CJ, Torgerson DJ, Birks YF, Porthouse J. (2005) A comparison of randomized controlled trials in health and education. British Educational Research Journal,31:761-785.

26

Page 27: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Code for analysis issues

►Code for whether, once randomized, all participants are included within their allocated groups for analysis (i.e., was intention to treat analysis used).

►Code for whether a single analysis is pre-specified before data analysis.

27

Page 28: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Attrition

►Attrition can lead to bias; a high quality trial will have maximal follow-up after allocation.

►A good trial reports low attrition with no between group differences.

►Rule of thumb: 0-5%, not likely to be a problem. 6% to 20%, worrying, > 20% selection bias. 28

Page 29: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Poorly reported attrition

► In a RCT of Foster-Carers extra training was given. “Some carers withdrew from the study once the dates and/or location were confirmed; others withdrew once they realized that they had been allocated to the control group” “117 participants comprised the final sample”

► No split between groups is given except in one table which shows 67 in the intervention group and 50 in the control group. 25% more in the intervention group – unequal attrition hallmark of potential selection bias. But we cannot be sure.

Macdonald & Turner, Brit J Social Work (2005) 35,126529

Page 30: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

What is the problem here?What is the problem here?

Random allocation

160 children in 20 schools (8 per school)

80 in each group

76 children allocated

to control

76 allocated to

intervention group

1 school 8 children withdrew

N = 17 children replaced following

discussion with teachers

30

Page 31: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Intention to Treat (ITT)vs.

Treatment Only (TO)►ITT: The analysis of the outcome measure for all participants initially assigned to a condition regardless of whether or not they completed or received that intervention.

►TO: The analysis of ONLY the participants initially assigned to a condition AND completed the intervention

31

Page 32: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

Survey of trial quality

Torgerson CJ, Torgerson DJ, Birks YF, Porthouse J. (2005) A comparison of randomized controlled trials in health and education. British Educational Research Journal,31:761-785. (based on n = 168 trials)

32

Characteristic Drug Health Education Cluster Randomised 1% 36% 18% Sample size justified 59% 28% 0% Concealed randomisation 40% 8% 0% Blinded Follow -up 53% 30% 14% Use of CIs 68% 41% 1% Low Statistical Power 45% 41% 85%

Page 33: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

CONSORT

►Because the majority of health care trials were badly reported, a group of health care trial methodologists developed the CONSORT statement, which indicates key methodological items that must be reported in a trial report.

►This has now been adopted by all major medical journals and some psychology journals.

33

Page 34: 8.0 Assessing the Quality of the Evidence. Assessing study quality or critical appraisal ► ► Minimize Bias ► ► Weight for Quality ► ► Assess relationship

The CONSORT guidelines, adapted for trials in educational

research► Was the target sample size adequately determined?Was the target sample size adequately determined?► Was intention to teach analysis used? (i.e. were all Was intention to teach analysis used? (i.e. were all

children who were randomized included in the follow-children who were randomized included in the follow-up and analysis?)up and analysis?)

► Were the participants allocated using random number Were the participants allocated using random number tables, coin flip, computer generation?tables, coin flip, computer generation?

► Was the randomisation process concealed from the Was the randomisation process concealed from the investigators? (i.e. were the researchers who were investigators? (i.e. were the researchers who were recruiting children to the trial blind to the recruiting children to the trial blind to the child’s allocation until after that child had been child’s allocation until after that child had been included in the trial?)included in the trial?)

► Were follow-up measures administered blind? (i.e. Were follow-up measures administered blind? (i.e. were the researchers who administered the outcome were the researchers who administered the outcome measures blind to treatment allocation?)measures blind to treatment allocation?)

► Was precision of effect size estimated (confidence Was precision of effect size estimated (confidence intervals)?intervals)?

► Were summary data presented in sufficient detail to Were summary data presented in sufficient detail to permit alternative analyses or replication?permit alternative analyses or replication?

► Was the discussion of the study findings consistent Was the discussion of the study findings consistent with the data?with the data?

34