shrier 2011

15
Structural Approach to Bias in Meta-analyses Ian Shrier* Methods to calculate bias-adjusted estimates for meta-analyses are becoming more popular. The objective of this paper is to use the structural approach to bias and causal diagrams to show that (i) the current use of the bias-adjusted estimating tools may sometimes introduce bias rather than reduce it and (ii) the Cochrane collaboration risk of bias tool, which was designed for randomized studies, is also applicable to non-randomized studies with only minimal changes. Causal diagrams are used to illustrate each of the items in the current risk of bias tool and how they apply to both randomized and non-randomized studies. With the exception of confounding by indication, the structure of all potential biases present in non-randomized studies may also be present in randomized studies. In addition, causal diagrams demonstrate important limitations to the methods currently being developed to provide bias-adjusted estimates of individual studies in meta-analyses. Finally, causal diagrams can be helpful in deciding when it is appropriate to combine studies in a meta-analysis of non-randomized studies even though the studies may use different adjustment sets. Copyright © 2012 John Wiley & Sons, Ltd. Keywords: bias; meta-analyses; diagram Introduction The fundamental objective of evidence-based medicine (EBM) is to enable clinicians and policy makers to make causal inferences about exposures in order to develop or prescribe effective health interventions. In general, randomized trials are stronger than non-randomized studies in establishing causation [1,2]. In randomized studies, the treatment and control groups are expected to have equal distributions of all prognostic factors except for the exposure of interest. Therefore, any difference in outcome is likely due to the treatment. In non-randomized studies, there may be systematic differences between the treatment and control groups, and these differences may be just as likely to explain the difference in outcome, as is the treatment. That said, both randomized and non-randomized studies can have aws in methodology and analysis that will result in the calculated treatment effect being systematically different from the true treatment effect [15], that is, the study may provide a biased estimate of the true treatment effect. Because EBM requires the evidence synthesis from different studies, researchers conducting meta-analyses must evaluate biases within each individual study, as well as biases created during the process of evidence synthesis. Several groups have been working on advanced methods that quantify the direction and magnitude of biases in order to provide bias-adjusted estimatesfor meta-analyses [3,4,68]. The ultimate objective of these methods is to improve the estimated causal effect for meta-analyses limited to randomized studies and meta- analyses limited to non-randomized studies. In addition, because the source of the bias is not important once corrected for, bias-adjusted estimates would also allow for meta-analyses that combine randomized and non- randomized studies together. There are two general approaches to calculate bias adjusted estimates. First, reviewers may provide estimates for the magnitude and direction (along with uncertainty) for each type of possible bias in each particular study on a study-by-study basis. The effects of the different biases are then combined mathematically for each reviewer, and then the overall estimate is obtained by combining the results across reviewers [3,4]. Currently, these estimates are obtained by expert opinion. An alternative proposed method would use results from meta-epidemiological Centre for Clinical Epidemiology and Community Studies, Jewish General Hospital, Lady Davis Institute for Medical Research, McGill University, Montreal, Canada *Correspondence to: Ian Shrier, MD, PhD, Centre for Clinical Epidemiology and Community Studies, Jewish General Hospital, 3755 Cote Ste- Catherine Rd, Montreal, QC H3T 1E2, Canada. E-mail: [email protected] Copyright © 2012 John Wiley & Sons, Ltd. Res. Syn. Meth. 2011, 2 223237 Original Article Received 26 July 2011, Revised 22 November 2011, Accepted 12 December 2011 Published online 2 February 2012 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/jrsm.52 223

Upload: courursula

Post on 11-Jan-2016

212 views

Category:

Documents


0 download

DESCRIPTION

meta

TRANSCRIPT

Original Article

Received 26 July 2011, Revised 22 November 2011, Accepted 12 December 2011 Published online 2 February 2012 in Wiley Online Library

(wileyonlinelibrary.com) DOI: 10.1002/jrsm.52

Structural Approach to Bias inMeta-analyses

Ian Shrier*

Methods to calculate bias-adjusted estimates for meta-analyses are becoming more popular. The objectiveof this paper is to use the structural approach to bias and causal diagrams to show that (i) the current useof the bias-adjusted estimating tools may sometimes introduce bias rather than reduce it and (ii) theCochrane collaboration risk of bias tool, which was designed for randomized studies, is also applicableto non-randomized studies with only minimal changes. Causal diagrams are used to illustrate each ofthe items in the current risk of bias tool and how they apply to both randomized and non-randomizedstudies. With the exception of confounding by indication, the structure of all potential biases present innon-randomized studies may also be present in randomized studies. In addition, causal diagramsdemonstrate important limitations to the methods currently being developed to provide bias-adjustedestimates of individual studies in meta-analyses. Finally, causal diagrams can be helpful in deciding whenit is appropriate to combine studies in a meta-analysis of non-randomized studies even though the studiesmay use different adjustment sets. Copyright © 2012 John Wiley & Sons, Ltd.

Keywords: bias; meta-analyses; diagram

Introduction

The fundamental objective of evidence-based medicine (EBM) is to enable clinicians and policy makers to makecausal inferences about exposures in order to develop or prescribe effective health interventions. In general,randomized trials are stronger than non-randomized studies in establishing causation [1,2]. In randomized studies,the treatment and control groups are expected to have equal distributions of all prognostic factors except for theexposure of interest. Therefore, any difference in outcome is likely due to the treatment. In non-randomizedstudies, there may be systematic differences between the treatment and control groups, and these differencesmay be just as likely to explain the difference in outcome, as is the treatment. That said, both randomized andnon-randomized studies can have flaws in methodology and analysis that will result in the calculated treatmenteffect being systematically different from the true treatment effect [1–5], that is, the study may provide a biasedestimate of the true treatment effect.

Because EBM requires the evidence synthesis from different studies, researchers conducting meta-analysesmust evaluate biases within each individual study, as well as biases created during the process of evidencesynthesis. Several groups have been working on advanced methods that quantify the direction and magnitudeof biases in order to provide “bias-adjusted estimates” for meta-analyses [3,4,6–8]. The ultimate objective of thesemethods is to improve the estimated causal effect for meta-analyses limited to randomized studies and meta-analyses limited to non-randomized studies. In addition, because the source of the bias is not important oncecorrected for, bias-adjusted estimates would also allow for meta-analyses that combine randomized and non-randomized studies together.

There are two general approaches to calculate bias adjusted estimates. First, reviewers may provide estimatesfor the magnitude and direction (along with uncertainty) for each type of possible bias in each particular study ona study-by-study basis. The effects of the different biases are then combined mathematically for each reviewer,and then the overall estimate is obtained by combining the results across reviewers [3,4]. Currently, these estimatesare obtained by expert opinion. An alternative proposed method would use results from meta-epidemiological

Centre for Clinical Epidemiology and Community Studies, Jewish General Hospital, Lady Davis Institute for Medical Research, McGill University,Montreal, Canada*Correspondence to: Ian Shrier, MD, PhD, Centre for Clinical Epidemiology and Community Studies, Jewish General Hospital, 3755 Cote Ste-Catherine Rd, Montreal, QC H3T 1E2, Canada. E-mail: [email protected]

Copyright © 2012 John Wiley & Sons, Ltd. Res. Syn. Meth. 2011, 2 223–237

223

I. SHRIER

224

studies to estimate the usual bias associated with each type of potential bias (e.g., allocation concealment) andapply this usual bias to each of the studies in the meta-analysis [9].

In order for bias-adjusted estimate methods to provide valid estimates, one must be able to list all the potentialbiases and understand how they interact. For randomized studies, the list is fairly limited. Several bias lists forrandomized studies are available and contain similar items; this article focuses on the Cochrane collaboration’squalitative “risk of bias” tool (Table 1) [5,10]. This is because the Cochrane collaboration produces a large amountof meta-analyses and now requires the risk of bias tool in all future meta-analyses. Although there is generalagreement for the lists of bias in randomized studies, the issue appears more complex in non-randomized studies.For example, Chavalarias and Ioannidis [11] found 235 bias terms in the literature. Although some suggestions fora framework have been made [3,12], these lists do not cover the full range of traditional epidemiological biasesand do not account for interaction between biases.

For bias-adjusted estimates to be practical in a meta-analysis, one needs to develop an approach thataddresses both the multitude of biases and the interaction of biases. One promising alternative of assessing biasis to use causal diagrams and the structural approach to bias [13–15]. For those who are unfamiliar with causaldiagrams, the diagrams encode causal relationships between variables. A unidirectional arrow from X to Y meansthat variable X causes variable Y; the absence of an arrow means X does not cause Y. If X and Y are both caused byZ, then Z is said to be a common cause (or “ancestor") of X and Y. If Z and L both cause X, then X is said to be acommon effect (or collider) of Z and L (or “descendent” of both Z and L). Causal diagrams need to include allvariables that are common causes of any two variables in the diagram but do not need to include variables that

Table 1. The Cochrane risk of bias tool [10].

Domain Description Review authors’ judgement

Sequence generation Describe the method used to generatethe allocation sequence in sufficientdetail to allow an assessment ofwhether it should produce comparablegroups.

Was the allocation sequencevadequately generated?

Allocation concealment Describe the method used to concealthe allocation sequence in sufficientdetail to determine whetherintervention allocations could havebeen foreseen in advance of, orduring, enrolment.

Was allocation adequatelyconcealed?

Blinding of participants, personnel,and outcome assessors Assessmentsshould be made for each mainoutcome (or class of outcomes)

Describe all measures used, if any, toblind study participants and personnelfromknowledge ofwhich intervention aparticipant received. Provide anyinformation relating to whether theintended blinding was effective.

Was knowledge of theallocated interventionadequately preventedduring the study?

Incomplete outcome dataAssessments should be made foreach main outcome (or class ofoutcomes)

Describe the completeness of outcomedata for each main outcome, includingattrition and exclusions from theanalysis. State whether attrition andexclusions were reported, the numbersin each intervention group (comparedwith total randomized participants),reasons for attrition/exclusions wherereported, and any re-inclusions inanalyses performed by the reviewauthors.

Were incomplete outcomedata adequately addressed?

Selective outcome reporting State how the possibility of selectiveoutcome reporting was examined bythe review authors andwhat was found.

Are reports of the study freeof suggestion of selectiveoutcome reporting?

Other sources of bias State any important concerns aboutbias not addressed in the otherdomains in the tool.

Was the study apparently freeof other problems that couldput it at a high risk of bias?

If particular questions/entries werepre-specified in the review’s protocol,responses should be provided foreach question/entry.

Copyright © 2012 John Wiley & Sons, Ltd. Res. Syn. Meth. 2011, 2 223–237

I. SHRIER

are a cause of the exposure alone or a cause of the outcome alone [13]. When using causal diagrams to choose amongdifferent regressionmodels, each variable included in a regressionmodel would also need to be included in the diagram.

Using causal diagrams, each traditional epidemiological “bias” can be grouped into one of four categories[13,16–18]. These biases are briefly explained below and fully explained in Appendix 1.

1. Confounding bias. This may occur when there is a common cause of exposure and outcome and one has notblocked the bias through conditioning on a sufficient set of covariates [13].

2. Collider-stratification bias (also known as selection bias). This may occur when one conditions on a commoneffect of two different causes. This is especially important because it means that including a covariate in amodel can create bias rather than minimize it even if it is (i) associated with exposure, (ii) associated withoutcome, (iii) changes the effect estimate when included in the model, and (iv) is not affected by exposure.Thus, standard epidemiological rules for deciding which variables to include in a model to reduce bias arenot sufficient, and one must understand the causal relationships to know if including a covariate is likely toreduce or increase bias [13].

3. Measurement error bias. This may occur with measurement error [17,18], where measurement error meansthat the measured value of a variable is not reflective of its true value (e.g., misclassified exposure oroutcome) and may be due to either random or systematic error.

4. Over-adjustment bias. This may occur when one conditions on a variable that lies within the causal pathwayor conditions on a marker for a variable within the causal pathway [16,19].

Causal diagrams have been used to explain biases within original research studies related to randomizedcontrolled trials (e.g., loss to follow-up), cohort studies (e.g., possible common causes of exposure and outcome),and case-control studies (e.g., Berkson’s bias) [13]. Once a causal diagram is drawn, it is possible to follow somesimple rules to determine which covariates should be included in the analysis in order to obtain an unbiasedestimate [20,21], assuming the causal diagram drawn is correct. In addition, several software programs have beendeveloped to help understand which covariate sets minimize bias and which covariate sets introduce bias [22,23].

One important limitation of using causal diagrams is that one never knows if one has drawn the correct causaldiagram. That said, this actually represents a strength of the causal diagram approach because it requires authorsto make their underlying assumptions transparent, which is an essential component of any systematic review ormeta-analysis. This is explained in greater depth later. In brief, we have previously suggested that where severalcausal diagrams are plausible, authors should display each of the causal diagrams and associated analyses [20].Whether or not one chooses to draw the causal diagrams, bias will be minimized or not depending on the truecausal diagram. If an investigator chooses not to draw out the plausible causal diagrams and still apply ananalytical strategy (including propensity scores), it simply means that the investigator is deciding that oneparticular causal diagram is the most appropriate (i.e., the one in which his/her analytical strategy will minimizebias) without making the underlying assumptions and reasoning available to the reader.

Readers should understand that causal diagrams do not affect the validity of any randomized or non-randomizedstudy. Unknown common causes of exposure and outcome can never be ruled out completely in a non-randomizedstudy, whereas the randomization algorithm is the only possible cause of exposure in a well-conducted randomized trial.In addition, when there is unmeasured confounding in observational studies, including instrumental variables (or indeedany variable that is more strongly associated with exposure than with the outcome, i.e., even some true confounders) inan analytical model may actually amplify the bias in the results compared with not including these variables [24]. Theobjective of causal diagrams is to help investigators choose analytical strategies that will minimize bias in bothrandomized and non-randomized studies and make the potential biases in each study design apparent. Readersinterested in learning more about causal diagrams are referred elsewhere for more extensive background [13–19,21,25].

Given that bias-adjusted methods for combining studies in meta-analyses are becomingmore formalized and causaldiagrams help make underlying assumptions of analytical models transparent, the overall objective of this paper is toshow how causal diagrams can be used to (i) demonstrate that with minor modifications, the current Cochrane risk ofbias tool provides an applicable framework for all study designs and (ii) highlight contexts where current “bias-adjusted”estimate methods being proposed for meta-analyses may inadvertently introduce bias rather than correct for it.

Each of the causal diagrams in this paper includes the following features: If the bias exists in both randomized andnon-randomized studies, the term “allocation process” is used; if the bias is different in randomized and non-randomizedstudies, the factors responsible for the allocation to exposure group are stated. Allocation factors cause subjects to beassigned to exposure groups, which cause exposure to participants (called “true exposure” because not all participantswill comply). In the causal diagrams shown, exposure is a cause of the outcome. When a variable is noted with an “*”, itrefers to the observed measure of the variable, which includes the associated measurement error [26].

225

Cochrane risk of bias tool

The premise of this paper is that causal diagrams suggest that the current Cochrane risk of bias tool (see Table 1for full description of the tool), developed for randomized trials, also provides an appropriate foundation for

Copyright © 2012 John Wiley & Sons, Ltd. Res. Syn. Meth. 2011, 2 223–237

I. SHRIER

226

assessment of biases in non-randomized studies. This framework lists five general sources of bias, with anadditional “other” category [10]: sequence generation, allocation concealment, blinding (participants, personnel,and outcome assessors), incomplete outcome data, and selective outcome reporting. Each of the followingsections lists a Cochrane risk of bias tool item, describes a causal diagram for the item, illustrates differencesbetween randomized and non-randomized studies where appropriate, and explains the usefulness for authors/readers/teachers of meta-analyses.

Sequence generation

Sequence generation refers to the process leading to group assignment. In a truly randomized trial, the sequencegeneration is randomly generated (caused), usually by a computer algorithm, and is causally unrelated to theoutcome. The sequence generation creates a group assignment, which causes investigators to inform subjectswhich group they are assigned to, which affects the probability that subjects will actually be exposed or not,and which causes a change in the probability of the outcome occurring (assuming exposure has an effect).

The key to sequence generation is that the group assignment should be causally unrelated to the outcome.However, some studies labeled as randomized may not use truly random assignments. When this occurs, a biasmay be introduced if the assignment is not random with respect to the outcome. Therefore, all risk of bias tools(including the Cochrane risk of bias tool) ask authors of systematic reviews to assess the likelihood of thisoccurring. In Figure 1, the sequence generation of one of the included “randomized studies” was based (caused)on the month of birth; that is, it was not random and should not have been called a randomized study. In thisexample, all participants born in the first half of the year are assigned to one group, and those born in the secondhalf of the year are assigned to another group (one group will be older than the other). If age is a prognostic factor(Figure 1), then confounding bias exists because age is a common cause of both exposure and outcome [13].Inappropriate sequence generation can sometimes be accounted for in an analysis, but this is not usually done.First, if authors were aware of the problem, they would have avoided it. Second, in the example given above,residual confounding would always exist because each group has completely different months of birth with nooverlap. That said, the authors of a meta-analysis might not believe age affects the outcome. If true, they woulddraw a similar causal diagram but delete the arrow between age and outcome. If this causal diagram was correctand one applied the methods previously mentioned that determine if bias exists [20,21], one would conclude thatthe allocation method used did not introduce any bias even though it was not “randomized”. Therefore, in a meta-analysis using bias-adjusted methodology, no adjustment would be necessary for this item in this study. Drawingthe causal diagram clearly delineates the assumptions and underlying reasoning why one should adjust for thesequence generation or should not adjust for the sequence generation.

Allocation concealment

Lack of allocation concealment was originally associated with ~30% overestimation of treatment effect and hasreceived a great deal of attention as a potential bias [8,9,27–31]. In reality, this is really a problem related to“blinding”. Blinding has many components; one can blind 1) investigators, 2) patients, 3) those responsible forassigning participants to exposure groups, 4) those responsible for allocating treatment, 5) those responsiblefor carrying out all study processes, and/or 6) those responsible for assessing outcome. Allocation concealmentrefers only to blinding of the person allocating the treatment. The Cochrane risk of bias tool and other biasadjustment methods [4,9,10] list this potential bias separately from other types of “blinding”. Therefore, blindingof those unrelated to allocation concealment is discussed separately in the next section (associated with Figure 3).The causal diagrams in Figure 2 suggest that the underlying mechanism of bias related to allocation concealmentcould be either confounding bias (Figure 2a) or collider-stratification bias (Figure 2b).

Figure 1. A causal diagram for the “sequence generation” item in the Cochrane risk of bias tool [10]. In an appropriately randomized study, thesequence generation does not cause the outcome (i.e., the arrow from “unmeasured factor” to “outcome” would not exist). In some studies labeled“randomized”, the sequence generation may in fact be due to a factor that is also a cause of the outcome, that is, it is not a true randomized studyFor example, if one allocates treatment by date of birth (e.g., participants born January–June are in one group and participants bornJuly–December are in another group) and age is a cause of the outcome, confounding bias exists because age is a common cause of both

exposure and outcome.

Copyright © 2012 John Wiley & Sons, Ltd. Res. Syn. Meth. 2011, 2 223–237

.

Figure 2. A causal diagram for the “allocation concealment” item in the Cochrane risk of bias tool [10], which is due to un-blinding of the personallocating the treatment (un-blinding of investigator, patient, outcome assessor, or other study personnel is explained in Figure 3). In A, theallocator has knowledge about the health of the participant (which is affected by causal factors of the outcome), and this causes the allocatorto “cheat” and allocate certain participants to exposure groups that they were not actually assigned to. In this diagram, participant prognosis isthe common cause of exposure and outcome. In B, poor research training of the investigators leads to a protocol in which the allocator wasnot blinded. In addition, poor research training also leads to other quality issues in the study (e.g., follow-up procedures are different for thetwo groups). Because follow-up procedures are caused by both exposure and poor research training (i.e., follow-up procedures are a commoneffect), a conditional association is created between exposure and poor research training (and therefore also between allocator concealmentand outcome). Note that there is no arrow from allocation concealment to exposure because this can only occur through fraud (illustrated inA). There is no arrow from allocation concealment to outcome because this can only occur through un-blinding of an outcome assessor or the

patient, and that is illustrated in Figure 3.

I. SHRIER

227

For pedagogical purposes, we restrict the discussion to randomized studies because almost all non-randomized studies do not have allocation concealment. In Figure 2a, the person allocating the treatment knowsthe health status of the participants (allocation not concealed) and therefore may decide to give a differenttreatment than what was randomly assigned. This is a form of fraud. Although fraud could occur at any level ofa study, it is illustrated here because allocation concealment has received so much attention, and it is in factthe only possible cause of the bias in this context. To minimize this bias, some have argued that elaborate tactics(e.g., central randomization) are necessary to prevent the allocator from determining what the assignment is; ifsimple mechanisms are used (e.g., allocation using envelopes), the risk of bias is high [31–33]. However, theseelaborate “solutions” have not been tested, and in theory, it is not clear that they would be successful. Forexample, an allocator who wants a subject assigned to a particular group could simply assign the subject evenif central randomization said not to, and then switch the assignment for a subsequent subject to create the correctbalance between groups. Furthermore, if the fraud begins with the investigators (to show a treatment isbeneficial/harmful despite the science), the investigators would likely lie about allocation concealment in theirreport—the risk of bias would remain high despite the reporting that elaborate mechanisms were used.

Another possibility suggested by Wood et al. [9] is that lack of allocation concealment may simply represent amarker for general poor study quality. If true, the current suggestions on how to account for this bias in meta-analyses may increase rather than minimize bias. Figure 2b is a causal diagram showing why this would occur.According to this theory, poor research training would increase the probability (cause) that the study investigatorsused different follow-up procedures (i.e., number of follow-up visits, methods to detect outcome, stringency inquality control) for each of the exposure groups. In the causal diagram, this is expressed by including an arrowfrom poor research training to follow-up procedures and from exposure to follow-up procedures. Therefore, thefollow-up procedure represents a collider between exposure group and poor research training. Furthermore, inthis theory, allocation concealment cannot cause exposure by any other method except fraud. Because this wasillustrated in Figure 2a, it is not repeated again here, and there is no arrow from allocation concealment toexposure. Finally, if the person allocating treatment also happens to be the person responsible for assessingoutcome, there might be measurement error bias, but this is due to un-blinding of the outcome assessor(discussed in Figure 3) and not due to lack of allocation concealment.

What are the implications of the causal structure in Figure 2b? First, there is a conditional association betweenexposure and poor research training; that is, a non-causal association (collider-stratification bias) betweenallocation concealment and outcome is created if any of the follow-up procedures (i.e., the collider) are includedin the model. Second, if all other study methods are appropriate (i.e., if the arrow from poor training to follow-up

Copyright © 2012 John Wiley & Sons, Ltd. Res. Syn. Meth. 2011, 2 223–237

Figure 3. A causal diagram for the “blinding” item in the Cochrane risk of bias tool [10]. In A, the bias associated with the placebo effect is shown. Thisbias can occur in both randomized and non-randomized studies, and therefore, the generic term “allocation process” is used to assign exposure groups.The bias only occurs if the unblinded participant had prior beliefs about the effects of a particular exposure, and these beliefs actually cause a change inthe outcome (e.g., through psychobiological or behavioral mechanisms). Note that there is no bias if one is interested in the total causal effect of theexposure (which includes psychobiological effects). In B, the biases associated with lack of blinding of the investigator/clinician/participant when one isnot interested in including psychobiological effects (i.e., non-placebo effect) are shown. The effects are different for randomized (factors causingexposure other than randomization are not present) and non-randomized studies (“randomization” factor not present). In non-randomized studies,the probability of exposure changes if the clinician (and/or participant) believes that the effect of the exposure will affect the outcome. If the reasonfor these beliefs is because of the presence/absence of factors that cause the outcome, there is confounding bias (confounding by indication). If thereasons for these beliefs are unrelated to the outcome (i.e., the arrows from causal outcome factors to clinician/participant beliefs would be absent),there is no confounding bias. A separate bias occurs if the participant is assigned to one exposure group and then later changes because they becomeaware of their prognosis. In C, assessor non-blinding from any one of several causes can cause differences between the classified outcome (denoted“outcome*”) and the true outcome (denoted “outcome”), which is known as measurement error. In addition, the way questions are asked/informationobtained can lead to differences between classified exposure and true exposure (recall bias); the classified exposure (“exposure*”) is also caused by

assignment to exposure (“exposure”). These biases occur regardless of whether the allocation process is randomized or not.

I. SHRIER

228

procedures is removed), allocation concealment is not a cause of bias; any method that provided an “allocationconcealment bias-adjusted” estimate for such a study [4,6–9] would introduce bias rather than minimize it. Third,if poor follow-up procedures were present and bias-adjusted estimates were calculated for these poor follow-upprocedures, any further adjustment for allocation concealment would again introduce bias. Finally, if training ongeneral study methods were improved, lack of allocation concealment would likely become unimportant, becauseit is just a marker (as opposed to a cause) for bias. Indeed, improved training may be the reason for the observedreduction in the effect of allocation concealment bias (published by the same group of authors) from 30% in 2001[34] to 17% in 2008 [9], and the finding that a simple method of sealed, sequentially numbered opaque envelopesyielded equivalent effect estimates compared with central randomization [31]. That said, providing bias-adjustedestimates based on allocation concealment alone (i.e., no other bias adjustments are made) could theoretically bean efficient method but only under the assumption that allocation concealment is a good marker for thecombined total of all biases related to methodological design. In conclusion, causal diagrams illustrate thatproviding bias-adjusted estimates for allocation concealment in a meta-analysis requires careful thought, and thatsuch bias-adjusted estimates should probably not include any other adjustments for methodological quality.

Copyright © 2012 John Wiley & Sons, Ltd. Res. Syn. Meth. 2011, 2 223–237

I. SHRIER

Blinding

Bias due to un-blinding the person responsible for assigning exposure (participants or clinician/investigators) isillustrated in Figures 3a and b, and bias due to an un-blinding the person responsible for determining outcome(assessor, participant) is illustrated in Figure 3c. An unblinded analyst could also create bias through reporting bias(discussed later) or through deliberate inappropriate analyses (again equivalent to fraud and not shown as acausal diagram). Finally, un-blinding of any other study personnel (e.g., those applying an intervention, enteringdata) could also create bias but are not part of risk of bias tools and are not discussed here. In brief, bias wouldbe created if these un-blinded individuals affected the application of the intervention, changed the expectationsof the participant (equivalent to un-blinding the participant), or affected the measured outcome (equivalent toun-blinding the outcome assessor or analyst).

Figure 3a illustrates potential biases due to placebo effects (defined as psychobiological effects here) when aparticipant is unblinded in a study with truly objective outcomes; that is, response bias (where a subject respondsin a particular way to please the researchers) is not possible because the participant is not the assessor. The causaldiagram illustrates that the intention-to-treat analysis is biased in unblinded studies if one is interested in theaffects of assigned exposure acting only through the biological effect (i.e., blinding eliminates the arrow betweenexposure group and participant beliefs of assigned exposure because participants do not know which exposuregroup they are in). However, there is no bias in unblinded studies when using an intention-to-treat analysis if (i)there are no psychobiological effects on the outcome (i.e., no arrow emanating from participant beliefs), (ii)participants can be convinced that the two treatments are likely to have the same effect (or equivalently, thatparticipants do not have expectations that one treatment is superior; again, no arrow emanating from participantbeliefs), or (iii) if one is interested in the total causal effect of assigning exposure (where the total causal effectincludes behavior change and psychobiological effects such as when one is interested in whether the color andshape of a pill affects the outcome [35]). In addition, using the causal diagrams and applying the rules previouslymentioned that help determine which covariates should be included in the analysis in order to obtain an unbiasedestimate [20,21], the lack of blinding can be minimized if the patients’ prior beliefs about the exposure aremeasured before the exposure is given, and then conditioned on.

Figure 3b illustrates the non-placebo biases associated with un-blinding the investigator/clinician/participant.In general, one can group these biases into two categories: confounding by indication and misclassification(measurement error). First, confounding by indication occurs when the participant is assigned an exposurespecifically because of the presence/absence of causal factors for the outcome. The presence/absence of thesecausal outcome factors affects the clinician’s “beliefs about exposure effects” (or of the beliefs of the participantwho accepts/refuses a treatment). This information then leads to prescribing (or accepting) a particular treatment.The causal outcome factors therefore represent a common cause of both exposure and outcome (confoundingbias). Note that other more complicated forms of confounding by indication exist (Figures 6c and d in Ref.[13]), where conditioning on a common effect induces confounding bias because it creates a conditional commoncause of exposure and outcome [36].

A slightly different bias occurs when the participant becomes unblinded after the exposure group has alreadybeen assigned. At this point, there would be measurement error bias if the participant decided to take theun-assigned exposure (traditionally known as contamination) or just stop the assigned exposure; biases due todropping out of the study are discussed in Figure 4. All of these contexts can occur in both randomized andnon-randomized studies.

Figure 3c addresses the issue of an unblinded assessor; similar effects are expected in both randomized andnon-randomized studies. In the causal diagram, this is just a form of measurement error because the unblindedassessor is reporting a value for the outcome that is different from the true value. If the assessor is part of theresearch team, this may be called detection bias (because one group is being more accurately classified). If theassessor is the participant him/herself and it is a subjective outcome, it may be called response bias [37] (becausethe participant answers positively just to please the research team even if they do not feel better). Similarly, in acase-control study, the participant may recall exposures differently if they are un-blinded (often called recall biasbut could also be called response bias if the motivation was to please the research team). Regardless of the nameor who is responsible for the misclassification, causal diagrams illustrate that all of these are cases of measurementerror bias.

Summarizing the causal diagrams in Figures 3a-c, the structure of the bias due to unblinded assessors isidentical in randomized and non-randomized studies (frequency and magnitude may differ), except for some

igure 4. A causal diagram for the “incomplete outcome” item in the Cochrane risk of bias tool. This causal diagram demonstrates an examplehere one is conditioning on a descendent of a variable lying along the causal pathway (over-adjustment bias [16]). These biases occur regardlessf whether the allocation process is randomized or not. Other causal diagrams related to incomplete outcome data can be found in Ref. [13], some

229

F

wo

of which apply to both randomized and non-randomized studies, and others that apply only to non-randomized studies.

Copyright © 2012 John Wiley & Sons, Ltd. Res. Syn. Meth. 2011, 2 223–237

I. SHRIER

230

case-control analyses. Bias due to unblinded participants is identical for the placebo effect and contamination. Theonly structural difference between randomized and non-randomized studies (i.e., the probability of the differentbiases may also be different) is confounding by indication, which can occur for known or unknown reasons.

Can confounding by indication be minimized at the analysis stage? Using causal diagrams [15], confounding byindication is minimized (i.e., the result is equivalent to the intention-to-treat analysis of a randomized study) byconditioning on a correct set of accurately measured covariates that blocks the confounding bias; one set ofcovariates that minimizes bias when included in the model is simply the indications (i.e., causes) for treatment.This should be achievable for studies on diseases where the indications are well described and understood[38]. For conditions where indications are ambiguous, prospective studies can obtain the indications from thephysician or patient at the time the treatment is prescribed [38] and condition on them in the statistical modelto provide unbiased estimates. The only limitation to this approach is that precision is decreased if it includesvariables that are not also causes of the outcome [39].

When conducting a meta-analysis, investigators are clearly dependent on what information the original studiescollected and reported and the analyses conducted. One serious limitation to meta-analyses of non-randomizedstudies using traditional methods is that different studies will use different sets of covariates in the adjustedanalysis. Currently, investigators not using the bias-adjusted estimate methods previously described simply reportmeta-analysis results after combining studies with different covariate adjustment sets but never explain why theybelieve it is appropriate to do so. Causal diagrams force the investigator to make their assumptions explicit. Forexample, when the causal relationships between exposure and disease are well understood, and it is realistic toassume a particular causal model has a high probability of being correct, it is logical to combine studies thatuse different covariate adjustment sets if each of the covariate adjustment sets results in minimizing bias (seeAppendix 1 for an example). When the causal relationships between exposure and disease are not wellunderstood, investigators should draw all the causal diagrams they believe are realistic. The same process asabove (i.e., when only one causal diagram is probable) is then applied to each of the causal diagrams as asensitivity analysis. It may be that certain studies should be combined according to one causal diagram but notaccording to a different causal diagram; causal diagrams therefore increase transparency of the underlyingassumptions being applied by the investigator doing the meta-analysis. Of course, methods that providebias-adjusted estimates of the individual studies also depend on the true causal relationships, and therefore,causal diagrams would increase the transparency of these meta-analyses as well.

Incomplete outcome data

The issues concerning incomplete outcome data that have previously been discussed by others will not berepeated here (see Figure 6 in Ref [13]). The bias occurs when there is differential loss to follow-up betweenthe assignment groups, which is known as “informative censoring” in traditional epidemiological teaching [13].Another example of informative censoring not previously discussed (Figure 4) is the over-adjustment bias thatoccurs where loss to follow-up is a marker of side effects due to exposure (i.e., one is effectively conditioningon an effect of the exposure).

Incomplete outcome data bias may occur in both randomized and non-randomized studies. The bias isprobably more frequently present in non-randomized studies because some of the causal diagrams describingthe bias are specific to non-randomized studies (see Figure 6 in Ref. [13]), but none are specific to randomizedstudies. Using causal diagrams clearly shows that obtaining the reasons why the individuals dropped out of thestudy is essential to estimating bias. If the reason is not an effect of exposure, there is no bias. If the reason isan effect of exposure, non-regression methods of analysis may be more appropriate [40–42]. With respect tometa-analyses, causal diagrams are helpful pedagogically to illustrate the bias but less helpful from a practicalviewpoint at this time.

Selective reporting outcome

The selective reporting outcome row in the risk of bias tool is designed to evaluate whether some outcomes arespecifically excluded from the published literature (which can occur based on either author or editor decisions).From a causal diagram approach, there is little difference between selective reporting bias and publication bias.Both biases are due to incomplete reporting of results in the published literature (publication bias simply beingthe extreme where no results are published). As such, the causal diagram in Figure 5 begins with the study results,which causes investigators/editors to choose which outcomes to emphasize (e.g., based on statistical significance,results support investigators’ theories, and so on). The choice of outcomes may also cause a change in theprobability of publication (either by the authors or the editors), and this can affect the results of future meta-analyses. If the study protocol were available, one would be able to properly evaluate if this occurred.

In essence, selective reporting bias and publication bias are examples of incomplete outcome data, but on themeta-analysis level rather than the individual study level. With respect to meta-analyses, not including databecause of the results represents an example of over-adjustment bias (conditioning on an effect of the exposure,which are the results in this example). Attempts to simply include unpublished data do not remove bias because

Copyright © 2012 John Wiley & Sons, Ltd. Res. Syn. Meth. 2011, 2 223–237

Figure 5. A casual diagram for the “selective reporting” item in the Cochrane risk of bias tool [10] (also applicable to publication bias, which is theextreme of selective reporting and occurs when all results of a study are not reported). Each individual study is analogous to a cross-over study(because each study represents an observation with data for both comparison conditions). The results of the studies cause investigators (oreditors) to emphasize some of the study outcomes but not others, which leads to the inclusion of only some results for the publication. In addition,study quality will affect the amount of bias in the study, which will affect both the study results and the probability of publishing. Including onlypublished material is effectively conditioning on an effect of exposure. Including all the unpublished results will lead to bias unless one can

appropriately condition on the study biases.

I. SHRIER

study quality (leading to study biases) represents a common cause of the ability to publish the study and the studyresults. Therefore, the best approach would be to obtain all the unpublished data and adjust for study quality(all internal biases) in order to obtain valid bias-adjusted estimates [4,6–9,43] or to qualitatively report on thesebiases if the expertise for bias-adjusted estimate calculations is not available. However, this is often not feasibleor possible, and some authors have recently suggested a Bayesian approach that borrows information aboutpublication bias from other meta-analyses [44]. In essence, this is conceptually similar to the previously discussedmethods for allocation concealment where one uses usual bias estimates obtained from other meta-analyses [9].

Readers should note that publication bias can occur for other reasons unrelated to study biases, and these arenot included in Figure 5 (e.g., studies written by authors whose first language is not English may have difficultypublishing in English language publications) because they do not represent a common cause of two variablesin the causal diagram and therefore are not necessary for internal validity of the study (in this case, the meta-analysis). However, omitting such studies can affect generalizability and may still represent a concern dependingon the research question.

Other biases

Figures 6, 7 illustrate that many of the “biases” attributed solely to non-randomized study designs also occur inrandomized trials, usually because of poor design. These biases are termed study-level biases because they occurat the level of the study group (i.e., each participant in the same group is affected the same way) rather than theparticipant level. Because they occur at the study group level, they may be more easily thought of as due toclustering or co-interventions.

Non-randomized studies that use control groups identified by a different period (pre-post studies, historicalcontrols) or location are considered to have a high probability of bias. This is because other factors (e.g., levelof health care) may have changed over time (or location), and these other factors might be responsible for theobserved effect. In essence, this is just a specific example of assigning causal relations when there are“co-interventions”. For example, if a cluster randomized study has only two clusters, there may be co-interventionsthat occur by chance (Figure 6) [45]. In this case, it is not possible to know whether the randomized exposure orthe co-intervention is responsible for the causal effect. With respect to historical control studies, these simplyrepresent a two-cluster study with clustering due to time. If there are no co-interventions related to the outcomewithin the context of the study (e.g., health care does not generally change over the span of weeks or months),

Figure 6. A causal diagram for the biases that occur when the exposure and control groups are chosen based on time (e.g., historical controls,pre-post studies) or location. The choice of exposure applies to all individuals within the group, and the same choice may also cause a secondexposure (i.e., the equivalent of a co-intervention). If the second exposure also causes the outcome, it is not possible to disentangle the effectsof the exposure of interest from the co-intervention. This effect is also observed in cluster randomized trials with few clusters. See text for example.

Copyright © 2012 John Wiley & Sons, Ltd. Res. Syn. Meth. 2011, 2 223–237

231

Figure 7. A causal diagram for regression to the mean. In this example, exposure groups are clustered by time. The random variability in theoutcome means that the response of the “exposure” group will vary over time by chance (indicated by dotted bidirectional arrow that suggestsan association due to unmeasured causes [15]). The bias occurs when the outcome of one clustered group is unusually high, and this causes theinvestigators to decide to conduct the study; exposure should never be determined based on outcome. Because investigators use the group with

the known high values that occurred by chance, it is very probable that the comparison group will have low values simply by chance.

I. SHRIER

232

then historical controls could be appropriate, and bias needs to be evaluated on a study-by-study basis. From ameta-analysis perspective, causal diagrams suggest this bias occurs in both randomized and non-randomizedstudies (albeit more frequently in non-randomized studies). Therefore, any risk of bias tool that assesses biasesfor randomized studies to calculate bias-adjusted estimates in a meta-analysis must account for this bias as well.

Another commonly cited bias for pre-post studies is regression to the mean (Figure 7). If the pre-post studywere well designed where the intervention is applied based on factors unrelated to the prognosis of patients,the probability of the pre group having a higher mean value than the post group is exactly the same as anyrandomized trial, that is, chance association. That said, pre-post studies might be conducted because cliniciansnotice an unusually high rate/proportion of an undesirable outcome at a particular period (e.g., high injury rate).The clinicians might then create an intervention to reduce the high rate and compare post-intervention resultswith pre-intervention results. This violates the basic principle that one should never define an exposure groupbased on the outcome (which was also the principle underlying recall bias in Figure 3b); in this case, the “pre”group was defined by their unusually high injury rate. More generically, one expects variability in the outcomerate over time, and time is considered the clustering factor in a pre-post study. This chance association at onepoint in time causes investigators to conduct the study, which creates a non-causal association between exposureand outcome. Here, causal diagrams are most useful for pedagogy and explain how regression to the mean issimply due to inappropriately allowing an outcome to determine exposure level.

Finally, biases associated with the selection of controls within “case-control studies” are often considereddifficult to identify. However, causal diagrams have been used to show that they usually represent bias due toconditioning on a common effect (collider-stratification bias) [13]. For example, Berkson’s bias occurs when aninvestigator chooses controls from the same hospital where cases were admitted, which is often done in orderto select controls from the same socio-economic area as the cases (i.e., individuals who attend the same hospitalare likely to live in proximity to each other). However, conditioning on a common effect of two variables(e.g., hospitalization is caused by several different diseases) or one of its descendants creates a conditionalassociation between the causes of these diseases. This means that any exposure that results in hospitalization willbe associated with the outcome of interest even if it is not a cause of the outcome of interest. Because bias due toconditioning on a common effect can also occur when there is loss to follow-up in randomized studies [13], thereis nothing “unique” about the bias in case-control studies. More importantly, if one chooses the controls for a case-control study appropriately (e.g., using incidence density sampling in a case-control study nested within a cohortstudy), there is no conditioning on a common effect and there is no bias. Therefore, from a risk of bias toolperspective, the relevant question for a case-control study is whether the control group was specifically identifiedbecause they experienced a particular outcome that is also caused by the exposure of interest. Because a cohortstudy would never select participants based on an outcome that occurred after exposure, neither should a case-control study (with the exception that controls cannot have the outcome of interest). From a meta-analysisperspective, causal diagrams should help investigators realize that collider-stratification bias in randomized trialscreates the same problems as collider-stratification bias in non-randomized studies.

Implications for bias assessment tools in meta-analyses

At a fundamental level, the causal diagrams in Figures 1–7 illustrate that the only bias that occurs exclusively innon-randomized studies is the process of treatment allocation leading to confounding by indication; all the othergeneral categories of biases differ only in the frequency or magnitude with which they occur. This supports ourprevious work arguing the same principles from a traditional epidemiological approach [38].

Given the above, any effective tool used to assess bias in randomized studies or calculate bias-adjustedestimates in meta-analyses must already include all biases except confounding by indication, and that only slightmodifications would be necessary to be appropriate for non-randomized studies. One possible series ofmodifications to the popular Cochrane risk of bias tool are shown in Table 2, with the additions necessary to

Copyright © 2012 John Wiley & Sons, Ltd. Res. Syn. Meth. 2011, 2 223–237

Table 2. Proposed modifications to the Cochrane risk of bias tool (changes indicated in bold) so that it isapplicable for non-randomized studies.

Domain Description Review authors’ judgement

Allocationgeneration

Describe the method used to generate theallocation sequence in sufficient detail toallow an assessment of whether it shouldproduce comparable groups.

Was the allocation sequence adequatelygenerated? For non-randomized studies,was the allocation based on theindications for treatment, or presence ofoutcome (introduces bias)?

Other sourcesof bias

State any important concerns about biasnot addressed in the other domains inthe tool.

Was the study apparently free of otherproblems that could put it at a high riskof bias? In particular, were there anyother “co-interventions” by design orassociation through clustering thatcould explain the results?

If particular questions/entries were pre-specified in the review’s protocol, responsesshould be provided for each question/entry.

Analyticalprocedures

Describe the statistical methods used tominimize bias.

Were appropriate statistical analysesused to minimize bias? A causaldiagram outlining the theoreticalcausal relationships between variablesof interest would be beneficial

The items that did not require changes are not shown. These are allocation concealment; blinding of participants,personnel and outcome assessors; incomplete outcome data; and selective outcome reporting.

I. SHRIER

highlight the biases associated with non-randomized studies marked in bold. In brief, issues related to clustering(which occur in both randomized and non-randomized studies) are now simply specified as examples under“other sources of bias” to help the less experienced authors remember these specific issues. Issues related toconfounding by indication and those related to choosing exposure groups based on the outcome are listed underallocation generation (previously referred to as “sequence generation” in the original Cochrane risk of bias tool).This is because blinding (or lack thereof) causes a particular allocation process to occur, and therefore, allocationgeneration represents the more proximal cause of the bias. For pedagogical reasons, some may prefer to teachconfounding by indication as part of blinding, as we have done in this paper. Finally, there is one additionalrow that refers to analytical procedures. Although this could be included under “other biases”, non-randomizedstudies will almost always require adjustment to minimize confounding bias, and authors need to evaluate ifthe adjustment set of covariates is likely to block confounding bias or introduce collider-stratification bias.Therefore, one would expect this bias will be frequent enough to warrant its own heading. Analytical proceduresshould also be included when assessing biases in randomized studies, especially when one is required to use theestimates provided by the investigators and cannot simply enter raw numbers (e.g., survival analysis).

The Cochrane risk of bias tool was designed to assess and evaluate internal validity only and omits externalvalidity. However, answers from perfectly conducted randomized trials on children may be biased if thepopulation of interest is the adults. Additional items to account for these biases should be considered as wellwhen calculating bias-adjusted estimates in meta-analyses (as was done by some authors, see Ref. [3] forexamples) but require selection diagrams [46] (as opposed to causal diagrams) and are beyond the scope of thispaper. Finally, causal diagrams will also be helpful to those conducting health technology assessments anddeveloping guidelines when using the GRADE system (Grading of Recommendations Assessment, Development,and Evaluation) [47]. For example, they can be used to transparently defend a decision to downgrade evidencefrom randomized trials [48] or to upgrade/downgrade evidence from non-randomized studies [48,49].

This paper has used causal diagrams to help modify the risk of bias tool and to explicitly state the underlyingassumptions when calculating bias-adjusted estimates. Although causal diagrams are advocated by some[13,20,50–52], others believe that the additional benefits are “. . . generally not helpful and can too easily lead totheoretical infatuation with mathematical curiosities . . .” [53]. I believe the current manuscript highlights someof the benefits of the causal diagram approach; readers are encouraged to learn more about these methodsand decide for themselves.

233

Conclusion

Using causal diagrams, this article illustrates that with the exception of confounding by indication, the structure ofall potential biases present in non-randomized studies may also be present in randomized studies. Therefore, any

Copyright © 2012 John Wiley & Sons, Ltd. Res. Syn. Meth. 2011, 2 223–237

I. SHRIER

234

appropriate risk of bias tool developed to assess bias in randomized studies, and to calculate bias-adjustedestimates in meta-analyses, should only require minor modifications for use with non-randomized studies. Causaldiagrams help to clearly illustrate the underlying mechanisms listed in risk of bias tools and suggest that thecurrent Cochrane risk of bias tool in particular remains applicable for use in non-randomized studies with onlyminor modifications. Causal diagrams also require investigators to explicitly state the underlying assumptionsof their analytical models, thereby increasing transparency.

Appendix 1The structural approach to bias uses causal diagrams to illustrate how all the epidemiological biases can begrouped into one of four categories (confounding bias, collider-stratification bias, over-adjustment bias andmeasurement error bias) according to the causal relationships between variables.

Causal diagrams generally use directed acyclic graphs to illustrate causal relationships. Each arrow connectingtwo variables indicates causation, and there are no bi-directional arrows; variables with no direct causalassociation are left unconnected. In complex situations where each variable might be considered a cause of theother variable (e.g., depression and chronic pain), one must remember that time is a component in therelationship; the same construct measured at different times represents distinct variables and must be treatedas such. Therefore, a diagram relating chronic pain and depression might include variables for depression at time1 and at time 2, variables for chronic pain at time 1 and time 2, and arrows from (i) depression at time 1 to bothdepression at time 2 and chronic pain at time 2, and (ii) chronic pain at time 1 to both chronic pain at time 2 anddepression at time 2.

Confounding bias

Figure A-1 is an example of confounding bias. Because the covariate is a common cause of exposure andoutcome, exposure and outcome will be associated unconditionally. If one conditions on the covariate, theexposure and outcome will not be associated.

Collider-stratification bias

Figure A-2 is an example of collider-stratification bias. This is also known as selection bias in the structuralapproach to bias because it is the usual cause of what has been traditionally called selection bias. If a covariateand exposure both cause a third variable, then including the third variable (or an effect of the third variable) ina regression model creates a conditional (non-causal) association between the exposure and the covariate. Forexample, both rain and sprinklers can cause a football field to be wet. If one knows the grass is wet, then knowingthe sprinklers were off increases the probability that it rained; rain and sprinklers become associated when the

Figure A-1. Causal diagram illustrating confounding bias. Conditioning on the variable in the box minimizes bias. See text for explanation.

Figure A-2. Causal diagrams illustrating collider-stratification bias in a case-control study (top) and in a prospective study (bottom). Conditioningon the variable that is in the box creates the bias. See text for explanation.

Copyright © 2012 John Wiley & Sons, Ltd. Res. Syn. Meth. 2011, 2 223–237

Figure A-3. Causal diagram illustrating over-adjustment bias. Conditioning on the variable that is in the box creates the bias. See text for details.

Figure A-4. Causal diagram illustrating one example of measurement bias. Measurement bias occurs when the value one measures for a variableis different from its true value. In this example, the bias is a form of collider-stratification bias. See text for details.

I. SHRIER

common effect of “field wetness” is known. Berkson’s bias is a well-known example in health-related case-controlstudies and is illustrated in the top of Figure A-2. Let us say one is interested in whether estrogen prevents deathfrom myocardial infarction (MI). An investigator chooses cases of death due to MI from a local hospital. In order torecruit controls from the same geographical area, the investigator chooses controls from patients admitted to thesame hospital for non-myorcardial infarction diagnoses. However, estrogen is causally related to hospitalizationbecause it increases bone density and prevents osteoporotic fractures. Because both estrogen and MI causehospitalization, they will be associated in this study specifically because both patients and controls were chosenbased on hospitalization (i.e., one has conditioned on hospitalization). However, this association is non-causal andshould not be interpreted to mean taking estrogen prevents MI.

Collider-stratification bias can also occur in randomized trials if participants drop out because of side effects. Inthe bottom part of Figure A-2, both treatment and disease cause side effects, which then cause participants todrop out of the study. If one only analyzes the subjects who remain in the study, one is conditioning on adescendent of a collider, and therefore, a conditional association is created among the parents of the collider(i.e., treatment and disease). Because the disease causes death, the treatment will also be associated (non-causally)with death.

Over-adjustment bias

Over-adjustment bias is defined as adjusting for a variable that (i) lies along the causal path from exposure todisease or (ii) is caused by a variable that lies along the causal path from exposure to disease [16]. A simpleexample is shown in Figure A-3. In brief, osteoarthritis causes subjects to limp, which results in a lower qualityof life. In addition, osteoarthritis causes a decreased quality of life because of recurrent swelling, muscle weakness,increased physician visits, and other effects that may be independent of gait abnormality. If one were interested inthe total causal effect of osteoarthritis (e.g., to estimate the potential benefits of a treatment for osteoarthritis),then an effect estimate obtained after conditioning on gait abnormality would be biased.

Measurement error bias

Measurement error bias is a special case of the other three biases [17], and one example is illustrated in Figure A-4.Often, we cannot measure a particular variable accurately. For example, an accurate measure of blood pressurerequires catheterizing a main artery (e.g., radial artery). For most clinical studies, we accept a recording from ablood pressure cuff (sphygmomanometer) as an acceptable proxy. However, a regular sized blood pressure cuffwill underestimate the true blood pressure in participants with very large arms due to obesity or muscle (this isa mechanical effect because some of the pressure from the cuff is absorbed by the increased tissue and nottransmitted to the artery). If one is trying to assess the role of anabolic steroids on death, the estimate obtainedby using a regular cuff on every subject will be biased because the measured blood pressure is a collider betweenlarge arms and the true blood pressure. This creates a conditional (non-causal) association between anabolicsteroids and death.

235

References

1. Hernan MA, Robins JM. 2012. Causal Inference: Chapman & Hall/CRC.2. Shadish WR, Cook TD, Campbell DT. 2002. Experiments and generalized causal inference. Experimental and

Quasi-experimental Designs for Generalized Causal Inference. Houghton Mifflin Company: Boston, MA; 1–32.

Copyright © 2012 John Wiley & Sons, Ltd. Res. Syn. Meth. 2011, 2 223–237

I. SHRIER

236

3. Thompson S, Ekelund U, Jebb S, Lindroos AK, Mander A, Sharp S, et al. 2011. A proposed method of biasadjustment for meta-analyses of published observational studies. Int J Epidemiol 40(3): 765–777.

4. Turner RM, Spiegelhalter DJ, Smith GC, Thompson SG. 2009. Bias modelling in evidence synthesis. J R Stat SocSer A Stat Soc 172(1): 21–47.

5. Higgins JPT, Altman DG, Gotzsche PC, Juni P, Moher D, Oxman AD, et al. 2011. The Cochrane collaboration’stool for assessing risk of bias in randomised trials. Br Med J (In Press).

6. Wolpert RL, Mengersen KL. 2004. Adjusted likelihoods for synthesizing empirical evidence from studies thatdiffer in quality and design: effects of environmental tobacco smoke. Statistical Sci 19(3): 450–471.

7. Greenland S, O’Rourke K. 2001. On the bias produced by quality scores in meta-analysis, and a hierarchicalview of proposed solutions. Biostatistics 2: 463–471.

8. Welton NJ, Ades AE, Carlin JB, Altman DG, Sterne JAC. Models for potentially biased evidence in meta-analysisusing empirically based priors. 2009. J R Stat Soc Ser A Stat Soc 172(Part 1): 119–136.

9. Wood L, Egger M, Gluud LL, Schulz KF, Juni P, Altman DG, et al. Empirical evidence of bias in treatment effectestimates in controlled trials with different interventions and outcomes: meta-epidemiological study. 2008.Br Med J 336(7644): 601–605.

10. Higgins JPT, Green S (eds). 2009. Cochrane Handbook for Systematic Reviews of Interventions Version 5.0.2[updated September 2009]: The Cochrane Collaboration. Available from http://www.cochrane-handbook.org.

11. Chavalarias D, Ioannidis JP. 2010. Science mapping analysis characterizes 235 biases in biomedical research.J Clin Epidemiol 63(11): 1205–1215.

12. Reeves B, Shea B, Wells G. 2010. Developing a strategy to assess the extent of residual confounding in primarystudies when including non-randomised studies (NRS) in a systematic review. Cochrane Colloquium 2010.Keystone, Colorado.

13. Hernàn MA, Hernandez-Diaz S, Robins JM. 2004. A structural approach to selection bias. Epidemiology 15(5):615–625.

14. Hernàn MA, Hernandez-Diaz S, Werler MM, Mitchell AA. Causal knowledge as a prerequisite for confoundingevaluation: an application to birth defects epidemiology. 2002. Am J Epidemiol 155(2): 176–184.

15. Pearl J. (eds). 2000. Causality: Models, Reasoning and Inference. University of Cambridge: Cambridge.16. Schisterman EF, Cole SR, Platt RW. 2009. Overadjustment bias and unnecessary adjustment in epidemiologic

studies. Epidemiology 20(4): 488–495.17. Shahar E. Causal diagrams for encoding and evaluation of information bias. 2009. J Eval Clin Pract 15(3):

436–440.18. Hernàn MA, Cole SR. 2009. Invited commentary: causal diagrams and measurement bias. Am J Epidemiol

170(8): 959–962.19. Weinberg CR. Toward a clearer definition of confounding. 1993. Am J Epidemiol 137(1): 1–8.20. Shrier I, Platt RW. 2008. Reducing bias through directed acyclic graphs. BMC Med Res Methodol 8: 70.21. Greenland S, Pearl J, Robins JM. 1999. Causal diagrams for epidemiologic research. Epidemiology 10(1): 37–48.22. Textor J, Hardt J, Knüppel S. 2011. DAGitty: a graphical tool for analyzing causal diagrams. Epidemiol

(Cambridge, Mass) 22(5): 745.23. dagR-package: R functions for directed acyclic graphs [program]. 1.1.1 version, 2010.24. Pearl J. 2009. On a class of bias-amplifying covariates that endanger effect estimates (R-356): UCLA Cognitive

Systems Laboratory; 1–9.25. Cole SR, Hernàn MA. 2002. Fallibility in estimating direct effects. Int J Epidemiol 31(1): 163–165.26. Robins JM. Data, design, and background knowledge in etiologic inference. 2001. Epidemiology 12(3): 313–320.27. Schulz KF, Grimes DA. 2002. Allocation concealment in randomised trials: defending against deciphering.

Lancet 359(9306): 614–618.28. Schulz KF, Altman DG, Moher D. 2002. Allocation concealment in clinical trials. J AmMed Assoc 288(19): 2406–2407.29. Juni P, Egger M. 2002. Allocation concealment in clinical trials. J Am Med Assoc 288(19): 2407–2408.30. Berger VW, Do AC. 2010. Allocation concealment continues to be misunderstood. J Clin Epidemiol 63(4): 468–469.31. Herbison P, Hay-Smith J, Gillespie WJ. 2011. Different methods of allocation to groups in randomized trials are

associated with different levels of bias. A meta-epidemiological study. J Clin Epidemiol; In Press.32. Doig GS, Simpson F. 2005. Randomization and allocation concealment: a practical guide for researchers. J Crit

Care 20(2):187–191; discussion 191–3.33. Wilson M-F. 2010. Young athletes at risk: preventing and managing consequences of sports concussions in

young athletes and the related legal issues. Marquette Sports Law Rev 21(1): 241–292.34. Juni P, Altman DG, Egger M. 2001. Systematic reviews in health care: assessing the quality of controlled clinical

trials. Br Med J 323(7303): 42–46.35. Khan A, Bomminayuni EP, Bhat A, Faucett J, Brown WA. 2010. Are the colors and shapes of current

psychotropics designed to maximize the placebo response? Psychopharmacology 211(1): 113–122.36. Greenland S. 2003. Quantifying biases in causal models: classical confounding vs collider-stratification bias.

Epidemiology 14(3): 300–306.37. Hróbjartsson A, Kaptchuk TJ, Miller FG. 2011. Placebo effect studies are susceptible to response bias and to

other types of biases. J Clin Epidemiol 64(11): 1223–1229.

Copyright © 2012 John Wiley & Sons, Ltd. Res. Syn. Meth. 2011, 2 223–237

I. SHRIER

38. Shrier I, Boivin JF, Steele RJ, Platt RW, Furlan A, Kakuma R, et al. 2007. Should meta-analyses of interventionsinclude observational studies in addition to randomized controlled trials? A critical examination of theunderlying principles. Am J Epidemiol 166: 1203–1209.

39. Vanderweele TJ, Shpitser I. 2011. A new criterion for confounder selection. Biometrics.40. Frangakis CE, Rubin DB. 2002. Principal stratification in causal inference. Biometrics 58(1): 21–29.41. Moodie EE, Stephens DA. 2011. Marginal structural models: unbiased estimation for longitudinal studies. Int J

Public Health 56(1): 117–119.42. Moodie EE, Stephens DA. 2010. Using directed acyclic graphs to detect limitations of traditional regression in

longitudinal studies. Int J Public Health 55: 701–703.43. Greenland S. 2005. Multiple-bias modelling for analysis of observational data. J R Stat Soc Ser A Stat Soc

168: 267–306.44. Moreno SG, Sutton AJ, Ades AE, Cooper NJ, Abrams KR. 2011. Adjusting for publication biases across similar

interventions performed well when compared with gold standard data. J Clin Epidemiol 64(11): 1230–1241.45. Kramer MS, Martin RM, Sterne JA, Shapiro S, Dahhou M, Platt RW. 2009. The double jeopardy of clustered

measurement and cluster randomisation. Br Med J 339: b2900.46. Pearl J, Bareinboim E. 2010. Transportability across studies: a formal approach (R-372). University of California:

Los Angeles; 1–19.47. Guyatt G, Oxman AD, Akl E, Kunz R, Vist G, Brozek J, et al. 2010. GRADE guidelines 1. Introduction-GRADE

evidence profiles and summary of findings tables. J Clin Epidemiol: 1–12.48. Balshem H, Helfand M, Schünemann HJ, Oxman AD, Kunz R, Brozek J, et al. 2011. GRADE guidelines: 3. Rating

the quality of evidence. J Clin Epidemiol: 1–6.49. Guyatt GH, Oxman AD, Sultan S, Glasziou P, Akl EA, Alonso-Coello P, et al. 2011. GRADE guidelines: 9. Rating up

the quality of evidence. J Clin Epidemiol: 1–6.50. Pearl J. 2009. Remarks on the method of propensity score. Stat Med 28(9): 1415–1416; author reply 1420–3.51. Vanderweele TJ, Robins JM. 2007. Directed acyclic graphs, sufficient causes, and the properties of conditioning

on a common effect. Am J Epidemiol 166(9): 1096–1104.52. Sjolander A. 2009. Propensity scores and M-structures. Stat Med 28(9):1416–1420; author reply 1420–3.53. Rubin DB. 2009. Should observational studies be designed to allow lack of balance in covariate distributions

across treatment groups. Stat Med 28: 1420–1423.

Copyright © 2012 John Wiley & Sons, Ltd. Res. Syn. Meth. 2011, 2 223–237

237