indirect treatment comparison / network … reporting quality & transparency, interpretation of...
TRANSCRIPT
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
1
INDIRECT TREATMENT COMPARISON / NETWORK META-ANALYSIS
STUDY QUESTIONNAIRE TO ASSESS RELEVANCE AND
CREDIBILITY TO INFORM HEALTHCARE DECISION-MAKING: AN
ISPOR-AMCP-NPC GOOD PRACTICE TASK FORCE REPORT
DRAFT REPORT FOR REVIEW
VERSION SEPTEMBER 24, 2013
©2013 International Society for Pharmacoeconomics and Outcomes Research. All rights reserved under
International and Pan-American Copyright Conventions.
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
2
ABSTRACT
Evidence-based health-care decision making requires comparisons of efficacy and safety of all
relevant competing interventions for a certain disease state. When the available randomized
controlled trials of interest do not all compare the same interventions – but each trial compares
only a subset of the interventions of interest – it is possible to represent the evidence base as a
network where all trials have at least one intervention in common with another. Such a network
of trials involving treatments compared directly or indirectly can be synthesized by means of a
network meta-analysis. Despite the great realized or potential value of network meta-analysis to
inform healthcare decision-making, many decision-makers might not be very familiar with these
techniques and the use of network meta-analysis in health care decisions is still lagging. The
Task Force developed a consensus-based questionnaire to help decision makers assess the
relevance and credibility of indirect treatment comparisons and network meta-analysis to help
inform healthcare decision-making. The developed questionnaire consists of 26 questions. The
domain of relevance (4 questions) addresses the extent to which the results of the network meta-
analysis, if accurate, applies to the setting of interest to the decision-maker. The credibility
domains address the extent to which the network meta-analysis accurately answers the question
it is designed to answer. The 22 credibility questions relate to the used evidence base, analysis
methods, reporting quality & transparency, interpretation of findings, and conflict of interest.
The questionnaire aims to provide a guide for assessing the degree of confidence that should be
placed in a network meta-analysis and to enable decision-makers to gain awareness of the
subtleties involved in evaluating networks of evidence from these kinds of studies. It is
anticipated that user feedback will permit periodic evaluation and modification of the
questionnaire. The goal is to make the questionnaires as useful as possible to the healthcare
decision- making community.
Keywords: network meta-analysis, indirect comparisons, mixed treatment comparisons, validity,
bias, relevance, credibility, questionnaire, checklist, decision-making
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
3
BACKGROUND TO THE TASK FORCE
[To be added]
INTRODUCTION 1
Four Good Practices Task Forces developed a consensus-based set of questionnaires to help 2
decision makers evaluate 1) prospective and 2) retrospective observational studies, 3) network 3
meta-analysis (Indirect Treatment Comparison), and 4) decision analytic modeling studies with 4
greater uniformity and transparency [1-3]. The primary audiences of these questionnaires are 5
assessors and reviewers of health care research studies for health technology assessment, drug 6
formulary, and health care services decisions requiring varying level of knowledge and expertise. 7
This report focuses on the questionnaire to assess the relevance and credibility of network meta-8
analysis (indirect treatment comparison). 9
Systematic reviews of randomized controlled trials (RCTs) are considered the gold-standard for 10
evidence-based health care decision-making for clinical treatment guidelines, formulary 11
management, and reimbursement policies. Many systematic reviews use meta-analysis of RCTs 12
to combine quantitative results of several similar studies and summarize the available evidence 13
[4]. Sound comprehensive decision-making requires comparisons of all relevant competing 14
interventions. Ideally, robustly and comprehensively designed RCTs would simultaneously 15
compare all interventions of interest. Such studies are almost never available, thereby 16
complicating decision-making [5-8]. New drugs are often compared with placebo or standard 17
care, but not against each other, in trials aimed to contribute toward obtaining approval for drug 18
licensing; there may be no commercial incentive to compare the new treatment with an active 19
control treatment [8,9]. Even if there was an incentive to incorporate competing interventions in 20
a RCT, the interventions of interest may vary by country or have changed over time due to new 21
evidence and treatment insights. Therefore, for some indications, the relatively large number of 22
competing interventions makes a trial incorporating all of them impractical. 23
In the absence of trials involving a direct comparison of treatments of interest, an indirect 24
comparison can provide useful evidence for the difference in treatment effects between 25
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
4
competing interventions (which otherwise would be lacking) and for judiciously selecting the 26
best choice(s) of treatment. For example, if two particular treatments have never been compared 27
against each other, head-to-head, but these two treatments have been compared against a 28
common comparator, then an indirect treatment comparison can use the relative effects of the 29
two treatments versus the common comparator [10-15]. 30
Although it is often argued that indirect comparisons are only needed when direct comparisons 31
are not available, it is important to realize that both direct and indirect evidence contribute to the 32
total body of evidence. The results from indirect evidence in combination with the direct 33
evidence, may strengthen the assessment between treatments directly evaluated [6]. Even when 34
the results of the direct evidence are conclusive, combining them with the results of indirect 35
estimates in a mixed treatment comparison (MTC) may yield a more refined and precise estimate 36
for the interventions directly compared and broaden inference to the population sampled [12]. 37
When the available RCTs of interest do not all compare the same interventions – but each trial 38
compares only a subset of the interventions of interest – it is possible to represent the evidence 39
base as a network where all trials have at least one intervention in common with another trial. 40
Such a network of trials involving treatments compared directly, indirectly, or both, can be 41
synthesized by means of network meta-analysis [14]. In traditional meta-analysis all included 42
studies compare the same intervention with the same comparator. Network meta-analysis extends 43
this concept by including multiple pair-wise comparisons across a range of interventions and 44
provides (indirect) estimates of relative treatment effect on multiple treatment comparisons for 45
comparative or relative effectiveness purposes. In this report, we use the terms network meta-46
analysis and indirect comparison interchangeably.. 47
Despite the great realized or potential value of network meta-analysis to inform healthcare 48
decision-making and its increasing acceptance [e.g., Pharmaceutical Benefits Advisory 49
Committee (PBAC) in Australia, Canadian Agency for Drugs and Technologies in Health 50
(CADTH), National Institute for Health and Clinical Excellence (NICE) in the United Kingdom, 51
Institute for Quality and Efficiency in Health Care (IQWIG) in Germany, and Haute Autorité de 52
Santé (HAS) in France], many decision-makers might not be very familiar with these techniques 53
and the use of network meta-analysis in health care decisions is still lagging. A major dilemma 54
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
5
decision-makers are facing is the degree of confidence they should place in the results obtained 55
with a network meta-analysis. There is a critical need for education on network meta-analysis, as 56
well as transparent and uniform ways to assess the quality of reported network meta-analysis to 57
help inform decision-making. 58
In creating a questionnaire for health care decision makers to assess network meta-analyses, the 59
Task Force was charged with achieving two goals. One goal was to provide a guide for assessing 60
the degree of confidence that should be placed in such a study. The aim was to create a 61
questionnaire that would be relatively easy and quick to use for individuals without an in-depth 62
knowledge of design and statistical issues. The second goal was for the questionnaire to have 63
educational and instructional value to decision-makers and other users, enabling them to gain 64
awareness of the subtleties involved in evaluating network meta-analysis; we sought an 65
instrument that could prompt or invite users to obtain additional education on the underlying 66
methodologies. 67
68
QUESTIONNAIRE DEVELOPMENT 69
One issue in creating questionnaires for decision makers is whether they should be linked to 70
scorecards, annotated scorecards, or checklists. Concerns were raised by the Task Force that a 71
scorecard with an accompanying scoring system may be misleading; it did not have adequate 72
validity and measurement properties. Scoring systems may also provide users with a false sense 73
of precision and have been shown to be problematic in the interpretation of randomized trials 74
[16]. 75
An alternative to a scorecard is a checklist. However, it was felt by the Task Force that checklists 76
might mislead users in their assessments because a study may satisfy all of the elements of a 77
checklist and still harbor “fatal flaws” in the methods applied in the publication. Moreover, 78
users might have the tendency to count up the number of elements present converting it into an 79
implicit score, and then apply that implicit scoring to their overall assessment of the evidence. 80
In addition, the applicability of a study may depend on whether there is any other evidence that 81
addresses the specific issue or the decision being made. In general, an evidence evaluator needs 82
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
6
to be aware of the strengths and weaknesses of each piece of evidence and apply his or her own 83
reasoning. Indeed, it was felt that, in addition to the potential for implicit or explicit scoring to 84
be misleading, it would also undermine the educational goal that the questionnaires provide. 85
The Task Force decided to develop a questionnaire characterized by two principal concepts: 86
relevance and credibility. Relevance addresses the extent to which the results of the network 87
meta-analysis, if trustworthy, apply to the setting of interest to the decision-maker. The domain 88
of relevance addresses questions related to the population, comparators, endpoints, timeframe, 89
and policy-relevant differences. Credibility was defined as the extent to which the network meta-90
analysis or indirect comparison accurately or validly answers the question it is designed to 91
answer. To generate the questions for the credibility section, the Task Force relied on the 92
expertise of its members and the scientific literature including the reports of the ISPOR Task 93
Force and other pertinent publications on indirect treatment comparisons [10-15]. Items and 94
suggested wording were also informed by earlier or recent efforts that provided guidance to 95
evidence reviewers [17,18]. These questions were grouped into logical domains: Evidence base 96
used for the indirect comparison or network meta-analysis; Analysis; Reporting Quality & 97
Transparency; Interpretation; and Conflict of interest. 98
The developed questionnaire consists of 6 domains with a total of 26 questions related to the 99
relevance (4 questions) and credibility (22 questions) of a network meta-analysis. Each question 100
can be scored with “Yes”, “No”, “Can’t Answer”. “Can’t Answer” can be used if the item is not 101
reported, there is insufficient information in the report, or the assessor does not have sufficient 102
training to answer the question. For one question, a “No” will imply a “fatal flaw”. A fatal flaw 103
suggests significant opportunities for the findings to be misleading and that the decision maker 104
should use extreme caution in applying the findings to inform decisions. However, the 105
occurrence of a fatal flaw neither prevents a user from completing the questionnaire nor requires 106
the user to judge the evidence as “insufficient” for use in decision making. The presence of a 107
fatal flaw should raise a strong caution and should be carefully considered when the overall body 108
of evidence is reviewed. Based on the number of items scored satisfactory in each of the 6 109
domains, an overall judgment of the strength of each domain was then provided: “Strength,” 110
“Neutral,” “Weakness,” “Fatal flaw.” Based on the domain judgments, the overall relevance 111
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
7
and credibility of the network meta-analysis for decision-making will be assessed separately by 112
the user as either “Sufficient” or “Insufficient.” 113
Following self-testing by the Task Force members during September/October of 2012 and 114
subsequent modification of the questionnaire, the revised questionnaire was made available to 115
volunteers interested in user testing. Each volunteer was asked to evaluate three published 116
network meta-analysis with the questionnaires being developed during April and May of 2013. 117
Based on the feedback and agreement among the ratings provided the current version of the 118
questionnaire was found helpful in assisting users systematically assess network meta-analysis 119
studies. The questionnaire is provided in Appendix A. 120
121
QUESTIONNAIRE ITEMS 122
This section provides detailed information to facilitate answering the 26 questions. A glossary is 123
provided in Appendix B. 124
Relevance 125
Relevance addresses the extent to which the results of the study apply to the setting of interest to 126
the decision-maker. A relative term, relevance has to be determined by each decision-maker and 127
the respective judgment determined by one decision-maker will not necessarily apply to other 128
decision-makers. 129
1. Is the population relevant? 130
This question addresses whether the populations of the RCTs that form the basis for the network 131
meta-analysis sufficiently match the population of interest to the decision-maker. Relevant 132
population characteristics are not only limited to the specific disease of interest, but also 133
pertinent to disease stage, severity, comorbidities, treatment history, race, age, gender, and 134
possibly other demographics. Typically, RCTs included in the network analysis are identified by 135
means of a systematic literature search with the relevant studies in terms of population pre-136
defined by study selection criteria. If these criteria are reported, this is a good starting point to 137
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
8
judge relevance in terms of population. Evidence tables with inclusion criteria and baseline 138
patient characteristics for each study provide the most relevant information to judge population 139
relevance; exclusion criteria is also noteworthy. For example, if a decision involves covering a 140
Medicare Part D population (e.g., those greater than 65), studies with few patients above 60 141
years of age may be less relevant. 142
2. Are any relevant interventions missing? 143
This question gets at whether the intervention(s) included in the network meta-analysis matches 144
the one(s) of interest to the decision-maker and whether all relevant comparators been 145
considered. Depending on the included RCTs, the network meta-analysis may include additional 146
interventions not necessarily of interest for the decision-maker. This does not, however, 147
compromise relevance. Things to think about when judging the relevance of included 148
biopharmaceuticals are dose and schedule of a drug, mode of administration, and background 149
treatment. A question whether the drug is used as induction or maintenance treatment can be of 150
relevance as well. For other medical technology, one can consider whether the procedure or 151
technique in the trials is the same as the procedure or technique of interest to the decision-maker. 152
3. Are any relevant outcomes missing? 153
This question asks what outcomes are assessed in the study and whether the outcomes are 154
meaningful to the decision maker. There has been an increasing emphasis on outcomes that are 155
directly meaningful to the patient or the health system such as cardiovascular events (e.g., 156
myocardial infarction, stroke) or patient functioning or health-related quality of life (e.g., SF-36, 157
EQ-5D) and decreasing emphasis on surrogate endpoints (e.g., cholesterol levels ) unless 158
validated. Other considerations include the feasibility of measuring relevant (i.e., final) 159
outcomes, the predictive relationship between surrogate outcomes and final outcomes and what 160
kind of evidence will be considered “good-enough” given the patient population, the burden of 161
the condition, the availability of alternative treatments (and the evidence supporting those 162
treatments). Not only the endpoints themselves are of interest, but also their time points. For 163
example, a network meta-analysis that included RCTs with a longer follow-up may be more 164
relevant to help inform treatment decisions for a chronic disease than a network meta-analysis 165
limited to studies with a short follow-up (if follow-up is related to treatment effect). 166
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
9
4. Is the context (settings & circumstances) applicable? 167
This question addresses whether there are any differences between the RCTs included in the 168
network meta-analysis and the setting and circumstances the decision-maker is interested in 169
which do not concern the population, interventions, and outcomes, but may still render the 170
findings not applicable. For example, the year when the studies included in the network meta-171
analysis were performed can be of interest when background medical care of a certain disease 172
has dramatically changed over time; as such, the standard treatment may change over time. 173
Many of the RCTs underlying network meta-analysis comparing biopharmaceuticals are often 174
designed for efficacy (can it work) purposes and therefore the setting or circumstances may be 175
different from the real-world setting of interest to the decision-maker. A relevant question to ask, 176
therefore, is whether the relative treatment effects and their rank ordering of interventions as 177
obtained with the network meta-analysis would still hold if real-world compliance or adherence 178
would have been taken into consideration. The answer might be “no” if some of the intervention 179
are associated with a much lower compliance in the real-world setting than other interventions. 180
Credibility 181
Once the network meta-analysis is considered sufficiently relevant, its credibility will be 182
assessed. Credibility is defined as the extent to which the network meta-analysis or indirect 183
comparison accurately or validly answers the question it is designed to answer. For the 184
assessment questionnaire we take the position that credibility is not limited only to internal 185
validity (i.e. the observed treatment effects resulting from the analysis reflect the true treatment 186
effects), but also to reporting quality & transparency, interpretation, and conflict of interest (see 187
Figure 1). The internal validity of the network meta-analysis can be compromised as a result of 188
the presence of bias in the identification and selection of studies, bias in the individual studies 189
included, and bias introduced by the use of inappropriate statistical methods. 190
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
10
Figure 1: Overview of domains related to assessment of the credibility of a network meta-analysis 191
192
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
11
Evidence base used for the indirect comparison or network meta-analysis 193
The first 6 questions of the credibility domain pertain to the validity of the evidence base feeding 194
information into the network meta-analysis. 195
1. Did the researchers attempt to identify and include all relevant randomized controlled 196
trials? 197
In order to have a network meta-analysis that reflects the currently available evidence, a 198
systematic literature search needs to be performed. Although there is no guarantee that all 199
relevant studies can be identified with the search, it is important that the researchers at least 200
attempted to achieve this goal. Important questions to ask are: 201
Was the literature search strategy defined in such a way that available randomized 202
controlled trials concerning the relevant interventions could be identified? 203
Were multiple databases searched (e.g. MEDLINE, EMBASE, Cochrane Central Registry 204
of Trials)? 205
Were review selection criteria defined in such a way that relevant randomized controlled 206
trials were included if identified with the literature search? 207
When the answer to each of these questions is “yes”, one may conclude that the researchers 208
attempted to include all available relevant RCTs that have been published as full text reports. As 209
additional assurance that no key studies are missing, a search of clinical trial databases, such as 210
clinicaltrial.gov, is also recommended. 211
2. Do the trials for the interventions of interest form one connected network of randomized 212
controlled trials? 213
In order to allow comparisons of treatment effects of all interventions included in the network 214
meta-analysis with each other, the evidence base used for the network meta-analysis should meet 215
two requirements. First, all of the RCTs include at least one intervention strategy (or placebo) in 216
common with another trial. Second, all of the (indirect) treatment comparisons are part of the 217
same network. In Figure 2 four different evidence networks are presented. The nodes reflect 218
interventions, and the edges reflect one or several RCTs, i.e. direct comparisons. The network in 219
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
12
Figure 2A includes AB studies comparing intervention B with A and AC studies comparing 220
intervention C with A. The relative treatment effect of C versus B can be obtained with the 221
indirect comparison of the AB and AC studies. The network in Figure 2B also includes CD 222
studies thereby allowing treatment comparisons of interventions A, B, C, and D. The absence of 223
an edge between two nodes in Figure 2A and 2B indicates that the relative treatment effects are 224
obtained by means of an indirect comparison. The networks in Figure 2C and 2D are 225
characterized by a closed loop, which implies that for some of the treatment comparisons there is 226
both direct and indirect evidence. In Figure 2C there is both direct and indirect evidence for the 227
AB, AC, and BC comparisons. In Figure 2D there is direct and indirect evidence for all 228
comparisons with the exception of the AD and BC contrast, for which there is only direct 229
evidence. 230
231
Figure 2: Networks of randomized controlled trials to allow for network meta-analysis and 232
indirect comparisons. All interventions that can be indirectly compared are part of one 233
network. 234
235
236
A B
C D
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
13
In RCTs the observed outcome with an intervention is the result of study characteristics, patient 237
characteristics, and the treatment itself. In the placebo-controlled trial, the result of the placebo 238
arm reflects the impact of study and patient characteristics on the outcome of interest, say 239
outcome y. (See Figure 3). In other words: the placebo response is the result of all known and 240
unknown prognostic factors other than active treatment. We can call this the study effect. In the 241
active intervention arm of the trial, the observed outcome y is a consequence of the study effect 242
and a treatment effect. By randomly allocating patients to the intervention and placebo group, 243
both known and unknown prognostic factors (as well as both measured and unmeasured 244
prognostic factors) between the different groups within a trial are on average balanced. Hence, 245
the study effect as observed in the placebo arm is assumed to be the same in the active 246
intervention arm and, therefore, the difference between the active intervention arm and placebo 247
arm (say delta y) is attributable to the active intervention itself, resulting in a treatment effect 248
(the blue box in Figure 3). 249
250
Figure 3: Treatment effects, study effects, effect modifiers and prognostic factors in an 251
randomized placebo-controlled trial. 252
253
With an indirect comparison interest centers on the comparison of the treatment effects of 254
interventions that are not studied in a head-to-head fashion. In order to ensure that the indirect 255
comparisons of interventions are not affected by differences in study effects between studies, we 256
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
14
want to only consider the treatment effects of each trial. This consideration implies that all 257
interventions indirectly compared have to be part of one network of trials where each trial has at 258
least one intervention (or placebo) in common with another trial, as illustrated in Figure 4. If 259
some interventions of interest are not part of the same network, then it is not possible to perform 260
an indirect comparison of treatment effects of these interventions without a substantial risk of 261
bias, as illustrated in Figure 5. 262
263
Figure 4: Indirect comparison of AB, AC and CD studies where each trial is part of one 264
network. In terms of treatment effects, B dominates A, C dominates B, and D dominates C. 265
266
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
15
Figure 5: AB and CD studies that do not have an intervention in common and for which an 267
indirect comparison is not feasible. 268
269
270
271
272
273
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
16
3. Is it apparent that poor quality studies were included, thereby leading to bias? 274
The validity of the network meta-analysis is not only at risk when certain studies are not 275
identified, but also when the internal validity of an individual RCTs is compromised. In order to 276
answer this credibility question, the network meta-analysis report should have provided summary 277
information on key study characteristics of each RCT, such as method of randomization, 278
treatment allocation concealment, blinding of the outcome assessor, and drop out. Frequently, a 279
network meta-analysis report provides an overview of bias in individual studies as assessed with 280
a specific checklist for individual study validity, such as the Cochrane Collaboration's tool for 281
assessing risk of bias in randomized trials [19]. 282
4. Is it likely that bias was induced by selective reporting of outcomes in the studies? 283
Outcome reporting bias occurs when outcomes of the individual trials are selected for 284
publication based on their findings [20]. This can result in biased treatment effect estimates 285
obtained with the network meta-analysis. As an assessment of the likelihood of bias due to 286
selective reporting of outcomes, a determination can be made whether there is consistency in the 287
studies used for the network meta-analysis with respect to the different outcomes. In other words, 288
check whether any of the selected studies do not report some of the outcomes of interest and 289
were therefore not included in some of the network meta-analysis of the different endpoints. It 290
can be informative to look at the original publication of the suspect study whether there is any 291
additional information. Furthermore, if reported in the network meta-analysis report, check the 292
reasons why studies were excluded from the systematic review to ensure no eligible studies were 293
excluded only because the outcome of interest was not reported [20]. If this were the case, a 294
related risk of bias – publication bias -- in the overall findings of the network meta-analysis 295
would manifest itself. . 296
5. Are there systematic differences in treatment effect modifiers (i.e. baseline patient or study 297
characteristics that impact the treatment effects) across the different treatment 298
comparisons in the network? 299
Study and patient characteristics can have an impact on the observed outcomes in the 300
intervention and control arms of an RCT. As mentioned previously, study and patient 301
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
17
characteristics that influence outcome to the same extent in the intervention and the placebo arms 302
are called prognostic factors. More specific, the placebo response or study effect captures the 303
impact of all study and patient characteristics that are prognostic factors on the outcome of 304
interest. 305
Study and patient characteristics that influence the difference between the intervention and the 306
placebo arm regarding the outcome of interest are treatment effect modifiers (Figure 2). For 307
example, if a medical intervention only works for men and not for women, a trial among men 308
will demonstrate a positive treatment effect relative to placebo, whereas a trial among only 309
women would not. Gender is a treatment effect modifier for that intervention. Another example, 310
if the outcome of interest (e.g., improvement in pain) is greater in a 24-week trial than in 12-311
week trial, and there is no difference in the treatment effect of the intervention relative to placebo 312
between the 24-week trial and 12-week trial then trial duration is only a prognostic factor and not 313
an effect modifier. If a variable is both a prognostic factor of the outcome of interest and a 314
treatment effect modifier, then the placebo response (or baseline risk) of a placebo controlled 315
trial is associated with the treatment effect. 316
Although the indirect comparison or network meta-analysis is based on RCTs, randomization 317
does not hold across the set of trials used for the analysis; patients are not randomized to 318
different trials. As a result, there are situations where there are systematic differences in study 319
characteristics or the distribution of patient characteristics across trials. In general, if there is an 320
imbalance in the distribution of the effect-modifiers across the different types of direct 321
comparisons in a network meta-analysis, the corresponding indirect comparisons are biased 322
[21,22]. 323
In Figures 6-8 examples of valid and biased indirect comparisons and network meta-analysis are 324
provided. It is important to acknowledge that there is always some risk of imbalances in 325
unknown or unmeasured effect modifiers between studies evaluating different interventions. 326
Accordingly there is always a small risk of residual confounding bias, even if all observed effect 327
modifiers are balanced across the direct comparisons. 328
329
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
18
Figure 6: Indirect comparison of an AB and AC study with different proportions of 330
patients with moderate and severe disease. Disease severity is an effect modifier. The 331
indirect comparison for the moderate disease population is valid, as is the indirect 332
comparison for the severe population. The indirect comparison for the overall population is 333
biased because the distribution of the effect-modifier severity is different for the AB and 334
AC studies. 335
336
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
19
Figure 7: Valid network meta-analysis of AB and AC studies with and without heterogeneity: No imbalance in the distribution 337
of the effect modifier disease severity between AB and AC comparisons. 338
339
340
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
20
Figure 8: Biased network meta-analysis of AB and AC studies with and without heterogeneity: Imbalance in the distribution 341
of the effect modifier disease severity between AB and AC comparisons. 342
343
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
21
6. If yes (i.e. there are such systematic differences in treatment effect modifiers), were these 344
imbalances in effect modifiers across the different treatment comparisons identified prior 345
to comparing individual study results? 346
Frequently, there are several trial and patient characteristics that are different across the different 347
direct comparisons. Deciding which covariates are effect modifiers based on observed patterns 348
in the results across trials can lead to false conclusions regarding the sources of inconsistency 349
and biased indirect comparisons [22,23]. It is recommended that researchers undertaking the 350
network meta-analysis first generate a list of potential treatment effect modifiers for the 351
interventions of interest based on prior knowledge or reported subgroup results within individual 352
studies before comparing results between studies. Next, the study and patient characteristics that 353
are determined to be likely effect-modifiers have to be compared across studies to identify any 354
imbalances between the different types of direct comparisons in the network. 355
Analysis 356
The next 13 questions pertain to the statistical methods used for the indirect comparison or 357
network meta-analysis. 358
7. Were statistical methods used that preserve within-study randomization? (No naïve 359
comparisons) 360
In order to acknowledge the randomization of treatment allocation within studies and thereby 361
minimize the risk of bias as much as possible, an indirect treatment comparison or network meta-362
analysis has to be based on the relative treatment effects within RCTs [5-15]. In other words, 363
statistical methods need to be used that preserve within-study randomization. An invalid or naïve 364
indirect comparison of RCTs that does not preserve randomization is presented in Figure 9. The 365
naïve indirect comparison does not take any differences in study effects (as represented with the 366
white boxes representing placebo) across trials into account. With RCTs available that are part of 367
one evidence network, the naïve indirect comparison can be considered a fatal flaw. 368
369
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
22
Figure 9: Naïve indirect comparison of RCTs that is invalid because differences in study 370
effects are not acknowledged. In this example the difference of C versus B is overestimated. 371
372
373
8. If both direct and indirect comparisons are available for pairwise contrasts (i.e,. closed 374
loops), was agreement in treatment effects (i.e., consistency) evaluated or discussed? 375
If a network has a closed loop, there is both direct and indirect evidence for some treatment 376
contrasts (Figure 10). For example, in an ABC network that consists of AB trials, AC trials, and 377
BC trials, direct evidence for the BC contrast is provided by the BC trials and indirect evidence 378
for the BC contrast by the indirect comparison of AC and AB trials. If there are no systematic 379
differences in treatment effect modifiers across the different direct comparisons that form the 380
loop, then there will be no systematic differences in the direct and indirect estimate for each of 381
the contrasts that are part of the loop [5,11,14,15,22]. Combining direct estimates with indirect 382
estimates is valid and the pooled (i.e. mixed) result will reflect a greater evidence base and one 383
has increased precision with respect to treatment effect. However, if there are systematic 384
differences in effect modifiers across the different direct comparisons of the network loop, the 385
direct estimates will be different from the corresponding indirect estimates and combining these 386
may be inappropriate (Figure 11). Hence it is important that in the presence of a closed loop any 387
direct comparisons are compared with the corresponding indirect comparisons regarding effects 388
size or distribution of treatment effect modifiers. However, statistical tests for inconsistency 389
should not be over interpreted and should include knowledge of the subject matter. 390
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
23
Figure 10: Consistency between direct and indirect comparison 391
392
393
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
24
Figure 11: Inconsistency between direct and indirect comparison 394
395
396
397
398
399
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
25
9. In the presence of consistency between direct and indirect comparisons, were both direct 400
and indirect evidence included in the network meta-analysis? 401
If there is a closed loop in an evidence network, the relative treatment effect estimates obtained 402
with direct comparisons are comparable to those obtained with the corresponding indirect 403
comparisons, and there is no imbalance in the distribution of the effect modifiers, then it is of 404
interest to combine results of direct and indirect comparisons of the same treatment contrast in a 405
network meta-analysis. As a result, the pooled result will be based on a greater evidence base, 406
with increased precision for treatment effect, than when only direct evidence for the comparison 407
of interest is considered [5,6,11,14,15]. 408
10. With inconsistency or an imbalance in the distribution of treatment effect-modifiers across 409
the different types of comparisons in the network of trials, did the researchers attempt to 410
minimize this bias with the analysis? 411
In general, if there is an imbalance in the distribution of the effect-modifiers across the different 412
types of direct comparisons, transitivity does not hold and the corresponding indirect comparison 413
is biased and/or there is inconsistency between direct and indirect evidence (see Figures 8 and 414
11). 415
If there are a sufficient number of studies included in the network meta-analysis, it may be 416
possible to perform a meta-regression analysis where the treatment effect of each study is a 417
function of not only a treatment comparison of that study but also an effect-modifier [23-27]. In 418
other words, with a meta-regression model we estimate the pooled relative treatment effect for a 419
certain comparison based on the available studies, adjusted for differences in the level of the 420
effect modifier between studies. This will allow indirect comparisons of different treatments for 421
the same level of the covariate, assuming that the estimated relationship between effect-modifier 422
and treatment effect is not greatly affected by (ecological) bias (See Figure 12). For example, if 423
there are differences in the proportion of subjects with severe disease among trials and disease 424
severity affects the efficacy of at least one compared interventions, a meta-regression analysis 425
can be used to adjust the indirect comparison estimates for the differences in the proportion of 426
severe disease. In addition, the model can predict indirect treatment comparison estimates for 427
different proportions of patients with severe disease in a population. As an alternative, one can 428
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
26
use the placebo response of a study as the covariate to adjust for any inconsistency, but this relies 429
on the assumption that study and patient characteristics (such as severity of illness) that are effect 430
modifiers of the relative treatment effect are also prognostic factors of the outcome with placebo 431
[22]. However, this is not a trivial task and requires appropriate model specification. 432
As an alternative to a meta-regression analysis, researchers can also attempt to use models with 433
so called inconsistency factors [27-29]. These network meta-analysis models explicitly 434
acknowledge any difference between direct and indirect relative treatment effect estimates of two 435
interventions and thereby are less prone to bias estimates when direct and indirect estimates are 436
pooled. These models only apply to treatment comparisons that are part of closed loop in an 437
evidence network. 438
In the absence of inconsistency and absence of differences in effect modifiers across different 439
comparisons, this question can be answered with a “yes”. If there are inconsistencies or 440
systematic differences in effect modifiers across comparisons, this question will also be 441
answered with “yes” if statistical models are used that capture the inconsistency, or meta-442
regression models are used which are expected to explain or adjust for the consistency or bias. 443
Of course, one needs to be aware of the risk of model misspecification and ecological bias when 444
using only study level data. In the presence of inconsistency, but the researchers did not attempt 445
to adjust for this, the question needs to be answered with “no”. 446
447
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
27
Figure 12: Network meta-analysis without and with adjustment for imbalance in effect 448
modifier using a meta-regression model. With this approach the treatment effects for AC 449
and AB depend on the value of the effect modifier. 450
451
452
453
454
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
28
11. Was a valid rationale provided for the use of random effects or fixed effects models? 455
With multiple studies available for all or some of the direct comparisons in the network, it is 456
possible that there are studies demonstrating clearly different relative treatment effects despite 457
the fact that the same treatments were compared. With differences in relative treatment effects 458
greater than what can be expected due to chance, there is heterogeneity and a random effects 459
model is advocated [30]. The use of random effects models results in the average relative 460
treatment effect across the different levels of the responsible effect modifier. For example, if 461
there are several trials comparing intervention B relative to A and each trial has a different 462
distribution of patients with moderate or severe disease, the effect modifier, then the 463
heterogeneous study results can be pooled with a random effects model. The analysis will 464
provide the average relative treatment effect across the different levels of disease severity. In the 465
absence of heterogeneity, a fixed effect model can be defended. With a fixed effect model it is 466
assumed that the treatment effects are the same across the different trials included in the analysis. 467
A valid rationale for the use of random effects models includes the following: the presence of 468
differences in known or likely treatment effect modifiers between studies comparing the same 469
interventions (e.g. differences in the proportion of patients with severe disease among different 470
trials comparing the same interventions); clear presence of variation in treatment effects among 471
studies beyond what can be expected due to chance; or taking the conservative position that a 472
random effect approach is what is expected in the studies addressing the specific clinical 473
question. . 474
A valid rationale for the use of a fixed effects model instead of random effects model includes a 475
judgment about the similarity of studies according to important effect modifiers and the prior 476
belief, based on experience on the relevant clinical field, that the intervention is likely to have a 477
fixed relative effect irrespectively of the populations studied. Often model fit criteria are used to 478
decide between models; the model with the better trade-off between fit and parsimony (the fixed 479
effects model being the most parsimonious) is chosen. However, such approaches need to be 480
adopted with caution as the statistical tests and measures of heterogeneity typically require a 481
large number of studies to yield reliable results. 482
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
29
12. If a random effects model was used, were assumptions about heterogeneity explored or 483
discussed? 484
With a random effects model the between-study variation in treatment effects for the direct 485
comparisons are explicitly taken into consideration by estimating a value for this heterogeneity 486
which is subsequently incorporated into the estimation of the summary effects. Since a network 487
meta-analysis includes multiple treatment comparisons, one has the option to either assume that 488
the amount of heterogeneity is the same for every treatment comparison, or different. Frequently, 489
however, it is assumed that the amount of heterogeneity is the same for all comparisons. 490
Exploration or, at least, a discussion of this assumption is recommended. In many cases it is not 491
possible to explore the assumption of different sets of heterogeneity with alternative statistical 492
models due to limited data. If a valid rationale for the use of fixed effects model was provided, 493
answer this question with “yes.” 494
13. If there are indications of heterogeneity, were subgroup analyses or meta-regression 495
analysis with pre-specified covariates performed? 496
Heterogeneity in relative treatment effects (i.e. true variation in relative treatment effects across 497
studies comparing the same interventions) can be captured with random effects models, but the 498
analysis will provide the average relative treatment effect across the different levels of the 499
responsible effect modifier(s). This finding may not be very informative for decision-making, 500
especially if there are great differences in relative treatment effects for the different levels of the 501
effect modifiers [30]. It is more informative to estimate relative treatment effects for the different 502
levels of the effect modifier, either with subgroup analysis or with a meta-regression analysis 503
where treatment effects are modeled as a function of the covariate, as illustrated in Figure 12. 504
Often there are a limited number of trials in a network meta-analysis, but many trial and patient 505
characteristics may be different across studies. Deciding which covariates to include in the 506
meta-regression models used for the network meta-analysis based on observed patterns in the 507
data of the trials can lead to false conclusions regarding the sources of heterogeneity [23,24]. To 508
avoid data-dredging, it is strongly recommended to pre-specify the potential treatment effect 509
modifiers that will be investigated. 510
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
30
In the absence of meta-regression analysis or subgroup analysis, this question can only be 511
answered with “yes” if there are no indications of heterogeneity and a valid rational for a fixed 512
effects model was provided. 513
Reporting Quality & Transparency 514
The next 6 questions pertain to the transparency of the presentation of the evidence base and the 515
results of the network meta-analysis. With a sufficiently transparent presentation, the credibility 516
of the findings given the available studies can be accurately assessed and, if of interest, 517
replicated. 518
14. Is a graphical or tabular representation of the evidence network provided with information 519
on the number of RCTs per direct comparison 520
To help understand the findings of a network meta-analysis, an overview of the included RCTs is 521
required. The evidence base can be summarized with an evidence network where the available 522
direct comparisons are reflected with edges (i.e. lines) between the different interventions along 523
with the number of RCTs per direct comparison. It is recommended that any trials that compare 524
more than 2 interventions (i.e. trials with more than 2 arms) are highlighted. With such a network 525
it is immediately clear for which treatment contrasts there is direct evidence, indirect evidence, 526
or both [31]. A table where studies are presented in the rows, the interventions in the columns, 527
and observed individual study results with each intervention of each study in the cells is very 528
informative as well. 529
15. Are the individual study results reported? 530
In order to assess the (face) validity of the results of the network meta-analysis, the individual 531
study results need to be provided, either in the publication or in an online supplement. More 532
specific, presentation of the individual study results allows reviewers to compare these with the 533
results of the network meta-analysis. It will also facilitate replication of the analysis, if desired. 534
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
31
16. Are results of direct comparisons reported separately from results of the indirect 535
comparisons or network meta-analysis? 536
To judge whether the assumption of consistency between direct and indirect evidence holds, 537
estimates of (pooled) direct comparisons can be compared with estimates obtained from the 538
corresponding indirect comparisons. However, this is not a trivial task. A more pragmatic 539
approach is to present (pooled) direct evidence separately from results of the network meta-540
analysis where direct and indirect evidence for some comparisons (i.e., presence of closed loops) 541
are combined. Although the absence of a difference between these two sets of results does not 542
guarantee there is no inconsistency, the opposite does hold. If the results based on direct 543
evidence are systematically different from results based on the combination of direct and indirect 544
evidence then the indirect evidence has to be inconsistent with the direct evidence. 545
17. Are all pairwise contrasts between interventions as obtained with the network meta-546
analysis reported along with measures of uncertainty? 547
With a network meta-analysis, relative treatment effect estimates between all the interventions 548
included in the analysis can be obtained. For decision-making it is very informative when all 549
these possible contrasts are presented. Equally important, for every relative treatment effect that 550
is estimated, measures of uncertainty need to be presented (i.e., 95% confidence intervals or 95% 551
credible intervals, which will be defined in the next question). 552
18. Is a ranking of interventions provided given the reported treatment effects and its 553
uncertainty by outcome? 554
A network meta-analysis can be performed in a frequentist or a Bayesian framework. The result 555
of a frequentist network-meta-analysis comparing treatments is an estimate of the relative 556
treatment effect along with a p-value and 95% confidence interval. The p-value indicates 557
whether the results are statistically ‘significant’ or ‘non-significant’. The p-value reflects the 558
probability of having a test statistic at least as extreme as the one that was actually observed 559
assuming that the null hypotheses (e.g. no difference between treatments) are true [32]. This is 560
not very useful for decision-making because it does not provide information on the probability 561
that a hypothesis (e.g. one treatment is better than the other) is true or false. Moreover, when the 562
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
32
decision-maker is faced with a choice between more than two treatments, the interpretation of a p 563
value associated with each pairwise comparison in a network meta-analysis becomes even more 564
difficult to interpret because it does not provide information about the ranking of treatments. The 565
95% confidence intervals corresponding to the effect estimate as obtained within a frequentist 566
framework cannot be interpreted in terms of probabilities either; the 95% confidence interval 567
does not mean there is a 95% probability that the true value is between the boundaries of the 568
interval. 569
Within the Bayesian framework, the treatment effects to be estimated are considered random 570
variables [33]. The belief regarding the treatment effect size before looking at data can be 571
summarized with a probability distribution: the prior distribution. This probability distribution 572
will be updated after having observed the data, resulting in the posterior distribution 573
summarizing the updated belief regarding the likely values for this effect size. The output of the 574
Bayesian network meta-analysis is a complex posterior distribution of all relative treatment 575
effects between interventions included in the network. The posterior distribution for each relative 576
treatment effect can be summarized with a mean or median to reflect the most likely value for 577
the effect size, as well as the 2.5th and 97.5th percentile: the 95% credible interval. In figure 13 578
the output obtained with a frequentist and Bayesian network meta-analysis is summarized. In the 579
frequentist framework we obtain relative treatment effects of each intervention relative to a 580
control along with a 95% confidence interval (and p-value). In the Bayesian framework we 581
obtain posterior probability distributions summarizing the likely values for the treatment effect of 582
each intervention relative to a control. 583
584
585
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
33
Figure 13: Frequentist versus Bayesian output of a network meta-analysis 586
587
588
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
34
Since the posterior distribution is a probability distribution, it allows for probability statements. 589
Unlike the 95% confidence interval obtained with the frequentist framework, the 95% credible 590
interval can be interpreted in terms of probabilities: there is a 95% chance that the true effect size 591
falls between the boundaries of the credible interval. Consequently, in a network meta-analysis 592
fitted within a Bayesian framework, the multiple inferences based on confidence intervals or p-593
values can be replaced with probability statements (See Figure 14). For example, “ there is x% 594
probability that treatment C is better than B”, or “there is a y% probability that treatment D is the 595
most efficacious out of treatment A, B, C, D, and E regarding this outcome”, or “there is z% 596
probability that intervention E is the least efficacious [34]. 597
The probability that each treatment ranks 1st, 2nd, 3rd, and so on out of all interventions 598
compared can be called rank probabilities and are based on the location, spread and overlap of 599
the posterior distributions of the relative treatment effects. Rank probabilities can be summarized 600
with a graph where on the horizontal axis the rank from 1 to the number of treatments in the 601
analysis is provided, and the vertical axis reflects a probability. Now, for each treatment the 602
probability against the rank is plotted and these dots connected by treatment: a rankogram, as 603
illustrated in Figure 15 [28]. Note that solely presenting the probability of being the best can 604
result in erroneous conclusions regarding the relative ranking of treatments because interventions 605
for which there is a lot of uncertainty (i.e. wide credible intervals) are more likely to be ranked 606
best. The benefit of having ranking probabilities is that these ‘summarize’ the distribution of 607
effects, thereby acknowledging both location and uncertainty. 608
Technically, ranking probabilities can be obtained when a network meta-analysis has been 609
performed in a frequentist framework by using resampling techniques (such as bootstrapping). 610
However, this implies a Bayesian interpretation of the frequentist analysis. 611
612
613
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
35
Figure 14: Probabilistic interpretation of posterior distribution with Bayesian framework 614
615
616
Figure 15: ‘Rankograms’ showing the probability for each treatment to be at a specific 617
rank in the treatment hierarchy 618
619
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
36
19. Is the impact of important patient characteristics on treatment effects reported? 620
If it has been determined that certain patient characteristics are effect modifiers and differ across 621
studies, then it is of interest to report relative treatment effects for different levels of the effect 622
modifier as obtained with meta-regression analysis or subgroup analysis. Factors of interest can 623
include gender, severity of disease, distribution of biomarkers and treatment history, for 624
example. 625
626
Interpretation 627
20. Are the conclusions fair and balanced? 628
If the conclusions are in line with reported results of the network meta-analysis, the available 629
evidence base, credibility of the analysis methods, and any concerns of bias, then the conclusions 630
can be considered to be fair and balanced. 631
632
Conflict of Interest 633
21. Were there any potential conflicts of interest? 634
Conflicts of interest may exist when an author (or author's institution or employer) has financial 635
or personal relationships or affiliations that could influence (or bias) the author's decisions, work, 636
or manuscript. 637
22. If yes, were steps taken to address these? 638
In order to address potential conflicts of interest, all aspects of conflicts of interest should be 639
noted (including the specific type and relationship of the conflict of interest), and the publication 640
should be peer-reviewed. The contribution of each author should be clearly noted to document 641
full disclosure of activities. Also, a fair and balanced exposition, including the breadth and depth 642
of the study’s limitations, should be accurately discussed. 643
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
37
DISCUSSION 644
The objective was to develop a questionnaire to assess the relevance and credibility of network 645
meta-analysis to help inform healthcare decision-making. Relevance has to do with the extent to 646
which the network meta-analysis is applicable to the problem faced by the decision-maker. 647
Credibility has to do with the extent the findings of the network meta-analysis are valid and 648
trustworthy. The questionnaire also has an educational purpose: to raise awareness that evidence 649
evaluation can be challenging and important elements may not obvious to all potential users. 650
The developed questionnaire assists evidence evaluators in applying a structured and consistent 651
approach. The user will provide his or her impression of the overall relevance and credibility of 652
each network meta-analysis assessed. The questionnaire does not determine an overall 653
impression or summary score. Although some may be interested in such scores to facilitate 654
overall evidence synthesis, the use of such scores can be misleading. In addition, the 655
applicability of a study may depend on whether there is any other evidence that addresses the 656
specific issue or the decision being made. In general, an evidence evaluator needs to be aware of 657
the strengths and weaknesses of each piece of evidence and apply his or her own reasoning. 658
User testing 659
The Taskforce wanted to strike a balance between simplicity and comprehensiveness to ensure a 660
questionnaire useful for the end user who is not necessarily a methodologist. Approximately 22 661
persons were solicited to participate in user testing. Each volunteer was asked to evaluate three 662
published network meta-analysis, rated by the Task Force as respectively “good quality”, 663
“medium quality”, and “poor quality”. The response was 82%. There were not enough users to 664
perform a formal psychometric evaluation of the questionnaire. However, some interesting 665
descriptive observations could be made. Multi-rater agreement, calculated as the percentage 666
agreement over the response categories “yes”, “no”, “not applicable”, “not reported” and 667
“insufficient information” for each credibility question, exceeded 80% for 8 of the 22 questions. 668
The average agreement score was 72% with a range of 42%-91%. The lowest agreement scores 669
were observed for the credibility questions 5, 6, 10, 13 and 19 relating to the key concepts 670
“effect-modifiers”, “inconsistency” and “heterogeneity”. Agreement in responses was greater for 671
the well reported and good quality study as determined by the Task Force. Missing ratings across 672
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
38
the 4 credibility domains varied between 6% and 17%. Based upon the answer to the overall 673
summary question: is the credibility “Sufficient” or “Insufficient”? -- there was 83% agreement 674
among the ratings provided. The good quality study was generally rated as sufficient with respect 675
to relevance and credibility, while the poor study was generally rated not sufficient. Based on 676
these findings it can be concluded that the questionnaire in its current form can be helpful for 677
decision-making, but there is a clear need for education. Over time refinement of the 678
questionnaire are expected to be made. 679
Educational needs 680
Across many international jurisdictions, the resources and expertise available to inform health 681
care decision makers varies widely. While there is broad experience in evaluating evidence from 682
randomized controlled clinical trials, there is less experience with network meta-analysis among 683
decision-makers. ISPOR has provided Good Research Practice recommendations on network 684
meta-analysis [14, 15]. This questionnaire is an extension of those recommendations and serves 685
as a platform to assist the decision maker in understanding what a systematic evaluation of this 686
research requires. By understanding what a systematic structured approach to the appraisal of 687
this research entails, it is hoped that this will lead to a general increase in sophistication by 688
decision makers in the use of this evidence. To that end, we anticipate additional educational 689
efforts and promotion of this questionnaire will be developed and made available. In addition, an 690
interactive (i.e., web-based) questionnaire would facilitate uptake and support the educational 691
goal that the questionnaire provides. 692
Conclusion 693
The Task Force developed a consensus-based questionnaire to help decision makers assess the 694
relevance and credibility of meta-analysis to help inform healthcare decision-making. The 695
questionnaire aims to provide a guide for assessing the degree of confidence that should be 696
placed in a network meta-analysis, and enables decision-makers to gain awareness of the 697
subtleties involved in evaluating these kinds of studies. It is anticipated that user feedback will 698
permit periodic evaluation and modification to the questionnaires. The goal is to make these 699
questionnaires as useful as possible to the health care decision making community. 700
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
39
REFERENCES 701
1. Berger ML, et al. A Prospective Observational Study Questionnaire to Assess Study 702
Relevance and Credibility to Inform Healthcare Decision-Making: An ISPOR-AMCP-703
NPC Good Practice Task Force Report. Value in Health. 17:1 704
2. Martin B, et al. A Retrospective Observational Study Questionnaire to Assess Study 705
Relevance and Credibility to Inform Healthcare Decision-Making: An ISPOR-AMCP-706
NPC Good Practice Task Force Report. Value in Health. 17:1 707
3. Caro JJ, et al. A Modeling Study Questionnaire to Assess Study Relevance and 708
Credibility to Inform Healthcare Decision-Making: An ISPOR-AMCP-NPC Good 709
Practice Task Force Report. Value in Health. 17:1 710
4. Higgins JPT, Green S, eds. Cochrane Handbook for Systematic Reviews of Interventions 711
Version 5.0.2 [updated September 2009]. The Cochrane Collaboration, 2009. Available 712
from http://www.cochrane-handbook.org. [Accessed Feb 11, 2010]. 713
5. Caldwell DM, Ades AE, Higgins JPT. Simultaneous comparison of multiple treatments: 714
combining direct and indirect evidence. BMJ 2005;331:897-900. 715
6. Ioannidis JPA. Indirect comparisons: the mesh and mess of clinical trials. Lancet 716
2006;368:1470-2. 717
7. Sutton A, Ades AE, Cooper N, Abrams K. Use of indirect and mixed treatment 718
comparisons for technology assessment. Pharmacoecnomics 2008;26:753-67. 719
8. Wells GA, Sultan SA, Chen L, et al. Indirect Evidence: Indirect Treatment Comparisons 720
in Meta-Analysis. Ottawa: Canadian Agency for Drugs and Technologies in Health; 721
2009. 722
9. Bucher HC, Guyatt GH, Griffith LE, Walter SD. The results of direct and indirect 723
treatment comparisons in meta-analysis of randomized controlled trials. J Clin Epidemiol 724
1997;50:683-91. 725
10. Song F, Altman DG, Glenny A, Deeks JJ. Validity of indirect comparison for estimating 726
efficacy of competing interventions: empirical evidence from published meta-analyses. 727
BMJ 2003;326:472. 728
11. Lu G, Ades AE. Combination of direct and indirect evidence in mixed treatment 729
comparisons. Stat Med 2004;23:3105-24. 730
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
40
12. Song F, Loke YK, Walsh T, et al. Methodological problems in the use of indirect 731
comparisons for evaluating healthcare interventions: Survey of published systematic 732
reviews. BMJ 2009;338:b1147. 733
13. Dias S, Sutton AJ, Ades AE, Welton NJ. Evidence Synthesis for Decision Making 2: A 734
Generalized Linear Modeling Framework for Pairwise and Network Meta-analysis of 735
Randomized Controlled Trials. Med Decis Making. 2013;33:607-17 736
14. Jansen JP, Fleurence R, Devine B, Itzler R, Barrett A, Hawkins N, Lee K, Boersma C, 737
Annemans L, Cappelleri JC, Interpreting Indirect Treatment Comparisons & Network 738
Meta-Analysis for Health Care Decision-making: Report of the ISPOR Task Force on 739
Good Research Practices - Part 1. Value in Health 2011;14: 417-28 740
15. Hoaglin DC, Hawkins N, Jansen JP, Scott DA, Itzler R, Cappelleri JC, Boersma C, 741
Thompson D, Larholt KM, Diaz M, Barret A. Conducting Indirect Treatment 742
Comparison and Network Meta-Analysis Studies: Report of the ISPOR Task Force on 743
Indirect Treatment Comparisons Good Research Practices - Part 2. Value in Health 744
2011;14:429-37 745
16. Jüni P, Witschi A, Bloch R, Egger M. The Hazards of Scoring the Quality of Clinical 746
Trials for Meta-Analysis. JAMA 1995; 282:1054–60 747
17. Ades AE, Caldwell DM, Reken S, Welton NJ, Sutton AJ, Dias S. Evidence Synthesis for 748
Decision Making 7: A Reviewer's Checklist.Med Decis Making. 2013;33:679-91. 749
18. Donegan S, Williamson P, Gamble C, Tudur-Smith C. Indirect comparisons: a review of 750
reporting and methodological quality. PLoS One. 2010;5:e11054. 751
19. Higgins JP, Altman DG, Gøtzsche PC, Jüni P, Moher D, Oxman AD, Savovic J, Schulz 752
KF, Weeks L, Sterne JA; Cochrane Bias Methods Group; Cochrane Statistical Methods 753
Group The Cochrane Collaboration's tool for assessing risk of bias in randomised trials. 754
BMJ. 2011 Oct 18;343: 755
20. Dwan K, Gamble C, Kolamunnage-Dona R, Mohammed S, Powell C, Williamson PR. 756
Assessing the potential for outcome reporting bias in a review: a tutorial. Trials. 757
2010;11:52. 758
21. Jansen JP, Crawford B, Bergman G, Stam W. Bayesian meta-analysis of multiple 759
treatment comparisons: an introduction to mixed treatment comparisons. Value Health. 760
2008;11:956-64. 761
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
41
22. Jansen JP, Schmid CH, Salanti G. Direct acyclic graphs can help understand bias in 762
indirect and mixed treatment comparisons. J Clin Epidemiology 2012;65:798-807. 763
23. Thompson SG, Higgins JP. How should meta-regression analyses be undertaken and 764
interpreted? Stat Med. 2002;21:1559-73. 765
24. Schmid CH, Stark PC, Berlin JA, Landais P, Lau J. Meta-regression detected associations 766
between heterogeneous treatment effects and study-level, but not patient-level, factors. J 767
Clin Epidemiol 2004; 57:683-97. 768
25. Cooper NJ, Sutton AJ, Morris D, et al. Addressing between-study heterogeneity and 769
inconsistency in mixed treatment comparisons: application to stroke prevention 770
treatments in individuals with non-rheumatic atrial fibrillation. Stat Med 2009;28:1861–771
81. 772
26. Salanti G, Marinho V, Higgins JP. A case study of multiple-treatments meta-analysis 773
demonstrates that covariates should be considered. J Clin Epidemiol. 2009;62:857-64. 774
27. Dias S, Welton N, Marinho VCC, Salanti G, Ades AE. Estimation and adjustment of 775
Bias in randomised evidence using Mixed Treatment Comparison Meta-analysis. Journal 776
of the Royal Stat. Soc. (A) 2010;173: 613-29 777
28. Lu G, Ades AE. Assessing evidence inconsistency in mixed treatment comparisons. J Am 778
Stat Assoc 2006;101:447-59. 779
29. Dias S, Welton NJ, Caldwell DM, Ades AE. Checking consistency in mixed treatment 780
comparison meta-analysis. Statistics in Medicine 2010;29:932–44. 781
30. Ades AE, Lu G, Higgins JP. The interpretation of random-effects meta-analysis in 782
decision models. Med Dec Making 2005; 25:646-54 783
31. Salanti G, Kavvoura FK, Ioannidis JPA. Exploring the geometry of treatment networks. 784
Ann Intern Med 2008(a);148:544-53. 785
32. Goodman, SN. Towards evidence based medical statistics: 1. The P value fallacy. Ann 786
Intern Med 1999;120:995-1004. 787
33. Gelman AB, Carlin JS, Stern HS, Rubin DB. Bayesian Data Analysis. Boca Raton: 788
Chapman and Hall–CRC; 1995. 789
34. Salanti G, Ades AE, Ioannidis JP. Graphical methods and numerical summaries for 790
presenting results from multiple-treatment meta-analysis: an overview and tutorial. J Clin 791
Epidemiol. 2011;64:163-71. 792
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
43
APPENDIX A: QUESTIONNAIRE TO ASSESS THE RELEVANCE AND 794
CREDIBILITY OF A NETWORK META-ANALYSIS 795
The questionnaire consists of 26 questions related to the relevance and credibility of an indirect 796
treatment comparison and network meta-analysis. 797
Relevance questions relate to usefulness of the network meta-analysis to inform health care 798
decision making. Each question will be scored with Yes / No / Can’t Answer. Based on the 799
scoring of the individual questions, the overall relevance of the network meta-analysis needs to 800
be judged as Sufficient or Insufficient. 801
If the network meta-analysis is considered sufficiently relevant, the credibility is going to be 802
assessed. The credibility is captured with questions in the following 5 domains, Evidence base 803
used for the indirect comparison or network meta-analysis, Analysis, Reporting Quality & 804
Transparency, Interpretation, and Conflict of interest. Each question will be scored with Yes / No 805
/ Can’t Answer. Based on the number of questions scored satisfactory in each domain, an overall 806
judgment of the strength of each domain needs to be provided: Strength / Neutral / Weakness / 807
Fatal flaw. If any one of the items is scored as a no resulting in a fatal flaw, the overall domain 808
will be scored as a fatal flaw and the network meta-analysis may have serious validity issues. 809
Based on the domain judgments, the overall credibility of the network meta-analysis will be 810
judged as Sufficient or Insufficient. 811
812
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
44
Relevance 813
Strength Weakness Can’t answer
Not
repor-
ted
Insufficient
information
Other
reason*
1 Is the population relevant? Yes
No
2 Are any critical interventions missing? No
Yes
3 Are any relevant outcomes missing? No Yes
4 Is the context (e.g., settings and
circumstances) applicable to your
population?
Yes
No
Comments
* Other reasons can include insufficient training of the assessor. Please specify in the comments section. 814 815 816 817 818
Is the relevance of the network meta-analysis sufficient or insufficient for your 819
decision-problem? 820
821
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
45
Credibility 822
Evidence base used for the indirect comparison or network meta-analysis 823
Strength Weakness Can’t answer
Not
repor-
ted
Insuff.
infor-
mation
Other
*
1 Did the researchers attempt to identify and
include all relevant randomized controlled
trials?**
Yes No
2 Do the trials for the interventions of interest
form one connected network of randomized
controlled trials?
Yes No
3 Is it apparent that poor quality studies were
included thereby leading to bias?
No Yes
4 Is it likely that bias was induced by selective
reporting of outcomes in the studies?
No Yes
5 Are there systematic differences in treatment
effect modifiers (i.e. baseline patient or study
characteristics that impact the treatment
effects) across the different treatment
comparisons in the network?
No Yes
6 If yes (i.e. there are such systematic differences
in treatment effect modifiers), were these
imbalances in effect modifiers across the
different treatment comparisons identified prior
to comparing individual study results?
Yes No Not
applica
ble
Overall judgment (Strength / Neutral / Weakness)
* Other reasons can include insufficient training of the assessor. 824 ** To help answer this specific item, one can think of the following sub-questions: 825 - Was the literature search strategy defined in such a way that available randomized controlled trials concerning 826
the relevant interventions could be identified? 827 - Were multiple databases searched (e.g,. MEDLINE, EMBASE, Cochrane)? 828 - Were review selection criteria defined in such a way that relevant randomized controlled trials were included 829
(if identified with the literature search)? 830 831
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
46
Analysis 832
Strength Weakness Can’t answer
Not
repor-
ted
Insuff.
infor-
mation
Other
*
7 Were statistical methods used that preserve
within-study randomization? (No naïve
comparisons)
Yes No ->
Fatal flaw
8 If both direct and indirect comparisons are
available for pairwise contrasts (i.e. closed
loops), was agreement in treatment effects (i.e.
consistency) evaluated or discussed?
Yes No Not
applica
ble
9 In the presence of consistency between direct
and indirect comparisons, were both direct and
indirect evidence included in the network
meta-analysis?
Yes No Not
applica
ble
10 With inconsistency or an imbalance in the
distribution of treatment effect-modifiers
across the different types of comparisons in the
network of trials, did the researchers attempt to
minimize this bias with the analysis? **
Yes No
11 Was a valid rationale provided for the use of
random effects or fixed effects models?
Yes No
12 If a random effects model was used, were
assumptions about heterogeneity explored or
discussed? ***
Yes No
13 If there are indications of heterogeneity, were
subgroup analyses or meta-regression analysis
with pre-specified covariates performed? ***
Yes No
Overall judgment (Strength / Neutral / Weakness/ Fatal flaw)
* Other reasons can include insufficient training of the assessor. 833 ** In the absence of inconsistency and absence of differences in effect modifiers across comparisons, this item is 834 scored “yes.” If there are inconsistencies or systematic differences in effect modifiers across comparisons, this item 835 will be scored “yes” (if models are used that capture the inconsistency, or meta-regression models are used which 836 are expected to explain or adjust for consistency/bias). *** If a valid rationale for the fixed effects model was 837 provided, state “yes.” 838
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
48
Reporting Quality & Transparency 840
Strength Weakness Can’t answer
Not
repor-
ted
Insufficient
information
Other
reason*
14 Is a graphical or tabular representation of the
evidence network provided with information
on the number of RCTs per direct
comparison?
Yes No
15 Are the individual study results reported? Yes No
16 Are results of direct comparisons reported
separately from results of the indirect
comparisons or network meta-analysis?
Yes No
17 Are all pairwise contrasts between
interventions as obtained with the network
meta-analysis reported along with measures of
uncertainty?
Yes No
18 Is a ranking of interventions provided given
the reported treatment effects and its
uncertainty by outcome?
Yes No
19 Is the impact of important patient
characteristics on treatment effects reported?
Yes No
Overall judgment (Strength / Neutral / Weakness)
* Other reasons can include insufficient training of the assessor. 841 842 843
844
845
846
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
49
Interpretation 847
Strength Weakness Can’t answer
Not
repor-
ted
Insufficient
information
Other
reason*
20 Are the conclusions fair and balanced? Yes No
Overall judgment (Strength / Neutral / Weakness)
* Other reasons can include insufficient training of the assessor. 848 849
Conflict of Interest 850
Strength Weakness Can’t answer
Not
repor-
ted
Insufficient
information
Other
reason*
21 Were there any potential conflicts of
interest?**
No Yes
22 If yes, were steps taken to address these?*** Yes No
Overall judgment (Strength / Neutral / Weakness)
* Other reasons can include insufficient training of the assessor. 851 852 ** Conflicts of interest may exist when an author (or author's institution or employer) has financial or personal 853 relationships or affiliations that could influence (or bias) the author's decisions, work, or manuscript. 854 855 *** To help answer this specific item, one can think of the following sub-question: 856
857 In order to address potential conflicts of interest, all aspects should be noted, including the specific type and 858 relationship of the conflict of interest, and the publication should be peer-reviewed. The contribution of each author 859 should be clearly noted to document full disclosure of activities. Also, a fair and balanced exposition, including the 860 breadth and depth of the study’s limitations, should be accurately discussed. 861 862 863
Is the credibility of the network meta-analysis sufficient or insufficient? 864
865
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
50
APPENDIX B: GLOSSARY 866 Term Definition
Closed loop
A network of linked RCTs that is characterized by both direct and indirect
evidence for a particular comparison.
For example, when in addition to AB and AC studies the evidence base also
consists of BC studies, there is both direct and indirect evidence for each of
the three pairwise comparisons (AB, AC, BC).
(A network of only AB and AC trials does not have a closed loop; there is
only indirect evidence for the BC comparison.)
Confounding
Confounding bias in indirect comparisons or network meta-analysis implies
that the estimated relative treatment effect between two interventions obtained
by indirectly comparing results from different trial is affected by differences in
the distribution of effect modifiers between these trials. For example, if the
AB trial is performed in a population with 70% severe disease and 30%
moderate disease and the AC trial is performed in a population with 30%
severe disease and 70% moderate disease -- and it is known that severity of
disease influences the efficacy of C or B or both -- then the indirect
comparison of C versus B is affected by confounding bias.
Note that in observational, non-randomized head-to-head studies we get
confounding bias if there is a difference in prognostic factors or effect-
modifiers or both. In indirect comparisons of RCTs we only get confounding
bias if there are only imbalances in the effect modifiers between trials indirect
compared.
Network of AB and AC randomized
controlled trials: Not closed loop.
Network of AB, AC and BC randomized
controlled trials: Closed loop
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
51
Term Definition
Consistency
There is consistency when the treatment effects from head-to-head
comparisons are in agreement with treatment effects obtained from indirect
comparisons. Consistency between direct and indirect estimated effects is the
manifestation of transitivity. Transitivity refers to the case when there are no
systematic differences in treatment effect modifiers between the studies that
provide indirect evidence and the group of studies that provide direct
evidence.
Contrast
A pairwise treatment comparison of interest that can be possibly estimated
based on direct evidence, indirect evidence, or both.
For example, the BC contrast (i.e. comparison) can be estimated with BC trials
(direct evidence), AB combined with AC trials (indirect evidence), or both.
Decision set A set of randomized controlled trials for which at least one of the arms of each
trial concerns an intervention of interest from a decision-making perspective.
Direct comparison / direct evidence The evaluation of two or more treatment alternatives in a randomized
(placebo) controlled trial.
Effect-modifier
A study or patient characteristic that influences the true treatment effect of one
intervention relative to an active comparator or to a placebo.
For example, the relative risk of an adverse event with treatment B relative to
A is greater among men than women. Gender is the effect modifier.
Effect modifiers cause heterogeneity. If there is variation in effect-modifiers
across subgroup or studies, there will be true variation in treatment effects
among these subgroups or studies comparing the same interventions.
Effect-modifiers can cause inconsistency in network meta-analysis. If there
there is a real imbalance in the distribution of effect modifiers between studies
comparing different sets of interventions, the corresponding indirect
comparison is biased or invalid.
Evidence network One network of randomized controlled trials in which each trial has at least
one intervention in common with another trial.
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
52
Term Definition
Evidence set
An evidence network that includes all studies that provide either direct
evidence, indirect evidence, or both for the interventions of interest. This set
can include trials that do not include interventions of interest but provide a link
between trials evaluating interventions of interest.
For example, if the interventions B and D are of interest for a decision-maker
but are studied only in AB trials and CD trials, then AC trials provide relevant
evidence to facilitate this comparison.
Fixed effects model
A meta-analysis model to combine treatment effects from different studies
based on the assumption that all studies involving the same treatments aim to
estimate the same treatment effect. Observed variation in treatment effects for
a particular type of comparison (e.g., AB) is simply due to chance. (See Figure
7, example without heterogeneity.)
Heterogeneity
Variation in treatment effects (across trials) beyond what can be expected
based due to chance. Studies aim to estimate different, possibly context-
specific treatment effects. There are systematic differences in known or
unknown study or patient characteristics across studies that influence the
treatment effects.
Moderate; age>50
Moderate; age>40
Severe; age>50
Severe; age>40
0
A B
C D
4 AB studies with variation in true treatment effects:
heterogeneity. This is caused by variation in severity (not but in
age) across studies that influences the treatment effect
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
53
Term Definition
Indirect comparison / indirect
evidence
A comparison of interventions that were studied in different trials. For
example, the estimated treatment effects (e.g. odds ratios) of AB studies and
AC studies can be indirectly compared to get an estimate for a BC
comparison. (See Figure 4).
Measures of uncertainty 95% confidence intervals, standard errors, p-values, or 95% credible intervals.
Meta-regression
A meta-analysis model to combine treatment effects from different studies that
aims to explain heterogeneity or reduce inconsistency by explicitly capturing
the impact of study or patient characteristics on treatment effects.
Mixed treatment comparisons Combination of treatment effect estimates of both direct and indirect evidence
for a particular treatment comparison.
Naïve comparison
Indirect comparison where differences in study effects (or placebo effects) are
ignored, which is not recommended. For example, the response with drug B
from an AB trial is compared with the response with drug C from an AC trial,
ignoring differences in response with treatment A.
Network meta-analysis
A statistical method to combine evidence regarding more than two
interventions from different RCTs that are part of one evidence network.
If the available evidence base consists of a network of RCTs involving direct
and indirect treatment comparisons, it can be synthesized with a network
meta-analysis.
Outcome An endpoint to measure the effect of a treatment and can be an efficacy or a
safety realization.
Prior distribution
Within the Bayesian framework, treatment effects are considered random
variables. We are interested in how likely the value for the effect is given the
data observed. Since treatment effects to be estimated are considered random
variables in the Bayesian framework, the belief regarding the possible values
of the treatment effect before looking at data obtained in a study can be
summarized with a probability distribution: the prior distribution. This
probability distribution will be updated after having observed the data,
ISPOR Task Force Report: ITC & NMA Study Questionnaire DRAFT FOR REVIEW ONLY
54
Term Definition
resulting in the posterior distribution summarizing the updated probabilities of
the possible values for the treatment effect.
Posterior distribution
The probability distribution obtained with a Bayesian analysis that summarizes
the probability of the possible values for the treatment effects we are interested
in. (See Prior distribution for more information.)
Prognostic factor Patient or study characteristic that has an equal impact on the outcome of
interest in the intervention and control arm of a randomized control trial.
Random effects model
A statistical model to combine treatment effects from different studies based
on the assumption that there is variation in the true treatment effects (i.e.,
heterogeneity) across studies for a particular treatment comparison (e.g. , AB).
Study effect
True outcome in a study if there is (or would have been) no intervention. The
study effect reflects everything that is going on in a study other than the
(active) intervention(s).
Transitivity
If treatment B is more efficacious than treatment A and treatment C is more
efficacious than treatment B then treatment C has to be more efficacious than
A. Transitivity is the key underlying assumption for indirect treatment
comparisons and network meta-analysis. The implication of the transitivity
assumption is that we assume there are no systematic difference in treatment
effect modifiers between studies comparing different interventions (e.g. the
distribution of effect modifiers of AB studies, AC studies and BC studies are
similar). Transitivity will show as agreement in the relative treatment effects
obtained from direct and indirect evidence.
Treatment effect Difference in outcomes between an active intervention and a control compared
in a randomized controlled trial.
Within-study randomization
Randomization of patients to the alternative interventions compared in a
randomized controlled trial. As a result, the treatment effect (i.e., the
difference in the outcome of interest between the interventions compared) is
considered not affected by confounding bias in expectation.
867