determining optimal cut-offs in themeta
TRANSCRIPT
Master Thesis
Determining optimal cut-offs inthe meta-analysis of diagnostic test
accuracy studies
Author:
Susanne Steinhauser
Supervisors:
Prof. Dr. Martin Schumacher
Dr. Gerta Rucker
September 28, 2015
FAKULTAT FUR MATHEMATIK
UND PHYSIK
Diese Seite enthält persönliche Daten und ist deshalb nicht zur Online-Veröffentlichung freigegeben.
ABSTRACT iii
Abstract
In some systematic reviews of diagnostic test accuracy studies sev-
eral studies report more than one cut-off value and the corresponding
values of sensitivity and specificity. But until now there is no widely-
used meta-analysis approach using this information. For example, the
traditional bivariate model only assumes one pair of sensitivity and
specificity per study. But as there is more information available, it is
only reasonable to make use of it.
That is why we describe a new approach utilizing such data. It is based
on the idea of estimating the distribution functions of the underlying
biomarker in the non-diseased and diseased study population. We as-
sume a normal or logistic distributed biomarker and estimate different
distribution parameters in both groups. This is achieved through a
linear regression of the (probit or logit) transformed proportions of
negative test results of the non-diseased and diseased individuals, re-
spectively, using a mixed effects model with study as grouping factor.
We present a number of possible mixed models. Once both distribution
functions are estimated, these give the pooled sensitivity and specificity
at a specific cut-off and the summary receiver operating characteristic
(SROC) curve follows directly. Furthermore, the difference of the dis-
tribution functions is the Youden index, so that the determination of
an optimal cut-off across studies is possible through maximization of
the Youden index.
The approach is applied to several examples, almost all leading to con-
vincing results. An extensive simulation study was realized, showing
strengths and limitations of the approach.
iv
Zusammenfassung in deutscher Sprache
Fur systematische Ubersichtsarbeiten uber Diagnosestudien, bei de-
nen einige Studien mehr als einen cut-off Wert und die dazugehorige
Sensitivitat und Spezifitat berichten, gibt es momentan keinen weit
verbreiteten Metaanalyseansatz, der diese Informationen nutzt. Das
traditionelle bivariate Modell zum Beispiel geht lediglich von einem
Paar Sensitivitat und Spezifitat pro Studie aus. Gibt es aber mehr
Informationen, ist es sehr sinnvoll diese zu nutzen.
Wir mochten einen neuen Ansatz beschreiben, der genau diese Art von
Daten verwendet. Die Grundidee ist, die Verteilungsfunktion des zu-
grundeliegenden Biomarkers jeweils in der Gruppe der gesunden und
der der kranken Patienten zu schatzen. Wir machen einen parametri-
schen Ansatz (normal oder logistisch verteilter Biomarker) und schatzen
unterschiedliche Verteilungsparameter fur die beiden Gruppen. Dies
wird durch lineare Regression der (probit bzw. logit) transformierten
Verhaltnisse negativer Testergebnisse in den beiden Gruppen mithilfe
eines gemischten Modells erreicht, wobei Studie als Gruppierungsfak-
tor fungiert. Dabei stellen wir eine Vielzahl von moglichen gemischten
Modellen vor. Hat man beide Verteilungsfunktionen geschatzt, geben
diese die Werte der gepoolten Sensitivitat und Spezifitat fur einen spez-
ifischen cut-off und die geschatzte summary receiver operating char-
acteristic (SROC) Kurve folgt direkt. Außerdem bildet die Differenz
der Verteilungsfunktionen den Youndenindex, weshalb die Bestimmung
des optimalen cut-offs uber alle Studien hinweg durch Maximieren des
Youdenindexes moglich ist.
Der neue Ansatz wird anhand von mehreren Beispielen demonstri-
ert, die fast alle zu sehr uberzeugenden Ergebnissen fuhren. Eine
ausfuhrliche Simulationsstudie wurde durchgefuhrt, die Starken und
Grenzen des Ansatzes aufzeigt.
Contents
Acknowledgements i
Abstract iii
Zusammenfassung in deutscher Sprache iv
Abbreviations vii
Chapter 1. Introduction 1
Chapter 2. Background 3
2.1. Diagnostic Test Accuracy Studies 3
2.2. Meta-Analyses of Diagnostic Test Accuracy Studies 9
2.3. Traditional Approaches 11
Chapter 3. Theory 15
3.1. Motivation 15
3.2. Existing Approaches 15
3.3. Novel Approach 17
3.4. Model Selection 36
3.5. Implementation in R 38
3.6. Weighting Parameters 40
Chapter 4. Examples 45
4.1. Troponin as a marker for myocardial infarction 45
4.2. Procalcitonin as a marker for sepsis 49
4.3. Procalcitonin as a marker for neonatal sepsis 52
4.4. CAGE Questionnaire 53
Chapter 5. Simulation Study 55
5.1. Design 55
5.2. Results 57
Chapter 6. Discussion 65
Chapter 7. Conclusion 69
v
vi CONTENTS
Appendix A. Data Sets 71
Appendix B. R Code 75
B.1. Code Novel Approach 75
B.2. Code Simulation Study 94
Appendix C. Simulation Study Plots 105
Bibliography 109
ABBREVIATIONS vii
Abbreviations
DTA Diagnostic Test AccuracyTN True NegativesTP True PositivesFN False NegativesFP False PositivesTNR True Negative RateTPR True Positive RateROC Receiver Operating CharacteristicSROC Summary Receiver Operating CharacteristicREML Restricted Maximum LikelihoodAIC Akaike Information CriterioncAIC conditional Akaike Information CriterionMSE Mean Squared Error
CHAPTER 1
Introduction
The number of clinical studies published every year is strongly grow-
ing (Ressing et al., 2009). For example, there are more than 70 studies
investigating the predictive ability of procalcitonin (a prohormone that
can be measured in the blood) regarding sepsis1. Studies like these are
called diagnostic test accuracy (DTA) studies, as they investigate the
performance of a diagnostic test, in this case based on the biomarker
procalcitonin. DTA studies often report two measures: sensitivity and
specificity. These measures stand for the success rate of the diagnostic
test in diseased and non-diseased individuals and depend on the cho-
sen cut-off value of the biomarker. This flood of clinical studies needs
to be structured so that researchers and clinicians do not lose track.
Therefore, systematic reviews with meta-analyses are inevitable. They
collect and summarize study results regarding one subject matter ques-
tion.
For example, Wacker et al. (2013) conducted a meta-analysis about
procalcitonin as a diagnostic marker for sepsis to give an overview of
the current state of research. Despite mentioning that some studies
reported sensitivity and specificity at different cut-offs, they used only
one pair of sensitivity and specificity per study for their meta-analysis.
This was due to the use of a traditional approach for meta-analyses of
DTA studies, which only allows one pair of sensitivity and specificity
per study. In the following, further meta-analyses appeared where
several studies reported more than one cut-off and the corresponding
values of sensitivity and specificity. Therefrom the questions arose:
Which is the right cut-off to choose as a meta-analyst or how can we
use the full information provided by the studies?
There are already existing approaches which face this problem and
make use of more than one pair of sensitivity and specificity per study
1This number results from a brief PubMed search about meta-analyses to thissubject, counting the included studies.
1
2 1. INTRODUCTION
(see for example Hamza et al. (2009), Putter et al. (2010) and Martınez-
Camblor (2014)). As more data is used, the results are expected to be
more reliable and the biomarker is evaluated at its best. Finally, the
patients will benefit if the best-performing biomarker can be identified
out of a group of potential biomarkers and can be used in practise.
Furthermore, it is of interest to know at which cut-off value of the
biomarker this performance can be expected. That is why meta-analysts
have asked how to determine an optimal cut-off across all studies.
In this thesis we present a new approach for a meta-analysis of DTA
studies responding to these issues. We elaborated, refined and applied
an idea suggested by G. Rucker one year ago. This new approach uses
data, where several studies report more than one cut-off and corre-
sponding values of sensitivity and specificity and leads to pooled sen-
sitivity and specificity as well as to an optimal cut-off value across
studies. The fundamental idea is to estimate the distribution functions
of the biomarker within the diseased and non-diseased individuals us-
ing a linear mixed effects model.
The structure of this thesis is organized as follows: In the second chap-
ter, we give background information about diagnostic test accuracy
studies and meta-analyses of these and briefly introduce two traditional
approaches. Then, in chapter 3, we present our new approach. First, we
give a motivation and briefly describe some existing approaches. Step
by step, we explain the procedure, touch on the subject of model selec-
tion, introduce the implementation in R and give two weighting options.
In chapter 4 we show some examples taken from current meta-analyses.
A simulation study evaluating the performance of our new approach is
presented in chapter 5. In chapter 6 we discuss the approach and the
results of the evaluations and we finish with a conclusion in the last
chapter.
CHAPTER 2
Background
2.1. Diagnostic Test Accuracy Studies
2.1.1. Diagnostic test accuracy study. As the subject of this
thesis is meta-analysis of diagnostic test accuracy studies, we first want
to take a closer look at this special type of study. This chapter follows
chapter 9 of Schwarzer et al. (2015).
A diagnostic test accuracy study investigates if and how well a diagnos-
tic test can recognize or rule out a disease. A test can be, for example,
based on a questionnaire testing for alcoholism and we want to know if
the questions can distinguish correctly between harmful and harmless
alcohol consumption. A test could also be based on a biomarker. As
defined by the World Health Organisation (2001), ”a biomarker is any
substance, structure or process that can be measured in the body or its
products and influence or predict the incidence of outcome or disease”.
An example for a biomarker is the concentration of the prohormone
procalcitonin in the blood, where high concentration can be an indica-
tor for sepsis.
Moreover, a DTA study can provide indications for treatment decisions
of physicians. A study could, for example, report a threshold of the
concentration of procalcitonin that - if exceeded - should be seen as an
indicator of sepsis.
For DTA studies one assumes a fully accurate gold standard, that
means one knows exactly which of the patients are ill and which are
healthy. Of course this assumption cannot be upheld entirely in most
cases. However, it should be realised as well as possible.
Conducting a DTA study, one needs two groups of patients: one group
composed of diseased (D+) and one group composed of non-diseased
individuals (D−). Without loss of generality, we assume a positive test
indicates illness. T+ will stand for patients with a positive test result,
3
4 2. BACKGROUND
\Disease D+ D− Total
Test result \
T+ TP FP TP+FP
T− FN TN FN+TN
Total n1 n0 n
Table 2.1. T+ denotes a positive test result, T− a negativetest result, D+ denotes diseased, D− non-diseased individ-uals, n1 is the number of individuals in the diseased group,n0 the number of individuals in the non-diseased group andn the overall study population. ’TP’ denotes true positives,’FP’ false positives, ’FN’ false negatives and ’TN’ true nega-tives.
whereas T− will indicate a negative test result. As the test to be ex-
amined presumably is not perfect, we get a fourfold table shown in
table 2.1. There is a number of diseased individuals who will have a
positive test result, the true positives, denoted as ’TP’, but also a num-
ber of the diseased individuals who wrongly will have a negative test
result, the false negatives, denoted as ’FN’. On the other hand there
will be a number of non-diseased individuals correctly testing negative,
denoted as ’TN’ (true negatives), and in contrast some will incorrectly
test positive, denoted as ’FP’ (false positives).
The number of diseased individuals is denoted by n1 and the number of
the non-diseased by n0. In the whole thesis we will use the subscripts
1 and 0 to distinguish between diseased and non-diseased, respectively.
2.1.2. Definition of sensitivity and specificity. To rate or
compare tests, several measures were developed. We want to present
the two most common ones, according to Honest and Khan (2002):
sensitivity and specificity .
Sensitivity (Se) is the probability of a positive test result, given the
person has the disease:
Se = P (T+|D+).
2.1. DIAGNOSTIC TEST ACCURACY STUDIES 5
This probability can be estimated as follows:
�Se = TP
n1
,
and then is also called true positive rate (TPR). In contrast, specificity
(Sp) is the probability of a negative test result, given the person is
non-diseased:
Sp = P (T−|D−).
To estimate this probability we again use the values of the fourfold
table:
�Sp =TN
n0
and call it true negative rate (TNR). As one deduces from the defini-
tions, it is desirable that both, sensitivity and specificity, are close to
one, as they state probabilities of right decisions. But as we will easily
see in the next paragraph there is a trade-off between these two mea-
sures. So in most cases they cannot be maximised both at the same
time.
2.1.3. Tests based on a continuous marker. In the following
we consider a continuous biomarker X and want to design a test based
on its value. We imagine for instance a substance in the blood that
we can measure. Without loss of generality, we assume that a higher
marker value indicates a higher probability of illness, as this is what is
true for most biomarkers. Therefore, plotting the probability of having
a specific marker value for diseased and non-diseased individuals (plot
2.1), we see the distribution of the diseased further to the right. To
create a test based on this marker, we choose a specific marker value as
cut-off value and all individuals with marker values higher than the cut-
off value get a positive test result whereas individuals with a smaller
marker value get a negative test result.
As can be seen in graphic 2.1, choosing a cut-off value results in a
fourfold table as in table 2.1. That way, every choice of a cut-off leads
to a pair of estimated sensitivity and specificity. In this thesis we will
use the terms ’cut-off value’ and ’threshold’ as synonyms. Increasing
the threshold leads to more negative test results and less positives, de-
creasing leads to more positive and less negative test results. Thus
6 2. BACKGROUND
−4 −2 0 2 4 6 8
0.0
0.1
0.2
0.3
0.4
Biomarker
Cut−off
Non−diseased
Diseased
TN
FN
TP
FP
Figure 2.1. Distributions of a continuous biomarker, theleft curve of the non-diseased and the right one of the dis-eased. The cut-off value 1.5 leads to a fourfold table with truenegatives (TN), false negatives (FN), true positives (TP) andfalse positives (FP).
increasing the threshold results in an increasing specificity and a de-
creasing sensitivity and vice versa for a decreasing threshold.
As in general the density functions will overlap, there is no cut-off value
which leads to a sensitivity and specificity both equal to one. Instead,
we have to find a trade-off between sensitivity and specificity.
2.1.4. Receiver operating characteristic curve. To plot the
triples of cut-off value, sensitivity and specificity in a two dimensional
plot, there are two common ways to proceed. First, we want to in-
troduce the receiver operating characteristic (ROC) curve. This curve,
originally developed in the signal detection theory, was later used in
medical diagnostics (Lusted, 1971). This subsection is based on Schu-
macher and Schulgen (2008, p. 330 ff.).
In this approach we keep sensitivity and specificity as pairs of a com-
mon cut-off, but neglect the value of the cut-off. By plotting these pairs
2.1. DIAGNOSTIC TEST ACCURACY STUDIES 7
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
1 − Specificity
Sensitiv
ity
Figure 2.2. ROC curve of a normally distributedbiomarker. The dot represents the values of sensitivity andspecificity at cut-off 1.5.
as sensitivity against one minus specificity, both on the range [0,1], we
get the ROC curve (see plot 2.2).
Each point on the curve represents the pair of sensitivity and specificity
of one cut-off. Marking a specific cut-off value as a dot in the graphic,
by increase of the cut-off value the dot moves downwards, towards the
origin.
Throughout this thesis we will consider two classes of distributions for
the biomarker X: normal distribution and logistic distribution. De-
pending on the disease status we will choose different parameters.
First we assume that the biomarker X is normally distributed. For
the diseased, the distribution is described through N(µ1, σ21) and for
the non-diseased through N(µ0, σ20), where µ1 is greater than µ0 and
N(µ, σ) is the normal distribution with mean µ and variance σ2. Then
8 2. BACKGROUND
the ROC curve is given by
Se(x) = 1− Φµ1,σ1
�Φ
−1µ0,σ0
(1− x)�, 0 ≤ x ≤ 1, (1)
where Φµ,σ is the distribution function of the normal distribution with
parameters µ and σ. As a test is better, the higher sensitivity and
specificity are, the ROC curve should ideally run close to the upper
left corner.
In the following we want to consider the biomarker X to be logistically
distributed.
Definition 2.1 (Logistic distribution). Let the continuous random
variable X be logistically distributed with location parameters µ and
dispersion parameter σ, σ > 0. We write X ∼ Logistic(µ, σ). Then
the density function is given by
fµ,σ(x) =exp(−x−µ
σ)
σ�1 + exp
�−x−µ
σ
� �2 .
Therewith, the distribution function, named expit function, is given by
expitµ,σ(x) =1
1 + exp�−x−µ
σ
� .
The inverse of the expit function is called logit function:
logitµ,σ(x) = µ+ σ log
�x
1− x
�.
The terms ’logit’ and ’expit’ without indices refer to the standard pa-
rameters choice µ = 0 and σ = 1.
The mean of the random variable X is E(X) = µ and the variance
Var(X) = σ2π2
3.
Let the distribution of the diseased be described by Logistic(µ1,σ1)
and of the non-diseased by Logistic(µ0,σ0), where again µ1 is greater
than µ0. Then the ROC curve is given by
Se(x) = 1− expitµ1,σ1
�logitµ0,σ0
(1− x)�, 0 ≤ x ≤ 1.
2.1.5. Youden index. Another way to depict the triples of cut-
off value, sensitivity and specificity is a Youden index plot (see plot
2.3 at the left). To reduce one dimension the sum of sensitivity and
specificity minus one are plotted on the cut-off values.
2.1. DIAGNOSTIC TEST ACCURACY STUDIES 9
−4 −2 0 2 4 6 8
0.0
0.2
0.4
0.6
0.8
1.0
Threshold
Yo
ud
en
in
dex
Optimal threshold = 1.1
0.0 0.2 0.4 0.6 0.8 1.00
.00
.20
.40
.60
.81
.0
1 − Specificity
Se
nsitiv
ity
Summary ROC curveOptimal threshold at 1.1 with λ=0.5
Figure 2.3. Left: Youden index curve. The optimal thresh-old is derived as that threshold where the maximum is ob-tained.Right: Summary ROC curve with the optimal thresholdmarked as a dot.
Definition 2.2 (Youden index). The Youden index for a cut-off
value x is defined by
Y (x) = Se(x) + Sp(x)− 1. (2)
The point where the Youden index is maximized can be seen as
an optimal cut-off value, as at this point the sum of sensitivity and
specificity is maximal (see figure 2.3). The optimal cut-off will from
now on refer to this interpretation.
Definition 2.3 (Weighted Youden index). The weighted Youden
index for a cut-off value x is defined by
Y (x) = 2 (λw · Se(x) + (1− λw) · Sp(x))− 1, (3)
where λw ∈ [0, 1] is a weighting parameter.
The parameter λw can be used to weight either Se or Sp higher.
Choosing λw = 0.5 results in the Youden index defined in equation (2)
10 2. BACKGROUND
and equal weighted sensitivity and specificity. To place more empha-
sis on sensitivity, one could for example choose λw = 23. This can be
reasonable, for instance, for a test at the beginning of an examination
where it is important to recognize all diseased individuals.
2.2. Meta-Analyses of Diagnostic Test Accuracy Studies
2.2.1. Meta-analysis. Often there is a large number of studies
considering similar questions. Then it is useful to conduct a system-
atic review, which seeks to bring together all available studies on one
subject and summarizes the results.
”Meta-analysis is a statistical technique for combining the findings of
independent studies. It is most often used to assess the clinical effec-
tiveness of healthcare interventions.” (Crombie and Davies, 2009) A
simple meta-analysis for studies reporting all the same effect measure
could be to calculate a weighted average effect measure.
2.2.2. Meta-analyses of DTA studies. The methodology of
systematic reviews and meta-analyses of DTA studies is relatively new,
the first papers appeared in the 1990’s (Willis and Quigley, 2011).
There is a variety of different outcomes that a DTA study can report:
Different effect sizes (sensitivity and specificity or others), even sev-
eral pairs of sensitivity and specificity, cut-off values, and much more.
Most DTA studies report just a pair of sensitivity and specificity of
their choice, possibly announcing the cut-off. But we found a number
of meta-analyses with studies reporting several up to all possible triples
(see section 4). Reporting all cut-offs is certainly only reasonable for
discrete markers.
Depending on the subject, it can happen that not all measures are
available. An imaging method for example, where a physician decides
on the basis of an image whether the person has an illness or not, has a
sensitivity and specificity which may be calculated out of the number
of correct and false decisions of the physician. But there is no numeri-
cal value, up from which the physician opts for disease. Thus there is
no threshold one can report.
The kind of meta-analysis we mainly want to consider in this thesis is
using data, where each study reports one or more pairs of sensitivity
2.2. META-ANALYSES OF DIAGNOSTIC TEST ACCURACY STUDIES 11
and specificity and the corresponding cut-offs (see Table 2.2).
To conduct a meta-analysis of diagnostic test accuracy studies one
study cut-off TP FP FN TN
1 5 106 131 4 91
1 13 92 38 18 184
1 14 92 36 18 186
1 15 93 29 17 193
2 14 19 21 6 121
3 14 159 190 6 87
3 27 119 36 46 141
4 14 37 163 1 105
Table 2.2. Excerpt of the data of the meta-analysis ofZhelev et al. (2015), consisting of 4 studies reporting 4,1,2and 1 threshold(s) and the corresponding fourfold tables.
needs to pay attention to some specialties. The most important pecu-
liarity of meta-analyses of DTA studies is that sensitivity and specificity
are dependent measures. Hence, it is inappropriate to conduct two sep-
arate meta-analyses for them.
A further challenge is that heterogeneity between the studies is typi-
cally large in a meta-analysis of DTA studies. Firstly there is variation
between studies in how a continuous marker is dichotomised into a test
classification, i.e. how thresholds are chosen. And secondly there is
variation in the accuracy of tests across different settings, e.g. different
sample sizes, very similar individuals or not, etc.
If it is appropriate, a study should report the cut-off value for every pair
of sensitivity and specificity. In a meta-analysis optimal cut-offs per
study can be averaged or an overall optimal cut-off can be computed.
However, some care has to be taken concerning the concept of an op-
timal threshold across studies as this is only reasonable if a biomarker
value has the same meaning in all studies and does not differ because
of, for example, laboratory conditions.
If studies report multiple triples, the triples of each study are based on
the same individuals and therefore dependant as well. Hence, it is not
appropriate to conduct separate meta-analyses for them.
As showed Honest and Khan (2002), the prior goals of meta-analyses
of DTA studies are typically to pool sensitivity and specificity over
12 2. BACKGROUND
all studies and to obtain a summary ROC (SROC) curve. It is also of
interest to obtain an optimal threshold of the biomarker over all studies.
2.3. Traditional Approaches
The customary approaches for meta-analysis of diagnostic accuracy
studies only use one pair of sensitivity and specificity per study. They
aim to pool sensitivity and specificity and/or to estimate a summary
ROC curve. Widely used approaches are the hierarchical model (Rutter
and Gatsonis, 2001) and the bivariate model (Reitsma et al., 2005).
2.3.1. Hierarchical model. Rutter and Gatsonis proposed a hi-
erarchical model focussing on an estimate of a summary ROC curve.
They used a mixed model to allow for variation of test stringency and
test accuracy across studies.
At the study level, the number of positive testing individuals from
study s, s = 1, ...,m are assumed to be independent and to follow
binomial distributions
TPs ∼ B(n1s, Ses),
FPs ∼ B(n0s, 1− Sps),
where n1s and n0s are the number of diseased and non-diseased individ-
uals in study s. Furthermore Rutter and Gatsonis assumed, that the
logit transformed sensitivity depends linearly on the logit transformed
1− specificity for every study s parametrizing them as follows
logit(Ses) =
�θs +
1
2αs
�e−
β
2 ,
logit(1− Sps) =
�θs −
1
2αs
�e
β
2 ,
where θs is the random threshold in study s and αs the random accuracy
in study s, which are allowed to vary across studies. The parameter β
is an asymmetry parameter and constant over all studies.
At the between-study level we will describe the hierarchical model with-
out covariates. The study-level parameters θs and αs are assumed to
2.3. TRADITIONAL APPROACHES 13
be normally distributed to account for variation across studies:
θs ∼ N(Θ, τ 2θ ),
αs ∼ N(Λ, τ 2α).
Note that the threshold parameter θs is just describing the heterogene-
ity along the SROC curve and not standing for explicit cut-off values,
as the cut-off data is not used in the model.
Then in logit space the SROC curve is linear, as logit(Se) can be ex-
pressed as
logit(Se) = e−βlogit (1− Sp) + Λe−β
2 .
Back-transforming this equation leads to the SROC curve
Se = expit�e−βlogit(1− Sp) + Λe−
β
2
�.
2.3.2. Bivariate model. The bivariate model was proposed by
Reitsma et al. in 2005. They aimed to pool sensitivity and specificity
modelling a bivariate distribution. At the study level, the number of
positive test results is binomially distributed, just as in the hierarchical
model.
At the between-study level, they assume a bivariate normal distribution
of logit transformed sensitivity and specificity:�
logit(Ses)
logit(1− Sps)
�∼ N
��µ1
µ0
�,
�τ 21 τ10
τ10 τ 20
��.
That way, they preserve the two-dimensional nature of the data, ac-
knowledge possible correlation and account for variability between the
studies with random effects.
In 2007 and 2008 it was proven that the hierarchical and the bivariate
approach are closely related and even equivalent under the condition
of no covariates (Harbord et al., 2007; Arends et al., 2008). Hence,
they illustrate the same idea from two different points of view and the
parameters of one model can be converted into the ones of the other
model.
2.3.3. Benefits and drawbacks. A lot of studies only present
one pair of sensitivity and specificity, with or without information on
the underlying cut-off, hence it is a great benefit to have meta-analysis
14 2. BACKGROUND
methods dealing with such data.
In addition, as stated in subsection 2.2.2, not all diagnostic test accu-
racy studies have a calculable numerical cut-off value. For these kind
of studies it is important to have meta-analysis approaches which do
not need a cut-off information.
A problem arising when using only one pair of sensitivity and speci-
ficity per study is a not uniquely defined SROC curve. As Arends et al.
(2008) showed, there are many different ways to define the straight line
in logit space, i.e. the transformed SROC curve. As every approach
has another way to proceed and justify, there is no completely natural
way to define the SROC curve, but a large number of ways to do so.
Furthermore, using only one pair of sensitivity and specificity per study
might overestimate the SROC curve. Most studies will not report any
pair of sensitivity and specificity but a kind of optimal one. Thus, the
available data for a summary ROC curve are all the ’optimal’ points.
So it is likely that the summary ROC curve will be too optimistic and
cannot be seen as a mean of the single ROC curves.
This problem was already addressed by Rucker and Schumacher (2010).
They proposed a new approach, which uses an ’optimal point’ assump-
tion to identify a straight line in logit space and thus an SROC curve.
Another point to be mentioned is the cut-off selection. If studies present
more than one cut-off value, the meta-analyst needs to reduce the data
and select a cut-off. This procedure leads to bias, too and also does
not use the full information.
CHAPTER 3
Theory
3.1. Motivation
In this chapter we want to describe a new approach for meta-
analysis of DTA studies, which is feasible if all studies report cut-off
values and, moreover, a number of these studies provide information
about several triples of cut-off, sensitivity and specificity. There are a
number of systematic reviews providing data of this form, as can be
seen in chapter 4.
To apply the traditional approaches, the meta-analyst has to decide
on one pair of sensitivity and specificity and thus discard a lot of in-
formation. The cut-off information is also unused. Furthermore, the
selection of a pair of somehow optimal sensitivity and specificity may
cause additional bias and the test accuracy might be overestimated.
To avoid these issues we want to use all information provided. As we
will see, this will also lead to a naturally defined SROC curve.
3.2. Existing Approaches
We want to introduce three already existing approaches, which also
use data where several studies provide information about more than
one pair of sensitivity and specificity.
3.2.1. Multivariate random effects approach. In the publi-
cation of Hamza et al. (2009) the bivariate random effects approach
is generalized to the situation where each study reports k (k ≥ 3)
thresholds and the corresponding values of sensitivity and specificity.
With a multivariate random effects approach they estimate a summary
ROC, interpreting the individual ROC curve per study as random sam-
ples of the population of all ROC curves of such studies. That way,
they do not need further assumptions for the summary ROC curve, as
it was the case in the traditional approaches (see section 2.3), and the
SROC curve can be seen as an average of all study-specific ROC curves.
15
16 3. THEORY
They are also able to calculate pooled sensitivity and specificity at any
threshold.
A clear limitation of this approach is the equal number of thresholds
demanded per study. Although there are possibilities to handle this
constraint and to accept a different number of thresholds in some cases,
the total number of different thresholds across all studies should not
be too large, as this increases the number of parameters and then the
likelihood method may not work correctly any more.
3.2.2. Survival approach. Putter et al. (2010) proposed a meta-
analysis of DTA studies with multiple thresholds using survival meth-
ods. As in the multivariate approach of Hamza et al., they act on
the assumption that all studies present the same number of thresh-
olds, e.g. categories. Putter et al. assume the number of diseased and
non-diseased with a positive or negative test result, respectively, to be
Poisson distributed. Then a multivariate gamma distribution is used to
describe between-study variation. Hence, correlation of sensitivity and
specificity within a given study is included through common random
effects. Furthermore, extra correlation of sensitivity and specificity of
different thresholds is included.
But in this approach we encounter the same problem as in the multi-
variate random effects approach: the necessity of an equal number of
thresholds per study.
3.2.3. Non-parametric approach. The last approach we want
to mention is a fully non-parametric approach of Martınez-Camblor
(2014) to estimate a summary ROC curve. The data used are the
number of TNs, TPs, FNs and FPs for one or multiple thresholds per
study. In a first step, the points are depicted in ROC space and linearly
interpolated within each study, taking into account that all ROC curves
begin at (0,0) and end in (1,1). Then a global ROC curve is computed
as a weighted mean of the individual interpolated ROC curves. There
are two weighting schemes proposed, one based on a fixed-effects model
and one on a random-effects model.
In contrast to the above-mentioned approaches, the number of thresh-
olds presented per study is not fixed and all information can be em-
ployed. On the other hand, the approach only leads to an SROC curve
3.3. NOVEL APPROACH 17
and no pooled sensitivity and specificity for specific cut-offs are esti-
mated.
3.3. Novel Approach
3.3.1. Overview. The novel approach we want to present is char-
acterized by the estimation of the distribution functions of the biomarker
within the non-diseased and diseased individuals, respectively.
The method assumes a continuous biomarker which is normal or logis-
tic distributed. Different location and dispersion parameters for non-
diseased and diseased individuals are estimated with the available data.
The distribution functions of the biomarker in the non-diseased and dis-
eased population specify the estimated specificity and one minus sensi-
tivity, respectively, per threshold. To account for the heterogeneity of
the studies, a mixed effects model is used and the bivariate structure
is taken into account allowing for correlation of the random effects of
the non-diseased and diseased individuals.
This results in a large amount of possibilities. Having estimated the
underlying distribution functions, one can read off the pooled sensi-
tivity and specificity values at every threshold and confidence regions
can be specified. The summary ROC curve follows naturally and the
summary Youden index is simply the difference of the two estimated
distributions. Furthermore, the optimal cut-off among all studies can
be calculated.
In the following subsections, we will explain the procedure step by
step. In each step we will consider two different distribution assump-
tions: normal distribution and logistic distribution, respectively. In the
first subsection, the data is transformed so that it has a linear relation-
ship. Then, a straight line for each group is estimated using a linear
mixed model. After back-transforming, we determine the optimal cut-
off value. In the last subsection we compute confidence regions for
the distribution parameters, sensitivity and specificity and the optimal
cut-off.
3.3.2. Probit/Logit Transformation. First of all, we transform
sensitivity and specificity so that they are linear in the threshold. This
gives us the possibility of using a linear mixed effects model to esti-
mate the distribution functions. Starting with a normal distribution
18 3. THEORY
assumption, let N(µ0, σ20) be the distribution for the non-diseased in-
dividuals and N(µ1, σ21) the one for the diseased.
Let x be a cut-off value. The specificity, i.e. the probability of a nega-
tive test result given non-diseased individuals, at cut-off x is the area
under the density function of cut-off values smaller than x (see figure
2.1). This is per definition the value of the distribution function of
the biomarker of the non-diseased, i.e. Φµ0,σ0, at point x. This can be
restated with the standard normal distribution and we get
Sp(x) = Φµ0,σ0(x) = Φ
�x− µ0
σ0
�.
Applying the probit function Φ−1, the inverse of the standard normal
distribution, results in
Φ−1�Sp(x)
�=
x− µ0
σ0
,
and the expression on the left-hand side is linear in x.
Considering the distribution function of the biomarker of the diseased,
Φµ1,σ1, which gives the probability of a negative test result for diseased
individuals, i.e. one minus sensitivity, we get the following equivalence:
1− Se(x) = Φµ1,σ1(x) = Φ
�x− µ1
σ1
�.
Probit transforming the equation leads to
Φ−1�1− Se(x)
�=
x− µ1
σ1
.
We conclude that the logit transformed specificity and one minus sen-
sitivity depend linearly on the cut-off value x.
In the following, we want to assume logistic distributions for non-
diseased and diseased. Let the location parameter be µ0 and µ1,
and the dispersion parameter be σ0 and σ1 for non-diseased and dis-
eased, respectively. For a threshold x we get analogously to the above-
mentioned case
Sp(x) = expitµ0,σ0(x) =
1
1 + exp�−x−µ0
σ0
� .
3.3. NOVEL APPROACH 19
Applying the standard logit function results in
logit(Sp(x)) = logit
1
1 + exp�−x−µ0
σ0
�
= log
1
exp�−x−µ0
σ0
�
=x− µ0
σ0
.
Concerning the diseased individuals, we have
1− Se(x) = expitµ1,σ1(x) =
1
1 + exp�−x−µ1
σ1
� .
Logit transforming this equation leads to
logit�1− Se(x)
�=
x− µ1
σ1
,
and we get linearity in x for logit transformed specificity and one minus
sensitivity as well.
3.3.3. Linear regression. In the next step, we want to fit the
transformed data with a linear model to extract the parameters of
the earlier mentioned linear dependence. Let us first consider a linear
fixed effects model. The definition follows the formulation of Ga�lecki
and Burzykowski (2013) closely.
Definition 3.1 (Linear fixed effects model). A linear regression
model with fixed effects is given by
y = Xβ + e,
e ∼ N(0, R),
where y = (y1, . . . , yn), consisting of the response variables for the n
observations and
X =
x11 x12 . . . x1p
x21 x22 . . . x2p
......
...
xn1 xn2 . . . xnp
,
20 3. THEORY
the design matrix, consisting of the n observation values of p (known)
covariates (p < n) and β = (β1, . . . , βp) are the corresponding (un-
known) regression parameters. We assume the residual error
e = (e1, . . . , en) is multivariate normally distributed with variance ma-
trix R and the e1, . . . , en are independent.
In our case we want to only take one covariate into consideration,
the threshold, and estimate two separate regression lines for the dis-
eased and non-diseased, respectively.
Let ks be the number of cut-offs in study s (s = 1, ...,m) and let xsi
be the value of the i’th cut-off of study s. The number of cut-offs may
vary over the studies. We want to explain the transformed proportions
of negative test results, with TNsi
n0sbeing the proportion of negative test
results of the non-diseased of study s and cut-off i, and FNsi
n1sthe one of
the diseased.
From now on, we will present all linear models with probit transformed
proportions of negative test results, but of course the same models can
be used for logit transformed ones.
We consider the following regression model:
Φ−1
�TNsi
n0s
�= α0 + β0 · xsi + esi,
Φ−1
�FNsi
n1s
�= α1 + β1 · xsi + fsi,
esi ∼ N
�0,
γ2
wsi
�,
fsi ∼ N
�0,
γ2
vsi
�, s = 1, . . . ,m, i = 1, . . . , ks,
where α0 and α1 are the intercepts, β0 and β1 are the slopes for the non-
diseased and diseased, respectively. The independent error terms of the
non-diseased are denoted with esi, the ones of the diseased with fsi for
every study s and cut-off i. They are both mean zero normally dis-
tributed with variances γ2
wsiand γ2
vsi, respectively, where γ is an unknown
scale parameter and wsi and vsi are given prior weights. We use this
formulation for the variances of the residual errors because it is the
way it is implemented in the R function lmer(), which we will use for
the regressions (see section 3.5).
3.3. NOVEL APPROACH 21
This regression model is a fixed effects model. A linear regression
with only fixed effects assumes that there is a constant overall inter-
cept and slope for all possible values. The data is just deviating from
this regression line by chance. The way it is deviating is described by
esi ∼ N�0, γ2
wsi
�, fsi ∼ N
�0, γ2
vsi
�for s = 1, . . . ,m, i = 1, . . . , ks.
For the meta-analysis data this means that, for example for the dis-
eased, we estimate one straight line, i.e. one transformed distribution,
which underlies all data. That way we assume that all studies have the
same biomarker distribution.
But one can observe a big heterogeneity between the studies, which
is the consequence of different sample sizes, different laboratory con-
ditions, different study populations (differing in countries, specialized
clinics, etc.) and much more. To explain part of this heterogeneity in
the model, we want to include random effects.
A further disadvantage of this fixed effects model is the estimation of
two separate straight lines, leading to two separate distribution func-
tions, disregarding the bivariate character of the data. The data of the
non-diseased and diseased individuals are not independent, but come
from the same studies. That is why we want to include correlated pa-
rameters to link the two regression lines.
A model describing such a set-up is the linear mixed effects model. The
formulation follows closely M.Laird and H.Ware (1982) and Ga�lecki and
Burzykowski (2013).
Definition 3.2 (Linear mixed effects model). A linear mixed model
is given through an extension of the linear fixed effects model (see defi-
nition 3.1), including random effects. For hierarchical data with a single
level of grouping, we formulate the linear mixed model at a given level
of grouping factor s (s = 1, . . . ,m) as follows:
ys = Xsβ + Zsds + es, (4)
ds ∼ N(0, D),
es ∼ N(0, Rs), with ds ⊥ es,
where ys, Xs and es are the vector of response, the design matrix and
the vector of residual errors for grouping factor s and β the regression
22 3. THEORY
parameters of the fixed effects as in definition 3.1. The covariables
matrix of the random effects Zs is given by
Zs =
z(s)11 z
(s)12 . . . z
(s)1q
z(s)21 z
(s)22 . . . z
(s)2q
......
...
z(s)n1 z
(s)n2 . . . z
(s)nq
,
consisting of n observations of q (known) covariates and
ds = (d(s)1 , . . . , d
(s)q )� is the corresponding (unknown) vector of random
effects. The residual errors are independent and independent of the
random effects and have the variance matrix Rs. The q × q variance-
covariance matrix D of the random effects is positive-definite.
To write one model for all data, let Y = (y�1 , . . . , y�
m)�,
d = (d�1 , . . . , d�
m)� and e = (e�1 , . . . , e
�
m)� be the vectors containing all
observed values of the dependent variable, all random effects and all
residual errors of all grouping factors s. Define matrices
X =
X1
...
Xm
and Z =
Z1 0 . . . 0
0 Z2 . . . 0...
.... . .
...
0 0 . . . Zm
,
where 0 denotes a matrix with all elements equal to zero. Then the
linear mixed regression (equation (4)) can be written for all data as
follows
Y = Xβ + Zd+ e,
d ∼ N(0,D),
e ∼ N(0,R),
where D = diag(D), a diagonal matrix with the matrix D as main
diagonal entries, and R = diag(R1, . . . Rm).
In the case of meta-analysis data, there is a clear hierarchical struc-
ture by the different studies. Furthermore, the correlation between
values of the same study must not be neglected. Considering the stud-
ies as randomly chosen out of the overall study population, a linear
mixed model with study as grouping factor is an adequate way of re-
gressing the data.
3.3. NOVEL APPROACH 23
For the fixed effects covariate matrix we set p = 2, thus we consider
only a fixed intercept and a single covariate, the threshold (cf. the
fixed effects model). For the random effects covariates matrix Zs for
grouping level s we will choose either
Zs =
1
1...
1
, Zs =
z(s)1
z(s)2...
z(s)n
or Zs =
1 z(s)1
1 z(s)2
......
1 z(s)n
,
starting with the one on the left-hand side. As we aim to estimate
two distributions, for the non-diseased and the diseased, respectively,
we want to estimate two regression lines as in the fixed effects model.
Extending the fixed effects model with a random intercept as in both
regressions leads to a linear mixed model
Φ−1
�TNsi
n0s
�= α0 + as + β0 · xsi + esi, (model CI)
Φ−1
�FNsi
n1s
�= α1 + as + β1 · xsi + fsi,
as ∼ N(0, τ 2a ),
esi ∼ N
�0,
γ2
wsi
�,
fsi ∼ N
�0,
γ2
vsi
�, s = 1, . . . ,m, i = 1, . . . , ks,
where as is the random intercept for both groups at the same time.
The variable as is normally distributed with mean zero and standard
deviation τ 2a . The residual errors esi and fsi are independent and inde-
pendent of as.
The model parameters which we do not mention here are defined in
the same way as in the fixed effects model. It holds for all following
models, that model parameters which are not mentioned are defined
as in the previous model. The model name ’CI’ stands for ’common
(random) intercept’. The denomination of the models will always refer
to the random effects structure.
The following paragraph is based on Barr et al. (2013). With model CI
we estimate an overall straight line for each group, just as in the fixed
24 3. THEORY
effects model, with the fixed effects parameters α0/1 and β0/1. These
fixed effects parameters are not depending on the selection of studies
for the meta-analysis, but represent the overall study population. In
contrast, the random effects as have different values for every study.
The specific composition of the intercepts α0/1+ as, s = 1, . . . ,m for a
given meta-analysis is assumed to be a random subset of the intercepts
in the underlying study-population. Another instantiation of the same
meta-analysis where different studies would be included would there-
fore have different realizations of the as effects .
The primary goal is to produce a model, which represents the whole
study-population from which the studies are randomly drawn, rather
than describing the specific as, s = 1, . . . ,m values for this sample.
Therefore, instead of estimating the individual as effects, the model-
fitting algorithm estimates the population distribution from which the
as are drawn. We assume for the study-specific intercepts that as is
mean zero normally distributed and estimate the variance τ 2a .
By including the random term as we allow for study-specific intercepts,
that lead to study-specific means (see subsection 3.3.4).
Furthermore, the two regression lines are now connected via a common
random intercept. That way we acknowledge the bivariate structure of
sensitivity and specificity, deriving from the same studies.
Another way to include a random intercept with respect to the group-
ing factor ’study’, is to include different random intercepts for non-
diseased and diseased individuals. This leads to model DI (’different
intercepts’):
Φ−1
�TNsi
n0s
�= α0 + a0s + β0 · xsi + esi, (model DI)
Φ−1
�FNsi
n1s
�= α1 + a1s + β1 · xsi + fsi,
(a0s, a1s) ∼ N
�0,
�τ 20a ρτ0aτ1a
ρτ0aτ1a τ 21a
��,
esi ∼ N
�0,
γ2
wsi
�,
fsi ∼ N
�0,
γ2
vsi
�, s = 1, . . . ,m, i = 1, . . . , ks.
3.3. NOVEL APPROACH 25
Whereas in model CI we assumed the study-specific intercepts as,
s = 1, . . . ,m added to α0/1 are the same for both groups, we allow
in model DI for non-diseased and diseased individuals to have differ-
ent study-specific intercepts. The random intercepts of non-diseased
and diseased individuals, i.e. a0s and a1s, are assumed to be bivari-
ate normal distributed. The correlation of the random effects assures
the togetherness of diseased and non-diseased individuals to the same
study. As before, the random effects of different studies, a0s and a0s� ,
a1s and a1s� and also a0s and a1s� with s �= s�, are assumed to be inde-
pendent. This is also valid for all following models.
In the following we want to include random slopes instead of random
intercepts. The equivalent to model CI is the model CS (’common
slopes’), which includes the same random slope for both, non-diseased
and diseased individuals. This corresponds to the use of a random
effects covariates matrix
Zs =
z(s)1
z(s)2...
z(s)n
.
The model CS is given by
Φ−1�TNsi
n0s
�= α0 + (β0 + bs)xsi + esi, (model CS)
Φ−1�FNsi
n1s
�= α1 + (β1 + bs)xsi + fsi,
bs ∼ N(0, τ 2b ),
esi ∼ N
�0,
γ2
wsi
�,
fsi ∼ N
�0,
γ2
vsi
�, s = 1, . . . ,m, i = 1, . . . , ks.
26 3. THEORY
As before, we could also allow different random effects in the groups
and this leads to model DS (’different slopes’):
Φ−1
�TNsi
n0s
�= α0 + (β0 + b0s)xsi + esi, (model DS)
Φ−1
�FNsi
n1s
�= α1 + (β1 + b1s)xsi + fsi,
(b0s, b1s) ∼ N
�0,
�τ 20b ρτ0bτ1b
ρτ0bτ1b τ 21b
��,
esi ∼ N
�0,
γ2
wsi
�,
fsi ∼ N
�0,
γ2
vsi
�, s = 1, . . . ,m, i = 1, . . . , ks.
Finally we can also include both, random intercept and random slope,
using the random effects covariate matrix
Zs =
1 z(s)1
1 z(s)2
......
1 z(s)n
.
Starting with a common random intercept and a common random slope
for both groups, this leads to the following model:
Φ−1
�TNsi
n0s
�= α0 + as + (β0 + bs)xsi + esi, (model CICS)
Φ−1
�FNsi
n1s
�= α1 + as + (β1 + bs)xsi + fsi,
(as, bs) ∼ N
�0,
�τ 2a ρτaτb
ρτaτb τ 2b
��,
esi ∼ N
�0,
γ2
wsi
�,
fsi ∼ N
�0,
γ2
vsi
�, s = 1, . . . ,m, i = 1, . . . , ks.
where the variance-covariance matrix of (as, bs) equals matrix D in
definition 3.2.
Proceeding as before, we now include distinct random intercepts for
3.3. NOVEL APPROACH 27
diseased and non-diseased individuals:
Φ−1
�TNsi
n0s
�= α0 + a0s + (β0 + bs)xsi + esi, (model DICS)
Φ−1
�FNsi
n1s
�= α1 + a1s + (β1 + bs)xsi + fsi,
(a0s, a1s, bs) ∼ N
0,
τ 20a ρ1τ0aτ1a ρ2τ0aτb
ρ1τ0aτ1a τ 21a ρ3τ1aτb
ρ2τ0aτb ρ3τ1aτb τ 2b
,
esi ∼ N
�0,
γ2
wsi
�,
fsi ∼ N
�0,
γ2
vsi
�, s = 1, . . . ,m, i = 1, . . . , ks.
The variance-covariance matrix of the random effects is a composition
of the ones we have already seen. The random intercept and random
slope of the same study as well as the different random intercepts for
non-diseased and diseased of the same study can exhibit a correlation.
For ease of notation we will use the names of the correlation coefficients
in the different models for different correlation coefficients.
We could, instead of including separate random intercepts for the
groups, add different random slopes for the non-diseased and diseased
individuals. This leads to model CIDS:
Φ−1
�TNsi
n0s
�= α0 + as + (β0 + b0s)xsi + esi, (model CIDS)
Φ−1
�FNsi
n1s
�= α1 + as + (β1 + b1s)xsi + fsi,
(as, b0s, b1s) ∼ N
0,
τ 2a ρ1τaτ0b ρ2τaτ1b
ρ1τaτ0b τ 20b ρ3τ0bτ1b
ρ2τaτ1b ρ3τ0bτ1b τ 21b
,
esi ∼ N
�0,
γ2
wsi
�,
fsi ∼ N
�0,
γ2
vsi
�, s = 1, . . . ,m, i = 1, . . . , ks.
The last model we want to present is the most complex one, includ-
ing different random slopes and different random intercepts for both
28 3. THEORY
groups:
Φ−1
�TNsi
n0s
�= α0 + a0s + (β0 + b0s)xsi + esi, (model DIDS)
Φ−1
�FNsi
n1s
�= α1 + a1s + (β1 + b1s)xsi + fsi,
(a0s, a1s, b0s, b1s) ∼ N
0,
τ 20a ρ1τ0aτ1a ρ2τ0aτ0b ρ3τ0aτ1b
ρ1τ0aτ1a τ 21a ρ4τ1aτ0b ρ5τ1aτ1b
ρ2τ0aτ0b ρ4τ1aτ0b τ 20b ρ6τ0bτ1b
ρ3τ0aτ1b ρ5τ1aτ1b ρ6τ0bτ1b τ 21b
,
esi ∼ N
�0,
γ2
wsi
�,
fsi ∼ N
�0,
γ2
vsi
�, s = 1, . . . ,m, i = 1, . . . , ks.
As one can see, the total number of parameters to estimate is quite
high for model DIDS. Hence, to use this model for estimation one
needs enough data.
So far we have presented eight linear mixed effects models, which are
shown in an overview in table 3.1. A special case of these regression
models is to fix the slopes for the diseased and non-diseased individuals
to β0 = β1.
But apart from that, there is no further simplification of the fixed
effects structure feasible, as it is the main idea to estimate two dis-
tributions, for the non-diseased and diseased individuals, respectively,
being separable by their location.
Under the given circumstances (estimation of two straight lines for the
diseased and non-diseased individuals, respectively, same fixed effects,
study as only grouping factor, correlated random effects and the thresh-
old being the only covariate) table 3.1 contains a complete list of all
possible models.
3.3.4. Back-transformation. The linear regression provides in-
tercepts aj and slopes bj, j = 0, 1. We want to back-transform the
regression lines to obtain the distributions for the non-diseased and
diseased individuals. Thus, we want to compute the distribution pa-
rameters µj, σj, j = 0, 1. In subsection 3.3.2 we obtained the equiv-
alence of probit transformed specificity and one minus sensitivity to
3.3. NOVEL APPROACH 29
Model Specification
DIDS Different random intercepts and different random slopes
CIDS Common random intercept and different random slopes,
a0s = a1s = as
DICS Different random intercepts and common random slope,
b0s = b1s = bs
CICS Common random intercept and common slope,
a0s = a1s = as, b0s = b1s = bs
DS Different random slopes,
a0s = a1s = 0
CS Common random slope,
a0s = a1s = 0, b0s = b1s = bs
DI Different random intercepts,
b0s = b1s = 0
CI Common random intercept,
a0s = a1s = as, b0s = b1s = 0
Table 3.1. Linear mixed effects models listed according totheir random effects structure.
z−µj
σj, j = 0, 1, respectively, and the same for the logit transformed.
Now after regressing the data we obtain the following:
z − µj
σj
= αj + βjz (j = 0, 1).
By equating coefficients we get
αj = −µj
σj
, βj =1
σj
(j = 0, 1),
wherefrom we obtain
µj = −αj
βj
, σj =1
βj
(j = 0, 1).
Thus, it is necessary that the βj (j = 0, 1) are positive to obtain positive
dispersions. That means Sp(z) and 1 − Se(z), i.e. the probabilities of
having a negative test result, should increase with increasing cut-off
values within both groups.
Within studies this is always true (see graphic 2.1). But if we combine
data of several studies, as it is done in a meta-analysis, and regress this
combined data it need not be true anymore.
30 3. THEORY
As we can see, if one fixes β0 = β1 in the linear regression models,
one assumes that the distributions of the biomarker of ill and healthy
individuals have the same variance. In contrast, varying the intercept
of the linear regression affects the mean of the distribution function.
3.3.5. Optimal cut-off value. The optimal cut-off value of a
biomarker is the value where the Youden index is maximized (see sub-
section 2.1.5).
Let us assume a normally distributed biomarker. Hence we can write
the weighted Youden index for a cut-off value x as
Y (x) = λw
�1− 2Φ
�x− µ1
σ1
��+ (1− λw)
�2Φ
�x− µ0
σ0
�− 1
�,
see equation (3). The weighted Youden index is maximised at one of
the two points of intersection between the weighted densities of the two
normal distributions ϕµ1,σ1and ϕµ0,σ0
, thus at one of the two solutions
of
λwϕµ1,σ1(x) = (1− λw)ϕµ0,σ0
(x).
The argument x0 of the maximal Youden index is given by
x0 =µ0σ
21 − µ1σ
20
σ21 − σ2
0
+
�σ20σ
21
�2(σ2
1 − σ20)�log σ1
σ0
− logit(λw)�+ (µ1 − µ0)2
�
σ21 − σ2
0
.
If σ0 = σ1 =: σ holds, then the argument x0 of the maximal Youden
index is given by
x0 =σ2logit(λw) +
12(µ2
0 − µ21)
µ0 − µ1
.
For the logistic assumption the weighted Youden index is defined as
Y (x) = λw
�1− 2expitµ1,σ1
(x)�+ (1− λw)
�2expitµ0,σ0
(x)− 1�.
To obtain the point where the Youden index is maximised we need to
solve the equation
λwexpit�
µ1,σ1(x) = (1− λw)expit
�
µ0,σ0(x). (5)
3.3. NOVEL APPROACH 31
The prime symbol � denotes the derivative, which will hold throughout
the thesis. As no analytical solution has been found yet, we propose
a fixed point iteration to compute the optimal cut-off. The solution
x ∈ [µ0, µ1] of (5) can be written as
x = µ0 + σ0arccosh
�(1− λw)
λw
σ1
σ0
�1 + cosh
�x− µ1
σ1
��− 1
�. (6)
Interpreting the right-hand side of (6) as function g of x leads to
g(x) = µ0 + σ0arccosh
�(1− λw)
λw
σ1
σ0
�1 + cosh
�x− µ1
σ1
��− 1
�.
(7)
The inverse is given by
g(x)−1 = µ1 − σ1arccosh
�λw
(1− λw)
σ0
σ1
�1 + cosh
�x− µ0
σ0
��− 1
�.
(8)
We can fixed point iterate these functions with, for example, start-
ing value (µ0+µ1)/2. Following the Banach fixed point theorem, one of
the functions has precisely one fixed point and the fixed point iteration
converges towards this fixed point, if the function is Lipschitz contin-
uous with Lipschitz constant L ∈ [0, 1). Depending on the parameter
µ0, µ1, σ0 and σ1 this needs to be checked for the functions of (7) and
(8).
3.3.6. Confidence regions. To obtain confidence intervals we
use the delta method. The definition is taken from Agresti (1990),
p. 422.
Definition 3.3 (Delta method). Let Xn1, Xn2, . . . , Xnl be asymp-
totically multivariate normal distributed random variables with means
Θ1,Θ2, . . . ,Θl and covariance matrix Σ/n. The subscript n shall ex-
press the dependence on the sample size n.
We defineXn = (Xn1, Xn2, . . . , Xnl)� and Θ = (Θ1,Θ2, . . . ,Θl)
�. More
precisely, the Xn converge in distribution as follows
√n(Xn −Θ)
d→ N(0,Σ).
32 3. THEORY
Suppose the function f(Xn) has a non-zero differential
d = (dX1, dX2
, . . . , dXl) at Θ, where
dXi(Θ) =
∂f(Xn)
∂Xni
����(Θ1,Θ2,...,Θl)
.
It follows that
√n�f(Xn)− f(Θ)
� d→ N(0, d�Σd).
Thus for large samples, f(Xn) has a distribution similar to the normal
with mean f(Θ) and variance (d�Σd)/n.
Therefore, in the univariate case (l = 1) we can approximate Var�f(X)
�,
dropping the subscript n for ease of reading, by
�f �(Θ)
�2Var(X),
and in the bivariate case (l = 2) we can approximate Var (f(X1, X2))
by
dX1(Θ)2Var(X1) + 2dX1
(Θ)dX2(Θ)Cov(X1, X2) + dX2
(Θ)2Var(X2).
3.3.6.1. Distribution parameters. In the following we want to calcu-
late the confidence intervals of the distribution parameters. We assume
µj and σj, j = 0, 1 are asymptotically normally distributed. First, we
want to compute the variance of σj, j = 0, 1. With σj =1bj
in mind,
we use the univariate delta method with f(X) = 1X
and derivative
f �(X) = − 1X2 and conclude
Var(σj) = Var
�1
βj
�=
1
E(βj)4Var(βj), j = 0, 1.
For the computation of the variance of µj = −αj
βj, j = 0, 1 we apply
the bivariate delta method with f(X1, X2) =X2
X1
and partial derivatives∂f∂X1
(X1, X2) = −X2
X2
1
and ∂f∂X2
(X1, X2) =1X1
. It results for j = 0, 1
Var(µj) = Var
�−αj
βj
�
=E(αj)
2
E(βj)4Var(βj)− 2
E(αj)
E(βj)3Cov(αj, βj) +
1
E(βj)2Var(αj).
The variances of αj and βj, j = 0, 1 are obtained from the linear
regression.
3.3. NOVEL APPROACH 33
We get the (1− α) confidence interval for σj, j = 0, 1 as
[σj − z1−α2
�Var(σj), σj + z1−α
2
�Var(σj)]
and the (1− α) confidence interval for µj, j = 0, 1 as
[µj − z1−α2
�Var(µj), µj + z1−α
2
�Var(µj)],
where z1−α2is the 1− α
2quantile of the standard normal distribution.
3.3.6.2. Sensitivity and specificity. To construct (1−α) confidence
intervals for sensitivity and specificity, we first regard the confidence
intervals of (logit and probit) transformed sensitivity and one minus
specificity. We will demonstrate this procedure only with probit trans-
formation, but it holds just as well for logit transformation, by simply
replacing ’Φ’ with ’expit’ and ’Φ−1’ with ’logit’.
The variance of probit transformed sensitivity (now called y0) and pro-
bit transformed one minus specificity (now called y1) for a fixed cut-off
value x are given by
Var�Φ
−1(Sp)�= Var(y0)
= Var(α0 + β0x)
= Var(α0) + 2xCov(α0, β0) + x2Var(β0),
Var�Φ
−1(1− Se)�= Var(y1)
= Var(α1 + β1x)
= Var(α1) + 2xCov(α1, β1) + x2Var(β1).
Thus, assuming the transformed specificity and 1-sensitivity are ap-
proximately normally distributed, the confidence intervals are given
for the transformed specificity by
[y0 − z1−α2
�Var(y0); y0 + z1−α
2
�Var(y0)]
and for the transformed 1-sensitivity by
[y1 − z1−α2
�Var(y1); y1 + z1−α
2
�Var(y1)].
We probit transform these confidence interval to obtain the confidence
intervals of specificity
[Φ�y0 − z1−α
2
�Var(y0)
�; Φ
�y0 + z1−α
2
�Var(y0)
�]
34 3. THEORY
and of sensitivity
[1− Φ�y1 + z1−α
2
�Var(y1)
�; 1− Φ
�y1 − z1−α
2
�Var(y1)
�].
3.3.6.3. Optimal cut-off. In the following we want to determine a
confidence interval for the optimal cut-off under normal distribution
assumption. Remember the following formula for the optimal cut-off x0
if σ0 �= σ1:
x0 =µ0σ
21 − µ1σ
20
σ21 − σ2
0
+
�σ20σ
21
�2(σ2
1 − σ20)�log σ1
σ0
− logit(λw)�+ (µ1 − µ0)2
�
σ21 − σ2
0
.
First, we conduct a reparametrization with
µj = −αj
βj
, σj =1
βj
(j = 0, 1),
to achieve a representation of the optimal cut-off, which depends on
α0,α1, β0 and β1:
x0 = h(α0,α1, β0, β1)
=
α1β1 − α0β0 +
�2(β2
0 − β21)�log β0
β1
− logit(λw)�+ (α0β1 − α1β0)
2
β20 − β2
1
.
Assuming the optimal cut-off to be asymptotically normally distributed,
we can use the delta method to obtain a variance. Let Θ be the mean
vector (E(α0),E(α1),E(β0),E(β1))�. The delta method implies that
the approximate variance for the optimal cut-off is given by
∂h
∂α0
����2
Θ
Var(α0) +∂h
∂α1
����2
Θ
Var(α1) +∂h
∂β0
����2
Θ
Var(β0) +∂h
∂β1
����2
Θ
Var(β1)
+ 2∂h
∂α0
����Θ
∂h
∂α1
����Θ
Cov(α0,α1) + 2∂h
∂α0
����Θ
∂h
∂β0
����Θ
Cov(α0, β0)
+ 2∂h
∂α0
����Θ
∂h
∂β1
����Θ
Cov(α0, β1) + 2∂h
∂α1
����Θ
∂h
∂β0
����Θ
Cov(α1, β0)
+ 2∂h
∂α1
����Θ
∂h
∂β1
����Θ
Cov(α1, β1) + 2∂h
∂β0
����Θ
∂h
∂β1
����Θ
Cov(β0, β1).
3.3. NOVEL APPROACH 35
The variances and covariances in the formula result from the linear
regression. Denoting
S :=
�2(β2
0 − β21)
�log
β0
β1
− logit(λw)
�+ (α0β1 − α1β0)
2,
the partial derivatives are given by
∂
∂α0
h(α0,α1, β0, β1) =−β0 +
β1
S(α0β1 − α1β0)
β20 − β2
1
,
∂
∂α1
h(α0,α1, β0, β1) =β1 − β0
S(α0β1 − α1β0)
β20 − β2
1
,
∂
∂β0
h(α0,α1, β0, β1) =−α0 +
1β0S
(β20 − β2
1)
β20 − β2
1
+4β0
�log β0
β1
− logit(λw)�− 2α1(α0β1 − α1β0)
2S(β20 − β2
1)
− 2β0 (α1β1 − α0β0 + S)
(β20 − β2
1)2 ,
∂
∂β1
h(α0,α1, β0, β1) =α1 − 1
β1S(β2
0 − β21)
β20 − β2
1
+−4β1
�log β0
β1
− logit(λw)�+ 2α0(α0β1 − α1β0)
2S(β20 − β2
1)
+2β1(α1β1 − α0β0 + S)
(β20 − β2
1)2 .
If σ0 = σ1 =: σ holds, then the optimal cut-off value is given by
x0 =σ2logit(λw) +
12(µ2
0 − µ21)
µ0 − µ1
.
As σ0 = σ1 it follows β0 = β1 =: β. First we reparametrize, so that we
obtain the optimal cut-off value dependent of α0,α1 and β:
x0 = hσ(α0,α1, β0, β1)
=
1β
�logit(λw) +
12(α2
1 − α20)�
α1 − α0
.
36 3. THEORY
Then, with the delta method, we obtain the following estimate of the
variance of x0, with Θ being the mean vector (E(α0),E(α1),E(β))�:
∂hσ
∂α0
����2
Θ
Var(α0) +∂hσ
∂α1
����2
Θ
Var(α1) +∂hσ
∂β
����2
Θ
Var(β)
+ 2∂hσ
∂α0
����Θ
∂hσ
∂α1
����Θ
Cov(α0,α1) + 2∂hσ
∂α0
����Θ
∂hσ
∂β
����Θ
Cov(α0, β)
+ 2∂hσ
∂α1
����Θ
∂hσ
∂β
����Θ
Cov(α1, β),
where the derivatives are given by
∂
∂α0
hσ(α0,α1, β) =α1(α1 − α0)− 1
β
�logit(λw) +
12(α2
1 − α20)�
(α1 − α0)2
∂
∂α1
hσ(α0,α1, β) =−α0(α1 − α0) +
1β
�logit(λw) +
12(α2
1 − α20)�
(α1 − α0)2
∂
∂βhσ(α0,α1, β) =
− 1β2
�logit(λw) +
12(α2
1 − α20)�
α1 − α0
.
Thus, the confidence interval of the optimal cut-off in the normal dis-
tribution case, is given by
[x0 − z1−α2
�Var(x0), x0 + z1−α
2
�Var(x0)].
Assuming logistic distribution, we can determine the optimal cut-off
only with a fixed point iteration. Thus, we cannot obtain a variance
estimate in the same way.
3.4. Model Selection
In the novel approach a broad range of linear mixed models are
introduced. This leads directly to the question of model selection. The
model selection of linear mixed effects models is a current research
question, as common measures, known from fixed effects models, can-
not be used for mixed models. Several approaches for model selection
have been presented during the last few years. We will briefly present
two which seem promising to us. Unfortunately the approaches are
both not suitable to select one of our models.
3.4.1. REML criterion. It is common to estimate mixed models
with a restricted maximum likelihood (REML) approach. The idea is to
3.4. MODEL SELECTION 37
obtain less biased variance estimates for the random effects, considering
linear combinations of the data that remove the fixed effects (Faraway
(2006), p.172). The REML criterion is defined as
REML = −2 log(LikREML),
where LikREML is the likelihood function of the transformed data. The
preferred model is the one with the smallest REML criterion.
As the fixed effects are eliminated, with the REML criterion only mixed
models with same fixed effects can be compared. Thus, for our range
of models (where some models assume same fixed slopes β0 = β1 and
others not) this model selection method is not appropriate.
3.4.2. Conditional AIC. From linear fixed effect models the Akaike
information criterion (AIC) is well known. It is defined as follows:
Definition 3.4 (Akaike information criterion). Let �L be the max-
imized value of the likelihood function for the model and p the number
of estimated parameters in the model, then the AIC of the model is
defined as
AIC = −2 ln(�L) + 2p.
The problem of using this measure with mixed effect models is the
lack of clarity on how to determine a number of parameters p for the
model. Imagine we had a mixed effects model with only one random
effect. So we could argue that we only have one parameter more than
the corresponding fixed effects model, the variance parameter for that
random effect. But on the other hand, we are incorporating as many
random effects as we have studies. So what is the correct number of
parameters?
To face this problem, Vaida and Blanchard (2005) proposed a condi-
tional Akaike information criterion (cAIC). Instead of using twice the
number of parameters as penalty term, they propose a penalty term
which is related to the effective degrees of freedom for a linear mixed
model stated by Hodges and Sargent (2001). The effective degrees of
freedom reflect an intermediate level of complexity between the fixed-
effects model without the random effects and a corresponding model
with all random effects counted as fixed ones. Greven and Kneib (2010)
38 3. THEORY
identified deficits of the cAIC (the estimation of the random effects co-
variance matrix induces bias) and derived an analytic representation of
a corrected version of the cAIC. They also provide an implementation
in an R package.
This approach seems very promising, but unfortunately, the implemen-
tation in R has not been working in all of our examples yet and the
results have not been entirely plausible.
3.5. Implementation in R
To apply the linear mixed models presented in subsection 3.3.3 we
use the lmer() function of the R-package ’lme4’. The function provides
maximum likelihood or restricted maximum likelihood estimates of the
parameters in linear mixed effects models. The model is described
by a formula, including fixed and random effects terms. To estimate
the parameters of, for example, model DI with normal distribution
assumption, we use the following code:
lmer (qnorm(NN) ˜ Group ∗ Cut + (Group | Study ) ) ,
where the left-hand side of the ˜ symbol represents the response vari-
able and the right-hand side the explanatory variables. Here qnorm()
is the probit function and ’NN’ a combined vector of the proportions
of the negative test results, first of the non-diseased and then of the
diseased individuals.
On the right-hand side of the formula Group ∗ Cut stands for the fixed
effects and (Group | Study) for the random effects with study as clus-
tering factor. ’Group’ is a vector containing zeros in the first half and
ones in the second, standing for the non-diseased and diseased individ-
uals, respectively. ’Cut’ is the vector of thresholds, including study 1 to
m. The ∗ symbol means that there will be three regression parameters
estimated, one for the covariate Group (i1), one for Group·Cut (s1)
and one for Cut (s0). With the intercept i0 (which is estimated per
3.5. IMPLEMENTATION IN R 39
default), we receive the fixed effects parameter of model DI as follows:
α0 = i0,
α1 = i0 + i1,
β0 = s0,
β1 = s0 + s1,
as for the non-diseased the group variable is zero and for the diseased
it is one.
In the following, let us examine the random effects. There are two
random effects, the default random intercept i0 and a random effect i1
of the covariate Group. To obtain the random effects of model DI we
proceed analogously to the fixed effects:
a0s = i0,
a1s = i0 + i1.
For the variances and covariance thus holds
τ 20a = Var(i0),
τ 21a = Var(i0 + i1) = Var(i0) + 2Cov(i0, i1) + Var(i1),
Cov(a0s, a1s) = Cov(i0, i0 + i1) = τ 20a + Cov(i0, i1).
In R the linear mixed model is only written in one equation, to allow
correlation of the random effects of the non-diseased and diseased in-
dividuals.
One needs to pay attention to the number of parameters that need to
be estimated for a model. For the most complex model, model DIDS,
the number of parameters is the highest and none of our examples of
chapter 4 provided enough data to estimate these parameters.
To include heteroscedasticity with a small number of additional pa-
rameters (Ga�lecki and Burzykowski, 2013, p.124), the lmer() function
assumes the proportion of an unknown scale parameter γ2 and a given
prior weight wsi or vsi for the variance of the residual error of study
s and cut-off i. Per default wsi = vsi = 1 is assumed for all obser-
vations (Bates et al., 2015, p.42). Thus, the higher the weight for a
given observation, the lower the variance. The prior weights are not
normalized or standardized. Therefore, if the weights have relatively
40 3. THEORY
large magnitudes, then in order to compensate, the parameter γ will
need to have a relatively large magnitude (Bates et al., 2015, p.42).
This impairs the convergence of the lmer function, as the eigenvalues
rise. So it is advisable to scale the weights. We propose a scaling of
the weights, so that they have a mean of one. In section 3.6 we will
take a look at possible prior weights.
3.6. Weighting Parameters
In meta-analysis it is reasonable to assign different weights to dif-
ferent studies. First of all, studies with higher sample sizes should be
emphasized, as the information which these studies provide represent
a larger part of the population and thus they are expected to be more
reliable. Secondly, one could emphasize studies with smaller variance.
A small variance can be due to a large sample size and to homoge-
neous test results, which is both desirable. In the following we want to
present these two possibilities of weighting.
3.6.1. Sample Size. To weight the studies with their sample size,
we set weights w0si and w1si for the non-diseased and diseased individ-
uals of study s and cut-off i as
w0si = n0s ,
w1si = n1s ,
where n0s is the number of non-diseased individuals in study s and n1s
the number of diseased individuals in study s. It is important to weight
the groups separably, because a study may have different sample sizes
in each group.
3.6.2. Inverse Variance. To determine inverse variance weights,
we need to compute the variances of the probit and logit transformed
specificity and one minus sensitivity out of the data. Therefore, we use
the delta method (see definition 3.3).
First, we determine the mean and variance of the estimated specificity
and one minus sensitivity. Considering the TN as a (n0, p0)-binomial
distributed random variable within the non-diseased, with p0 being the
probability of a negative test result for a non-diseased individual and
3.6. WEIGHTING PARAMETERS 41
n0 the number of non-diseased individuals, we have
E(�Sp) = E
�TN
n0
�=
1
n0
E(TN) = p0, (9)
Var(�Sp) = Var
�TN
n0
�=
1
n20
Var(TN) =p0(1− p0)
n0
. (10)
Analogously we assume the TP as being a (n1,p1)-binomial distributed
random variable with p1 the probability of a positive test result for a
diseased individual and n1 the number of diseased individuals. Thereby
we have
E(1− �Se) = 1− p1, (11)
Var(1− �Se) = p1(1− p1)
n1
. (12)
Let us first consider the probit transformation Φ−1. The derivative of
the probit function is given for a random variable X by
Φ−1�(X) =
1
Φ��Φ−1(X)
� , (13)
where Φ� is the density function of the normal distribution.
Let us assume the specificity and one minus sensitivity are asymptoti-
cally normal distributed and we obtain with the delta method
Var�Φ
−1��Sp
��(13)=
1
Φ�
�Φ−1
�E��Sp
���2 · Var��Sp
�
(9)(10)=
1
Φ��Φ−1(p0)
�2 ·p0(1− p0)
n0
.
Estimating p0 as �p0 = TNn0
, we get for the estimated variance
�Var�Φ
−1��Sp
��=
1
Φ��Φ−1 (�p0)
�2 ·�p0(1− �p0)
n0
=1
Φ�
�Φ−1
�TNn0
��2 ·TN · FP
n30
.
For 1-sensitivity we obtain analogously
Var�Φ
−1�1− �Se
��(13)(11)(12)
=1
Φ��Φ−1(1− p1)
�2 ·p1(1− p1)
n1
42 3. THEORY
and with the probability p1 estimated as �p1 = TPn1
, we conclude for the
estimated variance
�Var�Φ
−1�1− �Se
��=
1
Φ�
�Φ−1
�FNn1
��2 ·TP · FN
n31
.
Thus, the weights of study s and cut-off i for the diseased and non-
diseased, defined as (unscaled) inverse variances, result in
w0si =1
�Var�Φ−1
�TNsi
n0s
�� =n30s · Φ
�
�Φ−1
�TNsi
n0s
��2
TNsi · FPsi
,
w1si =1
�Var�Φ−1
�FNsi
n1s
�� =n31s · Φ
�
�Φ−1
�FNsi
n1s
��2
TPsi · FNsi
.
Now let us consider logit transformation. The derivative of the logit
function is given by
logit�(X) =1
X(1−X), (14)
where X is a random variable. Then, we obtain with the delta method
Var�logit
��Sp
��(14)=
1
E��Sp
�2 �1− E
��Sp
��2 · Var��Sp
�
(9)(10)=
1
p20(1− p0)2·p0(1− p0)
n0
=1
n0p0(1− p0).
The estimated variance results in
�Var�logit
��Sp
��=
1
n0 �p0(1− �p0)=
n0
TN · FP.
3.6. WEIGHTING PARAMETERS 43
To calculate the variance of one minus sensitivity we proceed as above
and get
Var�logit
�1− �Se
��(14)(11)(12)
=1
(1− p1)2�1− (1− p1)
�2 ·p1(1− p1)
n1
.
=1
n1p1(1− p1).
Then the estimate of the variance is
�Var�logit
�1− �Se
��=
1
n1 �p1(1− �p1)=
n1
TP · FN.
So for the weights w0si of the non-diseased in study s and cut-off i and
w1si of the diseased in study s and cut-off i we obtain the (unscaled)
inverse variances weights
w0si =TNsi · FPsi
n0s
,
w1si =TPsi · FNsi
n1s
.
CHAPTER 4
Examples
In the following chapter we want to apply the novel approach to
different examples. In all plots the (transformed) proportions of neg-
ative test results of the non-diseased individuals will be depicted as
open circles and the ones of the diseased individuals as filled circles.
The different studies are marked in different colors. The regression
line (and the distribution function) of the non-diseased is depicted as a
dashed line, whereas the regression line (and the distribution function)
of the diseased is depicted as a continuous line. Grey lines mark the
confidence regions of the estimated specificity and one minus sensitiv-
ity, respectively.
To apply our approach to the examples, we linearly regress the trans-
formed negative test results for the diseased and non-diseased, respec-
tively. Therefore, we will use the lmer() function in R with normal dis-
tribution assumption, REML estimation and inverse variance weights
scaled to mean one. To avoid problems with zero values, we add a
continuity correction of 0.5 to TN, TP, FN and FP. We limit our-
selves to using models of the general form, i.e. where the fixed slopes
of non-diseased and diseased individuals are able to differ, and mark
these models with ’*’. To choose one model of this range, we select the
one with the smallest REML criterion. This procedure leads to model
*DICS in almost all examples shown. We use a weighting parameter
λw of 0.5, meaning that sensitivity and specificity are equally weighted,
except when noted otherwise.
4.1. Troponin as a marker for myocardial infarction
The first example that we want to consider is the data of the sys-
tematic review of Zhelev et al. (2015) where they investigated the ”di-
agnostic accuracy of single baseline measurement of Elecsys Troponin
T high-sensitive assay for diagnosis of acute myocardial infarction in
45
46 4. EXAMPLES
5 10 20
0.0
0.2
0.4
0.6
0.8
1.0
Troponin threshold [ng/L]
P(n
egative
test re
sult)
●
● ● ●
●●
● ●
●
●
●
●●
●●
●
●●
●
●
●
●
● ●
●
●●
●●●
●
●
●
●
●●
Figure 4.1. Troponin data. The proportions of negativetest results of the non-diseased are depicted as open circles(estimated specificities), the ones of the diseased as filledcircles (estimated 1-sensitivities). The data points belongingto different studies are marked in different colors and datapoints of the same studies are connected with lines.
emergency department”. There are 23 studies included, where 8 stud-
ies report 2 to 4 thresholds and 13 only one. Most of them (20 studies)
reported a threshold of 14 ng/L, as this is the manufacturer’s recom-
mended threshold (the 99% quantile of a healthy reference population).
Four studies reported a threshold of 3 ng/L and four a threshold of 5
ng/L. Furthermore, some other thresholds were reported. The data
can be found in table A.1 in the appendix.
Plotting the proportions of negative test results against the logarith-
mized threshold of troponin leads to figure 4.1.
Zhelev et al. conducted two meta-analyses using a bivariate model,
4.1. TROPONIN AS A MARKER FOR MYOCARDIAL INFARCTION 47
5 10 20
−2
−1
01
2
Troponin threshold [ng/L]
Φ−1 (
P(n
egative
test re
sult))
●
● ● ●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
● ●
●
●●
●●●
●
●
●
●
●●
Figure 4.2. Troponin data. Regression lines of the non-diseased (open circles, dashed line) and the diseased individ-uals (filled circles, solid line). Different studies are markedin different colors.
one pooling the data for the 14 ng/L threshold and one combined for
the thresholds 3 and 5 ng/L, as there was not enough data to perform
separate meta-analyses for these thresholds as well.
To apply our novel approach, we will proceed step-by-step. First, we
transform the proportions of negative test results applying the probit
function. After the transformation, we use a linear mixed effects model
to regress the data. From all *-models we select model *DICS, as it
has the smallest REML criterion of this range of models. The result
can be seen in figure 4.2.
48 4. EXAMPLES
5 10 20
0.0
0.2
0.4
0.6
0.8
1.0
Troponin threshold [ng/L]
P(n
egative
test re
sult)
●
● ● ●
●●
● ●
●
●
●
●●
●●
●
●●
●
●
●
●
● ●
●
●●
●●●
●
●
●
●
●●
Optimal threshold = 20.8 ng/L
Figure 4.3. Troponin data. Biomarker distributions withinthe non-diseased (open circles, dashed line) and within thediseased individuals (filled circles, solid line). The grey linesmark the confidence regions. The optimal threshold, derivedfrom a maximization of the Youden index, is depicted as asolid vertical line. Different studies are marked in differentcolors.
We back-transform the data and the parameters and obtain the dis-
tribution functions of the biomarker within the non-diseased and dis-
eased individuals, respectively (see figure 4.3). The resulting sensitiv-
ity at threshold 14 ng/L is 0.88 [0.85, 0.91] and the specificity is 0.75
[0.68, 0.82]. Zhelev et al. obtained a higher diagnostic accuracy of the
biomarker at the threshold 14 ng/L, as the sensitivity was 0.90 [0.86,
0.92] and the specificity was 0.77 [0.69, 0.84].
Our optimal threshold however, determined by a maximization of the
Youden index, is 20.78 ng/L. This is a lot higher than the recommended
4.1. TROPONIN AS A MARKER FOR MYOCARDIAL INFARCTION 49
●
● ●
●
●
●
●●
●
●
●
●●
●
●●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
5 10 20
0.0
0.2
0.4
0.6
0.8
1.0
Troponin threshold [ng/L]
Youden index
Optimal threshold = 20.8 ng/L
Figure 4.4. The Youden index of the troponin data withweighting parameter λw = 0.5. The optimal threshold, de-picted as a solid vertical line, is derived as that thresholdwhere the maximum is obtained.
threshold. The Youden index (with weighting parameter λw = 0.5, see
equation (3)) can be seen in figure 4.4.
At threshold 20.78 ng/L the sensitivity is 0.81 [0.75, 0.86] and speci-
ficity is 0.86 [0.82, 0.90]. This means that with the threshold of 20.78
ng/L, sensitivity decreased by 0.07 ng/L, but specificity increased by
0.11 ng/L with respect to the threshold of 14 ng/L. The big difference
between the recommended threshold and the point of maximization of
the Youden index might come from a different weighting of sensitivity
and specificity. As the diagnostic accuracy of Elecsys Troponin T high
sensitive assay is analyzed in the emergency department, it is likely
that sensitivity is emphasized. Setting the weighting parameter λw of
50 4. EXAMPLES
the weighted Youden index to 2/3, we obtain an optimal threshold of
13.9 ng/L, which is very close to the recommended threshold.
At last, we present the estimated summary ROC curve (plot 4.5) which
can be obtained easily since estimates of all distribution parameters are
available (see equation (1)).
●
●●●
●●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
● ●
● ●●
●
●
●
●
●●
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
1 − Specificity
Sensitiv
ity
Summary ROC curve
Optimal threshold at 20.8 ng/L with λ=0.5
Optimal threshold at 13.9 ng/L with λ=2/3
Figure 4.5. Summary ROC curve of the troponin data withtwo different optimal thresholds obtained by choosing weight-ing parameters λw = 0.5 (optimal threshold marked withblack cross) and λw = 2/3 (optimal threshold marked withred cross). Different studies are marked in different colors.
4.2. PROCALCITONIN AS A MARKER FOR SEPSIS 51
4.2. Procalcitonin as a marker for sepsis
The next example we want to consider is the data of the system-
atic review of Wacker et al. (2013). In this publication a meta-analysis
has been conducted ”to investigate the ability of procalcitonin to dif-
fer between sepsis and systematic inflammatory response syndromes
of non-infectious origin in critically ill patients”(Wacker et al., 2013).
There were 31 different studies included. They obtained a pooled sen-
sitivity of 0.77 [0.72, 0.81] and a pooled specificity of 0.79 [0.74, 0.84]
by using a bivariate mixed effects regression model.
Wacker et al. mentioned that several studies reported more than one
threshold and provided us with a list of these studies. There were 11
of the 31 studies reporting between 2 and 5 thresholds (see table A.2).
We applied our approach to that data, logarithmizing the procalcitonin
(PCT) thresholds and using model *DICS. The results are shown in
plot 4.6. We obtain an optimal threshold of 1.1 ng/mL and pooled
sensitivity of 0.71 [0.63, 0.78] and pooled specificity of 0.80 [0.73, 0.85]
at this threshold. Compared to the results of Wacker et al., our esti-
mated sensitivity is lower and has a bigger confidence interval, whereas
the estimates for the specificity are quite similar. The summary ROC
curve is shown in plot 4.7.
It is important to mention that the procalcitonin data is very het-
erogeneous. Wacker et al. investigated sources of heterogeneity with
meta-regression, but could not explain the heterogeneity. As different
procalcitonin assays were used in the studies, it may be reasonable to
stratify with respect to these assays. The different assays are all tests
to determine the concentration of procalcitonin in human serum and
plasma. To be precise, these assays are PCT-Q (a rapid test without
instruments), PCT-LIA (a manual standard test which is very reliable)
and PCT-Kryptor (a fully automated test).
The data and the estimated distribution functions stratified by assays
can be seen in figures 4.8 and 4.9. We used model *DICS for all as-
says. The resulting optimal thresholds and the pooled sensitivities and
specificities at these thresholds are given in table 4.1. For the PCT-Q
assay the point where the Youden index is maximised, i.e. the optimal
threshold, is 0.34 ng/mL. However, this value is outside of the data
range (see top panel in figure 4.8). Thus, to avoid extrapolation, we
52 4. EXAMPLES
0.1 0.2 0.5 1.0 2.0 5.0 10.0
0.0
0.2
0.4
0.6
0.8
1.0
Procalcitonin threshold [ng/mL]
P(n
egative
test re
sult)
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
Optimal threshold = 1.1 ng/mL
Figure 4.6. Distribution functions of procalcitonin withinthe non-diseased (open circles, dashed line) and within thediseased individuals (filled circles, solid line). The grey linesmark the confidence regions and different studies are markedin different colors. The optimal threshold, derived from amaximization of the Youden index, is depicted as a solidvertical line.
Assay Optimal threshold Sensitivity Specificity
[ng/mL]
PCT-Q 0.50 0.80 [0.68, 0.89] 0.86 [0.77, 0.92]
PCT-LIA 2.10 0.65 [0.54, 0.75] 0.84 [0.73, 0.91]
PCT-Kryptor 0.87 0.71 [0.55, 0.83] 0.81 [0.72, 0.87]
Table 4.1. Table of sensitivities and specificities at the op-timal threshold for the different assays of procalcitonin.
4.2. PROCALCITONIN AS A MARKER FOR SEPSIS 53
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
1 − Specificity
Sensitiv
ity
Model−based summary ROC curve
Optimal threshold at 1.1 ng/mL,
(Se,Sp)=(0.71,0.80)
Figure 4.7. Summary ROC curve of the procalcitonin datawith the optimal threshold. Different studies are marked indifferent colors.
recommend using a threshold of 0.5 ng/mL.
The optimal thresholds of the assays differ a lot. Therefore, it is rea-
sonable to stratify. Nevertheless, the data still contains a lot of hetero-
geneity, which explains the big confidence intervals. The three sum-
mary ROC curves of the different procalcitonin assays are shown in
plot 4.10. They defer a lot and the PCT-Q assays seems to have the
best diagnostic accuracy. However, the data set of the PCT-Q assay
was the smallest with only 4 studies included.
54 4. EXAMPLES
0.5 1.0 2.0 5.0 10.0
0.0
0.2
0.4
0.6
0.8
1.0
Procalcitonin threshold [ng/mL]
P(n
egative
test re
sult)
●
●
●
●
●
●
●
●
Optimal threshold = 0.3 ng/mL
0.1 0.2 0.5 1.0 2.0 5.0 10.0
0.0
0.2
0.4
0.6
0.8
1.0
Procalcitonin threshold [ng/mL]
P(n
egative
test re
sult)
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
Optimal threshold = 2.1 ng/mL
Figure 4.8. Top: Procalcitonin data with PCT-Q assay,bottom: procalcitonin data with PCT-LIA assay.Procalcitonin distribution functions within the non-diseased(open circles, dashed line) and within the diseased individuals(filled circles, solid line). The grey lines mark the confidenceregions and different studies are marked in different colors.The optimal threshold, derived from a maximization of theYouden index, is depicted as a solid vertical line.
4.2. PROCALCITONIN AS A MARKER FOR SEPSIS 55
0.1 0.2 0.5 1.0 2.0 5.0 10.0
0.0
0.2
0.4
0.6
0.8
1.0
Procalcitonin threshold [ng/mL]
P(n
egative
test re
sult)
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
Optimal threshold = 0.9 ng/mL
Figure 4.9. Procalcitonin data with PCT-Kryptor assay.Biomarker distributions within the non-diseased (open cir-cles, dashed line) and within the diseased individuals (filledcircles, solid line). The grey lines mark the confidence regionsand different studies are marked in different colors. The op-timal threshold, derived from a maximization of the Youdenindex, is depicted as a solid vertical line.
56 4. EXAMPLES
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
1 − Specificity
Sensitiv
ity
Summary ROC curve PCT−Q
Optimal threshold of PCT−Q at 0.5 ng/mL
Summary ROC curve PCT−LIA
Optimal threshold of PCT−LIA at 2.1 ng/mL
Summary ROC curve PCT−Kryptor
Optimal threshold of PCT−Kryptor at 0.87 ng/mL
Figure 4.10. Summary ROC curves of the different procal-citonin assays with the optimal threshold marked as a cross(PCT-Q assay in black, PCT-LIA assay in red and PCT-Kryptor assay in green). Different studies are marked indifferent colors.
4.3. Procalcitonin as a marker for neonatal sepsis
In the following we want to consider the data of the systematic
review of Vouloumanou et al. (2011), where the value of serum pro-
calcitonin for the distinction of individuals with and without neonatal
sepsis was investigated. They reported 16 studies, whereof 3 reported
2 to 3 thresholds and the others only one. The data set with multiple
thresholds per study can be found in table A.3. Martınez-Camblor used
4.3. PROCALCITONIN AS A MARKER FOR NEONATAL SEPSIS 57
0.5 1.0 2.0 5.0 10.0
−2
−1
01
2
Procalcitonin threshold [ng/mL]
Φ−1 (
P(n
egative
test re
sult))
●
●
●
● ●
●
●
● ●
●
●
●
●
●
● ●●
●
●
●
Figure 4.11. Procalcitonin data concerning neonatal sepsiswith regression lines of the non-diseased (open circles, dashedline) and the diseased individuals (filled circles, solid line).Different studies are marked in different colors.
this example to demonstrate his non-parametric approach described in
subsection 3.2.3.
Applying our new approach to this data may, depending on the model,
result in regression lines with negative slope. This leads to negative
standard deviations and thus the model fails (see paragraph 3.3.4).
The result for model *CS and logarithmized procalcitonin threshold
can be seen in plot 4.11.
For some of the other models the regression lines are not decreasing,
but almost horizontal, resulting in optimal thresholds ranging from 0
to 1.4 · 1043. Thus, we do not obtain reasonable results for this data.
58 4. EXAMPLES
4.4. CAGE Questionnaire
Generally, our approach is based on the assumption of a continuous
biomarker. However, in the following we want to apply our approach
to a discrete biomarker. Aertgeerts et al. (2004) conducted a meta-
analysis to assess diagnostic characteristics of the CAGE, a self-report
questionnaire to identify alcoholism. The data can be found in table
A.4. Putter et al. (2010) used this data example to demonstrate their
approach which is described in subsection 3.2.2. The data consists of
10 studies all reporting 5 thresholds (0, 1, 2, 3, 4) and corresponding
values of sensitivity and specificity.
The results of our approach can be seen in plot 4.12. The regression
was conducted with model *DICS and led to an optimal threshold of
1.56, but this threshold does not exist.
Instead, we need to maximize the Youden index on the discrete set of
thresholds {0, 1, 2, 3, 4}. This results in the optimal threshold 2 with
a sensitivity of 0.70 [0.60, 0.78] and a specificity of 0.88 [0.82, 0.92].
The Poisson correlated gamma frailty model of Putter et al. resulted
in a sensitivity of 0.69 [0.63, 0.76] and a specificity of 0.88 [0.83, 0.94]
at threshold 2. Thus, the results are very similar, with our approach
leading to slightly bigger confidence intervals.
The summary ROC curve is shown in plot 4.13.
4.4. CAGE QUESTIONNAIRE 59
0 1 2 3 4
0.0
0.2
0.4
0.6
0.8
1.0
CAGE score threshold
P(n
egative
test re
sult)
●●●●●●●●●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●
Optimal threshold = 1.6
Figure 4.12. Distribution functions of the CAGE scorewithin the non-diseased (open circles, dashed line) and withinthe diseased individuals (filled circles, solid line). The greylines mark the confidence regions and different studies aremarked in different colors. The optimal threshold, derivedfrom a maximization of the Youden index, is depicted as asolid vertical line.
60 4. EXAMPLES
●●●●●●●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
1 − Specificity
Sensitiv
ity
Model−based summary ROC curve
Optimal threshold at 2,
(Se,Sp)=(0.70,0.88)
Figure 4.13. Summary ROC curve of the CAGE data withthe optimal threshold marked as a cross. Different studiesare marked in different colors.
CHAPTER 5
Simulation Study
5.1. Design
To evaluate the performance of our method we conducted a simu-
lation study. We aimed to investigate how precise the new approach
can estimate the true distributions of ill and healthy individuals. Fur-
thermore, we examined if the model is a suitable approach to estimate
the pooled sensitivity and specificity and the optimal threshold in a
meta-analysis.
We considered 384 scenarios with 1000 runs each. Data was simulated
mimicking roughly the example data presented in section 4. For an
overview of the data acquisition see the flow chart in figure 5.1. All
scenarios can be seen in table 5.1.
To obtain a data set of a DTA study, the number of studies was
randomly set to 10, 20 or 30. The ’real’ overall distributions of the
biomarker within the non-diseased and diseased individuals were nor-
mal distributions with fixed mean 0 for the non-diseased and varying
mean 2.5 (’nearby distributions’) or 4 (’distant distributions’) for the
diseased. The standard deviation of the non-diseased was 1.5, 1, 2.5,
2.5, respectively, varying together with the standard deviation of the
diseased of 1.5 (same standard deviations), 2 (different standard devia-
tions), 2.5 (same standard deviations) and 4 (different standard devia-
tions). The standard deviations were varying together with the mean,
the first two smaller standard deviations combined with the nearby
distributions and the two bigger ones with the distant distributions.
To obtain study-specific distributions random noise was added to the
’real’ overall distributions. The extent of the random noise was de-
termined by a visual comparison with the examples. Namely a mean
zero normal error with standard deviation 0 (’no heterogeneity’), 0.5
(’moderate heterogeneity’), 1 (’large heterogeneity’) or 1.5 (’huge het-
erogeneity’) was added to the mean parameters. Likewise, a normal
61
62 5. SIMULATION STUDY
error with standard deviation 0, 0.3, 0.4 or 0.5 was added to the stan-
dard deviation parameters. These noise distributions were symmetri-
cally truncated. Those of the mean parameters were truncated so that
the mean of the study-specific distribution of the diseased individuals
was greater than that of the non-diseased; those of the standard de-
viation parameters were truncated in order to guarantee non-negative
study-specific standard deviations.
The total number of individuals per study was drawn from a log-normal
distribution1 with parameters µ = 5 and σ = 1. The proportion of ill
individuals per study was drawn from a normal distribution with mean
0.5 and standard deviation 0.2, truncated to the interval (0.2, 0.8) to
obtain realistic proportions. Drawing from the respective study-specific
distribution as many times as the number of non-diseased and diseased
led to biomarker values for all individuals.
The number of thresholds per study was drawn from a Poisson dis-
tribution with parameter λ = 1.3 or 2 (rejecting zeros) or fixed to 5
thresholds per study. The values of the thresholds were spaced equidis-
tantly between the 40% quantile of the study-specific distribution of
the non-diseased individuals and the 60% quantile of the study-specific
distribution of the diseased individuals. Therewith we obtained whole
data sets of DTA studies.
Conducting the meta-analysis we applied 4 selected linear random ef-
fects models: CI, DS, CICS and CIDS, with increasing complexity. All
models were used in the special case of same fixed slopes (β0 = β1) and
in the general case of different fixed slopes (these model were named
with *), leading to a total number of eight different models. For the
computational implementation of the linear random effect models we
used the lmer() function of the R package lme4 1.1-7 with REML es-
timation. We did not include the most complex model DIDS because
there was mostly insufficient data, as the simulation data was mim-
icking the example data. For weighting of the studies we used inverse
variance weights scaled to mean one. Sensitivity and specificity were
equally weighted.
1A random variable X is log-normal distributed with parameters µ and σ, if ln(X)is normal distributed with mean µ and standard deviation σ.
5.1. DESIGN 63
Numberofcu
toffs
pois(λ
=1.3,2),
5
Proportionill:healthy
N(0.5,0
.2)
Numberofpatients
perstudy
Lognorm
al(5,1
)
Heterogeneityparameters
τµ=
(0,0.5,1,1.5),
τσ=
(0,0.3,0.4,0.5)
Parameters
ofdistribution
µ0=
0,µ1=
(2.5,4),
σ0=
(1.5,1,2.5,2.5),
σ1=
(1.5,2,2.5,4)
Numberofstudiespermeta
analysis
(10,20,30)
NumberofIll/Healthy
Parameters
ofdistributionper
study
Valuesofcu
toffs
Biomarkerva
lues
TP,TN,FP,FN
Figure 5.1. Flow chart of the data acquisition in the simu-lation study.
64 5. SIMULATION STUDY
µ0 µ1 σ0/σ1 λ τµ/τσ models0 2.5 1.5/1.5 1.3 0/0 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 2.5 1.5/1.5 1.3 0.5/0.3 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 2.5 1.5/1.5 1.3 1/0.4 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 2.5 1.5/1.5 1.3 1.5/0.5 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 2.5 1.5/1.5 2 0/0 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 2.5 1.5/1.5 2 0.5/0.3 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 2.5 1.5/1.5 2 1/0.4 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 2.5 1.5/1.5 2 1.5/0.5 CI, DS, CICS, CIDS,*CI, *DS, *CICS, *CIDS0 2.5 1.5/1.5 5 thresholds 0/0 CI, DS, CICS, CIDS,*CI, *DS, *CICS, *CIDS0 2.5 1.5/1.5 5 thresholds 0.5/0.3 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 2.5 1.5/1.5 5 thresholds 1/0.4 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 2.5 1.5/1.5 5 thresholds 1.5/0.5 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 2.5 1/2 1.3 0/0 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 2.5 1/2 1.3 0.5/0.3 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 2.5 1/2 1.3 1/0.4 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 2.5 1/2 1.3 1.5/0.5 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 2.5 1/2 2 0/0 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 2.5 1/2 2 0.5/0.3 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 2.5 1/2 2 1/0.4 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 2.5 1/2 2 1.5/0.5 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 2.5 1/2 5 thresholds 0/0 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 2.5 1/2 5 thresholds 0.5/0.3 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 2.5 1/2 5 thresholds 1/0.4 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 2.5 1/2 5 thresholds 1.5/0.5 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 4 2.5/2.5 1.3 0/0 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 4 2.5/2.5 1.3 0.5/0.3 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 4 2.5/2.5 1.3 1/0.4 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 4 2.5/2.5 1.3 1.5/0.5 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 4 2.5/2.5 2 0/0 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 4 2.5/2.5 2 0.5/0.3 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 4 2.5/2.5 2 1/0.4 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 4 2.5/2.5 2 1.5/0.5 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 4 2.5/2.5 5 thresholds 0/0 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 4 2.5/2.5 5 thresholds 0.5/0.3 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 4 2.5/2.5 5 thresholds 1/0.4 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 4 2.5/2.5 5 thresholds 1.5/0.5 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 4 2.5/4 1.3 0/0 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 4 2.5/4 1.3 0.5/0.3 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 4 2.5/4 1.3 1/0.4 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 4 2.5/4 1.3 1.5/0.5 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 4 2.5/4 2 0/0 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 4 2.5/4 2 0.5/0.3 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 4 2.5/4 2 1/0.4 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 4 2.5/4 2 1.5/0.5 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 4 2.5/4 5 thresholds 0/0 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 4 2.5/4 5 thresholds 0.5/0.3 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 4 2.5/4 5 thresholds 1/0.4 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS0 4 2.5/4 5 thresholds 1.5/0.5 CI, DS, CICS, CIDS, *CI, *DS, *CICS, *CIDS
Table 5.1. All scenarios of the simulation study. Every linerepresents 8 scenarios, differing in the linear mixed modelused.
5.2. RESULTS 65
5.2. Results
We investigated bias, coverage and mean squared error (MSE) of
the distribution parameters µ0, µ1, σ0 and σ1 and of sensitivity and
specificity at three points: at the mean of the non-diseased population
(sens 1, spec 1), at the ’real’ optimal threshold (sens 2, spec 2) and at
the mean of the diseased population (sens 3, spec 3). Furthermore, we
investigated bias and MSE for the optimal threshold.
All pictures contain eight plots. In every plot bias, coverage or MSE is
plotted against the linear models. From left to right the heterogeneity
of the studies increases. The four plots at the bottom show the case
of same standard deviation in the ’real’ overall distributions of the
biomarker, the top four plots the case of different standard deviations.
5.2.1. Bias.
5.2.1.1. Distribution parameters. Firstly, we consider the bias of
the distribution parameters µ0, µ1, σ0 and σ1 with λ = 1.3 and nearby
distributions of the biomarker (see figure 5.2). In the case of no hetero-
geneity and same standard deviations, the bias was basically zero for
all parameters. But the bias increased with increasing heterogeneity
up to the value of 100 for single parameters. The parameter µ0, the
mean of the non-diseased, was consequently underestimated, whereas
the other parameters were overestimated. In the case of different stan-
dard deviations (the upper row of plots) we find a similar structure.
But there is one striking difference in the case of no heterogeneity: For
the models with same fixed slopes for the non-diseased and diseased
individuals, all parameters have bias unequal to zero. An explanation
could be that the data is quite perfect, as there is no heterogeneity,
but the slopes of the two straight lines to be estimated are different.
In this case all parameters suffer from estimating these lines with the
constraint of estimating the same slope. This phenomenon vanishes
with more heterogeneity.
In the case of λ = 2 (see figure C.1 in the appendix) or even more in
the case of the fixed number of 5 thresholds per study the bias of the
distribution parameters is significantly decreasing (see graphic at the
top of figure 5.3). One can observe a zigzag pattern of the bias, where
model DS and *DS have the greatest bias and model CIDS and *CIDS
66 5. SIMULATION STUDY
Models
Bia
s
0
50
100
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, Same SD Moderate hetero, Same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, Same SD Huge hetero, Same SD
No hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, Different SD Large hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
0
50
100
Huge hetero, Different SD
mu0 mu1 sigma0 sigma1
Models
Bia
s
−5
0
5
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, Same SD Moderate hetero, Same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, Same SD Huge hetero, Same SD
No hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, Different SD Large hetero, Different SDC
I
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
−5
0
5
Huge hetero, Different SD
mu0 mu1 sigma0 sigma1
Figure 5.2. Bias of µ0 (open light blue circle), µ1 (filled light bluecircle), σ0 (open dark blue circle) and σ1 (filled dark blue circle) forλ = 1.3 and nearby distributions. The picture at the top shows thewhole scenario, the one at the bottom is zoomed so that −10 ≤ Bias≤ 10. In both pictures the heterogeneity of the studies increases fromleft to right. The four plots at the bottom show the case of samestandard deviations (SD), the top four plots the case of differentstandard deviations.
5.2. RESULTS 67
the lowest, equally for same and different standard deviations.
Let us consider the case of distant distributions, where the overall dis-
tribution of the biomarker within the non-diseased individuals is a mean
zero normal distribution with standard deviation 2.5 and the one within
the diseased individuals is a normal distribution with mean 4 and stan-
dard deviation 2.5 or 4. The bias of the distribution parameters de-
creased, in comparison with the bias in the case of nearby distributions
(see figure 5.3 at the bottom, others not shown). This could be due to
the higher values of mean and standard deviation and therefore higher
differences between these parameters regarding the non-diseased and
diseased individuals. Thus the adding of heterogeneity, which stayed
the same, affected the parameters less.
5.2.1.2. Sensitivity and Specificity. In the following we consider the
bias of sensitivity and specificity with λ = 1.3 and nearby distributions
(see figures 5.4 and 5.5). In the case of same standard deviations and no
heterogeneity there was almost no bias. With increasing heterogeneity,
the sensitivity was underestimated at threshold 0 and overestimated at
threshold 2.5. For specificity this held vice versa. Thus, small values
of sensitivity and specificity were overestimated and large ones under-
estimated. At the ’real’ optimal threshold 1.25 they were both slightly
underestimated. For different standard deviations we discover the same
development, but additionally we obtain bias for the models assuming
same standard deviations in the no-heterogeneity case. At threshold
0 and at the ’real’ optimal threshold the bias in the case of different
standard deviations was generally slightly larger than in the case of
same standard deviation, at threshold 2.5 smaller.
Consistent with the results of the distribution parameters, the bias of
sensitivity and specificity was generally decreasing with an increasing
number of thresholds. At the outer two points (point 1 and 3) the bias
decreased significantly, at the ’real’ optimal threshold the bias stayed
the same (see figures 5.6 (top) and C.3 for the case of 5 thresholds per
study, the case λ = 2 is not shown). Again, we can observe a zigzag
pattern, with the highest bias resulting from model DS and *DS and
the lowest from CIDS and *CIDS, equally for same and different stan-
dard deviations.
68 5. SIMULATION STUDY
Models
Bia
s
−5
0
5
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, Same SD Moderate hetero, Same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, Same SD Huge hetero, Same SD
No hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, Different SD Large hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
−5
0
5
Huge hetero, Different SD
mu0 mu1 sigma0 sigma1
Models
Bia
s
−5
0
5
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, same SD Moderate hetero, same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, same SD Huge hetero, same SD
No hetero, different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, different SD Large hetero, different SDC
I
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
−5
0
5
Huge hetero, different SD
mu0 mu1 sigma0 sigma1
Figure 5.3. Bias of µ0 (open light blue circle), µ1 (filled light bluecircle), σ0 (open dark blue circle) and σ1 (filled dark blue circle).Top: In the case of 5 thresholds per study and nearby distributions.Bottom: In the case of λ = 1.3 and distant distributions. Zoomedversion so that −10 ≤ Bias ≤ 10. For an overview see figure C.2. Inboth pictures the heterogeneity of the studies increases from left toright. The four plots at the bottom show the case of same standarddeviations (SD), the top four plots the case of different standarddeviations.
5.2. RESULTS 69
Models
Bia
s
−0.1
0.0
0.1
0.2
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, Same SD Moderate hetero, Same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, Same SD Huge hetero, Same SD
No hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, Different SD Large hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
−0.1
0.0
0.1
0.2
Huge hetero, Different SD
sens_1 spec_1
Models
Bia
s
−0.1
0.0
0.1
0.2
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, Same SD Moderate hetero, Same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, Same SD Huge hetero, Same SD
No hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, Different SD Large hetero, Different SDC
I
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
−0.1
0.0
0.1
0.2
Huge hetero, Different SD
sens_2 spec_2
Figure 5.4. Bias of sensitivity and specificity at 0 (sens 1, spec 1)in the top panel and at the ’real’ optimal threshold (sens 2, spec 2) inthe bottom panel. It holds λ = 1.3 and the distributions are nearby.In both pictures the heterogeneity of the studies increases from left toright. The four plots at the bottom show the case of same standarddeviations (SD), the top four plots the case of different standarddeviations.
70 5. SIMULATION STUDY
Models
Bia
s
−0.1
0.0
0.1
0.2
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, Same SD Moderate hetero, Same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, Same SD Huge hetero, Same SD
No hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, Different SD Large hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
−0.1
0.0
0.1
0.2
Huge hetero, Different SD
sens_3 spec_3
Figure 5.5. Bias of sensitivity (sens 3) and specificity(spec 3) at 2.5. It holds λ = 1.3 and the distributions arenearby. The heterogeneity of the studies increases from leftto right. The four plots at the bottom show the case of samestandard deviations (SD), the top four plots the case of dif-ferent standard deviations.
In the case of distant distributions the bias of sensitivity and speci-
ficity decreased, in comparison to the nearby distributions (see figures
5.6 (bottom) and C.4, others not shown). This matches with the results
of the distribution parameters.
5.2.1.3. Optimal threshold. In the meta-analysis an overall optimal
threshold was estimated. The bias of this optimal threshold in the case
of λ = 1.3 and same standard deviations was small (see figure 5.7). In
the case of different standard deviations the bias of models with same
slope was significantly higher than the one of models with different
slopes. With increasing number of thresholds per study the bias was
decreasing (see figure C.5).
5.2.2. MSE.
5.2.2.1. Distribution parameters. The mean squared error of the
distribution parameters in the case of λ = 1.3 and nearby distributions
5.2. RESULTS 71
Models
Bia
s
−0.1
0.0
0.1
0.2
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, Same SD Moderate hetero, Same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, Same SD Huge hetero, Same SD
No hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, Different SD Large hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
−0.1
0.0
0.1
0.2
Huge hetero, Different SD
sens_1 spec_1
Models
Bia
s
−0.1
0.0
0.1
0.2
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, same SD Moderate hetero, same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, same SD Huge hetero, same SD
No hetero, different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, different SD Large hetero, different SDC
I
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
−0.1
0.0
0.1
0.2
Huge hetero, different SD
sens_1 spec_1
Figure 5.6. Bias of sensitivity (sens 1) and specificity (spec 1) atthreshold 0.Top: In the case of 5 thresholds per study and nearby distributions.Bottom: In the case of λ = 1.3 and distant distributions. Both: Theheterogeneity of the studies increases from left to right. The fourplots at the bottom show the case of same standard deviations (SD),the top four plots the case of different standard deviations.
72 5. SIMULATION STUDY
Models
Bia
s
−3
−2
−1
0
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, Same SD Moderate hetero, Same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, Same SD Huge hetero, Same SD
No hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, Different SD Large hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
−3
−2
−1
0
Huge hetero, Different SD
Models
Bia
s
−3
−2
−1
0
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, same SD Moderate hetero, same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, same SD Huge hetero, same SD
No hetero, different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, different SD Large hetero, different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
−3
−2
−1
0
Huge hetero, different SD
Figure 5.7. Bias of the optimal threshold in the case of λ = 1.3.Top:: nearby distributions.Bottom: Distant distributions. Both: The heterogeneity of the stud-ies increases from left to right. The four plots at the bottom showthe case of same standard deviations (SD), the top four plots thecase of different standard deviations.
5.2. RESULTS 73
ranged from almost 0 up to 5.04·106, increasing strongly with increas-
ing heterogeneity. Considering no heterogeneity, the MSE was close to
zero and for moderate heterogeneity it did not exceed 15 (see figure
C.6).
For distant distributions there was an outlier at almost 8 · 108 (in the
case of large heterogeneity), but generally the MSE was smaller than
for nearby distributions. For example, in the case of no and moderate
heterogeneity the MSE did not exceed 5 (figures not shown). This may
again be due to the fact that the heterogeneity parameters affected the
higher values of the distant distributions less.
With an increasing number of thresholds, the MSE is decreasing sig-
nificantly, with a maximum value of 2·103 and almost all values being
below 15 in the case of 5 thresholds per study .
5.2.2.2. Sensitivity and specificity. In the case of λ = 1.3 and nearby
distributions the MSE did not exceed 0.08 for the three measuring
points. It was significantly lower at points 2 and 3 (not exceed 0.02
and 0.05, respectively). The MSE of specificity was permanently higher
than the one of sensitivity at threshold 0 and vice versa at threshold
2.5 (see figures 5.8 and C.7). This effect could also be observed for the
bias.
With an increasing number of thresholds the MSE was decreasing. For
example, with 5 thresholds per study the MSE got maximally up to
0.06 (not shown). For distant distributions and λ = 1.3 the MSE was
smaller, never exceeding 0.04. (see C.8 and C.9).
5.2.2.3. Optimal threshold. The MSE of the optimal threshold in-
creased a lot with increasing heterogeneity (see figure 5.9). It de-
creased strongly with an increasing number of thresholds, such that
for 5 thresholds per study the MSE was always below 3 (not shown).
It was smaller in the case of distant distributions than in the case of
nearby distributions (not shown).
5.2.3. Coverage. The coverage is the probability of the real value
being comprised in the confidence interval of the estimated value. As
we chose 95% confidence intervals, the coverage should ideally be 0.95.
5.2.3.1. Distribution parameters. The coverage of the distribution
parameters was varying between almost 0 and almost 1 (see figure
74 5. SIMULATION STUDY
Models
MS
E
0.02
0.04
0.06
0.08
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, Same SD Moderate hetero, Same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, Same SD Huge hetero, Same SD
No hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, Different SD Large hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
0.02
0.04
0.06
0.08
Huge hetero, Different SD
sens_1 spec_1
Figure 5.8. MSE of sensitivity (sens 1) and specificity(spec 1) at threshold 0 in the case of λ = 1.3 and nearby dis-tributions. The heterogeneity of the studies increases fromleft to right. The four plots at the bottom show the case ofsame standard deviations (SD), the top four plots the caseof different standard deviations.
5.10). The models with same fixed slope had permanently smaller
coverage than the models with different fixed slopes, but we will see
that this phenomenon did not exist for sensitivity and specificity to
this extent. In general, the coverage was almost never 0.95. In the case
of no heterogeneity we see two different phenomena: Firstly, when the
distribution functions had the same standard deviations, the coverage
hovered around 0.95 for all models. Secondly, in the case of different
standard deviations, the *-models resulted in more or less the same
coverage, whereas the models with same fixed slope had a coverage
close to zero. This may be explained by a small confidence interval due
to no heterogeneity and existing bias.
For an increasing number of thresholds the coverage decreased (see
picture C.11). In the case of distant distribution functions the coverage
is slightly higher (not shown).
5.2. RESULTS 75
Models
MS
E
10
20
30
40
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, Same SD Moderate hetero, Same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, Same SD Huge hetero, Same SD
No hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, Different SD Large hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
10
20
30
40
Huge hetero, Different SD
Figure 5.9. MSE of the optimal threshold in the case ofλ = 1.3 and nearby distributions. The heterogeneity of thestudies increases from left to right. The four plots at thebottom show the case of same standard deviations (SD), thetop four plots the case of different standard deviations. Thisis a zoomed-in version. For the overview see figure C.10.
5.2.3.2. Sensitivity and specificity. The coverage of sensitivity and
specificity is strongly decreasing with increasing heterogeneity, in the
case of λ = 1.3 and nearby distributions. For the outer thresholds (0
and 2.5) and in the case of same standard deviations the coverage was
between 0.7 and 0.97 in the case of no heterogeneity and between 0 and
0.7 for huge heterogeneity. At the ’real’ optimal threshold the coverage
was also decreasing with increasing heterogeneity, but always stayed
over 0.5 (see figures 5.11 and C.12 (top)).
With an increasing number of thresholds per study the coverage at
threshold 0 and 2.5 was spreading over almost the whole interval [0,1]
and at the ’real’ optimal threshold the coverages were slightly decreas-
ing (see figures C.12 (bottom) and C.13).
In the case of distant distributions the coverage at thresholds 0 and 4
were higher than with nearby distributions at the thresholds 0 and 2.5.
76 5. SIMULATION STUDY
Models
Cove
rag
e
0.2
0.4
0.6
0.8
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, Same SD Moderate hetero, Same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, Same SD Huge hetero, Same SD
No hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, Different SD Large hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
0.2
0.4
0.6
0.8
Huge hetero, Different SD
m0 s0 m1 s1
Figure 5.10. Coverage of the distribution parameters µ0
(open light blue circle), µ1 (filled light blue circle), σ0 (opendark blue circle) and σ1 (filled dark blue circle) in the caseof λ = 1.3 and nearby distributions. The heterogeneity ofthe studies increases from left to right. The four plots at thebottom show the case of same standard deviations (SD), thetop four plots the case of different standard deviations.
At the ’real’ optimal threshold it was slightly higher as well (figures
not shown).
5.2. RESULTS 77
Models
Cove
rag
e
0.2
0.4
0.6
0.8
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, Same SD Moderate hetero, Same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, Same SD Huge hetero, Same SD
No hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, Different SD Large hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
0.2
0.4
0.6
0.8
Huge hetero, Different SD
sens_1 spec_1
Models
Cove
rag
e
0.2
0.4
0.6
0.8
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, Same SD Moderate hetero, Same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, Same SD Huge hetero, Same SD
No hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, Different SD Large hetero, Different SDC
I
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
0.2
0.4
0.6
0.8
Huge hetero, Different SD
sens_2 spec_2
Figure 5.11. Coverage of sensitivity and specificity at threshold 0(sens 1, spec 1) in the top panel and at the ’real’ optimal thresh-old (sens 2, spec 2) in the bottom panel in the case of λ = 1.3 andnearby distributions. In both pictures the heterogeneity of the stud-ies increases from left to right. The four plots at the bottom showthe case of same standard deviations (SD), the top four plots thecase of different standard deviations.
CHAPTER 6
Discussion
We have described and evaluated a new approach for meta-analysis
of diagnostic test accuracy studies, where several studies report more
than one threshold and the corresponding values of sensitivity and
specificity. The approach uses a parametric assumption (normal or lo-
gistic) for the distribution of a continuous biomarker. The idea is to
estimate the distribution functions of the biomarker, one distribution
function within the non-diseased and one within the diseased study
population. This is achieved by the use of a mixed effects model with
study as random factor.
The traditional approaches, as for example the hierarchical model of
Rutter and Gatsonis (2001) and the bivariate model of Reitsma et al.
(2005), only use one pair of sensitivity and specificity per study. Only
recently, alternative approaches using more than one pair of sensitivity
and specificity per study have been described, as for example the mul-
tivariate random effects approach of Hamza et al. (2009), the survival
approach of Putter et al. (2010) and the non-parametric approach of
Martınez-Camblor (2014). Riley et al. (2014) also proposed a multi-
variate regression model, which is closely related to our models.
Our new approach for meta-analysis of DTA studies has its strengths
and limitations.
Strengths. Our approach uses multiple pairs of sensitivity and speci-
ficity and their corresponding thresholds per study. In comparison with
traditional approaches, this has several advantages: we do not need
to select one pair of sensitivity and specificity per study (which may
lead to bias) and we do not need further assumptions to determine a
summary ROC curve. Instead, we use all the given information and
therefore the results are expected to be more reliable. In contrast to
the alternative approaches of Hamza et al. (2009) and Putter et al.
(2010), our approach can deal with a varying number of thresholds per
study. This has been the case in most of the systematic reviews we
79
80 6. DISCUSSION
found with several studies providing multiple thresholds.
The assumption of a normal or logistic distributed biomarker with dif-
ferent parameters for the non-diseased and diseased individuals is very
common. Thus, our approach follows a very natural idea by estimating
these biomarker distributions. Everything is based upon these distri-
butions: sensitivity and specificity, the SROC curve and the Youden
index and therewith the optimal threshold. Thus, directly and with-
out further assumptions, we obtain all desired quantities. By using a
mixed effects model we acknowledge the diversity of the studies, while
the data of each study has in principle the same structure. By admit-
ting correlated random effects, we respect the bivariate character of the
study data.
Furthermore, with our approach we can determine an optimal thresh-
old among all studies. This is important information for clinicians. In
the clinical routine it is not only of interest to know which is the best
biomarker for a specific illness, but also at which threshold an optimal
discrimination between non-diseased and diseased individuals can be
achieved.
Limitations. Some care has to be taken concerning the concept
of an optimal threshold across studies. This is only reasonable if a
biomarker value has the same meaning in all studies and does not
differ because of laboratory conditions. If the thresholds are very het-
erogeneous, this has to be doubted. Of course the question arises as
well in how far it is reasonable to pool sensitivity and specificity if the
studies are very inhomogeneous.
A weak point of this method is the possibility of decreasing proportions
of negative test results with an increasing threshold across studies, as
can be seen in example 4.3. Within a study, positive correlation is
assured by definition, but across studies this cannot be guaranteed. In
this case, the method will fail.
If there are not enough data points reported from the studies, some
linear mixed effects models may not be applicable as the number of
parameters to be estimated might be too big. Moreover, the lmer()
function had convergence and/or calculation problems for some mod-
els and data.
6. DISCUSSION 81
Examples. In all examples a large heterogeneity between the stud-
ies can be seen. However, the application of our approach to these
examples led to very convincing results (except for example 4.3). The
distribution functions and the pooled sensitivity and specificity with
their confidence intervals seemed reasonable. The selection method to
only consider *- models and then select the one with the smallest REML
criterion resulted, for all examples (except example 4.3), in choosing
model *DICS. The repeated choice of this model might be explained by
the great difference between the intercepts of the transformed negative
test results of the non-diseased and diseased individuals, whereas the
slopes of the data points were almost the same. Possibly this model
could be included in a next simulation study. With different choices
of models we always obtained visually good results for the estimated
distributions, but the optimal thresholds varied.
Example 4.3 regarding procalcitonin as a marker for neonatal sepsis
shows that the approach does not work for all data. If the heterogene-
ity between studies is very big and the number of threshold is low, a
reasonable regression cannot be assured as it is generally possible to
obtain regression lines with negative slope. Therefore, it is not advis-
able to use our model for such data.
In example 4.4 we can see that the approach may work well for discrete
biomarkers, too. With our approach we obtain almost the same results
as Putter et al. and have the advantage of not being restricted to data
with the same amount of thresholds per study.
Simulation study. The simulation study showed that with grow-
ing heterogeneity, the quality of the estimates deteriorates. Generally,
reasonable results of the new approach can only be expected for the
heterogeneity levels ’no’ and ’moderate’. However, since the distribu-
tion estimates for almost all data examples have been convincing, we
assume that in practice most data has a heterogeneity level within this
range. This assumption is supported by the heterogeneity levels that
Martınez-Camblor (2014) examined in his simulation study. He used
mean zero normal errors with variances 0, 0.1 and 0.2 for the means
of the distributions and with variances 0 and 0.1 for the logarithmized
variances of the distributions. Therefore, all his levels of heterogeneity
are smaller than the ’moderate’ heterogeneity level.
82 6. DISCUSSION
For data with maximal moderate heterogeneity the linear mixed mod-
els allowing for different fixed slopes (denominated with *) are to be
preferred. They led to smaller bias and MSE in scenarios where the
standard deviations were different and to an equivalent bias and MSE
in scenarios where the standard deviations were the same.
Furthermore, the bias and MSE of estimates decreased with a increas-
ing number of thresholds per study. Therefore, study investigators are
encouraged to report as many thresholds as possible.
If the biomarker distributions of the non-diseased and diseased individ-
uals were further away (mean zero distribution within the non-diseased
and a mean of 4 within the diseased), the bias and the MSE were
smaller than in the case of nearby distributions (mean 0 and mean
2.5). However, as the differences of the means and also of the stan-
dard deviations of the biomarker distributions of the non-diseased and
diseased individuals were bigger in the first case, the smaller bias and
MSE could be due to the fact that the heterogeneity parameters (which
stayed the same) had less influence.
In most circumstances the bias of sensitivity and specificity was the
smallest for the most complex models examined, the CIDS and *CIDS
model (common random intercept and different random slope). On the
other hand, we observed that the more complex the mixed effects model
was, the more convergence problems occurred in the lmer() function.
Unfortunately, the coverage of the estimates of the distribution param-
eters as well as of sensitivity and specificity was not satisfying. This
could be due to the existing bias and to incorrect confidence intervals.
For the confidence intervals we assumed the parameters to be approx-
imately normally distributed, but possibly the normal quantiles led to
the creation of too narrow confidence intervals.
CHAPTER 7
Conclusion
In this thesis a new approach for meta-analyses of DTA studies
where several studies provide more than one threshold was described.
We acknowledged for the heterogeneity between the studies as well as
for the bivariate character of the data. We applied the new approach to
several examples, almost all leading to convincing results. Our simula-
tion study showed that the method works only reasonable in scenarios
of no or moderate heterogeneity between the studies and that the cov-
erage of the estimated parameters is not satisfying. We proposed a
total of 16 linear mixed models which differ in their fixed and random
effects structure for estimation of the distribution functions. It would
be desirable to have a criterion to select the model of choice for con-
crete data, but the model selection remains an unsolved problem.
Furthermore, it would be worthwhile to determine a confidence interval
of the optimal threshold in the case of logistic distributions, possibly
using an empirical approach with a resampling method. Moreover, one
could include the uncertainty of the optimal threshold in the confidence
intervals of sensitivity and specificity at the optimal threshold.
Although our new approach can still be improved in some aspects, it
has all the important properties needed to acknowledge the heterogene-
ity of the studies and the bivariate character of the data and includes
multiple thresholds of studies, possibly differing in number.
83
APPENDIX A
Data Sets
study cutoff TP FP FN TN
1 5 106 131 4 91
1 13 92 38 18 184
1 14 92 36 18 186
1 15 93 29 17 193
2 3 202 424 3 310
2 5 196 340 9 394
2 14 189 149 16 585
2 17 186 114 19 620
3 3 130 378 0 195
3 14 111 101 19 472
4 3 35 77 0 25
4 14 33 31 2 71
5 5 39 140 1 260
5 14 36 80 4 320
5 20 35 48 5 352
5 30 32 28 8 372
6 12 54 5 7 114
6 14 53 2 8 117
6 17 51 2 10 117
7 14 159 90 6 87
7 27 119 36 46 141
8 5 434 1087 9 542
9 14 19 21 6 121
9 18 19 15 6 127
10 14 37 163 1 105
11 14 53 33 14 733
12 14 101 59 27 173
13 14 42 49 3 223
14 14 125 117 11 250
15 14 398 363 46 1265
16 14 62 52 13 82
17 14 23 159 9 195
18 14 12 60 1 69
19 14 128 18 3 84
20 14 71 80 8 199
21 14 61 126 9 282
Table A.1. Troponin data of Zhelev et al. (2015) with mul-tiple thresholds.
85
86 A. DATA SETS
study cutoff TP FP TN FN assay
1 0.5 63 11 38 8 PCT-Q
1 2 49 2 47 22 PCT-Q
1 10 18 0 49 53 PCT-Q
2 0.1 111 161 54 11 PCT-LIA
2 0.3 84 71 144 38 PCT-LIA
2 0.4 77 56 159 45 PCT-LIA
2 0.5 73 45 170 49 PCT-LIA
3 0.5 10.22 4.62 9.38 3.78 PCT-LIA
3 1 9.94 1.12 12.88 4.06 PCT-LIA
3 1.2 9.52 0.28 13.72 4.48 PCT-LIA
3 2 12.32 0 14 1.68 PCT-LIA
3 5 5.74 0 14 8.26 PCT-LIA
4 3.03 52 11 10 10 PCT-LIA
4 15.75 47 2 19 15 PCT-LIA
5 1 42 6 26 9 PCT-LIA
5 10 9 0 32 42 PCT-LIA
6 0.087 59 11 8 15 PCT-Kryptor
6 0.1 56 9 10 18 PCT-Kryptor
6 0.25 41 2 17 33 PCT-Kryptor
6 0.5 34 2 17 40 PCT-Kryptor
7 2 31 9 35 1 PCT-Kryptor
7 10 21 3 41 11 PCT-Kryptor
8 0.5 76 11 45 36 PCT-Q
8 2 52 4 52 60 PCT-Q
8 10 30 2 54 82 PCT-Q
9 0.5 20.5 25 14 4.5 PCT-LIA
9 2.5 17 10 29 8 PCT-LIA
9 5 12.5 7 32 12.5 PCT-LIA
10 0.1 138 49 84 65 PCT-Kryptor
10 0.5 83 17 116 120 PCT-Kryptor
10 3 37 4 129 166 PCT-Kryptor
11 0.5 44 73 15 1 PCT-LIA
11 1.5 34 20 68 11 PCT-LIA
11 3 24 12 76 21 PCT-LIA
12 1.2 21 2 13 13 PCT-LIA
13 1 29 2 38 7 PCT-Kryptor
14 9.7 28 9 27 3 PCT-Kryptor
15 1.6 16 8 23 4 PCT-LIA
16 0.6 39 9 20 8 PCT-LIA
17 0.28 20 3 9 4 PCT-LIA
18 1.1 58 4 14 2 PCT-LIA
19 2.2 31 0 11 24 PCT-Kryptor
20 1.1 34 5 17 7 PCT-LIA
21 0.5 17 5 58 24 PCT-LIA
22 0.25 77 23 32 19 PCT-Kryptor
23 0.5 53 5 37 19 PCT-Q
24 0.5 22 1 24 3 PCT-Q
25 5.79 17 2 17 13 PCT-Kryptor
26 0.32 65 9 16 13 PCT-Kryptor
27 2 82 92 116 37 PCT-LIA
28 3.3 19 5 6 3 PCT-LIA
29 2 49 6 14 26 PCT-LIA
30 1 19 2 21 8 PCT-Kryptor
31 1.31 55 2 8 20 PCT-LIA
Table A.2. Procalcitonin data of Wacker et al. (2013) withmultiple thresholds.
A. DATA SETS 87
study cutoff TP FP FN TN
1 0.5 92 0 31 40
2 0.6 30 15 0 28
3 5.75 20 31 9 63
4 0.5 16 41 2 28
4 2 16 24 2 45
4 10 13 17 5 52
5 1 11 22 8 108
6 0.8 11 1 3 25
7 1 38 5 12 17
8 0.5 38 8 7 123
9 0.55 43 41 14 107
10 2 7 21 0 95
11 0.5 35 3 1 12
11 1 26 0 10 15
12 0.5 51 60 14 58
13 1 15 6 4 109
14 2 34 11 7 16
14 6 32 2 9 25
15 0.5 26 39 20 77
16 5 16 65 3 66
Table A.3. Procalcitonin data (neonatal sepsis) ofVouloumanou et al. (2011) with multiple thresholds.
88 A. DATA SETS
study cutoff TP FP FN TN
1 0 76 134 0 0
2 0 53 247 0 0
3 0 63 61 0 0
4 0 48 56 0 0
5 0 175 1795 0 0
6 0 294 527 0 0
7 0 57 60 0 0
8 0 110 117 0 0
9 0 25 129 0 0
10 0 52 483 0 0
1 1 70 35 6 99
2 1 46 50 7 197
3 1 50 14 13 47
4 1 46 18 2 38
5 1 107 235 68 1560
6 1 261 99 33 428
7 1 56 15 1 45
8 1 78 48 32 69
9 1 22 16 3 113
10 1 52 304 0 179
1 2 61 9 15 125
2 2 35 19 18 228
3 2 44 9 19 52
4 2 42 9 6 47
5 2 80 95 95 1700
6 2 216 45 78 482
7 2 47 6 10 54
8 2 58 15 52 102
9 2 12 1 13 128
10 2 48 184 4 299
1 3 42 3 34 131
2 3 23 2 30 245
3 3 33 3 30 58
4 3 27 2 21 54
5 3 42 30 133 1765
6 3 130 11 164 516
7 3 30 2 27 58
8 3 30 2 80 115
9 3 6 0 19 129
10 3 24 58 28 425
1 4 21 1 55 133
2 4 10 0 43 247
3 4 17 1 46 60
4 4 17 0 31 56
5 4 20 7 155 1788
6 4 56 1 238 526
7 4 23 0 34 60
8 4 10 1 100 116
9 4 2 0 23 129
10 4 5 5 47 478
Table A.4. CAGE data of Aertgeerts et al. (2004).
APPENDIX B
R Code
B.1. Code Novel Approach
Code of the R function evaluate.R of our new approach.
1 # Auxiliary functions
2
3 # Define logit function
4 logit <- function(x){
5 log(x) - log(1-x)
6 }
7 # Define expit function
8 expit <- function(x){
9 (1+ exp(-x))^(-1)
10 }
11
12 # Function for fixed point iteration
13 iterate <- function(f, x0 , nmax , eps , print=FALSE) {
14 x <- x0
15 n <- 0
16 while (abs(f(x) - x) > eps & n < nmax) {
17 n <- n+1
18 x <- f(x)
19 if (print) print(x)
20 }
21 if(n < nmax)
22 return(x)
23 else
24 stop(" Iteration of maximal Youdenindex reached number of
maximal iterations without converging.")
25 }
26
27 #----------------------------------------------------
28 # Function to evaluate and plot meta -analysis data
29
30 evaluate <- function(data , normal=TRUE , model="6c",
31 log=FALSE , reml = TRUE ,
32 weights = "Default",
33 lambda =0.5, nmax =1000,
89
90 B. R CODE
34 eps =0.000000001 ,
35 print.iterations=FALSE ,
36 xlab="Threshold", print=FALSE ,
37 plots=c(1,2,3,4,5),
38 evaluateCutoff = "optCutoff",
39 printCI=FALSE ,
40 wl = FALSE ,
41 output=TRUE){
42
43 # Input parameters:
44 # data:( dataframe) dataframe with columns named:
45 # study , cutoff , TP , FP , TN , FN
46 # normal :( logical)
47 # True: biomarker distribution assumption is normal ,
48 # False: biomarker distribution assumption is logistic
49 # model:( string) choose one one the following models:
50 # 1b (=CI), 2b (=DI), 3b (=CS), 4b(=DS), 5b(=CICS),
51 # 6b(=DICS), 7b(=CIDS), 8b(=DIDS),
52 # 1c (=*CI), 2c (=*DI), 3c (=*CS), 4c(=*DS),
53 # 5c(=*CICS), 6c(=*DICS), 7c(=*CIDS), 8c(=*DIDS),
54 # log:( logical) True: scale x-axis with log(x),
55 # FALSE: nothing happens
56 # reml:( logical) argument of function lmer.
57 # TRUE: REML=TRUE , FALSE: REML=FALSE. (Estimation with ML)
58 # weights :( string) prior weighting for studies.
59 # "Default ": no weighting (all weights =1)
60 # "SampleSize ": sample size
61 # "SampleSize1 ": sample size ,
62 # scaled so that the weights sum up to 1
63 # "SampleSize2 ": sample size ,
64 # scaled , so that the weights have mean 1
65 # "InverseVariance ": inverse variance
66 # "InverseVariance1 ": inverse variance ,
67 # scaled so that the weights sum up to 1
68 # "InverseVariance2 ": inverse variance ,
69 # scaled , so that the weights have mean 1
70 # lambda :( numeric) weighting parameter for higher weighting
71 # of sens(lambda > 0.5) or spez(lambda > 0.5).
72 # lambda =0.5 is equally weighted
73 # nmax:( numeric) number of maximal iterations for
74 # finding the optimal cutoff of logistic distribution
75 # eps:( numeric) smallest difference for end of iterations
76 # for finding the optimal cutoff of logistic distribution
77 # print.iterations :( logical)
78 # TRUE: output of iterations , FALSE: no output
79 # xlab:( string) x axis label
B.1. CODE NOVEL APPROACH 91
80 # print:( logical) TRUE: creates plots as pdfs in folder ,
81 # FALSE: direct output of plots
82 # plots:( vector of numerics)
83 # 1: transformed data with regression lines
84 # 2: data with distribution functions
85 # 3: ROC
86 # 4: Youdenindex
87 # evaluateCutoff :( string)
88 # The cutoff where Sens and Spec are evaluated
89 # for output. Default is the optimal cutoff.
90 # printCI :( logical) TRUE: print confidence intervals ,
91 # FALSE: do not print CIs
92 # wl: (logical) TRUE: datapoints are connected with lines.
93 # FALSE: they are not
94 # output: (logical) TRUE: if output on the console is
diseased ,
95 # else FALSE.
96
97 ######################### FUNCTIONS
#################################
98 # Function for calculating the weighted cut -off point of
99 # two logistic distributions by an iterative fixpoint
procedure
100
101 # function
102 g <- function(x) {
103 m1 - s1*acosh(lambda/(1-lambda)*s0/s1*(1+ cosh((x-m0)/s0))
-1)
104 }
105
106 # inverse function
107 f <- function(x) {
108 m0 + s0*acosh ((1- lambda)/lambda*s1/s0*(1+ cosh((x-m1)/s1))
-1)
109 }
110
111 iter <- NULL
112
113 # error handling for function iterate , chooses f or g for
iterations
114 saveIterate <- function(x0, nmax , eps , print) {
115 tryCatch( { # try iterating with funciton f
116 x <- iterate(f, x0 , nmax , eps , print)
117 print("Optimal cut -off iteration with f")
118 return(list(x=x, iter="f"))
119 },
92 B. R CODE
120 warning=function(w) {
121 suppressWarnings(x <- iterate(f, x0 , nmax , eps , print))
122 warning(w$message)
123 return(list(x=x, iter="f"))
124 },
125 # if error occurs , iterate with function g
126 error=function(e) {
127 tryCatch ({ # try iteration with function g
128 x <- iterate(g, x0, nmax , eps , print)
129 print("Optimal cut -off iteration with g")
130 return(list(x=x, iter="g"))
131 },
132 warning=function(wa) {
133 suppressWarnings(x <- iterate(g, x0 , nmax , eps , print
))
134 warning(w$message)
135 return(list(x=x, iter="g"))
136 },
137 error=function(er){
138 stop("Optimal cutoff iteration didn ’t converge. Use
normal=TRUE.")
139 return(list(x=NULL , iter=NULL))
140 })
141 })
142 }
143
144 ############# DATA ORGANISING
######################################
145 attach(data)
146
147 # Reorganizing the data
148 CutoffSingle <- cutoff
149 Cutoff <- c(cutoff ,cutoff)
150 Group <- c(rep(0,length(study)),rep(1,( length(study))))
151 Negative0 <- TN
152 Negative1 <- FN
153 Negative <-c(TN , FN)
154 N0 <- FP + TN # number of non -diseased patients
155 N1 <- TP + FN # number of diseased patients
156 N <- c(N0 , N1) # number of all patients in one study IN ONE
GROUP!!!(0 or 1)
157 NN0 <- (Negative0 + 0.5) / (N0 + 1) # with continuity
correction
158 NN1 <- (Negative1 + 0.5) / (N1 + 1)
159 NN <- (Negative + 0.5) / (N + 1)
160 StudySingle <- study
B.1. CODE NOVEL APPROACH 93
161 Study <- c(study ,study)
162 numberOfStudies <- nlevels(factor(Study))
163 # Vector consisting of as many colors as the maximal number
of a study
164 colorChart <- rainbow(max(study))
165 colorVector <- colorChart[StudySingle]
166 # Dataframe consisting of rows with data for each cutoff of
each study , first
167 # for all non -diseased individuals and then everything again
for the diseased individuals
168 # (each cutoff of each study is named twice)
169 Data <- data.frame(Study , Group , Cutoff , N, Negative)
170
171 #---------------------------------------------
172
173 ## log transfomation of cutoff values
174 if(log) {Cut <- log(Cutoff)
175 CutSingle <- log(CutoffSingle)
176 } else {Cut <- Cutoff
177 CutSingle <- CutoffSingle}
178
179
180 ################### WEIGHTS
#######################################
181
182 # Default weights
183 if(weights == "Default"){
184 w <- NULL
185 }
186
187 # weights according to sample size
188 if(weights == "SampleSize"){
189 # weights are sample sizes
190 w <- c(N0,N1)
191 }
192
193 # Sample size --------- scale: sum up to 1
194 if(weights == "SampleSizeScaled1"){
195 w <- c(N0,N1)
196 w <- (w / sum(w))
197 }
198
199 # Sample size --------- scale: mean 1
200 if(weights == "SampleSizeScaled2"){
201 w <- (length(N) * N) / sum(N)
202 }
94 B. R CODE
203
204
205 # Inverse Variance weights with continuity correction
206 if(weights == "InverseVariance"){
207 if(!normal){
208 w0 <- (TN + 0.5) * (FP + 0.5) / (N0 + 1)
209 w1 <- (TP + 0.5) * (FN + 0.5) / (N1 + 1)
210 w <- c(w0 ,w1)
211 } else {
212 w1 <- (N1+1)^3*dnorm(qnorm ((FN+0.5)/(N1+1)))^2/((TP+0.5)*
(FN+0.5))
213 w0 <- (N0+1)^3*dnorm(qnorm ((TN+0.5)/(N0+1)))^2/((TN+0.5)*
(FP+0.5))
214 w <- c(w0 , w1)
215 }
216 }
217
218 # Inverse Variance weights with continuity correction ---
scale: sum up to 1
219 if(weights == "InverseVarianceScaled1"){
220 if(!normal){
221 w0 <- (TN + 0.5) * (FP + 0.5) / (N0 + 1)
222 w1 <- (TP + 0.5) * (FN + 0.5) / (N1 + 1)
223 w <- c(w0 ,w1)
224 } else {
225 w1 <- (N1+1)^3*dnorm(qnorm ((FN+0.5)/(N1+1)))^2/((TP+0.5)*
(FN+0.5))
226 w0 <- (N0+1)^3*dnorm(qnorm ((TN+0.5)/(N0+1)))^2/((TN+0.5)*
(FP+0.5))
227 w <- c(w0 , w1)
228 }
229 # scaling to sum 1
230 w <- (w / sum(w))
231 }
232
233 # Inverse Variance weights with continuity correction ---
scale: mean 1
234 if(weights == "InverseVarianceScaled2"){
235 if(!normal){
236 w0 <- (TN + 0.5) * (FP + 0.5) / (N0 + 1)
237 w1 <- (TP + 0.5) * (FN + 0.5) / (N1 + 1)
238 w <- c(w0 ,w1)
239 } else {
240 w1 <- (N1+1)^3*dnorm(qnorm ((FN+0.5)/(N1+1)))^2/((TP+0.5)*
(FN+0.5))
B.1. CODE NOVEL APPROACH 95
241 w0 <- (N0+1)^3*dnorm(qnorm ((TN+0.5)/(N0+1)))^2/((TN+0.5)*
(FP+0.5))
242 w <- c(w0 , w1)
243 }
244 # scaling
245 w <- length(w) * w / sum(w)
246 }
247
248 #----------------------------------------
249
250 detach(data)
251
252 results <- list()
253
254 # Function to transform/rescale the x-values , either with
probit or with logit function.
255 resc <- function(x){
256 if(normal) {qnorm(x)}
257 else logit(x)
258 }
259
260 ############### MODELS
#############################################
261 # fixed effects: b (Group + Cut), c (Group * Cut)
262 # random effects: 1,2 (random intercept), 3,4 (random slope),
5,6 (random intercept + slope)
263
264 # Linear regression according to selected model
265 # models as in thesis subsection 3.3.3.
266
267 #-------------------------------------------------------
268 ## Models with Fixed Effects: 1 + Group + Cut
269 # random intercept , common distribution [thesis: model CI]
270 if(model == "1b"){
271 lmeModel <- lmer(resc(NN) ~ Group + Cut + (1| Study), REML =
reml ,weights=w)
272 }
273 # random intercept , different distributions [thesis: model
DI]
274 if(model == "2b"){
275 lmeModel <- lmer(resc(NN) ~ Group + Cut + (1 + Group |
Study), REML = reml ,weights=w)
276 }
277
278 # random slope , common distribution [thesis: model CS]
279 if(model == "3b"){
96 B. R CODE
280 lmeModel <- lmer(resc(NN) ~ Group + Cut + (0 + Cut|Study),
REML = reml , weights=w)
281 }
282
283 # random slope , different distributions [thesis: model DS]
284 if(model == "4b"){
285 lmeModel <- lmer(resc(NN) ~ Group + Cut + (0 + Cut + Group
: Cut|Study), REML = reml , weights=w)
286 }
287
288 # random slope and intercept , common distributions [thesis:
model CICS]
289 if(model == "5b"){
290 lmeModel <- lmer(resc(NN) ~ Group + Cut + (Cut|Study), REML
= reml , weights = w)
291 }
292
293 # random slope (common distribution) and intercept(different
distributions) [thesis: model DICS]
294 if(model == "6b"){
295 lmeModel <- lmer(resc(NN) ~ Group + Cut + (Cut + Group |
Study), REML = reml , weights=w)
296 }
297
298 # random slope (different distributions) and intercept (
common distribution) [thesis: model CIDS]
299 if(model == "7b"){
300 lmeModel <- lmer(resc(NN) ~ Group + Cut + (Cut + Group :
Cut |Study), REML = reml , weights=w)
301 }
302
303 # random slope (different distributions) and intercept (
different distributions) [thesis: model DIDS]
304 if(model == "8b"){
305 lmeModel <- lmer(resc(NN) ~ Group + Cut + (Group * Cut|
Study), REML = reml , weights=w)
306 }
307
308 # --------------------------------------------------------
309 ## Models with Fixed Effects: Group * Cut
310 # random intercept , common distribution [thesis: model *CI]
311 if(model == "1c"){
312 lmeModel <- lmer(resc(NN) ~ Group * Cut + (1| Study), REML =
reml ,weights=w)
313 }
314
B.1. CODE NOVEL APPROACH 97
315 # random intercept , different distributions [thesis: model *
DI]
316 if(model == "2c"){
317 lmeModel <- lmer(resc(NN) ~ Group * Cut + (1 + Group |
Study), REML = reml ,weights=w)
318 }
319
320 # random slope , common distribution [thesis: model *CS]
321 if(model == "3c"){
322 lmeModel <- lmer(resc(NN) ~ Group * Cut + (0 + Cut|Study),
REML = reml , weights=w)
323 }
324
325 # random slope , different distributions [thesis: model *DS]
326 if(model == "4c"){
327 lmeModel <- lmer(resc(NN) ~ Group * Cut + (0 + Cut + Group
: Cut|Study), REML = reml , weights=w)
328 }
329
330 # random slope + intercept , common distributions [thesis:
model *CICS]
331 if(model == "5c"){
332 lmeModel <- lmer(resc(NN) ~ Group * Cut + (Cut|Study), REML
= reml , weights = w)
333 }
334
335 # random slope(common distribution) and intercept(different
distributions) [thesis: model *DICS]
336 if(model == "6c"){
337 lmeModel <- lmer(resc(NN) ~ Group * Cut + (Cut + Group |
Study), REML = reml , weights=w)
338 }
339
340 # random slope(different distributions) and intercept (common
distributions) [thesis: model *CIDS]
341 if(model == "7c"){
342 lmeModel <- lmer(resc(NN) ~ Group * Cut + (Cut + Group :
Cut |Study), REML = reml , weights=w)
343 }
344
345 # random intercept(different distributions) and slope (
different distributions) [thesis: model *DIDS]
346 if(model == "8c"){
347 lmeModel <- lmer(resc(NN) ~ Group * Cut + (Group * Cut|
Study), REML = reml , weights=w)
348
98 B. R CODE
349 }
350
351 #--------------------------------------------------------
352
353 s <- summary(lmeModel)
354
355 cf <- coef(s)
356 vc <- vcov(s)
357
358
359 ################ PARAMETER EXTRACTION
#############################
360 # Extract regression coefficients alpha0 , beta0 of the non -
diseased and alpha1 and beta1 of the diseased. And their
variances.
361
362
363 if(grepl("b", model)) {
364 alpha0 <- cf[1,1]
365 alpha1 <- cf[1,1] + cf[2,1]
366 beta0 <- cf[3,1]
367 beta1 <- cf[3,1]
368 varalpha0 <- vc[1,1]
369 varalpha1 <- vc[1,1] + vc[2,2] + 2*vc[1,2]
370 varbeta0 <- vc[3,3]
371 varbeta1 <- vc[3,3]
372 cov0 <- vc[1,3]
373 cov1 <- vc[1,3] + vc[2,3]
374 }
375
376 if(grepl("c", model)) {
377 alpha0 <- cf[1,1]
378 alpha1 <- cf[1,1] + cf[2,1]
379 beta0 <- cf[3,1]
380 beta1 <- cf[3,1] + cf[4,1]
381 varalpha0 <- vc[1,1]
382 varalpha1 <- vc[1,1] + vc[2,2] + 2*vc[1,2]
383 varbeta0 <- vc[3,3]
384 varbeta1 <- vc[3,3] + vc[4,4] + 2*vc[3,4]
385 cov0 <- vc[1,3]
386 cov1 <- vc[1,3] + vc[1,4] + vc[2,3] + vc[2,4]
387 }
388
389 #.......................................................
390 # Compute the parameters of the biomarker distributions and
their variances.
B.1. CODE NOVEL APPROACH 99
391
392 m0 <- -alpha0/beta0 # Mean disease -free
393 s0 <- 1/beta0 # Standard deviation disease -free
394 m1 <- -alpha1/beta1 # Mean diseased
395 s1 <- 1/beta1 # Standard deviation diseased
396
397
398 vars0 <- varbeta0/(beta0 ^4)
399 vars1 <- varbeta1/(beta1 ^4)
400 varm0 <- (alpha0 ^2)/(beta0 ^4)*varbeta0+varalpha0/(beta0 ^2) -2
*alpha0/(beta0 ^3)*cov0
401 varm1 <- (alpha1 ^2)/(beta1 ^4)*varbeta1+varalpha1/(beta1 ^2) -2
*alpha1/(beta1 ^3)*cov1
402
403
404 ###### if neg correlation of data , stop
405 if(beta0 <=0 |beta1 <=0) stop("Regression yields negative
correlation. Try another model or get better data :)")
406 if(m1 < m0) stop("Estimated distribution of diseased patients
is left of non -diseased ones. Check if for your
biomarker really higher values indicate illness.")
407
408 ############ DISTRIBUTIONS
###################################
409
410 ######### NORMAL DISTRIBUTIONS ASSUMPTION
#####################
411 if(normal){
412 # Compute cutpoint(s) ’cut ’ of the two normals weighted
with lambda and 1-lambda
413 turn <- (m0*s1^2 - m1*s0^2)/(s1^2 -s0^2)
414 rad <- sqrt(s0^2*s1^2*(2*(s1^2 - s0^2)*(log(s1) - log(s0) -
logit(lambda)) + (m1 - m0)^2)/(s1^2 - s0^2)^2)
415 x0 <- turn - rad
416 x1 <- turn + rad
417 if (s0 < s1) cut <- x1
418 if (s0 > s1) cut <- x0
419 if (s1 == s0) {
420 cut <- (s0^2*(-logit(lambda)) -0.5*(m0^2-m1^2))/(m1 -m0)
421 }
422 # Function to compute sensitivity and specificity and their
the confidence intervals
423 sesp <- function(x) {
424 y0 <- beta0*x + alpha0
425 sp <- pnorm(y0)
426 SEy0 <- sqrt(varalpha0 + x^2*varbeta0 + 2*x*cov0)
100 B. R CODE
427 lsp <- pnorm(y0 - 1.96*SEy0)
428 usp <- pnorm(y0 + 1.96*SEy0)
429 y1 <- beta1*x + alpha1
430 se <- 1-pnorm(y1)
431 SEy1 <- sqrt(varalpha1 + x^2*varbeta1 + 2*x*cov1)
432 lse <- 1-pnorm(y1 + 1.96*SEy1)
433 use <- 1-pnorm(y1 - 1.96*SEy1)
434 list(Sens = c("Sens",round(se ,3),"[",round(lse ,3),";",
round(use ,3),"]"), Spec = c("Spec",round(sp ,3),"[",
round(lsp ,3),";",round(usp ,3),"]"))
435 }
436 #.........................................
437 # Back -transform optimal cut -off
438 if(log) cutlog <- exp(cut)
439
440 # Plot y axis label
441 yLAB1 =expression(Phi^{-1}~"(P(negative test result))")
442
443 }
444
445 ############## LOGISTIC DISTRIBUTION ASSUMPTION
###############################
446 if(!normal){
447 # Cutoff of the two logistics , weighted with lambda and
1-lambda
448 wmean <- (1-lambda)*m0 + lambda*m1
449 if ((1- lambda)*s1 != lambda*s0) {
450 x0 <- wmean
451 iterateResult <- saveIterate(x0 ,nmax ,eps ,print=print.
iterations)
452 cut <- iterateResult$x
453 iter <- iterateResult$iter
454 }
455 if ((1- lambda)*s1 == lambda*s0) {
456 cut <- wmean
457 }
458
459 # Function to compute sensitivity and specificity and their
the confidence intervals
460 sesp <- function(x) {
461 y0 <- beta0*x + alpha0
462 sp <- expit(y0)
463 SEy0 <- sqrt(varalpha0 + x^2*varbeta0 + 2*x*cov0)
464 lsp <- expit(y0 - 1.96*SEy0)
465 usp <- expit(y0 + 1.96*SEy0)
466 y1 <- beta1*x + alpha1
B.1. CODE NOVEL APPROACH 101
467 se <- 1-expit(y1)
468 SEy1 <- sqrt(varalpha1 + x^2*varbeta1 + 2*x*cov1)
469 lse <- 1-expit(y1 + 1.96*SEy1)
470 use <- 1-expit(y1 - 1.96*SEy1)
471 list(Sens = c("Sens",round(se ,3),"[",round(lse ,3),";",
round(use ,3),"]"), Spec = c("Spec",round(sp ,3),"[",
round(lsp ,3),";",round(usp ,3),"]"))
472 }
473
474 #...........................................
475 # Back -transform optimal cut -off
476 if(log) cutlog <- exp(cut)
477
478 # Plot y axis label
479 yLAB1 = "Logit(P(negative test result))"
480 }
481
482 ############# PLOTS
##############################################
483
484 # Function to print confidence regions
485 printConfI <- function (){
486 gray <- rgb (190 ,190 ,190 , maxColorValue =255, alpha =170)
487 upperSens <- function(vectorX) lapply(vectorX , function(x)
1-as.numeric(sesp(x)$Sens [4]))
488 if(log) curve(upperSens(log(x)), col=gray , lwd=1, add=TRUE)
else
489 curve(upperSens(x), col=gray , lwd=1, add=TRUE)
490 lowerSpec <- function(vectorX) lapply(vectorX , function(x)
as.numeric(sesp(x)$Spec [4]))
491 if(log) curve(lowerSpec(log(x)), col=gray , lwd=1, add=
TRUE) else
492 curve(lowerSpec(x), col=gray , lwd=1, add=TRUE)
493 lowerSens <- function(vectorX) lapply(vectorX , function(x)
1-as.numeric(sesp(x)$Sens [6]))
494 if(log) curve(lowerSens(log(x)), col=gray , lwd=1, add=TRUE)
else
495 curve(lowerSens(x), col=gray , lwd=1, add=TRUE)
496 upperSpec <- function(vectorX) lapply(vectorX , function(x)
as.numeric(sesp(x)$Spec [6]))
497 if(log) curve(upperSpec(log(x)), col=gray , lwd=1, add=TRUE)
else
498 curve(upperSpec(x), col=gray , lwd=1, add=TRUE)
499 }
500 #------------------------------------------------------
501 ## Plot 1: linear regression lines in logit/probit Space
102 B. R CODE
502 if(1 %in% plots){
503 if(print){pdf("PlotsMA/Plot1.pdf", width=5, height =4.5)}
504 if(log){
505 plot(Cutoff ,resc(NN), pch=16, col=0, ylab=expression(atop
("",Phi^{-1}~"(P(negative test result))")),xlab=xlab ,
cex.lab=0.7,cex.axis =0.7, log="x")
506 # add transformed data (possibly with lines)
507 points(CutoffSingle , resc(NN1), pch=16, cex=1, col=
colorVector)
508 if(wl) connect <- lapply (1: numberOfStudies , function(i)
lines(CutoffSingle[which(StudySingle ==i)], resc(NN1[
which(StudySingle ==i)]), col=colorChart[i], lwd=1))
509 points(CutoffSingle , resc(NN0), pch=1, cex=1, col=
colorVector)
510 if(wl) connect <- lapply (1: numberOfStudies , function(i)
lines(CutoffSingle[which(StudySingle ==i)], resc(NN0[
which(StudySingle ==i)]), col=colorChart[i], lwd=1))
511 # add linear regression lines
512 curve(alpha0 + beta0*log(x), lty=2, col = 1, lwd=1, add=
TRUE)
513 curve(alpha1 + beta1*log(x), lty=1, col = 1, lwd=1, add=
TRUE)
514 } else {
515 plot(Cutoff ,resc(NN), pch=16, col=0, ylab=yLAB1 ,xlab=xlab
)
516 # add transformed data (possibly with lines)
517 points(CutoffSingle , resc(NN1), pch=16, cex=1, col=
colorVector)
518 if(wl) connect <- lapply (1: numberOfStudies , function(i)
lines(CutoffSingle[which(StudySingle ==i)], resc(NN1[
which(StudySingle ==i)]), col=colorChart[i], lwd=1))
519 points(CutoffSingle , resc(NN0), pch=1, cex=1, col=
colorVector)
520 if(wl) connect <- lapply (1: numberOfStudies , function(i)
lines(CutoffSingle[which(StudySingle ==i)], resc(NN0[
which(StudySingle ==i)]), col=colorChart[i], lwd=1))
521 # add linear regression lines
522 abline(alpha0 ,beta0 , lty=2, col = 1, lwd =1)
523 abline(alpha1 ,beta1 , lty=1, col = 1, lwd =1)
524 }
525 if(print)dev.off()
526 }
527 #----------------------------------------------------------
528 ## Plot 2: data and biomarker distributions functions
529 if(2 %in% plots){
530 if(print){pdf("PlotsMA/Plot2.pdf", width=5, height =4.5)}
B.1. CODE NOVEL APPROACH 103
531 if(log) {
532 plot(Cutoff ,NN , pch=16, col=0, ylab=expression(atop("","P
(negative test result)")), xlab=xlab , cex.lab=0.7,cex
.axis =0.7, log="x", ylim=c(0,1)) #cex.lab=2
533 }
534 if(!log) {
535 plot(Cutoff ,NN , pch=16, col=0, ylab="P(negative test
result)",xlab=xlab , ylim=c(0,1))
536 }
537 # add data (possibly with lines)
538 points(CutoffSingle , NN1 , pch=16, cex=1, col=colorVector)
539 if(wl) connect <- lapply (1: numberOfStudies , function(i)
lines(CutoffSingle[which(StudySingle ==i)], NN1[which(
StudySingle ==i)], col=colorChart[i], lwd=1))
540 points(CutoffSingle , NN0 , pch=1, cex=1, col=colorVector)
541 if(wl) connect <- lapply (1: numberOfStudies , function(i)
lines(CutoffSingle[which(StudySingle ==i)], NN0[which(
StudySingle ==i)], col=colorChart[i], lwd=1))
542 # add regression curves (possibly with confidence intervals
)
543 if(normal) {
544 if(log) {
545 if(printCI) printConfI ()
546 curve(pnorm(log(x),m0 ,s0), lty=2, col=1,lwd=1, add=TRUE
)
547 curve(pnorm(log(x),m1 ,s1), lty=1, col=1,lwd=1, add=TRUE
)
548 }
549 if(!log) {
550 if(printCI) printConfI ()
551 curve(pnorm(x,m0 ,s0), lty=2, col=1,lwd=1, add=TRUE)
552 curve(pnorm(x,m1 ,s1), lty=1, col=1,lwd=1, add=TRUE)
553 }
554 }
555 else{
556 if(log) {
557 if(printCI) printConfI ()
558 curve(expit(beta0*log(x) + alpha0), lty=2, col=1, lwd
=1, add=TRUE)
559 curve(expit(beta1*log(x) + alpha1), lty=1, col=1,lwd=1,
add=TRUE)
560 }
561 if(!log) {
562 if(printCI) printConfI ()
563 curve(expit(beta0*x + alpha0), lty=2, col=1,lwd=1, add=
TRUE)
104 B. R CODE
564 curve(expit(beta1*x + alpha1), lty=1, col=1,lwd=1, add=
TRUE)
565 }
566 }
567 # draw optimal cut -off
568 if(log) {abline(v=exp(cut))} else {abline(v=cut)}
569 # draw legend
570 if(log)legend (1,0.12, paste("Optimal threshold =", round(
cutlog ,1)), cex=0.7, lwd=1, col=1, bty="n") else
571 legend(0,1, paste("Optimal threshold =", round(cut ,1)),
cex=0.7, col=1, bty="n")
572 # legend (0.3*max(Cutoff) ,0.95, paste(" Optimal threshold
=", round(cut ,1)), cex=0.7,lwd=1, col=1, bty="n")
573 if(print)dev.off()
574 }
575
576 # ----------------------------------------------------------
577 ## plot 3: Youden index = Sens + Spec - 1 ~ TNR -FNR
578 ##### not for weighted Youdenindex! (-> need to weight the
data , too)
579 if(3 %in% plots){
580 if(print){pdf("PlotsMA/Plot3.pdf", width=5, height =4.5)}
581 # plot data
582 if(log) {
583 plot(CutoffSingle , NN0 - NN1 , pch=16, ylab=expression(
atop("","Youden index")),xlab=xlab ,cex.lab=0.7,cex.
axis =0.7, ylim=c(0,1), col=colorVector ,
584 cex=1, log="x")
585 if(wl) connect <- lapply (1: numberOfStudies , function(i)
lines(CutoffSingle[which(StudySingle ==i)], (NN0 -NN1)[
which(StudySingle ==i)], col=colorChart[i], lwd=1))
586 }
587 if(!log) {
588 plot(CutoffSingle , NN0 - NN1 , pch=16, ylab="Youden index"
,xlab=xlab , ylim=c(0,1), col=colorVector ,
589 cex=1)
590 if(wl) connect <- lapply (1: numberOfStudies , function(i)
lines(CutoffSingle[which(StudySingle ==i)], (NN0 -NN1)[
which(StudySingle ==i)], col=colorChart[i], lwd=1))
591 }
592 # plot Youden index
593 if(normal) {
594 if(log) {
595 curve (2*(1-lambda)*pnorm(log(x),m0 ,s0) + 2*lambda*(1-
pnorm(log(x),m1 ,s1)) - 1,col=1, add=TRUE)
596 }
B.1. CODE NOVEL APPROACH 105
597 if(!log) {
598 curve (2*(1-lambda)*pnorm(x,m0 ,s0) + 2*lambda*(1-pnorm(x
,m1 ,s1)) - 1,col=1, add=TRUE)
599 }
600 } else {
601 if(log) {
602 curve (2*(1-lambda)*expit(alpha0 + beta0*log(x)) + 2*
lambda*(1 - expit(alpha1 + beta1*log(x))) - 1, col
=1, add=TRUE)
603 }
604 if(!log) {
605 curve (2*(1-lambda)*expit(alpha0 + beta0*x) + 2*lambda*
(1 - expit(alpha1 + beta1*x)) - 1, col=1, add=TRUE)
606 }
607 }
608 # plot line at threshold where the maximum is obtained
609 if(log) {abline(v=exp(cut))} else {abline(v=cut)}
610 # draw legend
611 if(log)legend (1.1 ,0.95 , paste("Optimal threshold =", round(
cutlog ,1), "ng/mL"),cex=0.7, lwd=1, col=1, bty="n")
else
612 legend (0.3*max(Cutoff) ,0.95, paste("Optimal threshold =",
round(cut ,1)), cex=0.7,lwd=1, col=1, bty="n")
613 if(print)dev.off()
614 }
615
616 # --------------------------------------------------------
617 ## Plot 4: ROC curve
618 if(4 %in% plots){
619 if(print){pdf("PlotsMA/Plot4.pdf", width=5, height =5)}
620 # plot data
621 par(mfrow=c(1,1), pty="s")
622 plot(1-NN0 ,1-NN1 , pch=16, col=colorVector , cex=1,cex.lab
=0.7,cex.axis =0.7,
623 xlim=c(0,1),ylim=c(0,1),
624 xlab="1 - Specificity", ylab="Sensitivity")
625 # add ROC curve
626 if(normal){
627 curve(1 - pnorm(qnorm(1-x,m0 ,s0),m1 ,s1), lwd=1, col=1,
add=TRUE)
628 points (1 - pnorm(cut ,m0,s0), 1 - pnorm(cut ,m1,s1), lwd=1,
cex=1, pch=3, col =1)
629 } else {
630 curve(1 - expit(alpha1+ beta1*(logit(1-x) - alpha0)/beta0)
, lwd=1, col=1, add=TRUE)
106 B. R CODE
631 points (1 - expit(alpha0 + beta0*cut) ,1 - expit(alpha1 +
beta1*cut), lwd=1, cex=4, pch=3, col=1)
632 }
633 # add legend
634 if(log) {legend (0.2, 0.3, bty="n", lwd=c(1,NA ,NA),pch=c
(-1,3,NA), col=c(1,1,NA), cex=0.7, c("Model -based
summary ROC curve",paste("Optimal threshold at ", round
(cutlog ,1), "ng/mL ,"), "(Se ,Sp)=(0.71 ,0.80)"))
635 } else {
636 legend (0.2, 0.3, bty="n",lwd=2,pch=c(-1,3),col=c(1,1),
cex=0.7, c("Model -based summary ROC curve",paste("
Optimal threshold at ", round(cut ,1), "ng/mL")))
637 }
638 if(print)dev.off()
639 }
640
641
642 ############ OUTPUT
#############################################
643 # print sens and spec and their confidence intervals at
chosen cut -off "evaluateCutoff"
644 if(evaluateCutoff =="optCutoff") {
645 evaluateCutoffHere <- cut
646 SESP <-sesp(cut)
647 } else {
648 if(log)evaluateCutoffHere <- as.numeric(log(evaluateCutoff)
) else
649 evaluateCutoffHere <- as.numeric(evaluateCutoff)
650 SESP <- sesp(evaluateCutoffHere)
651 }
652
653 # create list with all results
654 results$model <- model
655 results$REML.criterion <- REMLcrit(lmeModel)
656 results$AIC <- AIC(logLik(lmeModel))
657 results$BIC <- BIC(lmeModel)
658 results$normal <- normal
659 results$log <- log
660 results$REML <- reml
661 results$weights.name <- weights
662 results$lambda <- lambda
663 results$nmax <- nmax
664 results$eps <- eps
665 results$iter <- iter
666 results$print.iterations <- print.iterations
667 results$print <- print
B.1. CODE NOVEL APPROACH 107
668 results$workingData <- Data # Cutoff not yet logarithmized
669 results$weights <- w
670 results$regression.coefficients <- data.frame(alpha_0 = alpha0 ,
beta_0 = beta0 , alpha_1 = alpha1 , beta_1 = beta1)
671 results$distribution.parameters <- data.frame(m_0 = m0, varm0=
varm0 , s_0 = s0, vars0=vars0 , m_1 = m1, varm1=varm1 , s_1 =
s1, vars1=vars1)
672 if(log) results$optimal.cutoff <- cutlog else
673 results$optimal.cutoff <- cut
674 results$pooled.sensitivity <- SESP$Sens [1]
675 results$polled.specificity <- SESP$Spec [1]
676 results$inputData <- data
677 results$lmerOutput <- lmeModel
678
679 # Output on console
680 if(output){
681
682 cat("\nModel: ", model ,"\n\n")
683
684 if(log) {cat("The optimal cut -off value is:", round(cutlog ,3),
"\n\n")} else
685 {cat("The optimal cut -off value is:", round(cut ,3), "\n
\n")}
686
687 cat("REML criterion: ", REMLcrit(lmeModel),"\n\n")
688
689 if(log)
690 cat("Pooled Sensitivity and Specificity at cutoff =", round(exp
(evaluateCutoffHere) ,3), ":", "\n", SESP$Sens , "\n", SESP$
Spec ,"\n\n")
691 else
692 cat("Pooled Sensitivity and Specificity at cutoff =", round(
evaluateCutoffHere ,3), ":", "\n", SESP$Sens , "\n", SESP$
Spec ,"\n\n")
693
694 cat("--------------------------------", "\n\n")
695 }
696 return(results)
697
698 }
108 B. R CODE
B.2. Code Simulation Study
1 # Simulation study
2 # Susanne Steinhauser
3 # August/September 2015
4
5 library("data.table")
6 library(lme4)
7
8 # The version of the Evaluate.R function which is used here
defers from the one shown in section B1 in
9 # - 3 different input parameter: cutoff1 , cutoff2 , cutoff3
10 # - At these parameters sensitvity and specificity with their
confidence intervals are evaluated
11 # - the output is a dataframe containing the distribution
parameters with their variances (mo , varmo , m1 , varm1 , s0,
vars0 , s1, vars1),
12 # the optimal cutoff (cut) and sensitivity and specificity at
the 3 cut -offs given in the input with their upper and
lower boundary of the confidence interval.
13 source("Evaluate15Simu.R")
14
15 # Define logit function
16 logit <- function(x){
17 log(x) - log(1-x)
18 }
19
20
21 set.seed (14)
22
23 ######################################### OUTPUT
##################################################
24
25 # open file
26 sink(paste(getwd(),"/","SimuOutput",Sys.Date(),".csv",sep="",
collapse=NULL))
27 # write header to file
28 cat("numberOfRuns; mu0Perfect; lambda; sigma0Perfect;
mu1Perfect; sigma1Perfect; tauMu; tauSigma; model;
realMaxYouden; realSens1; realSpec1; cutoff2; realSens2;
realSpec2; cutoff3; realSens3; realSpec3; ;")
29 cat("optCutMean; optCutSE; optCutMSE; m0mean; m0SE; m0MSE;
m0Coverage; s0mean; s0SE; s0MSE; s0Coverage; m1mean; m1SE;
m1MSE; m1Coverage; s1mean; s1SE ; s1MSE; s1Coverage; ")
B.2. CODE SIMULATION STUDY 109
30 cat("sens1Mean; sens1SE; sens1MSE; sens1Coverage; spec1Mean;
spec1SE; spec1MSE; spec1Coverage; sens2Mean; sens2SE;
sens2MSE; sens2Coverage;")
31 cat("spec2Mean; spec2SE; spec2MSE; spec2Coverage; sens3Mean;
sens3SE; sens3MSE; sens3Coverage; spec3Mean; spec3SE;
spec3MSE; spec3Coverage; warnings; errors; negCorrError \n"
)
32
33 ############################### PARAMETERS
#########################################################
34
35 numberOfRuns <- 1000
36 # 1. number of studies per meta analysis
37 numberOfStudies <- c(10 ,20 ,30)
38 # 2. healthy and ill logistic distribution parameters
39 mu0Perfect <- 0
40 sigma0Perfect <- c(1.5, 1, 2.5, 2.5)
41 mu1Perfect <- c(2.5, 2.5, 4, 4)
42 sigma1Perfect <- c(1.5, 2, 2.5, 4)
43 # lognormal distribution parameters to draw number of patients
per study
44 meanlogNumberOfPatients <- 5
45 sdlogNumberOfPatients <- 1
46 # poisson distribution parameter to draw number of cutoffs
47 lambdaNumberOfCutoffs <- c(1.3 ,2)
48 # 3. heterogenize parameters
49 # for the code they should have same length (for loop)
50 heterogenizeM <- c(0, 0.5, 1, 1.5)
51 heterogenizeS <- c(0, 0.3, 0.4, 0.5)
52 # vector of linear mixed effect models
53 modelVector <- as.list(c("1b", "4b", "5b", "7b", "1c", "4c", "5
c", "7c"))
54 # parameter to unequal proportion of ill and healthy
distributions
55 lambda <- 0.5
56
57 # initialize a warning and an error vector for the messages
58 warningMessageVector <- errorMessageVector <- vector(mode="
character")
59
60 ######################################### ITERATION OF
PARAMETERS
#################################################
61 for(iterLambda in 1: length(lambdaNumberOfCutoffs)){
62 for(iterParameter in 1: length(mu1Perfect)){
63 for(h in 1: length(heterogenizeS)) {
110 B. R CODE
64 for (nm in 1: length(modelVector)) {
65 # create empty vectors to store the resulting
parameters of all runs
66 mu0RunsVector <- sigma0RunsVector <- mu1RunsVector <-
sigma1RunsVector <- optCutRunsVector <-
optCutMSE <- vector(length=numberOfRuns)
67 coverageVectorM0 <- coverageVectorS0 <-
coverageVectorM1 <- coverageVectorS1 <- vector(
length=numberOfRuns)
68 sens1 <- spec1 <- sens2 <- spec2 <- sens3 <- spec3 <-
vector(length=numberOfRuns , mode="numeric")
69 coverageVectorSens1 <- coverageVectorSens2 <-
coverageVectorSens3 <- coverageVectorSpec1 <-
coverageVectorSpec2 <- coverageVectorSpec3 <-
vector(length=numberOfRuns)
70 warningVector <- errorVector <- vector(length=
numberOfRuns)
71 negCorrVector <- rep(FALSE , numberOfRuns)
72
73 # determine optimal cutoff
74 turn <- (mu0Perfect*sigma1Perfect[iterParameter ]^2 -
mu1Perfect[iterParameter]*sigma0Perfect[
iterParameter ]^2)/(sigma1Perfect[iterParameter ]^2
-sigma0Perfect[iterParameter ]^2)
75 rad <- sqrt(sigma0Perfect[iterParameter ]^2*
sigma1Perfect[iterParameter ]^2*(2*(sigma1Perfect[
iterParameter ]^2 - sigma0Perfect[iterParameter
]^2)*(log(sigma1Perfect[iterParameter ]) - log(
sigma0Perfect[iterParameter ]) - logit(lambda)) +
(mu1Perfect[iterParameter] - mu0Perfect)^2))/(
sigma1Perfect[iterParameter ]^2 - sigma0Perfect[
iterParameter ]^2)
76 x0 <- turn - rad
77 x1 <- turn + rad
78 if (sigma0Perfect[iterParameter] < sigma1Perfect[
iterParameter ]) cut <- x1
79 if (sigma0Perfect[iterParameter] > sigma1Perfect[
iterParameter ]) cut <- x0
80 if (sigma1Perfect[iterParameter] == sigma0Perfect[
iterParameter ]) {
81 cut <- (sigma0Perfect[iterParameter ]^2*logit(lambda
)+0.5*(mu0Perfect ^2- mu1Perfect[iterParameter
]^2))/(mu0Perfect -mu1Perfect[iterParameter ])
82 }
83
84 YoudenindexMax <- cut
B.2. CODE SIMULATION STUDY 111
85
86 # fix 3 cut -off values to evaluate the results of
Sens , Spec
87 cutoff1 <- YoudenindexMax
88 cutoff2 <- mu0Perfect
89 cutoff3 <- mu1Perfect[iterParameter]
90
91 # function to calculate sens and spec at a cutoff
value x
92 sespSimu <- function(x) {
93 sp <- pnorm(x, mu0Perfect , sigma0Perfect[
iterParameter ])
94 se <- 1-pnorm(x, mu1Perfect[iterParameter],
sigma1Perfect[iterParameter ])
95 list(Sens=se, Spec =sp)
96 }
97
98 # get sens and spec at fix cutoffs 1,2,3
99 sensPerfect1 <- sespSimu(cutoff1)$Sens
100 specPerfect1 <- sespSimu(cutoff1)$Spec
101 sensPerfect2 <- sespSimu(cutoff2)$Sens
102 specPerfect2 <- sespSimu(cutoff2)$Spec
103 sensPerfect3 <- sespSimu(cutoff3)$Sens
104 specPerfect3 <- sespSimu(cutoff3)$Spec
105
106 ########################################## RUNS
#################################################
107 for(m in 1: numberOfRuns) {
108 # 1. generate a random sequence of entry choices for
the numberOfStudies vector
109 randomNumberOfStudiesVectorEntry <- sample (1: length
(numberOfStudies), size=numberOfRuns , replace=
TRUE)
110 # vector with number of studies for each run of
meta analysis
111 numberOfStudiesPerMetaanalysis <- numberOfStudies[
randomNumberOfStudiesVectorEntry]
112 # 2. determine number of cutoffs per study
113 numberOfCutoffsVector <- vector ()
114 while (length(numberOfCutoffsVector) <
numberOfStudiesPerMetaanalysis[m]) {
115 a <- rpois(lambda = lambdaNumberOfCutoffs[
iterLambda], n = 1)
116 if (a != 0) numberOfCutoffsVector <- c(
numberOfCutoffsVector ,a)
117 }
112 B. R CODE
118
119 # 4. vector with total patient number for each
study
120 totalPatientsPerStudy <- ceiling(rlnorm(n =
numberOfStudiesPerMetaanalysis[m],meanlog =
meanlogNumberOfPatients , sdlog =
sdlogNumberOfPatients))
121 # 5. vector with proportion of ill patients for
each study
122 propIllPerStudy <- vector(length =
numberOfStudiesPerMetaanalysis[m])
123 i <- 0
124 while(i < numberOfStudiesPerMetaanalysis[m]){
125 p <- rnorm(1, mean =0.5, sd =0.2)
126 if(p > 0.2 & p < 0.8) {propIllPerStudy[i+1] <- p;
i <- i + 1}
127 }
128 # 4.) + 5.) vector with number of ill patients per
study
129 numberIllPatients <- mapply(function(i,j) rbinom(n
= 1, size = i, prob = j),totalPatientsPerStudy
,propIllPerStudy)
130
131 # 2.) + 3.) heterogenize the distribution
parameters
132 # parameter per study. vector with an entry mu0 for
every study. Same for mu1. Truncate
distributions so that mu1 is always greater
than mu0
133 mu0 <- mu1 <- vector ()
134 while(length(mu0) < numberOfStudiesPerMetaanalysis[
m]){
135 m0 <- mu0Perfect + rnorm(1, mean=0, sd=
heterogenizeM[h])
136 m1 <- mu1Perfect[iterParameter] + rnorm(1, mean
=0, sd=heterogenizeM[h])
137 if((m1 -m0) > 0 & (m1-m0) < 2*(mu1Perfect[
iterParameter ])) {
138 mu0 <- c(mu0 , m0)
139 mu1 <- c(mu1 , m1)
140 }
141 }
142 # parameter per study. vector with an entry sigma0
for every study. Same for sigma1.
143 # Check that they are not negative and truncate
distribution symmetrically
B.2. CODE SIMULATION STUDY 113
144 sigma0 <- sigma1 <- vector ()
145 while(length(sigma0) <
numberOfStudiesPerMetaanalysis[m]){
146 s <- sigma0Perfect[iterParameter] + rnorm(1, mean
=0, sd=heterogenizeS[h])
147 if(s > 0 & s < 2*sigma0Perfect[iterParameter ])
sigma0 <- c(sigma0 , s)
148 }
149 while(length(sigma1) <
numberOfStudiesPerMetaanalysis[m]){
150 s <- sigma1Perfect[iterParameter] + rnorm(1, mean
=0, sd=heterogenizeS[h])
151 if(s > 0 & s < 2*sigma1Perfect[iterParameter ])
sigma1 <- c(sigma1 , s)
152 }
153
154 ############################################ DATA
GENERATION ##################################
155 SimuData=list()
156
157 for(s in 1: numberOfStudiesPerMetaanalysis[m]) {
158 # get cutoff values for healthy and ill patients.
Draw from logistic distribution.
159 BiomarkerValuesHealthy <- rnorm(n=(
totalPatientsPerStudy[s]-numberIllPatients[s
]), mean=mu0[s], sd=sigma0[s])
160 BiomarkerValuesIll <- rnorm(n=numberIllPatients[s
], mean=mu1[s], sd=sigma1[s])
161
162 ## 6. Cut -off allocation equidistantly between
40% quantile of distribution of healthy and
163 # 60% quantile of distribution of ill for each
study
164 cutoffMinVector <- qnorm (0.4, mean=mu0 , sd=sigma0
)
165 cutoffMaxVector <- qnorm (0.6, mean=mu1 , sd=sigma1
)
166 cutoffValues <- cutoffMinVector[s] + (1:
numberOfCutoffsVector[s])*(cutoffMaxVector[s
]-cutoffMinVector[s])/(numberOfCutoffsVector[
s]+1)
167
168
169
170 for(c in 1 : numberOfCutoffsVector[s]) {
114 B. R CODE
171 # 1.) +2.) +3.) +4.) +5.) +6.) => Test results of
healthy and ill patients
172 TNeg <- length(BiomarkerValuesHealthy[
BiomarkerValuesHealthy < cutoffValues[c]])
173 FPos <- length(BiomarkerValuesHealthy[
BiomarkerValuesHealthy >= cutoffValues[c]])
174 FNeg <- length(BiomarkerValuesIll[
BiomarkerValuesIll < cutoffValues[c]])
175 TPos <- length(BiomarkerValuesIll[
BiomarkerValuesIll >= cutoffValues[c]])
176
177 dataNew <- list(study=s,cutoff=cutoffValues[c]
, TN=TNeg , FN=FNeg , TP=TPos , FP=FPos)
178 SimuData <- rbindlist(list(SimuData , dataNew),
use.names=TRUE)
179 }
180 }
181 ###############################################
EVALUATE CALL
##############################################
182 # apply the evaluate function on the generated data
183 evalResult <- tryCatch ({
184 results <- evaluate(SimuData , normal=TRUE , model=
modelVector[nm], cutoff1=cutoff1 , cutoff2=
cutoff2 , cutoff3=cutoff3 ,log=FALSE , weights="
InverseVarianceScaled2",plots=c(), output=
FALSE)
185 list(result=results , warn=FALSE , error=FALSE ,
warnMessage="-")
186 },
187 warning=function(w) {
188 # w is the first warning of first part. Here I
need so supress the warnings to not see them
all again.
189 # so the print out is always just the fist
warning
190 suppressWarnings(result <-evaluate(SimuData ,
normal=TRUE , model=modelVector[nm],cutoff1=
cutoff1 , cutoff2=cutoff2 , cutoff3=cutoff3 ,log
=FALSE , weights="InverseVarianceScaled2",
plots=c(), output=FALSE))
191 list(result=results , warn=TRUE , error=FALSE ,
warnMessage=w$message)
192 },
193 error=function(e){
B.2. CODE SIMULATION STUDY 115
194 list(result=data.frame(m0=NA, varm0=NA, s0=NA,
vars0=NA, m1=NA , varm1=NA , s1=NA , vars1=NA ,
cut=NA ,SensC1=NA , SensC1l=NA , SensC1u=NA ,
SensC2=NA, SensC2l=NA, SensC2u=NA,
195 SensC3=NA, SensC3l=NA,
SensC3u=NA , SpecC1=NA ,
SpecC1l=NA , SpecC1u=
NA , SpecC2=NA , SpecC2l
=NA , SpecC2u=NA ,
SpecC3=NA, SpecC3l=NA,
SpecC3u=NA),
196 warn=FALSE , error=TRUE , warnMessage=e$
message)
197 }, finally = {})
198
199 ################################### STORAGE OF
RESULTS
############################################
200 attach(evalResult)
201 # store results of every run
202 optCutRunsVector[m] <- result$cut
203 mu0RunsVector[m] <- result$m0
204 sigma0RunsVector[m] <- result$s0
205 mu1RunsVector[m] <- result$m1
206 sigma1RunsVector[m] <- result$s1
207
208 # Coverage of m0 , s0
209 coverageVectorM0[m] <- abs(mu0Perfect - result$m0)
<= 1.96*sqrt(result$varm0)
210 coverageVectorM1[m] <- abs(mu1Perfect[iterParameter
] - result$m1) <= 1.96*sqrt(result$varm1)
211 coverageVectorS0[m] <- abs(sigma0Perfect[
iterParameter] - result$s0) <= 1.96*sqrt(result
$vars0)
212 coverageVectorS1[m] <- abs(sigma1Perfect[
iterParameter] - result$s1) <= 1.96*sqrt(result
$vars1)
213
214 # store sens and spec in 3 fix cutoffs
215 sens1[m] <- result$SensC1
216 sens2[m] <- result$SensC2
217 sens3[m] <- result$SensC3
218 spec1[m] <- result$SpecC1
219 spec2[m] <- result$SpecC2
220 spec3[m] <- result$SpecC3
221
116 B. R CODE
222 # coverage of sens , spec in 3 fix cutoffs
223 # e.g. is sensPerfect1 in confidence interval of
sens1?
224 coverageVectorSens1[m] <- (sensPerfect1 >= result$
SensC1l & sensPerfect1 <= result$SensC1u)
225 coverageVectorSens2[m] <- (sensPerfect2 >= result$
SensC2l & sensPerfect2 <= result$SensC2u)
226 coverageVectorSens3[m] <- (sensPerfect3 >= result$
SensC3l & sensPerfect3 <= result$SensC3u)
227 coverageVectorSpec1[m] <- (specPerfect1 >= result$
SpecC1l & specPerfect1 <= result$SpecC1u)
228 coverageVectorSpec2[m] <- (specPerfect2 >= result$
SpecC2l & specPerfect2 <= result$SpecC2u)
229 coverageVectorSpec3[m] <- (specPerfect3 >= result$
SpecC3l & specPerfect3 <= result$SpecC3u)
230
231 detach(evalResult)
232 # count warnings and errors
233 warningVector[m] <- evalResult$warn
234 warningMessageVector <- c(warningMessageVector ,
evalResult$warnMessage)
235 errorVector[m] <- evalResult$error
236 errorMessageVector <- c(errorMessageVector ,
evalResult$warnMessage)
237 if(evalResult$warnMessage == "neg correlation")
negCorrVector[m] <- TRUE
238 }
239 ############################################
AVERAGING OF RESULTS
###########################################
240
241 # calculate means and standard errors of the
estimated parameters of all runs
242 optCutMean <- mean(optCutRunsVector , na.rm=TRUE)
243 optCutSE <- sd(optCutRunsVector , na.rm=TRUE)
244 m0Mean <- mean(mu0RunsVector , na.rm=TRUE)
245 m0SE <- sd(mu0RunsVector , na.rm=TRUE)
246 s0Mean <- mean(sigma0RunsVector , na.rm=TRUE)
247 s0SE <- sd(sigma0RunsVector , na.rm=TRUE)
248 m1Mean <- mean(mu1RunsVector , na.rm=TRUE)
249 m1SE <- sd(mu1RunsVector , na.rm=TRUE)
250 s1Mean <- mean(sigma1RunsVector , na.rm=TRUE)
251 s1SE <- sd(sigma1RunsVector , na.rm=TRUE)
252
253 # calculate mse
B.2. CODE SIMULATION STUDY 117
254 m0MSE <- mean((rep(mu0Perfect , numberOfRuns) -
mu0RunsVector)^2, na.rm=TRUE)
255 s0MSE <- mean((rep(sigma0Perfect[iterParameter],
numberOfRuns) - sigma0RunsVector)^2, na.rm=TRUE)
256 m1MSE <- mean((rep(mu1Perfect[iterParameter],
numberOfRuns) - mu1RunsVector)^2, na.rm=TRUE)
257 s1MSE <- mean((rep(sigma1Perfect[iterParameter],
numberOfRuns) - sigma1RunsVector)^2, na.rm=TRUE)
258
259 sens1MSE <- mean((rep(sensPerfect1 , numberOfRuns) -
sens1)^2, na.rm=TRUE)
260 spec1MSE <- mean((rep(specPerfect1 , numberOfRuns) -
spec1)^2, na.rm=TRUE)
261 sens2MSE <- mean((rep(sensPerfect2 , numberOfRuns) -
sens2)^2, na.rm=TRUE)
262 spec2MSE <- mean((rep(specPerfect2 , numberOfRuns) -
spec2)^2, na.rm=TRUE)
263 sens3MSE <- mean((rep(sensPerfect3 , numberOfRuns) -
sens3)^2, na.rm=TRUE)
264 spec3MSE <- mean((rep(specPerfect3 , numberOfRuns) -
spec3)^2, na.rm=TRUE)
265
266 optCutMSE <- mean((rep(YoudenindexMax , numberOfRuns)
- optCutRunsVector)^2, na.rm=TRUE)
267
268
269 ########################################## OUTPUT
RESULTS #########################################
270 # format iterartion and output parameters to decimal
commas and 3 digits
271 iterationParameters <- c(lambdaNumberOfCutoffs[
iterLambda], sigma0Perfect[iterParameter],
mu1Perfect[iterParameter], sigma1Perfect[
iterParameter], heterogenizeM[h], heterogenizeS[h
], modelVector[nm],
272 YoudenindexMax , sensPerfect1
, specPerfect1 , cutoff2 ,
sensPerfect2 ,
specPerfect2 , cutoff3 ,
sensPerfect3 ,
specPerfect3)
273 outputParameters <-
274 c(optCutMean , optCutSE , optCutMSE , m0Mean , m0SE ,
m0MSE , mean(coverageVectorM0 , na.rm=TRUE),
s0Mean , s0SE , s0MSE , mean(coverageVectorS0 , na.
rm=TRUE),
118 B. R CODE
275 m1Mean , m1SE , m1MSE , mean(coverageVectorM1 , na.rm
=TRUE),s1Mean , s1SE , s1MSE , mean(
coverageVectorS1 , na.rm=TRUE),
276 mean(sens1 , na.rm=TRUE), sd(sens1 , na.rm=TRUE),
sens1MSE , mean(coverageVectorSens1 , na.rm=
TRUE), mean(spec1 , na.rm=TRUE), sd(spec1 , na.
rm=TRUE), spec1MSE , mean(coverageVectorSpec1 ,
na.rm=TRUE),
277 mean(sens2 , na.rm=TRUE), sd(sens2 , na.rm=TRUE),
sens2MSE , mean(coverageVectorSens2 , na.rm=
TRUE), mean(spec2 , na.rm=TRUE), sd(spec2 , na.
rm=TRUE), spec2MSE , mean(coverageVectorSpec2 ,
na.rm=TRUE),
278 mean(sens3 , na.rm=TRUE), sd(sens3 , na.rm=TRUE),
sens3MSE , mean(coverageVectorSens3 , na.rm=
TRUE), mean(spec3 , na.rm=TRUE), sd(spec3 , na.
rm=TRUE), spec3MSE , mean(coverageVectorSpec3 ,
na.rm=TRUE),
279 sum(warningVector , na.rm=TRUE), sum(errorVector ,
na.rm=TRUE), sum(negCorrVector , na.rm=TRUE))
280 iterationParametersformatted <- format(
iterationParameters , decimal.mark=",")
281 outputParametersformatted <- format(outputParameters ,
decimal.mark=",")
282
283 # write all parameters to file
284 cat(";;")
285 cat(iterationParametersformatted , sep=";")
286 cat(";;")
287 cat(outputParametersformatted , sep=";")
288 cat("\n")
289 }
290 }
291 }
292 }
293
294 #)
295 sink()
296
297 cat("number Of Runs:", numberOfRuns)
298 cat("warning messages:" )
299 print(warningMessageVector)
300 cat("error messages:")
301 print(errorMessageVector)
APPENDIX C
Simulation Study Plots
Models
Bia
s
−10
0
10
20
30
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, Same SD Moderate hetero, Same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, Same SD Huge hetero, Same SD
No hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, Different SD Large hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
−10
0
10
20
30Huge hetero, Different SD
mu0 mu1 sigma0 sigma1
Figure C.1. Bias of µ0 (open light blue circle), µ1 (filled light blue circle),σ0 (open dark blue circle) and σ1 (filled dark blue circle) in the case of λ = 2and nearby distributions. The heterogeneity of the studies increases fromleft to right. The four plots at the bottom show the case of same standarddeviations (SD), the top four plots the case of different standard deviations.
119
120 C. SIMULATION STUDY PLOTS
Models
Bia
s
−500
0
500
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, same SD Moderate hetero, same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, same SD Huge hetero, same SD
No hetero, different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, different SD Large hetero, different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
−500
0
500
Huge hetero, different SD
mu0 mu1 sigma0 sigma1
Figure C.2. Bias of µ0 (open light blue circle), µ1 (filled light blue circle),σ0 (open dark blue circle) and σ1 (filled dark blue circle) in the case ofλ = 1.3 and distant distributions. The heterogeneity of the studies increasesfrom left to right. The four plots at the bottom show the case of samestandard deviations (SD), the top four plots the case of different standarddeviations.
C. SIMULATION STUDY PLOTS 121
Models
Bia
s
−0.1
0.0
0.1
0.2
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, Same SD Moderate hetero, Same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, Same SD Huge hetero, Same SD
No hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, Different SD Large hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
−0.1
0.0
0.1
0.2
Huge hetero, Different SD
sens_2 spec_2
Models
Bia
s
−0.1
0.0
0.1
0.2
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, Same SD Moderate hetero, Same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, Same SD Huge hetero, Same SD
No hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, Different SD Large hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
−0.1
0.0
0.1
0.2
Huge hetero, Different SD
sens_3 spec_3
Figure C.3. Bias of sensitivity and specificity at the ’real’ optimal thresh-old (top) and at threshold 2.5 (bottom) in the case of 5 thresholds per studyand nearby distributions. The heterogeneity of the studies increases fromleft to right. The four plots at the bottom show the case of same standarddeviations (SD), the top four plots the case of different standard deviations.
122 C. SIMULATION STUDY PLOTS
Models
Bia
s
−0.1
0.0
0.1
0.2
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, same SD Moderate hetero, same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, same SD Huge hetero, same SD
No hetero, different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, different SD Large hetero, different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
−0.1
0.0
0.1
0.2
Huge hetero, different SD
sens_2 spec_2
Models
Bia
s
−0.1
0.0
0.1
0.2
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, same SD Moderate hetero, same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, same SD Huge hetero, same SD
No hetero, different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, different SD Large hetero, different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
−0.1
0.0
0.1
0.2
Huge hetero, different SD
sens_3 spec_3
Figure C.4. Bias of sensitivity and specificity at the ’real’ optimal thresh-old (top) and at threshold 4 (bottom) in the case of λ = 1.3 and distantdistributions. The heterogeneity of the studies increases from left to right.The four plots at the bottom show the case of same standard deviations(SD), the top four plots the case of different standard deviations.
C. SIMULATION STUDY PLOTS 123
Models
Bia
s
−3
−2
−1
0
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, Same SD Moderate hetero, Same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, Same SD Huge hetero, Same SD
No hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, Different SD Large hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
−3
−2
−1
0
Huge hetero, Different SD
Figure C.5. Bias of the optimal threshold in the case of 5 thresholds perstudy and nearby distributions. The heterogeneity of the studies increasesfrom left to right. The four plots at the bottom show the case of samestandard deviations (SD), the top four plots the case of different standarddeviations.
124 C. SIMULATION STUDY PLOTS
Models
MS
E
0e+00
1e+06
2e+06
3e+06
4e+06
5e+06
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, Same SD Moderate hetero, Same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, Same SD Huge hetero, Same SD
No hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, Different SD Large hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
0e+00
1e+06
2e+06
3e+06
4e+06
5e+06
Huge hetero, Different SD
m0 s0 m1 s1
Models
MS
E
20
40
60
80
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, Same SD Moderate hetero, Same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, Same SD Huge hetero, Same SD
No hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, Different SD Large hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
20
40
60
80
Huge hetero, Different SD
m0 s0 m1 s1
Figure C.6. MSE of the distribution parameters in the case of nearbydistributions and λ = 1.3. The top picture is the overall view and the oneon the bottom a zoomed-in version with MSE ≤ 100. The heterogeneity ofthe studies increases from left to right. The four plots at the bottom showthe case of same standard deviations (SD), the top four plots the case ofdifferent standard deviations.
C. SIMULATION STUDY PLOTS 125
Models
MS
E
0.02
0.04
0.06
0.08
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, Same SD Moderate hetero, Same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, Same SD Huge hetero, Same SD
No hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, Different SD Large hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
0.02
0.04
0.06
0.08
Huge hetero, Different SD
sens_2 spec_2
Models
MS
E
0.02
0.04
0.06
0.08
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, Same SD Moderate hetero, Same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, Same SD Huge hetero, Same SD
No hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, Different SD Large hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
0.02
0.04
0.06
0.08
Huge hetero, Different SD
sens_3 spec_3
Figure C.7. MSE of sensitivity and specificity at the ’real’ optimal thresh-old (top) and at threshold 2.5 (bottom) in the case of λ = 1.3 and nearbydistributions. The heterogeneity of the studies increases from left to right.The four plots at the bottom show the case of same standard deviations(SD), the top four plots the case of different standard deviations.
126 C. SIMULATION STUDY PLOTS
Models
MS
E
0.02
0.04
0.06
0.08
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, same SD Moderate hetero, same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, same SD Huge hetero, same SD
No hetero, different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, different SD Large hetero, different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
0.02
0.04
0.06
0.08
Huge hetero, different SD
sens_1 spec_1
Models
MS
E
0.02
0.04
0.06
0.08
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, same SD Moderate hetero, same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, same SD Huge hetero, same SD
No hetero, different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, different SD Large hetero, different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
0.02
0.04
0.06
0.08
Huge hetero, different SD
sens_2 spec_2
Figure C.8. MSE of sensitivity and specificity at threshold 0 (top) andat the ’real’ optimal threshold (bottom) in the case of λ = 1.3 and distantdistributions. The heterogeneity of the studies increases from left to right.The four plots at the bottom show the case of same standard deviations(SD), the top four plots the case of different standard deviations.
C. SIMULATION STUDY PLOTS 127
Models
MS
E
0.02
0.04
0.06
0.08
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, same SD Moderate hetero, same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, same SD Huge hetero, same SD
No hetero, different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, different SD Large hetero, different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
0.02
0.04
0.06
0.08
Huge hetero, different SD
sens_3 spec_3
Figure C.9. MSE of sensitivity and specificity at threshold 4 in the case ofλ = 1.3 and distant distributions. The heterogeneity of the studies increasesfrom left to right. The four plots at the bottom show the case of samestandard deviations (SD), the top four plots the case of different standarddeviations.
128 C. SIMULATION STUDY PLOTS
Models
MS
E
0
100
200
300
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, Same SD Moderate hetero, Same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, Same SD Huge hetero, Same SD
No hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, Different SD Large hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
0
100
200
300
Huge hetero, Different SD
Figure C.10. MSE of the optimal threshold in the case of λ = 1.3 andnearby distributions. The heterogeneity of the studies increases from leftto right. The four plots at the bottom show the case of same standarddeviations (SD), the top four plots the case of different standard deviations.
C. SIMULATION STUDY PLOTS 129
Models
Cove
rag
e
0.2
0.4
0.6
0.8
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, Same SD Moderate hetero, Same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, Same SD Huge hetero, Same SD
No hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, Different SD Large hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
0.2
0.4
0.6
0.8
Huge hetero, Different SD
m0 s0 m1 s1
Figure C.11. Coverage of the distribution parameters in the case of 5thresholds per study and nearby distributions. The heterogeneity of thestudies increases from left to right. The four plots at the bottom show thecase of same standard deviations (SD), the top four plots the case of differentstandard deviations.
130 C. SIMULATION STUDY PLOTS
Models
Cove
rag
e
0.2
0.4
0.6
0.8
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, Same SD Moderate hetero, Same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, Same SD Huge hetero, Same SD
No hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, Different SD Large hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
0.2
0.4
0.6
0.8
Huge hetero, Different SD
sens_3 spec_3
Models
Cove
rag
e
0.2
0.4
0.6
0.8
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, Same SD Moderate hetero, Same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, Same SD Huge hetero, Same SD
No hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, Different SD Large hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
0.2
0.4
0.6
0.8
Huge hetero, Different SD
sens_1 spec_1
Figure C.12. Coverage of sensitivity and specificity in the case of nearbydistributions. Top: at threshold 2.5 in the case of λ = 1.3. Bottom: atthreshold 0 in the case of 5 thresholds per study. The heterogeneity of thestudies increases from left to right. The four plots at the bottom show thecase of same standard deviations (SD), the top four plots the case of differentstandard deviations.
C. SIMULATION STUDY PLOTS 131
Models
Cove
rag
e
0.2
0.4
0.6
0.8
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, Same SD Moderate hetero, Same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, Same SD Huge hetero, Same SD
No hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, Different SD Large hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
0.2
0.4
0.6
0.8
Huge hetero, Different SD
sens_2 spec_2
Models
Cove
rag
e
0.2
0.4
0.6
0.8
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
No hetero, Same SD Moderate hetero, Same SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Large hetero, Same SD Huge hetero, Same SD
No hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
Moderate hetero, Different SD Large hetero, Different SD
CI
DS
CIC
S
CID
S
*CI
*DS
*CIC
S
*CID
S
0.2
0.4
0.6
0.8
Huge hetero, Different SD
sens_3 spec_3
Figure C.13. Coverage of sensitivity and specificity at the ’real’ optimalthreshold (top) and at threshold 2.5 (bottom) in the case of 5 thresholds perstudy and nearby distributions. The heterogeneity of the studies increasesfrom left to right. The four plots at the bottom show the case of samestandard deviations (SD), the top four plots the case of different standarddeviations.
Bibliography
Aertgeerts, B., Buntinx, F., and Kester, A. (2004). The value of the
cage in screening for alcohol abuse and alcohol dependence in general
clinical populations: a diagnostic meta-analysis. Journal of Clinical
Epidemiology, 57(1):30–39.
Agresti, A. (1990). Categorical Data Analysis. John Wiley & Sons,
New York, first edition.
Arends, L. R., Hamza, T. H., van Houwelingen, J., Heijenbrok-Kal, M.,
Hunink, M., and Stijnen, T. (2008). Bivariate random effects meta-
analysis of ROC curves. Medical Decision Making, 28(5):621–638.
Barr, D. J., Levy, R., Scheepers, C., and Tily, H. J. (2013). Random ef-
fects structure for confirmatory hypothesis testing: Keep it maximal.
Journal of Memory and Language, 68:255–278.
Bates, D., Machler, M., Bolker, B. M., and Walker, S. C.
(2015). Package ’lme4’. Available from: https://cran.r-
project.org/web/packages/lme4/lme4.pdf.
Crombie, I. K. and Davies, H. T. (2009). What is meta-analysis?
Available from: http://www.medicine.ox.ac.uk/bandolier/painres/
download/whatis/meta-an.pdf.
Faraway, J. J. (2006). Extending the Linear Model with R. Taylor &
Francis Group.
Ga�lecki, A. and Burzykowski, T. (2013). Linear Mixed-Effects Mod-
els Using R: A Step-by-Step Approach. Springer Science+Business
Media.
Greven, S. and Kneib, T. (2010). On the behaviour of marginal and
conditional AIC in linear mixed models. Biometrika, 97(4):773–789.
Hamza, T. H., Arends, L. R., van Houwelingen, H. C., and Stijnen,
T. (2009). Multivariate random effects meta-analysis of diagnostic
tests with multiple thresholds. BMC Medical Research Methodology,
10(9):73.
133
134 Bibliography
Harbord, R. M., Deeks, J. J., Egger, M., Whiting, P., and Sterne,
J. A. (2007). A unification of models for meta-analysis of diagnostic
accuracy studies. Biostatistics, 8:239–251.
Hodges, J. S. and Sargent, D. J. (2001). Counting degrees of freedom
in hierarchical and other richly-parameterised models. Biometrika,
88:367–379.
Honest, H. and Khan, K. S. (2002). Reporting of measures of accuracy
in systematic reviews of diagnostic literature. BMC Health Services
Researc, 2.
Lusted, L. B. (1971). Signal detectability and medical decision-making.
Science, 171:1217–9.
Martınez-Camblor, P. (2014). Fully non-parametric receiver operat-
ing characteristic curve estimation for random-effects meta-analysis.
Statistical Methods in Medical Research.
M.Laird, N. and H.Ware, J. (1982). Random-effects models for longi-
tudinal data. Biometrics, 38:963–974.
Putter, H., Fiocco, M., and Stijnen, T. (2010). Meta-analysis of diag-
nostic test accuracy studies with multiple thresholds using survival
methods. Biometrical Journal, 52(1):95–110.
Reitsma, J., Glas, A., Rutjes, A., Scholten, R., Bossuyt, P., and Zwin-
derman, A. (2005). Bivariate analysis of sensitivity and specificity
produces informative summary measures in diagnostic reviews. Jour-
nal of Clinical Epidemiology, 58(10):982–990.
Ressing, M., Blettner, M., and Klug, S. J. (2009). Systematische
Ubersichtsarbeiten und Metaanalysen. Deutsches Arzteblatt, 27:456–
63.
Riley, R., Takwoingi, Y., Trikalinos, T., Guha, A., Biswas, A., Ensor,
J., Morris, R. K., and Deeks, J. (2014). Meta-analysis of test accu-
racy studies with multiple and missing thresholds: a multivariate-
normal model. Journal of Biometrics and Biostatistics, 5:196.
10.4172/2155-6180.1000196.
Rucker, G. and Schumacher, M. (2010). Summary ROC curve based on
the weighted Youden index for selecting an optimal cutpoint in meta-
analysis of diagnostic accuracy. Statistics in Medicine, 29:3069–3078.
Rutter, C. M. and Gatsonis, C. A. (2001). A hierarchical regression
approach to meta-analysis of diagnostic test accuracy evaluations.
Bibliography 135
Statistics in Medicine, 20:2865–2884.
Schumacher, M. and Schulgen, G. (2008). Methodik klinischer Studien.
Methodische Grundlagen der Planung, Durchfuhrung und Auswer-
tung. Springer-Verlag Inc., Heidelberg, third edition.
Schwarzer, G., Carpenter, J. R., and Rucker, G. (2015). Use-R! –
Meta-Analysis with R. Springer, Berlin, Heidelberg.
Vaida, F. and Blanchard, S. (2005). Conditional Akaike information
for mixed-effects models. Biometrika.
Vouloumanou, E., Plessa, E., Karageorgopoulos, D., Mantadakis, E.,
and Falagas, M. (2011). Serum procalcitonin as a diagnostic marker
for neonatal sepsis: a systematic review and meta-analysis. Intensive
Care Medicine, 37(5):747–762.
Wacker, C., Prkno, A., Brunkhorst, F. M., and Schlattmann, P. (2013).
Procalcitonin as a diagnostic marker for sepsis: a systematic review
and meta-analysis. Lancet Infectious Diseases, 13(5):426–435.
Willis, B. H. and Quigley, M. (2011). Uptake of newer methodological
developments and the deployment of meta-analysis in diagnostic test
research: a systematic review. BMC Medical Research Methodology
2011, 11:27, 11.
World Health Organisation (2001). Biomarkers in risk assessment: Va-
lidity and validation. Available from: http://www.inchem.org/ doc-
uments/ehc/ehc/ehc222.htm.
Zhelev, Z., Hyde, C., Youngman, E., Rogers, M., Fleming, S., Slade,
T., Coelho, H., Jones-Hughes, T., and V., N. (2015). Diagnos-
tic accuracy of single baseline measurement of elecsys troponin T
high-sensitive assay for diagnosis of acute myocardial infarction in
emergency department: systematic review and meta-analysis. BMJ,
350:h15. doi: 10.1136/bmj.h15.
Hiermit versichere ich, dass ich die vorliegende Arbeit selbststandig
verfasst und samtliche Quellen angegeben und alle Zitate gekennzeich-
net habe und dass die Arbeit in gleicher oder ahnlicher Form noch in
keiner Prufungsbehorde vorgelegt wurde.
Freiburg, den 28.09.2015 ——————————————————–
Susanne Steinhauser