data in nonlinear mixed effects methodology for handling ...715330/fulltext01.pdf · nonlinear...

ACTAUNIVERSITATIS

UPSALIENSISUPPSALA

2014

Digital Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Pharmacy 189

Methodology for Handling MissingData in Nonlinear Mixed EffectsModelling

ÅSA M. JOHANSSON

ISSN 1651-6192ISBN 978-91-554-8970-0urn:nbn:se:uu:diva-224098

Dissertation presented at Uppsala University to be publicly examined in B41, BMC,Husargatan 3, Uppsala, Friday, 29 August 2014 at 09:15 for the degree of Doctor ofPhilosophy (Faculty of Pharmacy). The examination will be conducted in English. Facultyexaminer: Professor Leon Aarons (University of Manchester, UK; School of Pharmacy andPharmaceutical Sciences).

AbstractJohansson, Å. M. 2014. Methodology for Handling Missing Data in Nonlinear Mixed EffectsModelling. Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty ofPharmacy 189. 75 pp. Uppsala: Acta Universitatis Upsaliensis. ISBN 978-91-554-8970-0.

To obtain a better understanding of the pharmacokinetic and/or pharmacodynamiccharacteristics of an investigated treatment, clinical data is often analysed with nonlinear mixedeffects modelling. The developed models can be used to design future clinical trials or to guideindividualised drug treatment. Missing data is a frequently encountered problem in analysesof clinical data, and to not venture the predictability of the developed model, it is of greatimportance that the method chosen to handle the missing data is adequate for its purpose. Theoverall aim of this thesis was to develop methods for handling missing data in the context ofnonlinear mixed effects models and to compare strategies for handling missing data in order toprovide guidance for efficient handling and consequences of inappropriate handling of missingdata.

In accordance with missing data theory, all missing data can be divided into three categories;missing completely at random (MCAR), missing at random (MAR) and missing not at random(MNAR). When data are MCAR, the underlying missing data mechanism does not depend onany observed or unobserved data; when data are MAR, the underlying missing data mechanismdepends on observed data but not on unobserved data; when data are MNAR, the underlyingmissing data mechanism depends on the unobserved data itself.

Strategies and methods for handling missing observation data and missing covariatedata were evaluated. These evaluations showed that the most frequently used estimationalgorithm in nonlinear mixed effects modelling (first-order conditional estimation), resultedin biased parameter estimates independent on missing data mechanism. However, expectationmaximization (EM) algorithms (e.g. importance sampling) resulted in unbiased and preciseparameter estimates as long as data were MCAR or MAR. When the observation data areMNAR, a proper method for handling the missing data has to be applied to obtain unbiased andprecise parameter estimates, independent on estimation algorithm.

The evaluation of different methods for handling missing covariate data showed that acorrectly implemented multiple imputations method and full maximum likelihood modellingmethods resulted in unbiased and precise parameter estimates when covariate data were MCARor MAR. When the covariate data were MNAR, the only method resulting in unbiased andprecise parameter estimates was a full maximum likelihood modelling method where an extraparameter was estimated, correcting for the unknown missing data mechanism's dependence onthe missing data.

This thesis presents new insight to the dynamics of missing data in nonlinear mixed effectsmodelling. Strategies for handling different types of missing data have been developed andcompared in order to provide guidance for efficient handling and consequences of inappropriatehandling of missing data.

Keywords: Pharmacometrics, population models, censored observations, missing covariates,missing dependent variable, missing data mechanism, missing completely at random(MCAR), missing at random (MAR), missing not at random (MNAR), estimation algorithms

Åsa M. Johansson, Department of Pharmaceutical Biosciences, Box 591, Uppsala University,SE-75124 Uppsala, Sweden.

© Åsa M. Johansson 2014

ISSN 1651-6192ISBN 978-91-554-8970-0urn:nbn:se:uu:diva-224098 (http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-224098)

http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-224098

There are three kinds of lies:Lies, damned lies, and statistics.

List of papers

This thesis is based on the following papers, which are referred to in the textby their Roman numerals.

I Johansson ÅM, Hill N, Perisoglou M, Whelan J, Karlsson MO, StandingJF. (2011) A Population Pharmacokinetic/Pharmacodynamic Model ofMethotrexate and Mucositis Scores in Osteosarcoma. Therapeutic DrugMonitoring. 33(6):711-718

II Johansson ÅM, Ueckert S, Plan EL, Hooker AC, Karlsson MO. (2014)Evaluation of Bias, Precision, Robustness and Runtime for the EM andMonte Carlo based Estimation Methods in NONMEM 7. Journal ofPharmacokinetics and Pharmacodynamics. [Accepted]

III Johansson ÅM, Karlsson MO. (2014) The Impact of Censored Obser-vations on Model Fit and Structural Model Discrimination in NonlinearMixed Effects Modelling when using Different Estimation Algorithms.[In manuscript]

IV Johansson ÅM, Karlsson MO. (2013) Multiple Imputation of MissingCovariates in NONMEM and Evaluation of the Method’s Sensitivity toη-Shrinkage. The AAPS Journal. 15(4):1035-1042

V Johansson ÅM, Karlsson MO. (2013) Comparison of Methods for Hand-ling Missing Covariate Data. The AAPS Journal. 15(4):1232-1241

Reprints were made with permission from the publishers.

Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.1 Pharmacological and physiological data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.1.1 Pharmacokinetic data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.1.2 Pharmacodynamic data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.1.3 Covariate data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.2 Pharmacometrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121.2.1 Application of pharmacometrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121.2.2 Nonlinear mixed effects models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131.2.3 Maximum likelihood estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

1.3 Missing data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191.3.1 Missing observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191.3.2 Missing covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2 Aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.1 Data and models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.1.1 Methotrexate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.1.2 Response data models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313.1.3 Missing observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.1.4 Missing covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.2 Data analysis and model evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393.2.1 Model for clinical data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403.2.2 Simulation studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454.1 Model for clinical data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.1.1 Methotrexate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454.2 Simulation studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

4.2.1 Response data models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484.2.2 Missing observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504.2.3 Missing covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

7 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

Abbreviations

AUC Area under the concentration-time curveBAYES Full Markov Chain Monte-Carlo Bayesian analysisBMI Body mass indexBOV Between occasion variabilityBSV Between subject variabilityBQL Below quantification limitCL ClearanceEM Expectation maximizationFFM Fat free massFO First order approximationFOCE First order conditional estimationfu fraction unboundGFR Glomerular filtration rateIMP Important sampling expectation maximizationIMPMAP Important sampling EM assisted by mode a posterioriITS Iterative two-stageIQR Interquartile rangeLAPLACE Laplace approximationLLOQ Lower limit of quantificationMAR Missing at randomMCAR Missing completely at randomMNAR Missing not at randomNEE Normalized estimation errorOFV Objective function valuePD PharmacodynamicsPK PharmacokineticsRER Relative estimation errorRSE Relative standard errorrRMSE Relative root mean squared errorRTTE Repeated time to eventSAEM Stochastic approximation expectation maximizationSD Standard deviationSCr Serum creatinineTDM Therapeutic drug monitoringVPC Visual predictive checkWT Body weight

1. Introduction

1.1 Pharmacological and physiological dataPharmacology is a Greek word meaning the knowledge/study of a poison/drug.The scientific field of pharmacology deals with the interactions occurring be-tween chemical substances (man-made or natural) and living organisms. Thescience of pharmacology is divided into several sub-fields. One of them ispharmacokinetics (PK), which deals with how the living organism affects thechemical substance, another one is pharmacodynamics (PD), which deals withhow the chemical substance affects the living organism.

1.1.1 Pharmacokinetic dataTo investigate how the living organism affects the chemical substance, con-centration measurements of the substance and/or its metabolites are made ata number of different time points after the administration. The concentrationcould for example be measured from venous blood samples, exhaled air and/orurine samples. Concentration measurements are a continuous data variable.

1.1.2 Pharmacodynamic dataUnlike PK data, there is usually no direct way to measure how the chemicalsubstance affects the living organism. Sometimes, the underlying response inthe organism can be measured as a change in a biomarker [1]. The biomarkercould for example be blood glucose concentration, or blood pressure, mea-sured at different time points after the administration of the substance. Inthese cases, the response is measured as a continuous data variable. Whenthe response cannot be measured via a biomarker, the alternative is to rely ona clinical endpoint which reflects how the individual feels, functions or sur-vives [1]. The clinical endpoint could for example be whether the individualexperiences pain or not (binary data), whether the individual experiences mild,moderate or severe pain (ordered categorical data), how long time it will takeuntil the individual experiences one or more episodes of pain ((repeated) timeto event data) or how many episodes of pain the individual experiences duringa certain time interval (count data). All these clinical endpoints are examplesof categorical data variables (time is usually handled as time intervals).

11

1.1.3 Covariate dataApart from PK and PD data, other types of pharmacological and physiologicaldata can be measured, which characterise and distinguish individuals from oneanother. These types of data are often referred to as covariate data. Covari-ate variables can be continuous, like measurements of body weight or serumcreatinine concentrations, or they can be (nominal) categorical, like sex or thegenotype regulating some metabolic rate.

1.2 PharmacometricsThe efficacy and safety of a drug is usually assured through traditional statis-tical analyses of changes in PD data (biomarkers or clinical endpoints). Theseanalyses compare PD data collected at two different time points (e.g. beforeand after drug intake) or PD data collected for two or more groups of indi-viduals (e.g. PD data at the end of a study for different dose groups). Phar-macometrics is emerging as a powerful alternative to the traditional statisticalefficacy and safety analyses. Instead of only analysing data collected at a cou-ple of time points during the study, the pharmacometric approach allows thedata analyst to analyse data from the whole time course of the study simultane-ously. The drug effect (both desired and undesired) is linked to the measureddrug concentrations (drug exposure) through mathematical statistical models.

1.2.1 Application of pharmacometricsBefore going into more details on the mathematical statistical models usedin pharmacometrics and how to analyse the data using these models, I want togive a couple of examples of when pharmacometrics is particularly applicable.

Drug development

Drug development is a long and costly process [2] and many clinical (andpreclinical) trials fail to give answers to the questions they were intended toanswer. Pharmacometric methods can be used during drug development toaccumulate knowledge and information about the "dose-exposure-response"relationship of the investigated drug [3]. The knowledge, and not the least thegaps of knowledge, can then be used to design future clinical trials in a cycleof learning and confirming [4]. The developed pharmacometric model(s) canalso be used to identify failing drug candidates early on in the developmentprocess and thereby save a lot of time and money [3, 5].

The application of pharmacometric methods in drug development is re-ferred to as model based drug development, and companies which has adaptedtheir procedures to model based drug development have gained in efficiency [5].

12

Individualised drug treatment

The collection of PK and PD data does not stop once the drug has been intro-duced to the market. Therapeutic drug monitoring (TDM), where PK and/orPD data are collected regularly from each patient, is routine for many drugs.TDM is especially required for drugs with a narrow therapeutic window, i.e.a narrow span within which the drug has sufficient effect but not unaccept-able side effects. Pharmacometric methods can be applied to TDM data todevelop models which can be used to estimate initial doses and to optimizeand individualise the dosing strategies [6–9].

1.2.2 Nonlinear mixed effects modelsThe population modelling approach is the most widely used in pharmacomet-rics. The application of nonlinear mixed effects modelling allows the mod-eller to simultaneously estimate parameters for the typical individual in thepopulation (typical values) and variances describing the differences betweenindividuals (between subject variability) and the variability due to measure-ment errors and model misspecification errors (residual error) [10]. Theseparameters are population parameters, and the typical value parameters areusually referred to as fixed effects while the variance parameters are referredto as random effects (to be precise, the variance parameters are variances ofrandom effects and not random effects per se). Based on the estimates of thetypical value and variance parameters, it is possible to obtain estimates of theindividual parameters (the true random effects).

The nonlinear mixed effects model can be regarded as consisting of threedifferent model parts: the structural model, the stochastic model, and the co-variate model. The models are only models, trying to explain what is observedin the data. There is no such thing as a true model (unless the data is simulatedfrom a model to start with); however, the model will be more or less usefuldepending on how well it captures the data.

Structural model

The structural model describes the central tendency of the data, a model forthe typical individual in the studied population. A structural model for PKdata aims at describing the absorption, disposition and elimination of the drugwhile a structural model for PD data aims at describing the observed responsesand determine an appropriate linkage between the PK model and the PD model.Examples of PK structural models are zero- or first-order absorption, one-,two- or three-compartment disposition and zero- or first-order elimination.The PK structural models are specified as sequences of differential equations.

The PD structural models are different depending on the type of responsedata observed. Continuous response variables are modelled using differenttypes of compartmental models, e.g. indirect response models. For categor-ical response variables, the structural model could for example be a logistic

13

regression model (for binary data), an ordered logistic regression model (forordered categorical data) [11], a cumulative hazard model (for repeated timeto event (RTTE) data) or a Poisson model (for count data) [12]. The linkagebetween the PK model and the PD model (also a structural model) is usuallydescribed with a linear, Emax or sigmoidal Emax model, via a direct effect oreffect compartment model.

The developed model can give a more or less mechanistic description ofthe PK and/or PD data depending on the amount of information available inthe data and the amount of prior knowledge available about the pharmacologyof the drug and the physiology of the investigated biological system (e.g. thehuman body).

Stochastic model

The stochastic model identifies and quantifies the variability in the model andmodel parameters, thereby accounting for model misspecification and mea-surement errors and allowing individuals to differ from each other (and fromthe typical individual) [10]. Depending on the data, different levels of variabil-ity are identifiable. The two most common levels are the previously mentionedbetween subject variability (BSV) and residual error. However, if data are col-lected at a number of occasions for the same individual, it is not likely thatthe PK and/or PD model parameters are exactly the same for all occasions.This type of variability is called between occasion variability (BOV) and it isshown that neglecting the BOV when it is present, leads to biased parameterestimates [13]. Another level of variability, which has to be accounted for ifpresent, is the between study variability [14].

The BSV and BOV are usually modelled using a log-normal distribution ofthe individual parameter values (and values from different occasions) aroundthe typical value of the parameter. The residual error is usually modelledas an additive, proportional or combined error (variability) around the modelprediction of the individual drug concentrations/continuous responses.

Covariate model

Some of the quantified BSV can be explained by differences in the individuals’covariates. It is more common to include covariates in the PK model than inthe PD model (a lot of the variability in response between individuals can beexplained by the differences in drug exposure). For the PK model, the mostcommon covariates to include are covariates describing body size (e.g. bodyweight, body surface area, body mass index), kidney function (e.g. serumcreatinine, creatinine clearance) and/or liver function (e.g. albumin, bilirubin).

The covariate effects are included as typical value parameters affecting thestructural model parameters. Continuous covariates can for example be in-cluded in the model as a linear additive covariate effect, a covariate effectcentred on or normalized to the median of the covariate or, as is often the casefor body weight, using allometric scaling [15,16]. A categorical covariate can

14

be included by estimation of different typical value parameters for the differ-ent covariate categories, or by defining one category as the base category andthen model the typical values for the other categories as changes from the basecategory.

General mathematical description

The mathematical description of a nonlinear mixed effects model is complexand involves vector and matrix calculus.

Let N be the number of subjects included in the data set, independently sam-pled from the study population, and let ni be the number of observed dependentvariables (i.e. measured concentrations and/or responses) for individual i. Thejth ( j = 1, . . . ,ni) observed dependent variable for individual i (i = 1, . . . ,N)is denoted yi j.

The general nonlinear mixed effects model for continuous data can be de-fined as:

yi j = f (ti j,g(θ ,ηi,xi,zi))+h(ti j,g(θ ,ηi,xi,zi) ,εi j) (1.1)

where f (·) is the function of the structural model, h(·) is the function describ-ing the residual error model, ti j is the independent variable (e.g. observationtime point) and g(·) is the vector function defining the s individual parametersgiven the vector of typical value parameters θ , the individual random effectsηi, the vector of discrete design variables xi (e.g. dose) and the vector ofcovariates zi (e.g. body weight). The individual parameters of the ith individ-ual deviate from the population typical values θ with the random effects ηi(ηi ∼ N(0,ΩΩΩ)), where ηi is a vectors of size s and ΩΩΩ is the s× s covariancematrix describing the correlations between the individual parameters. Thediagonal elements of ΩΩΩ are also referred to as the BSV. In the residual er-ror model, εi j (εi j ∼ N(0,ΣΣΣ)) describes the deviation of the individual modelprediction from the observed value, where εi j is a vector (usually of size oneor two) and ΣΣΣ is the covariance matrix describing the correlation between thecoordinates of the εi j-vector.

The general nonlinear mixed effects model for categorical data, describ-ing the probability of observing the individual response yi j, given the modelparameters, can be defined as:

p(yi j|ti j,g(θ ,ηi,xi,zi)) (1.2)

where p(·) is the probability (likelihood) function, and the other function, vari-ables and parameters are defined as before.

1.2.3 Maximum likelihood estimationMaximum likelihood estimation is one of the most common methods usedto estimate the parameters of a mathematical statistical model. The method

15

estimates the parameters by finding the parameter values which maximizes thelikelihood of the observed data, i.e. the set of parameter values, of all possiblesets of parameters, for which the observed data have the highest probability tooccur. Since the vectors ti, xi and zi are constants in the maximization of thelikelihood, they are suppressed in the equations in this subsection.

The likelihood for the observed data (dependent variables) for individual i,given the population parameters (θ , ΣΣΣ and ΩΩΩ) can be defined as [17]:

Li (θ ,ΣΣΣ,ΩΩΩ|yi) = P(yi|θ ,ΣΣΣ,ΩΩΩ) =∫

P(yi|ηi,θ ,ΣΣΣ) ·P(ηi|ΩΩΩ)dηi (1.3)

where Li is the ith individual’s contribution to the likelihood, P(yi|ηi,θ ,ΣΣΣ)is the conditional distribution of yi given the random effect ηi, P(ηi|ΩΩΩ) isthe ’prior’ (marginal) distributions of ηi and the product of the two is the’posterior’ distribution of ηi.

The conditional distribution of ηi is then

P(ηi|θ ,ΣΣΣ,ΩΩΩ,yi) =P(yi|ηi,θ ,ΣΣΣ) ·P(ηi|ΩΩΩ)

P(yi|θ ,ΣΣΣ,ΩΩΩ)∝ P(yi|ηi,θ ,ΣΣΣ) ·P(ηi|ΩΩΩ) (1.4)

and when θ , ΣΣΣ and ΩΩΩ are replaced with their maximum likelihood estimators,Equation 1.4 is called the empirical Bayes ’posterior’ distribution of ηi [17].

The likelihood is the product of all individuals’ contribution to the like-lihood, or to simplify the calculations, the log-likelihood is the sum of thelogarithms of all individuals’ contribution to the likelihood,

ln{L (θ ,ΣΣΣ,ΩΩΩ)}=N

∑i=1

ln{Li (θ ,ΣΣΣ,ΩΩΩ)} (1.5)

Since the nonlinear mixed effects models are nonlinear with respect to therandom effects (ηi), there is usually no analytical solution to the integral inEquation 1.3. Therefore, a numerical optimization procedure has to be con-ducted to find the maximum of the likelihood. There are many algorithmsavailable for numerical solution of the likelihood function. Gradient basedalgorithms (which maximize an approximated likelihood) and Expectation-Maximization (EM) algorithms are two groups of algorithms commonly usedfor maximum likelihood estimation in nonlinear mixed effects modelling. Analternative to maximum likelihood estimation is Bayesian hierarchical mod-elling [18].

The NONMEM� software [19] is the most frequently used software fornonlinear mixed effects modelling of pharmacological data. NONMEM wasthe modelling software used throughout this thesis and the following sectiongives an overview of some of the estimation algorithms available in NON-MEM.

16

Gradient based algorithms

The gradient based estimation algorithms are not different algorithms per se;they all use the same quasi-Newton algorithm to maximize an approximatedlikelihood. The differences between the algorithms are in their approximationsof the likelihood.

All gradient based algorithms use a second order Taylor series expansionaround the random effects of the conditional distribution of yi (in Equation 1.3)to find a closed form solution of the (approximated) likelihood [17, 19]. Thesecond order Taylor series expansion contains the second derivative of theconditional distribution of yi (the Hessian matrix, ΔΔΔiii(η)) which is often verycomplex and computationally expensive. The first order (FO) method and thefirst order conditional estimation (FOCE) method therefore approximates theHessian matrix by a function of the gradient vector (the first derivative of theconditional distribution of yi, Γ(η)) [17, 19]. For the FO method the Taylorseries expansion is made around the centre of the ’prior’ distributions of ηi(the expected value of the ’priors’ are zero) and for the FOCE method theexpansion is made around the centre (mode) of the ’posterior’ distribution ofηi (the individual conditional estimates) [17, 19].

The LAPLACE method uses both the first and second derivative of theconditional distribution in the second order Taylor series expansion and theexpansion is made around the mode of the posterior distribution of ηi [17,19].

The ’prior’ distributions of ηi are defined in the model (Equation 1.1 and1.2) as normally distributed with the mean zero and variance ΩΩΩ.

The LAPLACE algorithm gives the most precise approximation of the gra-dient based algorithms, and it is the only gradient based algorithm suitable tofit categorical data models (Equation 1.2). The FOCE algorithm is the mostfrequently used algorithm for continuous data models (Equation 1.1), whileFO, which has the advantage of being much faster than the other algorithms,is known to produce biased parameter estimates, especially for models withwide BSV and/or residual error [20, 21]. All gradient based algorithms per-form best if the conditional distribution is approximately symmetric around itsmode.

EM algorithms

The EM algorithm was originally introduced to analyse ’incomplete data’, i.e.data where part of the information is missing [22]. The individual randomeffects (ηi) are unobserved and can hence be treated as missing information,and the EM algorithm can be applied to maximize the likelihood. The al-gorithm consists of two steps; the expectation (E-step) and the maximization(M-step). During the E-step, the expected value of the likelihood is evaluatedwith respect to the conditional distribution of ηi (Equation 1.4) given the ob-served data and the current estimates of the population parameters (θ , ΣΣΣ andΩΩΩ). During the M-step the population parameters are updated by maximizing

17

the ’expected’ likelihood from the E-step. The E-step followed by the M-stepis repeated until the change in population parameter estimates are negligible.

The expected value of the likelihood (E-step) cannot be evaluated in closedform due to the nonlinearity of the model. Four EM algorithms are availablein NONMEM, and the main difference between the algorithms lies in theirapproximation of the E-step [19].

The iterative two stage (ITS) method estimates the individual random ef-fects (Equation 1.4), given the observed data and the current estimates of thepopulation parameters, using the same approximation as FOCE (or LAPLACE).During the M-step, the population parameters are updated using the estimatedindividual random effects [19, 23].

The E-step can also be approximated using Monte-Carlo integration orstochastic approximation [24, 25]. Both methods make use of a proposaldensity from which individual random effects are sampled (E-step), followedby maximization of the ’expected’ likelihood (M-step). Two of the EM al-gorithms in NONMEM use Monte-Carlo integration to approximate the E-step; Monte-Carlo Importance Sampling EM (IMP) and Monte-Carlo Im-portance Sampling EM Assisted by Mode a Posteriori (IMPMAP), and oneEM algorithm uses stochastic approximation; Stochastic Approximation EM(SAEM) [19].

The IMP algorithm uses the mean and variance of the conditional distribu-tion of ηi (obtained from the previous M-step) to define the proposal densityused in the Monte-Carlo integration while the IMPMAP algorithm uses themode and approximated variance of the conditional distribution of ηi (obtainedby using the same approximation as FOCE (or LAPLACE)) to define the pro-posal density. During the first iteration the algorithms are indistinguishable(even the IMP algorithm uses the FOCE (or LAPLACE) approximation) [19].Both methods generate 300 Monte Carlo samples per individual (default valuein NONMEM) in each iteration.

The SAEM algorithm consists of two phases; the burn-in phase and theaccumulation phase. During the burn-in phase, the algorithm uses stochasticapproximation with only a few samples per individual (the default value inNONMEM is two) in each iteration. When the population parameters arereasonably stable, the algorithm proceeds to the accumulation phase. Duringthe accumulation phase individual random effects are sampled using MarkovChain Monte-Carlo simulations and samples from previous iterations are aver-aged together (giving samples from previous iterations a greater importance)to update the population parameters [19, 26].

Bayesian hierarchical modelling algorithm

Bayesian hierarchical modelling algorithms are not maximum likelihood algo-rithms. Instead, they produce a great number of probable population param-eters (default number in NONMEM is 10000) from which weighted averagesof the parameters may be obtained. The likelihood evaluated by the Bayesian

18

hierarchical modelling algorithms is extended to include user specified uncer-tainty distributions for the population parameters [18]. This extension is par-ticularly appropriate when there is prior information on parameter estimatesavailable from earlier studies [27].

The Bayesian hierarchical modelling algorithm available in NONMEM isa full Markov Chain Monte-Carlo Bayesian analysis method (BAYES). TheBAYES algorithm consists of two phases; the burn-in phase (similar to SAEM)and the stationary distribution phase (during which the great number of prob-able population parameters are obtained) [19].

1.3 Missing dataMissing data is a frequently encountered problem in analyses of clinical data.There are tons of possible reasons for why some of the data might be missing.These reasons are mathematically referred to as missing data mechanisms.In accordance with missing data theory, all missing data can be divided intothree categories; missing completely at random (MCAR), missing at random(MAR) [28] and missing not at random (MNAR) [29]. When data are MCAR,the underlying missing data mechanism does not depend on any observed orunobserved data, when data are MAR, the underlying missing data mecha-nism depends on observed data but not on unobserved data, and when data areMNAR, the underlying missing data mechanism depends on the unobserveddata itself.

The method chosen to handle missing data in a nonlinear mixed effectsmodel will affect the outcome of the analysis (to a greater or lesser extentdepending on the fraction of missing data and the importance of the data miss-ing). It is therefore important that the method is chosen with great care.

Different types of data requires different handling of missing data. It couldbe the dependent variable (i.e. drug concentration data and/or response data;from now on referred to as observation data) that is partly missing or it couldbe covariate data that are missing.

1.3.1 Missing observationsExamples

In long-term studies some observations are usually censored due to individualsdropping out of the study prematurely [30]. However, studies covering onlyone treatment cycle can also suffer from missing observations. The most com-mon type of missing observations within one treatment cycle is censoring ofdata due to measured concentrations being below the lower limit of quantifica-tion (BQL data). Another type of censoring is censoring due to discontinuationof the sampling schedule once a sample which fulfils certain concentration re-quirements has been measured. This type of censoring is common in high dose

19

methotrexate (chemotherapy) studies where the TDM usually is discontinuedonce a concentration measurement below 0.2 μmol L−1 is observed [31–35].

Following the previously mentioned classification of missing data, the cen-sored methotrexate concentrations are MAR (the underlying missing data me-chanism depends on the concentration of the last observed sample and not onthe concentrations of the missing samples) and the censored BQL data areMNAR (the underlying missing data mechanism depends on the concentra-tions of the missing samples themselves). An example of observation databeing MCAR would be if one of the freezers, used to store blood samplesbefore the analytical assay, would break and all the samples in that freezerwould be destroyed. The data would be MCAR as long as the reason why thesamples were stored in that freezer had nothing to do with the concentrationlevels in the (not yet analysed) blood samples.

Modelling strategies

Let YYY represent all observation data (i.e. all yi) which were intended to beobserved before initiation of data collection. Then YYY obs is the subset of YYYwhich was actually observed during data collection and YYY miss is the subset ofYYY which was not observed (missing) during data collection (YYY = (YYY obs,YYY miss)).Let RRR be a matrix of binary indicators of the same dimension as YYY and let theelements of RRR take on a value of 0 when the corresponding element in YYY isobserved and 1 when the corresponding element in YYY is missing. The part ofRRR associated with individual i is denoted ri. The elements of ri can be treatedas a set of random variables having a joint probability distribution, giving theprobability of observing the observation data in yi.

As previously described (Equation 1.3), a model for yi is found by maxi-mizing the likelihood of yi given the model’s population parameters (θ , ΣΣΣ andΩΩΩ). To simplify the notation, the population parameters will be referred to asβ in this subsection and ti, xi and zi will be reintroduced in the notation as thejoint vector vi.

In the presence of missing observation data, the general model for properinterference about the population parameters in β has to be written as a jointdistribution of yi and ri:

P(yi,ri|vi,β ,φ) = P(yi|vi,β ) ·P(ri|yi,vi,φ) (1.6)

where φ are the parameters of the underlying missing data mechanism andP(ri|yi,vi,φ) is the probability distribution of ri given the complete observa-tion data (both observed and missing), the joint vector of independent vari-ables, discrete design variables and covariates, and the parameters of the un-derlying missing data mechanism [19, 29].

When the missing observation data are MCAR the probability distributionof ri does not depend on the observed or missing values of yi [28, 29]. How-ever, it can be related to independent variables, discrete design variables or

20

covariates, as long as those variables are included in the model:

P(ri|yi,vi,φ) = P(ri|vi,φ) , for all yi (1.7)

This implies that valid inference about the population parameters in β is ob-tained using only the observed data in YYY obs [28, 29]. The underlying mecha-nism causing the observation data to be missing can be ignored in the estima-tion of the population parameters.

When the observation data are MAR the probability distribution of ri is onlydependent on the observed observation data and not on the missing data [28,29]:

P(ri|yi,vi,φ) = P(ri|(yi)obs,vi,φ) , for all (yi)miss (1.8)

Valid inference about the parameters in β is then obtainable without a modelfor the probability distribution of ri as long as the analysis method allows forcorrelations between the observed and the missing data. That is, if a properestimation method is used (e.g. maximum likelihood estimation) the mech-anism causing missing data can be ignored in the estimation of the modelparameters [28, 29, 36].

When the observation data are MNAR, the probability distribution of ridepends on all data in yi (both (yi)obs and (yi)miss) and can therefore not besimplified [29]. That means that valid inference about the parameters in β isonly obtainable using the joint distribution of yi and ri as given in Equation 1.6,i.e. the population model has to include an extra model for the missing datamechanism in order to get unbiased estimates of the model parameters [29].

When clinical data sets suffering from BQL censoring are analysed withnonlinear mixed effects modelling, without a proper handling of the censoreddata, there is an increased bias and imprecision of model parameter estimates[37–40] and the type I error rate is elevated [41]. This means that there is anincreased risk of deciding for a more advanced structural model than needed,and even if the structural model is correct the parameter estimates are notreliable. By simultaneously modelling the observed observations using themodel for continuous data (Equation 1.1) and the censored observations usingthe model for categorical data (Equation 1.2, evaluating the probability thatthe observation is BQL), unbiased and precise parameter estimates are obtain-able [37, 38, 42].

Problems

According to missing data theory, the missing data mechanism can be ig-nored during parameter estimation when data are MCAR or MAR, as longas maximum likelihood estimation is used to fit the model. However, sincethe maximum likelihood algorithms available to fit a nonlinear mixed effectsmodel, rely on different approximations of the likelihood, it is unclear how theestimated model parameters and the decision of correct structural model willbe affected when some of the observation data are MCAR or MAR.

21

1.3.2 Missing covariatesThe structural model of the PK-PD model is not affected by missing covariatedata. However, since covariates can be used to explain some of the BSV in themodel, the predictability and applicability of the model might be affected bymissing covariates. If an important covariate is completely missing, i.e. thereis no information about the covariate for any of the individuals in the data set,there is of course not much one can do to reduce the impact on the model.The following subsection will focus on methods to reduce the impact of partlymissing covariates.

Examples

If individuals have been asked to fill out a form with questions about theirsex and body weight and some of the forms are returned with the questionabout body weight unanswered, the missing weights are: MCAR if the reasonthey are missing is that some individuals did not see the question, MAR if thereason is that females in general were less willing to reveal their body weightthan males (but the willingness was independent on the body weight itself),and MNAR if the reason is that obese individuals were less willing to revealtheir body weight than normal weighted individuals.

Modelling strategies

Suggested modelling strategies for missing covariates can be divided into twocategories; imputation methods and full maximum likelihood methods. In sin-gle imputation methods, the missing covariates are replaced by some valuesin order to obtain complete data sets. The data sets are then analysed as ifthe imputed values are the true values, without taking the uncertainty in theimputations into account. The most common types of single imputations areto replace the missing values with the mean (or median) of the variable (calcu-lated using data from the individuals for whom the covariate was observed) or,if the covariate has been observed at some time points but is missing at others,to replace the missing values with the last observed value of the covariate (thismethod is often referred to as last observation carried forward).

Several covariates in the data set might be correlated and contain partly thesame information [43]. It is therefore possible to establish relationships be-tween the observed and the partly missing covariates and then base the imputa-tions on these relationships [44,45]. The relationships are established throughregression (often linear regression) and the imputations can be conducted byreplacing the missing values by the values obtained from the regression model,or by values simulated within a certain variance around the regression model(established by estimating the residual error in the regression model).

Once a regression model and the associated variance has been established,it is possible to simulate more than one value to replace each missing datapoint. This method is called multiple imputations. Each completed data set

22

is analysed separately, after which the results are combined to one set of esti-mates [46,47]. By repeating the imputations many times and then combine theresults, the uncertainty in the imputations are taken into account. A correctlyimplemented multiple imputations method is therefore considered to be one ofthe best and most accurate methods to apply to handle missing data [36, 48].

Another alternative is to apply full maximum likelihood modelling to han-dle the missing covariates. This method also makes use of the establishedregression model and its residual error. However, instead of simulating valuesto replace the missing values, a full maximum likelihood estimation will findthe most probable values of the missing covariate(s) given the observed data(including the observed dependent variable).

Problems

Even though better methods exist, simple versions of single imputations arestill the most frequently used methods to apply to handle missing covariates innonlinear mixed effects modelling. Multiple imputations methods and meth-ods based on full maximum likelihood modelling give less bias and higherprecision in parameter estimates than simpler methods when handling missingdata in linear fixed effect models [36, 48]. However, the performance of fullmaximum likelihood modelling and (multiple) imputation methods to handlemissing covariate data in nonlinear mixed effects modelling, have not beenwell studied [49, 50].

When applying a multiple imputations method, it is important to include allvariables which can be predictive of the missing covariate or the underlyingmissing data mechanism in the regression model [51–54]. Information aboutthe missing covariate data is available, not only in observed covariates, butalso in the dependent variable (i.e. the concentration measurements and/orresponses). Proper imputations should therefore be based on relationshipsestablished between all these variables. Since the model for the dependentvariable is nonlinear with multiple hierarchies of random effects, it cannot beused directly in the imputation model. Wu and Wu suggest a multiple impu-tations method where the imputations are based on observed covariates andindividual parameter estimates [49]. The individual parameter estimates con-tain information about the dependent variable [55,56] and hence also about themissing covariates. With decreasing information on the individual level in thedata set to be analysed, the individual parameter estimates will shrink towardsthe estimates of the typical value parameter (η-shrinkage) [57]. Shrunken in-dividual estimates will affect the imputed values and it is therefore importantto investigate how this will influence the final model.

23

2. Aims

The overall aim of this thesis was to develop methods for handling missingdata in the context of nonlinear mixed effects models and to compare strategiesfor handling missing data in order to provide guidance for efficient handlingand consequences of inappropriate handling of missing data.

The specific aims were:

• To understand the dynamics of missing data by analysing data collectedduring routine TDM of methotrexate in patients treated for osteosar-coma.

• To evaluate and compare the performance of different estimation algo-rithms, used for maximum likelihood estimation of nonlinear mixed ef-fects models, when applied to fit different types of PD structural modelsto continuous and (different types of) categorical data.

• To evaluate how the population parameter estimates and the discrimi-nation of correct structural model were affected by different patterns ofmissing observation data, and to investigate if there were any differencesin these statistics when using different estimation algorithms to fit thenonlinear mixed effects models.

• To evaluate and compare the performance of full maximum likelihoodmodelling and (multiple) imputation methods, when applied to handlemissing covariate data in nonlinear mixed effects modelling, and to eval-uate the suggested multiple imputations method’s sensitivity to η-shrink-age.

25

3. Methods

3.1 Data and modelsIn this section, the data sets and models used during the work with this thesisare presented. In Paper I, a clinical data set, with data collected during routineTDM in patients undergoing chemotherapy for osteosarcoma, was analysed.In Paper II, models developed to analyse different types of response data wereused to simulate data sets in order to evaluate and compare different estimationalgorithms for model analysis. In Paper III, IV and V, models were used tosimulate data sets with different patterns of missing observations (Paper III)and different patterns of missing covariates (Paper IV and V) in order to eval-uate and compare different strategies for model analysis when the data setssuffer from missing data.

3.1.1 MethotrexateData

After ethical approval by the Institutional Review Board at the University Col-lege London Hospital, data were collected (over an 18-month period) frompatients being treated with high-dose methotrexate for osteosarcoma. A totalof 46 individuals (30 males and 16 females) were recruited. The median dose(range) for all recruited patients was 11.9 (8–13) gm−2 and the doses weregiven as four-hour infusions. After each dose, the patients underwent dailyTDM which continued until the methotrexate concentration had fallen below0.2 μmolL−1. Blood samples from 301 methotrexate courses were collected(up to 12 courses per individual, median 6 courses) and a total of 943 plasmaconcentrations were available for development of the PK model. The majorityof doses (89%) were followed by at least 3 plasma concentration samples,collected within 84 hours after dose. More samples were collected from indi-viduals with a lower clearance (CL), and for some individuals up to 8 sampleswere collected up to 192 hours after dose (Figure 3.1 and Table 3.1).

Oral mucositis was graded according to the National Cancer Institute Com-mon Toxicity Criteria with a scale from 0 (no mucositis) to 4 (life-threateningor disabling mucositis). Mucositis scores were recorded at the start and endof each methotrexate treatment period. Mucositis scores were collected from28 of the individuals (18 males and 10 females) and a total of 384 mucositisscores from 195 methotrexate courses were available for development of thePK-PD model.

27

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

● ●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●● ●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

● ●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●●

●●

●

●

●

●

●

●

●

●

●

● ●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●0.1

1.0

10.0

100.0

24 48 72 96 120 144 168 192Time after dose [h]

Con

cent

ratio

n [μ

mol

/L]

Figure 3.1. Methotrexate plasma concentrations collected and used in the developmentof the PK model. The grey dotted line shows the concentration of 0.2 μmolL−1.Once a concentration below this threshold was measured the subsequent samples werecancelled.

Table 3.1. Number of samples collected within each time interval after dose.

Time interval No. samples collected

after dose [h] (% of 301 MTX courses)

0–36 299 (99.3%)36–60 299 (99.3%)60–84 268 (89.0%)84–108 59 (19.6%)108–132 10 (3.32%)132–156 4 (1.33%)156–180 3 (0.997%)

> 180 1 (0.332%)

Total number of

samples collected: 943(% of all possible samples) (39.2%)

Table 3.2 presents the mean and range of some covariates of the patientsincluded in the study. Serum creatinine (SCr) measurements were missing forthree subjects on between one and four occasions; in these cases, an earlier orlater measure (whichever was nearest chronologically) was used to completethe data set (i.e. ’last observation carried forward’ and ’first observation car-ried backward’, respectively). A data set containing both observed methotrex-

28

Table 3.2. Covariates of the individuals included in the high-dose methotrexate (MTX)study.

All patients Patients with mucositis scores

[Mean (Range)] [Mean (Range)]

Age [yr] 19.3 (4–51) 23.6 (8–51)Body weight [kg] 60.9 (17.4–107) 68.7 (26.3–107)Height [cm] 161 (105–195) 170 (127–195)Body surface area [m2] 1.63 (0.72–2.39) 1.78 (0.96–2.39)BMI [kgm−2] 22.6 (12.8–35.7) 23.7 (14.2–35.7)SCr [μmolL−1] 59 (18–104) 67 (34–104)Dose MTX [gm−2] 11.9 (8–13) 11.8 (8–13)Occasions [n] 6.5 (1–12) 7.0 (2–12)Samples per individual [n] 20.5 (3–38) 21.9 (6–38)

ate concentrations, mucositis scores and demographic statistics and clinicalcharacteristics was created.

Pharmacokinetic model

PK structural models with one, two, and three compartments were investi-gated. Linear and nonlinear (Michaelis-Menten) models for CL (drug elimi-nation) were investigated. A log-normal distribution of individual parameterswas assumed by modelling the BSV as a log-normal random effect. BSV wastested for all parameters, both in diagonal and various block combinations, es-timating correlations between η values. Additive, proportional, and combinedresidual error models were investigated.

Once the structural model was developed, CL was divided into two physio-logical components: one filtration part and one secretion/metabolism part [58,59]. The typical value and variance of the filtration component were essen-tially fixed to the expected values from the literature. This was done usinga relationship derived by Rhodin et al [60] (Equation 3.1), multiplied by themethotrexate fraction unbound (fu) that was assumed to be 0.4 [61] (Equa-tion 3.2). Therefore, the filtration element contained no estimated parametersbut allowed for BSV using the fixed variance from Rhodin et al [60].

GFRi = 6.72 ·(

WTi

70

)0.632

· eηGFRi (3.1)

CLfilt,i = fu ·GFRi (3.2)

where GFRi is an individual’s expected glomerular filtration rate (GFR) esti-mated from the individual’s body weight (WTi), the expected typical GFR fora 70 kg subject (6.72 Lh−1) and the allometric exponent estimated by Rhodinet al [60]. The BSV in GFR (ηGFR) was fixed to a variance of 0.22, reported

29

by Rhodin et al [60]. The secretion/metabolism component was estimatedaccording to Equation 3.3.

CLsec/met,i = θCL ·(

WTi

70

)θWT,CL

· eηsec/meti (3.3)

where CLsec/met,i is the individual estimated secretion/metabolism componentof CL calculated from the estimated typical value (θCL), estimated allometricexponent (θWT,CL), and BSV (ηsec/meti).

The influence of deviations in serum creatinine (SCr), age, and sex werefurther included by scaling overall CL with the expected sex- and age-adjustednormal SCr according to the relationships derived by Ceriotti et al [62] and thenormal SCr values for adults reported by Junge et al [63], as shown in Equa-tions 3.4–3.9. Equations 3.6 and 3.7 represent a linear extrapolation betweenthe oldest paediatric group predicted by Ceriotti et al [62] and the adult maleand female values [63] (Equations 3.8 and 3.9).

CLtot,i = (CLfilt,i+CLsec/met,i) ·(

SCrMean

SCri

)θSCr,CL

(3.4)

For age ≤ 15 years:

SCrMean =−2.37330−12.91367 · ln(age)+23.93581 ·√age (3.5)

For 15 years < age ≤ 17 years:Boys:

SCrMean = 9.5471 · age−87.847 (3.6)

Girls:SCrMean = 4.7137 · age−15.347 (3.7)

For age ≥ 17 years:Men:

SCrMean = 84.0 (3.8)

Women:SCrMean = 69.5 (3.9)

The volumes of distribution (V1 and V2) and the inter-compartmental clear-ance (Q) were estimated according to Equation 3.10

Pi = θP ·(

WTi

70

)θWT,P

· eηPi (3.10)

where θWT,P is the estimated allometric covariate scaling parameter for pa-rameter P (i.e. V1, V2 or Q). Substitution of body weight (WT) with fat-freemass (FFM) [64] in these equations and models with body mass index (BMI)included as an additional covariate were also investigated. The influence of

30

BOV [13] was investigated on CL, central volume of distribution (V1), and onboth parameters combined, with an occasion being treated either as a singledose or a two-week block encompassing two doses of methotrexate given atthe end of each treatment cycle in accordance with the treatment protocol.

Pharmacodynamic model

Mucositis scores (ordered categorical data) were modelled as a 5-point orderedlogistic regression model [11] describing the probability of each score in rela-tion to the individual PK parameters. To investigate whether mucositis scorewas driven mainly by the drug concentration, drug exposure, or a mixture ofthese, the individual PK was linked to the probability of mucositis scores viaan effect compartment model (Equation 3.11).

dCE,i

dt= k1e ·CP,i − ke0 ·CE,i (3.11)

where CE,i is the concentration at the effect site for individual i, CP,i is theconcentration in plasma for individual i, k1e is the rate constant describingthe relationship between the drug concentration in the central (plasma) com-partment and the hypothetical effect compartment, and ke0 is the rate constantdescribing the loss of drug concentration from the hypothetical effect com-partment. k1e was fixed to 1, assuming that the concentration in plasma willhave an immediate effect on the probabilities of mucositis scores, and ke0 wasestimated. Using this method, if the parameter ke0 is large (i.e., tends toward1), the effect is driven by the individual drug concentration, and converselyif ke0 tends toward zero, individual cumulative drug exposure (AUCi) drivesthe effect. Both linear and Emax models were investigated for the influence ofindividual methotrexate PK parameters on mucositis score probability. BSVwas tested on both the baseline and slope, and BOV was tested additionallyon baseline, with an occasion being treated as a single dose or as a two-weekblock with two consecutive doses.

3.1.2 Response data modelsSimulated data sets

The data consisted of responses modelled with continuous, binary, orderedcategorical, RTTE and count PD models. The models were used to simulatenew data sets.

Original data setsThe data set with continuous response data contained data from 95 individu-als. The individuals were given monoxidine and information on noradrenalineconcentrations were available at three occasions for each individual with fivesamples after each dose.

31

The other four data sets investigated (binary, ordered categorical, RTTEand count data) contained data from a (simulated) clinical trial evaluatingthe inhibition of reflux events following the administration of a new drug.In each case, the study design included 48 subjects divided evenly into fourdose groups (0, 10, 50 and 200 dose units). All data sets, except the dataset analysed by the RTTE data model, contained observations (responses orevents) recorded during hourly intervals (eight observations per individual),whereas the data set analysed by the RTTE data model contained observations(events) recorded during minutely intervals (480 observations per individual).The observations in the data sets analysed by the binary, RTTE and count datamodels were dichotomized as event/-no event, whereas the observations (re-sponses) in the data sets analysed by the ordered categorical data model wererated as no event (0), mild (1), moderate (2) or severe (3).

Population modelsThe model for continuous data characterized how moxonidine influences thelevels of noradrenaline and was an adapted version of the model presentedby Brynne et al. [65]. The noradrenaline levels were modelled with a directinhibitory Emax (Imax) model related to a given concentration-time profile ofmoxonidine. The model consisted of five typical value parameters and fourvariance parameters, of which one was describing the BOV, two the BSV,and one the residual error. NONMEM code including parameter values andestimation settings can be found in the appendix of Paper II.

The other four models investigated did all contain a time-constant baselinemodel with a variance parameter describing the BSV. In the models for binary,RTTE and count data the drug effect was incorporated as a typical value Imaxmodel, where the drug effect could entirely inhibit non-zero responses (binary)or events (RTTE and count). In the model for ordered categorical data the drugeffect was incorporated as a linear model with a variance parameter describ-ing the BSV. The binary data model (logistic regression model) consisted oftwo typical value parameters and one variance parameter (BSV), the orderedcategorical data model (ordered logistic regression model) [11] consisted offour typical value parameters and two variance parameters (BSV, which weremodelled with correlation), the RTTE data model (cumulative hazard model)consisted of two typical value parameters and one variance parameter (BSV),and the count data model (Poisson model) [12] consisted of two typical valueparameters and one variance parameter (BSV). NONMEM code including pa-rameter values and estimation settings can be found in the appendix of Pa-per II.

Estimation models

The population models fitted to the simulated data sets were the same modelsas the ones used to simulate the data. Two sets of estimations were performed;one where the initial estimates of the parameters were the same as the true val-

32

ues (i.e. the values used in the simulations), and one where the initial estimatesof the typical value parameters were randomly sampled from uniform distri-butions [θ ,2 ·θ ] (where θ is the true value of the typical value parameter) andthe initial estimates of the variance parameters were randomly sampled froma Wishart density with 20 degrees of freedom centred at the true values.

3.1.3 Missing observationsSimulated data sets

Population modelA population PK model with an intravenous bolus dose and a one-compartmentdisposition model was used to simulate all data sets (Equation 3.12);

ln(Ci j) = ln(

DoseVi

)− CLi

Vi· ti j + εi j (3.12)

where Ci is the concentration for individual i, CLi is the drug clearance forindividual i, Vi is the volume of distribution of the central compartment forindividual i, ti j is the time point for the jth sample from the ith individual,εi j describes how the observed concentration at the jth time point for the ithindividual (Ci j) is deviating from Ci and εi j ∼N(0,σ2) where σ is the standarddeviation of the residual error. To increase numerical stability the logarithmof the concentrations were simulated.

The individual parameters were log-normally distributed around the typicalparameter value according to Equation 3.13;

Pi = θP · eηPi (3.13)

where Pi is the ith individual’s value of model parameter P (e.g. CL), θP isthe typical value (population median) of parameter P, ηPi describes the ithindividual’s deviation from the typical value and ηPi ∼ N(0,ω2

P) where ωP isthe standard deviation of the BSV of parameter P. The typical value of CL was10 [L h−1] and the typical value of V was 50 [L]. The individual parameters(CLi and Vi) were simulated with a BSV of 30% and the residual error of thepopulation model was set to 15%.

Censored observationsEach data set contained data from 400 individuals and there were up to six con-centration measurements sampled from each individual (4, 8, 12, 16, 20 and24 hours after dose). Three missing data mechanisms were simulated: randomcensoring (MCAR), TDM censoring (MAR) and BQL censoring (MNAR).The underlying missing data mechanisms were adjusted to give an equal num-ber of observed concentration measurements. This was done by simulation ofdata sets with 10,000 individuals (Table 3.3). For the random censoring, the

33

Table 3.3. Number of concentration measurements observed at different time pointsafter dose for the investigated missing data mechanisms after simulation of a data setwith 10,000 individuals.

Time after TDM censoring (MAR) BQL censoring (MNAR)

dose [h] % of 10,000 % of 10,000

4 10,000 (100%) 9,974 (99.7%)8 9,411 (94.1%) 7,787 (77.9%)12 2,946 (29.5%) 3,498 (35.0%)16 398 (3.98%) 1,140 (11.4%)20 30 (0.300%) 307 (3.07%)24 4 (0.0400%) 84 (0.840%)

Total number of

concentration

measurements: 22,789 22,790(% of all) (38.0%) (38.0%)

probability of a sample being observed at a specific time point or not was simu-lated according to the percentages derived from the TDM censoring column inTable 3.3. Once an individual got a sample censored, the samples at later timepoints were also censored (first panel in Figure 3.2). For the TDM censoring,all succeeding samples were censored as soon as a sample less than 0.5 wasobserved (second panel in Figure 3.2). For the BQL censoring, all samplesless than 0.2375 were censored (note that this means that an individual mighthave one sample censored but still have another sample observed at a latertime point) (third panel in Figure 3.2).

Estimation models

Evaluation of model fitThe population model fitted to the simulated PK data was the same one-compartment model as the one used to simulate the data. Two sets of esti-mations were performed; one where the initial estimates of the parameterswere the same as the true values (i.e. the values used in the simulations), andone where the initial estimates of the parameters were randomly sampled fromuniform distributions [0.5 ·P, 2 ·P], where P is the true value of the populationparameter (θ , ω2 or σ2).

Evaluation of structural model discriminationTo evaluate how the different patterns of censored observations affected thestructural model discrimination, an additional population model with an in-travenous bolus dose and a two-compartment disposition model was fitted tothe simulated data sets. Two versions of the two-compartment model wereinvestigated; one where only two typical value parameters (volume of distri-bution of the peripheral compartment (V2) and inter-compartmental clearance

34

Figure 3.2. Simulated concentration-time profiles for the different missing data mech-anisms. The grey dotted line in the second panel (MAR) shows the concentration of0.5 units, and once a concentration below this threshold was measured the subsequentsamples were censored. The grey dotted line in the third panel (MNAR) shows thelower limit of quantification of 0.2375 units, and all concentrations below this thresh-old were censored.

35

(Q)) were added to the model, and one where two typical value parameters (V2and Q) and two variance parameters (ω2

V2and ω2

Q) were added to the model.Initial estimates were set to: CL = 10 [L h−1], V1 = 40 [L], Q = 2 [L h−1], V2 =10 [L], and when applicable ω2

V2= 0.09 and ω2

Q = 0.09 (30% BSV).

3.1.4 Missing covariatesSimulated data sets

Population modelA population PK model with constant infusion at steady state was used for the(estimations and) simulations (Equation 3.14);

ln(Cssi j) = ln(

R0

CLi

)+ εi j (3.14)

where Cssi is the steady state concentration for individual i, R0 is the infusionrate, CLi is the drug clearance for individual i, εi j describes how the observedconcentration in the jth sample of the ith individual (Cssi j) is deviating fromCssi and εi j ∼ N(0,σ2) where σ is the standard deviation of the residual error.To avoid simulation of negative concentrations the logarithm of the steadystate concentrations were simulated.

The individual CL values were log-normally distributed around the typicalvalue of CL (Equation 3.15);

CLi = θ · eηi (3.15)

Where θ is the typical value (population value) of CL, ηi describes howthe ith individual’s value of CL deviates from the typical value of CL andηi ∼ N(0,ω2) where ω is the standard deviation of the variability betweenindividuals (i.e. BSV).

The relative difference in the typical value of CL between males and fe-males were 50%, where females were assigned to have a lower CL than males(binary covariate, simulated and estimated as two typical value parameters,θ = 1 for females and θ = 2 for males). The individual CLs were simulatedwith a BSV of 30%. The residual error of the population model was set to20%. NONMEM code for the population model can be found in the appendixof Paper V.

Missing covariatesEach data set consisted of data for 200 individuals, 60% of the individualswere randomly assigned to be males and 40% females. Two concentrationmeasurements were simulated for each individual. Weights were simulatedfrom two truncated log-normal distributions with sex-specific medians andvariances (lnN(85.1,0.0329) for males and lnN(73.0,0.0442) for females)

36

which had been estimated using a large data set with 1022 males and 423females [66]. Three missing data mechanisms were simulated; MCAR, MARand MNAR. For each mechanism, 50% of the individuals were assigned tolack information about the covariate sex. For MCAR, all individuals had thesame probability of missing sex; for MAR, the underlying mechanism gavea higher probability of missing sex with increasing weight (27% probabilityof missing sex for a person weighing 40 kg and 83% probability of missingsex for a person weighing 145 kg); for MNAR, the underlying mechanismgave a three times higher probability of missing sex for males than females.The proportion of males in the data sets for whom sex was observed was thenapproximately 60% when data were MCAR, 56% when data were MAR and37% when data were MNAR.

Estimation models

Six different methods for handling missing covariates were compared: com-plete case scenario (CC), single imputation of mode (SImode), single imputa-tion based on weight (SIWT), multiple imputation based on weight and individ-ual response (i.e. Cssi) (MI), full maximum likelihood modelling with infor-mative priors based on weight (MOD) and full maximum likelihood modellingwhere the proportion of males (and females) among the individuals lacking in-formation about sex was estimated as an extra parameter in the model (EST).For comparison purposes, estimation with all data (ALL) was also carried out.Implementation of all methods except CC and SImode required estimation ofadditional models for logistic regression and MI also required additional simu-lations. NONMEM code showing the implementation of the different methodsare available in the appendix of Paper V.

CCAll individuals lacking the covariate sex were excluded from the analysis, i.e.50% of the data were discarded.

SImodeThe mode of the covariate, i.e. the most frequently occurring category amongthe individuals for whom the covariate was observed, was imputed for all in-dividuals lacking the covariate.

SIWTA model was created to describe the probability of being male given the ob-served weight (P(male |weight)). The model was estimated as a logistic re-gression among the individuals for whom both covariates were observed. Themodel was used together with the observed weights to predict the probabilityof being male for each individual. For all individuals for whom the informa-tion about sex was missing the covariate was imputed based on the individualprobability prediction, i.e. a probability prediction greater than or equal to 0.5was imputed as ’male’, otherwise ’female’.

37

MIThe multiple imputations method presented by Wu and Wu [49] was imple-mented in NONMEM as described in Paper IV. The PK model, without in-clusion of any covariate (i.e. estimation of one typical value parameter in-stead of two), was fitted to the data to get individual estimates of CL (CLi)for all individuals. CLi contains information about the dependent variable(Cssi) [55, 56] and can hence be used to create a model which describes theprobability of being male given the observed weight and the dependent vari-able (P(male |weight,CLi)). The probability model was estimated as a logisticregression among the individuals for whom the covariate was observed. Forall individuals for whom information about sex was missing the covariate wasimputed (simulated) based on the logistic regression model, the individuals’observed weight and their individual estimate of CL. The imputation stepfollowed by an estimation of the imputed data set was repeated six times. Thesix sets of population parameter estimates were combined to one set by calcu-lating the average of the six point-estimates of each typical value and varianceparameter in the population model [47, 53]).

MODA model was created to describe the probability of being male given the ob-served weight (P(male |weight)). The model was estimated as a logistic re-gression among the individuals for whom both covariates were observed. Forsubjects for whom information on sex was missing, the probability model wasused in a mixture model together with the individuals’ observed weights, toprovide the probability of being male for each individual. The mixture modelfunctionality uses the individual responses, in combination with informationabout the probability of belonging to each of the subpopulations (in this casemale or female) on the population level, to estimate the model parameters [19].

ESTRather than fixing the expected relation between covariates to the estimatesfrom the portion of the population without missing information (as was donein the MOD method), these relations were estimated. Thus, the fraction ofindividuals belonging to each subpopulation (i.e. the population probability ofbeing male) was estimated as a typical value parameter in the mixture model.To ensure a hierarchical relation between EST and MOD the typical valueparameter was added to the individual predicted probability of being malegiven the observed weight (the same probability model as was used in MOD).

Evaluation of the MI method’s sensitivity to η-shrinkage

The individual estimates generated in the first step of the MI method (whenfitting the PK model without inclusion of any covariate) can suffer from η-shrinkage [57]. The amount of shrinkage in the individual estimates for aparticular individual is dependent on the magnitude of the variability of the

38

Table 3.4. Scenarios investigated under the MAR missing data mechanism to evaluatethe MI method’s sensitivity to η-shrinkage

Sex difference in Individuals with Residual

Scenario CL only 1 observation error

1 17% 0 20%2 17% 0 50%3 17% 2/3 50%4 17% 2/3 70%5 50% 0 20%6 50% 0 50%7 50% 2/3 50%8 50% 2/3 70%

residual error, the number of observations of the dependent variable for thatindividual and the informativeness of these observations [67]. An increase inshrinkage will decrease the apparent impact of the partly missing covariate onthe concentrations. The weakened relationship will then be incorporated inthe model which describes the likelihood of being male given the observedweight, which then will be used to impute the missing covariates.

To evaluate the impact of η-shrinkage on the performance of the developedMI method when handling missing covariate data, eight scenarios were inves-tigated under the MAR missing data mechanism (Table 3.4). The populationmodel was the same as described in Equations 3.14 and 3.15. However, therelative difference in the typical value of CL between males and females werealtered between 50% and 17% and the residual error was altered between 20%,50% and 70%. Apart from that, the number of observations per individualwas altered between two samples for all individuals and two samples for onethird of the individuals and one sample for the remaining two thirds (equallydistributed between males and females). The level of η-shrinkage for eachscenario was calculated after simulating data for 10,000 individuals (6,000males and 4,000 females) and fitting the base model (the PK model withoutinclusion of any covariate) to the data.

3.2 Data analysis and model evaluationAll model analyses were performed using NONMEM (version VI in Paper I,version 7.1.2 in Paper II, IV and V and version 7.3.0 in Paper III) [19] facili-tated with PsN (version 3.1–4.1.2) [68,69], and statistical analyses of the datawere completed using R (version 2.10–2.14) [70]. The OFV in NONMEMis proportional to −2·log-likelihood of the observed data given the estimatedparameter values. The difference in OFV between two hierarchical (nested)models is then approximately χ2-distributed and a decrease in the OFV of

39

3.84 with 1 degree of freedom (one extra model parameter) gives a signifi-cantly better fit at 5% significance level [19].

3.2.1 Model for clinical dataMethotrexate

The FOCE estimation algorithm with interaction (FOCE-INTER) was usedwhen fitting the PK models, and the LAPLACE algorithm was used whenfitting the joint PK-PD models [19]. Model development was guided by theOFV, precision in parameter estimates derived through a non-parametric boot-strap [71], and graphical assessment of basic goodness-of-fit plots and visualpredictive checks (VPCs) [38, 72].

A non-parametric bootstrap was performed from which the precision in pa-rameter estimates were evaluated by the calculation of standard errors. Forthe PK model, 500 samples were used but due to long run times, 100 sampleswere used for the joint PK-PD model. The PK-PD bootstrap was stratified toensure that patients with only PK data and patients with both PK and PD datawere sampled in the same fractions as the original dataset [71].

The simulation properties of the model were investigated by a VPC utilizing1000 simulations from the final PK-PD model. In the VPC, the observationswere compared with the simulated model predictions [38, 72]. The VPCsand the basic goodness-of-fit plots were derived using the library Xpose 4(http://xpose.sf.net) implemented in R.

3.2.2 Simulation studiesPapers II, III, IV and V all use stochastic simulations and estimations (SSE)analyses to evaluate and compare the properties of different estimation algo-rithms and different imputations and modelling techniques for analysing mod-els with and without missing data. In an SSE analysis, a great number of datasets are simulated by Monte Carlo simulations where the random effects are(randomly) sampled from their predefined distributions. One or more alter-native models are then fitted to the simulated data sets. One set of parameterestimates are obtained for each simulated data set and each alternative model,making it possible to compare bias and precision of parameter estimates forthe alternative models.

Response data models

The simulated data sets were analysed with seven different estimation algo-rithms available in NONMEM 7; FOCE (only continuous PD data), LAPLACE(only binary, ordered categorical, RTTE, and count data), ITS, IMP, IMPMAP,SAEM and BAYES. Default settings were used for all algorithms with a fewexceptions; for the ITS, IMP and IMPMAP algorithms the maximum number

40

of iterations (NITER) was increased to 1000 and a convergence test was ap-plied to the OFV and all population parameter estimates (CTYPE=3); for theSAEM and BAYES algorithms (which both use Markov Chains to generatesamples) a convergence test was applied to the OFV and all population param-eter estimatess (CTYPE=3) and the OFV and parameter estimates from every10th iteration was submitted to the convergence test (CINTERVAL=10) [19].

The algorithms based on EM and Monte Carlo methods (i.e. ITS, IMP,IMPMAP, SAEM and BAYES) are more efficient if the population modelcontain information on how the typical value parameters are associated arith-metically with the individual random effects (η-values). This is in NONMEMreferred to as MU referencing [19]. All models were parametrised with MUreferencing.

Bias and precisionFive hundred data sets were simulated from each PD model and the initialestimates (for the estimation step) were set to the values used to simulate thedata, i.e. the true parameter values. Bias and precision of parameter estimatesderived with the different estimation algorithms were evaluated in two ways:(i) comparison of the normalized estimation errors (NEEs) and (ii) comparisonof relative root mean squared error (rRMSE).

The NEEs were calculated for each parameter and each algorithm after eachsimulation re-estimation cycle (i = 1...N) using Equation 3.16:

NEE(

apθ i

)=

apθ i − pθSDr

(pθ

) (3.16)

whereapθ i represents the estimate of parameter p in data set i obtained with

algorithm a, pθ is the true value of the parameter, i.e. the value used in thesimulations, and SDr(pθ) is a robust estimate of the standard deviation ofestimates of

apθ i across algorithms. SDr(pθ) was calculated by dividing the

empirically determined interquartile range (IQR), i.e. the difference betweenthe 75th and 25th percentile of all the

apθ i for all algorithms, by the IQR of

the standard normal distribution. A median test [73] was applied to test if themedian NEE was significantly (1% significance level) different from zero.

The rRMSE for each parameter and each algorithm was calculated by firstcalculating the RMSE according to Equation 3.17:

RMSE(

apθ

)=

√N−1

N

∑i=1

(apθ − pθ

)2(3.17)

and then dividing it by the RMSE calculated for the same parameter whenestimated with FOCE/LAPLACE. An rRMSE less than one indicated that thealgorithm was estimating the parameter with lower bias and higher precisionthan FOCE/LAPLACE, whereas an rRMSE greater than one indicated that

41

the algorithm performance was inferior to FOCE/LAPLACE. The rRMSE wasfurther summarized per algorithm by calculating the average across all typicalvalue and variance parameters for that model.

RobustnessOne hundred data sets were simulated from each PD model. Parameter esti-mates received when the estimations started from the true values and param-eter estimates received when the estimations started from randomly sampledinitial estimates were compared. The estimates were regarded as significantlydifferent if the parameter estimates received when the estimation started fromrandomly sampled initial estimates were more than 1.96 standard deviations(calculated using the robust standard deviation of all estimates across all algo-rithms) away from the estimate received when the estimation started from thetrue values. The number of significantly different estimates were counted foreach algorithm under all models.

RuntimesThe runtimes (in seconds) reported by NONMEM for 100 estimations withinitial estimates set to the true parameter values, were used to calculate averageestimation time for each algorithm and each model separately.

Missing observations

Two hundred data sets were simulated for each missing data mechanism andsix algorithms were used to fit the models to the simulated data sets: FO,FOCE, LAPLACE, IMP, IMPMAP and SAEM. Default settings were usedfor all algorithms with a few exceptions; for the EM algorithms based on im-portance sampling (IMP and IMPMAP) the maximum number of iterations(NITER) was increased to 9999 and a convergence test was applied to theOFV and all population parameter estimates (CTYPE=3); for the SAEM algo-rithm a convergence test was applied to the OFV and all population parameterestimates (CTYPE=3) and the OFV and parameter estimates from every 10thiteration was submitted to the convergence test (CINTERVAL=10) [19]. Toobtain an OFV which was comparable between all estimation algorithms animportance sampling at the final population parameter estimates (EONLY=1,NITER=5, ISAMPLE=1000) were conducted after estimation with each of thealgorithms [19].

All models fitted with any of the EM algorithms were parametrised withMU referencing whenever possible.

Evaluation of model fitThe model fit was evaluated through a comparison of relative bias (RBias) andrelative standard deviation (RSD) of parameter estimates of each populationparameter, under each estimation algorithm (a), and under each missing datamechanism (m; including estimations using all data). The bias was defined as

42

the deviation of the mean of the parameter estimates from the true parametervalue and the RBias was calculated according to Equation 3.18:

RBias[Pa,m] =Pa,m −P

P(3.18)

where P is the mean of the estimates of the population parameter and P is thecorresponding true value. A 95% confidence interval for P was also calculatedand all algorithms for which the confidence interval did not include the trueparameter value, were considered to be significantly biased (p < 0.05).

The RSD described the precision of the parameter estimates relative themean of the parameter estimates (Equation 3.19);

RSD[Pa,m] =SD[Pa,m]

Pa,m(3.19)

where SD is the standard deviation of the distribution of the estimates. All al-gorithms which gave parameter estimates with a RSD < 10% for the estimatesof the typical value parameters, and < 20% for the estimates of the varianceparameters, were considered to give precise parameter estimates.

The algorithms’ ability to find the parameter estimates which maximizes thelikelihood of the observed data was evaluated by comparison of the OFVs ofall data sets estimated under each estimation algorithm and under each missingdata mechanism.

The FO algorithm is known to give more biased and less precise parameterestimates than the FOCE and LAPLACE algorithms. With increasing shrink-age in the individual random effects (η-values), the FOCE algorithm becomesincreasingly similar to the FO algorithm. A separate simulation study was runto evaluate if any potential bias in the parameter estimates, after fitting the datasets using the FOCE algorithm, was the result of η-shrinkage. The residualerror was reduced to 5%, 200 data sets were simulated and the populationmodel was fitted to the data using the FOCE algorithm. Bias and precision inparameter estimates were evaluated as described above.

Evaluation of structural model discriminationBoth versions of the two compartment model were hierarchically related tothe one-compartment model and a comparison of OFVs were therefore valid.The type I error rate was calculated (for each estimation algorithm under eachmissing data mechanism, including estimations using all data) as the percent-age of data sets for which the two-compartment model had a significantly(p < 0.05) better fit to the data than the one-compartment model.

Missing covariates

Two hundred data sets were simulated for each missing data mechanism, tocompare the six methods for handling missing covariate data. All modelswere fitted with the FOCE estimation algorithm.

43

Bias and precisionBias and precision of the estimated population parameters were evaluated (ina similar way as when the data sets suffered from missing observations) bycalculation and comparison of relative bias (RBias) and relative standard de-viation (RSD), for each population parameter (i), under each method for han-dling the missing data ( j; including estimations using all data), and under eachmissing data mechanism (k). The bias was defined as the deviation of the meanof the estimates from the true value and the RBias was calculated according toEquation 3.20;

RBias[Pi, j,k] =Pi, j,k −Pi

Pi(3.20)

where P is the mean of the estimates of the population parameter and P is thecorresponding true value, i.e. the value used in the simulation. All methodswhich resulted in parameter estimates with a RBias < 5% for the typical valueparameters, and < 10% for the variance parameters, were considered to beunbiased.

The RSD described the precision of the parameter estimates relative themean of the parameter estimates (Equation 3.21);

RSD[Pi, j,k] =SD[Pi, j,k]

Pi, j,k(3.21)

where SD is the standard deviation of the distribution of the estimates. Allmethods which resulted in parameter estimates with a RSD < 10% for theestimates of the typical value parameters, and < 20% for the estimates of thevariance parameters, were considered to give precise parameter estimates.

Evaluation of the MI method’s sensitivity to η-shrinkageTwo hundred data sets were simulated for each scenario. The data sets wereanalysed using the MI method and, to enable a comparison, using all data. Thebase model and the model including the covariate effect was hierarchical andtherefore a comparison of OFVs between the two models (to investigate if thecovariate was found significant or not) was possible.

The efficiency of the MI method was evaluated by comparing the bias andprecision of the estimated typical value and variance parameters with the biasand precision received when all data were used in the estimations. The calcu-lations of bias and precision were done in accordance with Equation 3.20 and3.21 and the definition of when an estimate was considered unbiased and/orprecise was the same as described above.

44

4. Results

4.1 Model for clinical data4.1.1 MethotrexatePharmacokinetic model

A three-compartment structural model gave a significantly lower OFV com-pared with the two-compartment model. This was thought to be due to theeffect of censored observations [41], since the daily TDM was suspendedonce a plasma sample with a concentration less than 0.2 μmolL−1 was col-lected. The two-compartment model simulated data similar to the data ob-served, both regarding observed concentrations and the fraction of observa-tions below 0.2 μmolL−1 (VPCs in Figure 4.1), and for these reasons, it waschosen as the final structural model. Michael-Menten elimination resulted ina similar OFV to the linear CL model, with Km estimates well above the ob-served concentration range, indicating that CL was best described by a linearprocess. A proportional error model was found to best describe the residualerror.

No significant improvement in model fit was found when using FFM inthe GFR calculation compared with body weight, and no improvement wasobtained when BMI was included as an additional covariate. For this reason,unadjusted body weight was used in the final model. When covariate scalingparameters θWT,V1 and θWT,V1 were estimated, values were close to one, andno significant change in OFV was observed when they were fixed to one in thefinal model. Final parameter estimates and relative standard errors (calculatedfrom the bootstrap estimates) for the PK model are presented in Table 4.1.

Table 4.1. Parameter estimates and relative standard errors (RSE), calculated fromthe bootstrap estimates (455 of 500), for the final PK model. The tabulated values arefor an individual weighing 70 kg.

Typical values BSV% BOV%

Parameter (RSE%) (RSE%) (RSE%)

CLsec [L/h] 10.9 (12.5) 12.9 (54.0) 12.0 (18.7)V1 [L] 74.3 (12.3) 8.02 (91.9)Q [L/h] 0.110 (15.7) 29.2 (60.2)V2 [L] 4.10 (20.9) 36.5 (36.5)θWT,CLsec 0.944 (9.50)θWT,Q 0.693 (17.9)θSCr,CL 0.0831 (55.8)Residual error (%) 29.3 (5.70)

45

Figure 4.1. The top panel shows the visual predictive check (VPC) of the final PKmodel and the observed data. The horizontal dashed gray line is the limit concentra-tion of 0.2 μmolL−1 after which TDM is stopped. The dark gray circles indicate theobserved concentrations, the solid black line is the observed median, the dashed blacklines are the observed 5th and 95th percentiles, the light gray shaded area is the 95%confidence interval (CI) for the simulated median, and the dark gray shaded areas arethe 95% CIs for the simulated 5th and 95th percentiles. The bottom panel shows theVPC for the fraction of the data that was censored due to cessation of TDM, oncea concentration below 0.2 μmolL−1 was observed. The solid dark gray line is theobserved fraction of censored observations, and the light gray shaded area is the 95%CI for the simulated fraction of censored observations.

46

Figure 4.2. A visual predictive check (VPC) of the final PD model (PK-PD model)and the observed data. The panels shows the probability of each mucositis scorewith increasing cumulative AUCi. The solid dark gray line is the observed proportionof each score, and the light gray shaded area is the 95% confidence interval for thesimulated probability of each score.

Table 4.2. Parameter estimates and relative standard errors (RSE), calculated fromthe bootstrap estimates (92 of 100) for the final PD model.

Typical values BSV% BOV%

Parameter (RSE%) (RSE%) (RSE%)

B1 -2.11 (16.8) 67.7 (52.2) 59.4 (36.8)B2 -1.49 (12.3)B3 -1.95 (21.8)Slope 0.000275 (19.3)

B1, B2 and B3 are the baseline parameters (on the log-scale) from which the probabilities arecalculated. The corresponding baseline probabilities (AUC = 0) for the bootstrap estimates(RSE%) would be: Pscore=0 = 0.89 (3.9), Pscore=1 = 0.084 (30.8), Pscore=2 = 0.023 (41.5),Pscore≥3 = 0.0043 (52.0).

Pharmacodynamic model

In the effect compartment model, ke0 tended toward zero indicating that theeffect was driven by cumulative drug exposure rather than plasma concentra-tion. Adding methotrexate AUCi to the probability model gave a significantdrop in OFV. The relationship between AUCi and the probability of differentmucositis scores was best described with a linear model (no improvement inOFV with an Emax model). Cumulative AUC for a two-week (i.e., two-dose)block gave a lower OFV than treating each dose separately. The final modelcontained BSV and BOV implemented additively on baseline. The probabil-ities for mucositis with scores 3 and 4 were similar, and they were thereforecombined as one category. Simulations from the final PK driven mucositismodel showed that the simulated probability of each mucositis score with in-creasing cumulative AUCi compared well with the observed proportions (Fig-ure 4.2). Final parameter estimates and relative standard errors (calculated

47

Figure 4.3. Point range plot of the normalized estimation error for typical values (darkgrey) and variance (light grey) parameters stratified by model type. The median NEEis shown as a cross if it was significantly different from zero (median test, 1% signif-icance level) and otherwise as a filled circle. The vertical line indicates the precisionof the estimates (quantiles of ±1 SD of the standard normal distribution).

from the bootstrap estimates) for the PD model from the final PK-PD modelare presented in Table 4.2.

4.2 Simulation studies4.2.1 Response data modelsBias and precision

Figure 4.3 shows the NEEs for all parameters, estimated using each of thealgorithms, stratified by model type. The estimates of the parameters should beapproximately normally distributed around the median. The median estimateshould be close to zero to be considered unbiased and the range of estimatesshould be approximately 2SDr(pθ) wide (±1SDr(pθ) from the median) to beconsidered precise.

The mean rRMSE of parameter estimates obtained with the different esti-mation algorithms across models are shown in Table 4.3. Tables with rRMSEfor all parameters, estimated using each of the algorithms, stratified by modeltype are presented in the appendix of Paper II.

The algorithm giving the lowest bias and highest precision in parameterestimates across models was the IMP algorithm, closely followed by FOCE/LAPLACE, IMPMAP and SAEM. The ITS and the BAYES algorithms re-sulted in parameter estimates with marked bias and high imprecision for pa-rameters in all of the investigated models.

48

Table 4.3. Mean rRMSE of parameter estimates obtained with the different estimationalgorithms across models.

Algorithm Continuous Binary OC RTTE Count

ITS 0.98* 0.91* 1.87 0.98 1.03IMP 0.95 0.97 1.04 0.99 0.99IMPMAP 1.01 0.98 1.02* 0.99 0.99SAEM 0.99 1.03 1.17 1.04 1.04BAYES 4580 1.24 1.33 31800 22900

*Artificially low rRMSE due to biased but precise estimates (determined from figure 4.3).

Continuous data modelThe only algorithm resulting in unbiased and precise parameter estimates wasthe IMP algorithm (first panel in the first row of Figure 4.3). The ITS, IMP andSAEM algorithms resulted in a mean rRMSE <1 (first column in Table 4.3),reflecting a lower RMSE than FOCE. The BAYES algorithm resulted in anextremely high mean rRMSE.

Binary data modelNone of the algorithms resulted in unbiased estimates for all parameters; how-ever, IMP, IMPMAP and SAEM were the best performing algorithms (secondpanel in the first row of Figure 4.3). The ITS, IMP and IMPMAP algorithmsresulted in a mean rRMSE of <1 (second column in Table 4.3).

Ordered categorical data modelThe algorithm resulting in the least biased and most precise parameter esti-mates was the LAPLACE algorithm, closely followed by the IMP algorithm(third panel in the first row of Figure 4.3). None of the algorithms resulted ina mean rRMSE of <1 (third column in Table 4.3).

RTTE data modelThe algorithms resulting in the least biased and most precise parameter es-timates were the IMP and the IMPMAP algorithms (first panel in the secondrow of Figure 4.3). The ITS, IMP and IMPMAP algorithms resulted in a meanrRMSE of <1, while the BAYES algorithm resulted in an extremely high meanrRMSE (fourth column in Table 4.3).

Count data modelThe algorithms resulting in the least biased and most precise parameter esti-mates were the IMP and the IMPMAP algorithms (second panel in the secondrow of Figure 4.3). The IMP and IMPMAP algorithms resulted in a meanrRMSE of <1, while the BAYES algorithm resulted in an extremely high meanrRMSE (fifth column in Table 4.3).

49

Robustness

The sensitivity with respect to initial estimates differed considerably betweenmodels and none of the algorithms were superior to the others. The percent-age of significantly different parameter estimates was above the expected 5%for all estimation algorithms when fitting the continuous data model and itwas below, or close above, 5% for all algorithms when fitting the binary datamodel and the RTTE data model. When fitting the ordered categorical datamodel, only SAEM and BAYES had more than 5% significantly different pa-rameter estimates, while when fitting the count data model, only ITS, IMP andIMPMAP had more than 5% significantly different parameter estimates.

Runtimes

The FOCE/LAPLACE algorithms had the shortest runtimes across all models,followed by the ITS algorithm which had runtimes equal to LAPLACE for theordered categorical data model but 2-3 times slower than FOCE/LAPLACEfor the remaining models. The IMP and IMPMAP algorithms had similarrelative runtimes for all investigated models, with relative runtimes between25 (continuous data model with IMPMAP) and 77 (binary model with IMP)times longer than for the FOCE/LAPLACE algorithms. The SAEM algorithmhad shorter or similar runtimes as IMP and IMPMAP for all models, and theruntimes were between 8.4 (ordered categorical model) and 54 (count model)times longer than for the FOCE/LAPLACE algorithms. The BAYES algo-rithm was the slowest for all models and the relative runtimes were between47 (ordered categorical model) and 280 (count model) times longer than forthe FOCE/LAPLACE algorithms.

4.2.2 Missing observationsEvaluation of model fit

Similar results were obtained when the initial estimates of the parameters wereset to the same as the true values and when the initial estimates were randomlysampled from uniform distributions. The only algorithm suffering from ro-bustness problems (difficulties finding the minimum OFV when starting fromrandomly sampled initial estimates) were the SAEM algorithm, which resultedin extremely high OFVs and extreme parameter estimates when fitting themodel to some of the data sets (11% when data were MCAR, 3.5% when datawere MAR and 3.0% when data were MNAR). Only results following esti-mation from randomly sampled initial estimates are presented. The calculatedbias and precision of parameter estimates are reported in tables in the appendixof Paper III.

All dataAll algorithms except FO gave unbiased and precise parameter estimates forall parameters when all data were used in the estimation (first panel in Fig-

50

ure 4.4). The only estimates being significantly biased (p < 0.05) were theestimates of the typical value of CL when estimated using the FO algorithm.More than 75% of the estimates of the BSV of CL were higher than the truevalue when estimated using the FO algorithm; however, this bias was not sig-nificant.

MCARWhen data were MCAR, the parameter estimates obtained with the gradientbased algorithms and the SAEM algorithm suffered more from the missingdata than the estimates obtained with the IMP and IMPMAP algorithms (sec-ond panel in Figure 4.4). The FO algorithm showed the same pattern in theestimates as when all data were used in the estimation. The estimates ofthe typical value of CL were significantly biased (p < 0.05) and more than75% of the estimates of the BSV of CL were higher than the true value, yetnot significantly biased. When using the FOCE algorithm more than 75%of the estimates of the typical value of CL were higher than the true valueand almost 75% of the estimates of the BSV of CL were lower than the truevalue; however, not significantly biased. The LAPLACE algorithm and theSAEM algorithm showed minor difficulties in the estimation of the BSV ofCL (LAPLACE) and the BSV of V (SAEM). All algorithms except SAEMgave precise parameter estimates of all parameters. The imprecision of the es-timates when using the SAEM algorithm was due to the observed robustnessproblems.

MARWhen data were MAR, the parameter estimates obtained with the gradientbased algorithms and the SAEM algorithm suffered even more from the miss-ing data than they did when data were MCAR (panel three in Figure 4.4).The FO algorithm gave significantly biased (p < 0.05) estimates of the typicalvalue of CL, the BSV of V and the residual error parameter. In addition morethan 75% of the estimates of the typical value of V were higher than the truevalue, yet not significantly biased. After fitting the model with the FOCE al-gorithm more than 75% of the estimates of the typical value of CL were higherthan the true value and more than 75% of the estimates of the BSV of V werelower than the true value (not significantly biased). The LAPLACE algorithmgave estimates of the BSV of V of which more than 75% were lower thanthe true value and the SAEM algorithm showed minor difficulties estimatingthe same parameter. All algorithms except SAEM gave precise estimates ofall parameters except the BSV of V for which the estimates were imprecise(RSD > 20%) independent on estimation algorithm. The imprecision of theestimates when using the SAEM algorithm was due to the observed robustnessproblems.

51

Figure 4.4. Box-plots showing bias and precision of population parameter estimatesafter fitting the model to 200 simulated data sets using six different estimation algo-rithms. The top panel shows the results for when all available data (six concentrationmeasurements per individual) were used when fitting the model, and the subsequentpanels show the results for when the model was fitted to data where approximately62% of the concentration measurements were MCAR, MAR and MNAR respectively.When the SAEM algorithm was used, extreme parameter estimates (not shown in thepanels) were obtained when fitting the model to some of the data sets.

52

MNARWhen data were MNAR, all estimation algorithms encountered problems whenfitting the model to the data (bottom panel in Figure 4.4). All algorithms ex-cept SAEM resulted in significantly biased (p < 0.05) estimates of the typicalvalue of V and in addition the FO algorithm gave significantly biased estimatesof BSV of CL and the FOCE algorithm gave significantly biased estimates ofthe residual error parameter. With a few exceptions all algorithms resultedin parameter estimates of which 75% were lower than the true value (typicalvalue of CL and BSV of CL) or of which 75% were higher than the true value(residual error parameter); however, these biases were not significant. Allalgorithms except SAEM gave precise estimates of all parameters except theBSV of V for which the estimates were imprecise (RSD > 20%) independenton estimation algorithm. The imprecision of the estimates when using theSAEM algorithm, as well as the non-significant bias of the typical value of V,were due to the observed robustness problems.

Comparison of OFVsThe comparison of OFVs for the different estimation algorithms revealed thatthe FO algorithm resulted in a higher OFV than FOCE for all investigated sce-narios, while the LAPLACE, IMP, IMPMAP and SAEM algorithms resultedin OFVs which were the same (when all data were used in the estimation)or slightly lower (when data were MCAR, MAR or MNAR) than the onesreceived with the FOCE algorithm (excluding the runs resulting in extremelyhigh OFVs when estimated with the SAEM algorithm).

Reduced η-shrinkageWhen data were MCAR or MAR, and the FOCE algorithm was used to fit themodel to data sets simulated with 5% residual error, the parameter estimateswere better centred on the true value than they were when the data sets weresimulated with 15% residual error. When data were MNAR, similar difficul-ties in estimation of the parameters were observed when the residual error was5% as when the residual error was 15%.

Evaluation of structural model discrimination

Convergence was not always achieved when fitting the two-compartment mo-del to the simulated data sets. The gradient based algorithms had most dif-ficulties reaching convergence when there were four BSV parameters in themodel while the EM algorithms had most difficulties reaching convergencewhen there were only two BSV parameters in the model. Therefore, the re-sults presented are the results following fitting the two-compartment modelwith two BSV parameters using the gradient based algorithms and fitting thetwo-compartment model with four BSV parameters using the EM algorithms.

53

The type I error rate was less than 5% for all estimation algorithms whenusing all data in the estimation and when data were MCAR or MAR but it was(close to) 100% for all algorithms when data were MNAR.

4.2.3 Missing covariatesBias and precision

The bias and precision in estimates of the typical value of CL for males (CLmale)and females (CLfemale), when estimated using the different methods for han-dling missing covariate data, are visualised in Figure 4.5, showing the bias asthe deviation of the median estimate from the true value and the precision asthe width of the box and the whiskers. The calculated bias and precision ofthe population parameter estimates (typical value and variance parameters) arepresented in tables in Paper V.

CCThe method gave unbiased estimates for all population parameters, indepen-dent on underlying missing data mechanism (Figure 4.5). The estimates wereless precise than the ones received when using the more advanced methods(MI, MOD or EST) but all RSDs were below the predefined limits and shouldtherefore be considered as precise.

SImodeWhen data were MCAR or MAR, the SImode method resulted in considerablyunderestimated values of CLmale (RBias, -16% and -14%) while the estimatesof CLfemale remained unbiased but less precise than the estimates receivedwhen using the more advanced methods (MI, MOD and EST) (first and sec-ond panel in Figure 4.5). More outliers were observed when data were MARthan when data were MCAR. When data were MNAR the estimates of CLmalewere unbiased but less precise while the estimates of CLfemale were highlyoverestimated (RBias, +42%) (third panel in Figure 4.5). The estimates ofthe BSV were highly overestimated independent on missing data mechanism(RBias between +76% and +110%).

SIWT

When data were MCAR or MAR, the estimates of CLmale were underestimated(RBias, -11%) whereas the estimates of CLfemale were overestimated (RBias,+11% and +10%) (first and second panel in Figure 4.5). For the MNAR miss-ing data mechanism, the estimates of CLmale were unbiased according to thepredefined limits (RBias, -3.1%) and the estimates of CLfemale were highlyoverestimated (RBias, +32%) (third panel in Figure 4.5). The estimates ofthe BSV were highly overestimated independent on missing data mechanism(RBias between +69% and +87%).

54

Figure 4.5. Box-plot showing bias and precision of the estimates of the typical valueof CL for males (true value, 2) and females (true value, 1) after fitting 200 simulateddata sets, in which the covariate sex was MCAR (panel 1), MAR (panel 2) or MNAR(panel 3) for 50% of the individuals, using six different methods to handle the missingdata.

55

MIThe MI method gave unbiased and precise estimates of all population param-eters when data were MCAR or MAR (first and second panel in Figure 4.5).When data were MNAR, the estimates of CLmale and CLfemale were both over-estimated (RBias, +5.2% for CLmale and +6.2% for CLfemale) (third panel inFigure 4.5). Both variance parameters (BSV and residual error) were esti-mated without any bias and with high precision, independent on underlyingmissing data mechanism.

MODThe results received when using the MOD method were similar to those ob-served for the MI method. All population parameters were estimated withoutany bias and with high precision when data were MCAR or MAR (first andsecond panel in Figure 4.5). When data were MNAR, the estimates of CLmaleand CLfemale were both overestimated (RBias, +5.8% for CLmale and +10% forCLfemale) (third panel in Figure 4.5). Both variance parameters were estimatedwithout any bias and with high precision, independent on underlying missingdata mechanism.

ESTThe EST method was the only method resulting in unbiased and precise es-timates of all population parameters independent on underlying missing datamechanism (Figure 4.5). EST was significantly better than MOD in 8.5% ofthe simulated data sets when data were MCAR, in 13% of the data sets whendata were MAR and in 100% of the data sets when data were MNAR. Theextra parameter that was estimated in the EST method had a median estimate(median over the 200 simulated data sets) of 0 when data were MCAR or MARwhereas it had a median estimate of 2.0 when data were MNAR.

Evaluation of the MI method’s sensitivity to η-shrinkage

The η-shrinkage was calculated for each scenario and the values are presentedin Table 4.4. The bias and precision in estimates of CLmale) and CLfemale, foreach scenario, are visualised in Figure 4.6, showing the bias as the deviationof the median estimate from the true value and the precision as the width ofthe box and the whiskers. The calculated bias and precision of the populationparameter estimates (typical value and variance parameters) are presented in atable in Paper IV.

The MI method gave similar bias and precision as when all data were usedin the analysis, independent on the level of η-shrinkage. All population pa-rameters were unbiased under all scenarios. However, the precision decreasedwith increasing shrinkage for both the typical value parameters and the BSV.

The percentage of data sets for which the covariate effect was significantwas calculated for each scenario and is reported in Table 4.4.

56

Figure 4.6. Box-plots showing bias and precision of the estimated fixed effects ofCL. The panels present the results for the different scenarios and 200 data sets weresimulated and thereafter re-estimated for each scenario. The covariate sex was missingat random for 50% of the individuals and the multiple imputation method (MI) wasused to handle the missing data in the estimations, here compared with the resultsreceived when no data was missing (All).

Table 4.4. Calculated η-shrinkage and the percentage of data sets for which thecovariate was found significant when the MI method was used to handle the missingdata, compared with when all data were used in the estimation, for each investigatedscenario.

Covariate

Scenario Shrinkage Method significant

1 8.8% ALL 97%MI 90%

2 33% ALL 76%MI 70%

3 42% ALL 68%MI 66%

4 54% ALL 45%MI 46%

5 4.5% ALL 100%MI 100%

6 21% ALL 100%MI 100%

7 28% ALL 100%MI 100%

8 40% ALL 100%MI 100%

57

5. Discussion

The research presented in this thesis showed that missing observation datais a greater problem during analyses of nonlinear mixed effects models thanpreviously assumed. Modelling strategies for appropriate handling of missingobservation data and missing covariate data were proposed and consequencesof inappropriate handling were presented.

When analysing data collected during routine TDM in patients treated withhigh-dose methotrexate for osteosarcoma, several problems regarding miss-ing data were observed. The greatest problem concerned missing observa-tion data, since the TDM was suspended once a plasma sample with a con-centration less than 0.2 μmolL−1 was collected. PK structural models withone, two and three compartments were tested and, even though the three-compartment model gave a significantly (p < 0.05) better fit to the data, thetwo-compartment model was chosen as final disposition model. This wasbecause it was assumed that the censoring of lower concentration measure-ments erroneously caused the three-compartment model to give a better fit thanthe two-compartment model. The VPC for the final PK model (Figure 4.1)showed that the two-compartment model adequately simulated data comparedwith those observed, and adequately predicted the fraction of censored obser-vations at different time points after dose.

A two-compartment disposition model for methotrexate is also proposed inseveral previous studies (e.g. [31–33, 35]). However, a three-compartmentdisposition model is not even tested in most of these studies [31–33]. Instudies where both two- and three-compartment models are tested, the two-compartment model [35] and the three-compartment model [34,74] are chosenas final disposition model for different data sets, based on statistical evalua-tions.

Missing covariate data was also a problem during the analysis of the metho-trexate PK data. Serum creatinine measurements were missing for three sub-jects on between one and four occasions and single imputation, using ’last ob-servation carried forward’ (or ’first observation carried backward’), was usedto fill in the missing values to enable the analysis. After single imputation, theimputed values will be analysed as if they are the true values, without takingthe uncertainty in the imputation into account. The shortcomings of singleimputation methods are documented and discussed by others [29, 36, 48, 75];however, since the fraction of missing SCr values in the data set was low, theimpact of the imputations on the final parameter estimates was assumed to benegligible.

59

To evaluate the differences in performance between different estimation al-gorithms regarding the fitting of models to data where part of the observationswere missing, the algorithms were first evaluated by comparing differencesin their performance when fitting models to different types of response (PD)data where no observations were missing. The algorithms use different ap-proximations to evaluate and maximize the likelihood (’expected’ likelihoodfor the EM algorithms) and the performance of the algorithms was evaluatedby comparing bias, precision, robustness and runtime. The IMP algorithmwas the algorithm giving the lowest bias and highest precision in parameterestimates across models, closely followed by FOCE/LAPLACE, IMPMAPand SAEM. The ITS and BAYES algorithms resulted in biased and impreciseparameter estimates. However, the BAYES algorithm was not used in the wayit is intended to be used; no prior (uncertainty) distributions of the populationparameters were added to the likelihood and, instead of considering all prob-able sets of population parameters (as reported by BAYES), the means of allprobable values were used when evaluating the bias and precision of parameterestimates.

The algorithms’ robustness (with respect to initial estimates) differed con-siderably between models and none of the algorithms were superior to theothers. In the comparison of runtimes, the FOCE/LAPLACE algorithms hadthe shortest runtimes across all models, followed by the ITS algorithm. TheBAYES algorithm had the longest runtimes for all models tested. All responsemodels used in the evaluation were quite simple and the differences in run-times between the sampling based algorithms (IMP, IMPMAP, SAEM andBAYES) and the non-sampling based algorithms (FOCE/LAPLACE and ITS)are expected to decrease with increasing model complexity.

Based on the results from the evaluation of the performance of the algo-rithms when fitting models to different types of response data, the FOCE,LAPLACE, IMP, IMPMAP and SAEM algorithms were chosen to be includedin the evaluation of differences in performance when fitting models to datawhere part of the observations were missing. To complete the picture, the FOalgorithm (the gradient based algorithm with the roughest approximation ofthe likelihood), was also included in the comparison. Model fit and structuralmodel discrimination was evaluated under three different missing (observa-tion) data mechanisms: random censoring (MCAR), TDM censoring (MAR),and BQL censoring (MNAR); where TDM censoring was the same type ofmissing data mechanism as was observed in the methotrexate data set.

When data were MCAR or MAR, none of the algorithms included in theevaluation resulted in an elevated type I error rate during the evaluation ofthe structural model. When data were MNAR, all algorithms resulted in bi-ased and imprecise parameter estimates as well as an elevated type I errorrate. This implies that the wrong structural model was chosen for methotrex-ate, and hence the structural disposition model best describing the data is thethree-compartment model. This finding is important considering the fact that

60

the response model for mucositis was found to be correlated with methotrex-ate exposure over time (AUCi) rather than with the drug concentration at aspecific time point after dose. A third disposition compartment means that thedrug stays in the body for a longer time than if there are only two disposi-tion compartments, i.e. the exposure to the drug is greater than expected andpredicted by the published model.

The EM algorithms performed better than the gradient based algorithmsunder all missing data mechanisms, both considering bias and precision ofparameter estimates when fitting the one-compartment model to the data, andconsidering convergence of estimations when fitting the two-compartment mo-del to the data. The IMP algorithm was the algorithm giving the least biased(unbiased) and most precise parameter estimates, when data were MCAR orMAR, closely followed by IMPMAP. The SAEM algorithm experienced somedifficulties finding the minimum OFV; convergence was sometimes achievedat extreme OFVs, and SAEM was the EM algorithm resulting in the mostbiased (non-significant bias on 5% significance level) and least precise param-eter estimates, when data were MCAR or MAR.

The LAPLACE algorithm was the best of the gradient based algorithms,reaching almost as low OFVs as the EM algorithms under all missing datamechanisms and resulting in less bias in parameter estimates than FO andFOCE. The FO algorithm was the only algorithm resulting in significantlybiased parameter estimates even when all data (six concentration samples perindividual) were used to fit the model. The FOCE algorithm gave unbiased pa-rameter estimates when all data were used to fit the model; however, when datawere MCAR or MAR, more than 75% of the estimated CL values were higherthan the true value (non-significant bias on 5% significance level) and prob-lems estimating the BSV of CL and V were also observed (the bias was closelyrelated to shrinkage and was reduced when the shrinkage was decreased). Thisindicates that the approximations of the likelihood used by the FO and FOCE(and to some extent also LAPLACE) algorithms, might not be good enoughwhen there is an unbalance in the distribution of observations over time afterdose.

A significant bias is not necessarily significant from a treatment perspec-tive and the same applies on a non-significant bias which does not have to beunimportant from a treatment perspective. The gradient based algorithms (FO,FOCE and LAPLACE) should therefore be avoided in the estimation of modelparameters when data are MCAR or MAR. These findings are in disagreementwith the current FDA guidelines [76] which states that: "if the concentrationdata are missing randomly, the process that caused the missing data can beignored and the observed data can be analyzed without regard to the missingdata". This statement is true if the missing observations are evenly distributedover time after dose [77] (not correlated with independent variables, discretedesign variables or covariates; Equation 1.7). However, it does not seem to

61

be generally applicable to all types of randomly missing observation data innonlinear mixed effects modelling.

Based on these findings it can be noted that, in order to receive unbiasedand precise parameter estimates, the IMP algorithm would have been the bestalgorithm to use when estimating both the PK and the PD parameters in thejoint PK-PD model for methotrexate.

The relative differences in performance of different methods used to handlemissing covariate data were very similar when data were MCAR and MAR,whereas most methods gave a greater bias in the estimates when data wereMNAR. The more advanced methods investigated (the multiple imputationsmethod (MI), and the full maximum likelihood methods (MOD and EST))gave unbiased and precise estimates of all population parameters when datawere MCAR or MAR. The only method giving unbiased and precise estimateseven when data were MNAR was the EST method (Figure 4.5).

The CC method gave unbiased but less precise estimates under all investi-gated missing data mechanisms. The reason for why there were no biases inthe estimates of the population parameters was because given the particularmodel used (with categorical covariate data), the missing data were MCARwhen analysed with the CC method, independent on missing data mechanism.However, when the covariate is continuous and data are MAR or MNAR, theCC method is known to result in biased parameter estimates [29, 36, 78].

Two types of single imputation was included in the evaluation, SImode andSIWT. The rationale for including these methods, even though they are knownto perform worse than full maximum likelihood methods and multiple im-putations [36, 48], was that these methods are common in nonlinear mixedeffects modelling of clinical data and it was considered important to show theconsequences of this type of inappropriate handling of missing covariate data.The SImode method is equivalent to imputing the mean or median value of acontinuous covariate and, as was shown, this type of method underestimatesthe strength of the covariate–parameter relationship, resulting in biased andimprecise parameter estimates [29].

A more advanced method to handle the missing covariate would be to de-rive a model for the covariate based on other, completely observed, variablesand then use that model for the imputation. The SIWT method and the MImethod both used this strategy. In the SIWT method, the single imputation ofsex was based on the observed weight, while in the MI method, the multipleimputations of sex were based on observed weight and information about thedependent variable (Cssi). When the missing data were handled with the SIWTmethod, all estimates of the population parameters, except the estimates of theresidual error, were biased. Despite the bias, the estimates were quite precisewhich means that all the single imputed data sets resulted in similar parameterestimates. This indicates that the main problem with this method was that thecorrelation between weight and sex was weak and that a logistic regression

62

model based on only weight gave a poor description of the individuals’ truesexes.

The MI method used both the weight and information about the depen-dent variable as covariates in the logistic regression model and this resultedin unbiased estimates when data were MCAR or MAR. The reason for thiswas not only because multiple imputations are better than single imputationbut also because the dependent variable was more correlated with sex thanweight. When applying a multiple imputations method, it is important toinclude all variables which can be predictive of the missing covariate or theunderlying missing data mechanism in the regression model [51–54]. Thesearguments should also be applicable to single imputation when the imputa-tions are based on observed predictors in the data. Even if single imputation isa less proper procedure than multiple imputations, the differences between themethods would have been smaller if the same logistic regression model hadbeen used for simulating the single imputations.

The MOD method gave estimates similar to those received using the MImethod; both methods gave unbiased and precise estimates of all popula-tion parameters when data were MCAR or MAR, whereas there was a biasin parameter estimates when data were MNAR. Methods using multiple im-putations or full maximum likelihood modelling yield similar results whenhandling missing data in linear fixed effect models, when the methods areimplemented in comparable ways [36, 48, 54]; the same relative performanceis expected for nonlinear mixed effects models. The MI method and the MODmethod both use the individual weights and (information about) the dependentvariables when fitting the data sets; the MI method in the logistic regressionmodel used in the imputations of missing sexes and the MOD method whenmaximising the likelihood of the dependent variables by ’estimating’ the indi-viduals’ sexes (in the mixture model) using information about their weights,and hence they produce similar results.

The extra typical value parameter, which was estimated when using the ESTmethod, corrected for the unknown missing data mechanism’s dependence onthe missing data when data were MNAR. Therefore, the EST method was theonly method tested which gave unbiased and precise estimates independenton missing data mechanism. When analysing data where a large extent of theindividuals are lacking information about one covariate or more, estimationof this extra parameter should be used to evaluate the model before any con-clusions can be drawn from the estimated parameters. A significant drop inOFV when including the extra parameter could be the result of a (logistic)regression model with poor predictability or be due to data being MNAR.

The evaluation of the MI method’s sensitivity to η-shrinkage showed thatthe MI method gave similar results as when all data were used in the analysis,independent on η-shrinkage. When the amount of information on the indi-vidual level decreases and the individual estimates shrink towards the typicalvalues, the amount of information available in the dependent variable about

63

the partly missing covariate decrease. The MI method, which extracts in-formation about the partly missing covariate from the dependent variable, ishence insensitive to η-shrinkage even though the individual estimates, used inthe imputations, suffer from shrinkage.

The covariate used in the evaluation of the performance of the differentmethods for handling missing covariate data was categorical, but the samerelative performances of the MI, MOD and EST methods are expected forcontinuous covariates, since the models in the continuous case builds on vir-tually the same statistical principles. As there is no round off to the nearestcategory in the continuous case, the continuous counterparts of the SImode andSIWT methods are expected to perform relatively better but still not as well asthe MI, MOD and EST methods.

All tested methods applied to handle missing covariate data was evaluatedusing the FOCE algorithm. Missing covariate data (and methods to handlethese) are not expected to have a great impact on the likelihood surface, andtherefore there was no need to use a more advanced (accurate but slower)algorithm to fit the models.

64

6. Conclusions

During the analysis of data collected during routine TDM in patients treatedwith high-dose methotrexate for osteosarcoma, several problems regardingmissing data were observed. Missing observations caused problems regardingthe decision of structural model for the PK data, and missing covariate datacaused problems during the covariate model analysis. The missing methotrex-ate concentrations were (erroneously) assumed to be the reason why the moreadvanced structural model (three-compartment disposition model) fitted thedata significantly (p < 0.05) better than the less advanced structural model(two-compartment model). Further investigation revealed that a correct struc-tural model was obtainable, independent on which estimation algorithm thatwas used to fit the models, as long as the data were MCAR or MAR. Since themissing methotrexate concentrations were MAR, the correct structural modelfor methotrexate should be a three-compartment disposition model, and notthe so often presented two-compartment model.

The evaluation and comparison of the performance of different estimationalgorithms showed that the IMP algorithm was the most reliable algorithmregarding bias and precision of parameter estimates, independent on the typeof data analysed. The difference in performance between the gradient basedalgorithms (in this case FOCE and LAPLACE) and the EM algorithms (IMP,IMPMAP and SAEM) were greater when fitting models to data where part ofthe observations were missing, than when fitting models to different types ofPD data. In contradiction to what is usually assumed regarding maximumlikelihood estimation when applied to data sets with missing observations,the parameter estimates obtained when using the FOCE algorithm (and tosome extent also the LAPLACE algorithm) were not reliable, independent onwhether data were MCAR, MAR or MNAR. This indicates that the approxi-mation(s) of the likelihood used by the FOCE (and LAPLACE) algorithm(s),might not be good enough when there is an unbalance in the distribution ofobservations over time after dose. It is therefore recommended that modelsare fitted using one of the sampling based EM algorithms (IMP, IMPMAP orSAEM) when observation data are MCAR or MAR. When data are MNAR, aproper method for handling the missing data should be used to obtain a correctstructural model and unbiased and precise parameter estimates, independenton the applied estimation algorithm.

When no observation data were missing, the gradient based algorithms per-formed well in comparison with the sampling based EM algorithms. However,at strategic points in the model development process, it is still recommended

65

that parameter estimates obtained with FOCE/LAPLACE are compared withthose obtained when applying the IMP, IMPMAP or SAEM algorithm.

The evaluation and comparison of the performance of full maximum like-lihood modelling and (multiple) imputation methods, when applied to handlemissing covariate data, showed that both multiple imputations (the suggestedmethod was shown to be insensitive to η-shrinkage) and full maximum like-lihood modelling were good approaches for handling missing covariate datawhen data were MCAR or MAR. When the covariate data were MNAR, theonly method resulting in unbiased and precise parameter estimates was a fullmaximum likelihood modelling approach where an extra parameter was esti-mated, correcting for the unknown missing data mechanism’s dependence onthe missing data. Since the missing data mechanism is usually unknown, fullmaximum likelihood modelling, with and without this extra parameter, couldbe applied to investigate if the data are MNAR.

This thesis presents new insight to the dynamics of missing data in nonlin-ear mixed effects modelling. Strategies for handling different types of missingdata have been developed and compared in order to provide guidance for effi-cient handling and consequences of inappropriate handling of missing data.

66

7. Acknowledgements

The work presented in this thesis was carried out at the Department of Phar-maceutical Biosciences, Faculty of Pharmacy, Uppsala University, Sweden. Iam grateful for the financial support from Apotekarsocieteten, which made itpossible for me to attend a number of very interesting and instructive courses.

There are many people who (directly or indirectly) have contributed to thisthesis. I feel very, very lucky to know you all! Special thanks to:

My primary supervisor Mats Karlsson for teaching me how to become a sci-entist. Thank you for all our interesting and inspiring discussions and thestimulating research projects. Thank you for believing in me and letting medo things my way.

My secondary supervisor Andrew Hooker for introducing me to the subjectof pharmacometrics. Thank you for your valuable input, I wish I would havehad the chance to work on more than one project with you.

Margareta Hammarlund-Udenaes for contributing to such a nice work en-vironment. Thank you for giving me the opportunity to develop my teachingskills and for all your support.

My master thesis supervisors Joseph Standing (co-author Paper I) and StefanieHennig for teaching me a great many things during just a few months. Thankyou for your support, then and now.

Kajsa Harling and Rikard Nordgren for all the help with new PsN functionali-ties, Jerker Nyberg for keeping our cluster in such good condition, and MagnusJansson for all the help with computer related problems, including a full mugof tea on the wrong (in-)side of the laptop keyboard.

My roomie, co-author (Paper II) and dear friend Sebastian Ueckert, these lastmonths would have been much scarier without you! Thank you for sharingyour thoughts and fears about the future and thank you for all the laughs we’vehad. I’ve benefited a lot from our discussions, you are brilliant!

My co-author (Paper II) and dear friend Elodie Plan, your thoughts and ideasare invaluable.

67

My former roomies and dear friends: Martin Bergstrand for "bullshit-buttons","flying-monkeys" and other crazy ideas; Ami Mohamed for beautiful (but afew too many) subway stations in Moscow and "fall-in-love" ice-cream inSeoul; and Chunli Chen for tasty tea and an extra big thank you for lookingafter my cats when I was in Uganda!

My wonderful friend Angelica Quartino for all our late night talks and forsharing struggles and success. I miss you a lot! I wish San Francisco wascloser to Sweden...

My dear friends Emilie Hénin, Jan-Stefan van der Walt, Brigitte Lacroix,Alexandre Sostelly and Paul Westwood; some people pass through your lifewithout leaving any marks while others stay in your heart forever no matterhow seldom you meet or talk. You are my friends for life! Although I really dislike

that you left Uppsala and me behind...

My wonderful, crazy friend Waqas Sadiq for your great laugh and crazy ideas.Thank you for being such a wonderful friend and thank you for deciding todo a postdoc in Uppsala instead of leaving (me) like the others did... I’m verysorry that I will have to leave you now, I hope you can forgive me!

My great friend Elke Krekels, I don’t know how to thank you! Without youthis thesis would have been without a ’Discussion’... The dinner is on me nexttime! I’m looking forward to continue our discussions in Leiden.

My dear friend Camille Vong for all our nice dinners, late night talks andsharing of ideas and fears about the future. Thank you for chasing me inSeoul, I needed it!

David Khan for being such a nice and caring friend.

Elin Svensson and Anders Kristoffersson for being great friends with greatminds. Thank you for all your valuable input during the development of thestatistics course.

Jörgen Bengtsson, Lena Klarén, Ann-Marie Falk, Srebrenka Dobric, AgnetaFreijs, Emma Lundkvist and Maria Swartling for all the inspiring discussionsabout teaching and for supporting and encouraging me.

Birgitta Rylén, Karin Tjäder, Annette Svensson Lindgren, Ulrica Bergströmand Marina Rönngren for administrative support during my time as a teacherand my time as a PhD student.

68

All the fantastic and wonderful people at the department (past and present).The inspiring working environment you all create is really unique and I’mimmensely thankful for the years I’ve been part of it.

I also want to thank the people at the Red Cross in Uppsala, especially LindaWickström, Johan Jansson and Nina Carlsson. Thank you for all the laughsand for reminding me that there are more important things in life than phar-macometrics.

My dear mamma and pappa, you are the best parents anyone can have andI’m so proud of you! Thank you for all your love and care.

My big little brother Tomas for picking me up when I was on the bottom, se-curing the rope when I’m climbing and safeguarding when I’m lifting weights.You keep my feet on the ground and my mind clear, what would I do withoutyou?

My best friend Richard for all the great times we’ve shared throughout theyears. Thank you for your patience and constant support. Thank you for beingone of the very few people who will ever read this thesis. Thank you for allthe times you’ve kept me company over skype when I’ve worked till 4 o’clockin the morning. Thank you for all the ice-creams (I think I owe you one now)and for always being by my side. You are the best friend anyone can have!

My dear, awesome, fantastic Thomas, I’m the luckiest person in the world!’Great minds think alike’; thank you for all our fruitful discussions about myresearch and for proof reading my thesis. Thank you for making me laugh,each and every day, even during stressful periods. You make me so happy andI love you so much! Pusspuss!

69

References

[1] Biomarkers Definitions Working Group, “Biomarkers and surrogate endpoints:preferred definitions and conceptual framework,” Clin. Pharmacol. Ther., vol. 69,pp. 89–95, Mar 2001.

[2] C. P. Adams and V. V. Brantner, “Spending on new drug development,” HealthEconomics, vol. 19, no. 2, pp. 130–141, 2010.

[3] R. L. Lalonde, K. G. Kowalski, M. M. Hutmacher, W. Ewy, D. J. Nichols, P. A.Milligan, B. W. Corrigan, P. A. Lockwood, S. A. Marshall, L. J. Benincosa, T. G.Tensfeldt, K. Parivar, M. Amantea, P. Glue, H. Koide, and R. Miller, “Model-based drug development,” Clin. Pharmacol. Ther., vol. 82, pp. 21–32, Jul 2007.

[4] L. B. Sheiner, “Learning versus confirming in clinical drug development,” Clin.Pharmacol. Ther., vol. 61, pp. 275–291, Mar 1997.

[5] P. A. Milligan, M. J. Brown, B. Marchant, S. W. Martin, P. H. van der Graaf,N. Benson, G. Nucci, D. J. Nichols, R. A. Boyd, J. W. Mandema, S. Krish-naswami, S. Zwillich, D. Gruben, R. J. Anziano, T. C. Stock, and R. L. Lalonde,“Model-based drug development: a rational approach to efficiently acceleratedrug development,” Clin. Pharmacol. Ther., vol. 93, pp. 502–514, Jun 2013.

[6] A. K. Hamberg, M. L. Dahl, M. Barban, M. G. Scordo, M. Wadelius, V. Pengo,R. Padrini, and E. N. Jonsson, “A PK-PD model for predicting the impact ofage, CYP2C9, and VKORC1 genotype on individualization of warfarin therapy,”Clin. Pharmacol. Ther., vol. 81, pp. 529–538, Apr 2007.

[7] J. E. Wallin, L. E. Friberg, A. Fasth, and C. E. Staatz, “Population pharmacoki-netics of tacrolimus in pediatric hematopoietic stem cell transplant recipients:new initial dosage suggestions and a model-based dosage adjustment tool,” TherDrug Monit, vol. 31, pp. 457–466, Aug 2009.

[8] A. K. Hamberg, L. E. Friberg, K. Hanseus, B. M. Ekman-Joelsson, J. Sun-negardh, A. Jonzon, B. Lundell, E. N. Jonsson, and M. Wadelius, “Warfarindose prediction in children using pharmacometric bridging–comparison withpublished pharmacogenetic dosing algorithms,” Eur. J. Clin. Pharmacol., vol. 69,pp. 1275–1283, Jun 2013.

[9] W. W. Hope, M. Vanguilder, J. P. Donnelly, N. M. Blijlevens, R. J. Brugge-mann, R. W. Jelliffe, and M. N. Neely, “Software for dosage individualiza-tion of voriconazole for immunocompromised patients,” Antimicrob. AgentsChemother., vol. 57, pp. 1888–1894, Apr 2013.

[10] L. Sheiner, B. Rosenberg, and V. Marathe, “Estimation of population charac-teristics of pharmacokinetic parameters from routine clinical data,” Journal ofPharmacokinetics and Biopharmaceutics, vol. 5, no. 5, pp. 445–479, 1977.

[11] A. Agresti, “Modelling ordered categorical data: recent advances and future chal-lenges,” Stat Med, vol. 18, no. 17-18, pp. 2191–2207, 1999.

[12] B. Frame, R. Miller, and R. L. Lalonde, “Evaluation of mixture modeling withcount data using NONMEM,” J Pharmacokinet Pharmacodyn, vol. 30, pp. 167–183, Jun 2003.

70

[13] M. O. Karlsson and L. B. Sheiner, “The importance of modeling interoccasionvariability in population pharmacokinetic analyses,” J Pharmacokinet Biopharm,vol. 21, pp. 735–750, Dec 1993.

[14] S. Laporte-Simitsidis, P. Girard, P. Mismetti, S. Chabaud, H. Decousus, and J.-P. Boissel, “Inter-study variability in population pharmacokinetic meta-analysis:When and how to estimate it?,” Journal of Pharmaceutical Sciences, vol. 89,no. 2, pp. 155–167, 2000.

[15] G. B. West, J. H. Brown, and B. J. Enquist, “A general model for the origin ofallometric scaling laws in biology,” Science, vol. 276, pp. 122–126, Apr 1997.

[16] G. B. West, J. H. Brown, and B. J. Enquist, “The fourth dimension of life: fractalgeometry and allometric scaling of organisms,” Science, vol. 284, pp. 1677–1679, Jun 1999.

[17] Y. Wang, “Derivation of various NONMEM estimation methods,” J Pharma-cokinet Pharmacodyn, vol. 34, pp. 575–593, Oct 2007.

[18] D. Lunn, N. Best, A. Thomas, J. Wakefield, and D. Spiegelhalter, “Bayesiananalysis of population pk/pd models: General concepts and software,” Journalof Pharmacokinetics and Pharmacodynamics, vol. 29, no. 3, pp. 271–307, 2002.

[19] S. Beal, L. Sheiner, A. Boeckmann, and R. Bauer, “NONMEM user’s guides(1989-2009),” tech. rep., Icon Development Solutions, Ellicott City, MD, USA,2009.

[20] U. Wählby, E. N. Jonsson, and M. O. Karlsson, “Assessment of actual signifi-cance levels for covariate effects in NONMEM,” Journal of Pharmacokineticsand Pharmacodynamics, vol. 28, no. 3, pp. 231–252, 2001.

[21] R. J. Bauer, S. Guzy, and C. Ng, “A survey of population analysis methods andsoftware for complex pharmacokinetic and pharmacodynamic models with ex-amples,” The AAPS Journal, vol. 9, no. 1, pp. E60–E83, 2007.

[22] A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from in-complete data via the em algorithm,” Journal of the Royal Statistical Society,Series B, vol. 39, no. 1, pp. 1–38, 1977.

[23] F. Mentré and R. Gomeni, “A two-step iterative algorithm for estimation in non-linear mixed-effect models with an evaluation in population pharmacokinetics,”Journal of Biopharmaceutical Statistics, vol. 5, no. 2, pp. 141–158, 1995. PMID:7581424.

[24] G. C. G. Wei and M. A. Tanner, “A Monte Carlo implementation of the EMalgorithm and the poor man’s data augmentation algorithms,” Journal of theAmerican Statistical Association, vol. 85, no. 411, pp. pp. 699–704, 1990.

[25] B. Delyon, M. Lavielle, and E. Moulines, “Convergence of a stochastic approx-imation version of the EM algorithm,” The Annals of Statistics, vol. 27, pp. 94–128, 03 1999.

[26] E. Kuhn and M. Lavielle, “Coupling a stochastic approximation version of emwith an mcmc procedure,” ESAIM: Probability and Statistics, vol. 8, pp. 115–131, 9 2004.

[27] A. Dokoumetzidis and L. Aarons, “Propagation of population pharmacokineticinformation using a Bayesian approach: comparison with meta-analysis,” JPharmacokinet Pharmacodyn, vol. 32, pp. 401–418, Aug 2005.

[28] D. B. Rubin, “Inference and missing data,” Biometrika, vol. 63, no. 3, pp. 581–592, 1976.

71

[29] R. Little and D. Rubin, Statistical analysis with missing data. Wiley series inprobability and mathematical statistics. Probability and mathematical statistics,Wiley, 2002.

[30] P. Diggle and M. G. Kenward, “Informative drop-out in longitudinal data analy-sis,” Applied statistics, pp. 49–93, 1994.

[31] F. Odoul, C. Le Guellec, J. P. Lamagnere, D. Breilh, M. C. Saux, G. Paintaud, andE. Autret-Leca, “Prediction of methotrexate elimination after high dose infusionin children with acute lymphoblastic leukaemia using a population pharmacoki-netic approach,” Fundam Clin Pharmacol, vol. 13, no. 5, pp. 595–604, 1999.

[32] D. Aumente, D. S. Buelga, J. C. Lukas, P. Gomez, A. Torres, and M. J. Garcia,“Population pharmacokinetics of high-dose methotrexate in children with acutelymphoblastic leukaemia,” Clin Pharmacokinet, vol. 45, no. 12, pp. 1227–1238,2006.

[33] C. Piard, F. Bressolle, M. Fakhoury, D. Zhang, K. Yacouben, A. Rieutord,and E. Jacqz-Aigrain, “A limited sampling strategy to estimate individual phar-macokinetic parameters of methotrexate in children with acute lymphoblasticleukemia,” Cancer Chemother. Pharmacol., vol. 60, pp. 609–620, Sep 2007.

[34] C. Dupuis, C. Mercier, C. Yang, S. Monjanel-Mouterde, J. Ciccolini, R. Fan-ciullino, B. Pourroy, J. L. Deville, F. Duffaud, D. Bagarry-Liegey, A. Durand,A. Iliadis, and R. Favre, “High-dose methotrexate in adults with osteosarcoma:a population pharmacokinetics study and validation of a new limited samplingstrategy,” Anticancer Drugs, vol. 19, pp. 267–273, Mar 2008.

[35] H. Colom, R. Farre, D. Soy, C. Peraire, J. M. Cendros, N. Pardo, M. Torrent,J. Domenech, and M. A. Mangues, “Population pharmacokinetics of high-dosemethotrexate after intravenous administration in pediatric patients with osteosar-coma,” Ther Drug Monit, vol. 31, pp. 76–85, Feb 2009.

[36] J. L. Schafer and J. W. Graham, “Missing data: our view of the state of the art,”Psychol Methods, vol. 7, pp. 147–177, Jun 2002.

[37] S. L. Beal, “Ways to fit a PK model with some data below the quantificationlimit,” Journal of Pharmacokinetics and Pharmacodynamics, vol. 28, no. 5,pp. 481–504, 2001.

[38] M. Bergstrand and M. O. Karlsson, “Handling data below the limit of quantifi-cation in mixed effect models,” AAPS J, vol. 11, pp. 371–380, Jun 2009.

[39] J. P. Hing, S. G. Woolfrey, D. Greenslade, and P. M. Wright, “Analysis of toxi-cokinetic data using NONMEM: impact of quantification limit and replacementstrategies for censored data,” J Pharmacokinet Pharmacodyn, vol. 28, pp. 465–479, Oct 2001.

[40] V. Duval and M. O. Karlsson, “Impact of omission or replacement of data belowthe limit of quantification on parameter estimates in a two-compartment model,”Pharm. Res., vol. 19, pp. 1835–1840, Dec 2002.

[41] W. Byon, C. V. Fletcher, and R. C. Brundage, “Impact of censoring data belowan arbitrary quantification limit on structural model misspecification,” J Phar-macokinet Pharmacodyn, vol. 35, pp. 101–116, Feb 2008.

[42] J. E. Ahn, M. O. Karlsson, A. Dunne, and T. M. Ludden, “Likelihood basedapproaches to handling data below the quantification limit using NONMEM VI,”J Pharmacokinet Pharmacodyn, vol. 35, pp. 401–421, Aug 2008.

[43] T. Orchard and M. A. Woodbury, “A missing information principle: theory and

72

applications,” Proceedings of the Sixth Berkeley Symposium on MathematicalStatistics and Probability, vol. Volume 1: Theory of Statistics, pp. 697–715,1972.

[44] S. F. Buck, “A method of estimation of missing values in multivariate data suit-able for use with an electronic computer,” Journal of the Royal Statistical Society.Series B (Methodological), vol. 22, no. 2, pp. pp. 302–306, 1960.

[45] J. E. Walsh, “Computer-feasible method for handling incomplete data in regres-sion analysis,” J. ACM, vol. 8, pp. 201–211, Apr. 1961.

[46] D. Rubin, “Multiple imputations in sample surveys - a phenomenologicalBayesian approach to nonresponse,” Proceedings of the survey research methodssection: American Statistical Association, vol. 1, pp. 20–28, 1978.

[47] D. Rubin, Multiple Imputation for Nonresponse in Surveys. Wiley Classics Li-brary, Wiley, 1987.

[48] R. J. Little, “Regression with missing x’s: a review,” Journal of the AmericanStatistical Association, vol. 87, no. 420, pp. 1227–1237, 1992.

[49] H. Wu and L. Wu, “A multiple imputation method for missing covariates innon-linear mixed-effects models with application to HIV dynamics,” Stat Med,vol. 20, pp. 1755–1769, Jun 2001.

[50] P. Bonate, Pharmacokinetic-Pharmacodynamic Modeling and Simulation.Pharmacokinetic-pharmacodynamic Modeling and Simulation, Springer, 2006.

[51] X.-L. Meng et al., “Multiple-imputation inferences with uncongenial sources ofinput,” Statistical Science, vol. 9, no. 4, pp. 538–558, 1994.

[52] D. B. Rubin, “Multiple imputation after 18+ years,” Journal of the AmericanStatistical Association, vol. 91, no. 434, pp. 473–489, 1996.

[53] J. Schafer, Analysis of Incomplete Multivariate Data. Chapman & Hall/CRCMonographs on Statistics & Applied Probability, Taylor & Francis, 1997.

[54] L. M. Collins, J. L. Schafer, and C.-M. Kam, “A comparison of inclusive and re-strictive strategies in modern missing data procedures.,” Psychological methods,vol. 6, no. 4, p. 330, 2001.

[55] P. O. Maitre, M. Buhrer, D. Thomson, and D. R. Stanski, “A three-step approachcombining Bayesian regression and NONMEM population analysis: applicationto midazolam,” J Pharmacokinet Biopharm, vol. 19, pp. 377–384, Aug 1991.

[56] J. W. Mandema, D. Verotta, and L. B. Sheiner, “Building populationpharmacokinetic–pharmacodynamic models. I. Models for covariate effects,” JPharmacokinet Biopharm, vol. 20, pp. 511–528, Oct 1992.

[57] L. B. Sheiner and S. L. Beal, “Bayesian individualization of pharmacokinetics:simple implementation and comparison with non-Bayesian methods,” J PharmSci, vol. 71, pp. 1344–1348, Dec 1982.

[58] N. Christophidis, W. J. Louis, I. Lucas, W. Moon, and F. J. Vajda, “Renal clear-ance of methotrexate in man during high-dose oral and intravenous infusion ther-apy,” Cancer Chemother. Pharmacol., vol. 6, no. 1, pp. 59–64, 1981.

[59] B. Winograd, R. J. Lippens, M. J. Oosterbaan, M. J. Dirks, T. B. Vree, andE. van der Kleijn, “Renal excretion and pharmacokinetics of methotrexate and7-hydroxy-methotrexate following a 24-h high dose infusion of methotrexate inchildren,” Eur. J. Clin. Pharmacol., vol. 30, no. 2, pp. 231–238, 1986.

[60] M. M. Rhodin, B. J. Anderson, A. M. Peters, M. G. Coulthard, B. Wilkins,M. Cole, E. Chatelut, A. Grubb, G. J. Veal, M. J. Keir, and N. H. Holford,

73

“Human renal function maturation: a quantitative description using weight andpostmenstrual age,” Pediatr. Nephrol., vol. 24, pp. 67–76, Jan 2009.

[61] M. B. Maia, S. Saivin, E. Chatelut, M. F. Malmary, and G. Houin, “In vitro andin vivo protein binding of methotrexate assessed by microdialysis,” Int J ClinPharmacol Ther, vol. 34, pp. 335–341, Aug 1996.

[62] F. Ceriotti, J. C. Boyd, G. Klein, J. Henny, J. Queraltó, V. Kairisto, M. Panteghini,on behalf of the IFCC Committee on Reference Intervals, and D. L. (C-RIDL),“Reference intervals for serum creatinine concentrations: Assessment of avail-able data for global application,” Clinical Chemistry, vol. 54, no. 3, pp. 559–566,2008.

[63] W. Junge, B. Wilke, A. Halabi, and G. Klein, “Determination of reference inter-vals for serum creatinine, creatinine excretion and creatinine clearance with anenzymatic and a modified Jaffé method,” Clin. Chim. Acta, vol. 344, pp. 137–148, Jun 2004.

[64] S. Janmahasatian, S. B. Duffull, S. Ash, L. C. Ward, N. M. Byrne, and B. Green,“Quantification of lean bodyweight,” Clin Pharmacokinet, vol. 44, no. 10,pp. 1051–1065, 2005.

[65] L. Brynne, J. L. McNay, H. G. Schaefer, K. Swedberg, C. G. Wiltse, and M. O.Karlsson, “Pharmacodynamic models for the cardiovascular effects of moxoni-dine in patients with congestive heart failure,” British Journal of Clinical Phar-macology, vol. 51, no. 1, p. 35–43, 2001.

[66] K. Tunblad, L. Lindbom, L. McFadyen, E. N. Jonsson, S. Marshall, and M. O.Karlsson, “The use of clinical irrelevance criteria in covariate model buildingwith application to dofetilide pharmacokinetic data,” J Pharmacokinet Pharma-codyn, vol. 35, pp. 503–526, Oct 2008.

[67] X. S. Xu, M. Yuan, M. O. Karlsson, A. Dunne, P. Nandy, and A. Vermeulen,“Shrinkage in nonlinear mixed-effects population models: quantification, influ-encing factors, and impact,” AAPS J, vol. 14, pp. 927–936, Dec 2012.

[68] L. Lindbom, J. Ribbing, and E. N. Jonsson, “Perl-speaks-NONMEM (PsN)–aPerl module for NONMEM related programming,” Comput Methods ProgramsBiomed, vol. 75, pp. 85–94, Aug 2004.

[69] L. Lindbom, P. Pihlgren, E. N. Jonsson, and N. Jonsson, “PsN-Toolkit–a collec-tion of computer intensive statistical methods for non-linear mixed effect mod-eling using NONMEM,” Comput Methods Programs Biomed, vol. 79, pp. 241–257, Sep 2005.

[70] R Core Team, R: A Language and Environment for Statistical Computing. RFoundation for Statistical Computing, Vienna, Austria, 2013.

[71] B. Efron, “Bootstrap methods: Another look at the jackknife,” The Annals ofStatistics, vol. 7, no. 1, pp. pp. 1–26, 1979.

[72] M. O. Karlsson and N. Holford, “A tutorial on Visual Predictive Checks.” PAGE17, Abstr 1434, 2008.

[73] S. Siegel, Nonparametric statistics for the behavioral sciences. McGraw-Hillseries in psychology, McGraw-Hill, 1956.

[74] C. Sabot, J. Debord, B. Roullet, P. Marquet, L. Merle, and G. Lachatre, “Compar-ison of 2- and 3-compartment models for the Bayesian estimation of methotrex-ate pharmacokinetics,” Int J Clin Pharmacol Ther, vol. 33, pp. 164–169, Mar1995.

74

[75] A. Donner, “The relative effectiveness of procedures commonly used in multipleregression analysis for dealing with missing values,” The American Statistician,vol. 36, no. 4, pp. 378–381, 1982.

[76] U.S. Department of Health and Human Services, Food and Drug Admini-stration, “Population Pharmacokinetics.” http://www.fda.gov/downloads/ Sci-enceResearch/SpecialTopics/WomensHealthResearch/UCM133184.pdf, 1999.

[77] E. I. Ette, H. Sun, and T. M. Ludden, “Ignorability and parameter estimation inlongitudinal pharmacokinetic studies,” The Journal of Clinical Pharmacology,vol. 38, no. 3, pp. 221–226, 1998.

[78] F. Harrell, Regression Modeling Strategies: With Applications to Linear Mod-els, Logistic Regression, and Survival Analysis. Graduate Texts in Mathematics,Springer, 2001.

75

Acta Universitatis UpsaliensisDigital Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Pharmacy 189

Editor: The Dean of the Faculty of Pharmacy

A doctoral dissertation from the Faculty of Pharmacy, UppsalaUniversity, is usually a summary of a number of papers. A fewcopies of the complete dissertation are kept at major Swedishresearch libraries, while the summary alone is distributedinternationally through the series Digital ComprehensiveSummaries of Uppsala Dissertations from the Faculty ofPharmacy. (Prior to January, 2005, the series was publishedunder the title “Comprehensive Summaries of UppsalaDissertations from the Faculty of Pharmacy”.)

Distribution: publications.uu.seurn:nbn:se:uu:diva-224098

ACTAUNIVERSITATIS

UPSALIENSISUPPSALA

2014

data in nonlinear mixed effects methodology for handling ...715330/fulltext01.pdf · nonlinear...

Documents