Estimation of Parameter Values
Nutrition Models Workshop
Luis Moraes
The Ohio State University
06/25/2017
1
Outline
• Nutrition models are VERY diverse• Combination of empirical, mechanistic, dynamic and static models
• Regression, linear and nonlinear mixed models, differential equations
• Today: Main approaches for estimating parameters in a variety of models• Some mathematical description
• Idea is for you to understand the reasoning and challenges of different approaches
• One exercise/demonstration in the end• Fit model with two approaches
2
Introduction
• Different types of models have been used for nutrition modeling• Compartmental, regression, meta-analysis, nonlinear mixed models, …..
• One feature is common to almost all these models• Parameters are needed to describe the system
• Quantify relationship between variables
3
Introduction
• Simple example: linear regression 0 1i i iY x
0 20 40 60 80 100
010
20
30
40
50
60
x
y
• Yi is the response variable for the ith observation• xi is the predictor variable in the ith observation• β0 is the intercept• β1 is the slope
• εi is the error, E[εi] = 0, Var[εi] = σ2 and εi are independent• i = 1, …, n
In matrix notation: y Xβ ε
4
1 1 1
0
1
1
1n n n
Y x
Y x
Introduction• In practice, parameter true values are unknown
• Estimators from a sample
• Parameters have to be optimal in some sense• Least square estimators minimize squared errors
• Maximum likelihood estimators maximize the likelihood function
5
Least Squares Estimators
6
Least Square Estimators
• The least squares estimators minimize the square errors:
• How do we get them???
• We can find points of minimum and maximum of a function using derivatives.
For example for f (x) = 160x – 16x2
7
2
0 1
1
n
i i
i
Q Y x
Set derivative to zero and “solve” for x:
160 – 32x = 0
x = 5
Second derivative test: −32
Least Squares Estimators
8
0 1
10
0 1
10
2
2
n
i i
i
n
i i i
i
QY x
Qx Y x
Setting these partial derivatives to zero, we construct the normal equations
0 1
1 1
2
0 1
1 1 1
n n
i i
i i
n n n
i i i i
i i i
Y nb b X
X Y b X b X
Least Squares Estimators
• The least square estimators are the solutions to the normal equations
• The concept extends to multiple regression
• General form of the least squares estimators:
9
2
1
1 1
0 1
/n n
i i i
i i
b x x Y Y x x
b Y b x
1
T T
b X X X Y
2
0 1 1 1 , 1
1
n
i i p i p
i
Q Y x x
Least Squares Estimators
• Estimates of the uncertainty associated with these parameters
• Estimator of the error’s variance
• Estimated variance-covariance matrix of the parameters
10
2
0 1 1 1 , 1
1
1 n
i i p i p
i
MSE Y b b x b xn p
1
TMSE
X X
Nonlinear Models
• So far, we can estimate parameters in linear models
• Many phenomena in biology are nonlinear• For example, reaction velocity vs. substrate concentration in an enzymatic reaction
• Before we start with nonlinear models, let’s clarify
is a nonlinear model
is a linear model
11
1
2 31 expi i
i
Yx
2
0 1 2i i i iY x x
Nonlinear Models
12
50
100
150
200
0.0 0.3 0.6 0.9
Substrate Concentration
Reaction R
ate
Michaelis-Menten Kinetics
max
M
V Sv
K S
Nonlinear Regression
13
,i i iY f x θ
f is the nonlinear function describing the relationship between Y and x
Michaelis-Menten example: max i
i i
M i
V xy
K x
• yi is the reaction rate for the ith observation
• xi is the associated substrate concentration
are the parameters to be estimated
• εi is the error, E[εi] = 0, Var[εi] = σ2 and independent
• i = 1, …, n
max, ii
M i
V xf x
K x
θ
T
max , MV Kθ
Least Squares Estimators
• For the simple linear regression model, least squares minimize
• For the nonlinear regression, the idea is the same: minimize
14
2
0 1
1
n
i i
i
Q Y x
2
1
n
i i
i
Q Y f x
Least Squares Estimators
• Solution to the normal equations are often difficult to obtain analytically
• Numerical Algorithms• For example, Gauss-Newton
• Require initial values to initialize numerical procedures
15
Gauss-Newton
• Default in PROC NLIN and nls()
• Approximate the nonlinear model with linear terms
• Taylor series expansion and least squares as for linear regression
• Denote the least squares estimates g and the initial values
• Approximation around starting values:
16
0 (0) (0) (0)
0 1 1, , , pg g g g
0
10 0
0 =
,, ,
pi
i i k k
k k
f xf x f x g
θ g
θθ g
Gauss-Newton
• Model approximation
• It is a linear model!
• Estimate parameters by least squares:
• Update:
17
0
100
0
0 0 0
,,
pi
i i k ikk k
f xY f x g
θ g
θg
Y D β ε
1
0 0 T 0 0 T 0
b D D D Y
1 0 0 g g b
Gauss-Newton
• Evaluation criteria:
• Start the process again with as the initial values
• Repeat procedure until is negligible
• Estimate of error’s variance:
• Other methods available, e.g. Nelder-Mead and Marquardt
18
2
0 0
1
n
i i
i
SSE Y f
1g
1s sSSE SSE
2
1
, /n
i i
i
MSE Y f x n p
g
Compartmental Models
• Traditionally used in nutritional modeling• Roots on pharmacokinetics and differential calculus
19
GutCompartment
Drug in ka Blood Compartment
ke
Compartmental Models
• Functional forms described in terms of differential equations• Instead of the “integrated form”
• Strategy for parameter estimation• Expected mean represented by a compartmental model f
• If f cannot be obtained analytically, it has to be solved numerically • Euler, Runge-Kutta4, lsoda
• Can use nonlinear least squares but have to numerically solve f at iteration
• Modern software estimate using maximum likelihood
20
Maximum Likelihood Estimation
• Another strategy for parameter estimation
• For regression models with independent , estimators coincide with least squares estimators
• Estimators maximize the likelihood function• Parameter values that are in best agreement with the data
21
2~ 0,i N
Maximum Likelihood Estimation
22
• p (y | μ σ2) is the density function: How likely y is at each value
• The likelihood function is: – “How likely the whole data is with that set of parameters values”
– MLE: “maximize the likelihood of getting the observed data”
2 2 2
1 1, | ,..., | , | ,n nL y y p y p y
2
2
22
1| , exp
22
yp y
Maximum Likelihood Estimation
• Linear Regression Example:
• It is easier to work with the log-likelihood
• To find parameters that maximize the likelihood we…• Take the derivative with respect to each parameter and set to zero. Also need
second derivative test
23
0 1i i iY x
0 1 0
2
1 22 1
2 1 1, | , , exp
22,
n
i
in i xL y y y
22 2
0 1 1 0 12
1log , , | , , log 2 log
2 2 2n i i
i
n nL y y y x
Mixed Models
• Modern mixed modeling relies heavily on likelihood methods
• Extension of (non)linear models with both fixed and random effects
• Probably the “type” of model you need to analyze your data or construct your nutrition model
• There’s a whole workshop on mixed models in this meeting
24
Nonlinear Mixed Models
• Nonlinear functional forms• Michaelis-Menten, logistic, exponential, Gompertz, …
• Random effects that “enter the model” nonlinearly
• Allow you to model nonlinear clustered, logitudinal data • Records from the same animal, treatment means from the same study
25
Nonlinear Mixed Models
• yij is the jth record on the ith “subject” or cluster
• xij is the associated predictor variable
• θi is the vector of subject specific parameters
θi = β + bi bi ~ N(0, Ψ)
• εij is the random error ~ N(0, σ2)26
,ij ij i ijy f x θ
“Fixed” “Random”
Nonlinear MM Maximum Likelihood
• There is more than one source of variability• Between subjects and within subjects
• To represent the generative process of the data we need to take both into account• Joint density of the response and the random effects
27
Maximum Likelihood
28
• For the linear regression
• The likelihood function is:
• For the nonlinear mixed model, we need to compute the marginal density of the responses:
2 2 2
1 1, | ,..., | , | ,n nL y y p y p y
2
2
22
1| , exp
22
yp y
2 2| , , | , , |p p p d y β Ψ y b β b Ψ b
Maximum Likelihood
• Bad news from a estimation perspective
• The likelihood function to estimate the parameters requires integrating the joint density with respect to the random effects
• The integral often does not have a closed form expression
• Approximation of the likelihood function
29
Exercise
• Let’s go back to the compartmental model
30
11
21 2
a
a e
dyk y
dt
dyk y k y
dx
• y1 is the amount of drug in the gut compartment• y2 is the amount of drug in the blood compartment• ka is the absorption rate (1/hr)• ke is the elimination rate (1/hr)
Exercise
• Data from Davidian and Giltinan (1995)
• 12 subjects received a single oral dose of theophylline• Anti-asthmatic drug
• Single oral dose at time zero
• Measurements of blood concentrations of drug at 11 time points over a 25 hour period
31
Exercise
32
Time (hr)
Concentr
ation (
mg /
L)
0
2
4
6
8
10
0 5 10 15 20 25
Exercise
• We only observe data from the second compartment
• The differential equation for the second compartment can actually be solved analytically
• We will estimate the parameters in two different ways• Analytical solutions: nonlinear mixed model
• Numerical solutions for differential equations in a mixed model framework
33
Exercise
• The analytical solution to the second differential equation is
• As a nonlinear mixed model
34
( ) exp expe a
e a
a e
DoseC t t
k kk k
Cl k kt
,
,
,
exp expi e a i
ij e ij a i ij ij
i a i e
Dose k kC k t k t
Cl k k
2
2
2
,
00~ 0, and ~ ,
00a
i Cl
ij
a i k
ClN N
k
Exercise
• All three parameters must be positive
• Reparameterize model with parameters on a log scale
where
35
exp( ) exp exp exp exp
exp exp
e a
e a
a e
lk lk lCllk lk
lk l
DoseC t t t
k
log , log and loge ae alk lk lC Clk k l
Exercise
• First method• Fit nonlinear mixed model in R
36
Exercise
• Second method• Nonlinear mixed model but solve differential equations numerically: lsoda
solver
• Observation Equation:
37
11
21 2
1
2
0
0 0
a
a e
dyk y
dt
dyk y k y
dx
y Dose
y
2
e
y tC t k
Cl
Bayesian Inference
• Combines prior information with new data: update of knowledge
• All parameters are treated as random variables• Prior distributions for parameters
• Inference is based on the posterior distribution
• Bayes theorem
38
| |p g Lθ y θ θ y
Posterior DataPrior
From: https://www.r-bloggers.com/the-beta-prior-likelihood-and-posterior/
Bayesian Inference
• Inference based on posterior• Combines prior information with the observed data
• Particularly suited for models built with many parameters that good biological knowledge is available
• Many freely software available
• Including with differential equations “solvers”
• We don’t have to have a known or tractable posterior: Markov Chain Monte Carlo (MCMC)
39
Example of a Multivariate Nonlinear Model
• From Strathe et al. (2012). J. Agri. Sci. 150:764-774
40
F∙(ME-MEM)
kf1-kpkp
MEM
(1-F)∙(ME-MEM)
ME
MEPD MELD
HPPD LD
1-kf
Courtesy of A. B. Strathe and adapted after van Milgen and Noblet (1999)
Example for a Multivariate Nonlinear Model
41
explog
/
PDmaxmax
PDmax
b
f p
PBWPD
D BWBW
LD ME BW PD
BW
k a k
1
~ . , . and ~ . , .p fk N k N2 20 60 0 10 0 80 0 10
From Strathe et al. (2012). J. Agri. Sci. 150:764-774
Multivariate Model:
Priors from literature
Credible Intervals for the Posterior