joint modeling li l
DESCRIPTION
Joint Modeling Li LTRANSCRIPT
-
Liang Li
Department of Quantitative Health SciencesCleveland Clinic
Joint Modeling of Longitudinal and Survival Data
Presented at ASA North Illinois Chapter Spring Meeting, March 5 2009
-
Outline of the Talk
What is joint modeling of longitudinal & survival data?
The shared parameter model
The measurement error perspective
Our proposal
Why it works (theoretical properties)
How it works (empirical performance)
Extension and on-going work
2
-
Longitudinal Data
Each subject is followed over a period of time; a series of measurements made.
e.g., After lung transplant, FEV1 measured every week for a month, and every months afterwards till the end of the study
3
Months
FEV1
0 6 12 18
-
Survival Data
Time to event such as death, machine failure, disease relapse, PFS, etc.
Could be censored (partially observed)
4
0 1 2 3 4 5 6
02
04
06
08
01
00
Years
Su
rviv
al (%
)
2 4 6 8 10
Time (Months)
death
censored
Kaplan-Meier Curve
-
Joint Modeling of Longitudinal and Survival Data
Question: how does the change in the (earlier) longitudinal profile of a subject relate to the risk of the (later) survival event?
Example 1: Rate of change of glomerular filtration rate (GFR) & time to end stage renal disease (ESRD) or death
Example 2: FEV1 & survival among cystic fibrosis patients
Wide-spread use & active research field, e.g, surrogate endpoint
5
-
Longitudinal Profile
6
Months
FEV1
0 6 12 18
longitudinal profile =
signal + noise
Linear profile: subject-specific (random) intercept & slope
Relates intercept & slope to survival
Can we use raw data profile and avoid joint modeling?
Nonlinear profile: time-dependent covariate curve
-
Data Structure
7
subject-specificintercept & slope
longitudinal data survival data
Stage 1
Stage 2
e.g., Cox Modele.g., Linear Mixed Model
Two-stage hierarchical model
Longitudinal part and survival part are conditionally independent given the subject-specific intercept and slope
-
Shared Parameter Model
Two-stage hierarchical structure suggests the shared parameter model
Review by Tsiatis & Davidian (2004), and Tseng, Hsieh, Wang (2005), Liu & Ying (2007), among others
Almost all based on the following Fisher-likelihood
8
n
i=1
log{
f(longint, slope)f(surv
int, slope)f(int, slope)d[int, slope]}
Pros: maximum likelihood estimator
Cons: computational intensive, distributional assumptions needed
-
A New Perspective
9
Can we use a two-step approach for the two-stage problem?
step 1: estimate the intercept and slope for each subject
step 2: relate them to survival
0 2 4 6 8 10
810
12
14
16
Time
true line
fitted line
fitted line
-
The Measurement Error Perspective
10
Do a regression of survival using true subject-specific intercepts and slopes
true intercept & slope unknown
estimated intercept & slope act as surrogates
measurement error may cause bias in regression
-1 0 1 2
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
Z (red) or X (blue)
Y
Y = b0 + b1Z + e
X = Z + U
Measurement error cause attenuation in regression
Y ~ X
-
The Example
11
HEMO study: a clinical trial coordinated at Cleveland Clinic
2 by 2 design: standard or high dose of dialysis, low or high flux dialyzer
Neither treatment was found to significantly affect time to all-cause mortality (Rocco et al 2004)
We want to study a secondary question: whether the decline of albumin levels is a strong predictor of mortality
Challenge: albumin measurements need to be calibrated to remove artificial differences due to variations in total body water.
Monday-Wednesday-Friday
Tuesday-Thursday-Saturday
-
Model & Notation - Longitudinal Part
12
Longitudinal sub-model: linear mixed model
Wij =VTij + DTij
i + #ij
i =Xi + i
In the context of the application:
ALBij =Mon/Tues + i0 + Timeiji1 + Noiseij
i0 =Intercept0 + DoseiA0 + FluxiB0 + i0i1 =Intercept1 + DoseiA1 + FluxiB1 + i1
We start with i N(0,)
and shows later that conclusion holds even when this assumption is dropped.
Stage 1
Stage 2
-
Model & Notation - Survival Part
Survival sub-model: Cox proportional hazard model
T: time to event C: time to censoring
Y = min(T, C) = 1{ T < C}
13
Log hazard function:
h(t;Zi, i) = h0(t) + ZTi a1 + Ti a2
In the context of the example:
h(t;Zi, i) =h0(t) + Doseia11 + Fluxia12+i0a20 + i1a21
The proposed model includes as special cases the models considered by Wang (2006), Ratcliffe et al (2004), Hsieh, Tseng, Wang (2006), Tsiatis & Davidian (2004), among others
Stage 2
-
Poissonization of Cox Model
Step 3: use Trapezoidal rule for numerical integration
14
Step 1: use B-spline to approximate log baseline hazard
h0(t) K
k=1
a0kk(t)
Step 2: use full likelihood of Cox model, not partial likelihood
iUi(Yi)T Yi
0exp{Ui(t)T}dt
Finally: we can fit a Cox model using Poisson regression
-
15
The joint log likelihood (for one subject)
Key observation: appears in linear, quadratic or exponential terms
Survival/Poisson
Longitudinal
Stage 1
LLi() =ni2
log(22! ) Wi Xi Dii 2
22!
LSi() =Mi
g=0
{Y ig
{UT1ig1 +
Ti 2 T2 Xi + log(cig)
}
exp{UT1ig1 +
Ti 2 T2 Xi + log(cig)
}}
LMi() =q
2log(2) 1
2log | |
12(i Xi)T1 (i Xi)
i
True Likelihood corrected version
-
16
From linear model theory, is a measurement of
i = (DTi Di)
1DTi (Wi Vi)
i |i N(i ,
2! (D
Ti Di)
1)
i i
W N(X,2u)If then,
i
Xi
i
Wi
i
X2i
i
(W 2i 2u)
i
exp(Xi)
i
exp(Wi 122u)
n
i=1
LL() +n
i=1
LS() +n
i=1
LM()
Do correction to the joint log likelihood (formula omitted)
Corrected Likelihood
Linear
Quadratic
Exponential
0 2 4 6 8 10
810
12
14
16
Time
-
The proposed estimators are maximizers of the corrected joint log likelihood function
Variance components estimated separately in a side step.
Mis-specification allowed, like GEE
Result not sensitive to the B-spline approximation
Statistical inference based on sandwich variance estimator
17
A Few Remarks
-
Summary on Proposed Method
Key idea: find a corrected joint log likelihood that looks like the true joint log likelihood with the unknowns eliminated
This is possible because the unknowns reside in linear, quadratic or exponential terms (Li and Greene, Biometrics 2008)
Combine three pieces of log likelihood together, similar in spirit to the h-likelihood (1996), but different from the classical Fisher likelihood (1922)
Compared with Wang (2006, Stat Sinica), our method
more general (unknown parameters in both sub-models), including most published models as special case
exact correction with full likelihood instead of approximate correction with partial likelihood
concave likelihood (next page)
18
-
Theoretical Properties
The estimators of the unknown parameters are maximizers of the corrected joint log likelihood
As sample size becomes large:
the estimator is consistent
the estimator is asymptotically normal
the corrected joint log likelihood is concave
These properties remain valid even when the random effects do not have normal distribution or their variance matrix is misspecified (robust)
19
-
Simulation Results
We conducted extensive computer simulations to investigate the empirical performance of the proposed method
Bias, variance, coverage of confidence interval: Good
Result not sensitive to number of knots of B-spline
The computation is much faster than competing methods based on maximum likelihood
The algorithm is stable, always converge (concavity)
Estimator expected to be less efficient than maximum likelihood based methods, a trade-off for robustness
20
-
Parameter
Bias CI coverageof
proposeduncorrected(two-step)
proposed
L 1 = 1 0.00197 0.00299 94.5
L 2 = 2 -0.00370 -0.00571 94.0
L 3 = 1 0.00591 0.00659 94.0
L 4 = 0.5 -0.0104 -0.0118 97.0
intercept = 0.5 -0.347 0.0196 96.0
slope = 1 -0.471 0.0552 95.5
21
n=250
-
Application to HEMO Study Data
1628 patients with between 3 and 15 repeated measurements
22
Parameter Estimator p-value
intercept 3.7 < 0.001
high dose 0.0012 0.94
high flux -0.007 0.67
time (years) -0.058 < 0.001
high dose by time -0.014 0.311
high flux by time -0.01 0.468
Monday / Tuesday -0.026 0.017
high dose -0.061 0.5
high flux -0.069 0.44
random intercept -1.5 < 0.001
random slope -3.7 < 0.001
0 2 4 6 8 10
810
12
14
16
Time
smaller slope (-0.4)
larger slope (-0.2)
-
Estimated baseline survival function and its 95% point-wise confidence interval
23
0 1 2 3 4 5 6
02
04
06
08
01
00
Years
Su
rviv
al (%
)
smooth curve
step function frompartial likelihood
-
Summary
A new method for joint modeling
A general model that includes most published models as special case
Theoretically appealing properties and reliable and easy computation
Robust against certain model mis-specification
May use other methods than Trapezoidal rule (Poissonization is not inevitable)
Limitation:
Need at least three repeated measurements per subject
Trade efficiency for robustness, best for large sample size
24
-
Nonlinear Longitudinal Data
In a lung transplant study at Cleveland Clinic, investigators want to use FEV1 profile after lung transplant to predict mortality
The profile is clearly nonlinear
25
-
0 20 40 60 80 100
30
35
40
45
50
55
60
65
mean FEV1 trajectory, subject!clustering ignored
months after transplant
FE
V1
26
-
0 20 40 60 80 100
0.0
0.5
1.0
1.5
Subject!Specific Fitted Curves
months after transplant
fitted c
urv
es
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
3334
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
5253
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
7071
72 73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
9394
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158159
160
161
162163164
165
166
167168
169
170
171
172
173
174
175
176177
178
179
180
181
182
183
184
185
186
187
188189
190
191
192
193
194
195
196
197
198
199
200
201202
203
204205
206
207208
209
210
211
212
213
214
215
216
217218
219
220
221
222
223
224
225
226
227
228
229230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249250251
252
253
254255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299300
301
302
303
304
305
306
307
308
309310
311
27
-
0 1 2 3 4 5 6
-0.5
0.0
0.5
1.0
1.5
2.0
2.5
Time
Replace subject-specific intercept or slope with time-dependent covariate
Want Error correction?true curve
estimated curve
-
Proposed Model & Method
Cox model with time-dependent covariate and time-dependent hazard ratios (varying coefficients)
29
Why varying coefficients: constant hazard ratios unlikely for surgical data
hi(t;X(t)) = exp{0(t) + 1(t)T Xi(t)}
Wi(t) = Xi(t) + !i(t)hi(t;X(t)) = exp{0(t) + 1T Xi(t)}
Use what ever method to fit each subjects longitudinal profile separately
Get estimated curve & its variation; do the measurement error correction
Deal with varying coefficients
-
0 2 4 6 8 10
-2-1
01
2
Time
Y
0 2 4 6 8 10-2
-10
12
Time
Y
Local linear method: estimate the curves piece by piece at local neighborhoods.
Proposed Method
-
Proposed Method
Local linear method for the full likelihood of Cox model
31
Our proposal different from all previous methods in that we did not use partial likelihood (for exact correction)?
2 4 6 8 10
Time
2 4 6 8 10
Time
artificiallycensored
removed
-
n
i=1
[i{Xi(Yi)T(Yi)
} Yi
0exp
{Xi(t)T(t)
}dt
]
n
i=1
[Kh(Yi t0)i
{Xi(Yi)T(Yi)
} Yi
0Kh(t t0) exp
{Xi(t)T(t)
}dt
]
n
i=1
[Kh(Yi t0)i
{Wi(Yi)T(Yi)
}
Yi
0Kh(t t0) exp
{Wi(t)T(t)
12(t)T(t)(t)
}dt
]
Cox log likelihood
Cox local likelihood
The Evolution of Likelihoods
Replace (t) by intercept + slope t
under construction ... ...
32
with correction
with local linear approx.
-
References
Liang Li, Bo Hu, Tom Greene (2009) A semiparametric joint model for longitudinal and survival data with application to hemodialysis study. Biometrics, in press.
Liang Li. Semiparametric joint modeling of nonlinear time-dependent covariate process and time to event outcome with varying coefficients. Working paper.
33