survival analysis: weeks 2-3 - stanford...
TRANSCRIPT
1
Survival Analysis: Weeks 2-3
Lu Tian and Richard Olshen
Stanford University
2
Kaplan-Meier(KM) Estimator
• Nonparametric estimation of the survival function
S(t) = pr(T > t)
• The nonparametric estimation is more robust and does not
depend on any parametric assumption.
3
If there is no censoring
• S(t) can be consistently estimated by
S(t) = n−1n∑
i=1
I(Ti > t).
• S is a discrete distribution with mass probability of n−1 at
observed times T1, · · · , Tn
• S(t) is the nonparametric maximum likelihood estimator
(NPMLE) for S(t).
4
CDF as NPMLE
• Assuming that F (·) is discrete with mass probability at
T1 < T2 < · · · < Tn, where {T1, T2, · · · } are observed times.
• Let f1 = pr(T = T1), f2 = pr(T = T2), · · · .
• Objective: estimate f1, f2, · · · .
• Method: maximize∏n
i=1 fi subject to∑n
i=1 fi = 1.
• Solution: f1 = f2 = · · · = fn = n−1.
5
Data and Assumptions
• Data: {(Ui, δi), i = 1, · · · , n} where Ui = min(Ti, Ci) and
δi = I(Ti ≤ Ci).
• Assumptions:
1. T1, · · · , Tn i.i.d. ∼ F (·) = 1− S(·)2. C1, · · · , Cn i.i.d. ∼ G(·)3. Ti ⊥ Ci, i = 1, · · · , n. Noninformative censoring!
6
If there is censoring
• Assuming that F (·) is discrete with mass probability at
v1 < v2 < · · · , where {v1, v2, · · · } are observed times.
• Let f1 = pr(T = v1), f2 = pr(T = v2), · · · .
• Objective: estimate f1, f2, · · · .
7
Example
• Obs: 2, 2, 3+, 5, 5+, 7, 9, 16, 16, 18+, where + means censored
• v1 = 2; v2 = 3, v3 = 5, v4 = 7, v5 = 9, v6 = 16, v7 = 18, v8 = 18+
• The likelihood function in terms of (f1, f2, · · · ) :
L(F ) = f21 (f3+f4+f5+f6+f7+f8)f3(f4+f5+f6+f7+f8)f4f5f
26 f8,
where f1 + f2 + f3 + f4 + f5 + f6 + f7 + f8 = 1
8
Reparametrization tricks
• The discrete hazard function: h1 = pr(T = v1) and
hj = pr(T = vj |T > vj−1), j > 2
• For t ∈ [vj , vj+1)
S(t) = pr(T > t) = pr(T > vj) =
j∏i=1
(1− hi).
• For t = vj
fj = f(t) = pr(T = t) = hj
j−1∏i=1
(1− hi).
9
Reparametrization tricks
• The likelihood function in terms of (h1, h2, · · · ) :
L(F ) =h21 × {(1− h1)(1− h2)} × {(1− h1)(1− h2)h3}
× {(1− h1)(1− h2)(1− h3)} × {(1− h1)(1− h2)(1− h3)h4}
× {(1− h1)(1− h2)(1− h3)(1− h4)h5}
× {(1− h1)(1− h2)(1− h3)(1− h4)(1− h5)h6}2
× {(1− h1)(1− h2)(1− h3)(1− h4)(1− h5)(1− h6)(1− h7)}
=h21(1− h1)
8 × (1− h2)8 × h3(1− h3)
6
h4(1− h4)4 × h5(1− h5)
3 × h26(1− h6)× (1− h7)
10
KM estimation
• The likelihood function
L(F ) =∏j
hdj
j (1− hj)Y (vj)−dj
where
dj =n∑
i=1
δiI(Ui = vj) = #failures at vj
Y (vj) =n∑
i=1
I(Ui ≥ vj) = #“at risk” at vj .
11
KM estimation
• hj = dj/Y (vj)
•
S(t) =
1 t < v1∏ji=1(1− hi) vj ≤ t < vj+1
,
which is the Kaplan-Meier Estimator.
12
Example
vj Y (vj) dj hj S(vj) =
j∏i=1
(1 − hi) = P (T > vj)
2 10 2 2/10 .8
5 7 1 1/7 .69 (= .8 × 67 )
7 5 1 1/5 .55 (= .69 × 45 )
9 4 1 1/4 .41 (= .55 × 34 )
16 3 2 2/3 .14 (= .41 × 13 )
18 1 0 0 .14
13
KM estimation
• Suppose that vg denotes the largest vj for which Y (vj) > 0.
1. if dg = Y (vj), then S(t) = 0 for t ≥ vg
2. if dg < Y (vj), then S(t) > 0 but not defined for t > vg. (Not
identifiable beyond vg.)
• The survival distribution may not be estimable with
right-censored data. Implicit extrapolation is sometimes used.
• The KM estimator can also be used to estimate the survival
function for the censoring distribution.
14
KM estimation
• KM estimator is a special MLE
1− S(t) = argmaxFL(F )
where F is the CDF for all discrete random variables
(nonparametric MLE).
15
Self-Consistency
• No censoring S(t) = n−1∑n
i=1 I(Ti > t)
• Right censoring: S(t) = n−1∑n
i=1 E(I(Ti > t)|Ui, δi)
1. E(I(Ti > t)|Ui, δi = 1) = I(Ui > t)
2. E(I(Ti > t)|Ui, δi = 0) = S(t)/S(Ui)I(t ≥ Ui) + I(Ui > t)
• Self-consistency iteration:
Snew(t) = n−1n∑
i=1
{I(Ui > t) + (1− δi)
Sold(t)
Sold(Ui)I(Ui ≤ t)
}
• The solution is still the KM estimator.
16
Redistribution of Mass
Step 1 2 2 3+ 5 5+ 7 9 16 16 18+
Step 2 110
110
110
110
110
110
110
110
110
110
Step 3 ↓ ↓ ↪→ 170
170
170
170
170
170
170
↓ ↓ ↓ ↪→ 15 (
870 )
15 (
870 )
15 (
870 )
15 (
870 )
15 (
870 )
︸ ︷︷ ︸ ︸ ︷︷ ︸ ︸ ︷︷ ︸Total
210 0 8
70 0 24175
24175
48175 Assume
Mass this is
somewhere
> 18
17
Nelson-Aalen Estimator
• How to estimate the cumulative hazard function?
H(t) =
j∑i=1
hi for vj ≤ t < vj+1
• H(t) = − log{S(t)}
− log{S(t)} =
j∑i=1
{− log(1− hi)} ≈j∑
i=1
hi = H(t)
for vj ≤ t < vj+1.
18
Asymptotical properties of KM estimator
• As n → ∞, S(t) → S(t) in probability.
• As n → ∞, n1/2{S(t)− S(t)} converges to N(0, σ2(t)) in
distribution.
19
Asymptotical properties of KM estimator
• How to estimate the variance of S(t)
• hi is an estimated probability.
• The variance of hi can be approximated by
hi(1− hi)
Y (vi)=
di(Y (vi)− di)
Y (vi)3
• hi and hj are asymptotically independent (why?).
20
Asymptotical variance of KM estimator
For vj ≤ t < vj+1 : Var(lnS(t)
)≈
j∑i=1
Var(ln(1− hi)
)δ−method
=
j∑i=1
Var(hi
)· 1
(1− hi)2
=
j∑i=1
diY (vi)(Y (vi)− di)
.
21
Asymptotical variance of KM estimator
Using the δ-method
Var(S(t)
)≈ Var
(lnS(t)
)(elnS(t)
)2
= S(t)2Var(lnS(t)
)
≈ S(t)2j∑
i=1
diY (vi)(Y (vi)− di)
(vj ≤ t < vj+1)
= σ2(t)
Greenwood’s formula
22
By-Product
The by-product of the Greenwood’s formula is the variance
estimator for Nelson-Aalen Estimator:
var(H(t)) =
j∑i=1
diY (vi)(Y (vi)− di)
, vj ≤ t < vj+1
23
Confidence Interval
• S(t)± 1.96σ(t), drawbacks?
• By δ-method
var(log(− log(S(t)))) =σ2(t)
(log(S(t)))2S(t)2
• The confidence interval for S(t)[exp{−e
log(− log(S(t)))− ˆ1.96σ(t)
log(S(t))S(t) }, exp{−elog(− log(S(t)))+
1.96σ(t)
log(S(t))S(t) }]
24
Median survival time
• How to estimate the median survival time
• Solving S(tM ) = 1/2, not always solvable!
• How to construct the CI for the median survival time?
Suppose that
1. pr(SL(t) < S(t)) = pr(SU (t) > S(t)) = 0.975.
2. SL(tML) = 0.5
3. SU (tMU ) = 0.5
4. The confidence interval for tM is [tML, tMU ].
25
CI for median survival time
0 1000 2000 3000 4000
0.0
0.2
0.4
0.6
0.8
1.0
2540 34453086
26
Median survival time
0.975 = pr(SL(tM ) < S(tM )) = pr(SL(tM ) < 0.5)
= pr(SL(tM ) < SL(tML)) = pr(tM ≥ tML)
0.975 = pr(SU (tM ) > S(tM )) = pr(SU (tM ) > 0.5)
= pr(SU (tM ) > SU (tMU )) = pr(tM ≤ tMU )
27
Restricted mean survival time
• The area under the survival curve is a nice summary for the
curve
• The AUC µ =∫ τ
0S(t)dt :
µ = tS(t)
∣∣∣∣τ0
+
∫ τ
0
tf(t)dt
= τS(τ) +
∫ τ
0
tf(t)dt =
∫ ∞
0
min(t, τ)f(t)dt = E{min(t, τ)}
• µ can be estimated as ∫ τ
0
S(t)dt.
28
Restricted mean survival time
• The restricted mean survival time E{min(T, τ)} can also be
estimated as
µIPW = n−1n∑
i=1
δi + (1− δi)I(Ui ≥ τ)
SC(Ti ∧ τ)Ti ∧ τ,
where SC(·) is a consistent estimator of the survival function of
the censoring time C. (How?)
• Rational
E
[I(Ci ≥ τ ∧ Ti)
SC(Ti ∧ τ)Ti ∧ τ |Ti
]≈ (Ti∧τ)
P (Ci ≥ τ ∧ Ti|Ti)
SC(Ti ∧ τ)= Ti∧τ
• This type of estimator is called the inverse probability
weighting estimator
29
Restricted mean survival time
• µ and µIPW are equivalent!
• The variance of the estimated area under the survival curve is
complicated (the derivation will be given later).
1
n
∫ τ
0
{∫ τ
tS(u)du
}2h(t)dt
P (U ≥ t).
30
Area under the KM curve
0 1000 2000 3000 4000
0.0
0.2
0.4
0.6
0.8
1.0
31
Model Checking
• Under the exponential distribution: log(S(t)) = −λt
• Plot t vs. log(S(t)) to visually check the expected linear
pattern.
• Under the Weibull distribution
log(− log(S(t))) = p log(λ) + p log(t)
• Plot log(t) vs. log(− log(S(t))) to visually check the expected
linear pattern.
32
Model Checking
4 5 6 7 8
−5
−4
−3
−2
−1
0
log(t)
log(
−lo
g(S
(t))
)
33
Coming Soon
• The distribution of S(t) as a process.