survival analysis: weeks 2-3 - stanford...

1

Survival Analysis: Weeks 2-3

Lu Tian and Richard Olshen

Stanford University

2

Kaplan-Meier(KM) Estimator

• Nonparametric estimation of the survival function

S(t) = pr(T > t)

• The nonparametric estimation is more robust and does not

depend on any parametric assumption.

3

If there is no censoring

• S(t) can be consistently estimated by

S(t) = n−1n∑

i=1

I(Ti > t).

• S is a discrete distribution with mass probability of n−1 at

observed times T1, · · · , Tn

• S(t) is the nonparametric maximum likelihood estimator

(NPMLE) for S(t).

4

CDF as NPMLE

• Assuming that F (·) is discrete with mass probability at

T1 < T2 < · · · < Tn, where {T1, T2, · · · } are observed times.

• Let f1 = pr(T = T1), f2 = pr(T = T2), · · · .

• Objective: estimate f1, f2, · · · .

• Method: maximize∏n

i=1 fi subject to∑n

i=1 fi = 1.

• Solution: f1 = f2 = · · · = fn = n−1.

5

Data and Assumptions

• Data: {(Ui, δi), i = 1, · · · , n} where Ui = min(Ti, Ci) and

δi = I(Ti ≤ Ci).

• Assumptions:

1. T1, · · · , Tn i.i.d. ∼ F (·) = 1− S(·)2. C1, · · · , Cn i.i.d. ∼ G(·)3. Ti ⊥ Ci, i = 1, · · · , n. Noninformative censoring!

6

If there is censoring

• Assuming that F (·) is discrete with mass probability at

v1 < v2 < · · · , where {v1, v2, · · · } are observed times.

• Let f1 = pr(T = v1), f2 = pr(T = v2), · · · .

• Objective: estimate f1, f2, · · · .

7

Example

• Obs: 2, 2, 3+, 5, 5+, 7, 9, 16, 16, 18+, where + means censored

• v1 = 2; v2 = 3, v3 = 5, v4 = 7, v5 = 9, v6 = 16, v7 = 18, v8 = 18+

• The likelihood function in terms of (f1, f2, · · · ) :

L(F ) = f21 (f3+f4+f5+f6+f7+f8)f3(f4+f5+f6+f7+f8)f4f5f

26 f8,

where f1 + f2 + f3 + f4 + f5 + f6 + f7 + f8 = 1

8

Reparametrization tricks

• The discrete hazard function: h1 = pr(T = v1) and

hj = pr(T = vj |T > vj−1), j > 2

• For t ∈ [vj , vj+1)

S(t) = pr(T > t) = pr(T > vj) =

j∏i=1

(1− hi).

• For t = vj

fj = f(t) = pr(T = t) = hj

j−1∏i=1

(1− hi).

9

Reparametrization tricks

• The likelihood function in terms of (h1, h2, · · · ) :

L(F ) =h21 × {(1− h1)(1− h2)} × {(1− h1)(1− h2)h3}

× {(1− h1)(1− h2)(1− h3)} × {(1− h1)(1− h2)(1− h3)h4}

× {(1− h1)(1− h2)(1− h3)(1− h4)h5}

× {(1− h1)(1− h2)(1− h3)(1− h4)(1− h5)h6}2

× {(1− h1)(1− h2)(1− h3)(1− h4)(1− h5)(1− h6)(1− h7)}

=h21(1− h1)

8 × (1− h2)8 × h3(1− h3)

6

h4(1− h4)4 × h5(1− h5)

3 × h26(1− h6)× (1− h7)

10

KM estimation

• The likelihood function

L(F ) =∏j

hdj

j (1− hj)Y (vj)−dj

where

dj =n∑

i=1

δiI(Ui = vj) = #failures at vj

Y (vj) =n∑

i=1

I(Ui ≥ vj) = #“at risk” at vj .

11

KM estimation

• hj = dj/Y (vj)

•

S(t) =

1 t < v1∏ji=1(1− hi) vj ≤ t < vj+1

,

which is the Kaplan-Meier Estimator.

12

Example

vj Y (vj) dj hj S(vj) =

j∏i=1

(1 − hi) = P (T > vj)

2 10 2 2/10 .8

5 7 1 1/7 .69 (= .8 × 67 )

7 5 1 1/5 .55 (= .69 × 45 )

9 4 1 1/4 .41 (= .55 × 34 )

16 3 2 2/3 .14 (= .41 × 13 )

18 1 0 0 .14

13

KM estimation

• Suppose that vg denotes the largest vj for which Y (vj) > 0.

1. if dg = Y (vj), then S(t) = 0 for t ≥ vg

2. if dg < Y (vj), then S(t) > 0 but not defined for t > vg. (Not

identifiable beyond vg.)

• The survival distribution may not be estimable with

right-censored data. Implicit extrapolation is sometimes used.

• The KM estimator can also be used to estimate the survival

function for the censoring distribution.

14

KM estimation

• KM estimator is a special MLE

1− S(t) = argmaxFL(F )

where F is the CDF for all discrete random variables

(nonparametric MLE).

15

Self-Consistency

• No censoring S(t) = n−1∑n

i=1 I(Ti > t)

• Right censoring: S(t) = n−1∑n

i=1 E(I(Ti > t)|Ui, δi)

1. E(I(Ti > t)|Ui, δi = 1) = I(Ui > t)

2. E(I(Ti > t)|Ui, δi = 0) = S(t)/S(Ui)I(t ≥ Ui) + I(Ui > t)

• Self-consistency iteration:

Snew(t) = n−1n∑

i=1

{I(Ui > t) + (1− δi)

Sold(t)

Sold(Ui)I(Ui ≤ t)

}

• The solution is still the KM estimator.

16

Redistribution of Mass

Step 1 2 2 3+ 5 5+ 7 9 16 16 18+

Step 2 110

110

110

110

110

110

110

110

110

110

Step 3 ↓ ↓ ↪→ 170

170

170

170

170

170

170

↓ ↓ ↓ ↪→ 15 (

870 )

15 (

870 )

15 (

870 )

15 (

870 )

15 (

870 )

︸︷︷︸︸︷︷︸︸︷︷︸Total

210 0 8

70 0 24175

24175

48175 Assume

Mass this is

somewhere

> 18

17

Nelson-Aalen Estimator

• How to estimate the cumulative hazard function?

H(t) =

j∑i=1

hi for vj ≤ t < vj+1

• H(t) = − log{S(t)}

− log{S(t)} =

j∑i=1

{− log(1− hi)} ≈j∑

i=1

hi = H(t)

for vj ≤ t < vj+1.

18

Asymptotical properties of KM estimator

• As n → ∞, S(t) → S(t) in probability.

• As n → ∞, n1/2{S(t)− S(t)} converges to N(0, σ2(t)) in

distribution.

19

Asymptotical properties of KM estimator

• How to estimate the variance of S(t)

• hi is an estimated probability.

• The variance of hi can be approximated by

hi(1− hi)

Y (vi)=

di(Y (vi)− di)

Y (vi)3

• hi and hj are asymptotically independent (why?).

20

Asymptotical variance of KM estimator

For vj ≤ t < vj+1 : Var(lnS(t)

)≈

j∑i=1

Var(ln(1− hi)

)δ−method

=

j∑i=1

Var(hi

)· 1

(1− hi)2

=

j∑i=1

diY (vi)(Y (vi)− di)

.

21

Asymptotical variance of KM estimator

Using the δ-method

Var(S(t)

)≈ Var

(lnS(t)

)(elnS(t)

)2

= S(t)2Var(lnS(t)

)

≈ S(t)2j∑

i=1


(vj ≤ t < vj+1)

= σ2(t)

Greenwood’s formula

22

By-Product

The by-product of the Greenwood’s formula is the variance

estimator for Nelson-Aalen Estimator:

var(H(t)) =

j∑i=1


, vj ≤ t < vj+1

23

Confidence Interval

• S(t)± 1.96σ(t), drawbacks?

• By δ-method

var(log(− log(S(t)))) =σ2(t)

(log(S(t)))2S(t)2

• The confidence interval for S(t)[exp{−e

log(− log(S(t)))− ˆ1.96σ(t)

log(S(t))S(t) }, exp{−elog(− log(S(t)))+

1.96σ(t)

log(S(t))S(t) }]

24

Median survival time

• How to estimate the median survival time

• Solving S(tM ) = 1/2, not always solvable!

• How to construct the CI for the median survival time?

Suppose that

1. pr(SL(t) < S(t)) = pr(SU (t) > S(t)) = 0.975.

2. SL(tML) = 0.5

3. SU (tMU ) = 0.5

4. The confidence interval for tM is [tML, tMU ].

25

CI for median survival time

0 1000 2000 3000 4000

0.0

0.2

0.4

0.6

0.8

1.0

2540 34453086

26

Median survival time

0.975 = pr(SL(tM ) < S(tM )) = pr(SL(tM ) < 0.5)

= pr(SL(tM ) < SL(tML)) = pr(tM ≥ tML)

0.975 = pr(SU (tM ) > S(tM )) = pr(SU (tM ) > 0.5)

= pr(SU (tM ) > SU (tMU )) = pr(tM ≤ tMU )

27

Restricted mean survival time

• The area under the survival curve is a nice summary for the

curve

• The AUC µ =∫ τ

0S(t)dt :

µ = tS(t)

∣∣∣∣τ0

+

∫ τ

0

tf(t)dt

= τS(τ) +

∫ τ

0

tf(t)dt =

∫ ∞

0

min(t, τ)f(t)dt = E{min(t, τ)}

• µ can be estimated as ∫ τ

0

S(t)dt.

28


• The restricted mean survival time E{min(T, τ)} can also be

estimated as

µIPW = n−1n∑

i=1

δi + (1− δi)I(Ui ≥ τ)

SC(Ti ∧ τ)Ti ∧ τ,

where SC(·) is a consistent estimator of the survival function of

the censoring time C. (How?)

• Rational

E

[I(Ci ≥ τ ∧ Ti)

SC(Ti ∧ τ)Ti ∧ τ |Ti

]≈ (Ti∧τ)

P (Ci ≥ τ ∧ Ti|Ti)

SC(Ti ∧ τ)= Ti∧τ

• This type of estimator is called the inverse probability

weighting estimator

29


• µ and µIPW are equivalent!

• The variance of the estimated area under the survival curve is

complicated (the derivation will be given later).

1

n

∫ τ

0

{∫ τ

tS(u)du

}2h(t)dt

P (U ≥ t).

30

Area under the KM curve

0 1000 2000 3000 4000

0.0

0.2

0.4

0.6

0.8

1.0

31

Model Checking

• Under the exponential distribution: log(S(t)) = −λt

• Plot t vs. log(S(t)) to visually check the expected linear

pattern.

• Under the Weibull distribution

log(− log(S(t))) = p log(λ) + p log(t)

• Plot log(t) vs. log(− log(S(t))) to visually check the expected

linear pattern.

32

Model Checking

4 5 6 7 8

−5

−4

−3

−2

−1

0

log(t)

log(

−lo

g(S

(t))

)

33

Coming Soon

• The distribution of S(t) as a process.

survival analysis: weeks 2-3 - stanford...

Documents