part [1.0] { introduction to development and evaluation of ...€¦ · part [1.0] { introduction to...

Post on 21-Sep-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Part [1.0] – Introductionto

Development and Evaluation of Dynamic Predictions

• A Bansal & PJ Heagerty• University of Washington

1 Biomarkers

2 Biomarkers

3 Biomarkers

Outline

• Introductions• Evaluation – time-dependent accuracy

• Breast cancer biomarker• Mayo PBC Data• Illustration: Multiple Myeloma

• Development of dynamic predictions• End Stage Renal Disease Study

• Decision theoretic evaluation

• Practice with R

4 Biomarkers

Some Context – Armitage lecture 2010

“Whereas measurements and event data are very oftencollected together, the methods of analyzing them belong totwo different fields, survival analysis and longitudinal analysis,which rarely connect.”

“The separation between event data and longitudinalmeasurements is artificial.”

“In epidemiology, there is a strong push towards the analysisof life-history data because such data are now increasinglybecoming available due to the development of informationtechnology.”

“To develop statistical tools for an integrated analysis of suchdata is a big challenge.”

O. Aalen (2012)

5 Biomarkers

Part [1.1] – Measures of Classification Accuracyfor the

Prediction of Survival Times

6 Biomarkers

Session Outline

• Examples• Breast Cancer Cytology Data• Mayo PBC Data• Cystic Fibrosis Foundation Registry Data

• ROC overview

• TP, FP for survival outcomes

• ROCC/Dt (p)

• ROCI/Dt (p), AUC (t), and concordance, C τ

• Further work

7 Biomarkers

Example: Comparing cytometry measures

Breast Cancer among Younger Women

• N = 253 women BC diagnosed aged 20 to 44.

• Endpoint: time-until-death (any cause)• Cytometry measurements:

• “old” (S-phase ungated)• “new” (S-phase gated)

• Goal: compare measures as predictors of mortality

• Heagerty, Lumley & Pepe (2000)

8 Biomarkers

0 1 2 3 4 5 6 7

01

23

45

67

%S Ungated

%S

Gat

ed

New versus Old Method

9 Biomarkers

Predictive Survival Models

Mayo PBC Data

• N = 312 subjects, 125 deaths, 1974-1986

• Baseline measurements: bilirubin, prothrombin time,albumin...

• Goal: predict mortality; “medical management”

10 Biomarkers

11 Biomarkers

Predictive Survival Models

Cystic Fibrosis Data

• N = 23, 530 subjects, 4, 772 deaths, 1986-2000

• n = 160, 005 longitudinal observations

• Longitudinal measurements: FEV1, weight, height

• Goal: predict mortality; transplantation selection

12 Biomarkers

age

fev1

10 20 30 40 50

020

60100

907762

•••••

••••••••

agefev1

10 20 30 40 50

020

60100

914421

•••••••

age

fev1

10 20 30 40 50

020

60100

931885

••••••••••

age

fev1

10 20 30 40 50

020

60100

930200

••••••••

age

fev1

10 20 30 40 50

020

60100

918755

age

fev1

10 20 30 40 50

020

60100

904750

• •

age

fev1

10 20 30 40 50

020

60100

906580

•••••••••••••

age

fev1

10 20 30 40 50

020

60100

922747

••••

age

fev1

10 20 30 40 50

020

60100

913076

•••••

age

fev1

10 20 30 40 50

020

60100

923143

•••••••••••••

age

fev1

10 20 30 40 50

020

60100

922295

•••••••

age

fev1

10 20 30 40 50

020

60100

922858

••••••••••••

age

fev1

10 20 30 40 50

020

60100

928943

•••••••••••

age

fev1

10 20 30 40 50

020

60100

925274

••

•••••••

age

fev1

10 20 30 40 50

020

60100

905506

••

age

fev1

10 20 30 40 500

2060

100

925198

••

••••

age

fev1

10 20 30 40 50

020

60100

912910

•••••••••

age

fev1

10 20 30 40 50

020

60100

930735

••••

age

fev1

10 20 30 40 50

020

60100

927419

•••••••••••

age

fev1

10 20 30 40 50

020

60100

908675

••••••••

age

fev1

10 20 30 40 50

020

60100

916766

••

age

fev1

10 20 30 40 50

020

60100

926609

•••

age

fev1

10 20 30 40 500

2060

100

920366

•••••

age

fev1

10 20 30 40 50

020

60100

927349

•••••••

13 Biomarkers

Components of Accuracy

• Calibration• Bias – does observed match predicted?• Evaluated graphically and formally.

• Discrimination• Does prediction separate subjects with different risks?• Evaluated qualitatively based on K-M plots.

Harrell, Lee and Mark (1996); Hosmer and Lemeshow (2013)

14 Biomarkers

Calibration: example from Levy et al. (2006)

15 Biomarkers

BC Survival: New Measurement (gated)

0 50 100 150

0.0

0.2

0.4

0.6

0.8

1.0

%S: [0.0, 2 ), N= 86%S: [ 2 , 6 ), N= 86%S: [ 6 , 100), N= 81

(a) Survival for %S, Gated

16 Biomarkers

BC Survival: Old Measurement (ungated)

0 50 100 150

0.0

0.2

0.4

0.6

0.8

1.0

%S: [0.0, 1.1 ), N= 86%S: [ 1.1 , 3.5 ), N= 86%S: [ 3.5 , 100), N= 81

(b) Survival for %S, Ungated

17 Biomarkers

Discrimination

• Common to use ROC curves for logistic regression / binaryclassification.

Q: Extend classification error rate concepts to survival data?

18 Biomarkers

Binary Classification

Sensitivity “True Positive”

BINARY TEST : P(T + | D = 1)

CONTINUOUS MARKER : P(M > c | D = 1)

Specificity “True Negative”

BINARY TEST : P(T− | D = 0)

CONTINUOUS MARKER : P(M ≤ c | D = 0)

19 Biomarkers

ROC Curve

An ROC curve plots the True Positive Rate, TP(c), versus theFalse Positive Rate, FP(c) for all possible cutpoints, c:

FP(c) = P(M > c | D = 0)

TP(c) = P(M > c | D = 1)

ROC Curve : [ FP(c), TP(c) ] ∀c

20 Biomarkers

-2-1

01

2

0 1

Marker versus Disease status

1-specificityse

nsiti

vity

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

ROC curve

21 Biomarkers

-2-1

01

2

0 1

Marker versus Disease status

1-specificityse

nsiti

vity

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

ROC curve

22 Biomarkers

-2-1

01

2

0 1

Marker versus Disease status

1-specificityse

nsiti

vity

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

ROC curve

23 Biomarkers

-2-1

01

2

0 1

Marker versus Disease status

1-specificityse

nsiti

vity

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

ROC curve

24 Biomarkers

-2-1

01

2

0 1

Marker versus Disease status

1-specificityse

nsiti

vity

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

ROC curve

25 Biomarkers

-2-1

01

2

0 1

Marker versus Disease status

1-specificityse

nsiti

vity

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

ROC curve

26 Biomarkers

-2-1

01

2

0 1

Marker versus Disease status

1-specificityse

nsiti

vity

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

ROC curve

27 Biomarkers

-2-1

01

2

0 1

Marker versus Disease status

1-specificityse

nsiti

vity

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

ROC curve

28 Biomarkers

-2-1

01

2

0 1

Marker versus Disease status

1-specificityse

nsiti

vity

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

ROC curve

29 Biomarkers

-2-1

01

2

0 1

Marker versus Disease status

1-specificityse

nsiti

vity

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

ROC curve

30 Biomarkers

ROC Curves

1. Compare different markers over full spectrum of errorcombinations.

2. Compare sensitivity when controlling specificity (eg. TP whenFP=10%).

3. AUC interpretation:“For a randomly chosen case and control, the area under theROC curve is the probability that the marker for the case isgreater than the marker for the control.”

4. AUC is a marker-outcome concordance summary.

31 Biomarkers

Does a (repeated) measurement predict onset?

• Q: Can a measurement accurately predict which CASESwill experience an event (soon)?

• e.g. FEV1• e.g. death time

• Q: Can a measurement be used to accurately guidelongitudinal treatment decisions?

• e.g. lung transplantation

32 Biomarkers

Classification Errors

• True Positive Rate

P( high measurement | CASE )

• False Positive Rate

P( high measurement | CONTROL )

33 Biomarkers

Issues Related to Time

• Q: When is the measurement taken?• At baseline: Y (0)• At a follow-up time t: Y (t)

• Q: What time is used to determine when someone is aCASE or a CONTROL?

• Case = event (disease, death) before time t.• Case = event (disease, death) at time t.• Control = event-free through time t.• Control = event-free through a large follow-up time t?.

34 Biomarkers

Why study time-varying accuracy?

• Marker used to guide decisions at multiple subsequent time points

• However, performance of marker may vary over time as anindividual’s underlying clinical status changes

• Set of subjects still alive (risk set) changes over time

• Could be used for guiding therapy - target highly aggressive diseaseearly on, other longer outcome targeting therapies later

35 Biomarkers

Sensitivity and Specificity for Survival

• Let T denote the survival time.• Possible definitions:

• Cases(t):

1. Cumulative: event before time t: T ≤ t2. Incident: event at time t: T = t

• Controls(t):

◦ Dynamic: event-free through time t: T > t

36 Biomarkers

Sensitivity and Specificity for Survival

• Let T denote the survival time.• Possible definitions:

• Cases(t):

1. Cumulative: event before time t: T ≤ t

2. Incident: event at time t: T = t

• Controls(t):

◦ Dynamic: event-free through time t: T > t

36 Biomarkers

Sensitivity and Specificity for Survival

• Let T denote the survival time.• Possible definitions:

• Cases(t):

1. Cumulative: event before time t: T ≤ t2. Incident: event at time t: T = t

• Controls(t):

◦ Dynamic: event-free through time t: T > t

36 Biomarkers

Sensitivity and Specificity for Survival

• Let T denote the survival time.• Possible definitions:

• Cases(t):

1. Cumulative: event before time t: T ≤ t2. Incident: event at time t: T = t

• Controls(t):◦ Dynamic: event-free through time t: T > t

36 Biomarkers

Cumulative Cases, Dynamic Controls

At time t,

• Cumulative (prevalent) cases:

0 ≤ T ≤ t

• Dynamic controls:T > t

(Heagerty, Lumley, Pepe, 2000)

37 Biomarkers

Cumulative Cases, Dynamic Controls

Study Time

Sub

ject

0 2 4 6 8

48

1216

20

●●

●●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●

DiedCensored

38 Biomarkers

Cumulative Cases, Dynamic Controls

Example: t = 2 years

0 2 4 6 8

48

1216

20

●●

●●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●●

●●

Example: Identify subjects at high risk in the next 2 years to informdecisions regarding intervention- More intense screening, preventative treatment

39 Biomarkers

Cumulative Cases, Dynamic Controls

Example: t = 2 years

0 2 4 6 8

48

1216

20

●●

●●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●●

●●

0 2 4 6 8

40 Biomarkers

Cumulative Cases, Dynamic Controls

0 2 4 6 8

Poor marker performance:

Excellent marker performance:

41 Biomarkers

Cumulative/Dynamic ROC

• More formally,

sensitivityC (c, t) = P(M > c|case(t)) = P(M > c |T ≤ t)

specificityD(c, t) = P(M ≤ c|control(t)) = P(M > c |T > t)

• Using these definitions, define ROC(t) curve for any time t.

• Area Under the Curve:

AUCC/D(t) = P(Mj > Mk |j = case(t), k = control(t))

= P(Mj > Mk |Tj ≤ t,Tk > t)

42 Biomarkers

Cumulative/Dynamic ROC

• More formally,

sensitivityC (c, t) = P(M > c|case(t)) = P(M > c |T ≤ t)

specificityD(c, t) = P(M ≤ c|control(t)) = P(M > c |T > t)

• Using these definitions, define ROC(t) curve for any time t.

• Area Under the Curve:

AUCC/D(t) = P(Mj > Mk |j = case(t), k = control(t))

= P(Mj > Mk |Tj ≤ t,Tk > t)

42 Biomarkers

Cumulative/Dynamic ROC

• More formally,

sensitivityC (c, t) = P(M > c|case(t)) = P(M > c |T ≤ t)

specificityD(c, t) = P(M ≤ c|control(t)) = P(M > c |T > t)

• Using these definitions, define ROC(t) curve for any time t.

• Area Under the Curve:

AUCC/D(t) = P(Mj > Mk |j = case(t), k = control(t))

= P(Mj > Mk |Tj ≤ t,Tk > t)

42 Biomarkers

Cumulative Cases, Dynamic Controls

t = 2

Study Time

Sub

ject

0 2 4 6 8

48

1216

20

●●

●●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●●

●●

43 Biomarkers

Cumulative Cases, Dynamic Controls

t = 4

Study Time

Sub

ject

0 2 4 6 8

48

1216

20

●●

●●●

●●

●●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●●

●●

44 Biomarkers

Cumulative Cases, Dynamic Controls

t = 6

Study Time

Sub

ject

0 2 4 6 8

48

1216

20

●●

●●●

●●

●●●

●●

●●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●●

●●

45 Biomarkers

Cumulative Cases, Dynamic Controls

0 2 4 6 8

0 2 4 6 8

0 2 4 6 8

46 Biomarkers

Estimation of Cumulative/Dynamic ROC

• Main issue: Censoring. Cannot classify everyone.

• Valid ROC estimator using nonparametric nearest neighborestimation proposed by Heagerty, Lumley, Pepe (2000)

• Bootstrap resampling method for obtaining confidenceintervals

47 Biomarkers

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

1−specificity

sensitivity

(a) ROC(40m), NNE

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

1−specificity

sensitivity

(d) ROC(40m), Kaplan−Meier

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

1−specificity

sensitivity

(b) ROC(60m), NNE

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

1−specificity

sensitivity

(e) ROC(60m), Kaplan−Meier

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

1−specificity

sensitivity

(c) ROC(100m), NNE

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

1−specificity

sensitivity

(f) ROC(100m), Kaplan−Meier

48 Biomarkers

Accuracy Comparisons

Sensitivity

• Using gated (new) measurement:◦ P[M1 ≤ 5.4 | N(60m) = 0] = 0.71◦ P[M1 > 5.4 | N(60m) = 1] = 0.82

• Using ungated (old) measurement:◦ P[M2 ≤ 3.5 | N(60m) = 0] = 0.71

◦ P[M2 > 3.5 | N(60m) = 1] = 0.54• Therefore, controlling M1 and M2 to have equal specificity, thenew measure, M1, has a greater sensitivity.

49 Biomarkers

Accuracy Comparisons

AUC Calculations• AUC for gated (new): 0.80,

Bootstrap 95% CI: (0.72,0.89)• AUC for ungated (old): 0.68,

Bootstrap 95% CI: (0.56,0.77)• 95% CI for difference in areas (new-old): (0.03, 0.26)

50 Biomarkers

Recall our goal

How do we evaluate time-varying performance?

51 Biomarkers

Time-varying performance?

0 2 4 6 8

0 2 4 6 8

0 2 4 6 8

Performance averaged over previous time points.52 Biomarkers

Extension to assess time-varying trends

0 2 4 6 8

0 2 4 6 8

0 2 4 6 8

0 2 4 6 853 Biomarkers

Extension to assess time-varying trends

At time t,

• Cumulative (prevalent) cases:

t ≤ T ≤ t ′

• Dynamic controls:T > t ′

• Subset data at t to include only subjects with T ≥ t

• We set t ′ = t + 1

• Look at cumulative incidence over 1-year span, i.e. T ≤ t + 1

54 Biomarkers

Cumulative/Dynamic ROC for time-varying trends

Study Time

Sub

ject

0 2 4 6 8

48

1216

20

●●

●●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●

DiedCensored

55 Biomarkers

Cumulative/Dynamic ROC for time-varying trends

t = 0

Study Time

Sub

ject

0 2 4 6 8

48

1216

20

●●

●●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●

56 Biomarkers

Cumulative/Dynamic ROC for time-varying trends

t = 2

Study Time

Sub

ject

0 2 4 6 8

48

1216

20

●●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●●

●●

●●●

●●

57 Biomarkers

Cumulative/Dynamic ROC for time-varying trends

t = 4

Study Time

Sub

ject

0 2 4 6 8

48

1216

20

●●

●●

●●●

●●

●●●

●●

●●

●●●

●●

58 Biomarkers

Cumulative Cases, Dynamic Controls

0 2 4 6 8

0 2 4 6 8

0 2 4 6 8

Can incorporate updated biomarker measurements!59 Biomarkers

Summary

• Define time-dependent ROC curves based on prospective dataand cumulative occurence of events.

• Local survival estimation handles censored event times.

• Tool to evaluate a marker or a survival regression model score.

• R package: survivalROC (see web for doc/validation)

• Can be extended to general longitudinal marker scenario

• Heagerty, Lumley and Pepe (2000) “Time-dependent ROCCurves for Censored Survival Data and a Diagnostic Marker”Biometrics

60 Biomarkers

Incident Cases, Dynamic Controls

At time t,

• Incident cases:T = t

• Dynamic controls:T > t

(Heagerty & Zheng, 2005)

⇒ At time t, only those still at risk classified as cases or controls

61 Biomarkers

Incident Cases, Dynamic Controls

Study Time

Sub

ject

0 2 4 6 8

48

1216

20

●●

●●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●

DiedCensored

62 Biomarkers

Incident Cases, Dynamic Controls

t = 4

Study Time

Sub

ject

0 2 4 6 8

48

1216

20

●●

●●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●

63 Biomarkers

Incident Cases, Dynamic Controls

t = 4

Study Time

Sub

ject

0 2 4 6 8

48

1216

20

●●

●●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●

64 Biomarkers

Incident/Dynamic ROC to assess time-varying trends

t = 0

Study Time

Sub

ject

0 2 4 6 8

48

1216

20

●●

●●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●

65 Biomarkers

Incident/Dynamic ROC to assess time-varying trends

t = 2

Study Time

Sub

ject

0 2 4 6 8

48

1216

20

●●

●●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●

66 Biomarkers

Incident/Dynamic ROC to assess time-varying trends

t = 4

Study Time

Sub

ject

0 2 4 6 8

48

1216

20

●●

●●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●

67 Biomarkers

Incident/Dynamic ROC to assess time-varying trends

t = 6

Study Time

Sub

ject

0 2 4 6 8

48

1216

20

●●

●●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●

68 Biomarkers

Incident Cases, Dynamic Controls

0 2 4 6 8

0 2 4 6 8

0 2 4 6 8

0 2 4 6 869 Biomarkers

70 Biomarkers

Incident/Dynamic ROC

• More formally,

sensitivity I (c, t) = P(M > c|case(t)) = P(M > c |T = t)

specificityD(c, t) = P(M ≤ c |control(t)) = P(M > c |T > t)

• Using these definitions, define ROC(t) curve for any time t.

• Area Under the Curve:

AUC I/D(t) = P(Mj > Mk |j = case(t), k = control(t))

= P(Mj > Mk |Tj = t,Tk > t)

71 Biomarkers

Incident/Dynamic ROC

• More formally,

sensitivity I (c, t) = P(M > c|case(t)) = P(M > c |T = t)

specificityD(c, t) = P(M ≤ c |control(t)) = P(M > c |T > t)

• Using these definitions, define ROC(t) curve for any time t.

• Area Under the Curve:

AUC I/D(t) = P(Mj > Mk |j = case(t), k = control(t))

= P(Mj > Mk |Tj = t,Tk > t)

71 Biomarkers

Incident/Dynamic ROC

• More formally,

sensitivity I (c, t) = P(M > c|case(t)) = P(M > c |T = t)

specificityD(c, t) = P(M ≤ c |control(t)) = P(M > c |T > t)

• Using these definitions, define ROC(t) curve for any time t.

• Area Under the Curve:

AUC I/D(t) = P(Mj > Mk |j = case(t), k = control(t))

= P(Mj > Mk |Tj = t,Tk > t)

71 Biomarkers

AUC(t) and Concordance

Q: The I/D ROC curve and AUC(t) provide time-specificsummaries of accuracy, but is there a single global summary?Concordance:

C = P(Mj > Mk | Tj < Tk)

C =

∫t

AUC (t) · w(t) dt

with w(t) = 2 · f (t) · S(t)

72 Biomarkers

AUC(t) and Concordance

• Time can be restricted to (0, τ) to obtain:

C τ = P(Mj > Mk | Tj < Tk ,Tj < τ)

=

∫ τ

0AUC (t) · w τ (t) dt

with w τ (t) = w(t)/[1− S2(τ)]

73 Biomarkers

Estimation: Issues

• FPDt (c) = P(M > c | T > t) can be estimated

non-parametrically for times when∑

i Ri (t) moderate-to-large,where Ri (t) = 1(T ∗i > t), “at-risk” indicator.

• However, estimation of TPIt (c) = P(M > c | T = t) requires

some sort of smoothing since the observed subset with Ti = tmay only contain one observation.

• Semiparametric method smoothes by fitting a hazard model(Heagerty & Zheng, 2005)

• Nonparametric method uses kernel-based smoothing -preferable due to fewer assumptions (Saha-Chaudhuri &Heagerty, 2013; Bansal & Heagerty, in progress)

• Bootstrap resampling method for obtaining confidenceintervals

74 Biomarkers

Estimation: Issues

• FPDt (c) = P(M > c | T > t) can be estimated

non-parametrically for times when∑

i Ri (t) moderate-to-large,where Ri (t) = 1(T ∗i > t), “at-risk” indicator.

• However, estimation of TPIt (c) = P(M > c | T = t) requires

some sort of smoothing since the observed subset with Ti = tmay only contain one observation.

• Semiparametric method smoothes by fitting a hazard model(Heagerty & Zheng, 2005)

• Nonparametric method uses kernel-based smoothing -preferable due to fewer assumptions (Saha-Chaudhuri &Heagerty, 2013; Bansal & Heagerty, in progress)

• Bootstrap resampling method for obtaining confidenceintervals

74 Biomarkers

Estimation: Issues

• FPDt (c) = P(M > c | T > t) can be estimated

non-parametrically for times when∑

i Ri (t) moderate-to-large,where Ri (t) = 1(T ∗i > t), “at-risk” indicator.

• However, estimation of TPIt (c) = P(M > c | T = t) requires

some sort of smoothing since the observed subset with Ti = tmay only contain one observation.

• Semiparametric method smoothes by fitting a hazard model(Heagerty & Zheng, 2005)

• Nonparametric method uses kernel-based smoothing -preferable due to fewer assumptions (Saha-Chaudhuri &Heagerty, 2013; Bansal & Heagerty, in progress)

• Bootstrap resampling method for obtaining confidenceintervals

74 Biomarkers

Nonparametric Estimation of AUCI/D(t)

Recall:

• AUCI/D(t) = P(Mj > Mk |j = case, k = control)

• Interpretation: Given a random case and a random control,the probability that the case has a higher marker value thanthe control

75 Biomarkers

Nonparametric Estimation of AUCI/D(t)

●● ●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

● ●

7 8 9 10 11 12

0.0

0.2

0.4

0.6

0.8

1.0

Survival Time

Ran

k

At time t,

• Calculate a rank for each case in the risk set relative to the controlsin the risk set

• Mean rank at time t is calculated as the mean of the ranks for allcases in the risk set at t

• For window around t, AUC(t) = local average of the mean ranksacross all observed event times in that window

76 Biomarkers

Nonparametric Estimation of AUCI/D(t)

●● ●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

● ●

7 8 9 10 11 12

0.0

0.2

0.4

0.6

0.8

1.0

Survival Time

Ran

k

At time t,

• Calculate a rank for each case in the risk set relative to the controlsin the risk set

• Mean rank at time t is calculated as the mean of the ranks for allcases in the risk set at t

• For window around t, AUC(t) = local average of the mean ranksacross all observed event times in that window

76 Biomarkers

Nonparametric Estimation of AUCI/D(t)

●● ●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

● ●

7 8 9 10 11 12

0.0

0.2

0.4

0.6

0.8

1.0

Survival Time

Ran

k

At time t,

• Calculate a rank for each case in the risk set relative to the controlsin the risk set

• Mean rank at time t is calculated as the mean of the ranks for allcases in the risk set at t

• For window around t, AUC(t) = local average of the mean ranksacross all observed event times in that window

76 Biomarkers

Motivation: Treatment Prioritization

• Organ transplantation seeks to prioritize limited donor organsby identifying those subjects who are at risk of death withoutintervention (and who would do well if transplanted).

• Lung Allocation Score (see Gries et al. 2010)• MELD Score (Model for Endstage Liver Disease)

• The scientific goal is one where over time a goodmodel/marker would identify those subjects at risk of death(from among those still at-risk).

• Q: Where do diseased subjects who die rank among those inthe risk set?

77 Biomarkers

78 Biomarkers

Additional Comments

• For a baseline marker the C-index can also be estimated as theglobal weighted average:

C =

∫A(t) · 2f (t)S(t) dt

• We can also directly apply the WMR estimator to time-dependentcovariates, M(t), since the method is based on risk-sets and thecase rank within the riskset.

• Time-dependent Covariate Example:

• Cystic Fibrosis Data• FEV1, height, and weight are time-dependent

79 Biomarkers

Time

AU

C(t

) / R

ank

10 20 30 40 50

0.0

0.2

0.4

0.6

0.8

1.0

AUC Based on Risk Set Rank

80 Biomarkers

Some Current Extensions

• Q: How often does the CASE marker rank in the top 10% of therisk set (or among controls)?

• This concept is directly connected to sensitivity:

E {1[A(t) > (1− p)]} = CASE(t) > (1-p)% of CONTROLS(t)

= TP I/D(p, t)

(Saha-Chaudhuri & Heagerty, 2018)

81 Biomarkers

Some Current Extensions

• Non-Parametric TPI/D

• Select a value of false-positive rate: p• Derive the indicators:

H(t, p) = 1[ A(t) > (1− p) ]

• Locally weighted averages to obtain smooth curve in time:

TPI/Dhn (t, p) =

∑j

K∗hn(t − tj) · H(t, p)

82 Biomarkers

Summary

• The Case rank is a descriptive summary that is clinicallymeaningful.

• Using the case-rank provide a basis for non-parametricestimation of time-dependent accuracy summaries.

• WMR provides non-parametric estimation with analyticalexpressions for standard errors.

• Methods extend to allow time-dependent markers.

• Methods extend to estimation of time-dependent sensitivity.

83 Biomarkers

Illustration: Multiple Myeloma

Overall survival from registration in study

• Survival curves for Multiple Myeloma have differently shapedsegments

• Are different segments governed by different baseline prognosticvariables (due to different disease biology)?

• Variables representing disease aggressiveness more dominant earlier

84 Biomarkers

Illustration: Multiple Myeloma

• Goal: To compare different prognostic markers with respect to theirperformance over time

• Their approach: Examine associations of variables with survivalusing Cox proportional hazards regression

• Our approach: Classification

(Bansal & Heagerty, 2019)85 Biomarkers

Illustration: Multiple Myeloma

• N = 775

• Prospective randomized trial comparing high-dosechemoradiotherapy to standard chemotherapy

• Median follow-up = 8.2 years

• Median survival = 4.0 years

• Survival similar in both study arms. Pooled together forprognostic marker analysis

86 Biomarkers

Candidate Prognostic Markers

8 baseline variables investigated:

• Age

• Albumin

• Calcium

• Creatinine

• Hemoglobin

• Lactic Hydrogenase (LDH)

• Platelet Count

• Serum beta-2-microglobulin (SB2M)

87 Biomarkers

Methods

AUCCumulative/Dynamic(t)

0 2 4 6 8

0 2 4 6 8

0 2 4 6 8

0 2 4 6 8

0 2 4 6 8

0 2 4 6 8

0 2 4 6 8

AUCIncident/Dynamic(t)

0 2 4 6 8

0 2 4 6 8

0 2 4 6 8

0 2 4 6 8

0 2 4 6 8

0 2 4 6 8

0 2 4 6 8

88 Biomarkers

Results

0 1 2 3 4 5 6

0.40

0.50

0.60

0.70

AUCC/D(t)

Time (years)

AU

C(t

)

Platelet CountSB2MCreatinineAge

0 1 2 3 4 5 6

0.40

0.50

0.60

0.70

AUCI/D(t)

Time (years)

AU

C(t

)

Platelet CountSB2MCreatinineAge

0 1 2 3 4 5 6

0.40

0.50

0.60

0.70

AUCC/D(t)

Time (years)

AU

C(t

)

AlbuminCalciumLDHHemoglobin

0 1 2 3 4 5 6

0.40

0.50

0.60

0.70

AUCI/D(t)

Time (years)

AU

C(t

)AlbuminCalciumLDHHemoglobin

89 Biomarkers

Summary

Accuracy summary

ROCI/Dt (p) : vary (M,t)

AUC (t) : vary (t)

C : global

90 Biomarkers

91 Biomarkers

top related