roccurves for cohort data - haben michael · introduction: data and roc curves longitudinal...

49
Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work ROC Curves for Cohort Data Haben Michael Adviser: Lu Tian Summer 2017 1 / 49

Upload: others

Post on 11-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

ROC Curves for Cohort Data

Haben MichaelAdviser: Lu Tian

Summer 2017

1 / 49

Page 2: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

Outline

Introduction: Data and ROC Curves

Longitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)

Future Work

2 / 49

Page 3: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

Yale Prospective Longitudinal Pediatric HIV Cohort (1996–2013)

I 104 patients with HIV monitored over a period of 17 years, onaverage 6.7 yo at enrollment

I median 32.5 visits/patient, median 1 visit/3 months

3 / 49

Page 4: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

visit schedule

4 / 49

Page 5: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

Among the measurements taken at each visit, we focus on:

I response: “blip” status, a binary variable representing atransient spike in viral load

I predictor #1: CD4 count, # CD4 cells/mm3 blood

I predictor #2: CD4 percentage, the proportion of CD4 in asmall sample

5 / 49

Page 6: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

The goal of therapy is to keep CD4 count high and to suppressviral load, and both are tested regularly (every 3-6 months)

I viral load is the amount of HIV in a sample, indicating, e.g.,transmission risk

I CD4 count measures immunosuppression–risk of opportunisticinfections and strength of immune system (∼ 500+ normal,< 200 is an AIDS diagnosis)

I “strongest predictor of HIV progression”–US DHHS

I CD4 percentage measures the proportion of white blood cellsthat are CD4 cells

I usually more stable than CD4 count, but regarded as a poorerpredictor of disease progression

6 / 49

Page 7: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

Even when viral load is suppressed, transient spikes (“blips”) inviral load are often observed

Blips are often ignored by clinicians, but recent research hasestablished an association with the depletion of CD4 cells

CD4 count is the currently used predictor of blip status, but CD4percentage is much cheaper to measure. How can we compare thequality of the two predictors?

7 / 49

Page 8: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

8 / 49

Page 9: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

ROC curve

I graphical diagnostic of the performance of a threshold-basedclassifier

I plot of TPR vs FPR for a range of thresholds

I area under curve (AUC), summary statistic between 0 and 1,probability that a non-diseased marker is less than a diseasedmarker

9 / 49

Page 10: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

10 / 49

Page 11: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

Suppose we want to use an ROC curve or derived statistics tocompare CD4 count and percentage as predictors of blip status.

11 / 49

Page 12: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

12 / 49

Page 13: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

Three types of dependence to contend with:I longitudinal dependence (a) within and (b) between a given

patient’s diseased and non-diseased observations,I common variance estimators assume iid observations within

each population, and between populations

I and (c) pairwise dependence between a given patient’spredictors, CD4 count and percent

I common methods of comparison of ROC or AUC data assumeindependent samples for each predictor

(data from different patients assumed independent)

13 / 49

Page 14: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

patients

−−−−−−−→

(S11,

(X11

X ′11

)), (S12,

(X12

X ′12

)), . . . , (S1n1 ,

(X1n1

X ′1n1

))

(S21,

(X21

X ′21

)), (S22,

(X22

X ′22

)), . . . , (S2n2 ,

(X2n2

X ′2n2

))

...

(SN1,

(XN1

X ′N1

)), (SN2,

(XN2

X ′N2

)), . . . , (SNnN ,

(XNnN

X ′NnN

))

−−−−−−−→time/visits

14 / 49

Page 15: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

Outline

Introduction: Data and ROC Curves

Longitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)

Future Work

15 / 49

Page 16: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

patients

−−−−−−−→

(S11,

(X11

X ′11

)), (S12,

(X12

X ′12

)), . . . , (S1n1 ,

(X1n1

X ′1n1

))

(S21,

(X21

X ′21

)), (S22,

(X22

X ′22

)), . . . , (S2n2 ,

(X2n2

X ′2n2

))

...

(SN1,

(XN1

X ′N1

)), (SN2,

(XN2

X ′N2

)), . . . , (SNnN ,

(XNnN

X ′NnN

))

−−−−−−−→time/visits

16 / 49

Page 17: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

I With non-iid data the empirical ROC curve isn’t consistentI consistent for what?

I A common approach is to fit a glm

logit{P(Sij = 1) | Xi1, . . . ,Xini} = αi + βiXij

with (αi , βi ) multivariate normal and construct an ROC curvefrom the fitted values{

(αi + βiXij ,Sij), i = 1, · · · , n; j = 1, · · · , ni}

17 / 49

Page 18: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

{(αi + βiXij ,Sij), i = 1, · · · , n; j = 1, · · · , ni}

I Predictiveness of the biomarkers Xi1, . . . ,Xini , versus a“score” fi (Xi1, . . . ,Xini ), e.g., αi + βiXij

I This measures predictiveness of both the biomarker and themodel

I E.g., intercept term large in magnitude relative to slope term,status is nearly constant within a cluster, nearly perfect ROCcurves, regardless of quality of the biomarker Xij .

18 / 49

Page 19: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

I It is reasonable that the analyst should exploit predictivenessof the model if that is available

I If the model is poor, the scoring system may not do justice tothe biomarker

19 / 49

Page 20: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

Even assuming the model is correct, problems remain:

I In practice, the model is fit to the data,{(αi + βiXij ,Sij), i = 1, · · · , n; j = 1, · · · , ni

}so the predictiveness of the model will be overestimated

I Can’t set aside a test set of patients because the subjecteffects are drawn independently. Could set aside along the jdimension, which is close to what we do.

I The fit uses the entire data set, including “future”observations

I We adopt the vantage of an analyst who has a patient’shistory of biomarkers and disease statuses(Xij ,Sij), j = 1, . . . , ni , is confronted with a new biomarkerXi,j+1, and must now predict a new disease status. What isthe quality of the analyst’s prediction?

20 / 49

Page 21: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

I The procedure seems to be measuring the predictiveness ofthe biomarker for a given patient after adequate follow-up

I We can generalize the notion of ROC curve to the cohortsetting in other ways

21 / 49

Page 22: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

I refine notion of ROC curve

I relax parametric assumptions

I account for generalization error

I use available data–model relationship between past data andcurrent status–but no more

22 / 49

Page 23: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

We model biomarker levels Xij as an autoregressive processconditional on disease status Sij ∈ {0, 1}, in turn modeled as atwo-state markov process

P{Sij = b|Si(j−1) = a,Hi(j−1)} = pabi , a, b ∈ {0, 1}, j = 2, · · · ,

Xij | {Sij = a,Hij} = θ(a)i + ρ0Xi(j−1) + εij , j = 1, · · · ,

P(Si1 = a) = pa and Xi0 = 0,

where

ρ0 ∈ (−1, 1)

εijiid∼ N (0, σ2

0), j = 1, · · · , ni ,

ξi =

θ

(0)i

θ(1)i

g(p01i )g(p11i )

iid∼ N

θ0

θ1

µ01

µ11

,

(Σθ 00 Σp

) , i = 1, . . . , n,

and η = (p0, θ0, θ1, µ01, µ11,Σθ,Σp) are hyperparameters23 / 49

Page 24: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

...

... Si ,j−1

Xi ,j−1

Si ,j

Xi ,j

Si ,j+1

Xi ,j+1 ...

...

...

... Si ,j−1

Xi ,j−1

Si ,j

Xi ,j

Si ,j+1

Xi ,j+1 ...

...

. ..

. ..

...

... Si ,j−1

Xi ,j−1

Si ,j

Xi ,j

Si ,j+1

Xi ,j+1 ...

...

...

. . .

24 / 49

Page 25: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

Random effects models are submodels:

g {P (Sij = 1 | Hij , ξi )} = g{P(Sij = 1 | Xi(j−1),Xij , Si(j−1), ξi

)}= αi0 + αi1Si(j−1) + αi2Xi(j−1) + αi3Xij

=def

Mij ,

where

αi0 =1

2σ20

{(θ

(0)i )2 − (θ

(1)i )2

}+ µ01i ,

αi1 = µ11i − µ01i ,

αi2 =1

σ20

ρ0

(0)i − θ

(1)i

),

αi3 = − 1

σ20

(0)i − θ

(1)i

).

(1)

25 / 49

Page 26: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

Individual ROC curve

I contrast an individual’s diseased and non-diseased survivalfunctions

P(Mij > m | Sij = 1, ξi ) vs. P(Mij > m | Sij = 0, ξi )

withMij = αi0 + αi1Si(j−1) + αi2Xi(j−1) + αi3Xij

I Two CDFs imply an ROC curve (and derived statistics); arethese the correct CDFs?

I use all available history: under our model the current diseasestatus Sij given the score Mij is conditionally independent ofthe remaining data

I are functions of the available data

26 / 49

Page 27: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

Individual ROC curve

I

P(Mij > m | Sij = 1, ξi ) vs. P(Mij > m | Sij = 0, ξi )

withMij = αi0 + αi1Si(j−1) + αi2Xi(j−1) + αi3Xij

I The ROC curve is a function of the subject effect ξi throughthe subject specific-coefficients, which must be estimated

I E.g., posterior means (empirical bayes)

ξi = E(ξi | available subject data, η)

27 / 49

Page 28: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

The quality of the approximation to the patient’s true ROCdepends on the size of the patient’s history. When the patienthistory is too small, an alternative is the average individual ROCcurve

ROC1(u, j) = Eξ {ROC(u, j |ξ, η)} , (2)

where the expectation is taken with respect to the random effect ξ

I in practice, sample ξi | η under our model and average theempirical ROCs

ROC1(u, j) = B−1B∑

b=1

ROC(u, j | ξ, η)

I√n{ROC1(u, j)− ROC1(u, j)} converges weakly to a mean

zero gaussian process (in u), provided√n(η − η) G

28 / 49

Page 29: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

Expected individual ROC ROC1(u|t) and population ROC ROC4(u|t1, t2) at visits t1 = 6 through t2 = 8 with95% bootstrap CI; limiting individual ROC ROC2i for four patients (pediatric HIV data).

29 / 49

Page 30: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

Population ROC curveI Consider a time-indexed ROC curve based on the marginal

distribution of the scores Mij

ROC3(u | j) = Sj ,1(S−1j ,0 (u))

Sj ,a(u) = Pj(M·j > u | S·j = a)

30 / 49

Page 31: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

Population ROC curve

I Conditional on the population parameter, the patient data areiid, so the same holds for Mij , i = 1, . . . ,N, j fixed

I The empirical ROC curve based on pairs(Mij ,Yij), i = 1, . . . ,N,

ROC3(u | j) = Sj ,1(S−1j ,0 (u))

is a valid measure of the predictive performance of the scoresMij , regardless of the validity of the model (beyondindependence)

31 / 49

Page 32: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

With mild regularity conditions on the population parameter,

I ROC3(u | j) is consistent for ROC3(u | j) as the number ofpatients grows

I√n{ROC3(u | j)− ROC3(u | j)} converges to a mean zero

gaussian process (indexed by the FPR u)

I Variance can be approximated by the bootstrap

32 / 49

Page 33: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

Expected individual ROC ROC1(u|t) and population ROC ROC4(u|t1, t2) at visits t1 = 6 through t2 = 8 with95% bootstrap CI; limiting individual ROC ROC2i for four patients (pediatric HIV data).

33 / 49

Page 34: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

Population ROC for misspecified model

FPR 0.10 0.25 0.50 0.75t3 0.95 (0.045) 0.91 (0.065) 0.94 (0.043) 0.96 (0.027)5 0.94 (0.052) 0.93 (0.048) 0.92 (0.042) 0.93 (0.029)9 0.94 (0.045) 0.97 (0.042) 0.93 (0.048) 0.94 (0.027)

34 / 49

Page 35: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

Outline

Introduction: Data and ROC Curves

Longitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)

Future Work

35 / 49

Page 36: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

patients

−−−−−−−→

(S11,

(X11

X ′11

)), (S12,

(X12

X ′12

)), . . . , (S1n1 ,

(X1n1

X ′1n1

))

(S21,

(X21

X ′21

)), (S22,

(X22

X ′22

)), . . . , (S2n2 ,

(X2n2

X ′2n2

))

...

(SN1,

(XN1

X ′N1

)), (SN2,

(XN2

X ′N2

)), . . . , (SNnN ,

(XNnN

X ′NnN

))

−−−−−−−→time/visits

36 / 49

Page 37: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

In considering both longitudinal and pairwise dependence weestimate a more tractable target, the AUC

I X1, . . . ,XNiid∼ FX non-diseased observations

I Y1, . . . ,YNiid∼ FY diseased observations

I AUC is P(X < Y )

I Unbiased estimator is∑

i ,j P(Xi < Yj) (Mann-WhitneyU-statistic)

37 / 49

Page 38: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

As with the ROC, we can generalize to individual and populationAUCs

UN = P(X < Y | X and Y belong to the same cluster)

=1

N

N∑i=1

Pµi (Xij < Yij)

VN = P(X < Y )

estimators

UN =1

N

N∑i=1

1

nimi

∑j ,k

{Xij < Yik}

VN =1∑

mi∑

ni

∑i ,j

∑r ,s

{Xir < Yjs}

38 / 49

Page 39: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

θi ∼ Unif [0, 1]

Xij | θi ∼ Unif [θi − δ, θi ]Yij | θi ∼ Unif [θi , θi + δ]

UN = 1

VN →p 1/2 + ε (N →∞)

39 / 49

Page 40: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

Bi ∼ bernoulli

Xi1, . . . ,Xi,n−1,Yi1 | Bi

∼ Unif [2Bi − 1, 2Bi ]

UN →a.s. 1/2

VN →a.s. 1− ε (n,N →∞)

40 / 49

Page 41: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

Expected individual AUC (UN , U′N)

I Independent sequence of bounded RVs, apply CLT

N−1/2(

(UN

U ′N

)−(UN

U ′N

)) N (0,Σ)

I Use usual covariance estimators to perform inference

41 / 49

Page 42: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

Population AUC (VN , V′N)

I More commonly studied than individual AUC (opposite forROC)

I Almost a U-statistic, but (probably) not quite, because of therandom vector lengths as normalization

I We want a variance estimator in order to perform inference

I We could use the bootstrap, treating the clusters asindependent observations, but maybe lose too much power

42 / 49

Page 43: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

I Obuchowski ’97 describes a variance estimator. Developedwith radiology applications in mind. The different biomarkerscorrespond to different “readers.”

I nonparametric, asymptotic estimator

I little in the way of proof in the original paper

I in practice, seems very robust

43 / 49

Page 44: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

Gives close to the same estimate as the jackknife varianceestimator, treating the clusters as the independent observations“left out”

Replication of Obuchowski simulation with jackknifevariance estimator. (Approximately 3.6e+5 observations;status and maker correlations of 0, .4, and .8; normal andnonnormal data; 100 clusters with 2observations/cluster.)

The Obuchowski ’97 simulation was replicated whilevarying N = # of clusters, computing the maximumabsolute difference between the 2 estimators on eachreplicate. Also plotted is a fit to 1/N2.

44 / 49

Page 45: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

The difference between the two variance estimators is O(1/N2)

Two conclusions:

1. Consistency of the Obuchowski estimator can be establishedfrom consistency of the jackknife estimator

I Arvesen ’69 establishes consistency of the jackknife varianceestimator for well-behaved functions of U-statistics

VN =N−2

∑i ,j

∑r ,s{Xir < Yjs}

(N−1∑

mi )(N−1∑

ni)

I also establishes consistency for independent, non-identicallydistributed data, matching our assumption on the patients

I minor modification of his proof needed to account forwithin-cluster dependence across diseaesed and non-diseasedobservations

45 / 49

Page 46: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

Two conclusions:

2. might as well use bootstrap at the cluster level

46 / 49

Page 47: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

Outline

Introduction: Data and ROC Curves

Longitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)

Future Work

47 / 49

Page 48: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

I HIV data have a natural time parameter–the visits where thepatients are treated and measured

I Other data may consist of irregularly measured markersI individual ROC curves will mean different things for different

patientsI population ROC curve loses interpretation

48 / 49

Page 49: ROCCurves for Cohort Data - Haben Michael · Introduction: Data and ROC Curves Longitudinal Dependence (ROC Curves) Longitudinal and Pairwise Dependence (AUC) Future Work The goal

Introduction: Data and ROC CurvesLongitudinal Dependence (ROC Curves)

Longitudinal and Pairwise Dependence (AUC)Future Work

For the HIV data, are visits the “natural” time parameter?

49 / 49