accounting for patient heterogeneity 1) hierarchical bayesian approaches 2) anova – based...

Accounting for Patient Heterogeneity

1) Hierarchical Bayesian Approaches 2) ANOVA – Based Approaches

Thall, et al., Statistics in Medicine 2003

A Trial of Imatinib in Sarcoma

• Experimental Rx = Imatinib (Gleevec, STI571)

• Motivation: Recent clinical successes in CML, GI stromal tumors

• Goal: Assess activity in each of 10 Sarcoma subtypes

Patient Outcome in the Gleevec Trial

Compared to baseline, the patient’s disease status At each two-month evaluation is one of :

CR = complete responsePR = partial responseSD = stable diseasePD = progressive disease or death

“Response” = {CR/PR @ month 2} or {SD @ month 2 and CR/PR/SD @ month

4}

CR

PR

SDSD

CR

PR

PD PD

Bayesian Activity Trial Design(Thall & Sung, 1998)

Stop the trial early if

This statistical model & method ignore subtypes

The possibility of activity in one subtype

but not in another is not permitted

Approach #1

1) Assume that the disease subtypes all have one common response probability,

2) Conduct the trial using one early stopping rule for all the disease subtypes combined

CRITICISM

How to Accommodate Multiple Disease Subtypes?

The data are not shared between subtypes

Activity observed in one subtype is not permitted

to increase prob(activity) in the other subtypes

How to Accommodate Multiple Disease Subtypes?

1) Assume different, independent response probabilities {1,…,K}

2) Conduct K independent trials, using a separate early stopping rule within each disease subtype

CRITICISM

Approach #2

http://www.chazhound.com/cgi-bin/dog/ppcount.cgi?action=go&num=195



A Bayesian Hierarchical Model

Data in S1 Data in S2 Data in SK. . .

Event Rate Parameter

in S1


in S2


in SK

Hyper Parameters

. . .

Data and Parameters in the Sarcoma Trial

In sarcoma subtype j = 1, 2, …, 10,

mj = # patients evaluated (data)

Xj = # responses (data)

j = probability of response (parameters)

Bayesian Hierarchical Model

X1 , m1 X2 , m2 Xk , mk

1 2 k

Hyper Parameters

. . .

. . .

Hierarchical Models

Hierarchical Model for the Sarcoma Trial

Define i = logit(i), for j=1,…,k

Yj | j , mj ~ binom(j , mj), independently

1 , … k | ~ iid N(-1), = precision

~ N(-1.386, 10), ~ Gamma(2, 20)

Hierarchical Model for the Sarcoma Trial

has prior mean 0.10 and variance 0.005 has prior mean = logit(.20) & variance 10

and reflect the elicited probabilities Pr(1 > .30) = .45

Pr(1 > .30 | X1/n1 = 2/6) = .525

Pr(1 > .30 | X2/n2 = 2/6) = .47

Prior Correlation Between Sarcoma Subtypes

Sarcoma Trial Conduct

• Terminate accrual in sarcoma subtype j if

Sarcoma Trial Conduct

• “data” = outcomes from all 10 subtypes

• Minimum # patients = 8, maximum = 30 in each subtype

Pr(j> .30 | data) < .005

Borrowing Strength Between the Sarcoma Subtypes Reduces

BothFalse Negative Rates

and False Positive Rates

How Borrowing Strength Reduces How Borrowing Strength Reduces False Negative Rates :False Negative Rates :

Per-Subtype Rejection ProbabilitiesPer-Subtype Rejection Probabilities

PRACTICAL ADVANTAGES of the HIERARCHICAL BAYES DESIGN

PRACTICAL ADVANTAGES of the HIERARCHICAL BAYES DESIGN

The hierarchical model allows data from each subtype to provide information about the outcome parameters in all of the other subtypes

It avoids the two undesirable approaches of doing

> One trial assuming one common parameter, ignoring the subtypes

> K separate trials that ignore each others’ data

Posterior Distribution Under the Hierarchical ModelGleevec Study

Pr( Response )

0.0 0.2 0.4 0.6 0.8 1.0

02

46

81

01

2

AngiosarcomaEwings FamilyFibrosarcomaLeiomyosarcomaLiposarcomaMFHOsteosarcomaPeripheral Nerve SheathRhabdomysarcomaSynovialDesmoid Tumors

Posterior Distribution Under the Non-Hierarchical ModelGleevec Study

Pr( Response )

0.0 0.2 0.4 0.6 0.8 1.0

02

46

81

01

2


Posterior Distribution Under the Non-Hierarchical Model (Equal Mean,Variance)

Pr( Response )

0.0 0.2 0.4 0.6 0.8 1.0

02

46

81

01

2


Angiosarcoma , 1 / 9 ( 11 %) Response

Pr( Response )

0.0 0.2 0.4 0.6 0.8 1.0

01

23

45

Prob(p > 0.30 | Data) = 0.06

Ewings Family , 0 / 13 ( 0 %) Response

Pr( Response )

0.0 0.2 0.4 0.6 0.8 1.0

02

46

81

01

2

Prob(p > 0.30 | Data) = 0.001

Fibrosarcoma , 2 / 7 ( 29 %) Response

Pr( Response )

0.0 0.2 0.4 0.6 0.8 1.0

0.5

00

01

1.0

00

01

1.5

00

01

2.0

00

00

2.5

00

00

Prob(p > 0.30 | Data) = 0.37

Leiomyosarcoma , 6 / 30 ( 20 %) Response

Pr( Response )

0.0 0.2 0.4 0.6 0.8 1.0

01

23

45

Prob(p > 0.30 | Data) = 0.08

Liposarcoma , 10 / 28 ( 36 %) Response

Pr( Response )

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

Prob(p > 0.30 | Data) = 0.7

MFH , 2 / 15 ( 13 %) Response

Pr( Response )

0.0 0.2 0.4 0.6 0.8 1.0

01

23

45

Prob(p > 0.30 | Data) = 0.05

Osteosarcoma , 3 / 17 ( 18 %) Response

Pr( Response )

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

Prob(p > 0.30 | Data) = 0.09

Peripheral Nerve Sheath , 1 / 5 ( 20 %) Response

Pr( Response )

0.0 0.2 0.4 0.6 0.8 1.0

0.5

00

00

1.5

00

00

2.5

00

00

Prob(p > 0.30 | Data) = 0.22

Rhabdomysarcoma , 0 / 2 ( 0 %) Response

Pr( Response )

0.0 0.2 0.4 0.6 0.8 1.0

12

34

56

Prob(p > 0.30 | Data) = 0.12

Synovial , 4 / 18 ( 22 %) Response

Pr( Response )

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

Prob(p > 0.30 | Data) = 0.18

Desmoid Tumors , 8 / 13 ( 62 %) Response

Pr( Response )

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

2.0

2.5

Prob(p > 0.30 | Data) = 0.98

Effect of Desmoid Tumor Patients (8/13 response)

If the 13 Desmoid tumor patients are removed from the hierarchical structure:

The maximum change in Pr( p > 0.30 | Data) is 0.02.

Hierarchical Bayesian Approaches to Hierarchical Bayesian Approaches to Phase II Trials in Diseases Phase II Trials in Diseases

With Multiple Subtypes With Multiple Subtypes Case II: Time-to-Event OutcomesCase II: Time-to-Event Outcomes

A Trial of Fludarabine + Busulfan (Flu/Bu) in A Trial of Fludarabine + Busulfan (Flu/Bu) in Allogeneic Bone Marrow Transplantation (Allotx)Allogeneic Bone Marrow Transplantation (Allotx)

• Experimental Rx : Flu/Bu as a Preparative RegimenExperimental Rx : Flu/Bu as a Preparative Regimen

• 3 Patient-Disease Subgroups: 3 Patient-Disease Subgroups:

AML in Relapse, AML in Remission, MDSAML in Relapse, AML in Remission, MDS

• Goal: Improve DFS in each of the three groupsGoal: Improve DFS in each of the three groups

Data and Parameters in the AlloTx Trial

In each patient-disease subgroup j = 1, 2, 3 :

Xj,1 ,…, Xj,nj = failure (or censoring) times

In nj patients transplanted

j = Historical median failure time

j = Effect of Flu/Bu relative to historical

j x j = Median failure time with Flu/Bu

Bayesian Hierarchical Model forBayesian Hierarchical Model forthe AlloTx Trialthe AlloTx Trial

FailureFailure

Time Data Time Data

FailureFailure

Time DataTime Data

FailureFailure

Time Data Time Data

Flu/Bu Flu/Bu Effect on Effect on AML in AML in

RemissionRemission

Flu/Bu Flu/Bu Effect on Effect on AML in AML in RelapseRelapse

Flu/Bu Flu/Bu Effect on Effect on

MDSMDS

Hyper ParametersHyper Parameters

Bayesian Hierarchical Model forthe AlloTx Trial

FailureFailure

Time Data Time Data

FailureFailure

Time DataTime Data

FailureFailure

Time Data Time Data

11 22 33

Hyper ParametersHyper Parameters

Prior Correlation Between Prior Correlation Between Patient-Disease SubgroupsPatient-Disease Subgroups

Continuous Monitoring using an Approximate Posterior (CMAP) for

Phase II Trials Based On Event Times

Cheung and Thall, 2002Cheung and Thall, 2002

A Problem With Phase II Designs Based On Binary Outcomes

The Problem: If the event/response requires a relatively long follow-up period T, the number of responses (M) may not be observed. Example: alive with remission after 6 months of treatment

A Solution: Use the event time, possibly right-censored, as the outcome variable.

A Question: How does one obtain the posterior for θE?

Clinical events: 3 basic cases

NotationX = time to disease remission (“response”)Z = time to death/relapseR = time to “failure”, e.g. disease is resistantT = a pre-specified, fixed observation time windowCases(1) Simple event: B = {X < T}(2) Composite event: B = {X < T < Z}(3) Competing risks: B = {X < min(T,R), Z > T}

Approximate posterior, Case 2

1. Consider a current status likelihood of the observed data:

i [prob(Ai)]Yi [1-prob(Ai)]1-Yi

where Ai={Xi<Ci<Zi}, Ci = censoring time,

Yi = indicator of the event Ai.

2. Consider the probability decomposition

Prob(A) = w1θE + w2(1-θE)

where w1=prob(A|B) and w2=prob(A|BC)

Approximate Posterior, Case 2, cont.

Estimation strategy:– Replace the nuisance parameters (w1 and

w2) in the likelihood with estimates and obtain a “working likelihood” L(θE).

– The nuisance parameters can be estimated by the empirical quantities based on the completely followed patients

– Compound likelihood L(θE) with a beta prior: the posterior is a mixture of beta’s

Method: Monitoring θE

C = ContinuousM = Monitoring usingA = ApproximateP = Posteriors

1. Compute Prob (θE > θS + δ | Data), the stopping prob criterion each time a new patient is accrued, based on the approximate posterior described above

2. Design parameters: (Nmax, Nmin, δ, p, ρ) and the prior distributions.

Example: A leukemia trial

Patients: newly diagnosed acute myelogenous leukemia or myelodysplastic syndromes, with “-5/-7” cytogenetic abnormality.

Outcome structure: Competing risks (Case 3)

Z = survival time since day 0 of treatment

X = time to complete remission

R = time to declared resistant

Response, B = {X < min(90,R), Z > 90}

Historical Data

Empirical analysis– Nhist = 335, Mhist = 144, observed rate is 43%.

– Prior: θS ~ beta(145, 192); θE ~ beta(0.86, 1.14)

Model-based analysis– Marginal models: generalized odds rate

(Dabrowska and Doksum, 1988). – Dependence structure: Shen and Thall (1998) – Number of parameters: 17

– Model-based estimate of θhist is 44%.

4 Stopping Strategies

Thall-Simon (TS) Stopping rule:

Prob (θE > θS + .15 | Data) < .05CMAPTSCD: Continuous monitoring based on

completely followed patients onlyTS(1): wait and apply TS, stopping after every

patientTS(5): wait and apply TS, stopping rule after

every 5 patients

A null scenario: Nmax=60, Nmin=10

Event times generated under the historical model the model based mean θE = .44.

Patients arrive exponentially at a rate of 5 per 30 days.

CMAP TSCD TS(1) TS(5)

%Reject treatment .82 .79 .82 .79

Mean Duration (days) 286 331 520 469

Duration (Q1,Q3) (186, 390) (241, 417) (171, 788) (166, 692)

Sample Size (Ave) 34 42 29 33

#Turned away (Ave) 0 0 52 40

An alternative case: Nmax=60, Nmin=10

Event times were generated under the same model with parameters calibrated so that θE = .59.

CMAP TSCD TS(1) TS(5)

%Reject treatment .16 .13 .17 .13

Mean Duration (days) 411 434 689 618

Duration (Q1,Q3) (389,467) (398,469) (578,820) (549,711)

Sample Size (Ave) 55 57 53 55

#Turned away (Ave) 0 0 51 37

Randomized Multi-arm Study

Select the best regimen from (E1,E2,E3)

1) Same priors on θs and θEk’s

2) Stopping rules for arm Ek:Pr(θEk

> θs + .15) < .10 OR Pr(θEk < maxj θEj

) > .90

3) Nmax=90 (total), Nmin=10 (per arm)4) Randomize evenly among the non-stopped

arms5) Choose the best among the non-stopped

arms at the end6) Consider 4 stopping strategies

Multi-arm study, cont.

PSel N

Scene Design E1 E2 E3 None E1 E2 E3 Duration

Null CMAP .08 .10 .08 .74 14 15 13 481

TSCD .15 .15 .16 .54 23 23 23 606

TS(1) .09 .09 .09 .73 13 13 13 879

TS(5) .13 .13 .13 .62 15 20 15 749

E3 CMAP .04 .05 .63 .28 13 12 47 597

TSCD .07 .08 .69 .15 20 21 40 632

TS(1) .05 .06 .65 .24 11 11 51 950

TS(5) .06 .06 .71 .17 15 15 50 821

Conclusions

The price of ignoring censored data: inflation in the null sample size.

For the composite cases, the approximate posterior avoids complex modeling on dependence structure of times to event.

Computation of approximate posterior is easy.For the simple case, the approximate posterior

agrees with a nonparametric estimator based on right-censored data; Susarla and Van Ryzin (1976).

Most recent work: parametric model may be preferred in the simple case.

Accounting for Patient Heterogeneity in Phase II Using Regression

Two or more prognostic subgroups with different historical Pr(response) using “standard” therapy

If a subgroup is stopped early, the remaining sample size goes to the remaining subgroups

Allow different target Pr(response) values within subgroups

Use a regression model to “borrow strength” between subgroups

t = treatment group = E or S

Z = prognostic subgroup = 0, 1, …, K-1

t,Z () = Pr(Response | t,

Z, )

= logit -1 { t,Z () }

k = historical effect of prognostic subgroup k versus baseline group 0 (0=0)

k = E-versus-S treatment effect in prognostic subgroup k

I [ Z = k ]

Informative prior, from historical data

Non Informative priors

LINEAR TERMS

Early Stopping (“No Go”) Criteria

Given current data Dn, stop accrual in subgroup j if

for j = 0, 1, …, K-1, where pj is a fixed cut-off, usually .01 to .10, calibrated to obtain a design with given false negative rate.

An Example with Two Subgroups

Historical Standard Rx

Targets for the Expt'l Rx

Good Prognosis

.45 .60

Poor Prognosis

.25 .40


Nmax = 100 (approx. 50 per subgroup)

Apply subgroup-specific early stopping rules after cohorts of 10 patients

The early stopping rules are calibrated to control Pr(STOP | = target) = .10within each subgroup


Good Prognosis :

Target is

.45 + .15 = .60

Poor Prognosis:

Target is

.25 + .15 = .40


Early Stopping Rules

Good Prognosis :

Target is

.45 + .15 = .60

STOP if

Pr(G > .60 | data)

is “small”

Poor Prognosis:

Target is

.25 + .15 = .40

STOP if

Pr(P > .40 | data)

is “small”


Accrual may be stopped early in1) Both subgroups (Trial is stopped)2) Neither subgroup3) One subgroup but not the other

“Treatment-subgroup interaction”

In Case 3, all remaining patients, up to the maximum of 100, are accrued to the subgroup that has not been stopped.

Computer Simulation Results

True Values of

Ignoring

PrognosisAccounting for

Prognosis

Good Prognosis

G = .60

Poor Prognosis

P = .25


True Values of

Ignoring


Prognosis

Good Prognosis

G = .60

P(stop) = .42

N = 38

Poor Prognosis

P = .25

P(stop) = .42

N = 38

True Values of

Ignoring


Prognosis

Good Prognosis

= .60

P(stop) = .42

N = 38

P(stop) = .10

N = 64

Poor Prognosis

= .25

P(stop) = .42

N = 38

P(stop) = .73

N = 32


Effects of treatment-subgroup interaction

If the new treatment achieves the target in the “good prognosis” subgroup but not in the “poor prognosis” subgroup a conventional design ignoring treatment-subgroup interaction has

Pr(False Negative in “Good”) = .42

Pr(False Positive in “Poor”) = 1 - .42 = .58

Take Away Messages

In phase II, or ANY comparative trial :

Account for patient heterogeneity

Account for treatment-subgroup (treatment-covariate) interactions

The method is applied similarly for event times, using means or medians

The “Good” vs “Bad” Prognosis

dichotomy may be replaced with

“Biomarker +” vs “Biomarker –”

Currently being applied at MDACC to

a chemotherapy trial in acute leukemia

accounting for patient heterogeneity 1) hierarchical bayesian approaches 2) anova – based...

Documents

subtypes activity

sarcoma trial conduct

multiple disease subtypes

subtypes minimum

sarcoma trialt

hierarchical structure

trial of imatinib

patients disease status