accounting for patient heterogeneity 1) hierarchical bayesian approaches 2) anova – based...
TRANSCRIPT
Accounting for Patient Heterogeneity
1) Hierarchical Bayesian Approaches 2) ANOVA – Based Approaches
Thall, et al., Statistics in Medicine 2003
A Trial of Imatinib in Sarcoma
• Experimental Rx = Imatinib (Gleevec, STI571)
• Motivation: Recent clinical successes in CML, GI stromal tumors
• Goal: Assess activity in each of 10 Sarcoma subtypes
Patient Outcome in the Gleevec Trial
Compared to baseline, the patient’s disease status At each two-month evaluation is one of :
CR = complete responsePR = partial responseSD = stable diseasePD = progressive disease or death
“Response” = {CR/PR @ month 2} or {SD @ month 2 and CR/PR/SD @ month
4}
CR
PR
SDSD
CR
PR
PD PD
Bayesian Activity Trial Design(Thall & Sung, 1998)
Stop the trial early if
This statistical model & method ignore subtypes
The possibility of activity in one subtype
but not in another is not permitted
Approach #1
1) Assume that the disease subtypes all have one common response probability,
2) Conduct the trial using one early stopping rule for all the disease subtypes combined
CRITICISM
How to Accommodate Multiple Disease Subtypes?
The data are not shared between subtypes
Activity observed in one subtype is not permitted
to increase prob(activity) in the other subtypes
How to Accommodate Multiple Disease Subtypes?
1) Assume different, independent response probabilities {1,…,K}
2) Conduct K independent trials, using a separate early stopping rule within each disease subtype
CRITICISM
Approach #2
A Bayesian Hierarchical Model
Data in S1 Data in S2 Data in SK. . .
Event Rate Parameter
in S1
Event Rate Parameter
in S2
Event Rate Parameter
in SK
Hyper Parameters
. . .
Data and Parameters in the Sarcoma Trial
In sarcoma subtype j = 1, 2, …, 10,
mj = # patients evaluated (data)
Xj = # responses (data)
j = probability of response (parameters)
Bayesian Hierarchical Model
X1 , m1 X2 , m2 Xk , mk
1 2 k
Hyper Parameters
. . .
. . .
Hierarchical Models
Hierarchical Models
Hierarchical Model for the Sarcoma Trial
Define i = logit(i), for j=1,…,k
Yj | j , mj ~ binom(j , mj), independently
1 , … k | ~ iid N(-1), = precision
~ N(-1.386, 10), ~ Gamma(2, 20)
Hierarchical Model for the Sarcoma Trial
has prior mean 0.10 and variance 0.005 has prior mean = logit(.20) & variance 10
and reflect the elicited probabilities Pr(1 > .30) = .45
Pr(1 > .30 | X1/n1 = 2/6) = .525
Pr(1 > .30 | X2/n2 = 2/6) = .47
Prior Correlation Between Sarcoma Subtypes
Sarcoma Trial Conduct
• Terminate accrual in sarcoma subtype j if
Sarcoma Trial Conduct
• “data” = outcomes from all 10 subtypes
• Minimum # patients = 8, maximum = 30 in each subtype
Pr(j> .30 | data) < .005
Borrowing Strength Between the Sarcoma Subtypes Reduces
BothFalse Negative Rates
and False Positive Rates
How Borrowing Strength Reduces How Borrowing Strength Reduces False Negative Rates :False Negative Rates :
Per-Subtype Rejection ProbabilitiesPer-Subtype Rejection Probabilities
PRACTICAL ADVANTAGES of the HIERARCHICAL BAYES DESIGN
PRACTICAL ADVANTAGES of the HIERARCHICAL BAYES DESIGN
The hierarchical model allows data from each subtype to provide information about the outcome parameters in all of the other subtypes
It avoids the two undesirable approaches of doing
> One trial assuming one common parameter, ignoring the subtypes
> K separate trials that ignore each others’ data
Posterior Distribution Under the Hierarchical ModelGleevec Study
Pr( Response )
0.0 0.2 0.4 0.6 0.8 1.0
02
46
81
01
2
AngiosarcomaEwings FamilyFibrosarcomaLeiomyosarcomaLiposarcomaMFHOsteosarcomaPeripheral Nerve SheathRhabdomysarcomaSynovialDesmoid Tumors
Posterior Distribution Under the Non-Hierarchical ModelGleevec Study
Pr( Response )
0.0 0.2 0.4 0.6 0.8 1.0
02
46
81
01
2
AngiosarcomaEwings FamilyFibrosarcomaLeiomyosarcomaLiposarcomaMFHOsteosarcomaPeripheral Nerve SheathRhabdomysarcomaSynovialDesmoid Tumors
Posterior Distribution Under the Non-Hierarchical Model (Equal Mean,Variance)
Pr( Response )
0.0 0.2 0.4 0.6 0.8 1.0
02
46
81
01
2
AngiosarcomaEwings FamilyFibrosarcomaLeiomyosarcomaLiposarcomaMFHOsteosarcomaPeripheral Nerve SheathRhabdomysarcomaSynovialDesmoid Tumors
Angiosarcoma , 1 / 9 ( 11 %) Response
Pr( Response )
0.0 0.2 0.4 0.6 0.8 1.0
01
23
45
Prob(p > 0.30 | Data) = 0.06
Ewings Family , 0 / 13 ( 0 %) Response
Pr( Response )
0.0 0.2 0.4 0.6 0.8 1.0
02
46
81
01
2
Prob(p > 0.30 | Data) = 0.001
Fibrosarcoma , 2 / 7 ( 29 %) Response
Pr( Response )
0.0 0.2 0.4 0.6 0.8 1.0
0.5
00
01
1.0
00
01
1.5
00
01
2.0
00
00
2.5
00
00
Prob(p > 0.30 | Data) = 0.37
Leiomyosarcoma , 6 / 30 ( 20 %) Response
Pr( Response )
0.0 0.2 0.4 0.6 0.8 1.0
01
23
45
Prob(p > 0.30 | Data) = 0.08
Liposarcoma , 10 / 28 ( 36 %) Response
Pr( Response )
0.0 0.2 0.4 0.6 0.8 1.0
01
23
4
Prob(p > 0.30 | Data) = 0.7
MFH , 2 / 15 ( 13 %) Response
Pr( Response )
0.0 0.2 0.4 0.6 0.8 1.0
01
23
45
Prob(p > 0.30 | Data) = 0.05
Osteosarcoma , 3 / 17 ( 18 %) Response
Pr( Response )
0.0 0.2 0.4 0.6 0.8 1.0
01
23
4
Prob(p > 0.30 | Data) = 0.09
Peripheral Nerve Sheath , 1 / 5 ( 20 %) Response
Pr( Response )
0.0 0.2 0.4 0.6 0.8 1.0
0.5
00
00
1.5
00
00
2.5
00
00
Prob(p > 0.30 | Data) = 0.22
Rhabdomysarcoma , 0 / 2 ( 0 %) Response
Pr( Response )
0.0 0.2 0.4 0.6 0.8 1.0
12
34
56
Prob(p > 0.30 | Data) = 0.12
Synovial , 4 / 18 ( 22 %) Response
Pr( Response )
0.0 0.2 0.4 0.6 0.8 1.0
01
23
4
Prob(p > 0.30 | Data) = 0.18
Desmoid Tumors , 8 / 13 ( 62 %) Response
Pr( Response )
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.5
1.0
1.5
2.0
2.5
Prob(p > 0.30 | Data) = 0.98
Effect of Desmoid Tumor Patients (8/13 response)
If the 13 Desmoid tumor patients are removed from the hierarchical structure:
The maximum change in Pr( p > 0.30 | Data) is 0.02.
Hierarchical Bayesian Approaches to Hierarchical Bayesian Approaches to Phase II Trials in Diseases Phase II Trials in Diseases
With Multiple Subtypes With Multiple Subtypes Case II: Time-to-Event OutcomesCase II: Time-to-Event Outcomes
A Trial of Fludarabine + Busulfan (Flu/Bu) in A Trial of Fludarabine + Busulfan (Flu/Bu) in Allogeneic Bone Marrow Transplantation (Allotx)Allogeneic Bone Marrow Transplantation (Allotx)
• Experimental Rx : Flu/Bu as a Preparative RegimenExperimental Rx : Flu/Bu as a Preparative Regimen
• 3 Patient-Disease Subgroups: 3 Patient-Disease Subgroups:
AML in Relapse, AML in Remission, MDSAML in Relapse, AML in Remission, MDS
• Goal: Improve DFS in each of the three groupsGoal: Improve DFS in each of the three groups
Data and Parameters in the AlloTx Trial
In each patient-disease subgroup j = 1, 2, 3 :
Xj,1 ,…, Xj,nj = failure (or censoring) times
In nj patients transplanted
j = Historical median failure time
j = Effect of Flu/Bu relative to historical
j x j = Median failure time with Flu/Bu
Bayesian Hierarchical Model forBayesian Hierarchical Model forthe AlloTx Trialthe AlloTx Trial
FailureFailure
Time Data Time Data
FailureFailure
Time DataTime Data
FailureFailure
Time Data Time Data
Flu/Bu Flu/Bu Effect on Effect on AML in AML in
RemissionRemission
Flu/Bu Flu/Bu Effect on Effect on AML in AML in RelapseRelapse
Flu/Bu Flu/Bu Effect on Effect on
MDSMDS
Hyper ParametersHyper Parameters
Bayesian Hierarchical Model forthe AlloTx Trial
FailureFailure
Time Data Time Data
FailureFailure
Time DataTime Data
FailureFailure
Time Data Time Data
11 22 33
Hyper ParametersHyper Parameters
Prior Correlation Between Prior Correlation Between Patient-Disease SubgroupsPatient-Disease Subgroups
Continuous Monitoring using an Approximate Posterior (CMAP) for
Phase II Trials Based On Event Times
Cheung and Thall, 2002Cheung and Thall, 2002
A Problem With Phase II Designs Based On Binary Outcomes
The Problem: If the event/response requires a relatively long follow-up period T, the number of responses (M) may not be observed. Example: alive with remission after 6 months of treatment
A Solution: Use the event time, possibly right-censored, as the outcome variable.
A Question: How does one obtain the posterior for θE?
Clinical events: 3 basic cases
NotationX = time to disease remission (“response”)Z = time to death/relapseR = time to “failure”, e.g. disease is resistantT = a pre-specified, fixed observation time windowCases(1) Simple event: B = {X < T}(2) Composite event: B = {X < T < Z}(3) Competing risks: B = {X < min(T,R), Z > T}
Approximate posterior, Case 2
1. Consider a current status likelihood of the observed data:
i [prob(Ai)]Yi [1-prob(Ai)]1-Yi
where Ai={Xi<Ci<Zi}, Ci = censoring time,
Yi = indicator of the event Ai.
2. Consider the probability decomposition
Prob(A) = w1θE + w2(1-θE)
where w1=prob(A|B) and w2=prob(A|BC)
Approximate Posterior, Case 2, cont.
Estimation strategy:– Replace the nuisance parameters (w1 and
w2) in the likelihood with estimates and obtain a “working likelihood” L(θE).
– The nuisance parameters can be estimated by the empirical quantities based on the completely followed patients
– Compound likelihood L(θE) with a beta prior: the posterior is a mixture of beta’s
Method: Monitoring θE
C = ContinuousM = Monitoring usingA = ApproximateP = Posteriors
1. Compute Prob (θE > θS + δ | Data), the stopping prob criterion each time a new patient is accrued, based on the approximate posterior described above
2. Design parameters: (Nmax, Nmin, δ, p, ρ) and the prior distributions.
Example: A leukemia trial
Patients: newly diagnosed acute myelogenous leukemia or myelodysplastic syndromes, with “-5/-7” cytogenetic abnormality.
Outcome structure: Competing risks (Case 3)
Z = survival time since day 0 of treatment
X = time to complete remission
R = time to declared resistant
Response, B = {X < min(90,R), Z > 90}
Historical Data
Empirical analysis– Nhist = 335, Mhist = 144, observed rate is 43%.
– Prior: θS ~ beta(145, 192); θE ~ beta(0.86, 1.14)
Model-based analysis– Marginal models: generalized odds rate
(Dabrowska and Doksum, 1988). – Dependence structure: Shen and Thall (1998) – Number of parameters: 17
– Model-based estimate of θhist is 44%.
4 Stopping Strategies
Thall-Simon (TS) Stopping rule:
Prob (θE > θS + .15 | Data) < .05CMAPTSCD: Continuous monitoring based on
completely followed patients onlyTS(1): wait and apply TS, stopping after every
patientTS(5): wait and apply TS, stopping rule after
every 5 patients
A null scenario: Nmax=60, Nmin=10
Event times generated under the historical model the model based mean θE = .44.
Patients arrive exponentially at a rate of 5 per 30 days.
CMAP TSCD TS(1) TS(5)
%Reject treatment .82 .79 .82 .79
Mean Duration (days) 286 331 520 469
Duration (Q1,Q3) (186, 390) (241, 417) (171, 788) (166, 692)
Sample Size (Ave) 34 42 29 33
#Turned away (Ave) 0 0 52 40
An alternative case: Nmax=60, Nmin=10
Event times were generated under the same model with parameters calibrated so that θE = .59.
CMAP TSCD TS(1) TS(5)
%Reject treatment .16 .13 .17 .13
Mean Duration (days) 411 434 689 618
Duration (Q1,Q3) (389,467) (398,469) (578,820) (549,711)
Sample Size (Ave) 55 57 53 55
#Turned away (Ave) 0 0 51 37
Randomized Multi-arm Study
Select the best regimen from (E1,E2,E3)
1) Same priors on θs and θEk’s
2) Stopping rules for arm Ek:Pr(θEk
> θs + .15) < .10 OR Pr(θEk < maxj θEj
) > .90
3) Nmax=90 (total), Nmin=10 (per arm)4) Randomize evenly among the non-stopped
arms5) Choose the best among the non-stopped
arms at the end6) Consider 4 stopping strategies
Multi-arm study, cont.
PSel N
Scene Design E1 E2 E3 None E1 E2 E3 Duration
Null CMAP .08 .10 .08 .74 14 15 13 481
TSCD .15 .15 .16 .54 23 23 23 606
TS(1) .09 .09 .09 .73 13 13 13 879
TS(5) .13 .13 .13 .62 15 20 15 749
E3 CMAP .04 .05 .63 .28 13 12 47 597
TSCD .07 .08 .69 .15 20 21 40 632
TS(1) .05 .06 .65 .24 11 11 51 950
TS(5) .06 .06 .71 .17 15 15 50 821
Conclusions
The price of ignoring censored data: inflation in the null sample size.
For the composite cases, the approximate posterior avoids complex modeling on dependence structure of times to event.
Computation of approximate posterior is easy.For the simple case, the approximate posterior
agrees with a nonparametric estimator based on right-censored data; Susarla and Van Ryzin (1976).
Most recent work: parametric model may be preferred in the simple case.
Accounting for Patient Heterogeneity in Phase II Using Regression
Two or more prognostic subgroups with different historical Pr(response) using “standard” therapy
If a subgroup is stopped early, the remaining sample size goes to the remaining subgroups
Allow different target Pr(response) values within subgroups
Use a regression model to “borrow strength” between subgroups
t = treatment group = E or S
Z = prognostic subgroup = 0, 1, …, K-1
t,Z () = Pr(Response | t,
Z, )
= logit -1 { t,Z () }
k = historical effect of prognostic subgroup k versus baseline group 0 (0=0)
k = E-versus-S treatment effect in prognostic subgroup k
I [ Z = k ]
Informative prior, from historical data
Non Informative priors
LINEAR TERMS
Early Stopping (“No Go”) Criteria
Given current data Dn, stop accrual in subgroup j if
for j = 0, 1, …, K-1, where pj is a fixed cut-off, usually .01 to .10, calibrated to obtain a design with given false negative rate.
An Example with Two Subgroups
Historical Standard Rx
Targets for the Expt'l Rx
Good Prognosis
.45 .60
Poor Prognosis
.25 .40
An Example with Two Subgroups
Nmax = 100 (approx. 50 per subgroup)
Apply subgroup-specific early stopping rules after cohorts of 10 patients
The early stopping rules are calibrated to control Pr(STOP | = target) = .10within each subgroup
An Example with Two Subgroups
Good Prognosis :
Target is
.45 + .15 = .60
Poor Prognosis:
Target is
.25 + .15 = .40
An Example with Two Subgroups
Early Stopping Rules
Good Prognosis :
Target is
.45 + .15 = .60
STOP if
Pr(G > .60 | data)
is “small”
Poor Prognosis:
Target is
.25 + .15 = .40
STOP if
Pr(P > .40 | data)
is “small”
An Example with Two Subgroups
Accrual may be stopped early in1) Both subgroups (Trial is stopped)2) Neither subgroup3) One subgroup but not the other
“Treatment-subgroup interaction”
In Case 3, all remaining patients, up to the maximum of 100, are accrued to the subgroup that has not been stopped.
Computer Simulation Results
True Values of
Ignoring
PrognosisAccounting for
Prognosis
Good Prognosis
G = .60
Poor Prognosis
P = .25
Computer Simulation Results
True Values of
Ignoring
PrognosisAccounting for
Prognosis
Good Prognosis
G = .60
P(stop) = .42
N = 38
Poor Prognosis
P = .25
P(stop) = .42
N = 38
True Values of
Ignoring
PrognosisAccounting for
Prognosis
Good Prognosis
= .60
P(stop) = .42
N = 38
P(stop) = .10
N = 64
Poor Prognosis
= .25
P(stop) = .42
N = 38
P(stop) = .73
N = 32
Computer Simulation Results
Effects of treatment-subgroup interaction
If the new treatment achieves the target in the “good prognosis” subgroup but not in the “poor prognosis” subgroup a conventional design ignoring treatment-subgroup interaction has
Pr(False Negative in “Good”) = .42
Pr(False Positive in “Poor”) = 1 - .42 = .58
Take Away Messages
In phase II, or ANY comparative trial :
Account for patient heterogeneity
Account for treatment-subgroup (treatment-covariate) interactions
The method is applied similarly for event times, using means or medians
The “Good” vs “Bad” Prognosis
dichotomy may be replaced with
“Biomarker +” vs “Biomarker –”
Currently being applied at MDACC to
a chemotherapy trial in acute leukemia