goal : beat the frequentists at their own game in phase iii clinical trial design
Post on 03-Jan-2016
32 Views
Preview:
DESCRIPTION
TRANSCRIPT
Requirements: Maintain overall false-positive error rate
and targeted power Compare to O’Brien-Fleming, Pocock and
Optimal group-sequential designs
The method must be robust, and hence must not depend on the proportional hazards assumption
GoalGoal: Beat the frequentists at their own game : Beat the frequentists at their own game in phase III clinical trial designin phase III clinical trial design
Bayesian Doubly OptimalDoubly Optimal Group Sequential Design for Clinical Trials
Solution: A Bayesian Doubly Optimal Group Sequential (BDOGS) Design
(Wathen and Thall, Stat in Medicine, 2008)
1. A robust Bayesian decision-theoretic approach to designing group sequential clinical trials
2. The focus is on two-arm trials with time-to-failure (TTF) outcomes
3. Uses Bayesian adaptive model selection
4. Maintains overall frequentist size and power
Basic Elements of BDOGS1) Assume the data come from one of M models
(characterized by their hazard functions)
2) Before the trial: Derive the Optimal Decision Bounds for each model, and store them
3) During the trial: At each interim analysis, make decisions using the Optimal Decision Bounds of the Optimal Model
4) The optimal boundaries depend on the model, and the model is optimized adaptively The decision boundaries may change from one interim evaluation to the next
BDOGSillustration
A Doubly Optimal ProcedureStep 1 (Before the Trial): For each of M specific models, obtain the OptimalOptimal Decision Boundaries using forward simulation.
Step 2 (During the Trial): Obtain posterior model probabilities for the set of M possible models using approximate Bayes Factors to determine the OptimalOptimal Model.
Step 3 (During the Trial): Apply the optimal decision boundaries corresponding to the optimal model at each interim decision based on the most recent data.
= E – S = actual improvement in median failure time of experimental (E) over standard (S), a parameter under the Bayesian model (hence random)
* = fixed desired improvement in median failure time of E over S
Expected Utility = ½ E= 0(N) + ½ E= *(N)
Decision BoundariesTo facilitate computation, for each model BDOGS
uses the two parametric boundary functions
PU = aU – bU { N+(Xn)/N }
PL = aL + bL { N+(Xn)/N }
where N = maximum sample size, and
N+(Xn) = # failure events in data Xn
(aU , bU , cU , aL , bL , cL ) characterize the decision boundary for a given model
cU
cL
Decision Rules
Superiority of S over E
S = Pr( | x ) > PU Stop and select S
Superiority of E over S
E = Pr( | x ) > PU Stop and select E
Futility
S < PL and E < PL Stop for futility
Acquire more information
PL S, E PU Continue randomizing to obtain more information
Forward Simulation
Simulate the entire trial 5000 times assuming
= 0, and 5000 times assuming * :
1. For each interim analysis, calculate E and S, and store E, S, and also store
[# of patients], [# events] for each treatment arm.
2. Apply the decision rule, d to obtain the expected utility for a trial using d
3. Find d that maximizes the expected utility.
(A complex search algorithm is required.)
Examples of Hazard Functions (Models)
Hazard function for M1 = exponential distribution is constant
A Metastatic Non-Small Cell Lung Cell Cancer (NSCLC) Trial
Median overall survival (OS) in metastatic NSCLC is about 4 months
A phase III trial of localized surgery or radiation therapy versus systemic chemotherapy for metastatic NSCLC was designed with the goal to improve median progression-free survival (PFS) from 4 to 8 months
Initially, a conventional .05/.90 group sequential design with O’Brien-Fleming boundaries was planned, with up to 3 tests at 30, 60 and 89 events.
Under the “usual” assumptions, accruing 2 to 4 patients/month, a typical O’Brien-Fleming .05/.90
group sequential design will require ~ 100 to 120 patients and take ~ 2 ½ to 4 ½ years to complete
Analysis of Historical Data on PFS time in Metastatic NSCLS
A preliminary goodness-of-fit analysis, based on a published Kaplan-Meier plot of PFS times of NSCLC patients with metastatic disease, showed that the Log Normal distribution gave a much better fit than the Weibull or Exponential.
The proportional hazards assumption was very likely invalid.
The hazard function was very likely non-monotone.
1. To test H0: = 0 versus H1: 0
2. Assume med(T) = 4 mos. for std. therapy
3. Type I Error = .05, Power = 0.90 for = 4 months, improvement to med(T) = 8 mos.
4. Assume 2 patients per month accrual
5. Up to 5 interim analyses + 1 final analysis, at 25, 50, 75, 87, 112 and 122 events
6. Five possible models
A BDOGS Design for the NSCLC Trial
Possible Models (Hazard Functions)
M1 = constant (Exponential model)
M2 = increasing
M3 = decreasing
M4 = initially increasing, then a slight decrease
M5 = initially increasing, then a large decrease
A priori, the 5 models were assumed to be
equally likely: Pr(M1) = …= Pr(M5) = .20.
Non-Constant Hazard Functions (Models)
For comparability in the simulations:
An O’Brien-Fleming design was constructed to have the same 6 looks, for both superiority (reject the null) and inferiority (accept the null) decisions.
Both designs had the same maximum sample size N = 122 patients.
For each case (underlying true PFS distribution) studied, the data were simulated ahead of time and each method was presented with the same data.
Simulation Study for the NSCLS Trial
Non-constant Hazards Used in Simulation Study for S (solid line) and E (dashed line)
index
ss.n
ull
0 5 10 15 20 25
40
60
80
10
01
20
Sa
mp
le S
izeA: Null Case
30
50
70
90
110
130
B OF B OF B OF B OF B OF B OF B OFExp LN-BF W-BF WD LN-ID2 LN-ID3 Exp
Simulation Results: Null Case
B = BDOGS, OF = O’Brien-Fleming
Lower - Upper Lines = 2.5 - 97.5 Percentiles Line in Box = Median
Box = 25 – 75 Percentiles Dot in Box = Mean
index
ss.n
ull
0 5 10 15 20 25
40
60
80
10
01
20
Sa
mp
le S
ize
B: Alternative Case
30
50
70
90
110
130
B OF B OF B OF B OF B OF B OF B OFExp LN-BF W-BF WD LN-ID2 LN-ID3 WI
Simulation Results: Alternative Case
B = BDOGS, OF = O’Brien-Fleming
Lower - Upper Lines = 2.5 - 97.5 Percentiles Line in Box = Median
Box = 25 – 75 Percentiles Dot in Box = Mean
Simulation Results
If the hazard is constant, both BDOGS and OF maintain targeted size and
power, but OF requires a much larger sample (33% to 51% more patients)
Simulation Results
If the hazard is Log Normal, both BDOGS and OF maintain targeted size
and power, but OF requires a much larger sample
Simulation Results
If the hazard is Weibull, both BDOGS and OF maintain targeted power,
BDOGS has a reduced size = .02, and OF requires a much larger sample
Simulation Results
If the hazard is Weibull with decreasing hazard, BDOGS has size .07, OF has
reduced power .81, and OF requires a much larger sample
Simulation Results
If the hazard is Weibull with increasing hazard, both methods have greatly reduced size .01, OF has greatly
increased power .99, and OF has a 61% to 141% larger sample size
top related