good practices and implementation methods for optimally … › upload ›...
TRANSCRIPT
OptimallyStratified
Randomization
J Chipman
Introduction
Extensions
Implementation
Summary
References
Good Practices and Implementation Methodsfor Optimally Stratified Randomization
Jonathan Chipman, Cole Beck, Robert Greevy
Department of Biostatistics, Vanderbilt University School of Medicine
Midwest BiopharmaceuticalStatistics Workshop
May 17, 2016
OptimallyStratified
Randomization
J Chipman
Introduction
Extensions
Implementation
Summary
References
Outline
1 Optimally Stratified Randomization
2 Extending Matching On-The-Fly
3 R Implementation
4 Summary
OptimallyStratified
Randomization
J Chipman
Introduction
Extensions
Implementation
Summary
References
Introduction
Randomization, on the whole, prevents confounding
Increased balance of covariatesincreases power to detect treatment effect
OptimallyStratified
Randomization
J Chipman
Introduction
Extensions
Implementation
Summary
References
Optimal Stratified Randomization
Robert Greevy, Bo Lu, Jeffrey H Silber, and Paul Rosenbaum.Optimal multivariate matching before randomization.Biostatistics (Oxford, England), 5(2):263–275, April 2004
Non-bipartite matches via Mahalanobis Distance
Efficiency gains up to 7% vs unrestricted randomization
Requires knowledge of all participants/clusters at start
OptimallyStratified
Randomization
J Chipman
Introduction
Extensions
Implementation
Summary
References
Matching on-the-fly
Adam Kapelner and Abba Krieger. Matching on-the-fly:Sequential allocation with higher power and efficiency.Biometrics, 70(2):378–388, January 2014
OptimallyStratified
Randomization
J Chipman
Introduction
Extensions
Implementation
Summary
References
Matching on-the-fly
Adam Kapelner and Abba Krieger. Matching on-the-fly:Sequential allocation with higher power and efficiency.Biometrics, 70(2):378–388, January 2014
OptimallyStratified
Randomization
J Chipman
Introduction
Extensions
Implementation
Summary
References
Matching on-the-fly
Adam Kapelner and Abba Krieger. Matching on-the-fly:Sequential allocation with higher power and efficiency.Biometrics, 70(2):378–388, January 2014
OptimallyStratified
Randomization
J Chipman
Introduction
Extensions
Implementation
Summary
References
Matching on-the-fly
Adam Kapelner and Abba Krieger. Matching on-the-fly:Sequential allocation with higher power and efficiency.Biometrics, 70(2):378–388, January 2014
OptimallyStratified
Randomization
J Chipman
Introduction
Extensions
Implementation
Summary
References
Matching on-the-fly
Adam Kapelner and Abba Krieger. Matching on-the-fly:Sequential allocation with higher power and efficiency.Biometrics, 70(2):378–388, January 2014
OptimallyStratified
Randomization
J Chipman
Introduction
Extensions
Implementation
Summary
References
Matching on-the-fly, some additional notes
Requires pre-specifying
Initial reservoir sizeThreshold to denote degree of similarity
Threshold: MD of random pairs scales to F(p,n−p)
Does not require reservoir to deplete
May result in unequal treatment group sizesKapelner provides estimator that accounts for mix ofmatched and unmatched participants
In their simulations, saw great increase in power givennonlinear covariates
OptimallyStratified
Randomization
J Chipman
Introduction
Extensions
Implementation
Summary
References
What does matching on-the-fly look like?
●
●
●
●
●
●
●
●
●
●●●●●●●●●●●●●●●●●
●●●●●●●●●
●●●●
●●●●●●●●●●●●●●●
●●●●●●
0 10 20 30 40 50 60
2
4
6
8
10
Reservoir size after each enrollment
Study progression
Initial reservoir size of 10
60 participants, conditioning on 6 IID standard normalcovariates
Threshold, λ, of 0.50
OptimallyStratified
Randomization
J Chipman
Introduction
Extensions
Implementation
Summary
References
Extending Matching On-The-Fly
How to choose initial reservoir size?
A dynamic threshold can allow for an initial reservoir size of one
How to choose threshold?
Use dynamic threshold that changes based upon reservoir sizeand expected remaining enrollment
What if covariates are non-normal?
Compare against empirically estimated distribution ofunrestricted randomization
OptimallyStratified
Randomization
J Chipman
Introduction
Extensions
Implementation
Summary
References
Dynamic Threshold
Goals of static vs dynamic threshold are not same:
Static threshold: only sufficiently close matches are paired
Dynamic threshold: everyone matches by end of study
Dynamic threshold is F−1(p,n−p) (Q), where:
Q = P (matching to reservoir member | enrollment)
=# in reservoir
# in reservoir + # expected remaining participants
Can allow initial reservoir size of 1
OptimallyStratified
Randomization
J Chipman
Introduction
Extensions
Implementation
Summary
References
What does using the dynamic threshold look like:
●
●
●
●
●
●
●
●●
●●
●●●●●●●●●●
●●●
●●●●●●●
●●●●●●●●
●●●●●
●●
●●●●
●●●●
●●
●
●
●
●
0 10 20 30 40 50 60
0
2
4
6
8
10
Reservoir size after each enrollmentdarker red means match forced for balance
Study progression
60 participants
Conditioning on 6 IID standard normal covariates
OptimallyStratified
Randomization
J Chipman
Introduction
Extensions
Implementation
Summary
References
How do dynamic vs static threshold compare?
A dynamic threshold has some benefits:
No need to pre-specify threshold nor initial reservoir size
No data simulation required
No head-to-head comparison, so far.
OptimallyStratified
Randomization
J Chipman
Introduction
Extensions
Implementation
Summary
References
Non-normal covariates
Null distribution: estimate from bootstrap sampling n/2 pairsfrom final mahalanobis distance matrix
Sequential enrollment:
Similarly estimate the null but from current enrollment
Estimated distribution is biased, though the dynamicthreshold can help reduce the impact of the bias.
If interested, further slide available during questions.
OptimallyStratified
Randomization
J Chipman
Introduction
Extensions
Implementation
Summary
References
Efficiency gains
●●
● ●
●
0.95
1.00
1.05
1.10
1.15
FREE POWER !!!
OSR Relative power to unrestricted randomization1/2 standard deviation treatment effect
0.0 0.2 0.4 0.6 0.8
Adj R2 of non−treatment covariates
n per group: 68 55 42 30 18
6 IID standard normal covariatesSample size yields 80% power under unrestrictedrandomization
OptimallyStratified
Randomization
J Chipman
Introduction
Extensions
Implementation
Summary
References
Implementing OSR
All participants/clusters known at start of trial
http://biostat.mc.vanderbilt.edu/wiki/Main/
MatchedRandomization
Example to follow
Sequential Randomized Trials
Example pseudo code to follow
Single-institution with a statistician: do-able
Multi-center institution: Current effort for REDCapimplementation
Estimating treatment effect
Covariate-adjusted linear model
t-test (conservative)
Average Treatment Effect [Kapelner and Krieger, 2014]
Design-based estimator [Imai et al., 2009]
OptimallyStratified
Randomization
J Chipman
Introduction
Extensions
Implementation
Summary
References
Cluster Randomization (known at study onset)
# installing from CRAN
install.packages( "nbpMatching" )
# installing from R-Forge
install.packages("nbpMatching",
repos="http://R-Forge.R-project.org")
# installing from GitHub
library(devtools)
install_github(’couthcommander/nbpMatching’)
library(nbpMatching)
OptimallyStratified
Randomization
J Chipman
Introduction
Extensions
Implementation
Summary
References
Cluster Randomization (known at study onset)
library(nbpMatching)
# starting with example dataframe of 4 covariates
X[1:2,]
X1 X2 X3 X4 ID
1 0.53335445 1.9885732 -1.3974614 -0.0238064 ID1
2 0.03613734 0.3209346 0.4342036 0.4024346 ID2
# create the distance matrix
d1 <- gendistance(X,id="ID", ... )
d2 <- distancematrix(d1)
# create the optimal matching
m1 <- nonbimatch(d2)
m1$halves[1:2,]
Group1.ID Group1.Row Group2.ID Group2.Row Distance
1 ID1 1 ID27 27 1.2928466
2 ID2 2 ID7 7 0.3969663
OptimallyStratified
Randomization
J Chipman
Introduction
Extensions
Implementation
Summary
References
The ...
gendistance(X,id="ID", ... )
weights: Upweight certain covariates
rankcols: Transform certain covariates to ranks
ndiscard: More clusters (hospitals) available than needed
See tutorial for more options and exampleshttp://biostat.mc.vanderbilt.edu/wiki/Main/
MatchedRandomization
OptimallyStratified
Randomization
J Chipman
Introduction
Extensions
Implementation
Summary
References
Sequential Randomization (pseudo code)
for (participant in 1:nexpected){gendistance() among all enrolled
Requires p+2 participants to calculate MDWhen fewer than p+2 participants, either randomize orcreate fake buffer participants (not to be matched)
Calculate and compare to dynamic threshold
Assuming normal covariates: transform to F(p,n−p) scaleOtherwise, bootstrap null randomization distribution
If less than threshold, match and remove from reservoir
Otherwise either:
Randomize to trt/control and add to reservoirDisregard threshold and match (if required to achieveequal sample size amongst trt/control)
Keep track of updated reservoir and treatment assignment
}
OptimallyStratified
Randomization
J Chipman
Introduction
Extensions
Implementation
Summary
References
Summary
Optimal Stratified Randomization
Helps remove potential for confounding andIncreases power to detect treatment effect
Extensions to Matching On-the-fly
No pre-specification needed of threshold and initialreservoir sizeAllow for non-normal covariates
OptimallyStratified
Randomization
J Chipman
Introduction
Extensions
Implementation
Summary
References
References
http://biostat.mc.vanderbilt.edu/wiki/Main/
MatchedRandomization
Robert Greevy, Bo Lu, Jeffrey H Silber, and Paul Rosenbaum.Optimal multivariate matching before randomization.Biostatistics (Oxford, England), 5(2):263–275, April 2004.
Kosuke Imai, Gary King, and Clayton Nall. The Essential Roleof Pair Matching in Cluster-Randomized Experiments, withApplication to the Mexican Universal Health InsuranceEvaluation. Statistical Science, 24(1):29–53, February 2009.
Adam Kapelner and Abba Krieger. Matching on-the-fly:Sequential allocation with higher power and efficiency.Biometrics, 70(2):378–388, January 2014.
OptimallyStratified
Randomization
J Chipman
Introduction
Extensions
Implementation
Summary
References
Performance of Sequential OSR
0 1 2 3 4
0.0
0.2
0.4
0.6
0.8
1.0
Emperical CDFPercent buy back 67.0
Distance amongst pairs
●
●
●
Batch RandomizationSequential RandomizationUnrestricted Randomization
60 participants
6 standard normalcovariates
0 2 4 6 8
0.0
0.2
0.4
0.6
0.8
1.0
Emperical CDFPercent buy back 40.6
Distance amongst pairs
●
●
●
Batch RandomizationSequential RandomizationUnrestricted Randomization
60 participants
4 lognormal covariates2 binary covariates(prevalence 0.2 and 0.4)
OptimallyStratified
Randomization
J Chipman
Introduction
Extensions
Implementation
Summary
References
Bias Estimating Null Distribution
0 1 2 3 4 5
0.0
0.2
0.4
0.6
0.8
1.0
CDFNormal covariates F(6, n − 6)
F quantile
89101560
Distances right skewedLow probability of matching − > little biasHigh probability of matching − > later in study