good practices and implementation methods for optimally … › upload ›...

OptimallyStratified

Randomization

J Chipman

Introduction

Extensions

Implementation

Summary

References

Good Practices and Implementation Methodsfor Optimally Stratified Randomization

Jonathan Chipman, Cole Beck, Robert Greevy

Department of Biostatistics, Vanderbilt University School of Medicine

Midwest BiopharmaceuticalStatistics Workshop

May 17, 2016

OptimallyStratified

Randomization

J Chipman

Introduction

Extensions

Implementation

Summary

References

Outline

1 Optimally Stratified Randomization

2 Extending Matching On-The-Fly

3 R Implementation

4 Summary

OptimallyStratified

Randomization

J Chipman

Introduction

Extensions

Implementation

Summary

References

Introduction

Randomization, on the whole, prevents confounding

Increased balance of covariatesincreases power to detect treatment effect

OptimallyStratified

Randomization

J Chipman

Introduction

Extensions

Implementation

Summary

References

Optimal Stratified Randomization

Robert Greevy, Bo Lu, Jeffrey H Silber, and Paul Rosenbaum.Optimal multivariate matching before randomization.Biostatistics (Oxford, England), 5(2):263–275, April 2004

Non-bipartite matches via Mahalanobis Distance

Efficiency gains up to 7% vs unrestricted randomization

Requires knowledge of all participants/clusters at start

OptimallyStratified

Randomization

J Chipman

Introduction

Extensions

Implementation

Summary

References

Matching on-the-fly

Adam Kapelner and Abba Krieger. Matching on-the-fly:Sequential allocation with higher power and efficiency.Biometrics, 70(2):378–388, January 2014

OptimallyStratified

Randomization

J Chipman

Introduction

Extensions

Implementation

Summary

References

Matching on-the-fly, some additional notes

Requires pre-specifying

Initial reservoir sizeThreshold to denote degree of similarity

Threshold: MD of random pairs scales to F(p,n−p)

Does not require reservoir to deplete

May result in unequal treatment group sizesKapelner provides estimator that accounts for mix ofmatched and unmatched participants

In their simulations, saw great increase in power givennonlinear covariates

OptimallyStratified

Randomization

J Chipman

Introduction

Extensions

Implementation

Summary

References

What does matching on-the-fly look like?

●

●

●

●

●

●

●

●

●

●●●●●●●●●●●●●●●●●

●●●●●●●●●

●●●●

●●●●●●●●●●●●●●●

●●●●●●

0 10 20 30 40 50 60

2

4

6

8

10

Reservoir size after each enrollment

Study progression

Initial reservoir size of 10

60 participants, conditioning on 6 IID standard normalcovariates

Threshold, λ, of 0.50

OptimallyStratified

Randomization

J Chipman

Introduction

Extensions

Implementation

Summary

References

Extending Matching On-The-Fly

How to choose initial reservoir size?

A dynamic threshold can allow for an initial reservoir size of one

How to choose threshold?

Use dynamic threshold that changes based upon reservoir sizeand expected remaining enrollment

What if covariates are non-normal?

Compare against empirically estimated distribution ofunrestricted randomization

OptimallyStratified

Randomization

J Chipman

Introduction

Extensions

Implementation

Summary

References

Dynamic Threshold

Goals of static vs dynamic threshold are not same:

Static threshold: only sufficiently close matches are paired

Dynamic threshold: everyone matches by end of study

Dynamic threshold is F−1(p,n−p) (Q), where:

Q = P (matching to reservoir member | enrollment)

=# in reservoir

# in reservoir + # expected remaining participants

Can allow initial reservoir size of 1

OptimallyStratified

Randomization

J Chipman

Introduction

Extensions

Implementation

Summary

References

What does using the dynamic threshold look like:

●

●

●

●

●

●

●

●●

●●

●●●●●●●●●●

●●●

●●●●●●●

●●●●●●●●

●●●●●

●●

●●●●

●●●●

●●

●

●

●

●

0 10 20 30 40 50 60

0

2

4

6

8

10

Reservoir size after each enrollmentdarker red means match forced for balance

Study progression

60 participants

Conditioning on 6 IID standard normal covariates

OptimallyStratified

Randomization

J Chipman

Introduction

Extensions

Implementation

Summary

References

How do dynamic vs static threshold compare?

A dynamic threshold has some benefits:

No need to pre-specify threshold nor initial reservoir size

No data simulation required

No head-to-head comparison, so far.

OptimallyStratified

Randomization

J Chipman

Introduction

Extensions

Implementation

Summary

References

Non-normal covariates

Null distribution: estimate from bootstrap sampling n/2 pairsfrom final mahalanobis distance matrix

Sequential enrollment:

Similarly estimate the null but from current enrollment

Estimated distribution is biased, though the dynamicthreshold can help reduce the impact of the bias.

If interested, further slide available during questions.

OptimallyStratified

Randomization

J Chipman

Introduction

Extensions

Implementation

Summary

References

Efficiency gains

●●

● ●

●

0.95

1.00

1.05

1.10

1.15

FREE POWER !!!

OSR Relative power to unrestricted randomization1/2 standard deviation treatment effect

0.0 0.2 0.4 0.6 0.8

Adj R2 of non−treatment covariates

n per group: 68 55 42 30 18

6 IID standard normal covariatesSample size yields 80% power under unrestrictedrandomization

OptimallyStratified

Randomization

J Chipman

Introduction

Extensions

Implementation

Summary

References

Implementing OSR

All participants/clusters known at start of trial

http://biostat.mc.vanderbilt.edu/wiki/Main/

MatchedRandomization

Example to follow

Sequential Randomized Trials

Example pseudo code to follow

Single-institution with a statistician: do-able

Multi-center institution: Current effort for REDCapimplementation

Estimating treatment effect

Covariate-adjusted linear model

t-test (conservative)

Average Treatment Effect [Kapelner and Krieger, 2014]

Design-based estimator [Imai et al., 2009]

http://biostat.mc.vanderbilt.edu/wiki/Main/MatchedRandomization


OptimallyStratified

Randomization

J Chipman

Introduction

Extensions

Implementation

Summary

References

Cluster Randomization (known at study onset)

# installing from CRAN

install.packages( "nbpMatching" )

# installing from R-Forge

install.packages("nbpMatching",

repos="http://R-Forge.R-project.org")

# installing from GitHub

library(devtools)

install_github(’couthcommander/nbpMatching’)

library(nbpMatching)

OptimallyStratified

Randomization

J Chipman

Introduction

Extensions

Implementation

Summary

References

Cluster Randomization (known at study onset)

library(nbpMatching)

# starting with example dataframe of 4 covariates

X[1:2,]

X1 X2 X3 X4 ID

1 0.53335445 1.9885732 -1.3974614 -0.0238064 ID1

2 0.03613734 0.3209346 0.4342036 0.4024346 ID2

# create the distance matrix

d1 <- gendistance(X,id="ID", ... )

d2 <- distancematrix(d1)

# create the optimal matching

m1 <- nonbimatch(d2)

m1$halves[1:2,]

Group1.ID Group1.Row Group2.ID Group2.Row Distance

1 ID1 1 ID27 27 1.2928466

2 ID2 2 ID7 7 0.3969663

OptimallyStratified

Randomization

J Chipman

Introduction

Extensions

Implementation

Summary

References

The ...

gendistance(X,id="ID", ... )

weights: Upweight certain covariates

rankcols: Transform certain covariates to ranks

ndiscard: More clusters (hospitals) available than needed

See tutorial for more options and exampleshttp://biostat.mc.vanderbilt.edu/wiki/Main/




OptimallyStratified

Randomization

J Chipman

Introduction

Extensions

Implementation

Summary

References

Sequential Randomization (pseudo code)

for (participant in 1:nexpected){gendistance() among all enrolled

Requires p+2 participants to calculate MDWhen fewer than p+2 participants, either randomize orcreate fake buffer participants (not to be matched)

Calculate and compare to dynamic threshold

Assuming normal covariates: transform to F(p,n−p) scaleOtherwise, bootstrap null randomization distribution

If less than threshold, match and remove from reservoir

Otherwise either:

Randomize to trt/control and add to reservoirDisregard threshold and match (if required to achieveequal sample size amongst trt/control)

Keep track of updated reservoir and treatment assignment

}

OptimallyStratified

Randomization

J Chipman

Introduction

Extensions

Implementation

Summary

References

Summary

Optimal Stratified Randomization

Helps remove potential for confounding andIncreases power to detect treatment effect

Extensions to Matching On-the-fly

No pre-specification needed of threshold and initialreservoir sizeAllow for non-normal covariates

OptimallyStratified

Randomization

J Chipman

Introduction

Extensions

Implementation

Summary

References

References

http://biostat.mc.vanderbilt.edu/wiki/Main/


Robert Greevy, Bo Lu, Jeffrey H Silber, and Paul Rosenbaum.Optimal multivariate matching before randomization.Biostatistics (Oxford, England), 5(2):263–275, April 2004.

Kosuke Imai, Gary King, and Clayton Nall. The Essential Roleof Pair Matching in Cluster-Randomized Experiments, withApplication to the Mexican Universal Health InsuranceEvaluation. Statistical Science, 24(1):29–53, February 2009.

Adam Kapelner and Abba Krieger. Matching on-the-fly:Sequential allocation with higher power and efficiency.Biometrics, 70(2):378–388, January 2014.



OptimallyStratified

Randomization

J Chipman

Introduction

Extensions

Implementation

Summary

References

Performance of Sequential OSR

0 1 2 3 4

0.0

0.2

0.4

0.6

0.8

1.0

Emperical CDFPercent buy back 67.0

Distance amongst pairs

●

●

●

Batch RandomizationSequential RandomizationUnrestricted Randomization

60 participants

6 standard normalcovariates

0 2 4 6 8

0.0

0.2

0.4

0.6

0.8

1.0

Emperical CDFPercent buy back 40.6

Distance amongst pairs

●

●

●

Batch RandomizationSequential RandomizationUnrestricted Randomization

60 participants

4 lognormal covariates2 binary covariates(prevalence 0.2 and 0.4)

OptimallyStratified

Randomization

J Chipman

Introduction

Extensions

Implementation

Summary

References

Bias Estimating Null Distribution

0 1 2 3 4 5

0.0

0.2

0.4

0.6

0.8

1.0

CDFNormal covariates F(6, n − 6)

F quantile

89101560

Distances right skewedLow probability of matching − > little biasHigh probability of matching − > later in study

good practices and implementation methods for optimally … › upload ›...

Documents