stratification: are you a lumper or a splitter?

49
Stratification: Are you a lumper or a splitter?

Upload: winifred-foster

Post on 03-Jan-2016

34 views

Category:

Documents


0 download

DESCRIPTION

Stratification: Are you a lumper or a splitter?. …and if you are a splitter, how should you split the data and when?. Outline of Stratification Lectures. Definitions, examples and rationale (credibility) Implementation Fixed allocation (permuted blocks) Adaptive (minimization) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Stratification:   Are you a lumper or a splitter?

Stratification: Are you a lumper or a

splitter?

Page 2: Stratification:   Are you a lumper or a splitter?

…and if you are a splitter, how should you split the data and when?

Page 3: Stratification:   Are you a lumper or a splitter?

Outline of Stratification Lectures

• Definitions, examples and rationale (credibility)

• Implementation– Fixed allocation (permuted blocks)– Adaptive (minimization)

• Rationale - variance reduction– Pre- and post-stratification

Page 4: Stratification:   Are you a lumper or a splitter?

Stratification in randomized trials is different from

stratified random sampling where the population might be divided up into strata, e.g., census tracts,

and each stratum is sampled randomly for some pre-specified

sample size.

Page 5: Stratification:   Are you a lumper or a splitter?

Typical Situation for Stratification in Trials

• Usually, no restriction on number of participants per stratum (goal is to enroll as rapidly as possible and include participants who are representative of target population)

• There are exceptions (sometimes required by funder or regulatory authority): some trials have goals or put caps on the enrollment of certain subgroups:– ELITE II heart failure trial -- at least 85% of patients had to be > 65 years.

– Dietary study to lower BP (DASH) – a target of 50% women and 50% blacks. ,

Page 6: Stratification:   Are you a lumper or a splitter?

Stratification• A procedure in which factors known to be associated with the response (prognostic factors) are taken into account in the design (e.g., randomization)

• Another type of restriction on the randomization. – Goal of permuted block randomization is to achieve balance on the number in each treatment arm over time.

– Goal of stratification is to achieve balance between groups with respect to important prognostic factors.

• Pre-stratification refers to a stratified design; post-stratification refers to the analysis

Page 7: Stratification:   Are you a lumper or a splitter?

Example: Weight Loss Interventions in Clinical

Practice(Appel L et al, N Engl J Med 2011)

• 415 participants randomized (1:1:1) to control (n=138), remote support (n=139) or in-person support (n=138) (a modest size trial)

• Methods: – “Randomization was stratified according to sex and was generated in blocks of 3 and 6 with use of a Web-based program.”

– “The primary analysis was conducted with…repeated measures, mixed-effects. The model included adjustment for clinic, sex , age and race.”

• Results: Female sex n=88 in each treatment group

Page 8: Stratification:   Are you a lumper or a splitter?

Post-stratification (def.)

Classification of experimental units into strata after they have been randomized for the purpose of data analysis

e.g., stratified analysis of variance (normally distributed response), Mantel-Haenszel (binary response).

Often adjustment for baseline covariates is carried out using regression methods, e.g., linear regression or analysis of covariance (continuous), logistic regression (binary), or Cox regression (time to event)

This can be done irrespective of whether you employed pre-stratification.

Note: The term post-stratification is sometimes used to describe stratification on data collected post-randomization. Such analyses can be very difficult to interpret. More later on that issue.

Page 9: Stratification:   Are you a lumper or a splitter?

General Problems/Issues with Post-Stratification

• Model dependence / data dredging– How were covariates (stratifying variables) selected?

– How were cut-points (metric) chosen?

• Frequently covariates are not pre-specified– Partial solution: Analysis plan in the protocol that includes all covariates considered important (pre-stratification variables + others); updated analysis plan prior to unblinding the results of the study to investigators.

Page 10: Stratification:   Are you a lumper or a splitter?

Possible Stratification Scenarios

• Pre- plus post-stratification

• Pre-stratification only

• Post-stratification only

• Neither pre- nor post- stratification

• Regression adjustment with or without stratification

Page 11: Stratification:   Are you a lumper or a splitter?

Examples

• Targeted temperature management after cardiac arrest (N Engl J Med 2013; 369:21972206.

– Unadjusted and adjusted (design variables and design + other variables) Cox regression analyses for mortality (Table S10).

• Vaccine for influenza in children (N Engl J Med 2013; 369: 2481-2489).

– Cox model adjusted for variables used in minimization scheme – “pre-stratification variables.

• Solanezumab for Alzheimer’s disease (N Engl J Med 2014; 370: 311-321).

– Mixed model, change from baseline on baseline and other covariates.

Page 12: Stratification:   Are you a lumper or a splitter?

Advantages of Pre-Stratification

• Prevents “accidental bias” resulting from mal-distribution of important prognostic variables

• Increases precision (if stratifying variables are related to outcome)

• Ensures balance on stratifying factors in early interim analyses (even in large trials)

• Facilitates subgroup analysis by stratifyng factor (more optimal allocation ratio)

• Results less subject to criticism

Page 13: Stratification:   Are you a lumper or a splitter?

International Conference on Harmonization (ICH) Guideline (E-9

Document)

“Stratification by important prognostic factors measured at baseline (e.g., severity of disease, age, sex, etc.) may sometimes be valuable in order to promote balanced allocation within strata; this has greater potential benefit in small trials.”

Page 14: Stratification:   Are you a lumper or a splitter?

Disadvantages of Pre-Stratification

Primarily relates to additional administrative burden of implementation of randomization.

• May have several randomization schedules

• Measurements to define stratum must be carefully made prior to randomization

Page 15: Stratification:   Are you a lumper or a splitter?

What Stratification Does Not Do

1. Guarantee adequate power to make within-stratum comparisons

2. Eliminate the need to carry out covariate-adjusted analysis

– Chance imbalance on other covariates

– Analysis consistent with design

Page 16: Stratification:   Are you a lumper or a splitter?

Criticisms of UGDP

• Definition of target population

• Missing data and eligibility errors

• Differences in baseline characteristics– “Among the five treatment groups, as well as among clinics, baseline risk factors were also unevenly distributed. This was due to simple randomization of patients without subsequent “stratification” to correct for chance preponderance of antecedent risk factors in one or more of treatment groups.”

• Defects in interpretation (e.g., accounting for adherence)

Seltzer H, Diabetes 1972 (see also Feinstein A, Clin Pharm Ther 1971 and Biometric Society review, JAMA 1975)

Page 17: Stratification:   Are you a lumper or a splitter?

UGDP: Baseline Characteristics

Placebo(N=205)

Tolbutamide (N=204)

Insulin Standard (N=210)

Insulin Variable (N=204)

Age ≥ 55 (%) 42.3 48.2 46.2 46.2

Male (%) 30.8 31.5 26.9 22.1

Non-white (%) 49.8 47.2 51.0 41.2

Hypertension (%)

37.1 29.5 31.1 28.7

Diabetes (%) 4.5 7.8 5.8 5.1

Angina (%) 4.5 7.2 7.7 3.6

ECG abnormalities (%)

3.1 4.1 5.3 4.1

Cholesterol ≥ 300 mg/dl

8.8 15.5 16.6 13.8

One or more of the above (%)

47.0 47.8 50.5 42.6Cornfield J, JAMA 1971

Page 18: Stratification:   Are you a lumper or a splitter?

Baseline Characteristics of Patients in Trial to Prevent Toxoplasmic

Encephalitis(JID 1994;169:384-94)

CD4+ count (cells/mm3) 96.1 97.4

AIDS OI (%)+ 35.2 22.0

Karnofsky Score 89.5 89.7

Hemoglobin (g/dl) 12.6 12.7

Placebo(n=132)

Pyrimethamine(n=264)

+ P=0.007

Page 19: Stratification:   Are you a lumper or a splitter?

“In view of the major imbalance between

the groups in presentation at baseline

with AIDS defining OIs, the rigorousness

of the allocation procedures need to be

supported in detail if the results are

to be regarded as credible.”

NEJM referee for paper – major reason for rejection

Page 20: Stratification:   Are you a lumper or a splitter?

Example

How a small difference in an

important prognostic variable

can bias treatment differences.

Page 21: Stratification:   Are you a lumper or a splitter?

Baseline Characteristics in Trial of Didanosine (ddI) and

Zalcitabine (ddC)(N Engl J Med 1994; 330:657-662)

Age (years) 37.8 8.5 37.5 7.8

CD4+ 75.1 86.2 71.1 84.3

Karnofsky Score 87.2 11.9 85.3 11.9

Prior AIDS 64.8 66.7Diagnosis (%)

Mean SD Mean SD

ddC(N=237)

ddI(N=230)

Page 22: Stratification:   Are you a lumper or a splitter?

Frequency Distribution of Karnofsky Score by Treatment

Group

ddI ddC < 70 4.8 6.8

70 - 79 10.0 11.8

80 - 89 21.3 24.1

90 - 99 36.1 36.7

100+ 27.8 20.6

Page 23: Stratification:   Are you a lumper or a splitter?

Death Rate by Karnofsky Level

< 70 169.8

70 - 79 84.0

80 - 89 41.0

90 - 99 31.9

100+ 18.4

KarnofskyScore

Death Rateper 100

Person-years

Page 24: Stratification:   Are you a lumper or a splitter?

Comparison ofUnadjusted and AdjustedRelative Risk Estimates

Unadjusted 0.79 0.11

Adjusted 0.66 0.006

RR (ddC/ddI) P-value

Page 25: Stratification:   Are you a lumper or a splitter?

A major problem with this study is the adjustment for the “small differences at baseline” between didanosine and zalcitabine. While there is a “small difference” noted, the variability for each of these variables is quite large. For example, the difference in CD4 count was 4 cells/mm3 between treatment groups; however, the standard error was over 86 cells/mm3. Similarly, for Karnofsky performance status, the difference between the two groups was 2, but the standard error was 11.9. And, finally, there was no difference in the presence of AIDS-defining illness between the two groups. In short, the conclusion that should be drawn is that there is, indeed, no difference between the two groups and attempting to adjust for these small differences is inappropriate. The discussion of Results on page 23, first paragraph, should be eliminated.

Comments by NEJM referee – this time no rejection!

Page 26: Stratification:   Are you a lumper or a splitter?

Summary

• Small differences in a very important prognostic variable (irrespective of significance) can bias treatment comparisons

• Large, significant differences in unimportant variables will not bias treatment comparisons

• Remember a p-value is a function of both sample size and effect size

• Chance imbalances can occur with large sample sizes if there are many strata.

Page 27: Stratification:   Are you a lumper or a splitter?

Stratified Design for Comparing Treatments

Stratum A B

1

2

3

4

m1

m2

m3

m4

na nb

Treatment

m1A

m2A

m3A

m4A

m1B

m2B

m3B

m4B

• Typical situation:

m1 ≠ m2 m3 m4≠ ≠• Study is designed/powered based on na and nb

• Goal: miA = miB for all i.

Page 28: Stratification:   Are you a lumper or a splitter?

Considerations in the Decision to “Lump” or “Split”

1. Size of study

2. Homogeneity of study subjects

3. Strength of prognostic factors (between strata variability)

4. Administrative burden

5. Credibility

Page 29: Stratification:   Are you a lumper or a splitter?

Usual Implementation

• Block randomization within stratum

i.e., prepare a separate randomization schedule for each stratum usually with relatively small block sizes

• Makes no sense to use simple randomization

Note: The aim of this method is to ensure balance within strata formed by cross-classification of all factor levels.

Page 30: Stratification:   Are you a lumper or a splitter?

Typical Stratifying Variables

• Clinical site (good idea in multi-center study as each site can be viewed as a replication of study)

• Baseline level for outcome of interest

• Stage of disease

• Combination of factors, e.g., a risk score

Page 31: Stratification:   Are you a lumper or a splitter?

Stratification Example: TOMHS

• Multi-center (4 clinical sites) trial with two other strata defined by previous use of antihypertensive treatment (Rx) (Yes/No)

• 4 x 2 = 8 strata and randomization schedules – aim is to achieve the desired allocation ratio across all 8 groups

In general, s stratification variables

with Ii levels for the ith variable result

in strata.

sIi

πi = 1

Page 32: Stratification:   Are you a lumper or a splitter?

One can calculate the probability of obtaining a certain imbalance before the study begins. This can be used to decide whether to stratify the randomization.

p(t) is the prob. of randomizing t patients to group A when there are t1 patients in stratum 1. For a certain imbalance one can sum over all p(t) for t's that give that imbalance or worse.

p(t) =Nt

Nt - t

N + Nt

a b

1

1

a b

( (( )) )

Page 33: Stratification:   Are you a lumper or a splitter?

Example: Na = 100, Nb = 100, t1 = 40, g = 0.16, h = 0.24

Group A 16 84 100Group B 24 76 100Total 40 160 200

Want the prob. of obtaining the imbalance given by g = 0.16, h = 0.24, or worse.

p(t) =100t

10040 - t

20040

( (( )) )

Stratum 1 Stratum 2 Total

p(t) = 0.216 t ≤ 16 t ≥ 24

Page 34: Stratification:   Are you a lumper or a splitter?

Probability of Given Imbalance or More Extreme

.52 .48 1.0 .84 .23

.55 .45 .57 .42 .002

.60 .40 .25 .07 –

.70 .30 .01 – –

FractionAssigned A

FractionAssigned B 50 100 1000

Total in Stratum

Page 35: Stratification:   Are you a lumper or a splitter?

Estimates for the Size of Treatment Imbalance

• Let B = block size; K = number of strata; and D = imbalance.

• Hallstrom and Davis (Cont Clin Trials, 1988) showed that the total trial imbalance for the number of patients assigned 2 treatments across all strata = D = KB/2 with variance = K(B+1)/6

• Example: Cardiac arrhythmia trial with 270 strata (site, ejection fraction, time since MI) and block size of 4.

• Max D = 540; Var (D) = 225; SD (D) = 15; 2 SD = 30.

• In this trial, 4200 patients were to be randomized and an imbalance of 30 with probability = 0.05 was considered acceptable.

Page 36: Stratification:   Are you a lumper or a splitter?

For small studies with a large number of strata, the use of random permuted blocks within strata can be self-defeating.

Example: A study of testicular cancer• 2 treatments• 3 stratifying variables

Stage: 2 levelsHistology: 3 levels

Age: 2 levels

No. of strata = 2 x 3 x 2 = 12.

Page 37: Stratification:   Are you a lumper or a splitter?

Randomization Schedules for 12 Strata

* Patients randomized

Teratocarcinoma A* A* A* B*A* A* A* A*B A* A* A*A B B BB B B BB B B A

Embryonal carcinoma A* B B* A*A* B B* B*B A A B*B A A B*B A B AA B A A

Choriocarcinoma B* B A* B*B A B* B*A A B* AA B B* AB B A AA A A B

Stage IHistology < 15 ≥ 15

Stage II< 15 ≥ 15

Page 38: Stratification:   Are you a lumper or a splitter?

Marginal Totals for Strata

Teratocarcinoma 10 1Embryonal carcinoma 3 5Choriocarcinoma 1 6

Stage I 7 1Stage II 7 11

Age: < 15 8 6≥ 15 6 6

TOTAL 14 12

A B

Page 39: Stratification:   Are you a lumper or a splitter?

Minimization

A method of adaptive stratification which balances the marginal treatment totals for each stratification variable.

Interestingly, the European Committee for Proprietary Medicinal Products (CPMP) discourages use of minimization due to concerns about analysis. They note that the methods remain “highly controversial” and are “strongly discouraged”.

Page 40: Stratification:   Are you a lumper or a splitter?

References

• Taves DR, Clin Pharmacol Ther 1974; 15:443-53.

• Pocock S, Simon R. Biometrics 1975; 31:103-15.

Page 41: Stratification:   Are you a lumper or a splitter?

Some Notation

Let Xik = number of patients already assigned treatment k

k = 1, 2 (A or B) for our example

i = 1, 2 …, f prognostic factors of a new patient

Xtik = Xik if t ≠ k and = Xik+1 if t = k

Xtik denotes the new allocation if the new patient is assigned to t.

t = 1, 2 (A, B)

Page 42: Stratification:   Are you a lumper or a splitter?

Lack of Balance Functions

B(t) could be a function of Xik or Xtik which measures the “Lack of Balance”: 2 examples

Rule of assignment: Use the treatment with smallest B(t) with higher probability.

Note: Pocock and Simon’s approach is more general than Taves. It allows for variation among assignments to be considered (e.g., range) and non-deterministic assignment.

Xik

f

i = 1B(t) =

f

i = 1B(t) = range(Xt

i1, Xti2)

Page 43: Stratification:   Are you a lumper or a splitter?

Characteristics of New Patient

Performance status Ambulatory 30 31 xNon-ambulatory 10 9

Age < 50 18 17 x≥ 50 22 23

Disease-free interval < 2 years 31 32≥ 2 years 9 8 x

Dominant metastatic Visceral 19 21 xlesion Osseous 8 7

Soft tissue 13 12

Factor

Example (Pocock, page 85):

Level

Number oneach treatment

PatientA B

2 x 2 x 2 x 3 = 24 strata; x denotes the characteristics of the next patient to be randomized. Note: Taves would simply sum marginal totals and randomize to treatment with lowest total. In this case, A(76) instead of B (77).

Page 44: Stratification:   Are you a lumper or a splitter?

Estimation of B (1) i) Factor 1, Level 1 k x x Range (x – x )

1 30 31 31 – 31 = 02 31 31

ii) Factor 2, Level 1 k x x Range (x – x )1 18 19 19 - 17 = 22 17 17

iii) Factor 3, Level 2 k x x Range (x – x )1 9 10 10 – 8 = 22 8 8

iv) Factor 4, Level 1 k x x Range (x – x )1 19 20 20 – 21 = 12 21 21

B (1) = 0 + 2 + 2 + 1 = 5

1k11k

111

112

2k12k

121

122

3k13k

131

132

4k14k

141

142

Page 45: Stratification:   Are you a lumper or a splitter?

Estimation of B (2) i) Factor 1, Level 1 k x x Range (x – x )

1 30 30 30 – 32 = 22 31 32

ii) Factor 2, Level 1 k x x Range (x – x )

1 18 18 18 - 18 = 02 17 18

iii) Factor 3, Level 2 k x x Range (x – x )

1 9 9 9 – 9 = 02 8 9

iv) Factor 4, Level 1 k x x Range (x – x )

1 19 19 19 – 22 = 32 21 22

Since B (1) = B (2), toss a coin for the next patient.

1k21k

211

212

2k22k

221

222

3k23k

231

232

4k24k

241

242

Page 46: Stratification:   Are you a lumper or a splitter?

Implementation

Need to continuously update marginal totals to determine B(t) therefore this is best

done at a central coordinating/statistical center

Page 47: Stratification:   Are you a lumper or a splitter?

Flexibility in allocation: Examples

1. P = 1 if B(1) ≠ B(2)P = 1/2 if B(1) = B(2)Simple randomization if equal, deterministic if unequal

2. P = 2/3 if B(1) ≠ B(2)P = 1/2 if B(1) = B(2)

P denotes the: Prob (groups become “more equal”)

The more P deviates from 1 when B(1) ≠ B(2), the less effective the balancing

Page 48: Stratification:   Are you a lumper or a splitter?

Theoretical Challenge

• Not true randomization – in some cases deterministic– Violation of randomization as basis for inference

– If the site knows all the margins, then can predict

• Reality: When done in a multi-center trial, with central randomization, impossible for sites to predict– Appears random to the sites– Basis for inference: We do inference all the time in non-randomized trials, doesn’t bother us then

Page 49: Stratification:   Are you a lumper or a splitter?

Summary

• Unless a very small block size is used, over-stratification is likely with use of block randomization within strata if you have many strata relative to the total sample size.

• Minimization should be considered for situations where you have several important prognostic factors and a small sample size (particularly if you are concerned about using a very small block size).

• Therneau (Cont Clin Trials 1993;14:98-108) suggests that as the number of distinct groups (strata) approaches N/2, adaptive methods be considered.