odac may 3, 2004 1 subgroup analyses in clinical trials stephen l george, phd department of...
TRANSCRIPT
ODAC May 3, 2004 1
Subgroup Subgroup Analyses in Analyses in
Clinical TrialsClinical TrialsStephen L George, PhDStephen L George, PhD
Department of Biostatistics and Department of Biostatistics and BioinformaticsBioinformatics
Duke University Medical CenterDuke University Medical Center
2ODAC May 3, 2004
Definition of Subgroup Definition of Subgroup AnalysisAnalysis
An analysis of treatment effects An analysis of treatment effects within subgroups of patients within subgroups of patients enrolled on a clinical trialenrolled on a clinical trial
3ODAC May 3, 2004
Frequency of Subgroup Frequency of Subgroup Analyses Analyses
Approximately 50% of reports of Approximately 50% of reports of randomized clinical trials contain at randomized clinical trials contain at least one subgroup analysis (Pocock et least one subgroup analysis (Pocock et al 1987)al 1987)
Deciding on analysis after looking at Deciding on analysis after looking at the data is “dangerous, useful, and the data is “dangerous, useful, and often done” (Good 1983)often done” (Good 1983)
4ODAC May 3, 2004
Problems with Subgroup Problems with Subgroup AnalysesAnalyses
Increased probability of type I error Increased probability of type I error when Hwhen H0 0 truetrue
Decreased power (increased type II Decreased power (increased type II error) in individual subgroups when error) in individual subgroups when HH1 1 truetrue
Difficulty in interpretationDifficulty in interpretation
5ODAC May 3, 2004
General Assumptions in General Assumptions in Clinical TrialsClinical Trials
Hypotheses tested usually address an overall Hypotheses tested usually address an overall or ‘average’ treatment effect in the study or ‘average’ treatment effect in the study populationpopulation
No assumption of homogeneity of effect No assumption of homogeneity of effect across subgroupsacross subgroups
Direction, not magnitude, of the treatment Direction, not magnitude, of the treatment effect is expected be the same in subgroupseffect is expected be the same in subgroups
6ODAC May 3, 2004
ImplicationsImplications
Overall treatment comparisons are of Overall treatment comparisons are of primary interestprimary interest
Stratification or regression techniques Stratification or regression techniques can be used to adjust the overall can be used to adjust the overall comparison for subgroups or covariatescomparison for subgroups or covariates
Subgroup analyses are generally of Subgroup analyses are generally of secondary interest as “hypothesis secondary interest as “hypothesis generating” techniques for future studies generating” techniques for future studies
7ODAC May 3, 2004
Pre-planned vs Unplanned Pre-planned vs Unplanned Subgroup AnalysesSubgroup Analyses
Pre-planned analyses (hypothesis driven)Pre-planned analyses (hypothesis driven) Subgroup hypotheses specified in advanceSubgroup hypotheses specified in advance Control of error rates can, in principle, be Control of error rates can, in principle, be
addressedaddressed Unplanned analyses (exploratory)Unplanned analyses (exploratory)
Analyses suggested by the dataAnalyses suggested by the data Exhaustive search for differential treatment Exhaustive search for differential treatment
effects by subgroups (data dredging)effects by subgroups (data dredging) Inflated, and generally unknown, error ratesInflated, and generally unknown, error rates
8ODAC May 3, 2004
ICH Guideline E3ICH Guideline E3
Statistical Considerations (Appendix)
“… it is essential to consider the extent to which the analyses were planned prior to the availability of data…This is particularly important in the case of any subgroup analyses, because if such analyses are not preplanned they will ordinarily not provide an adequate basis for definitive conclusions.”
9ODAC May 3, 2004
ICH Guideline E9ICH Guideline E9
5.7 Subgroups, Interactions and Covariates
“In most cases…subgroup or interaction analyses are exploratory and should be clearly identified as such;…these analyses should be interpreted cautiously;…any conclusion of treatment efficacy (or lack thereof) or safety based solely on exploratory subgroup analyses are unlikely to be accepted.”
10ODAC May 3, 2004
Error Rates in Subgroup Error Rates in Subgroup Analyses Analyses
With k independent subgroups and With k independent subgroups and no difference in treatments, the no difference in treatments, the probabilityprobability
of at least one ‘significant’ of at least one ‘significant’ subgroup is:subgroup is:
1- (1- 1- (1- αα))kk
For example,For example, αα = 0.05,= 0.05, k k = 10 = 10 yieldsyields
1- (1- 0.05)1- (1- 0.05)10 10 = 0.40= 0.40
11ODAC May 3, 2004
Error rate as a function of number of subgroups
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 20 40 60 80 100
Number of subgroups
Typ
e I
erro
r ra
te
12ODAC May 3, 2004
Control of Error Rates in Control of Error Rates in Subgroup Analyses Subgroup Analyses
For planned subgroup analyses, the For planned subgroup analyses, the overall type I error rate can be controlled. overall type I error rate can be controlled. One conservative way is to use One conservative way is to use αα** = = αα//k k in in each of the subgroup analyseseach of the subgroup analyses
In this case, the power (probability of In this case, the power (probability of detecting real differences when present) detecting real differences when present) is sharply reduced in individual subgroupsis sharply reduced in individual subgroups
For unplanned subgroup analyses, For unplanned subgroup analyses, kk is is unknown so the error rates are unknownunknown so the error rates are unknown
13ODAC May 3, 2004
Hypothetical ExampleHypothetical Example
Treatments: Experimental (E) and Control (C)Treatments: Experimental (E) and Control (C) Outcome: Overall survivalOutcome: Overall survival Null median: 12 monthsNull median: 12 months Alt medians: 16 months (E) and 12 months (C)Alt medians: 16 months (E) and 12 months (C) 36 month accrual, 12 month followup, N = 36 month accrual, 12 month followup, N =
500500 αα = 0.05, 1- = 0.05, 1- ββ = 0.80 = 0.80 Subgroups: 350 males (70%), 150 femalesSubgroups: 350 males (70%), 150 females
14ODAC May 3, 2004
Subgroup Tests (no Subgroup Tests (no αα adjustment)adjustment)
Use Use αα** = 0.05 in each subgroup = 0.05 in each subgroup Overall Type I error rate = .0975Overall Type I error rate = .0975 Power in males ≈ 0.64, females ≈ 0.33Power in males ≈ 0.64, females ≈ 0.33 Probability that correct conclusion is Probability that correct conclusion is
reached in both subgroups (males, reached in both subgroups (males, females) under the alternative females) under the alternative hypothesis ≈ (0.64)(0.33) ≈ 0.21hypothesis ≈ (0.64)(0.33) ≈ 0.21
15ODAC May 3, 2004
Subgroup Tests Subgroup Tests (adjusted (adjusted αα))
Use Use αα** = 0.05/2 = 0.025 in each subgroup = 0.05/2 = 0.025 in each subgroup Overall Type I error rate = .04875Overall Type I error rate = .04875 Power in males ≈ 0.54, females ≈ 0.24Power in males ≈ 0.54, females ≈ 0.24 Probability that correct conclusion is Probability that correct conclusion is
reached in both subgroups (males, reached in both subgroups (males, females) under the alternative hypothesis females) under the alternative hypothesis ≈ (0.54)(0.24) ≈ 0.13≈ (0.54)(0.24) ≈ 0.13
16ODAC May 3, 2004
Aspirin Example Aspirin Example
A randomized trial of aspirin and A randomized trial of aspirin and sulfinpyrazone in threatened stroke. The sulfinpyrazone in threatened stroke. The Canadian Cooperative Study Group. Canadian Cooperative Study Group. N N Engl J MedEngl J Med 299: 53-59, 1978. 299: 53-59, 1978.
““Among men the risk reduction for stroke Among men the risk reduction for stroke or death was 48 per cent … whereas no or death was 48 per cent … whereas no significant trend was observed among significant trend was observed among women…We conclude that aspirin is an women…We conclude that aspirin is an efficacious drug for efficacious drug for menmen with threatened with threatened stroke.”stroke.”
17ODAC May 3, 2004
Strokes or Deaths: Strokes or Deaths: Aspirin Study Aspirin Study
AspirinAspirin No No AspirinAspirin
Total Total EventsEvents
Total Total SubjectSubject
ss
Males Males 2929 5656 8585 406406
FemaleFemaless
1717 1212 2929 179179
Total Total EventsEvents 4646 6868 114114 585585
18ODAC May 3, 2004
Risk Reduction: Aspirin Risk Reduction: Aspirin StudyStudy
O/EO/ERisk Risk
ReductiReductionon
ΧΧ22 P-valueP-value
Males Males 0.690.69 -48%-48% 8.208.20 0.0040.004
FemaleFemaless
1.181.18 +42%+42% 0.930.93 0.350.35
0.810.81 -31%-31% 3.953.95 0.0470.047
19ODAC May 3, 2004
Antiplatelet Meta-Antiplatelet Meta-analysis (1988) analysis (1988)
Secondary prevention of vascular Secondary prevention of vascular disease by prolonged antiplatelet disease by prolonged antiplatelet treatment. Antiplatelet Trialists' treatment. Antiplatelet Trialists' Collaboration. Collaboration. British Medical JournalBritish Medical Journal 296: 320-331, 1988.296: 320-331, 1988.
““Overall, allocation to antiplatelet Overall, allocation to antiplatelet treatment …reduced vascular mortality treatment …reduced vascular mortality by 15% … and non-fatal vascular events by 15% … and non-fatal vascular events (stroke or myocardial infarction) by 30% (stroke or myocardial infarction) by 30% …”…”
20ODAC May 3, 2004
Guidelines for Assessing Guidelines for Assessing Reported Subgroup Reported Subgroup
DifferencesDifferences(Oxman and Guyatt 1992)(Oxman and Guyatt 1992)
A priori hypotheses statedA priori hypotheses stated Clinical importance of the differenceClinical importance of the difference Proper assessment of statistical Proper assessment of statistical
significancesignificance Consistency across studiesConsistency across studies Indirect supporting evidenceIndirect supporting evidence
21ODAC May 3, 2004
Treatment-Covariate Treatment-Covariate Interactions:Interactions:
AA Generalization of Subgroup Generalization of Subgroup ConceptsConcepts A treatment-covariate interaction exists A treatment-covariate interaction exists
when the treatment effect is not the same when the treatment effect is not the same for all values of a covariate (e.g., gender, for all values of a covariate (e.g., gender, age, etc.)age, etc.)
Quantitative interactions: Treatment Quantitative interactions: Treatment effects in the same direction, but of effects in the same direction, but of different magnitude in some subgroups different magnitude in some subgroups (common and even expected)(common and even expected)
Qualitative interactions: Treatment Qualitative interactions: Treatment effects in opposite direction (rare)effects in opposite direction (rare)
22ODAC May 3, 2004
Treatment-covariate Treatment-covariate InteractionsInteractions
Treatment X (0 for control, 1 for Treatment X (0 for control, 1 for experimental)experimental)
Covariate Z (e.g., Z = 0 for female, 1 for male)Covariate Z (e.g., Z = 0 for female, 1 for male) Outcome Y = Outcome Y = ββ00 + + ββ11 X + X + ββ22 Z + Z + ββ33 XZXZ
ControlControl ExperimentExperimentalal
Trt EffectTrt Effect
FemaleFemale ββ00 ββ00 + + ββ11 ββ11
MaleMale ββ00 + + ββ22 ββ00 + + ββ11+ + ββ22 ++ββ33
ββ11 + + ββ33
Gender Gender EffectEffect
ββ22 ββ22 + + ββ33
23ODAC May 3, 2004
Some StrategiesSome Strategies
Design for overall hypotheses but test Design for overall hypotheses but test within pre-defined subgroups:within pre-defined subgroups: High overall error ratesHigh overall error rates Low power in subgroupsLow power in subgroups Biased estimatesBiased estimates
Design for overall hypotheses but test Design for overall hypotheses but test for pre-specified treatment-covariate for pre-specified treatment-covariate interactions:interactions: Low power to detect interactionsLow power to detect interactions
24ODAC May 3, 2004
Some Strategies Some Strategies (continued)(continued)
Design for overall hypotheses and Design for overall hypotheses and conduct unplanned (exploratory) conduct unplanned (exploratory) analyses of subgroup differences:analyses of subgroup differences: Higher, but unknown, error ratesHigher, but unknown, error rates Hypothesis generating exercise for Hypothesis generating exercise for
future studyfuture study Design for pre-specified subgroups or Design for pre-specified subgroups or
interactions:interactions: Control of error ratesControl of error rates Large sample sizesLarge sample sizes
25ODAC May 3, 2004
ConclusionsConclusions
Pre-planning is keyPre-planning is key Larger studies required for proper subgroup Larger studies required for proper subgroup
analysesanalyses Exploratory analyses are good for hypothesis Exploratory analyses are good for hypothesis
generating but are not convincing alonegenerating but are not convincing alone More than one study important for validationMore than one study important for validation