542-05-#1 statistics 542 introduction to clinical trials sample size issues ref: lachin, controlled...
TRANSCRIPT
542-05-#1
STATISTICS 542STATISTICS 542Introduction to Clinical TrialsIntroduction to Clinical Trials
SAMPLE SIZE ISSUESSAMPLE SIZE ISSUES
Ref: Lachin, Controlled Clinical Trials 2:93-113, 1981.
542-05-#2
Sample Size IssuesSample Size Issues
• Fundamental Point
Trial must have sufficient statistical power to detect differences of clinical interest
• High proportion of published negative trials do not have adequate power
Freiman et al, NEJM (1978)
50/71 could miss a 50% benefit
542-05-#3
Example: How many subjects?Example: How many subjects?
• Compare new treatment (T) with a control (C)
• Previous data suggests Control Failure Rate (Pc) ~ 40%
• Investigator believes treatment can reduce Pc by 25%
i.e. PT = .30, PC = .40
• N = number of subjects/group?
542-05-#4
• Estimates only approximate– Uncertain assumptions– Over optimism about treatment– Healthy screening effect
• Need series of estimates– Try various assumptions– Must pick most reasonable
• Be conservative yet be reasonable
542-05-#5
Statistical ConsiderationsStatistical Considerations
Null Hypothesis (H0):No difference in the response exists between treatment and control groups
Alternative Hypothesis (Ha):A difference of a specified amount () exists between treatment and control
Significance Level (): Type I ErrorThe probability of rejecting H0 given that H0 is true
Power = (1 - ): ( = Type II Error)The probability of rejecting H0 given that H0 is not true
542-05-#6
Standard Normal DistributionStandard Normal Distribution
Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.
542-05-#7
Standard Normal TableStandard Normal Table
Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.
542-05-#8
Distribution of Sample Means (1)Distribution of Sample Means (1)
Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.
542-05-#9
Distribution of Sample Means (2)Distribution of Sample Means (2)
Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.
542-05-#10
Distribution of Sample Means (3)Distribution of Sample Means (3)
Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.
542-05-#11
Distribution of Sample Means (4)Distribution of Sample Means (4)
Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.
542-05-#12
Distribution of Test StatisticsDistribution of Test Statistics
• Many have a common form• Theta = population parameter (eg
difference in means)• Thetahat = sample estimate• Then
– Z = Thetahat – E(thetahat)/SE(thetahat)
• And then Z has a Normal (0,1) distribution
542-05-#13
• If statistic z is large enough (e.g. falls into red area of scale), we believe this result is too largeto have come from a distribution with mean O (i.e. Pc - Pt = 0)
• Thus we reject H0: Pc - Pt = 0, claiming that their exists 5% chance this result could have come from distribution with no difference
542-05-#14
Normal DistributionNormal Distribution
Ref: Brown & Hollander. Statistics: A Biomedical Introduction. John Wiley & Sons, 1977.
542-05-#15
Two Groups0
00
Tc
11 TC
OR )1,0(~/2
Nn
XXZ TC
or
542-05-#16
Test StatisticsTest Statistics
)~
(
)~
(~
~
v
estimate sample
parameter population
542-05-#17
Test of HypothesisTest of Hypothesis• Two sided vs. One sidede.g. H0: PT = PC H0: PT < PC
• Classic test z = critical value
If |z| > z If z > z
Reject H0 Reject H0
= .05 , z = 1.96 = .05, z = 1.645
where z = test statistic• Recommend
z be same value both cases (e.g. 1.96) two-sided one-sided = .05 or = .025 z = 1.96 1.96
542-05-#18
Typical Design Assumptions (1)Typical Design Assumptions (1)
1. = .05, .025, .01
2. Power = .80, .90
Should be at least .80 for design
3. = smallest difference hope to detect
e.g. = PC - PT
= .40 - .30
= .10 25% reduction!
542-05-#19
Typical Design Assumptions (2)Typical Design Assumptions (2)
Z 1 - Z 0.05 1.96 0.80 0.84 0.025 2.24 0.90 1.282 0.01 2.58 0.95 1.645
Two SidedPowerSignificance Level
542-05-#20
Sample Size ExerciseSample Size Exercise
• How many do I need?
• Next question, what’s the question?
• Reason is that sample size depends on the outcome being measured, and the method of analysis to be used
542-05-#21
Simple Case - BinomialSimple Case - Binomial1. H0: PC = PT
2. Test Statistic (Normal Approx.)
3. Sample Size
Assume
• NT = NC = N
• HA: = PC - PT
)/1/1)(1(
ˆˆ
TC
TC
NNpp
ppZ
TC
TTCC
NN
PNPNp
ˆˆ
542-05-#22
Sample Size Formula (1)Sample Size Formula (1)Two ProportionsTwo Proportions
Simpler Case
• Z = constant associated with P {|Z|> Z } = two sided!
(e.g. = .05, Z =1.96)
• Z = constant associated with 1 - P {Z< Z} = 1-
(e.g. 1- = .90, Z =1.282)
• Solve for Z (1- ) or
2
2 )1()(2
ppZZ
N
542-05-#23
Sample Size Formula (2)Sample Size Formula (2)Two ProportionsTwo Proportions
• Z = constant associated with P {|Z|> Z } = two sided!
(e.g. = .05, Z =1.96)
• Z = constant associated with 1 - P {Z< Z} = 1- (e.g. 1- = .90, Z =1.282)
2
2])1()1()1(2[
TTCC PPPPZppZN
542-05-#24
Sample Size FormulaSample Size Formula
Power
• Solve for Z 1-
Difference Detected
• Solve for
Zqp
NZ
2
2
)(
N
qpZZ
542-05-#25
Simple Example (1)Simple Example (1)
• H0: PC = PT
• HA: PC = .40, PT = .30
= .40 - .30 = .10
• Assume
= .05 Z = 1.96 (Two sided)
1 - = .90 Z = 1.282
• p = (.40 + .30 )/2 = .35
542-05-#26
Simple Example (2)Simple Example (2)
Thus
a.
N = 4762N = 952
b.
2N = 956 N = 478
2
2
)3.4(.
])6)(.4(.)7)(.3(.282.1)65)(.35(.296.1[
N
478)3.4(.
)65)(.35(.)282.196.1(22
2
N
542-05-#27
Approximate* Total Sample Size for Comparing Various Approximate* Total Sample Size for Comparing Various Proportions in Two Groups with Significance Level (Proportions in Two Groups with Significance Level () )
of 0.05 and Power (1-of 0.05 and Power (1-) of 0.80 and 0.90) of 0.80 and 0.90True Proportions = 0.05
(one-sided)
= 0.05(two-sided)
pC(Control)
pI(Invervention)
1-0.90
1-0.80
1-0.90
1-0.80
0.60 0.50 850 610 1040 7800.40 210 160 260 2000.30 90 70 120 900.20 50 40 60 50
0.50 0.40 850 610 1040 7800.30 210 150 250 1900.25 130 90 160 1200.20 90 60 110 80
0.40 0.30 780 560 960 7200.25 330 240 410 3100.20 180 130 220 170
0.30 0.20 640 470 790 5900.15 270 190 330 2500.10 140 100 170 130
0.20 0.15 1980 1430 2430 18100.10 440 320 540 4000.05 170 120 200 150
0.10 0.05 950 690 1170 870
*Sample sizes are rounded up to the nearest 10
542-05-#28
542-05-#29
Comparison of MeansComparison of Means
• Some outcome variables are continuous– Blood Pressure– Serum Chemistry– Pulmonary Function
• Hypothesis tested by comparison of mean values between groups, or comparison of mean changes
542-05-#30
Comparison of Two MeansComparison of Two Means
• H0: C = TC - T = 0
• HA: C - T =
• Test statistic for sample means ~ N ()
• Let N = NC = NT for design
• Power
)/1/1(2TC
TC
NN
XXZ
~N(0,1) for H0
2
2
2
22
)/(
)(2)(2
ZZZZN
ZNZ )/(2/
542-05-#31
ExampleExamplee.g. IQ = 15 = 0.3x15 = 4.5
• Set 2 = .05
= 0.10 1 - = 0.90
• HA: = 0.3 / = 0.3
• Sample Size
• N = 234
2N = 468
222
2
)3.0(
02.21
)3.0(
)51.10(2
)3.0(
)282.196.1(2
N
542-05-#32
542-05-#33
Comparing Time to Event Comparing Time to Event DistributionsDistributions
• Primary efficacy endpoint is the time to an event
• Compare the survival distributions for the two groups
• Measure of treatment effect is the ratio of the hazard rates in the two groups = ratio of the medians
• Must also consider the length of follow-up
542-05-#34
Assuming Exponential Assuming Exponential Survival DistributionsSurvival Distributions
H : =
10 11
2
2
H :
1a 11
2
2
If P (T > t) = e w here = in group 1,- t1
,
= in group 2, le t2
= = m ed m ed w here m ed1 2 2 1 i / / ln (. ) / 5 i
• Then define the effect size by
• Standard difference
•
ln ( ) 2
542-05-#35
Time to Failure (1)Time to Failure (1)• Use a parametric model for sample size
• Common model - exponential– S(t) = e-t = hazard rate– H0: I = C
– Estimate N
George & Desu (1974)
• Assumes all patients followed to an event
(no censoring)
• Assumes all patients immediately entered
NZ Z
c I
2 2
2
( )
[ln ( / )]
542-05-#36
Assuming Exponential Assuming Exponential Survival DistributionsSurvival Distributions
• Simple case
• The statistical test is powered by the total number of events observed at the time of the analysis, d.
d = 4(Z + Z )
[ln( )]
2
2
=
C
I
542-05-#37
Converting Number of Events Converting Number of Events (D) to Required Sample Size (2N)(D) to Required Sample Size (2N)
• d = 2N x P(event) 2N = d/P(event)• P(event) is a function of the length of total follow-
up at time of analysis and the average hazard rate• Let AR = accrual rate (patients per year)
A = period of uniform accrual (2N = AR x A)F = period of follow-up after accrual completeA/2 + F = average total follow-up at planned
analysis = average hazard rate
• Then P(event) = 1 – P(no event) =
1 e - (A / 2+ F)
542-05-#38
Time to Failure (2)Time to Failure (2)• In many clinical trials
1. Not all patients are followed to an event
(i.e. censoring)
2. Patients are recruited over some period of time
(i.e. staggered entry)
• More General Model (Lachin, 1981)
where g() is defined as follows
2
2
)(
)}()({)(
IC
IC ggzzN
542-05-#39
1. Instant Recruitment Study Censored At Time T
2. Continuous Recruiting (O,T) & Censored at T
3. Recruitment (O, T0) & Study Censored at T (T > T0)
Teg
1)(
2
)1()(
3
TeT
Tg
0
)(
2
0
1
)(
Tee
gTTT
542-05-#40
ExampleAssume = .05 (2-sided) & 1 - = .90
C = .3 and I = .2T = 5 years follow-upT0 = 3
0. No Censoring, Instant Recruiting
N = 128
1. Censoring at T, Instant Recruiting
N = 188
2. Censoring at T, Continual Recruitment
N = 310
3. Censoring at T, Recruitment to T0
N = 233
542-05-#41
Sample Size Adjustment Sample Size Adjustment for Non-Compliance (1)for Non-Compliance (1)
• References:1. Shork & Remington (1967) Journal of Chronic Disease
2. Halperin et al (1968) Journal of Chronic Disease
3. Wu, Fisher & DeMets (1988) Controlled Clinical Trials
• Problem
Some patients may not adhere to treatment protocol
• Impact
Dilute whatever true treatment effect exists
542-05-#42
Sample Size Adjustment Sample Size Adjustment for Non-Compliance (2)for Non-Compliance (2)
• Fundamental PrincipleAnalyze All Subjects Randomized
• Called Intent-to-Treat (ITT) Principle– Noncompliance will dilute treatment effect
• A SolutionAdjust sample size to compensate for dilution effect (reduced power)
• Definitions of Noncompliance– Dropout: Patient in treatment group stops taking
therapy– Dropin: Patient in control group starts taking
experimental therapy
542-05-#43
Comparing Two Proportions– Assumes event rates will be altered by
non‑compliance– Define
PT* = adjusted treatment group rate
PC* = adjusted control group rate
If PT < PC,
0
PT PC
PT * PC *
1.0
542-05-#44
Simple Model - Compute unadjusted N– Assume no dropins– Assume dropout proportion R– Thus PC* = PC
PT* = (1-R) PT + R PC
– Then adjust N
– ExampleR 1/(1-R)2 % Increase
.1 1.23 23% .25 1.78 78%
2)1(*
R
NN
Adjusted Sample SizeAdjusted Sample Size
542-05-#45
Sample Size Adjustment Sample Size Adjustment for Non-Compliancefor Non-Compliance
Dropouts & dropins (R0, RI)
– ExampleR0 R1 1/(1- R0- R1)2 %
Increase
.1 .1 1.56 56%
.25 .25 4.0 4 times%
20 )1(
*IRR
NN
542-05-#46
• More Complex ModelRef: Wu, Fisher, DeMets (1980)
• Further Assumptions– Length of follow-up divided into intervals– Hazard rate may vary– Dropout rate may vary– Dropin rate may vary– Lag in time for treatment to be fully effective
Sample Size AdjustmentsSample Size Adjustments
542-05-#47
• Used complex model
• Assumptions
1. = .05 (Two sided) 1 - = .902. 3 year follow-up
3. PC = .18 (Control Rate)
4. PT = .13 Treatment assumed
28% reduction5. Dropout
26% (12%, 8%, 6%)6. Dropin
21% (7%, 7%, 7%)
Example: Beta-Blocker Example: Beta-Blocker Heart Attack Trial (BHAT) (1)Heart Attack Trial (BHAT) (1)
542-05-#48
Unadjusted Adjusted
PC = .18 PC* = .175
PT = .13 PT* = .14
28% reduction 20% reduction
N = 1100 N* = 2000
2N = 2200 2N* = 4000
Example: Beta-Blocker Example: Beta-Blocker Heart Attack Trial (BHAT) (2)Heart Attack Trial (BHAT) (2)
542-05-#49
““Equivalency” or Non-Inferiority Equivalency” or Non-Inferiority TrialsTrials
• Compare new therapy with standard
• Wish to show new "as good as"
• Rationale may be cost, toxicity, profit
• Examples– Intermittent Positive Pressure Breathing Trial
Expensive IPPB vs. Cheaper Treatment– Nocturnal Oxygen Therapy Trial (NOTT)
12 Hours Oxygen vs. 24 Hours
• Problem
Can't show H0: = 0
• A SolutionSpecify minimum difference = min
542-05-#50
Sample Size Formula Sample Size Formula Two ProportionsTwo Proportions
Simpler Case
• Z = constant associated with
• Z = constant associated with 1 -
• Solve for Z (1- ) or
2
2 )1()(2
ppZZ
N
542-05-#51
Difference in EventsTest Drug – Standard Drug
542-05-#52
Multiple Response VariablesMultiple Response Variables
• Many trials measure several outcomes
(e.g. MILIS, NOTT)
• Must force investigator to rank them for importance
• Do sample size on a few outcomes (2-3)
• If estimates agree, OK
If not, must seek compromise
542-05-#53
Mid Stream AdjustmentsMid Stream Adjustments
• Murphy's Law applies to sample size
• May find event rate assumptions way off from early results, power of study very inadequate
• Problem– Quit?– Continue for almost certain doom?– Adjust sample size?– Extend followup?
• Early Decision
Best to decide early, not look at treatment comparisons
542-05-#54
Adaptive DesignsAdaptive Designs
• One class allows re-estimating the sample size once the trial is underway– Chung et al– Chen, Lan & DeMets
• Methods have been criticized for allowing bias (eg Mehta & Tsiatis)
• Thus, methods still not widely used– AHEFT Trial one example
• Will be discussed later in data monitoring lecture
542-05-#55
Sample Size SummarySample Size Summary
• Ethically, the size of the study must be large enough to achieve the stated goals with reasonable probability (power)
• Sample size estimates are only approximate due to uncertainty in assumptions
• Need to be conservative but realistic
542-05-#56
Demo of Sample Size ProgramDemo of Sample Size Programwww.biostat.wisc.edu/www.biostat.wisc.edu/
• Program covers comparison of proportions, means, & time to failure
• Can vary control group rates or responses, alpha & power, hypothesized differences
• Program develops sample size table and a power curve for a particular sample size