biostatistics case studies 2005 peter d. christenson biostatistician session 1: study design for...

Biostatistics Case Studies 2005

Peter D. Christenson

Biostatistician

http://gcrc.humc.edu/Biostat

Session 1:

Study Design for Demonstrating Lack of Treatment Effect: Equivalence or Non-inferiority

Case Study

pASA+PPI = 1.5%

Demonstrate: pclop – pASA+PPI ≤ 4%

N=145/group Power=80% for what?

Typical Analysis: Inferiority or Superiority

H0: pclop – pASA+PPI = 0%

H1: pclop – pASA+PPI ≠ 0%

H1 → therapies differ

α = 0.05

Power = 80% for Δ=|pclop - pASA+PPI| =?

Clop inferior

= 95% CI for pclop – pASA+PPI

Clop superior

0

0

pclop – pASA+PPI

pclop – pASA+PPI

[Not used in this paper]

0pclop – pASA+PPI

No diff detected*

* and 80% chance that a Δ of (?) or more would be detected.





α = 0.05

Power = 80% for Δ=|pclop - pASA+PPI| =?


So, N=331/group → 80% chance that a Δ of 4% or more would be detected.

Detectable Δ = 5.5%-1.5%=4%





α = 0.05

Power = 80% for Δ=|pclop - pASA+PPI| =4%


H0: pclop – pASA+PPI ≤ 0%

H1: pclop – pASA+PPI > 0%

H1 → clop inferior

Note that this could be formulated as two one-sided tests (TOST):

α = 0.025

Power = 80% for pclop - pASA+PPI =4%

H0: pclop – pASA+PPI ≥ 0%

H1: pclop – pASA+PPI < 0%

H1 → clop superior

α = 0.025

Power = 80% for pclop - pASA+PPI =-4%

Demonstrating Equivalence

H0: |pclop – pASA+PPI| ≥ E%

H1: |pclop – pASA+PPI| < E%

H1 → therapies “equivalent”, within E


H0: pclop – pASA+PPI ≤ -4%

H1: pclop – pASA+PPI > -4%

H1 → clop non-superior

Note that this could be formulated as two one-sided tests (TOST):

α = 0.025

Power = 80% for pclop - pASA+PPI = 0%



H1 → clop non-inferior

α = 0.025


Demonstrating Equivalence

H0: |pclop – pASA+PPI | ≥ 4%

H1: |pclop – pASA+PPI | < 4%

H1 → equivalence

α = 0.05

Power = 80% for pclop

- pASA+PPI = 0

Clop non-superior


Clop non-inferior

0

0

pclop – pASA+PPI

pclop – pASA+PPI

0

pclop – pASA+PPI Equivalence*

-4 4

-4

-4

4

4

* both non-superior and non-inferior.

This Paper: Inferiority and Non-Inferiority




Apparently, two one-sided tests (TOST), but only one explicitly powered:

α = 0.025

Power = 80% for pclop - pASA+PPI = ?%




α = 0.025


The authors chose E=4% as the maximum therapy difference that therapies are considered equivalent.

This Paper: Inferiority and Non-Inferiority

Clop inferior


Clop non-inferior

0

0

pclop – pASA+PPI

pclop – pASA+PPI

0

pclop – pASA+PPI

“Non-clinical” inferiority*

-4 4

-4

-4

4

4

* clop is statistically inferior, but not enough for clinical significance.

Decisions:

Observed Results: pclop = 8.6%; pASA+PPI = 0.7%; 95% CI = 3.4 to 12.4

12

0-4 4

pclop – pASA+PPI

Clop inferior

Power for Test of Clopidrogrel Non-Inferiority




α = 0.025


- pASA+PPI = 0%

Power for Test of Clopidrogrel Inferiority




α = 0.025


- pASA+PPI = 7.3%

Detectable Δ = 8.8%-1.5%=7.3%

Conclusions: This Paper

• In this paper, clop was so inferior that investigators were apparently lucky to have enough power for detecting it. The CI was too wide with this N for detecting a smaller therapy difference.

• Investigators justify testing non-inferiority of clop only (and not of Aspirin + Nexium) with the lessened desirability of combination therapy (?).

• I feel that this is a good approach for size and power for a new competing therapy against a standard, if the N for clop inferiority had been considered also.

• Note that power calculations were based on actual %s of subjects, whereas cumulative 12-month incidence was used in the analysis. There are not power calculations for equivalency tests using survival analysis, that I know of.

Conclusions: General

• “Negligibly inferior” would be better than non-inferior.

• All inference can be based on confidence intervals.

• Pre-specify the comparisons to be made, which can be defined as where confidence intervals lie.

• Ns are smaller for equivalence tests, but study may be underpowered to detect differences if that is the case, unless specifically designed for that.

• Power for only one or for multiple comparisons. Power can be different for different comparisons.

• For large N, reversing α and β=1-power for the typical test gives the same N as for equivalence test.

Appendix: Possible Errors in Study Conclusions

Truth:

H0: No Effect H1: Effect

No Effect

Effect

Study Claims:

Correct

CorrectError (Type I)

Error (Type II)

Power: Maximize

Choose N for 80%

Set α=0.05

Specificity=95%

Specificity

Sensitivity

Typical study to demonstrate superiority/inferiority

Appendix: Graphical Representation of Power

H0

HA

H0: true effect=0

HA: true effect=3

Effect in study=1.13

\\\ = Probability of concluding HA if H0 is true.

41%

5%

Effect (Group B mean – Group A mean)

/// = Probability of concluding H0 if HA is true. Power=100-41=59%Note greater power if larger N, and/or if true effect>3, and/or less subject heterogeneity.

N=100 per

Group

Larger Ns give

narrower curves

Typical study to demonstrate superiority/inferiority

www.stat.uiowa.edu/~rlenth/Power

Appendix: Online Study Size / Power Calculator

Does NOT include tests

for equivalence

or non-inferiority

or non-superiority

biostatistics case studies 2005 peter d. christenson biostatistician session 1: study design for...

Documents

p clop p asa ppi

p clop p asa ppi clop

p clop p asa ppi e

clop inferior note

noninferiority clop

paper h

equivalence h

clop nonsuperior note