power assessment for hierarchical combination endpoints ... · 6 compare power performances to...

28
Power Assessment for Hierarchical Combination Endpoints Using Joint Modelling of RTTE and TTE Models versus Finkelstein-Schoenfeld Method Camille Vong, Steve Riley, Lutz O. Harnisch PAGE 27, May 30 th 2018

Upload: others

Post on 13-Jan-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

Power Assessment for Hierarchical Combination Endpoints Using Joint Modelling of RTTE and TTE Models versus

Finkelstein-Schoenfeld Method

Camille Vong,

Steve Riley,

Lutz O. Harnisch

PAGE 27, May 30th 2018

Page 2: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

Transthyretin Amyloidosis (ATTR)

2

■ Transthyretin (TTR) is a circulating plasma protein that normally exists as a stable homotetramer. In diseased patients an unstable tetramer structure leads to formation of amyloid fibrils and subsequent tissue deposition in organs/tissues.

■ Two distinct clinical presentations of the amyloidosis: transthyretin familial amyloid polyneuropathy (ATTR-FAP) when the peripheral nerves are primarily affected and transthyretin amyloid cardiomyopathy (ATTR-CM) when the heart is primarily affected

■ ATTR-CM is a late onset disease and is rarely diagnosed. Death in most patients with cardiomyopathy is from cardiac causes, including sudden death, heart failure, and myocardial infarction.

Page 3: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

It’s a RARE Disease

3

■ Cardio-vascular trial sample size ~10 000 - 20 000 patients

Page 4: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

It’s a RARE Disease

4

■ Cardio-vascular trial sample size ~10 000 - 20 000 patients

“Approximately 800-1000

diagnosed patients with ATTR-CM

worldwide.”1

n = ~ 400 available

1 Ando Y et al. Guideline of transthyretin-related hereditary amyloidosis from clinicians. Orphanet Journal of Rare Diseases. 2013;8:31

Page 5: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

It’s a RARE CARDIO-VASCULAR Disease

5

Survival is the golden standard CV-related endpoint:

Low power to detect drug effect with the available sample size, too long to show benefit alone

Hence, use of an ancillary longitudinal endpoint:

Frequency of cardiovascular-related hospitalization visits

red=

death

/bla

ck=

Censo

red

/blu

e=

HO

■ Cardio-vascular trial sample size ~10 000 - 20 000 patients

Page 6: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

Objectives

6

■ Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for a rare disease

● Apply the non-parametric Finkelstein-Schoenfeld (FS) test

● Enhance trial analytical metric with a model-based approach

• Exposure - Time-to-Event (TTE) for survival data

• Exposure - TTE with hospitalization frequency as time-varying covariate (TTE-COV)

• Exposure - Repeated Time-to-Event (RTTE) for hospitalization frequency

• Joint Exposure Repeated Time-to-Event and Time-to-Event (Joint RTTE+TTE)

Page 7: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

Methodology Framework – Assumptions

7

Trial design: 30 months

2:1:2, n=400

Placebo:low:high

Hazard

Distributions: 1) Exponential for

mortality and HO

h(t) =

2) Exponential for

mortality and Weibull for

HO

h(t) =

And

h(t) = .(.t)-1 3) 40% no event

Correlation

between hazards: 1) R2=0

2) R2=1

Treatment

Effect Size:

-15% mortality

-0.5 CV-related HO

for high dose

A) placebo and

low dose similar

B) low and high

dose similar C) Emax

relationship

Convergence Type I error

PK, mortality,

hospitalization

(HO) data5

FS TTE TTE-COV RTTE RTTE+

TTE

Power versus sample size6

5 Nyberg J. Simulating large time-to-event trials in NONMEM. https://www.page-meeting.org/default.asp?abstract=3166 6 Ueckert S. Accelerating Monte-Carlo Power Studies through Parametric Power Estimation. J Pharmacokinet Pharmacodyn. 2016 Apr;43(2):223-34

Page 8: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

0 -1 -1 -1

+1 0 +1

+1 0 +1

-1 -1 0 -1

1+ +1 0

Finkelstein-Schoenfeld (FS)

8

■ Prior regulatory history in cardiac medical device trials 3,4

■ Non-parametric hierarchical & pairwise test derived from patient-to-patient comparison

1) Black and grey died

3 Leon MB et al. N Engl J Med 2010; 363:1597-1607, 4 https://www.accessdata.fda.gov/cdrh_docs/pdf10/P100041b.pdf

Page 9: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

Finkelstein-Schoenfeld (FS)

9

■ Prior regulatory history in cardiac medical device trials

■ Non-parametric hierarchical & pairwise test derived from patient-to-patient comparison

0 -1 -1 -1 -1

+1 0 +1

+1 0 +1

+1 -1 -1 0 -1

1+ +1 0

1) Black and grey died but…

2) Black died before grey

Page 10: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

Finkelstein-Schoenfeld (FS)

10

■ Prior regulatory history in cardiac medical device trials

■ Non-parametric hierarchical & pairwise test derived from patient-to-patient comparison

0 -1 -1 -1 -1

+1 0 +1

+1 0 +1

+1 -1 -1 0 -1

1+ +1 0

Page 11: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

Finkelstein-Schoenfeld (FS)

11

■ Prior regulatory history in cardiac medical device trials

■ Non-parametric hierarchical & pairwise test derived from patient-to-patient comparison

0 -1 -1 -1 -1

+1 0 +1

+1 0 +1

+1 -1 -1 0 -1

1+ +1 0

H H H

H H

3

2

Page 12: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

Finkelstein-Schoenfeld (FS)

12

■ Prior regulatory history in cardiac medical device trials

■ Non-parametric hierarchical & pairwise test derived from patient-to-patient comparison

H H H

H H

1

2

0 -1 -1 -1 -1

+1 0 +1

+1 0 +1

+1 -1 -1 0 -1

1+ +1 0

Page 13: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

Finkelstein-Schoenfeld (FS)

13

■ Prior regulatory history in cardiac medical device trials

■ Non-parametric hierarchical & pairwise test derived from patient-to-patient comparison

H H H

H H

1

2

0 -1 -1 -1 -1

+1 0 +1 +1

+1 -1 0 +1

+1 -1 -1 0 -1

1+ +1 0

Page 14: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

Finkelstein-Schoenfeld (FS)

14

■ Prior regulatory history in cardiac medical device trials

■ Non-parametric hierarchical & pairwise test derived from patient-to-patient comparison

H H H

H H

1

2

0 -1 -1 -1 -1

+1 0 +1 +1 +1

+1 -1 0 +1 -1

+1 -1 -1 0 -1

1+ -1 +1 +1 0

Page 15: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

Finkelstein-Schoenfeld (FS)

15

■ Prior regulatory history in cardiac medical device trials

■ Non-parametric hierarchical & pairwise test derived from patient-to-patient comparison

H H H

H H

1

2

Ui

0 -1 -1 -1 -1 -4

+1 0 +1 +1 +1 +4

+1 -1 0 +1 -1 0

+1 -1 -1 0 -1 -2

1+ -1 +1 +1 0 +2

In each stratrum

Page 16: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

Finkelstein-Schoenfeld (FS)

16

■ Prior regulatory history in cardiac medical device trials

■ Non-parametric hierarchical & pairwise test derived from patient-to-patient comparison

Ui

0 -1 -1 -1 -1 -4

+1 0 +1 +1 +1 +4

+1 -1 0 +1 -1 0

+1 -1 -1 0 -1 -2

1+ -1 +1 +1 0 +2

Ui

Ui

Generalized

Gehan

Wilcoxon test

p-value

Active

Placebo

Page 17: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

Results: FS U-score distributions

17

Kaplan-Meier equivalent

No TTE event

Page 18: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

Results: FS U-score distributions

18

Improvement

in FS

No HO event

Kaplan-Meier equivalent

HO

ignored

Page 19: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

Drawbacks with Finkelstein-Schoenfeld

19

■ FS maintains the hierarchy (Mortality HO), but

● Ignores the assessment of the HO endpoint in patients who die in the trial

■ FS ignores the longitudinal aspect of the events

● Drop-out if it’s a competitive risk to death or dose interruption not accounted for

■ FS cannot test a dose-response if more than 1 active group

● Differentiation of doses requires multiple subgroup comparisons

■ FS is based on fixed set of strata (ie. categorical covariates)

● Integration of continuous covariates only if categorized

● Smaller N in each stratum to perform the test

Page 20: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

Joint RTTE + TTE

20

■ Shared random effects (log-normal)

■ Link function as an estimated scaling factor (on baseline and/or on shape)

■ $MIX to have 40% of the population without an event

■ DRUG effect = Emax reduction on the baseline hazard of RTTE/TTE (and/or shape of Weibull)

■ One-inflated negative binomial for hospitalization duration

Page 21: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

Joint RTTE + TTE

21

■ Shared random effects (log-normal)

■ Link function as an estimated scaling factor (on baseline and/or on shape)

■ $MIX to have 40% of the population without an event

■ DRUG effect = Emax reduction on the baseline hazard of RTTE/TTE (and/or shape of Weibull)

■ One-inflated negative binomial for hospitalization duration

7 PAGE 21 (2012) Abstr 2564 [www.page-meeting.org/?abstract=2564]

7

Page 22: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

Results : Scenario A - similar placebo/low dose

Method FS TTE TTE-COV RTTE Joint

RTTE+TTE

Power *(%) 10 27 (77%) 23 (69%) 75 (90%) 79 (93%)

Method FS TTE TTE-COV RTTE Joint

RTTE+TTE

Power *(%) 37 29 (77%) 42 (76%) 42 (60%) 65 (95%)

22 * Convergence in brackets

Page 23: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

Results : Scenario B - similar low/high dose

23

Method FS TTE TTE-COV RTTE Joint

RTTE+TTE

Power *(%) 17 20 (67%) 19 (62%) 71 (83%) 70 (83%)

Method FS TTE TTE-COV RTTE Joint

RTTE+TTE

Power *(%) 40 18 (50%) 28 (53%) 33 (77%) 49 (72%)

* Convergence in brackets

Page 24: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

Results : Scenario C – Emax relationship

24

Method FS TTE TTE-COV RTTE Joint

RTTE+TTE

Power *(%) 13 20 (77%) 23 (63%) 61 (86%) 62 (94%)

Method FS TTE TTE-COV RTTE Joint

RTTE+TTE

Power *(%) 44 30 (85%) 54 (79%) 56 (92%) 75 (96%)

* Convergence in brackets

Page 25: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

Results : All scenarios type I error rates

25

Correlation Endpoint Method Type I* (%)

R2 = 1

Mortality

HO data

FS NA

Mortality

data only

TTE 4

(43%)

Mortality

HO data

TTE-COV 7

(45%)

HO data only RTTE 7

(58%)

Mortality

HO data

Joint

RTTE+TTE

2

(57%)

R2 = 0

Mortality

HO data

FS NA

Mortality

data only

TTE 2

(30%)

Mortality

HO data

TTE-COV 3

(32%)

HO data only RTTE 4

(37%)

Mortality

HO data

Joint

RTTE+TTE

2

(17%)

* Convergence in brackets

Page 26: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

Summary

26

■ Implementation of a model-based approach to link the probability of survival and the probability of hospitalization events.

■ In general, the joint RTTE+TTE and the RTTE methods provided the highest power to detect a drug effect.

● While correlated, the gain of power from the joint RTTE+TTE model is very moderate.

● While uncorrelated, the joint RTTE+TTE model added extra power by acknowledging the additional information from the TTE data.

● FS results were superior to TTE alone in general, but vary across the scenarios.

● Type I error rates were controlled in general and convergence rates with an Emax model show adequate robustness of the models in power assessment.

■ Challenges in introducing drug effects and characterizing the underlying relationship if multiple confounders exist. In case of informative dropout, a dropout model can be implemented but may be competitive to mortality.

■ Hierarchical metrics in power assessment could mimic FS decision rules.

■ Smaller sample sizes to detect a treatment effect in future trials could be achieved using this methodology.

Page 27: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

Acknowledgment

■ Jeffrey H Schwartz ■ Balarama Gundapaneni ■ Daniel Meyer

■ Steve Gibbs ■ Ken Salatka ■ Crima Shah

■ Vijayakumar Sundararajan

■ Tim Nicholas

■ Yea Min Huh ■ Sridhar Duvvuri ■ Jae Eun Ahn ■ Chay Lim

And:

Rare disease

patients in the

study

Page 28: Power Assessment for Hierarchical Combination Endpoints ... · 6 Compare power performances to detect a (small) drug effect for the purpose of informing a dose recommendation for

THANK YOU !

Rare

disease

patient