consistent assessment of biomarker and subgroup identification methods h.d. hollins showalter...

65
Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

Upload: stephon-creswell

Post on 14-Dec-2015

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

1

Consistent Assessment of Biomarker and Subgroup Identification Methods

H.D. Hollins Showalter

5/20/2014 (MBSW)

Page 2: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

2

Outline

1. Background

2. Data Generation

3. Performance Measurement

4. Example

5. Operationalization

6. Conclusion

5/20/2014 (MBSW)

Page 3: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

3

Outline

1. Background

2. Data Generation

3. Performance Measurement

4. Example

5. Operationalization

6. Conclusion

5/20/2014 (MBSW)

Page 4: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

4

Tailored Therapeutics

A medication for which treatment decisions are based on the molecular profile of the patient, the disease, and/or the patient’s response to treatment.

• A tailored therapeutic allows the sponsor to make a regulatory approved claim of an expected treatment effect (efficacy or safety)

• “Tailored therapeutics can significantly increase value—first, for patients—who achieve better outcomes with less risk and, second, for payers—who more frequently get the results they expect.”*

5/20/2014 (MBSW)

*Opening Remarks at 2009 Investor Meeting, John C. Lechleiter, Ph.D.

Adapted from slides presented by William L. Macias, MD, PhD, Eli Lilly

Page 5: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

5

Achieving Tailored Therapeutics

• Data source: clinical trials (mostly)• Objective: identify biomarkers and subgroups• Challenges: complexity, multiplicity• Need: modern statistical methods

5/20/2014 (MBSW)

Page 6: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

6

Prognostic vs. Predictive Markers

Prognostic MarkerSingle trait or signature of traits that identifies different groups of patients with respect to the risk of an outcome of interest in the absence of treatment

Predictive MarkerSingle trait or signature of traits that identifies different groups of patients with respect to the outcome of interest in response to a particular treatment

5/20/2014 (MBSW)

Page 7: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

7

Statistical Interactions

Marker

Res

po

nse

- +

No treatment

Treatment

Marker Effect

TreatmentTreatment Effect

Treatment by Marker Effect

Y = 0 + 1*M + 2*T + 3*M*T + 5/20/2014 (MBSW)

Page 8: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

8

Types of Predictive Markers

Marker

Res

po

nse

- +

No treatment

Treatment

Marker

Res

po

nse

- +

No treatment

Treatment

Marker

Res

po

nse

- +

No treatment

Treatment

Marker

Res

po

nse

- +

No treatment

Treatment

5/20/2014 (MBSW)

Page 9: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

9

Entire Population

Subgroup of Interest

Group size: 50%

M+

Trt response: -1.17Pl response: -0.09

Treatment effect: -1.08

x1 = 1 x1 = 0

M−

Entire Population

Group size: 25%

M+

Trt response: -1.39Pl response: -0.19

Treatment effect: -1.20

x1 = 1 x1 = 0

M−

x 2 =

1x 2

= 0

Trt response: -0.33Pl response: -0.20

Treatment effect: -0.13

Predictive Marker Example

5/20/2014 (MBSW)

Subgroup of Interest

Group size: 75%

Trt response: -0.23Pl response: -0.13

Treatment effect: -0.1

Page 10: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

10

BSID vs. “Traditional” Analysis

• Traditional subgroup analysiso Interaction testing, one at a timeo Many statistical issueso Many gaps for tailoring

• Biomarker and subgroup identification (BSID)o Utilizes modern statistical methodso Addresses issues with subgroup analysiso Maximizes tailoring opportunities

5/20/2014 (MBSW)

Page 11: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

11

Simulation to Assess BSID Methods

Objective

Consistent, rigorous, and comprehensive calibration and comparison of BSID methods

Value• Further improve methodology

o Identify the gaps (where existing methods perform poorly)

o Synergy/combining ideas from multiple methods• Optimize application for specific clinical trials

5/20/2014 (MBSW)

Page 12: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

12

BSID Simulation: Three Components

1. Data generationo Key is consistency

2. BSIDo “Open” and comprehensive application of

analysis method(s)

3. Performance measuremento Key is consistency

5/20/2014 (MBSW)

Page 13: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

13

Per

form

ance

M

easu

rem

ent

BS

IDD

ata

Gen

erat

ion

BSID Simulation: Visual Representation

5/20/2014 (MBSW)

Dataset 1

Results 1

Performance Metrics 1

Overall Performance

Metrics

Truth

Dataset 2

Results 2

Performance Metrics 2

Dataset …

Results …

Performance Metrics …

Dataset n

Results n

Performance Metrics n

Page 14: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

14

Outline

1. Background

2. Data Generation

3. Performance Measurement

4. Example

5. Operationalization

6. Conclusion

5/20/2014 (MBSW)

Page 15: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

15

Data Generation

• Creating virtual trial datao Make assumptions in order to emulate real trial datao Knowledge of disease and therapies, including

historical datao Specific to BSID: must embed markers and

subgroups• In order to measure the performance of BSID

methodology the “truth” is neededo This is challenging/impossible to discern using real

trial data

5/20/2014 (MBSW)

Page 16: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

16

Data Generation Survey

5/20/2014 (MBSW)

Attribute SIDES (2011)1 SIDES (2014)2 VT3 GUIDE4 QUINT5 IT6

n 900 300, 900 400 - 2000 100 200 - 1000 300, 450

p 5 - 20 20 - 100 15 - 30 100 5 - 20 4

response type continuous continuous binary binary continuous TTE

predictor type binary binary continuous categorical continuous ordinal, categorical

predictor correlation 0, 0.3 0, 0.2 0, 0.7 0 0, 0.2 0

treatment assignment 1:1 1:1 ? ~1:1 ~1:1 ?

# predictive markers 0 - 3 2 0, 2 0, 2 1 - 3 0, 2

predictive effect(s) higher order higher order higher order N/A, simple, higher order

simple, higher order simple

predictive M+ group size (% of n) 15% - 20% 50% N/A, ~25%, ~50% N/A, ~36% ~16% - ~50% N/A, ~25%, ?

# prognostic markers 0 0 3 0 - 4 1 - 3 0, 2

prognostic effect(s) N/A N/A simple, higher order N/A, simple, higher order

simple, higher order simple

“contribution model”logit model (w/o and with subject-specific effects

linear model (on probability

scale)“tree model”

exponential model

Page 17: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

17

Data Generation: Recommendations

• Clearly identify attributes and modelso Transparencyo Traceability of analysis

• Make sure to capture the “truth” in a way that facilitates performance measurement

• Derive efficiency and synergistic value (more on this later!)

5/20/2014 (MBSW)

Page 18: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

18

Data Generation: Specifics

• Identify key attributeso Sample sizeo Number of predictorso Response typeo Predictor type/correlationo Subgroup sizeo Sizes of effects: placebo response, overall treatment

effect, predictive effect(s), prognostic effect(s)o Others: Missing data, treatment assignment

• Specify model

5/20/2014 (MBSW)

Page 19: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

19

Data Generation: Recommendations

• Clearly identify attributes and modelso Transparencyo Traceability of analysis

• Make sure to capture the “truth” in a way that facilitates performance measurement

• Derive efficiency and synergistic value (more on this later!)

5/20/2014 (MBSW)

Page 20: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

20

Data Generation: Reqs

• Format data consistently• Make code flexible enough to accommodate

any/all attributes and models• Ensure that individual datasets can be

reproduced (i.e., various seeds for random number generation)

The resulting dataset(s) should always have the same look and feel

5/20/2014 (MBSW)

Page 21: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

21

Outline

1. Background

2. Data Generation

3. Performance Measurement

4. Example

5. Operationalization

6. Conclusion

5/20/2014 (MBSW)

Page 22: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

22

Performance Measurement

• Quantifying the ability of BSID methodology to recapture the “truth” underlying the (generated) data

• If done consistently, allows calibration and comparison of BSID methods

5/20/2014 (MBSW)

Page 23: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

23

Performance Measurement: Survey

5/20/2014 (MBSW)

SIDES (2011)1 VT3 GUIDE4 QUINT5SIDES (2014)2 IT6

Selection rate

Complete match rate

Partial match rate

Confirmation rate

Treatment effect fraction

Pr(complete match)

Pr(partial match)

Pr(selecting a subset)

Treatment effect fraction (updated def.)

Pr(selecting a superset)

Finding correct X’s

Closeness of to the true

Closeness of the size of to the size of the

true

Properties of as an estimator

of

Power

Pr(selection at 1st or 2nd

level splits of trees)

Accuracy

Pr(nontrivial tree)

(RP1a) Pr(type I errors)

(RP1b) Pr(type II errors)

(RP2) Rec. of tree

complexity

(RP4) Rec. of assignments of observations to

partition classes

(RP3) Rec. of splitting vars

and split points.

Frequencies of the final tree sizes

Bias assessment via likelihood

ratio and logrank tests

Frequency of (predictor)

“hits”

Page 24: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

24

Marker Level

Subject Level

Subgroup Level

Performance Measurement: Recommendations

5/20/2014 (MBSW)

predictionestimation

testing

Page 25: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

25

SIDES (2011)1 VT3 GUIDE4 QUINT5SIDES (2014)2 IT6

Selection rate

Complete match rate

Partial match rate

Confirmation rate

Treatment effect fraction

Pr(complete match)

Pr(partial match)

Pr(selecting a subset)

Treatment effect fraction (updated def.)

Pr(selecting a superset)

Finding correct X’s

Closeness of to the true

Closeness of the size of to the size of the

true

Properties of as an estimator

of

Power

Pr(selection at 1st or 2nd

level splits of trees)

Accuracy

Pr(nontrivial tree)

(RP1a) Pr(type I errors)

(RP1b) Pr(type II errors)

(RP2) Rec. of tree

complexity

(RP4) Rec. of assignments of observations to

partition classes

(RP3) Rec. of splitting vars

and split points.

Frequencies of the final tree sizes

Bias assessment via likelihood

ratio and logrank tests

Frequency of (predictor)

“hits”

Perf. Measurement: Survey Revisited

5/20/2014 (MBSW)

SIDES (2011)1

VT3

GUIDE4

QUINT5SIDES (2014)2

IT6

Selection rate

Complete match rate

Partial match rate

Confirmation rate

Treatment effect fraction

Pr(complete match)

Pr(partial match)

Pr(selecting a subset)

Treatment effect fraction (updated def.)

Pr(selecting a superset)

Finding correct X’s

Closeness of to the true

Properties of as an estimator

of

Closeness of the size of to the size of the

true

Power

Pr(selection at 1st or 2nd

level splits of trees)

Accuracy

Pr(nontrivial tree)

(RP1a) Pr(type I errors)

(RP1b) Pr(type II errors)

(RP2) Rec. of tree

complexity (RP4) Rec. of assignments of observations to

partition classes

(RP3) Rec. of splitting vars

and split points.

Frequencies of the final tree sizes

Bias assessment via likelihood

ratio and logrank tests

Frequency of (predictor)

“hits”

Marker Level Subgroup Level Subj. Level

(testing)

(estimation) (prediction)

Page 26: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

26

Contingency Table: Marker Level

5/20/2014 (MBSW)

Predictive BiomarkerId

enti

fied

as

Pre

dic

tiv

eTrue False

No

Yes True Positive

False Positive

True Negative

False Negative

• Sensitivity = True Positive / True Predictive Biomarkers• Specificity = True Negative / False Predictive Biomarkers• PPV = True Positive / Identified as Predictive• NPV = True Negative / Not Identified as Predictive

Page 27: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

27

Performance Measures: Marker Level

# and % of predictors: true vs. identified• Sensitivity• Specificity• PPV• NPV

5/20/2014 (MBSW)

Page 28: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

28

Performance Measures: Subgroup Level

• Size of identified subgroup• Treatment effect in the identified subgroup

o Average the true “individual” treatment effects under potential outcomes framework

• Accuracy of estimated treatment effecto Difference (both absolute and direction) between

estimate and true effect

5/20/2014 (MBSW)

Page 29: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

29

Perf. Measures: Subgroup Level, cont.

• Implications on sample size/time/cost of future trialso Given true treatment effect, what is the number of

subjects needed in the trial for 90% power?o What is the cost of the trial? (mainly driven by #

enrolled)o How much time will the trial take? (mainly driven

by # screened)

5/20/2014 (MBSW)

Page 30: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

30

Contingency Table: Subject Level

5/20/2014 (MBSW)

Me

mb

ers

hip

C

lass

ific

ati

on

Potential to Realize Enhanced Treatment Effect*

True FalseM

-M

+ True Positive

False Positive

True Negative

False Negative

• Sensitivity = True Positive / True Enhanced Treatment Effect• Specificity = True Negative / False Enhanced Treatment Effect• PPV = True Positive / Classified as M+ • NPV = True Negative / Classified as M-

*at a meaningful or desired level

Page 31: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

31

Performance Measures: Subject Level

Compare subgroup membership on the individual level: true vs. identified• Sensitivity• Specificity• PPV• NPV

5/20/2014 (MBSW)

Page 32: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

32

Conditional Performance Measures

• Same metrics with Null submissions removed

Markers/subgroups can be very difficult to find. When a method DOES find something, how accurate is it?

Hard(er) to compare multiple methods when all performance measures are washed out by Null submissions

5/20/2014 (MBSW)

Page 33: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

33

Cond. Subgroup Level Measures Example

5/20/2014 (MBSW)

M+

Treatment effect: 10

x1 = 1 x1 = 0

M−

Treatment effect: 0

Group size: 50%

Group size: 50%

x 2 =

1x 2

= 0

BSID Method A900/1000: Null

100/1000: x1 = 1

Truth(but x1 very hard to find)

1000 simulations

BSID Method B900/1000: Null50/1000: x1 = 1 50/1000: x2 = 1

UnconditionalSize: 0.95Effect: 5.5

UnconditionalSize: 0.95

Effect: 5.25

ConditionalSize: 0.5Effect: 10

ConditionalSize: 0.5

Effect: 7.5

Gro

up

siz

e: 5

0%

Gro

up

siz

e: 5

0%

Page 34: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

34

Performance Measurement: Reqs

For each application of BSID user proposes:• List of predictive biomarkers• The one subgroup for designing the next study• Estimated treatment effect in this subgroup

In conjunction with the “truth” underlying the generated data, all of the recommended performance measures can be calculated using these elements

5/20/2014 (MBSW)

Page 35: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

35

Considering the “Three Levels”

What are the most important and relevant measures of a result? Depends on the objective…

5/20/2014 (MBSW)

Marker Level

Invest further in the

marker(s)

Subgroup Level

Tailor the next

study/design

Subject Level

Impact in clinical practice

Page 36: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

36

Outline

1. Background

2. Data Generation

3. Performance Measurement

4. Example

5. Operationalization

6. Conclusion

5/20/2014 (MBSW)

Page 37: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

37

Data Generation Example

5/20/2014 (MBSW)

Attribute Value

simulations (datasets) 200

n 240

p 20

response type continuous ( errors)

predictor type ordinal (“genetic”)

predictor correlation 0

treatment assignment 1:3 (pl:trt)

placebo response -0.1 (in weakest responding subgroup)

treatment effect -0.1 (in weakest responding subgroup)

# predictive markers 1

predictive effect size(s) (type) -0.45 (dominant)

predictive M+ group size ~50% of n

# prognostic markers 0

prognostic effect size(s) N/A

linear model

Page 38: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

38

Data Generation Example, cont.

5/20/2014 (MBSW)

Page 39: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

39

Data Generation Example, concl.

5/20/2014 (MBSW)

Dataset 1Trt 0: -0.141Trt 1: -0.407Effect: -0.266

x_1_1_1

- +-0.5

-0.45-0.4

-0.35-0.3

-0.25-0.2

-0.15-0.1

-0.050

trt 0trt 1

Dataset 21Trt 0: -0.018Trt 1: -0.427Effect: -0.409

- +-0.5

-0.45

-0.4

-0.35

-0.3

-0.25

-0.2

-0.15

-0.1

-0.05

0

trt 0trt 1

x_1_1_1

Page 40: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

40

BSID Methods Applied to Example

5/20/2014 (MBSW)

Alpha controlled at 0.1

Approach Traditional Virtual Twin3 TSDT7

Handling treatment-by-subgroup interaction Model Transformation Sequential

Searching for candidate subgroups Exhaustive Recursive Partitioning Recursive Partitioning

Addressing multiplicity Simple (Sidak Correction)

Permutation Sub-sampling + Permutation

Page 41: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

41

Performance Measurement Example

5/20/2014 (MBSW)

Truth Proposal+

= Performance Measures

Page 42: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

42

Perf. Measurement Example, cont.

5/20/2014 (MBSW)

Page 43: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

43

Perf. Measurement Example, concl.

5/20/2014 (MBSW)

Measure Traditional Virtual Twin3 TSDT7

Marker Level Uncond. Cond. Uncond. Cond. Uncond. Cond.

Sensitivity 0.025 0.227 0.135 0.614 0.39 0.929

Specificity 0.995 0.957 0.996 0.980 0.998 0.996

PPV 0.227 0.227 0.614 0.614 0.929 0.929

NPV 0.951 0.959 0.957 0.980 0.969 0.996

Subgroup Level

Non-Identification (Null) 89% 78% 58%

Subgroup Size 93.6% 41.4% 88.8% 48.9% 79.2% 50.4%

Trt Effect in Subgroup -0.335 -0.388 -0.359 -0.466 -0.416 -0.535

Subject Level

Sensitivity 0.947 0.518 0.956 0.798 0.986 0.966

Specificity 0.076 0.689 0.180 0.820 0.406 0.966

PPV 0.523 0.639 0.576 0.814 0.702 0.967

NPV 0.592 0.592 0.805 0.805 0.965 0.965

Page 44: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

44

Outline

1. Background

2. Data Generation

3. Performance Measurement

4. Example

5. Operationalization

6. Conclusion

5/20/2014 (MBSW)

Page 45: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

45

Strategy

• Develop framework (done/ongoing)• Present/get input (current)

o Internal and external forumso Workshop

• Establish an open environment (future)o R package on CRANo Web portal repository

5/20/2014 (MBSW)

Page 46: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

46

Predictive Biomarker Project: Vision

• Access Web Portalo Reads open description (objective, models, formats etc.)

• Access web interface for Data Generationo Generate data under specified scenarios, or utilize

“standard”/pre-existing scenarios• Apply BSID methodology to datasets

o Express results in the specified format• Access web interface for Performance Measurement

o Compare performance• Encouraged to contribute to Repository

o Open sharing of results, descriptions, programs

5/20/2014 (MBSW)

Page 47: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

47

Pros and Cons

Pros• More convenient and useful simulation studies to aid research• Direct comparisons of performance by methods• Optimization of methods for relevant and important scenarios

for drug development• New insights and collaborations• Data sets could be applied for other statistical problems

Cons• Need to develop infrastructure to support simulated data• Access and upkeep• Need experts to explicitly define the scope

5/20/2014 (MBSW)

Page 48: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

48

Outline

1. Background

2. Data Generation

3. Performance Measurement

4. Example

5. Operationalization

6. Conclusion

5/20/2014 (MBSW)

Page 49: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

49

Conclusion

• Simulation studies are a common approach to assessing BSID methods but there is a lack of consistency in data generation and performance measurement

• The presented framework enables consistent, rigorous, comprehensive calibration and comparison of BSID methods

• Collaborating on this effort will result in efficiency and synergistic value

5/20/2014 (MBSW)

Page 50: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

50

Acknowledgements

• Richard Zink• Lei Shen• Chakib Battioui• Steve Ruberg• Ying Ding• Michael Bell

5/20/2014 (MBSW)

Page 51: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

51

References

1. Lipkovich I, Dmitrienko A, Denne J, Enas, G. Subgroup identification based on differential effect search — a recursive partitioning method for establishing response to treatment in patient subpopulations. Statistics in Medicine 2011; 30:2601–2621. doi:10.1002/sim.4289.

2. Lipkovich I, Dmitrienko A. Strategies for Identifying Predictive Biomarkers and Subgroups with Enhanced Treatment Effect in Clinical Trials Using SIDES. Journal of Biopharmaceutical Statistics 2014; 24:130-153. doi:10.1080/10543406.2013.856024.

3. Foster JC, Taylor JMG, Ruberg SJ. Subgroup identification from randomized clinical trial data. Statistics in Medicine 2011; 30:2867–2880. doi:10.1002/sim.4322.

4. Loh, WY, He X, Man M. A regression tree approach to identifying subgroups with differential treatment effects. Presented at Midwest Biopharmaceutical Statistics Workshop 2014.

5. Dusseldorp E, Van Mechelen I. Qualitative interaction trees: a tool to identify qualitative treatment-subgroup interactions. Statistics in Medicine 2014; 33:219–237. doi:10.1002/sim.5933.

6. Su X, Zhou T, Yan X, Fan J, Yang S. Interaction trees with censored survival data. International Journal of Biostatistics 2008; 4(1):Article 2. doi:10.2202/1557-4679.1071.

7. Battioui C, Shen L, Ruberg S. A Resampling-based Ensemble Tree Method to Identify Patient Subgroups with Enhanced Treatment Effect. Proceedings of the 2013 Joint Statistical Meetings.

8. Zink R, Shen L, Wolfinger R, Showalter H. Assessment of Methods to Identify Patient Subgroups with Enhanced Treatment Response in Randomized Clinical Trials. Presented at the 2013 ICSA Applied Statistical Symposium.

9. Shen L, Ding Y, Battioui C. A Framework of Statistical Methods for Identification of Subgroups with Differential Treatment Effects in Randomized Trials. Presented at the 2013 ICSA Applied Statistical Symposium.

5/20/2014 (MBSW)

Page 52: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

52

Backup Slides

5/20/2014 (MBSW)

Page 53: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

53

Data Generation: SIDES (2011)1

5/20/2014 (MBSW)

Attribute Value

simulations (datasets) 5000

n 900 (then divided into 3 equal – 1 training, 2 test)

p 5, 10, 20

response type continuous ( errors)

predictor type binary (dichotomized from continuous)

predictor correlation 0, 0.3

treatment assignment 1:1 (pl:trt)

placebo response 0

treatment effect 0

# predictive markers 0, 1, 2, 3*

predictive effect size(s) not explicitly stated

predictive M+ group size 15% - 20% of n (but not explicitly stated)

# prognostic markers 0

prognostic effect size(s) N/A

model “contribution model”

Page 54: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

54

Data Generation: SIDES (2014)2

5/20/2014 (MBSW)

Attribute Scenario 1 Scenario 2 Scenario 3 Scenario 4

simulations (datasets) 10000 10000 10000 10000

n 300 300 900 900

p 20, 60, 100 20, 60, 100 20, 60, 100 20, 60, 100

response type continuous ( errors) continuous ( errors) continuous ( errors) continuous ( errors)

predictor type binary (dichotomized from continuous)

binary (dichotomized from continuous)

binary (dichotomized from continuous)

binary (dichotomized from continuous)

predictor correlation 0 0.2* 0 0.2*

treatment assignment 1:1 (pl:trt) 1:1 (pl:trt) 1:1 (pl:trt) 1:1 (pl:trt)

placebo response 0 0 0 0

treatment effect 0 0 0 0

# predictive markers 2** 2** 2** 2**

predictive effect size(s) 0.35 0.35 0.6 0.6

predictive M+ group size

0.5 * n = 150 0.5 * n = 150 0.5 * n = 450 0.5 * n = 450

# prognostic markers 0 0 0 0

prognostic effect size(s) N/A N/A N/A N/A

model “contribution model” “contribution model” “contribution model” “contribution model”

Page 55: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

55

Data Generation: Virtual Twins3

5/20/2014 (MBSW)

Attribute Null Base Modifications*

simulations (datasets) 100 100

n 1000 1000 400 and 2000

p 15 15 30

response type binary binary

predictor type continuous ( errors) continuous ( errors)

predictor correlation 0 0 0.7**

treatment assignment ? ?

placebo response -1 -1

treatment effect 0.1 0.1

# predictive markers 0 2

predictive effect size(s) 0 0.9 for X1*X2 1.5 for X1*X2

predictive M+ group size N/A ~0.25 * n = ~250 ~0.5 * n = ~500

# prognostic markers 3 3

prognostic effect size(s) 0.5, 0.5, -0.5 for X1, X2, X70.5 for X2*X7

0.5, 0.5, -0.5 for X1, X2, X70.5 for X2*X7

model logit model logit model logit model with subject-specific effects ai and (ai, bi)

Page 56: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

56

Data Generation: GUIDE4

5/20/2014 (MBSW)

Attribute M1 M2 M3

simulations (datasets) 1000 1000 1000

n 100 100 100

p 100 100 100

response type binary binary binary

predictor type categorical (3 levels)* categorical (3 levels)* categorical (3 levels)*

predictor correlation 0 0 0

treatment assignment ~1:1 (pl:trt) ~1:1 (pl:trt) ~1:1 (pl:trt)

placebo response 0.4 0.3 0.2

treatment effect 0 0 0.2

# predictive markers 2 2 0

predictive effect size(s) 0.2, 0.15 for X1, X20.05 for X1*X2

0.4 for X1*X2 N/A

predictive M+ group size

~0.36 * n = ~360 in strongest M+ group (but not explicitly stated)

~0.36 * n = ~360 (but not explicitly stated)

N/A

# prognostic markers 0 4 2

prognostic effect size(s) N/A 0.2 for X3, X4-0.2 for X1*X2

0.2 for X1, X2

model linear model (on probability scale)

linear model (on probability scale)

linear model (on probability scale)

Page 57: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

57

Data Generation: QUINT5

5/20/2014 (MBSW)

Attribute Model A Model B*** Model C*** Model D*** Model E

simulations (datasets) 100 100 100 100 100

n 200, 300, 400, 500, 1000

200, 300, 400, 500, 1000

200, 300, 400, 500, 1000

200, 300, 400, 500, 1000

200, 300, 400, 500, 1000

p 5, 10, 20 5, 10, 20 5, 10, 20 5, 10, 20 5, 10, 20

response type continuous* continuous* continuous* continuous* continuous*

predictor type continuous (multivariate normal)**

continuous (multivariate normal)**

continuous (multivariate normal)**

continuous (multivariate normal)**

continuous (multivariate normal)**

predictor correlation 0, 0.2 0, 0.2 0, 0.2 0, 0.2 0, 0.2

treatment assignment ~1:1 (trt 1:trt 2) ~1:1 (trt 1:trt 2) ~1:1 (trt 1:trt 2) ~1:1 (trt 1:trt 2) ~1:1 (trt 1:trt 2)

treatment 1 response 20*** 20*** 20*** 18.33*** 30***

treatment 2 effect -2.5, -5, -10*** -2.5, -5, -10*** -2.5, -5, -10*** -2.5, -5, -10*** 0***

# predictive markers 1 2 3 3 1

predictive effect size(s) 5, 10, 20*** 5, 10, 20*** 5, 10, 20*** 5, 10, 20*** 2.5, 5, 10***

predictive M+ group size ~0.16 * n (but not explicitly stated)***

~0.16 * n (but not explicitly stated)***

~0.38 * n (but not explicitly stated)***

~0.16 * n (but not explicitly stated)***

~0.5 * n (but not explicitly stated)***

# prognostic markers 1*** 2*** 3*** 3*** 1***

prognostic effect size(s) 20*** 20*** 20*** 21.67*** 10***

model “tree model” “tree model” “tree model” “tree model” “tree model”

Page 58: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

58

Data Generation: Interaction Trees6

5/20/2014 (MBSW)

Attribute Model A Model B Model C Model D

simulations (datasets) 100 100 100 100

n 450 test sample method (300 for learning sample, 150 for validation sample), 300 bootstrap method

450 test sample method (300 for learning sample, 150 for validation sample), 300 bootstrap method

450 test sample method (300 for learning sample, 150 for validation sample), 300 bootstrap method

450 test sample method (300 for learning sample, 150 for validation sample), 300 bootstrap method

p 4 4 4 4

response type TTE (censoring rates = 0%, 50%)

TTE (censoring rates = 0%, 50%)

TTE (censoring rates = 0%, 50%)

TTE (censoring rates = 0%, 50%)

predictor type ordinal for X1 and X3, categorical for X2 and X4

ordinal for X1 and X3, categorical for X2 and X4

ordinal for X1 and X3, categorical for X2 and X4

ordinal for X1 and X3, categorical for X2 and X4

predictor correlation 0 0 0 0

treatment assignment ? ? ? ?

placebo response 0.135 0.135 0.135 0.135

treatment effect 2* 2* 2* 2*

# predictive markers 0 2 2 2

predictive effect size(s) N/A 0.223 for X1*4.482 for X2*

0.741 to 0.050 for X1* **1.350 to 20.086 for X2* **

0.5 for X1*2 for X2*

predictive M+ group size N/A ~0.25 * n in strongest M+ group (but not explicitly stated)

not explicitly stated** ~0.25 * n in strongest M+ group (but not explicitly stated)

# prognostic markers 2 0 0 0

prognostic effect size(s) 0.223 for X1*4.482 for X2*

N/A N/A N/A

model exponential model exponential model exponential model exponential model

Page 59: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

59

Perf. Measurement: SIDES (2011)1

• Selection rate, that is, the proportion of simulation runs in which >1 subgroup was identified.o Complete match rate: Proportion of simulation runs in which the ideal subgroup

was selected as the top subgroup (computed over the runs when at least one subgroup was selected).

o Partial match rate: Proportion of simulation runs in which the top subgroup was a subset of the ideal subgroup (computed over the runs when at least one subgroup was selected).

• Confirmation rate, that is, the proportion of simulation runs that yielded a confirmed subgroup (which is not necessarily identical to the ideal subgroup). In each run, the top subgroup was identified in terms of the treatment effect p-value in the training data set (if at least one subgroup was selected). The subgroup was classified as ‘confirmed’ if the treatment effect in this subgroup was significant at a two-sided 0.05 level in both test data sets.

• Treatment effect fraction defined as the fraction of the treatment effect (per patient) in the ideal group, which was retained in the top selected or confirmed subgroup. The fraction was defined as follows:

5/20/2014 (MBSW)

Page 60: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

60

Perf. Measurement: SIDES (2014)2

• Probability of a complete match• Probability of a partial match

o Probability of selecting a subseto Probability of selecting a superset

• Treatment effect fraction (updated definition, not weighted by group sizes):

5/20/2014 (MBSW)

Page 61: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

61

Perf. Measurement: Virtual Twins3

• Finding correct X’s• Closeness of to the true . This is measured

using sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and area under the ROC curve (AUC).

• Closeness of the size of to the size of the true • Power. Another quantity of interest is the

percentage of times methods find a null when and when .

• Properties of as an estimator of

5/20/2014 (MBSW)

Page 62: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

62

Performance Measurement: GUIDE4

• Probabilities that (predictive markers) are selected at first and second level splits of trees

• Accuracy. Let n(t, y, z) denote the number of training samples in node t with Y = y and Z = z and define n(t,+, z) = and nt =. Let be the subgroup defined by t. The value of is estimated by = |n(t, 1, 1)/n(t,+, 1) − n(t, 1, 0)/n(t,+, 0)|. The estimate of is the subgroup such that is maximum among all terminal nodes. If is not unique, is taken as their union. The “accuracy” of is defined to be / if and 0 otherwise.⊂

• Pr(nontrivial tree)

5/20/2014 (MBSW)

Page 63: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

63

Performance Measurement: QUINT5

• (RP1a) Probability of type I errors• (RP1b) Probability of type II errors• (RP2) Recovery of tree complexity. Given an underlying true

tree with a qualitative treatment–subgroup interaction that has been correctly detected, the probability of successfully identifying the complexity of the true tree.

• (RP3) Recovery of splitting variables and split points. Given an underlying true tree with a qualitative treatment–subgroup interaction that has been correctly detected, probability of recovering the true tree in terms of the true splitting variables and the true split points

• (RP4) Recovery of the assignments of the observations to the partition classes

5/20/2014 (MBSW)

Page 64: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

64

Perf. Measurement: Interaction Trees6

• Frequencies of the final tree sizes• Frequency of (predictor) “hits”• Bias assessment: the following were calculated for

the pooled training and test samples and the validation samples o the likelihood ratio test (LRT) for overall interactiono the logrank test for treatment effect within the terminal

node that showed maximal treatment efficacy

(for presentation convenience, the logworth of the p-value, which is defined as -log10 (p-value), was used).

5/20/2014 (MBSW)

Page 65: Consistent Assessment of Biomarker and Subgroup Identification Methods H.D. Hollins Showalter 5/20/2014 (MBSW) 1

65

Predictive Biomarker Project

Data Generation

• Web interface• Standard

datasets

BSID

• Open methods• Standard

output

Performance Measurement

• Web interface• Standard

summary

5/20/2014 (MBSW)