experimental design and statistical considerations in translational cancer research (in 15 minutes)...

Experimental Design and Statistical Considerations in

Translational Cancer Research(in 15 minutes)

Elizabeth Garrett-Mayer, PhDAssociate Professor of Biostatistics

and Epidemiology

Phase I studies Taking markers into the clinic

Two Parts

Historically, DOSE FINDING study Classic Phase I objective:

“What is the highest dose we can safely administer to patients?”

Translation: Kill the cancer, not the patient Assumes monotonic relationship between

dose and toxicity dose and efficacy

Phase I Trial Design

Classic Phase I Assumption: Efficacy and toxicity both increase with dose

Pro

babi

lity

of O

utco

me

Dose Level1 2 3 4 5 6 7

0.0

0.2

0.4

0.6

0.8

1.0

ResponseDLT

DLT = dose-limitingtoxicity

Classic Phase I approach: Algorithmic Designs

“3+3” or “3 by 3” Prespecify a set of doses to consider, usually

between 3 and 10 doses.

MTD is considered highest dose at which 1 or 0 out of six patients experiences DLT.

Confidence in MTD is usually poor.

Treat 3 patients at dose K1. If 0 patients experience DLT, escalate to dose K+12. If 2 or more patients experience DLT, de-escalate to level K-13. If 1 patient experiences DLT, treat 3 more patients at dose level K

A. If 1 of 6 experiences DLT, escalate to dose level K+1B. If 2 or more of 6 experiences DLT, de-escalate to level K-1

“Novel” Phase I approaches

Continual reassessment method (CRM) (O’Quigley et al., Biometrics 1990) Many changes and updates in 20 years Tends to be most preferred by statisticians

Other Bayesian designs (e.g. EWOC) and model-based designs (Cheng et al., JCO, 2004, v 22)

Other improvements in algorithmic designs Accelerated titration design (Simon et al. 1999,

JNCI) Up-down design (Storer, 1989, Biometrics)

CRM: Bayesian Adaptive Design

Dose for next patient is determined based on toxicity responses of patients previously treated in the trial

After each cohort of patients, posterior distribution is updated to give model prediction of optimal dose for a given level of toxicity (DLT rate)

Find dose that is most consistent with desired DLT rate

Modifications have been both Bayesian and non-Bayesian.

New paradigm: Targeted Therapy

How do targeted therapies change the early phase drug development paradigm?

Not all targeted therapies have toxicity Toxicity may not occur at all Toxicity may not increase with dose

Targeted therapies may not reach the target of interest

Implications for study design: Previous assumptions may not hold Does efficacy increase with dose? Endpoint (DLT) may no longer be appropriate Should we be looking for the MTD? What good is phase I if the agent does not hit the

target?

Possible Dose-Toxicity & Dose-Efficacy Relationships for Targeted Agent

0 2 4 6 8 10 12

0.0

0.2

0.4

0.6

0.8

1.0

Pro

babi

lity

of O

utco

me

dose

Efficacy

Toxicity

A study that correlates a “marker” with disease

What is a marker? An innate characteristic of a tumor or tissue

Examples

What is a Correlative Study?

Marker PSA Estrogen receptor

SUV from PET

KIT mutation

Disease Prostate cancer

Breast cancer

Many cancers GIST

Prognostic marker: Predicts outcome (independent of therapy)

Predictive marker: Predicts response to therapy

Can be used for Treatment assignment Treatment stratification in clinical trials Surrogate endpoint (?) Targeted therapy development Diagnosis

What is it good for?

Mitotic Rate: Prognostic Marker

DeMatteo et al, Cancer, 112:608-615

Figure 3. Recurrence-free survival in 127 patients with completely resected localized gastrointestinal stromal tumor (GIST) based on mitotic rate

Disease-free survival.

Gennari A et al. JNCI J Natl Cancer Inst 2007;100:14-20

© The Author 2007. Published by Oxford University Press.

HER-2: Predictive Marker

Analytical development Measurement, logistics etc

Clinical development Sample collection, storage, processing “Retrospective” connection with outcome

Clinical validation “Prospective “ connection with outcome

Lifecycle of a marker

Statistical issues during analytical development Reproducibility

Repeat the measurement on the same sample multiple times under otherwise identical conditions

Suppose binary marker, twice measured Results can be summarized in a fourfold (2x2)

table Statistical Significance?

not good enough! p<0.05 shows there is a trend need strong agreement, not just a trend

Continuous Measurements

Measurement 1

Mea

sure

men

t 2

p = 1.2x10E-11R-squared = 0.92

Measurement 1

Mea

sure

men

t 2

p = 3.2x10E-5R-squared = 0.62

Measurement 1

Mea

sure

men

t 2

p = 5.2x10E-11R-squared = 0.59

DO NOT RELY ON P-VALUES!!

Correlate marker(s) with the outcome on a cohort of patients

Many issues relate to bias Case/control selection Quality/Processing Over-fitting/Lack of validation

Clinical development of a marker

A systematic difference between what we think we observe and what we actually observe

The more “haphazard” the data collection process, the more chances of bias creeping in

Buyer beware: Commercial Tissue Microarrays Why is bias a problem?

Cannot be “quantified” (within a study) Does not diminish with increasing sample sizes

What is bias?

Use the same data to develop/fine-tune a marker (or model) and evaluate its characteristics

Most obvious with multivariable analyses (gene signatures etc)

Might happen in seemingly innocuous circumstances Choosing a cutpoint Not reporting negative markers

VALIDATION!!! “cross-validation”: statistical approaches that use the

same data but account for double-dipping true validation:

repeat the study in a new but similar population apply the “model” to a new dataset and test its prediction

accuracy

Double dipping

All sorts of biases crept in Patients with tissue are unlikely to be a random

sample No real inclusion/exclusion criteria

Possibly looked at many markers, many subsets and many thresholds

Build your marker into a clinical trial

Be critical of your results

Start as secondary endpoints in a Phase I or II trial

If Phase I, might be better to have an MTD-cohort and limit the correlative studies to that cohort

If Phase II and an expensive/invasive marker, consider a two-stage design where marker will be measured only in the second stage

Incorporating markers into clinical trials

experimental design and statistical considerations in translational cancer research (in 15 minutes)...

Documents