statistical issues in clinical research - an overview

Upload: pharma000

Post on 29-May-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    1/64

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    2/64

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    3/64

    ` Answering even relatively simple questions under the bestconditions a controlled clinical trial can be tricky. Possiblesources of bias abound, and if appropriate safeguards are not taken,may combine to give a false or misleading conclusion

    ` Some of the factors which make clinical research hard

    Formulating the right scientific question can be deceptively tricky

    Logistical complexity, especially the need to use multiple sites Trial conduct is highly interdisciplinary, requiring sustained, well-

    coordinated effort from many groups

    Staggered recruitment of subjects, uncertainty about accrualpattern is unavoidable

    Patient dropout, particularly in longer trials

    Potential for the goalpost to move mid-trial unforeseen eventscan destroy, or severely reduce, the relevance of the study evenbefore it ends

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    4/64

    ` Lasagnas Law

    The prevalence of any disease under study dropsdramatically once study enrolment opens up, andreturns to previous levels only once enrolment closes

    ` Murphys Law Anything that can go wrong, will go wrong In particular, the most egregious breach of protocol

    instructions will occur at the highest-enrolling site` Giltinans Law

    The quality of data obtained from any site is inverselyproportional to the degree of exaltation of the thoughtleader or principal investigator at the site (in extremecases, the role of thought leader is so all-consuming

    that delays in filing the necessary paperwork result inactual enrolment levels close to zero) : clearly, all just different manifestations ofMurphys Law

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    5/64

    ` A key concern is that each individual study protocol must achieve

    its goals, not just on its own terms it must also make sense withinthe broader picture

    ` A major practical issue is the ever-changing nature of thelandscape the long duration of most trials, and the uncertaintyabout the results means that the original target may have shifted bycompletion of a given trial

    ` Nonetheless, a key requirement when designing any trial is that theproposed design should give the best chance possible of enablingthe development plan to proceed to the next stage, once resultsfrom the trial become available

    ` The previous condition should be met, even when results do notcorrespond to the desired answer; it is important to remember thata failed clinical trial is not one which fails to give the desiredanswer, but rather one which fails to give an unambiguous answer

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    6/64

    ` Phase III objectives determined primarily by (i) target product profile

    (think desired label claim) (ii) norms for the given disease

    ` Primary and secondary objectives should map readily tocorresponding statistical hypotheses

    ` Safety objectives are given greater emphasis in Phases I and II;Phase III focuses on efficacy and safety

    ` Objectives should be specified as precisely as possible. At aminimum, include information on What measure of efficacy/safety will be used? Key features of the target patient population Dosing regimen, i.e. amount, frequency, and route of dosing

    ` Preferable to use neutral language when specifying objectives(personal opinion). Phrases like to compare (investigate) the efficacyor to characterize the pharmacokinetics are preferable to, e.g., todemonstrate efficacy or to establish superiority

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    7/64

    Examples:

    ` To investigate the effect of a single 5mg dose ofrhwonderprotein, administered by transgenic snakebite,on clotting ability in Irish clergymen, as measured by thechange from baseline in prothrombin time, rather than To

    demonstrate the efficacy of rhwonderprotein in improvingclotting ability

    ` To investigate the effect of twice daily SC injection of40g/kg of rhIGF-I for 12 weeks on glycemic control, insubjects with moderate to severe Type II diabetes, asmeasured by the average change from baseline in HbA1c,compared to subjects in the placebo group

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    8/64

    1. Selection bias2. Allocation bias

    3. Evaluation bias(observer/instrument)

    4. Recall bias

    5. Time (systematic change inpatient population, treatment, orother aspect of study conduct astrial progresses)

    6. Withdrawal / drop out patterns7. Lack of compliance with study

    protocol

    8. Unblinding (of patient, physician,or study personnel)

    1. Unambiguous eligibility criteria2. Randomization, stratification,

    blinding3. Blinding, standardization

    (training, or central evaluation)4. Appropriate data collection

    instruments5. Balanced treatment allocation,

    protocol should specify salientdetails of study conduct, avoidingroom for differential interpretations

    6. Pre-specified analysisconventions, sensitivity analyses

    7. Training; engaged studycoordinators at site

    8. Randomized allocation; suitableprecautions surrounding treatmentcodes and drug inventory/supply

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    9/64

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    10/64

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    11/64

    ` Randomization is the basis for statistical inference` A significance level represents the probability that

    differences in outcome can be the result of randomfluctuations.

    ` Without randomization a statistically significantdifference may be the result of non randomdifferences in the distribution of unknown prognosticfactors

    ` Randomization does not ensure that groups aremedically equivalent, but it distributes randomly theunknown biasing factors

    ` Randomization plays an important role for thegeneralization of the observed clinical trials data

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    12/64

    ` If prognostic factors are known use randomizationmethods that can account for it Stratification / blocking Adaptive randomization

    ` If possible randomize patients within a site` Patients enrolled early may differ from patients enrolled

    later Watch out for staggered enrollment Temporary closing of study sites or arms can cause problems

    ` Protocol amendments that affect inclusion/exclusion

    criteria may be tricky` Even in open label studies randomization codes should

    be locked

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    13/64

    ` Randomization does not guarantee that therewill be no bias by subjective judgment inevaluating and reporting the treatment effect

    ` Such bias can be minimized by blocking theidentity of treatment (blinding) Types of blinding

    ` Challenges Ethical considerations

    Unblinding procedures for safety reasons

    Unblinding procedures at final analysis

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    14/64

    ` Protection against certain types of bias is through appropriate

    design precautions (stratification, randomization, blinding)

    ` Other types of bias are prevented only by giving unambiguousinstructions to the sites on the intended patient population and howall aspects of the study should be conducted

    ` Sites will sniff out each ambiguity in the protocol, and interpret andexecute the instructions more divergently than you can imagine

    ` There is vagueness regarding key aspects of study conduct, e.g.use of con meds, evaluation schedule, endpoint definition, handlingof dropouts, how key evaluations will be carried out, etc. etc. etc.

    ` Major divergence in interpretation (e.g. in deciding eligibility, or howto measure a key response variable) has the potential to torpedo the protocol entirely may not become evident until its too late

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    15/64

    ` As a routine precaution, it is advisable to limit thecontribution to enrolment of any single site to no morethan 15% of the total. Note that this limit is generally notspecified explicitly in protocol text, but is communicated tosites at study initiation nonetheless

    ` Non-standard evaluations may require intensive trainingof site personnel to reduce systematic differences inevaluation among sites

    ` Centralized (blinded) evaluation, when feasible, is oftenthe best option

    ` It is a good idea to develop a prospective publicationstrategy, securing upfront buy-in from key stakeholders

    ` A plan and timetable for disseminating study resultsshould be developed, following existing SOPs, andcommunicated to sites prospectively

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    16/64

    `

    Regular, frequent communication with sites is important

    ` Early monitoring of key variables is advisable, to allow

    problems to be detected and fixed early

    ` Appropriate mechanisms should be in place to allow

    evaluation of aggregated safety data in a timely fashion,

    (remember that individual sites may not be able to

    discern adverse patterns, based only on their data)

    ` Each team member should try to attain at least a basic

    understanding of the role of every other team member

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    17/64

    ` Discussion here will focus primarily on efficacy endpoints` What about other kinds of endpoints?

    ` Pharmacokinetic endpoints are generally standardparameters derived from the observed concentration-time profiles

    ` Safety endpoints also tend to be fairly standard; mostare common across protocols, with occasionaldisease/drug-specific markers Incidence of adverse events (general, protocol-specified, by body

    system, etc.)

    Changes in key laboratory parameters Incidence of antibodies (neutralizing or not)

    ` Pharmacodynamic endpoints, in contrast, aremeasures of activity, and will vary from study to study.Recommendations for efficacy endpoints apply.

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    18/64

    ` No problem in Phase I, where focus is primarily on safety and PK

    endpoints. Limited sample sizes preclude formal evaluation of

    efficacy if it must be mentioned in the protocol, it is preferable to

    refer to activity, rather than efficacy

    ` Drug approval requires establishing an acceptable risk-benefit

    profile. It is important to bear in mind that the regulatory expectation

    is that ofclinical benefit to the patient` Thus, in general, the primary efficacy endpoint should be a measure

    of clinical effect (as opposed to, e.g. a biochemical or physiological

    marker)

    ` Taking the primary efficacy endpoint in a pivotal trial to be a

    biomarker which is not a direct measure of clinical benefit issomething which should be done only with prior buy-in from all

    relevant regulatory agencies

    ` In general, such buy-in can be attained only in the case of an

    established surrogate endpoint more on this below

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    19/64

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    20/64

    ` Generally speaking, endpoints which can be measuredin a completely objective fashion are preferred` This may not always be possible some degree of

    subjectivity may be unavoidable (e.g. in endpoints suchas physicians or patients evaluation of improvement)

    ` The degree to which this kind of subjectivity may beacceptable is likely to depend on perceptions about theintegrity of blinding in the study

    ` In evaluating quality of life, use of a validatedinstrument is preferable. In many cases, a disease-specific QOL questionnaire exists

    ` Consultation with the Health Economics group is highlyrecommended, to ensure that collection of QOL datasupports the target product profile (dont wait until PhaseIII to do this)

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    21/64

    `

    In general, key efficacy endpoints should bestraightforward to measure. Avoid measures which mightstill be considered experimental, which require highlycomplex instrumentation, or involve extremelyspecialized assays. Measurements which rely heavily ontechnician skill or judgement can also be problematic

    ` Centralized evaluation of key endpoints may help guardagainst inter-site variation

    ` If key variables do involve specialized assays, make sure

    that assay procedures are thoroughly understood, andconsistently implemented

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    22/64

    ` Multiple secondary endpoints are common *` Multiple primary endpoints are sometimes used

    If consensus on a single 1r endpoint is impossible Should be a course of last resort (personal view) Have an associated penalty, in terms of a higher bar to

    declare statistical significance at a given level E` A common approach is to require significance at level E

    k, where k is the number of co-primary endpoints(Bonferroni)` Bonferroni works reasonably, provided k is not too large,

    and if the constituent endpoints are uncorrelated` For highly correlated endpoints, Bonferroni is inefficient;

    true attained significance will be < E`

    Especially problematic if there is interest in multiplesubsets* Try to show some discipline regarding # of 2r endpoints

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    23/64

    1. Continuous - e.g. reduction in cholesterol, HbA1c,visual acuity

    2. Categoricala) Multiple categories with no natural orderingb) Ordered categorical - e.g. different degrees of improvement

    3. Dichotomous e.g. response/non-response*,

    dead/alive at a specific time post-treatment4. Time-to-event e.g. survival, time to progression

    Different analysis methods are appropriate for each mainendpoint type; sample size requirements differ as well

    (3) is obviously a special case of (2)

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    24/64

    ` Approximate ordering by information content (fromhighest to lowest) isContinuous > time-to-event ~ ordered categorical

    > categorical > binary

    ` As a result, demonstrating an effect when the primaryefficacy measure is a response rate is typically mostdemanding, in terms of sample size

    ` Although continuous response variables may havepreferable statistical properties, it is quite common forFDA to require the primary efficacy variable to be aresponse rate, where response is defined as theproportion of subjects who reach a specified threshold ofimprovement on the continuous scale (Raptiva, Lucentis)

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    25/64

    ` Response rate (where response is based on change in

    tumor size, according to well-defined criteria; best post-treatment evaluation is counted, so response is not linkedto a specific timepoint)

    ` Duration of response (note that the resolution with whichthis can be determined will depend on the frequency of

    scheduled evaluations)` Survival time

    ` Time to disease progression, where criteria for progressionare well-defined

    ` Progression-free survival

    One major question is the extent to which a treatment effecton response, in terms of reduction of tumor size, is predictive

    for treatment effect on survival. Unfortunately, this seems to vary bytumor

    and treatment class.

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    26/64

    In the standard hypothesis testing framework for efficacy Type I error : conclude an ineffective drug is effective

    (false positive)

    Type II error : conclude an effective drug is ineffective

    (false negative)

    ` Ideally, both error probabilities should be controlled

    ` Generally, sample size is chosen to give acceptable power

    (defined as 1- Type II error rate, or 1 - F) for a prespecified

    false positive rate, E

    `

    In phase III efficacy trials, E is 0.05, by regulatory fiat` Acceptable power is generally taken to be 90% for pivotal

    studies

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    27/64

    ` This has implications for sample size, due totension between both types of error` Timeline implications, as study duration = treatment

    duration + accrual time` Common pitfall exaggerate extent of the possible

    treatment effect (power for the home run), over-optimistic sample sizes` General guideline : power study to detect treatment

    effect specified in the target product profile (regular,not optimistic, scenario)

    `

    In some cases, sample size is dictated by safety,rather than efficacy, considerations (satisfyminimum regulatory requirements)

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    28/64

    ` For a given value ofE, power depends on

    Magnitude of the treatment effect ()

    Sample size ()

    Inter-subject variability for continuous measurements ()

    Response rates for binary responses ()

    ` For most pivotal efficacy trials, the standard approach is to calculate

    the sample size necessary to give adequate (90%) power to detecta clinically meaningful treatment effect, with a type I error rate of

    5%

    ` Calculating the sample size needed for a given power requires

    some knowledge about variability of continuous responses (or

    response rates, for binary data)

    ` Clinically meaningful needs to be defined in terms of the target

    product profile, not as the effect size which will give acceptable

    power for the sample size Im willing/able to use

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    29/64

    ` Sample size is not always dictated by this kind of power analysis

    in some cases, safety requirements may be the deciding factor(rheumatoid arthritis, psoriasis)

    ` In earlier phases, it may not be practical to run trials big enough tocontrol both Type I and Type II error rates as well as we might like

    ` 80% power is generally considered adequate in Phase II; onoccasion we may settle for less

    ` Similarly, requiring significance at the 5% level may be overlystringent in Phase II

    ` Personal view: it is foolish to allow the hegemony of hypothesistesting to control our thinking prior to Phase III

    ` Instead, view the issue as an estimation problem

    ` Precision analysis Choose sample size in such a way that there is a desired

    precision at fixed confidence level

    Small chance of detecting true treatment effect

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    30/64

    Challenge Power for correctly detecting a clinical meaningful difference at a

    fixed type I error rate depends primarily on the number ofevents(deaths, progressions, etc.)

    Specifying the number of events doesnt uniquely determine thenumber of subjects

    For instance, suppose the required number of events is 280. If 300subjects per group is sufficient to give the required number ofevents, then 250 per group must as well it will just take longer

    Thus, sample size calculations are a little more complex for time-to-event responses and will depend on

    calculating the number of events needed to give the desired power an assumption about the median time-to-event in the control group

    an assumption about the size of the difference between control andtreated groups

    projected accrual patterns

    targeted study duration

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    31/64

    `

    Interim analysis is a tool to protect the welfare ofsubjects

    By stopping enrollment/treatment as soon as a drug is

    determined to be harmful

    By stopping enrollment as soon as a drug is

    determined to be beneficial By stopping trials which will yield little additional useful

    information (or which have negligible chance of

    demonstrating efficacy if fully enrolled, given results to

    date)` The associated statistical methods are generally referred

    to as group sequential methods

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    32/64

    ` Should preserve an overall false positive rate ofE for the trial :cannot claim statistical significance at level E if the unadjusted p-value at one of the interim analyses happens to be less than E

    ` In general, the unadjusted p-value for testing treatment effect at anygiven interim analysis will be compared to a more stringent (lower)

    bound to stop early (for efficacy) requires compelling evidence

    ` Regulatory agencies need to be convinced that interim analyses donot compromise the integrity of the blind

    ` Regulatory guidelines over the past 10 years have become stricterand stricter, ultimately requiring that interim analyses be conductedby an external, independent group, i.e. study team members are nolonger privy to interim results

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    33/64

    `

    Basically, interim results should not be shared with anyone in thesponsor company, or at participating study centers

    ` The only feedback to the sponsor is in the form of the

    recommendations from the Data Monitoring Committee

    ` Details of any proposed interim analysis, including the sponsors

    expectations of the DMC, should be laid out prospectively in a written

    charter

    ` SOPs and a charter template exist and should be followed

    ` Although team members do not conduct the actual analyses,

    scheduled interim analyses can be highly labor-intensive nonetheless.

    Genentechs biostatistician/statistical programmer will still need to

    work with the external data group to develop detailed specifications forthe analyses and displays to be made available to the Data Monitoring

    Board

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    34/64

    ` Early stopping for efficacy is not the only possibility (recent experiencenotwithstanding). Doing so is generally non-controversial, provided anappropriate group sequential stopping rule, and the role of the DMC,have been identified prospectively

    ` Early stopping for safety can range from scenarios which are veryclear-cut to situations which are considerably more ambiguous. In thelatter case, having an experienced DMC chair can be particularlyimportant

    ` Early stopping for lack of efficacy (futility analysis) is not particularlycommon (with one exception, discussed on the next slide) the ideathat incorporating this option can result in substantial reduction in thenumber of patients (gating risk) seems slightly misleading (personal

    opinion) Stopping for futility in a controlled trial will typically happen only if the

    treatment appears considerably inferior to control at the interim analysis

    Enrolment continues during preparation for the interim analysis, whichtypically occurs at a point where accrual has gained momentum, so # ofsubjects saved may not be that great

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    35/64

    ` An exception is the case of uncontrolled oncology trials focusing on

    estimation of response rate` Use of a two-stage (or multi-stage) design is common

    ` At a given analysis stage, if the observed response rate is so low

    that it essentially rules out the possibility that the true response rate

    is acceptable, may choose to stop

    `

    Typically the argument is based on the upper 90% or 95%confidence limit for the true response rate stop if this is lower than

    the minimum rate identified as interesting in the TPP

    ` Recall the rule of 3, often invoked in the context of safety data. If a

    particular event (adverse reaction, response) occurs in 0 out of N

    subjects tested, then the 95% upper confidence limit for the true rate

    of occurrence is 3/N.

    ` Thus, for instance, if no responses are observed in the first 20

    subjects, this effectively rules out values of the true response rate

    greater than 3/20, or 15%. If the TPP requires a response rate of at

    least 20%, stopping for futility seems warranted

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    36/64

    ` A fairly detailed exposition can be found on our websiteat : gwiz/projects/stathelp introductory course notes,lecture 4

    ` Use of the binomial distribution` C

    alculating standard errors; normal approximation forlarge samples` Estimation and confidence intervals for a single rate` Testing for difference between two rates (z-test, -test,Fishers exact test)

    ` Estimation and confidence intervals for the differencebetween two rates

    ` Testing for differences in rates among several groups(-test, Fishers exact test)

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    37/64

    ` If the response of interest is survival time, then

    specialized methods are needed, for two main reasons

    Frequency distribution of survival times is usually not well-behaved not normal, not even symmetric

    In the context of clinical studies, cannot wait to observe allsurvival times this means, for some subjects, all we know is thattheir survival time exceeds the observation period

    ` In statistical jargon, such survival times are called (right)-censored observations

    ` Methods for survival times are also applicable to anyresponse of type time-to-event e.g. time to diseaseprogression, etc.

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    38/64

    `

    Definitions: survivor function, hazard function

    ` Estimation of survival curve : Kaplan-Meier

    ` Comparison of one or more survival curves :logrank test, Wilcoxon test

    ` Comparing survival curves, allowing adjustment

    for other factors (e.g. baseline disease status) :

    proportional hazard regression, aka the Cox

    model

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    39/64

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    40/64

    ` We wish to estimate the proportion remaining disease-free at any giventime, equivalently, the estimated probability of that a member of thepopulation from which the sample is drawn is alive without disease atthat time

    ` Because of the censoring we use the Kaplan-Meiermethod. For eachtime interval we estimate the probability that those without disease atthe beginning remain so throughout the interval. This is a conditionalprobability.

    ` The probability of being disease-free at any time point is calculated asthe product of the conditional probabilities of surviving without diseasethrough each interval prior to that time point.

    ` The calculations are simplified by ignoring times at which there were norecorded events (whether progressions or losses to censorship).

    ` Censorship is accommodated in the calculations by ensuring that all

    subjects previously lost to censoring are removed from the risk setwhen calculating the conditional probability for a given timepoint

    ` Because the overall probability of being disease free at a particulartimepoint is calculated as a product of the relevant conditionalprobabilities, this (Kaplan-Meier) method of estimating the survivalcurve is sometimes referred to as the product-limit estimate

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    41/64

    ` Survival probabilities are usually presented as a connected"curve. The curve takes the form of a step function, withchanges in the estimated probability occurring (only) when anevent (progression) was observed

    ` Observations censored during any interval affect the number stillat risk at the start of the next interval. Censoring is thus

    accommodated when calculating the step sizes, its effect onthe curve is relatively subtle, but becomes cumulatively moreimportant over time. Some versions of the Kaplan-Meier curvedisplay censoring times as superimposed short vertical lines(works best for relatively small sample sizes)

    ` In practice, a computer is used to do these calculations.

    ` Standard errors and confidence intervals for estimated survivalprobabilities can be found by using a formula due to Greenwood

    ` Reporting estimated median survival with associated confidencelimits is usual; estimating other percentiles is also possible

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    42/64

    Two most common tests are

    ` Logrank test` Wilcoxon test

    If comparison needs to allow adjustment for othercovariates besides group ID (e.g baseline diseasestatus), the most common approach is

    ` Cox (proportional hazards) regression

    As the name implies, this analysis frames the comparison in termsof the effect a treatment or covariate exerts on the hazard function,rather than directly on the survival function

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    43/64

    Logrank test

    ` Basic idea at each new event time, figure out the survivalpattern that would be expected if the null hypothesis (no

    difference) were true` Quantify the difference between the observed survival pattern

    and that expected under null hypothesis. This is done at eachnew event time.

    ` Obtain a cumulative measure of discrepancy from H0 by addingup the contributions across all event times

    ` Compare the result to appropriate tables (chi-square) to obtain ap-value

    Wilcoxon test variation of logrank text which gives greater weightto discrepancies occurring earlier

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    44/64

    Limitations of the logrank test

    ` Only addresses the question : is there a difference?No direct quantification of the size of the difference

    ` Doesnt allow adjustment for other relevant prognostic

    factors (e.g. differences at baseline)

    These questions usually addressed by Cox(proportional hazards) regression. Salient output is

    ` estimated coefficient with standard error and/or

    confidence interval` Usually interested in whether or not coefficient is zero

    ` Quantifies effect on hazard, rather than the survivalfunction

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    45/64

    For completeness, here are the definitions:

    ` Survival function

    S(t) Probability of surviving past time t

    ` Hazard function

    h(t) Probability of dying at time t, given onehas survived until that time

    For calculus fans, the hazard function turns out to bed/dt [ - log (S(t) ]

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    46/64

    Safety and efficacy data differ in some key aspects` Safety hypotheses are not specified a priori

    ` Failure to achieve statistical significance does not

    mean that a safety finding can be ignored

    ` With safety data the goal is to prove a negative

    ` Safety analyses are usually descriptive

    `A few serious medical events can lead to the

    termination of products development extreme

    value distributions are relevant to safety analyses

    ` Concurrent controls may not provide adequate

    context for interpretation

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    47/64

    ` Phase III trials are typically sized based on efficacy:what type of safety statements are appropriate?

    ` Drug exposure: how to summarize, how to correlatewith adverse events observed, etc.

    Dose response

    Open label trials

    Placebo-controlled trials

    ` Sources of bias (under-reporting, longer follow-upleads to more events)

    ` Adverse events: very very many types, so what is anappropriate way to summarize/analyze?

    Multiplicity

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    48/64

    ` Number of subjects and duration of exposureduring development is minimal relative to the # of

    patients that may receive drug post-approval: Only the most common AEs (e.g., incidence of 1 % or

    more) are identified Less common AEs (1 in 1000) cannot be reliably

    detected

    Rare events (1 in 10,000) will almost certainly not be

    observed at all

    Some patient groups may have been excluded from

    trials entirely, or insufficiently represented to a degree

    which precludes identifying any risks specific to them

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    49/64

    ` Safety: Applicant must demonstrate product safety (FDA has

    obligation to demand)x Extent of data: There must be sufficient information to decide

    whether the drug is safe.x Adequate analyses: Adequate tests by all methods

    reasonably applicablemust be performed to evaluate safetyfor labeled use.

    x Reasonable results: Tests should show that drug is safe aslabeled

    x Risks must be adequately defined.

    x Extreme risks (even if rare) must be obvious.

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    50/64

    ` Efficacy: Applicant must demonstrate substantial evidence of

    effectiveness claimed.

    x Substantial evidence : evidence consisting of adequate and

    well-controlled investigations, including clinical investigations,from which experts could conclude the drug will have the

    claimed effect.

    x Investigations imply replication or corroboration.

    x Typical: 2 Phase III trials with identical or similar designs

    x

    In special circumstances: 1 Phase III trial may be sufficient.x E.g. life-threatening diseases with very limited therapeutic options

    (always a good idea to talk to regulatory agencies prior to trial

    initiation)

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    51/64

    ` Regulatory Agencies FDA

    EEC (European Economic Community)

    ` U.S. Codes ofFederal Regulations forClinical Trials

    ` ICH (International Conference onHarmonization) Initiatives undertaken by regulatory authorities and industry

    associations to promote international harmonization ofregulatory requirements

    Good Clinical Practice (GCP)

    Structure and content of clinical studies

    Clinical safety data management: Definitions and standardsfor expedited reporting

    Statistical principles for clinical trials

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    52/64

    . a laboratory measurement or physical sign

    used as a substitute for a clinical endpoint that

    measures how a patient feels, functions, or

    survives.

    from a definition of the term surrogate endpoint by

    Temple, cited in Fleming and DeMets (1996),

    Annals of InternalM

    edicine, 125, pages 605-613[Surrogate endpoints in clinical trials: are we being misled?]

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    53/64

    Some thoughts on biomarkers

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    54/64

    Predict clinical efficacy of treatment based

    on its effect on biomarker (data may be

    available earlier; may provide answer with fewer

    number of subjects)

    Use in Phase II is common

    dose ranging based on biomarker

    Phase III go/no go decision based onobserved treatment effect on biomarker

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    55/64

    ` Biochemical (cholesterol, HIV viral load, cytokineconcentration, hemoglobin A1c )

    ` Immunological (lymphocyte subpopulation

    counts, CD4+ , CD11a+ T cells, CD20+ B cells..)

    ` Saturation of target cell surface antigen or

    soluble ligand

    ` Physiological (e.g. blood pressure, pulmonary

    function testing, episodes of arrythmia )` Imaging (angiography, tumor size, bone density

    by DEXA scan )

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    56/64

    ` Lowering of cholesterol level by treatment with statins

    (survival benefit established)

    ` Reduction in viral RNA in peripheral blood throughtreatment with protease inhibitors delays HIV diseaseprogression

    ` Improved glycemic control (HbA1c) predictive of delayedonset of microvascular complications (retino-, nephro-,neuropathy) in Type I diabetes

    ` 90-minute TIMI flow (angiography) predictive of 30-day

    survival following thrombolytic therapy

    ` Reduction in free IgE following treatment with an anti-IgEantibody correlates with symptom improvement scores inallergic rhinitis and asthma

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    57/64

    Experience with biomarkers is not always positive

    ` CD4 counts as a surrogate in AIDS trials mixed

    performance as a predictor of clinical benefit

    ` Tumor size in cancer trials experience runs both ways

    appears to depend both on tumor type and on class oftreatments

    ` Experience in the CAST trial demonstrated that treatment

    with encainide/flecainide clearly reduced the incidence of

    arrythmias, but increased mortality` Similar results in context of treating atrial fibrillation

    ` Blood pressure as surrogate effect translates to clinical

    benefit for some drug classes, but not others

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    58/64

    ` Biomarker not on causal pathway of disease process

    ` Several pathways intervention affects that mediatedthrough biomarker, but not others (redundancy)

    ` Biomarker not on the pathway affected by the intervention,or is insensitive to treatment effect

    ` Intervention has mechanisms of action unrelated to thedisease process (aka the law ofunintended

    consequences)

    ` Failure of either type is possible - biomarker could falselypredict, or fail to predict, clinical benefit

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    59/64

    Other potential contributing factors include:

    Measurement difficulties due to rater effects

    GNE experience (K-interferon in renal cell carcinoma)

    strongly supports advisability of blinded tumor

    evaluation by a single central review board (avoidbias, minimize center differences)

    Measurement difficulties arising from sample preparation,

    transport, storage, and handling

    Time constraints in assaying fresh blood, possible effects ofactivation of T-cells, lack of standardization ofFACS assay

    protocols and reporting methods, heterogeneity of tumor

    samples, center differences (use of local or central labs)

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    60/64

    Other potential assay-related difficulties include -

    Matrix effects

    Interference by other proteins can affect assay

    specificity and/or sensitivity

    Development of antibodies

    Can be hard to detect; harder to quantify reliably;

    extremely difficult to assess clinical significance, if any

    Inter-laboratory differences

    Can be large enough to make biomarker data uninterpretable

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    61/64

    ` Avoid the what we can measure is what we should

    measure fallacy` Experience with imaging-based biomarkers to date has

    been disappointing

    ` Non-targeted genomic assays (e.g. microarrays followed bydata mining) has the potential for much wasted effort

    ` Avoid the rearranging the deckchairs on the Titanic fix, e.g.straining to improve assay precision from a CV of 20% to15% when the within-subject CV for the marker is 40% andthe inter-subject CV is 50%.

    ` Cytokines make particularly treacherous biomarkers

    ` Proteomics is not for sissies

    ` Distinguish between must know and nice-to-know

    ` An understanding of mechanism of action may be nice toknow, but is not a requirement for drug approval

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    62/64

    ` If the word cascade appears in the description ofthe disease process, all bets are off

    ` The topic of biomarkers seems to drive otherwise

    thoughtful researchers to an irrational frenzy of

    wishful thinking` The message so eloquently expounded by Jaggeret

    alremains as relevant today as it was in 1969

    ` Lasagnas Law already mitigates against rapid

    accrual of eligible subjects to clinical trials

    ` To slow recruitment from a trickle to a complete

    grinding halt only two words are needed in the

    protocol: serial biopsy

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    63/64

    ` Utility of a particular biomarker depends not only on the

    disease, but also on the nature of the therapeutic intervention

    ` Validation of any candidate biomarker must necessarily be

    considered on a case-by-case basis

    ` Validity of a marker for a given drug class may not transfer to

    other drug classes for the same disease` Success is most likely when intervention clearly affects the

    biomarker, whose role in the disease process is well-

    established and clearly understood

    ` Validation of a putative marker cannot happen withoutultimately generating the required clinical outcome data

    ` Regulatory conservatism is to be expected, and seems

    appropriate

    www.pharmasri.com

  • 8/8/2019 Statistical Issues in Clinical Research - An Overview

    64/64