1 epi-820 evidence-based medicine lecture 8: prognosis mat reeves bvsc, phd

1

EPI-820 Evidence-Based Medicine

LECTURE 8: PROGNOSIS

Mat Reeves BVSc, PhD

2

Objectives:

• 1. Review definitions. • 2. Understand concept of natural history and

inception cohort studies.• 3. Define commonly used measures of prognosis.• 4. Understand origins of bias in follow-up studies.• 5. Understand basic statistical methodology for

survival data.• 6. Define characteristics of an ideal prognostic

study.• 7. Understand the rationale and development of

clinical decision/prediction rules

3

“Prediction is very difficult, especiallyabout the future”

-Niels Bohr

4

I. Definitions

• Prognosis: the prediction of the future course of events following the onset of disease. • can include death, complications, remission/recurrence,

morbidity, disability and social or occupational function.

• Prognostic factors: factors associated with a particular outcome among disease subjects. • examples includes age, co-morbidities, tumor size, severity

of disease etc. • often different from disease risk factors e.g., BMI and pre-

menopausal breast CA.

5

I. Definitions

• Natural history: the evolution of disease without medical intervention.

• Clinical course: the evolution of disease in response to medical intervention.

6

Natural History Studies

• Degree to which natural history can be studied depends on the medical system and the type of disease.

• The natural history of some diseases can be

studied because:– remain unrecognized (i.e., asymptomatic) e.g., anemia,

hypertension.– considered “normal” discomforts e.g., arthritis, mild

depression.

7

Natural History Studies

• Natural history studies permit the development of rational strategies for:• early detection of disease

– e.g., CIN and Invasive Cervical CA.

• treatment of disease – e.g., middle ear infections.– DCIS– Hypertension

8

A. Study Designs

• Measuring natural history or clinical course of disease requires a cohort study:• Most often use a retrospective cohort.

• Exposed group = affected (diseased) patients followed to measure outcomes of interest.

9

Cohort Study Design

• To determine if outcome is atypical, need to compare to a non-exposed group which should be: • obtained from the same source population that

gave rise to the patients.• monitored with the same intensity.• If unaffected cohort unavailable, use “outside” data

e.g., standardized incidence or mortality ratios for cancer studies.

10

B. The Inception Cohort

• Prognosis studies usually involve taking a sample of diseased patients.

• Such cohort studies are very susceptible to bias because clinical course of disease can be both prolonged and variable

• Key factor in determining disease outcome is when did the time clock start?

11

Starting Point

• Starting point must be well defined and clearly specified:• The same point in the course of the disease for all

individuals. • Ideal starting point is near the onset (“inception”)

of disease = inception cohort. • Usually it is:

– onset of symptoms, time at first diagnosis, or beginning of treatment.

12

Effect of Using a Defined Inception Cohort on Average Survival Time

a) No inception cohorta) No inception cohorta) No inception cohorta) No inception cohort

5 cases identifed at different points in time

0 1 2 3 4 5Years

Pre-clinical phase

Clinical phase

Av. duration of survival= 14.5/5 = 2.9 yrs

Start cohort

b) Inception cohort

5 cases identifed at the same point in time

0 1 2 3 4 5Years

Pre-clinical phase

Clinical phase

Av. duration of survival= 11/5 = 2.2 yrs

Start cohort

13

II. Bias in Follow-up Studies

• A. Selection or Confounding Bias• i) Assembly or susceptibility bias occurs when the

exposed and non-exposed groups differ other than by the prognostic factors under study, and

• the extraneous factor affects the outcome of the study.

• Examples: • differences in starting point of disease (survival cohort)• differences in stage or extent of disease, co-morbidities,

prior treatment, age, gender, or race.

14

Survival Cohorts

• Survival cohort (or available patient cohort) studies can be very biased because:• convenience sample of current patients are likely to be at

various stages in the course of their disease.• individuals not accounted for have different experiences

from those included e.g., died soon after trt.

• Not a true inception cohort e.g., retrospective case series.

• Very common!

15

Survival Cohorts (Fletcher)

Measure OutcomesImproved = 40Not improved = 10

True Cohort

Survival Cohort

ObservedImprovement

TrueImprovement

AssembleCohortN=150

Measure OutcomesImproved = 75Not improved = 75

Assemble patients

BeginFollow-upN = 50

Not ObservedN = 100

Dropouts:Improved = 35Not improved = 65

50% 50%

80% 50%

16

A. Selection Bias

• ii) Migration bias• occurs when patients drop out of the study

– (lost-to-follow-up).

• usually subjects drop out because of a valid reason – e.g., died, recovery, side effects or disinterest.

• these factors are often related to prognosis. • asses extent of bias by using a best/worst case analysis. • patients can also cross-over from one exposure group to

another – if cross-over occurs at random = non-differential

misclassification of exposure

17

A. Selection Bias

• iii) Generalizability bias• related to the selective referral of patients to

tertiary (academic) medical centers. • highly selected patient pool have different clinical

spectrum of disease.• influences generalizability (see Moltusky, 1978;

Melton, 1985).

18

II. Bias in Follow-Up Studies

• B. Measurement bias• Measurement (or assessment) bias occurs

when one group has a higher (or lower) probability of having their outcome measured or detected.

– likely for softer outcomes • side effects, mild disabilities, subclinical disease or • the specific cause of death.

19

B. Measurement bias

• Measurement bias can be minimized by:

• ensuring observers are blinded to the exposure status of the patients.

• using careful criteria (definitions) for all outcome events.

• apply equally rigorous efforts to ascertain all events in both exposure groups.

20

III. Commonly Used Measures of Prognosis

• A. 5-year Survival Rate

• Number of individ. who survived between t 0 and t+5 years Number of individuals with disease at t 0

– typically t 0 is the point of initial diagnosis or treatment.

– cumulative incidence rate (= risk of death at 5 years).

– measures the proportion of the original patient population alive at 5 years.

– easy to interpret and remember, but fails to indicate the rate of death

21

Fig. Limitation of 5-year Survival Rates (From Fletcher)

0

20

40

60

80

100

0 1 2 3 4 5

Years

% S

urv

ivin

g Age at 100 yrs

Aneurysm

AIDS

CML

22

B. Case-fatality Rate (CFR)

• CFR = Number of indv. who die during t0 to t+1 Number of individuals with disease at t0

– specific type of cumulative incidence rate (= proportion, not a rate).

– measures the risk of death among those individuals who develop disease.

– must have an explicit (or implicit) time period that is sufficiently long to ensure that all relevant events have been observed e.g., legionnaires= disease, stroke, CML.

– Same principles apply to response, remission, and recurrence rates.

23

C. Mortality or Death Rate (MR)

• MR = Number of indv. who die during t 0 to t+1 Total population time-at-risk during t 0 to

t+1

– defined as the incidence rate of death per "population time"

– denominator is population time-at-risk.– measures the speed of death due to a specific disease– distinguish from the case-fatality rate! Example:

• Stroke 28 day CFR = 23%• Stroke MR = 63/100,000

24

IV. Statistical Methods Used in Prognosis Studies

• A. Analysis of Survival or Failure Time Data• primary end point of prognosis studies is time until

event of interest occurs e.g., death or relapse. • analysis of survival or failure time data, requires

specific techniques:– Kaplan-Meier estimator– Log-rank test– Cox proportional hazards regression model.

25

Censoring:

• Defn: when the event of interest does not occur in all individuals because:• study was stopped before everyone in the study

had the event• loss to follow-up • death from other (competing) causes e.g., road

traffic accidents

26

Censoring

• All statistical methods assume censoring is non-informative:

• reason for incomplete observation is not related to the underlying risk of failure.

• therefore, survival experience of those “lost to

follow-up” is assumed to be the same as those that remain.

27

Survival function (S(t))

• Defn: the probability of survival to a given point in time (t)

• S(3)= 60 indicates that 60% of the population survived 3 years.

• Graphically displayed using a "life table" or "survival curve.“

• Median survival time = the time at which half the patients have "failed“• a crude but common measure of survival

28

A Typical Survival Curve Showing the Survival Function (S(t)) Plotted Against Time with a Median Survival Time of 1.25 Years

0 1 2 3 4 50

0.2

0.4

0.6

0.8

1

S(t)

Years

29

Hazard function (h(t))

• Defn: the probability of an event at a specific moment in time (t), given the patient has already survived to that point in time.

• closely linked to the survival function. • indicates the probability of the patient "failing"

during the next time period. • a direct measure of prognosis.

30

Kaplan-Meier Estimator

• a widely accepted method of estimating S(t). • S(t) is expressed as the product of conditional

probabilities e.g., • S(3)= S(1) x S(2|1) x S(3|2)

• where:– S(1)= probability of surviving year 1

– S(2|1)= conditional probability of surviving year 2, given survival to year 1.

– S(3|2)= conditional probability of surviving year 3, given survival to year 2.

31

Kaplan-Meier Estimator

• estimators or "curves" begin at time zero with S(t) = 1 and then decrease in a series of steps corresponding to observed times of failure.

• censored observations contribute to the survival probability estimates up until the time of censoring.

• no assumptions made about the shape of the survival or hazard function (= a non-parametric technique).

• variability of estimates is greatest at the ends of the curves - few subjects and few failures.

32

Kaplan-Meier Estimators of the Survival Function (S(t)) for Two Groups (Treatment and Control).

0 1 2 3 4 50

0.2

0.4

0.6

0.8

1

S(t)

Years

Controls

Treatment

33

Log-rank test• a statistical test of the difference in survival distributions

(see Peto et al, 1977).

• at each observed time of failure, compares the observed number of events in one group to the expected number (based on identical hazard functions for the two groups).

• gives equal weight to differences at each point in time.

• if it makes sense to place greater emphasis on differences at earlier time periods then use the generalized Wilcoxon test (Cox and Oakes, 1984).

34

Cox proportional hazard model

• Very powerful regression modeling technique based on the hazard function (see Tibshirani 1982). • allows for the full application and flexibility of regression

analysis to be applied to survival data. • in its simplest form its an extension of the log rank test.

• Advantages: • ability to handle a large number of prognostic variables (both

discrete and continuous)• can adjust for confounding variables, and• evaluate interaction effects

35

B. Statistical Control of Common (Selection) Biases

• Prognostic studies are essentially observation studies that focus on survival (or some other outcome).

• Techniques to control biases are therefore the same as used in observational epidemiology.

36

Table. Methods for Controlling Selection Bias (from Fletcher)

Phase of Study

Methods

Description

Design

Analysis

Randomization

Random assignment ensures that known and unknown confounders are equally distributed between exposure groups (this is rarely feasible however, unless a specific RCT designed to evaluate some aspect of prognosis is being conducted).

+

Restriction

If a strong confounding factor is known - such as age or sex - limit the range of the characteristics of patients in the study.

+

Matching

Match exposure groups on the basis of important prognostic variables - such as stage of disease, age or sex.

+

Stratification

Compare event rates within subgroups (strata) with otherwise similar probability of outcomes e.g., sex or age-groups specific rates.

+

37

Table. Adjustment Procedures to Control Selection Bias Adjustment Procedures

Design

Analysis

Simple

Mathematically adjust crude rates for a characteristic known to be an important prognostic factor e.g., age adjustment.

+

Multiple

Use mathematical models to adjust risk estimates for several prognostic variables (Cox Regression).

+

Sensitivity Analysis

Describe how the results could differ by changing the values of known prognostic factors over plausible ranges. Best/worst case analysis is an example.

+

38

V. Ideal Characteristics of Prognostic Studies

• 1. Was the sample well defined and representative of a definable underlying population? Was the referral pattern well described?

• 2. Was an inception cohort assembled? Were all the study patients at a similar well defined point in the course of their disease?

• 3. Was the follow-up complete and sufficiently long?

39

V. Ideal Characteristics of Prognostic Studies

• 4. Were objective and unbiased outcome criteria used?

• 5. Was the outcome assessment blind?

• 6. Was adjustment for extraneous prognostic factors carried out?

40

Editorial Readings

• Melton• What is selection bias? and how can it effect the

conclusions of studies?

• Motulsky• Why did author place such emphasis on

understanding the selection method?

41

VI. Clinical Decision Rules (CDR)

• clinical tools that combine history, physical examination, and simple diagnostic tests to aid in diagnostic, prognostic or treatment decisions

• Outcomes:• Probability of disease/event (risk)

– e.g., APGAR, APCHE, CVD Risk Prediction (Framingham), colic prognosis

• Diagnostic/treatment decision– e.g., Breast biopsy decisions, Breast CA risk (Gail model), colic

surgery

42

CDRs – 3 Step Development Process

• Step 1 - Derivation:• Identify important (predictive) variables • Use statistical methods (Logistic regression,

recursive partitioning, neural networks), or pick variables based on expert opinion

• Initial statistical testing (validation)– Split sample (development and training sets)– Bootstrap techniques

43


• Step 2 – Validation• Usually prospective, validation required because

– CDR accuracy may be specific to development population (because of severity & disease prevalence)

– CDR may not be applied in the same manner in other populations

• Narrow: Application to similar patient popl.• Broad: Application to different populations with

varying prevalence and disease spectrum.

44


• Step 3 – Impact Analysis• Required because reluctance to use CDR is common. Why?

– Concern about different patient population/settings– Risk of false negatives (esp. legal concerns)– Rules are complicated or take too long to use– Doesn't provide a course of action (just a probability!!)

• Test effect of CDR on physician behavior and clinical practice and patient outcomes

• Rarely done!• Ideal = Randomize individual patients (difficult) or practices.• Or, evaluate using pre – post design

45

CDR - Hierarchy of Evidence

Level Minimum Evidence Required Recommended Use

1 Accuracy and applicability demonstrated in BROAD prospective validation studies in different populations. PLUS impact analysis.

Widespread use. High confidence that can change practice or outcomes.

2 Accuracy demonstrated in BROAD prospective validation studies in different populations

Widespread use. High confidence.

3 Accuracy demonstrated in single NARROW prospective validation study

Use with caution in similar settings

4 Statistical validation or retrospective validation only

Further evaluation required before use

46

CDR – Methodological Standards(McGinn, JAMA 2000)

• Were all important predictors included in the derivation process

– = content validity– were they collected prospectively in a blinded fashion for the purposes of

CDR development?– every patient included?, minimal missing data?– were predictors present in large proportion of study pop

• Was the patient population and setting well defined?– age, sex, referral filter

• Were all outcome events clearly defined?– Are they of clinical importance?– Were they determined blindly (independent of predictors)

47

CDR – Methodological Standards(McGinn, JAMA 2000)

• Were appropriate statistical methods used? – Adequate sample size to avoid over-fitting (need 10 outcomes per

variable)

• Were results of CDR appropriate and clear?– Se/Sp or ROC curves– usually want high Se (to avoid FNs)– PV’s of more use to clinicians (Prevalence dependent)– LR’s? – Prob (outcome) = survival curves

• Was reproducibility of predictors and the rule itself assessed? – Many S/S are not very reliable– Concerned with inter-observer variability (K)