1 epi-820 evidence-based medicine lecture 8: prognosis mat reeves bvsc, phd
TRANSCRIPT
1
EPI-820 Evidence-Based Medicine
LECTURE 8: PROGNOSIS
Mat Reeves BVSc, PhD
2
Objectives:
• 1. Review definitions. • 2. Understand concept of natural history and
inception cohort studies.• 3. Define commonly used measures of prognosis.• 4. Understand origins of bias in follow-up studies.• 5. Understand basic statistical methodology for
survival data.• 6. Define characteristics of an ideal prognostic
study.• 7. Understand the rationale and development of
clinical decision/prediction rules
3
“Prediction is very difficult, especiallyabout the future”
-Niels Bohr
4
I. Definitions
• Prognosis: the prediction of the future course of events following the onset of disease. • can include death, complications, remission/recurrence,
morbidity, disability and social or occupational function.
• Prognostic factors: factors associated with a particular outcome among disease subjects. • examples includes age, co-morbidities, tumor size, severity
of disease etc. • often different from disease risk factors e.g., BMI and pre-
menopausal breast CA.
5
I. Definitions
• Natural history: the evolution of disease without medical intervention.
• Clinical course: the evolution of disease in response to medical intervention.
6
Natural History Studies
• Degree to which natural history can be studied depends on the medical system and the type of disease.
• The natural history of some diseases can be
studied because:– remain unrecognized (i.e., asymptomatic) e.g., anemia,
hypertension.– considered “normal” discomforts e.g., arthritis, mild
depression.
7
Natural History Studies
• Natural history studies permit the development of rational strategies for:• early detection of disease
– e.g., CIN and Invasive Cervical CA.
• treatment of disease – e.g., middle ear infections.– DCIS– Hypertension
8
A. Study Designs
• Measuring natural history or clinical course of disease requires a cohort study:• Most often use a retrospective cohort.
• Exposed group = affected (diseased) patients followed to measure outcomes of interest.
9
Cohort Study Design
• To determine if outcome is atypical, need to compare to a non-exposed group which should be: • obtained from the same source population that
gave rise to the patients.• monitored with the same intensity.• If unaffected cohort unavailable, use “outside” data
e.g., standardized incidence or mortality ratios for cancer studies.
10
B. The Inception Cohort
• Prognosis studies usually involve taking a sample of diseased patients.
• Such cohort studies are very susceptible to bias because clinical course of disease can be both prolonged and variable
• Key factor in determining disease outcome is when did the time clock start?
11
Starting Point
• Starting point must be well defined and clearly specified:• The same point in the course of the disease for all
individuals. • Ideal starting point is near the onset (“inception”)
of disease = inception cohort. • Usually it is:
– onset of symptoms, time at first diagnosis, or beginning of treatment.
12
Effect of Using a Defined Inception Cohort on Average Survival Time
a) No inception cohorta) No inception cohorta) No inception cohorta) No inception cohort
5 cases identifed at different points in time
0 1 2 3 4 5Years
Pre-clinical phase
Clinical phase
Av. duration of survival= 14.5/5 = 2.9 yrs
Start cohort
b) Inception cohort
5 cases identifed at the same point in time
0 1 2 3 4 5Years
Pre-clinical phase
Clinical phase
Av. duration of survival= 11/5 = 2.2 yrs
Start cohort
13
II. Bias in Follow-up Studies
• A. Selection or Confounding Bias• i) Assembly or susceptibility bias occurs when the
exposed and non-exposed groups differ other than by the prognostic factors under study, and
• the extraneous factor affects the outcome of the study.
• Examples: • differences in starting point of disease (survival cohort)• differences in stage or extent of disease, co-morbidities,
prior treatment, age, gender, or race.
14
Survival Cohorts
• Survival cohort (or available patient cohort) studies can be very biased because:• convenience sample of current patients are likely to be at
various stages in the course of their disease.• individuals not accounted for have different experiences
from those included e.g., died soon after trt.
• Not a true inception cohort e.g., retrospective case series.
• Very common!
15
Survival Cohorts (Fletcher)
Measure OutcomesImproved = 40Not improved = 10
True Cohort
Survival Cohort
ObservedImprovement
TrueImprovement
AssembleCohortN=150
Measure OutcomesImproved = 75Not improved = 75
Assemble patients
BeginFollow-upN = 50
Not ObservedN = 100
Dropouts:Improved = 35Not improved = 65
50% 50%
80% 50%
16
A. Selection Bias
• ii) Migration bias• occurs when patients drop out of the study
– (lost-to-follow-up).
• usually subjects drop out because of a valid reason – e.g., died, recovery, side effects or disinterest.
• these factors are often related to prognosis. • asses extent of bias by using a best/worst case analysis. • patients can also cross-over from one exposure group to
another – if cross-over occurs at random = non-differential
misclassification of exposure
17
A. Selection Bias
• iii) Generalizability bias• related to the selective referral of patients to
tertiary (academic) medical centers. • highly selected patient pool have different clinical
spectrum of disease.• influences generalizability (see Moltusky, 1978;
Melton, 1985).
18
II. Bias in Follow-Up Studies
• B. Measurement bias• Measurement (or assessment) bias occurs
when one group has a higher (or lower) probability of having their outcome measured or detected.
– likely for softer outcomes • side effects, mild disabilities, subclinical disease or • the specific cause of death.
19
B. Measurement bias
• Measurement bias can be minimized by:
• ensuring observers are blinded to the exposure status of the patients.
• using careful criteria (definitions) for all outcome events.
• apply equally rigorous efforts to ascertain all events in both exposure groups.
20
III. Commonly Used Measures of Prognosis
• A. 5-year Survival Rate
• Number of individ. who survived between t 0 and t+5 years Number of individuals with disease at t 0
– typically t 0 is the point of initial diagnosis or treatment.
– cumulative incidence rate (= risk of death at 5 years).
– measures the proportion of the original patient population alive at 5 years.
– easy to interpret and remember, but fails to indicate the rate of death
21
Fig. Limitation of 5-year Survival Rates (From Fletcher)
0
20
40
60
80
100
0 1 2 3 4 5
Years
% S
urv
ivin
g Age at 100 yrs
Aneurysm
AIDS
CML
22
B. Case-fatality Rate (CFR)
• CFR = Number of indv. who die during t0 to t+1 Number of individuals with disease at t0
– specific type of cumulative incidence rate (= proportion, not a rate).
– measures the risk of death among those individuals who develop disease.
– must have an explicit (or implicit) time period that is sufficiently long to ensure that all relevant events have been observed e.g., legionnaires= disease, stroke, CML.
– Same principles apply to response, remission, and recurrence rates.
23
C. Mortality or Death Rate (MR)
• MR = Number of indv. who die during t 0 to t+1 Total population time-at-risk during t 0 to
t+1
– defined as the incidence rate of death per "population time"
– denominator is population time-at-risk.– measures the speed of death due to a specific disease– distinguish from the case-fatality rate! Example:
• Stroke 28 day CFR = 23%• Stroke MR = 63/100,000
24
IV. Statistical Methods Used in Prognosis Studies
• A. Analysis of Survival or Failure Time Data• primary end point of prognosis studies is time until
event of interest occurs e.g., death or relapse. • analysis of survival or failure time data, requires
specific techniques:– Kaplan-Meier estimator– Log-rank test– Cox proportional hazards regression model.
25
Censoring:
• Defn: when the event of interest does not occur in all individuals because:• study was stopped before everyone in the study
had the event• loss to follow-up • death from other (competing) causes e.g., road
traffic accidents
26
Censoring
• All statistical methods assume censoring is non-informative:
• reason for incomplete observation is not related to the underlying risk of failure.
• therefore, survival experience of those “lost to
follow-up” is assumed to be the same as those that remain.
27
Survival function (S(t))
• Defn: the probability of survival to a given point in time (t)
• S(3)= 60 indicates that 60% of the population survived 3 years.
• Graphically displayed using a "life table" or "survival curve.“
• Median survival time = the time at which half the patients have "failed“• a crude but common measure of survival
28
A Typical Survival Curve Showing the Survival Function (S(t)) Plotted Against Time with a Median Survival Time of 1.25 Years
0 1 2 3 4 50
0.2
0.4
0.6
0.8
1
S(t)
Years
29
Hazard function (h(t))
• Defn: the probability of an event at a specific moment in time (t), given the patient has already survived to that point in time.
• closely linked to the survival function. • indicates the probability of the patient "failing"
during the next time period. • a direct measure of prognosis.
30
Kaplan-Meier Estimator
• a widely accepted method of estimating S(t). • S(t) is expressed as the product of conditional
probabilities e.g., • S(3)= S(1) x S(2|1) x S(3|2)
• where:– S(1)= probability of surviving year 1
– S(2|1)= conditional probability of surviving year 2, given survival to year 1.
– S(3|2)= conditional probability of surviving year 3, given survival to year 2.
31
Kaplan-Meier Estimator
• estimators or "curves" begin at time zero with S(t) = 1 and then decrease in a series of steps corresponding to observed times of failure.
• censored observations contribute to the survival probability estimates up until the time of censoring.
• no assumptions made about the shape of the survival or hazard function (= a non-parametric technique).
• variability of estimates is greatest at the ends of the curves - few subjects and few failures.
32
Kaplan-Meier Estimators of the Survival Function (S(t)) for Two Groups (Treatment and Control).
0 1 2 3 4 50
0.2
0.4
0.6
0.8
1
S(t)
Years
Controls
Treatment
33
Log-rank test• a statistical test of the difference in survival distributions
(see Peto et al, 1977).
• at each observed time of failure, compares the observed number of events in one group to the expected number (based on identical hazard functions for the two groups).
• gives equal weight to differences at each point in time.
• if it makes sense to place greater emphasis on differences at earlier time periods then use the generalized Wilcoxon test (Cox and Oakes, 1984).
34
Cox proportional hazard model
• Very powerful regression modeling technique based on the hazard function (see Tibshirani 1982). • allows for the full application and flexibility of regression
analysis to be applied to survival data. • in its simplest form its an extension of the log rank test.
• Advantages: • ability to handle a large number of prognostic variables (both
discrete and continuous)• can adjust for confounding variables, and• evaluate interaction effects
35
B. Statistical Control of Common (Selection) Biases
• Prognostic studies are essentially observation studies that focus on survival (or some other outcome).
• Techniques to control biases are therefore the same as used in observational epidemiology.
36
Table. Methods for Controlling Selection Bias (from Fletcher)
Phase of Study
Methods
Description
Design
Analysis
Randomization
Random assignment ensures that known and unknown confounders are equally distributed between exposure groups (this is rarely feasible however, unless a specific RCT designed to evaluate some aspect of prognosis is being conducted).
+
Restriction
If a strong confounding factor is known - such as age or sex - limit the range of the characteristics of patients in the study.
+
Matching
Match exposure groups on the basis of important prognostic variables - such as stage of disease, age or sex.
+
Stratification
Compare event rates within subgroups (strata) with otherwise similar probability of outcomes e.g., sex or age-groups specific rates.
+
37
Table. Adjustment Procedures to Control Selection Bias Adjustment Procedures
Design
Analysis
Simple
Mathematically adjust crude rates for a characteristic known to be an important prognostic factor e.g., age adjustment.
+
Multiple
Use mathematical models to adjust risk estimates for several prognostic variables (Cox Regression).
+
Sensitivity Analysis
Describe how the results could differ by changing the values of known prognostic factors over plausible ranges. Best/worst case analysis is an example.
+
38
V. Ideal Characteristics of Prognostic Studies
• 1. Was the sample well defined and representative of a definable underlying population? Was the referral pattern well described?
• 2. Was an inception cohort assembled? Were all the study patients at a similar well defined point in the course of their disease?
• 3. Was the follow-up complete and sufficiently long?
39
V. Ideal Characteristics of Prognostic Studies
• 4. Were objective and unbiased outcome criteria used?
• 5. Was the outcome assessment blind?
• 6. Was adjustment for extraneous prognostic factors carried out?
40
Editorial Readings
• Melton• What is selection bias? and how can it effect the
conclusions of studies?
• Motulsky• Why did author place such emphasis on
understanding the selection method?
41
VI. Clinical Decision Rules (CDR)
• clinical tools that combine history, physical examination, and simple diagnostic tests to aid in diagnostic, prognostic or treatment decisions
• Outcomes:• Probability of disease/event (risk)
– e.g., APGAR, APCHE, CVD Risk Prediction (Framingham), colic prognosis
• Diagnostic/treatment decision– e.g., Breast biopsy decisions, Breast CA risk (Gail model), colic
surgery
42
CDRs – 3 Step Development Process
• Step 1 - Derivation:• Identify important (predictive) variables • Use statistical methods (Logistic regression,
recursive partitioning, neural networks), or pick variables based on expert opinion
• Initial statistical testing (validation)– Split sample (development and training sets)– Bootstrap techniques
43
CDRs – 3 Step Development Process
• Step 2 – Validation• Usually prospective, validation required because
– CDR accuracy may be specific to development population (because of severity & disease prevalence)
– CDR may not be applied in the same manner in other populations
• Narrow: Application to similar patient popl.• Broad: Application to different populations with
varying prevalence and disease spectrum.
44
CDRs – 3 Step Development Process
• Step 3 – Impact Analysis• Required because reluctance to use CDR is common. Why?
– Concern about different patient population/settings– Risk of false negatives (esp. legal concerns)– Rules are complicated or take too long to use– Doesn't provide a course of action (just a probability!!)
• Test effect of CDR on physician behavior and clinical practice and patient outcomes
• Rarely done!• Ideal = Randomize individual patients (difficult) or practices.• Or, evaluate using pre – post design
45
CDR - Hierarchy of Evidence
Level Minimum Evidence Required Recommended Use
1 Accuracy and applicability demonstrated in BROAD prospective validation studies in different populations. PLUS impact analysis.
Widespread use. High confidence that can change practice or outcomes.
2 Accuracy demonstrated in BROAD prospective validation studies in different populations
Widespread use. High confidence.
3 Accuracy demonstrated in single NARROW prospective validation study
Use with caution in similar settings
4 Statistical validation or retrospective validation only
Further evaluation required before use
46
CDR – Methodological Standards(McGinn, JAMA 2000)
• Were all important predictors included in the derivation process
– = content validity– were they collected prospectively in a blinded fashion for the purposes of
CDR development?– every patient included?, minimal missing data?– were predictors present in large proportion of study pop
• Was the patient population and setting well defined?– age, sex, referral filter
• Were all outcome events clearly defined?– Are they of clinical importance?– Were they determined blindly (independent of predictors)
47
CDR – Methodological Standards(McGinn, JAMA 2000)
• Were appropriate statistical methods used? – Adequate sample size to avoid over-fitting (need 10 outcomes per
variable)
• Were results of CDR appropriate and clear?– Se/Sp or ROC curves– usually want high Se (to avoid FNs)– PV’s of more use to clinicians (Prevalence dependent)– LR’s? – Prob (outcome) = survival curves
• Was reproducibility of predictors and the rule itself assessed? – Many S/S are not very reliable– Concerned with inter-observer variability (K)