cohort retrospective
TRANSCRIPT
Retrospective Cohort Study
Review- Retrospective Cohort Study
Retrospective cohort study:
• Investigator has access to exposure data on a group of people.
• The study sample is divided into exposed and non-exposed groups.
• Both the exposures and outcomes of interest have already occurred (hence “retrospective”).
• The disease experience of exposed and non-exposed groups is compared (e.g. risk ratio or rate ratio).
Retrospective cohort study:
Exposure Disease
?
?
Both exposure and disease have already occurred
Review - Retrospective Cohort Study
RetrospectiveCohort Studies
(Also called “historical” studies)
RetrospectiveCohort Studies
(Also called “historical” studies)
Design Features
Strengths:
• Can study the effects of exposures that no longer occur (e.g. discontinued medical treatments.)
• Quicker and less costly than prospective cohort studies.
• Particularly efficient for study of rare exposures, especially occupational and “natural history” exposures.
Design Features
Strengths (cont.):
•Particularly efficient for studying diseases with long “latency” periods.
•Can examine multiple effects of single exposure.
•Can yield information on multiple exposures.
• May allow direct measurement of incidence of disease in exposed and non-exposed groups (hence, calculation of relative risk).
Design Features
Limitations:
• Not useful for study of emerging, new exposures.
• Reliance on existing records or subject recall may be less accurate and complete than data collected prospectively (e.g. records were not recorded for the hypothesis of interest).
• Information on potential confounding factors are often unavailable from existing records.
IMPORTANT CONCEPTS INCOHORT STUDIES
• Rates Versus Risks
• Calculating Person Time
• Estimating the Empirical Induction Period
• Estimating Effect of Relevant Exposure
Rates Versus Risks
• In some instances, it is more desirable to calculate and compare incidence rates, rather than incidence proportions (risks).
-- Recall, cumulative incidence provides an estimate of the probability (risk) that an individual will develop a disease during a specified period of time.
-- Whereas, an incidence rate centers on how fast new cases are occurring in a population.
Cumulative Incidence (CI)No. of new cases of disease during a given period
CI = --------------------------------------------------------------
Total population at risk during the given period
Example: During a 1-year period, 10 out of 100 “at risk” persons develop the disease of interest.
10
CI = ----- = 0.10 or 10.0%
100
Incidence Rate (IR)
No. of new cases of disease during a given period
IR = --------------------------------------------------------------
Total “person-time” of observation
Range = 0 to Infinity
Incidence Rate (IR)
When we observe a group of individuals for a period of time in order to ascertain the DEVELOPMENT of an event….
-The actual time each individual is observed will most likely vary.
What is person time?
Each subject contributes a specific person-time of
observation (days, months, years) to the denominator
Person Follow-up Time on Study Person Yrs.
1 <-------------------------------------> 2 2 <--------------------------------------D 2 3 <-----------------WD 1 4 <-------------------------------------------------------> 3 5 <-------------------------------------> 2
1995 1996 1997 1998 Jan. Jan. Jan. Jan.
Person-Time
Person-Time
Person Follow-up Time on Study Person Yrs.
1 <-------------------------------------> 2 2 <--------------------------------------D 2 3 <-----------------WD 1 4 <-------------------------------------------------------> 3 5 <-------------------------------------> 2
1995 1996 1997 1998 Jan. Jan. Jan. Jan.
Number of Cases 1Person Years of Observation: 10
IR = 1 case / 10 person years of follow-up
Rates Versus Risks
Question: Among persons with acute leukemia, does antibiotic treatment prevent or delay the onset of gram-negative bacterial infections (as measured by the presence of fever).
--- 35 patients receive antibiotic treatment
all 35 develop fever
260 person days of follow-up
--- 40 patients do not receive antibiotic
treatment all 40 develop fever
210 person days of follow-up
Rates Versus RisksTreatmentYES
CI = 35 / 35 = 1.0 (100%)
IR = 35 / 260 = 0.1346 / person day
TreatmentNO
CI = 40 / 40 = 1.0 (100%)
IR = 40 / 210 = 0.1905 / person day
Risk Ratio = 1.0 / 1.0 = 1.0
Rate Ratio = 0.1346 / 0.1905 = 0.7066
Rates Versus Risks
Risk Ratio = 1.0 / 1.0 = 1.0
Rate Ratio = 0.1346 / 0.1905 = 0.7066
Although antibiotic treatment did not prevent gram-negative bacterial infections, the rate ratio of 0.7066 suggests that it delays the onset of occurrence.
In other words, the risk of developing gram-negative bacterial infection on a given day is lower in those treated with antibiotics.
Rates Versus Risks
In addition to being more informative on how fast new cases of disease are developing,
the rate ratio can also be much more informative than the risk ratio, depending on the exposure and disease being measured, and characteristics of the study cohort -- this is particularly true for “time-dependent” exposures (exposures that change over time).
Definitions“Open” or “Dynamic” Population:
Population in which person-time experience can accrue from a changing roster of individuals.
“Fixed Cohort”:
Exposure groups are defined at the start of follow-up with no movement of individuals between exposure groups (e.g. clinical trial).
“Closed” Cohort or Population:
Fixed cohort with no loss to follow-up.
When exposures are “dynamic,” it is important take into account these changes as subjects are followed.
Example: Suppose a cohort of industrial workers are continuously exposed to a hazardous agent over the course of their working career.
We wish to compare the mortality experience of those with low, moderate, and high exposure to mortality in the general population (see handouts).
Estimating Relevant Exposure
Estimating Relevant Exposure
Whether implicit or explicit, and whether for cohort studies or case-control studies, it is important to consider the empirical induction period when estimating the effects of exposures.
Empirical Induction Period
The “empirical induction period” includes the time from causal action of the exposure to disease detection. This consists of 2 parts:
• Induction period: Period of time from causal action to disease initiation (triggering).
• Latent period: Time interval between disease occurrence and detection.
Years of smoking0 30 40 45
“Pre-causal” period Inductionperiod
Latentperiod
Empirical Induction Period
Example: Smoking and lung cancer.
Total study period: 45 years
Empirical induction period: 15 years
Induction period: 10 years
Latent period: 5 years
Empirical Induction Period
For many exposure/disease associations, the empirical induction period is unknown.
The latent period can be reduced by improved methods of disease detection.
Slow-growing cancers may appear to have long induction periods with respect to some causes because they have long latent periods.
Estimating Relevant Exposure
Depending on the empirical induction period, it is often inappropriate to uniformly assign persons as exposed or non-exposed.
Instead, persons can contribute person time to both exposed and non-exposed denominators.
In other words, the time at risk of disease may vary depending on levels of the accumulation and intensity of exposure.
Example: Agent Orange exposure and thyroid cancer.
Exposed: Combat veterans exposed to Agent Orange in Vietnam (1967-70).
Nonexposed: Veterans in non-combat positions not exposed to Agent Orange in Vietnam (1967-70).
Follow-up Period: 1970 - 2000
Postulated Empirical Induction Period: 20 to 30 years
1967 1970 1987 1990 1997 2000
Range ofExposure Range of
Pre-Causal Period
Range of EmpiricalInduction Period
If we give “credit” to person time during the pre-causal period when exposed persons were presumed not at risk of disease occurrence, we may get a biased (usually diluted) estimate of the relative risk (see handout).
1967 1970 1987 1990 1997 2000
Range ofExposure Range of
Pre-Causal Period
Range of EmpiricalInduction Period
Estimating Relevant Exposure
An important issue is what happens to the time experienced by exposed subjects that does not meet the definition of time at risk of exposure effects (the empirical induction period).
The non-relevant follow-up time can be:
1. Assigned to the denominator of the unexposed rate.
2. Excluded from the study.
Estimating Relevant Exposure
Advantages of assigning the non-relevant follow-up time to the denominator of the unexposed rate:
• Greater precision in estimating the rate among the unexposed.
• Greater comparability between the exposed and non-exposed on characteristics such as age and time period of follow-up.
Estimating Relevant Exposure
Disadvantage of assigning the non-relevant follow-up time to the denominator of the unexposed rate:
• If the empirical induction period is underestimated, truly “exposed” cases will be added to the rate of the non-exposed – this will tend to make the exposed and unexposed rates more similar than they really are.
Estimating Relevant Exposure
Disadvantage of excluding the non-relevant follow-up time to the denominator of the unexposed rate:
• The number of truly unexposed cases may be too small to produce a stable comparison.
Estimating Relevant Exposure
Since the empirical induction period is often unknown, how do we know if it is appropriate?
• The empirical induction period can be “lagged” with separate analyses conducted using each period -- e.g.
10 to 20 years 15 to 25 years
20 to 30 years 25 to 35 years
30 to 40 years 35 to 45 years
Estimating Relevant Exposure
If multiple empirical induction periods are analyzed:
• One can select the largest risk estimate, with the tenuous assumption that it is not the largest simply due to statistical variability (chance).
• The data should be inspected to see whether a consistent pattern of effects emerges that reflect the empirical induction period.
Estimating Relevant Exposure
To accurately estimate person time of follow-up, it is important to determine the time of the event as precisely as possible.
Example: Defining the onset of time for disorders such as multiple sclerosis and atherosclerosis can be ambiguous.
Timing of Outcome Events
As a general rule, there should be a written protocol on how to classify subjects on the basis of available information.
Example: Seroconversion to HIV might be measured as the midpoint between time of last negative and first negative antibody test.
Timing of Outcome Events