Download - prelim guide (Autosaved)
-
8/3/2019 prelim guide (Autosaved)
1/26
1
2. DISEASE AND POPULATION
CASE DEFINITION
Does it specify characteristic shared by all members of the class
being defined?
Does it specify what distinguished them from others outside the
class?
Is the data readily available?
What are the options for creating case definition?
When little is known the disease, can be developed inductively
basedon the shared, readily available observable clinical
characteristics
In acute outbreak, shared causes, place and time of occurrence
are themselves sometimes included
When is consistent definition useful?
For valid comparison
When is 2+ case definition useful?
facilitate comparison when case definition has changed over time
when extent of evidence may vary among cases e.g definitive,
probable, possible cases during outbreak
assess degree to which results depend on case definition
What is drawback of broad case definition?
Low specificity, leading to bias and reduced statistical popwer in
case control studies
What is the drawback of narrow case definition?
If later suspected to be too narrow, may be difficult and time
consuming to go back and find the missed cases
Epidemiologic case definition vs Clinical Diagnosis
Clinical diagnosis epidemiologic study
guide treatment choice,
inform about prognosis
Evidence may include costly
and/or invasive tests
quantify population burden,
assess disparities, identify
shared causes
Evidence needed must
usually be feasibly obtainable
on a population scale
DISEASE MODELS
Is it non recurrent disease?
Alzeimers, Osteoarthirtis, Suicide, Homicide, SIDS
Is it recurrent disease?
Depression, UTI, Low back pain
Are all people at risk?
Who are not at risk?
Diseased people are not at risk
Immunized people are not at risk
Does it have susceptible state?
Does it have immune state? Is immunity throughout the life?
Does it have fluctuating at risk periods? Eg: occupational injury
Is duration of disease event negligible?
Use line diagram
POPULATION
Defined vs Undefined population
Why to define population?
to know its size
generalizability
Population definition
Does it specify characteristic shared by all members of the
population being defined?Personal attributes such as age, gender, momebership;
Geographic scope, time or time period
Does it specify what distinguished them from others outside the
population?
Link between Cases and the population at risk
If each of the cases had not developed the disease, would he or
she still have been included in the population?
If each of the non-cases in the population had developed the
disease, would he or she have been included as a case?
Defined population observed over time
Is it a closed population?
all members initially at riskNo gains or losses in embership during period of observation
(except due to disease itself)
Able to identify and count all new disease cases over a fixed time
period
Is it a open population?
population at risk can gain or lose members during time period of
interest. Eg births and deaths, migration, occurrence of new cases
and recovery of old ones
-
8/3/2019 prelim guide (Autosaved)
2/26
2
3. DISEASE FREQUENCY: BASICWhat is dual interpretation of disease frequency?
Population disease burden
Individual disease risk
PREVALENCE
Count of prevalent cases
no. of people in the diseased state at that time
Prevalence
No. of prevalent cases/Size of population
What is possible fixed time?Is it a Calendar time ?
Is it Age?
Is it Time relative to some salient event?
INCIDENCE
Cumulative Incidence
Number of people who develop the disease/ No of people initially
at risk for it
Is it a closed population?
Is timing of disease occurrence within the time period of interest?
Is each person counted as case only once?
What is it statistically?
ProportionWhat are other names?
Incidence proportion
Attack rate
Can time period for each individual be different?
Yes
Incidence Rate
No of incident cases/Amount of person-time at risk What is it statistically?
Rate
How to deal with recurrent event?
If has to be counted in numerator, also add person-time in
denominator
If not to count in numerator, dont add person-time indenominator
Estimating incidence rate when detail data is not available
Two ways
Average number of person * total duration of follow up time
Total person follow up * Average duration of follow up time
What are approaches to estimating average population at risk?
Population size at mid-period
Average of population size at start and at end of observation
period
Average several population size estimates made periodically
during the observation period
Is prevalence case included in population at risk? Is it small enough to be negligible?
Denominators other than person-time
Vehicles miles travelled
Comparison of Cumulative Incidence and Incidence Rate
Variants of Incidence
Mortality
Mortality Rate = No of death/Person time at risk
Mortality Rate= Death rate
Cumulative Mortality = No of death/Population at risk
Cumulative Moratlity = Mortality Density
Fatality
Case fatality = Number of fatal cases/Total number of cases
Proxy Measure of Incidence
Proportional mortality = Deaths from disease/Death from all causes
Be aware of pitfall !
When can proportional morality be valid for comparison?
If the number of denominator is equal and person time of follow-
up is equal between two population
What can increase in proportional mortality mean?
Increase in disease
Decrease in other disease
Increase in relative size of that segment of population
Change in criteria for case definition
Proportional incidence = No of cases of certain disease/no of cases in a
larger category that contains it
Fetal Death ratio= No of fetal deaths/Number of live births
OTHER MEASURES OF DISEASE FREQUENCY
Period prevalence
Hybrid of prevalence and cumulative incidence
Period prevalence = P + (1-P). CI
What is the main limitation?Point prevalence and cumulative incidence convey very different
kinds of information about disease frequency. Those distinction are
lost when they are combined, which limits usefulness as summary
measure.
Years of potential life lost
-
8/3/2019 prelim guide (Autosaved)
3/26
3
4. DISEASE FREQUENCY: ADVANCEDPREVALENCE
Length biased sampling?
A sent of prevalent cases tends to be skewed toward cases with
more chronic forms of the disease, why?
Other things being equal, a persons probability of being
captured as a prevalent case is proportional to the duration of his or
her disease.
What is issue in evaluating screening program?
Screening is like prevalence survey, and cases detected byscreening tend to be skewed toward more slowly progressive forms
of pre-symptomatic disease
Estimate confidence interval of prevalence
Stata command .cii n r
CUMULATIVE INCIDENCE
Estimating Cumulative incidence in presence of Censoring
Kaplan Meier method
.stset y deltadelta indication for censoring
.stsumsummarize total time at risk
.sts, lost gives KM graph
.sts list gives detailed information
.stcigives median survival time with 95% CI
RELATIONSHIPS AMONG DISEASE FREQUENCY MEASURES
Rates in Total population and its sub population
Example
Incidence Rate and Cumulative Incidence
Prevalence, Incidence and Duration of disease
Incidence of Two disease
Incidence rate in two stage process
Mortality, Incidence and Case Fatality
Is all data from same year?
If no, think that they may not have come to equilibrium yet.
-
8/3/2019 prelim guide (Autosaved)
4/26
4
6. SOURCES OF DATA ON DISEASE OCCURRENCE
NUMERATOR DATA
Death Records
Birth Records
Foetal Death Records
Disease report
Cancer, Birth defects, Trauma
Disease Registries
Health Care Records
Hospital Discharge Data
Clinical data (out patients)
Laboratory records
Pharmacy records
SURVEYS
National Health Interview Survey (NHIS)
National Health and Nutrition Examination Survey
-
8/3/2019 prelim guide (Autosaved)
5/26
5
NAMCS and NHAMCS
Behavioral Risk Factor Surveillance System (BRFSS)
DENOMINATOR DATA
Common sources of denominator data
U.S. Census
Administrative records
HMO enrollment records
Employment or labor union records
Alumni rosters
Etc.
Birth certificates for perinatal epidemiology
U.S. census data
USES OF MULTIPLE DATA SOURCES
Excluding ineligible cases
Validating using two data sources
Estimating completeness of data sources
Capture-recapture sampling: data layout
Can estimate x if can assume that capture by each data
source is independent of capture by the other
Independence assumption implies that 50/250 = 10/x
Hence x = 10 250/50 = 50
Estimating total (known + unknown) cases
Estimated total cases (known + unknown) = 360about
15% higher than simple tally of 310 known cases
What is the pitfall of capture recapture method?
Assumption of independent capture not easily tested and
may not always be plausible
ReducingMisclassification
Verifying using two data sources
-
8/3/2019 prelim guide (Autosaved)
6/26
6
7. PERSON, PLACE AND TIME
USES OF DESCRIPTIVE EPIDEMIOLOGY
Quantifying disease burden
Assess health disparity
Generate hypothesis
Help target efforts at early detection
Detect emerging threats to public health
PERSON DISTRIBUTION
AgeImmunity
Human development
Slowly progressive disease
Age related variation in lifestyle
Gender
Biological (anatomic and hormonal)
Non biological (social)
Race and Ethnicity
What is the major problem with race data?
Misclassification
Definition of race of infant is different, so caution in interpretation
Socio-economic status
Marital Status
PLACE DISTRIBUTION
What is the limitation of spot map?
Only number of cases is spoted, do not account for distribution of
population at risk
What might be the underlying causes?
Geographical variation is due to variation in physical environment
Socio cultural variation
Differences in medical practice
TIME DISTRIBUTION
Secular trend
Cyclic variation
Age period and Birth Cohort
Calendar year age = year of birth
COHORT EFFECTS
Lung cancer mortality in U.S. women
Age effect within a calendar year
Calendar effect within age group
In general, women along a diagonal belong to the same birth
cohort
Why consider birth cohort?
Shared experiences at an earlier age can affect future
disease risk
Can provide a simple explanation for otherwise puzzling
pattern of variation in rates by age
Lung cancer death rates
Connected by calendar year of death
Connected by birth cohort
-
8/3/2019 prelim guide (Autosaved)
7/26
7
8. INFERING CAUSAL RELATION BETWEEN EXPOSURE AND
DISEASE
Why is causal relationship important?
Individual decision making
Clinician to take decision on behalf of their patients
Public health practitioners
Cause
If the factor was not present, would some of the disease not
happen? Not Necessary factor
Is it a contributing component of a more complex mechanism
involving other factors?Not sufficient factor
Causal inference guideline
1.Randomized trial evidence exists
2.Temporal sequence is correct
3.Association is strong
4.Association is biologically plausible
5.Association is strongest when predicted to be so
6.No alternative explanations exist (No confounding)
7.Observed evidence is consistent
It is systematic way of thinking causalityIt is no proof, it is judgment
COMMON MISPERCEPTIONS OF THE NATURE OF CAUSES OF ILLNESS
AND INJURY
There are direct and indirect cause of disease, and the direct
causes and more important
In order to be considered a cause of disease, an exposure must be
present in every case
In order to be considered a cause of disease, an exposure must be
capable of producing that disease on its own
If there are heterogeneous conclusion among studies, what are
possible explanations?
Some interacting agents may be missing in oneResults may be different in different settings
-
8/3/2019 prelim guide (Autosaved)
8/26
8
9. MEASURE OF EXCESS RISK
WHEN INCIDENCE RATE CAN BE ASCERTAINED
Measu
re
Formula Helps answer the question
RR Ie/Io Does exposure (E) cause (D)?
AR Ie-Io Among persons exposed to (E) what amount of
incidence of D is E responsible for?
AR% (RR-1)/RR*
100%
(Ie-Io)/Ie *
100%
What proportion of occurrence of disease in
exposed individual was due to (E)? VACCINE
EFFICACY IS JUST THE
AR%: (Iunvac Ivac)/Iunvac x 100%
PAR It-Io Should resources be allocated to controlling E or,
instead, to exposures causing greater health
problems in population?
PAR % (It-Io)/It
*100%
What portion of D in the population caused by E?
Should resources allocated to combating D be
directed toward etiologic research or control of
known etiologies?
1/AR Number needed to treat
Relative effect for positive associations
(Relative excess risk)
= RR 1
Relative effect for negative associations
= 1 RR
WHEN INCIDENCE RATE CAN NOT BE ASCERTAINED
RR is equal, AR is higher in population A comparted to B
For Relative risk of given size, RD or AR associated with a given
exposure will be larger for common illness than for rare illness
What does PAR% depend on?
RR
Proportion of population exposed
Measure Formula
OR ad/bc Use the OR in case-control studies toapproximate the RR only when the
outcome is rare in the source
population from which cases and
controls were drawn
AR It /[Pe+1/(OR-1)]
Io (OR-1)
Pe = proportion exposed in
population as a whole: b/(b+d)
AR% [(OR 1)/OR]
x 100%
PAR AR x Pe
PAR % AR% * Pc Pc = proportion of cases who are
exposed: a/(a+c)
-
8/3/2019 prelim guide (Autosaved)
9/26
9
10. MESUREMENT ERROR
SOURCES OF MEASUREMENT ERROR
Measurement in Exposure status
Interview/Questionnaire:
Has subject been misinformed about his/her exposure status?
Has subject intentionally misrepresented it?
Direct measurement
Does the variable vary over time? Eg Blood pressure, seru
cholesterol
Records:Is record complete and faithful?
Measurement of Health outcome
Asymptomatic cases
Is current diagnostic technology applied to everyone who might
meet criteria for being a case
Diagnostic test
Is it appropriate diagnostic test, able to distinguish between who
had disease and who has not?
Information
Is all information available through the source of information in
use ?
Are both exposed and unexposed group followed for same period
of time?
ASSESSING MEASUREMENT ERROR
Why are quantitative indices of measurement error useful?
choosig among different measures of same characteristic in
design phase
Detecting data quality problems during staff training and data
collection
gauging how large role measurement error may have played in
determining study results during data analysis
RELIABILITY
Concordance
What is limitation of Concordance?
It fails to account for agreement that chance alone could produce
KappaIn STATA
.set obs 4
.input clinc self count
1. 0 0 1117
2. 1 0 1323. 0 1 128
4. 1 1 170
. kap clin self [freq=count], tab
Interpretation Guidelines
What is limitation of Kappa?
Kappa declines as prevalence approached 0 or 1. This property
should be kept in mind when comparing Kappas among populations
in which the prevalence of the characteristic under study differs
substantially
What is impact of low Kappa?
As Kappa approaches 0, attenuation of the OR becomes severe,
and a true exposure-disease association may go undetected due tomeasurement error
Intraclass Correlation Coefficient
VALIDITY
Sensitivity and Specificity
What is properties of sensitivity and specificity?
Not dependent on frequency or prevalence of the trait among
persons tested
Relative stable characteristic of a test, but may vary slightlyaccording to: Who performs the test, test setting, Disease severity
ROC curve
Plot sensitivity against 1-Specificity
The more accurate a test is, the farther toward the upper left its
curve falls in an ROC plot. This rule applies even if the two tests yield
results in entirely different units on totally different scales
AUC (Area under curve)
Summary measure of test accuracy based on ROC curve
0.5 is useless test, 1 is perfect test
CONSEQUENCE OF MEASUREMENT ERROR
Differential misclassification
Is ascertainment of exposure status influenced by the presence or
absence of disease?
Is ascertainment of disease status influenced by the presence of
exposure?
Are exposed and unexposed group followed for same period of
time while ascertaining disease?
What is the impact of differential misclassification?
Can falsely exaggerate or falsely minimize an association
Non differential misclassification
What is the impact of non-differential misclassification?
Bias towards the null
What is strategies for minimizing misclassification?
Sharpen the tool: Use tools with highest reliability and validity
-
8/3/2019 prelim guide (Autosaved)
10/26
10
11. CONFOUNDING AND ITS CONTROL
What are the necessary conditions of being confounding?
Factor is associated with disease of interest
cause or having disease recognized (eg: papsmear and cervical
cancer
It can be cause or correlate of cause (eg: race biological and
socioeconomic; age is correlate of cause)
Factor is associated with exposure of interest
Factor is not in causal pathway between exposure and disease
(mechanism by which exposure causes disease) How to assess confounding?
Step 1: Is smoking associated with MI risk? Look in unexposed
Step 2: Is smoking associated with coffee drinking? Look in non-
diseased
Step 3: Is factor in the causal pathway?
When do Confouder-disease and confounder-exposure result in
confounding?
Strong association between exposure & factor and between
disease & factor
Crude and stratum specific effect measures differ
--?Crude and adjusted effect measures differ
When is a factor not to be considered as a potential confounder?
Its not associated with exposure in source population for cases
(look at source population or control or non-diseased populationnot in cases)
Its a surrogate measure of exposure (Obesity: abdominal
breadth, BMI)
Its a consequence of exposure
Its not predictive of disease occurrence apart from its association
with exposure (only look at unexposed people)
Its a consequence of outcome
When can consequence of exposure be adjusted?
When we are interested to look in the association except from
that pathway
CONFOUNDING CONTROL IN DESIGN PHASE
1. Randomization
What is attraction of randomization?
Removes association between exposure and potential
confounders (usually)
Controls confounding by unknown or immeasurable confounder
What is limitation of randomization?
Confounding may still occur due to accident of randomization
What is remedy of accident of randomization?
Increase size of study, use stratified randomization methods,
handle as for observational study & do multivariate analysis
2. Restriction
Requires all members of study population to have same status on
potential confounder(s)
When is restriction most useful?
Most useful when most potential subjects have same status on
potential confounder (eg: look at only singleton in prenatal studies)
What is limitation of Restriction?
Enhances ability to make statistical inference because we loose
some precision when controlling for confounding
It reduces generalizability, can conclude for those who are
restricted
Eliminates ability to assess effect modification
May be difficult (or expensive) to find sufficient subjects.
3. Matching
Cohort study: Each exposed subject matched to one or more non
exposed subjects on potential confounder
Case control study: Each case matched to one or more control onpotential confounder
What is attraction of Matching?
Allows control for few well known strong risk factors (eg: age)
Increases efficiency of case control study
Precludes examining matching factor(s) as risk factors
What is limitation of Matching
Differential loss to follow up may result in imbalance in matching
factors
Matching can create bias in case control study (if matched on the
factor which is only associated with exposure)
What is main purpose of matching in case control studies?
Increase efficiency of the study
CONFOUNDING CONTROL IN ANALYSIS PHASE
Standardized (adjusted) rates
Summary rate that enables comparison of two or more groups
that differ in their distribution of an important factor
Subgroup differences hidden
When is standardized rate not influenced by choice of standard
population?
When difference in rates of two communities is contant across
age categories, the size of difference between the adjusted rates
will not be influenced by choice of the standard population
Ratio will be same no matter what age distribution is chosen to
assign the weight
Example 1 : Categorical variable
Step 1: Choose standard or reference population with a known
confounder distribution
One of the two groups to be compared
Combination of two groups to be compared
2000 US standard population (age)
IARC world standard population (age)
Step-2: Apply the confounder category specific rates for each
population to the number in the standard population in that category
Step 3: Add up the total number of hypothetical deaths in each
population, divide by the total in the standard population to determine
each populations adjusted rate
-
8/3/2019 prelim guide (Autosaved)
11/26
11
What happens when you fail to adjust for confounding by severity
score?
Mixing of effect of hospital (B vx A, exposure) with the effect of
severity score distribution (potential confounding)
Observed motality rate of Hospital B may be due to in part to
lower severity of illness of hospital B patients (compared to that of
Hospital A)
Observed mortality rate in hospital B (vs A) will be lower than the
adjusted or unconfounded rate because patients at hospital B tend
to also have a factor that is known to decrease mortality (low
severity)
Example 2 : Continuous variable
Standardized Incidence Ratio (SIR) and Standardized Mortality Ratio
(SMR)
SIR and SMR are the standardized rate ratio calculated usig the
exposed group as the standard population
Ratio of the total number of deaths in the exposed group divided
by the number of expected in the exposed group if the rates among
the unexposed prevailed within each age
categories
SMR: Miners and tuberculosis mortality
Is working as miner a risk factor for tuberculosis mortality?
TB mortality rate among miners= 384/294,013= 130.6/100,000TB mortality rate among all 35-64 year old men= 54.1/100,000 person
years
RR = 130.6/54.1 = 2.4
Age adjusted by standardization:
Step 1 Choose a standard population- the miners (exposed group). The
general population is the unexposed group
Step 2 Apply the age group specific TB death rates of the unexposed
population (general popn) to the standard popn (here the # of miners)
in that category to get the hypothetical number of TB deaths in
unexposed popn. If it had the age distribution (and #) of the miners
population
Step 3 Add up the total number of hypothetical deaths in the
unexposed (general) population
50.55+58.32+31.96 = 140.83 (among a hypothetical population of
294,013)
Step 4. Compare with the # of TB deaths actually observed in the
miners (among a real population of 294,013): 384
O= Observed # of deaths in general population
E= Expected # of deaths in general populationO/E = 384/140.83 = 2.73
SMR = 2.73
Crude Vs Stratum-specific rates vs standardized rates
Crude rates Represent reality
Useful for health services needs assessment
Stratum-
specific rates
Represent reality
Detailed information useful
Appropriate when stratum-specific effects
differ
Standardized
rates
Weights for each stratum defined by analyst
Facilitate comparison with other data,
studies particularly when known population
is knownAppropriate when stratum-specific effects
are similar
POOLING USING MANTEL-HAENSZEL ADJUSTED ODDS RATIO
CASE CONTROL STUDIES : ODDS RATIO
-
8/3/2019 prelim guide (Autosaved)
12/26
12
COHORT STUDIES, PERSON-TIME DATA: RELATIVE RISK
COHORT STUDIES, CUMULATIVE INCIDENCE: RELATIVE RISK
COHORT STUDIES, PERSON-TIME DATA: RATE DIFFERENCE
DIRECTION OF CONFOUNDING
1. Positive-positive or Negative-negative
2. Positve-Negative
RESIDUAL CONFOUNDING
Is confounder measured?
Is there incomplete control of confounding?
Is measurement improperly defines categories?
Does measurement correctly capture attributes?
Is measurement imperfect surrogate for confounder?
What is effect of residual confounding?
Adjusted effect measure closer to crude effect measure, if the
measurement is non differential
More precisely we measure confounding, more its effect is
reduced.
CONFOUNDING BY INDICATION (OR SEVERITY)
Non randomized pharmaco-epidemiology studies
Comparison of specific drug takers vs non takers
Drug treatment is marker for characteristic or condition that triggers
use that treatment (and increase risk outcome)
May attenuate beneficial effect new drug
Determine risk factors for disease complication/progression
Adjust for prognostic differences
Stratifying on basis of severity of illness
-
8/3/2019 prelim guide (Autosaved)
13/26
13
12. ECOLOGICAL STUDIESINTRODUCTION
What are reasons to undertake ecological studies?
Only aggregate information on exposure in each group may be
available to the investigator, even though exposure status does
actually vary among individuals within each group.
The exposure of interest may actually vary only at the population
level, not among individual within the study populations. (no
ecological fallacy, no extrapolation)
What is the size of the population for ecological studies?
In principle, the aggregate population in ecological studies can beof any size, including households, classrooms, workplaces,
communities, geographic regions, or entire nations.
Most often they are geopolitically defined populations for which
the necessary data are routinely collected, in as much as most
ecological studies use existing data.
What are the possible study designs?
Group randomized trial
Ecological cohort studies
Cross-sectional ecological studies
Longitudinal ecological studies
LEVELS OF MEASUREMENT
What are the different levels of measurement?
Individual level: such as persons age, gender, gun ownershipstatus, and so on. Individual level measurements, however, are
often unavailable in an ecological study.
Population-level measures can be divided into two kinds:
a.An aggregate measure simply summarizes the distribution of an
individual-level factor that may vary within a population. It is
statistic derived from individual level data. Eg: mean age, median
age, proportion of persons aged 65 years or older.
b.An intrinsic population level measure (termed as integral
measure) characterizes an enteir population as a unit. For eg: a
citys size, its population density, law
How can aggregate measure be contextual variables?
the aggregate measure of an individual characteristic can take onnew meaning at the population level by describing a feature of the
environment people live in.E.g. persons risk of becoming ahomicide victim may be influenced not only by whether he or she
owns gun, but also by the general availability of firearms in the
community, as reflected by the population prevalence of gun
ownership.
.
STUDYING EFFECTS OF INDIVIDUAL LEVEL EXPOSURES
How can individual level exposures be asses sed?
association between exposure prevalence and disease frequency
at the population level serve as a proxy for the individual level
association
What are the advantage?
Examine individual level associations in that pre-existing
population level data may be readily available. If so, it is quick and
low cost.
When exposure frequency varies substantially between
populations but not very much within population
If the exposure is subject to a high degree of measurement error
or short term biological variation at the individual level
Estimating Attributable Risk and Relative Risk
1.Apply regression analysis to the group-level data, modeling disease
rate as a function of exposure prevalence. Several forms of
regression can be used for this purpose such as y=mx+c
2.Use the fitted regression model to predict the disease rate for a
population in which everyone is exposed i.e, when x=0. Call thatrate R1. Similarly, predict the rate for a population in which nobody
is exposed, and call that rate R0
3.Estimate Relative Risk as R1/R0 and attributable risk R1-R0
A theoretically preferable analysis would give greater weight to
data from larger countries thus are subject to less sampling error.
What can be major problems?
(R1) or (R0) can be negative, which is, of course, impossible for a
rate
Exposure prevalence of 0 or 1 often fall way beyond the range of
observed exposure prevaences among the population studied,
leading to large extrapolation errors
Because the number of population data points in an ecological
study is often small, there are very limited power to determine
whether one model form fits significantly better than another
Results are highly model dependent, sample size may be too
small to determine which model fits the data set
What are the pitfalls?
The associations at the population level need not necessarily
reflect association of similar magnitude or even similar direction, at
the individual level. (ecological fallacy)
Cross-level bias occurs when an association at one level of
aggregation is assumed to represent the association at another
level, when in fact the associations at the two levels are unequal.
What can lead to cross level bias?
Group-level association between exposure prevalence and
baseline disease rate (rate in non exposed person) such as country
itself is a group-level confounder: it is associated with both outcome
and exposure due to following reasons:
a. The groups may differ on the distribution of one or more
extraneous individual-level risk factors, such as age and gender
b. An intrinsically group level factor may be a confounder. For eg: lax
(negligent) law
c. The exposure itself may have effects at the group level above and
beyond its effects at the individual level. eg: homicide risk to a gun
non-owner may be greater in a country where owning a gun is
common than where it is rare.
In infectious disease epidemiology is herd immunity.
Unequal distribution of effect modifier in the group:
Model misspecification: For many graded exposures, the
relationship between exposure and risk at the individual level is
nonlinear. Only available data may be the mean exposure level for
each group, which can not capture information about thedistribution of individuals among different exposure levels. The
same mean exposure level could result from most individuals falling
near the mean, or from two subgroups at opposite ends for the
expxoure range. These two patterns could correspond to quite
different epected overall disease rates.
Number of groups available for study may be small. As a result, a
simple linear or log linear model between disease rate and mean
exposure level may appear to fit the ecological data adequately,
even though it is actually a poor reflection of the individual-level
relationship of real interest.
What is the effect of non-differential measurement error?
Non differential misclassification can cause estimates of excessrisk to be biased away from null.
What can be done about non differential measurement error?
If sensitivity and specificity data for the measure of exposure are
available, the seize of this non-conservative bias can be estimated
and correction made.
How can confounding operate?
Confounders can operate at either individual or the group level.
What can be done about confounding?
The possibility of nonlinear associations motivates using more
finely detailed information about distribution of the confounder in
each group, if this information is available. For eg: rather than
including just mean age in a group-level regression analysis in an
attempt to remove confounding by age, better control may be
gained by including several age related variables, each of whichreflects the proportion of group members falling into a particular
age group.
Rate standardization can also be used to control confounding in
ecological studies, while doing so also
standardize the prevalence of exposure and of other covariates to
the same reference population.
When is ecological studies less biased?
when within-group variation in exposure is small but between
group variations in exposure prevalence is large
What is the drawback of this ?
Confounding at group level
STUDYING EFFECTS OF GROUP LEVEL EXPOSURES
to evaluate programs and policies that apply to entire
populations - an intrinsically group-level characteristics
Cross-level bias is also of less concern , because the target level
of inference s at the group level, the level at which such an exposure
would be potentially modifiable.
What is drawback?
individual or group level confounding factors to bias the observed
group level association.
How can potential biased be addressed?
cross-classifying the study population by age, gender, race, state
and calendar time
STUDYING EXPOSRES AT TWO OR MORE LEVELS AT ONCE
Individual level studies may be carried out in only a single setting,
-
8/3/2019 prelim guide (Autosaved)
14/26
14
or they may deliberately match or stratify study subjects on area of
residence, thus controlling for neighborhood level influences
. Having information at more than one level can permit a richer
and more complete conceptualization of how disease occurs,
leading in turn to a wider range of opportunities for prevention.
In such a goup level association is present, what might it represent?
treat of methodological artifact, such as measurement error or
residual confounding, always lurks in the background.
Shared environmental exposures
Selection effects: Eg: people with asthma may move to cleaner
placeContagion: prevalence of illness itself can affect the level of risk to
susceptible by influencing their chance of exposure.
-
8/3/2019 prelim guide (Autosaved)
15/26
15
13. RANDOMIZED TRIALCHARACTERISTICS
Comparative study of two or more intervention strategies
Exposure determined by formal chance mechanism
Each participant has a known probability of being assigned to
either arm
Outcome of assignment process is uncertain
What are the special strengths?
Protect against confounding, even by factors that may be
unknown or difficult to measure
Provide sound basis for statistical inferenceFacilitate blinding
What favors RCT?
The exposure must be potentially modifiable
The exposure must be potentially modificable by the investigator
There is genuine uncertainty about which intervention strategy is
superior
The primary outcomes are relatively common and occur relatively
soon
Explanatory vs Pragmatic Aims
Explanatory Pragmatic
Overall aim Test scientific theory Guide practical
decision-making
Experimentalarm
Guided by theory to betested, even if not
necessarily suitable for
wide use
Feasible for routineapplication
Control arm Placebo or other
theoretically relevant
alternative
Best practical
alternative
Eligibility Often narrow, may
include pre-screening
for compliance
Often broad, to
represent potential
target population
Outcomes Those of interest for
testing theory
Those most salient to
key decision makers:
E.g patients, clinicians,
public health policy
makers
What are the treatment arms?
Experimental
Control: Nothing, Placebo, Active alternative, Usual care
SELECTION OF STUDY SUBJECTS
What are the selection criteria
Eligibility
Internal validity
Generalizability
Risk and benefits to subjects
What affects internal validity?
Subject retention
Data qualityCompliance
Drop out
Statistical power: probability of experiencing a key outcome
Number of study subjects
For parallel-groups trail with two equal-sized group and abinary outcome:
Choosing values for
What are pitfalls of taking treatment effect based on pilot study?
underestimating intervention effect, can cause worthwhile
intervention to be abandoned prematurely
Overestimating intervention effect, causeing main study to be
underpowered
INFORMED CONSENTWhat are necessary elements of informed consent?
Awareness of participation in research
Procedures to be followed
Risks and discomfort
Potential benefits to self and others
Alternative treatments or procedures available
Confidentiality, data-retention provisions
Compensation should injury occur (if more than minimal risk)
Whom to contact if questions
Voluntary nature of participation and right to withdraw without
penalty or loss of benefits
RANDOMIZATION
Why to randomize?Protection against known and unknown confounding
Not costly, time consuming or difficult to do properly
Assignment list can usually be made up an dchecked in advance,
before any participants are enrolled, provided it is kept adequately
concealed
What are the three issues in randomization?
1. Sequence generation: Simple, Blocked, Stratified
Suggestion on choice of randomization approach
Method Good choice when
Simple Expected total n>200 and no interim
analyses planned
Block
Single Total sample size known in advance
Manysmall
Wish to keep group sizes balancedthroughout trial to facilitate interim
blocks analyses and possible early termination
Stratified Small trial (n
-
8/3/2019 prelim guide (Autosaved)
16/26
16
What is intent-to-treat principle?
Primary analysis should compare outcomes between the groups
formed by randomization
What are situation tempting to depart from intent-to-treat?
Non compliance
Cross over
Late exclusion after participant dropped from analysis after
randomization
What to do when these things happen?
Should nearly always define the primary comparisonIn pragmatic trials, discrepancies between intended an received
treatments may reflect real life
In explanatory trials, price of intent-to-treat can be higher
Interferes with estimating efficacy
Still, direction of bias is known and conservative
o Design features in explanatory trials often aim at minimizing
discrepancies: e.g tight eligibility criteria, run-in phase
Randomization late as possible
ESTIMATING EFFICACY INDIRECTLY
Example
Counterfactual view of situation
Suppose controls had been allocated to experimental treatment
By virtue of randomization:
o Would expect a similar proportion not to have received active
treatment
Would expect incidence among them to be similar to that actually
observed among non-recipients in experimental group
Can then estimate, by subtraction, experience of controls who
would have received experimental treatment, had they been
assigned to it
Subgroup Analysis
What is proper subgroup analysis based on?
Inherent participant characteristics that treatment group could
not affect
Other characteristics measured before randomization
Limit number os subgroup hypotheses
Use test of interaction to reduce multiple comparison problem
Interpret post-hoc subgroup differences with great caution
What is improper subgroup analysis?
Characteristics measured after randomization that could be
affected by treatment group. Examples: Compliance, response to
treatment
What is pitfall of subgroup analysis?
the more ways one looks for subgroup differences the more likely
it is that some statistically significant ones will be found, even if
they reflect only the play of chance
Because each subgroup is smaller than the full study population,
statistical test for a treatment effect within subgroup have less
power
What can be done if subgroup analysis is important to do?
Increase trial size accordingly during planning phase
DESIGN VARIATIONS
Factorial design
What are the attraction for Factorial design?
Can tease apart two or more interventions
If interventions are synergistic or antagonistic when used in
combination, can find out
For overall effects, get two (or more) studies for not much morethan price of one
Sequential Trails
What is the attraction of sequential trial?
Allows termination of trial if one arm emerges as clearly superior
What is the draw back? And remedy?
Multiple comparison can inflate alpha (probability of type I error)
, remedy is to use biostatistica l method to deal with this
Randomization within individual of Body parts
Attraction?
Minimize confounding
Cross over trial
randomize order of exposureAttraction?
Each study subjects serves as his/her own control
Completely prevents confounding from individual level factors such as
age, gender, comorbidity
Increased statistical power or smaller smaller sample size requirement
N-of-1 trials
randomization of Interval of time
Group randomized trial
What is the attraction?
When by its nature, intervention applies to entire group. E.g.laws/policies, mass media campaign, environmental modifications
Intervention has spillover effects to others through social
interaction
Intervention effects are thought to be transmissible from person
to person
What can be the drawback?
Unacceptable risk of contamination within group if smaller units
randomized
Complexity, cost
When no of groups randomized often small, greater risk of
imbalance in groups formed by randomization; Statistical power
usually much lower than an individually randomized study of same
Requires more complex statistical analysis to obtain valid
confidence interval limits and p-values
-
8/3/2019 prelim guide (Autosaved)
17/26
17
14. COHORT STUDIESCHARACTERISTICS
Measures occurrence of illness in person of differing exposure
Retrospective or prospective
No random assignment of study subjects
Exposure-disease induction/latent period is not too long
COHORT IDENTIFICATION
Geographic
Special exposure group
Special resources for cohort identification: records from lifeinsurance, health insurance, Union records, Alumni records,
population based disease registries, other
What is limitation of using individual level measurement?
SOURCES OF DATA ON EXPOSURE
Records: Occupational, Medial, pharmacy, prepaid health care
plans, census
Interview or questionnaire
Direct measurement: individual and environment
What is the limitation of having direct measurement?
If characteristic being measured exhibits short term variability
(e.g. serum cholesterol or systolic blood pressure) the value
obtained will not necessarily reflect the study participants long
term mean level. The impact of this misclassification will be to dullthe studys ability to assess the degree of association.
ESTIMATING THE EXPECTED OCCURRENCE OF DISEASE AMONG
EXPOSED COHORT MEMBERS
What are the options for comparison group?
Disease status can be contrasted among heterogeneous exposure
status
Members of the cohort who are exposed to other exposure
known not to influence the disease (eg Asbestos worker vs cotton
textile worker)
If no difference is found, what can be the drawback in drawing
conclusion?
There can be two conclusions, either there was no association or,
other exposure is associated with an altered risk to a similar degree
Health outcome present in the geographic population which
cohort members reside
When is this (general population) approach commonly used?
When death is outcome
When will this approach (general population) provide bias?
When size of non-exposed group is relatively small, and the
outcome of interest is relatively uncommon
While interpretation, what should be thought?
Have the outcome events under study been ascertained
comparably between the exposed cohort and the general
population?
To what extent has the rate of illness in the exposed cohort
influenced the size of the rate of the population as a whole?
On an average, are cohort members different from the general
population in ways that bear on disease incidence or mortality,
beyond difference in those demographic characteristics that are
measured in both groups and for which statistical adjustment can be
performed? (e.g. soldiers vs general population) (healthy worker
bias; Sick retiree bias)
What should be considered when comparing rates of illness or
death between patients who have received a specific medical
intervention and the population as a whole?
Could the condition that necessitated the treatment itself have an
impact on the incidence of the disease under study? (confounding
by indication)
At the time treatment was being considered, were members ofthe treated group evaluated for the presence of a condition, with
only those not having the condition allowed to receive the
treatment? (Healthy screenee bias)
What can be done for this?
Omit from the analysis the part of the follow up experience of
exposed individuals that is most susceptible to these biases. i.e. that
which accrues (accumulates or adds) relatively soon after exposure
status is defined. (e.g. breast cancer counted after 3 years among
those with breast implants)
OUTCOME DEFINITION AND ASSESSENT
Is there standard criteria to define outcome?
Is it assessed similarly among cohort and comparison group?
Is outcome measured at same period of time?
FOLLOW UP OF COHORT MEMBERS
When is validity of result threatened?
If the under ascertainment of disease, especially among just
exposed group
What are eligibility criteria?
Reachable throughout study period
Stable to maximize the likelihood of successful follow-up
Restriction of unstable subgroup
NATURE OF THE ILLNESS OUTCOME: INCIDENCE VS PREVALANCE
If the prevalence is measured
It is called cross sectional study
ISSUES IN ANALYSIS AND INTERPRETATION
For purpose of analysis, how soon after exposure should outcome
events that occur in cohort members begin to be counted?
Generally, immediately
Exceptions
Healthy worker bias
Healthy screenee bias
the diagnosis was made early in the follow up period had that
disease present in hidden form before exposure commenced
(presence of disease in those persons could not have been affected
by the exposure, it could possibly have influenced the likelihood of
receipt of exposure, or the presence or level of a characteristic
under study)
Always think if reverse casualty could be possible?
In pharmacological research, what bias can be encountered?
Immortal time bias
How are changes in exposure status of cohort members handled?
Data taken to reduce exposure misclassification
Person years contribute to denominator of exposed group, after
change of status
When duration is important, cohort members cannot be
permitted to contribute events to the numerator nor person-years
in denominator until they meet the criteria for a particular category
of duration
When is counting stopped?
Get full range of consequence as far as possible
-
8/3/2019 prelim guide (Autosaved)
18/26
18
15. CASE CONTROL STUDIES
Are cases and controls enrolled from the same underlying
population at risk?
Is there possibility of reverse casualty?
ASCERTAINMENT OF EXPOSURE
Is exposure measured validly?
Is information on exposure present during an etiologically
relevant period?
Eg: Alcohol Accident : minutes to hours
Alcohol Cirrhosis : years Is the exposure ascertained using same method in both cases and
controls?
Is exposure associated with mortality? Leading to less number of
people present to have developed the disease? (eg. Pg 397)
1. Interviews/questionnaires
Are questions similar and similarly asked ?
Are exposures included after illness began?
Are cases and control recalling exposure differently?
Socially sensitive issue, cases more honest
Other cases, cases recall better as they are interested in the
health event
Is non-response similar among cases and controls?
What are strengths?Information obtainable about non recorded exposures
Information obtainable about etiologically relevant exposure
What can be done to maximize accuracy?
Define reference date for both cases and controls
Use: visual aids like picture, medicine bottles
Use controls with other disease
Withhold information about study hypothesis
Standardized validated instrument
Show cards
Mental prompts: timing in relation to important events
Verification of exposure information
2. Records
What are different sources of records? Vital, Registry, Employment, Medical, Pharmacy
What are the strengths?
objective exposure information routinely collected
Often more detailed exposure information that subject self-report
Vital statistics data routinely available
Hospital discharge data often available
What are limitations of using record?
Usually kept for different purpose, so may not provide
information precisely
Eg: Death certificatemay be occupation but not exposure
Pharmacy list of prescribed drugs, but not information whether
they were taken
Etiologically relevant exposure time period may not be captured
Often more complete for cases than controlData quality and completeness is often issue
Are information restricted to point till when case is diagnosed?
Are information restricted to same point for control?
3. Physical and Laboratory measurement
What are limitations of laboratory measurement?
Post diagnosis exposure levels may not reflect pre-disease levels
Pre-diagnosis exposure may not reflect etiologically relevant time
period
Are levels measured following identification of cases and control?
Can rely: lead in dentine, BCG scar
Cant rely: hormone status Which records to be excluded from analysis?
those obtained with the period prior to diagnosis that might
correspond to the duration of preclinical stage of disease
Exception: genetically determined characteristics
CASE DEFINITION
Are there all (or representative sample) of members of defined
population who developed a given health outcome?
Can some person with disease go undiagnosed? If yes, Is there
reason to believe that exposed persons are relatively less likely to
go undiagnosed?
Is there chance that exposure status may influence cases
likelihood of diagnosis and therefore selection for study?
if yes, define objectivelyfocus on more seriously ill cases
What are the criteria to identify and s elect cases for study?
Objective
Sensitive and specific
Specificity is of particular concern because, inadvertent inclusion
of persons without disease in the case group will generally obscure
any true association with the exposure
Eg: in study of Reyes syndrome and Aspirin, they include only severe
cases, so that non-cases and misclassification could be avoided,
especially when there was general notion among physicians
regarding association with exposure.
What are sources of getting cases?
Geographically, members of health plan, occupational group,
registry What is challenge of ascertainment of case from population based
case selection?
Complete ascertainment
Use capture/recapture method/ out of area health events
Are they drawn in an unselected manner with regard to exposure
status? Eg: including all eligible cases
Are they incident or prevalent cases?
Goal of etiology is to have incident cases
ISSUES IN CASE SELECTION
Inclusion of Prevalent cases
Under what circumstances it may be necessary to enroll prevalent
cases?
For some conditions, date of occurrence is unknown: eg: HIV
infection; For uncommon disease of long duration, incident series
may yield too few cases
What is disadvantage of adding prevalent cases?
Problems of accurate exposure ascertainment
If date of occurrence is known, should be obtained for more
distant points in the past, on average, that would be necessary for
incident series
If date of occurrence is known, there will be uncertainty about
best point in time before which one should elicit (produce)
exposure information. By studying persons remaining alive with a given condition, one is
studying at the same time not only etiologic factors, but factors that
influence the duration of the condition, including those associated
with survival
Length biased sampling: cases with long lasting disease more
likely to be sample associations may be with disease duration not
etiology
Inclusion of Diagnosed cases without disease
What is the impact of including cases without disease?
diagnosis may depend on presence of exposure
Over diagnosis may threaten interval validity more than under
diagnosis
Inclusion of cases only from the portion of the population
What is the impact?
Missing cases are not missed systematically
Missing due to death- those with longest survival are
preferentially included
Characteristics of tertiary care sites cases (clinic based studies)
Asymptomatic and symptomatic undiagnosed cases (population
based studies)
Influence of exposure on likelihood of diagnosis among truly
diagnosed persons
Are controls who would have been diagnosed had they become ill
have similar access to diagnostic
Willing to undergo diagnostic procedure
CONTROL DEFINITION
When was the control not considered?
Occasionally, the proportion of ill person who have had a specific
exposure so high, unequivocally more than that would be expected
in the population they were derived from ,that the presence of an
association (though not its magnitude) can be surmised from a case
series alone. Eg: Pneumonia due to ingestion of adulterated
rapeseed oil in Spain in 1981
Ideal control group
Are controls at risk for developing disease?
Are the controls selected from a population whose distribution of
exposure is that of the population the case arose from?
If not, Selection bias
Are they identical to the cases with respect to their distribution of
all characteristics?
-
8/3/2019 prelim guide (Autosaved)
19/26
19
That influence the likelihood and/or degree of exposure, and
That, independent of their relationship to exposure, are also
related to the occurrence of the illness under study or to its
recognition
If not, Confounding
Can presence of exposure be measured accurately and in a manner
that is identical to that used for cases?
If not, information bias
Minimizing selection bias
Population based controls Are controls selected from same population as cases?
Geographically defined population: Random digit dialing of
telephone numbers, area sampling, neighborhood sampling, voters
list, population registers, motor vehicle licenses, birth certificates
etc
Prepaid health care plan: who were members of the same health
plan when the illness or injury occurred?
Employed population: same group of employees
What are the drawbacks of random digit dialing?
Household identification: change of telephone number, have only
cell phones
Enumeration: answering machine screening of calls, inaccurate
response about eligibility
What are drawbacks of population based controls?Not known to be free from disease
Response rate may be low and may not be unbiased sample of
population
Hard to identify if no list exist
Characteristics of non-responding population based controls are
shown to have more smokers, less educated, younger
What is effect of inclusion of diseased subjects in control group?
Benefit of population based controls over-weights
misclassification of some. (see lecture notes)
Examine for disease (after selection and if feasible)
estimate amount of undiagnosed disease
estimate resulting bias
Clinic based controls
Are cases selected from few hospitals or clinics?
If yes,
Are controls chosen from persons who, had they developed the
illness under study, would have received care at these hospitals or
clinics?
No selection bias
Are person who do and do not receive care from these sources
differ with regard to their frequency or level of exposure?
Yes selection bias
What is drawback of having other ill people as control?
Hospitalized or clinic based controls may not be typical of those in
population from which cases arose in terms of exposure of interest
dont represent population from where cases are coming)
Ill or recently diseased persons tend to have been smokers of
cigarettes more often than other people. Because smoking history of ill
persons overstate the cigarette consumption of the population from
which the cases arouse, the odds ratio associated with smoking based
on the use of ill persons as controls will be spuriously low.
How to remove selection bias when taking ill controls?
Omit potential controls with conditions known to be related
(positively or negatively) to exposure. Eg : in study of bladder cancer
and prior use of sweeteners, excluded control who were hospitalized
for obesity related disease
this is successful, if can be judged correctly which conditions truly
are exposure related, and how accurately the presence of thosecondition can be determine.
But for cigarette smoking and alcohol drinking, it has been shown
that admitting diagnoses or statements of cause of death are incapable
of identifying the persons with illnesses related to these exposures.
What is advantage of selecting controls chosen from individual
who are tested for the presence of disease and are found not to
have?
inexpensive to find
comparability with regard to the choice of health care provider
this will increase studys validity if disease being investigated is
generally asymptomatic and so would not be detected in the
absence of testing
Situ cancer example: oral contraceptive and situ cancer of cervix.
Women who use oral contraceptive were more likely to get
screening, situ cancer can be in asymptomatic form and shall be
discovered only through screening. If controls were chosen from
general population, who may or may not have received cervical
screening, an apparent excess of oral contraceptive users would be
present among cases of in situ cancer even if no true association was
present
What is drawback of having controls that are test negative?
Those with a diagnostic evaluation but confirmed not to have
disease may not be typical of those in the population from which cases
arouse (if they are in hospital, they have some problems so they
Will detract studys validity if large majority of persons who develop
the disease soon would get diagnosed whether or not the test was
administered
Eg: Endometrial cancer and postmenopausal estrogen. Controls were
chosen from those who underwent biopsy and found negative, because
of hidden cases in the population. However, a group of scientist
believed that there were no such hidden cases. And also, estrogen use
predisposes bleeding leading to biopsy. So, they claimed that risk
estimate was spuriously high.
How is selection bias introduced if exposure information is not
received from all participants of study?
If the frequency of missing data and the degree to which exposure
frequencies or level differ between study subjects for whom exposure
status is and is not known.
Minimizing information bias
Are the questions asked in identically to both cases and controls?
Are the past exposures or events more s alient to persons with an
illness? recall bias
Are these socially undesirable questions?
Eg: in prenatal study of malformation, control taken with other types
of malformation; anal intercourse and anal cancer, control were
with colon cancer
Are the questions very subjective? Eg; stress or shock producing
events and down syndrome, Controls general mother with OR=17;
control other mentally retarded childrens mother with OR=4.3
If questions for fatal diseases asked with surrogates of cases, thenwho to ask in control?
Though controls themselves might give more accurate answer, it
is better to ask from their surrogates for purpose of comparability
What is drawback of getting information from surrogates?
Misclassification of the exposure, especially by surrogates of
control, For e.g. study on radiation exposure and cancer
What is way to minimize information bias?
Blinding to those who are collecting information
Example of information bias in records
Records of endometriosis are higher among women with
infertility. However, the women with infertily undergo laproscopy as
a diagnostic tool to investigate the possible presence of conditions
such as endometriosis.
CONTROL OF CONFOUNDING IN CASE CONTROL STUDIES
Is the proportion of cases and control vary across level or
categories of the potential confounding factor?
Means of controlling confounding?
Restriction: Restrict cases and controls to a single category or level
of potentially confounding, e.g. study of physical activity and cardiac
arrest, restrict people who have clinically recognized heart disease
that could both predispose to cardiac arrest and physical activity
What is drawback of restriction?
Shrink pool of available subjects, especially because we are doing
case-control study for rare disease
limits generalization of results
Cant see effect modification
Adjustment: of potentially confounding factor in analysis phase
Matching
Individual matching
Frequency matching
Is matching alone sufficient to control for confounding?
No, should be considered in analysis as well
Appropriate to match Yes Is variable, one of the exposures of interest?
Is variable strongly associated with disease?
Is it inexpensive to do matching?
Is cost of ascertainment of exposure expensive?
What is drawback of matching?Missing of possibly large fraction of cases
-
8/3/2019 prelim guide (Autosaved)
20/26
20
May be overmatching, Is it surrogate for exposure
measurement?
Matching can induce confounding
Is the factor associated with only exposure?
CASE CONTROL STUDIES THAT DIRECTLY COMPARE DISEASE OR
EXPOSURE SUBGROUP
Compare between different types of disease
Compare among different level of exposure
What are the possible interpretations?e.g. OR > 1 with alcohol and HPV +ve Oropharyngeal
alcohol risk factor HPV +ve cancer? OR
Alcohol protective factor for HPVve cancer?
Case control study superior over other study design
Is the disease too rare for prospective studies?
Is the induction period too short? Eg: alcohol-injury
Is the exposure to disease period is very long?
Does it allow studying multiple exposure?
Allows to obtain information when exposure records do not exist
ESTIMATING SAMPLE SIZE REQUIREMENT
How many controls is needed?
4 controls per case is enough for maximizing power
Depends on cost
ANALYSIS OF CATEGORICAL EXPOSURE
Odds ratio
OR is an estimate of RR in case-controls studies
Under assumption that the disease is rare in both exposed and
unexposed persons (
-
8/3/2019 prelim guide (Autosaved)
21/26
21
16. INDUCTIONPERIODANDLATENTPERIOD
Is the time relevant period for etiology?
What happens if the period is not relevant? Induction period Interval between presence of exposure and initial presence ofdisease
Latent periodInterval between initial presence of disease and its recognition
How to measure induction/latent time?
The distribution of the length of time required for an exposure to
give rise to disease can be estimated by examining the relative riskassociated with that exposure over successive periods of time after
it was sustained. E.g. Leukemia in Hiroshima and Nagasaki
Enumerating times when cases occurred following the exposure
when nearly all exposed cases are due to exposure. Eg DES
Examination of variations in disease occurrence across
populations, or within a population over time
INFLUENCE OF THE SUSPECTED INDUCTION/LATENT
PERIOD ON STUDY DESIGNShort induction/Latent period
What is the best study design for short induction/latent period ?
Difficult to perform cohort or case control studies, e.g. alcohol
consumption and myocardial infarction
Case-cross over study can be an option What is problem with RCT and cohort studies?
the incidence in short time is small
What is problem with Case-control studies with short
induction/latent period?
Validity of studies investigating possible short term effects rests
on the comparibility of exposure ascertainment between cases and
controls
May require to recruit control group that is not representation of
the general population
If the exposure of interest is rare during the short period, then
will have little power to assess even moderate or large RR
What can be the possible limitations of case-cross over studies?
How much misclassification of exposure status may have occurred
from incorrectly judging the length of the relevant window of
exposure prior to disease onset?
To what extent was confounding influence of other exposures
taken into account? In case-cross over, confounding by factors that
do not vary to any appreciable extent over short period of time is
eliminated but those that can vary over short period of time can not
be eliminated
Relevancy of case-cross over studies
Can exposure status be assessed during both the hazard interval
and control interval? Can it be assessed comparably in each
interval?
May have recall bias in incident than control period
Has the duration of presumed induction period been judged
correctly?
May have misclassification related to time period
Could exposures other than the one under investigation vary over
time in the same way?
Confounders that vary over time, eg alcohol, smoking
Long induction/latent period
What are the problems?
Current exposures with future follow up will take a long time to
complete
Exposure status may change, necessitating future exposure
measurement
Often hindered by absent or imprecise measures of exposurestatus
What can be done?
Get exposure from records if possible
Memory can be used for specific exposure
Have patience
Invariant exposures can be done with case-control studies
Use surrogate outcomes, surrogate exposure where possible
What can be good research design?
Nested case control study
-
8/3/2019 prelim guide (Autosaved)
22/26
22
17. IMPROVING SENSITIVITY OF EPIDEMIOLOGICAL STUDIES
SENSITIVE STUDY
If an exposure truly has the capacity to cause a disease, at least in
a portion of exposed inividuals, a sensitive epidemiologic study is
one that will observe an association between that exposure and the
disease.
What are the strategies to enchance sensitivity,
irrespective of the size of available study populationDisaggregation of categories of exposure of concern that are
heterogeneious with respect to their impact on disease occurrence
Disaggregation of disease entities that are heterogeneous with
respect to their association with the exposure of concern
Disaggregation of study subjects who, bcause of the presence of
one or more other exposures or characteristics, are not affected to
the same degree by exposure of concern
What happens when two exposure act through separate
means to produce disease?the relative impact of either of them is greater in that segment of
the population in which the other exposure is absent
What happens when two factors have the capacity to act
together in a single causal pathway leading to disease?The incidence of that disease in persons in whom both factors are
present would be more than sum of the two rates produced by
either factors presence alone.
VARIATION IN SIZE OF RELATIVE RISK ACROSS SUBGROUP
Look for variation in RR or AR
What happens when incidence of disease differs in two
subgroups? If RR associated with another exposure is the same in each of the
subgroups, the corresponding AR will differ, possibly to an
important degreeDifferences in RR associated with an exposure between the
subgroups simply may be a reflection of the exposures adding to
the risk by the same amount in each of the subgroup
AGE AS POSSIBLE EFFECT MODIFIER
Can we compare mean (or median) ages of cases whoh do
or do not have a particular exposure or characteristic?No. It can be misleading as the same difference in mean age at
diagnosis can be produced by complerely different patters of effect
modiciation (Eg, pg 430)
DOES IMPROVING SENSITIVITY OF EPIDEMIOLOGIC STUDIES
DECREASE THEIR SPECIFICITY?
Yes
LIMITATIONS OF EPIDEMIOLOGIC STUDIES
Under what circumstances are we unable to identify etiologic factor
through non-randomized trial?
Issue Strategy
1. Magnitude of increased risk
produced by factor is too small
to be reliably identified
Examine exposure-disease
association within subgroups
of the population (based on
the presence or level of other
risk factors) in whom the
relative impact of exposure is
likely to be greatest
2. Magnitude of increased r isk
is theoretically not too small,
but
a. there is insufficient variation
among individuals within a
population regarding
presence/level of the factor
b. We are unable to distinguish
the effect of the factor from
that of other correlated factors
c. Practical problems:
i) No valid measure of past
presence or past levels of factor
ii) Lengthy induction/latent
period
Identify population within
which there is variation
Conduct the study inpopulation in which the
confounding factor is not so
highly correlated with
exposure in question
Identify exposure records, or
stored samples (as in nested
case control study)
-
8/3/2019 prelim guide (Autosaved)
23/26
23
18. SCREENING
WHEN CAN SCEENING BE JUSTIFIED?Disease is an important public health problem
The natural history of the disease presents a suitable window of
opportunity for screening (long time window period, or already
receiving care at the right time)
Effective treatment is available, and capable of favorably altering
the diseases natural history. Alternatively, an effective way to
prevent spread to other people is at hand.
Treatment or interventions to prevent spread to others, are moreeffective if initiated in the pre-symptomatic stage than when
initiated in symptomatic patient
A suitable screening test is available: reasonably inexpensive and
safe, acceptable to the population screened and able to discriminate
between disease and non-diseased
ASSESSING SCREENING TEST PERFORMANCE
Sensitivity and Specificity
Predictive value of the test
What does prevalence of the disease affect on?
Predictive value (especially, positive predictive value)
What is the implication of the fact that PV+ value of
screening test can be quite low in screened populations
with low disease prevalence?It can affect how a positive screening test result should be
interpreted and perhaphs how this information is communicated to
the screenee
Persons with a positive screening test result must unsually be
evaluated further to determine whether the result was a true
positive or a false positive
It affects choice of a target population for screening. Subgroups inwhich prevalence is highest can yield both more cases per screening
test and more true positives per positive screening test
LIKELIHOOD RATIO
EVALUATING THE EFFECTIVENESS OF SCREENING
Does treatment given at early detection lead to a more
favorable outcome than treatment given when the cancer
is clinically manifest?
Randomized Trail and Cohort (Follow up) studiesIn non randomized trail, is there potential confounding that true
benefit or lack associated with use of the test is distorted?
Which group to compare?Screened vs unscreened group
Is there a lead time bias ?
Most of the cancer haveWhat is appropriate group to compare?
Mortality experience, not of cases alone, but of the screened
group with that of an unscreened group, with both groups
monitored from the time of screening
Is there length bias sampling?patients who have a long preclinical but detectable phase of
disease are more readily found via screening than are patients with
that disease whose preclinical phase is short.
Those tend to have better survival in absence of treatment
What are things to consider if it is ecological study?
Reliable data
size ofpopulation in each time period large enough
-
8/3/2019 prelim guide (Autosaved)
24/26
24
evidence to indicate, absence of screening, the mortality rates
would not have fallen
Things to consider if it is case control studyAre persons selected as ill or diseased to he extent that diagnosis
would occur in absence of screening?
Are controls representative of the population that generated the
cases with respect to the presence or evel of screening activity?
A control that is restricted to earlier or less severe forms of the
condition under study is not appropriate (eg early stage cancer)
While control would not exclude persons with early or milddisease, it would include them only in proportion to their numbers
in the population
Are there confounding factors? That are associated with
screening an late state of disease and mortality
-
8/3/2019 prelim guide (Autosaved)
25/26
25
19. OUTBREAK INVESTIGATION
What are purposes of outbreak investigation?
Limit scope of severity and immediate threat to public health
Prevent future outbreaks
Identify new vehicles of infection
Monitor the success of intervention program
STEPS IN AN OUTBREAK INVESTIGATION
1. Verify the accuracy of disease reports
Confirm the diagnosis1. Determine existence of an outbreak
Compare observed vs. expected in a preliminary investigation
2. Establish a case definition
May need to be modified as more information is available
When appropriate, classify by confirmed, probable, or possible
3. Identify additional cases
4. Conduct descriptive epidemiology
5. Generate and test hypotheses (e.g., disease causation, risk
factors, transmission)
6. Monitor course of the outbreak and reassess strategies
7. Carry out lab and environmental investigations
8. Implement disease control measures
9. Communicate findings
Detection: How are Outbreaks identified ?
Step 1: Verify the Accuracy of Disease Reports
Establish the accuracy of the data (report)
Know your data sources
Confirm the diagnosis
Review clinical findingsdo they make sense?
Review laboratory results and methods
Interview cases and potential cases
Consult with subject matter experts
Is it an outbreak?
Rule out a pseudo-outbreak.
Consider other reasons for an increase in reports
For example, changes in
Reporting procedures
Case definitions
Awareness among reporters
Habits of reporters (referral bias)
Diagnostic tests used and their characteristics (esp. PPV)
Size of population
Step 2: Determine the Existence of Outbreak
Compare observed vs. expected number of cases
Observed: number of cases reported during this event
Expected: number of cases you would normally expect (in
comparable period of time)
Background rate: typical rate of disease among affected population;
consult historical surveillance data, scientific literature, and disease
registries
Use rates to make comparisons
Frequency of cases relative to population size
Is there a real increase in the rate of observed cases beyond what is
expected?
Is outbreak investigation Necessary?
When should a potential outbreak be investigated?
Considerations include:
Severity of illness
Communicability
Potential ongoing health threat
Need to learn more about agent new or novel
Public concern and political considerations
Available resources
Step 3: Establish a case definition
Require standardized case definition
Case definition should include criteria for
o Person, Place, Timeo Clinical criteria (should be simple and objective)
Use CDC or CSTE case definition when possible
Do not include potential risk factor in case definition
Classify cases
Can have definite, probable and possible caseso Useful for tracking cases
o Useful in estimating burden of illness
In larger outbreaks, not necessary to confirm every case
Step 4 Identify additional caseEnhanced surveillance:
Active
Health departments actively solicit reports from:
Health care providers and health care facilities
Clinical and public health laboratories
Discrete populations (e.g., exposed persons)
Passive
Non-direct way of increasing awareness
Targeted communications
Step 5: Conduct Descriptive Epidemiology
Who where and when?Use the Data
Use descriptive epidemiology to characterize the outbreak by
person, place, and time.
For new conditions, you may need to produce description before
creating case definition.
Data can be used to refine case definition.
New clinical features?
What population is being affected?
Review of Descriptive Epidemiology Terms
Incubation period
Time between exposure to infectious agent and the
first signs/symptoms of clinical disease
Index case
Initial case/patient who may have become the source of exposure
for other cases or first affected case
Primary cases
Cases who were exposed to the source (agent)
Secondary cases
Cases who were exposed by a primary case inperson-to-person
spread
Descriptive Epidemiology: Time
The epi curve displays the distribution of cases
over time (and can display more).
Can be used to:
Estimate magnitude and time trend
Determine exposure period
Help predict course of epidemic
Suggest the type of epidemic
Point source (exposure at one point in time)
Common (continuous) source (exposures continue over time)
Propagated (exposure to the source by initial cases, followed by
secondary cases infected from person-to-person spread)
How to create an Epi Curve
-
8/3/2019 prelim guide (Autosaved)
26/26
26
Descriptive Epidemiology : Place
Plot locations of exposure
Descriptive Epidemiology: Person
Define population at risk Age
Gender
Occupation
Social features
Medical history
Travel history
Step 6 (a) : Generate hypotheses
Step 6 (b) Test Hypotheses
Analytical Epidemiology
Different methods (study designs) for comparing groups
Two study designs used in outbreak investigation
1.Cohort studies
Well-defined groups of exposed and non-exposed individuals
Track and compare disease (or outcome) among exposed and
non-exposed individuals
2.Case-Control studies
Compare individuals with a disease (cases) to those without
the disease (controls)
Examine differences in exposures or r isk factors
Selecting an Appropriate Study Design
Case Control studies: Control Selection
Controls should be similar to cases with respect to opportunities for
exposure
Cannot have the disease in any form
Must represent the population from which cases came (e.g.,
same age group)
Strategies for control selection
Random sample
Friend or neighbor controls
Meal companions
Step 7 Monitor Outbreak and Reassess Strategies
Refine hypotheses if necessary
Have other potential explanations been overlooked?
Sequential case-control studies
Narrow down exposures to identify risk factor
Surveillance Data Needs During outbreaks
Step 8: Environmental and lab investigations
Complement epidemiological investigations
Environmental investigations
Examine and sample food, water sources, buildings,
materials, or environmental surfaces
Provide information about:
Exposure to agent
Contamination during food preparation, or manufacturing
Exposure during recreational activities Document contaminated environment
Do trace back investigations
Step 9: Control and Prevention Measures
Implement Control and Prevention Policies
Policy development and implementation
Food safety
Guidance on procedures
Guidance on food handling
Policies about food preparation
Shellfish harvesting
Exclusion of ill children from daycare settings
Petting zoos; pet turtle bans; salmonella and psittacosis warnings at
pet shops, etc. Isolation and quarantine
Immunization policy
Step 10: Communicate Findings
Provide ongoing current and accurate information to:
Staff within your team and agency
Environmental health officer, public information officer,
department administration
Other health agencies
Local and state health departments, CDC, and Indian Health
Service
Governmental agencies and jurisdictions
Health care providers and facilities
The public: media, schools, businesses
Communicate with the Public