large-scale epidemiological research design · seven steps for an epidemiology of consequences by...
TRANSCRIPT
Goal To give you an overview to the discipline of epidemiology with a
focus on large-scale studies
Outline
0 Epidemiologic Framework
0 Measuring disease occurrence
0 Selected study designs in epidemiology
0 Validity and measurement
0 Causation
0 Epidemiology and interface with policy
0 Epidemiology and biostatistics
Epidemiology Some definitions: “the study of the occurrence and distribution of diseases and
other health-related conditions in populations” – Kelsey “the study of the distribution and determinants of health
related states or events in specified populations and the application of this study to control health problems” - Last
“the study of the occurrence of illness” – Cole “the science of understanding the causes and distribution of
population health so that we may intervene to prevent disease and promote health” – Keyes and Galea
Selected varieties of epidemiology Schoenbach, Victor J
Surveillance, “shoe-leather” epidemiology (outbreak investigations), and epidemic control
Microbial epidemiology – biology and ecology of pathogenic microorganisms, their lifecycles, and their interactions with their human and non-human hosts
Descriptive epidemiology – examination of patterns of occurrence of disease and injury and their determinants
“Risk factor” epidemiology – searching for exposure-disease associations that may provide insights into etiology and avenues for prevention
Clinical epidemiology and the evaluation of healthcare – assess accuracy, efficacy, effectiveness, and unintended consequences of methods of prevention, early detection, diagnosis, treatment, and management of health conditions
Molecular epidemiology – investigate disease at the molecular level to precisely characterize pathological processes and exposures, to elucidate mechanisms of pathogenesis, and to identify precursor conditions
Varieties of epidemiology - cont
Genetic epidemiology – the confluence of molecular biology, population studies, and statistical models with an emphasis on heritable influences on disease susceptibility and expression
“Big” Epidemiology (i.e. Big $$$) – multisite collaborative trials, such as the Hypertension Detection and Follow-up Program (HDFP), Coronary Primary Prevention Trial (CPPT), Multiple Risk Factor Intervention Trial (MRFIT), Women’s Health Initiative (WHI)
Testimonial epidemiology – giving depositions and testifying in court or in legislative hearings on the state of epidemiologic evidence on a matter of dispute
Social epidemiology – interpersonal and community-level factors influencing health at the population level
Global epidemiology – assessing the effects of human activity on the ecosystem that supports life on Earth.
Life course epidemiology – The study of the influence of risk factors that occur across the life course (versus one point in time)
Defining Characteristics of Epidemiology
0 Tends to be observational, rather than experimental (which is why methods and careful interpretation are so critical)
0 Focuses on populations defined by social circumstances, geography
0 Often requires a multidisciplinary empirical approach to studies versus discipline-based approach
0 Tend to deal with etiology and disease control 0 Strong roots in public health
Seven Steps for an Epidemiology of Consequences By Keyes and Galea (Epidemiology Matters: A new introduction to methodological foundations. NY
Oxford University Press, 2014)
1. Define the population of interest
2. Conceptualize and create measures of exposures and health indicators
3. Take a sample of the population
4. Estimate measures of association between exposures and health indicators of interest
5. Rigorously evaluate whether the association observed suggests a causal association
6. Assess the evidence for causes working together
7. Assess the extent to which the results matters – is externally valid – to other populations
Epidemiology
According to Rothman:
0 “The principles of epidemiologic research appear deceptively simple, which misleads some people into believing that anyone can master epidemiology by just applying common sense”
0 Interpreting results from epidemiologic studies requires an in-depth understanding of methods, threats to validity, and other epidemiological concepts
Warm-up question
0 More people die in Toronto from cardiovascular disease each year than do people in Vancouver? What are some of the explanations for this observed difference?
Threats to Validity Bias can result for many reasons including:
Self selection (volunteering)
Nonresponse (refusal)
Loss to follow-up (attrition, migration)
Selective survival
Secular trends
Systematic errors in detection and diagnosis of health conditions
Choice of an inappropriate comparison group
Recall or reporting bias
False positives or negatives on diagnostic tests
Errors in assignment of cause of death or illness
Errors and omissions in medical records
Outline
0 Epidemiologic Framework
0Measuring disease occurrence 0 Study designs in epidemiology
0 Validity
0 Causation
0 Measurement
0 Epidemiology and interface with policy
Fundamental measures in epidemiology
0 Counts of illness/outcomes
Outcomes 0 Most commonly discrete (i.e. disease yes/no, number
of cases ect…)
0 Typically assumed to follow the following statistical distributions:
0 Binomial
0 Poisson
0 Negative Binomial
Risk/Incidence Proportion and Rate
Risk or incidence (%)
# 𝑜𝑓 𝑠𝑢𝑏𝑗𝑒𝑐𝑡𝑠 𝑑𝑒𝑣𝑒𝑙𝑜𝑝𝑖𝑛𝑔 𝑑𝑖𝑠𝑒𝑎𝑠𝑒 𝑑𝑢𝑟𝑖𝑛𝑔 𝑎 𝑡𝑖𝑚𝑒 𝑝𝑒𝑟𝑖𝑜𝑑
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑒𝑜𝑝𝑙𝑒 𝑓𝑜𝑙𝑙𝑜𝑤𝑒𝑑 𝑑𝑢𝑟𝑖𝑛𝑔 𝑡ℎ𝑎𝑡 𝑡𝑖𝑚𝑒 𝑝𝑒𝑟𝑖𝑜𝑑
E.g.. 5000 males were enrolled in a cohort study. After 10 years of follow up 5% of
males developed coronary artery disease – it is a proportion. Incidence Rate
# 𝑜𝑓 𝑠𝑢𝑏𝑗𝑒𝑐𝑡𝑠 𝑑𝑒𝑣𝑒𝑙𝑜𝑝𝑖𝑛𝑔 𝑑𝑖𝑠𝑒𝑎𝑠𝑒 𝑑𝑢𝑟𝑖𝑛𝑔 𝑎 𝑡𝑖𝑚𝑒 𝑝𝑒𝑟𝑖𝑜𝑑
𝑇𝑜𝑡𝑎𝑙 𝒑𝒆𝒓𝒔𝒐𝒏 − 𝒕𝒊𝒎𝒆 𝒂𝒕 𝒓𝒊𝒔𝒌 𝑑𝑢𝑟𝑖𝑛𝑔 𝑡𝑖𝑚𝑒 𝑝𝑒𝑟𝑖𝑜𝑑
E.g. Of 159 person-months of observations we observed 47 cases of hospital-
acquired infections resulting in an incidence rate of 0.30 cases per person-year
Prevalence
E.g.. The prevalence of diabetes in Canada is 8%
Prevalence (%) = # 𝑜𝑓 𝑠𝑢𝑏𝑗𝑒𝑐𝑡𝑠 𝑤𝑖𝑡ℎ 𝑑𝑖𝑠𝑒𝑎𝑠𝑒
𝑇𝑜𝑡𝑎𝑙 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛
Outcomes 0 Other outcome distributions include:
0 Normal or log-normal (biomarkers)
0 Survival (parametric, non-parametric and semi-parametric)
0 Multinomial
Outline
0 Epidemiologic Framework
0 Measuring disease occurrence
0Study designs in epidemiology 0 Validity
0 Causation
0 Measurement
0 Epidemiology and interface with policy
Selected epidemiological designs for large-scale studies
0 Follow-up or cohort (retrospective, prospective) studies, comparing people with and without a characteristic (exposure) in relation to a subsequent health-related event
0 Case-control studies comparing exposures people who have an outcome (case) compared with people who do not (control)
0 Intervention trials (clinical, community), in which a treatment or preventive intervention is provided to a group of people and their subsequent experience is compared to that of people not provided the intervention
0 Ecological - broad description of studies done at the group-level
0 Cross-sectional – study in which the exposure and outcome are measured at the same time
Cohort Studies 0 People followed up over a certain time to ascertain
the occurrence of a health-related event
Cohort
Outcome Disease Death
Recurrence Recovery
• Calculate the incidence in the exposed (Ie) as proportion or rate (depending on your data)
• Calculate the incidence in the non-exposure group (Ine)
• The ratio of the two (Ie/Ine) is known as the relative risk for incidence proportion ratios and relative rate for incidence rate ratios
Time
Exposed
Unexposed
Attributes of a cohort design 0 Get an accurate measure of disease incidence due to the
temporality of the design (i.e. disease or endpoint hasn’t happened at the start of follow-up)
0 Can be made up of volunteers – which can introduce bias (not in the case of registries/secondary data)
0 Retrospective cohorts where available are cheaper and quicker – but the principles don’t change
0 Non-comparability between exposed and unexposed (confounding) is the major issue due to ‘selection’ into exposure status
Examples of cohort studies
0 Nurses Health Cohort
0 Multicenter AIDS cohort study
0 Framingham Heart Study
0 UK study of doctors who smoke
0 Women’s Health Initiative (RCT and Cohort)
0 These studies have contributed to the fundamental epidemiology of many common conditions including cardiovascular disease, diabetes, and lung cancer
Relative risk of death from ovarian cancer, by use of HRT Relative risks are for HRT users compared with never-users, stratified by age and hysterectomy status, and adjusted by region of residence, socioeconomic group, time since menopause, parity.
Valerie Beral: Ovarian cancer and hormone replacement therapy in the Million Women Study
The Lancet, Volume 369, Issue 9574, 2007, 1703 – 1710 http://dx.doi.org/10.1016/S0140-6736(07)60534-0
Cohort Example: Million Women Study
www.millionwomenstudy.org
Diet, Lifestyle, and the Risk of Type 2 Diabetes Mellitus in Women Frank B. Hu, M.D., JoAnn E.
Manson, M.D., Meir J. Stampfer, M.D., Graham Colditz, M.D., Simin Liu, M.D., Caren G. Solomon,
M.D., and Walter C. Willett, M.D. NEJM 2001
Strengths and limitations of the cohort design
Strengths:
0 Can be used generate incidence estimates
0 Allows for the study of multiple outcomes
0 Temporality
Weaknesses
0 Often need a large sample size; statistically inefficient (unless disease is very common) – not good for rare diseases
0 Susceptible to bias due to the fact that the exposure is not random
0 Costly and time consuming when recruitment and primary data collection are involved
Case Control Studies
0 Samples cases and non-cases (controls) are identified from a source population
-Exposed -Not exposed
Cases
• Instead of calculating denominators for the rates or risks (as in a cohort study) we
calculation the proportion of cases and controls with the exposure. • The purpose of the control group is to estimate the distribution of the exposure in
the source population – leading to the cardinal rule that controls should be selected independently of exposure status
Non-Cases (controls)
Epi 101
0 Relative Risk (RR) = (
𝑎
𝑎+𝑏)
(𝑐
𝑐+𝑑)
i.e. risk in exposed / risk in the unexposed
0 Odds Ratio (OR) = (
𝑎
𝑏)
(𝑐
𝑑) or
𝑎𝑑
𝑏𝑐
i.e ratio of the odds of developing outcome in the exposed compared to the unexposed
0 Consensus: relative risk is preferred over the odds ratio for most prospective investigations
27
Exposure Disease Present
Disease Absent
Exposure a b
Present c d
Why do we use odds-ratios in case-control studies?
0 When sampling design is retrospective we can construct conditional distributions for the exposure (X) within the levels of the outcome variable
0 We cannot estimate probabilities with this type of design...
0 However the odds ratio can be computed the same way when it is defined as X given Y as it is for Y given X
Why do we use odds-ratios in case-control studies?
Cohort Study
Exposed Not Exposed (X)
Disease Outcome (Y)
0 In statistical terms Y is the ‘random’ variable
Why do we use odds-ratios in case-control studies?
Cohort Study
Exposed Not Exposed (X)
Disease Outcome (Y)
0 In statistical terms Y is the ‘random’ variable
Case Control Study
Disease Outcome (Y)
Look back
Exposed Not Exposed (X)
0 In statistical terms X is the ‘random’ variable
Odds ratio and rare disease assumption
0 Mathematical assumption in case control studies
0 If the prevalence of the disease is low, then the odds ratio approaches the relative risk
0 Useful for approximating the relative risk
0 The more rare the disease the more likely the odds ratio approaches the relative risk
ORs and their accompanying RRs for other outcomes or subgroup analyses in 41 RCTs.
Knol MJ, Duijnhoven RG, Grobbee DE, Moons KGM, et al. (2011) Potential Misinterpretation of Treatment Effects Due to Use of
Odds Ratios and Logistic Regression in Randomized Controlled Trials. PLoS ONE 6(6): e21248. doi:10.1371/journal.pone.0021248
http://www.plosone.org/article/info:doi/10.1371/journal.pone.0021248
Attributes of a case control study
0 Methodologically and analytically more sophisticated
0 Retroactive assessment of the exposure can be a major issue - particularly if using recall bias
0 Above issue can be minimised if using biological markers (which is why case control studies are so common in genetic epidemiology) or using existing records
0 Statistically more efficient
0 Can match several ways – one-to-one matching (requires specialized techniques to account for lack of independence) or frequency matching
0 A well designed case control study can rival a cohort study
Matching in case control studies
0 The purpose of controls is to get an estimate of the prevalence of the exposure in the population that gave rise to the cases
0 The purpose of matching is not to control for confounding
0 Caveat: Matching can control for confounding in cohort studies but NOT in case control studies – where matching can actually introduce bias where there was none to begin with (See Modern Epidemiology)
0 You may still have to adjust for matched factors (i.e. in frequency matching)
0 Matching can be make the study more statistically efficient
Control selection 0 Several potential sources – each have pros/cons
0 Population-controls: Sampled (randomly) directly from the population
0 Neighbourhood controls: Sample controls from a nearby residence
0 Friend controls
0 Random-digit dialling: Similar to population controls
0 Hospital or clinic-based controls: Controls represent people treated at the same institution for another condition
0 Test-negative controls
0 Summary: Because many of these factors are associated with the exposure and/or outcome, they can introduce confounding so use matching techniques with caution
Variations of the case control 0 Nested-case control: sample cases and controls in a
cohort that has been already enumerated (e.g. an occupational cohort)
0 Case-cohort: Every person in the cohort has a chance of being included as a control; however this is defined as person-time exposures. Person that is a case may also be a control at another time point
0 Case-crossover Studies: Case control version of the crossover study; the control people represent the same people as the cases but at a different time point
Title: HELICOBACTER-PYLORI INFECTION AND THE RISK OF GASTRIC-
CARCINOMA
Author(s): PARSONNET J, FRIEDMAN GD, VANDERSTEEN DP, et al.
Source: NEW ENGLAND JOURNAL OF MEDICINE Volume: 325 Issue:
16 Pages: 1127-1131 Published: OCT 17 1991
Notable case control study examples
Title: Epidemiologic classification of human papillomavirus types associated with
cervical cancer
Author(s): Munoz N, Bosch FX, de Sanjose S, et al.
Source: NEW ENGLAND JOURNAL OF MEDICINE Volume: 348 Issue:
6 Pages: 518-527 Published: FEB 6 2003
Title: A CASE CONTROL STUDY OF SCREENING SIGMOIDOSCOPY AND
MORTALITY FROM COLORECTAL-CANCER
Author(s): SELBY JV, FRIEDMAN GD, QUESENBERRY CP, et al.
Source: NEW ENGLAND JOURNAL OF MEDICINE Volume: 326 Issue:
10 Pages: 653-657 Published: MAR 5 1992
Strengths and Limitations of the case control
Strengths: 0 Efficient (both in terms of resources and statistics)
0 Can be used to study multiple exposures
0 Good for rare diseases
Weaknesses 0 Recall bias can be a significant problem
0 Temporality is more difficult to sort out (did the exposure come before the outcome or vice-versa)
0 Establishing an unbiased control group can be very difficult
Intervention Studies 0 People followed up over a certain time to ascertain the
occurrence of a health-related event
Intervention
Outcome Disease Death
Recurrence Recovery
Calculate the incidence in the exposed (intervnetion) (I.e.) (as proportion or rate – depending on your
data); Calculate the incidence in the non-exposure (Inex)
The ratio of the two incidences (Ie/Ine) is known as the relative risk for incidence proportion ratios
and relative rate for incidence rate ratios
No Intervention
Randomization
Strengths and Limitations of Intervention Trials
Strengths:
Control over many biases (most like an “experiment”)
Stronger comparability i.e. Provides strongest evidence for causality in relation to temporality and control for unknown "confounders"
Fulfills the basic assumption of statistical hypothesis tests
Weaknesses
Expensive, time consuming
Sometimes ethically questionable
Subjects are often a highly selected group (selected for willingness to comply with treatment regimen, level of health, etc.) and may not be representative of all people who might be given the intervention (i.e., generalizability may suffer)
Can suffer ecological problems in community trials
Doesn’t work well for population-based interventions
Women’s Health Initiative Major Components of the Women's Health Initiative
Randomized Clinical Trial (CT) component
0 Hormone Therapy Trial (HT) (27,347)
0 Dietary Modification Trial (DM) (48,835)
0 Calcium/Vitamin D Supplementation Trial (CaD) (36,282)
0 Trial ended early in 2002 - Combined estrogen and progestin replacement, with risks exceeding benefits:
“There is more potential for harm than good in healthy postmenopausal women taking a combination of estrogen and progesterone to prevent chronic disease”
Epidemiological Lessons learned from WHI Dr. Anna Day, CMAJ editorial 2002
0 The scientific validity of ideas that appear to be intuitively
correct must be proven through well- designed studies
0 “The WHI has clearly demonstrated that it is imperative
that trials assessing the overall risk and benefit of primary
prevention interventions for both men and women be conducted before such therapies are broadly instituted”
0 “We must ensure that we understand the values and paradigms that drive our hypotheses.”
Community randomized trials
Strengths and group/cluster RCT
Strengths
0 More akin to ‘real-life’
0 More amenable to policy/community level exposures
Weaknesses
• Ecological fallacy (talk about in ecological section)
• Contamination
• Loss of statistical efficiency
Cross Sectional Studies 0 The outcome and the exposure are measured at the same time
Exposure
Outcome
Calculate the prevalence of the exposure in cases and controls
The ratio of the two prevalence estimates is known as the prevalence ratio
One can also calculate the prevalence odds ratio
Attributes of a cross-sectional design
0 Takes a “snap-shot” of a cohort by collecting information on exposures and outcomes at a single time point
0 Often the first step before going on to longitudinal analysis
0 Can be done for the purpose of monitoring trends over time (i.e. surveillance)
0 Effective for hypothesis generation
Strengths and Limitations of the cross sectional design
0 Strengths:
0 Can provide reliable measures of disease and exposure prevalence
0 Efficient
0 Weaknesses
0 Information on all factors is collected simultaneously, so it can be difficult to establish a putative ‘cause' antedated the "effect'
Ecological or Group-level Studies
0 Obtain data at the level of a group, community, or political entity (county, state, country), often by making use of routinely collected data
0 Used at the first stage of research (hypothesis generating)
0 Used to study group-level constructs (e.g. impact of legislation)
0 Used at the later stages of intervention to understand public health impact when individual-level studies have already confirmed a relationship
0 Subset includes multilevel studies (group and individual level factors)
Measurement Challenges 0 Important to distinguish between ecological measures which
are used as individual surrogates versus ecological measures which are ecological in their own right
0 Examples: 0 Climate
0 Air pollution
0 Census-level socioeconomic status
0 Per captia fish intake
0 Hours of sunlight
0 Ecological Fallacy: the bias that may occur because an association observed between variables on an aggregate level does not represent associations that exist at and individual level
Comparison of observed rates of colorectal cancer (CRC) per 100 000 population with the rates obtained from each
of the following three fitted models: models consisting of (A) meat; (B) meat + fish and (C) meat + fish + olive oil.
Stoneham M et al. J Epidemiol
Community Health 2000;54:756-
760
©2000 by BMJ Publishing Group Ltd
EXAMPLE Measuring the neighbourhood using UK benefits data: a multilevel analysis of
mental health status David L Fone1 , Keith Lloyd2 and Frank D Dunstan1
1 Department of Primary Care & Public Health, Centre for Health Sciences Research, Cardiff University, Heath Park, Cardiff CF14 4YS, UK
2 School of Medicine, Swansea University, Swansea SA2 8PP, UK
“Results: Each contextual variable was significantly associated with individual mental health after adjusting for individual risk factors, so that living in a ward with high levels of claimants was associated with worse mental health. ……….. All contextual effects were significantly stronger in people who were economically inactive and unavailable for work. Conclusion: This study provides evidence for substantive contextual effects on mental health, and in particular the importance of small-area levels of economic inactivity and disability. DWP benefits data offer a more specific measure of local neighbourhood than generic deprivation indices and offer a starting point to hypothesise possible causal pathways to individual mental health status.”
Strengths and Limitations of Ecological Studies
Strengths: Cost effective
Appropriate for group-level variables and/or measuring population impact
Can be extended to multilevel designs which incorporate individual measures
Weaknesses Ecological fallacy
Temporality often difficult to establish
Outline
0 Epidemiologic Framework
0 Measuring disease occurrence
0 Study designs in epidemiology
0Validity and measurement 0 Causation
0 Epidemiology and interface with policy
0 Biostatistics and epidemiology
Common data sources used in epidemioloy
Aggregate data examples
0 Vital statistics (birth rates, death rates, pregnancy rates, birth weight)
0 Demographic, economic, housing, geographical, and other data from the Census and other government data-gathering activities
0 Summaries of disease and injury reporting systems and registries
0 Workplace monitoring systems
0 Environmental monitoring systems (e.g., air pollution measurements)
0 Production and sales data (e.g. pharmacy data)
Individual-level data examples
0 Vital events registration (births, deaths, marriages)
0 Disease and injury reporting systems and registries
0 National surveys
0 Computer data files (e.g., health insurers)
0 Medical records
0 Questionnaires - in person, by telephone, mailed
0 Biological specimens (routinely or specially collected)
Validity
0 Random error exists with any sampling
0 Bias is a result of a systematic error
0 Bias can arise at any point of the epidemiologic study:
0 Most critical and time-consuming aspect in the conduct of large scale epidemiological studies
Study design Selection of
subjects Measurement
of E/O/C Analysis
Common sources of bias
0 Selection bias: systematic error in the selection of subjects 0 E.g.. Volunteers, healthy individuals, non-
institutionalized, sick individuals
0 Information bias: Data collection errors 0 E.g.. Poor sensitivity/specificity
0 Recall bias
0 Interviewer bias
0 Biases specific to certain problems 0 E.g.. lead-time bias with screening
Confounding 0 Generally – a confounder is
something that results in “non-comparability” between exposed and unexposed
0 Rothman describes as the “confusion of the effects”
Exposure Outcome
Confounder
Confounding Example
Birth Order Down Syndrome
Maternal Age
Stark and Mantel from Rothman 2012
Stark and Mantel noted a much higher prevalence of down syndrome among children that were high on the birth order, with
the highest among 5+
Confounding 0 Confounders need to be measured fully and correctly or
can result in incomplete control of the outcome (residual confounding)
0 Unmeasured confounding is a result of variables which could be confounding the association but were not/could not be measured
0 Controlling can be done through statistical adjustment or stratification
0 It is inappropriate to rely on statistical significance to assess confounding – rather the effect size between exposure and outcome with and without confounder is most appropriate to detect confounding
Measurement
0 Measurement is one of the most important pieces to epidemiologic study
0 Need a clear definition and measurement of the disease and/or phenomena under study
0 Need an unbiased, accurate, and precise way to measure it
Factors that complicate measurement
0 In many cases there is a significant lag time between exposure and disease – and when they happen close together temporality is more complicated
0 Agent of disease may leave no measurable indicator of past exposure
0 Disease definitions and exposure cut-offs vary with time
0 Exposures can evolve or accumulate over time
0 Interactions and effect modification
Selected strategies to improve
measurement 0 Multiple measures
0 Adjustment of study results for measurement error
0 Validation
0 Quality control measures
0 Aids to recall
0 Uses of diaries/24-hour recall
0 Use of scales
0 Develop standards for interpretation
0 Training
0 Measurement models
Outline
0 Epidemiologic Framework
0 Measuring disease occurrence
0 Study designs in epidemiology
0 Validity and measurement
0Causation 0 Epidemiology and interface with policy
0 Epidemiology and biostatistics
Causation
0 Goal: to identify a factor that causes a disease – such that the elimination or treatment of that factor would result in a reduction and/or elimination of that disease
0 In practice much of large-scale epidemiology focuses on single risk factors…although this is changing to consider more of a comprehensive system
Causation – some history
0 Traditionally assessed by comparing against criteria
0 E.g. The Bradford-Hill Criteria: Strength
Consistency
Specificity
Temporality
Dose-response
Plausibility
Coherence
Experimental Evidence
Analogy
Other ways to conceptualize causality
0 Sufficient-component model (Rothman) 0 Distinguishes between necessary and sufficient causes
0 In principle, a cause can be necessary – without it the effect will not occur
0 Sufficient is a causal mechanism, which can be made up of multiple component causes
0 A sufficient cause is represented by a complete circle (a "causal pie"), the segments of which represent component causes.
0 May be multiple pathways and causal mechanisms for a given disease (multi-causality)
0 A component cause that is a part of every sufficient cause is a necessary cause
Single component cause
One causal mechanism
Rothman: The Causal Pie Model, 2012
Causal-web model (Kreiger, 1994)
0 Multiple direct and indirect causes
0 Direct: proximal causes, for example
0 Indirect: Mediated through one or more intervening variables
Exposure with direct effect
Outcome
Mediator
Exposure with indirect effect
Counterfactaual model (Greenland, Hernan)
0 Imagine the same individual in the absence of the exposed state – this is known as the “counterfactual”
0 Can summarize up to populations and use various mathematical relationships to estimate the causal effect
0 The counterfactual cannot be observed – basis of causal inference analysis
Causality – analytic techniques
0 Statistical measures of association used in epidemiology reflect but do not explain the number of pathways that and exposure may be causally related to disease (Dohoo, 2012)
0 Methods are currently developing (and becoming more sophisticated) to elucidate the causal pathway
0 Path analysis
0 Analytic causal models (marginal structural models)
Outline
0 Epidemiologic Framework
0 Measuring disease occurrence
0 Study designs in epidemiology
0 Validity and measurement
0 Causation
0 Epidemiology and interface with policy
0 Epidemiology and biostatistics
FDA's folic acid policy a success, report
HPV vaccine protects girls from cervical cancer
Cell phones while driving cause same impairment as drinking and driving
Fifteen minutes of sunlight a day to provide adequate levels of Vitamin D
Soft drinks to be banned from school
BPA banned from baby bottles
Epidemiology and policy 0 Epidemiological studies are key to informing public
health and health policy
0 At this stage brings in ‘levels of evidence’
0 How STRONG is the evidence (rigor)
0 How RELEVENT is the evidence (generalizability)
0 Generally policy change does not happen without large-scale studies and/or large body of evidence in the absence of large effects and/or the precautionary principle
0 Policymaking also involves other things aside from epidemiological evidence (ethical, legal, cost-effectiveness)
Outline
0 Epidemiologic Framework
0 Measuring disease occurrence
0 Study designs in epidemiology
0 Validity and measurement
0 Causation
0 Epidemiology and interface with policy
0 Epidemiology and biostatistics
Epidemiology and biostatistics 0 Differences in epidemiology and biostatistics are often
emphasized, but historically are not that great
0 Training in either discipline facilitates effective collaboration
0 Epidemiologists use statistics as a tool to understand the issue they are studying
0 Biostatisticians are critical to the conduct of epidemiologic study by: 0 Ensuring appropriate and accurate conduct and application
of statistical techniques
0 Generating new approaches to deal with data/analysis challenges and to match the conceptual hypothesis
Pragmatic approach to data analysis in epidemiology
1) Get to know your data – intimately! (data cleaning)
2) Classify your outcome (distribution)
3) Determine/confirm your question
4) Match the APPROPRIATE analysis for:
I. YOUR DATA
II. YOUR QUESTION
5) No statistical technique– no matter how fancy - can overcome study design and measurement issues (i.e. bias, misclassification, unmeasured confounding ect…)
Future of ‘large-scale’ epidemiology 0 Increased use of data linkage
0 Investments into studies that can examine the impact of multiple/complex interventions/exposures
0 Increasing interest on multi-level design 0 Being more explicit about measuring and understanding that
individuals are nested within larger structures (families, communities ect…)
0 Increased use of multiple types of data within one study - from the cell to the environment and new unstructured data (e.g. social media) 0 “Data Science”
Analytic Methods
0 Most common analytic method for analyzing epidemiological studies are statistical (of different types)
0 However increasing use of different methodologies including mixed methods (quantitative + qualitative), machine learning, micro-simulation modelling ect..)
Conclusion 0 Large scale epidemiological studies can provide unequivocal
information on disease processes and risk factors
0 Cornerstone of prevention and public health policy
0 Requires solid methodological base
0 Some of the best large-scale epidemiological studies are done in interdisciplinary teams with experts on methods, statistics and analysis, data, disease/clinical knowledge, and ethics
Selected Resources: General Epidemiology
0 Rothman, Kenneth J. Epidemiology - an introduction 2nd ed. NY, Oxford University Press, 2012.
0 Rothman, Kenneth J. and Sander Greenland. Modern Epidemiology. 3rd ed. Hagerstown MD, Lippincott-Raven, 2008. (advanced)
0 Szklo, Moyses and F. Javier Nieto. Epidemiology: beyond the basics. 3rd ed. Gaithersburg MD, Aspen, 2014
0 Kelsey, Jennifer L., Alice S. Whittemore Alfred S. Evans and W. Douglas Thompson. Methods in observational epidemiology. Second Edition NY, Oxford, 1996.
0 Keyes, Katherine M., Galea, Sandro. Epidemiology Matters: A new introduction to methodological foundations. NY Oxford University Press, 2014.
Questions? [email protected]