large-scale epidemiological research design · seven steps for an epidemiology of consequences by...

Goal To give you an overview to the discipline of epidemiology with a

focus on large-scale studies

Outline

0 Epidemiologic Framework

0 Measuring disease occurrence

0 Selected study designs in epidemiology

0 Validity and measurement

0 Causation

0 Epidemiology and interface with policy

0 Epidemiology and biostatistics

Epidemiology Some definitions: “the study of the occurrence and distribution of diseases and

other health-related conditions in populations” – Kelsey “the study of the distribution and determinants of health

related states or events in specified populations and the application of this study to control health problems” - Last

“the study of the occurrence of illness” – Cole “the science of understanding the causes and distribution of

population health so that we may intervene to prevent disease and promote health” – Keyes and Galea

Selected varieties of epidemiology Schoenbach, Victor J

Surveillance, “shoe-leather” epidemiology (outbreak investigations), and epidemic control

Microbial epidemiology – biology and ecology of pathogenic microorganisms, their lifecycles, and their interactions with their human and non-human hosts

Descriptive epidemiology – examination of patterns of occurrence of disease and injury and their determinants

“Risk factor” epidemiology – searching for exposure-disease associations that may provide insights into etiology and avenues for prevention

Clinical epidemiology and the evaluation of healthcare – assess accuracy, efficacy, effectiveness, and unintended consequences of methods of prevention, early detection, diagnosis, treatment, and management of health conditions

Molecular epidemiology – investigate disease at the molecular level to precisely characterize pathological processes and exposures, to elucidate mechanisms of pathogenesis, and to identify precursor conditions

Varieties of epidemiology - cont

Genetic epidemiology – the confluence of molecular biology, population studies, and statistical models with an emphasis on heritable influences on disease susceptibility and expression

“Big” Epidemiology (i.e. Big $$$) – multisite collaborative trials, such as the Hypertension Detection and Follow-up Program (HDFP), Coronary Primary Prevention Trial (CPPT), Multiple Risk Factor Intervention Trial (MRFIT), Women’s Health Initiative (WHI)

Testimonial epidemiology – giving depositions and testifying in court or in legislative hearings on the state of epidemiologic evidence on a matter of dispute

Social epidemiology – interpersonal and community-level factors influencing health at the population level

Global epidemiology – assessing the effects of human activity on the ecosystem that supports life on Earth.

Life course epidemiology – The study of the influence of risk factors that occur across the life course (versus one point in time)

Defining Characteristics of Epidemiology

0 Tends to be observational, rather than experimental (which is why methods and careful interpretation are so critical)

0 Focuses on populations defined by social circumstances, geography

0 Often requires a multidisciplinary empirical approach to studies versus discipline-based approach

0 Tend to deal with etiology and disease control 0 Strong roots in public health

Seven Steps for an Epidemiology of Consequences By Keyes and Galea (Epidemiology Matters: A new introduction to methodological foundations. NY

Oxford University Press, 2014)

1. Define the population of interest

2. Conceptualize and create measures of exposures and health indicators

3. Take a sample of the population

4. Estimate measures of association between exposures and health indicators of interest

5. Rigorously evaluate whether the association observed suggests a causal association

6. Assess the evidence for causes working together

7. Assess the extent to which the results matters – is externally valid – to other populations

Epidemiology

According to Rothman:

0 “The principles of epidemiologic research appear deceptively simple, which misleads some people into believing that anyone can master epidemiology by just applying common sense”

0 Interpreting results from epidemiologic studies requires an in-depth understanding of methods, threats to validity, and other epidemiological concepts

Warm-up question

0 More people die in Toronto from cardiovascular disease each year than do people in Vancouver? What are some of the explanations for this observed difference?

Threats to Validity Bias can result for many reasons including:

Self selection (volunteering)

Nonresponse (refusal)

Loss to follow-up (attrition, migration)

Selective survival

Secular trends

Systematic errors in detection and diagnosis of health conditions

Choice of an inappropriate comparison group

Recall or reporting bias

False positives or negatives on diagnostic tests

Errors in assignment of cause of death or illness

Errors and omissions in medical records

Outline


0Measuring disease occurrence 0 Study designs in epidemiology

0 Validity

0 Causation

0 Measurement


Fundamental measures in epidemiology

0 Counts of illness/outcomes

Outcomes 0 Most commonly discrete (i.e. disease yes/no, number

of cases ect…)

0 Typically assumed to follow the following statistical distributions:

0 Binomial

0 Poisson

0 Negative Binomial

Risk/Incidence Proportion and Rate

Risk or incidence (%)

# 𝑜𝑓 𝑠𝑢𝑏𝑗𝑒𝑐𝑡𝑠 𝑑𝑒𝑣𝑒𝑙𝑜𝑝𝑖𝑛𝑔 𝑑𝑖𝑠𝑒𝑎𝑠𝑒 𝑑𝑢𝑟𝑖𝑛𝑔 𝑎 𝑡𝑖𝑚𝑒 𝑝𝑒𝑟𝑖𝑜𝑑

𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑒𝑜𝑝𝑙𝑒 𝑓𝑜𝑙𝑙𝑜𝑤𝑒𝑑 𝑑𝑢𝑟𝑖𝑛𝑔 𝑡ℎ𝑎𝑡 𝑡𝑖𝑚𝑒 𝑝𝑒𝑟𝑖𝑜𝑑

E.g.. 5000 males were enrolled in a cohort study. After 10 years of follow up 5% of

males developed coronary artery disease – it is a proportion. Incidence Rate

# 𝑜𝑓 𝑠𝑢𝑏𝑗𝑒𝑐𝑡𝑠 𝑑𝑒𝑣𝑒𝑙𝑜𝑝𝑖𝑛𝑔 𝑑𝑖𝑠𝑒𝑎𝑠𝑒 𝑑𝑢𝑟𝑖𝑛𝑔 𝑎 𝑡𝑖𝑚𝑒 𝑝𝑒𝑟𝑖𝑜𝑑

𝑇𝑜𝑡𝑎𝑙 𝒑𝒆𝒓𝒔𝒐𝒏 − 𝒕𝒊𝒎𝒆 𝒂𝒕 𝒓𝒊𝒔𝒌 𝑑𝑢𝑟𝑖𝑛𝑔 𝑡𝑖𝑚𝑒 𝑝𝑒𝑟𝑖𝑜𝑑

E.g. Of 159 person-months of observations we observed 47 cases of hospital-

acquired infections resulting in an incidence rate of 0.30 cases per person-year

Prevalence

E.g.. The prevalence of diabetes in Canada is 8%

Prevalence (%) = # 𝑜𝑓 𝑠𝑢𝑏𝑗𝑒𝑐𝑡𝑠 𝑤𝑖𝑡ℎ 𝑑𝑖𝑠𝑒𝑎𝑠𝑒

𝑇𝑜𝑡𝑎𝑙 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛

Outcomes 0 Other outcome distributions include:

0 Normal or log-normal (biomarkers)

0 Survival (parametric, non-parametric and semi-parametric)

0 Multinomial

Outline



0Study designs in epidemiology 0 Validity

0 Causation

0 Measurement


Selected epidemiological designs for large-scale studies

0 Follow-up or cohort (retrospective, prospective) studies, comparing people with and without a characteristic (exposure) in relation to a subsequent health-related event

0 Case-control studies comparing exposures people who have an outcome (case) compared with people who do not (control)

0 Intervention trials (clinical, community), in which a treatment or preventive intervention is provided to a group of people and their subsequent experience is compared to that of people not provided the intervention

0 Ecological - broad description of studies done at the group-level

0 Cross-sectional – study in which the exposure and outcome are measured at the same time

Cohort Studies 0 People followed up over a certain time to ascertain

the occurrence of a health-related event

Cohort

Outcome Disease Death

Recurrence Recovery

• Calculate the incidence in the exposed (Ie) as proportion or rate (depending on your data)

• Calculate the incidence in the non-exposure group (Ine)

• The ratio of the two (Ie/Ine) is known as the relative risk for incidence proportion ratios and relative rate for incidence rate ratios

Time

Exposed

Unexposed

Attributes of a cohort design 0 Get an accurate measure of disease incidence due to the

temporality of the design (i.e. disease or endpoint hasn’t happened at the start of follow-up)

0 Can be made up of volunteers – which can introduce bias (not in the case of registries/secondary data)

0 Retrospective cohorts where available are cheaper and quicker – but the principles don’t change

0 Non-comparability between exposed and unexposed (confounding) is the major issue due to ‘selection’ into exposure status

Examples of cohort studies

0 Nurses Health Cohort

0 Multicenter AIDS cohort study

0 Framingham Heart Study

0 UK study of doctors who smoke

0 Women’s Health Initiative (RCT and Cohort)

0 These studies have contributed to the fundamental epidemiology of many common conditions including cardiovascular disease, diabetes, and lung cancer

Relative risk of death from ovarian cancer, by use of HRT Relative risks are for HRT users compared with never-users, stratified by age and hysterectomy status, and adjusted by region of residence, socioeconomic group, time since menopause, parity.

Valerie Beral: Ovarian cancer and hormone replacement therapy in the Million Women Study

The Lancet, Volume 369, Issue 9574, 2007, 1703 – 1710 http://dx.doi.org/10.1016/S0140-6736(07)60534-0

Cohort Example: Million Women Study

www.millionwomenstudy.org

Diet, Lifestyle, and the Risk of Type 2 Diabetes Mellitus in Women Frank B. Hu, M.D., JoAnn E.

Manson, M.D., Meir J. Stampfer, M.D., Graham Colditz, M.D., Simin Liu, M.D., Caren G. Solomon,

M.D., and Walter C. Willett, M.D. NEJM 2001

Strengths and limitations of the cohort design

Strengths:

0 Can be used generate incidence estimates

0 Allows for the study of multiple outcomes

0 Temporality

Weaknesses

0 Often need a large sample size; statistically inefficient (unless disease is very common) – not good for rare diseases

0 Susceptible to bias due to the fact that the exposure is not random

0 Costly and time consuming when recruitment and primary data collection are involved

Case Control Studies

0 Samples cases and non-cases (controls) are identified from a source population

-Exposed -Not exposed

Cases

• Instead of calculating denominators for the rates or risks (as in a cohort study) we

calculation the proportion of cases and controls with the exposure. • The purpose of the control group is to estimate the distribution of the exposure in

the source population – leading to the cardinal rule that controls should be selected independently of exposure status

Non-Cases (controls)

Epi 101

0 Relative Risk (RR) = (

𝑎

𝑎+𝑏)

(𝑐

𝑐+𝑑)

i.e. risk in exposed / risk in the unexposed

0 Odds Ratio (OR) = (

𝑎

𝑏)

(𝑐

𝑑) or

𝑎𝑑

𝑏𝑐

i.e ratio of the odds of developing outcome in the exposed compared to the unexposed

0 Consensus: relative risk is preferred over the odds ratio for most prospective investigations

27

Exposure Disease Present

Disease Absent

Exposure a b

Present c d

Why do we use odds-ratios in case-control studies?

0 When sampling design is retrospective we can construct conditional distributions for the exposure (X) within the levels of the outcome variable

0 We cannot estimate probabilities with this type of design...

0 However the odds ratio can be computed the same way when it is defined as X given Y as it is for Y given X


Cohort Study

Exposed Not Exposed (X)

Disease Outcome (Y)

0 In statistical terms Y is the ‘random’ variable


Cohort Study


Disease Outcome (Y)

0 In statistical terms Y is the ‘random’ variable

Case Control Study

Disease Outcome (Y)

Look back


0 In statistical terms X is the ‘random’ variable

Odds ratio and rare disease assumption

0 Mathematical assumption in case control studies

0 If the prevalence of the disease is low, then the odds ratio approaches the relative risk

0 Useful for approximating the relative risk

0 The more rare the disease the more likely the odds ratio approaches the relative risk

ORs and their accompanying RRs for other outcomes or subgroup analyses in 41 RCTs.

Knol MJ, Duijnhoven RG, Grobbee DE, Moons KGM, et al. (2011) Potential Misinterpretation of Treatment Effects Due to Use of

Odds Ratios and Logistic Regression in Randomized Controlled Trials. PLoS ONE 6(6): e21248. doi:10.1371/journal.pone.0021248

http://www.plosone.org/article/info:doi/10.1371/journal.pone.0021248

http://www.plosone.org/article/info:doi/10.1371/journal.pone.0021248

Attributes of a case control study

0 Methodologically and analytically more sophisticated

0 Retroactive assessment of the exposure can be a major issue - particularly if using recall bias

0 Above issue can be minimised if using biological markers (which is why case control studies are so common in genetic epidemiology) or using existing records

0 Statistically more efficient

0 Can match several ways – one-to-one matching (requires specialized techniques to account for lack of independence) or frequency matching

0 A well designed case control study can rival a cohort study

Matching in case control studies

0 The purpose of controls is to get an estimate of the prevalence of the exposure in the population that gave rise to the cases

0 The purpose of matching is not to control for confounding

0 Caveat: Matching can control for confounding in cohort studies but NOT in case control studies – where matching can actually introduce bias where there was none to begin with (See Modern Epidemiology)

0 You may still have to adjust for matched factors (i.e. in frequency matching)

0 Matching can be make the study more statistically efficient

Control selection 0 Several potential sources – each have pros/cons

0 Population-controls: Sampled (randomly) directly from the population

0 Neighbourhood controls: Sample controls from a nearby residence

0 Friend controls

0 Random-digit dialling: Similar to population controls

0 Hospital or clinic-based controls: Controls represent people treated at the same institution for another condition

0 Test-negative controls

0 Summary: Because many of these factors are associated with the exposure and/or outcome, they can introduce confounding so use matching techniques with caution

Variations of the case control 0 Nested-case control: sample cases and controls in a

cohort that has been already enumerated (e.g. an occupational cohort)

0 Case-cohort: Every person in the cohort has a chance of being included as a control; however this is defined as person-time exposures. Person that is a case may also be a control at another time point

0 Case-crossover Studies: Case control version of the crossover study; the control people represent the same people as the cases but at a different time point

Title: HELICOBACTER-PYLORI INFECTION AND THE RISK OF GASTRIC-

CARCINOMA

Author(s): PARSONNET J, FRIEDMAN GD, VANDERSTEEN DP, et al.

Source: NEW ENGLAND JOURNAL OF MEDICINE Volume: 325 Issue:

16 Pages: 1127-1131 Published: OCT 17 1991

Notable case control study examples

Title: Epidemiologic classification of human papillomavirus types associated with

cervical cancer

Author(s): Munoz N, Bosch FX, de Sanjose S, et al.


6 Pages: 518-527 Published: FEB 6 2003

Title: A CASE CONTROL STUDY OF SCREENING SIGMOIDOSCOPY AND

MORTALITY FROM COLORECTAL-CANCER

Author(s): SELBY JV, FRIEDMAN GD, QUESENBERRY CP, et al.


10 Pages: 653-657 Published: MAR 5 1992

Strengths and Limitations of the case control

Strengths: 0 Efficient (both in terms of resources and statistics)

0 Can be used to study multiple exposures

0 Good for rare diseases

Weaknesses 0 Recall bias can be a significant problem

0 Temporality is more difficult to sort out (did the exposure come before the outcome or vice-versa)

0 Establishing an unbiased control group can be very difficult

Intervention Studies 0 People followed up over a certain time to ascertain the

occurrence of a health-related event

Intervention

Outcome Disease Death

Recurrence Recovery

Calculate the incidence in the exposed (intervnetion) (I.e.) (as proportion or rate – depending on your

data); Calculate the incidence in the non-exposure (Inex)

The ratio of the two incidences (Ie/Ine) is known as the relative risk for incidence proportion ratios

and relative rate for incidence rate ratios

No Intervention

Randomization

Strengths and Limitations of Intervention Trials

Strengths:

Control over many biases (most like an “experiment”)

Stronger comparability i.e. Provides strongest evidence for causality in relation to temporality and control for unknown "confounders"

Fulfills the basic assumption of statistical hypothesis tests

Weaknesses

Expensive, time consuming

Sometimes ethically questionable

Subjects are often a highly selected group (selected for willingness to comply with treatment regimen, level of health, etc.) and may not be representative of all people who might be given the intervention (i.e., generalizability may suffer)

Can suffer ecological problems in community trials

Doesn’t work well for population-based interventions

Women’s Health Initiative Major Components of the Women's Health Initiative

Randomized Clinical Trial (CT) component

0 Hormone Therapy Trial (HT) (27,347)

0 Dietary Modification Trial (DM) (48,835)

0 Calcium/Vitamin D Supplementation Trial (CaD) (36,282)

0 Trial ended early in 2002 - Combined estrogen and progestin replacement, with risks exceeding benefits:

“There is more potential for harm than good in healthy postmenopausal women taking a combination of estrogen and progesterone to prevent chronic disease”

Epidemiological Lessons learned from WHI Dr. Anna Day, CMAJ editorial 2002

0 The scientific validity of ideas that appear to be intuitively

correct must be proven through well- designed studies

0 “The WHI has clearly demonstrated that it is imperative

that trials assessing the overall risk and benefit of primary

prevention interventions for both men and women be conducted before such therapies are broadly instituted”

0 “We must ensure that we understand the values and paradigms that drive our hypotheses.”

Community randomized trials

Strengths and group/cluster RCT

Strengths

0 More akin to ‘real-life’

0 More amenable to policy/community level exposures

Weaknesses

• Ecological fallacy (talk about in ecological section)

• Contamination

• Loss of statistical efficiency

Cross Sectional Studies 0 The outcome and the exposure are measured at the same time

Exposure

Outcome

Calculate the prevalence of the exposure in cases and controls

The ratio of the two prevalence estimates is known as the prevalence ratio

One can also calculate the prevalence odds ratio

Attributes of a cross-sectional design

0 Takes a “snap-shot” of a cohort by collecting information on exposures and outcomes at a single time point

0 Often the first step before going on to longitudinal analysis

0 Can be done for the purpose of monitoring trends over time (i.e. surveillance)

0 Effective for hypothesis generation

Strengths and Limitations of the cross sectional design

0 Strengths:

0 Can provide reliable measures of disease and exposure prevalence

0 Efficient

0 Weaknesses

0 Information on all factors is collected simultaneously, so it can be difficult to establish a putative ‘cause' antedated the "effect'

Ecological or Group-level Studies

0 Obtain data at the level of a group, community, or political entity (county, state, country), often by making use of routinely collected data

0 Used at the first stage of research (hypothesis generating)

0 Used to study group-level constructs (e.g. impact of legislation)

0 Used at the later stages of intervention to understand public health impact when individual-level studies have already confirmed a relationship

0 Subset includes multilevel studies (group and individual level factors)

Measurement Challenges 0 Important to distinguish between ecological measures which

are used as individual surrogates versus ecological measures which are ecological in their own right

0 Examples: 0 Climate

0 Air pollution

0 Census-level socioeconomic status

0 Per captia fish intake

0 Hours of sunlight

0 Ecological Fallacy: the bias that may occur because an association observed between variables on an aggregate level does not represent associations that exist at and individual level

Comparison of observed rates of colorectal cancer (CRC) per 100 000 population with the rates obtained from each

of the following three fitted models: models consisting of (A) meat; (B) meat + fish and (C) meat + fish + olive oil.

Stoneham M et al. J Epidemiol

Community Health 2000;54:756-

760

©2000 by BMJ Publishing Group Ltd

EXAMPLE Measuring the neighbourhood using UK benefits data: a multilevel analysis of

mental health status David L Fone1 , Keith Lloyd2 and Frank D Dunstan1

1 Department of Primary Care & Public Health, Centre for Health Sciences Research, Cardiff University, Heath Park, Cardiff CF14 4YS, UK

2 School of Medicine, Swansea University, Swansea SA2 8PP, UK

“Results: Each contextual variable was significantly associated with individual mental health after adjusting for individual risk factors, so that living in a ward with high levels of claimants was associated with worse mental health. ……….. All contextual effects were significantly stronger in people who were economically inactive and unavailable for work. Conclusion: This study provides evidence for substantive contextual effects on mental health, and in particular the importance of small-area levels of economic inactivity and disability. DWP benefits data offer a more specific measure of local neighbourhood than generic deprivation indices and offer a starting point to hypothesise possible causal pathways to individual mental health status.”

Strengths and Limitations of Ecological Studies

Strengths: Cost effective

Appropriate for group-level variables and/or measuring population impact

Can be extended to multilevel designs which incorporate individual measures

Weaknesses Ecological fallacy

Temporality often difficult to establish

Outline



0 Study designs in epidemiology

0Validity and measurement 0 Causation


0 Biostatistics and epidemiology

Common data sources used in epidemioloy

Aggregate data examples

0 Vital statistics (birth rates, death rates, pregnancy rates, birth weight)

0 Demographic, economic, housing, geographical, and other data from the Census and other government data-gathering activities

0 Summaries of disease and injury reporting systems and registries

0 Workplace monitoring systems

0 Environmental monitoring systems (e.g., air pollution measurements)

0 Production and sales data (e.g. pharmacy data)

Individual-level data examples

0 Vital events registration (births, deaths, marriages)

0 Disease and injury reporting systems and registries

0 National surveys

0 Computer data files (e.g., health insurers)

0 Medical records

0 Questionnaires - in person, by telephone, mailed

0 Biological specimens (routinely or specially collected)

Validity

0 Random error exists with any sampling

0 Bias is a result of a systematic error

0 Bias can arise at any point of the epidemiologic study:

0 Most critical and time-consuming aspect in the conduct of large scale epidemiological studies

Study design Selection of

subjects Measurement

of E/O/C Analysis

Common sources of bias

0 Selection bias: systematic error in the selection of subjects 0 E.g.. Volunteers, healthy individuals, non-

institutionalized, sick individuals

0 Information bias: Data collection errors 0 E.g.. Poor sensitivity/specificity

0 Recall bias

0 Interviewer bias

0 Biases specific to certain problems 0 E.g.. lead-time bias with screening

Confounding 0 Generally – a confounder is

something that results in “non-comparability” between exposed and unexposed

0 Rothman describes as the “confusion of the effects”

Exposure Outcome

Confounder

Confounding Example

Birth Order Down Syndrome

Maternal Age

Stark and Mantel from Rothman 2012

Stark and Mantel noted a much higher prevalence of down syndrome among children that were high on the birth order, with

the highest among 5+

Confounding 0 Confounders need to be measured fully and correctly or

can result in incomplete control of the outcome (residual confounding)

0 Unmeasured confounding is a result of variables which could be confounding the association but were not/could not be measured

0 Controlling can be done through statistical adjustment or stratification

0 It is inappropriate to rely on statistical significance to assess confounding – rather the effect size between exposure and outcome with and without confounder is most appropriate to detect confounding

Measurement

0 Measurement is one of the most important pieces to epidemiologic study

0 Need a clear definition and measurement of the disease and/or phenomena under study

0 Need an unbiased, accurate, and precise way to measure it

Factors that complicate measurement

0 In many cases there is a significant lag time between exposure and disease – and when they happen close together temporality is more complicated

0 Agent of disease may leave no measurable indicator of past exposure

0 Disease definitions and exposure cut-offs vary with time

0 Exposures can evolve or accumulate over time

0 Interactions and effect modification

Selected strategies to improve

measurement 0 Multiple measures

0 Adjustment of study results for measurement error

0 Validation

0 Quality control measures

0 Aids to recall

0 Uses of diaries/24-hour recall

0 Use of scales

0 Develop standards for interpretation

0 Training

0 Measurement models

Outline





0Causation 0 Epidemiology and interface with policy


Causation

0 Goal: to identify a factor that causes a disease – such that the elimination or treatment of that factor would result in a reduction and/or elimination of that disease

0 In practice much of large-scale epidemiology focuses on single risk factors…although this is changing to consider more of a comprehensive system

Causation – some history

0 Traditionally assessed by comparing against criteria

0 E.g. The Bradford-Hill Criteria: Strength

Consistency

Specificity

Temporality

Dose-response

Plausibility

Coherence

Experimental Evidence

Analogy

Other ways to conceptualize causality

0 Sufficient-component model (Rothman) 0 Distinguishes between necessary and sufficient causes

0 In principle, a cause can be necessary – without it the effect will not occur

0 Sufficient is a causal mechanism, which can be made up of multiple component causes

0 A sufficient cause is represented by a complete circle (a "causal pie"), the segments of which represent component causes.

0 May be multiple pathways and causal mechanisms for a given disease (multi-causality)

0 A component cause that is a part of every sufficient cause is a necessary cause

Single component cause

One causal mechanism

Rothman: The Causal Pie Model, 2012

Causal-web model (Kreiger, 1994)

0 Multiple direct and indirect causes

0 Direct: proximal causes, for example

0 Indirect: Mediated through one or more intervening variables

Exposure with direct effect

Outcome

Mediator

Exposure with indirect effect

Counterfactaual model (Greenland, Hernan)

0 Imagine the same individual in the absence of the exposed state – this is known as the “counterfactual”

0 Can summarize up to populations and use various mathematical relationships to estimate the causal effect

0 The counterfactual cannot be observed – basis of causal inference analysis

Causality – analytic techniques

0 Statistical measures of association used in epidemiology reflect but do not explain the number of pathways that and exposure may be causally related to disease (Dohoo, 2012)

0 Methods are currently developing (and becoming more sophisticated) to elucidate the causal pathway

0 Path analysis

0 Analytic causal models (marginal structural models)

Outline





0 Causation



FDA's folic acid policy a success, report

HPV vaccine protects girls from cervical cancer

Cell phones while driving cause same impairment as drinking and driving

Fifteen minutes of sunlight a day to provide adequate levels of Vitamin D

Soft drinks to be banned from school

BPA banned from baby bottles

Epidemiology and policy 0 Epidemiological studies are key to informing public

health and health policy

0 At this stage brings in ‘levels of evidence’

0 How STRONG is the evidence (rigor)

0 How RELEVENT is the evidence (generalizability)

0 Generally policy change does not happen without large-scale studies and/or large body of evidence in the absence of large effects and/or the precautionary principle

0 Policymaking also involves other things aside from epidemiological evidence (ethical, legal, cost-effectiveness)

Outline





0 Causation



Epidemiology and biostatistics 0 Differences in epidemiology and biostatistics are often

emphasized, but historically are not that great

0 Training in either discipline facilitates effective collaboration

0 Epidemiologists use statistics as a tool to understand the issue they are studying

0 Biostatisticians are critical to the conduct of epidemiologic study by: 0 Ensuring appropriate and accurate conduct and application

of statistical techniques

0 Generating new approaches to deal with data/analysis challenges and to match the conceptual hypothesis

Pragmatic approach to data analysis in epidemiology

1) Get to know your data – intimately! (data cleaning)

2) Classify your outcome (distribution)

3) Determine/confirm your question

4) Match the APPROPRIATE analysis for:

I. YOUR DATA

II. YOUR QUESTION

5) No statistical technique– no matter how fancy - can overcome study design and measurement issues (i.e. bias, misclassification, unmeasured confounding ect…)

Future of ‘large-scale’ epidemiology 0 Increased use of data linkage

0 Investments into studies that can examine the impact of multiple/complex interventions/exposures

0 Increasing interest on multi-level design 0 Being more explicit about measuring and understanding that

individuals are nested within larger structures (families, communities ect…)

0 Increased use of multiple types of data within one study - from the cell to the environment and new unstructured data (e.g. social media) 0 “Data Science”

Analytic Methods

0 Most common analytic method for analyzing epidemiological studies are statistical (of different types)

0 However increasing use of different methodologies including mixed methods (quantitative + qualitative), machine learning, micro-simulation modelling ect..)

Conclusion 0 Large scale epidemiological studies can provide unequivocal

information on disease processes and risk factors

0 Cornerstone of prevention and public health policy

0 Requires solid methodological base

0 Some of the best large-scale epidemiological studies are done in interdisciplinary teams with experts on methods, statistics and analysis, data, disease/clinical knowledge, and ethics

Selected Resources: General Epidemiology

0 Rothman, Kenneth J. Epidemiology - an introduction 2nd ed. NY, Oxford University Press, 2012.

0 Rothman, Kenneth J. and Sander Greenland. Modern Epidemiology. 3rd ed. Hagerstown MD, Lippincott-Raven, 2008. (advanced)

0 Szklo, Moyses and F. Javier Nieto. Epidemiology: beyond the basics. 3rd ed. Gaithersburg MD, Aspen, 2014

0 Kelsey, Jennifer L., Alice S. Whittemore Alfred S. Evans and W. Douglas Thompson. Methods in observational epidemiology. Second Edition NY, Oxford, 1996.

0 Keyes, Katherine M., Galea, Sandro. Epidemiology Matters: A new introduction to methodological foundations. NY Oxford University Press, 2014.

Questions? [email protected]

large-scale epidemiological research design · seven steps for an epidemiology of consequences by...

Documents