study design: case-control studies paul l. reiter, phd assistant professor division of cancer...
TRANSCRIPT
Study Design: Case-Control Studies
Paul L. Reiter, PhDAssistant Professor
Division of Cancer Prevention and Control
Learning Objectives
Describe the strengths and weaknesses of case-control studies
Describe the importance of the selection of controls Compare and contrast the different types of matching for
case-control studies Describe the different types of biases commonly
associated with case-control studies
Module Outline
Case-Control Studies What are they? Case Selection Control Selection Matching Potential Biases Strengths and Weaknesses
Case-Control Study
Nature / subjects / others assign exposure status No formal procedure of random assignment
Subjects selected based on disease status (cases and controls)
Past exposure status is determined for cases and controls Compare exposure in cases versus controls
Case-Control Study
a b
c d
Disease
ExposureYes
No
Yes No
Case-control studies start with disease status and then determine exposure
Module Outline
Case-Control Studies What are they? Case Selection Control Selection Matching Potential Biases Strengths and Weaknesses
Source Cohort
“It is helpful to think of any case-control study as being nested – that is, conducted – within a cohort of exposed and unexposed…Case-control studies can be thought of as nested within a source population…”
Rothman and Greenland
Source Cohort
The “source cohort” behind a case-control study is the population (cohort) that gave rise to the cases
Example
Cases: lung cancer cases in Franklin County, OH
Source Cohort: residents of Franklin County, OH
Selecting Participants
??? ???
??? ???
Disease
ExposureYes
No
Yes No
Goal: Select n1 cases and n0 controls from source cohort without knowledge of their exposure status
n1n0
Selecting Cases
We want cases to be all (or a “representative” sample) of the diseased members of the source cohort “Representative” = group provides a valid estimate of exposure
Where to Find Cases
Clinic-based cases Hospitals Outpatient clinics Physician practices
Population-based cases Disease registries Death certificates
Considerations
Clinic-based cases Possibly harder to define “source cohort” due to referral patterns If examining severely ill patients, may get “survivors” instead of a
representative sample
Population-based cases May be difficult to find a registry for some diseases (e.g., HPV
infection)
Incident vs. Prevalent Cases
Incident Newly diagnosed cases Have to wait for new cases to be diagnosed and have system for
identifying them
Prevalent People who may have had disease for some time Any risk factors identified may be related more to survival than
disease development
Verdict: Incident cases are generally preferred
Module Outline
Case-Control Studies What are they? Case Selection Control Selection Matching Potential Biases Strengths and Weaknesses
Selecting Controls
We want controls to be a “representative” sample of the non-diseased members of the source cohort “Representative” = group provides a valid estimate of exposure
Selecting controls is extremely important since they serve as the “comparison group” to cases for your study Want to select the most valid comparison group possible
Selecting Controls
Select individuals who might have become cases in your study if they had developed disease, that is, from the source cohort that gave rise to the cases
Try to conceptualize the “source cohort” (although it may not be easily identifiable) and select controls from that cohort
Selecting Controls is Difficult!
Control selection is “one of the most difficult problems in epidemiology” (Gordis)
It is also one of the most important components of a case-control study!
Where to Find Controls
Medical care system Hospitals Outpatient clinics Physician practices
Community General population Family members or friends Neighbors (geographic controls) Other (schools, worksites, etc.)
Deceased individuals
Medical Care System Controls
Advantages Theoretically belong to same source cohort as cases (if using
clinic-based cases) Easily identifiable High cooperation rate “Mental set” is similar to cases (potentially less recall bias)
Disadvantages Might have medical condition caused by exposure Only a subset of source population
Medical Care System Controls
General rules Choose control conditions likely to have same referral pattern as
disease of interest Exclude conditions known to be associated (positively or
negatively) with the exposure Preferable to select controls from multiple disease categories
Community Controls
Advantages Theoretically belong to same source cohort as cases (if using
population-based cases) Random sampling of population-based controls is usually the
most desirable option, if possible
Disadvantages Source cohort not always easily identifiable to allow for random
sampling of controls Low cooperation rate Possible “overmatching” if using family or friends “Mental set” different from cases (recall bias)
Community Controls - Methodology
Random digit dialing (RDD) Cell phone only households Negative influence of telemarketing
Door-to-door More likely option for developing countries
Ask cases to provide list of family members, friends, or neighbors
Public databases (DMV, voter registration lists, etc.)
How Many Controls Do I Need?
0 1 2 3 4 5 6
Pre
cisi
on
of E
stim
ate
s
Number of Controls per Case
Returns in statistical efficiency diminish drastically by increasing the control to case ratio beyond 4 or 5
Module Outline
Case-Control Studies What are they? Case Selection Control Selection Matching Potential Biases Strengths and Weaknesses
Matching - Definition
“Matching refers to the selection of a reference series – unexposed subjects in a cohort study or controls in a case-control study – that is identical, or nearly so, to the index series [exposed or cases] with respect to the distribution of one or more potentially confounding factors.”
Rothman and Greenland
Reason for Matching
“A major concern in conducting a case-control study is that cases and controls may differ in characteristics or exposures other than the one that has been targeted for the study.”
Gordis
Matching
Matching basically makes sure that controls and cases are similar on certain characteristics
Two types of matching Individual matching Group matching
Individual Matching
Also called “match pairs” Matching occurs subject by subject For each case, select one or more controls with
characteristics that match that case Example
Case is a 50 year old African American man, and we want to match on age, race, and gender
Control would be selected who is 50 years old, African American, and male
Group Matching
Also called “frequency matching” For a stratum of cases, select a stratum of controls. The
proportion of a characteristic should be the same between cases and controls
Often requires that all cases are selected first Example
There are 400 cases (300 female, 100 male) We would select 300 female and 100 male controls if we wanted
to match on gender
Matching – Positives and Negatives
Positives Leads to more efficient stratified analyses
Negatives Cannot examine the relation of a matched variable to the disease May be increase complexity of study logistics (hard to find a
control for some cases) In individual matching, cannot use cases for which no matched
control was found Risk of “overmatching”, which can result in loss of precision
Matching – The Verdict
Be careful when opting for a matched design Match (if at all) on only a few variables suspected to be
strong confounders
Module Outline
Case-Control Studies What are they? Case Selection Control Selection Matching Potential Biases Strengths and Weaknesses
Selection Bias
Control-selection bias If exposure in selected controls differs from exposure in source
cohort
Case-selection bias If exposure in selected cases differs from exposure in source
cohort If some cases did not arise from the source cohort
Want well-defined inclusion/exclusion criteria and sound selection methods
Recall Bias
Remember that we identify cases and controls based on disease status and then need to determine past exposure
May not be a problem for some exposures (e.g., presence of a gene) but other exposure data rely on interviews or surveys
Recall is a major problem in case-control studies
Recall Bias
Some participants may not be able to remember or accurately report information related to exposure Or they simply may not have the requested information
This means that some cases/controls will likely be misclassified as exposed/unexposed
Interviewer Bias
If using interviewers to collect data, they may not be blinded to the case-control status of participants Interviews may phrase items differently or probe further on
exposure questions when interviewing cases
Minimizing Information Bias
Exposure status (and other variables) should be measured in a comparable fashion in cases and controls
Exposure status should not be known when a cases or control is selected for study
Sources of exposure information Self-reports Surrogate / proxy (e.g., spouse) Records (hospital, worksite) Physical measurements Stored samples
Confounding Bias
Confounding: A situation in which the effect or association between an exposure and outcome is distorted by the presence of another variable
If confounding is present in the source cohort, then it should also be present in the study sample Since we select cases and controls to be “representative” of the
source cohort
Several ways to control for confounding Stratification, statistical modeling, etc.
Module Outline
Case-Control Studies What are they? Case Selection Control Selection Matching Potential Biases Strengths and Weaknesses
Strengths of Case-Control Studies
Easier to study rare diseases Can examine a variety of exposures for a given disease Compared to cohort studies, usually:
Quicker Easier Cheaper
Under certain conditions, results can estimate a causal parameter
Weaknesses of Case-Control Studies
Difficulty in selecting appropriate controls Information bias (particularly recall bias) Not ideal for rare exposures (cohort studies are probably
better for this) Can be difficult to establish temporality between
exposure and disease
Case-Control vs. Cohort
Case-Control Cohort (Prospective)
Study Group Diseased persons (cases) Exposed persons
Comparison Group Nondiseased (controls) Unexposed persons
Multiple Associations Several exposures with disease
Several diseases with exposure
Cost of Study Relatively inexpensive Expensive
Time Required Relatively short Generally long
Best When Disease is rare Exposure is rare
Problems Selection of controls, information bias, etc.
Loss to follow-up, misclassify outcomes, etc.
Summary
“A case-control study is a useful first step when searching for a cause of an adverse health outcome.”
Gordis
Summary
There are several important strengths to case-control studies, but must be aware of some of the limitations Biases discussed earlier
Control selection is crucial to a case-control study Source of controls Matching
Thank you for completing this module
• If you have any questions, write to me. • [email protected]
References
Gordis L. (2009). Epidemiology, 4th edition. Philadelphia, PA: Elsevier/Saunders.
Rothman, K.J., Greenland, S. & Lash, T.L. (2008). Modern Epidemiology, 3rd Edition. Philadelphia, PA: Lippincott, Williams & Wilkins.
Rothman, K.J. & Greenland, S. (1998). Modern Epidemiology, 2nd Edition. Philadelphia, PA: Lippincott, Williams & Wilkins.
Survey
We would appreciate your feedback on this module. Click on the button below to complete a brief survey. Your responses and comments will be shared with the module’s author, the LSI EdTech team, and LSI curriculum leaders. We will use your feedback to improve future versions of the module.
The survey is both optional and anonymous and should take less than 5 minutes to complete.
Survey