epidemiologic methods- fall 2002. bias in clinical research: selection and measurement bias...
Post on 19-Dec-2015
223 views
TRANSCRIPT
Epidemiologic Methods- Fall 2002
Where we have been:
Making, assessing, and using measurements
Lecture
1
Title
Understanding Measurement: Reproducibility & Validity
2 Study Design
3 Measures of Disease Occurrence I
4 Measures of Disease Occurrence II
5 Measures of Disease Association I
6 Measures of Disease Association II
Where we are going:
Threats to validity in clinical research studies andhow can they be prevented
Lecture Title
7 Bias in Clinical Research: Selection and Measurement Bias
8 Confounding and Interaction I: General Principles
9 Confounding and Interaction II: Assessing Interaction
10 Confounding and Interaction II: Stratified Analysis
11 Conceptual Approach to Multivariable Analysis I
12 Conceptual Approach to Multivariable Analysis II
Bias in Clinical Research: Selection and Measurement Bias
• Framework for threats to validity (bias)
• Selection bias
– by study design:• descriptive • case-control• cross-sectional• longitudinal studies (cohort or experimental)
• Measurement bias
– exposure vs. outcome
– non-differential vs. differential
Internal vs External Validity
• Validity– before, for measurements:
• accuracy of evaluation of individual traits or characteristics– today, for entire studies:
• accuracy of inferences about populations
• Internal validity– Do the results obtained from the actual subjects accurately
represent the target population?
• External validity (aka generalizability)– Do the results obtained from the actual subjects pertain to
persons outside of the target population?– Internal validity is a prereq for external validity
Diseased
Exposed
+ -
+
-
REFERENCE/TARGET/SOURCE POPULATION
STUDY SAMPLE
INTERNAL VALIDITY
OTHER POPULATIONS EXTERNAL
VALIDITY
• The goal of any study is to find the truth• Ways of missing the truth (getting the wrong answer):
– Bias• Any systematic process that results in incorrect estimate of:
– measure of disease (or exposure) occurrence in a descriptive study
– measure of association between exposure and disease in an analytic study
– Chance• Random error
– type I– type II
Threats to Validity in Clinical Research
MetLife Is Settling Bias Lawsuit
BUSINESS/FINANCIAL DESK | August 30, 2002, Friday
MetLife said yesterday that it had reached a preliminary settlement of a class-action lawsuit accusing it of charging blacks more than whites for life insurance from 1901 to 1972.
MetLife, based in New York, did not say how much the settlement was worth but said it should be covered by the $250 million, before tax, that it set aside for the case in February.
“Bias” in Webster’s Dictionary1 : a line diagonal to the grain of a fabric; especially : a line at a 45° angle to the selvage often utilized in the cutting of garments for smoother fit2 a : a peculiarity in the shape of a bowl that causes it to swerve when rolled on the green b : the tendency of a bowl to swerve; also : the impulse causing this tendency c : the swerve of the bowl3 a : bent or tendency b : an inclination of temperament or outlook; especially : a personal and sometimes unreasoned judgment : prejudice
c : an instance of such prejudice
d (1) : deviation of the expected value of a statistical estimate from the quantity it estimates
(2) : systematic error introduced into sampling or testing by selecting or encouraging one outcome or answer over others
4 a : a voltage applied to a device (as a transistor control electrode) to establish a reference level for operation b : a high-frequency voltage combined with an audio signal to reduce distortion in tape recording
Classification Schemes for “Ways of Getting the Wrong Answer”
• Szklo and Nieto– Bias
• Selection Bias• Information/Measurement Bias
– Confounding– Chance
• Other Common Approach– Bias
• Selection Bias• Information/Measurement Bias• Confounding Bias
– Chance
Selection Bias
• Technical definition – Bias that is caused when individuals have different
probabilities of being included in the study according to relevant study characteristics: namely, the exposure and the outcome of interest
• Plain definition– Bias that is caused by some kind of problem in the
process of selecting subjects initially or - in a longitudinal study - in the process that determines how long subjects participate in the study
Selection Bias in a Descriptive Study
• Pre-election surveys re: 1948 Presidential Election– various methods used to find subjects– largest % favored Dewey
• General election results– Truman beat Dewey
• Ushered in realization of the importance of representative (random) sampling
Leukemia Incidence Among Observers of a Nuclear Bomb Test
Caldwell et al. JAMA 1980• Smoky Atomic Test in Nevada• Outcome of 76% of troops at site was later found; occurrence
of leukemia determined
82% contacted by the investigators
18% contacted the investigators on their own
4.4 greater risk of leukemia than those
contacted by the investigators
REFERENCE/TARGET/SOURCE POPULATION
STUDY SAMPLE
Descriptive Study: Unbiased Sampling
REFERENCE/TARGET/SOURCE POPULATION
STUDY SAMPLE
Descriptive Study: Selection Bias
Diseased
Exposed
+ -
+
-
REFERENCE POPULATION
STUDY SAMPLE
Analytic Study: Unbiased Sampling
Diseased
Exposed
+ -
+
-
REFERENCE POPULATION
STUDY SAMPLE
Analytic Study: Selection Bias
Selection Bias in Case-Control Studies
Coffee and cancer of the pancreas MacMahon et al. N Eng J Med 1981; 304:630-3
Cases: patients with histologic diagnosis of pancreatic cancer in any of 11 large hospitals in the Boston and Rhode Island between October 1974 and August 1979
What study base gave rise to these cases?
How should controls be selected?
Selection Bias in a Case-Control Study
Coffee and cancer of the pancreas MacMahon et al. N Eng J Med 1981; 304:630-3
Controls: • Other patients under the care of the same physician of the
cases with pancreatic cancer.
• Patients with diseases known to be associated with smoking or alcohol consumption were excluded
207 275
9 32
MalesCase Control
Coffee: > 1 cup day
No coffee
OR= (207/9) / (275/32) = 2.7 (95% CI, 1.2-6.5)
Coffee and cancer of the pancreasMacMahon et al., (N Eng J Med 1981; 304:630-3)
216 307
482
41
Relative to the study base that gave rise to the cases, the:
Controls: • Other patients under the care of the same physician at the time
of an interview with a patient with pancreatic cancer
Most of the MDs were gastroenterologists whose other patients were likely advised to stop using coffee
• Patients with diseases known to be associated with smoking or alcohol consumption were excluded
Smoking and alcohol use are correlated with coffee use; therefore, sample is relatively depleted of coffee users
Cancer No cancer coffee
no coffee
REFERENCE POPULATION
STUDY SAMPLE
Case-control Study of Coffee and Pancreatic Cancer: Selection Bias
Selection Bias in a Cross-sectional Study
• Inclusion of prevalent cases causes all sorts of problems
• Finding a diseased person in a cross-sectional study requires 2 things:– the disease occurred in the first place– the case survived long enough to be sampled
• Any factor associated with a prevalent case of disease might be associated with disease development, survival with disease, or both
• Assuming goal is to find factors associated with disease development, bias in prevalence ratio occurs any time that exposure under study is associated with survival with disease
Selection Bias in a Cross-sectional Study
e.g. Smoking and emphysema
• Smoking is a cause of emphysema, but persons with emphysema who continue to smoke have shorter survival
• Hence, in any cross-section of persons with emphysema, those who smoke less are apt to be more greatly represented (because of the survival disadvantage of those who continue to smoke)
• Therefore, cross-sectional study of current smoking and emphysema will result in a prevalence ratio that underestimates the entity you are presumably really interested in: the incidence ratio
Emphysema
Smoke
+ -
+
-
REFERENCE/TARGET POPULATION
STUDY SAMPLE
Cross-sectional study of smoking and emphysema
Selection Bias: Cohort Studies/RCTs
• Among initially selected subjects, selection bias much less likely to occur compared to case-control or cross-sectional studies
– Reason: study participants (exposed or unexposed; treatment vs placebo) are selected (theoretically) before the outcome occurs
Diseased
Exposed
+ -
+
-
REFERENCE POPULATION
STUDY SAMPLE
Cohort Study/RCTSince disease has not occurred yet among initially selected subjects, there is no opportunity for disproportionate sampling with respect to exposure and disease
E
_E
Diseased
Exposed
+ -
+
-
REFERENCE POPULATION
STUDY SAMPLE
Cohort Study/RCTAll that is sampled is exposure status
Even if disproportionate sampling occurs, it will not result in selection bias when forming measures of association
E
_E
Selection Bias: Cohort Studies
• Selection bias can occur on the “front-end” of the cohort if diseased individuals are unknowingly entered into the cohort
• e.g.:
– Consider a cohort study of the effects of exercise on all-cause mortality among persons initially thought to be completely healthy.
– If some participants were enrolled had undiagnosed cardiovascular disease and as a consequence were more likely to exercise less, what would the effect be on the measure of association?
Death No death
exercise
no exercise
REFERENCE POPULATION
STUDY SAMPLE
Cohort Study of Exercise and Survival
Selection bias will lead to spurious protective effect of exercise
Selection Bias: Cohort Studies/RCTs
• Most common form of selection bias does not occur with the process of initial selection of subjects
• Instead, selection bias most commonly caused by forces that determine length of participation (who ultimately stays in the analysis) i.e. loss to follow-up
– When those lost to follow-up have a different probability of the outcome than those who remain (i.e. informative censoring) AND
– this probability is different across exposure groups
– selection bias results
Selection Bias: Cohort Studies/RCTs
e.g., Cohort study of progression to AIDS: IDU vs homosexual men
• In general, getting sicker is a common reason for loss to follow-up
• Therefore, persons who are lost to follow-up have different AIDS incidence than those who remain (i.e., informative censoring)
• In general, IDU more likely to become loss to follow-up - at any given level of feeling sick
• Therefore, the degree of informative censoring differs across exposure groups (IDU vs homosexual men)
• Results in selection bias: underestimates the incidence of AIDS in IDU relative to homosexual men
Effect of Selection Bias in a Cohort Study
Survival assuming no informative censoring and no difference between IDU and homosexual men
Effect of informative censoring in IDU group
Effect of informative censoring in homosexual male group
AIDS No AIDS
IDU
Homo-sexual men
REFERENCE POPULATION
STUDY SAMPLE
Cohort Study of HIV Risk Group and AIDS Progression
Selection bias will lead to spurious underestimation of AIDS incidence in IDU group
Managing Selection Bias• Prevention and avoidance are critical• Unlike confounding where there are solutions in the analysis of the
data, once the subjects are selected, there are usually no fixes for selection bias
• In case-control studies:– Follow the study base principle
• In cross-sectional studies:– Be aware of how exposure in question affects disease survival
• In longitudinal studies (cohorts/RCTs):– Screen for occult disease at baseline– Avoid losses to follow-up
Measurement Bias
• Definition– bias that is caused when the information collected
about or from subjects is inaccurate (invalid; erroneous)
• any type of variable: exposure, outcome, or confounder
– aka: misclassification bias; information bias (text); identification bias
• misclassification is the immediate result
Definition of Terms Related to Measurement Accuracy
• Sensitivity
– the ability of a test (measurement) to identify correctly
those who have the characteristic (disease or exposure)
of interest.
• Specificity
– the ability of a test (measurement) to identify correctly
those who do NOT have the characteristic of interest
Causes for Misclassification
• Participant recall
• Ambiguous questions
• Under or overzealous interviewers
• Problems in biological specimen question
• Faulty instruments
• Data management problems
•
•
•
Diseased
Exposed
+ -
+
-
REFERENCE/TARGET POPULATION
STUDY SAMPLE
Non-Differential Misclassification of Exposure
Problems with sensitivity - independent of disease status
Problems with specificity - independent of disease status
Non-differential Misclassification of Exposure
Truth: No misclassification (100% sensitivity/specificity)
Exposure Cases ControlsYes 50 20No 50 80
OR= (50/50)/(20/80) = 4.0
Presence of 70% sensitivity in exposure classification
Exposure Cases ControlsYes 50-15=35 20-6=14No 50+15=65 80+6=86
OR= (35/65)/(14/86) = 3.3
Effect of non-differential misclassification of 2 exposure categories: Bias the OR toward the null value of 1.0
Diseased
Exposed
+ -
+
-
REFERENCE/TARGET POPULATION
STUDY SAMPLE
Non-Differential Misclassification of Exposure: Imperfect Sensitivity
Problems with sensitivity
Diseased
Exposed
+ -
+
-
REFERENCE/TARGET POPULATION
STUDY SAMPLE
Non-Differential Misclassification of Exposure
Problems with sensitivity - independent of disease status
Problems with specificity - independent of disease status
Non-Differential Misclassification of Exposure: Imperfect Sensitivity and Specificity
Exposure Cases ControlsYes 50 20No 50 80 True OR = (50/50) / (20/80) = 4.0
True Cases Controls Distribution exp unexp exp unexp (gold standard) 50 50 20 80
Study distribution: Cases ControlsExposed 45 10 55 18 16 34Unexposed 5 40 45 2 64 66
sensitivity 0.90 0.80 0.90 0.80 or specificity
Exposure Cases ControlsYes 55 34No 45 66 Observed OR = (55/45) / (34/66) =2.4
REFERENCE/TARGET POPULATION
Study Sample
Non-differential Misclassification of Exposure: Magnitude of Bias on the Odds Ratio
Assume True OR=4.0
2.20.0770.900.90
2.80.200.900.90
3.00.3680.900.90
1.90.200.600.90
3.20.200.950.90
1.90.200.850.60
2.60.200.850.90
Observed ORPrev of Exp in controls
SpecificitySensitivity
Diseased
Exposed
+ -
+
-
REFERENCE/TARGET POPULATION
STUDY SAMPLE
Non-Differential Misclassification of Outcome
Problems with sensitivity -independent of exposure status
Problems with specificity - independent of exposure status
Non-differential Misclassification of Outcome: Magnitude of Bias on the Odds Ratio
Assume True OR=4.0
2.10.200.600.90
3.20.200.950.90
1.90.200.850.60
2.80.200.850.90
Observed ORPrev of Exp in controls
SpecificitySensitivity
Special Situation In a Cohort or Cross-sectional Study
Misclassification of outcome• If specificity of outcome measurement is 100%• Any degree of imperfect sensitivity, if non-differential, will not
bias the risk ratio or prevalence ratio• e.g.
• Worth knowing about when choosing cutoff for continuous variables on ROC curves: choose most specific cutoff
DiseaseNoDisease
Exposed 20 80 100Unexposed 10 90 100
2.0
1001010020
ratio )prevalence (or Risk
DiseaseNoDisease
Exposed 20-6=14 80+6=86100Unexposed 10-3=7 90+3=93100
2.0
1007
10014
ratio )prevalence (or Risk
Truth
70% sensitivity
Differential Misclassification of ExposureWeinstock et al. AJE 1991• Nested case-control study with Nurses Health Study
• Cases: women with new melanoma diagnoses
• Controls: women w/out melanoma - by incidence density sampling
• Measurements: questionnaire about “tanning ability”; administered
shortly after melanoma development
MelanomaNoMelanoma
No tan to light tan 15 77Med to dark tan 19 157
1.6
157771915
OR
• Question asked after diagnosis
• Question asked before diagnosis (NHS baseline)
MelanomaNoMelanoma
No tan to light tan 9 79Med to dark tan 25 155
0.7
15579259
OR
MelanomaNoMelanoma
No tan to light tan 15 77Med to dark tan 19 157
1.6
157771915
OR
Diseased
Exposed
+ -
+
-
REFERENCE/TARGET POPULATION
STUDY SAMPLE
“Tanning Ability” and Melanoma
Imperfect specificity - mostly in cases
Differential Misclassification of Exposure: Magnitude of Bias on the Odds Ratio
Assume True OR=3.9
Exposure Classification
Sensitivity Specificity
Cases Controls Cases Controls OR
0.90 0.60 1.0 1.0 5.79
0.60 0.90 1.0 1.0 2.22
1.0 1.0 0.9 0.70 1.00
1.0 1.0 0.7 0.90 4.43
Prevalence of Exposure in Controls = 0.1
Misclassification: Summary of Effects• Dichotomous exposure and outcome
• Multi-level exposure and/or outcome– more complicated and less predictable– e.g. non-differential misclassification can lead to bias
away from null
Misclassification Measure of AssociationNon-differential
Exposure Towards nullOutcome Towards null*
DifferentialExposure Away or towards nullOutcome Away or towards null
*Exception: When specificity is 100%, no effect on risk ratio regardless of sensitivity
Poor Reproducibility
Poor Validity
Good Reproducibility
Good Validity
Managing Measurement Bias
• Prevention and avoidance are critical
• If true sensitivity/specificity are known, complex back-calculation techniques exist that can be used in the analysis phase
• Optimize the reproducibility/validity of your measurements!
Selection Bias in a Clinical Trial
• Losses to follow-up are the big unknown in clinical trials and the major potential for selection bias
• If:
– a symptomatic side effect of a drug is more common in persons “sick” from disease
– occurrence of the side effect is associated with more losses to follow-up
• Then:
– drug treatment group would be selectively depleted of the sickest persons
– drug overall looks better