spring 2008
DESCRIPTION
Spring 2008. Case-Control Studies and Odds Ratio STAT 6395. Filardo and Ng. Lopez-Carrillo et al. Chili pepper consumption and gastric cancer in Mexico. Hypothesis: chili pepper consumption increases the risk of gastric cancer Source population: residents of Mexico City metropolitan area. - PowerPoint PPT PresentationTRANSCRIPT
Case-Control Studiesand Odds Ratio
STAT 6395
Spring 2008
Filardo and Ng
Lopez-Carrillo et al. Chili pepper consumption and gastric cancer in Mexico
• Hypothesis: chili pepper consumption increases the risk of gastric cancer
• Source population: residents of Mexico City metropolitan area
Lopez-Carrillo et al. Chili pepper consumption and gastric cancer in Mexico
• Cases: Cases of stomach cancer diagnosed between September 17, 1989 and June 30, 1990 in 15 Mexico City metropolitan area hospitals (about 80% of the stomach cancer cases)
• Controls: Age-stratified random sample of residents of the Mexico City metropolitan area selected from the household sampling frame of the 1986-87 Mexican National Health Survey
Measurement of exposure: interview (structured questionnaire)
• Cases were queried about their dietary habits during the 12-month period prior to the onset of symptoms.
• Controls were queried about their dietary habits during the 12-month period preceding the interview.
• Cases and controls were asked “Do you eat chili peppers or chili sauces with your meals?”
Case-control study of stomach cancer and chili pepper consumption
Stomach Cancer Cases
Controls
Yes 204 552
Chili Pepper
Consumption No 9 145
Total 213 697
Odds ratio = (204x145)/(552x9) = 5.95
Data layout for case-control study with multiple levels of exposure
→ First select
Cases Controls
High exposure a1 b1
Low exposure
a2 b2
Not exposed c d
↓ Then
Measure Past
Exposure
Total → a1 + a2 + c b1 + b2 + d
Calculation of odds ratio in case-control study with multiple levels of exposure
No exposure group → ‘reference’
OR for high exposure group → (a1d)/(b1c)
OR for low exposure group → (a2d)/b2c)
• Hypothesis: HPV infection greatly increases the risk of CIN, a precursor of cervical cancer
• Source population: Women receiving Pap smears at the Kaiser Permanente prepaid health plan in Portland, Oregon between 4/1/1989 and 11/2/1990
Schiffman et al. Epidemiologic evidence showing that human papillomavirus infection causes most cervical intraepithelial neoplasia
• Cases: women found to have CIN on Pap smear
• Controls: a random sample of women with normal Pap smear and no known history of CIN
• Measurement of exposure Cervicovaginal lavage, as part of the Pap smear screening, for
HPV testing
Schiffman et al. Epidemiologic evidence showing that human papillomavirus infection causes most cervical intraepithelial neoplasia
HPV status
CIN Cases
Controls
Odds ratio
Types 16, 18 158 13 50.9
Types 31, 33, 35, 39, 45, 51, 52
117 15 32.7
Types 6, 11, 42, other, unknown
108 52 8.7
Negative 89 373 1.0
Total → 472 453
Schiffman et al. Epidemiologic evidence showing that human papillomavirus infection causes most cervical intraepithelial neoplasia
Matched-pair case-control study with a dichotomous (yes/no) exposure
4 possible combinations of matched pairs: Pairs in which both the case and control were exposed
(concordant)
Pairs in which neither the case nor the control were exposed (concordant)
Pairs in which the case was exposed but the control was not (discordant)
Pairs in which the control was exposed but the case was not (discordant)
Data layout for matched-pair case-control study with a dichotomous exposure
Exposure status of matched pairs (“+” = exposed; “-” = unexposed)
(case/control)
+/+ +/- -/+ -/-
Number of
Matched Pairs
q r s t
q and t are concordant pairs (uninformative)r and s are discordant pairsOdds ratio = r/s
• Hypothesis: Infection of genital tract with Chlamydia trachomatis increases the risk of ectopic pregnancy
• Source population: Surrounding area of several hospitals
• Cases: Women admitted to the participating hospitals for ectopic pregnancy
• Controls: women attending prenatal clinics at those hospitals
Chow et al. The association between Chlamydia trachomatis and ectopic pregnancy
• Controls were individually matched to cases on age (within 1 year), ethnicity, and hospital
• Measurement of exposure: blood drawn for antibody titer
Chow et al. The association between Chlamydia trachomatis and ectopic pregnancy
Exposure status of matched pairs (“+” = exposed; “-” = unexposed)
(case/control)
+/+ +/- -/+ -/-
Number of
Matched Pairs
72 109 36 40
Odds Ratio = 109/36 = 3.0
Chow et al. The association between Chlamydia trachomatis and ectopic pregnancy
Advantages of case-control studies
• Take considerably less time to conduct than concurrent cohort studies
• If the exposures of interest are relatively common, require a much smaller sample size than cohort studies
Advantages of case-control studies
• Considerably less costly to conduct than concurrent cohort studies
• Can test hypotheses about the relationship between many different exposures and the disease of interest in a single study
• Useful for studying rare diseases
• Past exposures are ascertained after the onset of disease
Uncertainty (exposures preceded onset of disease?)
Exposure information is less accurate
Possibility of recall bias -- differential recall of exposures by those who develop the disease compared to those who do not
Disadvantages of case-control studies
• Prone to biases in selection of controls
• Can’t calculate actual incidence rates; can only estimate relative risks from the odds ratio
• Case-control studies are observational studies
Disadvantages of case-control studies
Smoking and Lung Cancer
• Dramatic increase in reported mortality rates from lung cancer in Great Britain and other countries from 1920s to 1950
• Four major hypotheses about the cause of these increases Improved diagnosis Air pollution Occupational exposures Tobacco smoking
• Several small case-control studies pointed to smoking
Smoking and Lung Cancer
Landmark Studies in 1950
• Hypothesis: smoking increases the risk of lung cancer
• Wynder and Graham: United States
• Doll and Hill: Great Britain
• Large hospital-based case-control studies
• Landmark studies: Importance of the result Thoughtfulness of the methodology and interpretation
Smoking and Lung Cancer
Smoking: Attributable risk percent (population) in Canada
Cause of death AR% (population)
All causes 20%
Lung cancer 85%
COPD 80%
Coronary heart disease
15%
Cerebrovascular disease
10%
Smoking and Lung Cancer
Smoking is a risk factor for:
• Other cancers: larynx, oral cavity, esophagus, bladder, pancreas, kidney, cervix
• Peripheral artery occlusive disease
• Peptic ulcers
• Periodontal disease
• Low birth weight, preterm delivery, neonatal death
• …and fires in homes
Smoking and Lung Cancer
Environmental Tobacco Smoke
• Lung cancer
• Coronary heart disease
• Eye irritation
• Respiratory symptoms
• Aggravates allergic symptoms
Smoking and Lung Cancer
Doll and Hill study: Cases
• Provisonal cases: all patients presumed to be newly diagnosed with carcinoma of the lung admitted to 20 London hospitals between April 1948 and October 1949
• Notification about new cases: admitting clerk, physician, cancer registrar, or radiotherapy department
Smoking and Lung Cancer
Doll and Hill study: Cases
• Authors believed they missed some cases, but there was no reason to believe that the missed cases smoked more or less than the reported cases
• Interviewer visited the hospital to interview the patient
Smoking and Lung Cancer
• For each lung cancer case, the interviewer chose a noncancer control patient matched to the case on:
Sex Age (within the same 5-year age group) Hospital (in the same hospital at or about the same time)
Doll and Hill study: Controls
Smoking and Lung Cancer
• At 2 hospitals (Brompton and Harefield), it was not always possible to find a control. In these instances, controls were selected from one of the 2 neighboring hospitals.
Even this didn’t always work for Brompton Hospital, so used patients as controls who had been interviewed as cancer patients and were later determined not to have cancer.
Doll and Hill study: Controls
Smoking and Lung Cancer
• Initial diagnoses were provisional
• Generally the hospital discharge diagnosis was accepted as the final diagnosis
• Later evidence (autopsy, biopsy) taken into account
• Final diagnosis based on best available evidence
Doll and Hill study: Confirmation of diagnoses
Smoking and Lung Cancer
Notification of provisional cancer cases (including stomach and colorectal cancers) (2,370)
(2,140)Provisional cancer cases eligible for interview
Cases > age 75 (150)
Diagnosis changed beforeinterview (80)
Doll and Hill study: Flow Diagram of Case Selection
Smoking and Lung Cancer
(2,140) Provisional cancer cases eligible for interview
(1,732) Provisional cancer cases interviewed
Discharged before interview (189)
Too ill to be interviewed (116)
Died before interview (67)
Too deaf (24)
Unable to speak English (11)
Patient unreliable (1)
Patient refused (0)
Smoking and Lung Cancer
Interviewed study subjects (1,732 provisional cases and 743 original controls) after confirmation of diagnoses
Disease group No. of subjects
Lung carcinoma 709 Stomach/colorectal
carcinoma 637
Other cancers 81
Non cancer controls 709
Other subjects* 335
Excluded subjects& 4
*Provisional cancer cases found not to have cancer on final diagnosis and original non cancer controls whose paired provisional lung carcinoma cases were found not to be lung carcinoma&due to doubts about their true category
Smoking and Lung Cancer
Subjects included in this study
• 709 cases of carcinoma of the lung 649 men 60 women
• 709 non cancer controls, matched to the cases by sex, age, and hospital 649 men 60 women
Smoking and Lung Cancer
Subjects included in this study
• Social class distribution of cases and controls was similar
• A higher proportion of lung cancer cases lived outside of London
Smoking and Lung Cancer
• If they had smoked at any period of their lives
• The ages at which they had started and stopped
• The amount they smoked before the onset of their illness
• The main changes in their smoking history
• The maximum they had ever smoked
Measurement of smoking –Patients were asked:
Smoking and Lung Cancer
Measurement of smoking –Patients were asked:
• The proportion of their smoking that was cigarettes vs. pipes
• Whether or not they inhaled
Definition of a smoker: A person who had smoked as much as one cigarette a day for as long as one year
Smoking and Lung Cancer
Reliability sub-study
• Reliability is the reproducibility or repeatability of a measurement
• Validity is the degree to which a measurement measures what it purports to measure (accuracy)
• 50 controls were re-interviewed about their smoking histories 6 months or more after the initial interview
• Found fairly good agreement between the 2 interviews
Smoking and Lung Cancer
Male Smokers and Non-smokers (2x2 table)
Lung
Cancer Cases
Controls
Smokers 647 622
Non- smokers
2 27
Odds ratio = (647x27)/(622x2) = 14.0
Smoking and Lung Cancer
Lung cancer and amount smoked immediately before the onset of illness*
Cigarettes/
day Cases Controls
Odds Ratio
50+ 32 13 33.2
25-49 136 71 25.9
15-24 196 190 13.9
5-14 250 293 11.5
1-4 33 55 8.1
0 2 27 1.0
*If the subject had given up smoking before then, he was classified by the
amount smoked immediately prior to giving up smoking.
Smoking and Lung Cancer
Lung cancer and other measures of cigarette smoking
• Maximum amount ever smoked regularly
• Total amount of tobacco smoked in lifetime (number of cigarettes)
• Results similar to those obtained for amount smoked immediately before onset of illness
Smoking and Lung Cancer
Alternative explanations of results
• Smoking is a major risk factor for lung cancer
• Selection of an inappropriate control group (selection bias)
Smoking and Lung Cancer
Alternative explanations of results
• Exaggeration of smoking habits by lung cancer cases who thought they had an illness that they could attribute to smoking (recall bias)
• Exaggeration of smoking habits of lung cancer cases by interviewers who believed that smoking caused lung cancer (interviewer bias)
Smoking and Lung Cancer
Selection of controls: place of residence
• A higher proportion of cases than controls lived outside London
• There was no difference in place of residence between cases and controls from the district hospitals, which did not have special cancer treatment facilities
Smoking and Lung Cancer
Selection of controls: place of residence
• A strong association between smoking and lung cancer was observed when the analysis was restricted to the district hospitals
• Persons residing outside London smoked less than London residents
Smoking and Lung Cancer
Selection of controls: did the hospital-based control group smoke less than the source population?
• Would result in an overestimation of the association between smoking and lung cancer
• Many of the subjects in the control group were hospitalized for non-malignant respiratory disease and cardiovascular disease
Smoking and Lung Cancer
Selection of controls: did the hospital-based control group smoke less than the source population?
• Both diseases have a strong association with cigarette smoking
• If anything, the selection of controls was biased toward choosing controls who smoked more than the source population
Smoking and Lung Cancer
Selection of controls: Did the interviewers select a disproportionate number of light smokers to be control patients from among the patients available for selection?
• The smoking habits of the patients who the interviewers selected for interview did not differ from the smoking habits of the patients (other than lung cancer) whose names were notified by the hospitals (stomach and colorectal cancer patients)
Smoking and Lung Cancer
Did the lung cancer cases exaggerate their smoking habits? (recall bias)
• Having respiratory symptoms may have influenced their replies to the smoking questions
• However, patients with nonmalignant respiratory diseases, who would also have respiratory symptoms, did not give smoking histories appreciably different from patients with nonrespiratory diseases
Smoking and Lung Cancer
Did the lung cancer cases exaggerate their smoking habits? (recall bias)
• Smoking was not thought to be related to lung cancer at the time, so there would be no reason for lung cancer cases to overstate or understate their smoking because they knew they had lung cancer
Smoking and Lung Cancer
Did the interviewers overstate the smoking habits of the lung cancer cases? (interviewer bias)
• Interviewers could not be blinded to diagnoses
• 209 patients thought to have lung cancer at the time of interview later had their diagnoses disproved
Smoking and Lung Cancer
Did the interviewers overstate the smoking habits of the lung cancer cases? (interviewer bias)
• Smoking among the lung cancer cases was much greater than smoking among the patients incorrectly thought to have lung cancer
• Smoking among non-lung cancer patients and patients incorrectly thought to have lung cancer did not differ
Smoking and Lung Cancer
Conclusions
• There is a real association between smoking and lung cancer
• Smoking is an important factor in the development of lung cancer
• The risk increases with increasing amount smoked
Smoking and Lung Cancer
Molecular epidemiology
• The use in epidemiologic studies of molecular, biochemical, pathology, and other laboratory methods to measure exposure, disease, disease precursors, or susceptibility to disease
• Biological marker: a molecular, biochemical, or cellular indicator of exposure, disease, disease precursors, or susceptibility to disease
Genetic epidemiology
• The epidemiologic study of the role of genetic factors and their interactions with environmental factors in the etiology of disease
Hankinson et al. Plasma prolactin levels and subsequent risk of breast cancer in postmenopausal women
• Case-control study nested within the Nurses’ Health Study cohort
• Hypothesis: plasma prolactin levels are positively associated with breast cancer risk in postmenopausal women
Molecular Epidemiology
Methodologic issues in measurement of biologic markers
• Biomarker stability –transport and storage conditions
• Control of extraneous determinants of biomarker levels
• Intra-subject variability (reliability of a single sample)
Molecular Epidemiology
Methodologic issues in measurement of biologic markers
• Intra-laboratory variability –precision of the laboratory assay
• Measurement of biomarker blind with respect to other study variables
• Laboratory drift (systematic intra-laboratory variability over time)
Molecular Epidemiology
Prolactin
• Polypeptide hormone
• Secreted primarily by anterior pituitary gland
• Essential for breast development and lactation
• Involved in mammary tumor formation in rodents
• Produced by normal and malignant breast tissue
Molecular Epidemiology
Prolactin
• More than 50% of breast cancers have prolactin receptors
• Prolactin can stimulate the growth of human breast cancer cells grown in tissue culture
• Extraneous determinants of plasma prolactin Circadian variation Increases substantially with a noontime meal Higher in women using postmenopausal hormones (PMHs)
Molecular Epidemiology
Previous epidemiologic studies of plasma prolactin and postmenopausal breast cancer
• Case-control studies inconsistent Small Presence of breast cancer may influence prolactin levels
• One previous cohort study with only 40 postmenopausal breast cancer cases. A nonsignificant positive association was observed
Molecular Epidemiology
Nurses’ Health Study
• Concurrent cohort study of 121,700 female registered nurses, age 30-55 years, who returned a mailed questionnaire in 1976
• Baseline for current study: 1989-1990, when blood samples were collected
Molecular Epidemiology
• Collected to assess relationship between serum hormone levels, serum micronutrient levels and disease
• Women willing to provide a blood sample were sent a blood collection kit in 1989-1990
Molecular Epidemiology
Blood samples
• 32,826 women returned blood samples by overnight delivery in a Styrofoam mailer cooled with a frozen gel pack
• Plasma was separated from red and white blood cells, and stored on liquid nitrogen
Molecular Epidemiology
Blood samples
Plasma prolactin stability
• 97% of the blood samples were received within 26 hours of being drawn
• Plasma prolactin is stable indefinitely frozen on liquid nitrogen
Molecular Epidemiology
Preliminary study
• 3 tubes of blood drawn from each of 9 postmenopausal women
• 1 tube processed immediately
• 2 tubes stored for 24 or 48 hours in a Styrofoam mailer cooled with a frozen gel pack
Molecular Epidemiology
Measured plasma prolactin according to delay in processing after phlebotomy
Delay (hours)
Plasma prolactin (ug/ml)*
0 8.8 (4.3)
24 8.9 (4.5)
48 8.8 (4.2)
*Mean (standard deviation) of results from 9 postmenopausal women
Molecular Epidemiology
Cases
• Postmenopausal at time of blood collection
• No reported cancer diagnosis prior to blood collection
• Diagnosed with breast cancer after blood collection but before June 1, 1994
Molecular Epidemiology
Cases
• All but one case confirmed by medical record review
• Mean time from blood collection to diagnosis: 28 months
Molecular Epidemiology
Controls
• Postmenopausal at time of blood collection• Matched to cases on:
Recent PMH use Time of day of blood draw (+ 2 hours) Fasting status at time of blood draw (at least 10 hours since a meal
vs. < 10 hours or unknown) Age (+ 2 years) Month of blood collection (+ 1 month)
Molecular Epidemiology
Nested case-control study (nested within a concurrent cohort study)
Plasma prolactin collected from 1989 to 90
cases and controls identified June 1994
specimens used to assess exposure and compare it among study groups
Time
Case-Control studies Population-based, hospital-based, nested
Baseline
Defenses against laboratory drift (systematic intra-laboratory variability over time)
• All case-control pairs or case-control triplets were analyzed together
• Replicate quality control samples from the plasma pool were included in each batch to assess inter-batch variability
Molecular Epidemiology
Laboratory drift
• The samples were assayed in 2 batches (1993 and 1996)
Molecular Epidemiology
Laboratory drift
• Prolactin values were systematically higher in the first batch The mean levels of the quality control samples were higher in batch
1 than in batch 2 Using quartile cut points based on all control subjects combined, the
highest quartile contained 41% of batch 1 control subjects and 18% of batch 2 control subjects
• Expected the highest quartile to contain about 25% of batch 1 and 25% of batch 2 control subjects
Molecular Epidemiology
• To categorize cases and controls into plasma prolactin exposure categories, batch was taken into account
• Batch-specific quartile cut points were defined based on the distribution of the control values in each batch
Molecular Epidemiology
Laboratory drift
• What if the investigators had been oblivious to the problem of laboratory drift and all of the cases had been assayed in batch 1 and all of the controls in batch 2?
Molecular Epidemiology
Laboratory drift
Batch-specific cut points
Quartile
Batch 1 (ng/ml)
Batch 2 (ng/ml)
1 (lowest) <6.4 <5.9
2 6.5-9.3 6.0-7.6
3 9.4-13.7 7.7-9.7
4 (highest) >13.7 >9.7
Molecular Epidemiology
Final study cohort
• Study subjects initially eligible 337 cases 493 controls
• Final study subjects after exclusions 306 cases 448 controls
Molecular Epidemiology
Baseline characteristics of cases and controls
Characteristic
Cases (mean)
Controls (mean)
p-value
Age 61.5 61.9 - Age at menarche 12.5 12.6 0.30 Age at menopause 49.1 49.4 0.39 Parity 3.2 3.5 0.51 BMI at baseline 25.7 25.7 0.92 Family history of breast cancer 16.6% 12.7% 0.05 History of benign breast disease
49.8%
37.5%
0.003
Median plasma prolactin (ng/ml)
9.0
7.9
0.01
Molecular Epidemiology
Multivariable ORs for the relationship between plasma prolactin and breast cancer
Quartile
Cases
Controls
OR*
95% CI
1 64 121 1.00 -
2 63 112 1.05 0.65-1.71
3 79 112 1.45 0.91-2.31
4 100 103 2.03 1.24-3.31
*Multivariable conditional logistic regression adjusted for BMI at age 18, family history of breast cancer, age at menarche, age at first birth, parity, age at menopause, duration of PMH use
Molecular Epidemiology
ORs for the relationship between plasma prolactin and breast cancer according to batch
Quartile
OR (batch 1)
OR (batch 2)
1 1.00 1.00
2 1.45 0.87
3 1.81 1.53
4 1.83
(0.79-4.23)* 2.47
(1.28-4.76)*
*95% CI
Molecular Epidemiology
Conclusions
• This was the first study of the relationship between plasma prolactin and risk of breast cancer with sufficient power to detect a weak to moderate association
• A significant positive association between plasma prolactin and subsequent risk of breast cancer was observed
Molecular Epidemiology
Conclusions
• The study design (case-control study nested within a concurrent cohort study) ensured that the elevated prolactin levels occurred before the breast cancer
• Additional studies are needed to confirm and extend this finding
Molecular Epidemiology