analysis of matched data; plus, diagnostic testing
DESCRIPTION
Analysis of matched data; plus, diagnostic testing. Correlated Observations. Correlated data arise when pairs or clusters of observations are related and thus are more similar to each other than to other observations in the dataset. Ignoring correlations will: - PowerPoint PPT PresentationTRANSCRIPT
Analysis of matched data; plus, Analysis of matched data; plus, diagnostic testingdiagnostic testing
Correlated ObservationsCorrelated Observations
Correlated data arise when pairs or clusters of observations are related and thus are more similar to each other than to other observations in the dataset.
Ignoring correlations will:– overestimate p-values for within-person or
within-cluster comparisons– underestimate p-values for between-person or
between-cluster comparisons
Pair Matching: Why match?Pair Matching: Why match?Pairing can control for extraneous sources
of variability and increase the power of a statistical test.
Match 1 control to 1 case based on potential confounders, such as age, gender, and smoking.
ExampleExample Johnson and Johnson (NEJM 287: 1122-1125,
1972) selected 85 Hodgkin’s patients who had a sibling of the same sex who was free of the disease and whose age was within 5 years of the patient’s…they presented the data as….
Hodgkin’s
Sib control
Tonsillectomy None
41 44
33 52
From John A. Rice, “Mathematical Statistics and Data Analysis.
OR=1.47; chi-square=1.53 (NS)
ExampleExample But several letters to the editor pointed out that
those investigators had made an error by ignoring the pairings. These are not independent samples because the sibs are paired…better to analyze data like this:
From John A. Rice, “Mathematical Statistics and Data Analysis.
OR=2.14*; chi-square=2.91 (p=.09)
Tonsillectomy
None
Tonsillectomy None
26 15
7 37
Case
Control
Pair Matching: examplePair Matching: example
Match each MI case to an MI control based on age and gender.
Ask about history of diabetes to find out if diabetes increases your risk for MI.
Pair Matching: examplePair Matching: example
Which cells are informative?
Just the discordant cells are informative!
Diabetes
No diabetes
25 119
Diabetes No Diabetes
9 37
16 82
46
98
144
MI cases
MI controls
Pair MatchingPair Matching
Diabetes
No diabetes
25 119
Diabetes No Diabetes
9 37
16 82
46
98
144
MI cases
MI controls
OR estimate comes only from discordant pairs!
The question is: among the discordant pairs, what proportion are discordant in the direction of the case vs. the direction of the control. If more discordant pairs “favor” the case, this indicates OR>1.
Diabetes
No diabetes
25 119
Diabetes No Diabetes
9 37
16 82
46
98
144
MI cases
MI controls
P(“favors” case/discordant pair) =
53
37
1637
37ˆ
cb
bp
Diabetes
No diabetes
25 119
Diabetes No Diabetes
9 37
16 82
46
98
144
MI cases
MI controls
odds(“favors” case/discordant pair) =
16
37
c
bOR
Diabetes
No diabetes
25 119
Diabetes No Diabetes
9 37
16 82
46
98
144
MI cases
MI controls
OR estimate comes only from discordant pairs!!
OR= 37/16 = 2.31
Makes Sense!
Diabetes
No diabetes
Diabetes No Diabetes
9 37
16 82
MI casesMI controls
McNemar’s TestMcNemar’s Test
...)5(.)5(.39
53)5(.)5(.
38
53)5(.)5(.
37
53 143915381637
valuep
Null hypothesis: P(“favors” case / discordant pair) = .5(note: equivalent to OR=1.0 or cell b=cell c)
Diabetes
No diabetes
Diabetes No Diabetes
9 37
16 82
MI casesMI controls
McNemar’s TestMcNemar’s Test
01.;88.264.3
5.10
)5)(.5(.53
)2
53(37
pZ
Null hypothesis: P(“favors” case / discordant pair) = .5(note: equivalent to OR=1.0 or cell b=cell c)
By normal approximation to binomial:
McNemar’s Test: generallyMcNemar’s Test: generally
cb
cb
cb
cb
cb
cbb
Z
4
22)5)(.5)(.(
)2
(
By normal approximation to binomial:
Equivalently:
cb
cb
cb
cb
2
221
)()(
exp
No exp
exp No exp
a b
c d
casescontrols
Diabetes
No diabetes
Diabetes No Diabetes
9 37
16 82
MI casesMI controls
McNemar’s TestMcNemar’s Test
01.;88.232.853
21
53
)1637( 222
21
p
McNemar’s Test:
Example: McNemar’s EXACT Example: McNemar’s EXACT testtest
Split-face trial: – Researchers assigned 56 subjects to apply SPF
85 sunscreen to one side of their faces and SPF 50 to the other prior to engaging in 5 hours of outdoor sports during mid-day. The outcome is sunburn (yes/no).
– Unit of observation = side of a face– Are the observations correlated? Yes.
Russak JE et al. JAAD 2010; 62: 348-349.
Results ignoring correlation:Results ignoring correlation:
Table I -- Dermatologist grading of sunburn after an average of 5 hours of skiing/snowboarding (P = .03; Fisher’s exact test)
Sun protection factor Sunburned Not sunburned
85 1 55
50 8 48
Fisher’s exact test compares the following proportions: 1/56 versus 8/56. Note that individuals are being counted twice!
Correct analysis of data:Correct analysis of data:Table 1. Correct presentation of the data (P = .016; McNemar’s exact test).
SPF-50 side
SPF-85 side Sunburned Not sunburned
Sunburned 1 0
Not sunburned 7 48
McNemar’s exact test: Null hypothesis: X~binomial (n=7, p=.5)
.0156 value-p sidedTwo
0078.5.5. )7(
0078.5.5. )0(
077
0
077
0
XP
XP
Standard error of the difference of two proportions=
)ˆ1(ˆ)ˆ1(ˆ
2
22
1
11
n
pp
n
pp
RECALL: 95% confidence RECALL: 95% confidence interval for a difference in interval for a difference in
INDEPENDENTINDEPENDENT proportions proportionsStandard error can be estimated by:
n
pp )ˆ1(ˆ
95% confidence interval for the difference between two proportions:
)ˆ1(ˆ)ˆ1(ˆ
*96.1)ˆˆ(2
22
1
1121 n
pp
n
pppp
95% CI for difference in95% CI for difference in dependentdependent proportions proportions
controlscases
DEDEDEDEDEDE
controlscases
DEDEDE
controlscases
DEDEDE
n
ppppppCov
n
pppVar
n
pppVar
~&&~~&~&/~/
~/~/~/
///
**),(
)1()(
)1()(
Variance of the difference of two random variables is the sum of their variances minus 2*covariance:
)ˆ,ˆ(2)ˆ()ˆ()ˆˆ( 212121 ppCovpVarpVarppVar
)**
(2)1()1(
)( ~&&~~&~&~/~///~// n
pppp
n
pp
n
ppppVar DEDEDEDEDEDEDEDE
DEDE
95% CI for difference in 95% CI for difference in dependent proportionsdependent proportions
Diabetes
No diabetes
25 119
Diabetes No Diabetes
9 37
16 82
46
98
144
MI cases
MI controls
24.005.0)0024.(96.115.0 : CI %95
0024.144
)144
16*
144
37
144
82*
144
9(2)
144
251)(
144
25()
144
461)(
144
46(
)**
(2)1()1(
)(
15.17.32.144
25
144
46
~&&~~&~&~/~///
~//
~//
n
pppp
n
pp
n
pp
ppVar
pp
DEDEDEDEDEDEDEDE
DEDE
DEDE
The connection between McNemar The connection between McNemar and Cochran-Mantel-Haenszel Testsand Cochran-Mantel-Haenszel Tests
View each pair is it’s own View each pair is it’s own “age-gender” stratum“age-gender” stratum
Diabetes
No diabetes
Case (MI) Control
1 1
0 0
Example: Concordant for
exposure (cell “a” from before)
Diabetes
No diabetes
Case (MI) Control
1 1
0 0
Diabetes
No diabetes
Case (MI) Control
1 0
0 1
x 9
x 37
Diabetes
No diabetes
Case (MI) Control
0 1
1 0
Diabetes
No diabetes
Case (MI) Control
0 0
1 1
x 16
x 82
Mantel-Haenszel for pair-Mantel-Haenszel for pair-matched datamatched data
We want to know the relationship between diabetes and MI controlling for age and gender (the matching variables).
Mantel-Haenszel methods apply.
RECALL: The Mantel-Haenszel RECALL: The Mantel-Haenszel Summary Odds RatioSummary Odds Ratio
Exposed
Not Exposed
Case Control
a b
c d
k
i i
ii
k
i i
ii
T
cbT
da
1
1
Diabetes
No diabetes
Case (MI) Control
1 1
0 0
Diabetes
No diabetes
Case (MI) Control
1 0
0 1
ad/T = 0
bc/T=0
ad/T=1/2
bc/T=0
Diabetes
No diabetes
Case (MI) Control
0 1
1 0
Diabetes
No diabetes
Case (MI) Control
0 0
1 1
ad/T=0
bc/T=1/2
ad/T=0
bc/T=0
x 9
x 37
x 16
x 82
16
37
21
*16
21
37
2
2144
1
144
1
x
cb
da
OR
i
ii
i
ii
MH
Mantel-Haenszel Summary ORMantel-Haenszel Summary OR
Mantel-Haenszel Test StatisticMantel-Haenszel Test Statistic(same as McNemar’s)(same as McNemar’s)
)1(
)(*)(*)(*)()(
)(*)()( :recall
2
kk
kkkkkkkkk
k
kkkkk
nn
dbcadcbaaVar
n
cabaaE
21
1
2
1 ~
)(
))](([
k
ik
k
k
ik
aVar
aEa
Concordant cells contribute nothing to Mantel-Haenszel statistic (observed=expected)
Diabetes
No diabetes
Case (MI) Control
1 1
0 00
)1(2
)0)(1)(1)(2()(
011)(a
12
)1(*)2()(
2
k
k
k
k
aVar
aE
aE
Diabetes
No diabetes
Case (MI) Control
0 0
1 1
0)1(2
)2)(1)(1)(0()(
000)(a
02
)1(*)0()(
2
k
k
k
k
aVar
aE
aE
)1(
)2(*)1(*)2(*)1()(
)1(*)1()( :recall
2
kkk
kk
nn
colcolrowrowaVar
n
colrowaE
Discordant cells
Diabetes
No diabetes
Case (MI) Control
1 0
0 1
4
1
)12(2
)1)(1)(1)(1()(
2
1
2
11)(
2
1
2
)1(*)1()(
2
k
kk
k
aVar
aEa
aE
Diabetes
No diabetes
Case (MI) Control
0 1
1 0
4
1
)12(2
)1)(1)(1)(1()(
2
1
2
10)(
2
1
2
)1(*)1()(
2
k
kk
k
aVar
aEa
aE
)1(
)2(*)1(*)2(*)1()(
)1(*)1()( :recall
2
kkk
kk
nn
colcolrowrowaVar
n
colrowaE
01.;32.853
)1637(
)53(25.
)1637(5.
)25)(.53(
)]1637(5[.
)25)(.1637(
)]5(.16)5(.37[
)(
))](([
222
22
1
2
121
p
aVar
aEa
k
ik
k
k
ik
21
222
2
.
. .
1
2
1
~
')(
)(25.
)(5.
)25)(.(
)](5.)(5[.
25.
]5.5.[
)(
))](([
sMcNemarcb
cb
cb
cb
cb
cb
aVar
aEa
CMH
cellsdisc
cellsdisccontrolcellsdisccase
k
ik
k
k
ik
From: “Large outbreak of Salmonella enterica serotype paratyphi B infection caused by a goats' milk cheese, France, 1993: a case finding and epidemiological study” BMJ 312: 91-94; Jan 1996.
Example: Salmonella Example: Salmonella Outbreak in France, 1996Outbreak in France, 1996
Epidemic CurveEpidemic Curve
Matched Case Control StudyMatched Case Control Study
Case = Salmonella gastroenteritis.
Community controls (1:1) matched for: age group (< 1, 1-4, 5-14, 15-34, 35-44,
45-54, 55-64, or >= 65 years) gender city of residence
ResultsResults
In 2x2 table form: any goat’s In 2x2 table form: any goat’s cheesecheese
Goat’s cheese
None
29 30
Goat’ cheese None
23 23
6 7
46
13
59
Cases
Controls
8.36
23
c
bOR
In 2x2 table form: Brand A In 2x2 table form: Brand A Goat’s cheeseGoat’s cheese
Goat’s cheese B
None
10 49
Goat’ cheese B None
8 24
2 25
32
27
59
Cases
Controls
0.122
24
c
bOR
Brand A
None
Case (MI) Control
1 1
0 0
Brand A
None
Case (MI) Control
1 0
0 1
Brand A
None
Case (MI) Control
0 1
1 0
Brand A
None
Case (MI) Control
0 0
1 1
x8
x24
x2
x25
0)12(4
1*0*1*2
)1()(n
011)n(
12
1*2)E(n :exposed concordant 8
22211
11k
11k11k
1111k11k
kk
kkkk
k
kk
nn
nnnnVar
Observed
n
nn
0)12(4
1*2*1*0
)1()(n
000)n(
02
1*0)E(n :unexposed concordant 25
22211
11k
11k11k
1111k11k
kk
kkkk
k
kk
nn
nnnnVar
Observed
n
nn
Summary: 8 concordant-exposed pairs (=strata) contribute nothing to the numerator (observed-expected=0) and nothing to the denominator (variance=0).
Summary: 25 concordant-unexposed pairs contribute nothing to the numerator (observed-expected=0) and nothing to the denominator (variance=0).
Using Agresti notation
here!
Summary: 2 discordant “control-exposed” pairs contribute -.5 each to the numerator (observed-expected= -.5) and .25 each to the denominator (variance= .25).
4
1
)12(4
1*1*1*1
)1()(n
5.5.1)n(2
1
2
)1)(1( :casefavor cells discordant 24
22211
11k
11k11k
11k
kk
kkkk
nn
nnnnVar
Observed
4
1
)12(4
1*1*1*1
)1()(n
5.5.0)n(2
1
2
)1)(1(:controlfavor cells discordant 2
22211
11k
11k11k
11k
kk
kkkk
nn
nnnnVar
Observed
Summary: 24 discordant “case-exposed” pairs contribute +.5 each to the numerator (observed-expected= +.5) and .25 each to the denominator (variance= .25).
cb
cb
CMH
2222
2
)(
26
)224(
26
22
)25(.26
)25(.22
)25(.2)25(.2400
)]5(.2)5(.24)0(25)0(8[
Diagnostic Testing and Diagnostic Testing and Screening TestsScreening Tests
Characteristics of a diagnostic testCharacteristics of a diagnostic test
Sensitivity= Probability that, if you truly have the disease, the diagnostic test will catch it.
Specificity=Probability that, if you truly do not have the disease, the test will register negative.
Calculating sensitivity and Calculating sensitivity and specificity from a 2x2 tablespecificity from a 2x2 table
+ -
+ a b
- c d
Screening Test
Truly have disease
ba
a
Sensitivity
dc
d
Specificity
Among those with true disease, how many test positive?
Among those without the disease, how many test negative?
a+b
c+d
Hypothetical ExampleHypothetical Example
+ -
+ 9 1
- 109 881
Mammography
Breast cancer ( on biopsy)
Sensitivity=9/10=.90
10
990
Specificity= 881/990 =.89
1 false negatives out of 10 cases
109 false positives out of 990
What factors determine the What factors determine the effectiveness of screening?effectiveness of screening?
The prevalence (risk) of disease. The effectiveness of screening in preventing illness or
death.– Is the test any good at detecting disease/precursor (sensitivity
of the test)?– Is the test detecting a clinically relevant condition?– Is there anything we can do if disease (or pre-disease) is
detected (cures, treatments)?– Does detecting and treating disease at an earlier stage really
result in a better outcome? The risks of screening, such as false positives and
radiation.
Positive predictive valuePositive predictive value
The probability that if you test positive for the disease, you actually have the disease.
Depends on the characteristics of the test (sensitivity, specificity) and the prevalence of disease.
Example: MammographyExample: Mammography Mammography utilizes ionizing radiation to image breast
tissue. The examination is performed by compressing the breast
firmly between a plastic plate and an x-ray cassette that contains special x-ray film.
Mammography can identify breast cancers too small to detect on physical examination.
Early detection and treatment of breast cancer (before metastasis) can improve a woman’s chances of survival.
Studies show that, among 50-69 year-old women, screening results in 20-35% reductions in mortality from breast cancer.
MammographyMammography
Controversy exists over the efficacy of mammography in reducing mortality from breast cancer in 40-49 year old women.
Mammography has a high rate of false positive tests that cause anxiety and necessitate further costly diagnostic procedures.
Mammography exposes a woman to some radiation, which may slightly increase the risk of mutations in breast tissue.
ExampleExample
A 60-year old woman has an abnormal mammogram; what is the chance that she has breast cancer? E.g., what is the positive predictive value?
Calculating PPV and NPV Calculating PPV and NPV from a 2x2 tablefrom a 2x2 table
+ -
+ a b
- c d
Screening Test
Truly have disease
ca
a
PPV
db
d
NPV
Among those who test positive, how many truly have the disease?
Among those who test negative, how many truly do not have the disease?
a+c b+d
Hypothetical ExampleHypothetical Example
+ -
+ 9 1
- 109 881
Mammography
Breast cancer ( on biopsy)
PPV=9/118=7.6%
118 882
Prevalence of disease = 10/1000 =1%
NPV=881/882=99.9%
What if disease was twice as What if disease was twice as prevalent in the population?prevalent in the population?
+ -
+ 18 2
- 108 872
Mammography
Breast cancer ( on biopsy)
sensitivity=18/20=.90
20
980
specificity=872/980=.89
Sensitivity and specificity are characteristics of the test, so they don’t change!
What if disease was more What if disease was more prevalent?prevalent?
PPV=18/126=14.3%
126 874
Prevalence of disease = 20/1000 =2%
NPV=872/874=99.8%
+ -
+ 18 2
- 108 872
Mammography
Breast cancer ( on biopsy)
ConclusionsConclusions
Positive predictive value increases with increasing prevalence of disease
Or if you change the diagnostic tests to improve their accuracy.