analysis of matched data hrp 261 02/02/04 chapter 9 agresti – read sections 9.1 and 9.2

49
Analysis of matched data Analysis of matched data HRP 261 02/02/04 HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2 Chapter 9 Agresti – read sections 9.1 and 9.2

Upload: abner-leslie-freeman

Post on 01-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

Analysis of matched dataAnalysis of matched data

HRP 261 02/02/04HRP 261 02/02/04

Chapter 9 Agresti – read sections 9.1 and 9.2Chapter 9 Agresti – read sections 9.1 and 9.2

Page 2: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

Pair Matching: Why match?Pair Matching: Why match?Pairing can control for extraneous sources

of variability and increase the power of a statistical test.

Match 1 control to 1 case based on potential confounders, such as age, gender, and smoking.

Page 3: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

ExampleExample Johnson and Johnson (NEJM 287: 1122-1125,

1972) selected 85 Hodgkin’s patients who had a sibling of the same sex who was free of the disease and whose age was within 5 years of the patient’s…they presented the data as….

Hodgkin’s

Sib control

Tonsillectomy None

41 44

33 52

From John A. Rice, “Mathematical Statistics and Data Analysis.

OR=1.47; chi-square=1.53 (NS)

Page 4: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

ExampleExample But several letters to the editor pointed out that

those investigators had made an error by ignoring the pairings. These are not independent samples because the sibs are paired…better to analyze data like this:

From John A. Rice, “Mathematical Statistics and Data Analysis.

OR=2.14; chi-square=2.91 (p=.09)

Tonsillectomy

None

Tonsillectomy None

37 7

15 26

Case

Control

Page 5: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

Pair Matching: Pair Matching: Agresti Agresti exampleexample

Match each MI case to an MI control based on age and gender.

Ask about history of diabetes to find out if diabetes increases your risk for MI.

Page 6: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

Pair Matching: Pair Matching: AgrestiAgresti example example

Which cells are informative?

Just the discordant cells are informative!

Diabetes

No diabetes

25 119

Diabetes No Diabetes

9 37

16 82

46

98

144

MI cases

MI controls

Page 7: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

Pair MatchingPair Matching

Diabetes

No diabetes

25 119

Diabetes No Diabetes

9 37

16 82

46

98

144

MI cases

MI controls

OR estimate comes only from discordant pairs!

The question is: among the discordant pairs, what proportion are discordant in the direction of the case vs. the direction of the control. If more discordant pairs “favor” the case, this indicates OR>1.

Page 8: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

Diabetes

No diabetes

25 119

Diabetes No Diabetes

9 37

16 82

46

98

144

MI cases

MI controls

P(“favors” case/discordant pair) =

)~/(*)/(~)~/(~*)/(

)~/(~*)/(

DEPDEPDEPDEP

DEPDEP

=the probability of observing a case-control pair with only the control exposed

=the probability of observing a case-control pair with only the case exposed

Page 9: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

Diabetes

No diabetes

25 119

Diabetes No Diabetes

9 37

16 82

46

98

144

MI cases

MI controls

P(“favors” case/discordant pair) =

53

37

1637

37ˆ

cb

bp

Page 10: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

Diabetes

No diabetes

25 119

Diabetes No Diabetes

9 37

16 82

46

98

144

MI cases

MI controls

odds(“favors” case/discordant pair) =

16

37

c

bOR

Page 11: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

Diabetes

No diabetes

25 119

Diabetes No Diabetes

9 37

16 82

46

98

144

MI cases

MI controls

OR estimate comes only from discordant pairs!!

OR= 37/16 = 2.31

Makes Sense!

Page 12: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

Diabetes

No diabetes

Diabetes No Diabetes

9 37

16 82

MI casesMI controls

McNemar’s TestMcNemar’s Test

...)5(.)5(.39

53)5(.)5(.

38

53)5(.)5(.

37

53 143915381637

valuep

01.;88.264.3

5.10

)5)(.5(.53

)2

53(37

pZ

Null hypothesis: P(“favors” case / discordant pair) = .5(note: equivalent to OR=1.0 or cell b=cell c)

By normal approximation to binomial:

Page 13: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

McNemar’s Test: generallyMcNemar’s Test: generally

cb

cb

cb

cb

cb

cbb

Z

4

22)5)(.5)(.(

)2

(

By normal approximation to binomial:

Equivalently:

cb

cb

cb

cb

2

221

)()(

exp

No exp

exp No exp

a b

c d

casescontrols

Page 14: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

95% CI for difference in 95% CI for difference in dependent proportionsdependent proportions

Diabetes

No diabetes

25 119

Diabetes No Diabetes

9 37

16 82

46

98

144

MI cases

MI controls

24.05.)0024.(96.115.17.- 32. : CI %95

0024.144

)11.*26.57.*06(.2)83)(.17(.)68)(.32(.

),(2)1()1(

)(

),(2)()()(

~//~/~///

~//

212121

DEDE

controlscases

DEDE

controlscases

DEDE

DEDE

ppCovn

pp

n

pp

ppVar

ppCovpVarpVarppVar

Page 15: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

Each pair is it’s own “age-Each pair is it’s own “age-gender” stratumgender” stratum

Diabetes

No diabetes

Case (MI) Control

1 1

0 0

Example: Concordant for

exposure (cell “a” from before)

Page 16: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

Diabetes

No diabetes

Case (MI) Control

1 1

0 0

Diabetes

No diabetes

Case (MI) Control

1 0

0 1

x 9

x 37

Diabetes

No diabetes

Case (MI) Control

0 1

1 0

Diabetes

No diabetes

Case (MI) Control

0 0

1 1

x 16

x 82

Page 17: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

Mantel-Haenszel for pair-Mantel-Haenszel for pair-matched datamatched data

We want to know the relationship between diabetes and MI controlling for age and gender.

Mantel-Haenszel methods apply.

Page 18: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

RECALL: The Mantel-Haenszel RECALL: The Mantel-Haenszel Summary Odds RatioSummary Odds Ratio

Exposed

Not Exposed

Case Control

a b

c d

k

i i

ii

k

i i

ii

T

cbT

da

1

1

Page 19: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

Diabetes

No diabetes

Case (MI) Control

1 1

0 0

Diabetes

No diabetes

Case (MI) Control

1 0

0 1

ad/T = 0

bc/T=0

ad/T=1/2

bc/T=0

Diabetes

No diabetes

Case (MI) Control

0 1

1 0

Diabetes

No diabetes

Case (MI) Control

0 0

1 1

ad/T=0

bc/T=1/2

ad/T=0

bc/T=0

Page 20: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

16

37

21

*16

21

37

2

2144

1

144

1

x

cb

da

OR

i

ii

i

ii

MH

Mantel-Haenszel Summary ORMantel-Haenszel Summary OR

Page 21: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

Mantel-Haenszel Test StatisticMantel-Haenszel Test Statistic(same as McNemar’s)(same as McNemar’s)

cb

cb

cb

cbCMH

nVar

nn

nnnnVar

n

nn

cellsdisc

cellsdisccon cellsdisccase

k

kk

kkkk

k

kk

22

.

..

2

.

21111k

22211

11k

1111k

)(

)25)(.(

)](5.)(5[.

25.

]5.5.[

4

1

)12(2

)1)(1)(1)(1()(;

2

1

2

)1)(1(

:cells discordant

0 contribute cells Concordant

)1()(n

)E(n :recall

Page 22: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

From: “Large outbreak of Salmonella enterica serotype paratyphi B infection caused by a goats' milk cheese, France, 1993: a case finding and epidemiological study” BMJ 312: 91-94; Jan 1996.

Example: Salmonella Example: Salmonella Outbreak in France, 1996Outbreak in France, 1996

Page 23: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2
Page 24: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

Epidemic CurveEpidemic Curve

Page 25: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

Matched Case Control StudyMatched Case Control Study

Case = Salmonella gastroenteritis.

Community controls (1:1) matched for: age group (< 1, 1-4, 5-14, 15-34, 35-44,

45-54, 55-64, or >= 65 years) gender city of residence

Page 26: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

ResultsResults

Page 27: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

In 2x2 table form: any goat’s In 2x2 table form: any goat’s cheesecheese

Goat’s cheese

None

29 30

Goat’ cheese None

23 23

6 7

46

13

59

Cases

Controls

8.36

23

c

bOR

Page 28: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

In 2x2 table form: Brand B In 2x2 table form: Brand B Goat’s cheeseGoat’s cheese

Goat’s cheese B

None

10 49

Goat’ cheese B None

8 24

2 25

32

27

59

Cases

Controls

0.122

24

c

bOR

Page 29: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

Brand B

None

Case (MI) Control

1 1

0 0

Brand B

None

Case (MI) Control

1 0

0 1

Brand B

None

Case (MI) Control

0 1

1 0

Brand B

None

Case (MI) Control

0 0

1 1

x8

x24

x2

x25

Page 30: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

0)12(4

1*0*1*2

)1()(n

011)n(

12

1*2)E(n :exposed concordant 8

22211

11k

11k11k

1111k11k

kk

kkkk

k

kk

nn

nnnnVar

Observed

n

nn

0)12(4

1*2*1*0

)1()(n

000)n(

02

1*0)E(n :unexposed concordant 25

22211

11k

11k11k

1111k11k

kk

kkkk

k

kk

nn

nnnnVar

Observed

n

nn

Summary: 8 concordant-exposed pairs (=strata) contribute nothing to the numerator (observed-expected=0) and nothing to the denominator (variance=0).

Summary: 25 concordant-unexposed pairs contribute nothing to the numerator (observed-expected=0) and nothing to the denominator (variance=0).

Page 31: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

Summary: 2 discordant “control-exposed” pairs contribute -.5 each to the numerator (observed-expected= -.5) and .25 each to the denominator (variance= .25).

4

1

)12(4

1*1*1*1

)1()(n

5.5.1)n(2

1

2

)1)(1( :casefavor cells discordant 24

22211

11k

11k11k

11k

kk

kkkk

nn

nnnnVar

Observed

4

1

)12(4

1*1*1*1

)1()(n

5.5.0)n(2

1

2

)1)(1(:controlfavor cells discordant 2

22211

11k

11k11k

11k

kk

kkkk

nn

nnnnVar

Observed

Summary: 24 discordant “case-exposed” pairs contribute +.5 each to the numerator (observed-expected= +.5) and .25 each to the denominator (variance= .25).

Page 32: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

cb

cb

CMH

2222

2

)(

26

)224(

26

22

)25(.26

)25(.22

)25(.2)25(.2400

)]5(.2)5(.24)0(25)0(8[

Page 33: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

M:1 matched studiesM:1 matched studies

One-to-one pair matching provides the most cost-effective design when cases and controls are equally scarce.

But when cases are the limiting factor, as with rare diseases, statistical power may be increased by selecting more than 1 control matched to each case.

But with diminishing returns…

Page 34: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

M:1 matched studiesM:1 matched studies

2:1 matched study of colorectal cancer. Background: Carcinoembryonic antigen (CEA) is

the classical tumor marker for colorectal cancer. This study investigated whether the plasma levels of carcinoembryonic antigen and/or CA 242 were elevated BEFORE clinical diagnosis of colorectal cancer.

From: Palmqvist R et al. Prediagnostic Levels of Carcinoembryonic Antigen and CA 242 in Colorectal Cancer: A Matched Case-Control Study. Diseases of the Colon & Rectum. 46(11):1538-1544, November 2003.

Page 35: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

M:1 matched studiesM:1 matched studies Prediagnostic Levels of Carcinoembryonic Antigen and CA Prediagnostic Levels of Carcinoembryonic Antigen and CA

242 in Colorectal Cancer: A Matched Case-Control Study242 in Colorectal Cancer: A Matched Case-Control Study

Study design: A so-called “nested case-control study.”Idea: Study subjects who were members of an

ongoing prospective cohort study in Sweden had given blood at baseline, when they had no disease. Years later, blood can be thawed and tested for the presence of prediagnostic antigens.

Key innovation: The cohort is large, the disease is rare, and it’s too costly to test everyone’s blood; so only test stored blood of cases and matched controls from the cohort.

Page 36: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

M:1 matched studiesM:1 matched studies

Two cancer-free controls were randomly selected to each case from the corresponding cohort at the time of diagnosis of the matched case.

Matched for: Gender age at recruitment (±12 months) date of blood sampling ±2 months fasting time (<4 hours, 4–8 hours, >8 hours).

Page 37: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

2:1 matching:2:1 matching:

•stratum=matching groupstratum=matching group

•3 subjects per stratum3 subjects per stratum

•6 possible 2x2 tables…6 possible 2x2 tables…

Page 38: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

CEA +

CEA -

Case (CRC) Controls

1 1

0 1

CEA +

CEA -

Case (CRC) Controls

1 2

0 0

Everyone exposed; non-informative

Case exposed; 1 control unexposed

CEA +

CEA -

Case (CRC) Controls

1 0

0 2Case exposed; both controls unexposed

Page 39: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

CEA +

CEA -

Case (CRC) Controls

0 1

1 1

CEA +

CEA -

Case (CRC) Controls

0 2

1 0

Case unexposed; both controls exposed

Case unexposed; 1 control exposed

CEA +

CEA -

Case (CRC) Controls

0 0

1 2

Everyone unexposed; non-informative

Page 40: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

CEA +

CEA -

Case (CRC) Controls

1 1

0 1

CEA +

CEA -

Case (CRC) Controls

1 2

0 00

2

CEA +

CEA -

Case (CRC) Controls

1 0

0 212

Page 41: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

CEA +

CEA -

Case (CRC) Controls

0 1

1 1

CEA +

CEA -

Case (CRC) Controls

0 2

1 0

0

1

CEA +

CEA -

Case (CRC) Controls

0 0

1 2102

Page 42: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

CEA +

CEA -

Case (CRC) Controls

1 1

0 1

CEA +

CEA -

Case (CRC) Controls

1 0

0 2

CEA +

CEA -

Case (CRC) Controls

0 2

1 0

2 Tables with 2 exposed

13 Tables with 1 exposed

CEA +

CEA -

Case (CRC) Controls

0 1

1 1

2

2

1

1

Represents all possible

discordant tables (either 2 or 1 total exposed)

Page 43: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

CEA +

CEA -

Case (CRC) Controls

1 1

0 1

CEA +

CEA -

Case (CRC) Controls

0 2

1 0

2 Tables with 2 exposed

2

2

)1()() tablesecond(

)1()1()efirst tabl(

~/~//

~/~//

2

1

022

2

DEDEDE

DEDEDE

pppP

pppP

)1()()1(

)1()(

)exposed total2exposed/ case(

~/~//~//

~/~//

2

1

22

2

2

1

DEDEDEDEDE

DEDEDE

ppppp

ppp

P

Page 44: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

12

2

2

2

2

2

)1(2)()1(

)1(2)(

)1()()1(

)1()(

~//

~/~/

~//

~//

~//

~/~/

~/~/~//

~/~/

~//~//

~//

~/~//~//

~/~//

~~

~

~

~

2

1

22

2

2

1

OR

OR

pp

pp

pp

pp

pp

pp

pppp

pp

pppp

pp

ppppp

ppp

DEDE

DEDE

DEDE

DEDE

DEDE

DEDE

DEDEDEDE

DEDE

DEDEDEDE

DEDE

DEDEDEDEDE

DEDEDE

Page 45: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

CEA +

CEA -

Case (CRC) Controls

0 1

1 1

CEA +

CEA -

Case (CRC) Controls

1 0

0 2

13 Tables with 1 exposed

1

1

)1()1() tablesecond(

)1()efirst tabl(

~/~//

~/~//

2

1

202

0

DEDEDE

DEDEDE

pppP

pppP

)1()1()1(

)1(

)exposed total1exposed/ case(

~/~//~//

~//

2

1

22

0

22

0

DEDEDEDEDE

DEDE

ppppp

pp

P

Page 46: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

22

2

2

)1()1()1(

)1(

~//

~//

~//

~//

~//

~//

~//~//

~//

~/~//~//

~//

~/~//~//

~//

~

~

~

~

~

~

~~

~

~~2

~

2~

2

1

22

0

22

0

OR

OR

pp

pp

pp

pp

pp

pp

pppp

pp

ppppp

pp

ppppp

pp

DEDE

DEDE

DEDE

DEDE

DEDE

DEDE

DEDEDEDE

DEDE

DEDEDEDEDE

DEDE

DEDEDEDEDE

DEDE

Page 47: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

SummarySummary

P(case exposed/2 total exposed)=2OR/(2OR+1) P(case unexposed/2 total exposed)=1-2OR/(2OR+1) P(case exposed/1 total exposed) = OR/(OR+2) P(case unexposed/1 total exposed)= 1-OR/(OR+2)

Therefore, we can make a likelihood equation for our data that is a function of the OR, and use MLE to solve for OR

Page 48: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

Applying to example dataApplying to example data

11202

11202

)2

2()

2()

12

1()

12

2(

)2

1()2

()12

21()

12

2()/(

OROR

OR

OROR

OROR

OR

OR

OR

OR

OR

OR

ORORdataP

A little complicated to solve further…

Page 49: Analysis of matched data HRP 261 02/02/04 Chapter 9 Agresti – read sections 9.1 and 9.2

Applying to example dataApplying to example data

BD give a more simple robust estimate of OR for 2:1 matching:

0.26)1(1)0(2

)12(2)2(1

exposed) control & exposed total1 where1(#exposed) controls 2 & exposed total2 where#(2

exposed) case & exposed total1 where2(#exposed) case & exposed total2 where#(1

OR