interpretation and choice of effect measures in epidemiologic analyses

Upload: annie-wen

Post on 05-Apr-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/1/2019 Interpretation and Choice of Effect Measures in Epidemiologic Analyses

    1/8

    AMERICAN JOURNAL OF EPIDEMIOLOGYCopyright 1987 by The Jo hn s H opkins Univereity School of Hygiene and Public HealthAll rig hts reservedVol. 126, NQ . 5Printed in U.SA

    INTER PRE TAT ION AN D CHOICE OF EFFECT M EASU RES INEPIDEMIOLOGIC ANALYSES 1

    SANDER GREENLANDT he concep t of the odds r a t io i s now well -establ ished in epidemiology, largely be-cause i t serves as a l ink be tween resu l t sob taina ble from fol low-up s tud ies and tho seob ta inab le f rom case-con t ro l s tud ies (1-7) .Odds ra t ios a l so na tu ra l ly ar i se when con-

    s ider ing smal l samp le a nalys i s of 2 X 2tables and in logis t ic and log- l inear model-ing (2-4, 8) . This ubiquity , along with cer-t a in t echn i ca l cons ide r a t i on s , has l ed someau tho r s t o t r ea t t h e odds r a t i o a s pe rhapsa "universal" measure of epidemiologic ef-fec t, in th a t they would es t ima te odds r a t iosin fol low-up s tudie s as well as case -controls tudies (6, 9) ; others have expressed reser-vat ions about the u t i l i ty of the odds r a t ioas someth ing o the r t h an an e s t ima te o f anincidence rat io (10, 11) .

    I be l ieve tha t such con t rover sy as exis t sr egard ing the use of the odds r a t io ar i sesf rom i t s inheren t d i sadvan tages comparedwith the other measures for biological in-ference , and i t s inheren t advan tages fo rs ta t i s t ica l inference . The purpose of th i spape r i s t o compare t he i n t e r p r e t a t i on s ands ta t i s t ica l p roper t ies of the common mea-sures of effect in an at tempt to del ineatec l ea r ly t he advan tages and d r awbacks o feach me asure for epidemiologic inference. Iwil l argue that only incidence differencesand r a t i o s pos ses s d i rec t i n t e r p r e t a t i on s a smeasures of impac t on average r i sk o r haz-ard . Consequent ly , odds r a t ios are usefu lon ly when they serve as inc idence- ra t ioes t imates , and log is t ic and log- l inearmodels are useful only insofar as they pro-

    1 Division of Epidemiology, University of Califor-nia , Los Angeles, School of Public Health, Los Ange-les, CA 90024.The au tho r t ha nks Drs . Ha l Morgens t e rn , Cha r le sPoole , and James Schlesselman for their helpful com-ments .

    vide improved (smoothed (8)) es t imates ofincidence differences or rat ios .I N TE R P R E T A TI O N S U N DE R A

    STOCHASTIC-RISK MODELFor s implici ty , th is sect ion wil l focus on

    the problem of es t imating the effect of ab inary exposure fac to r on the r i sk of abinary disease outcome over a well-def inedt ime per iod ( the r i sk per iod) . As d iscussedla ter , the conclus ions extend to more gen-era l cases , inc lud ing r i sk cons idered as afunct ion of t ime (as in fai lure- t ime analy-s i s ) and po ly tomous , con t inuous , and mul-t iple exposures (as in regression models) .There are two bas ic concep tual modelsfor viewing individual disease r isk: deter-minis t ic and s tochast ic (probabil is t ic) (7) ,de term in is t ic be ing th e specia l case of s to-chas t ic in which r i sks may be zero o r one ,bu t no t i n be tween . A t r ea tme n t of t hemeasures in the de termin is t ic case has r e-cen t ly been g iven e l sewhere (11) . The ar -guments of th i s sec t ion genera l ize tha tt r ea tmen t t o t he s tochas t i c case .

    Individual measuresUnder the s tochas t ic model , each ind i -v idual i s analogous to a co in to be tossed:for each ind iv idual , i, there i s a cer ta inunknown p robab i l i t y r u that disease wil loccur when the ind iv idual i s exposed , anda probab i l i ty r0, that disease wil l occurwhen the ind iv idual i s no t exposed . Theser isk s a r e ana logous t o t he p robab i li ty t ha ta co in wi l l l and heads , and thus may vary

    between zero and one . One may def ine thesurv iva l p robab i l i t i es of the ind iv idual asSi, = 1 r , , when exposed and s0, = 1 - ro,when unexposed .One may also def ine the odds of dis-ease for the individual , or risk odds, by761

  • 8/1/2019 Interpretation and Choice of Effect Measures in Epidemiologic Analyses

    2/8

    762 GREENLANDw u = ru/su and wOi = r0l/so .- Unlike r isks ,the odds are def ined on ly i f the surv iva lp robab i l i t i e s su an d s0, a r e nonze ro .The effect of exposure on the r isk of anind iv idu al may be measured in te rms of therisk difference ru /"w, the risk ratio rijr^,the r isk-odds difference uiu w oi , or ther i sk -odd s r a t i o Wn/woi. Clear ly , the r i sk r a-t io and r isk-odds rat io wil l be undef ined ifth e r isk in a bse nce of exp os ure ro,- is zero.In addi t ion, both the r isk-odds differencean d th e r i sk-odds r a t io wi ll be undef ined i fei th er surviv al pro ba bil i ty is zero, i .e . , ife i ther r i sk i s one .

    Population measuresEpidemio logy i s la rge ly concerned wi thinferences about average r isks and effectsin popu la t i on s . I n a coho r t compr i s ing Niexposed and N o unexposed ind iv iduals , theexpec t ed number o f cases and noncasesover the r isk per iod would be as fol lows:

    erage r isk to the average surv iva l p rob ab i l -i ty; th a t is ,

    andAC

    BD ~~

    Note , however , tha t the inc idence odds donot equal the s imple averages of the r i skodds; that is , A/C * ^xwXl/Nx an d B/D *2ow0i/N0. Al though the i nc idence odds doequal the average r i sk odds when the r i skodds are weigh ted by the surv iva l p rob ab i l -i t ies , i t will be sho wn below th a t the failureof the inc idence odds to equal the s impleaverage r i sk odds ser ious ly hand icaps thein t e r p r e t ab i l i t y o f measu r es based on theinc idence odd s .Assume there i s no confounding , in thesense tha t had exposure been comple te ly

    Disease occursDisease does no t occurT o t a l

    Subcohort 1 (exposed)A = 2,r , ,C = ,N,

    Subcoho rt 0 (unexposed)D ioToiD=loso.No

    TotalM ,MoT

    wh e r e S i a n d 2 0 mean summat ion over a l lind iv iduals in subcohor t 1 (exposed) andsubcohor t 0 (unexposed) , r espect ive ly . Thep rop o r t i on expec ted t o con t r ac t t h e d is easein a g roup i s the cumula t ive inc idence (1)o r incidence proportion (12). T he inc idencep r o p o r t i o n s A/N\ an d B/No are in terpre t -ab le as average r i sks in the i r r espect ivegroups, i .e . , A/Ni = X1r iJNi a n d B/N o =2 o r e / No .One can a l so con s t r uc t an o th e r measu reof d i sease occur rence in the above popula-t i on . The r a t i o o f t he number expec t ed t oco n t r ac t d i sease t o t he nu mbe r expec ted t ono t con t rac t d i sease in a g roup i s the d is -ease odds o r incidence odds, which tak es ont h e v a l u e s A/C for the e xposed a nd B/D forthe unexpo sed . The inc idence odds A/C andB/D are in terpre tab le as r a t ios of the av-

    absen t , the average r i sk would have beenthe s ame among the subcoho r t s t h a t wer ein fac t the exposed an d th e une xposed (12);t h a t is , 2^ 0,/iV i = S o W ^ c T h e in c id en ce -prop or t ion d i fference i s the n g iven by

    1 ( 1 )(2)

    T hu s , th i s inc idence d i fference i s in terpre t -ab l e a s bo th t he ab so lu t e change in t heaverage r i sk of the exposed subcohor t p ro-duced by exposure (express ion 1 , the aver -age-r isk difference) and the average abso-lu te change in r i sk p roduced by exposureamong exposed ind iv iduals (express ion 2 ,the average r isk-difference) .

  • 8/1/2019 Interpretation and Choice of Effect Measures in Epidemiologic Analyses

    3/8

    INTERPRETATION OF EFFECT MEASURES 763The inc idence-p ropo r t i on r a t i o isgiven

    byA/N,B/No

    (3)Th us , t h e i nc idence-p ropo r t ion r a t io i s i n -t e r p r e t ab l e a s t he p ropo r t i ona t e change inthe average r i sk of t h e exposed subcoho r tp roduced byexposure (expression 3, theaverage- r i sk r a t io) . Never the less , it is n o tin t e r p r e t ab l e a s t he ave r age p ropo r t i ona t echange in r i sk p roduc ed by exposu re amongexposed individuals , i .e . , the average r isk-ra t io

    2i(ri,/ro,)/iVi, (4)which is undef ined if unexposed r i sks ofzero occur . If, however , the ind iv idual r i skra t ios r u/r Oi are all equa l to a c o n s t a n tva lue , express ions 3 a n d 4 and the i nc idenceprop or t ion ra t io wi ll a ll equal tha t va lue .

    T he inc idence-odds r a t io i s given byA/CB/D

    (5)Th i s exp r es s ion r ep r esen t s t h e p ropo r t i on -a t e change in the inc idence odds in theexposed produced by exposure . Never the-less, it is not equ iva l en t t o t he p ropo r t i on -a t e change in t h e ave r age odds in the ex-posed p roduced by exposu r e ,

    (6)Nei the r of the las t two express ions i s equ iv-a len t to the average of the ind iv idual oddsra t ios among the exposed ,

    2i(wn/wo,)/Ni (7)which isundef ined if any r i sks of one orunexposed r i sks of zero occu r . Thus , theinc idence-odds r a t io lacks any s imple in ter -p r e t a t i on in t e r ms of exposure effect on

    average r i sk orodds , or average exposureeffect on ind iv idual r i sk or odds . Para l le la r gumen t s show tha t t h e s ame is t rue ofthe incidence-odds difference.I f the ind iv idual r i sk-odds r a t ios WiJwOia re all equal to a cons t an t va lue (as as-sumed, for example , by a logistic-riskmodel) , express ions 6 and 7 wil l equal th a tva lue , ye t the inc idence-odds r a t io need notequal tha t va lue . For example , in a popu-lat ion inwhich 10 per ce n t of individualsh ad ru = 0.60 and r0,= 0.20, and the re-mainde r had ru=0.035 and rOi = 0.006, a llthe r i sk-odds r a t ios (and , thus , express ions

    6 and7) would equal 6.0. However , theinciden ce p rop or t io n would be 0.10(0.60) +0.90(0.035) = 0.0915 under exposure and0.10(0.20) + 0.90(0.006) = 0.0254 undernonexposure , y ie ld ing an inc idence-oddsra t io of 0.0915(1 - 0.0254)/0.0254(l -0.0915) =3.9.Conne ctions to confounding cri teria

    T he fa i lu re of the inc idence-odds r a t io toequal express ions 6 o r 7 , even when Wu/w Oli s con s t an t , can be s een a s an ana log of t he"paradoxica l" behav ior of t h e odds r a t i ono ted by Miet t inen and Cook (10) in the i rexample 3 . T ha t example showed th a t t h ee levat ion in t h e c r ude i nc idence odds p ro -duced by exposure can fall short of thee l eva t ion i n odds p roduced in any subg roup ,even if confounding (as def ined above) isen t i re ly ab sen t . Such pa r adox ica l behav io rcannot occur wi th the inc idence-propor t iondifference or r a t io (10). Boivin andWach older (9) fa iled to no te th a t th e crudeodds r a t io inexample 3of Miet t i nen andCook (10) is an unbiased es t imate ofexpress ion 5 ( the t rue exposure effect ont h e i nc idence odds in the exposed) ; as ar e su l t, t h ey po s tu l a t ed t h a t odds r a t i o non-confound ing co r r e sponds t o t he c r ude oddsr a t i o be ing equa l toaweighted average ofs t r a tum-speci f ic od ds r a t ios . T h is po s tu la teover looks the "defect" in the inc idence-odds r a t i o demons t r a t ed above , i.e., thecrude inc idence-odds r a t io may equal ne i -t h e r an average effect nor an effect on

  • 8/1/2019 Interpretation and Choice of Effect Measures in Epidemiologic Analyses

    4/8

    764 G R E E N L A N Dave r age odds , and yet still un biasedly rep-resent the effect of exposure on the incidenceodds.

    Den si ty measuresThe above arguments dea l on ly wi thcompar i sons of p ropor t ions ge t t ing d isease(or no t) in a s imple closed co ho r t , in whicheveryone i s observed th roughout the i r r i skpe r iod . Some popu la t i on s r easonab ly ap -proximate th i s model , especia l ly in thef ields of per inatal epidemiology and tech-no logy assessment . Even when loss to fo l -low-up occurs , var ious methods s t i l l al lowone to e s t ima te t he i nc idence p ropo r t i on sfor the or iginal cohor t (4, 6) . Never theless ,mos t chron ic d i sease s tud ies are based onopen (dynamic) popula t ions , in which theinc idence propor t ion cannot be d i r ec t lymeasured or even s imply def ined . Th is hasled to the developmen t an d use of concep tsof per son- t ime ra tes and inc idence dens i ty(1 , 2, 4-7 ) an d comp ar i sons based on these"dens i ty" measures . Somewhat leng thy de-velo pme nt i s r equ ired to conn ect inc idence-

    de ns i ty c ompa r i sons to exposure effec ts oninc idenc e pro por t ion (1 , 4 , 7) , an d th e re-su l t ing connect ion i s f a i r ly abs t r ac t (muchl ike the con cep t of inc idence d ens i ty i tse l f) .Never the less , i f the d isease i s " rare" andcensor ing i s unre la ted to r i sk , the inc i -dence-dens i ty r a t io wi l l approximate theinc idence-propor t ion ra t io and do so moreclose ly than the inc idence-odds r a t io (5) .Inc idence -dens i ty me asures may a l so bedirect ly l inked to individual fai lure- t ime( inc idence) d i s t r ibu t ions : Suppose a t t imet the hazard (13) for an individual i in apopu la t i on i s hi(t), t h e i n s t an t aneous i nc i -dence dens i ty in the popula t ion i s ID() ,an d the s ize of the pop ula t ion i s N(t); thenID(t) = 'Lh l(t)/N{t) (a proo f is given in theApp en dix) . Th us , like inc idence propor t ionan d r i sk , bu t u n l ike inc idence odds and r iskodd s , the popula t ion measure ( inc idence

    densi ty) is a s imple average of the individ-ual parameter s (here , hazards) . As a con-seque nce , incide nce -densi ty differencesand r a t i o s may be in terp re ted as d i fferencesan d ra t ios of average hazard s . As described

    in Implicat ions for Modeling, these inter-pre ta t ions genera l ize to l ink dens i ty mea-sures and fai lure- t ime models .IMPLICATIONS FOR MODELING

    Th e p r eced ing obse rva t i on s have impor-tan t impl ica t ions fo r inferences about pa-rameters in biologic models for individuald isease r i sks o r hazards . Under genera lr isk-difference or r isk-rat io models an d cer-tain m ixtures of the se, the form of cov ar iateeffects at the individual level will ( in theabsence of uncont ro l led confounding) befol lowed at the populat ion level by the in-c idence p ropo r t i on s . To s ee t h i s , cons ide rf i r s t a model s ta t ing tha t the r i sk of anindividual i with covar iate level x is givenby a, + d(x; fi), where a , i s a r andomeffect independent of x, and d(x; fi ) is ageneral r isk-difference function, e.g. , fi x(f i a n d x may be v ec tors) . If the s ize of theobserved popula t ion a t l eve l x is N(x), th einc idence propor t ion a t l eve l x will be2 (a , + d(x; fi))/N(x) = a(x) + d(x; fi),where a(x) = 2 a , / iV (x) , the me an of the a ,a t l evel x, and the sums a re over ind iv idualsobserved at level x ( the no-confoundingassumpt ion g iven ear l ie r t r an s la te s in to as-suming the a(x) are cons tan t across x; theassumpt ion of independence of the a, an dx is, howeve r, sufficient for inferen ce on fi).Assume ne xt a mod el in which r isk is givenby r(x; y)\, where r (x; 7) i s a genera lr isk-rat io funct ion, e .g . , exp(-rx) . The in-c idence propor t ion a t l eve l x will then be2r (x; y)Xi/N(x) = r(x;y)\(x), where X(x)is th e mean of th e A, a t level x. Finally,assume an add i tive mixture model a , + d(x;fi ) + r (x; 7)A, ; the inc idence propor t ionwill then be a(x) + d{x; fi) + r(x; y)\(x),where a(x) an d A(x) are as b efore.

    In co ntra st , the form of cov ar iate effectsin a general model for the difference orrat io of the r isk odds (e.g . , the logis t icmodel , which s tates the r isk odds is givenby exp(a, + fix)) will generally not be fol-lowed by the inc idence odds . In o rder toforce a cor responde nce be tween a r i sk-oddsmodel and th e inc idence odds , much of thes ta t i s t ica l l i t e r a tu re impl ic i t ly assumes th a t

  • 8/1/2019 Interpretation and Choice of Effect Measures in Epidemiologic Analyses

    5/8

    INTERPRETATION OF EFFECT MEASURES 765the random effects a, are co n s t an t withincova r ia te levels. However , such an a s s u mp -t ion isun r ea l i s t ic because it t r an s l a t e s i n toa b io log i c a s sumpt ion tha t all ind iv idualsobserved at the same covar ia te level havethe same r i sk , i.e., t h a t the measu r ed co-var i a t e s a r e theonly impor ta n t r i sk fac to r s .In a manner pa r a l l e l to tha t jus t g ivenfor r isk models and i n c idence p ropo r t i on s ,one can show tha t unde r gene r a l haza rddifference or genera l hazard r a t io (p ropor -t ional hazards) models and cer ta in mix-t u r e s of t h e s e , the form of covar iate effectsa t the individual hazard level wil l (in theabsence of uncont ro l led confounding) befollowed at the popula t ion level by the in-s t an t aneous i nc idence dens i t i e s .

    APPROXIMATE INTERPRETATIONSWhile the incidence-propor t ion differ-ence can be di rec t ly in terpre ted as an effecton average r i sk , and as an ave rage effect onr i sk , the inc idence-propor t ion ra t io can or-dinar i ly be i n t e r p r e t ed in only the first of

    t h ese ways , and the inc idence-odds r a t iol acks bo th i n t e r p r e t a t i on s . Under ce r t a inc i r cumstances , however , the deficiencies oft h e r a t i o measu r es d i s appea r .Odds approximation to proport ions

    If the disease is r a r e , the inc idence oddswill app roximate the inc idence prop or t ion s ,and c onsequen t ly the odds and odds r a t i o scan be used as subs t i t u t e s for the moreeas i ly in terpre ted inc idence propor t ionsand the i r r a t i o s ( 1 -7 ) . Never the l e s s , thenecessary r ar i ty condi t ion for t h i s subs t i -t u t i on shou ld bep rope r ly ap p r ec i a ted : Theinc idence propor t ion shou ld be low in allexposure and confounder ca tegor ies of theanalys i s ; it is s imply a mis take to requireon ly t ha t the crude inc idence be low. T h i sprob lem ar i ses , for example , in s tud i e s ofper ina t a l mor t a l i t y , in which the c rudemor ta l i ty is usual ly low, but is high in cer-tain subgroups (e.g . , very low bi r th weigh tinfan ts ) . There is a rule of t h u m b for judg-ing how low i n c idence m us t be to allow ther are d i sease assumpt ion to be invoked: Ifthe odds never exceeds X in any of the

    subg roups tobe compar ed , theodds or oddsra t io wi l l incorpora te no mo r e t h an 100Xper cen t e r r o r i n e s t ima t ing the co r re spond-ing p ropo r t ion or r a t io of p ropo r t i on s . Forexample , if the odds never exceeds 0.10,subs t i t u t i on of odds r a t ios for p ropo r t i onrat io s will lead to nomore than 10 per cen te r r o r in es t ima t ing the l a t te r . Th is ru le isder ived by n o t in g t h a tA/CB/D B/No \No/D

    Nx/C l +A/C\B/No \1 + B/D)'The second fac to r in the l as t t e rm is thebias in us ing the odds r a t io as an es t ima teof the p ropo r t ion ra t io ; if t h e odds A/CandB/D a r e b o t h u n d e r X, th i s b ias fac to r m us tfall between 1 X and 1 + X.

    Incidence-odds approximation to averagerisk oddsThe r a r e d i s ease a s sumpt ion is not suf-ficient to allow one to i n t e r p r e t the inci-dence odds as the average of the ind iv idualr isk odds , and thu s is not sufficient toallowin t e r p r e t a t i on of the inc idence-odds r a t ioas the ch an g e in average r i sk odds . Forexample , in a popu la t i on in which 2 perc e n t of ind iv iduals had a r i sk odds of 1.00an d the r emainder had a r i sk odds of 0.01(which t r ans la tes to r isks of 0.50 and0.01),the inc idence propor t ion would be0.02(0.50) + 0.98(0.01) = 0.02, and so theinc idence odds would be 0.02/0.98 = 0.02,b u t the average of the r isk odds would be0.02(1.00) + 0.98(0.01) = 0.03. Example s oft h i s n a t u r e are not hard to find in p e r i n a t a lepidemiology. A sufficient co nd it ion for thei n c idence odds to app rox imate the averager i sk odds (if def ined) is t h a t all the individ-ual r i sk odds be low; unlik e the rare diseaseas sumpt ion , however, t h i s cond i ti on cann o tbe verified by examining the d a t a .

    Inciden ce-proportion ratio approximationto average risk ratioThe r a r e d i s ease a s sumpt ion is also in-sufficient to allow one to i n t e r p r e t theinc idence-propor t ion ra t io as the averageof the individual r isk rat ios (expression 4

  • 8/1/2019 Interpretation and Choice of Effect Measures in Epidemiologic Analyses

    6/8

    766 GREENLANDabove ) . For example , suppose tha t in thes t udy p opu l a ti on 2 pe r c en t o f ind iv idua l sh ad ru = 0.50 and r0, = 0.20, and the re-ma i nde r h ad ru = 0.02 and rOl = 0.002. Theinc idence p ropor t i on fo r t h i s popu l a t i onwould be 0.02(0.50) + 0.98(0.02) = 0.03under exposure and 0.02(0.20) +0.98(0.002) = 0.006 under no exposure, foran incidence-proport ion rat io of 0.03/0.006= 5; in co nt ra s t , the av erage of th e indiv id-ual risk ratios would be 0.02(0.50/0.20) +0.98(0.02/0.002) = 10. A sufficient (bu t un-verif iable) condi t ion for the incidence-p ropor t i on r a t i o t o app rox ima t e t he ave r -age r isk ra t io ( if defined) is that a l l theindiv idua l r i sks be low.

    STATISTICAL CONSIDERATIONST he s t a t is t i c a l p rope r t i e s o f t he measu re sd i s cus s ed he re have been s t ud i ed i n d e t a il .T h e l i t e r a t u r e is va s t and h igh ly t e chn i ca l ,and I wi l l not a t t empt a rev iew; the poin t sI wish to consider fol low direct ly from thec i t ed refe rences .

    Sparse d ata efficien cyEpidemiologic s tudies f requent ly produce"sparse da ta ," i . e . , da ta tha t upon s t ra t i f i -ca t ion by re levant var iab les ( such asmatching fac tors ) y ie ld smal l s t ra ta . Forexample , a twin- o r ne ighborhoo d-ma tchedpa i r s t udy t ha t r e t a i n s t he na tu r a l pa i r i ngwil l yie ld data with only two subjects pers t r a t um and t hus t he da t a wi l l be s pa r s e .(One should not confuse the t e rm sparseda ta wi th "smal l sample ," because sparseda ta se t s may be qui te l a rge , as in l a rgema tched s t ud i e s . ) For s uch da t a , t he oddsra t io possesses c lea r i f ra ther t echnica l ad-vantages for formal s t a t i s t i ca l ana lys i s :Spa r s e da t a me thods t ha t a s sume and e s -t ima t e a cons t an t odds r a t i o a r e h i gh lyeffic ient ( in the s ta t is t ical sense) (14),whe rea s s pa r s e da t a me thods t ha t a s sumeand es t imate a cons tant d i f fe rence or ra t io

    of propor t ions can be h ighly ineff ic ien t(15).P lausibili ty of homogeneity assumptionsThe a s sumpt ion o f cons t ancy (homoge-nei ty) of an effect parameter is s ta t is t ical ly

    convenient but b io logica l ly s t r ingent , andi t i s good prac t i ce to c r i ti ca l ly examine th eassumpt ion before apply ing a t echniquebased on i t . There a re no pure ly logica l orgeneral biologic reasons for bel ieving suchan a s sumpt ion , bu t i n c e r t a i n s i t ua t i onsthere a re pure ly logica l reasons for d i sbe-l ieving constancy of the difference or ra t ioof propor t ions . These reasons a r i se f romthe i nhe ren t r ange l imi t a t i ons o f t he s emeasures . In 2 X 2 tab le n ota t ion , the d if-fe rence cannot exceed A/Ni or fall belowB/No, and t he r a t i o c anno t exceed No/B .For example , i f the inc idence propor t ionamong the unexposed was known t o r angeas h igh as 0.5 in some s t r a ta , the inc idence-proport ion difference could not exceed 0.5and t he i nc idence -p ropor t i on r a t i o cou ldnot exceed 2 .0 in those s t ra ta (s ince inci-dence prop or t ions ca nn ot exceed 1 .0). If theinc idence-propor t ion d i f fe rence observedin o th er s t ra ta c lea r ly exceed 0.5, one wouldhave to ru le out cons tan cy of the d i ffe rence ;s imi la r ly , i f the inc idence-propor t ion ra t ioobserved in o ther s t ra ta c lea r ly exceed 2 .0 ,one would have to ru le ou t cons tan cy of th ep ropor t i on r a t i o .

    The odds ra t io suffe rs f rom no such apr ior i range l imi ta t ions , an d so for commondi s ea s e s t he cons t an t i nc i dence -odds r a t i oassumption is logical ly less vulnerable toob jec t i on t han a r e t he o the r cons t ancy a s -sumptions . If , however , the disease is rare ,the l imit of the s ize of the incidence-proport ion difference and rat io wil l be soh igh a s t o c aus e no p rob l ems .Modeling considerat ions

    Odds ra t ios a r ise n a tur a l ly a s an t i logs ofs imple l inear combina t ions of logi s t i c orlog-l inear model coeffic ients (2-4, 8) , andthus have been p romoted a s a l ink be tweenthe resul t s of s t ra t i f i ca t ion ana lyses andmodel ing (2 , 6); s imilar ly, hazard rat iosa r i se na tura l ly as an t i logs of l inear combi-nation of Cox model coefficients (6, 13).The "na tu ra l ne s s " o f t he s e connec t i ons ,however , re f lec t s only mathemat ica l con-ven i ence and shou ld no t be t ake n t o impa r tany spec ia l b io logic impor tance to e i the rthe measures or the models (7 , 16 , 17)

  • 8/1/2019 Interpretation and Choice of Effect Measures in Epidemiologic Analyses

    7/8

    INTERPRETATION OF EFFECT MEASURES 767(a lthoug h th is conven ience may expla in th eprefe rences of many s ta t i s t ic ians and sof t-ware developers for the models) .

    Case-control analysisIf we now cons ide r the e a rl ie r 2 x 2 t ab leas observed case-contro l da ta , A/Ni a n dB/N o cease to be meaningful quant i t ie s ,and (wi thout ex te rna l informat ion) ana ly-s is must depend on the case-contro l expo-sure-odds ratio (A/B)/(C/D) = AD/BC.Never the less , one should not genera l lyequa te th is odds ra t io to the inc idence-oddsra t io (as is done in most e lementa ry te x ts) :Depend ing on th e spec ific m ethod s of caseand co nt ro l se lec t ion , the case-contro l oddsra t io may di rec t ly es t imate the inc idence-prop or t ion ra t io , inc idence-dens i ty ra t io , orinciden ce-odd s ra t io (1, 5, 18, 19) . Evenwhen the case-contro l odds ra t io d i rec t lyes t imates th e inc idence-odds ra t io , "d iseaserar i ty " will a l low the c ase-con trol odd s ra t ioto be used as an es t imate of the inc idence-propor t ion ra t io (1-7) , and exte rna l infor -mat ion ab ou t popu la t ion ra te s of d isease orexposu re will a l low direct est im atio n of th einc idence prop or t ion s or dens i t ie s (4 , pp .174-5). Thus , mos t c a se - con t ro l ana lyse sneed no t be interpreted in te rms of oddsr a t i o s .

    Prevalence dataThe s i tua t ion he r e somewha t pa r a l l e l scase-contro l ana lys is : under ce r ta in condi-

    t i on s , the preva lence-odd s ra t io d i rec tly es-t imates th e inc idence -dens i ty ra t io (1 , 7) .General ly, however , inferences about r iskvar ia t ion from prevalence data will requiremore res t r ic t ive a ssum pt ions th an wil l s im-ilar inferences from incidence-density case-contro l s tudies (1 , 4) .CONCLUSION

    I have a rgued tha t , for summar iz ing ex-posure impac t on r i sk , the inc idence-proport ion ("r isk") difference and ra t ioshou ld be the measures of choice , for in theabsence of b ias only they possess d i rec tin te rpre ta t ions in te rms of exposure ef fec ton average r i sk . The inc idence-propor t iondiffe rence possesses an add i t ion a l in te rpre -

    ta t ion as an average effect of exposure onr isk . The inc idence dens i ty can be d i rec t lyin t e rp r e t ed a s an ave r age haza rd , and thu scan be used to e s t ima te pa r ame te r s o ffai lure- t ime distr ibutions; if the disease isra re over the r i sk per iod , the inc idence-density ra t io (proper ly averaged) c loselyapprox ima te s the inc idence -p ropor t ion r a -t io and so inhe r i t s the l a t t e r ' s i n t e rp r e t a -t ion a s well . T he most common measure inepidemiologic s ta t i s t ic s , the odds ra t io , i sbiologically interpretable only insofar as i te s t ima te s the inc idence -p ropor t ion o rinc idence-dens i ty ra t io .

    Near ly a l l unbiased e t io logic s tudies canand should provide es t imates of exposureeffec ts on inc idence propor t ions or dens i-t i es . Odds ra t ios and paramete rs of mul t i -var ia te models wil l of ten be useful in serv-ing a s o r in co ns t ruc t ing the e s t ima te s , bu tshou ld no t be t r e a t ed a s the end p rodu c t ofa s ta t i s t ica l ana lys is of epidemiologic da taor as summaries of effect in themselves. I tha s been a rgued e l sewhe re tha t s t anda rd-ized regression coeff ic ients , correla t ions,and "var iance expla ined" a re a lso improp ersummaries of effect (20).

    REFERENCES1 . Miet t inen OS. Esti raab i l ity a nd es t imat ion incase-referent studies. Am J Epidemiol 1976;103:226-35.2. Breslow N E, Day NE. Sta t i s t ica l m ethods in can-cer research. Vol. 1 . Th e an alysis of case-con trol

    studie s. IARC Scientific Public ations no. 32.Lyon: International Agency for Research on Can-cer, 1980.3. Schlesselman JJ. Case-control studies: design,conduct, analysis. New York: Oxford UniversityPress , Inc. , 1982.4. Kleinbaum DG, Kupper LL, Morgenstern H. Ep-idemiologic research: principles and quantitativemethods. Belmont, CA: Lifetime Learning Publi-ca t ions , 1982.5. Greenland S, Thomas DC. On the need for therare d isease assumption in case-con tro l s tud ies .Am J Epidemiol 1982;116:547-53.6 . Kelsey JL, Thompson WD, Evans AS. Methodsin observational epidemiology. New York: OxfordUniversity Press, Inc. , 1986.7. Rothman KJ. Modern epidemiology. Boston: Lit-t le , Brown & Co., 1986.8 . Bishop YMM, Fienberg SE, Hol land PW. Discre temul t ivar ia te analysis : theory and prac t ice . Cam-bridge, MA: MIT Press, 1975.9. Boivin J-F, Wacho lder S. Conditions for con-founding of the r i sk ra t io and of the odds ra t io .

  • 8/1/2019 Interpretation and Choice of Effect Measures in Epidemiologic Analyses

    8/8

    768 GREENLANDAm J Epidemiol 1985;121:152-8.10 . Miet t inen OS, Cook EF. Confounding: essenceand de tec t ion . Am J Epidemiol 1981; 114:593-603.11. Miet t in en OS. Th eore t ica l epidemiology. NewYork: John Wiley & Son s , 1985.

    12. Greenland S , Robins JM. Ident i f iabi l i ty , ex-changeabi l i ty , and epidemiologica l confounding.Int J Epidemiol 1986;15:412-18.13. Kalbf le i sch JD, Prent ice RL. Th e s ta t i s t ica l ana l -ysis of fai lure-t ime data . New York: John Wiley& Sons , 1980.14 . Bres low N E. Odds ra t io es t imators when the da taare sparse . Biometrika 1981;68:73-84.15. Green land S , Robins JM. Est imat ion of a commoneffect parameter from sparse fol low-up data . Bio-metrics 1985;41:55-68.16 . Greenland S . Limi ta t ions of the logis t ic ana lys i s

    of epidemiologic data. Am J Epidemiol 1979;110:693-8.17 . Semiatycki J , Thomas DC. Biological models andstat ist ical interact ions: an example from mult i-stage carcinogenesis . Int J Epidemiol 1981;10:383-7.18 . Miett inen OS. Design options in epidemiologicresea rch: an upda te . Scand J Work Envi ronHealth 1982;8(suppl 1):7-14.19 . Green l and S , Thomas DC, Morgens t e rn H. Therare-disease assumption revisi ted: a cri t ique of"Estimators of re la t ive risk in case-control stud-ies." Am J Epidemiol 1986;124:869-76.20. Greenland S , Schlesse lman JJ , Criqui MH. Thefal lacy of employing stan dard ized re gression coef-fic ients an d co rrelat ion s as me asure s of effect . AmJ Epidemiol 1986;123:203-8.

    APPENDIXPr oof that the instantane ous inciden ce density equals the average hazard

    Let r,(f' 11) be the r i sk of an ind ividua l i up to t ' > t, given survival to t, and le t IP( t ' 11) be t he p ropo r t ionof popula t ion members a t t who would become ill by t' . The n h>(t) = lim r,(t' | t)/(t' - t) (13, p . 6) an d ID(f)t 't= lim IP(t ' | t)/(t' - t) (7, p. 31). By the arguments given in the text, IP( t ' 11) - 2 r ( f | t)/N(t), where th e

    sum is over th e population membership a t t; dividing both sides of this equation by t' t and letting t' go tot yields ID(t) = Zh(t)/N(t).