number needed to detect

10

Click here to load reader

Upload: francois-maignen

Post on 18-Mar-2017

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Number Needed to Detect

Pharm Med 2008; 22 (1): 13-22CURRENT OPINION 1178-2595/08/0001-0013/$48.00/0

© 2008 Adis Data Information BV. All rights reserved.

Number Needed to DetectNuances in the Use of a Simple and Intuitive Signal Detection Metric

Manfred Hauben,1,2,3 Ulrich Vogel4 and Francois Maignen5

1 Department of Medicine, Risk Management Strategy, Pfizer Inc., New York University School of Medicine, New York,New York, USA

2 Departments of Community and Preventive Medicine and Pharmacology, New York Medical College, Valhalla,New York, USA

3 School of Information Systems, Computing and Mathematics, Brunel University, London, England4 Corporate Drug Safety, Boehringer Ingelheim GmbH, Ingelheim am Rhein, Germany5 Post-Authorisation Pharmacovigilance, Safety and Efficacy Sector (Eudravigilance), European Medicines Agency,

London, England

Data mining algorithms are increasingly being used to support the process of signal detection and evaluationAbstractin pharmacovigilance. Published data mining exercises formulated within a screening paradigm typicallycalculate classical performance indicators such as sensitivity, specificity, predictive value and receiver operatorcharacteristic curves. Extrapolating signal detection performance from these isolated data mining exercises toperformance in real-world pharmacovigilance scenarios is complicated by numerous factors and some publishedexercises may promote an inappropriate and exclusive focus on only one aspect of performance. In this article,we discuss a variation on positive predictive value that we call the ‘number needed to detect’ that provides asimple and intuitive screening metric that might usefully supplement the usual presentations of data miningperformance. We use a series of figures to demonstrate the nature and application of this metric, and selectedadaptive variations. Even with simple and intuitive metrics, precisely quantifying the performance of contempo-rary data mining algorithms in pharmacovigilance is complicated by the complexity of the phenomena undersurveillance and the manner in which the data are recorded in spontaneous reporting systems.

It has been a decade since Bate et al.[1] pioneered the applica- The scientific basis underlying the computation of these per-tion of Bayesian methods to drug safety signal detection. This formance measures for the DMAs used on SRS databases isrepresented a quantum jump in a field not noted for frequent arguable considering that traditional pharmacovigilance methodstechnological innovation. Other researchers followed suit with usually fail to systematically evaluate and, therefore, identify astheir own elegant variations.[2] Although disproportionality ana- such, the signals that are true negative (Slattery J., Europeanlysis is not new,[3] data mining has become one of the most active Medicines Agency, personal communication). As a result theareas in pharmacovigilance as a result of these developments. metrics may be biased and suboptimal for intuition or from a

decision theoretic perspective. Similarly, false-negative signalsNumerous published exercises have aimed to elucidate theare, in general, not systematically reviewed.performance characteristics of data mining algorithms (DMAs)

when applied to spontaneous reporting systems (SRS) data.[4] In other words, the signals of disproportionate reportingTypically these include traditional performance metrics used to (SDRs) that are not evaluated as potential signals may be assumed,evaluate screening tests, namely sensitivity, specificity, predictive by default, to be negative or spurious signals without any furthervalue and receiver operator characteristic (ROC) curves.[4] These consideration of whether these signals are true- or false-negatives.studies enhanced our understanding of DMAs in pharma- At a certain point in time, because of the lack of information orcovigilance, but their limitations prevented generalization to the scientific evidence concerning a particular signal, it may also bereality of pharmacovigilance.[4] impossible to assess whether a signal is a true- or false-positive.

Page 2: Number Needed to Detect

14 Hauben et al.

The latter considerations reflect the reality of pharmacovigilance, nation exceeds the expected reporting frequency by a specifiedin which rational decisions must often be made in the setting of multiple. We use SDR rather than signal to emphasize that such aresidual uncertainty. Finally, studies that tried to compute the numerical result, when viewed in a biological vacuum, is notcharacteristics of DMAs in terms of sensitivity and specificity did adequate to determine that a credible signal exists.not always specify with which definitions of signals (true or false) Once presented with the list, each SDR is reviewed againstthese studies were conducted, or used definitions of limited gener- criteria to determine whether a credible signal exists that warrantsalizability. a detailed investigation of the underlying cases. These criteria

This performance approach (similar to the evaluation of diag- include novelty, seriousness and whether the event is readilynostic procedures) can promote over-reliance on individual met- attributable to treatment indication.rics and overly simplistic viewpoints, such as an exclusive empha- Classifier efficiency relates to the fact that some SDRs will besis on reducing false-positive or -negative findings. False positives found to be directly relevant to patient safety but many will turnhave an adverse impact on pharmacovigilance but an exclusive out not to be relevant to patient safety for various reasons; forfocus on reducing them may prevent useful knowledge discovery example, because they reflect confounding or reporting artefact.given the known tradeoffs between sensitivity and specificity. An DMA efficiency understandably depends on the numbers of rele-exclusive focus on reducing false negatives may be self-defeating vant versus irrelevant drug-event combinations that are highlight-by overloading safety surveillance systems with information of ed as quantitatively interesting by the DMA.limited use, diluting credible signals and diverting resources from Workflow impact is closely related to efficiency but also de-the most fruitful hypothesis. Seeking an appropriate balance of pends on the resulting activities required to evaluate the SDRs.sensitivity and specificity for the task at hand may be the most These activities include the number of unique reports that wouldsuitable approach.[5]

need to be reviewed in detail as a consequence of using a DMA.This article describes a simple and probably more intuitive

To summarize, classifier efficiency is a function of the numbermetric to supplement the calculation and presentation of DMA

of relevant versus irrelevant associations highlighted as quantita-performance by providing an additional view of some of the costs

tively interesting by each DMA. Workflow impact will depend onand utilities associated with selection of DMAs/metrics/thresh-

the latter classification efficiency as well as the resulting evalua-olds: the ‘number needed to detect’ (NND).

tive activities, such as the number of unique reports comprising theThe proposed metric is founded on the basic principles of how SDRs that would need to be reviewed in detail.

DMAs are used, the outputs they provide and ways that theseoutputs may be assessed and impact the signal detection process.

2. Number Needed to DetectWe first discuss the basic notions of DMA efficiency and impactof DMAs on pharmacovigilance workflow (many of these con-cepts could be applied to any signal methodology, including The NND is the number of items (described in detail below)traditional methods based on clinical heuristics). The NND natu- that would have to be reviewed per credible signal identified.rally follows. We include a set of diagrammatic representations Thus, it is an expression of the proportion of items highlighted bythat should clarify the calculation and interpretation of the NND the DMA that turn out to be relevant to public safety and aand its adaptive variations. variation on positive predictive value.

In anticipation of disagreements about what is a ‘credible1. Data Mining Algorithm (DMA) Efficiency and signal’, we note there is considerable semantic ambiguity andWorkflow Impacts imprecision about the meaning of the word ‘signal’[7] and antici-

pate that there will be significant variability in what signals expertsUnderstanding the effect of DMAs on a signal detection pro- consider ‘credible’. The latter concept relates to the construction

gramme involves notions of classifier efficiency as well as work- of reference sets of adverse events (i.e. gold standards). This is aflow impacts. crucial, related, yet distinct issue from defining performance met-

As commonly used, contemporary DMAs may be viewed as rics that is beyond the scope of this exposition. While the values ofclassifiers because they output a list of reported drug-event combi- sensitivity, specificity and predictive value will vary depending onnations that are classified as quantitatively ‘interesting’ versus ‘not the gold standards used, the concepts and definitions of theseinteresting’ based on observed versus ‘expected’ reporting fre- metrics are independent of which gold standard is used. Ourquency, so called ‘signals of disproportionate reporting’ or intention is not to pin down gold standards, which have beenSDRs[6,7] (figure 1). A SDR is observed when a drug-event combi- discussed in other publications,[8,9] but to suggest additional per-

© 2008 Adis Data Information BV. All rights reserved. Pharm Med 2008; 22 (1)

Page 3: Number Needed to Detect

Number Needed to Detect 15

Run DMA

Adjudication # Reports # Reports

N1

N2

N3

N4

N5

N6

N7

N8

N9

N10

N11

N12

N13

N14

N15

N16

N17

N18

N19

N20

N21

N22

N23

N24

N25

N26

N27

N28

N29

N30

SDR1

SDR2

SDR3

SDR4

SDR5

SDR6

SDR7

SDR8

SDR9

SDR10

SDR11

SDR12

SDR13

SDR14

SDR15

SDR16

SDR17

SDR18

SDR19

SDR20

SDR21

SDR22

SDR23

SDR24

SDR25

SDR26

SDR27

SDR28

SDR29

SDR30

SDR1

DMA 1 DMA 2

SDR2

SDR3

SDR4

SDR5

SDR6

SDR7

SDR8

SDR9

SDR10

SDR11

SDR12

N1

N2

N3

N4

N5

N6

N7

N8

N9

N10

N11

N12

Conceptually related(e.g. HLT/SMQ)

Literally identical SDR

Fig. 1. Signals of disproportionate reporting (SDRs) produced by two data mining algorithms (DMAs). HLT = higher level term; MedDRA = MedicalDictionary for Regulatory Activities; SMQ = standardized MedDRA queries.

formance metrics that can be applied whatever gold standard is choose to calculate this based on unlabelled SDRs (figure 3). Thisselected by a particular investigator or organization. could be calculated based only on which SDRs involved unlabel-

led events. The corresponding number of case reports associatedApplicable items used to calculate the NND can be varied towith the set of SDRs allows calculation of the NNDreports, which iscollectively provide differing perspectives on workflow impactsthe number of reports that would have to be reviewed per credibleand performance; for example, the item could be the SDR. Thesignal detected (figures 4 and 5). This provides more insight intoNNDSDR would be the number of SDRs that would have to beworkflow impact since there are usually multiple preferred termsreviewed to detect a single credible signal (figure 2). If an organi-

zation can quickly determine which SDRs are labelled they might (PTs) per report and the actual impact on workflow may be more

© 2008 Adis Data Information BV. All rights reserved. Pharm Med 2008; 22 (1)

Page 4: Number Needed to Detect

16 Hauben et al.

Run DMA

Adjudication # Reports

DMA 1: NNDSDR

# Reports

N1

N2

N3

N4

N5

N6

N7

N8

N9

N10

N11

N12

N13

N14

N15

N16

N17

N18

N19

N20

N21

N22

N23

N24

N25

N26

N27

N28

N29

N30

SDR1

SDR2

SDR3

SDR4

SDR5

SDR6

SDR7

SDR8

SDR9

SDR10

SDR11

SDR12

SDR13

SDR14

SDR15

SDR16

SDR17

SDR18

SDR19

SDR20

SDR21

SDR22

SDR23

SDR24

SDR25

SDR26

SDR27

SDR28

SDR29

SDR30

1

9

SDR1

DMA 1 DMA 2

SDR2

SDR3

SDR4

SDR5

SDR6

SDR7

SDR8

SDR9

SDR10

SDR11

SDR12

N1

N2

N3

N4

N5

N6

N7

N8

N9

N10

N11

N12

Labelled FP TP TP FP

n

n

y

y

y

n

n

y

n

n

n

y

n

y

n

n

y

y

n

n

n

y

n

y

n

y

n

n

n

n

= = 3.3

Labelled

n

y

y

y

n

n

n

y

n

n

n

n

DMA 2: NNDSDR1

4= = 3.0

21 + 9

8 + 4

Conceptually related(e.g. HLT/SMQ)

Literally identical SDR

Fig. 2. NNDSDR: number of signals of disproportionate reporting (SDRs) that would have to be reviewed to detect a single credible signal. NNDSDR (DMA1) > NNDSDR (DMA 2). Efficiency DMA 2 > DMA 1. DMA = data mining algorithm; FP = false positive; HLT = higher level term; MedDRA = MedicalDictionary for Regulatory Activities; n = no; NND = number needed to detect; SMQ = standardized MedDRA queries; TP = true positive; y = yes.

© 2008 Adis Data Information BV. All rights reserved. Pharm Med 2008; 22 (1)

Page 5: Number Needed to Detect

Number Needed to Detect 17

Run DMA

Adjudication # Reports

DMA 1: NNDSDR (unlabelled)

# Reports

N1

N2

N3

N4

N5

N6

N7

N8

N9

N10

N11

N12

N13

N14

N15

N16

N17

N18

N19

N20

N21

N22

N23

N24

N25

N26

N27

N28

N29

N30

SDR1

SDR2

SDR3

SDR4

SDR5

SDR6

SDR7

SDR8

SDR9

SDR10

SDR11

SDR12

SDR13

SDR14

SDR15

SDR16

SDR17

SDR18

SDR19

SDR20

SDR21

SDR22

SDR23

SDR24

SDR25

SDR26

SDR27

SDR28

SDR29

SDR30

12

17 + 2

SDR1

DMA 1 DMA 2

SDR2

SDR3

SDR4

SDR5

SDR6

SDR7

SDR8

SDR9

SDR10

SDR11

SDR12

N1

N2

N3

N4

N5

N6

N7

N8

N9

N10

N11

N12

Labelled FP TP TP FP

n

y

y

y

n

y

n

n

n

y

n

y

n

n

y

y

n

n

n

y

n

y

n

y

n

n

n

n

= = 9.5

Labelled

n

y

y

y

n

n

n

y

n

n

n

n

DMA 2: NNDSDR (unlabelled)1

1

7 + 1

= = 8.0

n

n

Conceptually related(e.g. HLT/SMQ)

Literally identical SDR

Fig. 3. NNDSDR (unlabelled): number of unlabelled signals of disproportionate reporting (SDRs) that would have to be reviewed to detect a single crediblesignal. The two unlabelled true positive (TP) SDRs identified by DMA 1 are medically related and may be counted as one medical concept in a variation ofthis metric. NNDSDR (unlabelled) (DMA 1) > NNDSDR (unlabelled) (DMA 2). Efficiency DMA 2 > DMA 1. DMA = data mining algorithm; FP = false positive; HLT= higher level term; MedDRA = Medical Dictionary for Regulatory Activities; n = no; NND = number needed to detect; SMQ = standardized MedDRAqueries; y = yes.

© 2008 Adis Data Information BV. All rights reserved. Pharm Med 2008; 22 (1)

Page 6: Number Needed to Detect

18 Hauben et al.

Run DMA

Adjudication # Reports

DMA 1: NNDreports

# Reports

N1

N2

N3

N4

N5

N6

N7

N8

N9

N10

N11

N12

N13

N14

N15

N16

N17

N18

N19

N20

N21

N22

N23

N24

N25

N26

N27

N28

N29

N30

SDR1

SDR2

SDR3

SDR4

SDR5

SDR6

SDR7

SDR8

SDR9

SDR10

SDR11

SDR12

SDR13

SDR14

SDR15

SDR16

SDR17

SDR18

SDR19

SDR20

SDR21

SDR22

SDR23

SDR24

SDR25

SDR26

SDR27

SDR28

SDR29

SDR30

9

SDR1

DMA 1 DMA 2

SDR2

SDR3

SDR4

SDR5

SDR6

SDR7

SDR8

SDR9

SDR10

SDR11

SDR12

N1

N2

N3

N4

N5

N6

N7

N8

N9

N10

N11

N12

Labelled FP TP TP FP

n

y

y

y

n

y

n

n

n

y

n

y

n

n

y

y

n

n

n

y

n

y

n

y

n

n

n

n

=

Labelled

n

y

y

y

n

n

n

y

n

n

n

n

DMA 2: NNDreports =

n

n

Conceptually related(e.g. HLT/SMQ)

Literally identical SDR

N1 + N2 + ... + N30

4

N1 + N2 + ... + N12

Fig. 4. NNDreports: number of reports that would have to be reviewed per credible signal detected. Signals may also be counted as distinct medicalconcepts rather than number of distinct Medical Dictionary for Regulatory Activities (MedDRA) preferred terms. DMA = data mining algorithm; FP = falsepositive; HLT = higher level term; n = no; NND = number needed to detect; SDR = signals of disproportionate reporting; SMQ = standardized MedDRAqueries; TP = true positive; y = yes.

© 2008 Adis Data Information BV. All rights reserved. Pharm Med 2008; 22 (1)

Page 7: Number Needed to Detect

Number Needed to Detect 19

Run DMA

Adjudication # Reports

DMA 1: NNDreports-related

# Reports

N1

N2

N3

N4

N5

N6

N7

Nx+1

Nx+2

Nx+3

Nx+4

SDR1

SDR2

SDR3

SDR4

SDR5

SDR6

SDR7

Medically related DECs with noreporting disproportionality

DECx+1

DECx+2

DECx+3

DECx+4

DECx+2

DECx+3

DECx+4

DECx+5

2

SDR1

DMA 1 DMA 2

SDR2

SDR3

SDR4

SDR5

SDR6

DECx+1

N1

N2

N3

N4

N5

N6

Nx+1

Nx+2

Nx+3

Labelled FP TP TP FP

n

y

y

y

n

n

n

n

n

=

Labelled

n

y

y

y

n

n

n

n

n

Nx+4n

Nx+5n

DMA 2: NNDreports-related =

n

n

Conceptually related(e.g. HLT/SMQ)

Literally identical SDR

N1 + ... + N7 + Nx+1 + ... + Nx+4

2

N1 + ... + N6 + Nx+1 + ... + Nx+5

Fig. 5. NNDreports-related: number of reports that would have to be reviewed per credible signal detected, including medically related drug-eventcombinations (DECs) with no reporting disproportionality. For a single credible medical concept identified by both data mining algorithms (DMAs), as inexample: the NNDreports-related is identical. NND = number needed to detect. FP = false positive; HLT = higher level term; MedDRA = Medical Dictionary forRegulatory Activities; n = no; NND = number needed to detect; SDR = signals of disproportionate reporting; SMQ = standardized MedDRA queries; TP =true positive; y = yes.

closely related to the number of reports reviewed than the number utilities and opportunity costs. Therefore it could be useful toof PTs reviewed (figure 6). Since clinically unrelated SDRs could calculate and compare the unique benefits/opportunity costs asso-appear in the same reports (figure 6), such reports would have tobe viewed more than once, so the NND can be medically stratifiedby grouping items; for example, based on the higher level of theMedical Dictionary for Regulatory Activities (MedDRA) hierar-chy (illustrated in figure 7). Similarly, a variation of the NNDSDR

could include concept-based grouping of both true- and false-positive SDRs.

Different DMAs generally produce lists of SDRs that overlapbut also have considerable differences (figure 1). That is to say,some SDRs will be uniquely associated with one or the other of theDMAs being compared. Some of these will be relevant to publicsafety and others will not. Therefore, the subsets of SDRs that areunique to each DMA will have their own unique fraction ofrelevant versus irrelevant SDRs and corresponding incremental

Cases with multiple SDR events(not conceptually related)

or multiple concepts

Cases without SDR event

Caseswith uniqueSDR event

Caseswith multipleSDR events

(unique concept)

Review

impact per individual case

All

case

s fo

r dr

ug in

sou

rce

data

base

Example(MedDRA PT SDR)

HypersensitivityCataract

NauseaVomiting

Epigastric discomfort

Nausea

<none>

Fig. 6. Workflow impact of signals of disproportionate reporting (SDRs) incase reports. MedDRA = Medical Dictionary for Regulatory Activities; PT =preferred term.

© 2008 Adis Data Information BV. All rights reserved. Pharm Med 2008; 22 (1)

Page 8: Number Needed to Detect

20 Hauben et al.

Run DMA

Adjudication # Reports

DMA 1: NNDgroup 1

# Reports

N1

N2

N3

N4

N5

N6

N7

N8

N9

N10

N11

N12

N13

N14

N15

N16

N17

N18

N19

N20

N21

N22

N23

N24

N25

N26

N27

N28

N29

N30

SDR1

SDR2

SDR3

SDR4

SDR5

SDR6

SDR7

SDR8

SDR9

SDR10

SDR11

SDR12

SDR13

SDR14

SDR15

SDR16

SDR17

SDR18

SDR19

SDR20

SDR21

SDR22

SDR23

SDR24

SDR25

SDR26

SDR27

SDR28

SDR29

SDR30

SDR1

DMA 1 DMA 2

SDR2

SDR3

SDR4

SDR5

SDR6

SDR7

SDR8

SDR9

SDR10

SDR11

SDR12

N1

N2

N3

N4

N5

N6

N7

N8

N9

N10

N11

N12

Labelled FP TP TP FP

n

n

y

y

y

n

n

y

n

n

n

y

n

y

n

n

y

y

n

n

n

y

n

y

n

y

n

n

n

n

=

Labelled

n

y

y

y

n

n

n

y

n

n

n

n

DMA 2: NNDgroup 1 =

5

Conceptually related(e.g. HLT/SMQ)

Literally identical SDR

Gro

up

IG

rou

p II

Gro

up

III

N1 + ... + N12

3

N1 + ... + N7

Signals counted as distinct terms

4

N1 + ... + N12

2

N1 + ... + N7

Signals counted as distinct medical concepts

Fig. 7. NNDgroup: number of reports that would have to be reviewed per credible signal detected, stratified by grouping (e.g. dictionary hierarchy).Groupings may contain more than one medical concept. Signals may also be counted as distinct medical concepts rather than number of distinct MedicalDictionary for Regulatory Activities (MedDRA) preferred terms. DMA = data mining algorithm; FP = false positive; HLT = higher level term; n = no; NND =number needed to detect; SDR = signals of disproportionate reporting; SMQ = standardized MedDRA queries; TP = true positive; y = yes.

© 2008 Adis Data Information BV. All rights reserved. Pharm Med 2008; 22 (1)

Page 9: Number Needed to Detect

Number Needed to Detect 21

Run DMA

Adjudication # Reports

DMA 1: NNDunique

# Reports

N1

N2

N3

N4

N5

N6

N7

N8

N9

N10

N11

N12

N13

N14

N15

N16

N17

N18

N19

N20

N21

N22

N23

N24

N25

N26

N27

N28

N29

N30

SDR1

SDR2

SDR3

SDR4

SDR5

SDR6

SDR7

SDR8

SDR9

SDR10

SDR11

SDR12

SDR13

SDR14

SDR15

SDR16

SDR17

SDR18

SDR19

SDR20

SDR21

SDR22

SDR23

SDR24

SDR25

SDR26

SDR27

SDR28

SDR29

SDR30

1

5

12 + 5

SDR1

DMA 1 DMA 2

SDR2

SDR3

SDR4

SDR5

SDR6

SDR7

SDR8

SDR9

SDR10

SDR11

SDR12

N1

N2

N3

N4

N5

N6

N7

N8

N9

N10

N11

N12

Labelled FP TP TP FP

n

n

y

y

y

n

n

y

n

n

n

y

n

y

n

n

y

y

n

n

n

y

n

y

n

y

n

n

n

n

= = 3.4

Labelled

n

y

y

y

n

n

n

y

n

n

n

n

DMA 2: NNDunique1

0

2 + 0

= = No unique credible signal

Conceptually related(e.g. HLT/SMQ)

Literally identical SDR

Fig. 8. NNDunique: number of signals of disproportionate reporting (SDRs) that would have to be reviewed to detect one credible signal that is unique to thatdata mining algorithm (DMA). DMA 2 detects no unique credible signal. FP = false positive; HLT = higher level term; MedDRA = Medical Dictionary forRegulatory Activities; n = no; NND = number needed to detect; SMQ = standardized MedDRA queries; TP = true positive; y = yes.

© 2008 Adis Data Information BV. All rights reserved. Pharm Med 2008; 22 (1)

Page 10: Number Needed to Detect

22 Hauben et al.

ciated with each DMA. The NNDunique is the number of SDRs/ Acknowledgementsreports that would have to be uniquely reviewed by choosing oneor another DMA metric/threshold to detect one credible signal that The views expressed in this paper are those of the authors and notis unique to that DMA/metric/threshold (figure 8). necessarily the official view of the European Medicines Agency. No sources

of funding were used to assist in the preparation of this review. The authorsOther variations are possible. For example, one can performhave no conflicts of interest that are directly relevant to the content of thisNND calculations that sum the SDRs/reports over time in order toreview.

assess the costs and utilities associated with differential time-to-The authors would like to thank the following individuals who generouslysignal between approaches. Finally, DMA(s) performance could

shared their time in reviewing early versions of this manuscript and providingbe further calibrated by calculating the NND, in all its variations, thoughtful insights: Alan Hochberg, Jeffrey Aronson, David Goldsmith,over a range of statistical thresholds; thus, allowing users to David Madigan and Panos Tsintis.clearly see the impact of threshold selection on performance. If theinformation is available, review time per case can be combinedwith the NNDreports to derive a number of person-time per credible Referencessignal detected.

1. Bate A, Lindquist M, Edwards IR, et al. A bayesian neural network method foradverse drug reaction signal generation. Eur J Clin Pharmacol 1998; 54: 315-21

2. DuMouchel W. Bayesian data mining in large frequency tables with an application3. Conclusions to the FDA spontaneous reporting system. Am Stat 1999; 53 (3): 177-202

3. Moore N. Thiessard F, Begaud B. The history of disproportionality measures(reporting odds ratio, proportional reporting rates) in spontaneous reporting ofNNDs may facilitate a better understanding of the costs andadverse drug reactions. Pharmacoepidemiol Drug Saf 2005; 14: 285-86

utilities associated with different data mining protocols, a selec- 4. Hauben M, Madigan D, Gerrits C, et al. The role of data mining in pharma-tion based on the relative premiums placed on sensitivity and covigilance. Expert Opin Drug Saf 2005; 4 (5): 929-48

5. Hauben M. Signal detection in the pharmaceutical industry: integrating clinical andspecificity in a given local environment. We recommend the NNDcomputational approaches. Drug Saf 2007; 30 (7): 627-30as an additional option that is both intuitively attractive and entails

6. Hauben M, Reich L, Chung S. Post-marketing surveillance of potentially fatallittle risk, though real-world experience will be needed to deter-reactions to oncology drugs: potential utility of two signal detection algorithms.

mine its incremental value. However, the NND is not a panacea. Eur J Clin Pharmacol 2004; 60 (10): 747-50

Indeed, as our examples clearly demonstrate, even with a simple 7. Hauben M, Reich L. Communication of findings in pharmacovigilance: use of theterm “signal” and the need for precision in its use. Eur J Clin Pharmacol 2005;metric, accurately measuring data mining performance may still be61 (5-6): 479-80

significantly complicated by the manner in which the data are8. Hauben M, Reich L. Reply: the evaluation of data mining methods for the

recorded and the many nuances in the design, interpretation and simultaneous and systematic detection of safety signals in large databases:lessons to be learned. Response to letter by Levine, et al. Br J Clin Pharmacolextrapolation of data mining validation exercises. Furthermore,2006; 61: 115-7

workflow impacts may go beyond reviewing case series, such as9. Hauben M, Aronson JK. Gold standards in pharmacovigilance-the use of definitive

performing a formal epidemiological study. Such impacts are not anecdotal reports of adverse drug reactions as pure gold and high-grade ore.Drug Saf 2007; 30: 645-55captured with this metric. Some of the aforementioned issues may

10. Henegar C, Bousquet C, Lillo-Le Louet A, et al. A knowledge based approach forbe mitigated by improvements in adverse event coding usingautomated signal generation in pharmacovigilance. Proceedings of the 11thautomated knowledge-based approaches to terminological group-World Congress on Medical Informatics; 2004 Sep 7-11; San Francisco (CA)

ing of adverse events.[10] [online]. Available from URL: http://cmbi.bjmu.edu.cn/news/report/2004/medinfo2004/pdffiles/papers/4696Henegar.pdf [Accessed 2008 Jan 28]Finally, it is important to note that while the NND in this

11. Hauben M, Van Puijenbroek EP, Gerrits C, et al. Data mining in pharma-context is superficially similar to the concept of the NND incovigilance: lessons from phantom ships. Eur J Clin Pharmacol 2006; 62 (11):

medical screening, it is distinct and not susceptible to the same 967-70

limitations as the former, which involves the total numberscreened to prevent an outcome. However, some biases observed Correspondence: Dr Manfred Hauben, Risk Management Strategy, Pfizerin medical screening may also be an issue in the area of signal Inc., 235 E. 42nd Street, New York, NY 10017-5755, USA.

E-mail: [email protected] in pharmacovigilance.[4,11]

© 2008 Adis Data Information BV. All rights reserved. Pharm Med 2008; 22 (1)