rapid information retrieval by creating a parallel implementation of medline bob badgett dept of...

16
Rapid information retrieval by creating a parallel implementation of Medline Bob Badgett Dept of Medicine UTHSC San Antonio 1/2006

Upload: claude-stokes

Post on 18-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Rapid information retrieval by creating a parallel implementation of Medline Bob Badgett Dept of Medicine UTHSC San Antonio 1/2006

Rapid information retrieval by creating a parallel implementation

of Medline

Bob BadgettDept of Medicine

UTHSC San Antonio1/2006

Page 2: Rapid information retrieval by creating a parallel implementation of Medline Bob Badgett Dept of Medicine UTHSC San Antonio 1/2006

As Mark Twain reportedly put it, "Be careful about reading health books; you may die of a misprint"

Johnson T. NEJM 1998

Page 3: Rapid information retrieval by creating a parallel implementation of Medline Bob Badgett Dept of Medicine UTHSC San Antonio 1/2006

X

Only errors that led to proximate adverse event

Discharges have 12% adverse events

Page 4: Rapid information retrieval by creating a parallel implementation of Medline Bob Badgett Dept of Medicine UTHSC San Antonio 1/2006

The most common diagnosis in primary care is…

• Questions occur in 1/3 of visits– We pursue answers to 55% of their questions– Find answers to 70% (with difficulty in 40%)– Result is only 40% of their questions being

answered (guessing in 60%)

• The “diagnosis of information failure” occurs in about 20% of patients– Twice as common as the most frequent single

primary care diagnosis

Page 5: Rapid information retrieval by creating a parallel implementation of Medline Bob Badgett Dept of Medicine UTHSC San Antonio 1/2006

MEDLINE searching is misery when in a hurry–30 minutes to search–50% of clinical searches by experts fail–Compared to librarians, clinicians find

•50% less relevant articles•50% more irrelevant articles

Doctors have two minutes available

Page 6: Rapid information retrieval by creating a parallel implementation of Medline Bob Badgett Dept of Medicine UTHSC San Antonio 1/2006

Current search engine

• http://SUMSearch.uthscsa.edu– Live searching of MEDLINE– Iterative searching– 400 - 500 queries per day– Internationally recognized

• Review: equivalent to PubMed

– Basis of current grant proposalsNLM in collaboration with American College of

Physicians, Thomson-MicroMedex, others

Page 7: Rapid information retrieval by creating a parallel implementation of Medline Bob Badgett Dept of Medicine UTHSC San Antonio 1/2006

Current method

• http://sumsearch.uthscsa.edu– Externally searches MEDLINE via PubMed– PubMed’s publicly stated limit is one search

every 8 seconds– We do ~6 per query

Page 8: Rapid information retrieval by creating a parallel implementation of Medline Bob Badgett Dept of Medicine UTHSC San Antonio 1/2006

Users of proposal

• Department of Medicine, UTHSC San Antonio– Bob Badgett

• School of Health Information Sciences, UTHSC Houston– Elmer Bernstam

Page 9: Rapid information retrieval by creating a parallel implementation of Medline Bob Badgett Dept of Medicine UTHSC San Antonio 1/2006
Page 10: Rapid information retrieval by creating a parallel implementation of Medline Bob Badgett Dept of Medicine UTHSC San Antonio 1/2006

Knowledge management – 1. Vastness

0

2000000

4000000

6000000

8000000

10000000

12000000

Year

Art

icle

s n

ot

lett

ers

0

500

1000

1500

2000

2500

3000

Pag

es

MEDLINE Articles not letters

Harrison's page

USPSTF 1 –

198

9: 6

0 Topic

s

USPSTF 2 –

199

6: 7

0 Topic

s

USPSTF 3 –

200

0-20

03: >

80 T

opics

Preve

ntion: 7

.4 h

ours/d

ay

Rx: In

crea

sing #

of m

eds

Page 11: Rapid information retrieval by creating a parallel implementation of Medline Bob Badgett Dept of Medicine UTHSC San Antonio 1/2006

Knowledge management – Vast & complex

• Articles come– 13 million citations– Half million added per year– MEDLINE’s doubling time is 15 years

• Articles go– 1/3 of research eventually refuted/attenuated

• JAMA. 2005. PMID: 16014596

– Original studies - T1/2 = 45 years• Ann Intern Med. 2002. PMID: 12069563

– Practice guidelines – T1/2 = 6 years• JAMA. 2001. PMID: 11572738

• Some articles never should have been– 25 of 33 streptokinase studies maybe were not needed. PMID: 1614465

• But there is more…

Page 12: Rapid information retrieval by creating a parallel implementation of Medline Bob Badgett Dept of Medicine UTHSC San Antonio 1/2006

Knowledge management – Misinformation

• Manuscript reviewers prefer manuscripts they agree with– J Lab Clin Med. 1994. PMID: 8051481

• Quality of reviews and textbooks– Original author misquoted in 15% of references– Errors in citation of references - 25%BMJ 1985. PMID: 3931753

• Biases that hinder disseminination– Publication bias against negative studiesBMJ 1998. PMID: 98113104

• Industry sponsored research• Media coverage of unpublished articles

– 1/3 never published

Page 13: Rapid information retrieval by creating a parallel implementation of Medline Bob Badgett Dept of Medicine UTHSC San Antonio 1/2006

Proposed search engine

http://medinformatics.uthscsa.edu/grant-public/

Page 14: Rapid information retrieval by creating a parallel implementation of Medline Bob Badgett Dept of Medicine UTHSC San Antonio 1/2006

Overall strategy

• Search ‘systematic textbook’– PIER (American College of Physicians)

• Depending on query– National Guidelines Clearinghouse– FDA– CDC– Others

• In case nothing found (20%?)– Evidence is too subtle or recent– MEDLINE

Page 15: Rapid information retrieval by creating a parallel implementation of Medline Bob Badgett Dept of Medicine UTHSC San Antonio 1/2006

MEDLINE the data

• 15 million records in xml– Currently 52 GBs– Growing at 6 GBs per year

• Updated weekly

• Its thesaurus, MeSH is 23 descriptors and is updated yearly

• The UMLS meta-thesaurus has 5 million concept names

Page 16: Rapid information retrieval by creating a parallel implementation of Medline Bob Badgett Dept of Medicine UTHSC San Antonio 1/2006

MEDLINE Strategy

Original studies Systematic reviews Practice guidelines

Other types

3-4 iterations with increasingly restrictive limits

3-4 iterations with increasingly

restrictive limits

3-4 iterations with increasingly restrictive limits

3-4 iterations with increasingly restrictive limits

12 searches per query

Need subscecond