the persistence of error (2011 crossref annual meeting)

Post on 10-Jul-2015

880 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Phil Davis, Ph.D.

pmd8@cornell.edu

The Persistence of Error: A Study of Retracted Articles on the Internet and in Personal Libraries

2011 CrossRef Annual Member Meeting

November 15, 20110

An Elegant Solution to a Poorly Understood Problem

2 November 21, 2011

What We Know

• Number of retractions small but increasing (Wager & Williams, 2011; Steen, 2011)

• Retracted articles continue to be cited as valid studies (Budd et al., 1998, 2011; Redman et al., 2008)

• Journal publishers are inconsistent with alerting readers: 41% articles watermarked, 32% contain no notification anywhere (Steen, 2011)

• Most publishers allow some form of self-archiving (SHERPA/Romeo; Morris, 2009)

• Authors often ignore publisher policy (Davis & Connolly, 2007)

• Journal articles are likely to be found on non-publisher websites (Wren, 2005)

3 November 21, 2011

What We Assume

• Reaching readers is a communication problem that is not being solved by publishers and indexers alone.

• There is more than one access conduit to the scholarly literature

• Proliferation of article versions

• Scholars hoard articles in personal libraries

• Article status is static unless stated otherwise

• As retraction numbers are small, little incentive to search for updates (high-cost, low return)

4 November 21, 2011

What We Don’t Know

• Extent of proliferation of retracted papers on the public internet (out of the control of the publisher)

• Where they exist and which version(s)?

• What exists in readers personal libraries?

5 November 21, 2011

What We Did

1. Searched for copies of retracted papers on the public Internet. Excluded published version on publisher’s website

2. Created an API that searched the Mendeleydatabase for retracted articles

6 November 21, 2011

PMC (no notice on page view or pdf)

7 November 21, 2011

PMC (notice but not on pdf)

8 November 21, 2011

9 November 21, 2011

Advanced publication

10 November 21, 2011

Final manuscript on publisher’s site

11 November 21, 2011

Author manuscript in library repository

12 November 21, 2011

Pub version in repository

13 November 21, 2011

Reviewer manuscript in repository

14 November 21, 2011

Author website

15 November 21, 2011

Classes

16 November 21, 2011

Hospital Labs

17 November 21, 2011

Journal clubs

18 November 21, 2011

Medical schools

19 November 21, 2011

University Research Institutes

20 November 21, 2011

Advocacy

21 November 21, 2011

Commercial websites

22 November 21, 2011

Author, medical business

23 November 21, 2011

Aggregation sites

24 November 21, 2011

Entire issue

25 November 21, 2011

Clearinghouses

26 November 21, 2011

0

20

40

60

80

100

120

140

160

180

1973

1975

1976

1977

1978

1979

1980

1981

1982

1983

1984

1985

1986

1987

1988

1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

Retr

acte

d a

rtic

les

Year

No public copies

Found public copies

Public Copies on the Web

27 November 21, 2011

Summary of Web Study

• 1,779 retracted articles from PubMed (1973-2010)

• 308(12%) publicly-accessible copies (excluding published version on journal website)

• 29 could be found in more than one location (max 5)

• 90% of copies were published version; 9% final manuscripts; 1% other

• 41% in PMC; 28% on educational sites; 7% commercial

• 24% copies with retraction notices (5% excluding PMC page view)

28 November 21, 2011

A window into what is on computers

29 November 21, 2011

Mendeley API

30 November 21, 2011

Our API: http://www.fireisborn.org/retract/

Results from Mendeley

• 75% (1,340 of 1,779 records) could be found in Mendeley (mean readers = 3.4, max = 133)

• Caveat: We are not certain if they have the PDF

• Concentration of “readers” in top journals

• High readership articles more than 3x likely to be found on public (non-repository) websites (OR 3.28, 2.33-4.61, p<.0001)

31 November 21, 2011

Implications

• The problem of persistence cannot be controlled by copyright. Publishers lack control of articles

• Increased access comes with a versioning problem

• Essential problem: How do you reach readers when a Version of Record is no longer a Version of Record?

32 November 21, 2011

Solutions

Given 90% public copies are publisher version, CrossMark would be seen by the future reader

Caveats:

• Reader still responsible for initializing verification check

• Authors often write directly from bibliographic software

• Doesn’t prevent reuse/recycling of citations

• Doesn’t automatically update older PDFs (without symbol)

• Institutional self-archiving mandates may increase author manuscripts

33 November 21, 2011

1. Before Reading

34 November 21, 2011

2. Before Writing

35 November 21, 2011

3. Before Publication

36 November 21, 2011

Tripartite Solution

1. Before Reading

2. Before Writing

3. Before Publication

37 November 21, 2011

top related