the persistence of error (2011 crossref annual meeting)

37
Phil Davis, Ph.D. [email protected] The Persistence of Error: A Study of Retracted Articles on the Internet and in Personal Libraries 2011 CrossRef Annual Member Meeting November 15, 20110

Upload: crossref

Post on 10-Jul-2015

880 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: The Persistence of Error (2011 CrossRef Annual Meeting)

Phil Davis, Ph.D.

[email protected]

The Persistence of Error: A Study of Retracted Articles on the Internet and in Personal Libraries

2011 CrossRef Annual Member Meeting

November 15, 20110

Page 2: The Persistence of Error (2011 CrossRef Annual Meeting)

An Elegant Solution to a Poorly Understood Problem

2 November 21, 2011

Page 3: The Persistence of Error (2011 CrossRef Annual Meeting)

What We Know

• Number of retractions small but increasing (Wager & Williams, 2011; Steen, 2011)

• Retracted articles continue to be cited as valid studies (Budd et al., 1998, 2011; Redman et al., 2008)

• Journal publishers are inconsistent with alerting readers: 41% articles watermarked, 32% contain no notification anywhere (Steen, 2011)

• Most publishers allow some form of self-archiving (SHERPA/Romeo; Morris, 2009)

• Authors often ignore publisher policy (Davis & Connolly, 2007)

• Journal articles are likely to be found on non-publisher websites (Wren, 2005)

3 November 21, 2011

Page 4: The Persistence of Error (2011 CrossRef Annual Meeting)

What We Assume

• Reaching readers is a communication problem that is not being solved by publishers and indexers alone.

• There is more than one access conduit to the scholarly literature

• Proliferation of article versions

• Scholars hoard articles in personal libraries

• Article status is static unless stated otherwise

• As retraction numbers are small, little incentive to search for updates (high-cost, low return)

4 November 21, 2011

Page 5: The Persistence of Error (2011 CrossRef Annual Meeting)

What We Don’t Know

• Extent of proliferation of retracted papers on the public internet (out of the control of the publisher)

• Where they exist and which version(s)?

• What exists in readers personal libraries?

5 November 21, 2011

Page 6: The Persistence of Error (2011 CrossRef Annual Meeting)

What We Did

1. Searched for copies of retracted papers on the public Internet. Excluded published version on publisher’s website

2. Created an API that searched the Mendeleydatabase for retracted articles

6 November 21, 2011

Page 7: The Persistence of Error (2011 CrossRef Annual Meeting)

PMC (no notice on page view or pdf)

7 November 21, 2011

Page 8: The Persistence of Error (2011 CrossRef Annual Meeting)

PMC (notice but not on pdf)

8 November 21, 2011

Page 9: The Persistence of Error (2011 CrossRef Annual Meeting)

9 November 21, 2011

Page 10: The Persistence of Error (2011 CrossRef Annual Meeting)

Advanced publication

10 November 21, 2011

Page 11: The Persistence of Error (2011 CrossRef Annual Meeting)

Final manuscript on publisher’s site

11 November 21, 2011

Page 12: The Persistence of Error (2011 CrossRef Annual Meeting)

Author manuscript in library repository

12 November 21, 2011

Page 13: The Persistence of Error (2011 CrossRef Annual Meeting)

Pub version in repository

13 November 21, 2011

Page 14: The Persistence of Error (2011 CrossRef Annual Meeting)

Reviewer manuscript in repository

14 November 21, 2011

Page 15: The Persistence of Error (2011 CrossRef Annual Meeting)

Author website

15 November 21, 2011

Page 16: The Persistence of Error (2011 CrossRef Annual Meeting)

Classes

16 November 21, 2011

Page 17: The Persistence of Error (2011 CrossRef Annual Meeting)

Hospital Labs

17 November 21, 2011

Page 18: The Persistence of Error (2011 CrossRef Annual Meeting)

Journal clubs

18 November 21, 2011

Page 19: The Persistence of Error (2011 CrossRef Annual Meeting)

Medical schools

19 November 21, 2011

Page 20: The Persistence of Error (2011 CrossRef Annual Meeting)

University Research Institutes

20 November 21, 2011

Page 21: The Persistence of Error (2011 CrossRef Annual Meeting)

Advocacy

21 November 21, 2011

Page 22: The Persistence of Error (2011 CrossRef Annual Meeting)

Commercial websites

22 November 21, 2011

Page 23: The Persistence of Error (2011 CrossRef Annual Meeting)

Author, medical business

23 November 21, 2011

Page 24: The Persistence of Error (2011 CrossRef Annual Meeting)

Aggregation sites

24 November 21, 2011

Page 25: The Persistence of Error (2011 CrossRef Annual Meeting)

Entire issue

25 November 21, 2011

Page 26: The Persistence of Error (2011 CrossRef Annual Meeting)

Clearinghouses

26 November 21, 2011

Page 27: The Persistence of Error (2011 CrossRef Annual Meeting)

0

20

40

60

80

100

120

140

160

180

1973

1975

1976

1977

1978

1979

1980

1981

1982

1983

1984

1985

1986

1987

1988

1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

Retr

acte

d a

rtic

les

Year

No public copies

Found public copies

Public Copies on the Web

27 November 21, 2011

Page 28: The Persistence of Error (2011 CrossRef Annual Meeting)

Summary of Web Study

• 1,779 retracted articles from PubMed (1973-2010)

• 308(12%) publicly-accessible copies (excluding published version on journal website)

• 29 could be found in more than one location (max 5)

• 90% of copies were published version; 9% final manuscripts; 1% other

• 41% in PMC; 28% on educational sites; 7% commercial

• 24% copies with retraction notices (5% excluding PMC page view)

28 November 21, 2011

Page 29: The Persistence of Error (2011 CrossRef Annual Meeting)

A window into what is on computers

29 November 21, 2011

Page 30: The Persistence of Error (2011 CrossRef Annual Meeting)

Mendeley API

30 November 21, 2011

Our API: http://www.fireisborn.org/retract/

Page 31: The Persistence of Error (2011 CrossRef Annual Meeting)

Results from Mendeley

• 75% (1,340 of 1,779 records) could be found in Mendeley (mean readers = 3.4, max = 133)

• Caveat: We are not certain if they have the PDF

• Concentration of “readers” in top journals

• High readership articles more than 3x likely to be found on public (non-repository) websites (OR 3.28, 2.33-4.61, p<.0001)

31 November 21, 2011

Page 32: The Persistence of Error (2011 CrossRef Annual Meeting)

Implications

• The problem of persistence cannot be controlled by copyright. Publishers lack control of articles

• Increased access comes with a versioning problem

• Essential problem: How do you reach readers when a Version of Record is no longer a Version of Record?

32 November 21, 2011

Page 33: The Persistence of Error (2011 CrossRef Annual Meeting)

Solutions

Given 90% public copies are publisher version, CrossMark would be seen by the future reader

Caveats:

• Reader still responsible for initializing verification check

• Authors often write directly from bibliographic software

• Doesn’t prevent reuse/recycling of citations

• Doesn’t automatically update older PDFs (without symbol)

• Institutional self-archiving mandates may increase author manuscripts

33 November 21, 2011

Page 34: The Persistence of Error (2011 CrossRef Annual Meeting)

1. Before Reading

34 November 21, 2011

Page 35: The Persistence of Error (2011 CrossRef Annual Meeting)

2. Before Writing

35 November 21, 2011

Page 36: The Persistence of Error (2011 CrossRef Annual Meeting)

3. Before Publication

36 November 21, 2011

Page 37: The Persistence of Error (2011 CrossRef Annual Meeting)

Tripartite Solution

1. Before Reading

2. Before Writing

3. Before Publication

37 November 21, 2011