download vs. citation vs. readership data:the case of an information systems journal

16
gefördert durch das Kompetenzzentrenprogramm ISSI 2013 – Altmetrics 2 15 July 2013 know-center.tugraz.at Download vs. Citation vs. Readership Data: The Case of an Information Systems Journal (RiP)* Christian Schlögl, Juan Gorraiz, Christian Gumpenberger, Kris Jack, Peter Kraker * Research in Progress

Upload: peter-kraker

Post on 10-May-2015

412 views

Category:

Education


0 download

DESCRIPTION

Presentation of the paper with Christian Schlögl, Juan Goarriz, Christian Gumpenberger, and Kris Jack at ISSI 2013

TRANSCRIPT

Page 1: Download vs. Citation vs. Readership Data:The Case of an Information Systems Journal

gefördert durch das Kompetenzzentrenprogramm

ISSI 2013 – Altmetrics 215 July 2013

know-center.tugraz.at

Download vs. Citation vs. Readership Data:The Case of an Information Systems Journal (RiP)*

Christian Schlögl, Juan Gorraiz, Christian Gumpenberger, Kris Jack, Peter Kraker

* Research in Progress

Page 2: Download vs. Citation vs. Readership Data:The Case of an Information Systems Journal

© Know-Center 2011

2

www.know-center.at

Introduction

Many studies have compared download and citation data (Moed 2005, Bollen & Van De Sompel 2008, Schlögl & Gorraiz 2011)

Possible sources for download data

Repositories/preprint archives

Open access journals

E-journals

Recently, online reference systems have received a lot of attention as a possible source for altmetrics

A few studies have compared readership and citation data (Bar-Ilan 2012, Li and Thelwall 2012 , Kraker et al. 2012)

In this study, we compare citations, downloads, and readership for the Journal of Strategic Information Systems

Page 3: Download vs. Citation vs. Readership Data:The Case of an Information Systems Journal

© Know-Center 2011

3

www.know-center.at

Research Questions

Are most cited articles the most downloaded ones, and those which can be found most frequently in user libraries of the collaborative reference management system Mendeley?

Do citations, downloads, and readership have different obsolescence characteristics at publication level?

Are there other features in which citation, download and readership data differ?

Page 4: Download vs. Citation vs. Readership Data:The Case of an Information Systems Journal

© Know-Center 2011

4

www.know-center.at

Data

The Journal of Strategic Information Systems (JoSIS)

“The Journal of Strategic Information Systems focuses on the management, business and organizational issues associated with the introduction and utilization of information systems as a strategic tool, and considers these issues in a global context.” http://www.journals.elsevier.com/the-journal-of-strategic-information-systems/

Period of analysis: 2002-2011; 321 documents

Data sources:

ScienceDirect (SD): monthly download data (PDF & HTML)

Scopus: monthly citation data

Mendeley: monthly additions to user libraries (full length articles)

Page 5: Download vs. Citation vs. Readership Data:The Case of an Information Systems Journal

© Know-Center 2011

5

www.know-center.at

Mendeley

Online reference management system

Organizing personal research library

Creating user profile

Reading and annotating of PDFs

Forming private and public groups

Sharing of references/PDFs

Crowdsourced Mendeley research catalog

2.5 m users

428 m user documents

~75 m unique articles

http://www.mendeley.com/research-papers/

Page 6: Download vs. Citation vs. Readership Data:The Case of an Information Systems Journal

© Know-Center 2011

6

www.know-center.at

Methodology

Preprocessing

Matching documents between ScienceDirect and Scopus

No unique key for SD and Scopus/Different document types between SD and Scopus

Matching via title, journal, vol/issue, page

Matching documents between Scopus and Mendeley via title (Levenshtein ratio 1/15.83) – found all but 5

Descriptive statistics

Document types, publication dates, downloads, readers

Correlation analysis

Downloads vs. cites, readers vs. Cites, downloads vs. readers

Page 7: Download vs. Citation vs. Readership Data:The Case of an Information Systems Journal

© Know-Center 2011

7

www.know-center.at

ResultsDownloads per document type

FLAs are the most downloaded document type (94.1%)

All other documents are downloaded at a considerably lower level

Document type n % docs % downloadsDownloads per doc – relations

Announcement 5 1.6% 0.4% 5.9

Book review 4 1.2% 0.3% 5.5

Contents list 29 9.0% 0.4% 1.0

Editorial Board 29 9.0% 0.6% 1.5

Editorial 49 15.3% 3.3% 4.6

Erratum 1 0.3% 0.1% 5.7

Full length article 181 56.4% 94.1% 35.4

Index 12 3.7% 0.2% 1.3

Miscellaneous 9 2.8% 0.2% 1.8

Publishers note 2 0.6% 0.2% 7.0

  321 100% 100%  Source: ScienceDirect; n=321

Page 8: Download vs. Citation vs. Readership Data:The Case of an Information Systems Journal

© Know-Center 2011

8

www.know-center.at

ResultsPrint publication delay

FLAs are published online more than 1.5 months before print publication on average.

Document type nOnline date - print

publication date (mean days)

Announcement 5 -13.2Book review 4 -40.5Contents list 29 12.9Editorial Board 29 12.9Editorial 49 9.0Erratum 1 -145.0Full length article 181 -49.8Index 12 -4.9Miscellaneous 9 32.9Publishers note 2 -13.0  321 -24.9

Source: ScienceDirect; n=321

Page 9: Download vs. Citation vs. Readership Data:The Case of an Information Systems Journal

© Know-Center 2011

9

www.know-center.at

ResultsDownloads per publication year (relational)

Download maximum in many cases 1 year after publication

Most downloads in a single year for FLAs published in 2011

DL-year

PY n 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 allDL/FLA

2002 13 1.0 2.3 1.7 1.3 1.2 1.4 2.4 2.8 2.8 2.7 19.6 7.4x2003 21 0.0 1.3 2.2 1.0 1.0 0.9 1.5 1.3 1.5 1.1 11.9 2.8x2004 17 1.7 2.6 2.1 2.2 2.4 2.7 2.9 2.3 18.9 5.5x2005 18 1.7 2.3 1.8 2.0 2.4 2.6 2.2 15.0 4.1x2006 14 0.2 2.4 2.1 1.8 2.1 2.0 2.0 12.5 4.4x2007 18 0.0 2.7 3.6 3.4 3.5 2.9 16.1 4.4x2008 16 0.0 2.9 3.5 3.0 2.4 11.8 3.6x2009 14 3.1 4.0 3.1 10.2 3.6x2010 21 3.9 4.4 8.3 2.0x2011 29 0.3 5.6 5.9 1.0xall 181 1.0 3.7 5.6 6.8 8.9 11.1 16.6 21.4 26.4 29.0 130.4

Source: ScienceDirect; FLA only (n=181)

Page 10: Download vs. Citation vs. Readership Data:The Case of an Information Systems Journal

© Know-Center 2011

10

www.know-center.at

ResultsCitations per document type

Different document types in Scopus and ScienceDirect (FLA ≈ articles + conference papers + reviews)

Ca. 25% of all documents not cited (primarily editorials, conference papers and recent publications)

Doc type no. docs % uncited Cites Cites per doc type

Article 151 15% 2563 14.8Conference paper 13 69% 8 0.4Editorial 33 79% 13 0.2Review 18 6% 383 20.2All 215 27% 2967 10.9

Source: Scopus; n=215

Page 11: Download vs. Citation vs. Readership Data:The Case of an Information Systems Journal

© Know-Center 2011

11

www.know-center.at

ResultsCitations per publication year

Only a few documents are cited in publication year - citation maxium is reached several years after publication

Difference to downloads reaching their maximum in the year of publication or one year later

Pubyear n

Citation year cites per doc2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 all

2002 13 2 19 38 69 88 105 158 165 194 199 1037 79.82003 14   1 6 21 27 39 35 41 40 39 249 17.82004 17     0  15 40 56 74 78 88 107 458 26.92005 19        0 16 46 78 76 93 99 408 21.52006 14       1 2 14 31 31 53 49 181 12.92007 18           1 31 74 92 85 283 15.72008 15             3 30 69 83 185 12.32009 14               3 34 57 94 6.72010 18                 5 40 45 2.52011 8                   14 14 1.8all 150 2 20 44 106 173 261 410 498 668 772 2954  Source: Scopus; Document types: articles, reviews, conference papers; only cited documents

(n=150)

Special Issue on “Trust in the Digital Economy“

Special Issue withconference papers

Page 12: Download vs. Citation vs. Readership Data:The Case of an Information Systems Journal

© Know-Center 2011

12

www.know-center.at

ResultsReaders per print publication year

Relative youth of Mendeley (est. 2008), strong increase of its user base since then (now: 2.5 mio) make obsolescence analyses difficult – Weighting with user/document growth needed.

Pubyear n

Readership years   Readers per doc2008 2009 2010 2011 - July

2012all

2002 13 7 30 126 245 183 591 45.52003 21 1 29 58 108 145 341 17.12004 17 11 36 107 158 165 477 28.12005 18 2 31 79 141 151 404 23.82006 14 6 39 88 128 148 409 29.22007 18 4 45 129 222 209 609 35.82008 16 7 36 99 182 164 488 32.52009 14 0 27 111 127 150 415 29.62010 21 0 0 84 238 191 513 24.42011 29 0 0 4 208 282 494 17.6all 181 38 273 885 1757 1852 4741  

Source: Mendeley; FLA only (n=181)

Page 13: Download vs. Citation vs. Readership Data:The Case of an Information Systems Journal

© Know-Center 2011

13

www.know-center.at

ResultsDownloads vs. readers vs. cites (only FLAs)

Moderate to high correlation (Spearman) between downloads and readers (0.73)

and downloads and citations (0.77)

Moderate correlation between citations and readers (r=0.51)

0

20

40

60

80

100

120

downloads vs. readers

downloads

rea

de

rs

0

50

100

150

200

250

300

downloads vs. cites

downloads

cit

es

0 20 40 60 80 100 120

0

50

100

150

200

250

300

readers vs. cites

readership

cit

es

r=0.73, n=181 r=0.77, n=151 r=0.51, n=151

Page 14: Download vs. Citation vs. Readership Data:The Case of an Information Systems Journal

© Know-Center 2011

14

www.know-center.at

ResultsReadership structure of Mendeley articles

2/3 of readership counts come from students

Researchers + Post Docs + Profs ≈ 1/4 of all readership counts

32%

7%

19%

6%

5%

5%

1%

5%

3%

3% 5%

3%4%

1%0%

Student (PhD) Student (doctorial) Student (MA) Student (postgr.)

Student (BA) Lecturer Sen. Lecturer Researcher (academic)

Researcher (non-academic) Post Doc Assist. Prof. Assoc. Prof.

Prof. other Librian

Source: Mendeley; doc type: FLA; n=4741

Page 15: Download vs. Citation vs. Readership Data:The Case of an Information Systems Journal

© Know-Center 2011

15

www.know-center.at

Conclusions

Comparison of different measures not always easy

Different obsolesence characteristics of downloads and cites (readership to be determined)

Moderate to high correlation between downloads and cites

Moderate correlation between cites and readership data

For representative usage measures, we need to understand their characteristics on a large scale

To fully understand usage and impact of an article, it will be important to have many complementary measures with transparent biases

On the one hand, we need open bibliometric data, on the other hand, we need a better understanding of the research process

Page 16: Download vs. Citation vs. Readership Data:The Case of an Information Systems Journal

gefördert durch das Kompetenzzentrenprogramm

ISSI 2013 – Altmetrics 215 July 2013

know-center.tugraz.at

Thank you very much for your attention!

Christian Schlögl, Juan Gorraiz, Christian Gumpenberger, Kris Jack, Peter Kraker

[email protected]