the strange world of bibliometric numbers: implications for professional practice

39
The strange world of bibliometric numbers: Implications for professional practice Dr Ian Rowlands David Wilson Library Manchester Metropolitan University 27 June 2016

Upload: sam-gray

Post on 14-Apr-2017

339 views

Category:

Education


0 download

TRANSCRIPT

Page 1: The Strange World of Bibliometric Numbers: Implications for Professional Practice

The strange world of bibliometric numbers: Implications for professional practice

Dr Ian RowlandsDavid Wilson Library

Manchester Metropolitan University27 June 2016

Page 2: The Strange World of Bibliometric Numbers: Implications for Professional Practice

Three themes

Look at the underlying data

• don’t take indicators at face value

Think how your data will be used

• put your numbers in context

Accept that bibliometrics is driven by rare events

• put measurements around that uncertainty

Page 3: The Strange World of Bibliometric Numbers: Implications for Professional Practice

Content

The importance of context: Interpreting the h-index

The journal impact factor: A case study in extremes

Working with `difficult’ numbers

Page 4: The Strange World of Bibliometric Numbers: Implications for Professional Practice

PrefaceBibliometrics deals with rare events

Page 5: The Strange World of Bibliometric Numbers: Implications for Professional Practice
Page 6: The Strange World of Bibliometric Numbers: Implications for Professional Practice

Fly body length (mm)

Page 7: The Strange World of Bibliometric Numbers: Implications for Professional Practice

Fly body length (mm)Statistic ValueMean 45.5

Median 45.5

Mode 45

Range 36 – 55

Standard deviation 3.9

Page 8: The Strange World of Bibliometric Numbers: Implications for Professional Practice

Citation frequencies: Nature 2008

Citations to present for 975 Nature articles and review papers published in 2008

Page 9: The Strange World of Bibliometric Numbers: Implications for Professional Practice

Nature citationsStatistic ValueMean cites per paper 275.1

Median 164

Mode 1

Range 0 – 4,735

Standard deviation 366.6

Citations to present for 975 Nature articles and review papers published in 2008

Page 10: The Strange World of Bibliometric Numbers: Implications for Professional Practice

Nature citationsStatistic ValueMean cites per paper 275.1

Median 164

Mode 1

Range 0 – 4,735

Standard deviation 366.6

Citations to present for 975 Nature articles and review papers published in 2008

What’s the average??

The data range over three orders of magnitude!!

Page 11: The Strange World of Bibliometric Numbers: Implications for Professional Practice

A thought experiment

What if flies’ body lengths followed the same distribution as citations?

• most typically, a fly would not even exist (often, mode=0)

• 85 per cent of flies would have bodies shorter than average for the whole population, and most would be hors d’oeuvres for the top 15 per cent

• some flies would measure a giant 30-inches

Page 12: The Strange World of Bibliometric Numbers: Implications for Professional Practice

Lesson

`Average’ is a problematic concept in bibliometrics

This has serious implications for

• methodology

• interpretation

• application

Page 13: The Strange World of Bibliometric Numbers: Implications for Professional Practice

The importance of contextInterpreting the h-index

Page 14: The Strange World of Bibliometric Numbers: Implications for Professional Practice

What is the h-index?

Page 15: The Strange World of Bibliometric Numbers: Implications for Professional Practice

What is the h-index?

36

Page 16: The Strange World of Bibliometric Numbers: Implications for Professional Practice

Interpreting the h-index

Harry– 60 papers– 6,000 citations– 100 citations per paper

Tom– 60 papers– 6,000 citations– 100 citations per paper

Page 17: The Strange World of Bibliometric Numbers: Implications for Professional Practice

Interpreting the h-index

Harry– 60 papers– 6,000 citations– 100 citations per paper

Tom– 60 papers– 6,000 citations– 100 citations per paper

h-index = 20

Page 18: The Strange World of Bibliometric Numbers: Implications for Professional Practice

Interpreting the h-index

Harry– 60 papers– 6,000 citations– 100 citations per paper

Tom– 60 papers– 6,000 citations– 100 citations per paper

h-index = 20

h-index = 40

Page 19: The Strange World of Bibliometric Numbers: Implications for Professional Practice

The h-index measures consistency not absolute impact.

Quite a few Nobel laureates have low to moderate h-indexes …

Page 20: The Strange World of Bibliometric Numbers: Implications for Professional Practice

On the h-index and its variants

“These are often breathtakingly naïve attempts to capture a complex citation record with a single number. Indeed the primary advantage of these new indices over simple histograms of citation counts is that the indices discard almost all of the detail … and this makes it possible to rank any two scientists … Surely understanding ought to be the goal when assessing research, not ensuring that any two people are comparable.”

International Mathematical Union, Citation Statistics, June 2008, p.14http://www.mathunion.org/fileadmin/IMU/Report/CitationStatistics.pdf

Page 21: The Strange World of Bibliometric Numbers: Implications for Professional Practice

Practical tips

The accuracy of h depends on not missing any relevant papers in the core as well as avoiding false drops

Present h with a health warning that pushes responsibility for curating their online identity back on the client (e.g. ORCID, active management of their ResearcherID)

Source coverage (particularly Scopus vs Web of Science) is a seriously overlooked issue and may yield very different h values

Since h throws away information about important highly cited papers (papers with citations > h) it does many researchers a disservice

Page 22: The Strange World of Bibliometric Numbers: Implications for Professional Practice

The journal impact factorA case study in extremes

Page 23: The Strange World of Bibliometric Numbers: Implications for Professional Practice

Dr Eugene Garfield

Page 24: The Strange World of Bibliometric Numbers: Implications for Professional Practice

Journal impact factor 2015 calculation

citations accruedduring 2015

papers published in 2013

papers published in 2014

+

÷

Numerator=ALL citations

Denominator=articles and reviews only

Page 25: The Strange World of Bibliometric Numbers: Implications for Professional Practice

Bibliometric ratios can be very unstable

The journal impact factor is a simple ratio:

JIF = citations / papers

Citations can throw up surprises, and these will be amplified if the sample is small.

Page 26: The Strange World of Bibliometric Numbers: Implications for Professional Practice

Journal impact factor instability

0 100 200 300 400 500 600 700 800 900 1000-125%

-100%

-75%

-50%

-25%

0%

25%

50%

75%

100%

125%

150%

175%

Journal size (2011 articles)

Mea

n %

chan

ge in

IF (2

011

on 2

010)

Page 27: The Strange World of Bibliometric Numbers: Implications for Professional Practice

What happened here?

Page 28: The Strange World of Bibliometric Numbers: Implications for Professional Practice

Acta Crystallographica Section ACitations received in

2008 2009 2010

The whole journal 3,628 6,068 7,325

George Sheldrick, A short history of SHELX (2008) 64(1) pp 112-122. 3,542 5,897 7,029

Helen Berman, The Protein Data Bank: A historical perspective (2008) 64(1) pp 88-95. 4 7 23

Page 29: The Strange World of Bibliometric Numbers: Implications for Professional Practice

A short history of SHELX

Abstract

“An account is given of the development of the SHELX system of computer programs from SHELX-76 to the present day …This paper could serve as a general literature citation when one or more of the open-source SHELX programs … are employed in the course of a crystal-structure determination.”

George M Sheldick, A short history of SHELX, Acta Crystallographica Section A (2008) 64(1): 112-122.

Page 30: The Strange World of Bibliometric Numbers: Implications for Professional Practice

top 10% of articles generate

40% of all citations …

… 82% of articles are `below average’Bill Gates gets on the train …

and, on average, everyone on

board is a millionaire

(at least until he gets off)

Page 31: The Strange World of Bibliometric Numbers: Implications for Professional Practice

Lessons

The example of Acta Crystallographica A’s 2009 JIF is a salutary reminder that rare events do happen. The issue is compounded in this case because the denominator is small (127 papers).

How could the journal impact factor (and other bibliometrics indicators) be better presented?• in principle, the mode and median are far more appropriate and

informative than the mean when dealing with highly skewed distributions

• but in reality, the mode and median for many indicators will simply be 0 or 1

• but this is not terribly realistic strategy!

Page 32: The Strange World of Bibliometric Numbers: Implications for Professional Practice

Working with `difficult’ numbersData transformation and stability intervals

Page 33: The Strange World of Bibliometric Numbers: Implications for Professional Practice

Nature 2008 (n=945) cites to end 2015

Page 34: The Strange World of Bibliometric Numbers: Implications for Professional Practice

Nature 2008 (n=945) cites to end 2015

A simple logarithmic data transform

Page 35: The Strange World of Bibliometric Numbers: Implications for Professional Practice

Advantages

By using a logarithmic rather than a linear scale, the mode, median and mean converge and we have a much better sense of the central tendency.

This has three practical benefits:

• Suddenly `average’ becomes meaningful

• You can now use a whole range of statistical tests that assume a normal distribution (e.g. student’s t-test, ANOVA)

• You can now put 95% confidence intervals around the mean, which aids interpretation

Page 36: The Strange World of Bibliometric Numbers: Implications for Professional Practice

Health warning

YOU MUST LOOK AT THE DATA

Fairly mature citation distributions are often approximately loglinear but this is not always the case.

Try other transforms (e.g. square root, reciprocal) to see if they offer a better solution.

If you want to be squeaky clean, consider a Box-Cox test to find the optimal transform.

Page 37: The Strange World of Bibliometric Numbers: Implications for Professional Practice

Stability intervals

Page 38: The Strange World of Bibliometric Numbers: Implications for Professional Practice

Stability intervals

Page 39: The Strange World of Bibliometric Numbers: Implications for Professional Practice

Final conclusions

Always look at the raw data, not the cooked indicator, and think about context

`Rare events’ can make a huge difference

Bibliometric indicators are unstable and this can lead to poor decision-making

You have a responsibility to present meaningful averages and to put bounds around data uncertainty