advantages and drawbacks of bibliometrics

49
Application of bibliometric analysis Advantages & pitfalls Thed van Leeuwen Workshop on Research Evaluation in Statistical Sciences , Bologna, 25 th March 2010

Upload: roger961

Post on 22-May-2015

2.067 views

Category:

Technology


9 download

TRANSCRIPT

Page 1: Advantages and drawbacks of bibliometrics

Application of bibliometric analysis

Advantages & pitfalls

Thed van Leeuwen

Workshop on Research Evaluation in Statistical Sciences ,

Bologna, 25th March 2010

Page 2: Advantages and drawbacks of bibliometrics

Introduction of bibliometrics

• Bibliometrics can be defined as the quantitative analysis of science and technology performance and the cognitive and organizational structure of science and technology.

• Basic for these analyses is the scientific communication between scientists through (mainly) journal publications.

• Key concepts in bibliometrics are output and impact, as measured through publications and citations.

• Important starting point in bibliometrics: scientists express, through citations in their scientific publications, a certain degree of influence of others on their own work.

• By large scale quantification, citations indicate influence or (inter)national visibility of scientific activity, but should not be interpreted as synonym for ‘quality’.

Page 3: Advantages and drawbacks of bibliometrics

CWTS data system• CWTS has a full bibliometric license from Thomson

Reuters Scientific to conduct evaluation studies using the Web of Science.

• Our database covers the period 1981-2009.

• Some characteristics:– Over 31.000.000 publications.

– Over 350.000.000 citation relations between source papers.

– 100.000.000 authors (incl. variations), 15.000.000 ‘unique’ names.

– Over 60.000.000 addresses, some 90% cleaned up over the last 10 years.

– Contains reference sets for journal and field citation data.

Page 4: Advantages and drawbacks of bibliometrics

Bibliometric indicators produced by CWTS

Page 5: Advantages and drawbacks of bibliometrics

Some basic indicators are …

• P: number of publications in journals processed for the

Web of Science.

• C: number of received citations, excl. self-citations.

• CPP: mean number of citations per publication, excl. self-

citations

• Pnc: percentage of the publications not cited (within a

certain time-frame !!!)

• % SC: percentage self-citations related to an output set.

Page 6: Advantages and drawbacks of bibliometrics

Important indicators are…

• CPP/JCSm: ratio between real, actual impact, and mean journal impact.

• CPP/FCSm: ratio between real, actual impact, and mean field impact.

• JCSm/FCSm: ratio between journal impact, and field impact, indicative for the ‘quality’ of the journal package in the field

Page 7: Advantages and drawbacks of bibliometrics

Various types of analysis focus on …

• Research profiles: a break down of the output over various fields of science.

• Scientific cooperation analysis: a break down of the output over various types of scientific collaboration.

• Knowledge user analysis: a break down of the ‘responding’ output into citing fields, countries or institutions.

• Highly cited paper analysis: which publications are among the most highly cited output (top 10%, 5%, 1%) of the global literature in that same field(s).

• Social network analysis: how is the network of partners composed, based on scientific cooperation.

Page 8: Advantages and drawbacks of bibliometrics

Journal & Field Normalization

Page 9: Advantages and drawbacks of bibliometrics

Calculating the JCSm & FCSm ----------------------------------------------------------------------------------------------

Type publ. Journal Journal # citations

year category until 1999

---------------------------------------------------------------------------------------------- 

I review 1996 CANCER RES Oncology 17 

II note 1997 J CLIN END Endocrinology 4 

III article 1999 J CLIN END Endocrinology 6

IV article 1999 J CLIN END Endocrinology 8

----------------------------------------------------------------------------------------------  

Page 10: Advantages and drawbacks of bibliometrics

Calculating the JCSm & FCSm 2-----------------------------------------------------------------

CPP JCS FCS

-----------------------------------------------------------------

I 17 16.9 23.7 

II 4 3.1 3.0 

III 6 4.8 4.1 

IV 8 4.8 4.1

-----------------------------------------------------------------

Page 11: Advantages and drawbacks of bibliometrics

Calculating the JCSm & FCSm 3

The mean citation score is determined as:

17 + 4 + 6 + 8

CPP = ------------------ = 8.8

1 + 1 + 1 + 1

The mean journal citation score as: (1 x 16.9) + (1 x 3.1) + (2 x 4.8)

JCSm = -------------------------------------- = 7.4 1 + 1 + 2 The mean field citation score as:

(1 x 23.7) + (1 x 3.0) + (2 x 4.1) FCSm = -------------------------------------- = 8.7

1 + 1 + 2

CPP / JCSm

(8.8 / 7.4) = 1.19

CPP / FCSm

(8.8 / 8.7) = 1.01

Page 12: Advantages and drawbacks of bibliometrics

Citation Windows & Impact Measurement

Page 13: Advantages and drawbacks of bibliometrics

Citation measurement and ‘windows’

• Publication years, fixed citation ‘window’.

Publications of 2002, with three citation years (namely 2002, 2003, and 2004), followed by 2003, with three years, etc.

• Blocks of publication years with a window decreasing in length.

Publications of 2002-2005, with citation window of 4 years (2002-2005), 3 years (2003-2005), 2 years (2004-2005), and 1 year (2005).

Page 14: Advantages and drawbacks of bibliometrics

Citation measurement with ‘fixed window’

Citation years

2002 2003 2004 2005 2006 2007 2008 2009

2002

2003

2004

2005

2006

2007

2008

2009

2002 2003 2004

2003 2004 2005

2004 2005 2006

2005 2006 2007

2006 2007 2008

2007 2008 2009

2008 2009

2009

Page 15: Advantages and drawbacks of bibliometrics

Citation measurement with ‘year blocks’

Citation years 2002 2003 2004 2005 2006 2007 2008 20092002

2003

2004

2005

2006

2007

2008

2009

2002 2003 2004 2005

2003 2004 2005

2004 2005

2005

2003 2004 2005 2006

2004 2005 2006

2005 2006

2006

2004 2005 2006 2007

2005 2006 2007

2006 2007

2007

2005 2006 2007 2008

2006 2007 2008

2007 2008

2008

2006 2007 2008 2009

2007 2008 2009

2008 2009

2009

Page 16: Advantages and drawbacks of bibliometrics

Methodological issues

Page 17: Advantages and drawbacks of bibliometrics

Adequacy of citation indexes : implications for bibliometric studies

Page 18: Advantages and drawbacks of bibliometrics

How to tackle this issue ?

• We conduct analyses on the adequacy of the citation indexes across disciplines based on reference behavior of researchers themselves.

• The degree of referring towards other indexed literature indicates the importance of journal literature in the scientific communication process.

Page 19: Advantages and drawbacks of bibliometrics

WoSNon-WoS

Non-WoS WoS

Citing/Source

Cited/Target

?%?%

Assessment of WoS Coverage

Non-Wos Journals

Books

Conference proceedings

Reports

Etc.

Page 20: Advantages and drawbacks of bibliometrics

WoSNon-WoS

Non-WoS WoS

Citing/Source

Cited/Target

75%25%

Total ISI/WoS Database (2002)

Page 21: Advantages and drawbacks of bibliometrics

The medical & Life sciences

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

AGRICULTUREAND FOODSCIENCE

BASIC LIFESCIENCES

BASIC MEDICALSCIENCES

BIOLOGICALSCIENCES

BIOMEDICALSCIENCES

CLINICALMEDICINE

HEALTHSCIENCES

References non-ISI

References ISI

Page 22: Advantages and drawbacks of bibliometrics

The natural sciences

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

ASTRONOMYAND

ASTROPHYSICS

CHEMISTRYAND

CHEMICALENGINEERING

COMPUTERSCIENCES

EARTHSCIENCES

ANDTECHNOLOGY

ENVIRONMENTALSCIENCES ANDTECHNOLOGY

MATHEMATICS PHYSICS ANDMATERIALSSCIENCE

STATISTICALSCIENCES

References non-ISI

References ISI

Page 23: Advantages and drawbacks of bibliometrics

Statistical sciences0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

1991

1996

2001

2006

References ISI

References non-ISI

Page 24: Advantages and drawbacks of bibliometrics

The engineering sciences

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1991 1996 2001 2006 1991 1996 2001 2006 1991 1996 2001 2006 1991 1996 2001 2006 1991 1996 2001 2006 1991 1996 2001 2006

CIVIL ENGINEERINGAND

CONSTRUCTION

ELECTRICALENGINEERING AND

TELECOMMUNICATION

ENERGY SCIENCEAND TECHNOLOGY

GENERAL ANDINDUSTRIAL

ENGINEERING

INSTRUMENTS ANDINSTRUMENTATION

MECHANICALENGINEERING AND

AEROSPACE

References non-ISI

References ISI

Page 25: Advantages and drawbacks of bibliometrics

The social– and behavioral sciences

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%1

99

1

19

96

20

01

20

06

19

91

19

96

20

01

20

06

19

91

19

96

20

01

20

06

19

91

19

96

20

01

20

06

19

91

19

96

20

01

20

06

19

91

19

96

20

01

20

06

19

91

19

96

20

01

20

06

ECONOMICSAND BUSINESS

EDUCATIONALSCIENCES

MANAGEMENTAND PLANNING

POLITICALSCIENCE AND

PUBLICADMINISTRATION

PSYCHOLOGY SOCIAL ANDBEHAVIORALSCIENCES,

INTERDISCIPLINARY

SOCIOLOGY ANDANTHROPOLOGY

References non-ISI

References ISI

Page 26: Advantages and drawbacks of bibliometrics

The humanities

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

1991

1996

2001

2006

INFORMATION ANDCOMMUNICATION

SCIENCES

LANGUAGE ANDLINGUISTICS

CREATIVE ARTS,CULTURE AND

MUSIC

HISTORY,PHILOSOPHY AND

RELIGION

LAW ANDCRIMINOLOGY

LITERATURE

References non-ISI

References ISI

Page 27: Advantages and drawbacks of bibliometrics

Overall WoS coverage by main field

EXCELLENT (> 80%)

VERY GOOD (60-80%)

GOOD(40-60%)

Biochem & Mol Biol

Appl Phys & Chem

Mathematics &Statistical

sciences

Biol Sci – Humans

Biol Sci – Anim & Plants

Economics

Chemistry Psychol & Psychiat

Engineering

Clin Medicine Geosciences MODERATE (<40 %)

Phys & Astron Soc Sci ~ Medicine

Other Soc Sci

Humanities & Arts

Page 28: Advantages and drawbacks of bibliometrics

Conclusions on adequacy issue

• We can clearly conclude that the application of bibliometric techniques, solely based on WoS (but very likely also Scopus) will not be valid for some of the ‘soft’ fields in the social sciences and the humanities.

• That is why the tool box has to be extended !

Page 29: Advantages and drawbacks of bibliometrics

The H-Index and its limitations

Page 30: Advantages and drawbacks of bibliometrics

The H-Index, defined as …

• The H-Index is the score that indicates the position at which a publication in a set, the number of received citations is equal to the ranking position of that publication.

• Idea of an American physicist, J. Hirsch, who published about this index in the Proc. NAS USA.

Page 31: Advantages and drawbacks of bibliometrics

Examples of Hirsch-index values

• Environmental biologist, output of 188 papers, cited 4,788 times in the period 80-04.

• Hirsch-index value of 31

• Clinical psychologist, output of 72 papers, cited 760 time sin the period 80-04.

• Hirsch-index value of 14

0

50

100

150

200

250

300

350

0 20 40 60 80 100 120 140 160 180 200

Value of H-Index= 31

Citations

Publications

0

10

20

30

40

50

60

70

80

0 10 20 30 40 50 60 70 80

Value of H-Index= 14

Citations

Publications

Page 32: Advantages and drawbacks of bibliometrics

Problems with the H-Index

• For serious evaluation of scientific performance, the H-Index is as indicator not suitable, as the index:

– Is insensitive to field specific characteristics (e.g., difference in citation cultures between medicine and other disciplines).

– Does not take into account age and career length of scientists, a small oeuvre leads necessarily to a low H-Index value.

– Is inconsistent in its ‘behaviour’.

Page 33: Advantages and drawbacks of bibliometrics

• Actual versus field normalized impact (CPP/FCSm) displayed against the output.

• Large output can be combined with a relatively low impact

Soc

HumMat

Soc

Eng

Psy

Eng ChePsyMed

Med

Che

Med

Med

Phy

PhyBio

BioPhy

Psy

Env

Phy

Med

Bio

MedMed

0.00

1.00

2.00

3.00

4.00

5.00

6.00

7.00

0 50 100 150 200 250

TOTAL PUBLICATIONS

CP

P/F

CS

m

Page 34: Advantages and drawbacks of bibliometrics

• H-Index displayed against the output.

• Larger output is strongly correlated with a high H-Index value.

Med

Med

Bio

MedPhy Env

PsyPhy

BioBioPhy

Phy MedMed

CheMedMed Psy

CheEng

PsyEng

SocMat

HumSoc

0

10

20

30

40

50

60

0 50 100 150 200 250

TOTAL PUBLICATIONS

H-i

nd

ex

Page 35: Advantages and drawbacks of bibliometrics

Consistency: Definition

Definition. A scientific performance measure is said to be consistent if and only if for any two actors A and B and for any number n ≥ 0 the ranking of A and B given by the performance measure does not change when A and B both have a new publication with n citations.

35

Page 36: Advantages and drawbacks of bibliometrics

Consistency: Motivation

• Consistency ensures that if the publishing behavior of two actors does not change over time, their ranking relative to each other also does not change

• Consistency ensures that if the individual researchers in one research group X outperform the individual researchers in another research group Y, the former research group X as a whole outperforms the latter research group Y.

36

Page 37: Advantages and drawbacks of bibliometrics

Inconsistency of the h-index

37

Actor A Actor B

0 2 4 6 8 10 120

1

2

3

4

5

6

7

8

9

publications

cita

tions

0 2 4 6 8 10 120

1

2

3

4

5

6

7

8

9

publications

cita

tions

h = 4 h = 6

0 2 4 6 8 10 120

1

2

3

4

5

6

7

8

9

publications

cita

tions

0 2 4 6 8 10 120

1

2

3

4

5

6

7

8

9

publications

cita

tions

h = 6h = 8

Page 38: Advantages and drawbacks of bibliometrics

ISI Impact Factors: calculation and validity

Page 39: Advantages and drawbacks of bibliometrics

Methodology: ISI’s classical IF

• The ISI Impact Factor (IF) is defined as the number of citations received by a journal in year t, divided by the number of citeable documents in that same journal in the years t-1 and t-2,

• Or, as a Formula:

Citations in year t Number of ‘citeable documents’ in t-1 & t-2

Page 40: Advantages and drawbacks of bibliometrics

Share ‘citations-for-free’ for The LancetPublications Citations

90+91 1992

Article 784 2986

Note 144 593

Review 29 232

Sub-total 957 (a) 7959 (b)

Letter 4181 (d) 4264 (e)

Editorial 1313 905

Other 1421 909

Total 7872 14037 (c)

• ISI Method:

Citations in 2000 .

Citeable documents in ‘98 and ‘99

14037 (c) 957 (a) IF=14.7

• CWTS Method:

Citations to Art/Not/Rev in 2000 .

Art/Not/Rev in ‘98 and ‘99

7959 (b) 957 (a)

Citations to Art/Let/Not/Rev in 2000 .

Art/Let/Not/Rev in ‘98 and ‘99

7959+4264 (b+e) 957+4181 (a+d)

IF=8.3

IF=2.4

Page 41: Advantages and drawbacks of bibliometrics

ISI Impact Factors

• From 1995 onwards CWTS has analyzed the uses and validity ISI Journal Impact Factor (IF).

• Most important points of criticism were:

– Calculated erroneously.

– Not sensitive for the composition of the journal in terms of the document types.

– Not sensitive for the science fields a journal is attached to …

– Based on too short ‘citation windows’.

Page 42: Advantages and drawbacks of bibliometrics

Distribution of citations used for the calculationof the IF value of The Lancet

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98

• The IF-score of The Lancet is seriously ‘overrated’ by the scientific ‘audience’ of the journal.

• The red area indicates citations ‘for free’, while the blue area indicates ‘correct citations’

Page 43: Advantages and drawbacks of bibliometrics

Impact Factors for Br. J. Clin. Pharm. and Clin. Pharm. & Ther.

• The graph shows the correct and erroneous impact factors of BJCP and CPT

• In the case of CPT, citations to published meeting abstracts are included, while BJCP has stopped publishing of meeting abstracts !

0.00

0.50

1.00

1.50

2.00

2.50

3.00

3.50

4.00

4.50

83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98

CPT Err IF

CPT IF

BJCP Err IF

BJCP IF

Page 44: Advantages and drawbacks of bibliometrics

Document types and fields

IMMUNOLOGY ANN REV IMMUNOL 50.49 1 5.18 1

BIOCHEM & MOLECULAR BIOL ANN REV BIOCHEM 34.61 1 4.10 3

PHARMACOL & PHARMACY PHARMACOLOGICAL REV 27.74 1 4.75 1

CELL BIOL ANN REV CELL & DEVELOPM BIOL 27.53 1 1.72 13

DEVELOPMENTAL BIOL ANN REV CELL & DEVELOPM BIOL 27.53 1 1.72 3

PHYSIOLOGY PHYSIOLOGICAL REV 24.82 1 3.18 1

CELL BIOLOGY NATURE REV MOL CELL BIOL 22.21 4 2.76 8

ENDOCRINOL & METABOLISM ENDOCRINE REV 21.98 1 2.87 1

NEUROSCIENCES ANN REV NEUROSCIENCE 21.89 1 3.12 4

PHYSICS REV MODERN PHYSICS 20.14 1 5.02 1

CHEMISTRY CHEMICAL REV 19.67 1 2.89 2

Field Journal IF JFIS

The IF is for ‘02, JFIS covers ‘98-‘02

Page 45: Advantages and drawbacks of bibliometrics

Fields and Citation windows0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

POLYMER SCIENCE (55)CHEM, APPLIED (25)

CHEM, CLIN&MEDIC (8)CHEM, PHYSICAL (78)

CRYSTALLOGRAPHY (18)ELECTROCHEMISTRY (10)

CHEM, INORG&NUC (37)BIOCH & MOL BIOL (169)

CHEM, ORGANIC (42)CHEMISTRY (128)

CHEM, MISCELLAN (7)CHEM, ANALYTICAL (54)

ENG, INDUSTRIAL (14)ENG, MANUFACT (5)

ENGINEERING (84)ENG, BIOMEDICAL (33)ENG, PETROLEUM (8)ENG, MECHANIC (69)

ENG, CIVIL (49)ENG, ENVIRONM (6)

ENG, CHEMICAL (69)ENG, MARINE (8)

ENG, ELECTRICAL (127)

PHYSICS, MATHEMA (10)ACOUSTICS (20)

THERMODYNAMICS (11)PHYSICS, FLUIDS (16)PHYSICS, MISCELL (6)PHYSICS, AT,M,C (22)

OPTICS (37)PHYSICS, APPLIED (49)

PHYSICS, COND MA (36)PHYSICS (85)

PHYSICS, NUCLEAR (16)PHYSICS, PART&FI (11)

Chem

istry

Engi

neer

ing

scie

nces

Phsy

ics

Page 46: Advantages and drawbacks of bibliometrics

Citation measurement of IF

2002 2003 2004 2005 2006 2007 2008 2009

2002

2003

2004

2005

2006

2007

2008

2009

2002 2003 2004

2003 2004 2005

2004 2005 2006

2005 2006 2007

2006 2007 2008

2007 2008 2009

2008 2009

2009

Page 47: Advantages and drawbacks of bibliometrics

CWTS answer to the problems of the IF

• This indicator is the JFIS, the Journal-to-Field Impact Score.

• The JFIS solves the main objections against the Impact Factor, as

– the calculation of JFIS is based on equally large entities,

– document types are taken into account,– JFIS is field-normalized, and finally,– based on longer citation windows (1-4 years)

Page 48: Advantages and drawbacks of bibliometrics

Citation measurement of JFIS

Citation years 2002 2003 2004 2005 2006 2007 2008 20092002

2003

2004

2005

2006

2007

2008

2009

2002 2003 2004 2005

2003 2004 2005

2004 2005

2005

2003 2004 2005 2006

2004 2005 2006

2005 2006

2006

2004 2005 2006 2007

2005 2006 2007

2006 2007

2007

2005 2006 2007 2008

2006 2007 2008

2007 2008

2008

2006 2007 2008 2009

2007 2008 2009

2008 2009

2009

Page 49: Advantages and drawbacks of bibliometrics

End of the presentation

For questions regarding the contents of the presentation, mail to: [email protected]