research metrics and the open science...
TRANSCRIPT
Research metrics and the open
science development
Dr. Thed van Leeuwen
Center for Science & Technology Studies (CWTS)
TU Delft seminar on “Are you ready to publish ?” , September 20th
2017
Outline
• Bibliometrics and research management context
• Infamous bibliometric indicators
• … return to the production of scientific knowledge,
and how to open that up !
• … so, how about the classical bibliometric approach?
• Take-home messages
1
What is bibliometrics ?
• Bibliometrics can be defined as the quantitative analysis of science and technology (development), and the study of cognitive and organizational structures in science and technology.
• Basic for these analyses is the scientific communication between scientists through (mainly) journal publications.
• Key concepts in bibliometrics are output and impact, as measured through publications and citations.
• Important starting point in bibliometrics: scientists express, through citations in their scientific publications, a certain degree of influence of others on their own work.
• By large scale quantification, citations indicate (inter)national influence or (inter)national visibility of scientific activity, but should not be interpreted as synonym for ‘quality’.
‘Classical’ image of the Credibility cycle
Credibility cycle (adapted from Latour and Woolgar (1979) & Rip (1990)
PEER REVIEW
Rise of performance indicators & bibliometrics
Need for formalised measures External push
• ‘Push’ from science policy (from 1970s onwards)
• Independent of peer review
• New Public Management / Neo-liberalism (from 1980s onwards)
Matrix-structure science (Whitley) Internal push
• Researchers part of international community (Peer review)
• But also part of local institutions (Specific management practices, e.g. appraisals, external evaluations)
• Institute managers not always part of international expert community
• Tighter forms of management (from the 1990s onwards)
Distance
Extended credibility cycle
‘Citation score’ is here sort of a metaphor
In a direct sense, we measure real impacts, comparing actual and expected values
In an indirect sense, we use derivatives, such as JIF and h-index…
Definitions of Journal Impact Factor & Hirsch Index
• Definition of JIF:
– The mean citation score of a journal, determined by dividing all
citations in year T by all citable documents in years T-1 and T-2.
• Definition of h-index:
– The ‘impact’ of a researcher, determined by the number of received
citations of an oeuvre, sorted by descending order, where the
number of received citations on that single paper equals the rank
position.
Problems with JIF
• Methodological issues
– Was/is calculated erroneously (Moed & van Leeuwen, 1996)
– Not field normalized
– Not document type normalized
– Underlying citation distributions are highly skewed (Seglen, 1994)
• Conceptual/general issues
– Inflation (van Leeuwen & Moed, 2002)
– Availability promotes journal publishing
– Is based on expected values only
– Stimulates one-indicator thinking
– Ignores other scholarly virtues
Problems with H-index
• Bibliometric-mathematical issues
– mathematically inconsistent (Waltman & van Eck, 2012)
– Conservative, can only become higher …
– Not field normalized (van Leeuwen, 2008)
• Bibliometric-methodological issues
– How to define an author?
– In which bibliographic/metric environment? (Bar-Ilan, 2006)
• Conceptual/general issues
– Favors age, experience, and high productivity (Costas & Bordons, 2006)
– No relationship with research quality
– Ignores other elements of scholarly activity
– Promotes one-indicator thinking
Research cycle, or
knowledge production process
12
Analysis
Publication
Review
Data gathering
Conceptualization
Research cycle & Open Science Trends
13
Analysis
Publication
Review
Data gathering
Conceptualization
Citizens science
Open Code
Pre-Print
Open Access
Data intensive
Open labbooks/
Open Data
Open annotation
Scientific blogs
Collaborative
Alternative reputation systems
bibliographies
workflows
Adding altmetric techniques to the Open
Science model *
14
Analysis
Publication
Review
Data gathering
Conceptualization
Citizens science
Open Code
Pre-Print
Open Access
Data intensive
Open labbooks/
Open Data
Open annotation
Scientific blogs
Collaborative
Alternative reputation systems
bibliographies
workflows
DOAJ List
RoarEprints.org
ArXiv
RunMyCode.org
SciStarter.com
FigShare.com
MyExperiment.org
dataDryad.org
OpenAnnotation.org
Researchgate.com
Mendeley.com
AltMetric.com
Academia.edu
SlideShare.com
ImpactStory.com
SlideShare.com
* Thanks to colleagues from Technopolis
Some conclusions …
• Classical bibliometrics mainly focuses on the output and impact related issues.
• Altmetric techniques describe other elements of the knowledge production process.
• But, not in all domains of scholarly activity has Open Science landed already to the same extent.
• Nor are the altmetric techniques and data already matured so far to be used to the full extent in a science policy context.
15
Results of the OA labeling analysis
Output for EU countries:
• Cover the period 2009-2016
• WoS articles, reviews, letters
• Rather arbitrary threshold of 25% !
• Color indicates penetration of OA
– Blue, OA =< than 25%
– Red, OA > than 25%
• Europe is becoming more OA focused !
(red >25%, blue =< 25%)
Country %OA in 2015
LATVIA 20%
ROMANIA 20%
BULGARIA 23%
GREECE 23%
LITHUANIA 23%
MALTA 23%
CYPRUS 27%
ITALY 27%
POLAND 27%
CZECH REPUBLIC 28%
SLOVAKIA 28%
FINLAND 30%
GERMANY 30%
ESTONIA 31%
HUNGARY 31%
AUSTRIA 32%
DENMARK 32%
LUXEMBOURG 32%
PORTUGAL 32%
SLOVENIA 32%
SPAIN 32%
FRANCE 33%
BELGIUM 34%
GREAT BRITAIN 34%
IRELAND 34%
SWEDEN 34%
NETHERLANDS 37%
%OA in 2016
21%
20%
25%
27%
22%
30%
31%
33%
29%
30%
26%
33%
33%
33%
35%
36%
38%
40%
36%
37%
37%
36%
41%
44%
39%
38%
43%
Distinguish between Gold and Green OA
18
Output for EU countries:
• Green Focus is in North-Western Europe
• Gold OA focus is in many Eastern European countries
• Reasons for this:
1) development of infrastructure in North West Europe ?
2) Stronger grip of the publishing industry on Eastern European countries ?
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
GREAT BRITAIN
FRANCE
IRELAND
NETHERLANDS
BELGIUM
AUSTRIA
GERMANY
DENMARK
LUXEMBOURG
HUNGARY
SWEDEN
FINLAND
ITALY
PORTUGAL
SPAIN
GREECE
MALTA
LATVIA
SLOVENIA
BULGARIA
ESTONIA
CZECH REPUBLIC
CYPRUS
SLOVAKIA
POLAND
ROMANIA
LITHUANIA
% Gold-OA
% Green-OA
OA, international cooperation, & research impact
19
0,00 0,50 1,00 1,50 2,00 2,50
AUSTRIA
BELGIUM
DENMARK
FINLAND
NETHERLANDS
SWEDEN
All output
All OA output
All OA IntCoop output
Smaller EU countries:
• First time ‘proof’ of effect of OA, on the national scale, with a full set of WoS papers
• Green OA is a game changer when included in the analyses
• International cooperation OA output reaches even higher levels
Take-home messages on journals
• Journals tend to publish positive/confirming results.
• Editorial boards are driven by market shares as well !
• Therefore, selection is harsh, and rejection rates are high
Take-home messages on data
• Data are not frequently published.
• Therefore, they do not give any credits for the producers!
• This keeps most scientific work non transparent
• Databases with negative results are necessary
Take-home messages on bibliometrics
• Ask yourself the question “What do I want to measure ? ”
• … and also “Can that be measured ? “
• Take care of proper data collection procedures.
• Then, always use actual and expected citation data.
• Apply various normalization procedures (field, document, age)
• Always have a variety of indicators.
• Always include various elements of scholarly activity.
• And perhaps most important, include peer review in
your assessment procedures !!!
Take-home messages on OA publishing
• Green OA publishing seems to be rewarding for scholars
• So repositories are important here
• Gold OA is often very costly
• xxxx
• And perhaps most important, keep in mind, the focus on
OA is a temporal issue !
Development of authorship across all
domains of scholarly activity
26
1,00
2,00
3,00
4,00
5,00
6,00
7,00 MULTIDISCIPLINARY JOURNALS
BASIC MEDICAL SCIENCES
BASIC LIFE SCIENCES
BIOMEDICAL SCIENCES
CLINICAL MEDICINE
ASTRONOMY AND ASTROPHYSICS
AGRICULTURE AND FOOD SCIENCE
CHEMISTRY AND CHEMICAL ENGINEERING
BIOLOGICAL SCIENCES
INSTRUMENTS AND INSTRUMENTATION
PHYSICS AND MATERIALS SCIENCE
ENERGY SCIENCE AND TECHNOLOGY
EARTH SCIENCES AND TECHNOLOGY
HEALTH SCIENCES
ENVIRONMENTAL SCIENCES AND TECHNOLOGY
ELECTRICAL ENGINEERING ANDTELECOMMUNICATIONPSYCHOLOGY
MECHANICAL ENGINEERING AND AEROSPACE
COMPUTER SCIENCES
CIVIL ENGINEERING AND CONSTRUCTION
GENERAL AND INDUSTRIAL ENGINEERING
EDUCATIONAL SCIENCES
STATISTICAL SCIENCES
SOCIAL AND BEHAVIORAL SCIENCES,INTERDISCIPLINARYMANAGEMENT AND PLANNING
SOCIOLOGY AND ANTHROPOLOGY
INFORMATION AND COMMUNICATION SCIENCES
MATHEMATICS
LAW AND CRIMINOLOGY
ECONOMICS AND BUSINESS
LANGUAGE AND LINGUISTICS
POLITICAL SCIENCE AND PUBLIC ADMINISTRATION
HISTORY, PHILOSOPHY AND RELIGION
CREATIVE ARTS, CULTURE AND MUSIC
LITERATURE
28
AU Moed, HF; Garfield, E. in
W
O
S
TI In basic science the percentage of 'authoritative' references
decreases as bibliographies become shorter
SO SCIENTOMETRICS 60 (3): 295-303, 2004 Y
RF ABT HA, J AM SOC INF SCI T, v 53, p 1106, 2004 Y
GARFIELD, E. CITATION INDEXING, 1979 (BOOK!) N
GARFIELD E, ESSAYS INFORMATION S, v 8, p 403, 1985 N
GILBERT GN, SOC STUDIES SCI, v 7, p 113, 1977 Y
MERTON RK, ISIS, v 79, p 606, 1988 Y
ROUSSEAU R, SCIENTOMETRICS, v 43, p 63, 1998 Y
ZUCKERMAN H, SCIENTOMETRICS, v 12, p 329, 1987 Y
WoS Coverage
= 5/7 = 71%
Not in WoS
WoS Coverage in 2010 across disciplines
• Black=Excellent coverage (>80%)
• Blue= Good coverage (between 60-80%)
• Green= Moderate coverage (but above 50%)
• Orange= Moderate coverage (below 50%, but above 40%)
• Red= Poor coverage (highly problematic, below 40%)
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100
%
BASIC LIFE SCIENCES (99,991)
BIOMEDICAL SCIENCES (105,156)
MULTIDISCIPLINARY JOURNALS (8,999)
CHEMISTRY AND CHEMICAL ENGINEERING (118,141)
CLINICAL MEDICINE (224,983)
ASTRONOMY AND ASTROPHYSICS (12,932)
PHYSICS AND MATERIALS SCIENCE (137,522)
BASIC MEDICAL SCIENCES (18,450)
BIOLOGICAL SCIENCES (60,506)
AGRICULTURE AND FOOD SCIENCE (26,709)
INSTRUMENTS AND INSTRUMENTATION (8,485)
EARTH SCIENCES AND TECHNOLOGY (33,160)
PSYCHOLOGY (24,244)
ENVIRONMENTAL SCIENCES AND TECHNOLOGY (42,705)
MECHANICAL ENGINEERING AND AEROSPACE (20,336)
HEALTH SCIENCES (29,213)
ENERGY SCIENCE AND TECHNOLOGY (15,021)
MATHEMATICS (27,873)
STATISTICAL SCIENCES (11,263)
GENERAL AND INDUSTRIAL ENGINEERING (8,756)
CIVIL ENGINEERING AND CONSTRUCTION (8,430)
ECONOMICS AND BUSINESS (16,243)
ELECTRICAL ENGINEERING AND TELECOMMUNICATION (...
MANAGEMENT AND PLANNING (7,201)
COMPUTER SCIENCES (23,687)
EDUCATIONAL SCIENCES (9,917)
INFORMATION AND COMMUNICATION SCIENCES (4,006)
SOCIAL AND BEHAVIORAL SCIENCES, INTERDISCIPLINARY...
SOCIOLOGY AND ANTHROPOLOGY (9,907)
LAW AND CRIMINOLOGY (5,299)
LANGUAGE AND LINGUISTICS (3,514)
POLITICAL SCIENCE AND PUBLIC ADMINISTRATION (6,423)
HISTORY, PHILOSOPHY AND RELIGION (11,753)
CREATIVE ARTS, CULTURE AND MUSIC (6,147)
LITERATURE (4,786)
Discipline
(Publications in 2010)
% Coverage of references in WoS
Some clear ‘perversions’ of the system … ?
• “You call me, I call you”
• When time is passing by …
• Salami slicing to boost an academic career
• Multiple authorship (without serious contributing)
• Putting your name on everything your unit produces
• The role of self citations
• Jumping on hypes and fashionable issues
Deconstructing the myth of the JIF…
• Take the Dutch output
• Similar journal impact classes
• Focus on publications that belong to the top 10% of their field
0,0%
5,0%
10,0%
15,0%
20,0%
25,0%
30,0%
35,0%
40,0%
A (0 >MNJS <=0.40)
B (0.40 > MNJS <= 0.80)
C (0.80 > MNJS <= 1.20)
D (1.20 > MNJS <=1.60)
E (MNJS > 1.60)
The problem of fields and h-index …
• Spinoza candidates, across all domains …
• Use output, normalized impact, and h-index
Soc
HumMat
Soc
Eng
Psy
Eng ChePsyMed
Med
Che
Med
Med
Phy
PhyBio
Bio
Phy
Psy
Env
Phy
Med
Bio
MedMed
0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
0 50 100 150 200 250
TOTAL PUBLICATIONS
CP
P/F
CS
m
Med
Med
Bio
MedPhy Env
PsyPhy
BioBioPhy
Phy Med
MedCheMed
Med Psy
Che
EngPsy
EngSoc
MatHum
Soc
0
10
20
30
40
50
60
0 50 100 150 200 250
TOTAL PUBLICATIONS
H-i
nd
ex
In what database context … ?
Database H-index Based upon …
Web of Science 14 Articles in journals
Scopus 25 Articles, book (chapters), and
conference proceedings papers
Google Scholar 33 All types, incl. Reports
34
Selected my own publications in WoS and Scopus, Google Scholar
has a pre-set profile.