bibliometric network analysis: software tools, techniques, and an analysis of network science at...

59
Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University Ludo Waltman and Nees Jan van Eck Centre for Science and Technology Studies (CWTS), Leiden University LCN2 Seminar Leiden, November 27, 2015

Upload: nees-jan-van-eck

Post on 14-Apr-2017

1.701 views

Category:

Science


0 download

TRANSCRIPT

Page 1: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Bibliometric network analysis:

Software tools, techniques, and an

analysis of network science at Leiden

University

Ludo Waltman and Nees Jan van Eck

Centre for Science and Technology Studies (CWTS), Leiden University

LCN2 Seminar

Leiden, November 27, 2015

Page 2: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Centre for Science and Technology

Studies (CWTS)

• Research center at Leiden University

focusing on science and technology

studies

• About 30 staff members

• History of more than 25 years in

bibliometric and scientometric

research

• Contract research

• Full access to large bibliographic

database (Web of Science and

Scopus)

1

Page 3: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Bibliographic databases: ‘Big data’

2

Web of Science Scopus

Journals 12,000 20,000

Publications 45 million 35 million

Citations 1 billion 0.9 billion

Page 4: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Bibliometric networks

3

Web of

Science

Scopus

Citation network

of publications

Co-authorship network

of authors / organizations

Co-citation network

of pubs / authors / journals

Co-occurrence network

of terms

Bibliographic coupling network

of pubs / authors / journals

Bibliographic

database

Page 5: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Outline

• Software tools

• Network analysis techniques

• Analysis of network science

4

Page 6: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Software tools

5

Page 7: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Software tools

• VOSviewer (www.vosviewer.com)

– Tool for constructing and visualizing bibliometric networks

• CitNetExplorer (www.citnetexplorer.nl)

– Tool for visualizing and analyzing citation networks of

publications

6

Page 8: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

VOSviewer

7

Page 9: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Map of university co-authorship

network

8

Page 10: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Map of journal citation network

9

Page 11: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

CitNetExplorer

10

Page 12: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

• Any type of bibliometric

network

• Co-authorship, co-citation, and

bibliographic coupling

• Time dimension is ignored

• Networks of at most ~10,000

nodes are supported

• Only citation networks of

publications

• Direct citation relations

• Time dimension is explicitly

considered

• Millions of publications are

supported

11

VOSviewer CitNetExplorer

Page 13: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Network

analysis

techniques

12

Page 14: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Network analysis techniques

13

Layout:

• Visualization of similarities

(VOS)

Community detection:

• Weighted modularity

• Smart local moving algorithm

Page 15: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

1414

Clustering can be seen as mapping

in a restricted space

Page 16: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

1515

Clustering can be seen as mapping

in a restricted space

Page 17: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Unified approach to mapping and

clustering

Minimize

where

n: number of nodes in the network

m: total weight of all edges in the network

Aij: weight of edge between nodes i and j

ki: total weight of all edges of node i

16

ji

ij

ji

ijij

ji

nddA

kk

mxxQ

2

1

2),,(

Mapping

xi: vector denoting the location

of node i in a p-dimensional

space

p

k

jkikjiijxxxxd

1

2

)(

Clustering

xi: integer denoting the

community to which node i

belongs

: resolution parameter

ji

ji

ij

xx

xx

d

if 1

if 0

Page 18: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Unified approach: Clustering

Equivalent to a weighted variant of modularity-based

community detection (Waltman et al., 2010)

Maximize

where

(xi, x

j) equals 1 if x

i= x

jand 0 otherwise

17

ji

ji

ijijjin

m

kk

Awxx

m

xxQ

2

),(

2

1),,(ˆ

1

ji

ij

kk

mw

2

Page 19: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Unified approach: Mapping

• Equivalent to the VOS (visualization of similarities)

technique (Van Eck & Waltman, 2007)

• Limit case of multidimensional scaling (Van Eck et

al., 2010)

18

ji

ji

ji

jiij

ji

xxxxA

kk

mQ

22

ji

jiijijxxDW

2

1

2

ij

ji

ijA

m

kk

D ij

ji

ijA

kk

mW

2

VOS

MDS

Page 20: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Unified approach

Commonly used clustering technique (modularity)

and commonly used mapping technique (MDS) can be

brought together in a unified framework

19

Unified

approach

Modularity

(weighted)

VOS

MDS

(limit case)

Page 21: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Louvain algorithm

• ‘Louvain algorithm’ (Blondel et al., 2008) is the

most popular heuristic algorithm for large-scale

modularity optimization

20

Page 22: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Louvain algorithm

21

Q = 0.3791

Q = 0.4151

Local

moving

heuristic

Local moving heuristic

Reduced

network

Original

network

Page 23: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Smart local moving algorithm

• Smart local moving algorithm extends the Louvain

algorithm in two ways:

1. Multiple algorithm iterations, with output of one iteration

serving as input for the next iteration

2. Recursive application of the local moving heuristic

22

Page 24: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Smart local moving algorithm

23

Q = 0.4198

Q = 0.3791

Reduced

network

Local moving

heuristic in

subnetworks

Local moving heuristic

Original

network

Page 25: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Empirical comparison (large networks)

• 6 networks

• Algorithms:

– Louvain (1 iteration)

– Louvain (10 iterations)

– Smart local moving (10 iterations)

• 10 algorithm runs using different random numbers

24

Page 26: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Empirical comparison (large networks)

25

Network Louvain Louvain (iterative) Smart local moving

Amazon

(0.5M / 0.9M)

Qmin

0.9257 0.9293 0.9335

Qmax

0.9264 0.9299 0.9338

t 6 9 28

DBLP

(0.4M / 1.0M)

Qmin

0.8203 0.8243 0.8357

Qmax

0.8227 0.8271 0.8367

t 7 9 26

IMDb

(0.4M / 15.0M)

Qmin

0.6976 0.6994 0.7050

Qmax

0.7041 0.7052 0.7077

t 18 26 100

LiveJournal

(4.0M / 34.7M)

Qmin

0.7441 0.7578 0.7676

Qmax

0.7557 0.7658 0.7720

t 350 566 1 549

WoS

(10.6M / 104.5M)

Qmin

0.7714 0.7851 0.7918

Qmax

0.7786 0.7902 0.7957

t 6 800 8 398 19 994

Web uk-2005

(39.5M / 783.0M)

Qmin

0.9793 0.9796 0.9801

Qmax

0.9795 0.9797 0.9801

t 11 006 11 736 17 074

Page 27: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Large-scale

analysis of the

structure of

science

26

Page 28: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Algorithmic classification systems of

science

• Publications (not journals) are clustered into

research areas based on citation relations

• Research areas are defined at different levels of

granularity and are organized hierarchically

• Clustering is performed using the smart local

moving algorithm (improved Louvain algorithm;

Waltman & Van Eck, 2013)

27

Page 29: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Algorithmically constructed

classification system of science

• 16.2 million publications from the period 2000–

2014 indexed in Web of Science

• 241.7 million citation relations

• Classification system of 3 hierarchical levels:

– 28 broad disciplines

– 813 fields

– 3,822 subfields

28

Page 30: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Breakdown of scientific literature into

3822 subfields

30

Social sciences

and humanities

Biomedical and

health sciences

Life and earth

sciences

Physical

sciences and

engineering

Mathematics and

computer science

Page 31: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Publications in scientometrics

subfield

31

Page 32: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Time-line map of highly cited

scientometrics publications

32

Page 33: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Application: Exploring the interface

between physical and medical sciences

33

Page 34: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Application: Emerging research areas

in physics

35

Particle physics

Astronomy and

astrophysics

Optics

Applied physics

Atomic, molecular,

and chemical

physics

Condensed matter

physics

Page 35: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

CWTS Leiden Ranking

36

Page 36: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Analyzing the

structure and

evolution of

network

science

37

Page 37: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Network science according to

Wikipedia

Network science is an interdisciplinary academic field

which studies complex networks such as

telecommunication networks, computer networks,

biological networks, cognitive and semantic networks,

and social networks. The field draws on theories and

methods including graph theory from mathematics,

statistical mechanics from physics, data mining and

information visualization from computer science,

inferential modeling from statistics, and social

structure from sociology.

38

Page 38: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Networks text book by Mark Newman

The scientific study of networks, including computer

networks, social networks, and biological networks,

has received an enormous amount of interest in the

last few years. (...) The study of networks is broadly

interdisciplinary and important developments have

occurred in many fields, including mathematics,

physics, computer and information

sciences, biology, and the social sciences.

39

Page 39: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Journal of Complex Networks

The journal covers everything from the basic

mathematical, physical and computational principles

needed for studying complex networks to their

applications leading to predictive models in

molecular, biological, ecological, informational,

engineering, social, technological and other systems.

40

Page 40: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Network Science journal

Network Science is a new journal for a new discipline -

one using the network paradigm, focusing on actors

and relational linkages, to inform research,

methodology, and applications from many fields

across the natural, social, engineering and

informational sciences.

41

Page 41: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Popular network terms

42

neural network

social network

wireless sensor

network

complex network

wireless network

regulatory

network

Page 42: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Network publications

• Web of Science database

• Time period 1992–2014

• Research articles and review articles

• ‘network’ or ‘graph’ in title or abstract

• 0.7 million publications

43

Page 43: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Number of network publications per

year

44

Page 44: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Co-occurrence relations between terms

in network publications

45

Biology

Neuroscience

Social science

Chemistry

Mathematics

Computer science

Page 45: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Co-occurrence relations between terms

in network publications

46

Biology

Neuroscience

Social science

Chemistry

Mathematics

Computer science

Page 46: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Network fields

• Network publications are clustered into fields

• Based on 3.1 million citation relations between

network publications

• Clustering methodology of Waltman and Van Eck

(2012, 2013)

• Publications in the same journal are assigned to the

same cluster, except for multidisciplinary journals

• 13 main clusters, covering 97% of all 0.7 million

network publications

47

Page 47: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Number of network publications per

field

48

Page 48: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Citation relations between journals

with ≥ 100 network publications

49

Computer science

Mathematics

Physics

Neuroscience

Biology

Chemistry

Page 49: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Convergence toward an integrated

network science field?

Number of citations between network fields

(x 100; 5-year citation window)

502004

Physics

Math

CS

Biology SSNeuro

3 2

2 7 4 2 1 2

Physics

Math

CS

Biology SSNeuro

10 5

10 13 9 9 8 5

2014

25 27

6 39 1

Page 50: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Convergence toward an integrated

network science field?

% of publications in each of two fields citing at least one

publication in the other field (5-year citation window)

512004

Physics

Math

CS

Biology SSNeuro

3 4

2 6 5 3 2 2

Physics

Math

CS

Biology SSNeuro

5 5

3 6 3 5 4 5

2014

6 10

7 12

Page 51: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Convergence of social science and

physics

52

Page 52: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Citation relations between journals at

the SS-physics interface (2005–2014)

53

Scientometrics

Economics

Sociology and SNA

Physica A

PREPRL

PLOS ONE

PNAS

Nature

Science

Sci. Rep.

JSTAT

EPL

EPJ B

Page 53: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Leiden University’s institutes with most

publications on network science

• LUMC

• Leiden Institute of Advanced Computer Science (Science)

• Leiden Institute of Chemistry (Science)

• Leiden Institute of Physics (Science)

• Institute of Psychology (FSW)

• Mathematical Institute (Science)

• Leiden Observatory (Science)

• Institute of Biology Leiden (Science)

• Centre for Science and Technology Studies (FSW)

54

Page 54: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Citation relations between journals

with ≥ 100 network publications

55

Computer science

Mathematics

Physics

Neuroscience

Biology

Chemistry

Page 55: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Leiden University’s publication output

in network science journals

56

Page 56: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Leiden University’s publication output

in network science journals

57

CWTS

Leiden Institute

of Chemistry

LIACS

Leiden Institute

of Physics

Leiden Institute

of Physics

Institute of

Psychology

LUMC

Institute of

Biology Leiden

Mathematical

Institute

Page 57: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Conclusions

• Network research has increased tremendously

during the past 10–15 years

• Network research covers many fields of science,

but there is only limited evidence of increasing

integration

• Network research in social science and physics is

becoming more connected

• Leiden University contributes to all major areas of

network research, although the contribution to in

the area of computer science is somewhat modest

58

Page 58: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Do it yourself!

59

www.vosviewer.com www.citnetexplorer.nl

Page 59: Bibliometric network analysis: Software tools, techniques, and an analysis of network science at Leiden University

Thank you for your attention!

60