applications of community detection in bibliometric network analysis

30
Applications of community detection in bibliometric network analysis Nees Jan van Eck Centre for Science and Technology Studies (CWTS), Leiden University EURANDOM workshop “Networks with community structure”, Eindhoven January 24, 2014

Upload: nees-jan-van-eck

Post on 09-Jun-2015

773 views

Category:

Education


0 download

DESCRIPTION

In this talk, we focus on the analysis of bibliometric networks, and in particular on the detection of communities in these networks. We start by demonstrating VOSviewer, a popular software tool for visualizing bibliometric networks. We discuss the techniques used by VOSviewer for visualizing bibliometric networks and for detecting communities in these networks. We pay special attention to the close relationship between visualization and community detection, and we discuss the unified approach to visualization and community detection that is implemented in VOSviewer. We then shift our attention to community detection in very large citation networks, including millions of publications and hundreds of millions of citation relations. We show how community detection techniques can be used to construct highly detailed classification systems of science. We also discuss applications of such classification systems to science policy questions. Finally, we demonstrate CitNetExplorer, a new software tool in which community detection techniques are used to support the large-scale analysis of citation networks. We use CitNetExplorer to analyze the citation network of publications on network science and in particular on community detection.

TRANSCRIPT

Page 1: Applications of community detection in bibliometric network analysis

Applications of community detection in bibliometric network analysis

Nees Jan van Eck

Centre for Science and Technology Studies (CWTS), Leiden University

EURANDOM workshop “Networks with community structure”, Eindhoven

January 24, 2014

Page 2: Applications of community detection in bibliometric network analysis

2

Outline

• Bibliometric network analysis at CWTS

• VOSviewer

• Unified approach to visualization and community detection

• Community detection in large citation networks

• CitNetExplorer

Page 3: Applications of community detection in bibliometric network analysis

3

Bibliometric network analysisat CWTS

Page 4: Applications of community detection in bibliometric network analysis

4

Bibliometric network analysis at CWTS• In-house databases:

– Thomson Reuters Web of Science

– Elsevier Scopus

• Bibliometric networks:– Publication citation networks

– Journal co-citation/bibliographic coupling networks

– Term co-occurrence networks

– Co-authorship networks

– Etc.

• Applications:– Research institutions: Research assessment

– Scientific publishers: Journal profiling

– Funding agencies: Science policy analyses

Page 5: Applications of community detection in bibliometric network analysis

5

VOSviewer(www.vosviewer.com)

Page 6: Applications of community detection in bibliometric network analysis

6

VOSviewer

(Van Eck & Waltman, Scientometrics, 2010)

Page 7: Applications of community detection in bibliometric network analysis

7

Citation network of fields in Web of Science

Page 8: Applications of community detection in bibliometric network analysis

8

Co-occurrence network of terms in clinical neurology

Page 9: Applications of community detection in bibliometric network analysis

9

Unified approach to visualization andcommunity detection

Page 10: Applications of community detection in bibliometric network analysis

10

Visualization vs. community detection• Visualization (‘mapping’):

– Assigning the nodes in a network to locations in a (usually two-dimensional) space

• Community detection (‘clustering’):– Partitioning the nodes in a network into a number of groups

Page 11: Applications of community detection in bibliometric network analysis

1111

Community detection seen as visualization in a restricted space

Page 12: Applications of community detection in bibliometric network analysis

1212

Community detection seen as visualization in a restricted space

Page 13: Applications of community detection in bibliometric network analysis

13

Unified approach to visualization and community detection

Minimize

wheren: number of nodes in the network

m: total weight of all edges in the network

Aij: weight of edge between nodes i and j

ki: total weight of all edges of node i

ji

ijji

ijijji

n ddAkkm

xxQ 21

2),,(

Visualizationxi: vector denoting the

location of node i in a p-dimensional space

p

kjkikjiij xxxxd

1

2)(

Community detectionxi: integer denoting the

community to which node i belongs

: resolution parameter

ji

jiij xx

xxd

if 1

if 0

Page 14: Applications of community detection in bibliometric network analysis

14

Unified approach: Community detection

Equivalent to a weighted variant of modularity-based community detection (Waltman et al., 2010)

Maximize

where(xi, xj) equals 1 if xi = xj and 0 otherwise

ji

jiijijjin m

kkAwxx

mxxQ

2),(

21

),,(ˆ1

jiij kk

mw

2

Page 15: Applications of community detection in bibliometric network analysis

15

Unified approach: Visualization

• Equivalent to the VOS (visualization of similarities) technique (Van Eck & Waltman, 2007)

• Limit case of multidimensional scaling (Van Eck et al., 2010)

ji

jiji

jiijji

xxxxAkkm

Q22

ji

jiijij xxDW2

1

2 ij

jiij A

m

kkD ij

jiij A

kkm

W2

VOS

MDS

Page 16: Applications of community detection in bibliometric network analysis

16

Unified approach

Most commonly used community detection technique (modularity) and most commonly used visualization technique (MDS) can be brought together in a unified framework

Unified approach

Modularity (weighted)

VOS

MDS(limit case)

Page 17: Applications of community detection in bibliometric network analysis

17

Community detection in large citation networks

Page 18: Applications of community detection in bibliometric network analysis

18

Classification systems of scientific publications• Web of Science/Scopus classification systems:

– Scientific fields defined at the level of journals rather than individual publications

– Difficulties with multidisciplinary journals

– High level of aggregation

– Sometimes outdated or inaccurate

• Disciplinary classification systems:– E.g., CA, JEL, MeSH, PACS

– Not available for all disciplines

– Sometimes outdated or inaccurate

Page 19: Applications of community detection in bibliometric network analysis

19

Algorithmically constructed classification systems• Publications (not journals) are clustered into fields

based on citation relations

• Fields are defined at different levels of granularity and are organized hierarchically

• Community detection based on a variant of the standard modularity function that accounts for differences in citation practices across fields

• Optimization using the smart local moving algorithm

Page 20: Applications of community detection in bibliometric network analysis

20

Example (Waltman & Van Eck, 2012)• 10.2 million publications from the period 2001–

2010 indexed in Web of Science

• 97.6 million citation relations

• Classification system of 3 hierarchical levels:– 20 broad disciplines

– 672 fields

– 22,412 subfields

Page 21: Applications of community detection in bibliometric network analysis

21

Visualization of 672 research areas at level 2 of the classification system

Page 22: Applications of community detection in bibliometric network analysis

22

Visualization of 417 publications in research area 4.30.10

Page 23: Applications of community detection in bibliometric network analysis

23

Application in a science policy context

Page 24: Applications of community detection in bibliometric network analysis

24

CitNetExplorer(www.citnetexplorer.nl)

Page 25: Applications of community detection in bibliometric network analysis

25

Exploring citation networks

• Macro-level applications:– Studying the development of a research field over time

– Identifying research areas

• Micro-level applications:– Studying the publication oeuvre of a researcher

– Supporting systematic literature reviewing

Page 26: Applications of community detection in bibliometric network analysis

26

HistCite

• Timeline visualization of publications and their citation relations, referred to as algorithmic historiography by Eugene Garfield

Page 27: Applications of community detection in bibliometric network analysis

27

CitNetExplorer

• New software tool for analyzing and visualizing citation networks

• Freely available on www.citnetexplorer.nl

• Runs on any system that offers Java support

• Citation networks can be constructed directly based on data downloaded from Web of Science

• Interactive functionality for drilling down into a citation network

• Very large citation networks can be handled, with millions of publications and tens of millions of citation relations

Page 28: Applications of community detection in bibliometric network analysis

Demonstration

• Database: Web of Science

• Fields: Physics and multidisciplinary (Nature, PLoS ONE, PNAS, Science, etc.)

• Time period: 1998–2012

• Number of publications: ~1.8 million

• Number of citation relations: ~15.1 million

28

Page 29: Applications of community detection in bibliometric network analysis

29

CitNetExplorer

Page 30: Applications of community detection in bibliometric network analysis

30

References

Van Eck, N.J., & Waltman, L. (2010). Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics, 84(2), 523-538.

Van Eck, N.J., & Waltman, L. (2011). Text mining and visualization using VOSviewer. ISSI Newsletter, 7(3), 50-54.

Van Eck, N.J., Waltman, L., Dekker, R., & Van den Berg, J. (2010). A comparison of two techniques for bibliometric mapping: Multidimensional scaling and VOS. JASIST, 61(12), 2405-2416.

Waltman, L., & Van Eck, N.J. (2012). A new methodology for constructing a publication-level classification system of science. JASIST, 63(12), 2378-2392.

Waltman, L., & Van Eck, N.J. (2013). A smart local moving algorithm for large-scale modularity-based community detection. European Physical Journal B, 86(11), 471.

Waltman, L., Van Eck, N.J., & Noyons, E.C.M. (2010). A unified approach to mapping and clustering of bibliometric networks. Journal of Informetrics, 4(4), 629-635.