the basics of network analysis using gephi & pajekpeople.ku.edu/~mvitevit/short_course.pdf ·...

23
The basics of network analysis using Gephi & Pajek Michael S. Vitevitch Department of Psychology University of Kansas

Upload: vuongngoc

Post on 28-Dec-2018

227 views

Category:

Documents


1 download

TRANSCRIPT

The basics of network analysis

using Gephi & Pajek

Michael S. Vitevitch Department of Psychology

University of Kansas

2

Basic parts of a network •  Vertices or Nodes

– an entity

•  Edges or Links/Connections/Arcs –  a relationship between entities

What kind of data might be appropriate for a network?

•  Survey •  Any communication, meeting •  Proximity •  Affiliation •  Behavioral

3

Network analysis & visualization

•  Numerous packages on R •  Siena (RSiena) •  UCINET •  Python modules •  and many others…

•  Gephi (Macs) •  Pajek (PC)

4

Gephi & Pajek

•  Gephi (for Macs) – https://gephi.org/users/download/ –  I stated during the workshop that folks in my lab have had problems trying to install the

PC version of Gephi. Recently, however, someone successfully installed the PC version, so this may be a viable option for either PC or Macs.

•  Pajek (for PCs)

– http://mrvar.fdv.uni-lj.si/pajek/

5

Input formats

•  Adjacency matrix

6

Nykamp DQ, “Small undirected network with numbered nodes and labeled edges.” From Math Insight. http://mathinsight.org/image/small_undirected_network_numbered

Math InsightImage: Small undirected network with numbered nodes andlabeled edges

An undirected network with 10 numbered nodes and 11 edges labeled by the corresponding component of theadjacency matrix.

For this network, the adjacency matrix is

Image file: small_undirected_network_numbered.png

Source image file: small_undirected_network_numbered.ggbSource image type: Geogebra

A =

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

0100000000

1000110000

0000010000

0000001000

0100001000

0110001110

0001110000

0000010011

0000010100

0000000100

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

Image: Small undirected network with numbered nodes and lab... http://mathinsight.org/image/small_undirected_network_numbered

1 of 2 3/27/16, 7:29 AM

Input formats

•  .net file format (the “Pajek” format)

*Vertices 5 1 "EVELYN” 2 "LAURA” 3 "THERESA” 4 "BRENDA” 5 "CHARLOTTE” *arcslist *edgeslist 1 2 3 4 2 3 4 5 3 4 5

7

DAVIS.NET •  Micro-structure

–  Degree –  Clustering coefficient

•  Meso-structure –  Paths through the network (length) –  Centrality measures (Borgatti, 2005) –  Community or cluster

•  Macro-structure –  Components (cf., communities/clusters)

8

Load the .net file

•  Gephi •  Pajek

9

Common network measures •  Micro-structure

– Degree • Number of connections incident to a node •  0, hubs

– Clustering coefficient •  The extent to which neighbors of a node are

connected to each other. •  0 to 1

10

Gephi

11

Gephi

•  Statistics tab [right side] – RUN various options

•  Degree •  Clustering coefficient

12

Pajek

13

Opening a network •  FILE > Network > Read

Visualize the network •  DRAW > Network

Pajek •  Degree: to calculate it

Network > Create Partition > Degree > All – This will create a “Partitions file” that you can save

and examine later, or look at now by clicking on the magnifying glass icon.

•  See image on next slide

14

15

Pajek •  Clustering Coefficient Network >Create Vector> Clustering Coefficients > CC1 (or CC2)

CC1 = 1 hop CC2 = 2 hop CCx’ is normalized

This will create a “Vectors file” that you can save and examine later, or look at now by clicking on the magnifying glass icon.

See image on next slide 16

17

Common network measures •  Meso-structure

–  Paths through the network (length) •  Network > Create Vector > Distribution of Distances* •  The average distance among reachable pairs, the

diameter (the longest shortest-path thru the network), and the identity of those most distant vertices will be displayed in the REPORT window. A “Vectors file” with a distribution of distances in the network will be created that you can save and examine later, or look at now by clicking on the magnifying glass icon.

•  See image on previous slide for location of “Vectors file”

18

Common network measures •  Meso-structure

–  Centrality measures •  Network > Create Vector > Centrality > Closeness •  Network > Create Vector > Centrality > Betweenness •  A “Vectors file” will be created that you can save and

examine later, or look at now by clicking on the magnifying glass icon.

In Gephi use: •  “Network Diameter” and “Avg. Path Length”

functions 19

Common network measures •  Meso-structure

–  Community detection In Gephi: •  (On the upper left, Clustering tab, then select the algorithm

you wish to use, and select number of communities you wish to find; might need to consider several numbers to find the best modularity (Q) value which measures how ‘good’ the communities are.)

20

Common network measures •  Meso-structure

–  Community detection In Pajek: •  Network > Create Partition > Communities > Louvain

Method (this is the most common method used) > Multi-Level Coarsening + Single (or Multi-Level) Refinement

–  See this website http://mrvar.fdv.uni-lj.si/pajek/community/CommunityDrawExample.htm

for more details about the differences between the types of refinement, and for information on how to set the parameters in the window that appears.

21

Common network measures •  Macro-structure

– Components (cf., communities/clusters) • In Gephi: “Connected Components”

function (Statistics tab, right side) •  In Pajek:

– Network > Create Partition > Components > Weak (or Strong; this matters in a weighted network, but not the simple one we are looking at); then select the minimum number of nodes you wish to consider (e.g, 1 means isolated nodes each form a component, 2 means that pairs of nodes (or more) will be counted as a component, but isolated nodes will not).

22

References

Exploratory Social Network Analysis with Pajek by Wouter de Nooy, Andrej Mrvar and Vladimir Batagelj Introductory social network analysis with Pajek presentation slide by Lada Adamic MIT14_15JF09_pajek.pdf Pajek Manual Gephi tutorial slides 3363639.pdf Borgatti, S. P. (2005). Centrality and Network Flow. Social Networks, 27, 55–71.

23