the basics of network analysis using gephi & pajekpeople.ku.edu/~mvitevit/short_course.pdf ·...
TRANSCRIPT
The basics of network analysis
using Gephi & Pajek
Michael S. Vitevitch Department of Psychology
University of Kansas
2
Basic parts of a network • Vertices or Nodes
– an entity
• Edges or Links/Connections/Arcs – a relationship between entities
What kind of data might be appropriate for a network?
• Survey • Any communication, meeting • Proximity • Affiliation • Behavioral
3
Network analysis & visualization
• Numerous packages on R • Siena (RSiena) • UCINET • Python modules • and many others…
• Gephi (Macs) • Pajek (PC)
4
Gephi & Pajek
• Gephi (for Macs) – https://gephi.org/users/download/ – I stated during the workshop that folks in my lab have had problems trying to install the
PC version of Gephi. Recently, however, someone successfully installed the PC version, so this may be a viable option for either PC or Macs.
• Pajek (for PCs)
– http://mrvar.fdv.uni-lj.si/pajek/
5
Input formats
• Adjacency matrix
6
Nykamp DQ, “Small undirected network with numbered nodes and labeled edges.” From Math Insight. http://mathinsight.org/image/small_undirected_network_numbered
Math InsightImage: Small undirected network with numbered nodes andlabeled edges
An undirected network with 10 numbered nodes and 11 edges labeled by the corresponding component of theadjacency matrix.
For this network, the adjacency matrix is
Image file: small_undirected_network_numbered.png
Source image file: small_undirected_network_numbered.ggbSource image type: Geogebra
A =
⎡
⎣
⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢
0100000000
1000110000
0000010000
0000001000
0100001000
0110001110
0001110000
0000010011
0000010100
0000000100
⎤
⎦
⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥
Image: Small undirected network with numbered nodes and lab... http://mathinsight.org/image/small_undirected_network_numbered
1 of 2 3/27/16, 7:29 AM
Input formats
• .net file format (the “Pajek” format)
*Vertices 5 1 "EVELYN” 2 "LAURA” 3 "THERESA” 4 "BRENDA” 5 "CHARLOTTE” *arcslist *edgeslist 1 2 3 4 2 3 4 5 3 4 5
7
DAVIS.NET • Micro-structure
– Degree – Clustering coefficient
• Meso-structure – Paths through the network (length) – Centrality measures (Borgatti, 2005) – Community or cluster
• Macro-structure – Components (cf., communities/clusters)
8
Common network measures • Micro-structure
– Degree • Number of connections incident to a node • 0, hubs
– Clustering coefficient • The extent to which neighbors of a node are
connected to each other. • 0 to 1
10
Pajek • Degree: to calculate it
Network > Create Partition > Degree > All – This will create a “Partitions file” that you can save
and examine later, or look at now by clicking on the magnifying glass icon.
• See image on next slide
14
Pajek • Clustering Coefficient Network >Create Vector> Clustering Coefficients > CC1 (or CC2)
CC1 = 1 hop CC2 = 2 hop CCx’ is normalized
This will create a “Vectors file” that you can save and examine later, or look at now by clicking on the magnifying glass icon.
See image on next slide 16
Common network measures • Meso-structure
– Paths through the network (length) • Network > Create Vector > Distribution of Distances* • The average distance among reachable pairs, the
diameter (the longest shortest-path thru the network), and the identity of those most distant vertices will be displayed in the REPORT window. A “Vectors file” with a distribution of distances in the network will be created that you can save and examine later, or look at now by clicking on the magnifying glass icon.
• See image on previous slide for location of “Vectors file”
18
Common network measures • Meso-structure
– Centrality measures • Network > Create Vector > Centrality > Closeness • Network > Create Vector > Centrality > Betweenness • A “Vectors file” will be created that you can save and
examine later, or look at now by clicking on the magnifying glass icon.
In Gephi use: • “Network Diameter” and “Avg. Path Length”
functions 19
Common network measures • Meso-structure
– Community detection In Gephi: • (On the upper left, Clustering tab, then select the algorithm
you wish to use, and select number of communities you wish to find; might need to consider several numbers to find the best modularity (Q) value which measures how ‘good’ the communities are.)
20
Common network measures • Meso-structure
– Community detection In Pajek: • Network > Create Partition > Communities > Louvain
Method (this is the most common method used) > Multi-Level Coarsening + Single (or Multi-Level) Refinement
– See this website http://mrvar.fdv.uni-lj.si/pajek/community/CommunityDrawExample.htm
for more details about the differences between the types of refinement, and for information on how to set the parameters in the window that appears.
21
Common network measures • Macro-structure
– Components (cf., communities/clusters) • In Gephi: “Connected Components”
function (Statistics tab, right side) • In Pajek:
– Network > Create Partition > Components > Weak (or Strong; this matters in a weighted network, but not the simple one we are looking at); then select the minimum number of nodes you wish to consider (e.g, 1 means isolated nodes each form a component, 2 means that pairs of nodes (or more) will be counted as a component, but isolated nodes will not).
22
References
Exploratory Social Network Analysis with Pajek by Wouter de Nooy, Andrej Mrvar and Vladimir Batagelj Introductory social network analysis with Pajek presentation slide by Lada Adamic MIT14_15JF09_pajek.pdf Pajek Manual Gephi tutorial slides 3363639.pdf Borgatti, S. P. (2005). Centrality and Network Flow. Social Networks, 27, 55–71.
23