network theory iii david lusseau biol4062/5062 [email protected]
TRANSCRIPT
![Page 2: Network theory III David Lusseau BIOL4062/5062 d.lusseau@dal.ca](https://reader036.vdocument.in/reader036/viewer/2022081603/5697bf941a28abf838c9058e/html5/thumbnails/2.jpg)
Outline
16 March: community structure
Suggested readings: Newman M.E.J. 2003. The structure and function of complex
networks. SIAM Review 45,167-256
![Page 3: Network theory III David Lusseau BIOL4062/5062 d.lusseau@dal.ca](https://reader036.vdocument.in/reader036/viewer/2022081603/5697bf941a28abf838c9058e/html5/thumbnails/3.jpg)
What is a community?
A cluster of individuals that are more linked to one another than to others
![Page 4: Network theory III David Lusseau BIOL4062/5062 d.lusseau@dal.ca](https://reader036.vdocument.in/reader036/viewer/2022081603/5697bf941a28abf838c9058e/html5/thumbnails/4.jpg)
Traditional techniques
Cluster analysis (hierarchical)
Multi-Dimensional Scaling
Principal Coordinate Analysis
![Page 5: Network theory III David Lusseau BIOL4062/5062 d.lusseau@dal.ca](https://reader036.vdocument.in/reader036/viewer/2022081603/5697bf941a28abf838c9058e/html5/thumbnails/5.jpg)
Traditional techniques
How representative is the result? Loss of information measure: Stress in MDS
What is the best division? Cluster analysis Peripheral individuals are lumped together
![Page 6: Network theory III David Lusseau BIOL4062/5062 d.lusseau@dal.ca](https://reader036.vdocument.in/reader036/viewer/2022081603/5697bf941a28abf838c9058e/html5/thumbnails/6.jpg)
Girvan-Newman algorithm
Divisive clustering algorithm Divide a population of n vertices in 1 to n communities
Find the boundaries of communities Weakest link between communities: edge betweenness
Standardise betweenness at each step Re-calculate edge betweenness at each step
![Page 7: Network theory III David Lusseau BIOL4062/5062 d.lusseau@dal.ca](https://reader036.vdocument.in/reader036/viewer/2022081603/5697bf941a28abf838c9058e/html5/thumbnails/7.jpg)
Zachary karate club
Girvan & Newman 2002 PNAS
![Page 8: Network theory III David Lusseau BIOL4062/5062 d.lusseau@dal.ca](https://reader036.vdocument.in/reader036/viewer/2022081603/5697bf941a28abf838c9058e/html5/thumbnails/8.jpg)
Finding the best division For each step calculate a modularity coefficient
Best division will have the most edges within communities and the least between Take community size into consideration
2i
iii aeQ
1 2 3
1 30 2 5
2 2 10 2
3 5 2 50
j
iji ea
))
108
57(
108
50())
108
14(
108
10())
108
37(
108
30(Q 222
Q=0.42
![Page 9: Network theory III David Lusseau BIOL4062/5062 d.lusseau@dal.ca](https://reader036.vdocument.in/reader036/viewer/2022081603/5697bf941a28abf838c9058e/html5/thumbnails/9.jpg)
Zachary karate club
Newman & Girvan 2003 Physics Review E
![Page 10: Network theory III David Lusseau BIOL4062/5062 d.lusseau@dal.ca](https://reader036.vdocument.in/reader036/viewer/2022081603/5697bf941a28abf838c9058e/html5/thumbnails/10.jpg)
Modularity coefficient
The principle of modularity coefficient optimisation can be apply to any community structure algorithm
![Page 11: Network theory III David Lusseau BIOL4062/5062 d.lusseau@dal.ca](https://reader036.vdocument.in/reader036/viewer/2022081603/5697bf941a28abf838c9058e/html5/thumbnails/11.jpg)
Extension to weighted matrices Edge betweenness
Transform similarity matrix into dissimilarity matrix Calculate geodesic path using Djikstra’ algorithm
Problem: more likely to remove edges between strongly connected pairs
![Page 12: Network theory III David Lusseau BIOL4062/5062 d.lusseau@dal.ca](https://reader036.vdocument.in/reader036/viewer/2022081603/5697bf941a28abf838c9058e/html5/thumbnails/12.jpg)
Alternative: Modularity optimisation Forget edge betweenness
Optimise for high Q!
Computer intensive
Prone to false minima Difficult to find out Iterate the optimisation to detect
Not always successful
![Page 13: Network theory III David Lusseau BIOL4062/5062 d.lusseau@dal.ca](https://reader036.vdocument.in/reader036/viewer/2022081603/5697bf941a28abf838c9058e/html5/thumbnails/13.jpg)
Modularity- Greedy algorithm
Start with n communities (agglomerative clustering method)
At each step link the communities that provides the greatest increase (or the smallest decrease in Q)
![Page 14: Network theory III David Lusseau BIOL4062/5062 d.lusseau@dal.ca](https://reader036.vdocument.in/reader036/viewer/2022081603/5697bf941a28abf838c9058e/html5/thumbnails/14.jpg)
Q optimisation
Girvan-Newman
Modularity- Greedy algorithm
![Page 15: Network theory III David Lusseau BIOL4062/5062 d.lusseau@dal.ca](https://reader036.vdocument.in/reader036/viewer/2022081603/5697bf941a28abf838c9058e/html5/thumbnails/15.jpg)
Overlapping communities
Recognise that some individuals sit on the fence Do not force them in one community or the other
but identify them as overlapping
Palla et al. 2005 Nature
![Page 16: Network theory III David Lusseau BIOL4062/5062 d.lusseau@dal.ca](https://reader036.vdocument.in/reader036/viewer/2022081603/5697bf941a28abf838c9058e/html5/thumbnails/16.jpg)
Palla algorithm Based on the k-clique principle: a community is composed of a number of k-cliques
k-cliques: fully connected subgraphs of k vertices
Adjacent k-cliques share k-1 vertices
Community: series of adjacent cliques
Palla et al. 2005 Nature
![Page 17: Network theory III David Lusseau BIOL4062/5062 d.lusseau@dal.ca](https://reader036.vdocument.in/reader036/viewer/2022081603/5697bf941a28abf838c9058e/html5/thumbnails/17.jpg)
Palla algorithm Find all k-cliques Calculate the clique-clique overlap matrix Define adjacent cliques
Issues (and advantages): k is user-defined, find ‘best’ k by trial and error Works only on binary networks
(weighted network transformation)
Palla et al. 2005 Nature
![Page 18: Network theory III David Lusseau BIOL4062/5062 d.lusseau@dal.ca](https://reader036.vdocument.in/reader036/viewer/2022081603/5697bf941a28abf838c9058e/html5/thumbnails/18.jpg)
Simply the best method
![Page 19: Network theory III David Lusseau BIOL4062/5062 d.lusseau@dal.ca](https://reader036.vdocument.in/reader036/viewer/2022081603/5697bf941a28abf838c9058e/html5/thumbnails/19.jpg)
Modularity matrix
A matrix? Let’s eigenanalyse!
Let’s rewrite the modularity coefficient:
jiij
jiij ss
m
kkA
mQ )
2(
4
1
Links distributed at random
Community identification
Newman 2006 PNAS
![Page 20: Network theory III David Lusseau BIOL4062/5062 d.lusseau@dal.ca](https://reader036.vdocument.in/reader036/viewer/2022081603/5697bf941a28abf838c9058e/html5/thumbnails/20.jpg)
Modularity matrix
Sum rows and sum of columns = 0 One eigenvector (1,1,1….) with eigenvalue 0 Graph Laplacian
Eigenvector of the dominant eigenvalue gives the best community division into 2 communities (negative and positive elements)
)2
(m
kkAB ji
ijij
![Page 21: Network theory III David Lusseau BIOL4062/5062 d.lusseau@dal.ca](https://reader036.vdocument.in/reader036/viewer/2022081603/5697bf941a28abf838c9058e/html5/thumbnails/21.jpg)
Magnitude of eigenvector elements Tells us how well a vertex is classified (whether
it belongs to the core or the periphery of the community)
Zachary karate club
![Page 22: Network theory III David Lusseau BIOL4062/5062 d.lusseau@dal.ca](https://reader036.vdocument.in/reader036/viewer/2022081603/5697bf941a28abf838c9058e/html5/thumbnails/22.jpg)
Finding the best division
Repeat the process on each subgraph
Recalculate the modularity coefficient for the whole graph
If new division makes 0 or <0 contribution to modularity then do not do it
Else continue
![Page 23: Network theory III David Lusseau BIOL4062/5062 d.lusseau@dal.ca](https://reader036.vdocument.in/reader036/viewer/2022081603/5697bf941a28abf838c9058e/html5/thumbnails/23.jpg)
Power of modularity matrix method Different types of null models can be tested
As long as we have One eigenvector (1,1,1….) with eigenvalue 0
To do so, substract sum of rows from diagonal
jiij
ijij ssPAm
Q )(2
1
![Page 24: Network theory III David Lusseau BIOL4062/5062 d.lusseau@dal.ca](https://reader036.vdocument.in/reader036/viewer/2022081603/5697bf941a28abf838c9058e/html5/thumbnails/24.jpg)
Uncertainty
Bootstrapped algorithm m results from community algorithm
Matrix: likelihood that 2 individuals belong to the same community
Coarse-grain community identity Provides uncertainty overlap
![Page 25: Network theory III David Lusseau BIOL4062/5062 d.lusseau@dal.ca](https://reader036.vdocument.in/reader036/viewer/2022081603/5697bf941a28abf838c9058e/html5/thumbnails/25.jpg)
Girvan-Newman in NetdrawModularity matrix in Socprog