bayesian estimation of the latent dimension and communities in stochastic blockmodels · 2020. 5....

16
1/16 Bayesian community detection Results Conclusion References JSM19 – Novel Approaches for Analyzing Dynamic Networks Bayesian estimation of the latent dimension and communities in stochastic blockmodels Francesco Sanna Passino, Nick Heard Department of Mathematics, Imperial College London [email protected] July 30, 2019 Francesco Sanna Passino Imperial College London Bayesian estimation of the latent dimension and communities in stochastic blockmodels

Upload: others

Post on 22-Sep-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bayesian estimation of the latent dimension and communities in stochastic blockmodels · 2020. 5. 27. · 1/16 Computer network graphs Bayesian community detection ConclusionReferences

1/16

Bayesian community detection Results Conclusion References

JSM19 – Novel Approaches for Analyzing Dynamic Networks

Bayesian estimation of the latent dimensionand communities in stochastic blockmodels

Francesco Sanna Passino, Nick HeardDepartment of Mathematics, Imperial College London

[email protected]

July 30, 2019

Francesco Sanna Passino Imperial College London

Bayesian estimation of the latent dimension and communities in stochastic blockmodels

Page 2: Bayesian estimation of the latent dimension and communities in stochastic blockmodels · 2020. 5. 27. · 1/16 Computer network graphs Bayesian community detection ConclusionReferences

2/16

Bayesian community detection Results Conclusion References

Stochastic blockmodels as random dot product graphs

Consider an undirected graph with symmetric adjacency matrix A ∈ {0, 1}n×n.

In random dot product graphs, the probability of a link between two nodes is expressedas the inner product between two latent positions xi,xj ∈ F, 0 ≤ x>y ≤ 1 ∀ x,y ∈ F:

P(Aij = 1) = x>i xj .

The stochastic blockmodel is the classical model for community detection in graphs.Given a matrix B ∈ [0, 1]K×K of within-community probabilities, the probability of alink depends on the community allocations zi and zj ∈ {1, . . . ,K} of the two nodes:

P(Aij = 1) = Bzizj .

The stochastic blockmodel can be interpreted as a special case of a random dot productgraph. IfBkh = µ>k µh with µk,µh ∈ F, and all the nodes in community k are assignedthe latent position µk, then:

P(Aij = 1) = µ>ziµzj , i < j, Aij = Aji.

Francesco Sanna Passino Imperial College London

Bayesian estimation of the latent dimension and communities in stochastic blockmodels

Page 3: Bayesian estimation of the latent dimension and communities in stochastic blockmodels · 2020. 5. 27. · 1/16 Computer network graphs Bayesian community detection ConclusionReferences

3/16

Bayesian community detection Results Conclusion References

Network embeddings

Consider an undirected graph with symmetric adjacency matrix A ∈ {0, 1}n×n, andmodified Laplacian L = D−1/2AD−1/2, D = diag(

∑ni=1Aij).

The adjacency embedding of A in Rd is:

X = [x1, . . . , xn]> = ΓΛ1/2 ∈ Rn×d,

where Λ is a d× d diagonal matrix containing the top d largest eigenvalues of A, andΓ is a n× d matrix containing the corresponding orthonormal eigenvectors.

The Laplacian embedding of A in Rd is:

X = [x1, . . . , xn]> = ΓΛ1/2 ∈ Rn×d,

where Λ is a d× d diagonal matrix containing the top d largest eigenvalues of L, and Γis a n× d matrix containing the corresponding orthonormal eigenvectors.

Francesco Sanna Passino Imperial College London

Bayesian estimation of the latent dimension and communities in stochastic blockmodels

Page 4: Bayesian estimation of the latent dimension and communities in stochastic blockmodels · 2020. 5. 27. · 1/16 Computer network graphs Bayesian community detection ConclusionReferences

4/16

Bayesian community detection Results Conclusion References

Spectral estimation of the stochastic blockmodel

Based on asymptotic properties, Rubin-Delanchy et al., 2017, propose the followingalgorithm for consistent estimation of the latent positions in stochastic blockmodels:

Algorithm 1: Spectral estimation of the stochastic blockmodel (spectral clustering)

Input: adjacency matrix A (or the Laplacian matrix L), dimension d, and number ofcommunities K ≥ d.

1 compute spectral embedding X = [x1, . . . , xn]> or X = [x1, . . . , xn]

> into Rd,2 fit a Gaussian mixture model with K components,

Result: return cluster centres µ1, . . . ,µK ∈ Rd and node memberships z1, . . . , zn.

In practice: d and K are estimated sequentially. Issues:Sequential approach is sub-optimal: the estimate of K depends on choice of d.Theoretical results only hold for d fixed and known.Distributional assumptions when d is misspecified are not available.

This talk discusses a novel framework for joint estimation of d and K.

Francesco Sanna Passino Imperial College London

Bayesian estimation of the latent dimension and communities in stochastic blockmodels

Page 5: Bayesian estimation of the latent dimension and communities in stochastic blockmodels · 2020. 5. 27. · 1/16 Computer network graphs Bayesian community detection ConclusionReferences

5/16

Bayesian community detection Results Conclusion References

A Bayesian model for network embeddings

Choose integer m ≤ n and obtain embedding X ∈ Rn×m→m arbitrarily large.Bayesian model for simultaneous estimation of d andK→ allow for d = rank(B) ≤ K .

xi|d, zi,µzi ,Σzi ,σ2zi

d∼ Nm

([µzi

0m−d

],

[Σzi 00 σ2

ziIm−d

]), i = 1, . . . , n,

(µk,Σk)|d iid∼ NIWd(0, κ0, ν0 + d− 1,∆d), k = 1, . . . ,K,

σ2kjiid∼ Inv-χ2(λ0, σ

20), j = d+ 1, . . . ,m,

d|z d∼ Uniform{1, . . . ,K∅},zi|θ iid∼ Multinoulli(θ), i = 1, . . . , n, θ ∈ SK−1,

θ|K d∼ Dirichlet( αK, . . . ,

α

K

),

Kd∼ Geometric(ω).

where K∅ is the number of non-empty communities.

Alternative: dd∼ Geometric(δ).

Yang et al., 2019, independently proposed a similar frequentist model.Francesco Sanna Passino Imperial College London

Bayesian estimation of the latent dimension and communities in stochastic blockmodels

Page 6: Bayesian estimation of the latent dimension and communities in stochastic blockmodels · 2020. 5. 27. · 1/16 Computer network graphs Bayesian community detection ConclusionReferences

6/16

Bayesian community detection Results Conclusion References

Empirical model validation

−1 −0.9 −0.8 −0.7 −0.6 −0.5

−0.5

0

0.5

Figure 1. Scatterplot of the simulated X1 and X2 – i.e. X:d

−0.4 −0.2 0 0.2 0.4 0.6

−0.4

−0.2

0

0.2

0.4

Figure 2. Scatterplot of the simulated X3 and X4

Simulated GRDPG-SBM with n = 2500, d = 2, K = 5.

Nodes allocated to communities with probability θk = P(zi = k) = 1/K .

Francesco Sanna Passino Imperial College London

Bayesian estimation of the latent dimension and communities in stochastic blockmodels

Page 7: Bayesian estimation of the latent dimension and communities in stochastic blockmodels · 2020. 5. 27. · 1/16 Computer network graphs Bayesian community detection ConclusionReferences

7/16

Bayesian community detection Results Conclusion References

Empirical model validation

2 4 6 8 10 12 14

−1

−0.5

0

0.5

Dimension

Overall meanMax/min within-cluster meanWithin-cluster mean

Figure 3. Within-cluster and overall means of X:15

−0.2 −0.1 0 0.1 0.20

5

10

15

Correlation coe�cient ρ(k)ij

ρ(k)ij forX:d

Histogram of ρ(k)ij forXd:

Figure 4. Within-cluster correlation coefficients of X:30

Means are approximately 0 for columns with index > d.

Reasonable to assume correlation ρ(k)ij = 0 for i, j > d.

Francesco Sanna Passino Imperial College London

Bayesian estimation of the latent dimension and communities in stochastic blockmodels

Page 8: Bayesian estimation of the latent dimension and communities in stochastic blockmodels · 2020. 5. 27. · 1/16 Computer network graphs Bayesian community detection ConclusionReferences

8/16

Bayesian community detection Results Conclusion References

Curse of dimensionality

0 100 200 300 400 500

0

2

4

6

·10−2

Dimension

Varia

nce

Within cluster varianceTotal variance

0 5 10 15 20 25

0

2

4

6

·10−2

Dimension

Varia

nce

Figure 5. Within-block variance and total variance for the adjacency embedding obtained from a simulatedSBM with d = 2, K = 5, n = 500, and well separated means µ1 = [0.7, 0.4],µ2 = [0.1, 0.1],µ3 = [0.4, 0.8],µ4 = [−0.1, 0.5] and µ5 = [0.3, 0.5], and θ = (0.2, 0.2, 0.2, 0.2, 0.2).

For some k and k′: σ2kj ≈ σ2k′j for j � d and k 6= k′.

Francesco Sanna Passino Imperial College London

Bayesian estimation of the latent dimension and communities in stochastic blockmodels

Page 9: Bayesian estimation of the latent dimension and communities in stochastic blockmodels · 2020. 5. 27. · 1/16 Computer network graphs Bayesian community detection ConclusionReferences

9/16

Bayesian community detection Results Conclusion References

Second order clustering

Bayesian model parsimony: K underestimated for d� m.

Possible solution: second order clustering v = (v1, . . . , vK) with vk ∈ {1, . . . ,H}.If vk = vk′ , then σ2kj = σ2k′j for j > d:

xi|d, zi, vzi ,µzi ,Σzi ,σ2vzi

d∼ Nm

([µzi

0m−d

],

[Σzi 00 σ2

vziIm−d

]), i = 1, . . . , n,

vk|K,H d∼ Multinoulli(φ), k = 1, . . . ,K,

φ|H d∼ Dirichlet

H, . . . ,

β

H

),

H|K d∼ Uniform{1, . . . ,K}.

The parameter vk defines clusters of clusters.

Empirical results show that the model is able to handle d� m.

Francesco Sanna Passino Imperial College London

Bayesian estimation of the latent dimension and communities in stochastic blockmodels

Page 10: Bayesian estimation of the latent dimension and communities in stochastic blockmodels · 2020. 5. 27. · 1/16 Computer network graphs Bayesian community detection ConclusionReferences

10/16

Bayesian community detection Results Conclusion References

N3(µ1,Σ1)

N3(µ2,Σ2)

N3(µ3,Σ3)

N3(µ4,Σ4)

N3(µ5,Σ5) N8(08,σ23I5)

N8(08,σ22I5)

N8(08,σ21I5)

3 = latent dimension

Francesco Sanna Passino Imperial College London

Bayesian estimation of the latent dimension and communities in stochastic blockmodels

Page 11: Bayesian estimation of the latent dimension and communities in stochastic blockmodels · 2020. 5. 27. · 1/16 Computer network graphs Bayesian community detection ConclusionReferences

11/16

Bayesian community detection Results Conclusion References

Extension to directed and bipartite graphs

Consider a directed graph with adjacency matrix A ∈ {0, 1}n×n.

The d-dimensional adjacency embedding of A in R2d, is defined as:

X = UD1/2 ⊕ VD1/2 =[UD1/2 VD1/2

]=[Xs Xr

].

where A = UDV> + U⊥D⊥V>⊥ is the SVD decomposition of A, where U ∈ Rn×d,D ∈ Rd×d

+ diagonal, and V ∈ Rn×d.

Essentially, only three distributions change:

xi|d,K, zi d∼ N2m

µzi

0m−dµ′zi

0m−d

,

Σzi 0 0 00 σ2

ziIm−d 0 00 0 Σ′zi 00 0 0 σ2′

ziIm−d

,

(µk,Σk)|d,K iid∼ NIWd(0, κ0, ν0 + d− 1,∆d), k = 1, . . . ,K,

σ2kj |d,Kiid∼ Inv-χ2(λ0, σ

20), j = d+ 1, . . . ,m.

Co-clustering: different clusters for sources and receivers→ bipartite graphs.

Francesco Sanna Passino Imperial College London

Bayesian estimation of the latent dimension and communities in stochastic blockmodels

Page 12: Bayesian estimation of the latent dimension and communities in stochastic blockmodels · 2020. 5. 27. · 1/16 Computer network graphs Bayesian community detection ConclusionReferences

12/16

Bayesian community detection Results Conclusion References

Enron e-mail network: number of clusters

5 10 150

1

2

3·105

K∅H∅

Figure 6. Posterior histogram of K∅ and H∅, unconstrainedmodel, MAP for d in red.

2 4 6 8 10 12 14 160

0.5

1

1.5

2

2.5

·105

K∅H∅

Figure 7. Posterior histogram of K∅ and H∅, constrainedmodel, MAP for d in red.

Francesco Sanna Passino Imperial College London

Bayesian estimation of the latent dimension and communities in stochastic blockmodels

Page 13: Bayesian estimation of the latent dimension and communities in stochastic blockmodels · 2020. 5. 27. · 1/16 Computer network graphs Bayesian community detection ConclusionReferences

13/16

Bayesian community detection Results Conclusion References

Enron e-mail network: number of clusters

4 6 8 10 12 14 160

0.2

0.4

0.6

0.8

1

1.2

·105

K∅

Figure 8. Posterior histogram of K∅, unconstrained modelwithout second order clustering, MAP for d in red.

6 8 10 12 14 160

0.2

0.4

0.6

0.8

1·105

K∅

Figure 9. Posterior histogram of K∅, constrained model with-out second order clustering, MAP for d in red.

Francesco Sanna Passino Imperial College London

Bayesian estimation of the latent dimension and communities in stochastic blockmodels

Page 14: Bayesian estimation of the latent dimension and communities in stochastic blockmodels · 2020. 5. 27. · 1/16 Computer network graphs Bayesian community detection ConclusionReferences

14/16

Bayesian community detection Results Conclusion References

Enron e-mail network: scree-plot

0 50 100 150

0

5

10

15

20

25

Figure 10. Singular values of the adjacency matrix.

Choice of d is consistent with the elbow of the scree-plot.

Francesco Sanna Passino Imperial College London

Bayesian estimation of the latent dimension and communities in stochastic blockmodels

Page 15: Bayesian estimation of the latent dimension and communities in stochastic blockmodels · 2020. 5. 27. · 1/16 Computer network graphs Bayesian community detection ConclusionReferences

15/16

Bayesian community detection Results Conclusion References

Conclusion

Community detection and stochastic blockmodels:Bayesian model for simultaneous selection of K andd in generalised random dot product graphs,Allow for initial misspecification of the arbitrarily largeparameter m, then refine estimate d,Gaussian mixture model (with constraints) based onspectral embedding,Easy to extend to directed and bipartite graphs.

More details:Sanna Passino and Heard, 2019 – arXiv: 1904.05333.

Francesco Sanna Passino Imperial College London

Bayesian estimation of the latent dimension and communities in stochastic blockmodels

Page 16: Bayesian estimation of the latent dimension and communities in stochastic blockmodels · 2020. 5. 27. · 1/16 Computer network graphs Bayesian community detection ConclusionReferences

16/16

Bayesian community detection Results Conclusion References

References

Athreya, A. et al. (2018). “Statistical Inference on Random Dot Product Graphs: a Survey”.In: Journal of Machine Learning Research 18.226, pp. 1–92.

Rubin-Delanchy, P. et al. (2017). “A statistical interpretation of spectral embedding: thegeneralised random dot product graph”. In: ArXiv e-prints. arXiv: 1709.05506.

Sanna Passino, F. and N. A. Heard (2019). “Bayesian estimation of the latent dimensionand communities in stochastic blockmodels”. In: arXiv e-prints. arXiv: 1904.05333.

Yang, C. et al. (2019). “Simultaneous dimensionality and complexity model selectionfor spectral graph clustering”. In: arXiv e-prints. arXiv: 1904.02926.

Francesco Sanna Passino Imperial College London

Bayesian estimation of the latent dimension and communities in stochastic blockmodels