fast counting of triangles in large networks: algorithms and laws rpi theory seminar, 24 november...
TRANSCRIPT
![Page 1: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/1.jpg)
FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS
RPI Theory Seminar, 24 November 2008
Charalampos (Babis) Tsourakakis School of Computer ScienceCarnegie Mellon University
http://www.cs.cmu.edu/~ctsourak
![Page 2: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/2.jpg)
Counting Triangles
RPI, November 2008
2
Given an undirected, simple graph G(V,E) a triangle is a set of 3 vertices such that any two of them by an edge of the graph.
Related Problems a) Decide if a graph is triangle-free. b) Count the total number of triangles δ(G). c) Count the number of triangles δ(v) that each
vertex v participates at.
d) List the triangles that each vertex v
participates at.
Our focus
|}),(,),(:),{(|)( EwvEuvEwuv
![Page 3: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/3.jpg)
Why is triangle counting important*?
RPI, November 2008
3
Social Network Analysis:“Friends of friends are friends” [WF94]
Web Spam Detection [BPCG08] Hidden Thematic Structure of the
Web [EM02] Motif Detection e.g. biological
networks [YPSB05]
*few indicative reasons, from the graph mining perspective
![Page 4: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/4.jpg)
Why is triangle counting important?
RPI, November 2008
4
Furthermore, two often used metrics are: Clustering Coefficient
where: Transitivity Ratio
where:
)(
)(3
G
GTR
Triple at node v
Triangle
'' )(
)(
|'|
1)(
|'|
1)(
VvVv v
v
Vvcc
VGCC
v
2
)()( and }2)(:{'
vdvvdvV
VvVv
vGvG )()( and )(3
1)(
![Page 5: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/5.jpg)
Outline
RPI, November 2008
5
• Related Work• Proposed Method • Experiments• Triangle-related Laws• Triangles in Kronecker Graphs• Future Work & Open Problems
![Page 6: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/6.jpg)
Counting methods
Dense graphs
Fast Low space
Time complexity
O(n2.37) O(n3)
Space complexity
O(n2) O(m)
Fast Low space
Time complexity
O(m0.7n1.2+n2+o(1)) e.g. O( n )
Space complexity
Θ(n2) (eventually) Θ(m)
Sparse graphs
RPI, November 2008
2maxd
6
![Page 7: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/7.jpg)
Outline
RPI, November 2008
7
• Related Work• Proposed Method • Experiments• Triangle-related Laws• Triangles in Kronecker Graphs• Future Work & Open Problems
![Page 8: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/8.jpg)
Outline of the Proposed Method
8
EigenTriangle theorem EigenTriangleLocal theorem EigenTriangle algorithm EigenTriangleLocal algorithm Efficiency & Complexity
Power law degree distributions Gershgorin discs Real world network spectra
RPI, November 2008
![Page 9: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/9.jpg)
Theorem [EigenTriangle]9
Theorem The number of triangles δ(G) in an
undirected, simple graph G(V,E) is given by:
where are the eigenvalues of the adjacency matrix of graph G.
RPI, November 2008
6)(
||
1
3
V
ii
G
||21 ... V
![Page 10: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/10.jpg)
Proof10
Call A the adjacency matrix of the graph. Consider the i-th diagonal element of A3, αii. This element is equal to the number of triangles vertex i participates at. So the trace is 6δ(G) because each triangle is counted 6 times (3 participating vertices and is also counted as i-j-k, and i-k-j). Furthermore, if Ax=λx, then λ3 is an eigenvalue of A3 (*) and vice versa if λ is an eigenvalue of A3 , then is an eigenvalue of A.
* A3 x=AAAx=AAλx=λΑΑx=λΑλx=λ2Αx=λ3x
3
RPI, November 2008
![Page 11: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/11.jpg)
Theorem [EigenTriangleLocal]
11
Theorem The number of triangles δ(i) vertex i
partipates at is equal to:
where is the j-th entry of the i-th eigenvector
Proof [Sketch]Follows from the previous theorem and the fact that A is symmetric, therefore diagonalizable and also
RPI, November 2008
2)(
2||
1
3ij
V
jju
i
iju iu
TUUA 33
![Page 12: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/12.jpg)
EigenTriangle Algorithm12
RPI, November 2008
![Page 13: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/13.jpg)
EigenTriangleLocal Algorithm
13
RPI, November 2008
Why are these two
algorithms
efficient?
![Page 14: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/14.jpg)
Skewed Degree Distributions
14
Skewed degree distribution ubiquitous in nature! Have been termed as “the signature of human activity”[FKP02] but appear as well to all other kind of networks, e.g. biological. See [N05][M04] for generative models of power law distributions.
Typically referred to as power-laws (even if sometimes we abuse the strict definition of a power law, i.e ).
RPI, November 2008
bxay )log()log(
![Page 15: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/15.jpg)
Examples of power laws15
Newman [N05] demonstratedhow often power laws appearusing may different types ofnetworks, ranging from wordfrequencies to population ofcities.
RPI, November 2008
Many cities havea small population
Few cities havea huge population
![Page 16: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/16.jpg)
Gershgorin’s Discs
RPI, November 2008
16
Theorem Let B an arbitrary matrix. Then the eigenvalues λ of B are located in the union of the n discs
For a proof see Demmel [D97], p.82.
kj
kjkk bb ||||
![Page 17: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/17.jpg)
Gershgorin Discs
RPI, November 2008
17
Bounds on the airports network (Observe how loose)
![Page 18: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/18.jpg)
Typical real world spectra18
RPI, November 2008
AirportsPolitical blogs
![Page 19: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/19.jpg)
Top Eigenvalues19
Zooming in the top eigenvalues and plotting the rank vs. the eigenvalue in log-log scale reveals that the top eigenvalues follow a power law [FFF99]
Some years later, Mihail & Papadimitriou [MP02] and Chung, Lu and Vu [CLV03] proved this fact.
RPI, November 2008
![Page 20: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/20.jpg)
Our idea20
Simple & clear: Use a low-rank approximation of A3 to estimate the diagonal elements and the trace.
Suggests also a way of thinking:Take advantage of special properties (e.g. power laws) to reduce the complexity of certain computational tasks in real-world networks.
RPI, November 2008
![Page 21: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/21.jpg)
Summing up: Why does it work?
21
Almost symmetry of the spectrum around 0 for the bulk of the eigenvalues except the top ones is the first main reason.
Cubes amplify strongly this phenomenon!
RPI, November 2008
![Page 22: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/22.jpg)
Complexity Analysis22
Main computational bottleneck that determines the complexity is the Lanczos method.
Lanczos runs in linear time with respect to the non-zero entries of the matrix, i.e. the edges, assuming that we compute a few constant number of eigenvalues.
Convergence of Lanczos is fast due to the eigenvalue power law (see Kaniel-Paige theory [GL89])
RPI, November 2008
![Page 23: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/23.jpg)
Outline
RPI, November 2008
23
• Related Work• Proposed Method • Experiments• Triangle-related Laws• Triangles in Kronecker Graphs• Future Work & Open Problems
![Page 24: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/24.jpg)
Datasets24
RPI, November 2008
![Page 25: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/25.jpg)
Competitor: Node Iterator 25
Node Iterator algorithm considers each node at the time, looks at its neighbors and checks how many among them are connected among them.
Complexity: O(n ) We report the results as the speedup
that EigenTriangle algorithm gives compared to the running time of the Node Iterator .
2maxd
RPI, November 2008
![Page 26: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/26.jpg)
Results: #Eigenvalues vs. Speedup
26
RPI, November 2008
![Page 27: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/27.jpg)
Results: #Edges vs. Speedup
27
RPI, November 2008
![Page 28: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/28.jpg)
Main points28
Some interesting facts for the two scatterplots:
Mean required approximations rank for at least 95% is 6.2
Speedups are between 33.7x and 1159x. The mean speedup is 250. Notice the increasing speedup as the
size of the network grows.
RPI, November 2008
![Page 29: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/29.jpg)
Zooming in29
RPI, November 2008
Zoomingin this point
![Page 30: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/30.jpg)
Evaluating the Local Counting Method
30
Pearson’s correlation coefficient ρ Relative Reconstruction Error
||
1 )(
|)(')(|
||
1 V
i
i
VRRE
RPI, November 2008
Political Blogs:RRE 7*10-4
ρ 99.97%
![Page 31: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/31.jpg)
#Eigenvalues vs. ρ for three networks
31
RPI, November 2008
Observe how a low rankresults in
almost optimal results.This holds for
surprisingly manyreal world networks
![Page 32: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/32.jpg)
Outline
RPI, November 2008
32
• Related Work• Proposed Method • Experiments• Triangle-related Laws• Triangles in Kronecker Graphs• Future Work & Open Problems
![Page 33: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/33.jpg)
Triangle Participation Law
RPI, November 2008
33
Plots the number of triangles δ (x-axis) vs. the count of vertices with δ participating triangles.
a) EPINIONS, who trusts-whosb) ASN, social networkc) HEP_TH, collaboration network
(a) (b)
(c)
![Page 34: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/34.jpg)
Degree Triangle Law
RPI, November 2008
34
Plots the degree di (x-axis) vs. the mean number of triangles that nodes with degree di participate at.
Epinions ASN
![Page 35: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/35.jpg)
Outline
RPI, November 2008
35
• Related Work• Proposed Method • Experiments• New Triangle-related Laws• Triangles in Kronecker Graphs• Future Work & Open Problems
![Page 36: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/36.jpg)
Kronecker Graphs
RPI, November 2008
36
This model was introduced in [LCKF05]. It is based on the simple operation of the Kronecker product to generate graphs that mimic real world networks.
Deterministic Kronecker Graphs: Kronecker Product of the adjacency matrix at the current step k with the initiator adjacency matrix (typically small).
Stochastic Kronecker Graphs: Kronecker Product of the matrix at the current step k with the initiator matrix. Initiator matrix contains probabilities.For more details see [LF07].
![Page 37: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/37.jpg)
Triangles in Kronecker Graphs
RPI, November 2008
37
Some notation first:A: nxn initiatior adjacency matrix of the undirected, simple graph GA
B = A[k] k-th Kronecker product
λ=(λ1,...,λn) the eigenvalues of A
Δ(GA), Δ(GΒ) #triangles of GA , GΒ Theorem [KroneckerTRC]
06 1 , k)Δ(G ) Δ(G kA
kB
![Page 38: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/38.jpg)
Proof 38
We use induction on the number of recursion steps k. For k=0 the theorem trivially holds.
Assume now that KroneckerTRC holds now for some
.Call C=A[r], D=A[r+1] and the eigenvalues of C,
[μi]i=1..s.By the assumption
The eigenvalues of D are given by the Kronecker product . By the EigenTriangle theorem, the number of triangles in D is given by:
RPI, November 2008
1r
16 rA
rc )Δ(G ) Δ(G
![Page 39: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/39.jpg)
Proof 39
RPI, November 2008
211
3
1
3
1 1
33
1 1
33
)(6)()(66
)(6
6
)(6
66)(
rA
rCA
s
ii
A
s
iAi
s
i
n
jji
s
i
n
jji
D
GGGG
GG
Therefore KroneckerTRC holds for all .Q.E.D
0k
![Page 40: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/40.jpg)
Outline
RPI, November 2008
40
• Related Work• Proposed Method • Experiments• New Triangle-related Laws• Triangles in Kronecker Graphs• Future Work & Open Problems
![Page 41: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/41.jpg)
Theoretical Challenge I:Spectra of real world networks
41
Can we prove things about the distribution of the eigenvalues, adopting a random graph model such as the expected degree model G(w) [CLV03]?
An analog to Wigner’s semicircle law for random Erdos-Renyi graphs (see Furedi-Komlos [FK81])
RPI, November 2008
Spectrum of
over 100000 Iterations
[S07]
2
1,40
G
![Page 42: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/42.jpg)
Theoretical Challenge I:Spectra of real world networks
42
RPI, November 2008
Empirically, the rest of
the spectrum:Triangular-like
distribution[FDBV01]
Can we proveSomething about
this empirical observation ?
![Page 43: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/43.jpg)
Theoretical Challenge II: Eigenvectors of real world networks
RPI, November 2008
43
Things even “worse” than the case of spectra. Very few knowledge about the eigenvectors. Related work:See [P08] for random graphs.
![Page 44: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/44.jpg)
Theoretical Challenge III: Degree Triangle Law
44
Prove using the expected degree random graph model G(w) the pattern we saw (see [S04])
Conjecture: The relationship we observed probably appears
for some cases of the slope of the degree distribution. Further experiments, recently
showed that for some graphs this pattern does not
hold.
RPI, November 2008
![Page 45: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/45.jpg)
Experimental Challenge I:Compare with Streaming Methods45
Streaming or Semi-Streaming methods, perform one or O(1) passes over the graph. [YKS02][BFLSS06][BPCG08] Common Underlying Idea: Sophisticated sampling methods
Implement and compare.
RPI, November 2008
![Page 46: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/46.jpg)
Practical Challenge I:Triangles in Large Scale Graph Mining46
Many Giga-byte and Peta-byte sized graphs. How to handle these graphs? HADOOP EigenTriangle algorithms are based just on
simple matrix vector multiplications. Easy to parallelize in all sorts of
architectures (distributed memory , shared memory).
See [DHV93] for the details. RPI, November 2008
![Page 47: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/47.jpg)
PEGASUS: Peta-Graph Miningfrom the Triangle perspective
47
RPI, November 2008
On-going work with U Kang and Christos Faloutsos in collaboration with Yahoo! Research.
Among others: Implement EigenTriangle algorithms in HADOOP and compare to other methods.
Find outliers in graphs with many billions of edges wrt triangles.
Soon…Stay tuned!
![Page 48: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/48.jpg)
Curious about:
RPI, November 2008
48
![Page 49: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/49.jpg)
Acknowledgements
RPI, November 2008
49
Christos Faloutsos
Yiannis KoutisFor the helpful discussions
![Page 50: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/50.jpg)
Acknowledgements
RPI, November 2008
50
Maria Tsiarli For the PEGASUS logo
![Page 51: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/51.jpg)
51
RPI, November 2008
![Page 52: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/52.jpg)
References
RPI, November 2008
52
[WF94] Wasserman, Faust: “Social Network Analysis: Methods and Applications (Structural Analysis in the Social Sciences)”
[EM02] Eckmann, Moses: “Curvature of co-links uncovers hidden thematic layers in the World Wide Web”
[BPCG08] Becchetti, Boldi, Castillo, Gionis Efficient Semi-Streaming Algorithms for Local Triangle Counting in Massive Graphs
[FKP02] Fabrikant, Koutsoupias, Papadimitriou: “Heuristically Optimized Trade-offs: A New Paradigm for Power Laws in the Internet”
[N05] Newman: “Power laws, Pareto distributions and Zipf's law” [M04] Mitzenmacher: “A brief history of generative models for
power law and lognormal distributions” [FK81] Furedi-Komlos: “Eigenvalues of random symmetric
matrices”
![Page 53: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/53.jpg)
References
RPI, November 2008
53
[S04] Danilo Sergi: “Random graph model with power-law distributed triangle subgraphs”
[D97] Demmel: “Applied Numerical Algebra” [LCKF05] Leskovec, Chakrabarti, Kleinberg, Faloutsos:
“Realistic, Mathematically Tractable Graph Generation and Evolution using Kronecker Multiplication”
[LK07] Leskovec, Faloutsos: “Scalable Modeling of Real Graphs using Kronecker Multiplication”
[FFF09] Faloutsos, Faloutsos, Faloutsos: “On power-law relationships of the Internet topology”
[MP02] Mihail, Papadimitriou: “On the Eigenvalue Power Law” [CLV03] Chung, Lu, Vu: “Spectra of Random Graphs with
given expected degrees”
![Page 54: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/54.jpg)
References
RPI, November 2008
54
[YKS02] Yossef, Kumar, Sivakumar: “Scalable Modeling of Real Graphs using Kronecker Multiplication”
[GL89] Golub, Van Loan: “Matrix Computations” [BFLSS06] Buriol, Frahling, Leonardi, Spaccamela, Sohler: “Counting
triangles in data streams” [DHV93] Demmel, Heath, Vorst: “Parallel Numerical Linear Algebra” [YPSB05] Ye, Peyser, Spencer, Bader: “Commensurate distances and
similar motifs in genetic congruence and protein interaction networks in yeast”
[P08] Mitra Pradipta: “Entrywise Bounds for Eigenvectors of Random Graphs”
[FDBV01] Farkas, Derenyi, Barabasi, Vicsek: “Spectra of "real-world" graphs: Beyond the semi-circle law”
[S07] Spielman’s “Spectral Graph Theory and its Applications” class (YALE): http://www.cs.yale.edu/homes/spielman/eigs/
![Page 55: FAST COUNTING OF TRIANGLES IN LARGE NETWORKS: ALGORITHMS AND LAWS RPI Theory Seminar, 24 November 2008 Charalampos (Babis) Tsourakakis School of Computer](https://reader034.vdocument.in/reader034/viewer/2022051620/56649e9e5503460f94b9f0ce/html5/thumbnails/55.jpg)
References
RPI, November 2008
55
[F08] Faloutsos’ “Multimedia Databases and Data Mining” class (CMU):http://www.cs.cmu.edu/~christos/courses/826.S08
For more references, take a look also in the paper: http://www.cs.cmu.edu/~ctsourak/tsourICDM08.pdf