document

Clustered graph, visualization and hierarchicalvisualization

Nathalie Villa-Vialaneixhttp://www.nathalievilla.org

[email protected]

Séminaire LIPN, Université Paris 13

November, 15th, 2012

Graph mining (Séminaire LIPN) Nathalie Villa-Vialaneix Paris, 11/15 2012 1 / 35

http://www.nathalievilla.org

[email protected]

An overview on graph visualization and clustering

Framework

A graph (network) G = (V ,E,W) with

• n vertices (nodes) V = {x1, . . . , xn};

• edges, E, weighted by Wij = Wji ≥ 0 (Wii = 0).



Network mining through visualizationA standard approach for network mining: using a force directedplacement algorithm (FDP) to display the graph; e.g.,[Fruchterman and Reingold, 1991]

• attractive forces: along the edges, analogous to springs;• repulsive forces : between all pairs of nodes, analogous to electric

forces.

The algorithm starts from an initial (random) position and iterates until thelayout is stabilized.



Drawbacks of FDP algorithms

• slow (hard to use for very large graphs);

• are more oriented toward aesthetic than toward an interpretablelayout:• tendency: short edges with uniform lengths;• negative consequence: hubs are clustered in the center of the figure.

What the user usually prefers:

1 understanding the macroscopic structure of the graph, i.e., find out“communities” and their relations;

2 focus on details for clusters that seem to be of interest.



Drawbacks of FDP algorithms

• slow (hard to use for very large graphs);• are more oriented toward aesthetic than toward an interpretable

layout:• tendency: short edges with uniform lengths;• negative consequence: hubs are clustered in the center of the figure.

What the user usually prefers:

1 understanding the macroscopic structure of the graph, i.e., find out“communities” and their relations;

2 focus on details for clusters that seem to be of interest.



Emphasizing “communities” in the layout1 global approach: displaying all vertices while modifying the forces in

such a way that the dense areas are emphasized: [Noack, 2007](LinLog algorithm)

2 clustering the vertices and then using a simplified representation ofthe graph

3 combined approach: hierarchical representations where finerdetails are provided to the user[Auber et al., 2003, Auber and Jourdan, 2005, Seifi et al., 2010]





2 clustering the vertices and then using a simplified representation ofthe graph [Herman et al., 2000]• partition the nodes into clusters V1, . . . , VC ;• display the clustered graph: nodes V1, . . . , VC (surface proportional

to |Vj |) and edges width proportional to∑

xk∈Vi ,xk ′∈VjWij

Main issue: Modify FDP to allows us to display nodes with differentsizes.






2 clustering the vertices and then using a simplified representation ofthe graphalternative approach: displaying while clustering as inSelf-Organizing Maps [Boulet et al., 2008],[Rossi and Villa-Vialaneix, 2010] and [Olteanu et al., 2013]




Emphasizing “communities” in the layout

1 global approach: displaying all vertices while modifying the forces insuch a way that the dense areas are emphasized: [Noack, 2007](LinLog algorithm)

2 clustering the vertices and then using a simplified representation ofthe graph




Outline of this talk

• self-organizing maps based on kernels and dissimilarities;• modularity based representations:

• combined with a map;• used hierarchically.


Self-organizing maps approaches

Outline

1 Self-organizing maps approaches

2 Modularity based approaches

3 Soft modularityHierarchical clustering and visualization



Basic ideas about SOM

Project the graph on a squared grid (each square of the grid is a cluster)

Project the graph on a squared grid (each square of the grid is a cluster)such that:• the nodes in a same cluster are highly connected• the nodes in two close clusters are also (less) connected• the nodes in two distant clusters are (almost) not connected



Basic ideas about SOM

Project the graph on a squared grid (each square of the grid is a cluster)such that:• the nodes in a same cluster are highly connected• the nodes in two close clusters are also (less) connected• the nodes in two distant clusters are (almost) not connected



Basics on Self-Organizing Maps (for multidimensionaldata)• the map is made of neurons (visually symbolized by, e.g.,

rectangles), 1...M, with which prototypes pi are associated (aprototype is a “representer” of the neuron in the original dataset);

• the map is equipped with a neighborhood relationship, i.e., a“distance” (actually a dissimilarity) between neurons, D;

• goal: find the best mapping f(xi) ∈ {1, . . . ,M} of the data xi in thedifferent neurons by minimizing the energy

E =n∑

i=1

M∑j=1

h(D(f(xi), j))‖xi − pi‖2.

i.e., each data is assigned to a neuron so that:• the neuron’s prototype is “close” to the data;• the neighboring prototypes are also “close” to the data;• distant prototypes are “distant” of the data.

(topology preservation)



Basics on Self-Organizing Maps (for multidimensionaldata)• the map is made of neurons (visually symbolized by, e.g.,

rectangles), 1...M, with which prototypes pi are associated (aprototype is a “representer” of the neuron in the original dataset);

• the map is equipped with a neighborhood relationship, i.e., a“distance” (actually a dissimilarity) between neurons, D;

• goal: find the best mapping f(xi) ∈ {1, . . . ,M} of the data xi in thedifferent neurons by minimizing the energy

E =n∑

i=1

M∑j=1

h(D(f(xi), j))‖xi − pi‖2.

i.e., each data is assigned to a neuron so that:• the neuron’s prototype is “close” to the data;• the neighboring prototypes are also “close” to the data;• distant prototypes are “distant” of the data.

(topology preservation)Graph mining (Séminaire LIPN) Nathalie Villa-Vialaneix Paris, 11/15 2012 9 / 35


SOM, dissimilarity SOM and kernel SOM (batch)

Original SOM algorithm (batch): x1, . . . , xn ∈ Rd

1: Initialization: randomly set p01 ,...,p0

M in Rd

2: for l = 1→ L do3: for all i = 1→ n do Assignment4: f l(xi)← arg minj=1,...,M ‖xi − p l−1

j ‖Rd

5: end for6: for all j = 1→ M do Representation7: p l

j ← arg minp∈Rd∑n

i=1 h l(D(f l(xi), j))‖xi − p‖2Rd

8: end for9: end for

Problems with graphs: xi are nodes so 1/ how to define the prototypes?and 2/ which distance to use between nodes?

[Villa and Rossi, 2007, Boulet et al., 2008]




Dissimilarity SOM (batch): xi ∈ G defined by a dissimilarity relation:δ(xi , xj)

1: Initialization: randomly set p01 ,...,p0

M in (xi)i

2: for l = 1→ L do3: for all i = 1→ n do Assignment4: f l(xi)← arg minj=1,...,M δ(xi , p l−1

j )5: end for6: for all j = 1→ M do Representation7: p l

j ← arg minp∈(xi)i

∑ni=1 h l(D(f l(xi), j))δ(xi , p)


[Kohohen and Somervuo, 1998, Kohonen and Somervuo, 2002]





Dissimilarity SOM (batch): xi ∈ G defined by a dissimilarity relation:δ(xi , xj)

1: Initialization: randomly set p0j ← γ0

ji xi (symbolic)2: for l = 1→ L do3: for all i = 1→ n do Assignment4: f l(xi) ← arg minj=1,...,M δ2(xi , p l−1

j ) =(∆γl−1

j

)i− 1

2 (γl−1j )T ∆γl−1

jwhere ∆ = (δ(xk , xk ′))k ,k ′

5: end for6: for all j = 1→ M do Representation7: γl

j ← arg minγ∈Rn∑n

i=1 h l(D(f l(xi), j))δ2(xi ,

∑nk=1 γk xk

)8: end for9: end for

[Rossi et al., 2007]





Kernel SOM (batch): xi ∈ G defined by a kernel relation: K(xi , xj)⇒∃ φ : G → (H , 〈., .〉H):K(x, x′) = 〈φ(x), φ(x′)〉H

1: Initialization: randomly set p0j ←

∑ni=1 γ

0jiφ(xi)

2: for l = 1→ L do3: for all i = 1→ n do Assignment4: f l(xi)← arg minj=1,...,M ‖φ(xi) − p l−1

j ‖H where ‖φ(xi) − p l−1j ‖H =∑n

k=1 γl−1jk γl−1

jk ′ K(xk , xk ′) − 2∑n

k=1 γl−1jk K(xi , xk )

5: end for6: for all j = 1→ M do Representation7: γl

jk ← arg minγ∈Rn∑n

i=1 h l(D(f l(xi), j))‖φ(xi) −∑n

k=1 γkφ(xk )‖2H





Dissimilarity SOM (stochastic)

(Online relational SOM) [Olteanu et al., 2013]

1: Initialization: randomly set γ0ji in R

2: for l = 1→ L do3: Randomly chose an input xi

4: Assignment f t (xi)← arg minj=1,...,M

(γl−1

j ∆)i− 1

2γl−1j ∆(γl−1

j )T

5: for all j = 1→ M do Update of the prototypes6: γl

j ← γl−1j + αlh l(D(f l(xi), j))

(1i − γ

l−1j

)where 1i is a vector with a

single non null coefficient at the ith position, equal to one




Which dissimilarities/kernels for graphs?

Laplacian [Kondor and Lafferty, 2002]

For a graph with vertices V = {x1, . . . , xn} and weights (wi,j)i,j=1,...,n

(positive, symmetric), the Laplacian is: L = (Li,j)i,j=1,...,n where

Li,j =

{−wi,j if i , jdi =

∑j,i wi,j if i = j

;

1 Diffusion matrix [Kondor and Lafferty, 2002]: for β > 0,Kβ = e−βL =

∑+∞k=1

(−βL)k

k ! heat kernel (or diffusion kernel);

2 Generalized inverse of the Laplacian [Fouss et al., 2007] :K = L+;

3 Dissimilarity: length of the shortest path between two nodes.



A first example: a medieval social networkExample from [Boulet et al., 2008], http://graphcomp.univ-tlse2.fr/ In the “Archive départementalesdu Lot” (Cahors, France), big corpus of 5000 transactions (mostly landcharters)

• coming from 4 “seigneuries” (about 25 little villages) in South West ofFrance;

• being established between 1240 and 1520 (just before and after thehundred years’ war).


http://graphcomp.univ-tlse2.fr/


Simplification of this network by kernel SOM

nodes: individuals (' 600)named in the transactions, re-stricted to transactions estab-lished before the HYW; edges:the fact that two individuals arenamed in a common transac-tion or have a common lord

Kernel SOM with heat kernel



Simplification of this network by kernel SOM

Kernel SOM with heat kernel



A brief comparison with spectral clustering

Number of clusters: 35 50Maximum size of the clusters: 255 268

Modularity: 0.597 0.420



A brief comparison with spectral clustering

Number of clusters: 35 29Maximum size of the clusters: 255 325

Modularity: 0.597 0.433



Online relational SOM (faster)Description:• nodes: 105 American political books;• edges weighted by the number of co-purchasing of the two books on

the internet (Amazon.com).

FDP representation



Online relational SOM (faster)Description:

• nodes: 105 American political books;

• edges weighted by the number of co-purchasing of the two books onthe internet (Amazon.com).


Modularity based approaches

Outline






Modularity [Newman and Girvan, 2004]

Popular quality measure for graph clustering: a partition of the verticesin C clusters, (Ck )k=1,...,C has modularity:

Q(C) =1

2m

C∑k=1

∑i,j∈Ck

(Wij − Pij)

where Pij are weights corresponding to a “null model” where the weightsonly depend on the nodes properties and not on the cluster they belong to.

More precisely,

Pij =didj

2m

with di = 12∑

j,i Wij is the degree of a vertex xi .A “good” clustering should maximize Q.





Q(C) =1

2m

C∑k=1

∑i,j∈Ck

(Wij − Pij)

where Pij are weights corresponding to a “null model” where the weightsonly depend on the nodes properties and not on the cluster they belong to.More precisely,

Pij =didj

2m

with di = 12∑

j,i Wij is the degree of a vertex xi .

A “good” clustering should maximize Q.





Q(C) =1

2m

C∑k=1

∑i,j∈Ck

(Wij − Pij)

where Pij are weights corresponding to a “null model” where the weightsonly depend on the nodes properties and not on the cluster they belong to.More precisely,

Pij =didj

2m

with di = 12∑

j,i Wij is the degree of a vertex xi .A “good” clustering should maximize Q.



Interpretation

• Q increases when (xi , xj) are in a same cluster and have trueweight Wij greater than the ones expected in the null model, Pij

• Q increases when (xi , xj) are in a two different clusters and havetrue weight Wij smaller than the ones expected in the null model, Pij

becauseQ(C) +

12m

∑k,k ′

∑i∈Ck , j∈Ck ′

(Wij − Pij) = 0.

• Contrary to the minimization of the number of edges betweenclusters, modularity can help to separate nodes with high degreesinto different clusters more easily



Drawing optimized clustering

Combine:

• high modularity to ensure high intra clusters density and lowexternal connectivity

• little edge crossing

by:• Classic solution: relying on graph drawing algorithm after maximization

of the modularity

• Extend the modularity to a criterium adapted to a prior structure (like agrid)




Combine:


• little edge crossing by:• Classic solution: relying on graph drawing algorithm after maximization

of the modularity

• Extend the modularity to a criterium adapted to a prior structure (like agrid)




Combine:


• little edge crossing by:• Classic solution: relying on graph drawing algorithm after maximization

of the modularity• Extend the modularity to a criterium adapted to a prior structure (like a

grid)


Soft modularity

Outline





Soft modularity

Self Organizing Map principle

For data in Rd , SOM minimizes (over the clustering and the prototypes(pk ))

M∑j=1

n∑i=1

Sf(xi),j‖xi − pj‖2Rd

where Skl encodes the prior structure: close to 1 for close clusters andclose to 0 for distant clusters

This corresponds to a soft membership: xi belongs to Cj withmembership Sf(xi),j .


Soft modularity

Self Organizing Map principle

For data in Rd , SOM minimizes (over the clustering and the prototypes(pk ))

M∑j=1

n∑i=1

Sf(xi),j‖xi − pj‖2Rd

where Skl encodes the prior structure: close to 1 for close clusters andclose to 0 for distant clustersThis corresponds to a soft membership: xi belongs to Cj withmembership Sf(xi),j .


Soft modularity

Organized modularity [Rossi and Villa-Vialaneix, 2010]

Same idea: encode a prior structure via a matrix S.Maximize:

SQ =1

2m

∑i,j

Sf(i)f(j)(Wij − Pij)

Hence:

• if a pair of vertices (xi , xj) is such that Wij > Pij , SQ increases with thecloseness of f(xi) and f(xj) in the prior structure

• if a pair of vertices (xi , xj) is such that Wij < Pij , SQ increases if f(xi)and f(xj) are distant in the prior structure


Soft modularity

Optimization

The clustering is represented by a n × C assignment matrix M withMik = δf(i)=k . The goal is then to maximize

SQ = F(M) =1

2m

∑i,j

∑k ,l

Mik SklMlj(Wij − Pij)

Combinatorial problem is NP-complet⇒ use of deterministic algorithm


Soft modularity

Comparison on the co-appearance network from “LesMisérables”Co-appearance network from “Les Misérables” [Knuth, 1993]

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

Myriel

Napoleon

MlleBaptistine

MmeMagloire

CountessDeLo

Geborand

Champtercier

Cravatte

Count

OldMan

Labarre

Valjean

Marguerite

MmeDeR

Isabeau

Gervais

Tholomyes

Listolier

FameuilBlacheville

Favourite

DahliaZephine

Fantine

MmeThenardierThenardier

Cosette

Javert

Fauchelevent

Bamatabois

Perpetue

Simplice

Scaufflaire

Woman1Judge

Champmathieu

BrevetChenildieuCochepaille

PontmercyBoulatruelle

Eponine

Anzelma

Woman2

MotherInnocent

Gribier

Jondrette

MmeBurgonGavroche

Gillenormand

Magnon

MlleGillenormand

MmePontmercy

MlleVaubois

LtGillenormand

Marius

BaronessT

Mabeuf

EnjolrasCombeferre

Prouvaire

Feuilly

CourfeyracBahorelBossuetJoly

Grantaire

MotherPlutarch

GueulemerBabetClaquesous

Montparnasse

Toussaint

Child1Child2

Brujon

MmeHucheloup

77 nodesdensity = 8.7%transitivity = 49.9 %


Soft modularity

Methodology

Comparison of:• Kernel SOM with various kernels: heat kernel, generalized inverse of

the Laplacian, modularity kernel (i.e., the positive part of W − P whichmimics the optimization of the modularity) and spectral SOM (basedon the first M eigenvectors of the Laplacian)

• SQ optimization

Parameters varied:

• size of the prior grid or number of clusters

• for organized clusterings, type of neighborhood on the grid

• for SOM, random or PCA initialization and kernel parameter for theheat kernel

Selection of the solutions: Pareto points according to modularity andnumber of edge crossing


Soft modularity

A brief comment on the kernel SOM solutions

●● ●● ●●● ●● ●● ● ●● ●● ● ●●

●●●●●●

0.0 0.1 0.2 0.3 0.4 0.5 0.6

010

020

030

040

050

0

Generalized inverse

Modularity

Cro

ssin

g ed

ges

●

●

●●

●

●●

●●●

●●

●●

●●

●

●●

●●

●●●●

●

●

●

●

●

●

●

●●

● ●●

●●●●●

●●●●●●

●

●

●

●

●

Size 3Size 4Size 5Pareto point

●●● ●●●●● ●●●●● ●●●●● ●●●●● ● ●●●●●

●●●●●

●●●●●

●●

●

●●●

●●

●●

● ●●●● ●●● ●● ●

●●●● ● ●●●

● ● ●● ●

●●●● ●● ●●● ●● ●● ●●● ●●● ●● ●● ● ●● ●

0.0 0.1 0.2 0.3 0.4 0.5 0.6

010

020

030

040

050

0

Heat kernel

Modularity

Cro

ssin

g ed

ges

●

●●●

●●

●●●

●●

●●

● ●●

● ●

● ●●

●●●

●

●●●● ●

●●

●● ●

●

●

●●

●

●●

●

●

●

●

●

●

●

●●●

● ●●

●●●●●

●●

●●●

●

●●●

●

●●●

●

●● ●●●●

●●●

●

●● ●● ●●● ●●●

●●●●●●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●●

● ●

●

●

●

●

●

●●

●

● ●

●

●●

●

●

●

●

●●●

●●

●

●●

●●

●

●●

●●●●●

●●●●

●

●●●●

●

●

●

Size 3Size 4Size 5Pareto point

●●●

●● ●●●●

●●● ●● ● ●● ●● ● ●●●

● ●

0.0 0.1 0.2 0.3 0.4 0.5 0.6

010

020

030

040

050

0

Modularity kernel

Modularity

Cro

ssin

g ed

ges

●● ●●

●●

● ●

●●

●

● ●●

●

●

●

●

●● ●

●

●

●

●

●

●

●●

● ●

●

●

●

●

●

●

●

●

●

●●●

●

●

●

●●

●●

●

●

●

Size 3Size 4Size 5

●●●●● ●●●

●●●

● ●●●●

●● ●●●

● ●●●

0.0 0.1 0.2 0.3 0.4 0.5 0.6

010

020

030

040

050

0

Spectral SOM

Modularity

Cro

ssin

g ed

ges

●

●

●

●

●

●

●

●

●●

●

●●

●

●

●

●

●

●

●●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

● ●

●

●

●

●●

●

●

● ●

●

●

Size 3Size 4Size 5

Spectral SOM and Modularity kernel obtain poor resultsGraph mining (Séminaire LIPN) Nathalie Villa-Vialaneix Paris, 11/15 2012 27 / 35

Soft modularity

Analysis of the Pareto points for “Les Misérables”Method Number Modularity Nb of pairs

of clusters of cut edges

Organized mod. 42 (7) 0.5638 1Organized mod. 52 (7) 0.5652 3

32 (6) 0.5472 0

kernel SOM (HK) 52 (22) 0.5327 2732 (8) 0.5276 2

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●●

●

●●●

●●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●●

●

●

●● ●●

●

●

●●

●

●

●

● ●

●

●

●


Soft modularity Hierarchical clustering and visualization

Outline






Global overview

[Rossi and Villa-Vialaneix, 2011] 2 combined steps:

• Find out a clustering hierarchy (by repeating modularity optimizationin clusters) + test of the significativity of the partition at each step;

• Display the different levels of the hierarchy by a modified forcedirected algorithm.



A hierarchy of clusteringAim: Limiting the resolution default of modularity (see [Fortunato, 2010].How to do so? Iterate the modularity optimization in each cluster. Themodularity is optimized by a greedy algorithm with multi-levels refinementsimilar to that of [Noack and Rotta, 2009].

Step 1




Step 2




Step 3



Stopping criterionA clustering algorithm always provides a solution, relevant or not!

Significativity of a node clustering:1 Generate random graphs with the same degree distribution than the

original graph;2 Optimize the modularity on these random graphs;3 Find out the p-value of the observed modularity compared to the

empirical distribution on random graphs;4 If the new clustering is not found to be significant, the algorithm is

stopped.



Stopping criterionA clustering algorithm always provides a solution, relevant or not!Significativity of a node clustering:

1 Generate random graphs with the same degree distribution than theoriginal graph;Simulation process: MCMC algorithm of [Roberts Jr., 2000] whichpermutes edges on random couples of pairs of connected nodes;

2 Optimize the modularity on these random graphs;3 Find out the p-value of the observed modularity compared to the


stopped.




1 Generate random graphs with the same degree distribution than theoriginal graph;Simulation process: MCMC algorithm of [Roberts Jr., 2000] whichpermutes edges on random couples of pairs of connected nodes;

After Q |E | permutations, the obtained graph is random for the uniformdistribution on the set of graphs with the same degre distribution.



stopped.




1 Generate random graphs with the same degree distribution than theoriginal graph;


empirical distribution on random graphs;

4 If the new clustering is not found to be significant, the algorithm isstopped.




1 Generate random graphs with the same degree distribution than theoriginal graph;



stopped.



Display a clustering hierarchy

Basics• start from the first step of the clustering;

• expand the clusters by order of minimal decrease in modularity.

Issues

1 taking into account the size of the clusters: [Tunkelang, 1999] for amodification of the FDR algorithm to nodes that have different sizes;

2 estimating in advance the place needed to represent a cluster when itis expanded.



Another medieval network...

From the same corpus of medieval documents:

• nodes: transactions and active individuals. 3 918 individuals and6 455 transactions (total: 10 373 sommets);

• edges model the active involvement of an individual in a transaction.

Modularity optimization: 48 clusters having from 10 to 740 nodes.Hierarchy : 4 levels, 89 classes on the latest levels.







Modularity optimization: 48 clusters having from 10 to 740 nodes.

Hierarchy : 4 levels, 89 classes on the latest levels.







Modularity optimization: 48 clusters having from 10 to 740 nodes.Hierarchy : 4 levels, 89 classes on the latest levels.



Conclusion

Mining a graph from a clustering• clustering can be used to provide a simplified representation of the

network and to help the user understand its macroscopic structure;

• optimizing the modularity seems to provide better results thanapproaches based on the Laplacian (it helps for separating hubs andthus results in more balanced clusters);

• approaches presented here are almost fully automated: solutions totune the parameters are provided in the corresponding articles.

Perspectives: improve the representation of the hierarchy, incorporateadditional information on nodes and edges in the clustering...

Merci pour votre attention...



ReferencesAuber, D., Chiricota, Y., Jourdan, F., and Melançon, G. (2003).Multiscale visualization of small world networks.In INFOVIS’03.

Auber, D. and Jourdan, F. (2005).Interactive refinement of multi-scale network clusterings.In International Conference on Information Visualisation, pages 703–709, Los Alamitos, CA, USA. IEEE Computer Society.

Boulet, R., Jouve, B., Rossi, F., and Villa, N. (2008).Batch kernel SOM and related laplacian methods for social network analysis.Neurocomputing, 71(7-9):1257–1273.

Fortunato, S. (2010).Community detection in graphs.Physics Reports, 486:75–174.

Fouss, F., Pirotte, A., Renders, J., and Saerens, M. (2007).Random-walk computation of similarities between nodes of a graph, with application to collaborative recommendation.IEEE Trans Knowl Data En, 19(3):355–369.

Fruchterman, T. and Reingold, B. (1991).Graph drawing by force-directed placement.Software Pract Exper, 21:1129–1164.

Herman, I., Melançon, G., and Scott Marshall, M. (2000).Graph visualization and navigation in information visualisation.6(1):24–43.

Knuth, D. (1993).The Stanford GraphBase: A Platform for Combinatorial Computing.Addison-Wesley, Reading, MA.

Kohohen, T. and Somervuo, P. (1998).



Self-Organizing maps of symbol strings.Neurocomputing, 21:19–30.

Kohonen, T. and Somervuo, P. (2002).How to make large self-organizing maps for nonvectorial data.Neural Networks, 15(8):945–952.

Kondor, R. and Lafferty, J. (2002).Diffusion kernels on graphs and other discrete structures.In Proceedings of the 19th International Conference on Machine Learning, pages 315–322.

Newman, M. and Girvan, M. (2004).Finding and evaluating community structure in networks.Phys Rev E, 69:026113.

Noack, A. (2007).Energy models for graph clustering.J Graph Algorithms Appl, 11(2):453–480.

Noack, A. and Rotta, R. (2009).Multi-level algorithms for modularity clustering.In SEA ’09: Proceedings of the 8th International Symposium on Experimental Algorithms, pages 257–268, Berlin, Heidelberg.Springer-Verlag.

Olteanu, M., Villa-Vialaneix, N., and Cottrell, M. (2013).On-line relational som for dissimilarity data.In Estevez, P., Principe, J., Zegers, P., and Barreto, G., editors, Advances in Self-Organizing Maps (Proceedings of WSOM2012), volume 198 of AISC, pages 13–22, Springer Verlag, Berlin, Heidelberg.To appear.

Roberts Jr., J. M. (2000).Simple methods for simulating sociomatrices with given marginal totals.Social Networks, 22(3):273 – 283.



Rossi, F., Hasenfuss, A., and Hammer, B. (2007).Accelerating relational clustering algorithms with sparse prototype representation.In 6th International Workshop on Self-Organizing Maps (WSOM), Bielefield, Germany. Neuroinformatics Group, BielefieldUniversity.

Rossi, F. and Villa-Vialaneix, N. (2010).Optimizing an organized modularity measure for topographic graph clustering: a deterministic annealing approach.Neurocomputing, 73(7-9):1142–1163.

Rossi, F. and Villa-Vialaneix, N. (2011).Représentation d’un grand réseau à partir d’une classification hiérarchique de ses sommets.Journal de la Société Française de Statistique, 152(3):34–65.

Seifi, M., Guillaume, J., Latapy, M., and Le Grand, B. (2010).Visualisation interactive multi-échelle des grands graphes : application à un réseau de blogs.In Atelier EGC 2010, Visualisation et Extraction de Connaissances, Hammamet, Tunisie.

Tunkelang, D. (1999).A Numerical Optimization Approach to General Graph Drawing.PhD thesis, School of Computer Science, Carnegie Mellon University.CMU-CS-98-189.

Villa, N. and Rossi, F. (2007).A comparison between dissimilarity SOM and kernel SOM for clustering the vertices of a graph.In 6th International Workshop on Self-Organizing Maps (WSOM), Bielefield, Germany. Neuroinformatics Group, BielefieldUniversity.


document

Science

graph visualization

clustered graph

graph herman

graph network g

clustering network mining

attractive forces

repulsive forces

electric forces