Graph visualization by organized clustering:
application to social and biological networks
Nathalie Villa-Vialaneix http://www.nathalievilla.org
& Fabrice Rossi
IUT de Carcassonne (UPVD) & Institut de Mathématiques de Toulouse
Challenging problems in Statistical Learning, Paris January 28/29, 2010
Network data
Many sources of large networks
I social networks (emails, collaborations, phone calls, etc.)
I technological networks (Internet, etc.)
I biological networks (metabolic pathways, gene regulation, gene interactions, etc.)
Scope of the talk: A graphG, with vertices{x1,x2, . . . ,xn}, undirected and weightedwith weightsWsuch that: wii =0 (no loop),wij =wji ≥0 (wij >0⇔ ∃edge between nodesxiandxj).
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
Network data
Many sources of large networks
I social networks (emails, collaborations, phone calls, etc.)
I technological networks (Internet, etc.)
I biological networks (metabolic pathways, gene regulation, gene interactions, etc.)
Scope of the talk: A graphG, with vertices{x1,x2, . . . ,xn}, undirected and weightedwith weightsWsuch that: wii =0 (no loop),wij =wji ≥0 (wij >0⇔ ∃edge between nodesxiandxj).
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
● ●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Network data analysis
Over one hundred vertices,pure manual analysis is infeasible
⇒need for automatic support for exploratory analysis:
I node/edge measures (e.g., degree distribution, betweenness, ...)
I visualization (e.g., force directed algorithm)
I node clustering (community extraction)
I supervised analysis
I ...
Network data analysis
Over one hundred vertices,pure manual analysis is infeasible
⇒need for automatic support for exploratory analysis:
I node/edge measures (e.g., degree distribution, betweenness, ...)
I visualization(e.g., force directed algorithm)
I node clustering(community extraction)
I supervised analysis
I ...
Visualization from clustering
Large graph visualization is difficult.
A standard solution: simplify the graph prior drawing. More precisely
1. identify dense clusters of nodes
2. draw the corresponding graph of clusters
[Newman and Girvan, 2004]
Visualization from clustering
Large graph visualization is difficult.
A standard solution: simplify the graph prior drawing. More precisely
1. identify dense clusters of nodes
2. draw the corresponding graph of clusters
[Newman and Girvan, 2004]
Visualization from clustering
Large graph visualization is difficult.
A standard solution: simplify the graph prior drawing. More precisely
1. identify dense clusters of nodes
2. draw the corresponding graph of clusters
[Newman and Girvan, 2004]
Drawing a clustered graph
I Given apartition(Ck)k=1,...,C
I represent each cluster by a glyph (e.g., a circle) with surface proportional to|Ck|
I draw a segment between glyphsCkandClwith thickness proportional toP
i∈Ck,j∈ClWij
I Thesimplification induced by the clustering has to be faithful: each cluster should be as dense as possible (i.e., Pi,j∈CkWij should be high compared to the other weights).
I Thegraph induced by the clustering has to be readable: edge crossing should be minimized.
Drawing a clustered graph
I Given apartition(Ck)k=1,...,C
I represent each cluster by a glyph (e.g., a circle) with surface proportional to|Ck|
I draw a segment between glyphsCkandClwith thickness proportional toP
i∈Ck,j∈ClWij
I Thesimplification induced by the clustering has to be faithful: each cluster should be as dense as possible (i.e., Pi,j∈CkWij should be high compared to the other weights).
I Thegraph induced by the clustering has to be readable: edge crossing should be minimized.
Drawing a clustered graph
I Given apartition(Ck)k=1,...,C
I represent each cluster by a glyph (e.g., a circle) with surface proportional to|Ck|
I draw a segment between glyphsCkandClwith thickness proportional toP
i∈Ck,j∈ClWij
I Thesimplification induced by the clustering has to be faithful: each cluster should be as dense as possible (i.e., Pi,j∈CkWij should be high compared to the other weights).
I Thegraph induced by the clustering has to be readable:
edge crossing should be minimized.
Two approaches based on organizing maps
Main idea: organizing the clustering on a grid to constrain clusters’
position and to represent the most connected clusters close to each others.
1. Kernel SOM: generalization of Self-Organizing Maps to graph by the use of a kernel
2. Organized modularity optimization: extension of a well-known clustering measure for graphs to organized clustering
Two approaches based on organizing maps
Main idea: organizing the clustering on a grid to constrain clusters’
position and to represent the most connected clusters close to each others.
1. Kernel SOM: generalization of Self-Organizing Maps to graph by the use of a kernel
2. Organized modularity optimization: extension of a well-known clustering measure for graphs to organized clustering
Basic ideas about SOM
Project the graph on a squared grid (each square of the grid is a cluster)
Project the graph on a squared grid (each square of the grid is a cluster) such that:
I the nodes in a same cluster are highly connected
I the nodes in two close clusters are also (less) connected
I the nodes in two distant clusters are (almost) not connected
Basic ideas about SOM
Project the graph on a squared grid (each square of the grid is a cluster) such that:
I the nodes in a same cluster are highly connected
I the nodes in two close clusters are also (less) connected
I the nodes in two distant clusters are (almost) not connected
SOM and kernel SOM
Original SOM algorithm (batch): x1, . . . ,xn ∈Rd
1. Initalization: Initialize randomlyp10, ...,pM0 inRd 2. Forl =1, . . . ,Ldo
3. Assignment: for alli =1, . . . ,ndo fl(xi) =arg min
j=1,...,Mkxi−pjl−1kRd
4. Representation: for allj =1, . . . ,M,
pjl= arg min
p∈Rd
Xn
i=1
hl(fl(xi),j)kxi−pk2
Rd
Online versions by[Lau et al., 2006]
SOM and kernel SOM
Kernel SOM (batch): xi∈ Gdefined by a kernel relation:K(xi,xj)
⇒ ∃φ:G →(H,h., .iH):K(x,x0) =hφ(x), φ(x0)iH
1. Initalization: Initialize randomlypj0 =Pn
i=1γ0jiφ(xi) 2. Forl =1, . . . ,Ldo
3. Assignment: for alli =1, . . . ,ndo fl(xi) =arg min
j=1,...,Mkφ(xi)−pjl−1kH 4. Representation: for allj =1, . . . ,M,
γlj =arg min
γ∈Rn
Xn
i=1
hl(fl(xi),j)kφ(xi)− Xn
k=1
γkφ(xk)k2H [Villa and Rossi, 2007, Hammer and Hasenfuss, 2007]
Online versions by[Lau et al., 2006]
SOM and kernel SOM
Kernel SOM (batch): xi∈ Gdefined by a kernel relation:K(xi,xj)
⇒ ∃φ:G →(H,h., .iH):K(x,x0) =hφ(x), φ(x0)iH
1. Initalization: Initialize randomlypj0 =Pn
i=1γ0jiφ(xi) 2. Forl =1, . . . ,Ldo
3. Assignment: for alli =1, . . . ,ndo
fl(xi) =arg min
j=1,...,M n
X
k,k0=1
γjkl−1γl−1jk0 K(xk,xk0)−2
n
X
k=1
γl−1jk K(xi,xk)
4. Representation: for allj =1, . . . ,M, γljk = h(fl(xk),j)
Pn
k0=1h(fl(xk0),j)
[Villa and Rossi, 2007, Hammer and Hasenfuss, 2007]
Online versions by[Lau et al., 2006]
SOM and kernel SOM
Kernel SOM (batch): xi∈ Gdefined by a kernel relation:K(xi,xj)
⇒ ∃φ:G →(H,h., .iH):K(x,x0) =hφ(x), φ(x0)iH
1. Initalization: Initialize randomlypj0 =Pn
i=1γ0jiφ(xi) 2. Forl =1, . . . ,Ldo
3. Assignment: for alli =1, . . . ,ndo
fl(xi) =arg min
j=1,...,M
Xn
k,k0=1
γjkl−1γl−1jk0 K(xk,xk0)−2 Xn
k=1
γl−1jk K(xi,xk)
4. Representation: for allj =1, . . . ,M, γljk = h(fl(xk),j)
Pn
k0=1h(fl(xk0),j) Online versions by[Lau et al., 2006]
Which kernels?
Laplacian:L= (Li,j)i,j=1,...,n where
Li,j =
( −wi,j ifi ,j di =P
j,iwi,j ifi =j ;
Regularized versions such as
I Heat kernel
[Kondor and Lafferty, 2002, Smola and Kondor, 2003]: for
β >0,Kβ=e−βL =P+∞
k=1 (−βL)k
k! .
I Generalized inverse of the Laplacian [Fouss et al., 2007]: K =L+.
Which kernels?
Laplacian:L= (Li,j)i,j=1,...,n where
Li,j =
( −wi,j ifi ,j di =P
j,iwi,j ifi =j ;
Regularized versions such as
I Heat kernel
[Kondor and Lafferty, 2002, Smola and Kondor, 2003]: for
β >0,Kβ=e−βL =P+∞
k=1 (−βL)k
k! .
I Generalized inverse of the Laplacian [Fouss et al., 2007]: K =L+.
Which kernels?
Laplacian:L= (Li,j)i,j=1,...,n where
Li,j =
( −wi,j ifi ,j di =P
j,iwi,j ifi =j ;
Regularized versions such as
I Heat kernel
[Kondor and Lafferty, 2002, Smola and Kondor, 2003]: for
β >0,Kβ=e−βL =P+∞
k=1 (−βL)k
k! .
I Generalized inverse of the Laplacian [Fouss et al., 2007]: K =L+.
A first example: a medieval social network
Example from [Boulet et al., 2008]
In Cahors (Lot, France), stands a big corpus of 5000 agrarian contracts coming from 4 seignories (about 25 little villages) and being established between 1240 and 1520 (just before and after the hundred years’ war).
A first example: a medieval social network
Example from [Boulet et al., 2008]
In Cahors (Lot, France), stands a big corpus of 5000 agrarian contracts coming from 4 seignories (about 25 little villages) and being established between 1240 and 1520 (just before and after the hundred years’ war).
Network of relations between peasants based on common ci- tations in a given contract.
A first example: a medieval social network
Example from [Boulet et al., 2008]
In Cahors (Lot, France), stands a big corpus of 5000 agrarian contracts coming from 4 seignories (about 25 little villages) and being established between 1240 and 1520 (just before and after the hundred years’ war).
Graph of clusters: the commu- nities have relations with time and space.
The leading people are empha- sized.
But The biggest communities are still very complex.
Modularity [Newman and Girvan, 2004]
Popularquality measure for graph clustering: a partition of the vertices inC clusters,(Ck)k=1,...,C has modularity:
Q(C) = 1 2m
C
X
k=1
X
i,j∈Ck
(Wij−Pij)
wherePijare weights corresponding to a “null model” where the weights only depend on the nodes properties and not on the cluster they belong to.
More precisely,
Pij = didj 2m withdi = 12P
j,iWijis the degree of a vertexxi. A “good” clustering shouldmaximizeQ.
Modularity [Newman and Girvan, 2004]
Popularquality measure for graph clustering: a partition of the vertices inC clusters,(Ck)k=1,...,C has modularity:
Q(C) = 1 2m
C
X
k=1
X
i,j∈Ck
(Wij−Pij)
wherePijare weights corresponding to a “null model” where the weights only depend on the nodes properties and not on the cluster they belong to.
More precisely,
Pij = didj 2m withdi = 12P
j,iWijis the degree of a vertexxi.
A “good” clustering shouldmaximizeQ.
Modularity [Newman and Girvan, 2004]
Popularquality measure for graph clustering: a partition of the vertices inC clusters,(Ck)k=1,...,C has modularity:
Q(C) = 1 2m
C
X
k=1
X
i,j∈Ck
(Wij−Pij)
wherePijare weights corresponding to a “null model” where the weights only depend on the nodes properties and not on the cluster they belong to.
More precisely,
Pij = didj 2m withdi = 12P
j,iWijis the degree of a vertexxi. A “good” clustering shouldmaximizeQ.
Interpretation
I Qincreases when(xi,xj)are in asame clusterand havetrue weightWij greaterthan the ones expected in the null model, Pij
I Qincreases when(xi,xj)are in atwo different clustersand havetrue weightWij smallerthan the ones expected in the null model,Pij because
Q(C) + 1 2m
X
k,k0
X
i∈Ck,j∈Ck0
(Wij−Pij) =0.
I Contrary to the minimization of the number of edges between clusters, modularity can help toseparate nodes with high degreesinto different clusters more easily
Interpretation
I Qincreases when(xi,xj)are in asame clusterand havetrue weightWij greaterthan the ones expected in the null model, Pij
I Qincreases when(xi,xj)are in atwo different clustersand havetrue weightWij smallerthan the ones expected in the null model,Pij because
Q(C) + 1 2m
X
k,k0
X
i∈Ck,j∈Ck0
(Wij−Pij) =0.
I Contrary to the minimization of the number of edges between clusters, modularity can help toseparate nodes with high degreesinto different clusters more easily
Drawing optimized clustering
Combine:
I high modularityto ensure high intra clusters density and low external connectivity
I little edge crossing
by:
I Classic solution: relying on graph drawing algorithm after maximization of the modularity
I Extend the modularity to a criterium adapted to a prior structure (like a grid)
Drawing optimized clustering
Combine:
I high modularityto ensure high intra clusters density and low external connectivity
I little edge crossingby:
I Classic solution: relying on graph drawing algorithm after maximization of the modularity
I Extend the modularity to a criterium adapted to a prior structure (like a grid)
Drawing optimized clustering
Combine:
I high modularityto ensure high intra clusters density and low external connectivity
I little edge crossingby:
I Classic solution: relying on graph drawing algorithm after maximization of the modularity
I Extend the modularity to a criterium adapted to a prior structure (like a grid)
Self Organizing Map principle
For data inRd, SOM minimizes (over the clustering and the prototypes(pk))
C
X
k=1 n
X
i=1
Sf(x
i),kkxi−pkk2
Rd
where:
I (pk)are the prototypes (one for each cluster of the grid) representing the cluster in the original space (Rd)
I f(xi)is the cluster, on the grid, wherexiis classified
I Sklencodes the prior structure: close to 1 for close clusters and close to 0 for distant clusters
This corresponds to asoft membership: xibelongs toCk with membershipSf(xi),k.
Self Organizing Map principle
For data inRd, SOM minimizes (over the clustering and the prototypes(pk))
C
X
k=1 n
X
i=1
Sf(x
i),kkxi−pkk2
Rd
where:
I (pk)are the prototypes (one for each cluster of the grid) representing the cluster in the original space (Rd)
I f(xi)is the cluster, on the grid, wherexiis classified
I Sklencodes the prior structure: close to 1 for close clusters and close to 0 for distant clusters
This corresponds to asoft membership: xibelongs toCk with membershipSf(xi),k.
Organized modularity [Rossi and Villa-Vialaneix, 2010]
Same idea: encode a prior structure via a matrixS. Maximize:
SQ= 1 2m
X
i,j
Sf(i)f(j)(Wij−Pij)
Hence:
I if a pair of vertices(xi,xj)is such thatWij >Pij,SQincreases with the closeness off(xi)andf(xj)in the prior structure
I if a pair of vertices(xi,xj)is such thatWij <Pij,SQincreases iff(xi)andf(xj)are distant in the prior structure
Organized modularity [Rossi and Villa-Vialaneix, 2010]
Same idea: encode a prior structure via a matrixS. Maximize:
SQ= 1 2m
X
i,j
Sf(i)f(j)(Wij−Pij)
Hence:
I if a pair of vertices(xi,xj)is such thatWij >Pij,SQincreases with the closeness off(xi)andf(xj)in the prior structure
I if a pair of vertices(xi,xj)is such thatWij <Pij,SQincreases iff(xi)andf(xj)are distant in the prior structure
Optimization
The clustering is represented by an×Cassignment matrixMwith Mik =δf(i)=k. The goal is then tomaximize
SQ=F(M) = 1 2m
X
i,j
X
k,l
MikSklMlj(Wij−Pij)
Combinatorial problem is NP-complet⇒use ofdeterministic algorithm:
I Given a temperature 1β, assume a Gibbs distribution on the solution spaceP(M) = Z1
PeβF(M)
I ComputeE(M)with respect toP
I At the limitβ→+∞,E(M)converges toM∗whereM∗realizes the maximum ofF(M)
Optimization
The clustering is represented by an×Cassignment matrixMwith Mik =δf(i)=k. The goal is then tomaximize
SQ=F(M) = 1 2m
X
i,j
X
k,l
MikSklMlj(Wij−Pij)
Combinatorial problem is NP-complet⇒use ofdeterministic algorithm:
I Given a temperature 1β, assume a Gibbs distribution on the solution spaceP(M) = Z1
PeβF(M)
I ComputeE(M)with respect toP
I At the limitβ→+∞,E(M)converges toM∗whereM∗realizes the maximum ofF(M)
Mean field approximation
Problem: ZP =P
MeβF(M)is hard to compute (Cn values forM)
except when the distribution factorizes (use of block calculations) ButSQdoes not factorize!!!
Solution: approximateP(M)by a distribution that factorizes:
I P(M)is approximatedby
R(M,E)= e
βP
i,kMikEik
P
NeβPi,kNikEik.
For the distributionR(M,E),Mik areindependantsfor i =1, . . . ,n.
I The mean fieldE is tuned byminimizing the Kullback-Leibler divergence:
KL(R|P) =P
MR(M,E)logR(M,E)P(M) ⇒mean field equations:
∂ER(F(M))
∂Ejl =P
k
∂ER(Mjk)
∂Ejl Ejk
I Eik andER(Mik)are iteratively estimated by an EM-like algorithm; at the limit,ER(Mik)gives theprobability ofxito belong to clusterk for the optimalSQ
Mean field approximation
Problem: ZP =P
MeβF(M)is hard to compute (Cn values forM)
except when the distribution factorizes (use of block calculations) ButSQdoes not factorize!!!
Solution: approximateP(M)by a distribution that factorizes:
I P(M)is approximatedby
R(M,E)= e
βP
i,kMikEik
P
NeβPi,kNikEik.
For the distributionR(M,E),Mik areindependantsfor i =1, . . . ,n.
I The mean fieldE is tuned byminimizing the Kullback-Leibler divergence:
KL(R|P) =P
MR(M,E)logR(M,E)P(M) ⇒mean field equations:
∂ER(F(M))
∂Ejl =P
k
∂ER(Mjk)
∂Ejl Ejk
I Eik andER(Mik)are iteratively estimated by an EM-like algorithm; at the limit,ER(Mik)gives theprobability ofxito belong to clusterk for the optimalSQ
Mean field approximation
Problem: ZP =P
MeβF(M)is hard to compute (Cn values forM)
except when the distribution factorizes (use of block calculations) ButSQdoes not factorize!!!
Solution: approximateP(M)by a distribution that factorizes:
I P(M)is approximatedby
R(M,E)= e
βP
i,kMikEik
P
NeβPi,kNikEik.
For the distributionR(M,E),Mik areindependantsfor i =1, . . . ,n.
I The mean fieldE is tuned byminimizing the Kullback-Leibler divergence:
KL(R|P) =P
MR(M,E)logR(M,E)P(M) ⇒mean field equations:
∂ER(F(M))
∂Ejl =P
k
∂ER(Mjk)
∂Ejl Ejk
I Eik andER(Mik)are iteratively estimated by an EM-like algorithm; at the limit,ER(Mik)gives theprobability ofxito belong to clusterk for the optimalSQ
Mean field approximation
Problem: ZP =P
MeβF(M)is hard to compute (Cn values forM)
except when the distribution factorizes (use of block calculations) ButSQdoes not factorize!!!
Solution: approximateP(M)by a distribution that factorizes:
I P(M)is approximatedby
R(M,E)= e
βP
i,kMikEik
P
NeβPi,kNikEik.
For the distributionR(M,E),Mik areindependantsfor i =1, . . . ,n.
I The mean fieldE is tuned byminimizing the Kullback-Leibler divergence:
KL(R|P) =P
MR(M,E)logR(M,E)P(M) ⇒mean field equations:
∂ER(F(M))
∂Ejl =P
k
∂ER(Mjk)
∂Ejl Ejk
I Eik andER(Mik)are iteratively estimated by an EM-like algorithm; at the limit,ER(Mik)gives theprobability ofxito belong to clusterk for the optimalSQ
Mean field approximation
Problem: ZP =P
MeβF(M)is hard to compute (Cn values forM)
except when the distribution factorizes (use of block calculations) ButSQdoes not factorize!!!
Solution: approximateP(M)by a distribution that factorizes:
I P(M)is approximatedby
R(M,E)= e
βP
i,kMikEik
P
NeβPi,kNikEik.
For the distributionR(M,E),Mik areindependantsfor i =1, . . . ,n.
I The mean fieldE is tuned byminimizing the Kullback-Leibler divergence:
KL(R|P) =P
MR(M,E)logR(M,E)P(M) ⇒mean field equations:
∂ER(F(M))
∂Ejl =P
k
∂ER(Mjk)
∂Ejl Ejk
I Eik andER(Mik)are iteratively estimated by an EM-like algorithm; at the limit,ER(Mik)gives theprobability ofxito belong to clusterkfor the optimalSQ
Deterministic annealing
Algorithm
For increasing sequenceβ1,β2, . . . ,βL,
1. InitializeER(M)randomly in[0,1]such thatP
kER(Mik) =1 2. Repeat forl=1, . . . ,L
2.1 Compute E:Eik =2P
j,i
P
k0ER(Mjk0)Skk0Bjiwhere B =2m1(W−P);
2.2 ComputeER(M):ER(Mik) =Peβl Eik
k0eβl Eik0
3. ThresholdER(Mik)intoclustering:
Mik =arg max
k=1,...,CER(Mik).
A toy example
A toy example[Zachary, 1977]: Zachary’s karate club(friendship social network between the34 membersof a Karate club at a US university in the 70s).
1 2
3 4
5
6 7
8 9
10
11 12 13 14 15
16
17 18 19
20 21
22 23
24
25 26 27
28
29 30
31
32 33 34
A toy example
A toy example[Zachary, 1977]: Zachary’s karate club(friendship social network between the34 membersof a Karate club at a US university in the 70s).
1 2
3 4
5
6 7
8 9
10
11 12 13 14 15
16
17 18 19
20 21
22 23
24
25 26 27
28
29 30
31
32 33 34
Optimization of organized modularity on “Karate”
For a choice of neighborhood relationship leading to4 non empty clusterson a squared grid of size 2×2:
50 100 200 500
0.00.10.20.30.4
β
Organized modularity
Evolution of the organized modularity during the annealing scheme
●
●
● ●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
3 3
3 3
1
1 1
3 4
4
1 3 3 3
4
4
1
3 4
3 4
3 4
2
2 2 4
2 2 4
4
2 4 4
Final classification and layout
Optimization of organized modularity on “Karate”
For a choice of neighborhood relationship leading to4 non empty clusterson a squared grid of size 2×2:
1
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
2
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
3
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
4
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
Probability of each node to be in a given classe just after the first phase transition
●
●
● ●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
3 3
3 3
1
1 1
3 4
4
1 3 3 3
4
4
1
3 4
3 4
3 4
2
2 2 4
2 2 4
4
2 4 4
Final classification and layout
Optimization of organized modularity on “Karate”
For a choice of neighborhood relationship leading to4 non empty clusterson a squared grid of size 2×2:
1
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
2
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
3
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
4
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
Probability of each node to be in a given classe just after the second phase transition
●
●
● ●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
3 3
3 3
1
1 1
3 4
4
1 3 3 3
4
4
1
3 4
3 4
3 4
2
2 2 4
2 2 4
4
2 4 4
Final classification and layout
Optimization of organized modularity on “Karate”
For a choice of neighborhood relationship leading to4 non empty clusterson a squared grid of size 2×2:
●
●
● ●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
3 3
3 3
1
1 1 3 4
4
1 3 3 3
4
4
1
3 4
3 4
3 4
2
2 2 4
2 2 4
4
2 4 4
●1 ●2
3 4
Final classification and layout
Optimization of organized modularity on “Karate”
For a choice of neighborhood relationship leading to4 non empty clusterson a squared grid of size 2×2:
●
●
● ●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
3 3
3 3
1
1 1 3 4
4
1 3 3 3
4
4
1
3 4
3 4
3 4
2
2 2 4
2 2 4
4
2 4 4
Final classification and layout
Comparisons on a toy example: “Karate”
Optimal solutionobtained with SOM (various kernels tested):
●
●
● ●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
3 3
3 3
1
1 1 3 4
3
1 3 3 3 4
4
1 3 4
3 4
3 4
2
2 2 4
2 2 4
4
2 4 4
●
●
● ●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
3 3
3 3
1
1 1 3 4
4
1 3 3 3 4
4
1 3 4
3 4
3 4
2
2 2 4
2 2 4
4
2 4 4
SOM (heat kernel) SQoptimization
Modularity = 0.4188 Modularity = 0.4198
true optimum
SQoptimization solution is consistent with the true division of the social network