• Aucun résultat trouvé

Community detection in complex networks R. Lambiotte Department of Mathematics University of Namur

N/A
N/A
Protected

Academic year: 2022

Partager "Community detection in complex networks R. Lambiotte Department of Mathematics University of Namur"

Copied!
97
0
0

Texte intégral

(1)

Community detection in complex networks

R. Lambiotte

Department of Mathematics

University of Namur

(2)

1. Community detection

2. Overlapping/multi-scale communities.

Dynamics and communities.

3. Why communities?

(3)

Networks

Mathematical tool to describe systems composed of elements and of the relations between them (directed or not)

Internet

Transport networks Power grids

Protein interaction networks Food webs

Metabolic networks Social networks

The brain Etc.

Natural language to

describe complex systems Possibility to analyse

systems of a very different nature within a single

framework

Identification of universal properties among diverse kinds of systems =>

generic organisation

principles

(4)

Modular Networks

Many networks are inhomogeneous and are made of modules: many links within modules and a few links between different modules

(5)

Observed in social, biological and information networks

S. Fortunato, Community detection in graph, Physics Reports, vol. 486, pp. 75-174 (2010)

(6)

Networks have a hierarchical/multi-scale structure: modules within modules

Nested organization

Hierarchical Networks

(7)

Hierarchy = multi-scale structure:

modules within modules

Different definitions for hierarchy

Hierarchy = subordination

General Colonel

Captain Sergeant

(8)

Hierarchy = multi-scale structure:

modules within modules

Different definitions for hierarchy

Hierarchy of nodes with different degrees of “modularity” (clustering)

Hierarchical Organization of Modularity in Metabolic Networks

E. Ravasz, A. L. Somera, D. A. Mongru, Z. N. Oltvai, A.-L. Barabási, Science (2002)

(9)

Is it possible to uncover the (multi-scale) modular organisation of networks in an automated fashion?

Given a graph, we look for an algorithm able to uncover its modules without specifying their number or their size (related to but different from classical partitioning problems, see later)

- find modules and only the modules: the method automatically finds the true modular organisation

- multi-scale modularity?

- scalability (millions of nodes)

Community detection

algorithm

(10)

Fun, pays my salary, allows me (and others) to publish in obscure math/physics journals.

What is the big deal? Applications? Any use?

Community detection

(11)

Why looking for modules?

Graphs help us to comprehend in a visual way its global organisation. This works extremely well when the graph is small but, as soon as the system is

made of hundreds or thousands of nodes, a brute force representation typically leads to a meaningless cloud of nodes.

(12)

Why looking for modules?

Is it possible to uncover modules/hierarchies in large networks?

Intermediate levels of organization of complex systems

Find a partition of the network into communities

Coarse-grained description

Martin Rosvall and Carl T. Bergstrom, PNAS 105, 1118 –1123 (2008)

Uncovering communities/modules helps to understand the structure of the network and to draw a readable map of the network (when N is large).

(13)

Modules often overlap with properties/functions of nodes Data mining perspective:

Uncovering communities might help to uncover hidden properties between nodes

Why looking for modules?

(14)

Hundreds of ways to uncover communities/modules in networks:

- (very) fast vs (very) slow methods

- overlapping vs non-overlapping communities - single-scale or multi-scale?

Community detection

S. Fortunato, Community detection in graph, Physics Reports, vol. 486, pp. 75-174 (2010)

(15)

What is the partition of this graph into groups?

Question

(16)

Graph partitioning vs community detection

Both terms refer to the division of a network into dense groups

Graph partitioning: the number and size of the groups is fixed by the user. For

instance, bisection problem: what is the division of a network into 2 groups of equal size such that the number of links between the groups is minimised.

Application parallel computing: minimise inter-processor communication.

Divides the system into the required number of groups, whatever the organisation of the system.

Community detection: the number and size of the groups are unspecified, but

determined by the organisation of the network. Ideally, the method should be able to uncover a mixture of groups of different size in the same system. It should divide a network only when a good subdivision exists and leave it undivided otherwise.

In both cases:

1) How to formalise the problem?

2) How to solve it in practice?

(17)

Example: Graph bisection

Definition of the problem:

Find the best division of a network into 2 groups of size n1 and n2 such that the cut size is minimal, where the cut size is the total number of links between different groups.

PS: to divide the system into more than 2 groups, one can repeat the bisection: one can divide a network into two groups, than dividing these groups into 2 sub-groups, etc.

Solving the problem:

Looking through all bi-partitions and choose the one with the smallest cut size?

Advantage? Disadvantages?

(18)

Example: Graph bisection

Definition of the problem:

Find the best division of a network into 2 groups of size n1 and n2 such that the cut size is minimal, where the cut size is the total number of links between different groups.

PS: to divide the system into more than 2 groups, one can repeat the bisection: one can divide a network into two groups, than dividing these groups into 2 sub-groups, etc.

Solving the problem:

Looking through all bi-partitions and choose the one with the smallest cut size?

This exhaustive search is extremely costly in terms of computer time

(19)

Example: Graph bisection

The number of ways to divide a network of 2n nodes into two groups of n and n nodes is:

By using the Stirling approximation, one finds

The time to solve the problem by exhaustive search increases exponentially with the system size.

Not possible to solve it, even with modern computers, for systems of more than say 30-40 nodes. For instance when n=50: 126,410,606,437,752

Is it possible to find a smarter method to solve the problem? No, it is not

possible to find a method solving exactly the problem in polynomial time for all networks...

But several heuristics have been developed and allow to find approximate solutions in non-prohibitive times.

(20)

Kernighan-Lin algorithm

1)  Split in two groups of the required size

2)  For all pairs (i,j) of nodes belonging to different groups, we evaluate the cut size change by swapping both nodes. We swap the pair that most decreases or least increases the cut size.

3)  Step 2 is repeated by excluding nodes used before, until there are no pairs left to swap.

4)  When all possible swaps have been performed, we look through all intermediate bisections and select the one with the lowest cut size. This partition is fed back into 2.

This process is repeated until no improvement is made.

R=4 R=2

(21)

Kernighan-Lin algorithm

1)  Split in two groups of the required size

2)  For all pairs (i,j) of nodes belonging to different groups, we evaluate the cut size change by swapping both nodes. We swap the pair that most decreases or least increases the cut size.

3)  Step 2 is repeated by excluding nodes used before, until there are no pairs left to swap.

4)  When all possible swaps have been performed, we look through all

intermediate bisections and select the one with the lowest cut size. This partition is fed back into 2.

This process is repeated until no improvement is made.

Complexity: each round of the algorithm requires O(N) swaps. For each swap, one examines O(N^2) pairs of vertices. For each pair of vertices, evaluating the change in cut size is local, so O(1) for sparse networks. So O(N^3) times the number of

rounds, which is small in practice.

Performance: results strongly dependent on the initial partition, the method is often used to improve the results obtained from other methods

(22)

Spectral algorithm

Let us denote by the assignment of node i

But this solution is not a partition (except in extremely trivial situations) and it probably does not satisfy the required size of the groups.

By performing a spectral decomposition of the Laplacian matrix, one finds:

If there is no condition on si, the optimal solution would be

(23)

Spectral algorithm

We chose the partition with the specified sizes of communities the closest to optimality

If one wants a split into n1 and n2=n-n1 vertices, one orders the components of the Fiedler vector from the largest positive to the smallest negative and picks the n1

largest (smallest) components of the Fiedler vector

Complexity: O(N^2) on sparse networks

(24)

Quality of a partition

What is the best partition of a network into modules?

How do we rank the quality of partitions of different sizes?

...

Q1 Q2

Q3 Q4

(25)

expected number of links between i and j

Modularity

Q = fraction of edges within communities - expected fraction of such edges Let us attribute each node i to a community ci

Allows to compare partitions made of different numbers of modules

M.E.J. Newman and M. Girvan, Finding and evaluating community structure in networks, Phys. Rev. E, 69, 026113, 2004.

(26)

Modularity optimisation

Different types of algorithms (many similar to or inspired by graph partioning methods) for different applications:

Small networks (<102): Simulated Annealing

Intermediate size (102 -104): Spectral methods, PL, etc.

Large size (>104): greedy algorithms

(27)

Spectral optimization

Let us denote by the assignment of node i

By performing a spectral decomposition of the modularity matrix, one finds:

si is chosen to be as similar to the dominant eigenvector of the modularity matrix

Let us first focus on the best division of the network into 2 communities (of any size!).

M.E.J. Newman, Finding community structure in networks using the eigenvectors of matrices, Phys. Rev. E, vol.

74, 036104, 2006.

(28)

Spectral optimization

in terms of the spectral properties of the (normalized ) Laplacian NB. Modularity can also be rewritten as:

Suggests to bi-partition the graph according to the second dominant eigenvector of the normalized Laplacian (“a la Shi-Malik”).

(29)

Greedy optimisation

The algorithm is based on two steps that are repeated iteratively.

First phase: Find a local maximum

1) Give an order to the nodes (0,1,2,3,...., N-1)

2) Initially, each node belongs to its own community (N nodes and N communities)

3) One looks through all the nodes (from 0 to N-1) in an ordered way.

The selected node looks among its neighbours and adopt the

community of the neighbour for which the increase of modularity is maximum (and positive).

4)This step is performed iteratively until a local maximum of

modularity is reached (each node may be considered several times).

Node 0 moves to the community of Node 3

After N nodes have been considered

After each nodes has been considered 4 times

V.D. Blondel, J.-L. Guillaume, R. Lambiotte and E. Lefebvre, Fast unfolding of communities in large networks, J.

Stat. Mech., P10008, 2008.

(30)

Once a local maximum has been attained, we build a new network whose nodes are the communities. The weight of the links between communities is the total weight of the links between the nodes of these communities.

New network of 4 nodes!

Note the self-loops

In typical realizations, the number of nodes diminishes drastically at this step.

Greedy optimisation

(31)

The two steps are repeated iteratively, thereby leading to a hierarchical decomposition of the network.

Multi-scale optimisation: local search first among neighbours, then among neighbouring communities, etc.

Increasing values of modularity

Hierarchical representation

1st pass 2nd pass

Greedy optimisation

(32)

Very fast: O(N) in practice. The only limitation being the storage of the network in main memory

Good accuracy (among greedy methods)

Clauset A, Newman M E J and Moore C, 2004 Phys. Rev. E 70 066111.

Wakita K and Tsurumi T, 2007 Proceedings of IADIS international conference on WWW/

Internet 2007 153.

Pons P and Latapy M, 2006 Journal of Graph Algorithms and Applications 10 191.

(33)

How to test the methods?

Test the heuristics: what is the value of Q obtained for different algorithms? Time complexity?

(34)

Comparison with real-world data: do modules reveal nodes having similar meta- data?

How to test the methods?

But: meta-data are often unknown. No insurance that modular organization coincides with semantic/cultural organisation

(35)

How to test the methods?

Benchmarks: artificial networks with known community structure.

But: random networks (their structure is quite different from real-world networks);

in the way the benchmark is built, there is a (hidden) choice for what good partitions should be

Leon Danon, Jordi Duch, Albert Diaz-Guilera, Alex Arenas, J. Stat. Mech. (2005) P09008 Andrea Lancichinetti, Santo Fortunato, and Filippo Radicchi, Phys. Rev. E 78, 046110 (2008)

(36)

How to test the methods?

Benchmarks: ask the people!

(37)

Definition of the problem: a certain quality function with or without constraints on the system size (modularity, map equation, spin-glass modularity, etc)

Solving the problem: several existing heuristics (SA, spectral, greedy, etc);

divisive vs agglomerative

(38)

Anglia UArts

Aston

Bath

BathSpa

Beds

Birkbeck

Birmingham

Bolton

Bournemth Bradford

Brighton

Bristol Brunel Cambridge

Canterbury ICR

CLancs

BirmCity Chichester

Chester

Coventry

CityLondon

Cumbria

Cranfield

Derby DeMontfort

EAnglia Durham

EdgeHill

ELondon

Exeter Gloucs

InstEd Essex

Herts

Hudd

Goldsmiths Greenwich

Keele

Kent

Hull

Imperial Lancaster

Leeds Kings Kingston

Liverpool Lincoln

Leicester LeedsMet LSE

LBS

LplJM LplHope

Lough SouthBank

LondonMet

LSHTM

Newcastle Middlesex

MancMet

Manchester

OU

SOAS

Oxford OxBrookes

Nhampton Nhumbria

Nottingham NottTrent

Reading

Roehampton

Rholloway SchPharmRVC

Plymouth Portsmouth

QueenMary

SotonSol

Soton

Sunderland Staffs Salford

StGeorges

SheffHall Sheffield

Warwick

UCL Wminster

WEngland Sussex

Surrey

ThamesV

Teesside

Ulster

Aberdeen

YorkStJohn Queens Worcester

York Winchester

Wolves

Napier QMargaret

GlasgowCal HeriotWatt

Edinburgh

Glasgow

Abertay Dundee Cardiff

Bangor

UWA

WScotland Strthclyde

Stirling StAndrews RobGordon Swansea

SwanseaMet

Lampeter Glyndwr Glamorgan

In practice, certain codes are easy to implement

Otherwise, several heuristics have been implemented in free softwares such as

Network Workbench, Gephi, networkX, etc. with nice

visualisations

(39)

1. Community detection

2. Overlapping/multi-scale communities.

Dynamics and communities.

3. Why communities?

(40)

Uncovering meso-scale, intermediate levels of descripition

(41)

Hierarchical Modularity

Resolution limit

Optimising modularity uncovers one partition

What about sub (or hyper)-communities in a hierarchical network?

(42)

Hierarchical Modularity

Resolution limit

Optimising modularity uncovers one partition

What about sub (or hyper)-communities in a hierarchical network?

Reichardt & Bornholdt Arenas et al.

Tuning parameters allow to uncover communities of different sizes Reichardt & Bornholdt different of Arenas, except in the case of a regular graph where

J. Reichardt and S. Bornholdt, Phys. Rev. E 74, 016110 (2006). Statistical mechanics of community detection A Arenas, A Fernandez, S Gomez, New J. Phys. 10, 053039 (2008). Analysis of the structure of complex networks at different resolution levels

(43)

Modularity

Resolution limit

Optimising modularity uncovers one partition

What about sub (or hyper)-communities in a hierarchical network?

Reichardt & Bornholdt Corrected Arenas et al.

Preserves the eigenvectors of Laplacian (no A) and has a nice dynamical interpretation Reichardt & Bornholdt = corrected Arenas

R. Lambiotte, Modeling and Optimization in Mobile, Ad Hoc and Wireless Networks (WiOpt), 2010 Proceedings of the 8th International Symposium on, 546-553 (2010)

(44)

Stability

The quality of a partition is determined by the patterns of a flow within the network: a flow should be trapped for long time periods within a

community before escaping it.

The stability of a partition is defined by the statistical properties of a random walker moving on the graph:

J.-C. Delvenne, S. Yaliraki & M. Barahona, Stability of graph communities across time scales. arXiv:0812.1811.

M. Rosvall and C. T. Bergstrom, PNAS 105, 1118 –1123 (2008)

(45)

Stability

The quality of a partition is determined by the patterns of a flow within the network: a flow should be trapped for long time periods within a

community before escaping it.

The stability of a partition is defined by the statistical properties of a random walker moving on the graph:

probability for a walker to be in the same community at times t

0

and t

0

+t when the system is at equilibrium

probability for two independent walkers to be in C (ergodicity)

J.-C. Delvenne, S. Yaliraki & M. Barahona, Stability of graph communities across time scales. arXiv:0812.1811.

(46)

a. Modularity vs Stability

Let us consider a random walk on an undirected network:

equilibrium

Probability that a walker is in the same community initially and at time t=1

Same probability for independent walkers

(47)

a. Modularity vs Stability

Let us consider a random walk on a directed network:

equilibrium

(48)

a. Modularity vs Stability

Flow-based modules Combinatorial modules

M. Rosvall and C. T. Bergstrom, PNAS 105, 1118 –1123 (2008)

(49)

a. Modularity vs Stability

(50)

a. Modularity vs Stability

Let us consider a random walk on a directed network:

equilibrium

but

(51)

Let us consider a continuous-time random walk with Poisson waiting times

equilibrium

Probability that a walker is in the same community initially and at time t

Same probability for independent walkers

b. Stability: time as a resolution parameter

(52)

What are the optimal partitions of R

t

?

t=0 Communities=single nodes

t small

favours single nodes modularity

!! Qt equivalent to the Hamiltonian formulation of Reichardt and Bornholdt (t=1/gamma)

When t goes to infinity, the optimal partition is made of 2 communities (by spectral decomposition, i.e. dominant eigenvector of the Laplacian)

...

b. Stability: time as a resolution parameter

(53)

Time is a “resolution parameter”: larger and larger communities when time is increased

0 0.2 0.4 0.6 0.8 1

0 1 2 3 4 5

R t

time 16 communities

8 communities 4 communities 2 communities

b. Stability: time as a resolution parameter

(54)

Time is a “resolution parameter”: larger and larger communities when time is increased

0 0.2 0.4 0.6 0.8 1

0 1 2 3 4 5

R t

time 16 communities

8 communities 4 communities 2 communities

b. Stability: time as a resolution parameter

(55)

Time is a “resolution parameter”: larger and larger communities when time is increased

0 0.2 0.4 0.6 0.8 1

0 1 2 3 4 5

R t

time 16 communities

8 communities 4 communities 2 communities

b. Stability: time as a resolution parameter

(56)

Time is a “resolution parameter”: larger and larger communities when time is increased

0 0.2 0.4 0.6 0.8 1

0 1 2 3 4 5

R t

time 16 communities

8 communities 4 communities 2 communities

b. Stability: time as a resolution parameter

(57)

Time is a “resolution parameter”: larger and larger communities when time is increased

0 0.2 0.4 0.6 0.8 1

0 1 2 3 4 5

R t

time 16 communities

8 communities 4 communities 2 communities

b. Stability: time as a resolution parameter

(58)

In practice: Optimisation

The stability R(t) of the partition of a graph with adjacency matrix A is equivalent to the modularity Q of a time-dependent graph with adjacency matrix X(t)

which is the flux of probability between 2 nodes at equilibrium and whose generalised degree is

For very large networks:

(59)

The optimization of R(t) over a period of time leads to a sequence of partitions that are optimal at different time scales.

How to select the most relevant scales of description?

The significance of a particular scale is usually associated to a certain notion of the robustness of the optimal partition. Here, robustness indicates that a small modification of the optimization algorithm, of the network, or of the quality

function does not alter this partition.

We look for regions of time where the optimal partitions are very similar. The similarity between two partition is measured by the normalised variation of information.

In practice: Selection of the most relevant

scales

(60)

M. Meilă, J. Multivariate Anal. 98, 873 (2007).

Verification on empirical networks and multi-scale benchmarks

0 5 10 15 20 25

10-1 100

Average number of communities

time

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14

10-1 100

Average NVI

time algo

net QF

Football network

Compatible notions of robustness

Lack of robustness => high degeneracy of the QF landscape: uncovered partitions are not to be trusted; wrong resolution

algo: for each t, 100 optimizations of Louvain algorithm while changing the ordering of the nodes

net: for each t, 100 optimizations with a fixed algorithm but

randomized modifications of the network

QF: for each t, one optimization.

Partitions at 5 successive values of t

are compared.

(61)

M. Meilă, J. Multivariate Anal. 98, 873 (2007).

Verification on empirical networks and multi-scale benchmarks

0 5 10 15 20 25

10-1 100

Average number of communities

time

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14

10-1 100

Average NVI

time algo

net QF

Football network

Compatible notions of robustness

Lack of robustness => high degeneracy of the QF landscape: uncovered partitions are not to be trusted; wrong resolution

algo: for each t, 100 optimizations of Louvain algorithm while changing the ordering of the nodes

net: for each t, 100 optimizations with a fixed algorithm but

randomized modifications of the network

QF: for each t, one optimization.

Partitions at 5 successive values of t

are compared.

(62)

M. Meilă, J. Multivariate Anal. 98, 873 (2007).

Verification on empirical networks and multi-scale benchmarks

Multi-scale network

Compatible notions of robustness

Lack of robustness => high degeneracy of the QF landscape: uncovered partitions are not to be trusted; wrong resolution

algo: for each t, 100 optimizations of Louvain algorithm while changing the ordering of the nodes

net: for each t, 100 optimizations with a fixed algorithm but

randomized modifications of the network

QF: for each t, one optimization.

Partitions at 5 successive values of t are compared.

100 101 102

10-1 100 101

Average number of communities

time

0 0.02 0.04 0.06 0.08 0.1 0.12

10-1 100 101

Average NVI

time algonet

QF

(63)

Overlapping communities

Node partitions are of limited use in systems where communities are overlapping, especially when the overlap is

pervasive

We all have different types of friends:

- Family - Friends

- Work Colleagues

(64)

Overlapping communities

Clique Percolation Method

Principle: looking at connectivity in terms of cliques

1)  Two k‐cliques are neighbours if they have a (k‐1)‐clique in common 2)  Connected components = modules

Palla et al., Nature 2005

(65)

Node partition Edge partition

T. Evans and R. Lambiotte, Phys. Rev. E, 80 (2009) 016105 Y.-Y. Ahn, J.P. Bagrow, S. Lehmann, Science 2010

The connection between two persons usually exists for one dominant reason => links

therefore typically belong to one single module

Overlapping communities

(66)

T. Evans and R. Lambiotte, Phys. Rev. E, 80 (2009) 016105 Y.-Y. Ahn, J.P. Bagrow, S. Lehmann, Science 2010

Modularity optimization on the Line Graph

Overlapping communities

Hierarchical clustering on a matrix of proximity between links

(67)

Local definitions of communities instead of global approaches Goodness of communities instead of partitions

Allows for overlapping communities

Triangles to Capture Social Cohesion, and C3 method for optimal covering, Friggeri et al.

Overlapping communities

(68)

Overlapping communities

Validated on a large-scale experiment on Facebook (Fellows): algorithmic

communities strongly correlated to the users’ perception of the quality of social communities.

Triangles to Capture Social Cohesion, A. Friggeri, G. Chelius and E. Fleury, SocialCom (2011)

(69)

Overlapping communities: an Application

Is the community structure of our networks the reflect of our individual psychological traits?

(70)

Overlapping communities: an Application

Is the community structure of our networks the reflect of our individual psychological traits?

The five-factor model of personality, or the big five, is the most comprehensive, reliable and useful set of personality concepts. The idea is that an individual can be associated with 5 scores that correspond to 5 main personality traits.

Personality traits predict a number of real-world behaviors. They, for example, are strong predictors of how marriages turn out: if one of the partner is high in

Neuroticism, then divorce is more likely.

The Revised NEO Personality Inventory, P. Costa and R. Mccrae, SAGE Publications (2005)

(71)

Overlapping communities: an Application

Is the community structure of our networks the reflect of our individual psychological traits?

- Facebook application: 5.5 million users

- Users can opt in and give their consent to share their profile information (40%) - Right incentives: subjects are not paid nor receive college credits. myPersonality users are solely motivated by the prospect of receiving reliable feedback and test results that accurately describe their personalities.

- Unreliable results are removed. Numerous validity tests

- myPersonality is able to obtain test results that are more reliable than those in pen-and-paper studies.

- myPersonality users are far less biased than those studies’ subjects for gender, age, and geography.

- VERY large scale data

(72)

Overlapping communities: an Application

Is the community structure of our networks the reflect of our individual psychological traits?

Ego-network of person John: friends of John and connections between them (John does not belong to his ego-network).

Number of friends = size of the ego-network

In the social science, long tradition in analysing and theorising ego-networks, e.g.

their connection to social capital: dense ego-networks favor trust and facilitate information flow // open ego-networks indicate bridging capital as individuals bridge structural holes between disconnected others.

~ 50k users with number of friends comprised between 50 and 2000

PS: Local network analysis because of our very incomplete knowledge of the whole Facebook network (thousands vs billions)

Extraversion is positively correlated to the size of the ego-network, but what about its organization?

Psychological Aspects of Social Communities, A. Friggeri, R. Lambiotte, M. Kosinski and E. Fleury, SocialCom (2012)

(73)

Overlapping communities: an Application

Is the community structure of our networks the reflect of our individual psychological traits?

Psychological Aspects of Social Communities, A. Friggeri, R. Lambiotte, M. Kosinski and E. Fleury, SocialCom (2012)

(74)

Overlapping communities: an Application

Is the community structure of our networks the reflect of our individual psychological traits?

Introverts tend to have less, larger communities: they hide into large communities.

Extroverts exhibit a higher overlap of the communities: they act as bridges between communities

No significant difference in the average value of cohesion.

Psychological Aspects of Social Communities, A. Friggeri, R. Lambiotte, M. Kosinski and E. Fleury, SocialCom (2012)

(75)

Dynamics and communities?

(76)

Small-World and Modularity

Modular vs Watts-Strogatz

(77)

Small-World and Modularity

Nodes within the same module are densely intra-connected, the number of

triangular motifs in a modular network is larger than in a random graph of the same size and connection density

The existence of a few links between nodes in different modules plays the role of topological short-cuts in a small-world formulation of network architecture.

Modular are SW but SW networks are not necessarily modular

(78)

Small-World, Modularity and Dynamics

Nodes within the same module are densely intra-connected, the number of

triangular motifs in a modular network is larger than in a random graph of the same size and connection density

The existence of a few links between nodes in different modules plays the role of topological short-cuts in a small-world formulation of network architecture.

Modular are SW but SW networks are not necessarily modular

Aggregation of Variables in Dynamic Systems by HA Simon and A Ando (1961)

Modularity naturally produces small-world networks, with the additional, crucial property of time-scale separation corresponding to fast intra-modular processes and slow inter-modular processes.

Near-decomposability:

- the short-time behaviour within a module is approximately independent of the short-time behaviour of the other components

- in the long-run, the behaviour of a module depends in only an aggregate way on the behaviour of other modules (i.e. not on the detailed state of their components).

(79)

Approach toward synchronization when

which is different from:

the probability to leave a node is

proportional to the degree of the node

the probability to leave a node does not depend on the degree of the node

Alex Arenas, Albert Diaz-Guilera, C Pérez-Vicente, Phys. Rev. Lett. 96, 114102 (2006).

Synchronization Reveals Topological Scales in Complex Networks

Example: Kuramoto

Kuramoto model

(80)

Alex Arenas, Albert Diaz-Guilera, C Pérez-Vicente, Phys. Rev. Lett. 96, 114102 (2006).

Synchronization Reveals Topological Scales in Complex Networks

Example: Kuramoto

(81)

Alex Arenas, Albert Diaz-Guilera, C Pérez-Vicente, Phys. Rev. Lett. 96, 114102 (2006).

Synchronization Reveals Topological Scales in Complex Networks

Example: (Partial) synchronization

Actually, the non-linearity is irrelevant...

(82)

Time-scale separation and spectral gaps

1/(1-lambda) ~ characteristic time WS

Modular

(83)

Time-scale separation and spectral gaps

What if the networks has pervasive overlaps?

(84)

1. Community detection

2. Overlapping/multi-scale communities.

Dynamics and communities.

3. Why communities?

(85)

Why communities?

Many complex systems are modular/hierarchical:

Generic mechanisms driving the emergence of modularity?

D. Meunier, R. Lambiotte and E.T. Bullmore, “Modular and hierarchical organisation in complex brain networks”, to appear in Frontiers in NeuroScience (2010) - 7 pages

(86)

Faster evolution

Simple systems evolve more rapidly if there are stable intermediate forms (modules) than if there are not present.

Among possible complex organizations, hierarchies are observed because they are the ones that have had the time to evolve

Simon, H. (1962). The architecture of complexity. Proceedings of the American Philosophical Society, 106, 467–482.

(87)

Faster evolution

Watchmaker parable:

“There once were two watchmakers, named Hora and Tempus, who made very fine watches. The phones in their workshops rang frequently; new customers were

constantly calling them. However, Hora prospered while Tempus became poorer and poorer. In the end, Tempus lost his shop. What was the reason behind this?

The watches consisted of about 1000 parts each.

The watches that Tempus made were designed such that, when he had to put down a partly assembled watch (for instance, to answer the phone), it immediately fell into pieces and had to be reassembled from the basic elements.

Hora had designed his watches so that he could put together subassemblies of about ten components each. Ten of these subassemblies could be put together to make a larger sub- assembly. Finally, ten of the larger subassemblies constituted the whole watch. Each subassembly could be put down without falling apart.”

Simon, H. (1962). The architecture of complexity. Proceedings of the American Philosophical Society, 106, 467–482.

(88)

Faster evolution when modular

Tempus Hora

- must complete 1 assembly of 1000 elements

- must complete 111 assemblies of 10 elements

Probability that an interruption occurs while a piece is added, say p=0.01

- loses on average 100 pieces

(1/0.01) - loses on average 5 pieces

- finishes an assembly with probability (1-0.01)^1000 ~ 0.000004

- finishes an assembly with probability (1-0.01)^10 ~ 0.9

Time to finish a watch: 100 /(1-0.01)^1000 Time to finish a watch: 111 * 5 /(1-0.01)^10 It will take Tempus 4000 times as long to assemble a swatch

Robust intermediate steps during evolution: if the system breaks down (whatever the reason), evolution does not restart from scratch, but from intermediate, stable

solutions (back-up!).

Importance of disturbance due to the environment

(89)

Separation of time scales: diversity

Pan, R.K. and Sinha, S. (2009) Modularity produces small-world networks with dynamical time-scale separation, EPL, 85:68006.

R. Lambiotte and M. Ausloos, J. Stat. Mech., (2007) P08026

Encapsulation of local activation: local consensus, local synchronization, local ordering, etc.

Locally dense and globally sparse (minimise wiring cost)

Promotes diversity and thus robustness in a fluctuating environment

Ising model Opinion dynamics,

e.g. MR

(90)

Separation of time scales: diversity

R. Lambiotte and M. Ausloos, J. Stat. Mech., (2007) P08026

MR

N agents have an opinion: -1 or 1 The population evolves by:

(i) picking a random voter

(ii) with probability q, the voter randomly selects a new state -1 or +1

(iii) with probability 1-q, the voter randomly selects two of its neighbours and the three agents adopt the state of the majority

(91)

Separation of time scales: diversity

R. Lambiotte and M. Ausloos, J. Stat. Mech., (2007) P08026

(92)

Separation of time scales: diversity

R. Lambiotte and M. Ausloos, J. Stat. Mech., (2007) P08026

Disordered phase Symmetric phase

Symmetric or Asymmetric phase

0 0.05 0.1 0.15 0.2 0.25

0 0.2 0.4 0.6 0.8 1

!

q Phase diagram??

??????

(93)

Separation of time scales: diversity

R. Lambiotte and M. Ausloos, J. Stat. Mech., (2007) P08026

Disordered phase Symmetric phase

Symmetric or Asymmetric phase

0 0.05 0.1 0.15 0.2 0.25

0 0.2 0.4 0.6 0.8 1

!

q

Noise or topology can drive the transition

(94)

Locally dense but sparse: knowledge creation and information diffusion

“closed”, dense, or cohesive networks

likely to foster trust, facilitate the enforcement of social norms, and enable the

creation of a common culture (Coleman, 1988)

facilitates the diffusion of

“complex knowledge”

“open”, sparse, or brokered networks

likely to connect people with different ideas, interests and perspectives (Burt, Granovetter)

accelerates diffusion, ensures communicability within the

system opposite benefits

Modular networks presents a compromise between these opposite effects

Dense modules Small diameter

P. Panzarasa, Journal of Informetrics, 3 (2009) 180–190

(95)

time

Co-evolution

The state of a node (spin, phase, etc) affects its connections, which affects its state, which affects its connections, etc.

Neuronal systems: synaptic plasticity in neuron dynamics Social networks: homophily and social influence

Natural tendency toward segregation and modularity

Rubinov, M., Sporns, O., van Leeuwen, C., and Breakspear, M. (2009).

Symbiotic relationship between brain structure and dynamics. BMC Neurosci., 10:55.

(96)

Why communities?

Generic mechanisms driving the emergence of modularity?

- Watchmaker: intermediate states facilitates the emergence of complex organisation from elementary subsystems

- Separation of time scales: enhances diversity, locally synchronised states - locally dense but globally sparse: advantages of dense structures while minimising the wiring cost

- in social systems, offer the right balance between dense networks (foster trust, facilitate diffusion of complex knowledge), and open networks (small diameter, ensures connectivity, facilitates diffusion of “simple” knowledge)

- naturally emerges from co-evolution and duplication processes (see Modularity

“for free” in genome architecture? Ricard V. Sole and Pau Fernandez)

- Optimality of modular networks at performing tasks in a changing environment (Kashtan, N. and Alon, U. (2005) Spontaneous evolution of modularity and network motifs. Proc. Natl.

Acad. Sci. USA, 102:13773--13778.)

- enhanced adaptivity and dynamical complexity, e.g. transient “chimera” states - delivers highly adaptive processing systems and to solve the dynamical

demands imposed by global integration and functional segregation (brain organisation)

D. Meunier, R. Lambiotte and E.T. Bullmore, “Modular and hierarchical organisation in complex brain networks”, to appear in Frontiers in NeuroScience (2010) - 7 pages

(97)

References:

Modularity and complex systems

- H.A. Simon, The Architecture of Complexity, Proc. Amer. Phil. Soc., 106, pp.

467-482 (1962)

Community detection

- M.E.J. Newman, Networks: An Introduction, Oxford University Press (2010)

- M.A. Porter, J.-P. Onnela and P.J. Mucha, Communities in Networks, Notices of the American Mathematical Society, vol. 56, pp. 1082-1097 (2009)

- S. Fortunato, Community detection in graph, Physics Reports, vol. 486, pp. 75-174 (2010)

- R. Lambiotte, Multi-scale Modularity in Complex Networks, Modeling and

Optimization in Mobile, Ad Hoc and Wireless Networks (WiOpt), 2010 Proceedings of the 8th International Symposium on, 546-553 (2010)

- M. Rosvall and C. T. Bergstrom, Maps of random walks on complex networks reveal community, PNAS 105, 1118 –1123 (2008)

Références

Documents relatifs

Une correction de biais est alors nécessaire : elle est obtenue, pour chaque secteur d'activité (en NES16 et NES36), par un étalonnage de l'indicateur sur les estimations

Positive/negative ratio and number of positive and negative feedback circuits in 2-PLNs divided by category (n+ s+ , n− s+ , n+ s− and n− s− ) using different thresholds for the

The intensity of multi-tasking n (the number of tasks per individual) as well as the level of consumption c increase with the efficiency of human capital E, the size of the workforce

Assuming that we observe all the divisions of cells occurring in continuous time on the interval [0, T ], with T &gt; 0, we propose an adaptive kernel estimator ˆ h of h for which

Quantum electrodynamics tells us, though, that the energy of the passing photons is not well defined and that from time to time, one of these photons creates an electron-hole

The difference of the squared charge radii of the proton and the deuteron, r d 2 − r 2 p = 3.82007(65) fm 2 , is accurately determined from the precise measurement of the isotope

In order to avoid uncertainties related to chirping effects in the Ti:sapphire laser and to the Stokes shift ( D vib ), the frequency calibration of the laser pulses leaving the

In this paper, we investigate the gap between the final true functional size of a piece of software at project closure and the functional size estimated much earlier