• Aucun résultat trouvé

Small worlds

N/A
N/A
Protected

Academic year: 2022

Partager "Small worlds"

Copied!
64
0
0

Texte intégral

(1)

1 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Collective Intelligence

Random networks and small worlds

(2)

Small worlds

I proposed a more difficult problem: to find a chain of contacts linking myself with an anonymous riveter at the Ford Motor

Company — and I accomplished it in four steps. The worker knows his foreman, who knows Mr. Ford himself, who, in turn, is on good terms with the director general of the Hearst publishing empire. I had a close friend, Mr. Árpád Pásztor, who had recently struck up an acquaintance with the director of Hearst Publishing. It would take but one word to my friend to send a cable to the general director of Hearst asking him to contact Ford who could in turn contact the foreman, who could then contact the riveter, who could then assemble a new automobile for me, would I need one.

[...] Our friend was absolutely correct: nobody from the group needed more than five links in the chain to reach, just by using the method of acquaintance, any inhabitant of our Planet.

[Karinthy, 1929]

(3)

3 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Six Degrees of Separation

Idea that two persons on Earth are separated bya chain of six individualswho know each other

Appears widely in popular culture:

It’s a small world!

(4)

Stanley Milgram’s Experiment [Travers and Milgram, 1969]

Stanley Milgram (1933-1984):social psychologist

Experiment:people are asked to send a message to some unknown person, by forwardingit to anacquaintancewho might be closer to this person

Results: only 29% of the messages arrived, with a mean number of acquaintances of5.2.

Validatessomehow the 6-degree theory!

Other more recent experiments [Dodds et al., 2003] confirm this order of magnitude.

(5)

5 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Kevin Bacon’s Number

(David Shankbone, Wikimedia)

Kevin Bacon: Hollywood actor, played in numerous movies, mostly

secondary roles

Kevin Bacon’s number:

0 for Kevin Bacon himself

1 for actors who played in the same movie as Bacon

2 for actors who played in the same movie as someone with a number of 1

etc.

http://oracleofbacon.org/

Most actors have asmallBacon’s number!

(6)

Erd ˝ os number

(Kmhkmh, Wikimedia)

Paul Erd ˝os (1913-1996):

Mathematician and computer

scientist, worked across many fields, with may collaborators

Erd ˝os number:

0 for Paul Erd ˝os himself

1 for scientists who coauthored an article with Erd ˝os

2 for scientists who coauthored an article with someone with a number of 1

etc.

http://www.ams.org/mathscinet/

collaborationDistance.html

(7)

7 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Questions

Is there really apatternhere?

How can this be mathematicallymodeled?

Can weexplainwhat happens?

Anything else todiscoverin such networks?

(8)

Outline

Introduction

Basics of Graph Theory

Characteristics of Real-World Networks Models of Networks

Conclusion

(9)

9 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Graphs

1 2 3

4 5 6

Definition

Adirected graphis a pair (S,A) where:

Sis a finite set ofvertices(ornodes) Ais a subset ofS2defining theedges(or arcs)

1 2 3

4 5 6

Definition

Anundirected graphis a pair (S,A) where:

Sis a finite set ofvertices(ornodes)

Ais a set of (unordered) pairs of elements of Sdefining theedges(orarcs)

Remark

Graphis the mathematical term,networkis used to describe real-world graphs.

(10)

Paths and Connectedness

Definition

Apathis a sequence of verticesv1. . .vn such thatvk is connected by an edge tovk+1for 1≤k ≤n−1.

Definition

Theunderlying undirected graphof a directed graphGis the graph obtained by adding all reverse edges.

Definition

An undirected graph isconnectedif for every two verticesu andv, there exists a path starting fromu and ending inv.

A directed graph isstrongly connectedif it is connected, and isweakly

(11)

11 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Connected Components

Definition

(S,A) is asubgraphof (S,A) ifS ⊆S andA is the restriction ofAto edges whose vertices are inS.

Connected component: maximal connected subgraph

Strongly connected component: maximal strongly connected subgraph

Weakly connected component: maximal weakly connected subgraph

1 2

4 5

3 6

(12)

Connected Components

Definition

(S,A) is asubgraphof (S,A) ifS ⊆S andA is the restriction ofAto edges whose vertices are inS.

Connected component: maximal connected subgraph

Strongly connected component: maximal strongly connected subgraph

Weakly connected component: maximal weakly connected subgraph

1 2

4 5

3 6

Strongly connected components

(13)

11 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Connected Components

Definition

(S,A) is asubgraphof (S,A) ifS ⊆S andA is the restriction ofAto edges whose vertices are inS.

Connected component: maximal connected subgraph

Strongly connected component: maximal strongly connected subgraph

Weakly connected component: maximal weakly connected subgraph

1 2

4 5

3 6

Weakly connected components

(14)

Vocabulary

Incident: an edge is said to beincidentto a vertex if it it hasthis vertex for endpoint

Degree (of a vertex): number of edgesincident toa vertex, in an undirected graph

Indegree (of a vertex): number of edgesarriving toa vertex, in a directed graph

Outdegree (of a vertex): number of edgesleaving froma vertex, in a directed graph

Cycle: Path whose start and end vertex is thesame Distance: Length of theshortest pathbetween two vertices

Sparse: a graph (S,A) is sparse if|A| ≪ |S|2

(15)

13 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Bipartite Graphs

Definition

Abipartitegraph is an undirected graph (S,A) such thatS=S1∪S2 (withS1∩S2=∅), and no edge ofAis incident to two vertices inS1or two vertices inS2.

Paths of length 2 in a bipartite graph define two regular undirected graphs.

1 2 3 4

5 6 7

1 2

3 4

5 6

7

(16)

Bipartite Graphs

Definition

Abipartitegraph is an undirected graph (S,A) such thatS=S1∪S2 (withS1∩S2=∅), and no edge ofAis incident to two vertices inS1or two vertices inS2.

Paths of length 2 in a bipartite graph define two regular undirected graphs.

1 2 3 4

5 6 7

1 2

3 4

5 6

7

(17)

13 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Bipartite Graphs

Definition

Abipartitegraph is an undirected graph (S,A) such thatS=S1∪S2 (withS1∩S2=∅), and no edge ofAis incident to two vertices inS1or two vertices inS2.

Paths of length 2 in a bipartite graph define two regular undirected graphs.

1 2 3 4

5 6 7

1 2

3 4

5 6

7

(18)

Beware of Graph Drawings

1 2

3 4

1 2

3 4

1

2

3 4

Three times the same graph! No “best” graph

Not always possible to have aplanar graph

(19)

14 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Beware of Graph Drawings

1 2

3 4

1 2

3 4

1

2

3 4

Three times the same graph!

No “best” graph

Not always possible to have aplanar graph

(20)

Outline

Introduction

Basics of Graph Theory

Characteristics of Real-World Networks Social Networks

Natural Networks Artificial Networks Models of Networks Conclusion

(21)

16 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Characteristics of Interest

Sparsity. Is the network sparse (|A| ≪ |S|2)?

All networks considered here will be sparse.

Typical distance. What is themean distancebetween any pairs of vertices?

Local clustering. Ifais connected to bothbandc, is the probability thatb is connected toc significantly greater than the probability any two nodes are connected?

Degree distribution. What is the distribution of the degree of vertices?

k P

k P

k P

Poisson Power-law Gaussian

𝜆k

k! k−𝛾 e−k2

(22)

Outline

Introduction

Basics of Graph Theory

Characteristics of Real-World Networks Social Networks

Natural Networks Artificial Networks Models of Networks Conclusion

(23)

18 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Acquaintance Network

As in the experiment by Milgram

. . . or as given bysocial networking sitessuch as Facebook, LinkedIn. . .

Network characteristics

Logarithmic typical distance Strong local clustering

Gaussian degree distribution [Amaral et al., 2000]

k P

(24)

Acquaintance Network

As in the experiment by Milgram

. . . or as given bysocial networking sitessuch as Facebook, LinkedIn. . .Network characteristics

Logarithmic typical distance Strong local clustering

Gaussian degree distribution [Amaral et al., 2000]

k P

(25)

19 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Actors and Scientists Networks

BipartitegraphsActor-MovieandScientist-Publication Corresponding undirected graphs:

actorsappearing in the same movie scientists whocoauthoreda paper

Bacon’s/Erd ˝os number:distancein the graph to the corresponding vertex

Network characteristics

Logarithmic typical distance Strong local clustering

Power-law degree distribution (2≤𝛾 ≤3), with a possible tail cutoff [Amaral et al., 2000]

k P

(26)

Actors and Scientists Networks

BipartitegraphsActor-MovieandScientist-Publication Corresponding undirected graphs:

actorsappearing in the same movie scientists whocoauthoreda paper

Bacon’s/Erd ˝os number:distancein the graph to the corresponding vertex

Network characteristics

Logarithmic typical distance Strong local clustering

Power-law degree distribution (2≤𝛾 ≤3), with a possible tail cutoff [Amaral et al., 2000]

k P

(27)

20 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Sex Networks

[Amaral et al., 2000]

Network characteristics

In this particular case (small and incomplete community): [Amaral et al., 2000]

Unconnected network,longtypical distance No local clustering (the graph is almost bipartite!) But for larger studies [Liljeros et al., 2001]:

Logarithmic typical distance

No strict local clustering because of predominance of heterosexuality, butsome kind of locality

Power-law degree distribution (𝛾2.5 for females, 𝛾2.3 for males)

k P

(28)

Sex Networks

Network characteristics

In this particular case (small and incomplete community): [Amaral et al., 2000]

Unconnectednetwork,long typical distance No local clustering (the graph is almost bipartite!) But for larger studies [Liljeros et al., 2001]:

Logarithmic typical distance

No strict local clustering because of predominance of heterosexuality, butsome kind of locality

Power-law degree distribution (𝛾2.5 for females, 𝛾2.3 for males)

P

(29)

21 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Outline

Introduction

Basics of Graph Theory

Characteristics of Real-World Networks Social Networks

Natural Networks Artificial Networks Models of Networks Conclusion

(30)

Neural Networks

Network characteristics

Logarithmic typical distance [Watts and Strogatz, 1998]

Strong local clustering

Power-law degree distribution

k P

(31)

22 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Neural Networks

(Dorling Kindersley, dkimages)

Network characteristics

Logarithmic typical distance [Watts and Strogatz, 1998]

Strong local clustering

Power-law degree distribution

k P

(32)

Metabolic Networks

Network characteristics

Logarithmic typical distance Strong local clustering

Power-law degree distribution (2≤𝛾 ≤2.4) [Jeong et al., 2000]

k P

(33)

23 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Metabolic Networks

(Laboratory of Computer Engineering, Technical University of Helsinki)

Network characteristics

Logarithmic typical distance Strong local clustering

Power-law degree distribution (2≤𝛾 ≤2.4) [Jeong et al., 2000]

k P

(34)

Outline

Introduction

Basics of Graph Theory

Characteristics of Real-World Networks Social Networks

Natural Networks Artificial Networks Models of Networks Conclusion

(35)

25 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

The Internet: physical connections be- tween LANs

http://www.opte.org/

Network characteristics

Logarithmic typical distance Strong local clustering

Power-law degree distribution (𝛾 ≈2.2) [Faloutsos et al., 1999]

k P

(36)

The Internet: physical connections be- tween LANs

Network characteristics

Logarithmic typical distance Strong local clustering

Power-law degree distribution (𝛾 ≈2.2) [Faloutsos et al., 1999]

k P

(37)

26 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

The Web: logical hyperlinks between Web pages

[Broder et al., 2000]

Network characteristics Directed graph

Logarithmic typical distance Strong local clustering

Power-law indegree and outdegree distribution (2≤𝛾 ≤3) [Broder et al., 2000]

k P

(38)

The Web: logical hyperlinks between Web pages

Network characteristics Directed graph

Logarithmic typical distance Strong local clustering

Power-law indegree and outdegree distribution (2≤𝛾 ≤3) [Broder et al., 2000]

k P

(39)

27 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Scientific Citations Network

Vertices: Scientific publications Edges: Citation links

Network characteristics Directed graph

No cycles! No strong connectivity.

Strong local clustering (on the underlying undirected graph)

Power-law indegree and outdegree distribution (2≤𝛾 ≤3)

k P

(40)

Scientific Citations Network

Vertices: Scientific publications Edges: Citation links

Network characteristics Directed graph

No cycles! No strong connectivity.

Strong local clustering (on the underlying undirected graph)

Power-law indegree and outdegree distribution (2≤𝛾 ≤3)

k P

(41)

28 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Transportation Networks

Network characteristics Long typical distance Strong local clustering Limited degree variations

(42)

Transportation Networks

Network characteristics Long typical distance Strong local clustering Limited degree variations

(43)

29 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Outline

Introduction

Basics of Graph Theory

Characteristics of Real-World Networks Models of Networks

Random Networks Small Worlds

Scale-Free Networks Conclusion

(44)

Outline

Introduction

Basics of Graph Theory

Characteristics of Real-World Networks Models of Networks

Random Networks Small Worlds

Scale-Free Networks Conclusion

(45)

31 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Random Networks [Solomonoff and Rapoport, 1951, Erd ˝ os and Rényi, 1960]

Construction

1. Start withnvertices and a probabilityp.

2. For each pair of vertices (u,v), insert an edge betweenuandv with probabilityp.

Sparseifp≪1

Logarithmictypical distance (inside the giant connected component)!

No local clustering.

(46)

Random Networks [Solomonoff and Rapoport, 1951, Erd ˝ os and Rényi, 1960]

Construction

1. Start withnvertices and a probabilityp.

2. For each pair of vertices (u,v), insert an edge betweenuandv with probabilityp.

Sparseifp≪1

Logarithmictypical distance (inside the giant connected component)!

No local clustering.

(47)

32 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Degree distribution in random networks

P(k) = (︃n

k )︃

pk(1−p)n−k ∼ (pn)ke−pn k!

k P

Remark

One can construct random graphs with anarbitrary degree distribution (more complicated); stillno local clustering, obviously.

(48)

Outline

Introduction

Basics of Graph Theory

Characteristics of Real-World Networks Models of Networks

Random Networks Small Worlds

Scale-Free Networks Conclusion

(49)

34 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Small Worlds [Watts and Strogatz, 1998, Watts, 1999]

Construction

1. Start with aregular lattice(a grid).

2. With probabilityp,rerouteeach edge randomly.

[Watts and Strogatz, 1998]

Sparse.

(50)

Small Worlds [Watts and Strogatz, 1998, Watts, 1999]

Construction

1. Start with aregular lattice(a grid).

2. With probabilityp,rerouteeach edge randomly.

[Watts and Strogatz, 1998]

Sparse.

(51)

35 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Characteristics of Small Worlds

Forp= 0: lattice (stronglocal clustering)

Forp= 1: random graph (smalltypical distance) Somewhere in between:

Smalltypical distance (thanks torerouting) Stronglocal clustering (thanks to theinitial lattice) Degree distribution resembling a Poisson.

k P

(52)

Measuring the local clustering

CG = 3×(number of triangles inG) number of connected triples inG Cfg= 1for a fully connected graph

Crg=pfor a random graph

A graphGhasstrong local clusteringifCG ≫Crg(for the random graph with the same number of edges)

(53)

37 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Outline

Introduction

Basics of Graph Theory

Characteristics of Real-World Networks Models of Networks

Random Networks Small Worlds

Scale-Free Networks Conclusion

(54)

Preferential Attachment [Barabási and Al- bert, 1999]

Construction

1. Start with a small graph of sizem0, letmbe a constant with m<m0.

2. One after the other,n−m0vertices are added to the graph, connecting them tomexisting vertices; the probability of connecting to a vertex isproportionalto its degree.

Network characteristics

Sparse ifm andn are chosen appropriately. Small typical distance.

Strong local clustering

Power-law degree distribution (actually, with𝛾 = 3, but variations allow arbitrary exponents).

k P

(55)

38 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Preferential Attachment [Barabási and Al- bert, 1999]

Construction

1. Start with a small graph of sizem0, letmbe a constant with m<m0.

2. One after the other,n−m0vertices are added to the graph, connecting them tomexisting vertices; the probability of connecting to a vertex isproportionalto its degree.

Network characteristics

Sparse ifm andn are chosen appropriately.

Small typical distance.

Strong local clustering

Power-law degree distribution (actually, with𝛾 = 3, but variations allow arbitrary exponents).

k P

(56)

Scale-Free Graphs

Graphs with the power-law degree distribution are calledscale-free graphs:

There is notypical scale, or typical order of magnitude for the degree of nodes.

P(𝛼k)

P(k) = (𝛼k)−𝛾 k−𝛾 =𝛼−𝛾

(57)

40 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Outline

Introduction

Basics of Graph Theory

Characteristics of Real-World Networks Models of Networks

Conclusion

(58)

To remember

What you should remember

1. Most (but not all!) real-worldnetworks:

are sparse

have small typical distance have strong local clustering

2. In addition, a large class of them arescale-free

3. Three simplemodels of networks, modeling (and explaining?) some or all of these properties:

Random graphs Small worlds

Preferential attachment

(59)

42 / 43 Pierre Senellart

21 November 2014

Licence de droits d’usage

Applications of the Models

Epidemiology

Network fault detection

Efficient search in P2P networks . . .

(60)

To go further

[Watts, 1999]: an easy-to-read book describing the small world problem and small-world models, with concrete applications

[Newman et al., 2006]: an in-detail survey of the most fundamental works on network theory, networks models, and experimentations on real-world networks

(61)

44 / 47 Pierre Senellart

21 November 2014

Licence de droits d’usage

Bibliography I

L. A. Amaral, A. Scala, M. Barthelemy, and H. E. Stanley. Classes of small-world networks. PNAS, 97(21):11149–11152, October 2000.

Albert-László Barabási and Réka Albert. Emergence of scaling in random networks. Science, 286(5439):509–512, October 1999.

Andrei Broder, Ravi Kumar, Farzin Maghoul, Prabhakar Raghavan, Sridhar Rajagopalan, Raymie Stata, Andrew Tomkins, and Janet Wiener. Graph structure in the web. Computer Networks, 33(1-6):

309–320, 2000.

Peter Sheridan Dodds, Roby Muhamad, and Duncan J. Watts. An experimental study of search in global social networks. Science, 301 (5634):827–829, August 2003.

P. Erd ˝os and A. Rényi. On the evolution of random graphs. Publ. Math.

Inst. Hung. Acad. Sci, 5:17–61, 1960.

(62)

Bibliography II

M. Faloutsos, P. Faloutsos, and C. Faloutsos. On power-law

relationships of the internet topology. InProc. SIGCOMM, pages 251–262, Cambridge, USA, August 1999.

H. Jeong, B. Tombor, R. Albert, Z. N. Oltvai, and A. L. Barabasi. The large-scale organization of metabolic networks. Nature, 407(6804), 2000.

Frigyes Karinthy. Chains. InEverything is different. 1929. Translated from Hungarian by Ádám Makkai, as reproduced in [Newman et al., 2006].

F. Liljeros, C. R. Edling, L. A. N. Amaral, H. E. Stanley, and Y. Aaberg.

The web of human sexual contacts. Nature, 411(6840):907–908, 2001.

Mark Newman, Albert-László Barabási, and Duncan J. Watts. The Structure and Dynamics of Networks. Princeton University Press,

(63)

46 / 47 Pierre Senellart

21 November 2014

Licence de droits d’usage

Bibliography III

Ray Solomonoff and Anatol Rapoport. Connectivity of random nets.

Bulletin of Mathematical Biology, 13(2):107–117, June 1951.

Jeffrey Travers and Stanley Milgram. An experimental study of the small world problem. Sociometry, 34(4), December 1969.

Duncan J. Watts. Small Worlds. Princeton University Press, 1999.

Duncan J. Watts and Steven H. Strogatz. Collective dynamics of

‘small-world’ networks. Nature, 393(6684):440–442, 1998.

(64)

Licence de droits d’usage

Contexte public} avec modifications

Par le téléchargement ou la consultation de ce document, l’utilisateur accepte la licence d’utilisation qui y est attachée, telle que détaillée dans les dispositions suivantes, et s’engage à la respecter intégralement.

La licence confère à l’utilisateur un droit d’usage sur le document consulté ou téléchargé, totalement ou en partie, dans les conditions définies ci-après et à l’exclusion expresse de toute utilisation commerciale.

Le droit d’usage défini par la licence autorise un usage à destination de tout public qui comprend : – le droit de reproduire tout ou partie du document sur support informatique ou papier,

– le droit de diffuser tout ou partie du document au public sur support papier ou informatique, y compris par la mise à la disposition du public sur un réseau numérique,

– le droit de modifier la forme ou la présentation du document,

– le droit d’intégrer tout ou partie du document dans un document composite et de le diffuser dans ce nouveau document, à condition que : – L’auteur soit informé.

Les mentions relatives à la source du document et/ou à son auteur doivent être conservées dans leur intégralité.

Le droit d’usage défini par la licence est personnel et non exclusif.

Tout autre usage que ceux prévus par la licence est soumis à autorisation préalable et expresse de l’auteur :sitepedago@telecom-paristech.fr

Références

Documents relatifs

As the data set has to be described by its dissimilarity measure, the choice of a good dissimilarity is critical: we propose here to investigate the combination of the

• Support and expand non-local networks that can link community groups with a broader range of resources, expertise, and information. • Ensure that information

Consensus string problems have the following general form: given input strings S

Similarities with algebraic work on the non-weighted graph Each class corresponds to one or several connected communities; Class 3 is a part of the “rich club” and “Combelcau”

Network characteristics Long typical distance Strong local clustering Limited degree variations.. 31 / 61

Network characteristics Long typical distance Strong local clustering Limited degree variations.. Models of Networks

L’accès aux archives de la revue « Rendiconti del Seminario Matematico della Università di Padova » ( http://rendiconti.math.unipd.it/ ) implique l’accord avec les

The levels of linkage disequilibrium (LD) among outlier genes observed here and the lack of extensive gene flow are consistent with a hypothesis of two sympatric forms of Atlantic