• Aucun résultat trouvé

Traffic-driven model of the World-Wide-Web Graph

N/A
N/A
Protected

Academic year: 2022

Partager "Traffic-driven model of the World-Wide-Web Graph"

Copied!
24
0
0

Texte intégral

(1)

Traffic-driven model of the World-Wide-Web Graph

A. Barrat, LPT, Orsay, France M. Barthélemy, CEA, France

A. Vespignani, LPT, Orsay, France

(2)

Outline

The WebGraph

Some empirical characteristics

Various models

Weights and strengths

Our model:

Definition

Analysis: analytics+numerics

Conclusions

(3)

The Web as a directed graph

i

l j

nodes i: web-pages

directed links: hyperlinks

in- and out- degrees:

(4)

• Small world : captured by Erdös-Renyi graphs

Poisson distribution

<k> = p N

With probability p an edge is established among couple of vertices

Empirical facts

(5)

• Small world

• Large clustering:

different neighbours of a node will likely know each other

1 2

3 n

Higher probability to be connected

=>graph models with large clustering, e.g. Watts-Strogatz 1998

Empirical facts

(6)

• Small world

• Large clustering

• Dynamical network

• Broad connectivity distributions

• also observed in many other contexts

(from biological to social networks)

• huge activity of modeling

Empirical facts

(Barabasi-Albert 1999; Broder et al. 2000; Kumar et al. 2000;

Adamic-Huberman 2001; Laura et al. 2003)

(7)

Various growing networks models

Barabáási-Albert (1999): preferential attachment

Many variations on the BA model: rewiring (Tadic 2001, Krapivsky et al. 2001), addition of edges,

directed model (Dorogovtsev-Mendes 2000, Cooper- Frieze 2001), fitness (Bianconi-Barabáási 2001), ...

Kumar et al. (2000): copying mechanism

Pandurangan et al. (2002): PageRank+pref.

attachment

Laura et al. (2002): Multi-layer model

Menczer (2002): textual content of web-pages

(8)

The Web as a directed graph

i

l j

nodes i: web-pages

directed links: hyperlinks

Broad P(kin) ; cut-of for P(kout)

(Broder et al. 2000; Kumar et al. 2000;

Adamic-Huberman 2001; Laura et al. 2003)

(9)

Additional level of complexity:

Weights and Strengths

i

j

Links carry weights/traffic:

w

ij

In- and out- strengths

l

Adamic-Huberman 2001: broad distribution of sin

(10)

Model: directed network

n i

j

(i) Growth

(ii) Strength driven

preferential attachment (n: kout=m outlinks)

AND...

“Busy gets busier”

(11)

Weights reinforcement mechanism

i

j n

The new traffic n-i increases the traffic i-j

“Busy gets busier”

(12)

Evolution equations

(Continuous approximation)

Coupling term

(13)

Resolution

Ansatz

supported by numerics:

(14)

Results

(15)

Approximation

Total in-weight i sini : approximately proportional to the

total number of in-links i kini , times average weight hwi = 1+

Then: A=1+

sin 2 [2;2+1/m]

(16)

Measure of A prediction of 

Numerical simulations

Approx of 

(17)

Numerical simulations

NB: broad P(sout) even if kout=m

(18)

Clustering spectrum

i.e.: fraction of connected couples of neighbours of node i

(19)

Clustering spectrum

•  increases => clustering increases

• New pages: point to various well-known pages, often connected together => large clustering for small nodes

• Old, popular pages with large k: many in-links from many less popular pages which are not connected together

=> smaller clustering for large nodes

(20)

Clustering and weighted clustering

takes into account the relevance of triangles in the global traffic

(21)

Clustering and weighted clustering

Weighted Clustering larger than topological clustering:

triangles carry a large part of the traffic

(22)

Assortativity

Average connectivity of nearest neighbours of i

(23)

Assortativity

•knn: disassortative behaviour, as usual in growing networks models, and typical in technological networks

•lack of correlations in popularity as measured by the in-degree

(24)

Summary

Web: heterogeneous topology and traffic

Mechanism taking into account interplay between topology and traffic

Simple mechanism=>complex behaviour, scale-free distributions for connectivity and traffic

Analytical study possible

Study of correlations: non-trivial hierarchical behaviour

Possibility to add features (fitnesses, rewiring, addition of edges, etc...), to modify the redistribution rule...

Empirical studies of traffic and correlations?

Références

Documents relatifs

La civilisation arabo-musulmane a connu son âge d’or grâce à une panoplie de savants dont les traces sont visibles jusqu’à nos jours. Al Khawarizmi Ibn

3) Les mises en forme et les fonds de pages sont au choix de l’élève, mais sont considérés dans l’évaluation du site web.. 4) Chaque page doit contenir un

Among the tag-based tag predictors, Majority Rule method predicts the best tags for unan- notated web pages which means Tag Similarity Assumption dominates Tag Col-

L’annotation des documents en utilisant des ontologies de domaine est pratiquée dans le domaine biopuces [5], le domaine médical, Lylia [6] a utilisé la

From our point of view, the Web page is entity carrying information about these communities and this paper describes techniques, which can be used to extract mentioned informa- tion

- Appeler votre professeur pour qu’il valide votre article et qu’il vérifie le bon fonctionnement de celui-ci. 4) Exercice 4 : Création d’un article avec des documents joints

Dans le code source, modifier la phrase en indiquant votre véritable passion, enregistrer le bloc-note et dans la fenêtre navigateur utiliser la touche F5 pour rafraichir la

Haut droite bas gauche, en 10 pixels Pour centrer horizontalement. padding : 20px ; Haut droite bas gauche,