• Aucun résultat trouvé

Traffic-driven model of the World-Wide-Web Graph

N/A
N/A
Protected

Academic year: 2022

Partager "Traffic-driven model of the World-Wide-Web Graph"

Copied!
24
0
0
En savoir plus ( Page)

Texte intégral

(1)

Traffic-driven model of the World-Wide-Web Graph

A. Barrat, LPT, Orsay, France M. Barthélemy, CEA, France

A. Vespignani, LPT, Orsay, France

(2)

Outline

The WebGraph

Some empirical characteristics

Various models

Weights and strengths

Our model:

Definition

Analysis: analytics+numerics

Conclusions

(3)

The Web as a directed graph

i

l j

nodes i: web-pages

directed links: hyperlinks

in- and out- degrees:

(4)

• Small world : captured by Erdös-Renyi graphs

Poisson distribution

<k> = p N

With probability p an edge is established among couple of vertices

Empirical facts

(5)

• Small world

• Large clustering:

different neighbours of a node will likely know each other

1 2

3 n

Higher probability to be connected

=>graph models with large clustering, e.g. Watts-Strogatz 1998

Empirical facts

(6)

• Small world

• Large clustering

• Dynamical network

• Broad connectivity distributions

• also observed in many other contexts

(from biological to social networks)

• huge activity of modeling

Empirical facts

(Barabasi-Albert 1999; Broder et al. 2000; Kumar et al. 2000;

Adamic-Huberman 2001; Laura et al. 2003)

(7)

Various growing networks models

Barabáási-Albert (1999): preferential attachment

Many variations on the BA model: rewiring (Tadic 2001, Krapivsky et al. 2001), addition of edges,

directed model (Dorogovtsev-Mendes 2000, Cooper- Frieze 2001), fitness (Bianconi-Barabáási 2001), ...

Kumar et al. (2000): copying mechanism

Pandurangan et al. (2002): PageRank+pref.

attachment

Laura et al. (2002): Multi-layer model

Menczer (2002): textual content of web-pages

(8)

The Web as a directed graph

i

l j

nodes i: web-pages

directed links: hyperlinks

Broad P(kin) ; cut-of for P(kout)

(Broder et al. 2000; Kumar et al. 2000;

Adamic-Huberman 2001; Laura et al. 2003)

(9)

Additional level of complexity:

Weights and Strengths

i

j

Links carry weights/traffic:

w

ij

In- and out- strengths

l

Adamic-Huberman 2001: broad distribution of sin

(10)

Model: directed network

n i

j

(i) Growth

(ii) Strength driven

preferential attachment (n: kout=m outlinks)

AND...

“Busy gets busier”

(11)

Weights reinforcement mechanism

i

j n

The new traffic n-i increases the traffic i-j

“Busy gets busier”

(12)

Evolution equations

(Continuous approximation)

Coupling term

(13)

Resolution

Ansatz

supported by numerics:

(14)

Results

(15)

Approximation

Total in-weight i sini : approximately proportional to the

total number of in-links i kini , times average weight hwi = 1+

Then: A=1+

sin 2 [2;2+1/m]

(16)

Measure of A prediction of 

Numerical simulations

Approx of 

(17)

Numerical simulations

NB: broad P(sout) even if kout=m

(18)

Clustering spectrum

i.e.: fraction of connected couples of neighbours of node i

(19)

Clustering spectrum

•  increases => clustering increases

• New pages: point to various well-known pages, often connected together => large clustering for small nodes

• Old, popular pages with large k: many in-links from many less popular pages which are not connected together

=> smaller clustering for large nodes

(20)

Clustering and weighted clustering

takes into account the relevance of triangles in the global traffic

(21)

Clustering and weighted clustering

Weighted Clustering larger than topological clustering:

triangles carry a large part of the traffic

(22)

Assortativity

Average connectivity of nearest neighbours of i

(23)

Assortativity

•knn: disassortative behaviour, as usual in growing networks models, and typical in technological networks

•lack of correlations in popularity as measured by the in-degree

(24)

Summary

Web: heterogeneous topology and traffic

Mechanism taking into account interplay between topology and traffic

Simple mechanism=>complex behaviour, scale-free distributions for connectivity and traffic

Analytical study possible

Study of correlations: non-trivial hierarchical behaviour

Possibility to add features (fitnesses, rewiring, addition of edges, etc...), to modify the redistribution rule...

Empirical studies of traffic and correlations?

Références

Documents relatifs

These applications have an intense memory allocation phase at startup, which can benefit greatly from large pages due to fewer page faults, but the conservative component does

More precisely, we extract the following 19 features from HTML content: the number of iframe tags, the number of hidden el- ements, the number of elements with a small area, the

Sizes versus numbers of lead- ing groups of aircraft models compiled in Table I (supplementary material).. 024904-3 Bejan

We compare the recall of text information of the home page in terms of their hierarchical order at different levels, ascending / descending order, total number of headings recall

Dans le code source, modifier la phrase en indiquant votre véritable passion, enregistrer le bloc-note et dans la fenêtre navigateur utiliser la touche F5 pour rafraichir la

Haut droite bas gauche, en 10 pixels Pour centrer horizontalement. padding : 20px ; Haut droite bas gauche,

Table 2 gives a taxonomy of the following systems that support annotations within the Web : Futplex [3] offers a collaborative environment for shared Web page edit- ing ;

More precisely, we extract the following 19 features from HTML content: the number of iframe tags, the number of hidden el- ements, the number of elements with a small area, the

Pour échapper aux trous noirs, Google utilise un modèle plus raffiné : avec une probabilité fixée c, le surfeur abandonne sa page actuelle P j et recommence sur une des n pages

The good old notion of frame, as introduced in the seven- ties, is a natural hub for cognitive sciences, knowledge representation, and natural language understanding. Developments

Among the tag-based tag predictors, Majority Rule method predicts the best tags for unan- notated web pages which means Tag Similarity Assumption dominates Tag Col-

L’annotation des documents en utilisant des ontologies de domaine est pratiquée dans le domaine biopuces [5], le domaine médical, Lylia [6] a utilisé la

From our point of view, the Web page is entity carrying information about these communities and this paper describes techniques, which can be used to extract mentioned informa- tion

- Appeler votre professeur pour qu’il valide votre article et qu’il vérifie le bon fonctionnement de celui-ci. 4) Exercice 4 : Création d’un article avec des documents joints

The Creating Web Pages All-in-One Desk Reference For Dummies is intended to be a reference for all the great things (and maybe a few not-so-great things) that you may need to know

The only trouble is, HTML doesn’t allow you to specify a location within the page using normal language — you can’t say “link to the spot just below the picture of Britney Spears

La prise en compte d’informations sémantiques sur le domaine pour l’annotation d’un élément dans une page web à partir d’une ontologie suppose d’aborder conjointement

La civilisation arabo-musulmane a connu son âge d’or grâce à une panoplie de savants dont les traces sont visibles jusqu’à nos jours. Al Khawarizmi Ibn

3) Les mises en forme et les fonds de pages sont au choix de l’élève, mais sont considérés dans l’évaluation du site web.. 4) Chaque page doit contenir un

Le devoir consiste à créer un site web contenant une page d’accueil, un formulaire de création d’adresse e-mail et une page web de confirmation d’enregistrement.. Créer un

Enfin, autre création ex-nihilo dont je me rendrais coupable : la notion de temust n imajaghen, “peuple ou nation des Touaregs”, concept des plus banals dont l‟occurence est

adresse demandeur (optionnelle) (ligne blanche = fin de l’entête HTTP de la requête) HTTP/1.1 200 OK. taille de la ressource

Vous pouvez déplacer vos pages dans les catégories de votre choix (attention, si vous changez une page de catégorie, l'URL va changer. Pour ne pas perdre votre référencement, pensez