• Aucun résultat trouvé

Maximal Flow / Minimal Cut

N/A
N/A
Protected

Academic year: 2022

Partager "Maximal Flow / Minimal Cut"

Copied!
16
0
0

Texte intégral

(1)

Pierre Senellart

(2)

What is a logical website?

2. Flow Simulation 3. Seed Extension 4. GEMO website 5. Conclusion

Simple idea: website = webserver But:

Some websites span over several webservers (e.g.

www-rocq.inria.fr, osage.inria.fr. . . ).

Some webservers contain different websites (e.g.

perso.wanadoo.fr).

Some websites may be split into different websites (e.g. INRIA website).

Limits of a website: a subjective notion

(3)

Talk outline

2. Flow Simulation 3. Seed Extension 4. GEMO website 5. Conclusion

1. Introduction

2. Flow Simulation

X Maximal Flow / Minimal Cut X Preflow-Push algorithm

X Adaptation to the Web 3. Seed Extension

X Markov CLustering algorithm X Flow simulation from MCL 4. GEMO website

X Experiments X Results

5. Conclusion

(4)

Maximal Flow / Minimal Cut

2. Flow Simulation

XMaxFlow / MinCut 3. Seed Extension

4. GEMO website 5. Conclusion

Transport network

Maximum flow ≡ Minimal cut

/6 /2

/1 /5 /2

/3

sink source

/4

(5)

Maximal Flow / Minimal Cut

2. Flow Simulation

XMaxFlow / MinCut 3. Seed Extension

4. GEMO website 5. Conclusion

Transport network

Maximum flow ≡ Minimal cut

/6 /2

/1 /5 /2

/3

sink source

4 0

4 2

1 /4 4

1

(6)

Maximal Flow / Minimal Cut

2. Flow Simulation

XMaxFlow / MinCut 3. Seed Extension

4. GEMO website 5. Conclusion

Transport network

Maximum flow ≡ Minimal cut

/6 /2

/1 /5 /2

/3

sink source

4 0

4 2

1 /4 4

1

(7)

Preflow-Push algorithm

2. Flow Simulation XPreflow-Push 3. Seed Extension 4. GEMO website 5. Conclusion

1. All nodes are assigned a height h: h(source) = N,

∀k 6= source, h(k) = 0 (N is the number of nodes) 2. Nodes with an overflow are visited, in any order.

If possible, the flow is pushed toward a lower node.

Capacities of edges are respected.

Otherwise, the node is heigtened.

Theorem

The process converges, whatever the sequence of

visited nodes may be. The maximal flow is obtained at the limit.

(8)

Adaptation to the Web

2. Flow Simulation

XAdaptation to the Web 3. Seed Extension

4. GEMO website 5. Conclusion

Website: nodes of a flow network delimited by a MaxFlow / MinCut.

Nodes: webpages, progressively crawled Edges: hyperlinks

Capacities: edition distance between URLs

A virtual source, pointing to a seed of pages with infinite capacity edges

A virtual sink, pointed by all nodes with very low capacity edges

(9)

Markov CLustering algorithm

2. Flow Simulation 3. Seed Extension

XMCL

4. GEMO website 5. Conclusion

An off-line Graph Clustering Algorithm based on flow simulation

Alternation between expansion (multiplication) and inflation (rescaling) of a stochastic matrix

Work (almost) only on undirected graphs

Complexity: O(n) (but with a high multiplying factor)

(10)

Markov CLustering algorithm

2. Flow Simulation 3. Seed Extension

XMCL

4. GEMO website 5. Conclusion

An off-line Graph Clustering Algorithm based on flow simulation

Alternation between expansion (multiplication) and inflation (rescaling) of a stochastic matrix

Work (almost) only on undirected graphs

Complexity: O(n) (but with a high multiplying factor)

(11)

Flow Simulation from MCL clusters

2. Flow Simulation 3. Seed Extension XF.S. from MCL 4. GEMO website 5. Conclusion

Process:

1. MCL Clustering of a large relevant portion of the graph 2. Identification of the most relevant cluster

3. Flow Simulation starting from this cluster

Advantages over pure MCL:

Dynamic discovery of clusters

Use the fact that the graph is directed

(12)

Experiments

2. Flow Simulation 3. Seed Extension 4. GEMO website

XExperiments 5. Conclusion

Crawl of a large part of *.inria.fr/* 72 hours MCL clustering of the obtained graph 24 hours Identification of the GEMO cluster < 1s Flow Simulation from this cluster 3 hours

87, 140 webpages in the graph.

276 clusters.

(13)

Results

2. Flow Simulation

3. Seed Extension Results 4. GEMO website

XResults 5. Conclusion

Size Precision Recall THESUS

Flow Simulation 8 87.5 1.3 xml

web

MCL 320 99.7 33.0 gemo

report MCL + Flow Simulation 788 90.4 86.4 gemo

report www-rocq.inria.fr/

verso/* 221 100.0 44.4 diapositive texte

{www-rocq,osage}.

inria.fr/verso/* 683 100.0 68.6 report

diapositive

(14)

In Brief

2. Flow Simulation 3. Seed Extension 4. GEMO website 5. Conclusion

XIn Brief

Website: a subjective, not obvious notion

Flow Simulation used to discover the limits of a website

Best results given by a combination of an off-line graph clustering and an on-line flow simulation

(15)

Future works

2. Flow Simulation 3. Seed Extension 4. GEMO website 5. Conclusion

XFuture works

Optimization, scaling

On-line computing of MCL?

Hierarchical clustering

Application of the Preflow-Push algorithm to peer-to-peer networks

(16)

Future works

2. Flow Simulation 3. Seed Extension 4. GEMO website 5. Conclusion

XFuture works

Optimization, scaling

On-line computing of MCL?

Hierarchical clustering

Application of the Preflow-Push algorithm to peer-to-peer networks

T

HANK YOU FOR YOUR ATTENTION

!

Références

Documents relatifs

In this paper, we analyze present situation of the Taxi Company which we research, research optimum arrange- ment of taxi drivers’ working hours so as to increase

13v, MS M.370, The Morgan Library & Museum / Right: Roger de Rabutin, comte de Bussy, attributed to Juste d’Egmont Château de Bussy-Rabutin.. The sequence of the miniatures suggests

Nous ne constatons pas de différence entre les soignants en 7h45-10h et 12h concernant la perception de la demande psychologique au travail.. Quels que soient leurs horaires de

On her deathbed, Ms Swinton asks Thomas to take her position as a schoolteacher in Swamp Creek.. Imagine his answer to Ms Swinton, with arguments justifying his acceptance or refusal

One cold day of January 1968, Claire Lucas receives the visit of two army officials, who inform her that Jimmy has been reported “missing in action”.... I set the kerosene lamp down

hopes—' Jetzt habe ich gute Hoffnung.' Another hour brought us to a place where the gradient slackened suddenly. The real work was done, and ten minutes further wading through

Now, when firms are allowed to adjust their labor input through the hours margin, a new worker reduces hours worked of the firm’s other employees by shifting production from

With a concave production function, the firm can reduce the marginal product by hiring an additional worker, thereby reducing the bargaining wage paid to all existing employees.. We