• Aucun résultat trouvé

LP-rounding-based algorithm

Dans le document Springer-Verlag Berlin Heidelberg GmbH (Page 182-187)

Integer Multicommodity Flow in Trees

20 Multicut in General Graphs

20.2 LP-rounding-based algorithm

This naturally raises the question whether the ratio of minimum multicut and maximum multicommodity flow is bounded. Equivalently, is the integral-ity gap of LP (20.2) bounded? In the next section we present an algorithm for finding a multicut within an O(log k) factor of the maximum flow, thereby showing that the gap is bounded by O(log k).

20.2 LP-rounding-based algorithm

First notice that the dual program (20.2) can be solved in polynomial time using the ellipsoid algorithm, since there is a simple way of obtaining a sepa-ration oracle for it: simply compute the length of a minimum si-ti path, for each commodity i, w.r.t. the current distance labels. If all these lengths are 2: 1, we have a feasible solution. Otherwise, the shortest such path provides a violated inequality. Alternatively, the LP obtained in Exercise 20.1 can be solved in polynomial time. Let de be the distance label computed for each multicut is the most cost-effective way of disconnecting all source-sink pairs, edges with large distance labels are more important than those with small distance labels for this purpose. The algorithm described below indirectly gives preference to edges with large distance labels.

The algorithm will work on graph G

=

(V, E) with edge lengths given by de. The weight of edge e is defined to be cede. Let dist(u, v) denote the length of the shortest path from u to v in this graph. For a set of vertices 8 C V, 8(8) denotes the set of edges in the cut (8, S), c(8) denotes the capacity of this cut, i.e., the total capacity of edges in 8(8), and wt(8) denotes the weight of set 8, which is roughly the sum of weights of all edges having both endpoints in 8 (a more precise definition is given below).

The algorithm will find disjoint sets of vertices, 81 , ... , 8z, l

:5

k, in G,

170 20 Multicut in General Graphs

20.2.1 Growing a region: the continuous process

The sets 81, ... , 81 are found through a region growing process. Let us first present a continuous process to clarify the issues. For the sake of time effi-ciency, the algorithm itself will use a discrete process (see Section 20.2.2).

Each region is found by growing a set starting from one vertex, which is the source or sink of a pair. This will be called the root of the region. Suppose the root is s1. The process consists of growing a ball around the root. For each radius r, define 8(r) to be the set of vertices at a distance:::; r from s1, i.e., 8(r) = {vI dist(sb v) :::; r }. 8(0) = { s1}, and as r increases continuously from 0, at discrete points, 8(r) grows by adding vertices in increasing order of their distance from s1.

Lemma 20.2 If the region growing process is terminated before the rodius becomes 1/2, then the set 8 that is found contains no source-sink pair.

Proof: The distance between any pair of vertices in 8(r) is :::; 2r. Since for each commodity i, dist(si, ti)

2:

1, the lemma follows. D

For technical reasons that will become clear in Lemma 20.3 (see also Exercises 20.5 and 20.6), we will assign a weight to the root, wt(s1) = Fjk.

The weight of 8(r) is the sum of wt(s1) and the sum of the weights of edges, or parts of edges, in the ball of radius r around s1. Let us state this formally.

For edges e having at least one endpoint in 8(r), let Qe denote the fraction of edge e that is in 8(r). If both endpoints of e are in 8(r), then Qe = 1.

Otherwise, suppose e = (u,v) with u E 8(r) and v ~ 8(r). For such edges,

r-dist(s1,u)

Define the weight of region 8(r),

where the sum is over all edges having at least one endpoint in 8(r).

We want to fix e so that we can guarantee that we will encounter the condition c(8(r)) :::; e:wt(8(r)) for r

<

1/2. The important observation is that at each point the rate at which the weight of the region is growing is at least c(8(r)). Until this condition is encountered,

d wt(8(r))

2:

c(8(r)) dr

>

e:wt(8(r)) dr.

Exercise 20.5 will help the reader gain some understanding of such a process.

Lemma 20.3 Picking e = 2ln(k

+

1) suffices to ensure that the condition c(8(r)):::; e:wt(8(r)) will be encountered before the rodius becomes 1/2.

20.2 LP-rounding-based algorithm 171 Proof: The proof is by contradiction. Suppose that throughout the region growing process, starting with r

=

0 and ending at r

=

112, c(S(r))

>

c:wt(S(r)). At any point the incremental change in the weight of the region

IS

e

Clearly, only edges having one endpoint in S(r) will contribute to the sum.

Consider such an edge e

=

(u, v) such that u E S(r) and v rf_ S(r). Then,

Since dist( S1, v) ::; dist( s1, u) +de, we get de :::0: dist( s1, v) - dist( s1, u), and hence Cede dqe :::0: Ce dr. This gives

d wt(S(r)) :::0: c(S(r)) dr

>

c:wt(S(r)) dr.

As long as the terminating condition is not encountered, the weight of the region increases exponentially with the radius. The initial weight of the region is F

I

k and the final weight is at most F

+

F

I

k. Integrating we get

rE.F+f

1

r!

h

wt(S(r)) d wt(S(r))

>

Jo c:dr.

k

Therefore, ln(k

+

1)

>

~c:. However, this contradicts the assumption that

c:

=

2ln( k

+

1), thus proving the lemma. D

20.2.2 The discrete process

The discrete process starts with S

= {

s1} and adds vertices to S in increasing order of their distance from s1. Essentially, it involves executing a shortest path computation from the root. Clearly, the sets of vertices found by both processes are the same.

The weight of region S is redefined for the discrete process as follows:

e

where the sum is over all edges that have at least one endpoint in S, and wt(sl)

=

Flk. The discrete process stops at the first point when c(S) ::;

c:wt(S), where c: is again 2ln(k

+

1). Notice that for the same setS, wt(S) in the discrete process is at least as large as that in the continuous process.

172 20 Multicut in General Graphs

Therefore, the discrete process cannot terminate with a larger set than that found by the continuous process. Hence, the set S found contains no source-sink pair.

20.2.3 Finding successive regions

The first region is found in graph G, starting with any one of the sources as the root. Successive regions are found iteratively. Let G1

=

G and S1 be the region found in G1. Consider a general point in the algorithm when regions St. ... , Si-1 have already been found. Now, Gi is defined to be the graph obtained by removing vertices sl

u ... u

si-1. together with all edges incident at them from G.

If Gi does not contain a source-sink pair, we are done. Otherwise, we pick the source of such a pair, say s1, as the root, define its weight to be Fjk, and grow a region in Gi. All definitions, such as distance and weight, are w.r.t. graph Gi. We will denote these with a subscript of

ai··

Also, for a set of vertices Sin Gi, ca, (S) will denote the total capacity of edges incident at Sin Gi, i.e., the total capacity of edges in 8a, (S). As before, the value of e is 2ln(k+l), and the terminating condition is ca, (Si) ~ ewta, (Si)· Notice that in each iteration the root is the only vertex that is defined to have nonzero weight.

f>~(S3) ... .

In this manner, we will find regions St. ... , St, l ~ k, and will output the set M = 8a1 (Sl) U ... U 8a,(St). Since edges of each cut are removed from the graph for successive iterations, the sets in this union are disjoint, and c(M) =

Li

ca,(Si)·

The algorithm is summarized below. Notice that while a region is growing, edges with large distance labels will remain in its cut for a longer time, and

20.2 LP-rounding-based algorithm 173 thus are more likely to be included in the multicut found. (Of course, the precise time that an edge remains in the cut is given by the difference between the distances from the root to the two endpoints of the edge.) As promised, the algorithm indirectly gives preference to edges with large distance labels.

Algorithm 20.4 (Minimum multicut)

1. Find an optimal solution to the LP (20.2), thus obtaining distance labels for edges of G.

2 . .s+-2ln(k+l), H+-G, M+-0;

3. While :3 a source-sink pair in H do:

Pick such a source, say si;

Grow a region S with root Sj until cy(S) :::; .swty(S);

M +-M

u

oy(S);

H +-H with vertices of S removed;

4. Output M.

Lemma 20.5 The set M found is a multicut.

Proof: We need to prove that no region contains a source-sink pair. In each iteration i, the sum of weights of edges of the graph and the weight defined on the current root is bounded by F + F jk. By the proof of Lemma 20.3, the continuous region growing process is guaranteed to encounter the terminating condition before the radius ofthe region becomes 1/2. Therefore, the distance between a pair of vertices in the region,

si,

found by the discrete process is also bounded by 1. Notice that we had defined these distances w.r.t. graph Gi. Since Gi is a subgraph of G, the distance between a pair of vertices in G cannot be larger than that in Gi. Hence, Si contains no source-sink pair. 0

Lemma 20.6 c(M):::; 2.sF

=

4ln(k + l)F.

Proof: In each iteration i, by the terminating condition we have cc; (Si) :::;

.swtc;(Si)· Since all edges contributing to wtc.(Si) will be removed from the graph after this iteration, each edge of G contributes to the weight of at most one region. The total weight of all edges of G is F. Since each iteration helps disconnect at least one source-sink pair, the number of iterations is bounded by k. Therefore, the total weight attributed to source vertices is at most F.

Summing gives:

0

174 20 Multicut in General Graphs

Theorem 20.7 Algorithm 20.4 achieves an approximation guarantee of O(log k) for the minimum multicut problem.

Proof: The proof follows from Lemmas 20.5 and 20.6, and from the fact that the value of the fractional multicut, F, is a lower bound on the minimum

multicut. D

Exercise 20.6 justifies the choice of wt(s1) = Fjk.

Corollary 20.8 In an undirected graph with k source-sink pairs, max JFJ ~ min JCJ

m/c flow F multicut C

~

O(log k) ( m/c flow F max JFJ) , where JFJ represents the value of multicommodity flow F, and JCJ represents the capacity of multicut C.

Dans le document Springer-Verlag Berlin Heidelberg GmbH (Page 182-187)