• Aucun résultat trouvé

Dual-fitting-based analysis for the greedy set cover algorithm cover algorithm

Dans le document Springer-Verlag Berlin Heidelberg GmbH (Page 121-125)

13 Set Cover via Dual Fitting

13.1 Dual-fitting-based analysis for the greedy set cover algorithm cover algorithm

To formulate the set cover problem as an integer program, let us assign a variable xs for each setS E S, which is allowed 0/1 values. This variable will be set to 1 iff set S is picked in the set cover. Clearly, the constraint is that for each element e E U we want that at least one of the sets containing it be picked.

minimize Lc(S)xs (13.1)

SES

subject to

I:

xs 2: 1, e E U

S: eES

V. V. Vazirani, Approximation Algorithms

© Springer-Verlag Berlin Heidelberg 2003

13.1 Dual-fitting-based analysis for the greedy set cover algorithm 109 xs E {0, 1}, 8ES

The LP-relaxation of this integer program is obtained by letting the do-main of variables xs be 1 ~ xs ~ 0. Since the upper bound on xs is is a purely mechanical procedure for obtaining the dual of a linear program.

Once the dual is obtained, one can devise intuitive, and possibly physically meaningful, ways of thinking about it. Using this mechanical procedure, one can obtain the dual of a complex linear program in a fairly straightforward manner. Indeed, the LP-duality-based approach derives its wide applicability from this fact.

An intuitive way of thinking about LP (13.3) is that it is packing "stuff"

into elements, trying to maximize the total amount packed, subject to the constraint that no set is overpacked. A set is said to be overpacked if the total amount packed into its elements exceeds the cost of the set. Whenever the coefficients in the constraint matrix, objective function, and right-hand side are all nonnegative, the minimization LP is called a covering LP and

110 13 Set Cover via Dual Fitting

the maximization LP is called a packing LP. Thus, (13.2) and (13.3) are a covering-packing pair of linear programs. Such pairs of programs will arise frequently in subsequent chapters.

I. .1.

OPT oo

dual fractional solutions primal fractional solutions

At this point, we can state the lower bounding scheme being used by Algorithm 2.2. Denote by OPT! the cost of an optimal fractional set cover, i.e., an optimal solution to LP (13.2). Clearly OPT! :::; OPT, the cost of an optimal (integral) set cover. The cost of any feasible solution to the dual program, LP (13.3), is a lower bound on OPT!, and hence also on OPT.

Algorithm 2.2 uses this as the lower bound.

Algorithm 2.2 defines dual variables price( e), for each element, e. Observe that the cover picked by the algorithm is fully payed for by this dual solution.

However, in general, this dual solution is not feasible (see Exercise 13.2). We will show below that if this dual is shrunk by a factor of Hn, it fits into the given set cover instance, i.e., no set is overpacked. For each element e define,

price( e) Ye = Hn ·

Algorithm 2.2 uses the dual feasible solution, y, as the lower bound on OPT.

Lemma 13.2 The vector y defined above is a feasible solution for the dual progmm (13.3).

Proof: We need to show that no set is overpacked by the solution y. Consider a set S E S consisting of k elements. Number the elements in the order in which they are covered by the algorithm, breaking ties arbitrarily, say

e1. .•. , ek.

Consider the iteration in which the algorithm covers element ei. At this point, S contains at least k-i

+

1 uncovered elements. Thus, in this iteration, S itself can cover ei at an average cost of at most c(S)/(k-i

+

1). Since the algorithm chose the most cost-effective set in this iteration, price(ei) :::;

c(S)/(k-i

+

1). Thus,

1 c(S) Y

< - .

-:--~'---:­

e; - Hn k - i

+

1 .

Summing over all elements in S,

13.1 Dual-fitting-based analysis for the greedy set cover algorithm 111

Therefore, S is not overpacked. 0

Theorem 13.3 The approximation guarantee of the greedy set cover algo-rithm is Hn.

Proof: The cost of the set cover picked is

L

price( e)= Hn

(LYe) :::;

Hn ·OPT!:::; Hn ·OPT,

eEU eEU

where the first inequality follows from the fact that y is dual feasible. 0

13.1.1 Can the approximation guarantee be improved?

Consider the three questions raised in Section 1.1.2 regarding improving the approximation guarantee for vertex cover. Let us ask analogous questions for set cover. The first and third questions are already answered in Section 2.1.

As a corollary of Theorem 13.3 we get an upper bound of Hn on the integrality gap of relaxation (13.2). Example 13.4 shows that this bound is essentially tight. Since the integrality gap of the LP-relaxation used bounds the best approximation factor one can hope to achieve using this relaxation, the answer to the second question is also essentially "no".

Example 13.4 Consider the following set cover instance. Let n = 2k- 1, where k is a positive integer, and let U = { e1 , e2 , .•• , en}· For 1 ::; i ::; n, consider i written as a k-bit number. We can view this as a k-dimensional vector over GF[2]. Let i denote this vector. For 1 :::; i :::; n define set Si = {eJI i · j = 1}, where i · j denotes the inner product of these two vectors.

Finally, letS= {817 ••• , Sn}, and define the cost of each set to be 1.

It is easy to check that each set contains 2k-l = (n

+

1)/2 elements, and each element is contained in (n

+

1)/2 sets. Thus, Xi= 2/(n

+

1), 1 :::;

i:::;

n, is a fractional set cover. Its cost is 2nj(n

+

1).

Next, we will show that any integral set cover must pick at least k of the sets. Consider the union of any p sets, where p

<

k. Let i1, ... , ip be the indices of these p sets, and let A be a p x k matrix over G F[2] whose rows consist of vectors

h, ... ,

ip, respectively. Since the rank of A is

<

k, the dimension of its null space is 2:: 1, and so the null space contains a nonzero vector, say j. Since Aj = 0, the element ej is not in any of the p sets. Hence the p sets do not form a cover.

Therefore, any integral set cover has cost at least k = log2 (n

+

1). Hence, the lower bound on the integrality gap established by this example is

112 13 Set Cover via Dual Fitting ( n ~

+

1) ·log log2n

2(n+1)

> -

2-.

D

Dans le document Springer-Verlag Berlin Heidelberg GmbH (Page 121-125)