Approximation Algorithms and Hardness Results for Labeled Connectivity Problems

(1)

Approximation Algorithms and Hardness Results for

Labeled Connectivity Problems

Refael Hassin∗ J´erˆome Monnot† Danny Segev∗

Abstract

Let G = (V, E) be a connected multigraph, whose edges are associated with labels specified by an integer-valued function L : E → N. In addition, each label ℓ ∈ N has a non-negative cost c(ℓ). The minimum label spanning tree problem (MinLST) asks to find a spanning tree in G that minimizes the overall cost of the labels used by its edges. Equivalently, we aim at finding a minimum cost subset of labels I ⊆ N such that the edge set {e ∈ E : L(e) ∈ I} forms a connected subgraph spanning all vertices. Similarly, in the minimum label s-t pathproblem (MinLP) the goal is to identify an s-t path minimizing the combined cost of its labels. The main contributions of this paper are improved approximation algorithms and hardness results for MinLST and MinLP.

Keywords: Labeled connectivity, approximation algorithms, hardness of approximation.

1 Introduction

In many graph connectivity problems each edge is associated with a numerical attribute, which may represent length, weight or cost, depending on the related real-life context, and the task is to identify a minimum cost subgraph satisfying given connectivity requirements. In this paper, we assume that the set of available edges is partitioned into classes, each of which can be purchased in its entirety or not at all. A convenient representation of such a model couples each edge with a label that specifies its class, and a subset of labels forms a feasible solution when the edges whose labels belong to this subset induce a subgraph satisfying the given connectivity requirements. The objective is to find a solution that minimizes some function defined over the costs of picked labels.

We address two fundamental labeled connectivity problems, those of constructing spanning trees and s-t paths by picking labels of minimum total cost. Formally, let G = (V, E) be a connected multigraph on n vertices, whose edges are associated with labels specified by an integer-valued function _{L : E → N. In addition, each label ℓ ∈ N has a non-negative cost} c(ℓ). The minimum label spanning tree problem (MinLST) asks to find a spanning tree in G that minimizes the overall cost of the labels used by its edges. Equivalently, we aim at finding a minimum cost subset of labels I _{⊆ N such that the edge set {e ∈ E : L(e) ∈ I} forms a} connected subgraph spanning all vertices. Similarly, in the minimum label s-t path problem (MinLP) the goal is to identify an s-t path minimizing the combined cost of its labels, where s, t_{∈ V are part of the input. We refer to the special cases of these problems in which at most} r edges are assigned to any given label as MinLSTr and MinLPr, respectively.

∗_School _of _Mathematical _Sciences, _Tel-Aviv _University, _Tel-Aviv _69978, _Israel. _Email:

{hassin,segevd}@post.tau.ac.il.

†

CNRS LAMSADE, Universit´e Paris-Dauphine, Place du Mar´echal de Lattre de Tassigny, 75775 Paris Cedex 16, France. Email: monnot@lamsade.dauphine.fr.

(2)

1.1 Related results

To the best of our knowledge, the weighted versions of both MinLST and MinLP have not been previously studied. Therefore, the approximability bounds of these problems are stated below with respect to the unweighted case, in which each label has a unit cost.

Chang and Leu [11], and independently Broersma and Li [7], proved that the decision prob-lem of MinLST is NP-complete. The former authors also studied the performance of several heuristics, one of which is the maximum vertex covering algorithm. Krumke and Wirth [20] demonstrated that a variant of this algorithm (henceforth, modified MVC) guarantees an ap-proximation factor of at most 2 ln n + 1, and accompanied this result by proving that MinLST is at least as hard to approximate as set cover. Wan, Chen and Xu [26] suggested a refined analysis of the modified MVC algorithm to obtain a factor of at most H(n_{− 1) =}Pn−1

k=1 1k. Recently,

Xiong, Golden and Wasil [28] established that this algorithm provides a tight approximation guarantee of H(r) for MinLSTr, improving the bound of Wan et al.1, which is independent of r.

Br¨uggemann, Monnot and Woeginger [9] considered a local-search heuristic, and showed that it constructs a solution for MinLSTr whose cost is within factor r+1₂ of optimum. In addition,

they proved that MinLST2 is polynomial-time solvable, whereas MinLSTr is APX-complete for

r≥ 3.

Carr, Doddi, Konjevod and Marathe [10] proved that MinLP contains as a special case the red-blue set cover problem, which was shown in the same paper to be inapproximable within a factor of O(2log1−ǫn) for any ǫ > 0, unless NP ⊆ TIME(npolylog(n)_{). However, this hardness}

result does not readily extend to MinLP, since the reduction described by Carr et al. is not approximation preserving. Relying on a more restrictive subproblem of red-blue set cover, Wirth [27, Thm. 2.16] established the above-mentioned lower bound for MinLP. On the positive side, Broersma, Li, Woeginger and Zhang [8] devised two exact exponential-time algorithms, with respective running times of O(n_{· min{L}d_,₂L_{}) and O(n}2_{L!), where L is the number of labels}

and d is the s-t distance in G. They also considered a Dijkstra-like algorithm for approximating MinLP, and demonstrated that it does not provide any constant factor. In fact, simple examples show that the resulting solution may have a cost of Ω(n) times the optimum, and moreover, to our knowledge a non-trivial approximation for MinLP has not been presented yet.

1.2 Our results

In this paper, we present improved approximation algorithms and hardness results for MinLST and MinLP. As a secondary objective, we make a concentrated effort to relate the algorithmic methods utilized in approximating these problems to a number of well-known techniques, orig-inally studied in the context of integer covering. Our main findings can be briefly summarized as follows:

1. We extend the modified MVC algorithm to handle label costs in MinLST. Consequently, we derive an algorithm for the weighted case, with an approximation guarantee of H(n_−1). This result appears in Section 2.

2. We provide an additional O(log n) approximation for MinLST, based on assembling partial solutions obtained by repeatedly calling a constant-factor maximum coverage subroutine [1, 19, 25]. This approach, which encapsulates the principal idea we employ to approximate MinLP, is described in Section 2.

1

Note that we may assume that r ≤ n − 1, since otherwise MinLSTr can be reduced to MinLSTn−1 by

(3)

3. By prematurely terminating the modified MVC heuristic and switching to an exact algo-rithm for MinLST2 (due to Br¨uggemann et al. [9]), we achieve an approximation factor of

H(r)−1

6 for unweighted MinLSTr. Our algorithm was inspired by a similar improvement

for the set cover problem, proposed by Goldschmidt, Hochbaum and Yu [15]. In addition to showing that the factor H(r) can be decreased by lower order terms, the underlying analysis is considerably simpler than that of Xiong et al. [28]. This algorithm is given in Section 3.

4. We devise the first non-trivial algorithm for MinLP, with an approximation factor of O(√n). In a preprocessing step, we “guess” certain attributes of an arbitrary optimal solution and modify the given instance accordingly. Once again, we make use of repeated calls to a maximum coverage subroutine, eventually allowing us to easily identify a near-optimal solution. This result is described in Section 4.

5. Since MinLPr admits a constant-factor approximation when r = O(1), one may ask

whether MinLPr can be approximated in this case to any required degree. A negative

answer is provided in Section 5, where we show that MinLPr is at least as hard to

ap-proximate as Min-r-SAT, a special case of the minimum satisfiability problem (MinSAT) in which each clause consists of at most r literals. The inapproximability of the former problem was studied by Avidor and Zwick [5], whereas that of MinSAT was studied even earlier by Marathe and Ravi [23].

6. By utilizing a self-improvability property of MinLP, which is based on the notion of label squaring, we show that MinLP cannot be approximated within any polylogarithmic factor unless P = NP. This result is incomparable with the previously mentioned lower bound of Wirth, stating that an approximation factor of O(2log1−ǫn) for some ǫ > 0 implies NP_{⊆ TIME(n}polylog(n)). Our technique was motivated by an analogous construction due to Karger, Motwani and Ramkumar [18] for the longest path problem. This proof is given in Section 5.

1.3 Notation

We conclude this section by introducing some notation and terminology. Given a set of edges F _{⊆ E, we use L(F ) = {L(e) : e ∈ F } to denote the image of F under L. Furthermore, when} H is a subgraph of G, the notation _{L(H) is used as a shorthand for L(E(H)). For a subset} I _{⊆ N, we denote by L}−1(I) = {e ∈ E : L(e) ∈ I} the inverse image of I, excluding the case where the specified subset is actually a singleton, which is abbreviated by writing_L−1(ℓ) instead of _L−1₍_{{ℓ}). The contraction of an edge (u, v) is the multigraph obtained by identifying the}

vertices u and v, followed by eliminating any degenerate edge joining the newly created vertex to itself. It is easy to verify that, regardless of the order according to which the edges in a subset F _{⊆ E are contracted, we always attain the same multigraph. Therefore, it is sensible} to define the contraction of an edge set.

2 Approximating the Weighted MinLST Problem

In what follows, we present two approximation algorithms for the MinLST problem. Guided by considerably different techniques, both algorithms iteratively construct a feasible subset of labels whose cost is within factor O(log n) of optimum. Recalling that MinLST is at least as hard to approximate as set cover, the factor we derive is best possible up to a constant multiplicative factor, assuming P_{6= NP [4, 24].}

(4)

2.1 The greedy algorithm

We extend the modified MVC algorithm, of Krumke and Wirth [20], to handle label costs. In each step, our algorithm picks the most cost-effective label, namely, one that minimizes the ratio between its cost and the decrement in the number of vertices resulting from the contraction of its corresponding edges. A formal description of the algorithm is provided in Figure 1, followed by a tight analysis showing that its approximation guarantee is exactly H(n_{− 1).}

1. I_{← ∅.}

2. While G contains at least two vertices

(a) For every label ℓ _{∈ L(G), let d}G(ℓ) be the decrement in the number of vertices in G

when the edge set _L−1_{(ℓ) is contracted.}

(b) Pick a label ℓ∗_{that minimizes the ratio} c(ℓ)

dG(ℓ) over all labels inL(G). (c) I← I ∪ {ℓ∗_{}, G ← the contraction of L}−1_(ℓ∗_{) in G.}

3. Return I.

Figure 1: The greedy algorithm

Theorem 1. The cost of the greedy solution is within factor H(n− 1) of optimum.

Proof. Let{ℓ1, . . . , ℓk} be the set of labels returned by the algorithm, indexed by the order in

which they were picked. In addition, for 1_{≤ j ≤ k, let G}j be the processed multigraph at the

beginning of the jth iteration (in which the label ℓj was picked). In what follows, we denote

by OPT the cost of an optimal solution to the original instance, and by OPT(Gj) the cost of

an optimal solution to the instance we obtain at the beginning of the jth iteration. Clearly, OPT = OPT(G1)≥ · · · ≥ OPT(Gk).

We first show that c(ℓj) ≤ dGj(ℓj)

OPT(Gj)

|V (Gj)|−1 for all 1 ≤ j ≤ k. Let {ℓ

∗

1, . . . , ℓ∗p} ⊆ L(Gj) be

an optimal solution to the instance corresponding to OPT(Gj). Note that the algorithm had

the option of picking each ℓ∗_i when ℓj was picked. By observing that a minimum-ratio label is

picked in each iteration, we have c(ℓj)

d_Gj(ℓj) ≤

c(ℓ∗ i)

d_Gj(ℓ∗

i) for every 1 ≤ i ≤ p, and the stated upper

bound on c(ℓj) follows as OPT(Gj) = p X i=1 c(ℓ∗ i) dGj(ℓ ∗ i) dGj(ℓ ∗ i)≥ c(ℓj) dGj(ℓj) p X i=1 dGj(ℓ ∗ i)≥ c(ℓj) dGj(ℓj) (_{|V (G}j)| − 1) .

The second inequality holds since the set of edges_L−1(_{ℓ∗₁, . . . , ℓ∗_p_{}) forms a connected subgraph} spanning V (Gj), implying that Pp_i=1dGj(ℓ

∗

i)≥ |V (Gj)| − 1.

Using the upper bounds proved above, we conclude that the cost of {ℓ1, . . . , ℓk} is k X j=1 c(ℓj)≤ k X j=1 dGj(ℓj) OPT(Gj) |V (Gj)| − 1 ≤ OPT k X j=1 d_Gj(ℓj) X i=1 1 |V (Gj)| − i = H(n_{− 1) · OPT ,}

where the last equality holds since dGj(ℓj) =|V (Gj)| − |V (Gj+1)|.

Lemma 2. There are MinLST instances for which the cost of the greedy solution is H(n_{− 1)} times the optimum.

(5)

Proof. Let G be a path connecting the vertices v1, . . . , vn in left-to-right order, augmented

with n_{− 2 additional edges joining v}1 to v3, . . . , vn. The labeling function and associated costs

are: L(v1, vi) = ℓ1 for 2 ≤ i ≤ n; L(vi, vi+1) = ℓi for 2 ≤ i ≤ n − 1; and c(ℓi) = 1_i for

1 _{≤ i ≤ n − 1. It is not difficult to verify that the algorithm picks the labels ℓ}n−1, . . . , ℓ1 (in

this order), constructing an approximate solution whose cost is H(n_{− 1), whereas the optimal} solution picks ℓ1, with cost 1.

We note that an instance of MinLST can be viewed as a set cover instance with exponentially many elements. More precisely, the objective is to cover all non-trivial cuts in G, where for each label ℓ there is an analogous subset, consisting of the cuts crossed byL−1_{(ℓ). Having observed}

this fact, one might be tempted to speculate that the algorithm we suggest is identical to the greedy set cover algorithm [17, 22] applied to the corresponding instance. However, simple examples show that in the weighted case the approximation factor of the latter is Ω(n). 2.2 The budgeted covering algorithm

Unlike the shortsighted approach employed by the greedy algorithm, that picks a single label in each step, the new strategy we suggest consists of repeatedly contracting an inexpensive collection of labels in an attempt to decrease the number of vertices by a constant fraction. Such a collection is identified by approximating a related instance of the budgeted maximum coverage problem, in which we are given a ground set U , a family _{S of subsets of U with} non-negative costs, and a budget B. The objective is to find a subcollection _S′ _{⊆ S such that the}

total cost of the subsets in S′ _{is at most B, and such that the number of elements covered by}

S′ is maximized. Several algorithms achieve an approximation guarantee of 1₋1_e for the latter problem [1, 19, 25].

To simplify the description and analysis of the budgeted covering algorithm, given in Figure 2, it would be convenient to make two preliminary assumptions. First, we assume that cmin =

min_ℓ∈L(G)c(ℓ) > 0, as all zero cost labels can be picked and contracted in advance. Second, given an accuracy requirement ǫ > 0, we assume that a parameter ∆∈ [OPT, (1 + ǫ)OPT] is known. This follows from observing that cmin ≤ OPT ≤ |L(G)|cmax, where cmax = maxℓ∈L(G)c(ℓ), so

all O(log_1+ǫ|L(G)|cmax

cmin ) candidate values of the form (1 + ǫ)

k_c

min can be tested, and one of these

values satisfies the assumption we make on ∆.

1. I_{← ∅, H ← G.}

2. While H contains at least two vertices

(a) Create a budgeted maximum coverage instance by: The ground set is V (H); for each label ℓ_{∈ L(H) there is a corresponding subset V}ℓ⊆ V (H), consisting of all endpoints

of edges in _L−1_{(ℓ); the cost of V}_ℓ _{is c(ℓ); and the budget is ∆.}

(b) Approximate the instance defined above, to obtain a subset I′ _{⊆ L(H).}

(c) I_{← I ∪ I}′_{, H}_{← the contraction of L}−1_(I′_{) in H.}

3. Return I.

Figure 2: The budgeted covering algorithm

Theorem 3. The cost of the solution constructed by the budgeted covering algorithm is within factor(1 + ǫ) log_10/7n of optimum.

(6)

Proof. Starting with an empty set of labels, in each iteration we augment I with labels whose total cost is at most ∆ _{≤ (1 + ǫ)OPT. Therefore, it is sufficient to show that the algorithm} terminates within log_10/7niterations. To this end, we argue that contracting each of the label sets we obtain in step 2b decreases the number of vertices in the processed multigraph by a factor of at least 0.3.

Let I∗ ⊆ L(G) be an optimal solution, withP

ℓ∈I∗c(ℓ) = OPT≤ ∆. Now consider a single

iteration. Since _L−1(I∗) forms a connected subgraph of G spanning all vertices, it follows that {Vℓ: ℓ∈ I∗∩ L(H)} is a feasible solution to the budgeted maximum coverage instance defined

in step 2a that fully covers V (H). Consequently, for the current approximate solution I′ we must have _|S

ℓ∈I′V_ℓ| ≥ (1 − 1_e)|V (H)|, implying that the contraction of L−1(I′) decreases the

number of vertices by at least 1₂(1₋1_e)_{|V (H)| > 0.3|V (H)|.}

3 An Improved Algorithm for Unweighted MinLST

In this section we present a hybrid algorithm for unweighted MinLSTr with an approximation

guarantee of H(r)− 1

6, demonstrating that the factor H(r) can be improved by lower order

terms. We complement this result by constructing an instance on which the stated bound is attained.

3.1 The algorithm

As previously mentioned, our technique was stimulated by that of Goldschmidt et al. [15], who combined the greedy set cover algorithm with an exact solution to a related edge cover problem, and achieved a similar improvement for unweighted set cover. The algorithm we propose, formally described in Figure 3, executes the modified MVC heuristic as long as the contraction of some unpicked label decreases the number of vertices by at least 3, and then switches to an exact MinLST2 algorithm, suggested by Br¨uggemann et al. [9].

1. I_{← ∅.}

2. While G contains at least two vertices

(a) For every label ℓ _{∈ L(G), let d}G(ℓ) be the decrement in the number of vertices in G

when the edge set _L−1_{(ℓ) is contracted.}

(b) Let ℓ∗ _{be a label that maximizes d}_G_{(ℓ) over all labels in}_L(G).

(c) If dG(ℓ∗) ≤ 2, proceed to step 3. Otherwise, I ← I ∪ {ℓ∗}, G ← the contraction of

L−1_(ℓ∗_{) in G.}

3. For each label ℓ_{∈ L(G), leave in G a maximal forest consisting of (at most two) edges whose} label is ℓ, and discard the edges remaining in_L−1_{(ℓ). Let G}′_{⊆ G be the subgraph we obtain.}

4. Find an optimal solution I′_{⊆ L(G}′_{) to the resulting MinLST}

2 problem on G′.

5. Return I∪ I′_.

Figure 3: The unweighted MinLST algorithm Note that G′_{may contain parallel edges, whereas the MinLST}

2algorithm of Br¨uggemann et

al. was originally stated in terms of simple graphs. However, their algorithm is based on solving a related graphic matroid parity problem, which is also polynomial-time solvable on multigraphs

(7)

[13]. This observation can be exploited to implement step 4 using minor adjustments to the algorithm described in [9].

3.2 Analysis

We break up step 2 of the algorithm into phases numbered r, . . . , 3, where phase j consists of the sequence of iterations in which the number of vertices decreased by exactly j. Let Ij be the

set of labels that were picked during phase j, with the convention that_|Ij| = nj. We also define

I1 and I2 according to the following procedure: Starting with the multigraph G′, the labels in I′

are contracted one after the other (in some arbitrary order); then, I1 is the set of labels whose

contraction decreased the number of vertices by 1; and similarly, I2 is the set of labels with a

decrement of 2. Clearly,_{|I| =}Pr

j=3nj and |I′| = n1+ n2.

For ease of presentation, we assume that none of I1, . . . , Ir is empty, and remark that the

general case can be handled by considering a number of degenerate scenarios. Having this assumption in mind, let Gj be the processed multigraph at the beginning of the iteration in

which the first label in Ij is picked. In addition, let OPT(Gj) be the cardinality of an optimal

solution to the MinLST instance whose underlying multigraph and labels are Gj and L(Gj),

respectively. Using this notation, OPT = OPT(Gr)≥ · · · ≥ OPT(G3)≥ |I′|.

Lemma 4. OPT(Gj)≥ 1_jPj_k=1knk for every 3≤ j ≤ r.

Proof. We first claim that _{|V (G}j)| = Pj_k=1knk+ 1. This follow from observing that the

number of vertices of Gj decreases by jnj in phase j, by (j− 1)nj−1 in phase j− 1, and so on,

until it decreases by 3n3 in phase 3. At this point, the number of remaining vertices is|V (G′)|,

and the definition of n1 and n2 ensures that |V (G′)| = 2n2+ n1+ 1.

Now consider the multigraph Gj. Since the contraction of each label inL(Gj) decreases the

number of vertices by at most j, a property that extends to subsequent contractions as well, we have OPT(Gj)≥ 1_j(|V (Gj)| − 1) = 1_jPjk=1knk.

Lemma 5. _{|I| + |I}′_{| ≤ (H(r) −}1₆)OPT. Proof. By Lemma 4 we have

OPT r X j=4 1 j ≥ r−1 X j=3 OPT(Gj) j+ 1 ≥ r−1 X j=3 1 j(j + 1) j X k=1 kn_k = r−1 X j=3 1 j − 1 j+ 1 j X k=1 knk = r−1 X k=1 knk r−1 X j=max{k,3} 1 j − 1 j+ 1 = r−1 X k=1 kn_k 1 max{k, 3} − 1 r = n1 3 + 2n2 3 + r−1 X k=3 nk− 1 r r−1 X k=1 knk . (3.1)

(8)

It follows that |I| + |I′| = r X k=3 nk+ (n1+ n2) ≤ r X k=1 nk+ n2 3 = n1 3 + 2n2 3 + r−1 X k=3 n_k₋1 r r−1 X k=1 kn_k ! + 1 r r X k=1 kn_k+2(n1+ n2) 3 ≤   r X j=4 1 j + 1 + 2 3  OPT = H(r)₋1 6 OPT ,

where the second inequality is obtained by adding inequality (3.1), the inequality proved in Lemma 4 for j = r, and OPT_{≥ |I}′_{| = n}1+ n2.

Theorem 6. MinLSTr can be approximated within a factor of H(r)−1₆.

3.3 A tight example

To prove that the above analysis is tight, we present a MinLSTr instance (G,L) demonstrating

that the suggested algorithm may construct a solution with cardinality H(r)− 1

6 times the

optimum. For 3 _{≤ k ≤ r, let G}k denote a graph formed by the union of two arbitrary

edge-disjoint spanning trees on the same set of k + 1 vertices. The graph G is defined as follows: 1. For every 4_{≤ k ≤ r, there are} _2kr! isomorphic copies of Gk whose vertex sets are

pairwise-disjoint, except for a special vertex v that appears in all copies. We denote by Gk,j the

jth copy of Gk, and by Tk,j′ and Tk,j′′ the pair of edge-disjoint spanning trees whose union

is Gk,j. Note that both T_k′ = SjTk,j′ and Tk′′ =

S

jTk,j′′ are trees that span the set of

verticesS

jV(Gk,j), each containing exactly r!₂ edges.

2. There are r!₃ isomorphic copies G3,j of G3, whose only common vertex is v. As in the

previous item, T₃′ =S

jT3,j′ and T3′′=

S

jT3,j′′ are trees that span the vertices

S

jV(G3,j),

with r! edges each.

3. Finally, there are r!₂ additional vertices, each of which is connected to v by an edge. We use S to denote the star induced by these vertices and v.

With respect to this graph, the labeling function _{L is defined by:}

1. For every 4 _{≤ k ≤ r and 1 ≤ j ≤} _2kr!, all edges of the tree T_k,j′ are labeled by ℓk,j. In

addition, for every 1≤ j ≤ r!₃, the edges of T_3,j′ are labeled by ℓ3,j.

2. On the remaining edges of G we spread the labels _{ℓj : 1 ≤ j ≤ r!₂}, such that each of

them appears exactly r times: Once in S, twice in T′′

3, and once in each of T4′′, . . . , Tr′′.

We observe that the algorithm is forced to pick all labels. More specifically, it begins step 2 by picking the labels in {ℓr,j : 1 ≤ j ≤ _2rr!}, then those in {ℓr−1,j : 1 ≤ j ≤ _2(r−1)r! }, and

so on, until it picks {ℓ4,j : 1 ≤ j ≤ r!₈} and {ℓ3,j : 1 ≤ j ≤ r!₃}. At this point in time, the

contracted graph is simply S, implying that in step 4 there is a unique solution, which is the set of labels_{ℓj : 1≤ j ≤ r!₂}. It follows that the solution produced by the algorithm consists of

Pr

k=4 2kr!+r!3 +r!2 = (H(r)−16)r!2 labels, while the optimum is at most r!2, since{ℓj : 1≤ j ≤ r!2}

(9)

4 An O(√n) Approximation for MinLP

In what follows, we present an algorithm for the MinLP problem, achieving an approximation factor of O(√n). Throughout this section, we assume that the reader is familiar with the basics of budgeted maximum coverage given in Subsection 2.2.

The principal ideas that guide our algorithm are as follows. When s and t are distant enough, an optimal solution must traverse many vertices, a fact that establishes the existence of an inexpensive set of labels whose contraction significantly decreases the number of vertices. As demonstrated in the context of the budgeted covering algorithm, we can identify a label set possessing this property by employing a maximum coverage subroutine. In the opposite case, a shortest path connecting s and t constitutes a near-optimal solution, provided that its edges are not endowed with overly priced labels. These observations suggest a two-step approach: First, perform repeated contractions as long as s and t are distant, and then complete the solution by picking a shortest s-t path.

For this tactic to have a low order strongly-polynomial running time, we apply a technique that was originally proposed by Hassin [16] and enhanced by Lorenz and Raz [21]. In adherence to standard terminology, we define an α-test to be a procedure that, given a parameter ∆_{≥ 0,} either constructs a feasible solution whose cost is at most α∆ or determines that OPT > ∆. The specifics of a 13₃ √n-test are provided in Figure 4, followed by a correctness proof.

1. I_{← ∅, H ← G.}

2. Eliminate from H all edges e with c(_{L(e)) > ∆.} 3. While distH(s, t)≥√n

(a) Create a budgeted maximum coverage instance by: The ground set is V (H); for each label ℓ_{∈ L(H) there is a corresponding subset V}ℓ⊆ V (H), consisting of all endpoints

of edges in _L−1_{(ℓ); the cost of V}_ℓ _{is c(ℓ); and the budget is ∆.}

(b) Approximate the instance defined above, to obtain a subset I′ _{⊆ L(H).}

(c) I_{← I ∪ I}′_{, H}_{← the contraction of L}−1_(I′_{) in H.}

4. If the number of iterations in step 3 was greater than 10 3

√_{n, return “OPT > ∆”.} 5. Let P be a shortest s-t path in H. Return I_{∪ L(P ).}

Figure 4: The MinLP test Lemma 7. The procedure given in Figure 4 is a 13₃ √n-test.

Proof. We first claim that when OPT_{≤ ∆, the number of iterations in step 3 is at most} 10₃√n, allowing us to safely report that OPT > ∆ at the detection of additional iterations. For this purpose, it is sufficient to show that the number of vertices in H decreases by at least 0.3√n whenever a label set is contracted.

Let I∗ ⊆ L(G) be an optimal solution, with P

ℓ∈I∗c(ℓ) = OPT ≤ ∆, and consider a single

iteration. Since the edges_L−1(I∗_{∩ L(H)) survive step 2 and form a subgraph of H containing} an s-t path, it follows that_{Vℓ : ℓ∈ I∗∩ L(H)} is a feasible solution to the budgeted maximum

coverage instance defined in step 3a. Moreover, as the s-t distance in H is at least√n, the latter solution satisfies _|S

ℓ∈I∗_∩L(H)V_ℓ| ≥√n. Consequently, for the current approximate solution I′

we must have _|S

ℓ∈I′Vℓ| ≥ (1 − 1_e)√n, implying that the contraction of L−1(I′) decreases the

number of vertices by at least 1₂(1−1 e)

√

(10)

We conclude the proof by considering the scenario in which the allowed number of iterations has not been exceeded. In this case, the cost of I is at most 10₃√n∆, since in each iteration it was augmented with labels whose total cost is at most ∆. In addition, the cost of L(P ) is bounded by √n∆, since P consists of at most √nedges, all of which survived step 2.

Now let cst be the minimum label cost for which the subgraph (V,{e : c(L(e)) ≤ cst})

con-tains an s-t path. Clearly, cst≤ OPT ≤ |L(G)|cst. Given an accuracy requirement ǫ > 0, we

con-duct a binary search over_{{(1 + ǫ)}kcst : 0≤ k ≤ ⌈log1+ǫ|L(G)|⌉}, involving O(log log1+ǫ|L(G)|)

calls to the 13₃√n-test described above. As a consequence, we identify a value ∆ for which the test reports OPT > ∆, whereas for (1 + ǫ)∆ it successfully constructs a feasible solution. It follows that the cost of this solution is at most (1 + ǫ)13₃√n_{· OPT.}

Theorem 8. For any fixed ǫ >0, MinLP can be approximated within a factor of (1 + ǫ)13₃√n.

5 The Hardness of Approximating MinLP

The main result of this section is that MinLP cannot be approximated within any polyloga-rithmic factor unless P = NP. Prior to presenting this proof, we relate the approximability of MinLPr with that of the Min-r-SAT problem.

5.1 MinLPr and Min-r-SAT

The input to the minimum satisfiability problem (MinSAT) is a Boolean formula in conjugative normal form, consisting of a collection _{C = {C}1, . . . , Cm} of clauses made up of literals from

the set X = _{x1, . . . , xn}. The objective is to find a truth assignment to the variables that

minimizes the number of satisfied clauses. We refer to the special case of this problem, in which each clause has at most r literals, as Min-r-SAT.

Marathe and Ravi [23] showed that MinSAT and vertex cover are equivalent with respect to approximability. Therefore, it is NP-hard to approximate the general MinSAT problem within any factor smaller than 1.3606 [12]. Having observed that this bound does not apply to Min-r-SAT for small values of r, Avidor and Zwick [5] provided a lower bound of 15₁₄ for r = 2, and a bound of 7₆ for all r ≥ 3. The next theorem extends these results to MinLPr.

Theorem 9. For every r_{≥ 2, MinLP}r is at least as hard to approximate as Min-r-SAT.

Proof. Given an instance (C, X) of Min-r-SAT, we show how to formulate it as a MinLPr

instance. For 1_{≤ j ≤ n, let d}j and ¯dj be the number of clauses in which the literals xj and ¯xj

appear, respectively. Without loss of generality, dj ≥ 1 and ¯dj ≥ 1, or otherwise the value of xj

can be determined in advance. We define a MinLPr instance (G,L, s, t) as follows:

1. The vertices of G are v1, . . . , vn+1. In addition, for every 1 ≤ j ≤ n, we create two

interior-disjoint paths, Pj and ¯Pj, connecting vj and vj+1. The length of Pj is dj, while

that of ¯Pj is ¯dj.

2. We now spread the labels{ℓ1, . . . , ℓm} on the edges of G. Specifically, let C(xj) andC(¯xj)

be the sets of clauses in_{C containing the literals x}j and ¯xj, respectively. Then, each edge

of Pj is given a distinct label from {ℓi : Ci ∈ C(xj)}, and similarly, the edges of ¯Pj are

given distinct labels from {ℓi : Ci ∈ C(¯xj)}. Since each clause has at most r literals, the

number of occurrences of each label is at most r. 3. Finally, we set s = v1 and t = vn+1.

(11)

We note that there is a one-to-one correspondence between truth assignments and s-t paths in G. First, suppose that f is a truth assignment that satisfies k clauses. Then the concatenation P of the paths {Pj : f (xj) = true} and { ¯Pj : f (xj) = false} forms an s-t path with |L(P )| = k.

Conversely, suppose that P is an s-t path with _{|L(P )| = k. Then, as a result of setting each} variable xj to true if and only if Pj is a subpath of P , we obtain an assignment satisfying k

clauses.

5.2 Inapproximability within any polylogarithmic factor

In what follows, we prove that it is NP-hard to approximate MinLP within a factor of O(logkn), for any fixed k _{≥ 1. To simplify the presentation, our proof is decomposed into three stages.} First, we provide a logarithmic lower bound on the approximability of MinLP by relating it to a subproblem of set cover. Then, we define a new graph operation, called label squaring, and use it to derive a self-improvability property. Finally, we establish the main result by exploiting this property and additional structure common to instances obtained from the reduction described in the first stage.

Lemma 10. There exists a constant c >0, such that a polynomial time algorithm approximating MinLP within a factor of cln n implies P = NP.

Proof. By plugging the proof system of Raz and Safra [24] (or alternatively, Arora and Sudan [4]) into the reduction of Bellare, Goldwasser, Lund and Russell [6], the former authors showed that set cover is NP-hard to approximate within a factor of O(log n). In other words, there is a constant c′ >0 such that approximating set cover in polynomial time within a factor of c′ln n implies P = NP. This result also applies to instances (U,_{S) with |U| > |S|}1/q_{, for some constant}

q _{≥ 1, since the above-mentioned construction guarantees that |U| and |S| are polynomially} related [3]. We refer to this special case as MinSC′.

Given a MinSC′_{instance, consisting of a ground set U =}_{e

1, . . . , en} and a family of subsets

S = {S1, . . . , Sm} ⊆ 2U, we define an instance (G,L, s, t) of MinLP as follows:

1. The vertices of G are v0, . . . , vn. In addition, for each element ej ∈ U we create a gadget

G(ej) by connecting vj−1 and vj to the upper rung of a ladder graph. More precisely, if ej

belongs to p subsets in_{S, we put together a ladder whose rungs are (a}1

j, b1j), . . . , (a p j, b

p j),

adding the edges (vj−1, a1j) and (vj, b1j). This configuration is illustrated in Figure 5.

1

v

j_-

v

j 1

a

j 2

a

j 3

a

j

a

j p 1

b

j 2

b

j 3

b

j

b

j p

Figure 5: The gadget G(ej)

2. We now spread the labels _{ℓ0, ℓ1, . . . , ℓm} on the edges of G. Using the notation of item

1, each of the p rungs is given a distinct label from_{ℓi : ej ∈ Si}, whereas all other edges

(12)

3. We set s = v0 and t = vn.

At this point, it is imperative to remark that since n > m1/q, the above construction ensures that the overall number of vertices satisfies

|V (G)| ≤ n + 1 + 2nm ≤ n + 1 + 2nq+1 ≤ 4nq+1 .

Now let c = _4(q+1)c′ , and suppose that MinLP can be approximated in polynomial time within a factor of c ln_{|V (G)|. We show that this assumption leads to an approximation factor of at} most c′ln n for MinSC′, implying P = NP. To this end, letS∗_{⊆ S be an optimal solution to the}

instance (U,_{S). As all elements of U are covered by S}∗, the label of at least one rung in each of the n ladders belongs to_{ℓi : Si ∈ S∗}, and by augmenting this label set with ℓ0 we obtain a

subgraph of G that contains an s-t path. Therefore, the number of labels in an optimal solution to the new MinLP instance is at most _|S∗_{| + 1. It follows that we can find in polynomial time} an s-t path P satisfying |L(P )| ≤ c ′ 4(q + 1)ln|V (G)| · (|S ∗_{| + 1)} ≤ c ′ 4(q + 1)ln 4n q+1_{· 2|S}∗_| ≤ c ′ 2 ln (4n)· |S ∗ | ≤ c′ln n· |S∗| .

The second inequality holds since_{|V (G)| ≤ 4n}q+1_{, and the last inequality follows from observing}

that we can assume without loss of generality that n≥ 4, so ln(4n) ≤ 2 ln n. It is not difficult to verify that, as the path P necessarily traverses ℓ0-labeled edges, {Si ∈ S : ℓi ∈ L(P )} is a

cover of U with cardinality at most _{|L(P )| − 1 ≤ c}′_{ln n}_{· |S}∗_|.

Remark 11. The reduction described in Lemma 10 produces MinLP instances in which the underlying graph is planar. Therefore, the result stated in Lemma 10 also applies to this restricted class of instances, an observation that will be crucial for the remainder of this section. Given a MinLP instance I = (G,_{L, s, t), its label squaring I}2 = (G2,_L2, s2, t2) is a new instance defined as follows. To assemble the graph G2_{, we first construct a distinct copy G}

e of

Gfor each original edge e∈ E(G). Letting seand tedenote the vertices of Gethat correspond to

sand t, we arbitrarily assign seand te to different endpoints of e. Then, for each v∈ V (G), the

vertices assigned to v are unified, over all copies, to a single vertex v2_{. Using this notation, the}

source and destination are s2 and t2, respectively. Finally, the new set of labels isL(G) × L(G), where the edge of Ge corresponding to f ∈ E(G) is given the label (L(e), L(f)).

Lemma 12. OPT(I2)≤ OPT2_(I).

Proof. Let P∗ be an optimal solution to I, that is, an s-t path in G satisfying _|L(P∗)_{| =} OPT(I). Then, by picking a copy of P∗ _{in each G}

e, e∈ E(P∗), we obtain an s2-t2 path in G2

whose set of labels is L(P∗₎_{× L(P}∗_{). It follows that OPT(I}2₎_{≤ |L(P}∗₎_{× L(P}∗₎_{| = OPT}2_(I).

Lemma 13. There is a polynomial-time algorithm that, given an s2-t2 path P2 in G2, finds an s-t path P in G satisfying _{|L(P )| ≤ |L}2(P2)|1/2.

(13)

2

s

I

2

I

s

t

v

l₁ l₂ l₃ l₁_,l₂ ( ) 2

t

2

v

l 1_l 1 , ( ) l1 l3 , ( ) l₂_,l₁ ( ) l 2_l 3 , ( ) l2 l2 , ( ) l3 l2 , ( ) l 3_l 1 , ( ) l₃_,l₃ ( )

Figure 6: A label squaring example

Proof. Let E′ _{⊆ E(G) be the set of edges e for which P}2 _{traverses at least one edge of G} e. By

construction of G2, there is an s-t path P0 in G whose edges are a subset of E′, implying that

|L(P0)| ≤ |L(E′)|. In addition, since P2∩ Ge is a copy of an s-t path whenever e∈ E′, for each

ℓ_{∈ L(E}′) we can find an s-t path Pℓ in G such that

|L(Pℓ)| ≤ min e∈E′_:L(e)=ℓ|L

2_(P2_{∩ G} e)| .

Now let P be the path that minimizes _{|L(P )| over all paths in {P}0} ∪ {Pℓ : ℓ∈ L(E′)}. Then,

|L(P )| ≤ min |L(P0)|, min ℓ∈L(E′₎|L(Pℓ)| ≤ |L(P0)| · min ℓ∈L(E′₎|L(Pℓ)| 1/2 ≤  |L(E′)| · 1 |L(E′₎_| X ℓ∈L(E′₎ |L(Pℓ)|   1/2 ≤   X ℓ∈L(E′₎ min e∈E′_:L(e)=ℓ|L 2_(P2_{∩ G} ej)|   1/2 ≤ |L2(P2)|1/2 .

Theorem 14. For any fixed k_{≥ 1, MinLP cannot be approximated in polynomial time within} a factor of O(logkn) unless P = NP.

Proof. As previously explained in Remark 11, the result stated in Lemma 10 also applies to instances I = (G,_{L, s, t) in which G is an n-vertex planar graph. Now suppose that there exists} a polynomial-time algorithm_{A whose approximation factor for such instances is α(n) ≤ c}klnkn,

for some ck>0. We show that this algorithm can utilize the label squaring operation to obtain

a self-improvability property, as a result of which we derive an approximation factor smaller than c ln n for planar MinLP, where c is the constant mentioned in Lemma 10.

We assume that n is sufficiently large so that ln1/2(3n)_≤ c₄ln n, and let q = q(k, ck) be the

smallest integer satisfying c2−q

k ≤ 2, 22

−q_qk

≤ 2 and 2−q_k _≤ 1

(14)

since c2_k−q _{→ 1, 2}2−qqk_{→ 1, and 2}−qk_{→ 0 as q tends to infinity. Starting with a planar instance} I, we repeatedly apply the label squaring operation q times, to obtain I2q

= (G2q

,_L2q, s2q, t2q). We then employ the algorithm A to find an approximate s2q

-t2q path in G2q, and make use of Lemmas 12 and 13 to obtain an s-t path P in G such that

|L(P )| ≤ α |V (G2q)_{| · OPT I}2q2−q

≤ α2−q |V (G2q)_{| · OPT(I) .} To bound the approximation guarantee α2−q(|V (G2q

)|) in terms of n and c, we first claim that the number of vertices in G2q is at most (3n)2q. For this purpose, it can be easily verified that the label squaring operation preserves planarity, implying that the instances I2j

we obtain throughout the sequence are planar, and in particular|E(G2j

)| ≤ 3|V (G2j

)| − 6. By combining this property with the observation that _{|V (G}2j+1)_{| ≤ |V (G}2j)_{| · |E(G}2j)_{|, we can inductively} prove that _|E(G2j

)_{| ≤ (3n)}2j

and _{|V (G}2j

)_{| ≤ 3}2j₋₁

n2j, with room to spare. It follows that the approximation factor we derive is at most

α2−q _{|V (G}2q)_{| ≤ c}_k2−qln2−qk (3n)2q = c2−q

k 22

−q_qk

ln2−qk(3n)_{≤ 4 ln}1/2(3n)_{≤ c ln n .}

6 Concluding Remarks

Non-uniform bound for MinLST. The algorithms presented in Section 2 approximate weighted MinLST within a factor of O(log n). Even though our analysis is best possible for the general problem, it does not take into account the maximum number of times a label may appear, and guarantees a uniform bound for all instances. Therefore, it would be of interest to investigate whether MinLSTr admits an improved r-dependent bound. We remark that neither

the proof technique described in Section 3 nor the one suggested by Xiong et al. [28] seem to be extendible beyond the unweighted case.

Improved factor for MinLP. An obvious open question for future research is whether the approximation guarantee of O(√n) can be improved. We observed that a number of alternative procedures may substitute the MinLP test proposed in Section 4, some of which are considerably more efficient. Unfortunately, none of these procedures turned out to be an o(√n)-test. Nev-ertheless, in the unweighted case we can obtain a slightly better bound, in which the exponent of n is decreased by a multiplicative factor of 1− Ω(log OPT_{log n} ).

Labeled Steiner forest. The algorithm described in Section 4 can be modified to approximate the minimum label Steiner forest problem. In this generalization of MinLP, we are given a collection_{s1, t1}, . . . , {sk, tk} of distinct pairs of vertices, and the objective is to find a minimum

cost subset of labels that induces a subgraph containing an si-ti path for every 1≤ i ≤ k. The

general idea is that although we cannot compute in steps 3 and 5 the minimum number of edges required to connect all input pairs, a constant-factor approximation for this problem [2, 14] is sufficient.

Hardness of planar MinLST. By following the proof of Lemma 10, it is not difficult to verify that the reduction we present holds for MinLST as well. Therefore, planar instances of the latter problem are NP-hard to approximate within a factor of c log n, for some constant c >0. This lower bound settles the approximability of planar MinLST, which was posed as an open problem by Br¨uggemann et al. [9].

(15)

References

[1] A. A. Ageev and M. Sviridenko. Pipage rounding: A new method of constructing algorithms with proven performance guarantee. Journal of Combinatorial Optimization, 8(3):307–328, 2004.

[2] A. Agrawal, P. N. Klein, and R. Ravi. When trees collide: An approximation algorithm for the generalized Steiner problem on networks. SIAM Journal on Computing, 24(3):440–456, 1995.

[3] S. Arora. Personal communication, November 2005.

[4] S. Arora and M. Sudan. Improved low-degree testing and its applications. Combinatorica, 23(3):365–426, 2003.

[5] A. Avidor and U. Zwick. Approximating MIN 2-SAT and MIN 3-SAT. Theory of Computing Systems, 38(3):329–345, 2005.

[6] M. Bellare, S. Goldwasser, C. Lund, and A. Russell. Efficient probabilistically checkable proofs and applications to approximations. In Proceedings of the 25th Annual ACM Sym-posium on Theory of Computing, pages 294–304, 1993.

[7] H. Broersma and X. Li. Spanning trees with many or few colors in edge-colored graphs. Discussiones Mathematicae Graph Theory, 17(2):259–269, 1997.

[8] H. Broersma, X. Li, G. Woeginger, and S. Zhang. Paths and cycles in colored graphs. Australasian Journal on Combinatorics, 31:299–311, 2005.

[9] T. Br¨uggemann, J. Monnot, and G. J. Woeginger. Local search for the minimum label spanning tree problem with bounded color classes. Operations Research Letters, 31(3):195– 201, 2003.

[10] R. D. Carr, S. Doddi, G. Konjevod, and M. V. Marathe. On the red-blue set cover problem. In Proceedings of the 11th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 345–353, 2000.

[11] R.-S. Chang and S.-J. Leu. The minimum labeling spanning trees. Information Processing Letters, 63(5):277–282, 1997.

[12] I. Dinur and S. Safra. On the hardness of approximating minimum vertex cover. Annals of Mathematics, 162(1):439–486, 2005.

[13] H. N. Gabow and M. F. M. Stallmann. Efficient algorithms for graphic intersection and parity. In Proceedings of the 12th International Colloquium on Automata, Languages and Programming, pages 210–220, 1985.

[14] M. X. Goemans and D. P. Williamson. A general approximation technique for constrained forest problems. SIAM Journal on Computing, 24(2):296–317, 1995.

[15] O. Goldschmidt, D. S. Hochbaum, and G. Yu. A modified greedy heuristic for the set cov-ering problem with improved worst case bound. Information Processing Letters, 48(6):305– 310, 1993.

[16] R. Hassin. Approximation schemes for the restricted shortest path problem. Mathematics of Operations Research, 17(1):36–42, 1992.

(16)

[17] D. S. Johnson. Approximation algorithms for combinatorial problems. Journal of Computer and System Sciences, 9(3):256–278, 1974.

[18] D. R. Karger, R. Motwani, and G. D. S. Ramkumar. On approximating the longest path in a graph. Algorithmica, 18(1):82–98, 1997.

[19] S. Khuller, A. Moss, and J. Naor. The budgeted maximum coverage problem. Information Processing Letters, 70(1):39–45, 1999.

[20] S. O. Krumke and H.-C. Wirth. On the minimum label spanning tree problem. Information Processing Letters, 66(2):81–85, 1998.

[21] D. H. Lorenz and D. Raz. A simple efficient approximation scheme for the restricted shortest path problem. Operations Research Letters, 28(5):213–219, 2001.

[22] L. Lov´asz. On the ratio of optimal integral and fractional covers. Discrete Mathematics, 13:383–390, 1975.

[23] M. V. Marathe and S. S. Ravi. On approximation algorithms for the minimum satisfiability problem. Information Processing Letters, 58(1):23–29, 1996.

[24] R. Raz and S. Safra. A sub-constant error-probability low-degree test, and a sub-constant error-probability PCP characterization of NP. In Proceedings of the 29th Annual ACM Symposium on Theory of Computing, pages 475–484, 1997.

[25] A. Srinivasan. Distributions on level-sets with applications to approximation algorithms. In Proceedings of the 42nd Annual Symposium on Foundations of Computer Science, pages 588–597, 2001.

[26] Y. Wan, G. Chen, and Y. Xu. A note on the minimum label spanning tree. Information Processing Letters, 84(2):99–101, 2002.

[27] H.-C. Wirth. Multicriteria Approximation of Network Design and Network Upgrade Prob-lems. PhD thesis, Department of Computer Science, W¨urzburg University, 2001.

[28] Y. Xiong, B. Golden, and E. Wasil. Worst-case behavior of the MVCA heuristic for the minimum labeling spanning tree problem. Operations Research Letters, 33(1):77–80, 2005.