Greedy Heuristics - ALGORITHMS and THEORY of COMPUTATION HANDBOOK





cost of an optimal tour from cityito city 1 that goes through each of the cities j1, j2, . . ., j_k exactly once, in any order, and through no other cities.

The principle of optimality tells us that T (i;j1, j2, . . . , j_k)= min

1≤m≤k

C_i,j_m+T (jm;j1, j2, . . . , j_m−1, j_m+1, . . . , j_k) ,

where, by deﬁnition,

T (i;j)=C_i,j +C_j,1.

We can write a functionTthat directly implements the above recursive deﬁnition, but as in the optimal search tree problem, many subproblems would be solved repeatedly, leading to an algorithm requiring time(n!). By caching the valuesT (i;j1, j2, . . . , j_k), we reduce the time required to(n²2ⁿ), still exponential, but considerably less than without caching.

1.5 Greedy Heuristics

Optimization problems always have an objective function to be minimized or maximized, but it is not often clear what steps to take to reach the optimum value. For example, in the optimum binary search tree problem of the previous section, we used dynamic programming to examine systematically all possible trees; but perhaps there is a simple rule that leads directly to the best tree—say by choosing the largest β_ito be the root and then continuing recursively. Such an approach would be less time-consuming than the(n³)algorithm we gave, but it does not necessarily give an optimum tree (if we follow the rule of choosing the largestβ_i to be the root, we get trees that are no better, on the average, than a randomly chosen trees). The problem with such an approach is that it makes decisions that arelocally optimum, though perhaps notglobally optimum. But, such a “greedy” sequence of locally optimum choices does lead to a globally optimum solution in some circumstances.

Suppose, for example,β_i =0 for 1≤i≤n, and we remove the lexicographic requirement of the tree;

the resulting problem is the determination of an optimal preﬁx code forn+1 letters with frequencies α0, α1, . . . , αn. Because we have removed the lexicographic restriction, the dynamic programming solu-tion of the previous secsolu-tion no longer works, but the following simple greedy strategy yields an optimum tree: Repeatedly combine the two lowest-frequency items as the left and right subtrees of a newly created item whose frequency is the sum of the two frequencies combined. Here is an example of this construction;

we start with ﬁve leaves with weights

α0=25 α1=34 α2=38 α3=58 α4=95 α5=21

First, combine leavesα0=25 andα5=21 into a subtree of frequency 25+21=45:

25+21=45

α0=25 α5=21

α1=34 α2=38 α3=58 α4=95

Then combine leavesα1=34 andα2=38 into a subtree of frequency 34+38=72:

25+21=45

α0=25 α5=21

34+38=72

α1=34 α2=38

α3=58 α4=95

Next, combine the subtree of frequencyα0+α5=45 withα3=58:

45+58=103

25+21=45

α0=25 α5=21 α3=58

34+38=72

α1=34 α2=38

α4=95

Then, combine the subtree of frequencyα1+α2=72 withα4=95:

45+58=103

25+21=45

α0=25 α5=21 α3=58

72+95=167

34+38=72

α1=34 α2=38 α4=95

Finally, combine the only two remaining subtrees:

103+167=270

45+58=103

25+21=45

α0=25 α5=21 α3=58

72+95=167

34+38=72

α1=34 α2=38 α4=95

How do we know that the above-outlined process leads to an optimum tree? The key to proving that the tree is optimum is to assume, by way of contradiction, that it is not optimum. In this case, the greedy strategy must have erred in one of its choices, so let us look at theﬁrsterror this strategy made. Since all previous greedy choices were not errors, and hence lead to an optimum tree, we can assume that we have a sequence of frequenciesα0, α1, . . . , α_nsuch that the ﬁrst greedy choice is erroneous—without loss of generality assume thatα0andα1are two smallest frequencies, those combined erroneously by the greedy strategy. For this combination to be erroneous, there must be no optimum tree in which these twoαs are siblings, so consider an optimum tree, the locations ofα0andα1, and the location of the two deepest leaves in the tree,α_iandα_j:

α0

α_i α_j

α1

By interchanging the positions ofα0andα_iandα1andα_j(as shown), we obtain a tree in whichα0andα1

are siblings. Becauseα0andα1are the two lowest frequencies (because they were the greedy algorithm’s choice)α0≤α_iandα1≤α_j, thus the weighted path length of the modiﬁed tree is no larger than before the modiﬁcation since level(α0)≥ level(α_i), level(α1)≥ level(α_j)and hence

level(α_i)×α0+ level α_j

×α1≤ level(α0)×α0+ level(α1)×α1.

In other words, the ﬁrst so-called mistake of the greedy algorithm was in fact not a mistake, since there is an optimum tree in whichα0andα1are siblings. Thus we conclude that the greedy algorithm never makes a ﬁrst mistake—that is, it never makes a mistake at all!

The greedy algorithm above is calledHuffman’s algorithm. If the subtrees are kept on a priority queue by cumulative frequency, the algorithm needs to insert then+1 leaf frequencies onto the queue, and the

repeatedly remove the two least elements on the queue, unite those to elements into a single subtree, and put that subtree back on the queue. This process continues until the queue contains a single item, the optimum tree. Reasonable implementations of priority queues will yieldO(nlogn)implementations of Huffman’s greedy algorithm.

The idea of making greedy choices, facilitated with a priority queue, works to ﬁnd optimum solutions to other problems too. For example, aspanning treeof a weighted, connected, undirected graphG=(V, E) is a subset of|V| −1 edges fromEconnecting all the vertices inG; a spanning tree is minimum if the sum of the weights of its edges is as small as possible.Prim’s algorithmuses a sequence of greedy choices to determine a minimum spanning tree: Start with an arbitrary vertexv∈Vas the spanning-tree-to-be.

Then, repeatedly add the cheapest edge connecting the spanning-tree-to-be to a vertex not yet in it. If the vertices not yet in the tree are stored in a priority queue implemented by a Fibonacci heap, the total time required by Prim’s algorithm will beO(|E| + |V|log|V|). But why does the sequence of greedy choices lead to a minimum spanning tree?

Suppose Prim’s algorithm doesnotresult in a minimum spanning tree. As we did with Huffman’s algorithm, we ask what the state of affairs must be when Prim’s algorithm makes its first mistake; we will see that the assumption of a first mistake leads to a contradiction, proving the correctness of Prim’s algorithm. Let the edges added to the spanning tree be, in the order added,e1,e2,e3,. . ., and lete_ibe the first mistake. In other words, there is a minimum spanning treeTmincontaininge1,e2,. . .,e_i−1, but no minimum spanning tree containinge1,e2,. . .,e_i. Imagine what happens if we add the edgee_itoTmin: sinceTminis a spanning tree, the addition ofe_icauses a cycle containinge_i. Letemaxbe the highest-cost edge on that cycle not amonge1,e2,. . .,e_i. There must be such anemaxbecausee1,e2,. . .,e_iare acyclic, since they are in the spanning tree constructed by Prim’s algorithm. Moreover, because Prim’s algorithm always makes a greedy choice—that is, chooses the lowest-cost available edge—the cost ofe_iis no more than the cost of any edge available to Prim’s algorithm whene_i is chosen; the cost ofemaxis at least that of one of those unchosen edges, so it follows that the cost ofe_iis no more than the cost ofemax. In other words, the cost of the spanning treeTmin−{emax}∪{e_i}is at most that ofTmin; that is,Tmin−{emax}∪{e_i} is also a minimum spanning tree, contradicting our assumption that the choice ofe_i is the first mistake.

Therefore, the spanning tree constructed by Prim’s algorithm must be a minimum spanning tree.

We can apply the greedy heuristic to many optimization problems, and even if the results are not optimal, they are often quite good. For example, in then-city traveling salesman problem, we can get near-optimal tours in timeO(n²)when the intercity costs are symmetric (C_i,j =C_j,ifor alliandj) and satisfy the triangle inequality(C_i,j ≤C_i,k+C_k,j for alli,j, andk). Theclosestinsertion algorithmstarts with a

“tour” consisting of a single, arbitrarily chosen city, and successively inserts the remaining cities to the tour, making a greedy choice about which city to insert next and where to insert it: the city chosen for insertion is the city not on the tour but closest to a city on the tour; the chosen city is inserted adjacent to the city on the tour to which it is closest.

Given ann×nsymmetric distance matrixCthat satisﬁes the triangle inequality, letI_nof length|I_n| be the “closest insertion tour” produced by the closest insertion heuristic and letO_nbe an optimal tour of length|O_n|. Then

|I_n|

|O_n| <2.

This bound is proved by an incremental form of the optimality proofs for greedy heuristics we have seen above: we ask not where the ﬁrst error is, but by how much we are in error at each greedy insertion to the tour—we establish a correspondence between edges of the optimal tourO_n and cities inserted on the closest insertion tour. We show that at each insertion of a new city to the closest insertion tour, the additional length added by that insertion is at most twice the length of corresponding edge of the optimal tourO_n.

To establish the correspondence, imagine the closest insertion algorithm keeping track not only of the current tour, but also of a spider-like conﬁguration including the edges of the current tour (the body of

the spider) and pieces of the optimal tour (the legs of the spider). We show the current tour in solid lines and the pieces of optimal tour as dotted lines:

Initially, the spider consists of the arbitrarily chosen city with which the closest insertion tour begins and the legs of the spider consist of all the edges of the optimal tourexceptfor one edge eliminated arbitrarily.

As each city is inserted into the closest insertion tour, the algorithm will delete from the spider-like conﬁguration one of the dotted edges from the optimal tour. When citykis inserted between citiesland m, the edge deleted is the one attaching the spider to the leg that contains the city inserted (from cityxto cityy), shown here in bold:

Now,

C_k,m≤C_x,y,

because of the greedy choice to add citykto the tour and not cityy. By the triangle inequality, C_l,k≤C_l,m+C_m,k,

and by symmetry we can combine these two inequalities to get C_l,k≤C_l,m+C_x,y. Adding this last inequality to the ﬁrst one above,

C_l,k+C_k,m≤C_l,m+2C_x,y, that is,

C_l,k+C_k,m−C_l,m≤2C_x,y.

Thus adding citykbetween citieslandmadds no more toI_nthan 2C_x,y. Summing these incremental amounts over the cost of the entire algorithm tells us

|I_n| ≤2|O_n| , as we claimed.

Dans le document ALGORITHMS and THEORY of COMPUTATION HANDBOOK (Page 28-33)