The Algorithms of Prim, Kruskal and Boruvka

Spanning Trees

4.4 The Algorithms of Prim, Kruskal and Boruvka

In this section, we will treat three popular algorithms for determining minimal spanning trees, all of which are based on the characterizations given in the previous section. Let us ﬁrst deal with a generic algorithm which has the advantage of allowing a rather simple proof. The three subsequent algorithms are special cases of this general method which is due to Prim [Pri57].

Algorithm 4.4.1 LetG= (V, E) be a connected graph with vertex setV = {1, . . . , n} andw:E→Ra weight function forG. The algorithm constructs a minimal spanning treeT for (G, w).

Procedure MINTREE(G, w;T)

(1) fori= 1ton doVi← {i};Ti← ∅od (2) fork= 1to n−1 do

(3) choose V_i with V_i=∅;

(4) choose an edgee=uvwith u∈V_i, v /∈V_i, andw(e)≤w(e) for all edgese=uv withu∈V_i,v∈/V_i;

(5) determine the indexj for whichv∈V_j; (6) Vi←Vi∪Vj;Vj← ∅;

(7) Ti←Ti∪Tj∪ {e}; Tj← ∅; (8) ifk=n−1thenT←Ti ﬁ (9) od

Theorem 4.4.2 Algorithm4.4.1determines a minimal spanning tree for the network(G, w).

Proof We use induction ont:=|T1|+· · ·+|Tn|to prove the following claim:

Fort= 0, . . . , n−1, there exists a minimal spanning treeT ofGcontainingT1, . . . , Tn.

(4.3) For t=n−1, this claim shows that the algorithm is correct. Clearly, (4.3) holds at the beginning of the algorithm—before the loop (2) to (9) is executed for the ﬁrst time—sincet= 0 at that point. Now suppose that (4.3) holds for t=k−1, that is, before the loop is executed for thek-th time. Lete=uvwith u∈Vi be the edge which is constructed in thek-th iteration. Ifeis contained in the minimal spanning treeT satisfying (4.3) fort=k−1, there is nothing to show. Thus we may assumee /∈T. ThenT∪ {e}contains the unique cycle

114 4 Spanning Trees

C=C_T(e); obviously, Chas to contain another edgef =rs withr∈V_i and s /∈V_i. By Theorem4.3.1,w(e)≥w(f). On the other hand, by the choice of ein step (4),w(e)≤w(f). Hence w(e) =w(f), and T= (T∪ {e})\ {f} is a minimal spanning tree ofGsatisfying (4.3) fort=k.

Of course, we cannot give the precise complexity of Algorithm4.4.1: this depends both on the choice of the indexiin step (3) and on the details of the implementation. We now turn to the three special cases of Algorithm 4.4.1 mentioned above. All of them are derived by making steps (3) and (4) in MINTREE precise. The ﬁrst algorithm was favored by Prim and is generally known as the algorithm of Prim, although it was already given by Jarn´ık [Jar30].

Algorithm 4.4.3 LetGbe a connected graph with vertex setV ={1, . . . , n} given by adjacency listsA_v, and letw:E→Rbe a weight function forG.

Procedure PRIM(G, w;T) (1) g(1)←0,S← ∅,T← ∅; (2) fori= 2ton dog(i)← ∞ od (3) while S=V do

(4) choose i∈V \S such thatg(i) is minimal; S←S∪ {i}; (5) ifi= 1thenT←T∪ {e(i)} ﬁ

(6) forj∈Ai∩(V \S)do

(7) ifg(j)> w(ij)theng(j)←w(ij); e(j)←ij ﬁ

(8) od

(9) od

Theorem 4.4.4 Algorithm4.4.3determines with complexityO(|V|²)a min-imal spanning treeT for the network(G, w).

Proof It is easy to see that Algorithm4.4.3is a special case of Algorithm4.4.1 (written a bit diﬀerently): if we always chooseV₁ in step (3) of MINTREE, we get the algorithm of Prim. The functiong(i) introduced here is just used to simplify ﬁnding a shortest edge leaving V₁=S. Hence the algorithm is correct by Theorem4.4.2; it remains to discuss its complexity. The while-loop is executed|V| times. During each of these iterations, the comparisons in step (4) can be done in at most|V| − |S|steps, so that we get a complexity of O(|V|²). As G is simple, this is also the overall complexity: in step (6),

each edge ofGis examined exactly twice.

Example 4.4.5 Let us apply Algorithm 4.4.3 to the undirected version of the network of Fig. 3.5, where we label the edges as follows:e₁={1,5},e₂= {6,8},e₃={1,3},e₄={4,5},e₅={4,8},e₆={7,8},e₇={6,7},e₈={4,7}, e₉={2,5}, e₁₀={2,4}, e₁₁={2,6}, e₁₂={3,6}, e₁₃={5,6}, e₁₄={3,8},

Fig. 4.1 The network considered in Example4.4.5

Table 4.2 The algorithm of Prim applied to the network in Fig.4.1

Iteration 1: i= 1,S={1},T=∅,g(2) = 28,e(2) =e15,g(5) = 1,e(5) =e1,g(3) = 2, e(3) =e3

Iteration 2: i= 5,S={1,5},T={e1},g(2) = 8,e(2) =e9,g(4) = 5,e(4) =e4, g(6) = 26,e(6) =e13

Iteration 3: i= 3,S={1,5,3},T={e1, e3},g(6) = 24,e(6) =e12,g(8) = 27,e(8) =e14

Iteration 4: i= 4,S={1,5,3,4},T={e1, e3, e4},g(7) = 8,e(7) =e8,g(8) = 7,e(8) =e5

Iteration 5: i= 8,S={1,5,3,4,8},T={e1, e3, e4, e5},g(6) = 1,e(6) =e2,g(7) = 7, e(7) =e6

Iteration 6: i= 6,S={1,5,3,4,8,6},T={e1, e3, e4, e5, e2} Iteration 7: i= 7,S={1,5,3,4,8,6,7},T={e1, e3, e4, e5, e2, e6} Iteration 8: i= 2,S={1,5,3,4,8,6,7,2},T={e1, e3, e4, e5, e2, e6, e9}

e15={1,2}. Thus the edges are ordered according to their weight. We do not really need this ordering for the algorithm of Prim, but will use it later for the algorithm of Kruskal. The algorithm of Prim then proceeds as summarized in Table4.2; the resulting minimal spanning tree is indicated by the bold edges in Fig.4.1.

Exercise 4.4.6 Let (G, w) be a network. The algorithm of Prim could pro-ceed in many diﬀerent ways: we might vary the start vertex (which we took to be the vertex 1 in Algorithm 4.4.3), and we might have a choice of the next edge to be added during an iteration (if there are edges of the same minimal weight available). To avoid the latter problem, we simply specify a tiebreaking rule between edges of the same wight (by putting them into a deﬁnite order). Show that then always the same minimal spanning tree arises, regardless of the start vertex used [Kra07]. Hint: Use a suitably perturbed weight function and apply Exercise4.3.5.

116 4 Spanning Trees

Now we turn to the second special case of Algorithm4.4.1; this is due to Kruskal [Kru56]. We ﬁrst give a somewhat vague version.

Algorithm 4.4.7 LetG= (V, E) be a connected graph withV ={1, . . . , n}, and letw:E→Rbe a weight function. The edges ofGare ordered according to their weight, that is,E={e1, . . . , em} withw(e1)≤ · · · ≤w(em).

Procedure KRUSKAL(G, w;T) (1) T← ∅;

(2) fork= 1tom do

(3) ifek does not form a cycle together with some edges ofT thenappendek toT ﬁ

(4) od

Note that the algorithm of Kruskal is the special case of MINTREE where Vi and e are chosen in such a way that w(e) is minimal among all edges which are still available: that is, among all those edges which do not have both end vertices in one of the sets V_j and would therefore create a cycle.

Again, Theorem4.4.2shows that the algorithm is correct. Alternatively, we could also appeal to Theorem4.3.1here: in step (3), we choose the edge which does not create a cycle with the edges already in the forest and which has minimal weight among all edges with this property. Thus the setT of edges constructed satisﬁes (4.1), proving again thatT is a minimal spanning tree.

Let us consider the complexity of Algorithm4.4.7. In order to arrange the edges according to their weight and to remove the edge of smallest weight, we use the data structurepriority queue already described in Sect. 3.7. Then these operations can be performed in O(|E|log|E|) steps. It is more diﬃcult to estimate the complexity of step (3) of the algorithm: how do we check whether an edge creates a cycle, and how many steps does this take? Here it helps to view the algorithm as a special case of Algorithm 4.4.1. In step (1), we begin with a (totally) disconnected forest T on n=|V| vertices which consists ofntrees without any edges. During each iteration, an edge is added to the forest T if and only if its two end vertices are contained in diﬀerent connected components of the forest constructed so far; these two connected components are then joined by adding the edge to the forestT. Therefore we may check for possible cycles by keeping a list of the connected components;

for this task, we need a data structure appropriate for treating partitions.

In particular, operations like disjoint unions (MERGE) and ﬁnding the com-ponent containing a given element should be easy to perform. Using such a data structure, we can write down the following more precise version of Algorithm4.4.7.

Algorithm 4.4.8 LetG= (V, E) be a connected graph withV ={1, . . . , n}, and letw:E→Rbe a weight function onG. We assume thatE is given as a list of edges.

Procedure KRUSKAL (G, w;T) (1) T← ∅;

(2) fori= 1tondo Vi← {i} od

(3) putE into a priority queueQwith priority functionw;

(4) while Q=∅ do

(5) e:= DELETEMIN(Q);

(6) ﬁnd the end verticesuandv ofe;

(7) ﬁnd the componentsV_u andV_v containinguand v, respectively;

(8) ifV_u=V_v thenMERGE(V_u, V_v);T←T∪ {e} ﬁ (9) od

Now it is easy to determine the complexity of the iteration. Finding and removing the minimal edge e in the priority queue takes O(log|E|) steps.

Successively merging the originalntrivial components and ﬁnding the com-ponents in step (7) can be done with a total eﬀort of O(nlogn) steps; see [AhoHU83], [CorLRS09] or [MehSa08]. AsGis connected,Ghas at leastn−1 edges, so that the overall complexity is O(|E|log|E|). We have established the following result.

Theorem 4.4.9 The algorithm of Kruskal (as given in Algorithm4.4.8) de-termines with complexityO(|E|log|E|)a minimal spanning tree for(G, w).

For sparse graphs, this complexity is much better than the complexity of the algorithm of Prim. In practice, the algorithm of Kruskal often contains one further step: after each merging of components, it is checked whether there is only one component left; in this case,T is already a tree and we may stop the algorithm.

Example 4.4.10 Let us apply the algorithm of Kruskal to the network of Fig.4.1. The edgese1, e2, e3, e4, e5, e6 ande9 are chosen successively, so that we obtain the same spanning tree as with the algorithm of Prim (although there the edges were chosen in a diﬀerent order). This has to happen here, since our small example has only one minimal spanning tree. In general, however, the algorithms of Prim and Kruskal will yield diﬀerent minimal spanning trees.

Exercise 4.4.11 Let (G, w) be a network, and letT be the set of all spanning trees of (G, w). Put

W(T) := max

w(e) :e∈T

forT∈ T.

Prove that all minimal spanning trees minimize the functionW overT. Let us turn to our third and ﬁnal special case of Algorithm 4.4.1; this is due to Boruvka [Bor26a] and requires that all edge weights are distinct.

118 4 Spanning Trees

Then we may combine several iterations of MINTREE into one larger step:

we always treat each nonemptyVi and add the shortest edge leaving Vi. We shall give a comparatively brief description of the resulting algorithm.

Algorithm 4.4.12 Let G= (V, E) be a connected graph with V = {1, . . . , n}, and let w:E→R be a weight function for which two distinct edges always have distinct weights.

Procedure BORUVKA(G, w;T) (1) fori= 1tondo V_i← {i} od (2) T← ∅;M← {V1, . . . , Vn}; (3) while |T|< n−1 do (4) for U∈M do

(5) ﬁnd an edgee=uv withu∈U,v /∈U andw(e)< w(e) for all edgese=uv with u∈U,v∈/U;

(6) ﬁnd the componentU containingv;

(7) T←T∪ {e};

(8) od

(9) forU∈M doMERGE(U, U)od (10) od

Theorem 4.4.13 The algorithm of Boruvka determines a minimal spanning tree for (G, w)inO(|E|log|V|)steps.

Proof It follows from Theorem 4.4.2that the algorithm is correct. The con-dition that all edge weights are distinct guarantees that no cycles are created during an execution of thewhile-loop. As the number of connected compo-nents is at least halved in each iteration, thewhile-loop is executed at most log|V| times. We leave it to the reader to give a precise formulation of steps (5) and (6) leading to the complexity ofO(|E|log|V|). (Hint: For each vertex v, we should originally have a listE_v of the edges incident withv.) Example 4.4.14 We apply the algorithm of Boruvka to the network shown in Fig. 4.2. When the while-loop is executed for the ﬁrst time, the edges {1,2}, {3,6}, {4,5}, {4,7} and {7,8} (drawn bold in Fig. 4.2) are chosen and inserted intoT. That leaves only three connected components, which are merged during the second execution of thewhile-loop by adding the edges {2,5} and{1,3}(drawn bold broken in Fig.4.2).

Exercise 4.4.15 Show that the condition that all edge weights are distinct is necessary for the correctness of the algorithm of Boruvka.

Exercise 4.4.16 Table4.3gives the distances (in units of 100 miles) between the airports of the cities London, Mexico City, New York, Paris, Peking and Tokyo: Find a minimal spanning tree for the corresponding graph [BonMu76].

Fig. 4.2 A network

Table 4.3 Distance table

for 6 cities L MC NY Pa Pe To

L – 56 35 2 51 60

MC 56 – 21 57 78 70

NY 35 21 – 36 68 68

Pa 2 57 36 – 51 61

Pe 51 78 68 51 – 13

To 60 70 68 61 13 –

Exercise 4.4.17 Thetree graph T(G) of a connected graphGhas the span-ning trees forGas vertices; two of these trees are adjacent if they have|V|−2 edges in common. Prove thatT(G) is connected. What can be said about the subgraph of minimal spanning trees (for a given weight functionw)?

The complexity of the algorithms discussed in this section can often be improved by using appropriate data structures. Implementations for the al-gorithms of Prim and Kruskal with complexity O(|E|log|V|) are given in [Joh75] and [CheTa76]. Using Fibonacci heaps, the algorithm of Prim can be implemented with complexityO(|E|+|V|log|V|); see [AhuMO93]. Boruvka’s algorithm (or appropriate variations) can likewise be implemented with com-plexity O(|E|log|V|); see [Yao75] and [CheTa76]. Almost linear bounds are in [FreTa87] and [GabGST86]; ﬁnally, an algorithm with linear complexity was discovered by Fredman and Willard [FreWi94]; of course, this supposes that the edges are already sorted according to their weights. Unfortunately, the best theoretical algorithms tend to be of no practical interest because of the large size of the implicit constants. There is a simple algorithm with complexityO(|V|) for planar graphs; see [Mat95].

120 4 Spanning Trees

The problem of ﬁnding a new minimal spanning tree if we change the weight of an edge and know a minimal spanning tree for the original graph already is discussed in [Fre85] and [Epp94]. On the average, an update may be done inO(log|V|) steps (under suitable assumptions). Finally, it can be veriﬁed in linear time (that is, with complexity O(|E|)) whether a given spanning tree is minimal. A similar result holds for thesensitivity analysisof minimal spanning trees; this is the problem how much the weight of a given edgeecan be increased without changing the minimal spanning tree already known. For the latter two problems, see [DixRT92].

Dans le document Algorithms and Computation in Mathematics • (Page 126-133)