Minimum-weight spanning tree (MST) algorithm in a synchronous system

5 Terminology and basic algorithms

5.5 Elementary graph algorithms

5.5.11 Minimum-weight spanning tree (MST) algorithm in a synchronous system

A minimum-weight spanning tree (MST) minimizes the cost of transmission from any node to any other node in the graph. The classical centralized MST algorithms such as those by Prim, Dijkstra, and Kruskal [9] assume that the entire weighted graph is available for examination.

• Kruskal’s algorithm begins with a forest of graph components. In each iteration, it identifies the minimum-weight edge that connects two different components, and uses this edge to merge two compo-nents. This continues until all the components are merged into a single component.

• In Prim’s algorithm and Dijkstra’s algorithm, a single-node component is selected. In each iteration, a minimum-weight edge incident on the component is identified, and the component expands to include that edge and the node at the other end of that edge. Aftern−1 iterations, all the nodes are included. The MST is defined by the edges that are identified in each iteration to expand the initial component.

In a distributed algorithm, each process can communicate only with its neighbors and is aware of only the incident links and their weights. It is also assumed that the processes know the value of N = n. The weight of each edge is unique in the network, which is necessary to guarantee a unique MST. (If weights are not unique, the IDs of the nodes on which they are incident can be used as tie-breakers by defining a well-formed order.)

A distributed algorithm by Gallagher, Humblet, and Spira [14] that gener-alizes the strategy of Kruskal’s centralized algorithm is given after reviewing some definitions. Aforest(i.e., a disjoint union of trees) is a graph in which any pair of nodes is connected by at most one path. Aspanning forestof an undirected graph N L is a maximal forest of N L, i.e., an acyclic and not necessarily connected graph whose set of vertices isN. When a spanning forest is connected, it becomes aspanning tree.

A spanning forest ofGis a subgraphGofGhaving the same node set as G; the spanning forest can be viewed as a set of spanning trees, one spanning tree per “connected component” of G. All MST algorithms begin with a spanning forest havingnnodes (or connected components) and without any edges. They then add a “minimum-weight outgoing edge” (MWOE) between two components.²The spanning trees of the combining connected components combine with the MWOE to form a single spanning tree for the combined connected component. The addition of the MWOE is repeated until a spanning

2 Note that this is an undirected graph. The direction of the “outgoing” edge is logical in the sense that it identifies the direction of expansion of the connected component under consideration.

Figure 5.8 Merging of MWOE components. (a) A cycle of length 2 is possible. (b) A cycle of length greater than 2 is not possible.

A B

(a) (b)

tree is produced for the entire graph N L. Such algorithms are correct because of the following observation.

Observation 5.1 For any spanning forest N_i L_ii =1 k of a weighted undirected graph G, consider any componentN_j L_j. Denote by _j, the edge having the smallest weight among those that are incident on only one node inN_j. Then an MST for the graphGthat includes all the edges in eachL_iin the spanning forest, must also include edge_i.

This observation says that for any “minimum-weight” component created so far, when it grows by joining another component, the growth must be via the MWOE for that component under consideration. Intuitively, the logic is as follows. For any component containing node setN_j, if edgexis used instead of the MWOE _j to connect with nodes in N\N_j, then the resulting tree cannot be a MST because edge xcan always be replaced with the MWOE that was not chosen to yield a lower cost tree.

Consider Figure 5.8(a) where three components have been identified and are encircled. The MWOE for each component is marked by an outgoing edge (other outgoing edges are not shown). Each of the three components shown must grow only by merging with the component at the other end of the MWOE.

In a distributed algorithm, the addition of the edges should be done concur-rently by having all the components identify their respective minimum-weight outgoing edge. The synchronous algorithm of Gallagher–Humblet–Spira [14]

uses this above observation, and is given in Algorithm 5.11. Initially, each node is the leader of its component which contains only that node. The algo-rithm uses logniterations. In each iteration, each component merges with at least one other component. Hence,logniterations guarantee termination with a single component.

159 5.5 Elementary graph algorithms

(message types)

SEARCH_MWOEleader // broadcast by current leader on tree edges EXAMINEleader // sent on non-tree edges after receiving

// SEARCH_MWOE

REPLY_MWOElocal_ID remote_ID // details of potential MWOEs // are convergecast to leader ADD_MWOElocal_ID remote_ID // sent by leader to add MWOE

// and identify new leader NEW_LEADERleader // broadcast by new leader after merging

// components leader=i;

forround = 1tologndo // each merger in each iteration involves at // least two components

1. ifleader=ithen

broadcastSEARCH_MWOE(leader) along marked edges of tree (Section5.5.5).

2. On receiving a SEARCH_MWOE(leader) message that was broadcast on marked edges:

(a) Each processi(includingleader) sends an EXAMINE message along unmarked (i.e., non-tree) edges to determine if the other end of the edge is in the same component (i.e., whether its leader is the same).

(b) From among all incident edges at i, for which the other end belongs to a different component, process i picks its incident MWOE(localID,remoteID).

3. The leaf nodes in the MST within the component initiate theconvergecast (Section 5.5.5) using REPLY_MWOEs, informing their parent of their MWOE(localID,remoteID). All the nodes participate in this convergecast.

4. ifleader=ithen

awaitconvergecast replies along marked edges.

Select the minimum MWOE(localID,remoteID) from all the replies.

broadcastADD_MWOE(localID,remoteID) along marked edges of tree (Section5.5.5).

// To ask processlocalIDto mark thelocalID remoteID // edge, i.e., include it in MST of component.

5. if an MWOE edge gets marked by both the components on which it is incidentthen

(a) Define new_leader as the process with the larger ID on which that MWOE is incident (i.e., process whose ID ismaxlocalID remoteID).

(b) new_leaderidentifies itself as the leader for the next round.

(c) new_leaderbroadcastsNEW_LEADER in the newly formed compo-nent along the marked edges (Section5.5.5) announcing itself as the leader for the next round.

Algorithm 5.11 The synchronous MST algorithm by Gallagher–Humblet–Spira (GHS algorithm). The code shown is for processorP_i, 1≤i≤n.

Figure 5.9 The phases within an iteration in a component.

Cross edge

Out-edge Tree edge

Root of component 11

(MWOE) 21

112

13 14 34

44 87 27

54 88

Each iteration goes through a broadcast–convergecast–broadcast sequence to identify the MWOE of the component, and to select theleaderfor the next iteration. The MWOE is identified after the broadcast (steps 1 and 2) and convergecast (step 3) by the current leader, which then does a second broadcast (step 4). The leader is selected at the end of this second broadcast (step 4);

among all the components that merge in an iteration, a single leader is selected, and it identifies itself among all the nodes in the newly forming component by doing a third broadcast (step 5). This sequence of steps can be visualized using the connected component enclosed within a rectangle in Figure5.9, using the following narrative: (a) root broadcasts SEARCH_MWOE; (b) convergecast REPLY_MWOE occurs; (c) root broadcasts ADD_MWOE; (d) if the MWOE is also chosen as the MWOE by the component at the other end of the MWOE, the incident process with the higher ID is the leader for the next iteration and broadcasts NEW_LEADER.

The correctness of the above algorithm hinges on the fact that in any iteration, when each component of the spanning forest joins with one or more other components of the spanning forest, the result is still a spanning forest! Observe that each component picks exactlyoneMWOE with which it connects to another component. However, more than two components can join together in one iteration. If multiple components join, we need to observe that the resulting component is still a spanning forest. To do so, model a directed graphP MwhereP is the set of components at the start of an iteration and M is the set ofP MWOE edges chosen by the components in P. In this graph, there is exactly one outgoing edge from each node inP. Recall that the direction of the MWOE is logical; the underlying graph remains undirected.

If component Achooses to include a MWOE leading to componentB, then directed edge A B exists in P M. By tracing any path in this graph, observe that MWOE weights must be monotonically decreasing. To see that (i) the merging of components retains the spanning forest property, and (ii) there is a unique leader in each component after the merger in the previous round, consider the following two cases:

161 5.5 Elementary graph algorithms

1. If two components join, then each must have picked the other to join with, and we have a cycle of length two. As each component was a spanning forest, joining via the common MWOE still retains the spanning forest property, and there is a unique leader in the merged component.

2. If three or more components join, then two sub-cases are possible:

• There is some cycle of length three or more (see Figure 5.8(b)). But as any path in P M follows MWOEs of monotonically decreasing weights, this implies a contradiction because at least one node must have chosen an incorrect MWOE.

• There is no cycle of length 3 or more, and at least one node inP M will have two or more incoming edges (component C in Figure5.8(a)).

Further, there must exist a cycle of length two. Exercise5.22asks you to prove this formally. As the graph has a cycle of length at most two (case 1), the resulting component after the merger of all the involved components is still a spanning component, and there is a unique leader in the merged component. That leader is the node with the larger PID incident on the MWOE that gets marked by both components on which it is incident.

Complexity

• In each of thelogniterations, each component merges with at least one other component. So after the first iteration, there are at mostn/2 compo-nents, after the second, at mostn/4 components, and so on. Hence, at most logniterations are needed and the number of nodes in each component after iterationkis at least 2^k. In each iteration, the time complexity isOn because the time complexity for broadcast and convergecast is bounded byOn. Hence the time complexity isOn·logn.

• In each of thelogniterations,Onmessages are sent along the marked tree edges (steps 1, 3, 4, and 5). There may be up tol= LEXAMINE messages to determine the MWOEs in step 2 of each iteration. Hence, the total message complexity isOn+l·logn.

The correctness of the GHS algorithm hinges on the fact that the execution occurs in synchronous rounds. This is necessary in step 2, where a process sends EXAMINE messages to its unmarked neighbors to determine whether those neighbors belong to the same or a different component than itself. If the neighbor is not synchronized, problems can occur. For example, consider edgej k, wherejandkbecome a part of the same component in “iteration”

x. Fromj’s perspective, the neighborkmay not yet have received its leader’s ID that was broadcast in step 5 of the previous iteration; hencekreplies to the EXAMINE message sent byjbased on an older ID for its leader. The testing processjmay (incorrectly) includekin the same component as itself, thereby creating cycles in the graph. As the distance from the leader to any node in its component is not known, this needs to be dealt with even in a synchronous system. One way to enforce the synchronicity is to wait forOnnumber of

communication steps; this way, all communication within the round would have completed in the synchronous model.

Dans le document This page intentionally left blank (Page 177-182)