A distributed world - Algorithmic Foundations of the Internet

Eight secret agents A, B, C, D, E, F, G, H stay in contact by phone using the network of Figure 8.4 which indicates twelve existing direct lines. Each agent knows the telephone numbers of his neighbors with whom he can speak directly, so for exampleBmay callCbut at least two calls are needed to send a message fromBtoG. One day central command issues a communication that only seven of the twelve lines available can be securely maintained, leaving to the agents the task of selecting lines in such a way that all agents will still be able to communicate with each other. In fact as the graph of the network has eight nodes, seven properly chosen lines are sufficient to maintain the connectivity of the group forming aconnected graph as is shown for example on the right side of Figure 8.4. Note that each communication between two agents may now pass through several nodes thus requiring more calls than before. The order is sent from the central command to one agent elected as theleaderfor this operation, and the leader must arrange for the information to reach the other agents via telephone. To do this the agents must execute an algorithm that in this case is called acommunication protocolconsisting of a set of calls over the original lines, until a set of safe lines has been established.

While executing the protocol, the agents assume one of three possible conditions ofLEADER,IDLE, orDONE, and react to the order at different times depending on the lines used, traffic on the network, and personal reaction time. At the beginning all the agents areIDLEexcept for the chosenLEADER.

E D

F H

B G

E D

F H

B G

FIGURE 8.4: A communication network with eight nodes and seventeen lines, and a possible set of seven safe lines connecting all nodes to one another.

The operation starts when theLEADER calls his neighbors(i.e., the agents directly connected with him) to communicate the order. Then the neighbors wake up and start selecting the safe lines. When anIDLEagentxreceives the order by phone, say from agenty, he accepts the connection and agrees with ythat their connecting line (x, y) has to be taken as safe. Thenxrefuses any further incoming calls from a neighbor that communicates the order to him, and calls all its neighbors excepty to communicate the order to them (if one of the neighbors, say z, has already received the order from another line the call will be refused byz). At this pointxbecomesDONE, a status indicating that he has completed his task. The set of safe lines is built when all agents areDONE.

Formally the protocol can be specified as shown inFigure 8.5. The protocol is written for an arbitrary number of agentsnand refers to a generic agentx because all the agents must follow the same procedure. Some delicate points remain, which we will come to in time, but first examine the given code to make sure that it implements the actions indicated above. Recall that the protocol starts with all agentsIDLEexcept for theLEADER.

Although the actions of each agent are clearly defined in the protocol, the effect of the overall process may be difficult to understand, as is the case for most distributed processes. In particular we claim that, when all the agents have reached the status DONE, a final set of exactly n−1 safe lines has been built. Before trying to prove this, consider a simulation of the agent behavior for the set of connections of Figure 8.4 to see how the protocol works. A number of interesting facts emerge. In particular the final set of safe connections depends on the timing of the phone calls and on the order in which the calls are made, so that different sets may come out for the same graph, starting with the same leader. Note that the operation: call all the neighbors disregarding any call received, although apparently parallel, requires thatxcalls its neighbors one after the other, since a phone call cannot start

algorithmSELECTION

The generic agentxmay be LEADER, IDLE, or DONE if xisLEADER:

callall the neighbors disregarding any call received;

for anyneighborz accepting the call:

addthe line{x, z}to the set of safe lines;

becomeDONE;

if xisIDLE:

upon receptionof a call from another agenty addthe line{x, y}to the set of safe lines;

call all neighbors except forydisregarding any call received;

for any neighborz accepting the call:

add the line{x, z}to the set of safe lines;

becomeDONE;

if xisDONE: disregard any call received.

FIGURE 8.5: Protocol for safe lines selection executed by each agentx.

before a previous one has ended. TakeH asLEADER and assume that each agent calls the others in circular alphabetic order. That is agentAfollows the order B, C, D, F, H; agent B follows the order C, H, A; agent C follows the order D, E, H, A, B; etc. Calls arriving in the same instant are answered in an arbitrary order. Assume that the lines have connection delays of one (e.g., one second) or two, and consider the following two limit cases.

Case 1. The lines of the octagonal perimeter of the graph have delay one and the other lines have delay two. SoH wakes upAat time 1,AwakesBat time 2, and a chain of calls goes on along the octagonal perimeter wakingGat time 7. A set of seven safe lines is thus built (graph on the left ofFigure 8.6), and all the other calls from each agent, including the calls from theLEADER, are refused by his neighbors since they arrive at later times. Different safe lines would have been obtained if H had called his neighbors in a different order, even with the same delays on the lines.

Case 2. The lines between H and his neighbors have a connection delay of one, and the delay of all the other lines is two. In this case H wakes-up the other agents at times 1 to 7 and all the other calls are refused. A new set of seven safe lines is built (graph on the right of Figure 8.6). Note that ifAhad called his neighbors in the order D, F, H, B, C, he would have reachedD at time 3 andF at time 5 waking-up these two agents before their reception of the calls fromH that would have been refused.

What is probably more interesting is that by changing the timing on the lines and the order of the calls any connected set of seven safe lines can be

A 1

C 3

E 5 D 4

F 6 H

B 2 G 7

A 1

C 3

E 5 D 4

F 6 H

B 2 G 7

FIGURE 8.6: Safe lines (solid segments) and node wake-up times with LEADER H, under two timing assumptions. The lines of the octagonal perime-ter have delay 1, or the lines connected with H have delay 1, while all the other lines have delay 2.

obtained, as for example the one ofFigure 8.4that the reader may personally investigate.

Although the given protocol is very simple, a formal proof of its correctness can only be given in the realm of the theory of distributed algorithms and is not at all trivial. For those interested in getting deeper into the subject we refer to the bibliographical notes at the end of this chapter, and note here only a few considerations to show how treacherous these algorithms can be.

A simple question is whether one can be sure that all agents are reached by the order of the central command. This can be proved inductively. In fact if an agent x other than the LEADER were never reached by a call thus remainingIDLE, his neighbors would also always beIDLE, and the argument can be pushed back to theLEADERwho is notIDLEby definition. A similar argument shows that all the chosen safe lines form a connected graph, but it is more difficult to prove that there are exactlyn−1 such lines. The reader may verify informally that the chosen lines induce a tree rooted at theLEADER, and recall from Chapter 3 that any tree ofnnodes hasn−1 arcs, a necessary and sufficient number to keep any graph connected. Aside from the root, the tree is composed ofinternalandleafagents, each of which knows the lines to its parent and the lines to its children, if any. But another question is more delicate to prove.

How do the agents become aware that all their colleagues have become DONE and that the set of safe lines is then ready to be used? To achieve this, further actions must be added to the protocol. The information about termination is collected from the leaves to the root on the safe lines of the induced tree, with a so called convergecast operation. Then the leader, now the root, becomes aware that all the agents have completed their task and broadcaststhis information to everybody in a final phase, again using the arcs

of the tree. The complete protocol follows as an immediate extension of the previous one but is actually a bit technical and is not reported here. At the end each agent has the knowledge that the selection of safe lines has been completed and can use them with confidence.

The story is much more important than it may appear at a first glance because the protocol is designed for a wealth of problems on arbitrary networks whose entities may not even know the number of their peers, or have a map of the existing connections. If an entity knew the whole network in advance it could select a set of safe lines and communicate this set to the others by itself.

In big networks, however, the entities have only a fraction of this information, as in the Internet, which is too big to be known by all nodes, and in which different computers and connections may come up and go down at any time.

So the entities present at any moment can only work together according to a distributed protocol based on pieces of local knowledge, like the set of safe lines established by an agentxwith his neighbors.

So let us now apply the experience gained with the agents to finding a community in a graph, a problem that was introduced in Section 7.7 of the last chapter. Take again the graph ofFigure 8.4and suppose that some of the agents are interested in a specific topicτ (e.g, movies). One of them, now the LEADER, wants to detect the subgraph of colleagues interested inτ together with all its arcs. This community is easy to detect with a centralized algorithm if the nodes are interrogated about their interests one by one, but they may be so numerous as to render the algorithm impractical, and other interesting features will come out from a distributed approach.

Perhaps surprisingly, the protocol for safe lines selection applies, with a few minor changes. Only the nodes interested inτnow call the neighbors to enquire about preferred topics. They do not disregard incoming calls as was required in the previous protocol for determining a tree withn−1 connections, and retain all the arcs connecting nodes with interestτ as part of the community.

The software suite of each node contains the protocol specified inFigure 8.7, that gets activated upon receipt of a enquiry about its interest in the given topic.⁷ Note the absolute similarity of the two protocols in Figures 8.5 and 8.7. Assuming A as the leader, two possible communities interested inτ are shown inFigure 8.8.

The protocol terminates when all the nodes interested in topic τ have become DONE. At this point each node in the community has a complete local knowledge of it, that is, the node knows all its neighbors in the commu-nity and the arcs connecting to them. This condition can be disclosed to the LEADER during a termination phase, as it was indicated for the protocol of safe lines detection. However, further information on the community can be gathered now. For example the subgraph in Figure 8.8(a) is anα-community (see definition (7.8) in the previous chapter). The subgraph in Figure 8.8(b),

7Some more details are needed in the protocol to handle the interleaving of outgoing and incoming calls.

algorithmCOMMUNITYThe generic node xmay be LEADER, IDLE, or DONE

if xisLEADER:

callall the neighbors;

for anyneighborz answering yes:

addnodez and arc{x, z}to the community;

becomeDONE;

if xisIDLEand is interested inτ:

upon receiptof the first call from another agenty answer yes;

addnodey and arc{y, z}to the community;

callall the neighbors except those from which a call was received;

for anyneighborz answering yes:

add nodez and arc{x, z}to the community;

for any callreceived (during the node’s own calls) fromw:

answer yes;

addnodew and arc{x, w}to the community;

becomeDONE;

if xisIDLEand is not interested inτ:

wheneverreceiving a call forτ answer no;

FIGURE 8.7: Protocol for the detection of a community with interest in a topicτ, executed by each nodex. “Calls” implicitly ask for interest inτ. Note that only the leader may disregard any call received.

E D

F H

B G

(a)

E D

F H

B G

(b)

FIGURE 8.8: Two possible communities with an interest in τ (see en-grossed arcs). (a) A,C,D,E,H in an α-community. (b) A,C,D,E,F is not an α-community due to node F.

instead, is not anα-community because two of the four arcs incident to node F point outside the community. This information, and other properties such as the identity of the participating nodes, can be gathered in a final phase of the protocol.

The peculiarities of distributed systems are quite surprising. Without global knowledge some apparently easy questions may have no answer, while some complex operations can be performed on the whole network using only the local knowledge of the nodes. In the case of the secret agents, when the safe line selection protocol is terminated a global connectivity is established on the whole network, and each agent can use his local safe lines leaving to the agents connected to him the responsibility of using in turn only safe lines with their neighbors. On the other hand it may be difficult or even impossible to solve elementary problems as we now show on a much simpler case.

Dans le document Algorithmic Foundations of the Internet (Page 166-172)