Discussion of Formal Model of L-ALLIANCE - Preferred Strategy: Energy as Metric

Preferred Strategy: Energy as Metric

4.6 Discussion of Formal Model of L-ALLIANCE

and of the solution found by L-ALLIANCE strategy IV. The average of these 200 runs then indicated the typical performance of the optimal and of the L-ALLIANCE allo-cations for that degree of heterogeneity. I then computed the average percent worse of the L-ALLIANCE solution over the optimal solution for each of the heterogene-ity dierences of that scenario. This value indicates the relative performance of the L-ALLIANCE strategy IV solution compared to the optimal solution for that triple.

Finally, I categorized that triple into the appropriate time/energy prole region (see gures 4-1 and 4-2) according to its task coverage=m ratio. I repeated this process for all the triples, and then computed the average percent dierence across each of the four regions (actually, three regions, since I had no data from region 4).

Figure 4-21 shows the results, indicating the percent worse than the optimal result for the scenarios in regions 1, 2, and 3 for both time and energy. A total of 331 sce-narios make up the region 1 average, 139 scesce-narios make up the region 2 average, and 26 scenarios make up the region 3 average. These results indicate that L-ALLIANCE performs quite well for these smaller scenarios | less than 20% worse than optimal for any region, for either time or energy, with much better performance in region 1. The worst-case performance in terms of time was for the scenario involving four robots, eight tasks, and a task coverage of three, which was 28% worse than the optimal;

this scenario is indicated in gure 4-20 by the large dot at location (.75, 2) in region 3. The worst-case performance in terms of energy was for the scenario involving ve robots, six tasks, and a task coverage of ve, which was 25% worse than the optimal;

this scenario is indicated in gure 4-20 by the large dot at location (1, 1.2) in region 2. The key question, of course, is how seriously we should expect the performance of L-ALLIANCE to degrade as the size of the problem increases. Strategy IV performs particularly well in region 1 because, although the knowledge is distributed across motivational behaviors, the robots are essentially using global knowledge in their action selection due to the high task coverage and low mission size. However, as the relative number of tasks to perform increases, the purely greedy approach cannot always result in near-optimal performances because it will at times be more ecient to make several less-than-optimal local task selections to arrive at a globally optimal result. Quantifying how much worse L-ALLIANCE performance can be than the optimal is dicult, however, and warrants further study. This is a primary topic of future study.

4.6 Discussion of Formal Model of L-ALLIANCE

Now that the philosophy behind the L-ALLIANCE learning approach has been pre-sented, let us look in detail at how this philosophy is designed into the motivational

0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00

# Tasks / # Robots

0.00 0.20 0.40 0.60 0.80 1.00

Task Coverage / # Robots

Scenarios Compared with Optimal

Region 4

Region 3

Region 2

Region 1

Figure 4-20: The data points shown correspond to scenarios for which the optimal result could be computed. For each of these scenarios, I compared the timeand energy usage required for the strategy IV distributed action selection technique against the required time and energy usage for the optimal result. The data points correspond to a total of 496 scenarios. The two heavy dots are the scenarios for which the time or energy performance was the worst. The dashed lines indicate the same four regions as shown in gures 4-1 and 4-2.

4.6. DISCUSSION OF FORMALMODEL OF L-ALLIANCE 115

Region 1 Region 2 Region 3

0.00 0.10

0.20 Time

Energy

Percent Worse than Optimal

0.15

0.05

Figure 4-21: Comparison of strategy IV performance with the optimal performance.

In region 1, the averages are over 331 scenarios, in region 2 the averages are over 139 scenarios, and in region 3 the averages are over 26 scenarios. Since the eciency problem is exponential in the numberof tasks, the optimal results could not be derived for any scenarios in region 4. The error bars indicate one standard deviation in the performance dierences.

behavior mechanism. I organize this subsection by rst discussing the threshold of activation of the behavior sets, followed by a discussion of the parameter settings pertinent to each of the sources of input to a robot's motivational behavior: sensory feedback, inter-robot communication, suppression from active behavior sets, learned robot inuence, robot impatience, and robot acquiescence. In these sections, I dis-cuss only those parameter issues in L-ALLIANCE that were previously ignored in my description of ALLIANCE. Chapter 3 provides the philosophy behind the basic AL-LIANCE mechanism, which remains true for L-ALAL-LIANCE. I conclude this section by showing how the L-ALLIANCE inputs are combined to compute the motivational levels. Appendix B summarizes the L-ALLIANCE formal model for easy reference⁵.

Threshold of activation

A parameter of key importance to the eciency of the robot team is the threshold of activation,. This parameter is used not only to determine the motivational level at which a behavior set is activated, but, more importantly, as a way of calibrating the impatience and acquiescence rates across motivational behaviors and across robots.

Recall from section 4.5.3 that I want the interaction of motivational behaviors to result in a robot selecting either the task it can perform the quickest or the task that requires the robot the longest time to accomplish, depending upon the task category. Since the L-ALLIANCE mechanism is distributed across several parallel processes, these orderings can be accomplished by setting the slowij(k;t) and fast_ij(t) impatience rates to values proportional to the expected completion times of their corresponding tasks. However, these rates are meaningless if the behavior sets activate at dierent levels, since a behavior set with a slower rate of impatience could activate before one with a faster impatience rate if the rst behavior set had a low enough threshold of activation. Likewise, I want the robot team memberthat is superior at a given task to

\win" the ability to perform that task by activating it prior to any of its teammates.

Yet again, this cannot be accomplished if the robots have dierent thresholds of activation.

It is therefore important for the sake of eciency for the value of to be uniform across robots and across the motivational behaviors of each robot. This uniformity should be quite easy to achieve: it can be obtained simply by the human designer broadcasting the desired value to all robots at the start of the mission, or by providing the robots with a simple arbitration mechanism that allows the team on its own to come to a consensus on what value of to use. Of course, as we saw in chapter 3, having uncalibrated's across motivational behaviors or across robots is not a

catas-5The model described in this section and in appendix B is a more recent version of that presented in [Parker, 1993b].

4.6. DISCUSSION OF FORMALMODEL OF L-ALLIANCE 117 trophic problem | the robots will still be able to accomplish their mission, although less eciently.

Sensory feedback

The use of sensory feedback in L-ALLIANCE is unchanged from its use in AL-LIANCE.

Inter-robot communication

The rate at which robots communicate their current actions to their teammates is of central importance in ALLIANCE and L-ALLIANCE to the awareness robot team members have of the actions of their teammates. This in turn aects the eciency of the team's selection of actions, since lack of awareness of the actions of teammates can lead to replication of eort and decreased eciency. Since this issue is addressed extensively in chapter 5, I will not repeat my conclusions here. Suce it to say that to ensure maximal eciency, it is best to set the communication rates,i, to be fairly frequent relative to the time required to complete each task in the mission. Since the task completion time is usually many orders of magnitude larger than the time required to broadcast a message, it is likely that the communication system capacity easily suces to meet this requirement.

The second parameter dealing with inter-robot communication isi. This parame-ter is especially important for allowing a robot to know which other robots are present and to some extent functioning on the team. Although I want robots to adapt their own actions according to the current and expected actions of their teammates, I do not want robots to continue to be inuenced by a robot that was on the team, but at some point has ceased to function. Thus, robots must at all times know which other robots are present and functioning on the team. This is implemented in ALLIANCE as follows: at the beginning of the mission, team members are unaware of any other robot on the team. The rst message a robot receives from another robot, however, is sucient to alert the receiving robot to the presence of that team member, since all robot messages are tagged with the unique identication of the sender. The robots then monitor the elapsed time from the most recent broadcast message of any type from each robot team member. If a robot does not hear from a particular teammate for a period of timei, then it must assume that that teammate is no longer available to perform tasks in the mission.

Clearly, the proper value of i is dependent upon each robot team member's i

settings. If team membershave dierent values for these parameters, then they cannot be sure how long to wait on messages from other robots. However, the diculty should be minor if thei values are set conservatively | say, to several times one's own time

delay between messages. Even so, if a robot ri erroneously assumes a team member rk is no longer functional, the receipt of just one message from that team member at some point in the future is sucient to reactivaterk's inuence on ri's activities.

To refer to the team members that a robot ri thinks are currently present on the team, I dene the following set:

robots present(i;t) =^fk^j9j:(comm received(i;k;j;t^;i;t) = 1)^g

The robots present(i;t) set consists simply of those robots rk from which ri has received some type of communication message in the last i time units.

Suppression from active behavior sets

The suppression from active behavior sets in L-ALLIANCE is implementedidentically to the method utilized in ALLIANCE.

Dans le document in partial fulllment of the requirements for the degree of (Page 131-136)