Decentralized mixing function control strategy for multi-robot informative persistent sensing applications

(1)

Decentralized Mixing Function Control Strategy for

Multi-Robot Informative Persistent Sensing Applications

by

Gavin Chase Hall

B.S. Mathematics, B.S. Mechanical Engineering, B.S. Physics

West Virginia University, 2009

Submitted to the Department of Mechanical Engineering

in partial fulfillment of the requirements for the degree of

Master of Science

at the

MASSACHUSETTS INSTITUTE OF TECHNOLOGY

June 2014

@ Massachusetts Institute of Technology 2014. All rights reserved.

Signature redacted

Author ....

Department of Mechanical Engineering

Signature redacted

May 15, 2014

Certified by

Daniela Rus Professor of Electrical Engineering and Computer Science Thpsis Sypervisor

Certified by ...

Signature redacted

/

Jean-Jacques E. Slotine

Professor of Mechanical Engineering

Signature redactedjhesis

Supervisor

Accepted by...

...

David E. Hardt

Chair, Department Committee on Graduate Students

MASSACHUSEM NflUE OF TECHNOLOGY

S1

52014

(2)

(3)

Decentralized Mixing Function Control Strategy for Multi-Robot

Informative Persistent Sensing Applications

by

Gavin Chase Hall

Submitted to the Department of Mechanical Engineering on May 15, 2014, in partial fulfillment of the

requirements for the degree of Master of Science

Abstract

In this thesis, we present a robust adaptive control law that enables a team of robots to gen-erate locally optimal closed path persistent sensing trajectories through information rich areas of a dynamic, unknown environment. This controller is novel in that it allows the robots to combine their global sensor estimates of the environment using a mixing

func-tion to opt for either: (1) minimum variance (probabilistic), (2) Voronoi approximafunc-tion, or (3) Voronoi (geometric) sensing interpretations and resulting coverage strategies. As the

robots travel along their paths, they continuously sample the environment and reshape their paths according to one of these three control strategies so that ultimately, they only travel through regions where sensory information is nonzero. This approach builds on previous work that used a Voronoi-based control strategy to generate coverage paths [32]. Unlike the Voronoi-based coverage controller, the mixing-function-based coverage controller captures the intuition that globally integrated sensor measurements more thoroughly capture infor-mation about an environment than a collection of independent, localized measurements.

Using a non-linear Lyapunov function candidate, we prove that the robots' coverage path configurations converge to a locally optimal equilibrium between minimizing sensing error and path length. A path satisfying this equilibrium is called an informative path. We extend the informative path controller to include a stability margin and to be used in con-junction with a speed controller so that a robot or a group of robots equipped with a finite sensing footprint can stabilize a persistent task by maintaining all growing fields within the environment bounded for all time. Finally, we leverage our informative persistent paths to generate a dynamic patrolling policy that minimizes the distance between instantaneous vehicle position and incident customer demand for a large fleet of service vehicles oper-ating in an urban transportation network. We evaluate the performance of the policy by conducting large-scale simulations to show global stability of the model and by comparing it against a greedy service policy and historical data from a fleet of 16,000 vehicles.

Thesis Supervisor: Daniela Rus

(4)

(5)

Acknowledgments

I would like to thank Professor Daniela Rus for two fantastic years of my life. Being new to research, she taught me many tricks of the trade to help me hit the ground running. Among them were her vision for the application of theory and her knack for clearly conveying complex solutions to select audiences. I like to think of her as a mother figure, except that she avoids me like the plague outside of school. She has been a great friend, and I can always count on her to give me the straight scoop about any issue. I would like to thank Professor Jean-Jacques E. Slotine for making the technical portions of this research a tad less gruesome. His course lectures single-handedly prepped me for the majority of the theoretical work between the covers of this thesis.

I would also like to thank all of my two friends in the Distributed Robotics Laboratory for making me realize that performing amazing work can actually be somewhat enjoyable.

My biggest thank you goes to my family. Whether it has been music, science, baseball, or

trick-or-treating until I was 28, you've made every step of the way so much better. I hope my progress helps to make up for how much my sister let you down.

(6)

(7)

(8)

(9)

List of Figures

1-1 Path reshaping process for three robots . . . 22 1-2 Example of persistent sensing by two robots . . . 23

1-3 Informative patrolling loops over historical demand distribution in Singapore 25

2-1 Mixing function sensing behaviors . . . 33 2-2 Mixing function supermodularity . . . 34

2-3 Approximation to indicator function. . . . 37

2-4 Mean integral parameter error and Lyapunov-like function in single robot learning phase . . ... _{. . . 55}

2-5 Single robot learning phase with an informative path controller . . . 56

2-6 Mean waypoint position error under the informative path controller for a

single robot . . . 57 2-7 Single robot path shaping phase with an informative path controller . . . . 58

2-8 Single robot W vs. Wn . . . 59 2-9 Consensus and integral parameter errors for multiple robots . . . 63

2-10 Multi-robot learning phase with informative path controller . . . 64 2-11 Mean waypoint position error under the informative path controller for

multiple robots . . . 65 2-12 Multi-robot path shaping with informative path controller . . . 66

2-13 Informative path configurations for two environments with different mixing

functions

. . . 67

2-14 Multiple robot W vs. Wn . . . 68

(14)

2-16 Sensitivity of Voronoi smoothing path separation . . . 71

2-17 Relaxed path separation sensitivity of minimum variance coverage strategy 72 2-18 Setup for testing initial waypoint position sensitivity . . . 73

3-1 Integral parameter error and Lyapunov function candidate of the informa-tive persistence controller for a single robot . . . . 82

3-2 Single robot learning phase with informative persistence controller . . . 83

3-3 Mean waypoint position error for a single robot . . . 84

3-4 Persistent task stability margin for a single robot . . . 85

3-5 Single robot path shaping phase with the informative persistence controller 85 3-6 Integral parameter error under the informative persistence controller for m ultiple robots . . . 88

3-7 Lyapunov-like function in learning phase under the informative persistence controller for multiple robots . . . 89

3-8 Consensus error under the informative persistence controller for multiple robots . . . 89

3-9 Multi-robot learning phase with informative persistence controller . . . 90

3-10 Mean waypoint position error and persistent sensing task's stability margin for multiple robots . . . 91

3-11 Multi-robot path shaping phase with informative persistence controller . . . 92

4-1 Arrival and destination distribution and map of Singapore with informative loops . . . 102

4-2 Dynamic patrolling policy simulator . . . 104

(15)

List of Algorithms

1 Mixing function controller for a single robot in an unknown environment:

robot level . . . 52

2 Mixing function controller for a single robot in an unknown environment:

waypoint level . . . 53

3 Mixing function controller for multiple robots in an unknown environment:

robot level . . . 61

4 Mixing function controller for multiple robots in an unknown environment:

waypoint level . . . 62

5 Informative persistence controller for a single robot: waypoint level . . . . 80

6 Informative persistence controller for multiple robots: waypoint level . . . 87 7 Patrolling policy pseudocode . . . 100

(16)

(17)

List of Tables

4.1 Total patrolling policy ONCA L L distance ratios and utilization factor over 24-hour

sim ulations.. . . . .111

A. 1 Common symbols for each control strategy . . . 115

A.2 Single-robot controller symbols . . . 116

(18)

(19)

Chapter 1 Introduction

1.1 Motivation and Goals

When monitoring an unfamiliar and changing environment, robots face two significant challenges: (1) the robots must identify the regions and rates of change corresponding to important sensory information in the environment, and (2) the robots must determine how to optimally position and allocate themselves to cooperatively collect this information. Given a group of robots, each equipped with a sensor to measure the environment, our goal is to derive an adaptive multi-robot control strategy that enables robots to employ multiple sensing strategies to generate a configuration of closed paths that the robots will travel along to maximize their collection of sensory information. These paths are called informative

paths, because they drive the robots through locations in the environment where the sensory

information is important.

Informative path planning brings the notion of sensory value into the planning prob-lem, thus adding information to the geometric formulation. This has many applications. For example, it can be used by a team of robots operating in an underground mining envi-ronment to estimate regions of significant CH4 accumulation and generate paths that enable continual monitoring of these locations to prevent future mining disasters. Another exam-ple, would be to use this controller to deploy a team of robots to monitor a wildfire over a large environment. The robots would be able to learn the regions where the fire has spread on-line and generate paths such that between all robots, all locations with fire damage are

(20)

constantly sensed, while regions with no fire damage are avoided.

In this thesis, we present a decentralized adaptive control algorithm for generating in-formative paths in unknown environments for multi-robot systems. This algorithm takes as input sensory information collected by each robot over a dynamic environment and outputs locally optimal paths whose trajectories cover the regions in the environment where sensory information is nonzero. The first feature of this informative path algorithm is a parameter estimation adaptation law the robots use to learn how sensory information is distributed throughout the environment. The second feature is a mixing-function-based coverage con-troller that performs the reconfiguring of the paths based on a gradient optimization of a cost function comprised of a class of parameterized sensor mixing functions [25].

Mixing functions dictate how sensor measurements from different robots are combined

in order to represent different assumptions about the coverage task. By varying the value of a free parameter, and consequently the mixing function class and the robots' aggregate sensor mixing behaviors, the mixing function control strategy can recover multiple com-mon control strategies, including minimum variance (probabilistic), Voronoi smoothing, and strictly Voronoi (geometric) based approaches. These control strategies represent a wide range of robot sensor combination abilities. Whereas the probabilistic control strat-egy promotes global sensor mixing by minimizing the expected variance of all robots'

sensor measurements of a point of interest, the geometric control strategy employs no sen-sor mixing and only considers the sensen-sor measurements of the robot closest to a point of interest [7, 25]. Voronoi smoothing bridges these two strategies by either increasing or decreasing the amount of sensor synthesis between robots in accordance to the mixing

function's free parameter value.

The mixing function controller is an extension to the probabilistic and geometric

unify-ing controller introduced in [25]. It consists of placunify-ing the waypoints of a path in locally optimal positions that achieve an equilibrium between minimizing the cost of a class of

mixing functions, sensing errors, and informative path lengths. Minimizing sensing errors

allows the robots to be close to the points of interest that they are sensing. Minimizing path length reduces the robots' travel time through the regions of the environment with no sen-sory value. As the robots discover the structure of the environment, they reshape their paths

(21)

according to this equilibrium to avoid visiting static areas and focus on sensing dynamic areas. An example of the reshaping process for three robots using a probabilistic control strategy is shown in Figure 1-1.

A number of task-specific multi-robot control strategies have been proposed to

accom-plish informative path planning in a distributed and efficient way [7,26]. In [27,33], ge-ometric based Voronoi controllers with no sensor measurement synthesis between robots were used. As a result of the robustness of the mixing function control strategy, not only can we directly recover the Voronoi-based controller, we can also smoothly approximate Voronoi-based coverage arbitrarily well, while preserving its asymptotic stability and con-vergence guarantees. Another benefit of the mixing function control strategy is that we can employ a probabilistic approach to circumvent localized sensing sensitivities intrinsic to geometric Voronoi-based controllers, that cause a significant topological variance in infor-mative path configurations for nearly identical corresponding initial robot configurations. Finally, because the mixing function controller does not require the expensive computa-tion of a Voronoi tessellacomputa-tion over an environment for any of its derived control strategies, informative paths can be generated by robots with limited computational resources.

Generating informative paths using the mixing function control strategy is the first step in stabilizing a persistent sensing task [30, 31] in unknown environments, where the robots are assumed to have sensors with finite sensing radii and are required to revisit a location of interest at a specific calculated frequency. A persistent sensing task is defined as a monitoring scenario that can never be completed due to the continual change of the states of the environment. For example, if the robot were to stop monitoring, the information at some points in the environment would grow unbounded.

For a persistent sensing task, we also calculate the speed at which robots with finite sensor footprints collect sensory information along an informative path by instituting a sta-bility margin that guarantees a bound on the difference between the robots' current estimate of the environment and the actual state of the environment for all time and all locations.

A consequence of having finite sensing radii is that the robots are unable to collect data

over the entire environment at once. Therefore, as the data over a dynamic region become outdated, the robots must return to that region to collect new data. In order to prevent the

(22)

0 0.1 0.6 0.4 0.2 1.4 .2 0.2 0.4 0.6 0.8 (a) Iteration: 1 .6 .4 .2 0c 0.2 0'4 0.6 0.8 1 (c) Iteration: 25 6 4 2 O 0.2 0.4 0.6 0.8 (e) Iteration: 75

Figure 1-1: Starting with initial sweeping paths, the robots learn about the environment by observation; the observations are then used to transform the paths so that they are aligned with the important parts of the environment. The paths correspond to the trajectories of the three robots, where the robots' positions along these paths are represented by the black arrows. The important regions of the environment are shown in green.

robots' model of the environment from becoming too outdated, [30] presented a persistent sensing controller that calculates the speeds of the robots at each point along given paths,

0 0 0 0 0. 0. 0. 0 0 0 0 0.2 0.4 0.6 0.6 (b) Iteration: 10

11

0.4 0 0.2 0.4 06 0.8 (d) Iteration: 50 .80 .6 .4 .2 90 0.2 0,4 0.6 0.6 (f) Iteration: 100 0

(23)

i~e

(a) Stable (b) Unstable

Figure 1-2: Example of persistent sensing by two robots. Each robot with a finite sensing radius (red and blue circles around the robots' positions) travels through its path, with the objective of keeping the accumulation function (green dots) low everywhere. The robots collect data at the dynamic regions and shrink the accumulation function. The size of the green dot is proportional to the value of the accumulation function at that location. On the left, a stable speed profile maintains the accumulation function bounded everywhere for all time, whereas on the right, the speed profile is not stable, and the accumulation function grows unbounded in some locations.

that are fittingly referred to as speedprofiles. This speed controller enables robots to visit faster changing areas more frequently than slower changing areas.

The persistent sensing problem is defined in [30] as an optimization problem whose goal is to keep a time varying field as small in magnitude as possible everywhere. This field is referred to as the accumulation function. Where it is not covered by the robots' sensors, the accumulation function grows unbounded, thus indicating a rising need to collect data at that location. Likewise, the accumulation function shrinks where it is covered by the robots' sensors, indicating a decreasing need for data collection. A stable speed profile is defined as one that bounds the largest magnitude of the accumulation function.

In this thesis, we extend the computed informative path configurations obtained from

the mixing function controller to be used in conjunction with stabilizing speed profiles

from the persistent sensing controller to create an informative persistence controller [33] that locally optimizes a persistent sensing task. Figure 1-2 shows an example of two robots performing a persistent sensing task with both a stable and unstable speed profile.

(24)

By assigning the accumulation function to a physical parameter such as oil spill

lev-els, airborne particulate matter accumulation, or aggregate sensor errors, persistent sensing becomes a very practical approach to a wide array of real world monitoring scenarios. In this thesis, we consider an informative persistent sensing approach to urban Mobility-on-Demand (MOD) systems, where the accumulation function represents the historical num-ber of passenger arrivals at discrete sets of locations over a defined period of time. In our previous work [22], we showed that autonomous driving can be used to mitigate the rebal-ancing problem current MOD systems face. Our objective is to minimize the waiting time of the passengers and the amount of time the vehicles in the system drive empty between subsequent customer requests. The critical question is where should each vehicle go once a delivery is complete? To solve this problem, we leverage our informative paths and per-sistent sensing controllers to develop optimized task allocation algorithms in the form of a dynamic patrolling policy for a fleet of MOD service vehicles such as taxis.

By using historical arrival distributions as input to our control algorithms, we can

com-pute patrolling loops that minimize the distance driven by the vehicles to get to the next request. The algorithm was trained using one month of data from a fleet of 16,000 taxis in Singapore. The resulting informative loops are used to redistribute the vehicles along sta-tionary virtual taxi stand locations along the loop. We compare the policy computed by our algorithm against a greedy policy as well as against the ground truth redistribution of taxis observed on the same dates and show up to a 6 x reduction over historical data in customer waiting time and taxi distance driven without a passenger. These metrics represent two key evaluation criteria: (1) quality of customer service and (2) fuel efficiency.

1.2 Contribution to Robotics

This thesis makes the following contributions:

* A decentralized robust informative control strategy. We extend the probabilistic and

geometric unifying controller presented in [25], so that now instead of statically parti-tioning themselves, a team of robots can adaptively compute closed online paths that continually travel through regions discovered to be important by observation in an

(25)

145

125-

-04-(a) Patrol Loops (b) Customer Demand Distribution

Figure 1-3: Evolution of six informative patrol loops over historical demand distribution in Singapore. Service vehicles travel along these loops to minimize the driving distance between subsequent customer pick-ups. Each loop is updated 96 times over a 24 hour time to account for differences in customer demand throughout the day. The peak amplitude in the customer demand distribution represents the central business district of Singapore.

unknown and dynamic environment. The provably stable and decentralized adaptive coverage controller uniquely combines robots' global sensor measurements based on a mixing

function

to learn the location of dynamic events in the environment and si-multaneously computes closed informative paths based on these aggregated sensor behaviors.

A mixing

function

controller is advantageous because it is amenable to geometric, probabilistic, and analytical interpretations, all of which have previously been pre-sented separately [25]. We introduce a family of mixing

functions

with a free pa-rameter, a, and show that different values of the parameter correspond to differ-ent assumptions about the coverage task, specifically showing that a minimum vari-ance solution (probabilistic strategy) is obtained with a parameter value of a = -1, Voronoi coverage (geometric strategy) is recovered in the limit a = --co, and Voronoi

smoothing coverage is recovered for 0 > a > -co.

Using a minimum variance controller (a = -1), we offer an improvement in in-formative path stability over the Voronoi approach presented in [33], by showing that small differences in the robots' initial waypoint positions do not result in

(26)

sig-nificantly different informative path configurations. We derive both single robot and multi-robot cases for each informative path control strategy and prove asymptotic sta-bility and convergence using Lyapunov stasta-bility theorems. We develop and analyze single robot and multi-robot informative path algorithms, and perform simulations in MATLAB for both cases.

9 An informative persistence path controller. As [33] did for a Voronoi-based control

strategy, we extend our robust adaptive coverage controller so that persistent sensing tasks can now be performed in unknown environments when the robots' stabilizing path configurations are unknown a priori. Combining a stability metric from persis-tent sensing tasks with our robust informative control strategy for the single-robot and multi-robot cases we develop informative persistence controllers that locally op-timize the persistent sensing task by generating informative paths for the robots and subsequently increase the stability metric of the persistent sensing task. Lyapunov proofs are used to prove stability of both the single-robot and multi-robot cases. We evaluate and simulate both single robot and multi-robot informative persistence al-gorithms in MATLAB.

* A dynamic patrolling policy for afleet of service vehicles. We instantiate the

informa-tive persistence controller to an application in traffic: matching supply and demand for taxis or a Mobility-on-Demand transportation system. The dynamic patrolling policy is comprised of multiple patrol loops and a provably stable vehicle redistri-bution model. Patrol loops are generated using actual historical data from a fleet of 16,000 vehicles over multiple days as the input to our mixing function informative path controller. In line with our objective to match supply and demand, the computed patrol loops minimize the instantaneous distance between customer requests and taxi position, as well as minimize the length of each patrol loop. Once a configuration of patrol loops has been computed, a centralized scheduling algorithm is implemented to manage request allocation and vehicle redistribution for large-scale (> 500 agents) MATLAB simulations. Our dynamic patrolling algorithm is tested against a greedy service policy and actual historical taxi performance in Singapore.

(27)

1.3 Relation to Previous Work

This work builds on several bodies of work: (1) adaptive control, (2) informative path planning, (3) coverage control, and (4) multi-agent systems. Adaptive path planning algo-rithms traditionally consider the real time mapping of paths to a set of desired states in an unknown or dynamic environment in continuous-time systems. For example, in [34], an optimization for path planning was presented for the case of partially known environments. In [8], the authors present a path planning algorithm for deploying unmanned aerial vehicle systems in an unknown environment. Most of the previous work in path planning focuses on computing an optimal path according to some metric to reach a destination [14], [23].

In this thesis, the objective of the robots is not to reach a final destination, but instead, continually travel along their computed closed path trajectories through regions of the en-vironment where sensory information is nonzero. We highly prioritize generating paths that allow the robots to travel through regions of interest in an unknown environment using adaptive control strategies to create a novel algorithm for computing informative paths.

Informative path sensing extends adaptive path planning algorithms with an emphasis on efficiently measuring and monitoring a dynamic environment. Such a method for com-puting paths that provide the most information about an environment was presented in [28], with the aim of adaptively learning and traversing through regions of interest with multiple robots. Informative sensing while maintaining periodic connectivity for the robots to share information and synchronize was examined in [12]. Our work considers adaptive path plan-ning and informative sensing in a similar context, by using a robust control strategy that can use both geometric and probabilistic sensor measurement behavior to optimize a coverage task in dynamic environment, as opposed to [25], where a non-adaptive probabilistic and geometric unifying control strategy was implemented for a known, static environment.

Cortes et al. [7] introduced a geometric control strategy for multi-robot coverage in a known environment that continually drives the robots toward the centroids of their Voronoi cells, or centroidal Voronoi configuration. Schwager et al. [27] extended this work by enabling the robots to sample and adaptively learn an unknown environment before they began reshaping their paths. Similar work in Voronoi coverage includes [11], where the

(28)

objective was to design a sampling trajectory that minimizes the uncertainty of an estimated random field at the end of a time frame.

Another common approach in coverage control is a probabilistic strategy. For exam-ple, [13] proposes an algorithm for positioning robots to maximize the probability of de-tecting an event that occurs in the environment and minimizes the predictive variance in a time frame. Both geometric and probabilistic based control strategies are based on a optimization that the controllers solve through the evolution of a dynamical system.

Geometric and probabilistic control strategies were unified in a mixing function con-troller introduced in [25]. This work serves as most relevant to our thesis, because it enables a group of agents to position themselves statically in locally optimal locations according to either a probabilistic or Voronoi sensing coverage interpretation of a known environment. In this thesis, we build upon the mixing function controller by defining an agent's closed path as a set of waypoints that can distributedly execute a parameter adaptation law and a decentralized gradient control law to learn an unknown environment from robots' estimates and compute informative sensing paths, respectively. A similar extension of a pre-existing control law to enable informative path generation, was presented in [32], where a Voronoi-based control strategy introduced in [27] served as the inspiration. The resulting informa-tive paths computed by our mixing function control strategy locally optimize the sensing position of each waypoint while minimizing the length of the informative path traveled by the robots. Our mixing function control strategy can also be used in conjunction with a governing region revisit policy or speed controller to achieve persistent sensing.

The persistent sensing concept motivating this thesis was introduced in [30], where a linear program was designed to calculate the robots' speeds at each point along given paths, in order for them to stabilize a persistent sensing task. A persistent sensing task entails bounding the growth of sensory information within the environment for all times. Examples of growing sensory information could include the amount of rainfall accumulated in a given area, or the amount of measurement uncertainty at a point of interest in the environment.

In [30], the robots were assumed to have full knowledge of the environment and were given pre-designed paths. Following the method introduced in [33] for Voronoi coverage,

(29)

in this thesis, we remove all prior environment assumptions by having the robots learn the environment through parameter estimation, and then use this information to shape their paths into informative persistence paths. By removing these constraints, we create a viable persistent sensing strategy for unknown and dynamic environments.

Persistent sensing is related to sweep coverage [5], where robots with finite sensor footprints must sweep their sensor over every point in the environment. The problem is also related to environmental monitoring research such as [4, 6, 15, 18, 37]. In this prior work, the authors often use a probabilistic model of the environment, and estimate the state of that model using a Kalman filter. The robots are then controlled so as to maximize a metric on the quality of the state estimate. Due to the complexity of the models, performance guarantees are difficult to obtain. In this thesis, based on our fully connected robot network, we can provide guarantees on the boundedness of the accumulation function.

By likening the concept of informative persistence sensing to patrolling problems [9, 21], we are able to propose a control strategy that distributedly uses an informative

persis-tence patrolling loop to locally optimize a task allocation scenario in a dynamic transporta-tion network. Distributed dynamic vehicle routing scenarios are considered in [1], where events occur according to a random process and are serviced by the robot closest to them. Work on optimal task allocation dates to [19] and [10]. Mobility-on-demand (MOD) is a similar paradigm for dealing with increasing urban congestion. Generally speaking, the objective of MOD problems is to provide on-demand rental facilities of convenient and ef-ficient modes of transportation [20]. Load balancing in DTA problems essentially reduces the Pickup and Delivery problem (PDP), whereby passengers arriving into a network are transported to a delivery site by vehicles. Autonomous load balancing in MOD systems has been studied in [22], where a fluid model was used to represent supply and demand. In this thesis, we employ a PDP problem formulation to model an urban transportation network.

Socially-motivated optimization criteria have also been considered in prior work. In [24,36], social optimum planning models were used to compute vehicle paths. Optimiza-tion of driving routes subject to congesOptimiza-tion was considered in [17]. In a broader con-text, [16] observed the effect that multiple service policies had on logistic taxi optimiza-tion. More recently, in [35] we studied both system-level and social optimization criteria,

(30)

showing a relationship between urban planning, fuel consumption, and quality of service metrics. In this work we consider similar evaluation models, showing how we can achieve an improvement with respect to all three of these aforementioned points of interest.

1.4 Thesis Organization

This thesis is divided into five chapters. Chapter 2 provides the main theoretical foundation of the thesis and derives the mixingfunction informative path controller for both single and multi-robot systems. Simulations and validations of the control algorithms are shown for a wide array of robot control strategies including minimum variance, Vornonoi smooth-ing, and strictly Voronoi approaches. Chapter 3 extends the informative path controller to persistent sensing and introduces stability margin requirements to the mixing function controller. Simulations are shown for a minimum variance informative persistence control algorithm. Chapter 4 presents a dynamic patrolling policy for a fleet of service vehicles in a MOD system using informative paths derived from Voronoi based controllers. Chapter 5

(31)

Chapter 2 Informative Path Controller Using a

Mixing Function

The idea of using a mixing function for static coverage in a known environment with multi-robot systems was introduced in [25] and was shown to produce results that were more stable numerically as compared to a geometric Voronoi-based approach. In this chapter, we build on this robustness insight and show that we can use a mixing-function-based ap-proach to create an informative path controller that is more stable numerically than the results in [33]. The mixing function control strategy consists of an adaptation law for pa-rameter estimation and a gradient optimization of a coverage cost function consisting of a sensing error cost, a robot path length cost, and a parameterized mixing function. We show that informative paths generated by this control strategy can be altered by varying a free parameter a to enable different sensor estimate mixing behaviors between robots that can be interpreted as either probabilistic, geometric approximation, or geometric. The result-ing informative paths computed by the mixresult-ing function control strategy, regardless of the sensing interpretation, locally optimize the coverage task. A mathematical formulation of the problem follows.

2.1 Problem Setup

(32)

A sensory function, defined as a map 0 :

Q

-+ R>o , where

Q

is a convex, compact environ-ment, determines the constant rate of change of the environment at an arbitrary point q E

Q.

Let there be N

E

Z+ robots identified by r E {1, ... ,N}. Robot r is equipped with a sensor

to make a point measurement 0 (Pr) at its position Pr E

Q

C R2 while traveling along its

closed path fr : [0, 1] -+ R2_{, consisting of a finite number nr of waypoints. Note that the nr} waypoints corresponding to robot r are different from the ni waypoints corresponding to robot r', Vr' I r.

The position of the ith waypoint on r is denoted p E P c R , where i E

{1,

... nrl,

P is the state space of a single waypoint, and dg- is the dimension of the state space. We

define Pr {P , ... , pnr E r and P = [P1, ... , PG E !Nnr as the configuration vectors

of robot r and of all waypoints, respectively. Because fr is closed, each waypoint i has a previous waypoint i - 1 and next waypoint i + 1 related to it, which are called the neighbor waypoints of i. Note that i+ I = 1 for i = nr, and i - I = nr for i = 1. A robot moves

between sequential waypoints in a straight line interpolation.

For each waypoint, the cost of the sensing estimate of a point q E

Q

from its position

p , is given by the function

f(prq) = ||q-pgI 2_, _(2.1)

where f(p , q) E R>0 and is differentiable with respect to p . The sensor measurement

es-timates of the N -nr waypoints are combined in a function g(f(p' , q), ... , f(PN, q)), called

the mixing function [25]. The mixing function ga: N R s R defines how sensory

in-formation from different robots is combined to give an aggregate cost of the waypoints' estimate of q. We propose a mixing function of the form

N nr

ga f p , qa) a(2.2) r=1i=1

where a is a free parameter. The mixingfunction manifests assumptions about the coverage task; in that, by changing the mixingfunction we can derive variety of distributed controllers

(33)

and Voronoi smoothing coverage control (-1 > a > -oo) [25].

Consider a sensing task in which an event of interest occurs randomly at a point q and is sensed at a distance by sensors located on different robots. The mixing function (2.2) assumes that different waypoints positioned at p and p5, may both have some sensory information about the event, instead of only counting the information from the waypoint that is closest to q as in the Voronoi approach. Unlike the geometric Voronoi approach, the mixing function captures the intuition that using more sensor estimates may provide a more accurate estimate of a point of interest in the environment than a single localized sensor estimate of the same point. Mixing function coverage for -1 > a > -oo is shown in Figure 2-la where the overlap of sensor estimates at two waypoint locations is shown as the intersection of two circles. Figure 2-lb shows that the Voronoi coverage case only considers robots' sensor estimates of q within their Voronoi partition [7]. Thus, allowing for no sensor estimate mixing between waypoints.

Waypoint position

Sensor cost Mixing function

r _N p , r.) ()an rI i=l Waypoint position_r Sensor cost .f (p, q)

(a) Mixing Function Schematic, -1 > a > -o (b) Voronoi Schematic, a =

--Figure 2-1: The mixing function defines how sensor measurements of the convex envi-ronment are shared by the waypoints. For probabilistic and Voronoi smoothing cases (-1 ;> a > -oo), waypoints combine sensor estimates. For the Voronoi case (a = -oo),

only sensor measurements of points of interest within a waypoint's Voronoi partition are considered, and waypoints do not combine sensor estimates.

The mixing function has several important properties. For a > 1, the ga becomes the p-norm of the vector [fI ... ] T. When a < 1, ga is non-convex and not a norm, because it

(34)

violates the triangle inequality. When a < 1, ga is smaller than any of its arguments alone.

Therefore, the cost of sensing at a q with different waypoints positioned at p and p Vihj, is less than the cost of sensing with only one of the waypoints individually. Furthermore, the decrease in ga from the addition of a second waypoint is greater than that from the addition of a third waypoint, and so on. Thus, there is a successively smaller benefit to adding more robots. This property is called supermodularity, and is shown in Figure 2-2.

it

Figure 2-2: As more sensing estimates are considered, the property of super modularity dictates that the amount that the mixing

function

is decreased becomes increasingly less.

2.2 Mixing Function Cost

Building upon [25,33] and using (2.2), we propose a generalized, non-convex cost function of the form

w N nr

H(P) s g(f(pi, q),...f( Nq))O(q)dq+

IIpi+1II

r=1 i=1

ws

N nr N nr

(E

f(p

, q)ag iO(q)dq + PIny|2 (2.3)

r=1 =1 r1 i=1

where | denotes the 12-norm, and the integrand g(f(pl, q),...,f(PNq)) represents the aggregated sensing estimate of all waypoints at a single arbitrary point q, with a corre-sponding weight Ws E Z+. Integrating over all points in

Q,

weighted by

4

(q), gives the

first term of the cost function. The second term of the cost function represents the cost of positioning neighboring waypoints of the same robot too far from one another. Ultimately,

(35)

this term dictates the cost assigned to the final length of the informative path, and it is given a corresponding weight Wn E Z+.

Our goal is to develop a controller that stabilizes the waypoint around configurations P* that minimize H [25]. The general mixing function cost (2.3) can be shown to recover several common existing coverage cost functions. Drawing out the relations between these different coverage algorithms will suggest new insights into when one algorithm should be preferred over another.

Substituting (2.1) and (2.2) into the general cost function from (2.3), we explicitly de-rive the mixing function cost

WV N nr N nr W

Ha(P) = a E(lq - 112) ) k (q)dq+ r +112. (2.4)

fQ r=1 i=1 r=1 i=1

This robust cost function consists of a sensing cost, a robot path length cost, and a mixing

function cost. An equilibrium is reached between these individual costs when =Ha(P) 0.

This optimization defines how the mixing function control strategy generates informative paths. A formal definition of informative paths for multiple robots follows.

Definition 1 (Informative Paths for Multiple Robots using a Mixing Function) A collec-tion of informative paths for a multi-robot system corresponds to the set of waypoint loca-tions for each robot that locally minimizes (2.4).

2.3 Mixing Function

Control Law

Because Ha is non-convex, a gradient based controller of the form

U -irdHa(P)(25

u =- (2.5)

yields locally optimal waypoint configurations P* for a control input u with integrator dynamics, and a strictly positive definite gain matrix K[ [25].

(36)

By substituting the explicit value of Ha(P), (2.5) becomes

p = -K[ ) a-1 N

(_ W (q)dqd+2 _(2.6)

2 ga dp LE 2 dp

By expanding the general gradient based controller (2.6) becomes

=

J

( ) ( 2 - 2qp + (pi)2) q+___1N f(r wd

S-K i]Q WS g )a-1(2 lpmt+ 1 1:22dpp (2.7)

Q ga r=1 1=1 PI

Using this gradient descent approach, we propose the following generalized mixing function control law to locally minimize (2.4) and enable waypoints to converge to an

equilibrium configuration

pJf=-KW ( ')a1(-q+p)O(q)dqwp+ -Wp _1,r (2.8)

I I t Q ga Ii

By substituting the values of the sensor cost (2.2) and the mixing function (2.2), the mixing

function control law is explicitly defined as

pf~~ ~ =

K2 Ws1- -p 0 a _12)a-1 (q - p) 0 (q)dq - Wnp+, nf1; 29

Q r=1i=1

It follows from (2.8) and (2.9), that the term inside the integral, (a, Enf-1 (flq

-f12)a) Ia (Iq - p f12)a-1 is equivalent to (f('q)a-i. This term is important, because it

gives an approximation to the indicator function of the Voronoi partition of waypoint i of robot r. The approximation improves as a -+ -oo. In addition to giving an approximation to a Voronoi partition, (f (,q) )a-' defines how the sensor estimates of different robots are combined over the environment. As shown in Section 2.4, Voronoi coverage is defined as lima,--( ' )a-'. At this limit, there is no sensor mixing between different robots. Using this intuition, we are able to use our coverage controller to approximate a Voronoi coverage controller with a higher degree of accuracy as we decrease a towards -oo. Even

(37)

for values of a >> -oo, i.e. a = [-10, - 15], the smoothing controller approximates the Voronoi partition arbitrarily well. The resulting contours of (f (pi', )a-I for various a val-ues are shown in Figure 2-3.

0.9 0.8 0.4 0.3 0.2 0.1 0 0.2 0.4 0.6 0.8 (a) a = -0.5 0.9 0.8 0.7-0.6 0.5 0.4 0.3 0.2 0.1 "0 0.2 0.4 0.6 0.8 1 "0 (c) a = -10 0.9 \ 0.8 0.7 0.4 0.4 0.3 0.2 0.1 0 0.2 0.4 0.6 0.8 1 (b) a -0.9 -0.8. 0.7--0.6 -0.5 0.4 0.3 0.21, 0.1 0.2 0.4 0.6 0.8 1 (d) a = -15

Figure 2-3: Contour plots of (f(p ',q))a-1 are shown for two robots with each. As a -+ -oo, the contours approach a Voronoi partition.

five waypoints

Next, we define three substitution variables analogous to mass-moments of rigid bodies. The mass M[, first mass-moment Y', and centroid C[, of environment

Q,

for a mixing

(38)

function

are defined as Mi = jWs( (P ))a)-I(q)dq, (2.10) Yr = jWs( ( ga))a-14(q)qdq, (2.11) _ yr C' = -. 1 (2.12) '-M.

Let e = CT - pT. Note that f( Iq - pg||) strictly increasing and 4(q) strictly positive imply

both Mr > 0 V Q 0 and Ci is in the interior of Q. Thus M[ and C have properties intrinsic to physical masses and centroids. Using these inertial property substitutions, the

mixing function control law (2.9) is be defined as

,.Ki (Mie + ir )

; = ,,p[ (2.13)

where

= Wn(PI1+p_ 1-2p ), (2.14)

#i=

Mr+2Wn > 0. (2.15)

Remark 2 Pi3 > 0 normalizes the weight distribution between sensing and staying close to

neighboring waypoints.

2.4 Deriving Common Control Strategies

In this section, we use the mixing function cost (2.4) and coverage control law (2.13) to derive Voronoi and minimum variance control strategies. These strategies represent the range of robot sensor behaviors that can be produced by using the mixingfunction. Whereas a Voronoi coverage strategy does not combine sensor measurements from different robots, a minimum variance strategy combines all robots sensors measurements to minimize the expected variance of the global sensor estimate.

(39)

2.4.1 Voronoi Control Strategy (a

= -oo)

From [27], we define the Voronoi partition of the ith waypoint along fr as

V = {q E

Q:

|jq - p 1; |q - p* 1 |, V(r', i') # (r, i)},

where r, r' E {1,...,N}, i E {1, ... ,nr} and i'E{1, ... ,n,.}. (2.16)

Because lima-+-o ga(f(p',q),...,f(Pn,q)) = min f(p ,q), it follows that for a = -oo,

gJ=1= (4--f2)a 1 _ 2, which implies the Voronoi indicator

function (f(pq )a-I = 1. Intuitively, min ga stipulates that there is no sharing of sensor

measurements between waypoints over the environment, and consequently, waypoint i of robot r considers only q E Vr.

As a result of no sensor mixing between different robots, the cost incurred by all the robots due to the event at q is the same as that incurred by the robot that is closest to

q. Thus, for g_., the coverage cost of the mixing function controller is equivalent to the

coverage cost of a Voronoi controller, which is defined in [32] as

N nr W N nr W

Hy -|q-p 2_0(q)dq+

p - p+(2.17)

H

1

HPHI

(2.17)

r=1 i=1 i'r r=1 i=1

By redefining the mass M[, first mass-moment Y[, and centroid CT of waypoint i's

Voronoi partition Vir as

M = Ws(q)dq (2.18)

Yr = Wso(q)qdq, (2.19)

Yr

Cr y r (2.20)

and by setting e = CT - p , the resulting gradient descent Voronoi control law is derived

(40)

where yf and

#3f

are the same as in (2.14) and (2.15), respectively. From (2.21), we can see that the Voronoi control law differs from our proposed control law due to the absence of a mixing function in the mass properties M[ and Y[ of the Voronoi partition V[.

2.4.2 Minimum Variance Probabilistic Control Strategy (a

= -1)

In this section we use a mixing function with free parameter a -1 to derive a control strategy that minimizes the robots' expected variance of their measurements of a point of interest q. We will formulate an optimal Bayesian estimator for the location of q given the aggregate measurements of the waypoints.

Assume that the waypoints have noisy measurements of the position of a point of in-terest in the environment. Let the point of inin-terest be given by a random variable q that takes on values in

Q,

where waypoint i of robot r has the current environment measurement = q-z. Here, z ~N(O,I2x₂

f(pf,

q)) is a bi-variate normally distributed random

vari-able, and I2x2 is an identity matrix. The variance of the measurement, f(p , q) is a function of the position of the sensor estimate and the point of interest. Given this variance, the mea-surement likelihood of waypoint i of robot r is Pr(xflq : p4) = ) exp( )

Assuming the measurement estimates obtained by different waypoints conditioned on q are independent, and that 0 (q) is the prior distribution of the q's position, Bayes' Theorem gives the posterior distribution,

N I

=1t I Pr(Xir Iq : p ) 0(q)

Pr1qXy , ... FI,p

,fNH Hnl Pr(Xflq :p )q(q)dq(

Our goal is to position the waypoints so that their total estimate of q is as accurate as possible. To achieve this, we want to position the robots so that they minimize the variance of their combined sensor measurements. The product of measurement likelihoods in the numerator of 2.24 can be simplified to a single likelihood function, which has the form of an un-normalized Gaussian

N nr _{I T} ₂

( . 3

fl _{Pr(xIq: pD) = yexp(} _g- _, ₍₂₂

(41)

whose variance is equivalent to our mixing function g-1 = (EL 11

f(p

, q)- 1)-1. The

values of y and

X[

are given by

N nr XT = g-1 YE Ef(Pq)- -1 and (2.24) r=1 i=1 1 ||F 11g _1 N nr iiTii2 T = 1 exp(

-

- q ),(2.25)

)

2 r=q)i=1 2 respectively.

Finally, the expectation over q of the likelihood variance recovers our original general mixing cost function (2.3),

W 1 -|

I--|

2 _{N nr}_W _ Eq[exp( 2 )]1+ E E pIi+112 H -12 E [Y XP g-1 r=1i= J Q~1(f(Pi,~) ),.,f(p))]~d+ N flrjpr (2.26 r=1 N nr i=1W = -g1(f (p', q), - - ., f )p ))(q)dq + r 2

Ii+|2

(2.26)

From this derivation, we can interpret the coverage control optimization as finding the waypoint positions that minimize the expected variance of the likelihood function for an

optimal Bayesian estimator of the position of the point of interest q.

2.5 Mixing Function Controller Convergence

In this section we introduce sensory function parameterization and prove that the proposed

mixingfunction control law in (2.13), causes the set of robot path configurations to converge

a locally optimal configuration according to (2.4) for both known and unknown environ-ments.

2.5.1 Sensory Function Parameterization

The sensory function O(q) can be parameterized as a linear combination of a set of known basis functions spanning Rd, where d is the dimension of the function space.

(42)

Assumption 1 (Sensory Basis Functions) ~a E R$> and B:

Q

- R>, where R> is a

vector of nonnegative entries, such that

O(q) = B(q)Ta, (2.27) where the vector of basis functions B(q) is known by the robots, and the nonnegative pa-rameter vector a is estimated in an unknown environment.

In adaptive control literature [29], B represents the set of parameters that can be mea-sured, given full state feedback and observable system dynamics. The parameter vector

A

represents the set of estimated system parameters such as unknown trajectories or inertias. Denoting ar(t) as the robot's estimation of a it follows that, $(q) = B(q)TA is the

robot's approximation of 0 (q), and the mass moment approximations can now be defined as r Ws( ( )a1(q)dq, (2.28) Q ga r j=Ws(f( ( )a-1qo(q)dq, (2.29) kr Cr = ) ,(2.30) Mi'

for the mixing function control strategy Using a a

A -a, the sensory function error and mass moment errors of the mixing

function are defined as

#(q) = _{^(q)-(q)=B(q)T a,} _(2.31)

Rf

=

Ar

- M[= jWs(f (a)a-1B(q)Tdq i, (2.32) k r r yr = Q WSgq )a-lqB(q)Tdq i, (2.33)

e = -i-. (2.34)

(43)

When a = -1 and we recover the minimum variance mass moment errors A = A - M = Ws( )2 2B(q)Tdq a, (2.35) JQ g-1 , = fir _ y = Ws 1 )-2qB(q)dq j, (2.36) JQg-i er = ir -.(2.37) Mr

Similarly, when a = -o, (pq) ) a-1 - 1, and we recover the Voronoi mass moment errors

Rjr= Af -M= WsB(q)TdqA, (2.38)

jirr firyr =f WqB(q)Tdq A, (2.39)

e = -r, (2.40)

In order to compress the notation in all three mass moment approximation notations, we

set the terms BPr (t) and OPr (t) as the value of the basis function vector and the value of #

at the robot's position pr(t), respectively.

2.5.2 Coverage Convergence in a Known Environment

We assume that the robots and waypoints have full knowledge of the sensory parameter vector i.e. ar = a, and therefore their sensory function estimate $(q) = (q).

Theorem 1 (Mixing Function Convergence Theorem in a Known Environment) The con-figuration of all waypoint positions P, converges to a locally optimal concon-figuration

accord-ing to dHa/dp =0.

Proof 1 We define a Lyapunov-like function based on the agent's path and environment measurement. Because the system is autonomous, we use LaSalle's Invariance Principle and invariant set theory to prove asymptotic stability of the system to a locally optimal equilibrium.

(44)

Let Ha be the Lyapunov function candidate. Because it is comprised of two squared

2-norms, Ha is positive definite. Additionally, Ha -+ oo as p --+ o, and has continuous first

partial derivatives. Domain

Q

is bounded, and therefore the state space of all waypoints

yNnr is bounded. Let 9 = {P* I Rap* 0} C _{P be the invariant set of all critical points}

of Ha over yNn. Taking the time derivative of Ha, we obtain

N nr dH T

Ra = E r P

r=1 i=1 p~i

N nrl

E (Mie + ) T K (Mie i+4) < 0.

r=1 i=1

The first derivative of the Lyapunov candidate Ha < 0, because K[ is strictly positive def-inite and /3/ > 0. Because Ha is negative semi-defdef-inite and Ha is positive defdef-inite, this

implies that Ha is non-increasing and lower bounded, thus 3s < 00 such that limt, Ha s.

Now 9 is explicitly defined as the set of solutions ofY2 NLi 1 Mre= + =i4 0, Vi, r. Let

S be the largest invariant set within 9. By definition i = 1-K a = KI I if i), from

which it follows that S = f, the set of all critical points of Ha. Thus, 9 is an invariant set,

and all trajectories converge to 92 as t - oo using LaSalle's Invariance Principle. From

(2.13), Mier

+

fi -+0 implies dHa/dp =0.

2.5.3 Coverage Convergence in an Unknown Environment

We now extend the mixing function control law in (2.13) to include a parameterized adap-tation law that ensures each robot's independently synthesized path converges to a locally optimal configuration according to (2.4), while each of the robots' estimates of the environ-ment converge to the real environenviron-ment. The presence of a consensus term in the adaptation law enables all of the robots' estimates of the environment to converge to the same esti-mate [27]. In order for robots' estiesti-mates of the environment to converge, the consensus term requires that each robot has knowledge of the states of all the other robots. Thus, we assume that our network of robots is fully connected.

(45)

control law from (2.13) becomes

d_ = 1 (2.41)

where

t =Wn (Pr 1+ p _ 2p ),

f = $i+ 2Wn,

Parameter vector ar is adjusted according to the following features of the adaptation law

Ar = Jwr(r)Bpr(r)Bpr(,r)Tdr, (2.42) r = r Wr()B,(r)()p,(r)dr, (2.43)

where the data collection weight Wr() [27] is defined as

Wr(t) positive constant scalar, if t < wr (2.44)

0,

otherwise,

where 4r, represents the time at which part of the the adaptation for robot r shuts down to

maintain Ar and Ar bounded. Let

br =

j

W( q) )a-B(q) (q - p) Tdq p2, (2.45)

b =1 Y, 4 ,(

N

aprer = -br-T(Arr-A r)

-

ir,r'(^r - ), (2.46)

r'=1

Decentralized mixing function control strategy for multi-robot informative persistent sensing applications

Decentralized Mixing Function Control Strategy for

Multi-Robot Informative Persistent Sensing Applications

by

Gavin Chase Hall

B.S. Mathematics, B.S. Mechanical Engineering, B.S. Physics

West Virginia University, 2009

Submitted to the Department of Mechanical Engineering

in partial fulfillment of the requirements for the degree of

Master of Science

at the

MASSACHUSETTS INSTITUTE OF TECHNOLOGY

June 2014

@ Massachusetts Institute of Technology 2014. All rights reserved.

Signature redacted

Department of Mechanical Engineering

Signature redacted

May 15, 2014

Certified by

Certified by ...

Signature redacted

/

Jean-Jacques E. Slotine

Professor of Mechanical Engineering

Signature redactedjhesis

Supervisor

Accepted by...

...

David E. Hardt

Chair, Department Committee on Graduate Students

S1

52014

Decentralized Mixing Function Control Strategy for Multi-Robot

Informative Persistent Sensing Applications

by

Gavin Chase Hall

Abstract

Acknowledgments

Contents

List of Figures

functions

List of Algorithms

List of Tables

Chapter 1

Introduction

1.1

Motivation and Goals

11

i~e

1.2 Contribution to Robotics

function

function

functions

1.3 Relation to Previous Work

1.4 Thesis Organization

Chapter 2

Informative Path Controller Using a

Mixing Function

2.1 Problem Setup

Q

Q

Q.

E

Q

{1,

Q

it

function

2.2

Mixing Function Cost

IIpi+1II

ws

f(p

Q,

4

2.3

Mixing Function

Control Law

J

Q,