Evaluation of closed-loop performance of an estimation strategy for decentralized safety
controller under communication delay and measurement uncertainty
Delphine Bresch-Pietri and Domitilla Del Vecchio
Abstract
We present here the details of the evaluation performance of an estimation strategy for a decen- tralized safety controller for two agents, subject to communication delay and imperfect measurements.
The control objective is to ensure safety, meaning that the state of the two-agent system does not enter an undesired set in the state space. Assuming that we know a feedback map designed for the delay- free case, we propose a state estimation strategy which guarantees control agreement between the two agents in the case of bounded communication delay. We extend it to the case of infinitely-distributed communication delays by determining a lower bound for the probability of safety. In the present note, we discuss the performance of the resulting controller.
I. NOTATION
In the following, m and p are positive integers. We denote with a superscript i the variables relative to agent i for i∈ {1,2}, with a superscript L (resp. R) the variables relative to the local agent (resp. the remote one) and with a subscript the coordinate.
|.| denotes the Euclidean norm whereas k·k∞ is used for the infinity norm of a signal. The diameter of a set S is written as D(S) =sup(s1,s2)∈S2|s1−s2|. The distance between a point x
D. Bresch-Pietri is with CNRS at GIPSA-lab, Control Department, 11 rue des Math´ematiques, 38000 Grenoble, FRANCE.
Email address:[email protected]
D. Del Vecchio is with the Department of Mechanical Engineering, Massachusetts Institute of Technology, Cambridge MA
and a non-empty set S is written as d(x,S) =infs∈S|s−x| and the distance between two non- empty set S1 andS2 asd(S1,S2) =max
sups1∈S1infs2∈S2|s1−s2|,sups2∈S2infs1∈S1|s1−s2| . The boundary of a set S is written as ∂S and its closure as S.
C0pw(S1,S2)represents the set of piecewise continuous functions defined on the setS1and taking values in S2. For two vectors x and ˜x in Rp, we will write x≤x˜ if xi≤x˜i for all 1≤i≤p. For S1⊂Rp, S2⊂Rp and (ξ,ξ˜)∈ C0pw(S1,S2)2, we will write ξ ≤ξ˜ if ξ(s)≤ξ˜(s), for all s∈S1. For two vectors x and ˜x in Rp, such that x≤x, we write˜ [x,x] = [x˜ 1,x˜1]×[x1,x˜2]×. . .×[xp,x˜p] and I(Rp) =
[x,x]˜ |(x,x)˜ ∈(Rp)2 .
ϕ(t,t0,x0,u)∈Rp is the flow associated with a given dynamics at time t ≥t0 corresponding to the initial condition x0∈Rp at time t0≥0 driven by the input signal u∈ C0pw([t0,∞),Rm).
For a set S⊂Rp, we write ϕ(t,t0,S,u) =∪x0∈Sϕ(t,t0,x0,u). When possible, we will simply let ϕ(t,S,u) =ϕ(t,0,S,u). For x:R+ →Rp and 0≤t1≤t2, we write x|[t1,t2]:s∈[t1,t2]7→x(s).
When necessary, we writeϕ(t,S,u|[¯t,¯t+t))the flow at time t≥0 driven by a portion of the input signal u∈ Cpw([0,∞),Rm), with ¯t ≥0. A scalar continuous function α :R+→R+ is said to be of class K if α(0) =0 and α is strictly increasing. A scalar continuous function γ :R+ →R+
is said to be of class K∞ if it is of class K and if α(t)→∞ ast →∞.
In the sequel, a white noise refers to a stochastic signal with a constant power spectral density for any frequency included in its (potentially infinite) spectrum. We write E(X) the expected value of a random variable X.
Finally, for (x,y)∈R×(R\ {0}), we write x≡0 modyif there exists n∈Nsuch that x=ny and bxc=m with m∈N such that m≤x<(m+1).
II. PROBLEM STATEMENT
A. Agent dynamics
We consider that each agent is governed by the same dynamics1, namely, for i∈ {1,2},
˙
xi(t) = fi(xi(t),ui(t)), (1)
yi(t) =xi(t) +σi(t), (2)
1Note that other output maps could be considered, such as multiplicative bounded uncertainties for example. Provided that a corresponding bounded set-valued measurement maphi exists, the proposed estimation strategy will hold.
with (xi,yi)∈Rn×Rn, ui∈[um,uM]⊂Rm and σi∈[σmi,σMi ]⊂Rn. Further, in the sequel, we consider the measurement maphi:yi∈Rn7→[yi−σMi ,yi−σmi] which is bounded set-valued and such that, for any output yi(t), xi(t)∈hi(yi(t)). In other words, for any measurement yi, each agent has access to a bounded set-valued function that returns the set of all states consistent with the current output. Finally, it is assumed that the vector field fi satisfies the following property.
Assumption 1: For any initial condition x0 ∈Rn and any input u∈ Cpw(R+,[um,uM]), the solution of (1) is global and unique.
Note that this assumption also applies to the extended dynamics
˙
x(t) = f1(x1,u1(t)),f2(x2,u2(t))
, (3)
y=x+σ, (4)
in whichx= (x1,x2)andσ= (σ1,σ2). In the sequel, we writeu= (u1,u2),h(y) = (h1(y1),h2(y2)) and ϕ the flow associated with (3), which is well-defined according to Assumption 1.
B. Delay-free control design
Given an open set B ⊂R2n, define the capture set C=
S⊂R2n|
∀u∈ Cpw(R+,[um,uM]2)∃t≥0 ϕ(t,S,u)∩ B 6= /0 . Besides, define the operator
Φ:R+×2R2n×2Cpw(R+,[um,uM]2)→2R2n
(t,S,U)7→ ∪u∈Uϕ(t,S,u). (5) Assumption 2: There exists a decreasing and Cartesian product-valued feedback law π: 2R2n →2[um,uM]×2[um,uM] such that, for all S⊂R2n and ˜π∈2Cpw(R+,[um,uM]2) such that S∈ C/ and ˜π(t)⊆π(Φ(t,S,π|˜ [0,t))) for t≥0, then Φ(t,S,π˜|[0,t))∈ C,/ t≥0.
The map π is decreasing, i.e.,: for two sets S1 and S2 in R2n such that S1 ⊆S2, one has π(S1)⊇π(S2). Qualitatively, this property indicates that any input keeping a set outside of the capture set should also keep any subset of it outside of the capture set.
C. Agents communication and delays
From now on, we focus on one of the agents, referred to as local agent. We introduce notations to outline information that the local agent receives from the other (remote) agent and computations that it performs based on both this information and locally available data. To this end, we will use a superscriptL (resp.R) for quantities computed by the local (resp. remote) agent with (L,R)∈ {(1,2),(2,1)}.
We assume that both agents share the same universal timet, obtained from GPS measurements for example2, and use it to stamp exchanged data. The information sent by the remote agent at time t is then
˜
zR(t) = t,hR(yR(t))
. (6)
Further, we consider that communication delays occur between the two agents, with independent but symmetric communication channels, i.e., the delays share the same model. Namely, defining ZL(t) the set of information received by the local agent at time t, we have
˜
zR(t)∈ZL(t+τRL(t)), (7)
in whichτRL≥0 is a continuous-time white noise process, which can be infinitely-distributed, and τRL(t)andτLR(t)have the same (time-invariant) probability density function but are independent.
τRL(t) represents the time the information takes to travel from the remote agent R to the local agentL(τLR represents the converse one). Note that the setZL(t)can be empty (if no information is received at timet) or can contain several elements (if more than one information is received at time t). The definition (6) of the exchanged information implies that, for each information, the corresponding delay value is known, as the two agents can determine it by comparing the exchanged time stamp and the current time stamp.
D. Problem under consideration and estimation strategy
The problem at stake here is: given a feedback map that guarantees safety in a context without communication delay and satisfies Assumption 2, design a state estimation strategy that allows to use that same feedback map to guarantee safety in the presence of communication delay and measurement uncertainties. We formulate it mathematically in the following.
2We consider that the two agents are close enough so we can neglect the difference between the two received GPS signals [3].
Problem 1:Given systems (3)–(4) subject to communication delay (7), a bad setB ⊂R2n, and a feedback map satisfying Assumption 2, determine a state estimation procedure t∈R+7→(xˆ1(t),xˆ2(t))⊂R2n×R2n and an initialization set X0 ⊂ R2n such that, with U1:t∈R7→π(xˆ1(t)) and U2:t∈R7→π(xˆ2(t)), if x(0)∈X0, u1(t)∈U11(t) and u2(t)∈U22(t) for t≥0, then the solution of (3) satisfies x(t) =ϕ(t,x(0),(u1,u2))∈ B/ for t≥0.
In our companion paper [1], we proved the following result for this problem.
Theorem 1: Consider the plant (1) satisfying Assumption 1, the feedback law π defined in Assumption 2 and the operator Φdefined in (5). Define τ∗≥0 andδ >0 and let Φ(t,/0,U) =/0 for t ≥0 andU ∈2C0pw(R+,[um,uM]2). For L∈ {1,2}, consider ˆhLd,syn(t,τ∗) = /0 for t ∈[0,τ∗) and, for t≥τ∗,
hˆLd,syn(t,τ∗) =
hˆLd(z) if t−τ∗≡0 modδ and if there exist s≤t and z∈ZL(s) s.t. t−z1=τ∗ Φ(δ,hˆLd,syn(t−δ,τ∗),UL
[t−δ−τ∗,t−τ∗
))if t−τ∗≡0 modδ and if t−z16=τ∗ for all z∈ZL(s) and s≤t
Φ
t−τ∗− bt−τ∗
δ cδ,hˆLd,syn(bt−τ∗
δ cδ+τ∗,τ∗),UL [bt−τ∗
δ cδ,t−τ∗)
otherwise.
(8)
and let ˆ
xLd(t) =n
x∈R2n| ∃(x0,w)∈hˆLd,syn(τ∗,τ∗)×ULx=ϕ(t−τ∗,x0,w) and, for s∈[τ∗,t], ϕ(s−τ∗,x0,w)∈hˆLd,syn(s,τ∗)o
, t≥τ∗ (9)
ˆ
xL(t) =Φ(τ∗,xˆLd(t),UL [t−τ∗
,t)),t ≥τ∗ (10)
UL(t) =
π(xˆL(t)) if t≥τ∗ and ˆxL(t)6= /0 um,uM2
otherwise .
(11) Provided that Φ(τ∗,h(x(0) +σ),[um,uM]2)∈ C, and that/ uL(t)∈ULL(t), for L∈ {1,2} andt≥0, then, for T ≥τ∗,
Pr(x(t)∈ B/ ,t∈[0,T])≥ p2(p2+ (1−p2)(1−p)2)b(T−τ∗)/δc=∆ Π(δ,τ∗), (12) with p=Pr(τRL(t)≤τ∗) =Pr(τLR(t)≤τ∗).
One can note that the probability bound Π(δ,τ∗) is increasing with respect to τ∗ and with respect to δ. In the sequel, we are interested in the corresponding closed-loop performance and its dependance on those two tuning parameters.
III. EVALUATION OF CLOSED-LOOP PERFORMANCE
In this section, we aim at providing an evaluation of the performance which can be obtained using the proposed technique, i.e., we determine how far from the bad set the trajectory generated by the control law can be with communication delay. With this aim in view, we define U = u∈ Cpw(R+,[um,uM]2)|uL(t)∈ULL(t),t ≥τ∗,L=1,2 and consider the following quantity
t∈[0,Tinf ]sup
u∈U
d(ϕ(t,x0,u),B), (13)
forT≥τ∗and a givenx0∈R2nsuch thatΦ(τ∗,h(x0+σ),[um,uM]2)∈ C. This distance quantifies,/ for given initial conditions satisfying the assumptions of Theorem 1, how far from the bad set the trajectory generated by the proposed control law can be in the presence of communication delay. However, in the case of an infinitely-distributed delay, the probability of entering the bad set is not equal to zero and therefore the trajectory can actually intersect the bad set. This is why we will restrict the trajectories under consideration to safe ones in the sequel.
A. Upper-bound for closed-loop performance
First, we characterize further the dynamics under consideration.
Assumption 3: There exist a positive increasing scalar continuous functionκ, classKfunctions α1 and α2 and a class K∞ function γ such that, for arbitrary pairs (x1,x2)∈ Rn×Rn and (u1,u2)∈ C0pw(R,Rm)2, the corresponding solutions of (1) satisfy, for t≥0,
|ϕ(t,x1,u1)−ϕ(t,x2,u2)| ≤κ(t)α1(|x1−x2|) +α2(t)γ(ku2−u1k∞). (14) This assumption is motivated by Theorem 3.4 in [2]. Indeed, provided that the vector field is continuously differentiable with respect to the state x and to the input u, Assumption 3 holds.
To evaluate the closed-loop performance for the proposed control technique and guarantee that the considered trajectories do not enter the bad set, we consider trajectories such that the following event holds
τLR(0)≤τ∗,τRL(0)≤τ∗ and, for n∈
1, . . . ,bT−τ∗ δ c
,
either τLR(nδ)≤τ∗ and τRL(nδ)≤τ∗ or τLR(nδ)>τ∗ and τRL(nδ)>τ∗ (15) As this event is not deterministically given, in the sequel, we consider a conditional expected value of the quantity (13).
Proposition 1: Consider the plant (1) satisfying Assumptions 1 and 3 and the assumptions of Theorem 1. Define U =
u∈ Cpw(R+,[um,uM]2)|uL(t)∈ULL(t),t ≥τ∗,L=1,2 . Then, for T ≥τ∗ and for trajectories originating from x0=x(0) such that the following event (15) holds for all time, there exist a positive increasing scalar continuous function κ, class K functionsα1 and α2 and a class K∞ function γ such that
t∈[0,Tinf ] sup
u∈U E(d(ϕ(t,x0,u),B)|E0)
≤κ(T)α1(max
y D(h(y))) + inf
t∈[0,T] sup
u∈U E(d(xˆL(t),B)|E0) +G(δ,τ∗), (16) with
G(δ,τ∗) =Pr(τ≤τ∗)
b(T−τ∗)/δc−1
∑
i=0
Pr(τ>τ∗)iα2(τ∗+ (i+1)δ)γ(|uM−um|)
+Pr(τ >τ∗)b(T−τ∗)/δcα2(T)γ(|uM−um|). (17) Proof: First, following the proofs of provided in [1], if (15) holds then ˆx1(t) =xˆ2(t) =x(t)ˆ and x(t)∈x(t), forˆ t∈[τ∗,T]. Further, ˆh1d,syn(t,τ∗) =hˆ2d,syn(t,τ∗) =hˆd,syn(t,τ∗) for t ∈[τ∗,T].
Consequently, one obtains
d(ϕ(t,x0,u),B)≤D(x(tˆ )) +d(x(t),ˆ B), (18) and, from the linearity of the expected value, it follows that
E(d(ϕ(t,x0,u),B)|E0)≤E(D(x(t))|E0) +ˆ E(d(x(t),ˆ B)|E0)
≤E(D(Φ(τ∗,hˆd,syn(t,τ∗),U|[t−τ∗,t]))|E0) +E(d(x(t),ˆ B)|E0). in which the last inequality follows from the fact that ˆhd,syn(t,τ∗)⊇xˆd(t) for t≥τ∗ using (9)–
(10). From (10), we have that
E(D(Φ(τ∗,hˆd,syn(t,τ∗),U|[t−τ∗,t]))|E0)
=E
D
Φ
τ∗+t−
t−τ∗ δ
δ,xˆd,syn t−τ∗ δ
δ,τ∗
,U|[b(t−τ∗)/δcδ−τ∗,t)
|E0
,
and, by definition of the expected value and from Cases 1-3 in (8), that E(D(Φ(τ∗,hˆd,syn(t,τ∗),U[t−τ∗,t]))|E0)
=Pr
τ t−τ∗ δ
δ−τ∗
≤τ∗
×D
Φ
τ∗+t−
t−τ∗ δ
δ,h
y t−τ∗ δ
δ−τ∗
,U|[b(t−τ∗)/δcδ−τ∗,t)
+Pr
τ t−τ∗ δ
δ−τ∗
>τ∗
×E
D
Φ
τ∗−
t−τ∗ δ
δ+t−δ,xˆd,cor t−τ∗ δ
δ−δ
,U|[b(t−τ∗)/δcδ−τ∗,t)
|E0
. Using again Cases 1-3 in (8) backward in time iteratively and taking into account the fact that Pr(τ(0)≤τ∗|E0) =1, one obtains
E(D(Φ(τ∗,hˆd,syn(t,τ∗),U[t−τ∗,t]))|E0) =Pr(τ≤τ∗)
b(t−τ∗)/δc−1 i=0
∑
Pr(τ >τ∗)i
×D
Φ
τ∗+t− t−τ∗ δ
−i
δ,h
y t−τ∗ δ
−i
δ−τ∗
,U|[(bt−τ∗
δ c−i)δ−τ∗,t)
+Pr(τ>τ∗)bt−τ
∗ δ c
D(Φ(t,h(y(0)),U|[0,t))). (19)
Finally, applying (14) and (5), one obtains, for s,s0≥0, D(Φ(s,h(y(s0)),U|[s
0,s+s0)))≤
max
u:R+7→[um,uM]2, x∈h(y(s0))
ϕ(s,x,u)− min
u:R+7→[um,uM]2, x∈h(y(s0))
ϕ(s,x,u)
≤κ(s)α1(max
y D(h(y))) +α2(s)γ(|uM−um|),
in which κ,α1,α2 and γ are introduced in Assumption 3. Then, one can bound the terms in (19) with this last inequality and take the inf and sup of both sides for t ≥0. The result follows noticing that both t≤T and t− b(t−τ∗)/δcδ ≤δ and matching the terms multiplying κ(T)α1(maxyD(h(y))).
Note that, here, maxyD(h(y)) =|σM−σm|. We keep such a general formulation maxyD(h(y)) to encapsulate other bounded set-valued measurement maps, for which the previous analysis and in particular Theorem 1 also hold.
According to Proposition 1, the distance of the trajectory from the bad set, i.e., the conservatism of the proposed control strategy, can be quantified by three terms corresponding to three different phenomena: (i) the magnitude of measurement uncertainties; (ii) the performance of the nominal
feedback law introduced in Assumption 2; and (iii) the synchronization technique introduced in (8)–(11). Without measurement uncertainties and communication delay, the left-hand side of (16) is equal to the right hand-side, since maxyD(h(y)) =0 and one can chose δ =τ∗=0 and therefore G(δ,τ∗) =0. This bodes well for the tightness of the bound. While the influence of the first two phenomena can be easily characterized, it is necessary to analyze further the behavior of the third one with respect to the parameters involved in the control strategy, that is, the synchronization delay τ∗ and the period of update δ. This is the subject of the following section.
B. Variation of the upper-bound with respect to the control parameters τ∗ and δ Lemma 1: The function G defined in (17) is increasing with respect to δ.
Proof: Consider two positive scalars δ1 and δ2 such that δ1 ≤ δ2. Consequently, b(T−τ∗)/δ1c ≥c(T−τ∗)/δ2b. First, ifb(T−τ∗)/δ1c=0, thenb(T−τ∗)/δ2c=0 andG(δ1,τ∗) = G(δ2,τ∗) =αM(T)γ(|uM−um|) according to (17). Otherwise, we have that
G(δ1,τ∗) =Pr(τ ≤τ∗)
b(T−τ∗)/δ2c−1
∑
i=0
Pr(τ>τ∗)iα(τ∗+ (i+1)δ1)γ(|uM−um|)
+Pr(τ >τ∗)b(T−τ∗)/δ2cG(δ˜ 1,δ2,τ∗), (20) in which
G(δ˜ 1,δ2,τ∗) =γ(|uM−um|)
Pr(τ>τ∗)b
T−τ∗
δ1 c−bT−τ∗
δ2 c
α(T)
+Pr(τ≤τ∗)
bT−τ∗
δ1 c−bT−τ∗
δ2 c−1 i=0
∑
Pr(τ>τ∗)iα
τ∗+
i+1+
T−τ∗ δ2 δ1
.
One can observe that the second factor in G˜ is a weighted average between α(τ∗+ (1+b(T−τ∗)/δ2c)δ1), α(τ∗+ (2+b(T −τ∗)/δ2c)δ1), . . . , α(T). Besides, as α is a classKfunction, thenα(τ∗+ (i+1+b(T−τ∗)/δ2c)δ1)≤α(T)for 0≤i≤ bT−τ∗
δ1 c − bT−τ∗
δ2 c −1.
Consequently, ˜G(δ1,δ2,τ∗)≤α(T)γ(|uM−um|). Further, as α is a class K function, it follows
that α(s1+s2δ1)≤α(s1+s2δ2) for all s1,s2≥0 and, therefore, from (20), G(δ1,τ∗)≤Pr(τ ≤τ∗)
bT−τ∗
δ2 c−1
∑
i=0
Pr(τ>τ∗)iα(τ∗+ (i+1)δ2)γ(|uM−um|) +Pr(τ >τ∗)b(T−τ∗)/δ2cα(T)γ(|uM−um|)
≤G(δ2,τ∗).
This concludes the proof.
The variations of G with respect to τ∗, instead, are not monotone. Nevertheless, we can establish the following result regarding its asymptotic behavior (for small or large values of τ∗).
Lemma 2:Consider δ ≥0 and the function Gintroduced in (17). There exists τ0 and τ1 with 0<τ0≤τ1 such that G is decreasing for τ∈[0,τ0) and increasing for τ∈(τ1,∞).
Proof: Considering (17), one can observe thatG(·,τ∗)∼
0 Pr(τ>τ∗)α(T)γ(|uM−um|), which is a decreasing function of τ∗. Therefore, there exists τ0>0 such that G(·,τ∗) is decreasing.
Similarly, one obtainsG(·,τ∗)∼
∞Pr(τ≤τ∗)α(τ∗)γ(|uM,um|), which is an increasing function of τ∗. This concludes the proof.
IV. CONCLUSION
From Theorem 1, one concludes that the probability of safetyΠincreases when increasing the synchronization delay τ∗ and the period of update δ. However, from the two previous lemmas, one obtains inverse requirements for closed-loop performance. Indeed, in the case of no-collision, the bound proposed in Proposition 1 increases while increasing δ and τ∗. Therefore, a tradeoff has to be reached between probability of safety and closed-loop performance.
REFERENCES
[1] D. Bresch-Pietri and D. Del Vecchio. Estimation for decentralized safety control under communication delay and measurement uncertainty. Automatica, Provisionally accepted.
[2] H. Khalil. Nonlinear Systems. 3rd Edition, Prentice Hall, 2002.
[3] B. Sterzbach. GPS-based clock synchronization in a mobile, distributed real-time system.Real-Time Systems, 12(1):63–75, 1997.