Evaluation of closed-loop performance of an estimation strategy for decentralized safety controller under communication delay and measurement uncertainty

(1)

Evaluation of closed-loop performance of an estimation strategy for decentralized safety

controller under communication delay and measurement uncertainty

Delphine Bresch-Pietri and Domitilla Del Vecchio

Abstract

We present here the details of the evaluation performance of an estimation strategy for a decentralized safety controller for two agents, subject to communication delay and imperfect measurements.

The control objective is to ensure safety, meaning that the state of the two-agent system does not enter an undesired set in the state space. Assuming that we know a feedback map designed for the delay- free case, we propose a state estimation strategy which guarantees control agreement between the two agents in the case of bounded communication delay. We extend it to the case of infinitely-distributed communication delays by determining a lower bound for the probability of safety. In the present note, we discuss the performance of the resulting controller.

I. NOTATION

In the following, m and p are positive integers. We denote with a superscript i the variables relative to agent i for i∈ {1,2}, with a superscript L (resp. R) the variables relative to the local agent (resp. the remote one) and with a subscript the coordinate.

|.| denotes the Euclidean norm whereas k·k_∞ is used for the infinity norm of a signal. The diameter of a set S is written as D(S) =sup_(s₁_,s₂_)∈S2|s₁−s₂|. The distance between a point x

D. Bresch-Pietri is with CNRS at GIPSA-lab, Control Department, 11 rue des Math´ematiques, 38000 Grenoble, FRANCE.

Email address:[email protected]

D. Del Vecchio is with the Department of Mechanical Engineering, Massachusetts Institute of Technology, Cambridge MA

(2)

and a non-empty set S is written as d(x,S) =inf_s∈S|s−x| and the distance between two non- empty set S₁ andS₂ asd(S₁,S₂) =max

sup_s₁_∈S₁inf_s₂_∈S₂|s₁−s₂|,sup_s₂_∈S₂inf_s₁_∈S₁|s₁−s₂| . The boundary of a set S is written as ∂S and its closure as S.

C⁰_pw(S₁,S₂)represents the set of piecewise continuous functions defined on the setS₁and taking values in S₂. For two vectors x and ˜x in R^p, we will write x≤x˜ if x_i≤x˜_i for all 1≤i≤p. For S₁⊂R^p, S₂⊂R^p and (ξ,ξ˜)∈ C⁰_pw(S₁,S₂)², we will write ξ ≤ξ˜ if ξ(s)≤ξ˜(s), for all s∈S₁. For two vectors x and ˜x in R^p, such that x≤x, we write˜ [x,x] = [x˜ ₁,x˜₁]×[x₁,x˜₂]×. . .×[x_p,x˜_p] and I(R^p) =

[x,x]˜ |(x,x)˜ ∈(R^p)² .

ϕ(t,t₀,x₀,u)∈R^p is the flow associated with a given dynamics at time t ≥t₀ corresponding to the initial condition x₀∈R^p at time t₀≥0 driven by the input signal u∈ C⁰_pw([t₀,∞),R^m).

For a set S⊂R^p, we write ϕ(t,t₀,S,u) =∪_x₀_∈Sϕ(t,t₀,x₀,u). When possible, we will simply let ϕ(t,S,u) =ϕ(t,0,S,u). For x:R⁺ →R^p and 0≤t₁≤t₂, we write x|_[t₁_,t₂_]:s∈[t₁,t₂]7→x(s).

When necessary, we writeϕ(t,S,u|_[¯_t,¯_t+t))the flow at time t≥0 driven by a portion of the input signal u∈ C_pw([0,∞),R^m), with ¯t ≥0. A scalar continuous function α :R+→R+ is said to be of class K if α(0) =0 and α is strictly increasing. A scalar continuous function γ :R+ →R+

is said to be of class K_∞ if it is of class K and if α(t)→∞ ast →∞.

In the sequel, a white noise refers to a stochastic signal with a constant power spectral density for any frequency included in its (potentially infinite) spectrum. We write E(X) the expected value of a random variable X.

Finally, for (x,y)∈R×(R\ {0}), we write x≡0 modyif there exists n∈Nsuch that x=ny and bxc=m with m∈N such that m≤x<(m+1).

II. PROBLEM STATEMENT

A. Agent dynamics

We consider that each agent is governed by the same dynamics¹, namely, for i∈ {1,2},

˙

xⁱ(t) = fⁱ(xⁱ(t),uⁱ(t)), (1)

yⁱ(t) =xⁱ(t) +σⁱ(t), (2)

1Note that other output maps could be considered, such as multiplicative bounded uncertainties for example. Provided that a corresponding bounded set-valued measurement maphⁱ exists, the proposed estimation strategy will hold.

(3)

with (xⁱ,yⁱ)∈Rⁿ×Rⁿ, uⁱ∈[u_m,u_M]⊂R^m and σⁱ∈[σ_mⁱ,σ_Mⁱ ]⊂Rⁿ. Further, in the sequel, we consider the measurement maphⁱ:yⁱ∈Rⁿ7→[yⁱ−σ_Mⁱ ,yⁱ−σ_mⁱ] which is bounded set-valued and such that, for any output yⁱ(t), xⁱ(t)∈hⁱ(yⁱ(t)). In other words, for any measurement yⁱ, each agent has access to a bounded set-valued function that returns the set of all states consistent with the current output. Finally, it is assumed that the vector field fⁱ satisfies the following property.

Assumption 1: For any initial condition x₀ ∈Rⁿ and any input u∈ C_pw(R+,[u_m,u_M]), the solution of (1) is global and unique.

Note that this assumption also applies to the extended dynamics

˙

x(t) = f¹(x¹,u¹(t)),f²(x²,u²(t))

, (3)

y=x+σ, (4)

in whichx= (x¹,x²)andσ= (σ¹,σ²). In the sequel, we writeu= (u¹,u²),h(y) = (h¹(y¹),h²(y²)) and ϕ the flow associated with (3), which is well-defined according to Assumption 1.

B. Delay-free control design

Given an open set B ⊂R²ⁿ, define the capture set C=

S⊂R²ⁿ|

∀u∈ C_pw(R+,[u_m,u_M]²)∃t≥0 ϕ(t,S,u)∩ B 6= /0 . Besides, define the operator

Φ:R+×2^R²ⁿ×2^C^pw⁽^R⁺^,[u^m^,u^M^]²⁾→2^R²ⁿ

(t,S,U)7→ ∪_u∈Uϕ(t,S,u). (5) Assumption 2: There exists a decreasing and Cartesian product-valued feedback law π: 2^R²ⁿ →2^[u^m^,u^M^]×2^[u^m^,u^M^] such that, for all S⊂R²ⁿ and ˜π∈2^C^pw⁽^R⁺^,[u^m^,u^M^]²⁾ such that S∈ C/ and ˜π(t)⊆π(Φ(t,S,π|˜ _[0,t))) for t≥0, then Φ(t,S,π˜|_[0,t))∈ C,/ t≥0.

The map π is decreasing, i.e.,: for two sets S₁ and S₂ in R²ⁿ such that S₁ ⊆S₂, one has π(S₁)⊇π(S₂). Qualitatively, this property indicates that any input keeping a set outside of the capture set should also keep any subset of it outside of the capture set.

(4)

C. Agents communication and delays

From now on, we focus on one of the agents, referred to as local agent. We introduce notations to outline information that the local agent receives from the other (remote) agent and computations that it performs based on both this information and locally available data. To this end, we will use a superscriptL (resp.R) for quantities computed by the local (resp. remote) agent with (L,R)∈ {(1,2),(2,1)}.

We assume that both agents share the same universal timet, obtained from GPS measurements for example², and use it to stamp exchanged data. The information sent by the remote agent at time t is then

˜

z^R(t) = t,h^R(y^R(t))

. (6)

Further, we consider that communication delays occur between the two agents, with independent but symmetric communication channels, i.e., the delays share the same model. Namely, defining Z^L(t) the set of information received by the local agent at time t, we have

˜

z^R(t)∈Z^L(t+τ^RL(t)), (7)

in whichτ^RL≥0 is a continuous-time white noise process, which can be infinitely-distributed, and τ^RL(t)andτ^LR(t)have the same (time-invariant) probability density function but are independent.

τ^RL(t) represents the time the information takes to travel from the remote agent R to the local agentL(τ^LR represents the converse one). Note that the setZ^L(t)can be empty (if no information is received at timet) or can contain several elements (if more than one information is received at time t). The definition (6) of the exchanged information implies that, for each information, the corresponding delay value is known, as the two agents can determine it by comparing the exchanged time stamp and the current time stamp.

D. Problem under consideration and estimation strategy

The problem at stake here is: given a feedback map that guarantees safety in a context without communication delay and satisfies Assumption 2, design a state estimation strategy that allows to use that same feedback map to guarantee safety in the presence of communication delay and measurement uncertainties. We formulate it mathematically in the following.

2We consider that the two agents are close enough so we can neglect the difference between the two received GPS signals [3].

(5)

Problem 1:Given systems (3)–(4) subject to communication delay (7), a bad setB ⊂R²ⁿ, and a feedback map satisfying Assumption 2, determine a state estimation procedure t∈R+7→(xˆ¹(t),xˆ²(t))⊂R²ⁿ×R²ⁿ and an initialization set X₀ ⊂ R²ⁿ such that, with U¹:t∈R7→π(xˆ¹(t)) and U²:t∈R7→π(xˆ²(t)), if x(0)∈X₀, u¹(t)∈U₁¹(t) and u²(t)∈U₂²(t) for t≥0, then the solution of (3) satisfies x(t) =ϕ(t,x(0),(u¹,u²))∈ B/ for t≥0.

In our companion paper [1], we proved the following result for this problem.

Theorem 1: Consider the plant (1) satisfying Assumption 1, the feedback law π defined in Assumption 2 and the operator Φdefined in (5). Define τ^∗≥0 andδ >0 and let Φ(t,/0,U) =/0 for t ≥0 andU ∈2^C⁰^pw⁽^R⁺^,[u^m^,u^M^]²⁾. For L∈ {1,2}, consider ˆh^L_d,syn(t,τ^∗) = /0 for t ∈[0,τ^∗) and, for t≥τ^∗,

hˆ^L_d,syn(t,τ^∗) =











hˆ^L_d(z) if t−τ^∗≡0 modδ and if there exist s≤t and z∈Z^L(s) s.t. t−z₁=τ^∗ Φ(δ,hˆ^L_d,syn(t−δ,τ^∗),U^L

_[t−δ−τ_∗_,t−τ_∗

))if t−τ^∗≡0 modδ and if t−z₁6=τ^∗ for all z∈Z^L(s) and s≤t

Φ

t−τ^∗− b^t−τ^∗

δ cδ,hˆ^L_d,syn(b^t−τ^∗

δ cδ+τ^∗,τ^∗),U^L _[bt−τ∗

δ cδ,t−τ^∗)

otherwise.

(8)

and let ˆ

x^L_d(t) =n

x∈R²ⁿ| ∃(x₀,w)∈hˆ^L_d,syn(τ^∗,τ^∗)×U^Lx=ϕ(t−τ^∗,x₀,w) and, for s∈[τ^∗,t], ϕ(s−τ^∗,x₀,w)∈hˆ^L_d,syn(s,τ^∗)o

, t≥τ^∗ (9)

ˆ

x^L(t) =Φ(τ^∗,xˆ^L_d(t),U^L _[t−τ_∗

,t)),t ≥τ^∗ (10)

U^L(t) =







π(xˆ^L(t)) if t≥τ^∗ and ˆx^L(t)6= /0 u_m,u_M2

otherwise .

(11) Provided that Φ(τ^∗,h(x(0) +σ),[u_m,u_M]²)∈ C, and that/ u^L(t)∈U_L^L(t), for L∈ {1,2} andt≥0, then, for T ≥τ^∗,

Pr(x(t)∈ B/ ,t∈[0,T])≥ p²(p²+ (1−p²)(1−p)²)^b(T^−τ^∗^)/δc=^∆ Π(δ,τ^∗), (12) with p=Pr(τ^RL(t)≤τ^∗) =Pr(τ^LR(t)≤τ^∗).

One can note that the probability bound Π(δ,τ^∗) is increasing with respect to τ^∗ and with respect to δ. In the sequel, we are interested in the corresponding closed-loop performance and its dependance on those two tuning parameters.

(6)

III. EVALUATION OF CLOSED-LOOP PERFORMANCE

In this section, we aim at providing an evaluation of the performance which can be obtained using the proposed technique, i.e., we determine how far from the bad set the trajectory generated by the control law can be with communication delay. With this aim in view, we define U = u∈ C_pw(R+,[u_m,u_M]²)|u_L(t)∈U_L^L(t),t ≥τ^∗,L=1,2 and consider the following quantity

t∈[0,Tinf ]sup

u∈U

d(ϕ(t,x₀,u),B), (13)

forT≥τ^∗and a givenx₀∈R²ⁿsuch thatΦ(τ^∗,h(x₀+σ),[u_m,u_M]²)∈ C. This distance quantifies,/ for given initial conditions satisfying the assumptions of Theorem 1, how far from the bad set the trajectory generated by the proposed control law can be in the presence of communication delay. However, in the case of an infinitely-distributed delay, the probability of entering the bad set is not equal to zero and therefore the trajectory can actually intersect the bad set. This is why we will restrict the trajectories under consideration to safe ones in the sequel.

A. Upper-bound for closed-loop performance

First, we characterize further the dynamics under consideration.

Assumption 3: There exist a positive increasing scalar continuous functionκ, classKfunctions α₁ and α₂ and a class K_∞ function γ such that, for arbitrary pairs (x₁,x₂)∈ Rⁿ×Rⁿ and (u₁,u₂)∈ C⁰_pw(R,R^m)², the corresponding solutions of (1) satisfy, for t≥0,

|ϕ(t,x₁,u₁)−ϕ(t,x₂,u₂)| ≤κ(t)α₁(|x₁−x₂|) +α₂(t)γ(ku₂−u₁k_∞). (14) This assumption is motivated by Theorem 3.4 in [2]. Indeed, provided that the vector field is continuously differentiable with respect to the state x and to the input u, Assumption 3 holds.

To evaluate the closed-loop performance for the proposed control technique and guarantee that the considered trajectories do not enter the bad set, we consider trajectories such that the following event holds

τ^LR(0)≤τ^∗,τ^RL(0)≤τ^∗ and, for n∈

1, . . . ,bT−τ^∗ δ c

,

either τ^LR(nδ)≤τ^∗ and τ^RL(nδ)≤τ^∗ or τ^LR(nδ)>τ^∗ and τ^RL(nδ)>τ^∗ (15) As this event is not deterministically given, in the sequel, we consider a conditional expected value of the quantity (13).

(7)

Proposition 1: Consider the plant (1) satisfying Assumptions 1 and 3 and the assumptions of Theorem 1. Define U =

u∈ C_pw(R⁺,[u_m,u_M]²)|u_L(t)∈U_L^L(t),t ≥τ^∗,L=1,2 . Then, for T ≥τ^∗ and for trajectories originating from x₀=x(0) such that the following event (15) holds for all time, there exist a positive increasing scalar continuous function κ, class K functionsα₁ and α2 and a class K_∞ function γ such that

t∈[0,Tinf ] sup

u∈U E(d(ϕ(t,x₀,u),B)|E0)

≤κ(T)α₁(max

y D(h(y))) + inf

t∈[0,T] sup

u∈U E(d(xˆ^L(t),B)|E0) +G(δ,τ^∗), (16) with

G(δ,τ^∗) =Pr(τ≤τ^∗)

b(T−τ^∗)/δc−1

∑

i=0

Pr(τ>τ^∗)ⁱα₂(τ^∗+ (i+1)δ)γ(|u_M−u_m|)

+Pr(τ >τ^∗)^b(T^−τ^∗^)/δ^cα2(T)γ(|u_M−u_m|). (17) Proof: First, following the proofs of provided in [1], if (15) holds then ˆx¹(t) =xˆ²(t) =x(t)ˆ and x(t)∈x(t), forˆ t∈[τ^∗,T]. Further, ˆh¹_d,syn(t,τ^∗) =hˆ²_d,syn(t,τ^∗) =hˆ_d,syn(t,τ^∗) for t ∈[τ^∗,T].

Consequently, one obtains

d(ϕ(t,x₀,u),B)≤D(x(tˆ )) +d(x(t),ˆ B), (18) and, from the linearity of the expected value, it follows that

E(d(ϕ(t,x₀,u),B)|E0)≤E(D(x(t))|E0) +ˆ E(d(x(t),ˆ B)|E0)

≤E(D(Φ(τ^∗,hˆ_d,syn(t,τ^∗),U|_[t−τ∗,t]))|E0) +E(d(x(t),ˆ B)|E0). in which the last inequality follows from the fact that ˆh_d,syn(t,τ^∗)⊇xˆ_d(t) for t≥τ^∗ using (9)–

(10). From (10), we have that

E(D(Φ(τ^∗,hˆ_d,syn(t,τ^∗),U|_[t−τ∗,t]))|E0)

=E

D

Φ

τ^∗+t−

t−τ^∗ δ

δ,xˆ_d,syn t−τ^∗ δ

δ,τ^∗

,U|_[b(t−τ∗)/δcδ−τ^∗,t)

|E0

,

(8)

and, by definition of the expected value and from Cases 1-3 in (8), that E(D(Φ(τ^∗,hˆ_d,syn(t,τ^∗),U_[t−τ^∗_,t]))|E0)

=Pr

τ t−τ^∗ δ

δ−τ^∗

≤τ^∗

×D

Φ

τ^∗+t−

t−τ^∗ δ

δ,h

y t−τ^∗ δ

δ−τ^∗

,U|_[b(t−τ∗)/δcδ−τ^∗,t)

+Pr

τ t−τ^∗ δ

δ−τ^∗

>τ^∗

×E

D

Φ

τ^∗−

t−τ^∗ δ

δ+t−δ,xˆ_d,cor t−τ^∗ δ

δ−δ

,U|_[b(t−τ∗)/δcδ−τ^∗,t)

|E0

. Using again Cases 1-3 in (8) backward in time iteratively and taking into account the fact that Pr(τ(0)≤τ^∗|E0) =1, one obtains

E(D(Φ(τ^∗,hˆ_d,syn(t,τ^∗),U_[t−τ∗,t]))|E0) =Pr(τ≤τ^∗)

b(t−τ^∗)/δc−1 i=0

∑

Pr(τ >τ^∗)ⁱ

×D

Φ

τ^∗+t− t−τ^∗ δ

−i

δ,h

y t−τ^∗ δ

−i

δ−τ^∗

,U|_[(bt−τ∗

δ c−i)δ−τ^∗,t)

+Pr(τ>τ^∗)^b^t−τ

∗ δ c

D(Φ(t,h(y(0)),U|_[0,t))). (19)

Finally, applying (14) and (5), one obtains, for s,s₀≥0, D(Φ(s,h(y(s₀)),U|_[s

0,s+s₀)))≤

max

u:R+7→[um,u_M]², x∈h(y(s0))

ϕ(s,x,u)− min

u:R+7→[um,u_M]², x∈h(y(s0))

ϕ(s,x,u)

≤κ(s)α₁(max

y D(h(y))) +α₂(s)γ(|u_M−u_m|),

in which κ,α₁,α₂ and γ are introduced in Assumption 3. Then, one can bound the terms in (19) with this last inequality and take the inf and sup of both sides for t ≥0. The result follows noticing that both t≤T and t− b(t−τ^∗)/δcδ ≤δ and matching the terms multiplying κ(T)α₁(max_yD(h(y))).

Note that, here, max_yD(h(y)) =|σ_M−σ_m|. We keep such a general formulation max_yD(h(y)) to encapsulate other bounded set-valued measurement maps, for which the previous analysis and in particular Theorem 1 also hold.

According to Proposition 1, the distance of the trajectory from the bad set, i.e., the conservatism of the proposed control strategy, can be quantified by three terms corresponding to three different phenomena: (i) the magnitude of measurement uncertainties; (ii) the performance of the nominal

(9)

feedback law introduced in Assumption 2; and (iii) the synchronization technique introduced in (8)–(11). Without measurement uncertainties and communication delay, the left-hand side of (16) is equal to the right hand-side, since max_yD(h(y)) =0 and one can chose δ =τ^∗=0 and therefore G(δ,τ^∗) =0. This bodes well for the tightness of the bound. While the influence of the first two phenomena can be easily characterized, it is necessary to analyze further the behavior of the third one with respect to the parameters involved in the control strategy, that is, the synchronization delay τ^∗ and the period of update δ. This is the subject of the following section.

B. Variation of the upper-bound with respect to the control parameters τ^∗ and δ Lemma 1: The function G defined in (17) is increasing with respect to δ.

Proof: Consider two positive scalars δ1 and δ2 such that δ1 ≤ δ2. Consequently, b(T−τ^∗)/δ₁c ≥c(T−τ^∗)/δ₂b. First, ifb(T−τ^∗)/δ₁c=0, thenb(T−τ^∗)/δ₂c=0 andG(δ₁,τ^∗) = G(δ₂,τ^∗) =α_M(T)γ(|u_M−u_m|) according to (17). Otherwise, we have that

G(δ₁,τ^∗) =Pr(τ ≤τ^∗)

b(T−τ^∗)/δ₂c−1

∑

i=0

Pr(τ>τ^∗)ⁱα(τ^∗+ (i+1)δ₁)γ(|u_M−u_m|)

+Pr(τ >τ^∗)^b(T^−τ^∗^)/δ²^cG(δ˜ ₁,δ2,τ^∗), (20) in which

G(δ˜ ₁,δ₂,τ^∗) =γ(|u_M−u_m|)

Pr(τ>τ^∗)^b

T−τ∗

δ1 c−b^T−τ^∗

δ2 c

α(T)

+Pr(τ≤τ^∗)

b^T−τ^∗

δ1 c−b^T−τ^∗

δ2 c−1 i=0

∑

Pr(τ>τ^∗)ⁱα

τ^∗+

i+1+

T−τ^∗ δ₂ δ₁

.

One can observe that the second factor in G˜ is a weighted average between α(τ^∗+ (1+b(T−τ^∗)/δ₂c)δ₁), α(τ^∗+ (2+b(T −τ^∗)/δ₂c)δ₁), . . . , α(T). Besides, as α is a classKfunction, thenα(τ^∗+ (i+1+b(T−τ^∗)/δ₂c)δ₁)≤α(T)for 0≤i≤ b^T^−τ^∗

δ1 c − b^T^−τ^∗

δ2 c −1.

Consequently, ˜G(δ₁,δ2,τ^∗)≤α(T)γ(|u_M−u_m|). Further, as α is a class K function, it follows

(10)

that α(s₁+s₂δ₁)≤α(s₁+s₂δ₂) for all s₁,s₂≥0 and, therefore, from (20), G(δ₁,τ^∗)≤Pr(τ ≤τ^∗)

b^T−τ^∗

δ2 c−1

∑

i=0

Pr(τ>τ^∗)ⁱα(τ^∗+ (i+1)δ₂)γ(|u_M−u_m|) +Pr(τ >τ^∗)^b(T^−τ^∗^)/δ²^cα(T)γ(|u_M−u_m|)

≤G(δ₂,τ^∗).

This concludes the proof.

The variations of G with respect to τ^∗, instead, are not monotone. Nevertheless, we can establish the following result regarding its asymptotic behavior (for small or large values of τ^∗).

Lemma 2:Consider δ ≥0 and the function Gintroduced in (17). There exists τ₀ and τ₁ with 0<τ0≤τ1 such that G is decreasing for τ∈[0,τ0) and increasing for τ∈(τ₁,∞).

Proof: Considering (17), one can observe thatG(·,τ^∗)∼

0 Pr(τ>τ^∗)α(T)γ(|u_M−u_m|), which is a decreasing function of τ^∗. Therefore, there exists τ₀>0 such that G(·,τ^∗) is decreasing.

Similarly, one obtainsG(·,τ^∗)∼

∞Pr(τ≤τ^∗)α(τ^∗)γ(|u_M,u_m|), which is an increasing function of τ^∗. This concludes the proof.

IV. CONCLUSION

From Theorem 1, one concludes that the probability of safetyΠincreases when increasing the synchronization delay τ^∗ and the period of update δ. However, from the two previous lemmas, one obtains inverse requirements for closed-loop performance. Indeed, in the case of no-collision, the bound proposed in Proposition 1 increases while increasing δ and τ^∗. Therefore, a tradeoff has to be reached between probability of safety and closed-loop performance.

REFERENCES

[1] D. Bresch-Pietri and D. Del Vecchio. Estimation for decentralized safety control under communication delay and measurement uncertainty. Automatica, Provisionally accepted.

[2] H. Khalil. Nonlinear Systems. 3rd Edition, Prentice Hall, 2002.

[3] B. Sterzbach. GPS-based clock synchronization in a mobile, distributed real-time system.Real-Time Systems, 12(1):63–75, 1997.