Probabilistic reachability and control synthesis for stochastic switched systems using the tamed Euler method

(1)

HAL Id: hal-02987885

https://hal.archives-ouvertes.fr/hal-02987885

Submitted on 4 Nov 2020

HAL

is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from

L’archive ouverte pluridisciplinaire

HAL, est

destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de

Probabilistic reachability and control synthesis for stochastic switched systems using the tamed Euler

method

Adrien Le Coënt, Laurent Fribourg, Jonathan Vacher, Rafael Wisniewski

To cite this version:

Adrien Le Coënt, Laurent Fribourg, Jonathan Vacher, Rafael Wisniewski. Probabilistic reachability

and control synthesis for stochastic switched systems using the tamed Euler method. Nonlinear

Analysis: Hybrid Systems, Elsevier, 2020, 36 (100860), �10.1016/j.nahs.2020.100860�. �hal-02987885�

(2)

Probabilistic reachability and control synthesis for stochastic switched systems using the tamed Euler

method

Adrien Le Co¨ent^a, Laurent Fribourg^b, Jonathan Vacher^c, Rafael Wisniewski^d

aDepartment of Computer Science, Aalborg University, Selma Lagerløfs Vej 300, Aalborg, Denmark.

bLSV, CNRS, INRIA, ENS Paris-Saclay, 61 Av. du Pdt. Wilson, 94235, Cachan, France.

cAlbert Einstein College of Medicine,

Block Research Pavilion, Department of Systems and Computational Biology, 1301 Morris Park Avenue, 10461 Bronx, NY, USA.

dDepartment of Electronic Systems, Automation and Control, Aalborg University, Fredrik Bajers Vej 7, Aalborg, Denmark.

Abstract

In this paper, we explain how, under the one-sided Lipschitz (OSL) hypothesis, one can find a mean square error bound for a variant of the Euler-Maruyama approximation method for stochastic switched systems. Subsequently, we explain how this bound can be used to control a stochastic switched system in order to make it reach a target zone with guaranteed minimum probability. The method is illustrated on several examples from the literature.

Keywords: Stochastic systems, numerical simulation, control system synthesis, switched control systems, nonlinear control systems.

1. Introduction

Symbolic methods for the verification and control synthesis of hybrid systems (and, particularly, “switched systems”) have received significant attention in the past few years. However, control systems involving stochastic differential equations remain difficult to handle with symbolic methods, and few methods have been developed for these systems. One distinguishes two main classes of symbolic methods for hybrid systems: indirect methods and direct methods [2].

Indirect methods proceed by constructing a finite abstraction of the original system by discretization of the dense state spaceR^d(wheredis the dimension of

This work is supported by the LASSO project financed by an ERC adv. grant; and the DiCyPS project funded by Innovation Fund Denmark.

License CC BY-NC-SA 4.0 : https://creativecommons.org/licenses/by-nc-sa/4.0/

(3)

the state space). Among the indirect methods, one of the most successful proceeds byapproximate bisimulation [10]. This method originally designed for deterministic switched systems has been recently extended forstochastic switched systems[30, 31, 32]. This approach relies on the hypothesis ofincremental sta- bilityof the stochastic switched system (or existence of a common/multiple Lya- punov function). Another method is presented in [7]. The associated tool [25]

allows to generate formal abstractions for discrete-time Markov processes defined over uncountable (continuous) state spaces.

Indirect methods have also been developed for the verification of safety.

Safety verification has been studied using the concept of barrier certificates [23].

The original inquiry has been later modified to cope with stochastic systems [29], and to the design of a controller that makes the closed-loop system safe [28, 24]. Also related is the subject of stability and safety verification is the reach avoidance problem. It has been studied using dynamic programming. This approach has been developed both for deterministic hybrid system in [18], and for discrete-time stochastic hybrid systems in [26].

A direct method proceeds by working directly at the level of the dense state of the systemR^d; it computes “trajectory tubes”, which are over-approximations of the set of all the controlled trajectories starting at a given subregion ofR^d. In previous work, we have followed such a direct approach first in [9], then in [15], using the Euler approximation scheme for calculating over-approximations of tubes of trajectories. We show here how to extend this direct method tostochas- ticswitched systems. The method extends the deterministic approach by replacing the classical Euler approximation scheme with a variant of the stochastic Euler-Maruyama scheme [13]. The correctness of these Euler-based methods does not rely on the hypothesis of incremental stability as in [30, 32], but on the hypothesis of ‘one-sided Lipschitz (OSL)’ condition with constantλPR^d (also called ‘monotonicity’/‘dissipativity’, see [27]). It can be seen that if a stochastic switched system satisfies an OSL condition with λă 0, then the function Vpx, x¹q “ }x´x¹}² is a common incremental Lyapunov functionin the sense of [31], from which it follows that the switched system is incrementally stable, and can be analysed by approximate bisimulation. However, Euler-based methods also apply when the system is not incrementally stable, in which case the constantλis necessarily positive. We thus consider a class of systems different from that of [31].

The plan of the paper is as follows: In Section 3, we give an explicit upper bound on the mean square error of the tamed Euler method for SDEs under OSL condition (Theorem 2). We apply the result in order to ensure reachability properties of stochastic switched systems with guaranteed minimum probability (Section 4). We conclude in Section 5.

This paper is an extended version of a paper (A. Le Co¨ent, L. Fribourg and J. Vacher. Control Synthesis for Stochastic Switched Systems using the Tamed Euler Method. In ADHS’18, IFAC-PapersOnLine 51(16), pages 259- 264. Elsevier Science Publishers, 2018). The main additional material of the extended version is:

(4)

• Section 3.3, which gives analytical upper bounds on expressions occurring in Theorem 2,

• Section 3.4, which gives a lower bound on the probability of reaching a given setR (Proposition 3), and

• Section 4.4, which proposes an optimized method of control synthesis.

2. Preliminaries 2.1. Notations

The symbols N, Ně0, R, Rą0, Rě0 denote the set of natural, nonnegative integer, real, positive, and nonnegative real numbers.

The symbol}¨}denotes the Euclidean norm onR^d. The symbolx¨,¨ydenotes the scalar product of two vectors ofR^d. Given a pointxPR^dand a positive real rą0, the ballBpx, rqof centrexand radiusris the settyPR^d | }x´y} ďru.

Throughout this paper, the set U denotes a finite set of switched modes.

Given a sequence of k modesa“ pa₁¨ ¨ ¨a_kq PU^k, and a sequence of pmodes b“ pb1¨ ¨ ¨bpq PN^p, we denote the concatenation a˚b of aandb the sequence of lengthk`p: a˚b“ pa1¨ ¨ ¨ak¨b1¨ ¨ ¨bpq.

A sequence of non-negative integers is written in bold when it is used as a multi-index,e.g. a“ pa1, . . . , apq P N^pě0, the sum of the components of a is denoted by|a| “a₁`¨ ¨ ¨`a_p, and`_b

a

˘“_a ^b!

1!a2!...ap! is the multinomial coefficient.

2.2. Systems considered and assumptions

Let τ P Rą0 be a fixed real number, let pΩ,F,Pq be a probability space with normal filtration pFtq_tPr0,τs, let d, m P N let W “ pW^p1q, . . . , W^pmqq : r0, τs ˆΩ Ñ R^m be an m-dimensional standard pWtq_tPr0,τs-Brownian motion and letx₀ : ΩÑR^d be anF₀{BpR^dq-measurable mapping with Er}x₀}^ps ă 8 for allpP r1,8q. Moreover, let f : R^d Ñ R^d be a continuously differentiable function whose derivative grows at most polynomially. Formally, let us suppose the existence of constantsDPR^ě0 andqPNsuch that, for allx, yPR^d

}fpxq ´fpyq}²ďD}x´y}²p1` }x}^q` }y}^qq (H1) Letg“ pgi,jqiPt1,...,du,jPt1,...,mu:R^dÑR^dˆmbe a globally Lipschitz continuous function: there existsL_gPRě0such that, for allx, yPR^d

}gpxq ´gpyq} ďLg}x´y} (H2) Finally, let us suppose that f is globally one-sided Lipschitz with constantλPR: DλPR@x, yPR^d: xfpyq ´fpxq, y´xy ďλ}y´x}² (H3) Then consider the Stochastic Differential Equations (SDE):

dXt“fpXtqdt`gpXtqdWt, X0“x0 (1)

(5)

for t P r0, τs. The drift coefficient f is the infinitesimal mean of the process X and the diffusion coefficient g is the infinitesimal standard deviation of the process X. Under the above assumptions, the SDE (1) is known to have a unique strong solution. More formally, there exists an adapted stochastic pro- cesspXt,x0q:r0, τs ˆΩÑR^d with continuous sample paths fulfilling

Xt,x₀ “x0` żt

0

fpXsqds` żt

0

gpXsqdWs

for alltP r0, τsP-a.s. (see, e.g., ([21])).

We denote byXt,x₀ the solution of Equation (1) at timet from initial con- ditionX0,x₀ “x0P-a.s., in whichx0is a random variable that is measurable in F0.

Remark 1. Constantsλ,Lg andD can be computed using (constrained) opti- mization algorithms (see ([15])).

2.3. Overview of the approach

We aim at synthesizing reachability controllers for the class of systems described above. Our approach consists in adapting reachability analysis based control systhesis methods to stochastic systems. In [15], we successfully used the Euler method for guaranteeing controlled reachability of deterministic systems using the Euler method in association to a new error bound relying on the OSL property of the vector field.

Consider equation (1) withg“0. Classically, one knows that, if the function f is Lipschitz continuous with Lipschitz constantL, the solution of the ODE starting at a given initial value exists and is unique. Besides, one has:

}Xt,x₀´Xt,x₁} ďe^Lt}x0´x1}, (2) with two initial valuesxi (i “ 0,1). This gives a rough growth bound, i.e. a function bounding the distance of neighboring trajectories ast evolves. In [6], it is proven that, if f is continuous and OSL with OSL constant λ, then the solution of the ODE starting at a given initial value exists and is unique, and, for allx₀, x₁PRⁿ:

}Xt,x₀´Xt,x₁} ďe^λt}x0´x1}. (3) This gives a more accurate growth bound because a Lipschitz function f is always OSL, and the associated OSL constantλis always less than or equal to its Lipschitz counterpart L. Using the OSL constant λ, it is also possible to bound the error}X_t,x₀´X˜_t,x₁} in function of }x₀´x₁}, where ˜X_t,x₁ denotes theEuler approximate trajectory of the solutionX_t,x₁. We can then conclude that any trajectory starting fromx₀close enough from a given initial condition x₁ (}x₀´x₁} ď δ₀) will remain within some bound (denoted by δ_t,δ₀) of the Euler trajectory ˜Xt,x₁. Our objective is to use a similar approach for stochastic systems, while making as few additional hypotheses on the dynamics as possible.

(6)

Therefore, we first explain how we can use a variant of theEuler-Maruyama scheme called tamed Euler scheme in order to guarantee reachability of the expectation of the state ErXt,x₀s of the stochastic system (1) in a given set.

It consists in computing an approximate Euler-based trajectory ˜X_t,x₁, we then provide an error bound for Er}Xt,x₀ ´X˜t,x₁}²s (Theorem 2) which allows to write a result for the expectation of the non approximated trajectoryErXt,x₀s (Proposition 2). This result can be extended further as a probability result (the probability with which the state reaches the set, Proposition 3).

This reachability analysis method is then used in association with tiling based control synthesis algorithms to ensure that the system is stabilised in a given region of interest with a given probability (Section 4). The first algorithm we use is based on an adaptive state-space decomposition algorithm [15, 14], and uses an exhaustive enumeration of all possible control sequences (up to a given length) in order to find stabilising sequences (see Section 4.3). We then propose a faster approach (in Section 4.4) which relies on state-space discretization and dynamic programming techniques in order to minimize the average distance of the state with some given objective after a given number of steps. The second algorithm is merely an enhancement of the control sequence search of the first one, using a static tiling. Note that with both algorithms, there is no need to compute a finite-state abstraction of the system, which means that computing a controller is done with the application of a single algorithm, unlike [31]

which requires the computation of a finite-state abstraction before computing controllers.

Example 1. Throughout the paper, we illustrate our approach on a 2-mode switched system borrowed from [30]. This case-study is a stochastic version of a well-known system originally introduced as an illustrative example in [16]. The dynamics is given by the system of equations

dx1“ p´0.25x1`px2` p´1q^p0.25qdt`wx1dW_t¹, dx2“ ppp´3qx1´0.25x2` p´1q^pp3´pqqdt`wx2dW_t²,

wherep“1,2 is the mode, and we consider different values of noisewPRą0.

3. Bounding the Error of the tamed Euler method 3.1. Tamed Euler approximation

The standard way to extend the classical Euler method for ordinary differential equations to the SDE (1) is the Euler-Maruyama scheme ([19]). More precisely, givenz: ΩÑR^danF0{BpR^dq-measurable mapping withEr}z}^ps ă 8 for allpP r1,8q, the explicit Euler-Maruyama (EM) method for the SDE (1) is given by the mappingsY_n,z^N : ΩÑR^d,nP t0,1, . . . , Nu, which satisfyY_0,z^N “z and

Y_n`1,z^N “Y_n,z^N ` τ

N ¨fpY_n,z^N q `gpY_n,z^N qpW_pn`1qτ{N ´W_nτ{Nq

for all n P t0,1, . . . , N ´1u and all N P N. See ([19]). Unfortunately, the convergence results for the EM scheme does not hold when the drift function

(7)

f of the SDE (1) is polynomial (not an affine function). In order to consider non affine drift functions, we adopt a refined scheme, which has been proposed recently to overcome this difficulty ([13]). LetX^N_n,z: ΩÑR^d,

X^N_n`1,z “X^N_n,z`

τ

N ¨fpX^N_n,zq 1`_N^τ ¨ }fpX^N_n,zq}

`gpX^N_n,zqpWpn`1qτ N

´W^nτ

Nq

(4)

for allnP t0,1, . . . , N´1uand allN PN. We refer to the numerical method (4) as atamed Euler scheme([13]). In this method the drift term ^τ_n¨fpX^N_n,zq is “tamed” by the factor 1{p1`_N^τ ¨ }fpX^N_n,zq}q for n P t0,1, . . . , N ´1u and N PNin (4). In order to establish results for the time continuous system, we use the following time continuous interpolation of the time discrete numerical approximations (4), also introduced in ([13]). Let ˜X_z^N :r0, τs ˆΩÑR^d,N PN, be a sequence of stochastic processes given by

X˜_t,z^N “X˜_n,z^N `

pt´nτ{Nq ¨fpX˜_n,z^N q

1`τ{N¨ }fpX˜_n,z^N q} `gpX˜_n,z^N qpWt´W^nτ

Nq (5)

for allt P r^nτ_N,^pn`1qτ_N s, n P t0,1. . . , N´1u and all N PN. Note that ˜X_t,z^N : r0, τs ˆΩÑR^d is an adapted stochastic process with continuous sample paths for everyN PN.

We finally introduce a piecewise constant interpolation X^N_t,z of the tamed Euler scheme as follows. It is mainly used in some constants when establishing the error bound and for proving Theorem 2.

X^N_t,z:“X^N_n,z fortP rnτ

N ,pn`1qτ

N q. (6)

Note that ˜X_t,z^N “X^N_t,z“X^N_n,z at timet“ ^nτ_N fornP t0,1, . . . , Nu.

The following theorem is proven in ([13]):

Theorem 1. Let us suppose(H1)-(H2)-(H3). Letz: ΩÑR^dbe anF0{BpR^dq- measurable mapping withEr}z}^ps ă 8for allpP r1,8q. Then, for allpP r1,8q, the tamed Euler scheme (4)satisfies:

sup

NPN

sup

nPt0,1,...,Nu

Er}X^N_n,z}^ps ă 8

This theorem allows to ensure the strong convergence of the tamed Euler method. Any numberN of subsampling steps can thus be used. This number is now left implicit for the sake of simplicity. From Theorem 1, it follows (cf.

Lemma 4.3, ([11])):

Lemma 1. Let us suppose (H1)-(H2)-(H3). Let z : ΩÑR^d be anF0{BpR^dq- measurable mapping with Er}z}^ps ă 8 for all pP r1,8q. Let us consider the time continuous interpolation of the tamed Euler scheme as defined in (5), and

(8)

the piecewise constant interpolation as defined in (6). Then, for any even inte- gerrě2, there exist two constantsEr,z andFr,z such that

sup

0ďtďτE}X_t,z´X˜t,z}^rď p∆tq^r²pEr,zp∆tq^r² `Fr,zdq.

with∆t“τ{N and:

E_r,z “2^rp}fp0q}^r`D2^r`1²

p1`Esup_0ďtďτ}X_t,z}^qrq¹²pEsup_0ďtďτ}X_t,z}^2rq¹²q, Fr,z“2^rp}gp0q}^2r`L^r_gEsup_0ďtďτ}X_t,z}^r²q.

Remark 2. Constants Er,z and Fr,z are computed using the constants λ and Lg (see Remark 1), and the expected values ofsup_0ďtďτ}X_t,z}^p for the required values of pat each time t“0,∆t, 2∆t,. . ., N∆t. These expected values can either be computed numerically by using a Monte Carlo method, or by evaluat- ing analytically the expectationsEsup_0ďtďτ}X_t,z}^p for all required values of p.

Proposition 2 explains how such expectations can be computed for the piecewise linear interpolationX˜t,z, the computation forX^p_t,zis the same but much simpler since there is no time dependent term in (6). In this paper, we use the latter approach.

3.2. Mean square error bounding

The following Theorem holds for SDE (1). This corresponds to a stochastic version of Theorem 1 of ([15]), showing that a similar result holds on average, using the tamed Euler method of ([13]). It is an adaptation of Theorem 4.4 in ([11]).

Theorem 2. Given the SDE system (1) satisfying (H1)-(H2)-(H3). Let δ0 P Rě0. Suppose thatz is a random variable on R^d such that

Er}x₀´z}²s ďδ₀². Then, we have, for allτě0:

Er sup

0ďtďτ

}Xt,x₀´X˜t,z}²s ďδ_τ,δ² ₀, withδ²_τ,δ

0 :“βpτqe^γτ, where:

γ“2p?

∆t`2λ`L²_g`128L⁴_gq, and

βpτq “2δ₀²`2τ∆tL²_gp1`128L²_gqpF2,zd`E2,z∆tq

`4τa

∆_tDpF_4,zd`E_4,z∆²_tq¹² p1`4E sup

0ďtďτ

}X_t,z}^2q`4E sup

0ďtďτ

}X˜t,z}^2qq¹².

(7)

with ∆t“τ{N.

(9)

Proof. The proof closely follows the proof of Theorem 4.4 in ([11]). Let et “ Xt,x₀´X˜t,z. We have, for all 0ďtďτ:

det“ pfpXt,x₀q ´fpzqqdt` pgpXt,x₀q ´gpzqqdWt. (8) Then, by using Equation (8) and the integral version of Itˆo formula applied to functionxÞÑ }x}² we obtain

}et}²“ }e0}²` żt

0

2xes, fpXs,x₀q ´fpX_s,zqyds

` żt

0

}gpXs,x0q ´gpX_s,zq}²ds`Mptq,

(9)

wheree0“x0´z, and Mptq “

żt 0

2xe_s, gpX_s,x₀q ´gpX_s,zqydW_s. So we have using (H2):

}et}²ď }e0}²` żt

0

2xes, fpXs,x₀q ´fpX˜s,zqyds

`L²_g żt

0

}Xs,x₀´X_s,z}²ds

` żt

0

2xe_s, fpX˜_s,zq ´fpX_s,zqyds`Mptq.

(10)

So we have using (H3) and Young’s inequality:

}et}²ď }e0}²` żt

0

p2λ}es}²`L²_g}es}²qds

`L²_g żt

0

}X_s,z´X˜s,z}²ds

` żt

0

p 1

?∆t

}fpX˜_s,zq ´fpX_s,zq}²`a

∆_t}e_s}²qds`Mptq.

(11)

So we have using (H1), for all 0ďtďτ:

}et}²ď }e0}²` pa

∆t`2λ`L²_gq żt

0

}es}²ds

`L²_g żt

0

}X_s,z´X˜s,z}²ds

` D

?∆t

żt 0

p1` }X_s,z}^q` }X˜_s,z}^qq}X_s,z´X˜_s,z}²ds

`Mptq.

(12)

(10)

It follows using Lemma 1 forr“2, and Cauchy-Schwarz inequality:

Er sup

0ďsďt

}es}²s ďE}e0}²` pa

∆t`2λ`L²_gq żt

0

E}es}²ds

`L²_gτ∆tpE2,z∆t`F2,zdq

` D

?∆t

żt 0

pEp1` }X_s,z}^q` }X˜s,z}^qq²q¹²pE}X_s,z´X˜s,z}⁴q¹²ds

`mptq,

(13)

where

mptq “Er sup

0ďsďt

}Mpsq}s.

Hence, using using Lemma 1 forr“4, and inequalitypa`bq^pď2^ppa^p`b^pq:

Er sup

0ďsďt

}es}²s ďE}e0}²` pa

∆t`2λ`L²_gqq żt

0

E}es}²ds

`L²_gτ∆_tpE_2,z∆_t`F_2,zdq

`2Dτa

∆_tpE_4,z∆²_t`F_4,zdq¹² p1`4E sup

0ďtďτ

}X_t,z}^2q`4E sup

0ďtďτ

}X˜t,z}^2qq¹² `mptq.

(14)

On the other hand, from the Burkholder-Davis-Gundy inequality, we get:

mptq ď16Er żt

0

}e_s}²}gpX_s,x₀q ´gpX_s,zq}²dss¹² Hence, using (H2):

mptq ď16L²_gEr sup

0ďsďt

}es}² żt

0

}Xs,x₀´X_s,z}²dss¹² Then, using Young’s inequality (for anyαą0):

mptq ď8L²_gpαEr sup

0ďsďt

}e_s}²s ` 1 αEr

żt 0

}X_s,x₀´X_s,z}²dssq.

Hence, by using Lemma 1 forr“2:

mptq ď8αL²_gEr sup

0ďsďt

}es}²s

`8L²_g α

żt 0

Er sup

0ďrďs

}e_r}²sds

` 8L²_g

α τ∆tpE2,z∆t`F2,zdq.

(15)

(11)

Hence, lettingα“ _16L¹2

g, we have by replacing in (14):

1 2Er sup

0ďsďt

}es}²s ď

δ₀²` pa

∆_t`2λ`L²_g`128L⁴_gq żt

0

Er sup

0ďrďs

}e_r}²sds

`τpL²_g`128L⁴_gq∆tpE2,z∆t`F2,zdq

`τ2Da

∆tpE4,z∆²_t`F4,zdq¹² p1`4E sup

0ďtďτ

}X_t,z}^2q`4E sup

0ďtďτ

}X˜t,z}^2qq¹².

(16)

It results from Gronwall’s inequality:

Er sup

0ďtďτ

}et}²s “βpτqe^γτ, with

γ“2p?

∆_t`2λ`L²_g`128L⁴_gq, and

βpτq “2δ₀²`2τp∆_tL²_gp1`128L²_gqpF_2,zd`E_2,z∆_tq

`4τa

∆tDpF4,zd`E4,z∆²_tq¹² p1`4E sup

0ďtďτ

}X_t,z}^2q`4E sup

0ďtďτ

}X˜_t,z}^2qq¹².

(17)

It follows from Theorem 2 and Jensen’s inequality:

Proposition 1. Consider two pointsx₀andz inR^d,and a positive real number δ₀. Suppose that x₀ P Bpz, δ₀q. Then EX_t,x₀ P BpX˜_t,z, δ_t,δ₀q for all t P r0, τs, whereδ_t,δ₀ is defined in (7).

3.3. Numerical pre-computation of expectations

Theorem 2 requires the knowledge of Esup_0ďtďτ}X˜_t,z}^2q. In this section, we give a pre-computable upper bound of this quantity.

Proposition 2. Let X˜_t,z be the piecewise linear interpolation of the solution to equation (1) as defined in equation (5), written in the form X˜t,z “αpzq ` βpzqt`γpzqW_t with αpzq “ X˜_n,z^N ´ ^pnτ{Nq¨fp

X˜_n,z^N q

1`τ{N¨}fpX˜_n,z^N q}, βpzq “ ^fp^X^˜

N n,zq 1`τ{N¨}fpX˜_n,z^N q}, γpzq “gpX˜_n,z^N q. Then

Er}X˜t,z}^2qs “ ÿ

|k|“q

ˆq k

˙

Ck₁,k₂,k₃pzqµk₄,k₅,k₆pz,1qt^k²^`2k³^`k⁴^`^k⁴^`k² ⁵^`k⁶, (18)

(12)

wherek“ pk1, . . . , k6qand Ck₁,k₂,k₃pzq “

´

αpzq^Tαpzq

¯k₁´

2αpzq^Tβpzq

¯k₂´

βpzq^Tβpzq

¯k₃

, and

µ_κpz, tq “

$

’’

&

’’

%

λ^p1qκ Dk5´k4

2 ,k4,k6pbpzqbpzq^T, Pa,bpzq, Qpzqqt^k⁴^`k² ⁵^`k⁶ if k4ăk5 and k4`k5”0 pmod 2q λ^p2qκ Dk4´k5

2 ,k₅,k₆papzqapzq^T, P_a,bpzq, Qpzqqt^k⁴^`k² ⁵^`k⁶ if k4ąk5 and k4`k5”0 pmod 2q

λ^p3qκ D_k₅_,k₆pP_a,bpzq, Qpzqqt^k⁵^`k⁶ if k₄“k₅ and k₅”0 pmod 2q 0 if k4`k5”1 pmod 2q

whereapzq “2βpzq^Tγpzq,bpzq “2αpzq^Tγpzq,Qpzq “γpzq^TγpzqandD_n₁_,...,n_ipA₁, . . . , A_iq being the normalized Davis polynomials [5, 12] and

P_a,bpzq “apzqbpzq^T `bpzqapzq^T

2 ,

λ^p1q_κ “2^k⁴^`k² ⁵^`k⁶^´1k₄k₆pk₅´k₄q, λ^p2q_κ “2^k⁴^`k² ⁵^`k⁶^´1k5k6pk4´k5q, λ^p3q_κ “2^k⁵^`k⁶k₅k₆.

.

Proof. First, note that we can write ˜Xt,z as the sum of a mean value, a linear drift and a standard Wiener process

X˜_t,z“αpzq `βpzqt`γpzqW_t, forzPR^d. Then, the squared norm of ˜Xt,z is

X˜_t,z^T X˜_t,z“ pαpzq `βpzqt`γpzqW_tq^Tpαpzq `βpzqt`γpzqW_tq

“αpzq^Tαpzq `2pαpzq^Tβpzq `βpzq^TγpzqW_tqt

`βpzq^Tβpzqt²`2αpzq^TγpzqWt`W_t^Tγpzq^TγpzqWt.

The Newton multinomial formula allows to express the 2q^th power of the ˜Xt,z

norm

}X˜t,z}^2q “ ÿ

|k|“q

ˆq k

˙´

αpzq^Tαpzq

¯k₁´

2αpzq^Tβpzqt

¯k₂´

βpzq^Tβpzqt²

¯k₃

´

2βpzq^TγpzqW_tt¯k₄´

2αpzq^TγpzqW_t¯k₅´

W_t^Tγpzq^TγpzqW_t¯k₆

(13)

wherek“ pk1, . . . , k6q. The previous equation can be written as follows }X˜t,z}^2q “ ÿ

|k|“q

ˆq k

˙

Ck₁,k₂,k₃pzqt^k²^`2k³^`k⁴papzq^TWtq^k⁴pbpzq^TWtq^k⁵pW_t^TQpzqWtq^k⁶

whereapzq “2βpzq^Tγpzq,bpzq “2αpzq^Tγpzq,Qpzq “γpzq^Tγpzqand Ck₁,k₂,k₃pzq “

´

αpzq^Tαpzq

¯k₁´

2αpzq^Tβpzq

¯k₂´

βpzq^Tβpzq

¯k₃

. We denote

µκpz, tq^def.“ E“

papzq^TWtq^k⁴pbpzq^TWtq^k⁵pW_t^TQpzqWtq^k⁶‰

where κ“ pk₄, k₅, k₆q. The expectation µ_κpz, tq can always be written as the high-order moment of a product of three quadratic forms [17], therefore we have

µκpz, tq “

$

’’

&

’’

%

λ^p1qκ Dk5´k4

2 ,k₄,k₆pbpzqbpzq^T, P_a,bpzq, Qpzqqt^k⁴^`k² ⁵^`k⁶ if k4ăk5 and k4`k5”0 pmod 2q

λ^p2qκ Dk4´k5

2 ,k₅,k₆papzqapzq^T, Pa,bpzq, Qpzqqt^k⁴^`k² ⁵^`k⁶ if k₄ąk₅ and k₄`k₅”0 pmod 2q

λ^p3qκ Dk5,k6pPa,bpzq, Qpzqqt^k⁵^`k⁶ if k4“k5 and k5”0pmod 2q 0 if k₄`k₅”1pmod 2q

where Dn₁,...,n_ipA1, . . . , Aiq denotes the normalized Davis polynomials [5, 12]

and

Pa,bpzq “apzqbpzq^T `bpzqapzq^T

2 ,

λ^p1q_κ “2^k⁴^`k² ⁵^`k⁶^´1k₄k₆pk₅´k₄q, λ^p2q_κ “2^k⁴^`k² ⁵^`k⁶^´1k5k6pk4´k5q, λ^p3q_κ “2^k⁵^`k⁶k5k6.

Finally, the result holds.

The normalized Davis polynomials Dn₁,...,n_ipA1, . . . , Aiq can be computed efficiently using recursion formulas, see [12].

From Proposition 2, we obtain an upper bound ofEsup_0ďtďτ}X˜_t,z}^2q using the following lemma.

Lemma 2. For every martingale YtPL^p (wherepą1), we have:

Er sup

0ďtďτ

}Yt}^ps ď ˆ p

p´1

˙p

Er}Yτ}^ps.

Proof. See [1].

(14)

We thus have:

Esup_0ďtďτ}X˜t,z}^2q ď ˆ 2q

2q´1

˙2q

ÿ

|k|“q

ˆq k

˙

C_k₁_,k₂_,k₃pzqµ_k₄_,k₅_,k₆pz,1qt^k²^`2k³^`k⁴^`^k⁴^`k² ⁵^`k⁶

whereCk₁,k₂,k₃pzqandµk₄,k₅,k₆pz,1qare given in Proposition 2. The computation of this deterministic upper-bound is much faster than the direct Monte- Carlo approximation of the expectation. Furthermore, we get rid of the approximations inherent to the use of Monte-Carlo simulations.

3.4. Probabilistic reachability

Let us suppose for simplicity thatR is a ball of the formBpO, ρqwhereO is the origin andρą0. Suppose thatx0PBpz, δ0q. We have:

P rpXτ,x₀ RRq “P rp}Xτ,x₀} ąρq.

Furthermore:

Proposition 3. P rpXτ,x₀ PRq ě 1´ 1 ρ²

ˆ δτ,δ₀`

b

Er}X˜τ,z}²s

˙2

. Proof. Using Chebyshev’s inequality and triangular inequality, it follows P rpXτ,x₀ RRq ď 1

ρ²Er}Xτ,x₀}²s ď 1 ρ²

ˆb

Er}X˜τ,z´Xτ,x₀}²s ` b

Er}X˜τ,z}²s

˙2

. We thus get, using the inequality of the previous section (namely, Er}X˜_τ,z ´ X_τ,x₀}²ďδ²_τ,δ

0):

P rpXτ,x₀ RRq ď 1

ρ² pδτ,δ₀q² ` b

Er}X˜τ,z}²s.

Alternatively, we have:

P rpXτ,x₀ PRq ě 1´ 1 ρ²

ˆ δτ,δ₀`

b

Er}X˜τ,z}²s

˙2

.

NB: If we haveR“BpO¹, ρqwhereO¹ isnotthe origin, we have:

P rpX_τ,x₀ PRq ě 1´ 1 ρ²

ˆ δ_τ,δ₀`

b

Er}X˜_τ,z´O¹}²s

˙2

.

(15)

3.5. Implementation

This method has been implemented in the interpreted language Octave, and the experiments performed on a 2.80 GHz Intel Core i7-4810MQ CPU with 8 GB of memory. The implementation is an adaptation of the program described in ([15]) for controlling deterministic switched systems, but makes use of the tamed Euler scheme for SDEs (with the error functionδt,δ₀ given in Theorem 2) instead of the classical Euler scheme.

Example 2. Consider the system of Example 1, for modeu“1andw“0.05:

dx1“ p´0.25x1`x2`0.25qdt`0.05x1dW_t¹ dx2“ p´2x1´0.25x2´2qdt`0.05x2dW_t²

The program gives (for τ “ 1, ∆t “ τ{10⁴): q “ 0, D “ 1.36, Lg “ 0.05, λ“0.25; and forz“ p´4,´3.8q: E_2,z “893.3,E_4,z “2.14¨10⁵,F_2,z “0.002, F_4,z “4.9¨10^´6.

Consider now the system corresponding to Example 1 for mode u“2 and w“0.05:

dx₁“ p´0.25x₁`2x₂´0.25qdt`0.05x₁dW_t¹ dx2“ p´x1´0.25x2`1qdt`0.05x2dW_t²

The program gives (for τ “ 1, ∆t “ τ{10⁴): q “ 0, D “ 1.36, Lg “ 0.05, λ“0.25, and, for z “ p0,3q: E2,z “543.2, E4,z “7.94¨10⁴,F2,z “0.0442, F4,z “0.00178.

Both computations take less than 10s of CPU time. Simulations of the two systems are given in Figure 1 for modeu“1 and starting pointz“ p´4,3.8q, and mode u“ 2 and starting point z “ p0,3q. The initial ball Bpz, δ0qis depicted in black, the final ballBpErX˜τ,zs, δτ,δ₀qin red, and 200 random sampling trajectories in blue fortP r0, τs.

Example 3. We now consider a nonlinear model of a pendulum on a cart borrowed from [4, 31]. The dynamics is given by:

dx₁“x₂dt`0.03x₁dW_t¹

dx₂“ p´^g_lsinx₁´_m^kx₂`_ml¹2uqdt`0.03x₂dW_t²

where the statepx₁, x₂qrepresents the angular position and velocity of the point mass on the pendulum,uis the control input (torque applied to the cart). The parameters are g “ 9.8, l “ 0.5, m “ 0.6, k “ 2. The program gives for control input u “ 0.5, τ “ 0.1, ∆t “ τ{10²: q “ 1, D “ 2.56, Lg “ 0.03, λ“1.446; and forz“ p0,0q: E2,z “47.5,E4,z“1977.2¨10⁵,F2,z “9.8¨10^´4, F4,z “9.73¨10^´7.

The computation takes less than 10s of CPU time. Simulations of the system are given in Figure 2 for control input u “0.5 and starting point z “ p0,0q.

The initial ball Bpz, δ0q is depicted in black, the final ball BpErX˜τ,zs, δτ,δ₀q in red, and 200 random sampling trajectories in blue fortP r0, τs.

(16)

Figure 1: Simulations of Example 2 with modeu“1 and initial ballBpp´4,3.8q,0.5q, and modeu“2 and initial ballBpp0,3q,0.5q;τ“1.

4. Sampled stochastic switched systems

4.1. Stochastic switched system as a finite collection of SDEs

We now consider a finite number of SDEs. Each SDE is referred to as a modej, and the set of modes is referred to asU “ t1, . . . , Mu. We will denote byX_t,x^j ₀ the solution at timetof the system:

dxptq “f_jpxptqq `g_jpxptqqdW_t^j,

xp0q “x0. (19)

where x0 is a random variable that is measurable in F0. Hypotheses (H1)- (H2)-(H3), as defined in Section 3, are naturally extended to every modej of U. Accordingly, constants L_g, λ, F associated to SDE (1) in Section 3, now becomeL_g_j,λ_j,F_j respectively, for eachjPU.

Likewise, for each j PU, the nonnegative real pδt,δ₀q² becomes pδ^j_t,δ

0q² for each modej; the approximate continuous-time solution of (19) starting fromz, is denoted by ˜X_t,z^j , and the approximate staircase solution by X^j_t,z.

4.2. Control patterns

The control laws that we now consider are “piecewise constant of durationτ”

in the sense that, everyτ seconds, they select a given mode (see ([30])). We call

(17)

Figure 2: Simulations of Example 3 with control inputu “0.5, initial ballBpp0,0q,0.01q, τ“0.1.

“(control) pattern of lengthk” a sequence ofk modes (i.e., an element ofU^k).

Each patternπof the formj₁j₂¨ ¨ ¨j_kcorresponds to the selection of modej₁for timetP r0, τq, then mode j₂fortP rτ,2τq, and so on, untilt“kτ. We assume that the solution of the system is continuous at sampling instantst“τ,2τ, . . . (which means that there is no “reset” of the system at sampling instants).

Given a stochastic switched system, a pattern π of length k and an initial random variablez, one constructs the “approximate solution controlled by π”

by composing together the approximations obtained by successive application of the modes of the patternπ. Formally, the “continuous” approximate solution X˜_t,z^π is defined at timetP r0, kτsas follows:

• X˜_t,z^π “X˜_t,z^j ifπ“jPU,k“1 andtP r0, τs, and

• X˜_pk´1qτ`t^π 1,z“X˜_t,z^j 1 withz¹“X˜_pk´1qτ,z^π¹ ifkě2,t¹ P r0, τs, π“π¹˚j for somej PU and π¹PU^k´1.

The “staircase” approximate solution X^π_t,z is defined analogously. Likewise, given an initial error radiusδ₀ą0 and a patternπof lengthkě1, one defines the error radiusδ^π_t,δ

0 using (7) as follows:

• δ^π_t,δ

0 “δ_t,δ^j

0 ifπ“jPU,k“1 andtP r0, τs, and

• δ^π_pk´1qτ`t₁_,δ

0 “δ_t^j1,δ¹ withδ¹ “δ_pk´1qτ,δ^π¹

0, ifkě2,t¹ P r0, τs,π“π¹˚j for somej PU and π¹PU^k´1.

4.3. Controlled probabilistic reachability

Given a ball R “ BpO, ρq Ă R^d with ρ ą 0, we define the problem of

“controlled probabilistic reachability insideR” as follows:

Given a starting pointzPR^dand a positive realδ₀, find a patternπof length ksuch that, at t“kτ, the image of the points located initially inBpz, δ₀qbe, on average, located as close as possible toO, and, in particular:

BpErX˜_t,z^π s, δ^π_t,δ₀q ĎR for t“kτ.

(18)

The program mentioned in Section 3.5, has been extended in order to find, by exhaustive enumeration¹, such a pattern π. This guarantees that, for all x0PBpz, δ0q:

P rpX_kτ,x^π ₀ PRq ě1´ 1 ρ²p

b

Er}X˜_kτ,z^π }²s `δ_kτ,δ^π ₀q², We now give two examples of application of this program.

Example 4. Consider the system of Example 1 withw“0.01:

dx₁“ p´0.25x₁ùx₂` p´1qû0.25qdt`0.01x₁dW_t¹ dx₂“ ppu´3qx₁´0.25x₂` p´1qûp3úqqdt`0.01x₂dW_t² whereu“1,2.

Forτ“0.5,∆t“10^´4, one finds (for all modes u“1,2):

q“0,D “1.36,Lg “0.01,λ“0.25; for z“ p´4,´3.8q: E2,z “893.31, E4,z “2.14¨10⁵, F2,z “ 0.002, F4,z “4.9¨10^´6; and for z “ p0,3q: E2,z “ 543.22,E4,z “7.94¨10⁴,F2,z “0.0442,F4,z “0.00178.

Our program shows the probabilistic reachability insideR“Bpp0,0q, ρq, for all point x0 P Bpp5,4q, δ0q and all point y0 P Bpp´5,´4q, δ0q, with δ0 “ 0.1 and ρ“7. We have, forπ1“ p2¨2¨2¨2q

P rpX_4τ,x^π¹

0 PRq ě1´ 1 ρ²p

b

Er}X˜_4τ,p5,4q^π¹ }²s `δ_4τ,δ^π¹

0q², P rpX_4τ,x^π¹ ₀ PRq ě0.689,

and forπ2“ p1¨1¨1¨1¨1q:

P rpX_5τ,y^π² ₀ PRq ě1´ 1 ρ²p

b

Er}X˜_{5τ,p´5,´4q}^π² }²s `δ_5τ,δ^π²

0q². P rpX_5τ,y^π²

0PRq ě0.713

Figure 3 depicts in black the initial balls (at t “ 0) centered at p5,4q and p´5,´4q; and for each initial ball, the pattern that sends the ball inside R (at timet“kτ); the intermediate balls (at t“τ,2τ, . . . ,pk´1qτ) are depicted in red, and 200 sampling trajectories drawn in black.

Example 5. (the slit problem)

The problem is adapted from ([20]). The controlled dynamics is:

dX “udt`dW, X0“1

with modeuP t´6,´5,´4,´3,´2,1,0,1,2,3,4,5,6u. We have (at t“0.5) a slit at xP r´1,´3s. The objective is thus to control the system so that xptq P R“ r´1,´3s (i.e.,xptq PR“BpO¹, ρq withO¹“ ´2 andρ“1q att“0.5.

1In Section 4.4 a strategy based on the search of anoptimalpattern is suggested rather than an exhaustive search.

(19)

Figure 3: Simulations of Example 4 from the initial ballsBpp5,4q,0.1qandBpp´5,´4q,0.1q using patternsp2¨2¨2¨2qandp1¨1¨1¨1¨1qresp., withτ“0.5.

One has, for all modes: q“0,D“0,Lg“0,λ“0. Forδ0“0.2, an initial pointz“1 and a sampling timeτ“0.5with subsampling∆t“2.5ˆ10^´3, one has for modeu“ ´6: E2,z “144,E4,z “20736,F2,z “4, F4,z “16; and for modeu“0: E2,z “0, E4,z “0, F2,z “4, F4,z“16.

Suppose that all the trajectories start at x0 P Bpz, δ0q, with z “ 1 and δ0 “ 0.2. When there is no control (u “ 0), at time t “ 0.5, the expected value of Xt,x₀ is in Bpz1, δt,δ₀q with z1 “1 and δt,δ₀ “ 0.29. From Markov’s inequality, it follows that the trajectories pass byR “Bp´2,1q att“0.5 with low probability: see Figure 4. On the other hand, with controlu“ ´6, at time t “ τ “ 0.5, the expected value of X_t,x₀ is now in Bpz₁¹, δ_τ,δ¹

0q with z₁¹ “ ´2 and δ_τ,δ¹

0 “0.29. This explains why the trajectories pass by R “Bp´2,1q at t“0.5 with high probability. We have, forx₀PBpz, δ₀qwith z“1, δ₀“0.2, t“τ“0.5,ρ“1,O¹“ ´2:

P rpX_τ,x^u

0 PR“BpO¹, ρqq ě1´ 1 ρ²p

b

Er}X˜_τ,zû Ó¹}²s `δû_τ,δ

0q². (20)

(20)

Figure 4: Simulations of Example 5 without control (π“ p0¨0q) and with control pattern p´6¨0qfrom the initial ballBp1,0.2q. The initial region is in green. The red regions are BpErX˜^π_t,zs, δ_t,δ^π

0q.withz“1,δ0“0.2 for the two patternsπ“ p0¨0q(top) andπ“ p´6¨0q (bottom).

For mode u “ ´6, this gives: δ^u_τ,δ

0 “ 0.29 and Er}X˜_τ,zû Ó¹}²s “ 0.048. It follows: P rpX_τ,xû ₀ PRq ě0.871.

4.4. Optimal control by state space discretization

In this Subsection, we are interested in solving an optimal control problem in order to minimize the average distance of the state reached after a given number of steps to the origin. In order to solve such optimal control problems, it is classical to spatially discretize the set S into a finite number of cells of equal size. Consider a partition of a compact subset S of R^d in cells. For the sake of simplicity, we suppose that each cell C is a hypercube of R^d of side length 2ε{?

d, and we consider the centre of the cell, denoted 7x, as the

“representative” of the cell. Note that any pointz located in the cell is such that }z´ 7x} ďε. In [3] (cf. [8]), it is explained how a discretized procedure,