• Aucun résultat trouvé

Large Deviation Principle for invariant distributions of Memory Gradient Diffusions

N/A
N/A
Protected

Academic year: 2021

Partager "Large Deviation Principle for invariant distributions of Memory Gradient Diffusions"

Copied!
37
0
0

Texte intégral

(1)

HAL Id: hal-00759188

https://hal-univ-tlse3.archives-ouvertes.fr/hal-00759188

Submitted on 30 Nov 2012

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Large Deviation Principle for invariant distributions of Memory Gradient Diffusions

Sébastien Gadat, Fabien Panloup, Clément Pellegrini

To cite this version:

Sébastien Gadat, Fabien Panloup, Clément Pellegrini. Large Deviation Principle for invariant distri- butions of Memory Gradient Diffusions. Electronic Journal of Probability, Institute of Mathematical Statistics (IMS), 2013, vol. 18, paper n° 81, 34 p. �hal-00759188�

(2)

Large Deviation Principle for invariant distributions of Memory Gradient Diffusions

Sébastien Gadat, Fabien Panloup and Clément Pellegrini Institut de Mathématiques

Université de Toulouse (UMR 5219) 31062 Toulouse, Cedex 9, France

{Sebastien.Gadat;fabien.panloup;clement.pellegrini}@math.univ-toulouse.fr

Abstract

In this paper, we consider a class of diffusion processes based on a memory gradient descent, i.e.whose drift term is built as the average all along the past of the trajectory of the gradient of a coercive function U. Under some classical assumptions on U, this type of diffusion is ergodic and admits a unique invariant distribution. In view to optimization applications, we want to understand the behaviour of the invariant distribution when the diffusion coefficient goes to 0. In the non-memory case, the invariant distribution is explicit and the so-called Laplace method shows that a Large Deviation Principle (LDP) holds with an explicit rate function, that leads to a concentration of the invariant distribution around the global minima ofU. Here, except in the linear case, we have no closed formula for the invariant distribution but we show that a LDP can still be obtained. Then, in the one- dimensional case, we get some bounds for the rate function that lead to the concentration around the global minimum under some assumptions on the second derivative ofU.

Keywords: Large Deviation Principle, Hamilton-Jacobi Equations, Freidlin and Wentzell Theory, small stochastic perturbations, hypoelliptic diffusions

1 Introduction

The aim of this paper is to studysome asymptotic properties ofa diffusive stochastic model with memory gradient when the noise component vanishes. The evolution is given by the following stochastic differential equation (SDE)on Rd:

dXtε =εdBt− 1

k(t) Z t

0

k(s)∇U(Xsε)ds

dt, (1.1)

where ε >0 and (Bt) is a standard d-dimensional Brownian motion. A special feature of such equation is the integration over the past of the trajectory depending on a function k which quantifies the amount of memory. Our work is mainly motivated by optimization considerations.

Indeed, in a recent work Cabot, Engler, and Gadat (2009b) have shown that the solution of the deterministic dynamical system (ε= 0) converges to the minima of the potential U. Without memory, that is without integration over the past of the trajectory, the model (1.1) reduces to the classical gradient descent model and such convergence results are well-known. Even in the deterministic framework, a potential interest of the gradient with memory is the capacity of the solution to avoid some local traps of U. Indeed, the solution of (1.1) (when ε = 0) may keep some inertia even when it reaches a local minimum of U which implies a larger exploration of the space than a classical gradient descent which cannot escape from local minima (see Alvarez (2000) and Cabot (2009)). Usually such property is obtained by introducing a small noise term.

In the classical case, this leads to the usual following SDE

dXtε=εdBt− ∇U(Xtε)dt. (1.2)

(3)

As mentioned above, the behaviour of the invariant distribution of this model when ε goes to 0 is well-known. Using the so-called Laplace method, it can be proved that a Large Deviation Principle (LDP) holds and that the invariant distribution of (1.2) concentrates on the global minima ofU when the parameterε→0 (seee.g. Freidlin and Wentzell (1979)).

It is then natural to investigate the study of the stochastic memory gradient (1.1) in order to obtain similar results. A major difference with the usual gradient diffusion is that the integration over the past of the trajectory makes the process (Xtε)t≥0 non Markov. This can be overcome with the introduction of an auxiliary (Ytε)defined by

Ytε= 1

k(t) Z t

0

k(s)∇U(Xsε)ds

dt. (1.3)

In general, the coupleZtε= (Xtε, Ytε)gives rise to a non-homogeneous Markov process (see Gadat and Panloup (2012)). In order to consider the notion of invariant measure, we concentrate on the case where k(t) =eλt which turns (Ztε) into an homogeneous Markov process. In this context, Gadat and Panloup (2012) have shown existence and uniqueness of the invariant measureνε for (Ztε).

In the present work, our objective is to obtain some sharp estimations of the asymptotic behaviour of (νε) as ε → 0. More precisely, we shall first show that (νε)ε>0 satisfies a Large Deviation Principle. Then, we will try to obtain some sharp bounds for the associated rate function in order to understand how the invariant probability is distributed as ε → 0. In particular, we will establish the concentration around the global minima of U up to technical hypotheses. In the classical setting of (1.2), this is an essential step towards implementing the strategy of the so-called simulated annealing. Developing a simulated annealing optimization procedure to the memory gradient diffusion is certainly a motivation of the study of (1.1). This will be addressed in a forthcoming work.

The paper is also motivated by extending some results of Large Deviations for invariant distributions to a difficult context where the process is not elliptic and the drift vector field is not the gradient of a potential. These two points and especially the second one strongly complicate the problem since explicit computations of the invariant measure are generally impossible. This implies that the works on elliptic Kolmogorov equations by Chiang, Hwang, and Sheu (1987), Miclo (1992) or Holley and Stroock (1988) for instance, can not be extended to our context. For similar considerations in other non-Markov models, one should also mention the recent works on Mac-Kean Vlasov diffusions by Herrmann and Tugaut (2010) and on self-interacting (with attractive potential) diffusions by Raimond (2009).

Here, in order to obtain a LDP for (νε)ε≥0 we adapt the strategy of Puhalskii (2003) and Freidlin and Wentzell (1979) to our degenerated context. We shall first show a finite-time LDP for the underlying stochastic process. Second we prove the exponential tightness of (νε)ε≥0 by using Lyapunov type arguments. Finally, we show that the associated rate function, denoted as W in the paper, can be expressed as the solution of a control problem (in an equivalent way as the solution of a Hamilton-Jacobi equation). However, at first sight the solution of the control problem is not unique. This uniqueness property follows from an adaptation of the results of Freidlin and Wentzell (1979) to our framework. In particular, we obtain a formulation of the rate function in terms of the costs to join stable critical points of our dynamical system. Next, the second step of the paper (sharp estimates of W) is investigated by the study of the cost to join stable critical points.

The paper is then organized as follows. In Section 2, we recall some results about the long- time behaviour of the diffusion when ε is fixed. Moreover, we provide the main assumptions needed for obtaining the LDP for (νε). In Section 3, we prove the exponential tightness of (νε) and show that any rate function W associated with a convergent subsequence is a solution of a finite or infinite time control problem. In Section 4, we prove the uniqueness of W by adapting the Freidlin and Wentzell approach to our context (see also the works of (Biswas & Budhiraja, 2011) and (Cerrai & Röckner, 2005) for other adaptations of this theory). Since the study of the

(4)

cost function is quite hard in a general setting, we focus in Section 5 on the case of a double- well potential U. In this context, we obtain some upper and lower bounds for the associated quasi-potential function. Then, we provide some conditions on U and on the memory parame- ter λwhich allow us to prove the concentration of the invariant distribution around the global minima. Note that, even if our assumptions in this part seem a little bit restrictive, the proofs of the bounds (especially the lower bound) are obtained by anoriginal (and almost optimal) use of some Lyapunov functions associated with the dynamical system.

Acknowledgments: The authors would like to thank Guy Barles for its hospitality, and are grateful to Guy Barles, Laurent Miclo and Christophe Prieur for helpful discussions and comments.

2 Setting and Main Results

2.1 Notations and background on Large Deviation theory

In the paper, the scalar product and the Euclidean norm on Rd are respectively denoted byh,i and |.|. The space of d×d real-valued matrices is referred as Md(R) and we use the notation k.k for the Frobenius norm on Md(R).

We denote by H(R+,Rd) the Cameron-Martin space, i.e. the set of absolutely continuous functionsϕ:R+→ Rd such that ϕ(0) = 0and such that ϕ˙ ∈L2,loc(R+,Rd).

For a C2-function f :Rd → R, ∇f and D2f denote respectively the gradient of f and the Hessian matrix off. In the one-dimensional case, we will switch to the notationfandf”in order to emphasize the difference withd >1. Given any f ∈ C2(Rd×Rd,R),∇xf :Rd×Rd→Rdand Dx2f :Rd×Rd→Md(R) denote the functions respectively defined by (∇xf(x, y))i =∂xif(x, y) and (D2xf(x, y))i,j = ∂xixjf(x, y). Obviously these notations are naturally extended to ∇yf, Dx,y2 f and Dy2f. Finally, for any vectorv∈Rd,vt will refer to the transpose of v.

For a measure µand aµ-measurable function f, we setµ(f) =R f dµ.

Let us now recall some definitions relative to the Large Deviation theory (see Dembo and Zeitouni (2010) for further references on the subject). Let(E, d)denote a metric space. A family of probability measures (νε)ε>0 on E satisfies a Large Deviation Principle (shortened as LDP) with speed rε and rate function I if for all open setO and closed set F,

lim inf

ε→0 rεlog(νε(O))≥ − inf

x∈OI(x) and lim sup

ε→0

rεlog(νε(F))≤ −inf

x∈FI(x).

The function I is referred to be good if for any c ∈ R, {x ∈ E, I(x) ≤ c} is compact. In this paper, we will use some classical compactness results in Large Deviation theory. A family of probability measures (νε)ε>0 is said to be exponentially tight of order rε if

∀a >0, ∃Ka compact ofE such that lim sup

ε→0

rεlog(νε(Kac))≤ −a.

Then, we recall the link between exponential tightness and the Large Deviation Principle (see Feng and Kurtz (2006), chapter 3 for instance).

Proposition 2.1 Let(S, d) be a metric space and(νε)ε≥0 a sequence of exponentially tight prob- ability measures on the Borel σ-algebra of S with speed rε. Then there exists a subsequence (εk)k≥0 such that εk →0 along which theLDP holds with good rate function I and speed rεk. Definition 2.1 Such subsequence (νεk)k≥1 will be called a(LD)-convergent subsequence.

(5)

2.2 Averaged gradient diffusions

Throughout this paper, we denote by U :Rd 7→ R a smooth (at least C2) function on Rd and coercive, i.e.

x∈infRU(x)>0, lim

|x|→+∞U(x) = +∞, and lim inf

|x|→+∞hx,∇U(x)i >0. (2.1) As announced in Introduction with k(t) =eλt, we are interested in the stochastic evolution of

dXtε=εdBt

λe−λt Z t

0

eλs∇U(Xsε)ds

dt,

where λ >0and (Bt)t≥0 is a standard d-dimensional Brownian motion. The process(Xtε)t≥0 is not a Markov process but enlarging the space by defining the auxiliary process (Ytε)t≥0 as

Ytε =λe−λt Z t

0

eλs∇U(Xsε)ds,

then(Ztε)t≥0:= ((Xtε, Ytε))t≥0 is a Markov process (see Gadat and Panloup (2012) for instance).

More precisely (Ztε)t≥0 satisfies:

( dXtε=εdBt−Ytεdt,

dYtε=λ(∇U(Xtε)−Ytε)dt. (2.2) When necessary, we will denote by (Ztε,z)t≥0 the solution starting from z ∈ Rd and by Pε the distribution of this process on C(R+,Rd). In the sequel, we will also intensively use thez

deterministic system obtained when ε= 0 in (2.2): if (z(t))t≥0 := (x(t), y(t))t≥0, the canonical differential system is

˙

z(t) =b(z(t)) with b(x, y) =

0 −y λ∇U(x) −λy

. (2.3)

2.3 Assumptions

The function ∇U being not necessarily Lipschitz continuous, we assume in all the paper that there existsC > 0 such that for all x ∈Rd,kD2U(x)k ≤CU(x). This assumption ensures the non-explosion (in finite horizon) of(Ztε)t≥0 (see Proposition 2.1 of Gadat and Panloup (2012)).

Since∇U is locally Lipschitz continuous, existence and uniqueness hold for the solution of (2.2) and(Ztε)t≥0 is a Markov process with semi-group denoted by(Ptε)t≥0 and infinitesimal generator Aε defined, for allf ∈ Cc2(Rd×Rd), by:

Aεf(x, y) =−hy, ∂xfi+λh∇U(x)−y, ∂yfi+ε2

2 Tr Dx2f

. (2.4)

We first recall some results obtained by Gadat and Panloup (2012) on existence and unique- ness for the invariant distribution of (2.2). To this end, we need to introduce a mean-reverting assumption denoted by(Hmr) and some hypoellipticity assumption(HHypo). The mean revert- ing assumption is expressed as follows:

(Hmr) : lim|x|→+∞hx,∇U(x)i= +∞ and lim|x|→+∞hx,∇U(x)ikD2U(x)k = 0.

Concerning the second assumption, let us defineEU by EU =n

x∈Rd, det D2U(x)

6

= 0o

, (2.5)

(6)

and denote byMU the complementary manifold MU =Rd\ EU. Assumption (HHypo) is then defined by:

(HHypo) :U is C(Rd,R), lim|x|→+∞U(x)|x| = +∞ and dim(MU)≤d−1.

The above assumption implies the uniqueness invariant distribution: the smoothness ofU and the fact thatdim(MU)≤d−1ensure that the Hörmander condition is satisfied on a sufficiently large subspace of R2d whereas the fact that lim|x|−1U(x) = +∞ as|x| → +∞ (which implies that U grows at least linearly) is needed for the topological irreducibility of the semi-group (see Gadat and Panloup (2012) for details). Under these assumptions, we deduce the following proposition from Theorems 2.3 and 3.2 of Gadat and Panloup (2012):

Proposition 2.2 Assume(Hmr). Then, for allε >0, the solution of (2.2)admits an invariant distribution. Furthermore, if (HHypo) holds, the invariant distribution is unique and admits a λ2d-a.s. positive density. We denote by νε this invariant distribution.

Note that (Hmr) implies Assumption(H1) of Gadat and Panloup (2012) in the particular case σ =Id andr=λ.

Our goal is now to obtain a Large Deviation Principle for (νε)ε>0 when ε→0. To this end, we need to introduce some more constraining mean-reverting assumptions than(Hmr):

(HQ+) : There existsρ∈(0,1),C >0,β ∈Rand α >0 such that (i) −hx,∇U(x)i ≤β−αU(x),∀x∈Rd

(ii) |∇U|2 ≤C(1 +U2(1−ρ)) and lim

|x|→+∞

kD2U(x)k U(x) = 0.

(HQ−) : There existsa∈(1/2,1],C >0,β ∈Rand α >0 such that (i) −hx,∇U(x)i ≤β−α|x|2a,∀x∈Rd (ii) |∇U|2≤C(1 +U) and sup

x∈RdkD2U(x)k<+∞.

Remark 2.1 Assumptions (HQ+) and (HQ−) correspond respectively to super-quadratic and subquadratic potentials. For instance, assume thatU(x) = (1 +|x|2)p. Whenp≥1, (HQ+) holds with ρ∈(0,2p1) and ifp∈(1/2,1], (HQ−) holds with a=p. These assumptions are adapted to a large class of potentials U with polynomial growth (more than linear). However, they do not cover the potentials with exponential growth ((HQ+)(ii) is no longer fulfilled).

2.4 Main results

2.4.1 Exponential tightness and Hamilton Jacobi equation

Let ϕ ∈ H. When existence holds, we denote respectively by zϕ := (zϕ(t))t≥0 and by ˜zϕ :=

(˜zϕ(t))t≥0, a solution of

˙

zϕ=b(zϕ) + ϕ˙

0

and z˙˜ϕ =−b(˜zϕ) + ϕ˙

0

. (2.6)

Note that (HQ+) and (HQ−) ensure the finite-time non-explosion of zϕ and ˜zϕ for all ϕ∈ H (see e.g. Equation (3.4)). Thus, since ∇U is locally Lipschitz continuous, for all z ∈ R2d, the solutions starting fromz respectively denoted by zϕ(z, .) and ˜zϕ(z, .) exist and are unique.

Finally, we will also need the following assumption.

(HD) :The set of critical points(xi)i=1...ℓ of U is finite and eachD2U(xi) is invertible.

This assumption will be necessary to obtain some uniqueness property. We are now able to express our first main result.

(7)

Theorem 2.1 Assume that(HHypo) holds and that either (HQ+)or (HQ−)is satisfied. Then, (i)The family(νε)ε∈(0,1] is exponentially tight on R2d with speed ε−2.

(ii) Let (νεn)n≥1 be a (LD)-convergent subsequence and denote by W the associated (good) rate function. Then, W satisfies for all t≥0 and any z∈Rd×Rd:

W(z) = inf

ϕ∈H

1 2

Z t

0 |ϕ(s)˙ |2ds+W(˜zϕ(z, t))

. (2.7)

(iii) Furthermore, assume that(HD) is fulfilled, then W(z) = min

1≤i≤ℓ inf

ϕ∈H

˜

zϕ(z,+∞) =zi 1

2 Z

0 |ϕ(s)˙ |2ds+W(zi)

. (2.8)

where ˜zϕ(z,+∞) := limt→+∞˜zϕ(z, t) (when exists) and zi= (xi,0) for all i= 1, . . . , l.

Equation (2.7) satisfied byW may be seen as an Hamilton-Jacobi equation (seee.g.Barles (1994) for further details on such equations).

2.4.2 Freidlin and Wentzell estimates

Let us stress that the main problem in the expression (2.8) is that the uniqueness is only available conditionally to the values ofW(zi),i= 1, . . . ℓ. Thus, in order to obtain a LDP, we now need to show that that this uniqueness is not conditional, i.e. that the values of W(zi) are uniquely determined. We are going to obtain this result by following the Freidlin and Wentzell (1979) approach. To this end, we first recall some useful elements of Freidlin and Wentzell theory.

{i}-Graphs Following the notations of Theorem 2.1, we denote by {z1, . . . , z} this finite set of equilibria and we recall here the definition of {i}-Graphs defined on this set. For any i∈ {1, . . . , ℓ}, we denote byG(i) the set of oriented graphs with vertices{z1, . . . , z}that satisfy the three following properties.

(i) Each statezj 6=zi is the initial point of exactly one oriented edge in the graph.

(ii) The graph does not have any cycle.

(iii) For any zj 6=zi, there exists a (unique) path composed of oriented edges starting at state zj and leading to the statezi.

L2 control cost between between equilibria We now define for any couple of points (ξ1, ξ2)∈(Rd×Rd)2 the minimal L2 cost to go fromξ1 to ξ2 within a finite time tas

It1, ξ2) = inf

ϕ∈H zϕ1, t) =ξ2

1 2

Z t

0 |ϕ(s)˙ |2ds,

and also the minimal L2 cost to go from ξ1 to ξ2 within any time:

I(ξ1, ξ2) = inf

t≥0It1, ξ2).

The function I is usually called the quasipotential. With these definitions, one can obtain the Freidlin and Wentzell estimate which gives another representation ofW(zi),i= 1, . . . , ℓ.

(8)

Theorem 2.2 Assume that (HHypo) holds and that either (HQ+) or (HQ−) is satisfied. If (HD) holds, then, for any (LD)-convergent subsequence (νεn)n≥1, the associated rate function W satisfies:

∀i∈ {1. . . ℓ} W(zi) =W(zi)− min

j∈{1,...,ℓ}W(zj) where

∀i∈ {1. . . ℓ} W(zi) := min

IG∈G(i)

X

(zm→zn)∈IG

I(zm, zn). (2.9) The next corollary follows immediately from Theorem 2.1 and Theorem 2.2.

Corollary 1 Assume that(HHypo)holds and that either(HQ+)or(HQ−) is satisfied. If(HD) holds,(νε)satisfies a large deviation principle with speed ε−2 and good rate functionW such that

W(z) = min

1≤i≤ℓ inf

ϕ∈H

˜

zϕ(z,+∞) =zi 1

2 Z

0 |ϕ(s)˙ |2ds+W(zi)

− min

j∈{1,...,ℓ}W(zj),

where W(zi) is given by (2.9).

Case of a double-well non-convex potential In the sequel, we are interested by the location of the minimum of W. More precisely, we expect that this minimum is located on the set of globalminimaofU. Using Equation (2.8), this point is clear whenU is a stricly convex potential.

Regarding now the non-convex case, the situation is more involved. Thus, we only focus on the double-well one-dimensional case. Without loss of generality, we assume that U has two local minima denoted by x1 andx2 with

x1 < x< x2 and U(x1)< U(x2), (2.10) where x is the unique local maximum betweenx1 and x2. We obtain the following result:

Theorem 2.3 Assume the hypothesis of Corollary 1 and that U satisfies (2.10). Then, (i) W satisfies

W(z1) =I(z2, z1)≤2[U(x)−U(x2)].

(ii) For allα∈[0,2], there exists an explicit constant mλ(α) such that

kU′′k≤mλ(α) =⇒ W(z2) =I(z1, z2)≥α[U(x)−U(x1)].

(iii) As a consequence, if U satisfies kU′′k < mλ

2U(xU(x)−U(x)−U(x21))

, then W(z1)<W(z2),

and finally (νε)ε≥0 weakly converges towards δz1 as ε→0.

In the next sections, we prove the above statements. Note that throughout the rest of the paper, C will stand for any non-explicit constant. Note also that except in Section 5, we will prove all the results with λ = 1 for sake of convenience (one can deduce similar convergences with small modifications for anyλ >0).

(9)

3 Large Deviation Principle for invariant measures (ν

ε

)

ε(0,1]

This section describes the proof of Theorem 2.1 which contains two important parts. The first one concerns the exponential tightness of the invariant measures(νε)ε∈(0,1] and the second result is a functional equality for any good rate function associated to any (LD)-convergent subsequence (νεk)k≥0.

We first establish a LDP for (Zε)ε>0 onC(R+,R2d) (space of continuous functions fromR+ to R2d) and then we detail how one can derive the exponential tightness property of (νε)ε∈(0,1]

using suitable Lyapunov functions for our dynamical system. Finally, we show that a functional equality such as (2.7) holds.

3.1 Large Deviation Principle for (Zε)ε>0

The next lemma establishes a LDP for trajectories of ((Ztε)t≥0)ε>0 within a finite time.

Lemma 3.1 Assume (HQ+) or (HQ−). Let z ∈ R2d and (zε)ε>0 be a net of R2d such that zε −−−→ε→0 z. Then,(Zε,zε)ε>0 satisfies a LDP onC(R+,R2d)(endowed with the topology of uniform convergence on compact sets) with speedε−2. The corresponding (good) rate functionIzis defined for all absolutely continuous (z(t))t≥0 = (x(t),y(t))t≥0 by

Iz((z(t))t≥0) = inf

ϕ∈H,zϕ(z,.)=z(.)

1 2

Z

0 |ϕ(s)· |2ds= 1 2

Z

0 |x(s) +˙ y(s)|2ds,

where zϕ(z, .) = z(.) means that for all t ≥ 0, zϕ(z, t) = z(t). In particular, for all t ≥ 0, (Ptε(zε, .))ε>0 satisfies a LDP with speed ε−2. The corresponding rate function It(z, .) is defined for allz, z ∈R2d by:

It(z, z) = inf

z(.)∈Zt(z,z)Iz(z(.)), (3.1) where Zt(z, z) denotes the set of absolutely continuous functions z(.) such that z(0) =z, z(t) = z. Furthermore, the function It can be written as

It(z, z) = inf

ϕ∈H,zϕ(z,t)=z

1 2

Z t

0 |ϕ(s)· |2ds.

Remark 3.1 Note that such result is quite classical when zε =z and when the coefficients are Lipschitz continuous functions (see e.g. Azencott (1980) for instance). Here, we have to handle the possibly super-linear growth of the drift vector fieldb(and also the degeneracy of the diffusion).

Proof : We wish to apply Theorem 5.2.12 of Puhalskii (2001). To this end, we need to prove the following four points:

• Uniqueness for the maxingale problem: this step is an identification of the (potential) LD- limits of (Zε)ε>0. More precisely, we need to prove that the idempotent probability πz(.) :=

exp(−Iz(.))is the unique solution to themaxingale problem(z, G)whereG :R2d×C(R+,R2d)→ C(R+,R2d) is given by:

∀λ= (λ1, λ2)∈Rd×Rd, ∀z∈ C(R+,R2d), ∀t≥0, Gt(λ,z) = Z t

0

b(z(s))ds+1 2λ21. The fact that πz solves the maxingale problem follows from Theorem 3.1 and Lemma 3.2 of (Puhalskii, 2004). Setting E(x, y) = U(x) + |y|2

2 , note that Lemma 3.2 can be applied since h∇E(x, y), b(x, y)i ≤0 (see condition (3.6a) of (Puhalskii, 2004)). Furthermore, sincebis locally Lipschitz continuous, for all ϕ∈H, the ordinary differential equation

˙

z=b(z) + ϕ˙

0

,

(10)

has a unique solution. Thus, uniqueness for the maxingale problem is a consequence of the second point of Lemma 2.6.17 of (Puhalskii, 2001) and of Theorem 3.1 of (Puhalskii, 2004).

• Continuity condition: some continuity conditions must be satisfied for the characteristics of the diffusion. In fact, since the diffusive component is constant, it is enough to focus on the drift component and to show that for all t ≥0 the function φt from C(R+,R2d) to R2d defined by φt(z) = Rt

0b(z(s))ds is a continuous function of z. Since b is Lipschitz continuous on every compact set ofR2d, this point is obvious.

• Local majoration condition: in this step, we have to check that for all M >0, there exists an increasing continuous map F¯M :R+→Rsuch that

∀0≤s≤t sup

z∈C(R+,R2d),kzk≤M

t(z)−φs(z))≤F¯M(t)−F¯M(s).

withkzk = supt≥0|z(t)|. Sinceb is locally bounded, this point is true with F¯tM(t) = sup

z∈R2d,|z|≤M

|b(z)|t.

• Non-Explosion condition (NE): The Non-Explosion condition holds if (i) The functionπz defined by πz := exp(−Iz(z)) is upper-compact, (ii) For all t≥0 and for alla∈(0,1], the set [

s≤t

sup

u≤s|z(u)|, πz,s(z)≥a

is bounded where

∀z∈R2d, ∀t≥0, πz,t(z) = exp

− inf

ϕ∈H,zϕ(z,.)=z(.)

1 2

Z t

0 |ϕ(s)˙ |2ds

.

Point (i): The property that πz is upper-compact means that for all a ∈ (0,1], the set Ka :=

{z, πz(z)≥a} is a compact set (for the topology of uniform convergence on compact sets). For this, we use the Ascoli Theorem. We first show the boundedness property for the paths ofKa. From the definition of πz, we observe that for any z of Ka, there exists a control ϕ ∈ H such thatz=zϕ and such that

Z

0 |ϕ(s)· |2ds≤ −2 loga+ 1. (3.2) Using the above defined function E, one checks that for all p >0,

d

dt(Ep(z(t))) =pE(z(t))p−1

|y(t)|2+h∇U(x(t)),ϕ(t)· i

≤C

E(z(t))p+E(z(t))2p−2|∇U(x)|2+|ϕ(t)· |2 .

Under (HQ+) or (HQ−), we have respectively |∇U|2 = O(U2−2ρ) or |∇U|2 = O(U). Thus, applying the inequalities with p¯=ρ (resp. p¯= 1) under (HQ+) (resp. (HQ−)) yields:

d

dt Ep¯(z(t))

≤C Ep¯(z(t)) +|ϕ(t)˙ |2

, (3.3)

and the Gronwall Lemma implies that

∀t >0, ∃Ct>0, ∀s∈[0, t], Ep¯(z(s))≤Ct

Ep¯(z) +C Z s

0 |ϕ(u)˙ |2du

. (3.4)

Finally, Equation (3.2) combined with (3.4) and the fact thatlim|z|→+∞E(z) = +∞ yields sup

z∈Ka

sup

s∈[0,t]|z(s)|<+∞. (3.5)

(11)

Now, let us prove thatKais equicontinuous: for all t >0,u, v ∈[0, t]withu≤vandz∈Ka, we know that for a suitable constantC˜t,a,z, the controlled trajectories ofKa area priori bounded:

sups∈[0,t]|z(s)| ≤C˜t,a,z. Sinceb is continuous, the Cauchy-Schwarz Inequality yields:

|z(v)−z(u)| ≤ Z v

u |b(z(s))|ds+ Z v

u |ϕ(s)˙ |ds≤ sup

|z|≤C˜t,a,z

|b(z)|(v−u) +p

1−2 loga√ v−u.

The two conditions of the Ascoli Theorem being satisfied, the compactness ofKa follows.

Point (ii): We do not detail this item which easily follows from the controls established in the proof of(i)(see (3.4)). Finally, the other conditions of Theorem 5.2.12 of Puhalskii (2001) being

trivially satisfied, the lemma follows.

3.2 Exponential tightness (Proof of i) of Theorem 2.1)

In the next proposition, we investigate the exponential tightness of (νε)ε∈(0,1]. Our approach consists in showing sufficiently sharper estimates for hitting time of the process (Ztε)t≥0.

Proposition 3.1 Assume(HQ+)or(HQ−), then there exists a compact set B ofR2d, such that the first hitting timeτε of B defined asτε= inf{t >0, Ztε∈B} satisfies the three properties:

(i) For all compact set K of R2d,

lim sup

ε→0

sup

z∈K

Ez[(τε)2]<∞. (3.6)

(ii) There exists δ >0 such that for all compact set K of R2d, lim sup

ε→0

sup

z∈K

sup

t≥0

Ezh

|Zt∧τε ε|ε2δ iε2

<+∞. (3.7)

(iii) For evey compact set K of R2d such that K∩B =∅, lim inf

ε→0 inf

z∈K

Ezε]>0. (3.8)

As a consequence, the family of invariant distributions (νε)ε∈(0,1] is exponentially tight.

The conclusion of the above proposition follows directly from Lemma 7 of Puhalskii (2003). A fundamental step of the proof of Proposition 3.1 is the next lemma which shows some mean- reverting properties for the process (with some constants that do not depend onε). Its technical proof is postponed in the appendix. Note that such lemma uses a key Lyapunov function V which is rather not standard due to the kinetic form of the coupled process.

Lemma 3.2 Assume (HQ+) or (HQ−) and let V :R2d→Rbe defined by V(x, y) =U(x) +|y|2

2 +m |x|2

2 − hx, yi

, withm∈(0,1). For p >0, δ >0 andε >0, set

ψε(x, y) = exp

δVp(x, y) ε2

.

Then, if p ∈ (0,1) under (HQ+) and p ∈ (1−a, a) under (HQ−) and δ is a positive number, there exist α, β, α, β positive such that for all (x, y)∈R2d andε∈(0,1]

AεVp(x, y)≤β−αVp¯(x, y) and, (3.9) Aεψε(x, y)≤ δ

ε2ψε(x, y)(β−αVp¯(x, y)), (3.10)

(12)

where Aε is the infinitesimal generator of (Xtε, Ytε) defined in (2.4) and where

¯ p=

(p under (HQ+) p+a−1 under (HQ−).

Proof of Proposition 3.1: For sake of simplicity, we omit the ε dependence and write (Xt, Yt) instead of (Xtε, Ytε).

• Proof of(i): We use a Lyapunov method to bound the second moment of the hitting time τε. Letp∈(0,1). By the Itô formula, we have

Vp(Xt, Yt)

1 +t =Vp(x, y) + Z t

0 −Vp(x, y)

(1 +s)2 +AεVp(x, y)

1 +s ds+εMt, (3.11) where (Mt)is the local martingale defined by

Mt = Z t

0

pVp−1(Xs, Ys)

1 +s h∇U(Xs) +m(Xs−Ys), dBsi. (3.12) Since V is a positive function, we have that

1 ε2

Z t

0 −AεVp(Xs, Ys) 1 +s ds− 1

2 Mt

ε ,Mt ε

≤ 1

ε2Vp(x, y) + Mt ε − 1

2 Mt

ε ,Mt ε

. (3.13) Note that in the above expression, the martingale(Mεt)t≥0has been compensated by its stochastic bracket in order to use further exponential martingale properties. The l.h.s. of (3.13) satisfies

1 ε2

Z t

0 −AεVp(Xs, Ys) 1 +s ds−1

2hMt ε ,Mt

ε i

= 1 ε2

Z t

0

1

1 +s − AεVp(Xs, Ys)−p2V2p−2(Xs, Ys)

1 +s |∇U(Xs) +m(Xs−Ys)|2

! ds

≥ 1 ε2

Z t 0

Hp,ε(Xs, Ys) 1 +s ds.

withHp,ε(x, y) =−AεVp(x, y)−p2V2p−2(x, y)|∇U(x) +m(x−y)|2. Then, a localization of(Mt) combined with the Fatou Lemma yields for all stopping time τ

E

"

exp 1

ε2 Z t∧τ

0

Hp,ε(Xs, Ys) 1 +s ds

#

≤exp 1

ε2Vp(x, y)

. The final step relies on the fact that there existsp∈(0,1) and M1>0such that:

∀(x, y)∈B(0, M¯ 1)c and ∀ε∈(0,1], Hp,ε(x, y)≥2. (3.14) Let us prove the above inequality under condition ((HQ+)) or (HQ−). First, since m ∈(0,1), one can check that there existsC >0 such that

∀(x, y)∈R2d, |x|2+|y|2 ≤CV(x, y). (3.15) As a consequence, we have

|(x,y)|→+∞lim V(x, y) = +∞. (3.16) Now, owing to the assumptions on∇U, it follows that,

V2p−2(x, y)|∇U(x) +m(x−y)|2 =

(O(V2(p−ρ)(x, y)) +O(V2p−1(x, y)) under ((HQ+))

O(V2p−1(x, y)) under ((HQ−)).

(13)

From now on, assume that

(0< p <2ρ∧1 under ((HQ+))

1−a < p < a under ((HQ−)). (3.17) By Lemma 3.2, we then obtain that for all (x, y)∈R2d andε∈(0,1]

Hp,ε(x, y)≥ −β+αVp¯(x, y)−O(V2p−1),

where p¯is defined in Lemma 3.2. Under (3.17), one checks that 2p−1<p¯and then uniformly inε:

|(x,y)|→+∞lim Hp,ε(x, y) = +∞,

and (3.14) follows. Next, we consider (3.6) with τ being τε= inf{t≥0, Ztε ∈B(0, M¯ 1)}, where M1 is such that (3.14) holds. We then have

E

"

exp 1

ε2 Z t∧τε

0

2 1 +sds

#

≤E

"

exp 1

ε2 Z t∧τε

0

Hp,ε(Xs, Ys) 1 +s

ds

#

≤exp

Vp(x, y) ε2

. Computing the integral and using the Fatou Lemma, we get

E(x,y) h

(1 +τε)ε22i

≤exp 1

ε2Vp(x, y)

.

The Jensen Inequality applied to x→xε12 yields that for every (x, y)∈R2d, for allε∈(0,1]

E(x,y)[(1 +τε)2]≤exp (Vp(x, y)). The first statement follows using that Vp is locally bounded.

• Proof of(ii): Thanks to (3.15), we have for all p >0 and for |(x, y)|large enough, ln(|(x, y)|)≤ 1

2ln(CV(x, y))≤Vp(x, y). (3.18) Multiplying byδ/ε2, this inequality suggests the computation of

E

exp δ

ε2Vp(Xt∧τ, Yt∧τ)

,

for appropriatep andτ. Applying the Itô formula to the function ψε(x, y) = exp(δVp(x, y)/ε2), we get for all t

ψε(Xt, Yt) =ψε(x, y) + Z t

0ε(Xs, Ys)ds+Mt, (3.19) where(Mt)t≥0 is a local martingale that we do not need to make explicit. Let us choosep∈(0,1) such that inequality (3.10) of Lemma 3.2 holds. Since V(x, y) → +∞ as |(x, y)| → +∞ and since p >¯ 0, we deduce that

β−αVp¯(x, y)−−−−−−−→ −∞|(x,y)|→+∞ . As a consequence, for all positive δ, there exists M2 >0such that

∀ε∈(0,1], ∀(x, y)∈B(0, M¯ 2)c, Aψε ≤0.

Letτε= inf{t≥0,(Xt, Yt)∈B(0, M¯ 2)}, a standard localization argument in (3.19) yields E(x,y)ε(Xt∧τε, Yt∧τε)]≤ψε(x, y).

(14)

Without loss of generality, we can assume that M2 is such that (3.18) is valid for all (x, y) ∈ B(0, M¯ 2)c. It follows that for allε∈(0,1],t≥0 and(x, y)∈B(0, M¯ 2)c,

E(x,y)h

|Xt∧τε, Yt∧τε|εδ2iε2

≤eδVp(x,y). From the above inequality, we finally deduce (3.7).

•Proof of(iii): With the notations of the two previous parts of the proof, the properties (3.6) and (3.7) hold withτε:= inf{t≥0,(Xt, Yt)∈B}for all compact setBsuch thatB(0, M¯ 1∨M2)⊂B.

In this last part of the proof, we then setB = ¯B(0, M) where M ≥M1∨M2.

Second, remark that it is enough to show that the result holds withτε∧1instead ofτε. Now, let K be a compact set of R2d such that B∩K=∅and let (εn, zn)n≥1 be a sequence such that εn→0, such that zn∈K for all n≥1 and such that

Ez

nεn∧1]−−−−−→n→+∞ lim inf

ε→0 inf

z∈K

Ezε∧1].

Up to an extraction, we can assume that(zn)n≥1is a convergent sequence. Letz˜denote its limit.

Lemma 3.1 implies that(L((Zεn,zn)t∈[0,1])n≥1is exponentially tight and then tight onC([0,1],Rd).

Using a second extraction, we can assume that (Zεn,zn)n≥1 converges in distribution to Z(∞). Furthermore, sinceεn→0, the limit processZ(∞)isa.s.a solution of the o.d.e. z˙ =b(z)starting at z. The function˜ b being locally Lipschitz continuous, uniqueness holds for the solutions of this o.d.e. and we can conclude that (Zεn,zn)n≥1 converges in distribution to z(˜z, .) (where z(˜z, .) denotes the unique solution of z˙ = b(z) starting from z). The function˜ z(˜z, .) being deterministic, the convergence holds in fact in probability and at the price of a last extraction, we can assume without loss of generality that(Zεn,zn)n≥1 convergesa.s.to z(˜z, .). In particular, settingδ :=d(K, B) (δ >0), there exists n0∈Nsuch that for all n≥n0,

sup

t∈[0,1]|Ztεn,zn−z(˜z, t)| ≤ δ 4 a.s.

Setting now,

τ˜z,δ

2 := inf{t≥0, d(z(˜z, t), B)≤ δ 2} ∧1, we deduce that for all n≥n0,

t∈[0,τinfz, δ˜

2]d(Ztεn,zn, B)≥ δ

4 =⇒ τεn ≥τz,˜δ 2 a.s.

Using the Fatou Lemma, we can conclude that

n→+∞lim Ez

nεn∧1]≥Ez

n[lim inf

n→+∞τεn∧1]≥τ˜z,δ 2.

Finally, sincet7→z(˜z, t) is a continuous function and since d(K,B(0, M¯ +δ2))>0, the stopping timeτz,˜δ

2 is clearly positive. The result follows and this finishes the proof of Proposition 3.1.

3.3 Hamilton-Jacobi equation (Proof of ii) of Theorem 2.1)

This point is a consequence of the finite time large deviation principle for(Zε)ε≥0 (Lemma 3.1) and of the exponential tightness of (νε)ε≥0 (Proposition 3.1). This is the purpose of the next proposition which is an adaptation of Corollary 1of Puhalskii (2003).

Proposition 3.2 For all ε >0, let (Ptε(z, .))t≥0,z∈R2d denote the semi-group associated to (2.2) whose unique invariant distribution is denoted by νε. Then, if the following assumptions hold:

(15)

(i) (νε)ε>0 is exponentially tight of order ε−2 on R2d.

(ii) For all t≥0 andz∈R2d, there exists a functionIt(z, .) :R2d→R, such that for all(zε)ε>0 such that zε→z as ε→0, Ptε(zε, .) satisfy a LDP with speed ε−2 and rate function It(z, .).

Then,(νε)ε>0 admits a(LD)-convergent subsequence and for such subsequence (νεk)k≥0 (with εk →0 as k→+∞), the associated rate function W satisfies for all z0∈R2d,

∀t≥0, W(z0) = inf

z∈R2d(It(z, z0) +W(z)). (3.20) With the terminology of Puhalskii (2003), Equation (3.20) means that W˜ defined for all Γ ∈ B(R2d) by W˜(Γ) = supy∈Γexp(−W(y)) is an invariant deviability for (Ptε(z, .))t≥0,z∈R2d. In Corollary 1 of Puhalskii (2003), this result is stated with a uniqueness assumption on the in- variant deviabilities. The above proposition is in fact an extension of this corollary to the case where uniqueness is not fulfilled. We refer to Appendix A for details.

Owing to Proposition 3.1 and Lemma 3.1, Proposition 3.2 can be applied with It(z, .)defined in (3.1). The rate function W is solution of (3.20) and Equation (2.7) is satisfied. Thus, the next result proves the assertion ii) of Theorem 2.1.

Proposition 3.3 Assume that either(HQ+)or(HQ−)is satisfied, then any (good) rate function W associated to any (LD)-convergent subsequence (εn)n≥1 satisfies for all t ≥ 0 and for all z∈R2d

W(z0) = inf

ϕ∈H

˜

zϕ(0) =z0 1

2 Z t

0 |ϕ(s)˙ |2ds+W(˜zϕ(t))

.

Proof : We know thatW satisfies (3.20) and thus for any z0 ∈R2d W(z0) = inf

v∈R2d(It(v, z0) +W(v)) = inf

v∈R2d, ϕ∈H zϕ(0) =v,zϕ(t) =z0

1 2

Z t

0 |ϕ(s)˙ |2ds+W(zϕ(0))

.

Remarking thatg: [0, t]→R2d defined byg(s) =zϕ(t−s) is a controlled trajectory associated to −band −ϕ, we deduce that for all t≥0

W(z0) = inf

v∈R2d, ϕ∈H

˜

z−ϕ(0) =z0,˜z−ϕ(t) =v 1

2 Z t

0 |ϕ(s)˙ |2ds+W(˜z−ϕ(t))

.

The result follows from the change of variable ϕ˜=−ϕ.

3.4 Infinite horizon Hamilton-Jacobi equation

The aim of this part is to show that when there is a finite number of critical points, we can

"replace t by +∞" in (2.7). This proof is an adaptation of Theorem 4 of Biswas and Borkar (2009). The main novelty of our proof is the second step. Indeed, using arguments based on asymptotic pseudo-trajectories and Lyapunov functions, we prove that the optimal controlled trajectory is attracted by a critical point of the drift vector field.

Proof of(iii) of Theorem 2.1: The proof is divided in three parts. We first build an optimal patht7→˜zψ(z, t)for the Hamilton Jacobi equation of interest. Then, we focus in the second step on its long time behaviour and obtain thatz˜ψ(z, t)converges tozwhich belongs to{z, b(z) = 0}.

Références

Documents relatifs

‫ﺍﻝﻤﻁﻠﺏ ﺍﻝﺭﺍﺒﻊ ‪ :‬ﺇﺸﻜﺎﻝﻴﺔ ﺘﻤﻭﻴل ﺍﻝﻤﺅﺴﺴﺎﺕ ﺍﻝﺼﻐﻴﺭﺓ ﻭ ﺍﻝﻤﺘﻭﺴﻁﺔ‬ ‫ﺍﻝﻤﺒﺤﺙ ﺍﻝﺜﺎﻝﺙ‪ :‬ﺩﻭﺭ ﻭ ﻤﻜﺎﻨﺔ ﺍﻝﻤﺅﺴﺴﺎﺕ ﺍﻝﺼﻐﻴﺭﺓ ﻭ ﺍﻝﻤﺘﻭﺴﻁﺔ ﻓﻲ ﺍﻻﻗﺘﺼﺎﺩ ﺍﻝﺠﺯﺍﺌﺭﻱ‬

Le diagnostic des infections neuro-méningées à entérovirus a pour but d’identifier l’agent infectieux et d’exclure une infection bactérienne ou herpétique ; il

Probabilistic proofs of large deviation results for sums of semiexponential random variables and explicit rate function at the transition... Université

/ La version de cette publication peut être l’une des suivantes : la version prépublication de l’auteur, la version acceptée du manuscrit ou la version de l’éditeur. Access

Depositing the SWCNT on the grain surface accelerates the onset of the hydration reactions and provides loca- lized reinforcing behaviour, but once the hydration products

A low band-gap alternating copolymer of indolocarbazole and benzothiadiazole-cored oligothiophene demonstrated balanced crystallinity and solubility; a solar cell combining this

is balanced, since the coupling effects of the machines in the line are strongest then. When one machine strongly acts as a bottleneck, the approximation is

The righthanded fermions and the Uð1Þ gauge boson a are fundamental particles neutral under the SUð2ÞL interactions, and thus already SUð2ÞL singlets.. The same is not true for