• Aucun résultat trouvé

RATE OF ESTIMATION FOR THE STATIONARY DISTRIBUTION OF STOCHASTIC DAMPING HAMILTONIAN SYSTEMS WITH CONTINUOUS OBSERVATIONS

N/A
N/A
Protected

Academic year: 2021

Partager "RATE OF ESTIMATION FOR THE STATIONARY DISTRIBUTION OF STOCHASTIC DAMPING HAMILTONIAN SYSTEMS WITH CONTINUOUS OBSERVATIONS"

Copied!
35
0
0

Texte intégral

(1)

HAL Id: hal-02455744

https://hal.archives-ouvertes.fr/hal-02455744

Preprint submitted on 26 Jan 2020

HAL

is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire

HAL, est

destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

RATE OF ESTIMATION FOR THE STATIONARY DISTRIBUTION OF STOCHASTIC DAMPING HAMILTONIAN SYSTEMS WITH CONTINUOUS

OBSERVATIONS

Sylvain Delattre, Arnaud Gloter, Nakahiro Yoshida

To cite this version:

Sylvain Delattre, Arnaud Gloter, Nakahiro Yoshida. RATE OF ESTIMATION FOR THE STATION-

ARY DISTRIBUTION OF STOCHASTIC DAMPING HAMILTONIAN SYSTEMS WITH CONTIN-

UOUS OBSERVATIONS. 2020. �hal-02455744�

(2)

DISTRIBUTION OF STOCHASTIC DAMPING HAMILTONIAN SYSTEMS WITH CONTINUOUS OBSERVATIONS

SYLVAIN DELATTRE, ARNAUD GLOTER, AND NAKAHIRO YOSHIDA

Abstract. We study the problem of the non-parametric estimation for the densityπof the stationary distribution of a stochastic two-dimensional damp- ing Hamiltonian system (Zt)t∈[0,T] = (Xt, Yt)t∈[0,T]. From the continuous observation of the sampling path on [0, T], we study the rate of estimation forπ(x0, y0) as T → ∞. We show that kernel based estimators can achieve the rateT−vfor some explicit exponentv(0,1/2). One finding is that the rate of estimation depends on the smoothness ofπand is completely different with the rate appearing in the standard i.i.d. setting or in the case of two- dimensional non degenerate diffusion processes. Especially, this rate depends also ony0. Moreover, we obtain a minimax lower bound on the L2-risk for pointwise estimation, with the same rateT−v, up to log(T) terms.

1. Introduction

The class of hypo-elliptic diffusion processes, for which the diffusion coefficient is degenerate, has been the subject of many recent works and is used for modeling in many fields, such as mathematical finance, biology, neuro-science, mechanics, ecology,... (see e.g. [8], [11], [7] and references therein). In this paper, we focus on the situation of a bi-dimensional hypo-elliptic process, describing the evolution in time of the couple position/velocity of some quantity. The velocityY is modeled by a non degenerate one-dimensional diffusion process, while the positionX is its integral, and the resulting bi-dimensional processZ = (X, Y) is hypo-elliptic. Our aim is to estimate the densityπof the stationary measure of this diffusion process, under the assumption of an ergodic setting.

The problem of non-parametric estimation of the stationary measure of a con- tinuous mixing process is a long-standing problem (see for instance N’Guyen [13], or Comte and Merlevede [3] and references therein). Based on sample of length T of the data Comte and Merlevede [3] find estimators of the stationary measure, converging at rate depending on the smoothness of the stationary measure and slower than√

T, as it is usual in non-parametric problems.

In the specific context where the continuous time process is a one-dimensional diffusion process, observed continuously on some interval [0, T], the finding is dif- ferent. It is shown that the rate of estimation of the stationary measure is√

T (see Kutoyants [10]). The rate of estimation is thus independent of the smoothness of the object that one estimates, in contrast to the typical non-parametric situation.

Date: 26, january, 2020.

2010Mathematics Subject Classification. Primary 62G07, 62G20; secondary 60J60.

Key words and phrases. hypo-elliptic diffusion, non-parametric estimation, stationary mea- sure, minimax rate.

1

(3)

The optimal estimator is very specific to the diffusive nature of the process as it relies on the local time of the process. Remark that if the process is a diffusion ob- served discretely on [0, T] with a sufficiently high frequency it is possible to estimate with rate√

T also (see [12], [4]).

The case of multi-dimensional non-degenerate diffusions is treated in Dalalyan and Reiß [6] and Srauss [14]. In that case, the local time process is not available, but it is shown that for non degenerate diffusion process of dimensiond= 2, there exists an estimator of the pointwise values of the stationary measure with rate

T /log(T)2. This rate of estimation does not depend on the smoothness of the stationary measure for d= 2. In the situation of a non degenerate diffusion with dimensiond≥3, they find estimators whose rate is polynomial inT, depends on both the smoothness of the stationary measure and the dimensiond≥3. Ford≥3, the rate is strictly slower than√

T, but faster than the rate appearing in standard multivariate density estimation from T i.i.d. observations. Hence, for d≥ 3, the diffusive structure of the process, enables to get a faster estimation rate than for the i.i.d. case, as well.

In the case of hypo-elliptic processes, fewer results exist for the estimation of the stationary distribution. In [5], the authors consider the case of two-dimensional process Z = (X, Y), whereY is a velocity and X a position. Based on a discrete sampling of the path of size n, they propose an estimator of the stationary mea- sure which converges at a non parametric rate depending on the smoothness of the stationary measure. Assuming that the stationary measure has anisotropic regu- larity (k1, k2), the proposed estimator has a rate depending on the harmonic mean 2/(1/k1+ 1/k2) as it is for the optimal rate of estimation of distribution on in the case of i.i.d. sequence.

In this paper, we focus on the situation where the process is observed continu- ously on [0, T] and our goal is to determine what is the optimal rate of estimation ofπin this context. Assuming that the stationary density (x, y)7→π(x, y) has an anisotropic H¨older regularity with indexk1 with respect to the variablex, andk2

with respect to y, we construct an estimator of π(x0, y0) based on (Xt, Yt)t∈[0,T]. This estimator achieves some rateT−v(k1,k2)and this rate of convergence depends on the smoothness of π in a very specific way. Indeed, the expression of the rate of estimation involves only k1 or k2, depending on the relative positions of these two smoothness indexes. This shows the specificity of the estimation problem for continuous observation of process Z = (X, Y) with a non degenerate diffusive ve- locity Y, and a degenerate componentX. It is noteworthy that we find a rate of estimation slower than in the case of continuous observation of a non-degenerate diffusion in dimension 2. Another interesting finding is that the rate of estimation ofπ(x0, y0) depends on the point where one estimates the stationary measure, and is slower for points corresponding to null velocityy0= 0. A crucial ingredient in the study of the rate of estimation is to derive the variance of a kernel estimator with a choice of bandwidth (h1, h2). We show that the variance of the kernel estimator depends in a completely unsymmetrical way onh1 and h2. As a consequence, we get that the optimal bandwidth choice for the estimator is such that one of the two bandwidth can go almost arbitrarily fast to 0, while the optimal choice for the other bandwidth depends sharply on the smoothness of the stationary measure.

Also, we show a lower bound for the minimax risk of estimation ofπ(x0, y0) on a class of hypo-elliptic diffusion models with stationary measure of H¨older regularity

(4)

(k1, k2). This proves that it is impossible to estimate uniformly on this class with a rate faster thanT−v(k1,k2) (up to log(T) term).

The outline of the paper is the following. In Section 2, we present the model and give assumptions that are sufficient to get an ergodic system with stationary measure admitting a density π. In Section 3, we present the construction of the estimator and states the results on their rate of convergence (in Theorem 1 for y06= 0, and Theorem2fory0= 0). In Section4, we prove the upper bound on the variance of the kernel estimator. We also illustrate the very specific behaviour of the variance of the kernel estimator for hypo-elliptic diffusion by numerical simulations.

In Section5, we state and prove minimax lower bounds for the risk of estimation.

In the Appendix, we prove some technical results used in the proofs of Section4.

2. Hamiltonian system and mixing property

Let us consider (Ω,A,P), some probability space on which a standard one dimen- sional Brownian motion (Bt)t≥0is defined. We assume that the process (Zt)t≥0= (Xt, Yt)t≥0is solution of the stochastic differential equation

dXt=Ytdt (2.1)

dYt=a(Xt, Yt)dBt−[β(Xt, Yt)Yt+V0(Xt)]dt, (2.2)

where (X0, Y0) is a random variable independent of (Bt)t.

We introduce the following regularity assumption on the coefficients.

AssumptionHReg:

• The functionsa:R2 →Ris a C function anda > a(x, y)≥a >0, and for some constantsa > a >0.

• The function β : R2 → R is continuously differentiable, and such that

|β(x, y)| ≤β for all (x, y)∈ R2, and for some constantβ > 0. Moreover, we have β(x, y) > β, ∀x ∈ [l,∞), y ∈ R, where l ≥ 0, β > 0 are two constants.

• The functionV :R→Ris lower bounded, and withC2regularity.

It is shown in [17], that under the assumptions HReg the S.D.E. (2.1)–(2.2) admits a weak solution, which satisfies the Markov property, and the associated semi group (Pt)t≥0is strongly Feller [17], which in turn implies that the process is strongly Markovian. Let us stress that the sign condition onβ for largextogether with the existence of lower bound on V are crucial to insure that the solution of (2.1)–(2.2) does not explode in finite time. Of course, if we know that (x, y) 7→

a(x, y), (x, y)7→yβ(x, y) andx7→V0(x) are globally Lipschitz, the solutions of the S.D.E. exists in the strong sense.

We now introduce an assumption on the potentialV of the system that ensure that the process tends to some equilibrium.

AssumptionHErg: one has lim

|x|→∞V0(x)sign(x) = +∞.

Is is shown in [17] that underHRegandHErg, one can construct a Lyapounov function Ψ ≥ 1, and that a stationary probability π exists and is unique for the processZ = (X, Y), and satisfiesπ(Ψ)<∞. It is shown in [17] (see Theorem 2.4) that for someD >0 andρ >0,

(2.3) ∀t≥0, ∀z∈R2, sup

|f|≤Ψ

Pt(f)(z)− Z

R2

f(z0)π(dz0)

≤DΨ(z)e−ρt

(5)

for any function measurable function f such as f /Ψ is bounded on R and where (Pt)t≥0 is the semi group ofZ,

Pt(f)(z) =E[f(Xt, Yt)|(X0, Y0) =z].

Remark that under HReg and HErg, it is possible to construct a Lyapounov function such that Ψ(x, y)≥ C1 exp

1

C[|y|2+V(x)]

, for some constantC >0 (see (3.10) in [17]). As a consequence, (2.3) applies to functions f with exponential growth, and the invariant distribution admits finite exponential moments.

Hence, we can state the following proposition

Proposition 1. Assume that the coefficients of the equation (2.2) satisfy HReg andHErg, then there exists a stationary solutionZ = (X, Y)to the S.D.E.(2.1)–

(2.2), and the stationary distribution is unique and admits some densityπ. More- over, there exist constantsDerg>0andρ >0such that for any bounded measurable functionsf,g, we have

(2.4) ∀t≥0, |cov(f(Z0), g(Zt))| ≤Dergkfkkgke−ρt.

The Proposition (1) is a consequence of the results shown in [17]. Indeed, from the fact that the Lyapounov function Ψ is greater than 1 and integrable with respect to the stationary measure, one can check that (2.4) is a consequence of (2.3).

3. Estimator and upper bounds

In this section we introduce the expression for our estimator of the stationary measure π of the S.D.E. (2.1)–(2.2) and prove that the estimator achieves some rate of convergence, depending on the smoothness ofπ.

Letϕ:R→Ra bounded, compactly supported function. For convenience, we suppose that the support ofϕis [−1,1]. We assume that

(3.1) Z

R

ϕ(u)du= 1, Z

R

ϕ(u)uldu= 0, forl∈ {1, . . . , L} whereL≥1.

We let h1(T) > 0, h2(T) > 0 be two bandwidths which converge to zero as T → ∞, and we consider a kernel estimator ofπ at the point (x0, y0)∈R2 as (3.2) πˆT(x0, y0) = 1

T Z T

0

ϕh1(T),h2(T)(Xs−x0, Ys−y0)ds, where

(3.3) ϕh1,h2(x−x0, y−y0) = 1 h1h2

ϕ(x−x0 h1

)ϕ(y−y0 h2

).

We assume that the two bandwidths satisfy,

∃K >0, h1(T)−1+h2(T)−1≤K(1 +TK), ∀T >0, (3.4)

ph1(T) +h2(T)≤K(log(T))−3/2∧1, ∀T >1.

(3.5)

The two previous conditions insure that the bandwidths go faster to zero than the logarithmic rate by (3.5), but not faster than any polynomial rates by (3.4).

Actually these two bandwidths will be specified later (see equations (3.12), (3.13), (3.14), (3.15) in the proofs of Theorems1and2).

We introduce the class of H¨older functions.

(6)

Definition 1. For(k1, k2)∈(0,∞)2, and R >0, we denote Hk1,k2(R) the set of functions f : R2 → R such that x7→ f(x, y) and y 7→ f(x, y) are respectively of classCbk1c andCbk2c, and satisfy the control,∀x, y inRandh∈[−1,1],

|f(x, y)| ≤R,

bk1cf

∂xbk1c(x+h, y)−∂bk1cf

∂xbk1c(x, y)

≤R|h|k1−bk1c,

bk2cf

∂xbk2c(x, y+h)−∂bk2cf

∂xbk2c(x, y)

≤R|h|k2−bk2c.

We can state the main results on the asymptotic behaviour of the estimator.

This behaviour is different according to the fact that we estimate the value of the stationary measure on a point (x0, y0) corresponding to a null velocity or not.

Theorem 1. Assume that Z = (X, Y) is a stationary solution to (2.1)–(2.2)and that AssumptionsHReg,HErghold true. We assume that the stationary distribu- tionπbelongs to the set Hk1,k2(R)fork1>0, k2>0, R >0, withmax(k1, k2)≤L (recall (3.1)).

Assume that y06= 0. Then, there exist bandwidths(h1(T))T,(h2(T))T, depend- ing only onk1 andk2, such that the estimator satisfies :

if k1< k2/2, E

(ˆπT(x0, y0)−π(x0, y0))2

≤CT

2k2 2k2 +1, (3.6)

if k1≥k2/2, E

(ˆπT(x0, y0)−π(x0, y0))2

≤CT

2k1 2k1 +1/2, (3.7)

for some constant C independent ofT.

Theorem 2. Assume that Z = (X, Y) is a stationary solution to (2.1)–(2.2)and that AssumptionsHReg,HErghold true. We assume that the stationary distribu- tionπbelongs to the set Hk1,k2(R)fork1>0, k2>0, R >0, withmax(k1, k2)≤L (recall (3.1)).

Assume that y0= 0. Then, there exist bandwidths(h1(T))T,(h2(T))T, depend- ing only onk1 andk2, such that the estimator satisfies :

ifk1< k2/3, E

(ˆπT(x0, y0)−π(x0, y0))2

≤C( T logT)

2k2 2k2 +2, (3.8)

ifk1≥k2/3, E

(ˆπT(x0, y0)−π(x0, y0))2

≤CT

2k1 2k1 +2/3, (3.9)

for some constant C independent ofT.

Remark 1. The rates of estimation obtained in Theorems 1–2are completely dif- ferent with the usual one in several ways. First, they do not depend on the harmonic mean of the smoothness indexk1,k2, as it is usual in non-parametric setting. Sec- ond, the rate depends on the point(x0, y0)where the density is estimated. We state in Section5a minimax lower bound for the L2 risk of estimation ofπ(x0, y0)with the same rates (up to logterms).

The asymptotic behaviour of the estimator relies on the standard bias variance decomposition. Hence, we need sharp evaluations for the variance of the estimator, that are stated below, and will be proved in Section4.

Proposition 2. Assume thatZ = (X, Y)is solution to (2.1)–(2.2), that Assump- tionsHReg,HErg, hold true and kπk≤R for someR >0.

(7)

Assume thaty06= 0. Then, there exists some constantC, such that for allT >0, Var(ˆπT(x0, y0))≤C 1

T h2(T)∧ 1 Tp

h1(T)

!

+Cε(T, h1(T), h2(T)), where

(3.10) ε(T, h1(T), h2(T))≤|log(h1(T)h2(T))|C

T .

Proposition 3. Assume that Z = (X, Y) is a solution to (2.1)–(2.2), that As- sumptionsHReg,HErg, hold true and thatkπk≤R for someR >0.

Assume thaty0= 0. Then, there exists some constantC, such that for allT >0, Var(ˆπT(x0, y0))≤C

ln(T)

T h2(T)2∧ 1 T h1(T)2/3

+Cε(T, h1(T), h2(T)), where

ε(T, h1(T), h2(T))≤|log(h1(T)h2(T))|C

T .

We can now prove the that our estimator achieves the rates given in Theorems 1–2.

Proof[Proof of Theorem 1.] We write the usual bias-variance decomposition, (3.11) E[(ˆπT(x0, y0)−π(x0, y0))2]≤

|E(ˆπT(x0, y0))−π(x0, y0)|2+ Var(ˆπT(x0, y0)).

Using the stationarity of the process, we can upper bound the bias term as

|E(ˆπT(x0, y0))−π(x0, y0)|2

= Z

R2

ϕ(u)ϕ(v)[π(x0+uh1(T), y0+vh2(T))−π(x0, y0)]dudv

≤C(h1(T)2k1+h2(T)2k2), where in the last line we usedπ∈ Hk1,k2(R) with (3.1) (see e.g. [16], or Proposition 1 in [2] for details). We now use the results of Proposition2on the variance of the estimator and choose the optimal bandwidthsh1(T),h2(T).

•Case 1: k1< k2/2.

Using Proposition2

E[(ˆπT(x0, y0)−π(x0, y0))2]≤C(h1(T)2k1+h2(T)2k2)

+ C

T h2(T)+Cε(T, h1(T), h2(T)).

We now choose to balanceh2(T)2k2with the main contribution of the variance term and let the contribution ofh1(T) on the bias be smaller. It yields us to set (3.12) h2(T) =T−1/(2k2+1), h1(T) =T−C1, whereC1≥ k2

k1(2k2+ 1). With these choices and recalling (3.10), we get (3.6).

•Case 2: k1≥k2/2.

(8)

We use again Proposition2:

E[(ˆπT(x0, y0)−π(x0, y0))2]≤C(h1(T)2k1+h2(T)2k2)

+ C

Tp

h1(T)+Cε(T, h1(T), h2(T)).

Balancing the variance and bias terms yields to

(3.13) h1(T) =T−1/(2k1+1/2), h2(T) =T−C2, where C2≥ k1 k2(2k1+ 1/2),

and (3.7) follows.

Proof [Proof of Theorem2] We use again the bias/variance decomposition (3.11) and exploit now the results of Proposition3.

•Case 1: k1< k2/3.

We have that,

E[(ˆπT(x0, y0)−π(x0, y0))2]≤C(h1(T)2k1+h2(T)2k2) + Cln(T)

T h2(T)2 +Cε(T, h1(T), h2(T)).

We set

(3.14) h2(T) = ( T

ln(T))−1/(2k2+2), h1(T) =T−C1, where C1≥ k2

k1(2k2+ 2), and (3.8) follows.

•Case 2: k1≥k2/3.

We have that,

E[(ˆπT(x0, y0)−π(x0, y0))2]≤C(h1(T)2k1+h2(T)2k2)

+ C

T h1(T)2/3 +Cε(T, h1(T), h2(T)).

The choice

(3.15) h1(T) =T−1/(2k1+2/3), h2(T) =T−C2, where C2≥ k1 k2(2k1+ 2/3)

gives (3.9).

Remark 2. • The optimal choices for the bandwidths are given in (3.12), (3.13),(3.14)and (3.15). In all the situations, we see that one of the two bandwidths h1(T) or h2(T) can be chosen “arbitrarily small” (as the con- stants C1 andC2 in (3.12),(3.13),(3.14), (3.15)can be arbitrarily large).

In means that the bias induced by the variation of π along one of the two variablesxory can be arbitrarily reduced by the choice of a very thin band- width. It explains why the expression of the rate of estimation depends only on one index of smoothnessk1 ork2.

• The fact that one of the two bandwidths can be chosen arbitrarily small is reminiscent to the situation of the estimation of the stationary measureπ(z) for a one dimensional diffusion process(Zt)t∈[0,T]observed continuously. In that case, the efficient estimator is based on the local time of the process (see [10]) and the rate is √

T independently of the smoothness of π. The

(9)

use of local time is a way to give a rigorous analysis of the quantityπ˜T(z) =

1 T

RT

0 δ{z}(Zs)ds, whereδ{z} is the Dirac mass located atz. We see thatπ˜T is essentially a kernel estimator with bandwidth h= 0, for which the bias is reduced to0.

4. Variance of the kernel estimator

In this section we prove the crucial upper bounds given in Propositions2–3. We first need to state two lemmas related to the behaviour of density and semi group of the process, and whose proofs are postponed to Section6.

Lemma 1 (Corollary 2.12 in [1]). Assume HReg. Then, the process admits a transition density (pt)t>0 which satisfies the following upper bound. For all K compact subset of R2,∀(x, y, x0, y0)∈K×K,∀t∈(0,1)

(4.1) pt((x, y); (x0, y0))≤pGt((x, y); (x0, y0)) +pUt((x, y); (x0, y0)) where

(4.2) pGt((x, y); (x0, y0)) = CG

t2 exp − 1 CG

"

(y−y0)2

t +(x0−x−y+y2 0t)2 t3

#!

,

for some CG >0 and pU is a measurable non negative function such that for any compact K⊂R2 and(x, y)∈K we have for all t∈(0,1),

(4.3)

Z

R2

pUt((x, y); (x0, y0))dx0dy0≤CUexp(−t−1CU−1),

forCU >0. The two constantsCGandCU are independent oft∈[0,1], but depend on the compact setK.

The Lemma 1 gives us a control on the short time behaviour of the transition density. For the sequel, we need a control valid for any time. This is the purpose of the following lemma about the semi group of the process.

Lemma 2. Assume HReg, and let K be a compact subset of R2. Then, there exists a constant CeK such that for all 0 < D < 1, t ≥D, ∀z ∈ R2, and any f measurable bounded function with support on K,

(4.4) |Pt(f)(z)| ≤CeK

"

kfkL1(R2)

D2 +kfke−1/(CeKD)

# .

4.1. Proof or Proposition2. Throughout the proof we suppress in the notation the dependence uponT ofh1(T) andh2(T). The constantCmay change from line to line and is independent ofT. In the proof, we will use repeatedly Lemmas1–2.

To this end, we consider a compact setKthat contains a ball of radius√

2 centered at (x0, y0), and as a result the support of (x, y)7→ϕ(x−x0, y−y0) is included in this compact.

To prove the proposition, it is sufficient to prove that the following inequalities holds both, forT large enough,

Var(ˆπT(x0, y0))≤C 1 T h2

+Cε(T, h1(T), h2(T)), (4.5)

Var(ˆπT(x0, y0))≤C 1 T√

h1

+Cε(T, h1(T), h2(T)).

(4.6)

(10)

First step: we prove (4.5).

From (3.2), and the stationarity of the process we get that (4.7) Var (ˆπT(x0, y0)) = 1

T2 Z T

0

Z T 0

κ(t−s)dtds, where

κ(u) = Cov (ϕh1,h2(X0−x0, Y0−y0), ϕh1,h2(Xu−x0, Yu−y0)). We deduce that

(4.8) Var (ˆπT(x0, y0))≤ 1 T

Z T 0

|κ(s)|ds.

We will find an upper bound for the integral on the right-hand side of the latter expression by splitting the time interval [0, T] into 4 pieces [0, T] = [0, δ)∪[δ, D1)∪ [D1, D2)∪[D2, T], whereδ,D1,D2, will be chosen latter.

• For s∈[0, δ), we write from (4.7) and using Cauchy-Schwarz inequality and the stationarity of the process,

|κ(s)| ≤Var(ϕh1,h2(X0−x0, Y0−y0))1/2Var(ϕh1,h2(Xs−x0, Ys−y0))1/2

= Var(ϕh1,h2(X0−x0, Y0−y0)).

This variance is smaller than Z

R2

ϕh1,h2(x−x0, y−y0)2π(x, y)dxdy and using thatπis bounded and (3.3), we deduce

(4.9) |κs| ≤ C

h1h2

.

In turn, we have (4.10)

Z δ 0

|κ(s)|ds≤C δ h1h2

.

•Fors∈[δ, D1), whereδ < D1≤1. We write

|κ(s)| ≤E[|ϕh1,h2(X0−x0, Y0−y0)||ϕh1,h2(Xu−x0, Yu−y0)|] +

E[|ϕh1,h2(X0−x0, Y0−y0)|]E[|ϕh1,h2(Xs−x0, Ys−y0)|]

Using that (Xt, Yt)tis stationary with marginal law having a bounded density we deduce that E[|ϕh1,h2(Xs−x0, Ys−y0)|] ≤ CR

R2h1,h2(x−x0, y−y0)|dxdy ≤ C from (3.3). This gives,

(4.11) |κ(s)| ≤ Z

R2

h1,h2(x−x0, y−y0)|

Z

R2

h1,h2(x0−x0, y0−y0)|ps(x, y;x0, y0)dx0dy0π(x, y)dxdy+C.

Using now equation (4.1) in Lemma1, we get that (4.12) |κ(s)| ≤κ1(s) +κ2(s) +C

(11)

with

(4.13) κ1(s) :=

Z

R2

h1,h2(x−x0, y−y0)|

Z

R2

h1,h2(x0−x0, y0−y0)|pGs(x, y;x0, y0)dx0dy0π(x, y)dxdy,

κ2(s) :=

Z

R2

h1,h2(x−x0, y−y0)|

Z

R2

h1,h2(x0−x0, y0−y0)|pUs(x, y;x, y)dx0dy0π(x, y)dxdy.

In order to upper boundκ1(s) we show that the Gaussian kernel (4.2) appearing in the expression ofκ1(s) takes small values fors∈[δ, D1) as soon asδis well cho- sen. Recall thaty06= 0, and for simplicity assume thaty0>0. Then, using thatϕis

compactly supported on[−1,1] we know that|ϕh1,h2(x−x0, y−y0h1,h2(x0−x0, y0−y0)| 6=

0 implies that

(4.14) |x−x0| ≤h1,|x0−x0| ≤h1,|y−y0| ≤h2,|y0−y0| ≤h2.

Let us denoteK(h1, h2) the rectangle ofR4defined by the conditions (4.14). Then, (4.15) κ1(s)≤C

Z

K(h1,h2)

h1,h2(x−x0, y−y0)|

h1,h2(x0−x0, y0−y0)|pGs(x, y;x0, y0)dxdydx0dy0, where we used that π is bounded. On K(h1, h2), we have y+y2 0y20 > 0 if h2 is small enough, and |x−x0| ≤ 2h1. Hence, if we assume that s ≥ 6hy1

0 we

have |x0−x| ≤ sy30. It entails, x0−x−y+y2 0s ≤ sy30y20s =−sy60, and in turn

(x0−x−y+y20s)2

s3y36021s. Plugging in (4.2) this yields topGs(x, y;x0;y0)≤ sC2exp(−Cs1 ) for some constantCindependent ofs. Using (4.15), we deduce

κ1(s)≤ C

s2exp(− 1 Cs)

Z

K(h1,h2)

h1,h2(x−x0, y−y0)|

h1,h2(x0−x0, y0−y0)|dxdydx0dy0 and hence

(4.16) κ1(s)≤ C

s2exp(− 1 Cs).

To controlκ2(s), we use (4.3) andkϕh1,h2(· −x0,· −y0)khC

1h2 to get that for allx, yin the compactKcontaining a ball of radius√

2 centered at (x0, y0), we have the upper boundR

R2h1,h2(x0−x0, y0−y0)|pUs(x, y;x0, y0)dx0dy0hC

1h2eCs1 . As a consequence,

κ2(s)≤C Z

R2

h1,h2(x−x0, y−y0)| 1

h1h2eCs1 π(x, y)dxdy

≤ C h1h2eCs1 , (4.17)

(12)

where we used again thatπis bounded and that the support ofϕh1,h2(· −x0,· −y0) is included inK. From (4.12), (4.16)–(4.17), we deduce that for 6hy1

0 ≤δ≤D1≤1, Z D1

δ

|κ(s)|ds≤ Z D1

δ

[C

s2exp(− 1

Cs) + C

h1h2exp(− 1

Cs) +C]ds

≤ Z D1

δ

[C

s2exp(− 1

Cs) + C

h1h2s2exp(− 1

Cs) +C]ds

≤Cexp(− 1 CD1

)[1 + 1 h1h2

] +CD1, (4.18)

where in the second line we usedD1≤1.

•Fors∈[D1, D2) withD1≤1≤D2< T, we start from the control (4.11) that we write

|κ(s)| ≤ Z

R2

h1,h2(x−x0, y−y0)|Ps(|ϕh1,h2(· −x0,· −y0)|)(x, y)π(x, y)dxdy+C.

Sinceϕh1,h2(·−x0,·−y0) vanishes outside the compact neighbourhoodKof (x0, y0) we can use Lemma2to upper bound the semi group term. Hence, we get for some constantC >0,

|κ(s)| ≤C Z

R2

h1,h2(x−x0, y−y0)|×

[kϕh1,h2kL1(R2)

D21 + 1 h1h2

e−1/(CD1)]π(x, y)dxdy+C, and we deduce|κ(s)| ≤C[D12

1

+h1

1h2e−1/(CD1)+ 1]. This yields, (4.19)

Z D2 D1

|κ(s)|ds≤C[D2

D12 + D2

h1h2

e−1/(CD1)+D2], for some constantC >0.

• For s ∈ [D2, T], we use the covariance control (2.4), that allows us to write

|κ(s)| ≤Ckϕh1,h2(· −x0,· −y0)k2e−ρs≤C

1 h1h2

2

e−ρs, forC >0 andρ >0. It entails the upper bound,

(4.20)

Z T D2

|κ(s)|ds≤C e−ρD2 (h1h2)2.

Collecting together (4.8), (4.10), (4.18), (4.19), (4.20) we deduce, Var(ˆπT(x0, y0))≤ C

T δ

h1h2

+ exp(− 1 CD1

)[1 + D2

h1h2

]+

D1+D2+D2

D21 + e−ρD2 (h1h2)2

,

forC >0 some constant. We choose δ= 6hy1

0 ,D1= C|log1h

1h2|, D2= |ln((h1h2)2)|

ρ .

By (3.4)–(3.5) we see that this choice is such that, 6hy1

0 =δ < D1<1<D2< T for T large enough. And it yields,

Var(ˆπT(x0, y0))≤C T

6 y0

1

h2 +h1h2+D1+D2+D2

D21 + 1 .

(13)

Sinceh1→0,h2 →0, and as the value ofC may change from line to line, we can write that

Var(ˆπT(x0, y0))≤ C T

1 h2

+|ln(h1h2)|C

and we have shown (4.5).

Second step: we prove (4.6).

We use the same decomposition ofRT

0 |κ(s)|dsin four terms as for the proof of (4.5), but we treat in a different way the contribution of the short time correlations Rδ

0 |κ(s)|ds.

• Let us find an upper bound forκ(s) fors∈(0, δ) withδ <1. We recall that, from Lemma1, the decomposition (4.12) holds true whereκ1(s) is given by (4.13) andκ2(s) is upper bounded by (4.17). We now studyκ1(s). To this end, we remark thatpGs(x, y;x0, y0)≤Csqs(x0|x, y, y0) where

qs(x0|x, y, y0) = C

s3/2exp −C−1(x0−x−y+y2 0s)2 s3

! .

Let us stress that

(4.21) sup

s∈(0,1)

sup

(x,y,y0)∈R3

Z

R

qs(x0|x, y, y0)dx0≤C <∞.

Thus, using (4.13), we have κ1(s)≤ C

√s Z

R2

h1,h2(x−x0, y−y0)|π(x, y) Z

R2

h1,h2(x0−x0, y0−y0)|qs(x0 |x, y, y0)dx0dy0 dxdy.

By (3.3), we have |ϕh1,h2(x0−x0, y0−y0)| ≤ hC

1

1 h2

ϕ(y0h−y0

2 )

, and thus, using (4.21), we get

Z

R2

h1,h2(x0−x0, y0−y0)|qs(x0|x, y, y0)dx0dy0

≤ C h1

Z

R

1 h2

ϕ(y0−y0

h2

) Z

R

qs(x0|x, y, y0)dx0dy0

≤ C h1

Z

R

1 h2

ϕ(y0−y0

h2

)

dy0≤ C h1

.

We deduce that κ1(s)≤ Csh

1

R

R2h1,h2(x−x0, y−y0)|π(x, y)dxdy≤ Csh

1. Col- lecting the latter equation with (4.12) and (4.17), it yields

Z δ 0

|κ(s)|ds≤ Z δ

0

[ C

√sh1

+CeCs1 h1h2

+C]ds

≤ Z δ

0

[ C

√sh1

+C1 s2

eCs1 h1h2

+C]ds

≤C[

√ δ h1

+e1 h1h2

+δ]

(4.22)

where we have used in the second line thatδ <1.

(14)

We now gather (4.8), (4.18), (4.19), (4.20), (4.22), to derive, Var(ˆπT(x0, y0))≤ C

T √

δ h1

+e1 h1h2

+δ+ exp(− 1 CD1

)[1 + D2

h1h2

]+

D1+D2+D1

D22 + e−ρD2 (h1h2)2

.

We choose the same thresholds as in the first step, δ = 6hy1

0 , D1 = C|log1h

1h2|, D2 = |ln((h1h2)2)|

ρ . Recalling (3.4)–(3.5), we have e−(C6h1/y0)−1 =O(e−εlog(T)3) = o(h1h2), with someε >0. We derive that

Var(ˆπT(x0, y0))≤ C T[ 1

√h1 +|ln(h1h2)|C],

for someC >0. The second step of the proposition is proved.

Remark 3. (1) The Proposition 2 consists actually in the two upper bounds (4.5)–(4.6)for the variance of the estimator. We see that one of these two bounds is smaller than the other, depending on the relative positions of h2 or √

h1. It explains why the expression for the rate of convergence of the estimator in Theorem 1 depends on the relative positions of k1 and k2/2, which determines which one of the two bounds (4.5)or (4.6)is used in the bias/variance decomposition of the estimation error (see proof of Theorem 1).

(2) The control ofκ(s) = Cov (ϕh1,h2(X0−x0, Y0−y0), ϕh1,h2(Xs−x0, Ys−y0)) fors∈[δ, D1], withδ≈h1 andD1≤1 depends on the fine structure of the main term (4.2)in the short time expansion of the transition density of the process and on the fact thaty06= 0. In the situationy06= 0, it is impossible to get such a refined result on the covariance, and eventually the bound on the variance of the estimator is larger (see Proposition3).

4.2. Proof of Proposition3. We need to prove that the following two inequalities hold true, forT large enough:

Var(ˆπT(x0, y0))≤C 1 T h2/31

+Cε(T, h1(T), h2(T)), (4.23)

Var(ˆπT(x0, y0))≤Cln(T)

T h22 +Cε(T, h1(T), h2(T)).

(4.24)

Again we considerKa compact set ofR2that contains a ball of radius√

2 centered at (x0, y0).

First step : let us prove (4.23).

We recall the control (4.8) for the variance of ˆπT(x0, y0) and split the integral in (4.8) into four pieces corresponding to the partition [0, T] = [0, δ)∪[δ, D1)∪ [D1, D2)∪[D2, T], whereδ, D1, D2 will be specified latter. Let us stress that in the proof of Proposition 2, only the control of |κ(s)| fors ∈[δ, D1), uses the fact thaty06= 0.

•Fors∈[0, δ) withδ <1, we recall the result obtained in (4.22) which states Z δ

0

|κ(s)|ds≤C[

√ δ h1

+e1 h1h2

+δ].

(15)

• For s ∈ [δ, D1] with 0 < δ < D1 < 1, exactly with the same proof as in Proposition 2, we have |κ(s)| ≤ κ1(s) +κ2(s) +C where κ1(s) is given by (4.13) andκ2(s) is upper-bounded as in (4.17). We need to find a control onκ1(s) in the situation y0 6= 0. Using, from (4.2), that pGs(x, y;x0, y)≤ sC2, ∀(x, y, x0, y0)∈K2, and the fact thatπis bounded, we get

κ1(s)≤ C s2

Z

R4

h1,h2(x−x0, y−y0)|

h1,h2(x0−x0, y0−y0)|dxdydx0dy0 ≤ C s2.

We deduce

Z D1 δ

|κ(s)|ds≤ Z D1

δ

C[1

s2 +eCs1 h1h2

+C]ds

≤C[1

δ + exp(− 1 CD1) 1

h1h2 +D1]

≤C[1

δ + exp(− 1 CD1

) 1 h1h2

], (4.25)

where we usedD1≤1.

•Fors∈[D1, D2), withD1<1< D2< T, we use the control (4.19).

•Fors∈[D2, T], withD2< T, we use the control (4.20).

Collecting the four previous controls, we deduce

Var(ˆπT(x0, y0))≤ C T

√δ h1

+e1 h1h2

+1

δ +exp(−CD1

1) h1h2

(1 +D2)+

D1+D2+D2

D12+ e−ρD2 (h1h2)2

whereC >0,ρ >0. We chooseδthat balances √

δ/h1 with 1/δ, namelyδ=h2/31 which is smaller than 1 forT large enough, recalling (3.5). Next, we chooseD2=

|ln((h1h2)2)|

ρ , andD1= |logCh

1h2|. We deduce Var(ˆπT(x0, y0))≤C

T[ 1 h2/31

+|log(h1h2)|C],

for someC >0, and where we have used that, from (3.4)–(3.5), exp(−1/(Ch2/31 )) = O(exp(−εlog(T)2)) =o(h1h2), with someε >0. Hence (4.23) is proved.

Second step: we show (4.24).

Comparing with the first part of the proposition, we have to modify our upper bound onRδ

0|κ(s)|dswithδ <1. We split this integral into two parts,Rδ0

0 |κ(s)|ds+

Rδ

δ0|κ(s)|ds. On the first part, we use the control (4.9) and get (4.26)

Z δ0 0

|κ(s)|ds≤C δ0 h1h2

.

Références

Documents relatifs

bounds only give a coarse insight on the change-point estimation behavior, we recently proposed the use of Bayesian lower bounds such as the Weiss-Weinstein bound (WWB), both for

Using a general result due to Wu [Wu01], he established the LDP for occupation measures of stochastic Burgers and Navier–Stokes equations, provided that the random force is white

Regarding the literature on statistical properties of multidimensional diffusion processes in absence of jumps, an important reference is given by Dalalyan and Reiss in [9] where, as

We consider a population model with size structure in continuous time, where individuals are cells which grow continuously and undergo binary divisions after random exponential times

For general models, methods are based on various numerical approximations of the likelihood (Picchini et al. For what concerns the exact likelihood approach or explicit

We will show that, provided the decreasing step sequence and the constant pseudo-step T are consistent, this allows for a significant speeding-up of the procedure (as well as

In this paper, we complete our previous works ( Cattiaux, Leon and Prieur (2014-a), Cattiaux, Leon and Prieur (2014-b), Cattiaux, Leon and Prieur (2014-c)) on the

For some ergodic hamiltonian systems we obtained a central limit theorem for a non-parametric estimator of the invariant density (Cattiaux et al. 2013), under partial observation