• Aucun résultat trouvé

Geometric ergodicity for families of homogeneous Markov chains

N/A
N/A
Protected

Academic year: 2021

Partager "Geometric ergodicity for families of homogeneous Markov chains"

Copied!
39
0
0

Texte intégral

(1)

HAL Id: hal-00455976

https://hal.archives-ouvertes.fr/hal-00455976v2

Preprint submitted on 8 May 2012

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Geometric ergodicity for families of homogeneous Markov chains

Leonid Galtchouk, Serguei Pergamenchtchikov

To cite this version:

Leonid Galtchouk, Serguei Pergamenchtchikov. Geometric ergodicity for families of homogeneous Markov chains. 2010. �hal-00455976v2�

(2)

Geometric ergodicity for classes of homogeneous Markov chains

L. Galtchouk S. Pergamenshchikov May 8, 2012

Abstract

The paper deals with non asymptotic computable bounds for the geometric convergence rate of homogeneous ergodic Markov processes.

Some sufficient conditions are stated for simultaneous geometric ergod- icity of Markov chain classes. This property is applied to nonparametric estimating in ergodic diffusion processes.

MSC : 60F10, 60J05

Key words and phrases: Homogeneous Markov chain; Geometric ergodicity;

Convergence rate; Coupling renewal processes; Lyapunov function; Renewal theory; Non asymptotic exponential upper bound; Ergodic diffusion processes.

The second author is partially supported by the RFFI-Grant 09-01-00172-a.

IRMA,Department of Mathematics, Strasbourg University, 7, rue R´en`e Descartes, 67084, Strasbourg, France, e-mail: leonid.galtchouk@math.unistra.fr

Laboratoire de Math´ematiques Raphael Salem, Avenue de l’Universit´e, BP. 12, Uni- versit´e de Rouen, F76801, Saint Etienne du Rouvray, Cedex France and Department of Mathematics and Mechanics,Tomsk State University, Lenin str. 36, 634041 Tomsk, Russia, e-mail: Serge.Pergamenchtchikov@univ-rouen.fr

(3)

1 Introduction

In this paper we study a class of homogeneous Markov chains (Φ)n≥0 with values in some measurable space (X, B(X)) defined by a parametrized family of transition probabilities

(Pϑx)x∈X, ϑ∈Θ, (1.1)

where Θ is a parametric set for this family. For each ϑ Θ the sequence Φ = (Φn)n≥0 is a homogeneous Markov chain defined on the measurable space (X ,B(X)) with a transition probability Pϑ, i.e.

P(Φ1 Γ|Φ0 =x) =Pϑx(Γ).

Our main goal is to state geometric ergodicity for this class simultaneously over all values of the parameter ϑΘ.

Geometric ergodicity is studied in a number of papers (see, for example, [1], [16]-[19]). We remind that a chain (Φn)n≥0 on the space (X,B(X)) with a invariant measure π is calledgeometrically ergodicif there exist a X →[1,[ function V(x) and some constants R >0,κ >0 such that, for any n 1,

sup

x∈X

sup

0≤g≤V

1

V(x)|Exg(Φn)π(g)| ≤R e−κn. (1.2) As we shall see later (see, Definition 6.1 below) the function V, providing the drift condition, is given by the Lyapunov functions (see, e.g. [13] in the case of diffusion processes and [7], [11], [12] for Markov chains). For this reason, in the sequel, we shall call such functions by the Lyapunov functions.

The property (1.2) is useful in applied problems related to the identifica- tion of stochastic systems, described by stochastic processes with dependent values, in particular, governed by stochastic difference or stochastic differential equations. Necessity of simultaneous geometric ergodicity appears in statistics, when one studies a minimax risk with respect to some family of distributions related to a statistical experiment. In particular, in this paper we shall ap- ply simultaneous geometric ergodicity to nonparametric estimating the drift coefficient in the stochastic differential equation (see [2]):

dyt=S(yt) dt+σ(yt)dWt, 0tT , (1.3) whereSandσ are unknown functions andShas to be estimated from observa- tions (yt)0≤t≤T. In studying minimax risks for kernel estimators we need to use geometric ergodicity for the process (1.3) simultaneously over all coefficients S and σ from some functional class (see (2.7) below). In this case, ϑ = (S, σ) is the class parameter.

It is clear that to apply the property (1.2) to some distribution family we have to find some explicit expressions for the parameters R > 0 and κ >

(4)

0. It is a well-known problem in the Markov Chain Monte Carlo (MCMC) theory, when a stopping rule for simulations is based on the accuracy of n- step approximations. Therefore, we need to find computable bounds in (1.2).

Note that some explicit expressions for R and κ were calculated in [1], [17], [21] and [22] for ψ-irreducible homogeneous Markov chains. These results are not applicable in our case because it is not clear what does it mean ψ- irreducibility for a class of parametrized Markov chains. Note that in [1] and [17] the parameters R and κ were obtained by making use of the Kendall theorem. In [22] these parameters are obtained through the direct coupling method for the Markov chain. To this end the authors impose in [21]–[22]

some additional assumptions which are not satisfied in the case of the diffusion process (1.3) (see Remark 3.1 in Section 3). Moreover, it should be noted that the upper bound in (1.2) given in [21] is calculated under the assumption that the minorization condition holds on the whole state space. This assumption is never true for the model (1.3).

In the paper we apply the coupling method to the renewal process generated by entrance times of the process into the minorization set. Note, that Meyn and Tweedie use the same approach in order to obtain convergence results (see, [16], chapter 13). Their results imply the power convergence rate. To obtain the geometric rate we make use of the Lyapunov functions method for the related coupling process.

In order to explain the novelty of the method introduced in the paper, we give the scheme of proving the property (1.2). The first step consists in passing to a splitting chain, which yields a chain with an accessible atom.

Then, one makes use of the Regenerative Decomposition for splitting chains in order to evaluate the convergence rate (see [10], [16]). Let us remind that the principal term in this decomposition gives the deviation in the renewal theorem, which may be evaluated thanks to the Kendall renewal theorem that provides a geometric convergence rate. In our case the same convergence rate is obtained thanks to making use of the Lyapunov functions method for the coupling renewal process (see Theorem 4.1 in Section 4). This upper bound enables us to find the explicit non asymptotic exponential upper bound in the ergodic theorem for which we can find the supremum over the transition probability family in (1.1).

In this paper we find some sufficient conditions which provide simultaneous geometric ergodicity of the family (1.1) over all values of the parameterϑ. We check these conditions for the diffusion model (1.3). As corollary, we obtain explicit upper bounds for geometric convergence rate in the ergodic theorem for diffusion processes. These bounds may be used in the Monte Carlo technique to calculate some functionals of ergodic diffusion processes. In that case one can replace these functionals by the corresponding integrals with respect to the invariant density which has a simple explicit form. The accuracy of this

(5)

approximation is given by the explicit non asymptotic bounds in the geometric convergence rate for diffusion processes.

The paper is organized as follows. In the next section the main results are stated. Section 3 provides the explicit formulas for parameters in the geometric convergence rate. Section 4 is devoted to related coupling renewal processes. In section 5 geometric ergodicity is proved for a parametrized class of homogeneous Markov chains. In section 6 we apply this property to stochastic differential equations. Some basic results on homogeneous Markov chains are given in the Appendix.

2 Main results

Assume now, that the family of transition probabilities (Pϑ)ϑ∈Θ satisfies the following conditions

H1) There exist 0< δ <1, some set C ∈ B(X) and some probability measure ν on B(X) with ν(C) = 1 such that, for any A∈ B(X) with ν(A)>0,

x∈Cinf

ϑ∈Θinf Pϑ(x, A)δν(A)

> 0. (2.1)

For the sequel we denote η = inf

x∈C

ϑ∈Θinf Pϑ(x, C)δ

. (2.2)

H2) There exist X → [1,) function V, some constants 0 < ρ < 1, D 1, and a set C from B(X) such that

V = sup

x∈C

V(x) < and, for any x∈ X,

sup

ϑ∈Θ

Eϑx(V1)) (1ρ)V(x) + D1C(x). (2.3)

Here Eϑx means the expectation with respect to the transition probability Pϑ(x,·).

Remark 2.1. Condition H2) is called the uniform drift condition and that of H1) is the uniform minorization condition.

(6)

Theorem 2.1. Assume the conditions H1)–H2) hold true with the same set C ∈ B(X). Then, for each θ Θ, the chainΦadmits an invariant distribution πϑ on B(X). Moreover, for any n2,

sup

ϑ∈Θ

sup

x∈X

sup

0<f≤V

1 V(x)

Eϑxfn) Z

X

f(z)πϑ(dz)

Re−κn, (2.4) where the parameters R = R(ρ, δ, D, η, V) and κ = κ(ρ, δ, D, η1, V) are given in (3.5).

Apply now to the process (1.3). To this end we have to introduce some func- tional class of functions ϑ = (S, σ). First, for some x 1, M > 0 and L > β >0, we denote by V1 the class of functions S from C1(R) such that

sup

|x|≤x

|S(x)|+|S(x)˙ |

M and

L inf

|x|≥x

S(x)˙ sup

|x|≥x

S(x)˙ ≤ −β .

Second, for some fixed reals σmin >0 and σmax > σmin, we denote by V2 the class of functions σ fromC2(R) such that, for allxR,

σmin min (|σ(x)|, |σ(x)˙ |, |σ(x)¨ |)max (|σ(x)|,|σ(x)˙ |,|σ(x)¨ |)σmax. Finally, we set

Θ = V1× V2. (2.5)

Note that (see, for example, [6]), for any function ϑ from Θ, the equation (1.3) admits a unique strong solution, which is an ergodic process having an invariant measure πϑ with the invariant density qϑ defined as

qϑ(x) = σ−2(x) exp{Rx

0 S1(v)dv} R+∞

−∞ σ−2(z) exp{Rz

0 S1(v)dv}dz , (2.6) where S1(v) = 2S(v)/σ2(v).

Theorem 2.2. For any 0< ǫ1/2 and t >0,

sup

ϑ∈Θ

sup

x∈R

sup

0<g≤1

Eϑxg(yt)R

Rg(x)qϑ(x)dx

(1 +x2)ǫ Rǫe−κǫt, (2.7) where the parameters Rǫ >0 and κǫ >0 are given in (3.13).

(7)

Remark 2.2. Note that the property (2.7) is called simultaneous geometric ergodicity. As is shown in Section 6, the function (1 +x2)ǫ is a Lyapunov function. It should be said that in [20], for the process (1.3) with a constant diffusion coefficient (i.e. for σ = 1), an exponential bound for deviation (2.7) was obtained. The proof of that result was based on the coupling method applied directly to the diffusion process (1.3) provided the existence of a Lyapunov function. In contrast with [20], in our paper an explicit family of Lyapunov functions is given that is of help in applications.

Remark 2.3. It should be noted that the inequality (2.7) may be applied to Monte Carlo calculation of the expectation Eϑxg(yt). Indeed, the previous ex- pectation can be replaced with the integral of g with respect to the invariant density (2.6). The precision of such approximation is given in (2.7).

3 Computable bounds for geometric conver- gence rate

In this section we introduce the parametersR andκ which make explicit the upper bound for geometric ergodicity in (2.4). For any 0 < γ <1, we denote

B = ˇU

1 + UˇV 1(1δη1)γ

, (3.1)

where η1 =η/(1δ) and Uˇ = max

1ρ+D

(1δ)(1ρ)1−γ(1(1ρ)γ), V (1ρ)1−γ

.

We remind that the parameter η appears in the condition H1). Moreover, we

put

r = (1γ)2|ln(1ρ)| |ln(1δη1)| ln( ˇUV) +|ln(1δη1)| ; el = 2 +

ln(VB)

r +lneq(1e−r)−1 2r

,

(3.2)

where

e

q = 1Bˇ1

2 and Bˇ1 = min

e−r, δ er1 +δη1

. Here [a] means the integer part of a reel number a >0. Next,

Ae= VB+ 1

(1e−r)(1e−γr) and ̺e1 = (1γ)2er|ln(1eǫ)|

lnAe+rel+|ln(1eǫ)|, (3.3)

(8)

where er =|ln(1eq)| and eǫ =δη1(1δ)el−2. Now we introduce the following parameter

Ae3 = 1γ γ

3 + 2 erAe

1 +Aee rel (1(1eǫ)γ) (er1)

VB+eeκ (3.4)

and

e

κ= (1γ)̺e1r e

̺1+ lnVB .

We define the parameters R and κ in the Theorem 2.1 as

κ =κ(ρ, δ, D, V) = (1γ)̺e1r 2(̺e1+ lnVB); R =R(ρ, δ, D, V) = 2(Ae2 + 1)eκ+ 1

eκ1 V (B)2 .

(3.5)

Further we define the upper bound in the ergodic Theorem 2.2. First of all, we take the set C in the minorization condition H1) as the intervalC = [K, K] for someK >0 and we chose the measure ν as the uniform distribution onC, i.e. for any measurable set A fromB(R),

ν(A) = 1

2K mes(A[K , K]), (3.6) where mes(·) is the Lebesgue measure on R.

In order to define the threshold δ >0 we need the quadratic function

(z) =ω1z2+ω2z+ω3 (3.7) with ω1 = 4(Lσ)2 + + 1, ω2 = σmax +H0 and ω3 = H1 + 2(H0)2, where

H0 = M +σ

σmin , H1 =M(1 +σ) +L+ 2σ2 and σ = σmax σmin .

We chose the parameter δ as

δK = 3Ke−Ω(K)e 2

2πσmax , (3.8)

where Ke =K/σmin. Now we set K0 =p

M1+ 8σmin+ 8ˇσ+ 4q ˇ

σ(M1+M2) + 16ˇσ2, (3.9)

(9)

where ˇσ =σ2max/(β(1e−β)),

M1 = (M+βx)2+σmax2 β

β2 and M2 =M1(1e−β). We set

ηK = 1δK σ(K2+M2)

(K2M1)2 . (3.10)

Note, that for any K K0 δK 3Kee Ke2

2

3

4

and ηK 3(

1) 4

.

Therefore, Propositions A.7–A.8 imply the condition H1) with the parameters δK andηK defined in (3.8) and (3.10) for anyK K0. To check the condition H2) we set

b0 =β(2 +x)(1 +x) + (M + 3)(3 +x) and b1 = b0

. (3.11) Propositions 6.3–6.4 imply, that for any 0 < ǫ 1/2, the diffusion process (1.3) satisfies the drift condition H2) withV(x) = (1 +x2)ǫ,C = [Kǫ, Kǫ],

ρǫ = (1ˇǫ) (1e−2ǫβ) and Dǫ =Vǫe−2ǫβ+b1(1e−2ǫβ), (3.12) where

Kǫ =K0+ sb1

ˇ ǫ

1/ǫ

1 and Vǫ = (1 +Kǫ2)ǫ. Now we define

κǫ =κǫ, δǫ, Dǫ, ηǫ, Vǫ) and Rǫ =Rǫ, δǫ, Dǫ, ηǫ, Vǫ), (3.13) where the functions κ and R are defined in (3.5), δǫ = δK

ǫ, ηǫ = ηK

ǫ and Dǫ =DK

ǫ.

Remark 3.1. Note that we can not apply the bounds for geometric convergence rate from the paper [22] (Theorem 12) and [21] (Theorem 8) since there the bounds were obtained under the condition

(1 +K2)ǫ > 2DK

ρǫ . (3.14)

This condition is not satisfied in our case for sufficiently small values of the parameter ǫ > 0. In fact, we apply the bounds (2.7) in [5] to point-wise esti- mating the drift coefficientS under observations of the process (1.3)at discrete times (tk)k≥1. The question of interest is the behavior of kernel estimators that are nonlinear functionals of observations. In order to study the non asymptotic estimating precision one needs of some concentration inequalities [4] based on the bounds (2.7) with the parameter ǫ >0 no matter how small.

(10)

In order to illustrate the behavior of the geometric rate, we suppose that the parameters satisfy the following conditions:

limδ→0η= 1, limδ→0 1η

δ = +, limρ→1 D

V = 0 and limρ→1 lnV

|ln(1ρ)| = 1.

(3.15)

It should remark that these conditions hold true for the process (1.3) with the parametric set (2.5) as L > β → ∞ and ˇǫ0 for some fixed ǫ >0.

It should be noted, that the geometric rate in (1.2) obtained in [17] satisfies κ =O(δ8) as δ0.

In [22] under the condition (3.14) this rate is “best“, i.e.

κ =O(δ) as δ 0.

Under the conditions (3.15) the coefficient κ defined in (3.5) κ = (c1+o(1))δ13/2+µ0(γ)

|lnδ|2 as δ 0, ρ1, where c1 >0 and µ0(γ)0 as γ 0.

Remark 3.2. Note that the condition L, β → ∞ on the drift function of the process (1.3) concerns the behavior of the function only outside of the interval [x, x], i.e. outside of the informative part of the function S. Remind (see, [3]), that the class (2.5) is used to bound the function S on the interval [x, x] and outside of the interval the conditions are imposed to preserve ergodicity.

Remark 3.3. It should remark that, unfortunately, the bounds from the pa- pers [1], [17] are not applicable, in general case, to classes of Markov chains.

Indeed, irreducibility is one of conditions providing the geometrical rate in [17].

Therefore, in the case of a parametric Markov chain class, irreducibility mea- sure should depend on a class parameter. It is not clear, what will going on this measure when one takes the supremum over the class parameter and how one needs to change the irreducibility condition in order to obtain uniform bounds over the class parameter ϑΘ by using the proof in [17].

4 Coupling Renewal Method

In this Section we shall obtain a non asymptotic upper bound with explicit constants in the renewal theorem by making use of the coupling method. The notions used here can be found in [8], [11], [14].

(11)

Let (Yj)j≥0 and (Yj)j≥0 be two independent sequences of random variables taking values in N. Assume that the initial random variables Y0 and Y0 have distributionsa= (a(k))k≥0 and b= (b(k))k≥0, respectively, i.e. for anyk 0,

P(Y0 =k) =a(k) and P(Y0 =k) =b(k).

The sequences (Yj)j≥1 and (Yj)j≥1 are supposed to be the i.i.d. sequences with the same distribution p= (p(k))k≥0, i.e. for any k 0,

P(Y1 =k) =P(Y1 =k) =p(k).

We assume also that p(0) = P(Y1 = 0) = 0, i.e. the sequences (Yj)j≥1 and (Yj)j≥1 take values in N =N\ {0}. Moreover, we suppose that the distribu- tions a, b and psatisfy the following condition

C) There exists a real number r >0 such that ln max (EerY0, EerY1)

υ(r) and ln

EerY0

υ(r). (4.1) For any n0, we define the following stopping times

tn = inf{k 0 : Xk

i=0

Yi > n} and tn= inf{k 0 : Xk

i=0

Yi > n}.

Further, we set

Wn =

tn

X

j=0

Yjn and Wn =

tn

X

j=0

Yjn . (4.2) It is easy to see that the sequences (Wn)n≥1 and (Wn)n≥1 are homogeneous Markov chains taking values in N such that, for any n, k and l from N,

P Wn =k|Wn−1 =l

=P

Wn =k|Wn−1 =l

=p(k)1{l=1}+1{k=l−1}1{l≥2}. (4.3) Firstly we study the entrance times (sk)k≥0 of the chain (Wn)n≥0 to the state {1} which are defined ass−1 = 0 and for k 0

sk= inf{l sk−1+ 1 : Wl= 1}. (4.4) One can check directly that the stopping times sk, k 1, can be represented as

sk =s0+ Xk

j=1

ςj. (4.5)

(12)

One can check directly that in this case (ςj)j≥1 are i.i.d. random variables independent of s0 and, for any l N,

P(ς1 =l) =P(s0 =l|W0 = 1) =p(l), (4.6) i.e. the random variables (ςj)j≥1 have the same distribution as Y1. Now we study the properties of the stopping time s0.

Proposition 4.1. Assume that the condition C) holds. Then Eers0 3eυ(r).

Proof. First of all, note that, for k2 and l 0, P(s0 =l|W0 =k) = 1{l=k−1}. Moreover, taking into account that, for any k 1,

P(W0 =k) =a(0)p(k) +a(k), (4.7) we obtain that

Eers0 =P(W0 = 1)EerY1+ X k=2

er(k−1)P(W0 =k)

EerY1 +a(0)e−rEerY1 +e−rEerY0 3eυ(r). Hence Proposition 4.1.

Now, we introduce the embedded Markov chain (Zk)k≥0 by Zk =Ws

k (4.8)

and the corresponding entrance time to the state {1}:

̟ = inf{k 1 : Zk= 1}. (4.9) In order to study the property of this stopping time, we need of the following notations

l =l(r) = 2 +

"

ln e(r)(1e−r)−1q(r) 2r

#

, (4.10)

where q(r) = (1 eυ1(r))/2 and the parameter υ1(r) < 0 will be specified below. Moreover, for any 0 < γ , ǫ <1, we set

A1(r) = A(r)1 +A(r)erl

1(1ǫ)γ , (4.11) where

A(r) = 1 +Q(r)erl(r)

(1q(r))1−γ (1(1q(r))γ) and Q(r) = eυ(r) 1e−r .

(13)

Proposition 4.2. Assume the condition C). Then, for any 0 < γ < 1, for any ǫ >0 and υ1(r)≤ −r for which

0< ǫ min

1≤j≤l−1 p(j) and lnEe−rY1 υ1(r), (4.12) one has

Ee̺̟ Q1(r)A1(r), where Q1(r) =eυ(r)+eυ(r)+Q(r) and

̺ =̺(r) = (1γ)2|ln(1q(r))| |ln(1ǫ)|

lnA(r) +rl+|ln(1ǫ)| . (4.13) Proof. Firstly we note that the sequence (4.8) is a homogeneous Markov chain with values in N such that, for any mand l fromN and for anyk 1,

P(Zk=m|Zk−1 =l) =P(Wς

1−l =m|Y0 = 0). (4.14) Note, thatWk =k fork < 0 under the condition Y0 = 0. Therefore, for any positive function V,

E[V(Z1)|Z0 =l] =EV(lς1)1

1<l}+EQ(ς1l)1

1≥l}, (4.15) where Q(n) =E(V(Wn)|Y0 = 0). Using the distribution (4.3) yields

Q(n)E V(Wn−11)|Y0 = 0

+EV(Y1)P Wn−1 = 1|Y0 = 0 . Choosing now V(x) =erx one has

Q(n)e−rQ(n1) +EV(Y1).

From the last inequality, taking into account that Q(0) = EV(Y1), it follows that, for any n 1,

Q(n)e−nrQ(0) +EV(Y1) Xn

j=1

e−(n−j)r Q(r), (4.16) where the upper bound Q(r) is defined in (4.10). This implies that the last term in (4.15) can be estimated as

EQ(ς1l)11≥l} Q(r)eυ(r)−rl. Therefore,

E (V(Z1)|Z0 =l)

V(l) eυ1(r)+Q(r)eυ(r)−2rl.

Références

Documents relatifs

Les deux chapitres suivants présentent les résultats nouveaux obtenus en appliquant cette stratégie : le chapitre 3 présente des bornes à la Berry-Esseen dans le cadre du

For any positive power of the ab- solute error loss and the same H¨older class, the sharp constant for the local minimax risk is given in Galtchouk and Pergamenshchikov (2005) as

In this section, the weak spectral method is applied to the three classical models cited in Section 1: the v-geometrically ergodic Markov chains, the ρ-mixing Markov chains, and

Before presenting our results, we give a brief review of well-known multidimensional renewal theorems in both independent and Markov settings, as well as some general comments

Renewal theory and computable convergence rates for geometrically ergodic Markov chains.. Geometric convergence of Markov chains through a local

Since we have seen in the previous section that the existence of Lyapunov functions furnishes functional inequalities in the symmetric case, in this section we shall study

The aim of this paper is to prove the exponential ergodicity towards the unique invariant probability measure for some solutions of stochastic differential equations with finite

More specifically, studying the renewal in R (see [Car83] and the beginning of sec- tion 2), we see that the rate of convergence depends on a diophantine condition on the measure and