Moderate deviations for stationary sequences of bounded random variables

(1)

www.imstat.org/aihp 2009, Vol. 45, No. 2, 453–476

DOI: 10.1214/08-AIHP169

Moderate deviations for stationary sequences of bounded random variables

Jérôme Dedecker

^a

, Florence Merlevède

^b

, Magda Peligrad

^c,1

and Sergey Utev

^d

aUniversité Paris 6, LSTA, 175 rue du Chevaleret, 75013 Paris, France

bUniversité Paris Est-Marne la Vallée, LAMA and CNRS UMR 8050, 5 Boulevard Descartes, Cité Descartes-Champs sur Marne, 77 454 Marne La Vallée Cedex 2, France.

cDepartment of Mathematical Sciences, University of Cincinnati, P.O. Box 210025, Cincinnati, OH 45221-0025, USA dSchool of Mathematical Sciences, University of Nottingham, Nottingham, NG7 2RD, UK

Received 9 November 2007; accepted 3 March 2008

Abstract. In this paper we derive the moderate deviation principle for stationary sequences of bounded random variables under martingale-type conditions. Applications to functions ofφ-mixing sequences, contracting Markov chains, expanding maps of the interval, and symmetric random walks on the circle are given.

Résumé. Dans cet article, nous établissons un principe de déviation modérée pour des suites stationnaires de variables aléatoires bornées sous différentes conditions projectives. Nous appliquons ces résultats aux suitesφ-mélangeantes, à certaines chaînes de Markov contractantes, aux transformations uniformément dilatantes de l’intervalle, ainsi qu’à la marche aléatoire symétrique sur le cercle.

MSC:60F10; 60G10

Keywords:Moderate deviations; Martingale approximation; Stationary processes

1. Introduction

For the stationary sequence(X_i)_i_∈_Zof centered random variables, define the partial sums and the normalized partial sums process by

S_n= n

j=1

X_j and W_n(t )=n⁻^1/2

[nt]

i=1

X_i.

In this paper we are concerned with the moderate deviation principle for the normalized partial sums processW_n, considered as an element ofD([0,1])(functions on[0,1]with left-hand side limits and continuous from the right), equipped with the Skorohod topology (see Section 14 in [2]) for the description of the topology onD([0,1])). More exactly, we say that the family of random variables{W_n, n >0}satisfies the Moderate Deviation Principle (MDP) in D[0,1]with speedan→0 and good rate functionI (·), if the level sets{x, I (x)≤α}are compact for allα <∞, and for all Borel sets

− inf

t∈Γ⁰

I (t )≤lim inf

n a_nlogP√

a_nW_n∈Γ

≤lim sup

n

a_nlogP√

a_nW_n∈Γ

≤ −inf

t∈ ¯Γ

I (t ). (1)

1Supported in part by a Charles Phelps Taft Memorial Fund grant and NSA Grant H98230-07-1-0016.

(2)

The moderate deviation principle is an intermediate behavior between the central limit theorem (an=a) and large deviations (an=a/n). Usually, MDP has a simpler rate function, inherited from the approximated Gaussian process, and holds for a larger class of dependent random variables than the large deviation principle.

De Acosta and Chen [5] used the renewal theory to derive the MDP for bounded functionals of geometrically ergodic stationary Markov chains. Puhalskii [21] and Dembo [10] applied the stochastic exponential to prove the MDP for martingales. Starting from the martingale case and using the so-called coboundary decomposition due to Gordin [18] (Xk=Mk+Zk−Zk+1, whereMk is a stationary martingale difference), Gao [17] and Djellout [15] obtained the MDP forφ-mixing sequences with summable mixing rate. In the context of Markov chains, the coboundary decomposition is known as the Poisson equation. Starting from this equation, Delyon, Juditsky and Liptser [9] proved the MDP forn⁻^1/2_n

k=1H (Y_k), whereHis a Lipschitz function, andY_n=F (Y_n₋₁, ξ_n), whereF satisfies|F (x, z)− F (y, t )| ≤κ|x−y| +L|z−t|withκ <1, and(ξ_n)_n_≥₁is an iid sequence of random variables independent ofY₀. In their paper, the random variables are not assumed to be bounded: the authors only assume that there exists a positiveδsuch thatE(e^δ^|^ξ¹^|) <∞. They strongly used the Markov structure to derive some appropriate properties for the coboundary (see their Lemma 4.2).

In this paper we propose a modification of the martingale approximation approach that allows to avoid the coboundary decomposition and thus to enlarge the class of dependent sequences known to satisfy the moderate deviation principle. Recent or new exponential inequalities are applied to justify the martingale approximation. The conditions involved in our results are well adapted to a large variety of examples, including regular functionals of linear processes, expanding maps of the interval and symmetric random walks on the circle.

The paper is organized as follows: In Section 2 we state the main results. A discussion of the conditions, clari- fications, and some simple examples and extensions follow. Section 3 describes the applications, while Section 4 is dedicated to the proofs. Several technical lemmas are proved in the Appendix.

2. Results

From now on, we assume that the stationary sequence (X_i)_i_∈_Z is given by X_i =X₀◦Tⁱ, whereT:Ω →Ω is a bijective bimeasurable transformation preserving the probabilityP on (Ω,A). For a subfield F0 satisfying F0⊆ T⁻¹(F0), letFi=T⁻ⁱ(F0). ByX∞we denote theL_∞-norm, that is the smallestusuch thatP(|X|> u)=0.

Our first theorem and its corollary treat the so-called adapted case,X0beingF0-measurable and so the sequence (X_i)_i_∈_Zis adapted to the filtration(Fi)_i_∈_Z.

Theorem 1. Assume thatX₀_∞<∞and thatX₀isF0-measurable.In addition,assume that ∞

n=1

n⁻^3/2E(Sn|F0)

∞<∞, (2)

and that there existsσ²≥0with

nlim→∞n⁻¹E S_n²|F0

−σ²

∞=0. (3)

Then,for all positive sequencesa_nwitha_n→0andna_n→ ∞,the normalized partial sums processesW_n(·)satisfy (1)with the good rate functionIσ(·)defined by

Iσ(h)= 1 2σ²

₁

0

h(u)2

du (4)

if simultaneouslyσ >0,h(0)=0andhis absolutely continuous,andIσ(h)= ∞otherwise.

The following corollary gives simplified conditions for the MDP principle, which will be verified in several examples later on.

(3)

Corollary 2. Assume thatX₀_∞<∞and thatX₀isF0-measurable.In addition,assume that ∞

n=1

n⁻^1/2E(Xn|F0)

∞<∞, (5)

and that for alli, j≥1,

nlim→∞E(XiX_j|F₋n)−E(XiX_j)

∞=0. (6)

Then the conclusion of Theorem1holds withσ²=

k∈ZE(X0Xk).

The next theorem allows to deal with non-adapted sequences and it provides additional applications. LetF_−∞=

n≥0F−nandF∞= _k_∈ZFk.

Theorem 3. Assume thatX₀_∞<∞,E(X0|F_−∞)=0almost surely,andX₀isF_∞-measurable.Define the pro- jection operators byP_j(X)=E(X|Fj)−E(X|Fj−1).Suppose that(6)holds and that

j∈Z

P0(X_j)

∞<∞. (7)

Then the conclusion of Theorem1holds withσ²=

k∈ZE(X0X_k).

2.1. Simple examples,comments and extensions

Comment 4 (φ-mixing sequences). Recall that ifY is a random variable with values in a Polish spaceY and ifM is aσ-field,theφ-mixing coefficient betweenMandσ (Y )is defined by

φ

M, σ (Y )

= sup

A∈B(Y)

P_Y_|M(A)−PY(A)

∞. (8)

For the sequence(X_i)_i_∈_Zand positive integerm,letφ_m(n)=sup_i_m>···>i1≥nφ (M0, σ (X_i₁, . . . , X_i_m))and letφ (k)= φ_∞(k)=limm→∞φ_m(k) be the usual φ-mixing coefficient. It follows from Corollary 2 that if the variables are bounded,the conclusion of Theorem1holds as soon as

k>0

k⁻^1/2φ₁(k) <∞ and lim

k→∞φ₂(k)=0. (9)

The condition(9) improves on the one imposed by Gao[17],that is

k>0φ (k) <∞,to get the MDP for bounded random variables(see his Theorem1.2).

Comment 5 (Application to the functional LIL). Since the variables are bounded,under the assumptions of Theo- rem1or of Theorem3,the MDP also holds inC[0,1]for the Donsker process

Dn(t )=Wn(t )+n⁻^1/2

nt− [nt] X_[nt]+1.

Hence,ifσ²>0,it follows from the proof of Theorem1.4.1in[14],that the process 2σ²log logn₋1/2

D_n(t ): t∈ [0,1]

(10) satisfies the functional law of the iterated logarithm.To be more precise,ifS denotes the subset ofC[0,1]consisting of all absolutely continuous functions with respect to the Lebesgue measure such thath(0)=0and₁

0(h(t ))²dt≤1, then the process defined in(10)is relatively compact with a.s.limit setS.In the case of bounded random variables, we then get new criteria to derive the functional LIL.In particular,the functional LIL holds forφ-mixing bounded random variables satisfying(9).

(4)

Comment 6 (Linear processes). Let(c_i)_i_∈_Zbe a sequence of real numbers in¹(Z)(absolutely summable).Define X_k=

i∈Zc_iε_k₋_i where(ε_k)_k_∈_Zis a strictly stationary sequence satisfying(6)and(7).Then,so does the sequence (X_k)_k_∈_Z,and the conclusion of Theorem3holds.In particular,the result applies ifε₀isF0-measurable,E(ε1|F0)=0 and

nlim→∞E ε²₀|F−n

−E

ε₀²_∞=0.

Comment 7 (Non-mixing in the ergodic sense example). The following simple example shows that Theorem1 is applicable to non-mixing in the ergodic theoretical sense sequences.Moreover it covers a strictly larger class of examples than its Corollary2.For allk∈Z,letQ_k₊1= −Q_k,whereP(Q0= ±1)=1/2andX_k=Q_k+Y_k,where (Y_k)_k_∈_Zis an iid sequence of zero mean and bounded random variables,independent ofQ0.We can easily check that all the conditions of Theorem1hold while the conditions of Corollary2are not satisfied.

Comment 8 (Stationary ergodic martingales that do not satisfy MDP). Let Yk be the stationary discrete Markov chain with the state space N and the transition kernel given by P(Y1=j −1|Y0=j )=1 for all j ≥1 and P(Y1=j|Y0=0)=P(τ=j )for j ∈NwithE(τ ) <∞andP(τ =1) >0which implies that(Yk)is ergodic.Let Xk=ξkI(Y_k=0)where(ξk)is an iid sequence independent of(Yk)and such thatP(ξk= ±1)=1/2.ThenXk is a sta- tionary ergodic martingale difference which is also a bounded function of an ergodic Markov chain.Straightforward computations show that ifτ does not have a finite exponential moment then there exists a positive sequencean→0 withnan→ ∞for which(1)does not hold.Thus the MDP principle is not true in general for the stationary sequences satisfying(5)without a certain form of condition(3).A similar example was suggested in[15],Remark2.6.

Comment 9 (On Var(S_n) and Theorem 1). Note that if _∞

n=1n⁻^3/2E(Sn|F0)2<∞, then, by Peligrad and Utev[19]

nlim→∞

Var(Sn)

n =σ²=E X²₁

+ ∞ j=0

2⁻^jE

S₂j(S₂j+1−S₂j) .

On the other hand,we shall prove later on that condition(2) along with (6) are sufficient for the validity of (3).

Therefore the conclusion of Theorem1holds under(2)and(6)withσ²identified in this remark.

Comment 10 (Sequences that are not strictly stationary). The proof of Theorem3is based on the exponential in- equality from Lemma22,that was established without stationarity assumption.Therefore,Theorem3admits various extensions to non-stationary sequences.The following slight generalization is motivated by the fixed design regres- sion problemZ_k=θ q_k+X_k,where the fixed design points are of the formq_k=1/g(k/n),the error processX_k is a stationary sequence and we analyze the error of the estimatorθˆ=n⁻¹_n

k=1Z_kg(k/n).If{X_i}i∈Zsatisfies the conditions of Theorem3,and ifgis a Lipschitz function,then the processWn= {n⁻^1/2_[nt]

i=1g(i/n)Xi, t ∈ [0,1]}

satisfies(1)with the good rate functionJ (·)defined by J (h)= 1

2σ² ₁

0

h(u) g(u)

2

du, whereσ²=

k∈Z

E(X0X_k).

The proof of this result is omitted.It can be done by following the proof of Theorem3.To be more precise,we start by proving the MDP for the processW_n(t )=n⁻^1/2_[nv⁻¹(t )]

i=1 g(i/n)X_i wherev(t )=σ²_t

0g²(x)dx.ForW_n(·),the rate function isI_σ(·)as in Theorem1.To go back to the processW_n(·),use the change-of-timeW_n=W_n◦v.

3. Applications

In this section we present applications to functions of φ-mixing processes, contracting Markov chains, expanding maps of the interval and symmetric random walks on the circle. The proofs are given in Section 4.

(5)

3.1. Functions ofφ-mixing sequences

In this section, we are partly motivated by Djellout et al. [16], Theorem 2.7, who have proved the MDP for Xk=f (Yk, . . . , Yk−)−E

f (Yk, . . . , Yk−)

, whereYk=

i∈Z

ciεk−i. (11)

In their Theorem 2.7, Djellout et al. [16] assume that:

(i) (ε_i)_i_∈_Zis an iid sequence;

(ii) (condition onci) the spectral density ofYk is continuous on[−π,π[;

(iii) (condition onε0)ε0satisfies the so-called LSI condition, which implies thatE(exp(δε²₀)) <∞for some posi- tiveδ, and that the distributionε₀has an absolutely continuous component with respect to the Lebesgue measure with a strictly positive density on the support ofμ[see their condition (2.1)];

(iv) (condition onf) the functions∂if are Lipschitz fori=0, . . . , .

By applying our main results, we derive the Propositions 11 and 12. In the case whereX_k is given by (11), the Proposition 11 will allow us to obtain the MDP for a large class of functions. However, we require a stronger condition than (ii), that is we assume that the sequence(ci)i∈Zis in1(Z), and instead of (iii), we suppose thatε0takes its values in some compact interval[a, b](this assumption cannot be compared to the LSI condition (iii)). Our method allows to link the regularity off to the behavior of the coefficients(c_i)_i_∈_Z(in that case, the condition (16) given below means that

i∈Zwj(2(b−a)|ci|) <∞for anyj=0, . . . , , wherewjis the modulus of continuity off with respect to the jth coordinate). In addition, our innovations may be dependent: more precisely,(εi)_i∈Zis assumed to be a stationary φ-mixing sequence.

We now describe our general results. Let(ε_i)_i_∈_Z=(ε₀◦Tⁱ)_i_∈_Z be a stationary sequence ofφ-mixing random variables with values in a subsetA of a Polish space X. Starting from the definition (8), we denote byφε(n)the coefficientφ_ε(n)=φ (σ (ε_i, i≤0), σ (εi, i≥n)).

Our first result is for non-adapted sequences, that is satisfying the representation (12). LetH be a function from A^ZtoRsatisfying the condition

C(A): for anyx, yinA^Z, H (x)−H (y)≤

i∈Z

Δ_i1xi=yi, where

i∈Z

Δ_i <∞.

Define the stationary sequenceX_k=X₀◦T^kby X_k=H

(ε_k₋_i)_i_∈_Z

−E H

(ε_k₋_i)_i_∈_Z

. (12)

Note thatXkis bounded in view ofC(A).

Proposition 11. Let(X_k)_k_∈_Zbe defined by(12),for a functionH satisfyingC(A).If

k>0φ_ε(k)is finite,then the conclusion of Theorem1holds withσ²=

k∈ZE(X0X_k).

For adapted sequences, that is satisfying the representation (13), we can assume thatH satisfies another type of condition. LetHbe a function fromA^NtoRsatisfying the condition

C(A): for anyi≥0, sup

x∈A^N,y∈A^N

H (x)−H

x⁽ⁱ⁾y≤Ri, whereRi decreases to 0,

the sequencex⁽ⁱ⁾ybeing defined by(x⁽ⁱ⁾y)j=xjforj < iand(x⁽ⁱ⁾y)j=yjforj≥i. Define the stationary sequence X_k=X₀◦T^kby

Xk=H

(εk−i)i∈N

−E H

(εk−i)i∈N

. (13)

(6)

Proposition 12. Let(X_k)_k_∈_Zbe defined by(13),for a functionH satisfyingC(A).If ∞

=1

R

k≥

φ_ε(k−)

√k <∞, (14)

then the conclusion of Theorem1holds withσ²=

k∈ZE(X0Xk).In particular,the condition(14)holds as soon as

1.

k>0φ_ε(k) <∞and

k>0k⁻^1/2R_k<∞.

2.

k>0R_k<∞and

k>0k⁻^1/2φ_ε(k) <∞.

Application to functions of linear processes

Assume thatε_i takes its values in a compact intervalA= [a, b]ofR, and let(c_i)_i_∈_Zbe a sequence of real numbers in ¹(Z). Letm=inf_x_∈_AZ

i∈Zcixi andM=sup_x_∈_AZ

i∈Zcixi. For a functionf from[m, M]^ZtoR, letwi be the modulus of continuity off with respect to theith coordinate, that is

wi(h)= sup

x∈[m,M]^Z, t∈[m,M],|xi−t|≤h

f (x)−f x^{(i,t )},

the sequencex^{(i,t )}being defined byx_j^{(i,t )}=xj forj =iandx_i^{(i,t )}=t. Assume that for anyx, yin[m, M]^Z, f (x)−f (y)≤

i∈Z

wi

|xi−yi|

<∞. Define the random variablesY_k=

i∈Zc_iε_k−i, and let X_k=f

(Y_k₋_i)_i_∈_Z

−E f

(Y_k₋_i)_i_∈_Z

(15) (note that (15) is a generalization of (11)). Clearly,X_k may be written as in (12), for a functionH fromA^Z toR.

Moreover,H satisfiesC(A)withΔi≤

∈Zw(2(b−a)|ci−|)provided that

i∈Z

∈Z

w

2(b−a)|c_i|

<∞. (16)

From Proposition 11, if

k>0φε(k) <∞and if (16) holds, then the conclusion of Theorem 1 holds. In particular, the condition (16) holds as soon as there exist(b_i)_i_∈_Z in¹(Z)andαin]0,1]such thatw(h)≤b|h|^α and

i∈Z|ci|^α<∞. Two simple examples of such functions are:

1. f (x)=

i∈Zg_i(x_i)for someg_i such that|g_i(x)−g_i(y)| ≤b_i|x−y|^αfor anyx, yin[m, M].

2. f (x)=^q_i₌_ph_i(x_i)for someh_isuch that|h_i(x)−h_i(y)| ≤K_i|x−y|^α for anyx, yin[m, M].

Now, assume thatci=0 fori <0, so thatYk=

i≥0ciεk−i. Iff is in fact a function ofx throughx0only, we simply denote byw=w₀its modulus of continuity over[m, M]. In that caseX_k=f (Y_k)−E(Yk)may be written as in (13) for a functionH satisfying C(A)with R_i ≤w(2|b−a|

k≥i|c_k|). From item 1 of Proposition 12, if

k>0φε(k) <∞and if

n≥1

n⁻^1/2w

2|b−a|

k≥n

|ck|

<∞, (17)

then the conclusion of Theorem 1 holds. In particular, if|ci| ≤Cρⁱ for someC >0 andρ∈ ]0,1[, the condition (17) holds as soon as:

₁

0

w(t ) t√

|logt|dt <∞.

Note that this condition is satisfied as soon asw(t )≤D|log(t )|⁻^γ for someD >0 and someγ >1/2. In particular, it is satisfied iff isα-Hölder for someα∈ ]0,1].

(7)

3.2. Contracting Markov chains

Let (Y_n)_n_≥₀ be a stationary Markov chain of bounded random variables with invariant measure μ and transition kernel K. Denote by · _∞,μ the essential supremum norm with respect to μ. Let Λ1 be the set of 1-Lipschitz functions. Assume that the chain satisfies the two following conditions:

there existC >0 andρ∈ ]0,1[such that sup

g∈Λ1

Kⁿ(g)−μ(g)

∞,μ≤Cρⁿ, (18)

for anyf, g∈Λ₁and anym≥0 lim

n→∞Kⁿ

f K^m(g)

−μ

f K^m(g)_∞,μ=0. (19)

We shall see in the next proposition that if (18) and (19) are satisfied, then the MDP holds inD[0,1]for the sequence

Xn=f (Yn)−μ(f ) (20)

as soon as the functionf belongs to the classLdefined below.

Definition 13. LetLbe the class of functionsf fromRtoRsuch that|f (x)−f (y)| ≤c(|x−y|),for some concave and non-decreasing functioncsatisfying

₁

0

c(t ) t√

|logt|dt <∞. (21)

Note that (21) holds ifc(t )≤D|log(t )|⁻^γ for someD >0 and someγ >1/2. In particular,Lcontains the class of functions from[0,1]toRwhich areα-Hölder for someα∈ ]0,1].

Proposition 14. Assume that the stationary Markov chain(Yn)n≥0 satisfies (18)and (19),and let Xn be defined by(20).Iff belongs toL,then the conclusion of Theorem1holds with

σ²=σ²(f )=μ

f−μ(f )2 +2

n>0

μ

Kⁿ(f )·

f −μ(f ) .

The proof of this proposition is based on the following lemma which has interest in itself.

Lemma 15. Letu_n=sup_g_∈_Λ₁Kⁿ(g)−μ(g)_∞,μ.Let f be a function fromRtoRsuch that|f (x)−f (y)| ≤ c(|x−y|)for some concave and non-decreasing functionc.Then

Kⁿ(f )−μ(f )

∞,μ≤c(un).

Remark 16. Ifu_n≤Cρⁿfor aC >0andρ∈ ]0,1[,and ifc(t )≤D|log(t )|⁻^γ forD >0andγ >0,then Kⁿ(f )−μ(f )

∞,μ=O n⁻^γ

.

We now give two conditions under which (18) and (19) hold. Let [a, b]be a compact interval in which lies the support of μ. For a Lipschitz function f, let Lip(f )=sup_x,y_∈[_a,b_]|f (x)−f (y)|/|x−y|. The chain is said to be Lipschitz contracting if there existκ >0 andρ∈ ]0,1[such that

Lip Kⁿ(f )

≤κρⁿLip(f ). (22)

Let BV be the class of bounded variation functions from [a, b] toR. For anyf ∈BV, denote bydf the total variation norm of the measure df: df =sup{

gdf,g_∞≤1}. The chain is said to be to beBV-contracting if there existκ >0 andρ∈ [0,1[such that

dKⁿ(f )≤κρⁿdf. (23)

(8)

It is easy to see that if either (22) or (23) holds, then (18) and (19) are satisfied (to see that the condition (23) implies (19), it suffices to note that it implies the same property for twoBV functionsf, g (see (52)), and that any Lipshitz function from[a, b]toRis aBV functions).

Application to iterated random functions

The stationary bounded Markov chain(Y_n)_n_≥0with transition kernelKis one-step Lipschitz contracting if there exists ρ∈ ]0,1[such that

Lip K(f )

≤ρLip(f ).

Note that ifK is one-step Lipschitz contracting then (22) obviously holds withκ=1. The one-step contraction is a very restrictive assumption. However, it is satisfied ifY_n=F (Y_n₋₁, ε_n)for some iid sequence(ε_i)_i>0independent ofY0, and some functionF such that

F (x, ε₁)−F (y, ε₁)

1≤ρ|x−y| for anyx, yinR. (24)

Remark 17. Under a more restrictive condition onF than(24),namely

F (x, z)−F (y, t )≤ρ|x−y| +L|z−t|, (25)

Delyon et al. [9]have proved the MDP forX_n=f (Y_n)−μ(f )whenf is a Lipschitz function.In their paper,the chain is not assumed to be bounded.It is only assumed that E(e^δε¹) <∞for someδ >0,which implies the same property forX₁(for a smallerδ)by using the inequality(25).

3.2.1. Application to expanding maps

LetT be a map from[0,1]to[0,1]preserving a probabilityμon[0,1], and let X_k=f ◦Tⁿ⁻^k⁺¹−μ(f ), W_n(t )=W_n(f, t )=n⁻^1/2

[nt]

i=1

f ◦Tⁿ⁻ⁱ⁺¹−μ(f ) .

Define the Perron–Frobenius operatorKfromL²([0,1], μ)toL²([0,1], μ)via the equality ₁

0

(Kh)(x)f (x)μ(dx)= ₁

0

h(x)(f ◦T )(x)μ(dx). (26)

The mapT is said to beBV-contracting if its Perron–Frobenius operator isBV-contracting, that is satisfies (23). As a consequence of Proposition 14, the following corollary holds.

Corollary 18. IfT isBV-contracting,and iff belongs toBV ∪L,then the conclusion of Theorem1holds with σ²=σ²(f )=μ

f −μ(f )2 +2

n>0

μ

f ◦Tⁿ·

f−μ(f ) .

Let us present a large class ofBV-contracting maps. We shall say thatT is uniformly expanding if it belongs to the classCdefined in [4], Section 2.1, p. 11. Recall that ifT is uniformly expanding, then there exists a probability measureμon[0,1], whose densityfμwith respect to the Lebesgue measure is a bounded variation function, and such thatμis invariant byT. Consider now the more restrictive conditions:

(a) T is uniformly expanding.

(b) The invariant measureμis unique and(T , μ)is mixing in the ergodic-theoretic sense.

(c) 1

f_μ1fμ>0is a bounded variation function.

(9)

Starting from Proposition 4.11 in [4], one can prove that ifT satisfies the assumptions (a), (b) and (c) above, then it is BV contracting (see for instance [7], Section 6.3). Some well-known examples of maps satisfying the conditions (a), (b) and (c) are:

1. T (x)=βx− [βx]forβ >1. These maps are calledβ-transformations.

2. I is the finite union of disjoint intervals(I_k)₁_≤_k_≤_n, andT (x)=a_kx+b_konI_k, with|a_k|>1.

3. T (x)=a(x⁻¹−1)− [a(x⁻¹−1)]for somea >0. Fora=1, this transformation is known as the Gauss map.

Remark 19. The case wheref (x)=x(that isX_n=Tⁿ−μ(T ))has already been considered by Dembo and Zeitouni [11].However,in this paper,the assumptions onT are more restrictive than the assumptions(a), (b)and(c)above.

In particular,they assume that there is a finite partition(I_j)₁_≤_j_≤_m of[0,1]on which T restricted toI_k isC¹and infx∈I_k|T(x)|>1,so that their result does not cover the case of the Gauss map(Item3above).

3.3. Symmetric random walk on the circle LetKbe the Markov kernel defined by

Kf (x)=1 2

f (x+a)+f (x−a)

on the torusR/Z, withairrational in[0,1]. The Lebesgue–Haar measuremis the unique probability which is invariant byK. Let(ξ_i)_i_∈_Zbe the stationary Markov chain with transition kernelKand invariant distributionm. Let

Xk=f (ξk)−m(f ), Wn(t )=Wn(f, t )=n⁻^1/2

[nt]

i=1

f (ξi)−m(f )

. (27)

From Derriennic and Lin [13], Section 2, we know that the central limit theorem holds forn⁻^1/2Wn(f,1)as soon as the series of covariances

σ²(f )=m

f−m(f )2 +2

n>0

m f Kⁿ

f−m(f )

(28)

is convergent, and that the limiting distribution isN(0, σ²(f )). In fact the convergence of the series in (28) is equiv- alent to

k∈Z^∗

| ˆf (k)|²

d(ka,Z)²<∞, (29)

wheref (k)ˆ are the Fourier coefficients off. Hence, for any irrational numbera, the criterion (29) gives a class of functionfsatisfying the central limit theorem, which depends on the sequence(d(ka,Z))k∈Z^∗. Note that a functionf such that

lim inf

k→∞ kf (k)ˆ >0, (30)

does not satisfy (29) for any irrational numbera. Indeed, it is well known from the theory of continued fraction that ifpn/qnis thenth convergent ofa, then|pn−qna|< q_n⁻¹, so that d(ka,Z) < k⁻¹for an infinite number of positive integersk. Hence, if (30) holds, then| ˆf (k)|/d(ka,Z)does not even tend to zero asktends to infinity.

Our aim in this section is to give conditions onf and on the properties of the irrational numberaensuring that the MDP holds inD[0,1].

ais said to be badly approximable by rationals if for any positiveε,

(31) the inequality d(ka,Z) <|k|⁻¹⁻^εhas only finitely many solutions fork∈Z.

(10)

From Roth’s theorem the algebraic numbers are badly approximable (cf. Schmidt [22]). Note also that the set of badly approximable numbers in[0,1]has Lebesgue measure 1.

In Section 5.3 of [8], it is proved that the condition (29) (and hence the central limit theorem forn⁻^1/2W_n(f,1)) holds for any badly approximable numbera as soon as

sup

k=0

|k|¹⁺^εf (k)ˆ <∞ for some positiveε. (32)

Note that, in view of (30), one cannot takeε=0 in the condition (32).

In fact, for badly approximable numbers, the condition (32) implies also the MDP inD[0,1]:

Proposition 20. Suppose thatais badly approximable by rationals,i.e.satisfies(31).If the functionf satisfies(32), then the conclusion of Theorem1holds withσ²=σ²(f ).

Note that, under the same conditions, the process{Wn(f, t ), t∈ [0,1]}satisfies the weak invariance principle in D[0,1]. Indeed, to prove Proposition 20, we show that the conditions of Corollary 2 are satisfied, but these conditions imply the weak invariance principle (see for instance [19]). From Comment 5, we also infer that the Donsker process defined in (10) satisfies the functional law of the iterated logarithm.

4. Proofs

Since the proofs of our results are mainly based on some exponential bounds for the deviation probability of the maximum of the partial sums for dependent variables, we present these inequalities, which have interest in themselves.

4.1. Exponential bounds for dependent variables

We state first the exponential bound from Proposition 2 in [20] that we are going to use in the proof of the main theorem.

Lemma 21. Let(Xi)i∈Zbe a stationary sequence of random variables adapted to the filtration(Fi)i∈Z.Then P

1max≤i≤n|S_i| ≥t ≤4√

e exp

−t²/2n

X1_∞+80 n

j=1

j⁻^3/2E(Sj|F0)

∞

2 .

In the next lemma, we bound the maximal exponential moment of the stationary sequence by using the projective criteria.

Lemma 22. Let{Y_k}k∈Zbe a sequence of random variables such that for allj,E(Yj|F_−∞)=0almost surely andY_j isF∞-measurable.Define the projection operators byP_j(X)=E(X|Fj)−E(X|Fj−1).Assume that

P_k₋_j(Y_k)

∞≤p_j and D:=

∞ j=−∞

p_j<∞. Let{g_k, k∈N}be a sequence of numbers and define,

S_k= k

i=1

g_iY_i, M_k= max

1≤j≤kS_j, G²_n= n

i=1

g_i².

Then,

Eexp(t Mn)≤4 exp 1

2G²_nD²t²

.

(11)

In particular, P

1max≤k≤n|S_k| ≥x

≤8 exp

− x² 2G²_nD²

.

Proof. Start with the decomposition Yk=

∞ j=−∞

Pk−j(Yk)= ∞ j=−∞

b_jP_k₋_j(Y_k) b_j ,

wherebj=pj/D≥ Pk−j(Yk)∞/D, for anyj∈Z. Then S_m= ^∞

j=−∞

b_j m

k=1

gkPk−j(Yk) b_j .

Thus,

M_n≤ ^∞

j=−∞

b_j max

1≤m≤n

m

k=1

Pk−j(gkYk)

b_j =: ^∞

j=−∞

b_jM_n^{(j )},

whereMn^{(j )}denotes max1≤m≤n

_m

k=1gkPk−j(Yk)/bj.

Since exp(x)is convex and non-decreasing andbj≥0 with

j∈Zbj=1, Eexp(t Mn)≤Eexp

_∞

j=−∞

bjt Mn^{(j )}

≤ ∞ j=−∞

bjEexp t Mn^{(j )}

.

Consider the martingale differenceUk=gkPk−j(Yk)/bj. SinceZk=exp(t (U1+ · · · +Uk)/2)is a submartingale, Doob’s inequality yields

Eexp t M_n^{(j )}

=E

1max≤k≤nZ_k²

≤4EZ²_n=4Eexp

t (U₁+ · · · +U_n) .

Applying Azuma’s inequality to the right-hand side, and noting that Uk∞= |gk|P_k₋_j(Y_k)_∞

b_j ≤ |gk|D, we infer that

Eexp t M_n^{(j )}

≤4 exp 1

2G²_nD²t²

.

Since

j∈Zb_j=1, we obtain that Eexp(t Mn)≤

j∈Z

b_j4 exp 1

2G²_nD²t²

=4 exp 1

2G²_nD²t²

.

Next, to derive the one-sided probability inequality we use the exponential bound witht=x/(G²_nD²), so P(Mn≥x)≤Eexp(t Mn)exp(−t x)=4 exp

− x² 2G²_nD²

.

(12)

Finally, to derive the two-sided inequality we observe that the stationary sequence{−Y_j}also satisfies the conditions

of the lemma. The proof is complete.

The next technical lemma provides an exponential bound for any random vector plus a correction in terms of conditional expectations (see also [23]).

Lemma 23. Let {X_i}1≤i≤n be a vector of real random variables adapted to the filtration {Fn}n≥1. Denote B=sup₁_≤_i_≤_nX_i∞.Then,for allδ >0andca natural number withcB/n≤δ/2,we have

P

1max≤i≤n

1 n

i

u=1

Xu

≥δ

≤2 exp

− δ²n 64B²c

+P

sup

1≤i≤[n/c]

1 c

ic

j=(i−1)c+1

E(Xj|F(i−1)c) ≥ δ

4

. (33)

Proof. Letcbe a fixed integer andk= [n/c](where, as before,[x]denotes the integer part ofx). The initial step of the proof is to divide the variables in consecutive blocks of sizecand to average the variables in each block

Y_i,c=1 c

ic

j=(i−1)c+1

X_j, i≥1.

Then, for all 1≤i≤kwe construct the martingale, Mi,c=

i

j=1

Yj,c−E(Yj,c|F(j−1)c)

= i

j=1

Dj,c

and we use the decomposition P

1max≤j≤n

1 n

j

u=1

X_u ≥δ

≤P

1max≤i≤k

1 k

i

j=1

Y_j,c

≥δ−cB n

≤P

1max≤i≤k

1 k

i

j=1

Y_j,c ≥ δ

2

≤P

1max≤i≤k

1

k|M_i,c| ≥δ 4

+P

1max≤i≤k

1 k

i

j=1

E(Yj,c|F(j−1)c) ≥δ

4

≤P

1max≤i≤k

1

k|M_i,c| ≥δ 4

+P

1max≤j≤k

E(Yj,c|F(j−1)c)≥δ 4

.

Next, we apply Azuma’s inequality to the martingale part and obtain, P

1max≤i≤k|M_i,c| ≥δk 4

≤2 exp

− δ²k² 32kB²

≤2 exp

− δ²n 64cB²

which implies that P

1max≤i≤n

1 n

i

u=1

Xu

≥δ

≤2 exp

− δ²n 64B²c

+P

1max≤i≤k

E(Yi,c|F(i−1)c)≥δ 4

proving the lemma.

4.2. Some facts about the moderate deviation principle

This paragraph deals with some preparatory material. The following theorem is a result concerning the MDP for a triangular array of martingale differences sequences. It follows from Theorem 3.1 and Lemma 3.1 of [21], (see also [15], Proposition 1 and Lemma 2).