Quasi-compactness and mean ergodicity for Markov kernels acting on weighted supremum normed spaces

(1)

www.imstat.org/aihp 2008, Vol. 44, No. 6, 1090–1095

DOI: 10.1214/07-AIHP145

Quasi-compactness and mean ergodicity for Markov kernels acting on weighted supremum normed spaces

Loïc Hervé

I.R.M.A.R., UMR-CNRS 6625, Institut National des Sciences Appliquées de Rennes, 20, Avenue des Buttes de Couësmes CS 14 315, 35043 Rennes Cedex, France. E-mail: [email protected]

Received 12 December 2006; revised 29 June 2007; accepted 10 September 2007

Abstract. LetP be a Markov kernel on a measurable spaceEwith countably generatedσ-algebra, letw:E→ [1,+∞[such that P w≤CwwithC≥0, and letBw be the space of measurable functions onEsatisfyingfw=sup{w(x)⁻¹|f (x)|, x∈E} <

+∞. We prove thatP is quasi-compact on(Bw, · w)if and only if, for allf ∈Bw,(¹_n_n

k=1P^kf )ncontains a subsequence converging inBw toΠf =_d

i=1μ_i(f )v_i, where thev_i’s are non-negative bounded measurable functions onEand theμ_i’s are probability distributions onE. In particular, when the space ofP-invariant functions inBw is finite-dimensional, uniform ergodicity is equivalent to mean ergodicity.

Résumé. SoitPun noyau markovien sur un espace mesurableEmuni d’une tribu à base dénombrable, soitw:E→ [1,+∞[tel queP w≤Cw, avecC≥0, et soitBwl’espace des fonctionsfmesurables deEdansCtelles quefw=sup{w(x)⁻¹|f (x)|, x∈ E} < +∞. Nous démontrons queP est quasi-compact sur(Bw, · w)si et seulement si, pour toutf ∈Bw,(¹_n_n

k=1P^kf )n contient une sous-suite convergeant dansBwversΠf=_d

i=1μ_i(f )v_i, oùv_iest une fonction mesurable positive bornée surEet μ_iune probabilité surE. En particulier, quand le sous-espace deBwconstitué des fonctionsP-invariantes est de dimension finie, la convergence uniforme des moyennes est équivalente à la convergence ponctuelle.

MSC:37A30; 60J10

Keywords:Markov kernel; Quasi-compactness; Mean ergodicity; Geometrical ergodicity

1. Introduction

Let(E,E)be a measurable space with countably generated σ-algebra, let(B˜, · )denote the space of complex- valued bounded measurable functions onE, equipped with the supremum norm, and letP be a Markov kernel on (E,E). Under some irreducibility conditions,P is quasi-compact onB˜ if and only ifP is mean ergodic with one- dimensional limit projection defined by the uniqueP-invariant distribution. This result was proved in [1] under the Harris condition (see also [11]), and in [8] under the ergodicity condition.¹See also [6].

Now letw:E→ [1,+∞[, and let(Bw, · w)denote the Banach space of complex-valued measurable functions onEsatisfyingfw:=sup{w(x)⁻¹|f (x)|, x∈E}<+∞. AssumingP w≤Cw, withC∈R^∗₊,P acts continuously onBw. This work extends toBwthe equivalence between mean ergodicity with finite rank limit projection and quasi- compactness.

1The equivalence between mean ergodicity and quasi-compactness is not mentioned in [1], but it is an easy consequence of Theorem II.2 in [1]. In [8]Eis not supposed to be countably generated.

(2)

Theorem 1. P is quasi-compact onBw if and only if there existd∈N^∗,linearly independent non-negative functions v₁, . . . , v_dinB,˜ andP-invariant distributionsμ₁, . . . , μ_donEsatisfyingμ_i(w) <+∞such that,for allf∈Bw,the sequence(_n¹_n

k=1P^kf )ncontains a subsequence converging inBwto_d

i=1μi(f )vi.

Observe that the naive idea which consists in applying the similarity transformationP˜:f →w⁻¹P (wf )in order to deduce the theorem from [1,8] does not work becauseP˜ is not Markovian whenP ww>1 (i.e. whenwis not sub-invariant). The proof of Theorem 1 is actually based on a recent work of Hennion [3], which gives criteria for quasi-compactness of kernels acting onBw, on spectral theory [2], and on positive operator theory [12,13]. As in [3], the above theorem does not require any irreducibility or aperiodicity conditions; in this sense, when applied with w=1E, it improves [1,8]. This theorem shows too that a quasi-compact Markov kernel onBw is necessarily power- bounded. This fact was already proved in [4] (Section IV.3), together with the equivalence between quasi-compactness and uniform ergodicity, which also follows from [9].

The above theorem does not hold whenBw is replaced with continuous function spaces. For instance, ifEis a compact metric space andP is uniquely ergodic on the spaceC(E)of all complex-valued continuous functions onE, thenP is mean ergodic [7], but in generalP is not quasi-compact onC(E)(consider irrational rotations of the circle).² We shall present in Section 3 (Corollary 1) a direct application tow-geometrically ergodic Markov chains [10]

whose transition probability is, by definition, quasi-compact onBw, withλ=1 as a simple eigenvalue and the unique peripheral eigenvalue. Many examples of such Markov chains, with unbounded functionsw, are presented in [10].

A simple example is provided by the linear modelX_n=αX_n₋1+ε_n, withα∈ ] −1,1[, where(ε_n)_n_≥1is an i.i.d.

sequence of real-valued random variables, independent of X₀, such thatm=E[|ε₁|]<+∞. In this case the state space isE=Rwith its Lebesgue sets, andP (x, A)=E[1A(αx+ε₁)], which yieldsPf (x)=E[f (αx+ε₁)]. Let w(y)=1+ |y|(y∈R). Then, for anyx∈R, we haveP w(x)=E[w(αx+ε1)] ≤1+ |α||x| +m, soP w≤ |α|w+L, with L=1− |α| +m. From this inequality, called drift condition, one can deduce that, if ε1 has an everywhere positive density, then(X_n)_n isw-geometrically ergodic [10] (Section 15.5.2). Observe thatw is not sub-invariant.

Indeed,P w(0)=1+m > w(0), soP ww>1. Obviously, this conclusion extends to any functionw(y)=a+b|y|, with constantsa, b >0. Actually, in most of the examples ofw-geometrically ergodic Markov chains,wis not sub- invariant when it is unbounded.

Finally we shall see in Corollary 2 that, in the special case of denumerable Markov chains, the above theorem en- ables us to obtain an elementary proof of the above mentioned well-known fact that geometric ergodicity is equivalent to some drift condition.

2. Proof of Theorem 1

Proof of⇒. SupposeP is quasi-compact onBw. It is proved in [4] (Section IV.3) that(_n¹n

k=1P^k)nconverges in the operator norm topology to a finite dimensional projectionΠof the form:Πf=_d

i=1φ_i(f )f_i, where thef_i’s are linearly independent functions inB˜and theφ_i’s are bounded complex measures onEsuch that|φ_i|(w) <+∞, with

|φ_i|the total variation ofφ_i. It remains to prove that one can choosef_i andφ_i such thatf_i≥0 andφ_i is a probability measure onE. Notice thatΠ (Bw)⊂ ˜B,Π≥0 andΠ1E=1E.

Let BR be the subspace of Bw composed of real-valued functions. Then Π (BR) is a Banach lattice which is isomorphic toR^d with the preservation of the order relation [13]. Consequently there exist non-negative functions g1, . . . , g_d inΠ (Bw)and positive linear forme^∗₁, . . . , e^∗_d onΠ (Bw)such thatg=_d

i=1e^∗_i(g)g_i for allg∈Π (Bw).

Let ψ_j =e_j^∗◦Π. The ψ_j’s are positive continuous linear forms onBw, andψ_j =_d

i=1e_j^∗(f_i)φ_i. Thus theψ_j’s are positive bounded measures on Esuch thatψ_j(w) <+∞. Setμ_j =_ψ¹

j(E)ψ_j andv_j =ψ_j(E)g_j. Then Πf = _d

i=1ψ_i(f )g_i=_d

i=1μ_i(f )v_i, and theμ_i’s areP-invariant (useΠ P =Π).

2Also considerE= [0,1]andPf (x)=¹₂[f (^x₂)+f (^x+1₂ )].P is quasi-compact on the space of Lipschitz functions on[0,1], soP is mean ergodic on the space of continuous functions on[0,1], but is not quasi-compact on this space: indeed, for|z|<1,fz=

n≥1zⁿ⁻¹cos(2ⁿπ·)is a continuous function satisfyingPfz=zfz.

(3)

Proof of⇐. We shall denote by (ME) the mean ergodicity (subsequential) condition of Theorem 1. We setΠf = _d

i=1μi(f )vi. IfT is a continuous linear operator onBw, we denote by Tw its operator norm, and byr(T )its spectral radius. We denote by I the identity operator onBw. Givena∈Candρ >0, we set D(a, ρ)= {z: z∈C,

|z−a| ≤ρ}.

SinceP1E=1E, we haver(P )≥1. Besides, there existsn_k +∞such that sup_kn⁻_k¹nk

j=1P^jww<+∞, thus sup_kn⁻_k¹Pⁿ^kww<+∞. SincePⁿw= Pⁿww, one getsr(P )=limnPⁿ^1/nw =1. In particular this yields n≥02⁻⁽ⁿ⁺¹⁾Pⁿw<+∞, so we can define the following bounded operator onBw, which is obviously Markovian:

Q=

n≥0

2⁻⁽ⁿ⁺¹⁾Pⁿ=(2I−P )⁻¹.

Proposition 1. Qis quasi-compact onBw. Proof. Let ν= ¹_d_d

i=1μ_i. Since the σ-algebra E is countably generated, there exist a non-negative measurable functionαon(E×E,E⊗E)and a positive kernelSonEsuch that we haveQ(x,dy)=α(x, y)dν(y)+S(x,dy), withS(x,·)⊥ν, for eachx∈E[11]. Forp∈N^∗, setαp=min{α, p}, and

T_p(x,dy)=α_p(x, y)dν(y), S_p(x,dy)=Q(x,dy)−T_p(x,dy).

Iff ∈Bw, then|Tpf| ≤ fwTpw≤pν(w)fw, soTp(Bw)⊂ ˜B. BesidesTpacts continuously onBw, and so is S_p. In order to apply [3], observe that, for eachp∈N^∗, the functionsα_p^(w)(x,·)=w(x)⁻¹α_p(x,·)w(·),x∈E, are uniformlyν-integrable (useα_p^(w)(x, y)≤pw(y),ν(w) <+∞and Lebesgue’s theorem).

Finally, sinceQ=φ (P )withφ (z)=

n≥02⁻⁽ⁿ⁺¹⁾zⁿandφis analytic onD(0,³₂), the spectral mapping theorem [2] yieldsr(Q)=φ (r(P ))=φ (1)=1. Proposition 1 then follows from [3] and [4] (Section IV) via the following

lemma.

Lemma 1. There existsp≥1such thatr(Sp) <1.

Proof. Suppose thatr(S_p)=1 for allp≥1. SinceS_p≥0, there exists a positive continuous linear form,η_p, onBw

such thatη_p=η_p◦S_pandη_p(w)=1, see [12], p. 267. LetP˜,Q,˜ T˜_p,S˜_p,η˜_pbe the restriction toB˜ofP,Q,T_p,S_p, η_p. Sinceη_p=η_p◦S_p≤η_p◦Qand(η_p◦Q−η_p)(1E)=0, we haveη˜_p= ˜η_p◦ ˜Q, thusη˜_p◦ ˜P= ˜η_p. Moreover we have:

(a) η˜_p=0. Indeed, ifη˜_p=0, then, fromη_p◦Q=η_p◦T_p+η_p◦S_p andT_p(Bw)⊂ ˜B, one would getη_p◦Q= η_p◦S_p=η_p, thusη_p◦P =η_p. Then, by (ME),η_p=_d

i=1η_p(v_i)μ_i would be a positive measure onEsuch that η_p(B˜)= {0}, soη_p=0, which is impossible.

(b) ∀f ∈ ˜B,ηp(f )=_d

i=1ηp(vi)μi(f ). This follows fromη˜p◦ ˜P = ˜ηpand (ME).

Now, from (a) and (b), there exist j ∈ {1, . . . , d} and p_k +∞ such that we have η_p_k(v_j)=0. Besides η_p_k(v_j)μ_j(T_p_k1E)≤η_p_k(T_p_k1E)=η_p_k(Q1E −S_p_k1E)=0, thus μ_j(T_p_k1E)=0. When k→ +∞, this gives α(x, y)dν(y)dμj(x)=0, hence

α(x₀, y)dν(y)=0 for ax₀∈E. SoQ(x₀,·)=S(x₀,·)⊥ν: there existsA∈E such thatQ(x0, A)=0 andν(A)=1.

But:Q(x₀, A)=0 ⇒ ∀n≥1, Pⁿ1A(x₀)=0 ⇒ _d

i=1μ_i(A)v_i(x₀)=0 (by condition (ME)). While: ν(A)=

1 d

_d

i=1μi(A)=1 ⇒ μi(A)=1,i=1, . . . , d. Thus_d

i=1v_i(x0)=0: this is impossible because (ME) gives 1E=_d

i=1v_i.

We shall denote byσ (Q)andσ (P )the spectrum ofQandP when acting onBw. Lemma 2. We haveσ (Q)\ {1} ⊂D(²₃,¹₃)∩D(0,1−ε)for a certainε∈ ]0,1[.

(4)

Proof. We have Q=φ (P ) with φ (z)= ₂₋¹_z, thus σ (Q)=φ (σ (P )) [2]. Since r(P )=1, we get σ (Q)⊂ φ (D(0,1))=D(²₃,¹₃). Soλ=1 is the unique peripheral spectral value ofQ, and Lemma 2 then follows from Propo-

sition 1.

Lemma 3. λ=1is a first order pole forP,with a corresponding finite-rank residue.

Proof. Setψ (z)=2−¹_z,z∈C^∗. Lemma 2 yields 0∈/σ (Q), soQis invertible onBw,ψis analytic on a neighborhood of σ (Q), andP =2I−Q⁻¹=ψ (Q). Thusσ (P )=ψ (σ (Q)), andσ (P )\ {1} =ψ (σ (Q)\ {1})⊂ψ (D(²₃,¹₃))∩ ψ (D(0,1−ε))=D(0,1)∩D(2,₁_−ε¹ )^c. Thusλ=1 is an isolated point inσ (P ). LetAP andAQ be the residue of the resolvent functions of P andQat λ=1. Letχ be an analytic function on a neighborhood of σ (P )such that χ (V0)= {0}and χ (V1)= {1}, where V0 andV1 are disjoint neighborhoods of the sets σ (P )\ {1} and{1}, respectively. We know thatAP =χ (P )[2], thusAP =χ (ψ (Q)). BesidesW0=ψ⁻¹(V0)andW1=ψ⁻¹(V1)are disjoint neighborhoods of respectivelyσ (Q)\ {1}and{1}, andχ◦ψ is an analytic function onW0∪W1such that χ ◦ψ (W0)= {0},χ ◦ψ (W1)= {1}. Thus A_Q=χ ◦ψ (Q), soA_P =A_Q. Since the Markov kernel Q is quasi- compact onBw (Proposition 1) and Qis power-bounded [4] (Theorem IV.3(i)),λ=1 is a first order pole forQ, andA_Q(Bw)=Ker(Q−I )is finite-dimensional by [2] (Theorem VIII.8.3 and Corollary VIII.8.4). By the definition of Qas a series, Pf =f implies Qf =f (f ∈Bw), and the converse holds by usingP =2I −Q⁻¹. Finally A_P(Bw)=A_Q(Bw)=Ker(Q−I )=Ker(P−I )is finite-dimensional, soλ=1 is a first order pole forP (use the

arguments of [2], Theorem VII.4.5).

Lemma 4. {λ∈σ (P ),|λ| =1}is composed of a finite number of first order poles.

Proof. From Lemma 3 and a classical result concerning the peripheral spectrum of positive operators on Banach lattice [13] (Theorem 5.5, p. 331), the set of peripheral spectral values ofP is composed of a finite number of poles forP. Using the Laurent expansions, Lemma 3 implies that they are first order poles.

Lemma 5. For any peripheral poleλofP,we havedim Ker(P−λI )≤dim Ker(P−I ) <+∞.

Proof. We have dim Ker(P−I ) <+∞by (ME). Letλ1=1, λ2, . . . , λmbe the peripheral poles ofP. The previous results show thatBw=Ker(P−I )⊕F⊕H, whereF =_m

i=2Ker(P−λiI ), andHis aP-invariant closed subspace ofBwsuch thatr(P_|_H) <1, withP_|_H the restriction ofP toH. Thus(¹_n_n

k=1P^k)_nconverges in the operator norm topology to the projection onto Ker(P−I ). Then Lemma 5 follows from [9] (Theorem 2).

The quasi-compactness ofP onBw follows from Lemmas 4 and 5.

3. Applications to geometrically ergodic Markov chains

Let(X_n)_n_≥₀ be a Markov chain with state spaceE and transition probabilityP. Recall that(X_n)_n_≥₀ is said to be w-geometrically ergodic if there exist an invariant distributionν onEsuch thatν(w) <+∞, and some constants r <1 andD∈R₊such that for everyf ∈Bw we have

Pⁿf−ν(f )1E

w≤Drⁿfw.

Corollary 1. Assume that(X_n)_n_≥₀is an aperiodic positive Harris Markov chain with stationary distributionν.Then (X_n)_n_≥₀isw-geometrically ergodic if and only if one of the two next conditions holds:

(a) ∀f ∈Bw,Pⁿf →ν(f )1EinBw whenn→ +∞.

(b) For allf ∈Bw,(¹_n_n

k=1P^kf )_ncontains a subsequence converging inBwtoν(f )1E.

Corollary 1 is an easy consequence of Theorem 1. (When (b) is assumed, the aperiodicity condition ensures that λ=1 is the unique peripheral eigenvalue ofP.)

(5)

The reader will find in [10] many examples of geometrically ergodic Markov chains. Geometric ergodicity with a bounded functionwcorresponds to an aperiodic Markov chain satisfying Doeblin’s condition.

Whenwis unbounded and(X_n)_n_≥₀is aperiodic andψ-irreducible w.r.t. to someσ-finite positive measureψonE, w-geometric ergodicity is equivalent to the following drift condition [10] (Chapter 16): there existρ <1,L >0, and a petite setAinEsuch thatP w0≤ρw0+L1A, wherew0is a function onEsuch thatd⁻¹w≤w0≤dwfor some constantd >0. Corollary 1 sheds new light on this fact, at least for countable Markov chains, and as an illustration, let us present a simple proof of the well-known next statement proved in [5].

Corollary 2. Let (Xn)n≥0 be an aperiodic and irreducible Markov chain with state space E=N, and suppose limkw(k)= +∞.Then (Xn)n≥0 isw-geometrically ergodic if and only if there existρ <1 andC >0 such that Pⁿw≤Cρⁿw+Cfor alln≥1.

By using the basic arguments of [10] (Section 16.1.1), one can easily see that the condition in Corollary 2 is equivalent to:∃ρ <1,∃L >0, P w0≤ρw0+L, withw0equivalent tow.

Proof of Corollary 2. If (X_n)_n_≥₀ is w-geometrically ergodic, then Pⁿw≤Drⁿw+ν(w). Conversely, suppose Pⁿw≤Cρⁿw+Cwithρ <1,C >0, independent ofn. Then we have sup_n_≥₁Pⁿw≤2C, and there exists an invariant distributionνsuch thatν(w) <+∞.³SetΠ_n=¹_n_n

k=1P^k, and let¹(ν)be the space ofC-valued sequences (x(n))_n_∈N such that

nν(n)|x(n)|<+∞.P is a contraction of ¹(ν), so for anyf ∈¹(ν),(Π_nf )_n converges in¹(ν), use e.g. [2] (Section VIII.5). The limitα=limnΠ_nf isP-invariant, and by irreducibility, it is constant:

∀i∈N, α(i)=ν(f ). Thus limnΠ_nf (i)=ν(f )for alli∈N.

Now letf ∈Bw, and for convenience assumefw=1 (i.e.|f| ≤w). We have

∀i∈N, P^kf (i)−ν(f )≤P^kw(i)+ν

|f| ≤Cρ^kw(i)+C+ν(w).

Letε >0. Then there existi0≥1,N0≥1 such thatw(i)⁻¹|P^kf (i)−ν(f )| ≤εfor alli > i0andk > N0. By using the fact that sup_k_≥₁P^kww<+∞and

Π_nf (i)−ν(f )=1 n

N₀

k=0

P^kf (i)−ν(f ) +1 n

n

k=N₀+1

P^kf (i)−ν(f ) ,

we easily deduce that there existsN₁≥N₀such thatw(i)⁻¹|Π_nf (i)−ν(f )| ≤2εfor alli > i₀andn > N₁. Finally letN₂≥N₁be such thatw(i)⁻¹|Π_nf (i)−ν(f )| ≤2εfor alli=0, . . . , i0andn > N₂. ThenΠ_nf−ν(f )w≤2ε

for alln > N₂, and Corollary 1 then applies.

Acknowledgments

The author is grateful to Michael Lin for his helpful advice about this work. Also, the author thanks the referee for many very helpful comments which allowed to greatly enhance the content and the presentation of this paper.

References

[1] A. Brunel and D. Revuz. Quelques applications probabilistes de la quasi-compacité.Ann. Inst. H. Poincaré, Sect. B (N.S.)10(1974) 301–337.

MR0373008

[2] N. Dunford and J. T. Schwartz.Linear Operators. Part. I: General Theory. Wiley, New York, 1958. MR1009162

[3] H. Hennion. Quasi-compactness and absolutely continuous kernels.Probab. Theory Related Fields.139(2007) 451–471. MR2322704

3This is a classical fact: consider the distributionsμn(A)= ¹n n

k=1(P^k1A)(x0) (x0∈Eis fixed). FromPⁿw≤Cρⁿw+C, we easily ob- tain sup_n≥1μn(w)≤2Cw(x0) <+∞, so(μn)nis tight (use limkw(k)= +∞), and one can select a subsequence converging to an invariant distributionνsuch thatν(w) <+∞.

(6)

[4] H. Hennion. Quasi-compactness and absolutely continuous kernels. Applications to Markov chains (2006). Available at ArXiv:math.PR/0606680.

[5] A. Hordijk and F. M. Spieksma. On ergodicity and recurrence properties of a Markov chain with an application to an open Jackson network.

Adv. in Appl. Probab.24(1992) 343–376. MR1167263

[6] S. Horowitz. Transition probabilities and contractions ofL_∞.Z. Wahrsch. Verw. Gebiete24(1972) 263–274. MR0331516 [7] U. Krengel.Ergodic Theorems. de Gruyter Studies in Mathematics, de Gruyter, Berlin, 1985. MR0797411

[8] M. Lin. Quasi-compactness and uniform ergodicity of Markov operators. Ann. Inst. H. Poincaré, Sect. B (N.S.)11(1975) 345–354.

MR0402007

[9] M. Lin. Quasi-compactness and uniform ergodicity of positive operators.Israel J. Math.29(1978) 309–311. MR0493502 [10] S. P. Meyn and R. L. Tweedie.Markov Chains and Stochastic Stability. Springer, London, 1993. MR1287609

[11] D. Revuz.Markov Chains. North-Holland, Amsterdam, 1975. MR0758799 [12] H. H. Schaefer.Topological Vector Spaces. Springer, New York, 1971. MR0342978

[13] H. H. Schaefer.Banach Lattices and Positive Operators. Springer, New York, 1974. MR0423039