www.imstat.org/aihp 2008, Vol. 44, No. 6, 1090–1095
DOI: 10.1214/07-AIHP145
© Association des Publications de l’Institut Henri Poincaré, 2008
Quasi-compactness and mean ergodicity for Markov kernels acting on weighted supremum normed spaces
Loïc Hervé
I.R.M.A.R., UMR-CNRS 6625, Institut National des Sciences Appliquées de Rennes, 20, Avenue des Buttes de Couësmes CS 14 315, 35043 Rennes Cedex, France. E-mail: [email protected]
Received 12 December 2006; revised 29 June 2007; accepted 10 September 2007
Abstract. LetP be a Markov kernel on a measurable spaceEwith countably generatedσ-algebra, letw:E→ [1,+∞[such that P w≤CwwithC≥0, and letBw be the space of measurable functions onEsatisfyingfw=sup{w(x)−1|f (x)|, x∈E} <
+∞. We prove thatP is quasi-compact on(Bw, · w)if and only if, for allf ∈Bw,(1nn
k=1Pkf )ncontains a subsequence converging inBw toΠf =d
i=1μi(f )vi, where thevi’s are non-negative bounded measurable functions onEand theμi’s are probability distributions onE. In particular, when the space ofP-invariant functions inBw is finite-dimensional, uniform ergodicity is equivalent to mean ergodicity.
Résumé. SoitPun noyau markovien sur un espace mesurableEmuni d’une tribu à base dénombrable, soitw:E→ [1,+∞[tel queP w≤Cw, avecC≥0, et soitBwl’espace des fonctionsfmesurables deEdansCtelles quefw=sup{w(x)−1|f (x)|, x∈ E} < +∞. Nous démontrons queP est quasi-compact sur(Bw, · w)si et seulement si, pour toutf ∈Bw,(1nn
k=1Pkf )n contient une sous-suite convergeant dansBwversΠf=d
i=1μi(f )vi, oùviest une fonction mesurable positive bornée surEet μiune probabilité surE. En particulier, quand le sous-espace deBwconstitué des fonctionsP-invariantes est de dimension finie, la convergence uniforme des moyennes est équivalente à la convergence ponctuelle.
MSC:37A30; 60J10
Keywords:Markov kernel; Quasi-compactness; Mean ergodicity; Geometrical ergodicity
1. Introduction
Let(E,E)be a measurable space with countably generated σ-algebra, let(B˜, · )denote the space of complex- valued bounded measurable functions onE, equipped with the supremum norm, and letP be a Markov kernel on (E,E). Under some irreducibility conditions,P is quasi-compact onB˜ if and only ifP is mean ergodic with one- dimensional limit projection defined by the uniqueP-invariant distribution. This result was proved in [1] under the Harris condition (see also [11]), and in [8] under the ergodicity condition.1See also [6].
Now letw:E→ [1,+∞[, and let(Bw, · w)denote the Banach space of complex-valued measurable functions onEsatisfyingfw:=sup{w(x)−1|f (x)|, x∈E}<+∞. AssumingP w≤Cw, withC∈R∗+,P acts continuously onBw. This work extends toBwthe equivalence between mean ergodicity with finite rank limit projection and quasi- compactness.
1The equivalence between mean ergodicity and quasi-compactness is not mentioned in [1], but it is an easy consequence of Theorem II.2 in [1]. In [8]Eis not supposed to be countably generated.
Theorem 1. P is quasi-compact onBw if and only if there existd∈N∗,linearly independent non-negative functions v1, . . . , vdinB,˜ andP-invariant distributionsμ1, . . . , μdonEsatisfyingμi(w) <+∞such that,for allf∈Bw,the sequence(n1n
k=1Pkf )ncontains a subsequence converging inBwtod
i=1μi(f )vi.
Observe that the naive idea which consists in applying the similarity transformationP˜:f →w−1P (wf )in order to deduce the theorem from [1,8] does not work becauseP˜ is not Markovian whenP ww>1 (i.e. whenwis not sub-invariant). The proof of Theorem 1 is actually based on a recent work of Hennion [3], which gives criteria for quasi-compactness of kernels acting onBw, on spectral theory [2], and on positive operator theory [12,13]. As in [3], the above theorem does not require any irreducibility or aperiodicity conditions; in this sense, when applied with w=1E, it improves [1,8]. This theorem shows too that a quasi-compact Markov kernel onBw is necessarily power- bounded. This fact was already proved in [4] (Section IV.3), together with the equivalence between quasi-compactness and uniform ergodicity, which also follows from [9].
The above theorem does not hold whenBw is replaced with continuous function spaces. For instance, ifEis a compact metric space andP is uniquely ergodic on the spaceC(E)of all complex-valued continuous functions onE, thenP is mean ergodic [7], but in generalP is not quasi-compact onC(E)(consider irrational rotations of the circle).2 We shall present in Section 3 (Corollary 1) a direct application tow-geometrically ergodic Markov chains [10]
whose transition probability is, by definition, quasi-compact onBw, withλ=1 as a simple eigenvalue and the unique peripheral eigenvalue. Many examples of such Markov chains, with unbounded functionsw, are presented in [10].
A simple example is provided by the linear modelXn=αXn−1+εn, withα∈ ] −1,1[, where(εn)n≥1is an i.i.d.
sequence of real-valued random variables, independent of X0, such thatm=E[|ε1|]<+∞. In this case the state space isE=Rwith its Lebesgue sets, andP (x, A)=E[1A(αx+ε1)], which yieldsPf (x)=E[f (αx+ε1)]. Let w(y)=1+ |y|(y∈R). Then, for anyx∈R, we haveP w(x)=E[w(αx+ε1)] ≤1+ |α||x| +m, soP w≤ |α|w+L, with L=1− |α| +m. From this inequality, called drift condition, one can deduce that, if ε1 has an everywhere positive density, then(Xn)n isw-geometrically ergodic [10] (Section 15.5.2). Observe thatw is not sub-invariant.
Indeed,P w(0)=1+m > w(0), soP ww>1. Obviously, this conclusion extends to any functionw(y)=a+b|y|, with constantsa, b >0. Actually, in most of the examples ofw-geometrically ergodic Markov chains,wis not sub- invariant when it is unbounded.
Finally we shall see in Corollary 2 that, in the special case of denumerable Markov chains, the above theorem en- ables us to obtain an elementary proof of the above mentioned well-known fact that geometric ergodicity is equivalent to some drift condition.
2. Proof of Theorem 1
Proof of⇒. SupposeP is quasi-compact onBw. It is proved in [4] (Section IV.3) that(n1n
k=1Pk)nconverges in the operator norm topology to a finite dimensional projectionΠof the form:Πf=d
i=1φi(f )fi, where thefi’s are linearly independent functions inB˜and theφi’s are bounded complex measures onEsuch that|φi|(w) <+∞, with
|φi|the total variation ofφi. It remains to prove that one can choosefi andφi such thatfi≥0 andφi is a probability measure onE. Notice thatΠ (Bw)⊂ ˜B,Π≥0 andΠ1E=1E.
Let BR be the subspace of Bw composed of real-valued functions. Then Π (BR) is a Banach lattice which is isomorphic toRd with the preservation of the order relation [13]. Consequently there exist non-negative functions g1, . . . , gd inΠ (Bw)and positive linear forme∗1, . . . , e∗d onΠ (Bw)such thatg=d
i=1e∗i(g)gi for allg∈Π (Bw).
Let ψj =ej∗◦Π. The ψj’s are positive continuous linear forms onBw, andψj =d
i=1ej∗(fi)φi. Thus theψj’s are positive bounded measures on Esuch thatψj(w) <+∞. Setμj =ψ1
j(E)ψj andvj =ψj(E)gj. Then Πf = d
i=1ψi(f )gi=d
i=1μi(f )vi, and theμi’s areP-invariant (useΠ P =Π).
2Also considerE= [0,1]andPf (x)=12[f (x2)+f (x+12 )].P is quasi-compact on the space of Lipschitz functions on[0,1], soP is mean ergodic on the space of continuous functions on[0,1], but is not quasi-compact on this space: indeed, for|z|<1,fz=
n≥1zn−1cos(2nπ·)is a continuous function satisfyingPfz=zfz.
Proof of⇐. We shall denote by (ME) the mean ergodicity (subsequential) condition of Theorem 1. We setΠf = d
i=1μi(f )vi. IfT is a continuous linear operator onBw, we denote by Tw its operator norm, and byr(T )its spectral radius. We denote by I the identity operator onBw. Givena∈Candρ >0, we set D(a, ρ)= {z: z∈C,
|z−a| ≤ρ}.
SinceP1E=1E, we haver(P )≥1. Besides, there existsnk +∞such that supkn−k1nk
j=1Pjww<+∞, thus supkn−k1Pnkww<+∞. SincePnw= Pnww, one getsr(P )=limnPn1/nw =1. In particular this yields n≥02−(n+1)Pnw<+∞, so we can define the following bounded operator onBw, which is obviously Markovian:
Q=
n≥0
2−(n+1)Pn=(2I−P )−1.
Proposition 1. Qis quasi-compact onBw. Proof. Let ν= 1dd
i=1μi. Since the σ-algebra E is countably generated, there exist a non-negative measurable functionαon(E×E,E⊗E)and a positive kernelSonEsuch that we haveQ(x,dy)=α(x, y)dν(y)+S(x,dy), withS(x,·)⊥ν, for eachx∈E[11]. Forp∈N∗, setαp=min{α, p}, and
Tp(x,dy)=αp(x, y)dν(y), Sp(x,dy)=Q(x,dy)−Tp(x,dy).
Iff ∈Bw, then|Tpf| ≤ fwTpw≤pν(w)fw, soTp(Bw)⊂ ˜B. BesidesTpacts continuously onBw, and so is Sp. In order to apply [3], observe that, for eachp∈N∗, the functionsαp(w)(x,·)=w(x)−1αp(x,·)w(·),x∈E, are uniformlyν-integrable (useαp(w)(x, y)≤pw(y),ν(w) <+∞and Lebesgue’s theorem).
Finally, sinceQ=φ (P )withφ (z)=
n≥02−(n+1)znandφis analytic onD(0,32), the spectral mapping theorem [2] yieldsr(Q)=φ (r(P ))=φ (1)=1. Proposition 1 then follows from [3] and [4] (Section IV) via the following
lemma.
Lemma 1. There existsp≥1such thatr(Sp) <1.
Proof. Suppose thatr(Sp)=1 for allp≥1. SinceSp≥0, there exists a positive continuous linear form,ηp, onBw
such thatηp=ηp◦Spandηp(w)=1, see [12], p. 267. LetP˜,Q,˜ T˜p,S˜p,η˜pbe the restriction toB˜ofP,Q,Tp,Sp, ηp. Sinceηp=ηp◦Sp≤ηp◦Qand(ηp◦Q−ηp)(1E)=0, we haveη˜p= ˜ηp◦ ˜Q, thusη˜p◦ ˜P= ˜ηp. Moreover we have:
(a) η˜p=0. Indeed, ifη˜p=0, then, fromηp◦Q=ηp◦Tp+ηp◦Sp andTp(Bw)⊂ ˜B, one would getηp◦Q= ηp◦Sp=ηp, thusηp◦P =ηp. Then, by (ME),ηp=d
i=1ηp(vi)μi would be a positive measure onEsuch that ηp(B˜)= {0}, soηp=0, which is impossible.
(b) ∀f ∈ ˜B,ηp(f )=d
i=1ηp(vi)μi(f ). This follows fromη˜p◦ ˜P = ˜ηpand (ME).
Now, from (a) and (b), there exist j ∈ {1, . . . , d} and pk +∞ such that we have ηpk(vj)=0. Besides ηpk(vj)μj(Tpk1E)≤ηpk(Tpk1E)=ηpk(Q1E −Spk1E)=0, thus μj(Tpk1E)=0. When k→ +∞, this gives α(x, y)dν(y)dμj(x)=0, hence
α(x0, y)dν(y)=0 for ax0∈E. SoQ(x0,·)=S(x0,·)⊥ν: there existsA∈E such thatQ(x0, A)=0 andν(A)=1.
But:Q(x0, A)=0 ⇒ ∀n≥1, Pn1A(x0)=0 ⇒ d
i=1μi(A)vi(x0)=0 (by condition (ME)). While: ν(A)=
1 d
d
i=1μi(A)=1 ⇒ μi(A)=1,i=1, . . . , d. Thusd
i=1vi(x0)=0: this is impossible because (ME) gives 1E=d
i=1vi.
We shall denote byσ (Q)andσ (P )the spectrum ofQandP when acting onBw. Lemma 2. We haveσ (Q)\ {1} ⊂D(23,13)∩D(0,1−ε)for a certainε∈ ]0,1[.
Proof. We have Q=φ (P ) with φ (z)= 2−1z, thus σ (Q)=φ (σ (P )) [2]. Since r(P )=1, we get σ (Q)⊂ φ (D(0,1))=D(23,13). Soλ=1 is the unique peripheral spectral value ofQ, and Lemma 2 then follows from Propo-
sition 1.
Lemma 3. λ=1is a first order pole forP,with a corresponding finite-rank residue.
Proof. Setψ (z)=2−1z,z∈C∗. Lemma 2 yields 0∈/σ (Q), soQis invertible onBw,ψis analytic on a neighborhood of σ (Q), andP =2I−Q−1=ψ (Q). Thusσ (P )=ψ (σ (Q)), andσ (P )\ {1} =ψ (σ (Q)\ {1})⊂ψ (D(23,13))∩ ψ (D(0,1−ε))=D(0,1)∩D(2,1−ε1 )c. Thusλ=1 is an isolated point inσ (P ). LetAP andAQ be the residue of the resolvent functions of P andQat λ=1. Letχ be an analytic function on a neighborhood of σ (P )such that χ (V0)= {0}and χ (V1)= {1}, where V0 andV1 are disjoint neighborhoods of the sets σ (P )\ {1} and{1}, respectively. We know thatAP =χ (P )[2], thusAP =χ (ψ (Q)). BesidesW0=ψ−1(V0)andW1=ψ−1(V1)are disjoint neighborhoods of respectivelyσ (Q)\ {1}and{1}, andχ◦ψ is an analytic function onW0∪W1such that χ ◦ψ (W0)= {0},χ ◦ψ (W1)= {1}. Thus AQ=χ ◦ψ (Q), soAP =AQ. Since the Markov kernel Q is quasi- compact onBw (Proposition 1) and Qis power-bounded [4] (Theorem IV.3(i)),λ=1 is a first order pole forQ, andAQ(Bw)=Ker(Q−I )is finite-dimensional by [2] (Theorem VIII.8.3 and Corollary VIII.8.4). By the definition of Qas a series, Pf =f implies Qf =f (f ∈Bw), and the converse holds by usingP =2I −Q−1. Finally AP(Bw)=AQ(Bw)=Ker(Q−I )=Ker(P−I )is finite-dimensional, soλ=1 is a first order pole forP (use the
arguments of [2], Theorem VII.4.5).
Lemma 4. {λ∈σ (P ),|λ| =1}is composed of a finite number of first order poles.
Proof. From Lemma 3 and a classical result concerning the peripheral spectrum of positive operators on Banach lattice [13] (Theorem 5.5, p. 331), the set of peripheral spectral values ofP is composed of a finite number of poles forP. Using the Laurent expansions, Lemma 3 implies that they are first order poles.
Lemma 5. For any peripheral poleλofP,we havedim Ker(P−λI )≤dim Ker(P−I ) <+∞.
Proof. We have dim Ker(P−I ) <+∞by (ME). Letλ1=1, λ2, . . . , λmbe the peripheral poles ofP. The previous results show thatBw=Ker(P−I )⊕F⊕H, whereF =m
i=2Ker(P−λiI ), andHis aP-invariant closed subspace ofBwsuch thatr(P|H) <1, withP|H the restriction ofP toH. Thus(1nn
k=1Pk)nconverges in the operator norm topology to the projection onto Ker(P−I ). Then Lemma 5 follows from [9] (Theorem 2).
The quasi-compactness ofP onBw follows from Lemmas 4 and 5.
3. Applications to geometrically ergodic Markov chains
Let(Xn)n≥0 be a Markov chain with state spaceE and transition probabilityP. Recall that(Xn)n≥0 is said to be w-geometrically ergodic if there exist an invariant distributionν onEsuch thatν(w) <+∞, and some constants r <1 andD∈R+such that for everyf ∈Bw we have
Pnf−ν(f )1E
w≤Drnfw.
Corollary 1. Assume that(Xn)n≥0is an aperiodic positive Harris Markov chain with stationary distributionν.Then (Xn)n≥0isw-geometrically ergodic if and only if one of the two next conditions holds:
(a) ∀f ∈Bw,Pnf →ν(f )1EinBw whenn→ +∞.
(b) For allf ∈Bw,(1nn
k=1Pkf )ncontains a subsequence converging inBwtoν(f )1E.
Corollary 1 is an easy consequence of Theorem 1. (When (b) is assumed, the aperiodicity condition ensures that λ=1 is the unique peripheral eigenvalue ofP.)
The reader will find in [10] many examples of geometrically ergodic Markov chains. Geometric ergodicity with a bounded functionwcorresponds to an aperiodic Markov chain satisfying Doeblin’s condition.
Whenwis unbounded and(Xn)n≥0is aperiodic andψ-irreducible w.r.t. to someσ-finite positive measureψonE, w-geometric ergodicity is equivalent to the following drift condition [10] (Chapter 16): there existρ <1,L >0, and a petite setAinEsuch thatP w0≤ρw0+L1A, wherew0is a function onEsuch thatd−1w≤w0≤dwfor some constantd >0. Corollary 1 sheds new light on this fact, at least for countable Markov chains, and as an illustration, let us present a simple proof of the well-known next statement proved in [5].
Corollary 2. Let (Xn)n≥0 be an aperiodic and irreducible Markov chain with state space E=N, and suppose limkw(k)= +∞.Then (Xn)n≥0 isw-geometrically ergodic if and only if there existρ <1 andC >0 such that Pnw≤Cρnw+Cfor alln≥1.
By using the basic arguments of [10] (Section 16.1.1), one can easily see that the condition in Corollary 2 is equivalent to:∃ρ <1,∃L >0, P w0≤ρw0+L, withw0equivalent tow.
Proof of Corollary 2. If (Xn)n≥0 is w-geometrically ergodic, then Pnw≤Drnw+ν(w). Conversely, suppose Pnw≤Cρnw+Cwithρ <1,C >0, independent ofn. Then we have supn≥1Pnw≤2C, and there exists an in- variant distributionνsuch thatν(w) <+∞.3SetΠn=1nn
k=1Pk, and let1(ν)be the space ofC-valued sequences (x(n))n∈N such that
nν(n)|x(n)|<+∞.P is a contraction of 1(ν), so for anyf ∈1(ν),(Πnf )n converges in1(ν), use e.g. [2] (Section VIII.5). The limitα=limnΠnf isP-invariant, and by irreducibility, it is constant:
∀i∈N, α(i)=ν(f ). Thus limnΠnf (i)=ν(f )for alli∈N.
Now letf ∈Bw, and for convenience assumefw=1 (i.e.|f| ≤w). We have
∀i∈N, Pkf (i)−ν(f )≤Pkw(i)+ν
|f| ≤Cρkw(i)+C+ν(w).
Letε >0. Then there existi0≥1,N0≥1 such thatw(i)−1|Pkf (i)−ν(f )| ≤εfor alli > i0andk > N0. By using the fact that supk≥1Pkww<+∞and
Πnf (i)−ν(f )=1 n
N0
k=0
Pkf (i)−ν(f ) +1 n
n
k=N0+1
Pkf (i)−ν(f ) ,
we easily deduce that there existsN1≥N0such thatw(i)−1|Πnf (i)−ν(f )| ≤2εfor alli > i0andn > N1. Finally letN2≥N1be such thatw(i)−1|Πnf (i)−ν(f )| ≤2εfor alli=0, . . . , i0andn > N2. ThenΠnf−ν(f )w≤2ε
for alln > N2, and Corollary 1 then applies.
Acknowledgments
The author is grateful to Michael Lin for his helpful advice about this work. Also, the author thanks the referee for many very helpful comments which allowed to greatly enhance the content and the presentation of this paper.
References
[1] A. Brunel and D. Revuz. Quelques applications probabilistes de la quasi-compacité.Ann. Inst. H. Poincaré, Sect. B (N.S.)10(1974) 301–337.
MR0373008
[2] N. Dunford and J. T. Schwartz.Linear Operators. Part. I: General Theory. Wiley, New York, 1958. MR1009162
[3] H. Hennion. Quasi-compactness and absolutely continuous kernels.Probab. Theory Related Fields.139(2007) 451–471. MR2322704
3This is a classical fact: consider the distributionsμn(A)= 1n n
k=1(Pk1A)(x0) (x0∈Eis fixed). FromPnw≤Cρnw+C, we easily ob- tain supn≥1μn(w)≤2Cw(x0) <+∞, so(μn)nis tight (use limkw(k)= +∞), and one can select a subsequence converging to an invariant distributionνsuch thatν(w) <+∞.
[4] H. Hennion. Quasi-compactness and absolutely continuous kernels. Applications to Markov chains (2006). Available at ArXiv:math.PR/0606680.
[5] A. Hordijk and F. M. Spieksma. On ergodicity and recurrence properties of a Markov chain with an application to an open Jackson network.
Adv. in Appl. Probab.24(1992) 343–376. MR1167263
[6] S. Horowitz. Transition probabilities and contractions ofL∞.Z. Wahrsch. Verw. Gebiete24(1972) 263–274. MR0331516 [7] U. Krengel.Ergodic Theorems. de Gruyter Studies in Mathematics, de Gruyter, Berlin, 1985. MR0797411
[8] M. Lin. Quasi-compactness and uniform ergodicity of Markov operators. Ann. Inst. H. Poincaré, Sect. B (N.S.)11(1975) 345–354.
MR0402007
[9] M. Lin. Quasi-compactness and uniform ergodicity of positive operators.Israel J. Math.29(1978) 309–311. MR0493502 [10] S. P. Meyn and R. L. Tweedie.Markov Chains and Stochastic Stability. Springer, London, 1993. MR1287609
[11] D. Revuz.Markov Chains. North-Holland, Amsterdam, 1975. MR0758799 [12] H. H. Schaefer.Topological Vector Spaces. Springer, New York, 1971. MR0342978
[13] H. H. Schaefer.Banach Lattices and Positive Operators. Springer, New York, 1974. MR0423039