A SUFFICIENT CONDITION FOR THE CONVERGENCE OF A FINITE MARKOV CHAIN

(1)

THE CONVERGENCE OF A FINITE MARKOV CHAIN

ALINA NICOLAIE

We give a sufficient condition for the convergence of a finite state nonhomogeneous Markov chain in terms of convergence of a certain series.

AMS 2000 Subject Classification: 60J10.

Key words: simulated annealing, finite Markov chain.

1. PRELIMINARIES

Consider a finite Markov chain with state space S = {1, . . . , r} and transition matrices (Pn)n≥1. We shall refer to it as the finite Markov chain (P_n)n≥1. For all integersm≥0,n > m, define

Pm,n =Pm+1Pm+2· · ·Pn= ((Pm,n)i,j)i,j∈S. Assume that the limit

(1.1) lim

n→∞Pn=P

exists and that the limit matrixP hasp≥1 irreducible aperiodic closed classes and, perhaps transient states, so that it has the form

(1.2)







S1 0 · · · 0 0

0 S₂ · · · 0 0

· · · ·

0 0 · · · S_p 0

L₁ L₂ · · · L_p T





 ,

where S_i are r_i×r_i transition matrices, i = 1, p, associated with the p irreducible aperiodic closed classes,T concerns the transitions of the chain as long as it stays in the r−Pp

t=1r_t transient states and the L_i concern transitions from the transient states into the ergodic sets corresponding to S_i,i= 1, p.

Such a class of Markov chains was, e.g., proposed as the mathematical model for simulated annealing, a stochastic algorithm for global optimization.

We refer to van Laarhoven and Aarts [8] for a general exposition and historical background.

MATH. REPORTS10(60),1 (2008), 57–71

(2)

Definition 1.1. A probability distribution µ= (µ1, . . . , µr) is said to be invariant with respect to an r×r stochastic matrix P ifµP =µ.

We shall need the following result

Theorem1.2. Consider a finite homogeneous Markov chain with state space S and transition matrix P of the form (1.2). Then

(1.3) lim

n→∞Pⁿ=







Γ1 0 · · · 0 0

0 Γ₂ · · · 0 0

· · · ·

0 0 · · · Γ_p 0

Ω1 Ω2 · · · Ωp 0





 ,

where

Γi=







µ⁽ⁱ⁾₁ · · · µ⁽ⁱ⁾ri

· · · · µ⁽ⁱ⁾₁ · · · µ⁽ⁱ⁾ri





, i= 1, p,

are strictly positive r_i×r_i matrices; each row of the matrixΓ_i is the invariant probability vector µ⁽ⁱ⁾ = (µ⁽ⁱ⁾₁ , . . . , µ⁽ⁱ⁾ri) with respect to the matrix Si, i= 1, p, and

Ωi =







µ⁽ⁱ⁾₁ z_r₁_+r₂+···+rp+1,i · · · µ⁽ⁱ⁾_r_iz_r₁_+r₂+···+rp+1,i

· · · ·

µ⁽ⁱ⁾₁ z_r,i · · · µ⁽ⁱ⁾riz_r,i







are (r −Pp

t=1r_t) ×r_i matrices, where z_j,i= probability that the chain will enter and thus, will be absorbed in the set corresponding to Si given that the initial state is j from the set corresponding to T,j=Pp

t=0rt, r, i= 1, p [with convention r₀ = 1].

Proof. For the form of Γi, i = 1, p, see, e.g., [5, p. 123] and for Ωi, i= 1, p, see, e.g., [7, p. 91].

Remark 1.3. Clearly,

(1.4) z_j,i ≥0, ∀j=

p

X

t=0

r_t, r, ∀i= 1, p, and

(1.5)

p

X

i=1

zj,i = 1, ∀j=

p

X

t=0

rt, r.

Theorem 1.4 (see, e.g., [6]). If P is the transition matrix of a finite Markov chain, then the multiplicity of the eigenvalue 1 is equal to the number of irreducible closed subsets of the chain.

(3)

Proof. See, e.g., [6, p. 126].

A vector x∈Cⁿ will be understood as a column vector and x⁰ denotes the transpose of x.

Theorem1.5. Let A= −I_r+P with P of the form (1.2). Then there exists a nonsingular complex r×r matrix Q such that

(1.6) A=QJ Q⁻¹,

where J is a Jordan r×r matrix, and Qreads as

Q=







1 0 · · · 0 · · ·

· · · ·

1 0 · · · 0 · · ·

0 1 · · · 0 · · ·

· · · ·

0 1 · · · 0 · · ·

0 0 · · · 0 · · ·

· · · ·

0 0 · · · 0 · · ·

0 0 · · · 1 · · ·

· · · ·

0 0 · · · 1 · · ·

zr1+r2+···+r_p+1,1 zr1+r2+···+r_p+1,2 · · · zr1+r2+···+r_p+1,p · · ·

· · · ·

z_r,1 z_r,2 · · · z_r,p · · ·





 ,

where the first column contains 1 in the 1, r₁ rows, the next p−1 columns contains1in theri−1+ 1, rirows,i= 2, p, and the lastr−pcolumns comprise complex numbers. For z_j,i, j=Pp

t=0r_t, r, i= 1, p, we have the meaning given in Theorem 1.2. The inverse Q⁻¹ has the form

Q⁻¹=







µ⁽¹⁾₁ · · · µ⁽¹⁾r1 0 · · · 0 0 · · · 0 0 · · · 0

· · · ·

0 · · · 0 0 · · · 0 µ^(p)₁ · · · µ^(p)rp 0 · · · 0

q_p+1,1 · · · q_p+1,r

· · · · q_r,1 · · · q_r,r





 ,

where µ⁽ⁱ⁾ are the invariant probability vectors with respect toS_i, i= 1, p, and the last r−p rows comprise complex numbers.

Proof. For the existence ofQsee, e.g., [4, p. 126]. See also [1]. We shall need some spectral properties of A. We have

(1.7) λ1 = 0

(4)

is an eigenvalue of A whose algebraic multiplicity is equal to the geometric multiplicity and equal to p. All other distinct eigenvalue λ₂, . . . , λ_l+s of A satisfy

(1.8) |λ_i+ 1|<1, i= 2, l+s, (1.9) Reλi <0, i= 2, l+s.

Indeed, by Theorem 1.4, the characteristic polynomial of P has the form det(P −ϑI_r) = (ϑ−1)^pR(ϑ),

where grad R=r−p and R(1)6= 0. Replacing P =I_r+A we obtain det(A−λIr) =λ^pR(λ+ 1),

where λ=ϑ−1, and so (1.7) follows.

On the other hand, if we set ϑi all other distinct eigenvalues different from 1 of P, with left eigenvectors ϕ_i,i= 2, l+s, becauseP is stochastic we have |ϑ_i|<1,i= 2, l+s, and

ϕ_iP =ϑ_iϕ_i, i= 2, l+s.

By subtracting ϕi from both members we obtain ϕiA= (ϑi−1)ϕi, i= 2, l+s.

Hence λ_i = ϑ_i −1, i = 2, l+s, are eigenvalues of A. Then (1.8) and (1.9) follow. From (1.7) (see [4, pp. 129–131]) we have

J =







J1 0 · · · 0 0 J₂ · · · 0

· · · ·

0 0 · · · J_l 0 · · · ·

0 0 · · · 0 J_l+1 · · · ·

· · · ·

0 0 · · · 0 0 · · · J_l+s





 ,

where J1 = 0p, i.e., the p ×p zero matrix, J_k is a diagonal m_k×m_k matrix with entries the eigenvalues λ_k, k = 2, l, whose algebraic and geometric multiplicities are equals, and

J_l+i=







λ_l+i ε⁽ⁱ⁾₁ 0 · · · 0

0 λ_l+i ε⁽ⁱ⁾₂ · · · 0 0

· · · · 0 0 · · · λ_l+i ε⁽ⁱ⁾_m

l+i−1

0 0 · · · 0 λl+i







(5)

are ml+i×ml+i matrices corresponding to eigenvalues whose geometric multiplicities are smaller then their algebraic multiplicities and ε⁽ⁱ⁾_t ∈ {0,1}, t= 1, m_l+i−1,i= 1, s. Clearly, p+m₂+· · ·+m_l+s=r.

A key step in obtaining the firstpcolumns ofQis the remark that writing (1.6) as AQ =QJ we see that A applied to the tth column ofQ yields zero vector, ∀t= 1, p, hence the vectors corresponding to the first p columns ofQ are right eigenvectors associated with the eigenvalue λ₁ = 0 ofA.

Now, we shall determine the right eigenspace corresponding to λ₁ by solving the linear system

(1.10) (A−0Ir)X= 0r

which is equivalent toP X =X, whereX= (x1, . . . , xr).

The firstPp

t=1rt rows of the system (1.10) are equivalent to (S_i−I_r_i)·(x_r₀_+r₁+···+ri−1, . . . , x_r₁+···+ri−1+ri) = 0, i= 1, p, with the general solution

(1.11) (x_r₀_+r₁+···+ri−1, . . . , x_r₁+···+ri) =α_i(1,1, . . . ,1), α_i∈C, i= 1, p.

For the next components xr1+···+rp+1, . . . , xr we note that P X =X⇒P^kX =X, ∀k≥1.

For the last r−Pp

t=1r_t components, the above equation amounts to

r

X

j=1

(P^k)_r₁+···+rp+i,jx_j =x_r₁+···+rp+i, i= 1, r−

p

X

t=1

r_t, k≥1, Equivalently,

p−1

X

t=0

r1+···+rt+1

X

j=r0+···+rt

(P^k)_r₁+···+rp+i,jx_j+

r

X

j=r1+···+rp+1

(P^k)_r₁+···+rp+i,jx_j =x_r₁+···+rp+i. For j=Pp

t=0rt, rand i= 1, r−Pp

t=1rtwe have [see 5, p. 91]

(1.12) lim

k→∞(P^k)_r₁+···+r_p+i,j = 0 and for i= 1, r−Pp

t=1r_t from representation (1.3) we get

k→∞lim(P^k)r1+···+r_p+i,r0+···+rs−1+t−1 =µ^(s)_t zr1+···+r_p+i,s, s= 1, p, t= 1, rs. So, letting k → ∞ in the last sum and using the fact that µ⁽ⁱ⁾, i= 1, p, are probability vectors, we obtain

p

X

j=1

zr1+r2+···+rp+i,jαj =xr1+···+rp+i, i= 1, r−

p

X

t=1

rt.

(6)

We thus obtain the general solution

p

X

j=1

α_j(0, . . . ,0,1, . . . ,1,0, . . . ,0, z_r₁+···+rp+1,j, z_r₁+···+rp+2,j, . . . , z_r,j)⁰, of (1.10), with α_j ∈ C, j = 1, p, where the components of the row vectors corresponding to r0+· · ·+rj−1, r1+· · ·+rj are 1 and those corresponding to the remaining ones from 1,Pp

t=1r_t, are 0,j = 1, p.

Next, we pick particular values for the parametersα_i,i= 1, p, to obtain plinear independent vectors; choosingα1= 1,α2 =· · ·=αp = 0, thenα2= 1, α₁ =α₃ =· · ·=α_p= 0 etc., we obtain the first pcolumns of the matrix Q.

A key step in obtaining the firstprows ofQ⁻¹ is the remark that writing (1.6) as Q⁻¹A=J Q⁻¹ which means that thetth row ofQ⁻¹, applied toA is equal to the tth rows of J applied to Q⁻¹,∀t= 1, p. Since the matrix J Q⁻¹ has the first p rows equal to the zero vector, we conclude that the tth row of Q⁻¹ is a left eigenvector of A associated withλ1= 0, ∀t= 1, p.

Next, we shall determine the left eigenspace corresponding toλ₁ by solving the linear system

(1.13) Y⁰(A−0Ir) = 0⁰_r

which is equivalent to Y⁰P = Y⁰, where Y⁰ = (y₁, . . . , y_r). The first Pp t=1r_t equations of system (1.13) are equivalent to

(yr0+r1+···+ri−1, . . . , yr1+···+ri)⁰·(Si−Iri) = 0, i= 1, p, with the general solution

(y_r₀_+r₁+···+ri−1, . . . , y_r₁+···+r_i) =β_i(µ⁽ⁱ⁾₁ , . . . , µ⁽ⁱ⁾_r_i), β_i ∈C, i= 1, p.

For the next components yr1+···+rp+1, . . . , yr we note that Y⁰P =Y⁰ ⇒Y⁰P^k=Y⁰, ∀k≥1.

For the last r−Pp

t=1r_t components, the above equation amounts to

r

X

j=1

yj(P^k)j,r1+···+rp+i =yr1+···+rp+i, i= 1, r−

p

X

t=1

rt, k ≥1, Equivalently,

p−1

X

t=0

r1+···+rt+1

X

j=r0+···+r_t

y_j(P^k)_j,r₁+···+r_p+i+

r

X

j=r1+···+r_p+1

y_j(P^k)_j,r₁+···+r_p+i =y_r₁+···+r_p+i. By letting k→ ∞in the last sum, and using (1.12) and (1.3) we get

y_r₁+···+r_p+i= 0, i= 1, r−

p

X

t=1

r_t.

(7)

We thus obtain the general solution

p

X

j=1

β_j(0, . . . , µ^(j)₁ , . . . , µ^(j)_r_j,0, . . . ,0)⁰,

of (1.13), with β_j ∈ C, j = 1, p, where the components of the vector corresponding to the subscripts r0+· · ·+rj−1, r1+· · ·+rj are the components of the invariant vector associated with S_j and the others are equals to 0.

Finally, we pick particular values for the parametersβ_i,i= 1, p, to obtain plinear independent vectors; choosingβ1= 1,β2 =· · ·=βp = 0, thenβ2= 1, β₁=β₃ =· · ·=β_p = 0, etc., we obtain the firstprows of the matrixQ⁻¹. If A = (Aij) is an m ×n matrix, then (see [4, p. 295]) define the matrix norm

|||A|||_∞= max

i=1,m n

X

j=1

|A_ij|.

For M ⊆ {1, . . . , m},N ⊆ {1, . . . , n},M, N 6=∅we also define A_M×N = (A_ij)i∈M,j∈N.

Theorem1.6 (see, e.g., [6]). Let (an)n≥0 be a sequence of real numbers convergent to 0 and P∞

n=0b_n an absolute convergent series. Then

n→∞lim

n

X

i=0

aibn−i = 0.

Proof. See, e.g., [6, pp. 34–36].

Theorem1.7 (see, e.g., [3]). Let ||| · |||be a matrix norm. Let (Un)n≥1

be a sequence of matrices such that |||U_n||| ≤ 1, ∀n≥ 1. Then the following statements are equivalent.

(i)Q∞

n=rU_n converges, ∀r≥1.

(ii)Q∞

n=1[Un+An]converges for all sequences (An)n≥1, when P∞

n=1|||A_n|||<∞.

Proof. See, e.g., [3, p. 101].

Theorem 1.8 (see, e.g., [4]). Let A be an n×n matrix. Then A is invertible if there exists a matrix norm ||| · ||| such that |||I_n−A||| < 1. If so, then

A⁻¹ =

∞

X

k=0

(In−A)^k. Proof. See, e.g., [4, p. 301].

(8)

Theorem1.9 (see, e.g., [4]). (i) If ||| · ||| is an arbitrary matrix norm, then |||I||| ≥1.

(ii) Let A be an n×n matrix. If ||| · ||| is a matrix norm such that

|||A|||<1, then

|||(I_n−A)⁻¹||| ≤ |||I_n||| −(|||I_n||| −1)|||A|||

1− |||A||| .

Proof. See, e.g., [4, pp. 301–302].

2. A CONVERGENCE RESULT

In this section we give our main result following an idea of a theorem of Gidas [1]. Our result is a sufficient condition for the convergence of a finite state nonhomogeneous Markov chain in terms of convergence of a certain series.

Definition 2.1 ([9]). A finite Markov chain (Pn)n≥1 is said to be convergent if ∀m≥0 the sequence (P_m,n)_n>m converges.

Definition 2.2 ([9]). Two Markov chains (P_n)n≥1 and (P_n⁰)n≥1 are said to be equivalent if

X

n≥1

|||P_n−P_n⁰|||∞<∞.

Theorem 2.3 ([2]). For two equivalent chains (P_n)n≥1 and (P_n⁰)n≥1

we have

∃ lim

n→∞P_m,n if and only if ∃ lim

n→∞P_m,n⁰ , ∀m≥1.

Consequently, two equivalent stochastic chains converge or diverge simultane- ously.

Proof. See [2].

The general results from Theorem 2.4 below are new. For the special case p= 1 see also [5, p. 226].

Theorem 2.4. Let (Pn)n≥1 be a nonhomogeneous Markov chain with state space S such thatP_n→P as n→ ∞. Suppose that P has exactly p≥1 recurrent aperiodic classes Si, i= 1, p, and, possibly, transient states, i.e., P is of the form (1.2). Let µ⁽ⁱ⁾ be the invariant probability vectors with respect to S_i, i = 1, p and z_j,i, j =Pp

t=1r_t+ 1, r, i = 1, p, as in Theorem 1.2. Let V_n=P_n−P, ∀n≥1, where lim

n→∞V_n= 0. LetQ and Q⁻¹ as in Theorem 1.5.

(9)

Then

n→∞lim(Pm,n)i,j = 0, i∈S, j=

p

X

t=0

rt, r, ∀m≥0.

If, moreover,

(2.1)

∞

X

n=1

|||(Q⁻¹V_nQ)S×M|||_∞<∞, where M ={1, . . . , p}, then the chain (Pn)n≥1 is convergent.

Proof. By the Chapman-Kolmogorov equation we have Pm,n =Pm,n−1Pn, 0≤m < n.

(with the convention Pn,n = Ir). Subtracting Pm,n−1 from both members, we obtain

(2.2) Pm,n−Pm,n−1 =Pm,n−1[−I_r+Pn], 0≤m < n.

Setting

(2.3) x⁽ⁱ⁾_n =x⁽ⁱ⁾_n (m) = ((P_m,n)_i,1, . . . ,(P_m,n)_i,r), 0≤m < n, and

x⁽ⁱ⁾_m =e_i, i∈S,

where (e_i)_i=1,r is the canonical basis of the linear spaceR^r, then equation (2.2) reads as

(2.4) x⁽ⁱ⁾_n −x⁽ⁱ⁾_n−1 =x⁽ⁱ⁾_n−1[−I_r+Pn], 0≤m < n, i∈S.

We remark that the x⁽ⁱ⁾n defined in (2.3) are solutions of equations of the type (2.5) xn−xn−1 =xn−1[−I_r+Pn], n≥1,

with

xn= (xn,1, . . . , xn,r), n≥0, under the conditions

(2.6) xn,i≥0, i= 1, r,

r

X

i=1

xn,i= 1, n≥0.

We are interested in the asymptotic behaviour of (2.5) under conditions (2.6).

Setting

A=−I_r+P

we can benefit of the result given in Theorem 1.5. Also, setting

(2.7) yn=xnQ, n≥0,

and

(2.8) Ven=Q⁻¹VnQ, n≥1,

(10)

equation (2.5) becomes

(2.9) yn−yn−1 =yn−1J+yn−1Ven, n≥1.

From (2.7) we have x_n=y_nQ⁻¹, hence (2.10) xn,i=µ⁽¹⁾_i yn,1+

r

X

j=p+1

yn,jqji, i= 1, r1,

(2.11) x_n,r₁+···+rt−1+i =µ^(t)_i y_n,t+

r

X

j=p+1

y_n,jq_j,r₁+···+rt−1+i, t= 2, p, i= 1, r_t,

(2.12) x_n,i=

r−p

X

j=1

y_n,p+jq_p+j,i, i=

p

X

t=0

r_t, r.

Next, equations (2.9) amount to yn,i−yn−1,i=

p

X

j=1

yn−1,j(Ven)j,i+

r

X

j=p+1

yn−1,j(Ven)j,i, i= 1, p,

yn,i−yn−1,i=λ2yn−1,i+

r

X

j=1

yn−1,i(Ven)j,i, i=p+ 1, p+m2,

y_n,i−y_n−1,i =λ_tyn−1,i+

r

X

j=1

yn−1,j(Ve_n)_j,i, i=p+

t−1

X

s=2

m_s+ 1, p+

t

X

s=2

m_s, t= 3, l, y_n,p+^Pl

s=2ms+1−y_n−1,p+^Pl

s=2ms+1=λ_l+1y_n−1,p+^Pl

s=2ms+1+ +

r

X

j=1

yn−1,j(Ve_n)_j,p+^Pl

s=2ms+1, y_n,i−yn−1,i=λ_l+1yn−1,i+ε⁽¹⁾_k yn−1,i−1+ +

r

X

j=1

yn−1,j(Ve_n)_j,i, i=p+

l

X

t=2

m_t+ 2, p+

l+1

X

t=2

m_t, k= 1, m_l+1−1, and similar equations for i=p+Pl+1

t=2mt+ 1, r.

Next, we shall study the asymptotic behaviour ofyn,i,i=p+ 1, r. The boundedness of y_n, n≥0, and the fact that lim

n→∞V_n= 0, allow one to assert that

(2.13) Wn,i:=

r

X

j=1

yn−1,j(Ven)j,i→0 as n→ ∞, i=p+ 1, r.

(11)

In the above system we shall be concerned with the equations corresponding toi∈ {p+ 1, . . . , p+Pl

t=2m_t+ 1}. Equivalently, we can write yn,i= (λ2+ 1)yn−1,i+Wn,i, n≥1, i=p+ 1, p+m2, respectively,

yn,i= (λt+ 1)yn−1,i+Wn,i, n≥1, t= 3, l, i=p+

t−1

X

s=2

ms+ 1, p+

t

X

s=2

ms, Therefore,

(2.14) |y_n,i| ≤ |y_0,i| |λ₂+ 1|ⁿ+

n

X

s=1

|W_s,i| |λ₂+ 1|^n−s, n≥1, i=p+ 1, p+m2. and similar equations do hold for i=p+Pt−1

s=2m_s+ 1, p+Pt

s=2m_s,t= 3, l.

By Theorem 1.6, using P∞

n=0|λ₂+ 1|ⁿ<∞(see (1.8)) and (2.13), the second term on the right side of (2.14) converges to zero asn→ ∞. Using also (1.8), we get

n→∞lim yn,i= 0, i=p+ 1, p+m2, regardless of the initial data y₀ and, similarly,

(2.15) lim

n→∞yn,i= 0, i=p+m2+ 1, p+

l

X

t=2

mt+ 1.

Next, we pay attention to the equations corresponding toifromp+Pl

t=2m_t+2 to p+Pl+1

t=2m_t. The case ε⁽ⁱ⁾_k = 0 for some k = 1, m_l+1−1 is similar to the preceding one. We are interested now in cases for whichε⁽ⁱ⁾_k = 1. As in (2.14), we obtain

(2.16) |y_n,i| ≤ |y_0,i| |λ_l+1+ 1|ⁿ+

n

X

s=1

[|y_s,i−1|+|W_s,i|]|λ_l+1+ 1|^n−s. For i= p+Pl

t=2m_t+ 2, p+Pl+1

t=2m_t, the first term on the right hand side of (2.16) converges to 0 as n→ ∞. SinceP∞

n=0|λ_l+1+ 1|ⁿ <∞ (see (1.8)), by Theorem 1.6 (see (1.8), (2.13) and (2.15)), the second term from (2.16) converges to 0 as n→ ∞ fori=p+Pl

t=2mt+ 2. Hence

n→∞lim yn,p+m2+···+ml+2 = 0.

Similarly,

n→∞lim y_n,i= 0, i=p+

l

X

t=2

m_t+ 3, p+

l+1

X

t=2

m_t.

(12)

We can thus conclude that

(2.17) lim

n→∞y_n,i= 0, i=p+ 1, r, regardless of the initial data y0.

Setting

R_n,i=

r

X

j=p+1

y_n,j(Ve_n+1)_j,i, i= 1, p, n≥0.

equation (2.9) for the first p components becomes

(2.18) yn,i−yn−1,i=yn−1,1(Ven)1,i+yn−1,2(Ven)2,i+· · ·+yn−1,p(Ven)p,i+Rn−1,i, i= 1, p. Letm≥0 andi∈S be arbitrarily fixed. Let

x_n=x⁽ⁱ⁾_n , n > m,

where x⁽ⁱ⁾n are defined like in (2.3). It follows using (2.17) and (2.12), that

(2.19) lim

n→∞x_n,k = 0, k=

p

X

t=0

rt, r.

and the first conclusion of theorem. In matrix notation, system (2.18) becomes





 yn,1

y_n,2

· · · yn,p







0

=





 yn−1,1

yn−1,2

· · · yn−1,p







0

C_n+







Rn−1,1

Rn−1,2

· · · Rn−1,p







0

,

where

Cn=







1 + (Ve_n)_1,1 (Ve_n)_1,2 · · · (Ve_n)_1,p (Ven)2,1 1 + (Ven)2,2 · · · (Ven)2,p

· · · ·

(Ve_n)_p,1 (Ve_n)_p,2 · · · 1 + (Ve_n)_p,p





 .

Setting

Y(n) =



 y_n,1

· · · y_n,p



 and R(n) =



 R_n,1

· · · R_n,p



, n > m, the matrix equation above becomes

(Y(n))⁰ = (Y(n−1))⁰Cn+ (R(n−1))⁰, n > m.

The solution of this matrix equation is (2.20)

(Y(n))⁰ = (Y(m))⁰

n

Y

s=m+1

C_s+

(R(n−1))⁰+

n−2

X

s=m

(R(s))⁰

n

Y

t=s+2

C_t

, n > m.

(13)

Setting

C_n=I_p+B_n, n > m, we have

|||B_n|||_∞= max

k=1,p p

X

j=1

|(Ve_n)_k,j|, n > m.

The convergence ofQn

s=m+1C_sfrom (2.20) asn→ ∞follows from Theo- rem 1.7. We shall now prove that the second term from (2.20) converges as n→ ∞, too. It follows from (2.1) that

∃n₀ > m+ 2 such that|||I_p−Cn|||_∞=|||B_n|||_∞<1, n≥n0. So, by Theorem 1.8 we have

∃(C_n)⁻¹, ∀n≥n0 and (Cn)⁻¹=

∞

X

t=0

[Ip−Cn]^t=

∞

X

t=0

[−B_n]^t, ∀n≥n0, and we can thus write

n−2

X

s=m

(R(s))⁰

n

Y

t=s+2

C_t+ (R(n−1))⁰ =

n0−2

X

s=m

(R(s))⁰

n

Y

t=s+2

C_t+

+

n−1

X

s=n0−1

(R(s))⁰

s+1

Y

l=n0

(Cs+n0+1−l)⁻¹

n

Y

t=n0

Ct, ∀n≥n0. To prove that the expression above converges amounts to show that

T(n) :=

n

X

s=n0−1

(R(s))⁰

s+1

Y

l=n0

(C_s+n₀+1−l)⁻¹, n≥n₀, converges. From the boundedness of (y_n)_n and condition (2.1) we have (2.21)

∞

X

n=1

|||R_n|||∞<∞.

Let u∈Nand n≥n₀. We have

|||T(n+u)−T(n)|||_∞=

=

R(n+1)

n+2

Y

l=n0

(C_n+1+n₀+1−l)⁻¹+· · ·+R(n+u)

n+u+1

Y

l=n0

(C_n+u+n₀+1−l)⁻¹ _∞≤

≤ |||R(n+ 1)|||_∞

n+2

Y

l=n0

|||(C_l)⁻¹|||_∞+· · ·+|||R(n+u)|||_∞

n+u+1

Y

l=n0

|||(C_l)⁻¹|||_∞

(14)

expression which, by Theorem 1.9, is smaller then

|||R(n+ 1)|||∞ n+2

Q

l=n0

[1− |||B_l|||_∞]

+· · ·+ |||R(n+u)|||∞ n+u+1

Q

l=n0

[1− |||B_l|||_∞]

≤

≤(|||R(n+ 1)|||_∞+· · ·+|||R(n+u)|||_∞) 1 c_n₀, where c_n₀ :=Q∞

l=n0

1− |||B_l|||_∞

∈(0,1]. The inequalities above and (2.21) implies that (T(n))_n is a Cauchy sequence. Therefore it is convergent. This implies that lim

n→∞yn,t,∀t= 1, p exists and, using also (2.10), (2.11) and (2.17), that lim

n→∞x_n,k, k= 1,Pp

t=1r_t exists. Sincei∈S and m ≥0 were arbitrarily chosen, it follows using also (2.19) that lim

n→∞P_m,n exists for allm≥0, i.e., the chain (Pn)n≥1 is convergent.

Example 2.5. Consider the chain (Pn)n≥1 given by

Pn=







1

2 −_n¹ ¹₂+ _n+1¹ _n(n+1)¹ 1−_n+1¹ _n+1¹ 0

0 0 1







, ∀n≥2,

and P₁ = I₃. The chain (P_n)n≥1 is convergent because the conditions of Theorem 2.4 are fulfilled.

Acknowledgements. The author wishes to thank Dr. Udrea P˘aun for the interest he showed in this paper.

REFERENCES

[1] B. Gidas, Nonstationary Markov chains and convergence of the annealing algorithm.

J. Statist. Phys.39(1985), 73–131.

[2] J. Hajnal,Weak ergodicity in nonhomogeneous Markov chains.Proc. Cambridge Philos.

Soc.54(1958), 233–246.

[3] D.J. Hartfiel,Nonhomogeous Matrix Products.World Scientific, River Edge, 2002.

[4] R.A. Horn and C.R. Johnson,Matrix Analysis.Cambridge Univ. Press, New York, 1985.

[5] M. Iosifescu, Finite Markov Processes and Their Applications. Wiley, Chichester and Editura Tehnic˘a, Bucure¸sti, 1980; corrected republication by Dover, Mineola, N.Y., 2007.

[6] D.L. Isaacson and R.W. Madsen,Markov Chains:Theory and Applications. Wiley, New York, 1976.

[7] S. Karlin and H.M. Taylor,A First Course in Stochastic Processes.Academic Press, New York, 1975.

(15)

[8] P.J.M. van Laarhoven and E.H.L. Aarts,Simulated Annealing:Theory and Applications.

Reidel, Dordrecht, 1987.

[9] A. Mukherjea and R. Chaudhuri,Convergence of non-homogeneous stochastic chains, II.

Math. Proc. Cambridge Philos. Soc.90(1981), 167–182.

Received 16 March 2007 “Transilvania” University of Bra¸sov

Department of Mathematical Analysis and Probability Str. Iuliu Maniu 50

500157 Bra¸sov, Romania alinanicolae@unitbv.ro