Quasi-stationary distributions and resilience: what to get from a sample?

(1)

Quasi-stationary distributions and resilience: what to get from a sample?

Tome 7 (2020), p. 943-980.

<http://jep.centre-mersenne.org/item/JEP_2020__7__943_0>

Certains droits réservés.

Cet article est mis à disposition selon les termes de la licence LICENCE INTERNATIONALE D’ATTRIBUTIONCREATIVECOMMONSBY 4.0.

https://creativecommons.org/licenses/by/4.0/

L’accès aux articles de la revue « Journal de l’École polytechnique — Mathématiques » (http://jep.centre-mersenne.org/), implique l’accord avec les conditions générales d’utilisation (http://jep.centre-mersenne.org/legal/).

Publié avec le soutien

du Centre National de la Recherche Scientifique

Publication membre du

Centre Mersenne pour l’édition scientifique ouverte

(2)

QUASI-STATIONARY DISTRIBUTIONS AND RESILIENCE:

WHAT TO GET FROM A SAMPLE?

by Jean-René Chazottes, Pierre Collet, Servet Martínez

& Sylvie Méléard

Abstract. — We study a class of multi-species birth-and-death processes going almost surely to extinction and admitting a unique quasi-stationary distribution (qsd for short). When rescaled by K and in the limit K →+∞, the realizations of such processes get close, in any fixed finite-time window, to the trajectories of a dynamical system whose vector field is defined by the birth and death rates. Assuming this dynamical system has a unique attracting fixed point, we analyzed the behavior of these processes for finiteKand finite times, “interpolating"

between the two limiting regimes just mentioned. In the present work, we are mainly interested in the following question: Observing a realization of the process, can we determine the so-called engineering resilience? To answer this question, we establish two relations which intermingle the resilience, which is a macroscopic quantity defined for the dynamical system, and the fluctuations of the process, which are microscopic quantities. Analogous relations are well known in nonequilibrium statistical mechanics. To exploit these relations, we need to introduce several estimators which we control for times betweenlogK(time scale to converge to the qsd) and exp(K)(time scale of mean time to extinction).

Résumé(Distributions quasi-stationnaires et résilience : que peut-on obtenir des données ?) Nous étudions une classe de processus de naissance et mort avec plusieurs espèces dans la situation où l’extinction est certaine et la distribution quasi-stationnaire est unique. Si on fixe un intervalle de temps fini et qu’on normalise les réalisations d’un tel processus par un paramètre d’échelleK, elles deviennent arbitrairement proches, dans la limiteK→+∞, des trajectoires d’un certain système dynamique dont le champ de vecteurs est défini à partir des taux de naissance et mort. Quand le système dynamique admet un seul point fixe attractif, nous avons précédemment analysé le comportement du processus pour des valeurs deKfinies et pour des temps finis, c’est-à-dire le comportement intermédiaire entre les deux comportements limites évoqués ci-dessus. La question principale qui nous intéresse est la suivante : si on observe une réalisation du processus, pouvons-nous estimer la résilience au sens de l’ingénieur (engineering resilience) ? Pour répondre à cette question, nous démontrons deux relations entremêlant la résilience, qui est une quantité macroscopique définie pour le système dynamique sous-jacent, et les fluctuations du processus, qui sont, elles, des quantités microscopiques. De tels genres de relations sont bien connus en mécanique statistique hors d’équilibre. Afin d’exploiter ces relations nous introduisons plusieurs estimateurs empiriques que nous parvenons à contrôler pour des temps entre logK, qui est l’échelle de temps pour observer la convergence vers la distribution quasi-stationnaire, etexp(K), qui est l’échelle du temps moyen d’extinction.

2020Mathematics Subject Classification. — 60J28, 92D25.

Keywords. — Birth-and-death process, dynamical system, engineering resilience, quasi-stationary distribution, fluctuation-dissipation relation, empirical estimators.

S. M. has been supported by the Chair “Modélisation Mathématique et Biodiversité” of Veolia Environnement-École Polytechnique-Museum national d’Histoire naturelle-Fondation X. P. C. and S. M. warmly thank the Basal Conicyt CMM AFD170001 project. J.-R. C. and P. C. also acknowl- edge the hospitality of the Instituto de Física de San Luis Potosí.

(3)

Contents

1. Introduction and main results. . . 944

2. Time evolution of moments of the process and moments of the QSD. . . 951

3. Controlling time averages of the estimators. . . 959

4. Fluctuation and correlation relations. . . 964

5. Variance estimates for the estimators. . . 966

Appendix A. Proof of the two variance estimates. . . 970

Appendix B. Counting the number of births. . . 976

Appendix C. Gaussian limit for the rescaled qsd . . . 977

References. . . 979

1. Introduction and main results

1.1. Context and setting. — The ability of an ecosystem to return to its reference state after a perturbation stress is given by its resilience, a concept pioneered by Holling. Resilience has several faces and multiple definitions [5]. In the traditional theoretical setting of dynamical systems, that is, differential equations, one of them is the so-calledengineering resilience. It is concerned with what happens in the vicinity of a fixed point (equilibrium state) of the system, and is given by minus the real part of the dominant eigenvalue of the Jacobian matrix evaluated at the fixed point. It can also be defined as the reciprocal of the characteristic return time to the fixed point after a (small) perturbation. In this paper, we are interested in how to determine the engineering resilience from the data. But which data? The drawback of the notion of engineering resilience is that we do not observe population densities governed by differential equations. Instead, we count individuals which are subject to stochastic fluctuations. Can we nevertheless infer the resilience? The subject of this paper is to show that this is possible in the framework of birth-and-death processes which are, in a sense made precise below, close to the solutions of a corresponding differential equation, at certain time and population size scales.

Let us now describe our framework. We consider a population made of dspecies interacting with one another. Suppose that the state of the process at timet, which we denote byN^K(t) = (N₁^K(t), . . . , N_d^K(t)), isn= (n1, . . . , ni, . . . , nd)∈Z^d+, whereni is the number of individuals of the ith species. Then the rate at which the population increases (respectively decreases) by one individual of the jth species is KB_j(n/K) (respectively KDj(n/K)), where K is a scaling parameter. Under the assumptions we will make, the process goes extinct, i.e., 0is an absorbing state, with probability one. There are two limiting regimes for the behavior of this process. The first one is to fix K and let t tend to infinity, which leads inevitably to extinction. The second one consists in fixing a time horizon and lettingK tend to+∞, after having rescaled the process byK. In this limit, the behavior of the rescaled process is governed by a certain differential equation. More precisely, given any 0< tH <+∞ and anyε >0

(4)

andx0∈R^d+, we have

K→+∞lim PbKx0c

sup

06t6tH

dist N^K(t)/K, x(t)

> ε

= 0,

where dist(·,·)is the Euclidean distance inR^d+, and x(t)is the solution of the differential equation inR^d+

dx

dt =B(x)−D(x)

with initial condition x0. We refer to [4, Chap. 11] for a proof. We use the notations x= (x1, . . . , xd),B(x) = (B1(x), . . . , Bd(x)), and so on and so forth. We will make further assumptions (see Subsection 1.4) on the birth and death rates to be in the following situation. The vector field

X =B−D

has a unique attracting fixed pointx^∗(lying in the interior ofR^d+). We denote byM^∗ its differential evaluated atx^∗, namely

M^∗=DX(x^∗).

We then define the (engineering) resilience as

ρ^∗=−sup{Re(z) :z∈Sp(M^∗)}, where Sp M^∗

denotes the spectrum (set of eigenvalues) of the matrix M^∗. Under our assumptions, we haveρ^∗>0.

We can now formulate more precisely the goal of this paper. Given afinite-length realization of the process(N^K(t), t6T), with large, but finite K, we want to build an estimator for ρ^∗. To this end, we need a good understanding of the behavior of the Markov process(N^K(t))in an intermediate regime between the two limiting regimes described above. This was done in a previous work of ours [3], and this can be roughly summarized as follows. All statesn6= 0are transient and 0is absorbing, hence the only stationary distribution is the Dirac measure sitting at 0. The mean time to extinction behaves like exp(Θ(K)). (We recall Bachmann-Landau notations below.) If we start in the vicinity of the staten^∗=bKx^∗c, that is, if the initial state has its coordinates of size of order K, then either the process wanders aroundn^∗ or it gets absorbed at0. More precisely, there is a unique quasi-stationary distribution (qsd, for short) νK which describes the statistics of the process conditioned to be non-extinct before timet. Without this conditioning, the law of the process at timet is well approximated by a mixture of the Dirac measure at 0 and the qsd νK, for times t ∈ [cKlogK,exp(Θ(K))], where c > 0, in the sense that the total variation distance between them is exponentially small inK, provided thatK is large enough.

We will rely on these results that will be recalled precisely later on. We will also need to prove further properties.

(5)

1.2. Main results. — To estimate the engineering resilienceρ^∗, we will establish a matrix relation involvingM^∗. Letµ^K = (µ^K₁, . . . , µ^K_d )be the vector of species sizes averaged with respect toν_K, that is,

(1.1) µ^K_p =

Z

npdνK(n), p= 1, . . . , d.

For each τ>0, define the matrix Σ^K_p,q(τ) =Eν_K

N_p^K(τ)−µ^K_p

N_q^K(0)−µ^K_q

, p, q∈ {1, . . . , d}.

In Section 4.1, we will prove the following result.

Theorem1.1. — For all τ >0 we have

(1.2) Σ^K(τ) = e^{τ M}^∗Σ^K(0) +O √ K

.

Some comments are in order. Ifτ is equal to, say,1/K then the estimate becomes useless. More generally, ifτ is too small thene^{τ M}^∗ is too close to the identity matrix.

Moreover, we will show later on thatµ^KandΣ^K(τ)are of orderK. Hence the estimate becomes irrelevant if τ becomes proportional to logK. Indeed, without knowing the constant of proportionality,e^{τ M}^∗Σ^K(0)can be of the same order than the error term.

Before proceeding further, we recall the following classical Bachmann-Landau notations.

Notations. — Given a ∈ R, the symbol O(K^a) stands for any real-valued function f(K) such that there exists C > 0 and K0 > 0 such that, for any K > K0,

|f(K)|6CKâ. Note in particular thatO(1)will always mean a strictly positive constant that we don’t want to specify. Sometimes, we will also use the symbol Θ(Kâ) stands for any real-valued functionf(K)such that there existC₁, C₂>0andK₀>0 such that, for any K > K0, C1Kâ 6f(K) 6C2Kâ. One can naturally generalize Θ(Kâ)to vector-valued functions. For instance, for n∈ R^d+ we write n= Θ(Kâ)if n_i= Θ(Kâ)fori= 1, . . . , d.

Relation (1.2) allows to determineM^∗. Indeed, we have (1.3) e^{τ M}^∗= Σ^K(τ) Σ^K(0)⁻¹+O 1/√

K .

This formula suggests that in order to estimate M^∗, we need estimators for Σ^K(0) andΣ^K(τ). Given a finite-length realization of N^K(t),06t6T

up to some time T > 0, we define estimators for µ^K_p and Σ^K_p,q(τ), for 0 6 τ < T, p, q ∈ {1, . . . , d}, K∈N^∗ by

(1.4) Sp^µ(T, K) = 1

T Z T

0

N_p^K(s) ds and

(1.5) S_p,q^C (T, τ, K) = 1 T−τ

Z T−τ

0

N_p^K(s+τ)−Sp^µ(T, K)

N_q^K(s)−Sq^µ(T, K) ds.

(6)

Under suitable conditions on n, K and T, S^µ(T, K) well approximates µ^K. More precisely, we will prove an estimate of the following form (see Theorem 3.4 for a precise statement)

(1.6) En

Sp^µ(T, K)

−µ^K_p

6C K+knk1

1 + logK

T + e^−c^(knk¹^∧K)+Te⁻^c⁰^K for every n ∈ Z^d+, p = 1, . . . , d, where c, c⁰, C are positive constants. We use the notationknk1 =Pd

i=1ni. Let us comment on this bound. Roughly speaking, it can only be useful if T is much smaller than exp(O(1)K) if n is, say, of order K. For instance, suppose that, forKlarge enough, we want the bound to beΘ(K^−a), for some a >0. One can check that this is possible ifn= Θ(K)andT = Θ(K^a+1logK). (Note in particular that, in this situation, we have a consistent estimator whenK→+∞.) However, when T becomes exp(O(1)K) or larger, we know thatEn

Sp^µ(T, K)

≈0, because with high probability, at this time scale the process is absorbed at0. This is the manifestation of the fact that the only stationary distribution is the Dirac measure at0. Consistently, our bound becomes very bad in that regime.

An estimate of the same kind holds forS^C(T, τ, K)which well approximatesΣ^K(τ) in the appropriate ranges.

Remark1.1. — It is possible to use discrete time instead of continuous time in the above averages. Indeed the key results (in particular Proposition 3.3) are obtained for discrete times.

We can now define the empirical matrixM_emp^∗ (T, τ, K)by e^{τ M}^emp^∗ ^{(T ,τ,K)}=S^C(T, τ, K)S^C(T,0, K)⁻¹.

We will see later on that, in appropriate regimes, S^C(T,0, K) is near Σ^K(0) and S^C(T, τ, K)is nearΣ^K(τ)(see Propositions 5.4 and 5.6). The matrixΣ^K(0)is invertible as a covariance matrix of a non-constant vector and isΘ(K)(see Proposition 2.9).

Then (1.2) implies thatΣ^K(τ)is invertible and the same holds forS^C(T, τ, K). These remarks imply that the matrixM_emp^∗ is well defined.

We define the empirical resilience by ρ^∗_emp(T, τ, K) =−sup

Re(z) :z∈Sp M_emp^∗ (T, τ, K) . Our main result (Theorem 5.7) is then the following.

Theorem. — For τ = Θ(1), n = Θ(K) (initial state) and 0 < T exp(Θ(1)K), andK large enough, we have

ρ^∗_emp(T, τ, K)−ρ^∗

6O(1)K²

√T + 1

√K

with a probability larger than 1−1/K. In particular, ifT >K⁵, we have ρ^∗_emp(T, τ, K)−ρ^∗

6O(1)/

√ K.

(7)

Several comments are in order. The dependence on the initial state n is some- what hidden and involved in the fact that the estimates hold “with a probability larger than 1−1/K”. Indeed, the estimate of this probability results from Cheby- shev inequality and variance estimates in which the process is started inn. What the symbol precisely means is not mathematically defined. It means that we need to consider T “much smaller than something exponentially big inK”. Indeed, since we do not control explicitly the various constants appearing in exponential terms inK, we have to considerT which varies on a scale smaller thanexp(Θ(1)K), for instance exp Θ(1)√

K

. The reader is invited to step through the proof of Theorem 5.7 for the more precise, but cumbersome bound we obtain.

1.3. A “fluctuation-dissipation” approach. — The above estimator for the engineering resilience, based on (1.3), is valid for anyd. In the case d= 1(only one species), we have another, simpler, estimator based on a “fluctuation-dissipation relation”. This relation is in fact true for any d and of independent interest. Let D^K be the d×d diagonal matrix given by

D^Kp,p=KBp(x^∗) =KDp(x^∗).

We have the following result. We writeΣ^K instead ofΣ^K(0), and the transpose of a matrixM is denoted byM^|.

Theorem1.2. — We have

(1.7) M^∗Σ^K+ Σ^KM^∗|+ 2D^K=O √ K

.

This relation is proved in Section 4.2. For background on fluctuation-dissipation relations in Statistical Physics, we refer to [7, §§2-3]. Note that the matrix Σ^K is symmetrical, but in general the matrixM^∗is not (see [3]). Note also that each term in the left-hand side of (1.7) is of orderK, as we will see below.

IfΣ^K andD^K are known, the matrixM^∗ is not uniquely defined, except ford= 1 (see for example [8]). For d = 1, (1.7) easily gives the resilience since it becomes a scalar equation:

ρ^∗=K(B(x^∗) +D(x^∗))

2Σ^K +O 1/√ K

.

Remark 1.2. — The quantity K(B(x^∗) +D(x^∗)) = 2KB(x^∗) is the average total jump rate KνK(B(n/K) +D(n/K)) up to O(1) terms. This follows from a Taylor expansion of B(n/K) +D(n/K)aroundx^∗, Theorem 2.6 and Proposition 2.7.

In the cased= 1, an estimator for D^K is

(1.8) S^D(T, K) = 1

T (number of births up to timeT).

In Section 5, we establish a bound for

En

S^D(T, K)

−KB(x^∗)

(8)

which depends onT, K andknk1, and is small in the relevant regimes. The estimator we use forΣ^K is

(1.9) S^Σ(T, K) = 1 T

Z T

0

N^K(s)−S^µ(T, K)

N^K(s)−S^µ(T, K) ds.

Again, we can control how well this estimator approximatesΣ^K. This provides another estimator forρ^∗, with a controlled error.

1.4. Standing assumptions. — The two (regular) vector fields B(x) and D(x) are given inR^d+. We assume that their components have second partial derivatives which are polynomially bounded. Obviously, we suppose thatBj(x)>0andDj(x)>0 for allj= 1, . . . , dandx∈R^d+. A dynamical system inR^d+ is defined by the vector field X(x) =B(x)−D(x), namely

dx

dt =B(x)−D(x) =X(x).

Forx∈R^d+, we use the following standard norms:

kxk1=

d

X

j=1

xj, kxk2= q

Pd j=1x²_j. We now state our hypotheses.

(H.1) The vectors fieldsB andD vanish only at0.

(H.2) There existsx^∗belonging to the interior ofR^d+ (fixed point ofX) such that B(x^∗)−D(x^∗) =X(x^∗) = 0.

(H.3) Attracting fixed point: there exist β >0 andR >0 such thatkx^∗k2 < R, and for allx∈R^d+ withkxk2< R,

(1.10) hX(x),(x−x^∗)i6−βkxk2kx−x^∗k²₂.

(H.4) The fixed point0of the vector fieldX is repelling (locally unstable). More- over, on the boundary of R^d+, the vector fieldX points toward the interior (except at0).

(H.5) Define

B(y) = supb

kxk1=y d

X

j=1

Bj(x), D(y) =b inf

kxk₁=y d

X

j=1

Dj(x) and fory >0, let

F(y) = B(y)b D(y)b . We assume that there exists 0< L < R such that

sup

y>L

F(y)<1/2 and lim

y→+∞F(y) = 0.

(H.6) There exists y₀ > 0 such that R∞

y0 D(y)b ⁻¹dy < +∞ and y 7→ D(y)b is increasing on[y0,+∞[.

(9)

(H.7) There existsξ >0such that

(H7) inf

x∈R^d₊

inf

16j6d

Dj(x) sup₁₆_`₆_dx`

> ξ.

(H.8) Finally, we assume that

(H8) inf

16j6d∂x_jBj(0)>0.

(By∂_x_j we mean the partial derivative with respect tox_j.)

Assumptions (H.5) and (H.6) ensure that the time for “coming down from infinity”

for the dynamical system is finite. Together with (H.3), this also implies thatx^∗ is a globally attracting stable fixed point. More comments on these assumptions can be found in [3].

1.5. A numerical example. — We consider the two-dimensional vector fields B(x1, x2) =

a x1+b x2

e x1+f x2

and D(x1, x2) =

x1 c x1+d x2 x2 g x1+h x2

,

where all the coefficients are positive. This is a model of competition between two species of Lotka-Volterra type. We have taken

a= 0.4569, b= 0.2959, e= 0.5920, f = 0.6449 c= 0.9263, d= 0.9157, g= 0.9971, h= 0.2905.

Assumptions (H.1) and (H.4) are easily verified numerically. Assumptions (H.5) and (H.6) are satisfied becauseB(y)b 6(a+b+e+f)yandD(y)b >(c∧h)y²/4. Concerning (H.2), we checked numerically that there is a unique fixed point inside the positive quadrant, namelyx^∗= (0.3567,1.4855). It remains to check (H.3), namely that

−β = sup{R(x) :x∈R²+}<0, where

R(x) =hX(x),(x−x^∗)i kxk2kx−x^∗k²₂ .

We first checked that the numeratorN(x) =hX(x),(x−x^∗)iis negative and vanishes only at0 andx^∗. It is easy to check thatN(x)<0 forkxk₂ large enough. We have verified numerically that the only solutions of the equations∂x₁N =∂x₂N = 0in the closed positive quadrant arex^∗ andz= (0.1739,0.4361), withN(z) =−0.2852, thus this is a negative local minimum. This implies that N(x)<0 in the closed positive quadrant, except at0andx^∗ where it vanishes. This implies thatR60in the closed positive quadrant. It is easy to check that

lim sup

kxk2→+∞

R(x)6−(c∧h)/√ 2.

This implies that R < 0 except perhaps at 0 and x^∗. Near 0 we have by Taylor expansion

R(x) =−hDX(0)x, x^∗i

kxk2kx^∗k²₂ 1 +O(kxk₂)

=−hx, D^|B(0)x^∗i

kxk2kx^∗k²₂ 1 +O(kxk₂)

(10)

and, since the vectorD^|B(0)x^∗has positive components, there exists% >0such that for allx∈R²+

hx, D^|B(0)x^∗i>%kxk2.

Ify=x−x^∗ is small, we have by Taylor expansion (sinceX(x^∗) = 0) R(x) = hM^∗y, yi

kx^∗k2kyk²₂ 1 +O(kyk2)

=

y,¹₂ M^∗|+M^∗ y

kx^∗k2kyk²₂ 1 +O(kyk2) . One can check numerically that the two real eigenvalues of the symmetric matrix

M^∗|+M^∗

are strictly negative, the largest being numerically equal to−0.786. This completes the verification of hypothesis (H.3).

Illustrating standard experiments on populations of cells or bacteria, we have cho- senK = 10⁵ and simulated a unique realization of the process withT = 100 which contains about5.10⁷ jumps (cell divisions or deaths). The resilience computed from the vector field is numerically equal to 0.547. We have computed ρ^∗_emp(100,1,10⁵).

The relative error, that is|ρ^∗_emp(100,1,10⁵)−ρ^∗|/ρ^∗, is equal to0.022.

Note that the situation we are interested in is completely different from standard statistical approach where one can repeat the experiments.

1.6. Organization of the paper. — In Section 2, we will study the time evolution of the moments of the process and we will prove moment estimates for the qsd. In Sec- tion 3, we will obtain control on the large time behavior of averages for the process.

In Section 4, we will prove the relations (1.2) and (1.7). In Section 5, we will apply these relations to obtain approximate expressions of the engineering resilience in terms of the covariance matrices for the qsd. From the results of Section 3, we will deduce variance bounds for the estimators (1.4), (1.5) and (1.8), starting either in the qsd or from an initial condition of orderK.

Acknowledgements. — We thank the two anonymous referees for fruitful comments and suggestions.

2. Time evolution of moments of the process and moments of the QSD 2.1. Time evolution of moments starting from anywhere. — The generatorLK of the birth and death processN^K= (N^K(t), t>0)is defined by

(2.1) LKf(n)

=K

d

X

`=1

B`(n/K) f(n+e^(`))−f(n) +K

d

X

`=1

D`(n/K) f(n−e^(`))−f(n) ,

where e^(`) = (0, . . . ,0,1,0, . . . ,0), the 1 being at the `-th position, and f :Z^d+ →R is a function with bounded support. We denote by(S_t^K, t>0) the semigroup of the processN^K acting on bounded functions, that is, forf :Z^d+→R, we have

S^K_t f(n) =E

f(N^K(t))

N^K(0) =n

=En

f(N^K(t)) .

(11)

ForA >1, let

(2.2) TA= inf{t >0 :kN^K(t)k1> A}.

Notice that we will use eitherk·k1ork·k2. They are of course equivalent but one can be more convenient than the other, depending on the context. We have the following result.

Theorem2.1. — There exists a constantC_(2.1)>0such that forK large enough, the operator groupS₁^K extends to exponentially bounded functions and

sup

n∈Z^d+

S₁^K e^k·k¹

(n)6e^C^(2.1)^K. Proof. — Introduce the functionGK defined on[y0,+∞)by

GK(y) = Z ∞

y

dz

D(z)b + 1 KD(y)b .

Assumption (H.6) implies thatGKis well defined and decreasing on[y0,+∞). We can define its inverse function on(0, s0]fors0>0small enough (independent ofK). Take 0< η6s₀∧(1− e⁻¹)/4. Then there is a unique positive functiony_K defined by (2.3) yK(s) =G_K⁻¹(ηs), s∈(0,1].

Note thaty_K(s)>y₀ andlim_s↓0y_K(s) = +∞. Let ϕ_K(s) = e^−Ky^K^(s)

KD(yb K(s)). Note that

lims↓0ϕK(s) = 0.

Using the Lipschitz continuity ofDb (and then its differentiability almost everywhere) and (2.3), we obtain

˙

ϕK(s) = dϕK

ds (s) =−e^−Ky^K^(s)

D(yb K(s))+e^−Ky^K^(s)Db⁰(yK(s)) KD(yb K(s))²

dyK

ds (s) =ηe^−Ky^K^(s). We now consider the function

fK(t, n) =ϕK(t) e^knk¹ to which we apply Itô’s formula at timet∧TA. We get

En

h

ϕ_K t∧TA

e^kN^K^(t∧^T^A^)k¹i

=En

Z t∧TA

0

∂_tf_K+LKf_K

(s, N^K(s)) ds

. We have

∂tfK(t, n) +LKfK(t, n) = ˙ϕK(t) e^knk¹ +KϕK(t) e^knk¹

(e−1)

d

X

`=1

B`(n/K) + (e⁻¹−1)

d

X

`=1

D`(n/K)

.

(12)

Note that

∂tfK(t, n) +LKfK(t, n) 6e^knk¹

˙

ϕK(t) +KϕK(t) (e−1)Bb(knk1/K)−(1−e⁻¹)Db(knk1/K) 6e^knk¹

˙

ϕK(t)−KϕK(t)(1−e⁻¹)Db(knk1/K) 1−eF(knk1/K) . It follows from (H.5) that there exists a number ζ > y0 such that if y > ζ, then F(y)<(2e)⁻¹.

Ifknk1< ζK we get

∂_tf_K(t, n) +LKf_K(t, n)

6O(1) e^ζK ϕ˙_K(t) +Kϕ_K(t) . Forknk1>K(ζ∨yK(t))we have

∂tfK(t, n) +LKfK(t, n)60 sinceϕ˙K(t) =ηKD(yb K(t))ϕK(t)andD(knkb 1/K)>D(yb K(t)).

Finally, forζK6knk1< Ky_K(t)we get ∂_tf_K(t, n) +LKf_K(t, n)

6e^Ky^K^(t)ϕ˙_K(t) =η.

We deduce that En

h

ϕK 1∧TA

e^kN^K^(1∧^T^A^)k¹i

6O(1) e^ζK.

The result follows by lettingAtend to infinity and by monotonicity.

We deduce moment estimates for the process which are uniform in the starting state, and in time, for times larger than1.

Corollary2.2. — For allt>1, the semi-group(St)maps functions of polynomially bounded modulus in bounded functions. In particular, for all q∈N, we have

(2.4) sup

t>1

sup

n∈Z^d+

En

kN^K(t)k^q₁

6q^qe^−qK^qe^C^(2.1). Proof. — We have

En

kN^K(1)k^q₁

=K^qEn

hkN^K(1)k^q₁

K^q e^−kN^K^(1)k¹^/Ke^kN^K^(1)k¹^/Ki 6K^qq^qe^−qEn

h

e^kN^K^(1)k¹^/Ki

since for allx>0,x^qe^−x6q^qe^−q. Inequality (2.4) follows from Hölder’s inequality and Theorem 2.1. Let us now considert >1. From the Markov property and by using the previous inequality, we deduce that

En

N^K(t)

q 1

=En

h

EN^K(t−1)

N^K(1)

q 1

i

6q^qe^−qK^qe^C^(2.1).

The proof is finished.

For timestless than 1, the moment estimates depends on the initial state.

(13)

Proposition2.3. — For each integerq, there exists a constant cq >0 such that for all K >1,t>0 andn∈Z^d+

En

N^K(t)

q 2

6c_qK^q+knk^q₂1{t<1}.

Proof. — We have only to study the caset <1, the other case being given in (2.4).

We prove the result for q even, namely q = 2q⁰. The result for q odd follows from Cauchy-Schwarz inequality. Letting

fq⁰(n) =knk^2q₂ ⁰ we have

LKfq⁰(n) =K

d

X

`=1

B`(n/K)

knk²₂+ 2n`+ 1q⁰

− knk^2q₂⁰ +K

d

X

`=1

D`(n/K)

knk²₂−2n`+ 1q⁰

− knk^2q₂ ⁰ .

Using (H.5) and the equivalence of the norms, we see that there exists a constant c_q⁰ >0 such that ifknk2> c_q⁰K

LKfq⁰(n)<0.

Moreover, we can takec_q⁰ large enough such that for alln LKfq⁰(n)6cq⁰K^2q⁰.

Applying Itô’s formula tofq⁰ we get as in the proof of Theorem 2.1 En

hkN^K(t∧TA)k^2q₂ ⁰i

6knk^2q₂ ⁰+En

Z _t∧T_A

0

cq⁰K^2q⁰ds

6knk^2q₂ ⁰+t cq⁰K^2q⁰.

(Recall thatTAis defined in (2.2).) The result follows by lettingAtend to infinity.

2.2. Moments estimates for the qsd. — Let us first recall (cf. [3]) that, under the assumptions of Section 1.4, there exists a unique qsd νK with support Z^d+r{0}.

Further, starting from the qsd, the extinction time is distributed according to an exponential law with parameterλ0(K)satisfying ([3, Th. 3.2])

(2.5) e^−d¹^K6λ0(K)6e^−d²^K,

whered1> d2>0 are constants independent ofK. Recall also that for allt >0, (2.6) Pν_K N^K(t)∈^•, T0> t

= e^−λ⁰^(K)tνK ^•), where

T0= inf{t >0 :N^K(t) = 0}.

Finally, for allf in the domain of the generator

(2.7) LK^†νK(f) =νK(LKf) =−λ0(K)νK(f)

(14)

with the notation

ν_K(f) = Z

f(n) dν_K(n).

We use several notations from [3] that we now recall. Let n^∗=bKx^∗c.

Forx∈R^d+ and r >0, B(x, r)is the ball of centerxand radius r. We consider the sets

(2.8) ∆ =B n^∗, ρ_(4.2)√ K

, D=B n^∗,(min_jn^∗_j)/2

∩Z^d+,

whereρ_(4.2)>0is a constant defined in [3, Cor. 4.2]. Note that sincen^∗is of orderK, we have ∆ ⊂D forK large enough. The first entrance time in ∆ (resp.D) will be denoted byT_∆ (resp.T_D).

We first prove that the support of the qsd is, for large K, almost included in D. (This will be important to control moments later on.)

Proposition2.4. — There exists a constantc_(2.4)>0such that for allKlarge enough ν_K D^c

6e^−c^(2.4)^K.

Proof. — We first recall two results from [3]. From [3, Lem. 1.5], there exist γ > 0 andδ∈(0,1)such that for allKlarge enough

(2.9) sup

n∈∆^cr0

Pn T∆> γlogK, T0> T∆ 6δ.

By [3, Sublem. 5.8], there exist two constants C >0 and c > 0 such that for all K large enough, and for allt >0

(2.10) sup

n∈∆Pn T_D^c < t

6C 1 +t e^−cK. Now, forq∈N r{0}define

tq =qγlogK.

We will first estimatesup_nPn N^K(t_q)∈D^c, T₀> t_q

. Note thatN^K(t_q)∈D^cimplies T_D^c6tq. We distinguish two cases according to whethern∈∆ orn∈∆^cr{0}.

Letn∈∆. It follows from (2.10) that Pn N^K(t_q)∈D^c

6C 1 +t_q e^−cK. Now letn∈∆^cr{0}. We have

Pn N^K(tq)∈D^cr{0}

=Pn N^K(tq)∈D^cr{0}, T∆6tq

+Pn N^K(tq)∈D^cr{0}, T∆> tq . Using the strong Markov property at timeT_∆ and (2.10) we obtain

Pn N^K(tq)∈D^cr{0}, T∆6tq

=En

h

1{T∆6tq}PN^K(T∆) N^K(tq−T∆)∈D^cr{0}i 6C(1 +tq) e^−cK.

(15)

We bound the second term recursively in q.

Pn T∆> tq, T0> T∆

=En

h

1{T∆>tq−1}1{T0>T∆}PN^K(tq−1) T∆> t1, T0> T∆

i

6δ sup

n∈∆^cr{0}Pⁿ T∆> t_q−1, T0> T∆ ,

where we used the strong Markov property at timet_q−1 and (2.9). This implies sup

n∈∆^cr{0}Pn N^K(tq)∈D^cr{0}, T∆> tq

6 sup

n∈∆^cr{0}Pn T∆> tq, T0> T∆

6δ^q. Therefore

sup

n6=0Pn N^K(tq)∈D^cr{0}

6C 1 +tq

e^−cK+δ^q.

Takingq=bKcwe conclude that there exists a constantc⁰ >0such that forKlarge enough

sup

n6=0Pn N^K(t_bKc)∈D^cr{0}

6e^−c⁰^K. This implies

Pν_K N^K(t_bKc)∈D^c, T0> t_bKc

6e^−c⁰^K but by (2.6)

Pν_K N^K(t_bKc)∈D^c, T0> t_bKc

= e^−λ⁰^(K)t^bKcνK D^c

and the result follows from (2.5).

Corollary 2.5. — For each q ∈ N, there exists Cq > 0 such that for all K large enough

Z

D^c

knk^q₁dν_K(n)6C_qK^qe^−c^(2.4)^K and Z

knk^q₁dν_K(n)6C_qK^q. Proof. — It follows at once from (2.6) (at time1) and Theorem 2.1 that (2.11)

Z

e^knk¹dνK(n)6e^λ⁰^(K)e^C^(2.1)^K 62 e^C^(2.1)^K forK large enough. We have

Z

D^c

knk^q₁dν_K(n) =K^q Z

D^c

(knk1/K)^qe^−knk¹^/Ke^knk¹^/K dν_K(n) 6K^qq^qe^−q

Z

e^knk¹^/K1D^c(n) dν_K(n).

We use Hölder inequality to get Z

D^c

knk^q₁dνK(n)6K^qq^qe^−q Z

e^knk¹dνK(n)

1/KZ

1D^c(n) dνK(n) 1−1/K

. The first result follows from (2.11) and Proposition 2.4. The second estimate follows from the first one, and the boundsup_n∈_Dknk16O(1)K.

We now estimate centered moments.

(16)

Theorem 2.6. — For each q ∈ Z+, there exists Cq > 0 such that for all K large enough

Z

kn−Kx^∗k^2q₂ dνK(n)6CqK^q.

Proof. — The proof consists in a recursion overq. The bound is trivial forq= 0. For q∈Ndefine the function

fq(n) =kn−Kx^∗k^2q₂ 1D1(n), where

D1=B Kx^∗,(2K/3) min_jx^∗_j

∩Z^d+.

Recall thate^(j)is the vector with1at thejth coordinate and0elsewhere. From the trivial identity

(2.12) kn−Kx^∗±e^(j)k²₂=kn−Kx^∗k²₂±2(nj−Kx^∗_j) + 1 it follows that

kn−Kx^∗±e^(j)k^2q₂ − kn−Kx^∗k^2q₂ ±2q(nj−Kx^∗_j)kn−Kx^∗k^2q−2₂

63^q2^q(1 +kn−Kx^∗k^2q−2₂ ).

Indeed, applying the trinomial expansion to (2.12), we obtain kn−Kx^∗±e^(j)k^2q₂ − kn−Kx^∗k^2q₂ ±2q(nj−Kx^∗_j)kn−Kx^∗k^2q−2₂

6q! X

p₁6q−2 p₁+p₂+p₃=q

kn−Kx^∗k^2p₂ ¹(2kn−Kx^∗k₂)^p²

p1!p2!p3! +qkn−Kx^∗k^2q−2₂ . Observe that if p1 6 q−2, p1 +p2+p3 = q and then 2p1+p2 = p1+q−p3 6 2q−2−p₃62q−2, sincep₃>0. This implies that

kn−Kx^∗k^2p₂ ¹(2kn−Kx^∗k2)^p² 62^q(1 +kn−Kx^∗k^2q−2₂ ).

It follows that

(2.13) LKfq(n) = 2qK

d

X

j=1

Xj(n/K) (nj−Kx^∗_j)kn−Kx^∗k^2q−2₂ 1D1(n) +Rq(n), where

(2.14) |Rq(n)|6O(1) K6^q(1 +kn−Kx^∗k^2q−2₂ )1D1(n) +qK^2q+11D^c(n) To get this bound, we used the fact that

sup

j=1,...,d

|1D1(n±e^(j))−1D1(n)|61D^c(n).

Using (1.10) we get (2.15) K

d

X

j=1

X_j(n/K) (n_j−Kx^∗_j)kn−Kx^∗k^2q−2₂ 1D1(n)

6−β⁰kn−Kx^∗k^2q₂ 1D1(n) =−β⁰fq(n),

(17)

where

β⁰= β 3min

j x^∗_j.

Integrating the equation (2.13) with respect toνK and using (2.7), (2.14), (2.15) and Proposition 2.4, we obtain

(2qβ⁰−λ0(K))νK(fq)6O(1) K6^q(1 +νK(fq−1)) + 6^qK^2q+1e^−c^(2.4)^K . Observing that ν_K(f₀) 6 1, it follows by recursion over q that, for each integer q, there exists C_q⁰ >0 such that, for all K large enough, νK(fq) 6C_q⁰K^q. Finally we have

Z

kn−Kx^∗k^2q₂ dν_K(n) =ν_K(f_q) + Z

kn−Kx^∗k^2q₂ 1D1^c(n) dν_K(n) 6νK(fq) +

Z

kn−Kx^∗k^2q−2₂ 1D^c(n) dνK(n)

sinceD⊂D1. The result follows using the previous estimate and Corollary 2.5.

The next result gives a more precise estimate for the average of n(instead of an error of order√

K).

Proposition2.7. — We have

µ^K−Kx^∗=O(1),

whereµ^K is defined in (1.1). Moreover, since kn^∗−Kx^∗k2=O(1), we have

(2.16) µ^K−n^∗=O(1).

Proof. — Define the functions

gj(n) =hn−Kx^∗, e^(j)i, 16j6d.

By Taylor expansion and the polynomial bounds onB andD we get LKg_j(n) =K B_j(n/K)−D_j(n/K)

=

d

X

m=1

∂_mB_j(x^∗)−∂_mD_j(x^∗)

g_m(n)1D(n) +O(1)kn−Kx^∗k²₂ K 1D(n) +O(1) (K^p+knk^p₂)1D^c(n) for some positive integerpindependent ofK. Using Cauchy-Schwarz inequality, identity (2.7), Corollary 2.5 and Proposition 2.4 we get

Z

1 +knk^p₂

1D^c(n) dνK(n) =o(1).

From Proposition 2.4, Theorem 2.6 and (2.5) we get

d

X

m=1

∂_mB_j(x^∗)−∂_mD_j(x^∗)

ν_K(g_m) =O(1).

The result follows from the invertibility of thed×dmatrix(∂mBj(x^∗)−∂mDj(x^∗)) which follows from (H.3). The other inequalities follow immediately.

(18)

Corollary2.8. — For all K >0, we have kΣ^Kk6

Z

n−µ^K

2

2dνK(n) = Z

kn−Kx^∗k²₂dνK(n) +O(1)6O(1)K.

Proof. — Combine Proposition 2.7 and Theorem 2.6.

We now show thatΣ^K is indeed of orderK.

Proposition2.9. — There exist two strictly positive constants c(2.9) and c⁰_(2.9) such that for allK large enough, the matrix Σ^K satisfies

Σ^K >c_(2.9)KId

for the order among positive definite matrices, Id being the identity matrix, and, in particular,

Z

n−µ^K

2

2dν_K(n)>c⁰_(2.9)K.

Proof. — We denote byΣe^K the positive definite matrix Σe^K_p,q=

Z

n_p−n^∗_p

n_q−n^∗_q

dν_K(n).

By (2.16) we have

(2.17)

eΣ^K−Σ^K

₂=O(1).

Letv be a unit vector inR^d. We have hv,Σe^Kvi=

Z

hv,(n−n^∗)i²dν_K(n)>

Z

∆

hv,(n−n^∗)i²dν_K(n).

From [3, Lem. 5.3] there exists a constantc >0such that for allK large enough and alln∈∆,

νK({n})>c U∆({n}), whereU∆ is the uniform distribution on∆. Therefore

hv,Σe^Kvi>c Z

∆

hv,(n−n^∗)i²dU∆(n) and we get

hv,Σe^Kvi>c_(2.9)Kkvk²₂.

The result follows.

3. Controlling time averages of the estimators ForT >0, we define the time average of a functionf :Z^d+→Rby

(3.1) Sf(T, K) = 1

T Z T

0

f(N^K(s)) ds.

The goal of this section is to obtain a control of |Sf(T, K)−νK(f)| for a suitable class of functions.

We recall the following result from [3, Th. 3.1].

(19)

Theorem3.1([3]). — There exista > 0,K0>1 such that, for all t>0 and for all K>K0, we have

(3.2) sup

n∈Z^d+r{0}

Pn(N^K(t)∈^•, t < T0)−Pn(t < T0)νK(^•)

_TV 62 e^−at/log^K. It is also proved in [3] that, for a time much larger thanlogK and much smaller than the extinction time (which is of order exp(Θ(1)K)), the law of the process at timet is close to the qsd. The accuracy of the approximation depends on the initial condition. This suggests to study the distance between the law of the process at timet and the qsd as a function of the initial condition,Kandt. This will result from (3.2) ifPn T06t

can be estimated. In fact we prove a more general result.

Lemma3.2. — For γ > 0, define τ_γ = inf

t > 0 : kN^K(t)k₁ 6 γK . There exist δ >0,α >0 andC >0 such that for alln∈Z^d+,K >1,06γ 61∧α/kx^∗k1 and t>0, we have

(3.3) Pn τγ 6t 6C

exp (−δ(ζ((knk1/K)∧ α)−γkx^∗k1)K)

+ texp (−δ(α−γkx^∗k₁)K) , where

(3.4) ζ= min

16j6d x^∗_j >0.

Takingγ= 0in (3.3), we get (3.5) Pn T₀6t

6C exp (−δ(ζ(knk₁/K)∧ α)K) + texp(−α δK) .

Proof. — It follows from (H.1) and (H.3) (using Taylor’s expansion ofX(x)near0) that there exists α₀ ∈ (0, R) (where R was introduced in Assumption (H.3)) such that for allx∈R^d+ satisfyingkxk26α0 we have

hX(x), x^∗i>βkx^∗k²kxk2−2βkxk2hx, x^∗i+βkxk³₂+hX(x), xi

>βkx^∗k²kxk2+O(1)kxk²₂> βkx^∗k²₂ 2 kxk2. (3.6)

Forα∈(0, α0]andδ >0 to be chosen later on, we define ψ(n) = e^−δ(hn,x^∗^i∧^αK). It is easy to verify that ifhn, x^∗i> α K+kx^∗k2 we have

LKψ(n) = 0.

IfαK− kx^∗k26hn, x^∗i6αK+kx^∗k2 we have

LKψ(n)

6O(K) e^−αδK.

Forhn, x^∗i6αK− kx^∗k₂, we haveknk₁6hn, x^∗i/ζ 6αK/ζ, whereζ is defined in (3.4), and

LKψ(n) =Kg(δ, n/K) e^−δhn,x^∗ⁱ,