• Aucun résultat trouvé

Poisson perturbations

N/A
N/A
Protected

Academic year: 2022

Partager "Poisson perturbations"

Copied!
20
0
0

Texte intégral

(1)

URL:http://www.emath.fr/ps/

POISSON PERTURBATIONS

Andrew D. Barbour

1

and Aihua Xia

2

Abstract. Stein’s method is used to prove approximations in total variation to the distributions of integer valued random variables by (possibly signed) compound Poisson measures. For sums of independent random variables, the results obtained are very explicit, and improve upon earlier work of Kruopis (1983) and ˇCekanaviˇcius (1997); coupling methods are used to derive concrete expressions for the error bounds. An example is given to illustrate the potential for application to sums of dependent random variables.

R´esum´e. On utilise la m´ethode de Stein pour approximer, par rapport `a la variation totale, la distribution d’une variable al´eatoire aux valeurs enti`eres par une mesure (´eventuellement sign´ee) de Poisson compos´ee. Pour les sommes de variables al´eatoires ind´ependantes, on obtient des r´esultats tr`es explicites ; les estimations de la pr´ecision de l’approximation, construites `a l’aide de la m´ethode de

“coupling”, sont plus exactes que celles de Kruopis (1983) et de ˇCekaniviˇcius (1997). Un exemple sert

`

a illustrer le potentiel de la m´ethode envers les sommes de variables al´eatoires d´ependantes.

AMS Subject Classification. 62E17, 60G50, 60F05.

Received January 28, 1999. Revised July 26, 1999.

1. Introduction

In a series of papers, beginning with the pioneering work of Presman (1983) and Kruopis (1986), and continuing with those of ˇCekanaviˇcius et al. (see for example, the references in ˇCekanaviˇcius 1997), it has been shown that signed compound Poisson measures can be used to make very close approximations in total variation to the distributions of sums of independent integer valued random variables. Signed compound Pois- son measuresµon the integers are measures of the formµ= exp{λ(F −E)}for someλ∈R, whereF is any probability distribution on the integers andEis the unit mass on 0: multiplication is interpreted as convolution, and the exponential is defined through its power series. Such measures may be signed measures ifλ <0, but it is always the case thatµ{Z}= 1.

In this paper, we confine our attention to the very small subset of such measures of the formπµ,a, forµ >0 anda∈R, having generating function

ˆ

πµ,a(z) :=X

r0

zrπµ,a{r}= exp

µ(z−1) +12a(z21) , (1.1) which are concentrated onZ+, together with their translates

π(m)µ,a, m∈Z; π(m)µ,a{s+m}:=πµ,a{s}, s∈Z+. (1.2)

Keywords and phrases: Stein’s method, signed compound Poisson measure, total variation, coupling.

Andrew Barbour was supported in part by Schweizerischer NF Projekt Nr 20-50686.97.

1 Abteilung f¨ur Angewandte Mathematik, Universit¨at Z¨urich, Winterthurerstrasse 190, 8057 Z¨urich, Switzerland; e-mail:

adb@amath.unizh.ch

2 Department of Statistics, School of Mathematics, The University of New South Wales, Sydney 2052, Australia.

c EDP Sciences, SMAI 1999

(2)

The generating function (1.1) is that of the convolution of the Poisson distribution Po(µ) and, for a >0, the distribution of 2Z, where Z Po(a/2); for a < 0, the measure πµ,a must be a signed measure, in view of Raikov’s (1938) theorem. For many purposes, this class is already wide enough to give good approximations;

indeed, we often assume thatµis large but ais bounded, in which caseπµ,a is a small perturbation of Po(µ), a Poisson distribution with large parameter. In these circumstances, πµ,a{r} is positive even when a < 0, except for very large values ofr, and the total negative mass is exponentially small with µ: hence the signed measureπµ,a could actually be replaced as an approximation by a probability distribution, with a very small change in total variation.

In order to use measures from the familyπµ,aas approximations, we need to have a way of showing how close they are to the distribution of any given random variable. Previous work has largely concentrated on Fourier methods, limiting the scope of applicability. Here, we show how Stein’s method can be used. For a >0, the measures πµ,a are compound Poisson distributions, and the general theory in Barbour and Utev (1998, 1999) could be invoked. However, the particular form of the πµ,a allows us to prove better bounds on the solutions of the Stein equation by a much easier argument, which is also valid fora <0, provided thatµis large enough by comparison to |a|. Indeed, the same method can be used to establish bounds on the solutions of the Stein equation (Th. 2.5) which are of optimal order inµ, for a whole class of compound Poisson distributions. Bounds of this order were previously known only for the Poisson distribution.

We first use Stein’s method to demonstrate that the measuresπµ,a are often close to other, better known distributions, so that approximation with respect toπµ,acan then be more easily understood. We then show how to use theπµ,ato prove approximation theorems with respect tototal variationdistance for sums of independent integer valued random variables, under much the same circumstances as are required for the usual central limit theorem, and we give very explicit error bounds. For sums of independent indicator random variables, our bounds have much better constants than those of Kruopis; for more general summands, explicit bounds seem to be new. In particular, we show that the family (1.2), restricted to have 0≤a <1, can frequently be used to give good approximations. This family consists entirely of probability distributions, and comes as near as possible onZto a family of translates of the Poisson distribution Po(µ) by anyreal displacement; furthermore, for largeµ, these distributions are extremely close to the negative binomial distribution with the same mean and variance. Finally, we use a very simple example to illustrate that approximations of this kind can in principle also be obtained for sums of dependent random variables.

2. A Stein equation

It follows directly from the definition in (1.1) that

µπµ,a{0}=πµ,a{1}; µ,a{r−2}+µπµ,a{r−1}=µ,a{r}, r≥2, (2.1) from which it follows that πµ,a{f}= 0 for allf of the form

f(j) =ag(j+ 2) +µg(j+ 1)−jg(j) (2.2)

for boundedg, where, for a functionf and a measureµ,µ{f}:=P

rf(r)µ{r}. This suggests a Stein equation for the measureπµ,a. The following lemma gives the necessary properties of the solutions.

Lemma 2.1. For any µ > 0, a R such that µ+a > 0 and θ = |a|(µ+a)1 <1/2, and for any bounded f :Z+R, there is a solution g=gf :Z+Rto the Stein equation

ag(j+ 2) +µg(j+ 1)−jg(j) =f(j)πµ,a{f}; j 0, (2.3) which satisfies

kgfk ≤ 2

12θ(µ+a)1/2kfk; k∆gfk ≤ 2

12θ(µ+a)1kfk, (2.4) where∆g(j) :=g(j+ 1)−g(j)andk · kapplied to functions denotes the supremum norm.

(3)

Proof. We constructgf as a perturbation of the solutiong0to the Stein equation

(µ+a)g0(j+ 1)−jg0(j) =f(j)Po(µ+a){f}, j≥0, (2.5) for the Poisson distribution Po(µ+a) with meanµ+a;g0satisfies (Barbouret al., Lem. 1.1.1)

kg0k ≤2(µ+a)1/2kfk; k∆g0k ≤2(µ+a)1kfk. (2.6) To do this, defineT g for any bounded function g:Z+ Rto be the solution ˜g of the equation

(µ+a)˜g(j+ 1)−j˜g(j) =f(j)Po(µ+a){f} −a∆g(j+ 1) +aPo(µ+a){∆g(·+ 1)}, j≥0. (2.7) Consider the sequence (gn, n 0) defined by gn =T gn1, n 1, with g0 as in (2.5). Then, writing hn(j)

=gn(j)−gn1(j), we have

(µ+a)hn(j+ 1)−jhn(j) =−a∆hn1(j+ 1) +aPo(µ+a){∆hn1(·+ 1)}, (2.8) so that, from (2.6),

khnk ≤2|a|(µ+a)1/2k∆hn1k; (2.9) k∆hnk ≤2|a|(µ+a)1k∆hn1k= 2θk∆hn1k. (2.10) From (2.10) and (2.6), it then follows that

k∆hnk ≤(2θ)nk∆g0k ≤2(2θ)n(µ+a)1kfk; in conjunction with (2.9), this gives

khnk ≤2(2θ)n(µ+a)1/2kfk. Hence gf = limn→∞gn exists uniformly and satisfiesT gf =gf, and

kgfk ≤X

n1

2(2θ)n(µ+a)1/2kfk+kg0k ≤ 2

12θ(µ+a)1/2kfk. Furthermore, sinceT gf =gf, it follows from (2.6) and (2.7) that

k∆gfk ≤2(µ+a)1{kfk+|a|k∆gfk},

so thatk∆gfk ≤ 12(µ+a)1kfk. Finally, again sinceT gf =gf, it follows from (2.7) that

agf(j+ 2) +µgf(j+ 1)−jgf(j) =f(j)Po(µ+a){f}+aPo(µ+a){∆gf(·+ 1)}, (2.11) and applyingπµ,a to both sides gives

0 =πµ,a{f} −Po(µ+a){f}+aPo(µ+a){∆gf(·+ 1)}, so that, from (2.11), the functiongf indeed satisfies (2.3).

The following corollary is immediate. The notation k · kis used with measures to denote the total variation norm: ifνis a (signed) measure onZ, with positive and negative partsν+ andν, thenkνk=ν+(Z)ν(Z).

Corollary 2.2. Suppose that µ, a R satisfy µ+a >0 and θ =|a|(µ+a)1 < 1/2. If W is any random variable on Z+ such that

|E{ag(W+ 2) +µg(W + 1)−W g(W)}| ≤ε (2.12)

(4)

for allg=gf as in Lemma 2.1 withkfk ≤1, then kL(W)πµ,ak ≤ε. In particular, if

|E{ag(W+ 2) +µg(W + 1)−W g(W)}| ≤ε0kgk+ε1k∆gk (2.13) for all boundedg:Z+ R, it follows that

kL(W)πµ,ak ≤ 2 1

n

(µ+a)1/2ε0+ (µ+a)1ε1

o·

IfW can take values in the whole ofZ, the corresponding result is a little more complicated.

Corollary 2.3. Suppose that µ, a∈ R satisfy µ+a > 0 and θ =|a|(µ+a)1 < 1/2. Let W be a random variable on Zsuch that

|E{ag(W+ 2) +µg(W + 1)−W g(W)}| ≤ε0kgk+ε1k∆gk (2.14) for all boundedg:ZR. Then it follows that

kL(W)πµ,ak ≤ 2 1

n

(µ+a)1/2ε0+ (µ+a)1ε1+ (1−θ)P[W ≤ −1]o

· (2.15)

In particular, if also EW =µ+aandVarW =µ+ 2a, then kL(W)πµ,ak ≤ 2

12θ n

(µ+a)1/2ε0+ (µ+a)11+ 1]

o· (2.16)

Proof. For boundedf :Z+ R, takegf as given by Lemma 2.1, and extend togf :ZRby settinggf(j) = 0, j≤0. Note thatgf(0) is not actually defined by (2.3), but that, from (2.3) withj= 0 and from (2.4),

(µ+a)|gf(1)|=|f(0)πµ,a{f} −a(gf(2)−gf(1))| ≤(1 +kπµ,ak)kfk+|a| k∆gfk ≤ 2 1kfk, so that this extension continues to satisfy (2.4), provided that we havekπµ,ak ≤(12θ)1.

Fora≥0, it is immediate that kπµ,ak= 1(12θ)1. Fora <0, take anyf with kfk ≤1, and use (2.5) to give

πµ,a{f} −Po(µ+a){f}=X

j0

((µ+a)g0(j+ 1)−jg0(j))πµ,a{j}=−aπµ,a{∆g0(·+ 1)}, the last line coming from (2.2), giving

kπµ,ak ≤1 + 2|a|(µ+a)1kπµ,ak from (2.6). Hence, ifa <0 andθ=|a|(µ+a)1<1/2, then

kπµ,ak ≤(12θ)1. (2.17)

In terms now of functions onZ, (2.3) is equivalent to

(f(j)πµ,a{f})1{j0}={agf(j+ 2) +µgf(j+ 1)−jgf(j)}1{j0}, (2.18) defining the value of hf(j) = agf(j+ 2) +µgf(j + 1)−jgf(j) for j 0, the definition of gf then giving hf(1) =agf(1) forj =1, and hf(j) = 0 for j ≤ −2. Hence, from (2.14), it follows that, for any bounded f :ZR,

|agf(1)P[W =1] +E{(f(W)πµ,a{f})1{W0}}| ≤ε0kgfk+ε1k∆gfk,

(5)

and hence that

|Ef(W)πµ,a{f}| ≤2kfkP[W ≤ −1] + 2

1|a|(µ+a)1kfkP[W =1]

+ 2

1{(µ+a)1/2ε0+ (µ+a)1ε1},

completing the proof of (2.15). The particular case then follows by applying Chebyshev’s inequality.

Remark. The Chebyshev bound forP[W ≤ −1] can of course be improved, if more information aboutW is available.

The following corollary can also prove useful.

Corollary 2.4. Suppose thatµi, ai satisfy µi+ai>0andθi=|ai|i+ai)1<1/2,i= 1,2. Then kπµ1,a1πµ2,a2k ≤k(µ1+a1)1/2{|µ1−µ2|+|a1−a2|},

with k= 2/(11)ifa20,k= 2/{(11)(12)}ifa2<0.

Proof. Apply πµ2,a2 to (2.3) with a =a1 and µ =µ1, using the fact that πµ2,a2{f} = 0 for allf as given in (2.2) witha=a2 andµ=µ2. This, together with (2.4), gives

|πµ2,a2{f} −πµ1,a1{f}| ≤ {|µ1−µ2|+|a1−a2|} 2 11

1+a1)1/2kfk kπµ2,a2k. The corollary follows by taking arbitraryf such thatkfk ≤1, and using (2.17) to boundkπµ,ak.

The perturbation argument of Lemma 2.1 can easily be modified to cover a more general class of compound Poisson distributions. This class is particularly useful in problems where a crude Poisson approximation is to be refined.

Theorem 2.5. Let λiR+,i≥1, satisfy

θ:=

X

i1

i

1

X

i2

i(i−1)λi<1/2. (2.19)

Then, for anyA⊂Z+, the Stein equation X

i1

ig(j+i)−jg(j) = 1A(j)CP (λ){A}, j 0, (2.20) for the compound Poisson distribution CP (λ)has a solutiong=gA satisfying

kgAk ≤ 1 1

X

i1

i

1/2

; k∆gAk ≤ 1 1

X

i1

i

1

. (2.21)

These simple bounds are in stark contrast to the much more complicated behaviour of the solutions to the Stein equation (2.20) which can occur whenθ >1/2. Note also that, by defining

θ:=

X

i1

i

1

X

i2

i(i−1)i|,

signed compound Poisson measures such thatP

i1i >0 andθ <1/2 could also be covered. If onlyλ1andλ2

are non–zero, the definition ofθreduces to that of Lemma 2.1.

(6)

3. Approximating

πµ,a

The measuresπµ,aare not immediately familiar, especially since, fora <0, they are not probability measures.

Our first use of the results of the previous section is therefore to show that the πµ,a are often close to more widely used distributions. To start with, we show that, at least when|a| µ, the measuresπµ,acan be seen as small perturbations of the Poisson distribution Po(µ). More precisely, forµ≥1 andc1, c2 R, letν(µ, c1, c2) denote the (still possibly signed) measure onZ+ defined by

ν(µ, c1, c2){s}= Po(µ){s} 1 +c1µ1(s−µ) +12c2µ2{(s−µ)2−µ}

, (3.1)

s Z+, which satisfies ν(µ, c1, c2){Z+}= 1. As for πµ,a, if µ is large andc1, c2 remain bounded, there are probability measures differing in total variation from the measure ν(µ, c1, c2) by an amount which is exponen- tially small inµ. Under these circumstances, we show in the next two theorems that measures close toν(µ, c1, c2) are also close to the measureπµ0,a0 with

µ0 =µ+c1(c2−c21) and a0=c2−c21. (3.2) We begin with a lemma, showing thatν(µ, c1, c2) almost satisfies a Stein equation of the form (2.3).

Lemma 3.1. Define µ0 and a0 as in (3.2), and suppose that µ, c1 and c2 are such that µ0+a0 µ/2. Let g:Z+R be any function satisfyingkgk ≤4(µ0+a0)1/2 andk∆gk ≤4(µ0+a0)1, and define hby

h(j) =a0g(j+ 2) +µ0g(j+ 1)−jg(j).

Then |ν(µ, c1, c2){h}| ≤k{3.1}(µ, c1, c23/2, where k{3.1}(µ, c1, c2) := 4n

(|c1| |2a0+c2|+|c2|) +µ1/2(2|a0|+|c1|)|c2|+µ1|a0c2|o

·

Proof. By definition, we have ν(µ, c1, c2){h}=En

1 +c1µ1(Z−µ) +12c2µ2{(Z−µ)2−µ}

×(a0g(Z+ 2) +µ0g(Z+ 1)−Zg(Z)) o

,(3.3) where Z∼Po(µ). NowE(Zg(Z)) =µEg(Z+ 1) for anygfor which the expectations exist; using this identity to eliminate powers ofZ in (3.3) yields, after some computation,

2ν(µ, c1, c2){h}=E

k12g(Z+ 1) +a0c23g(Z+ 1) +c2µ1[a02g(Z+ 1) +c1∆g(Z+ 1)−g(Z+ 1)] ,

where k1 = c1(2a0 +c2), and a0 = c2 −c21, as in (3.2). The last two terms have expectations bounded by 8|c1c22 and 8|c23/2 respectively, because of the bounds on kgkand k∆gkand because µ0+a0 µ/2.

Then the bound onk∆gkalso shows that

|E2g(Z+ 1)| ≤4(µ0+a0)1X

j1

|P[Z=j]−P[Z=j−1]| ≤8

2e1µ3/2, and that

|E3g(Z+ 1)| ≤4(µ0+a0)1X

j1

|P[Z=j]−2P[Z =j−1] +P[Z=j−2]| ≤16µ2,

by Barbouret al., pp. 222 and 224. The lemma now follows by collecting terms.

As a consequence of Lemma 3.1, we have the following approximation.

(7)

Theorem 3.2. Forµ≥max{1,8|c2−c21|,2|c1|}, we have

kν(µ, c1, c2)πµ0,a0k ≤k{3.1}(µ, c1, c23/2, whereµ0 anda0 are as in (3.2).

Proof. Take any f:Z+Rwithkfk ≤1, and constructgf using Lemma 2.1 withµ0 forµanda0 fora. Note that, by the assumption onµ,µ0+a0≥µ/2 and 4|a0| ≤0+a0); hence, from Lemma 2.1,kgfk ≤4(µ0+a0)1/2 andk∆gfk ≤4(µ0+a0)1. Now apply Lemma 3.1 in conjunction with (2.3) to give

|ν(µ, c1, c2){f} −πµ0,a0{f}| ≤k{3.1}(µ, c1, c23/2, (3.4) from which the theorem follows.

Note thatk{3.1}(µ, c1,0) = 8|c1|3 and thatk{3.1}(µ,0, c2) = 4|c2|(1 + (2µ1/2+µ1)|c2|). Thusν(µ, c1,0) is close toπµ0,a0 for all|c1| µ1/2, andν(µ,0, c2) is close toπµ0,a0 for all|c2| µ. Hence, in particular,πµ,a can be reasonably approximated by the memberν(µ+a,0, a) of the familyν(µ, c1, c2), as long as|a| µ.

The theorems given so far are useful whenπµ,a is a genuinely small perturbation of Po(µ), which is the case if|a| µ. However, it is at times useful to allow|a|to be of the same order of magnitude as µ, and still have simple probability approximations toπµ,a. Whena >0,πµ,ais already a probability measure, and so no further approximation is required. However, this family of compound Poisson distributions is less well known than the negative binomial family, using which good approximations can often also be obtained, as in Corollary 4.8. If a < 0, the obvious family to use is the binomial Bi(n, p). Here, the fact that n has to be integral requires some small adjustment. Matching mean and variance requires that np=µ+aand thatnp(1−p) =µ+ 2a, or, equivalently, that n =|a|1(µ+a)2 and that p= |a|(µ+a)1. If n so defined is not integral, choose ε to satisfy |a|1−ε+a)2 = n = b|a|1(µ+a)2c, where bxc denotes the largest integer m x, and set p =|a|−ε+a)1. We can then use Bi(n, p) to approximate πµε,a, and Bi(n, p)Po(ε) to approximate πµ,a, since the convolutionπµε,aPo(ε), for any 0≤ε≤µ, is justπµ,a. Our choice ofεis typically of order µ1|a|, and thus the convolution Po(ε)Bi(n, p) is a rather simple distribution with which to approximateπµ,a. Binomial approximation is treated later in Theorem 4.1 as a special case of sums of independent indicators.

The binomial and negative binomial approximations are not the only possible choices. Another possibility is to approximateπµ,a by a translate of a compound Poisson probability distribution,π(m)χ,b as defined in (1.2), with b≥0. To achieve this, we first need to show that translating πµ,a by 1 changes the measure by at most O(µ1/2) in total variation.

Lemma 3.3. For any µ >0anda∈Zsuch that θ= (µ+a)1|a|<1/2,

πµ,aπ(1)µ,a≤k(µ+a)1/2, (3.5) wherek may be taken to be2/(12θ)ifa >0, and 2/(12θ)2 otherwise.

Proof. Use Lemma 2.1 and (2.2) to give πµ,a{f(·+ 1)} −πµ,a{f}=X

j0

(agf(j+ 3) +µgf(j+ 2)(j+ 1)gf(j+ 1))πµ,a{j}

=πµ,a{gf(·+ 1)}, withkgf(·+ 1)k ≤ 12(µ+a)1/2kfk. Hence

|πµ,a{f(·+ 1)} −πµ,a{f}| ≤ 2

12θ(µ+a)1/2kπµ,ak kfk.

Ifa≥0,kπµ,ak= 1, and ifa <0 we havekπµ,ak ≤1/(12θ) from (2.17), proving the lemma.

We are now in a position to prove our approximation by a translate of a compound Poisson distribution. We usedxeto denote the smallest integerm≥x.

(8)

Theorem 3.4. Suppose that a <0 and that µ 5(|a|+ 1). Definem = d|a|e, b =a+m =d|a|e − |a| and χ=µ−2m. Then

πµ,aπ(m)χ,b21(|a|+ 1)(µ+a)3/2. If a≥1andµ≥3a, take m=−bac,b=a− bacandχ=µ−2m, giving

πµ,aπ(m)χ,b16a(µ+a)3/2.

Remark. Ifa∈Z, thenb= 0 andπ(m)χ,b is a shifted Poisson distribution. In all cases, 0≤b <1, so thatπχ,b

is close to being Poisson if χ is large. Corollary 4.8 shows that πχ,b is even closer to the negative binomial distribution with the same mean and variance.

Proof. LetZ πχ,b, so that, from (2.2),

E{bg(Z+ 2) +χg(Z+ 1)−Zg(Z)}= 0 (3.6)

for all boundedg. Now take any boundedf, and use Lemma 2.1 to show that

Ef(Z+m)−πµ,a{f}=E{agf(Z+m+ 2) +µgf(Z+m+ 1)(Z+m)gf(Z+m)}, wheregf(·+m) is bounded as in (2.4). Applying (3.6), it then follows that

Ef(Z+m)−πµ,a{f}=E{(a−b)gf(Z+m+ 2) + (µ−χ)gf(Z+m+ 1)−mgf(Z+m)}

=E{(a−b+m)gf(Z+m+ 2) + (µ−χ−2m)gf(Z+m+ 1)} −mE∆2gf(Z+m)}

=−mE∆2gf(Z+m), by choice ofχ andb. But now

E∆2gf(Z+m)≤ k∆gfkπχ,bπ(1)χ,b4(µ+a)1kfk4(χ+b)1/216p

5/3(µ+a)3/2kfk,

from Lemma 2.1 and Lemma 3.3, and becauseχ≥µ−2(|a|+ 1)3µ/5 under the given conditions onµanda;

note also that χ+b 4|b|under these conditions, as is needed to apply Lemma 3.3. Since 0≤m <|a|+ 1, the first part of the theorem now follows. The proof of the second part is entirely similar, with the rˆoles ofπµ,a

andπχ,b interchanged; now χ+b≥µ+a≥44b, simplifying the final calculation.

4. Applications: Independent summands

LetW =Pn

i=1Ii be a sum of independent Be (pi) random variables. By using a signed compound Poisson measure as approximation and matching the first two moments, Kruopis (1986) showed that an error in total variation of order λσ3θ3 is obtained, whereλ=Pn

i=1pi, θl =λ1Pn

i=1pli and σ2 =Pn

i=1pi(1−pi). This approximation is extremely accurate. It has an error of at most orderσ1, as good a rate as for Kolmogorov distance in the usual normal approximation, if thepiare of order 1, and the error is roughly of orderλ1/2p2if thepi are small, combining theλ1/2factor with theO(p2) error of the one term Poisson–Charlier expansion.

Here, we use Stein’s method, as developed in Section 2, to sharpen his constants.

Theorem 4.1. With notation as above, takeµ=λ(1 +θ2)anda=−λθ2, and suppose thatθ2<1/2. Then kL(W)πµ,ak ≤ 2

12

θ3τ1/2, (4.1)

where

τ=σ2 max

1inpi(1−pi).

(9)

In particular, ifµ >0anda <0are such thatn:= (µ+a)2/|a| ∈Z+ andp:= (µ+a)1|a|<1/2, then kBi(n, p)πµ,ak ≤ 2p2

(12p)p

(n1)p(1−p)· (4.2)

Proof. ForW as defined above, we have

EW g(W) = Xn i=1

piEg(Wi+ 1),

where Wi = W −Ii is independent of Ii. Choosing µ and a as specified, and for any choices ofµi and ai, 1≤i≤n, such thatPn

i=1µi=µandPn

i=1ai =a, we have E{W g(W)−µg(W + 1)−ag(W+ 2)}=E

Xn i=1

pig(Wi+ 1)−µi{pig(Wi+ 2) + (1−pi)g(Wi+ 1)}

−ai{pig(Wi+ 3) + (1−pi)g(Wi+ 2)}

=E (

Xn i=1

aipi2g(Wi+ 1) )

+ Xn i=1

En

g(Wi+ 1)(pi−µi(1−pi) +aipi) +g(Wi+ 2)(−µipi−ai(1−pi)2aipi)o

· Takingµi=pi+p2i andai=−p2i, the second expression vanishes, giving

|E{ag(W + 2) +µg(W+ 1)−W g(W)}|=

Xn i=1

p3iE∆2g(Wi+ 1)

. (4.3)

This gives

|E{ag(W+ 2) +µg(W + 1)−W g(W)}| ≤ Xn i=1

p3ik∆gk2dT V(L(Wi),L(Wi+ 1)),

and, since theWi have unimodal distributions, it follows that dT V(L(Wi),L(Wi+ 1))max

j0 P[Wi =j]≤eτiI0i) 1 2 τi

,

from Barbour and Jensen (1989), whereτi =P

j6=ipj(1−pj). The theorem now follows from Corollary 2.2.

The bound (4.1) is rather neater than that of Kruopis (1986), and for small θ2 it improves on Kruopis’s constant by a factor of about 10. If θ2 is larger, some of this advantage is lost. However, in such cases, approximation by a translated measure can give some improvement. Here, we approximate using the probability distribution nearest to a Poisson that can be obtained by translation from the family πµ,a.

Theorem 4.2. Let m=dλθ2e,µ=λ(1 +θ2)2m and0≤a=m−λθ2<1. Then kL(W)π(m)µ,ak ≤21θ3τ1/2

(1−θ2) +θ2λ1(1 + τ)/θ3

(1−θ2)1

· (4.4)

(10)

Proof. We compare the distribution ofW−monZwithπµ,a, using Corollary 2.3. Arguing as for Theorem 4.1, and writinggm(w) forg(w−m), we obtain

E{(W−m)g(W−m)−µg(W−m+ 1)−ag(W−m+ 2)}

=E Xn i=1

{−aipi2gm(Wi+ 1)−mi(1−pi)∆2gm(Wi)}

+gm(Wi+ 1){pi2mi(1−pi)−mipi+aipi−µi(1−pi)} +gm(Wi+ 2){−ai(1−pi)2aipi−µipi+mi(1−pi)}

, wherePn

i=1mi=m,Pn

i=1µi=µandPn

i=1ai =a. Takingmi=βp2i,ai=(1−β)p2i andµi=pi+ (12β)p2i for anyβ makes the last two terms zero, giving

|E{(W −m)g(W −m)−µg(W−m+ 1)−ag(W −m+ 2)}|

|1−β| Xn

i=1

p3i +|β| Xn i=1

p2i(1−pi)

!

2k∆gkdT V(L(Wi),L(Wi+ 1)).

Now takeβ = (λθ2)1dλθ2e, givingm,µandaas in the statement of the theorem, and note that θ3≥θ22; this gives

|E{(W −m)g(W−m)−µg(W −m+ 1)−ag(W−m+ 2)}| ≤ 3

θ2

+

λ+ a θ2

2−θ3)

τ1/2k∆gk, and the theorem follows from Corollary 2.3.

For large λ, this bound is preferable to that of Theorem 4.1 when θ2 > 1/3. Taking the better of these two bounds always results in an improvement by a factor of at least 3, compared to the bound obtained by Kruopis (1986).

There are analogous results for sums of integer valued random variables which are not restricted to take the values 0 and 1. Suppose that W = Pn

i=1Zi, where the Zi are independent and integer valued, and satisfy E|Zi3|<∞. Define

ψi := E|Zi(Zi1)(Zi2)|+|EZi|E|Zi(Zi1)|+ 2E|Zi| |VarZiEZi|; (4.5) d(i)+ := dT V(L(Wi),L(Wi+ 1)); d+:=dT V(L(W),L(W + 1)), (4.6) whereWi :=W−Zi.

Theorem 4.3. For any m∈Z such that

(2/3)VarW <EW +m <2VarW, (4.7)

we have

kL(W +m)−πµ+2m,amk ≤ 2

(1m)(EW+m) (

|m|d++ Xn i=1

ψid(i)+

!

+em(W) )

, (4.8) where

µ= 2EW−VarW; a= VarW−EW; θm=|VarW−(EW+m)|/(EW+m), (4.9) and whereem(W) = 0if W+m≥0a.s., and em(W) = 1 otherwise.

Proof. Newton’s expansion with remainder gives

g(z+l) =g(z+ 1) + (l1)∆g(z+ 1) +η(g, z, l), (4.10)

(11)

where

η(g, z, l) =



 Pl2

s=1(l1−s)∆2g(z+s), l≥3;

0, l= 2,1;

Pl

s=0(−l−s+ 1)∆2g(z−s), l≤0.

(4.11)

Hence, for any boundedg,l∈Zand 1≤i≤n, it follows that

|Eg(Wi+l)−(l1)Eg(Wi+ 2)(2−l)Eg(Wi+ 1)| ≤ k∆gk(l1)(l2)d(i)+ , (4.12) where we have used the general inequality

|E2g(U+j)| ≤2k∆gkdT V(L(U),L(U + 1)). (4.13) Fix any 1≤i≤n. Then applying (4.12) withl=j gives

E{Zig(W)}=X

j

jqijEg(Wi+j) =X

j

j(j−1)qijEg(Wi+ 2) +X

j

j(2−j)qijEg(Wi+ 1) +φ1i, (4.14) whereqij =P[Zi=j] and

1i| ≤ k∆gkd(i)+E|Zi(Zi1)(Zi2)|. (4.15) Then, takingl=j+ 1 andWi forzin (4.10), we have

Eg(W + 1) =X

j

qijEg(Wi+j+ 1) =X

j

qij{jEg(Wi+ 2) + (1−j)Eg(Wi+ 1)}+φ2i, (4.16) where

φ2i =X

j

qijEη(g, Wi, j+ 1); (4.17)

whereas, withl=j+ 2, we obtain Eg(W + 2) =X

j

qijEg(Wi+j+ 2) =X

j

qij{(j+ 1)Eg(Wi+ 2)−jEg(Wi+ 1)}+φ3i, (4.18) where

φ3i=X

j

qijEη(g, Wi, j+ 2). (4.19)

Careful calculation now shows that, for any choices ofµi and ai,

iφ2i+aiφ3i| ≤ k∆gkd(i)+{|µi+ai|E|Zi(Zi1)|+ 2|ai|E|Zi|}· (4.20) Thus, from (4.14, 4.16), and (4.18), it follows that

aiEg(W + 2) +µiEg(W+ 1)E{Zig(W)}=Eg(Wi+ 2)iEZi+ai(1 +EZi)E{Zi(Zi1)}}

+Eg(Wi+ 1)i(1EZi)−aiEZiE{Zi(2−Zi)}}+aiφ3i+µiφ2i−φ1i, and taking

ai= VarZiEZi; µi= 2EZiVarZi,

(12)

the coefficients of Eg(Wi+ 2) and Eg(Wi+ 1) vanish. This then implies, using (4.15) and (4.20), that

|aiEg(W + 2) +µiEg(W+ 1)E{Zig(W)}| ≤ k∆gkd(i)+ ψi. (4.21) Finally, again from (4.13), we have

|mE∆2g(W)| ≤ k∆gk |m|d+; (4.22) hence, adding (4.21) over 1≤i≤nand then subtracting (4.22), we find that

|(a−m)Eg(W+ 2) + (µ+ 2m)Eg(W+ 1)(W +m)Eg(W)| ≤ k∆gk (

|m|d++ Xn i=1

ψid(i)+

)

, (4.23) witha=Pn

i=1ai = VarW−EW andµ=Pn

i=1µi = 2EW VarW as in (4.9). For anym such that (4.7) is satisfied, the quantityθmdefined in (4.9) is less than 1/2, and we can apply Corollary 2.3 witha−mforaand µ+ 2mforµ, proving the theorem. IfW+m≥0 a.s., then Corollary 2.2 can be applied instead.

There is considerable flexibility inherent in Theorem 4.3, both in the choice ofmand in the fact that theZi

need not be centred. The two corollaries that follow are chosen to illustrate standard situations.

Corollary 4.4. If (2/3)VarW <EW <2VarW, then kL(W)πµ,ak ≤ 2

(12θ)EW ( n

X

i=1

ψid(i)+

!

+e0(W) )

, (4.24)

whereθ= (EW)1|VarW−EW| andµandaare as defined in (4.9).

Corollary 4.5. If VarW 3, then

kL(W+m)−πµ+2m,amk ≤ 2

(VarW−3) (

|m|d++ Xn i=1

ψid(i)+

!

+em(W) )

, (4.25)

where m = bVarW EWc and µ and a are as defined in (4.9). Here, we always have 0 ≤a−m <1 and VarW 2≤µ+ 2mVarW.

For comparison with the usual central limit theorem, Corollary 4.5 is appropriate. Settings2n=Pn

i=1VarZi

and Γn=Pn

i=1E|ZiEZi|3as usual, the numerator in Corollary 4.5 is bounded byn+ 1 for some universal constantC, so that, whens2n→ ∞, the error bound is of order Γnsn2max1ind(i)+. This differs from the usual Lyapounov ratio Γnsn3in that max1ind(i)+ replaces a factor ofsn1. It is easy to see that a change is required here, when proving approximation bounds with respect to total variation distance, because, if Zi 2Be (p) for all i, then Γnsn3 =O(n1/2), whereas W is concentrated on the even integers, and is therefore far from any of the measures πµ,a. Nonetheless, Proposition 4.6 below shows that max1ind(i)+ is frequently of order O(n1/2), so that the classical rate of approximation is recovered; in particular, this is so ifZi∼F for alli, for any strongly aperiodic distribution F on the integers.

Corollary 4.4 exploits the fact that theψi are small if the Zi mostly take the values 0 and 1. For example, suppose that

P[Zi= 1] =pi; P[Zi= 2] =qi≤cp2i for somec >0;

P[Zi= 0] = 1(pi+qi), where thepi andqi are such thatPn

i=1(pi+qi)2 and that θ= (EW)1|VarW EW| ≤

Pn

i=1P|2qn i(pi+ 2qi)2|

i=1(pi+ 2qi) 1 4·

Références

Documents relatifs

Mais peu importe, ne répondez pas et dirigeons-nous plutôt, si nous le voulons bien, vers ce bosquet, car tel est notre plaisir et aussi parce que l’on y fait

Français : 9h45 Mathématiques : 5h Eps : 1h30 Langue vivante :1h Pratiques artistiques : 1h30 DDM : 1h45 ECM :

Lecture : en 2019, 60 % des salariés ayant suivi une formation d’adaptation au poste de travail et dont le contrat n’a pas été rompu sont concernés par un enga- gement pour ce type

This paper addresses the question of the weakest failure detector to solve consensus among a number k &gt; n of processes that communicate using shared objects of a type T with

of four genomic groups of Borrelia burgdorferi sensu lato in Ixodes ricinus ticks collected in a Lyme borreliosis endemic region of northern Croatia. Contribution a`

Between the two visits, a total of 52 new dogs (usually puppies) were obtained in Sary Mogol, 36 in Taldy Suu and four in Kara Kavak (although it should be noted that in Kara

The aim of this paper is to combine two powerful probabilistic techniques, namely the Chen- Stein method (see e.g. [2, 9]) and the Malliavin calculus of variations (see e.g. [20,

More precisely, we provide an asymptotic analysis for the variance of random eigenfunctions, and also establish rates of convergence for various probability metrics for