• Aucun résultat trouvé

Appendix A: Proof of Theorem 2

N/A
N/A
Protected

Academic year: 2022

Partager "Appendix A: Proof of Theorem 2"

Copied!
13
0
0

Texte intégral

(1)

arXiv:1406.4406v1

Supplementary material for Posterior concentration rates for empirical Bayes procedures with applications to Dirichlet process mixtures

SOPHIE DONNET1,* VINCENT RIVOIRARD1,** JUDITH ROUSSEAU2 and CATIA SCRICCIOLO3

1CEREMADE, Universit´e Paris-Dauphine, Place du Mar´echal de Lattre de Tassigny, 75775 Paris Cedex 16, France.

E-mail:*donnet@ceremade.dauphine;**rivoirard@ceremade.dauphine.fr

2CEREMADE, Universit´e Paris-Dauphine, Place du Mar´echal de Lattre de Tassigny, 75775 Paris Cedex 16, France and CREST-ENSAE, 3 Avenue Pierre Larousse, 92240 Malakoff, France. E-mail:rousseau@ceremade.dauphine.fr

3Department of Decision Sciences, Bocconi University, Via R¨ontgen 1, 20136 Milano, Italy.

E-mail:catia.scricciolo@unibocconi.it

This supplementary material provides the proof of Theorem2, Proposition1and Theorem 3.

Equation numbers refer to the main paper. We denote byCa positive constant that may change from line to line.

Appendix A: Proof of Theorem 2

It is enough to check that assumptions [A1] and [A2] of Theorem1are satisfied. We begin by defining the parameter transformation. Under a DPM prior law with base measure proportional to a Gaussian distribution with parameterγ = (m, s2), we havepF,σ(·) = P

j≥1pjφσ(· −θj) almost surely, with independent sequences (θj)j≥1 and (pj)j≥1, the random variables (θj)j≥1 being independent and identically distributed according to N(m, s2). Hereafter, we use the notation Nγ as shorthand for N(m, s2). We consider a set Kn = [m1, m2]×[s21, s2n], with constants −∞ < m1 ≤ m2 < ∞, s21 > 0 and a positive sequence s2n → ∞ as a power of logn, such that P(n)p0 (ˆγn ∈ Kcn) = o(1). For a positive sequence un → 0 to be suitably chosen, consider a un-covering of [m1, m2] with intervals Ik = [mk, mk+un), for mk =m1+ (k−1)un, k = 1, . . . , Lmn, where Lmn =d1 + (m2−m1)/une, and a covering of [s21, s2n] with intervals Jl = [s2l, s2l+1) = [s21(1 +un)l−1, s21(1 +un)l), forl= 1, . . . , Lsn, whereLsn=d2u−1n log(sn/s1)e.

Fors2∈Jl, l= 1, . . . , Lsn, letρl= (s2/s2l)1/2. Letm∈Ik, k= 1, . . . , Lmn. For any γ0 = (mk, s2l) and γ = (m, s2), ifθj0 ∼Nγ0, then θj = [ρlj0 −mk) +m]∼Nγ,j ∈N.

1

(2)

Therefore, conditionally onσ, forF ∼DP(αRNγ0), ψγ0(pF,σ)(·) =X

j≥1

pjφσ(· −θ0j−[(ρl−1)θj0 −ρlmk+m])

is distributed according to a DPM of Gaussian densities with base measureαRNγ. With abuse of notation, we shall also writeψγ00j) to intend the parameter transformation θ0j7→ρlj0 −mk) +m,

ψγ0j0) =ρlj0 −mk) +m. (A.1) In the sequel, we shall repeatedly use the following inequalities:

1≤ρl<(1 +un)1/2 and −mkun< m−ρlmk≤un. (A.2) We first deal with the ordinary smooth case. To check that condition [A1] is satisfied, letσ∈(σn/2,2σn), with σn =1/βn , and let F =PNσ

j=1pjδθ

j be a mixing distribution such that the Gaussian mixturepFsatisfies both requirements in (3.4) and the minimal distance between any pair of contiguous location pointsθj’s is bounded below byδ=σ2bn, for someb >max{1,(2β)−1}. A partition (Uj)Mj=1ofRcan be constructed following the proof of Theorem 4 in Shen et al. [6] so that (Uj)Kj=1is a partition of [−aσ, aσ], withaσ= a0|logσ|1/τ, composed of intervals [θj−δ/2, θj+δ/2], forj= 1, . . . , Nσ, and of intervals with diameter smaller than or equal to σ to complete [−aσ, aσ]. Then, a partition of (−∞,−aσ)∪(aσ,∞) can be constructed with intervalsUj, forj=K+ 1, . . . , M, such thata1σ2bn ≤αRNγ(Uj)≤1 for some constant a1 >0. Note that, as in Shen et al. [6], M .σ−1(logn)1+1/τ and, for every 1 ≤ j ≤ K, we have Nγ(Uj) &(δ/s)e−2(aσ/s)2 &

σn2bn uniformly inγ∈ Kn. As in Shen et al. [6], defineBn as the set of all (F, σ) such thatσ∈(σn/2,2σn) and

M

X

j=1

|F(Uj)−pj| ≤22bn, min

1≤j≤MF(Uj)≥4bn/2.

Following Lemma 10 of Ghosal and van der Vaart [2], for some constantc >0,

γ∈Kinfn

π(Bn|γ)&exp (−cσ−1n (logn)2+1/τ). (A.3) For every (F, σ)∈Bn, for γ0 = (mk, s2l) and anyγ ∈Ik×Jl, by the parameter trans- formation in (A.1) and the inequalities in (A.2),

ψγ0(pF,σ)(x) =X

j≥1

pjφσ(x−ψγ0j0))

≥ X

j:j0|≤aσ

pjφσ(x−ψγ0j0))

> X

j:j0|≤aσ

pjφσ(x−θj0)

×exp (−4[|x−θ0j|(aσ+ 1)un+ (a2σ+ 1)u2n]/σn2), x∈R.

(3)

Note that (nσn)−1 = 2n. Choose un . n−1σn(logn)−2/τ = 2nσ2n(logn)−2/τ. On the event An ={Pn

i=1|Xi−m0| ≤ τ02nkn}, with kn =O((logn)1/τ), using the inequality logx≥(x−1)/xvalid for everyx >0, we have

`nγ0(pF,σ))−`n(p0)> `n(pFn)−`n(p0) +nlogcσ

−4nσn−2[(aσ+ 1)un(2aσ02kn) + (a2σ+ 1)u2n]

≥`n(pFn)−`n(p0) +nlogcσ−C0n2n

> `n(pFn)−`n(p0)−n2n−C0n2n, whereC0>0 is a large enough constant andpFn(·) :=c−1σ P

j:j0|≤aσpjφσ(· −θ0j), with normalizing constant

cσ:= X

j:j0|≤aσ

pj >1−22bn >1−2n

becauseb >1. The proof of Theorem 4 of Shen et al. [6], together with condition (3.4), implies that condition [A1] is satisfied forkas in part (i) of the statement of Theorem2.

We now check that condition [A2] is satisfied. LetF denote the set of all distribution functions onRand

Fn:=

(

F ∈ F: F =X

j≥1

pjδθj, |θj| ≤√

n ∀1≤j≤Hn, X

j>Hn

pjn

) . We consider the sieve set

Sn:={(F, σ) : (F, σ)∈ Fn×[σn,σ¯n]}, (A.4) withσnn =1/βn , ¯σn= exp(tn2n) for some constantt >0 depending on the param- etersν1, ν2>0 of the inverse-gamma prior distribution onσ, andHn =bn2n/(logn)c.

For some constantx0>0, letan:= 2x0(logn)1/τ. Forγ0= (mk, s2l) and anyγ∈Ik×Jl, if|θ| ≥an and |x| ≤an/2, then |x−θ| ≥ |θ|/2 and we can bound aboveψγ0(pF,σ) as

(4)

follows:

ψγ0(pF,σ)(x) = Z

−∞

φσ(x−ψγ0(θ))dF(θ)

≤ Z

−∞

φσ(x−θ) exp (un|x−θ|(|θ|+ 1)/σ2)dF(θ)

<exp (3a2nun2) Z

|θ|<an

φσ(x−θ)dF(θ) +

Z

|θ|≥an

φσ(x−θ) exp (4un(x−θ)22)dF(θ)

<exp (3a2nun2) Z

|θ|<an

φσ(x−θ)dF(θ) +

Z

|θ|≥an

φσ((x−θ)(1−8un)1/2)dF(θ)

≤max

exp (3a2nun2),(1−8un)−1/2

× Z

|θ|<an

φσ(x−θ) dF(θ) + Z

|θ|≥an

φσ˜n(x−θ)dF(θ)

!

, x∈R, (A.5)

whereF ∼DP(αRNγ0) and ˜σn:=σ(1−8un)−1/2. Now, define the event Ωn:=

−an/2≤ min

1≤i≤nXi≤ max

1≤i≤nXi≤an/2

.

Since by Condition (3.3),P(n)p0 (Ωcn).e−cnaτn, we can replace the supportRof the den- sity ψγ0(pF,σ) with Ωn and, with abuse of the notation introduced in (2.4), define, for all (F, σ)∈ Sn, the densityq(F,σ),γ0 supported on [−an/2, an/2] obtained from the re-normalized restriction to [−an/2, an/2] of the function in the last line of (A.5). Re- placingpF,σ withpF,σ1[−an/2, an/2] we then have thatkq(F,σ),γ0−pF,σk1=o(n). We can therefore consider the same tests as in Corollary 1 of Ghosal and van der Vaart [2] and condition (2.6) is verified, together with (2.7), using Proposition 2 of Shen et al. [6]. This implies that also condition (2.8) is satisfied. Since (A.3) implies condition (2.5), there only remains to verify assumption (2.4). The difficulty is to control q(F,σ),γ0 as σ →0.

For every (F, σ) withσ > σn, the previously defined densityq(F,σ),γ0 can be used as an upper bound on

sup

γ:kγ−γ0k≤un

ψγ0(pF,σ)(·)1[−an/2, an/2](·),

withkγ−γ0k:= (|m−mk|2+|s−sl|2)1/2. Then, for some finite constantC2>0, both integrals

Z

F

Z

σ>¯σn

Q(n)(F,σ),γ0([−an/2, an/2]n)π(dF|γ0)dπ(σ)

(5)

and

Z

Fnc

Z σ¯n

σn

Q(n)(F,σ),γ0([−an/2, an/2]n)π(dF |γ0)dπ(σ) areo(e−C2n2nπ(Bn0)) uniformly in γ0 ∈ Kn. We now study

sup

γ0∈Kn

Z

F

Z σn 0

Q(n)(F,σ),γ0([−an/2, an/2]n)π(dF |γ0)dπ(σ).

Consider the partition F

j=0n2−(j+1), σn2−j) of (0, σn). For everyj ≥0, let un,j :=

n−1enn2−j)2, with en = o(1). For every γ0 ∈ Kn, we consider a un,j-covering of {γ : kγ−γ0k ≤un} with centering points γi, i = 1, . . . , Nj, where Nj .(un/un,j)2. Whenσ∈[σn2−(j+1), σn2−j), for|x| ≤an/2,

sup

γ:kγ−γ0k≤un

ψγ0(pF,σ)(x)≤ max

1≤i≤Nj

sup

γ:kγ−γik≤un,j

Z

−∞

φσ(x−ψγi(θ))dF(θ)

≤ max

1≤i≤Nj

cn,igσ,i(x),

wheregσ,i is the probability density on [−an/2, an/2] proportional to Z

|θ|<an

φσ(x−θ)dF(θ) + Z

|θ|≥an

φσ

x−θ (1 +o(1/n))−1/2

dF(θ), withF ∼DP(αRNγi), and the normalizing constant

cn,i ≤F((−an, an)) exp (12x20n−1en(logn)2/τ) + [1−F((−an, an))]O(1) = 1 +O(1).

This implies that, foren=O((logn)−2/τ), Z

F

Z σn2−j σn2−(j+1)

Q(n)(F,σ),γ0([−an/2, an/2]n)π(dF |γ0)dπ(σ).Njπ([σn2−(j+1), σn2−j)) . u2n

u2n,jπ([σn2−(j+1), σn2−j)) .u2nn2e−2nn2−j)−4e−2j−1n, whence, for a suitable constantC >0,

sup

γ0∈Kn

Z

F

Z σn 0

Q(n)(F,σ),γ0([−an/2, an/2]n)π(dF |γ0)dπ(σ).exp (−Cσ−1n ).exp (−Cn2n) and (2.4) is verified, which completes the proof for the ordinary smooth case.

We now consider the super-smooth case. The main difference with the ordinary smooth case lies in the fact that, since the ratenis nearly parametric, that is,n2n=O((logn)κ)

(6)

for someκ >0, in order for condition (2.3) to be satisfied, in view of the constraint in (2.2), it suffices that, over some setBn, for suitable constantsC, C1>0,

sup

γ0∈Kn

sup

(F,σ)∈Bn

P(n)p0

inf

γ:kγ−γ0k≤un

`nγ0(pF,σ))−`n(p0)<−C1n2n

.e−Cn2n. (A.6) It is known from Lemma 2 of Shen and Wasserman [7] that if, for any fixedα∈(0,1], the pair (F, σ)∈Sn:={(F, σ) : ρα(p0;pF,σ)≤2n}, then, for every constantD >0,

P(n)p0

`n(pF,σ)−`n(p0)<−(1 +D)n2n

.e−αDn2n.

Let σn = O((logn)−1/r). It is known from Lemma 8 of Scricciolo [5] that, for σ ∈ (σn, σn +e−d1(1/σn)r) with d1 a positive constant, there exists a distribution F = PNσ

j=1pjδθ

j, withNσ =O((aσ/σ)2) support points in [−aσ, aσ], whereaσ =O(σ−r/(τ∧2)), such that, for some constantc >0,

max{Pp0log(p0/pF),Pp0log2(p0/pF)}.e−c(1/σ)r.

Inspection of the proof of the above mentioned Lemma 8 reveals that all arguments remain valid to bound above anyρα(p0;pF) divergence for α∈(0,1]. In fact, using the inequality |aα−bα| ≤ |aβ−bβ|α/β valid for all a, b >0 and 0≤α≤ β, if we set β= 1 in our case, then

ρα(p0;pF)≤α−1Pp0|(p0/pF)−1|α.

All bounds used in the proof of Lemma 8 for the various pieces in which the Kullback- Leibler divergencePp0log(p0/pF) is split can be used here to bound aboveρα(p0;pF).

Thus,

ρα(p0;pF).e−c(1/σ)r

for some constantc > 0 not depending onα. Construct a partition (Uj)Nj=0σ of R, with U0 := (SNσ

j=1Uj)c, Uj 3 θj and λ(Uj) = O(e−c1(1/σ)r), j = 1, . . . , Nσ, where here λ denotes Lebesgue measure. Then, infγ∈Knmin1≤j≤NσNγ(Uj)& e−c1(1/σ)r. Defined the set

Bn:=

(F, σ) : σ∈(σn, σn+e−d1(1/σn)r),

Nσ

X

j=1

|F(Uj)−pj| ≤e−c1(1/σ)r

 , for some constantsc2, c3>0 we have

γ∈Kinfn

π(Bn|γ)&exp (−c2Nσn(1/σn)r) = exp (−c3n2n) and, for every (F, σ)∈Bn,

ρα(p0;pF,σ).exp (−(1/σn)r).2n.

(7)

Reasoning as in the ordinary smooth case, for un =O(k−1n σ2n2n(logn)−2/(τ∧2)), on the eventAn :={Pn

i=1|Xi−m0| ≤τ02nkn}, with kn .(logn)1/τ,

`nγ0(pF,σ))−`n(p0)≥`n(pFn)−`n(p0) +n(cσ−1)

−4nσn−2[(a2σ+ 1)u2n+ (aσ+ 1)un(2aσ02kn)]

≥`n(pFn)−`n(p0)−n2n−C0n2n for some constant C0 > 0, withpFn(·) := c−1σ P

j:0j|≤aσpjφσ(· −θj0), where the nor- malizing constant

cσ:= X

j:0j|≤aσ

pj ≥1−e−c1(1/σn)r ≥1−2n.

Lemma 2 of Shen and Wasserman [7] then implies that (A.6) is satisfied. The other parts of the proof of Theorem 2 for the ordinary smooth case go through to this case with modifications. We need to check that the probabilitiesP(n)p0 (Acn) andP(n)p0 (Ωcn) converge to zero at appropriate rates. By a standard concentration inequality for sums of independent random variables, we have that, for a suitable constantc2>0,P(n)p0 (Acn).e−c2nkn2. Also, by assumption (3.3) that p0 has exponentially small tails,P(n)p0 (Ωcn).e−c3naτn.

Appendix B: Proof of Proposition 1

The result relies on the following inversion inequalities that relate theL2-distance be- tween the true mixing density and the (random) approximating mixing density in the sieve setSn, as defined in (A.4) in the proof of Theorem2, to theL2- or theL1-distance between the corresponding mixed densities:

kpY −p0Yk2.

( kK∗pY −K∗p0Ykβ21/(β1+η), ordinary smooth case, (−logkK∗pY −K∗p0Yk1)−β1/r1, super-smooth case.

To our knowledge, the first inequality, which concerns the ordinary smooth case, is new and of potential independent interest; the second one, concerning the super-smooth case, is similar to the one in Theorem 2 of Nguyen [3], which relates the Wasserstein distance between the mixing distributions to the L1-distance between the mixed densities. In what follows, we use “os” and “ss” as short-hands for “ordinary smooth” and “super- smooth”, respectively. To prove these inequalities, we instrumentally use thesinc kernel to characterize regular densities in terms of their approximation properties. We recall that thesinc kernel

sinc(x) =

(sinx)/(πx), if x6= 0, 1/π, if x= 0,

has Fourier transform sinc identically equal to 1 on [−1,d 1] and vanishing outside it.

Forδ >0, let sincδ(·) =δ−1sinc(·/δ) and define gδ as the inverse Fourier transform of

(8)

sincdδ/K, that isˆ

gδ(x) = 1 2π

Z

−∞

e−itxsincdδ(t)

K(t)ˆ dt, x∈R.

Let ˆgδ =sincdδ/Kˆ be the Fourier transform of gδ. So, sincδ = K∗gδ and pY ∗sincδ = (pY ∗K)∗gδ = (K∗pY)∗gδ. Then,

kpY −p0Yk22≤ kpY ∗sincδ−p0Y ∗sincδk22+kpY −pY ∗sincδk22+kp0Y −p0Y ∗sincδk22 .kpY ∗sincδ−p0Y ∗sincδk22+kpY −pY ∗sincδk221

because, by assumption (3.9), kp0Y −p0Y ∗sincδk22=

Z

−∞

|pˆ0Y(t)|2|1−sincdδ(t)|2dt

< δ1 Z

|t|>1/δ

(1 +t2)β1|pˆ0Y(t)|2dt.δ1.

Now, recall thatpY =pF,σ = F∗φσ. For (F, σ) ∈ Fn, σ ≥ σn ∝Cδ(logn)κ2, where 2κ2≥1,C2/2≥(2β1+ 1)/[2(β1+η) + 1] andδ≡δn&n−1/[2(β!+η)+1],

kpY −pY ∗sincδk22≤ Z

|t|>1/δ

|φˆσ(t)|2dt .(σ2/δ)−1e−(σ/δ)2/2

.[δ(logn)2]−1e−C2(logn)2/21. In the ordinary smooth case,

kpY ∗sincδ−p0Y ∗sincδk22=k(K∗pY)∗gδ−(K∗p0Y)∗gδk22

≤δ−2η Z

|t|≤1/δ

|K(t)|ˆ 2|pˆY(t)−pˆ0Y(t)|2dt

≤δ−2ηkK∗pY −K∗p0Yk22.

In the super-smooth case,kpY ∗sincδ−p0Y ∗sincδk22≤ kK∗pY −K∗p0Yk21kgδk22,where kgδk22= 1

2π Z

−∞

|sincdδ(t)|2

|K(t)|ˆ 2 dt= 1 2π

Z

δ|t|≤1

|K(t)|ˆ −2dt.e2%δ−r1. Combining partial results, for (F, σ)∈ Sn,

kpY −p0Yk221+

( kK∗pY −K∗p0Yk22×δ−2η, os case, kK∗pY −K∗p0Yk21×e2%δ−r1, ss case, so that the optimal choice forδ turns out to be

δ=

( O(kK∗pY −K∗p0Yk1/(β2 1+η)), os case, O (−logkK∗pY −K∗p0Yk1)−1/r1

, ss case.

(9)

For any 1≤q≤ ∞,

kK∗pY −K∗p0Ykq =k(K∗F)∗φσ−K∗p0Ykq =kpF∗K,σ−K∗p0Ykq. Then,

kpY −p0Yk2=kpF,σ−p0Yk2.

( kpF∗K,σ−K∗p0Ykβ21/(β1+η), os case, (−logkpF∗K,σ−K∗p0Yk1)−β1/r1, ss case.

For suitable constantsτ1, κ1>0, let ψn =

( n−(β1+η)/[2(β1+η)+1](logn)κ1, os case, n−1/2(logn)τ1, ss case.

Letvn be as in the statement of Proposition1. Then, for all (F, σ)∈ Sn, the following inclusions hold:

( {(F, σ) : kpF∗K,σ−K∗p0Yk2n} ⊆ {(F, σ) : kpY −p0Yk2.vn}, os, {(F, σ) : kpF∗K,σ−K∗p0Yk1n} ⊆ {(F, σ) : kpY −p0Yk2.vn}, ss.

Forq= 2 in the ordinary smooth case andq= 1 in the super-smooth case, by virtue of Theorem 2, we have π({(F, σ)∈ Sn : kpF∗K,σ−K∗p0Ykqn} | ˆγn, X(n))→ 1 in P(n)p0X-probability, which implies thatπ({(F, σ) : kpY −p0Yk2.vn} |ˆγn, X(n))→1 in the same mode of convergence, and the proof is complete.

Appendix C: Proof of Theorem 3

For any intensityλ, we still denote Mλ =R

λ(t)dt and ¯λ= Mλ−1×λ∈ F1. Without loss of generality, we can assume that Ω = [0, T]. To apply Theorem 1, we must first define the transformationψγ,γ0. Note that the parameter γonly influences the prior on

¯λand has no impact on Mλ. As explained in Section 2, we can consider the following transformation: for allγ, γ0∈R+, we set, for anyt,

λ(t) =¯ X

j≥1

pj

1(0, θj)(t) θj

, ψγ,γ0(¯λ)(t) =X

j≥1

pj

1(0, G−1

γ0(Gγj)))(t) G−1γ0 (Gγj)) , with

pj=VjY

l<j

(1−Vl), Vj∼Beta(1, A), θj ∼Gγ independently.

So, if ¯λis distributed according to a DPM of uniform distributions with base measure in- dexed byγ, thenψγ,γ0(¯λ) is distributed according to a DPM of uniform distributions with base measure indexed byγ0. We prove Theorem 3for both types of base measure intro- duced in (4.7). LetGdenote the cdf of a Gamma(a, 1) random variable andgits density.

(10)

For the first type of base measure we haveG−1γ0 (Gγ(θ)) =G−1(G(γθ)G(γ0T)/G(γT))/γ0 ifθ≤T andG−1γ0 (Gγ(θ)) =T ifθ≥T. For anyθ∈[0, T], ifγ0≥γ then

G−1γ0 (Gγ(θ))≤θ and G−1γ0 (Gγ(θ))≥γθ

γ0. (C.1)

The second inequality in (C.1) is straightforward. The first inequality in (C.1) is equiv- alent toG(γθ)G(γ0T)≤G(γ0θ)G(γT) and is deduced from the following argument. Let

∆(θ) =G(γθ)G(γ0T)−G(γ0θ)G(γT). Then, ∆(0) = 0 and ∆(T) = 0. By Rolle’s The- orem, there exists c ∈ (0, T) such that ∆0(c) = 0. We have ∆0(θ) = γg(γθ)G(γ0T)− γ0g(γ0θ)G(γT) which is proportional toθa−1e−γ0θae0−γ)θG(γ0T)−(γ0)aG(γT)]. The function inside brackets is increasing so that ∆0(θ)≤0 forθ≤cand ∆0(θ)≥0 forθ≥c.

Therefore, ∆ is first decreasing and then increasing. Since ∆(0) = ∆(T) = 0, ∆ is neg- ative on (0, T), which achieves the proof of (C.1). For the second type of base measure, forθ≤T, we have that, for everyγ, γ0>0,G−1γ0 (Gγ(θ)) =T γθ/[γ0(T−θ+θγ/γ0)] and (C.1) is straightforward.

We first verify assumption [A1]. At several places, by using (4.1) and (4.4), we use that, underP(n)λ (· |Γn), for any intervalI, the number of points ofN falling inIis controlled by the number of points of a Poisson process with intensity n(1 +α)m2λ falling in I.

Letun= (nlogn)−1so thatun=o(¯2n) and choosek≥6 so thatu−1n =o((n¯2n)k/2) and (2.2) holds (note thatNn(un) is the same order asu−1n ). Using the proof of Corollary 4.1 of Donnet et al. [1], we construct ˜Bn = ˜Bnγ (since in [A1] ˜Bn may depend onγ) as the set ofλ=Mλλ¯ such that|Mλ−Mλ0| ≤¯n andψγ,γ+un(¯λ)∈B¯n, with ¯Bn ={λ¯P0(x) = R

x θ−1dP0(θ) :P0 ∈ N }, whereN is as defined in the proof of Lemma 8 in Appendix A of Salomond [4]. Note that from Lemma 8 in Appendix A of Salomond [4], if ¯λ∈B¯n and|Mλ−Mλ0| ≤¯n,

P(n)λ0 `n(λ)−`n0)≤ −(κ0+ 1)n¯2nn

=O((n¯2n)−k/2(logn)k) (C.2) forκ0 a constant. To prove (2.3), it is enough to control infγ0∈[γ, γ+un]`n(Mλψγ,γ0(¯λ)).

Using (C.1), we have that for anyγ0∈[γ, γ+un], on Γn, G−1γ+un(Gγj))≤G−1γ0 (Gγj)) and

G−1γ0 (Gγj))≤γ+un

γ0 G−1γ+un(Gγj))≤γ+un

γ G−1γ+un(Gγj)).

Therefore, γ γ+un

ψγ,γ+un(¯λ)(t)≤ψγ,γ0(¯λ)(t)

≤ψγ,γ+un(¯λ)(t) +X

j≥1

pj

1(G−1γ+un(Gγj)),γ+unγ G−1

γ+un(Gγj)))(t) G−1γ+u

n(Gγj))

(C.3)

(11)

so that, fornlarge enough, inf

γ0∈[γ, γ+un]

`n(Mλψγ,γ0(¯λ)) = inf

γ0∈[γ, γ+un]

(Z T

0

log(Mλψγ,γ0(¯λ)(t))dNt− Z T

0

Mλψγ,γ0(¯λ)(t)Ytdt )

≥ Z T

0

log Mλψγ,γ+un(¯λ)(t)

dNt+N[0, T] log γ

γ+un

−Mλ

Z T 0

ψγ,γ+un(¯λ)(t)Ytdt−Mλunn(1 +α)m2

γ

≥`n(Mλψγ,γ+un(¯λ))−un

γ [Mλn(1 +α)m2+ 2N[0, T]], where the last line uses log(1−x)≥ −2xforx >0 small enough. By using the Bienaym´e- Chebyshev inequality, ifZis a Poisson variable with parametern(1 +α)m2Mλ0, we have

P(|Z−n(1 +α)m2Mλ0|> n(1−α)m2Mλ0) =o(1).

Then the event{N[0, T]≤2Mλ0m2n}has probability going to 1 and, on this event, inf

γ0∈[γ, γ+un]`n(Mλψγ,γ0(¯λ))

≥`n(Mλψγ,γ+un(¯λ))−nγ−1m2[Mλ(1 +α) + 4Mλ0]un.

(C.4) Combining this lower bound with (C.2), for allλ=Mλψγ,γ+un(¯λ), withψγ,γ+un(¯λ)∈B¯n

and|Mλ−Mλ0| ≤¯n, P(n)λ0

inf

γ0∈[γ, γ+un]

`n(Mλψγ,γ0(¯λ))−`n0)≤ −(κ0+ 2)n¯2nn

=O((n¯2n)−k/2).

The left hand side of the previous inequality is then negligible with respect toun which is the same order asNn(un)−1 and assumption [A1] is satisfied ifC1≥κ0+ 2.

We now verify assumption [A2]. First, note that using (C.3), (2.8) is obviously satisfied.

Mimicking the proof of Lemma 8 of Salomond [4], we have that over any compact subset K0 of (0,∞),

γ∈Kinf0π1n

≥e−Ck2n (C.5)

for some Ck > 0, when n is large enough. By definition of ˜Bnγ, π( ˜Bnγ | γ) = π1( ¯Bn | γ+unM([Mλ0−¯n, Mλ0+ ¯n]) which, together with (C.5), implies that infγ∈Kπ(Bnγ| γ) ≥ e−2Ck2n when n is large enough, so that (2.5) is satisfied as soon as j is large enough. We now define the measureQ(n)λ,γ, with

dQ(n)λ,γ=1Γn× sup

γ0∈[γ, γ+un]

exp(`n(Mλψγ,γ0(¯λ)))dµ

andµthe measure such that underµthe process is an homogeneous Poisson process with intensity 1. Using (C.1) and similarly to (C.4), we obtain that, for allγ0∈[γ, γ+un],

λ(t)¯ −X

j≥1

pj1(γθj0, θj)(t) θj

≤ψγ,γ0(¯λ)(t)≤γ+un γ

¯λ(t)

(12)

and we have Q(n)λ,γ(X(n)) =E(n)λ

"

1Γn sup

γ0∈[γ, γ+un]

exp −Mλ Z T

0

ψγ,γ0(¯λ)(t)Ytdt

+ Z T

0

log Mλψγ,γ0(¯λ)(t) dNt

!#

≤E(n)λ

1Γnexp(nm2(1 +α)Mλγ−1un+ log(1 +un/γ)N[0, T])

≤E(n)λ

1Γnexp(nm2(1 +α)Mλγ−1un+unγ−1N[0, T])

≤exp

nm2(1 +α)Mλγ−1un+ (1 +α)nm2Mλ(eun −1)

≤exp 3nm2(1 +α)γ−1Mλun

whenn is large enough since exp(x)−1 ≤2x forx > 0 small enough. Letφn,j be the tests defined in Proposition 6.2 of Donnet et al. [1] over Sn,jn). Using the previous computations, we have

Q(n)λ,γ[1−φn,j]≤E(n)λ [(1−φn,j) exp(nm2(1 +α)Mλγ−1un+unγ−1N[0, T])1Γn]

≤enm2(1+α)Mλγ−1un(E(n)λ [(1−φn,j)1Γn]E(n)λ [e−1unN[0, T]1Γn])1/2

≤e4nm2(1+α)γ−1Mλunmax{e−cnj2¯2n/2, e−cnj¯n/2}.

As in Salomond [4], we setSn={λ¯: ¯λ(0)≤Mn}, withMn= exp(c12n) andc1>0 is a constant. From Lemma 9 of Salomond [4], there existsa >0 such that supγ∈K0π1(Snc | γ)≤e−c1(a+1)n¯2n, so that whennis large enough,

sup

γ∈K0

Z

R+

Z

Snc

Q(n)λ,γ(X(n))dπ1(¯λ|γ)πM(Mλ)dMλ

.e−c1(a+1)n¯2n Z

R+

eδMλπM(Mλ)dMλ.e−c1(a+1)n¯2n,

with δ that can be chosen as small as needed since nun = o(1). This proves (2.4) by conveniently choosing c1. Combining the above upper bound with Proposition 6.2 of Donnet et al. [1], together with Remark1, achieves the proof of Theorem3.

References

[1] Donnet, S., Rivoirard, V., Rousseau, J., and Scricciolo, C. (2015). Posterior concentration rates for counting processes with Aalen multiplicative intensities.

arXiv:1407.6033v1, to appear in Bayesian Analysis.

[2] Ghosal, S. and van der Vaart, A. (2007). Posterior convergence rates of Dirichlet mixtures at smooth densities. Ann. Statist., 35(2):697–723.

(13)

[3] Nguyen, X. (2013). Convergence of latent mixing measures in finite and infinite mixture models. Ann. Statist., 41:370–400.

[4] Salomond, J.-B. (2014). Concentration rate and consistency of the posterior distribu- tion for selected priors under monotonicity constraints.Electron. J. Statist., 8(1):1380–

1404.

[5] Scricciolo, C. (2014). Adaptive Bayesian density estimation in Lp-metrics with Pitman-Yor or normalized inverse-Gaussian process kernel mixtures. Bayesian Anal- ysis, 9:475–520.

[6] Shen, W., Tokdar, S., and Ghosal, S. (2013). Adaptive Bayesian multivariate density estimation with Dirichlet mixtures. Biometrika, 100:623–640.

[7] Shen, X. and Wasserman, L. (2001). Rates of convergence of posterior distributions.

Ann. Statist., 29:687–714.

Références

Documents relatifs

Donner une valeur approch´ee de la variation relative de f lorsque x augmente de 5% `a partir de

La qualit´ e de r´ edaction et de la pr´ esentation entrera pour une part importante dans l’appr´ eciation des copies.. Le barˆ eme tiendra compte de la longueur

La qualit´ e de la r´ edaction et de la pr´ esentation entreront pour une part importante dans l’appr´ eciation des copies.. Ne pas oublier pas d’indiquer votre num´ ero de

La qualit´ e de r´ edaction et de la pr´ esentation entreront pour une part importante dans l’appr´ eciation des copies.. Ne pas oublier pas de marquer le num´ ero de

La qualit´ e de r´ edaction et de la pr´ esentation entreront pour une part importante dans l’appr´ eciation des copies.. Ne pas oublier de marquer le num´ ero

En sup- posant que l’entreprise tourne `a plein r´egime, trouver la r´epartition maximale entre les mod`eles de type X et Y permettant de maximiser le profit quotidien.. Calculer

´ Etudier les variations de g et en d´eduire qu’elle admet un minimum global dont on donnera la

Montrer que le point critique sous la contrainte est un extremum local, dont on pr´ecisera la nature.. Pr´eciser la valeur de f en