• Aucun résultat trouvé

Adequacy of the Harris path with the expected contour

Chapitre 7 On the inference for size constrained Galton-Watson trees 133

7.2 Inferring σ −1 from a forest

7.2.1 Adequacy of the Harris path with the expected contour

Let τn ∼ GWn(µ) with µ = 1. We assume that the offspring distribution µ is unknown. By virtue of Theorem 7.1.3, the asymptotic average behavior of the normalized Harris process (n−1/2H[τn](2nt), 0 ≤ t ≤ 1) is given by (2σ−1Et, 0 ≤ t ≤ 1), where σ−1 is obviously also unknown. We propose to estimateσ−1 by minimizing theL2-error defined by

λ7→

H[τn](2n·)

√n −2λE

2 2

.

The solution of this least-square problem is well-known and is given by bλ[τn] = hH[τn](2n·), Ei

2√

nkEk22 . (7.4)

Corollary 7.2.1. When n goes to infinity, we have bλ[τn]−→(d) σ−1Λ, where the real random variable Λ is defined by

Λ= he, Ei kEk22.

Proof. The result directly follows from Theorem 7.1.2 because the functional x 7→ hx, Ei is

continuous onC([0,1]). 2

Remark 7.2.2. The convergence in distribution stated in Corollary 7.2.1 seems quite unsatis-factory because this means that bλ[τn] is not a consistent estimator of σ−1 and the least-square strategy thus looks like inadequate. Nevertheless, one can not expect a stronger convergence from the observation of only one stochastic process within a finite window of time. This is why one may only focus on the estimation of the parameter of interest σ−1 from a forest of conditioned Galton-Watson trees. This statistical framework is also considered in [9].

Computing bλ[τn] is only a first step in the estimation of the inverse standard deviation from a large number of conditioned Galton-Watson trees. As a consequence, the distribution of the limit variableΛ is of first importance.

7.2. Inferring σ−1 from a forest Proposition 7.2.3. The random variable Λ admits a densityfΛ with respect to the Lebesgue measure. Furthermore,

E[Λ] = 1. (7.5)

Proof.The existence of a density was already known [70, 71] for the random variable R1

0 esds. In these papers the study is performed thanks to the analysis of the double Laplace transform

λ7→

Z 0

exp(−λt)E

exp

−t Z 1

0

esds

dt.

Thanks to the Feynmann-Kac formula, the authors express this quantity in terms of Airy func-tions. Then, they inverse the Laplace transform via analytical methods. Unfortunately, their method does not extend to our case. Indeed, in their case, an expression of the double Laplace transform given above is derived from the Feynmann-Kac formula for standard Brownian motion which tells us that the function

u(t, x) =Ex

f(Bt) exp Z t

0

Bsds

, ∀(t, x)∈R+×R, is the solution of the PDE

( ∂tu(t, x) = 12∆u(t, x) +xu(t, x) ∀x∈R, t∈R+,

u(0, x) =f(x) ∀x∈R.

In this case, taking the Laplace transform in time ofu leads to an ODE whose solution can be express in term of Airy functions (see [52]). In our case, the PDE becomes inhomogeneous in time which makes such transformation useless. As a consequence, one cannot obtain informations by this method.

That is why we propose a new method using Malliavin calculus and the representation of the Brownian excursion as a three-dimensional Bessel bridge (7.3) to show thatΛadmits a density.

We consider the probability space (C([0,1],R3),F,W), where C([0,1],R3) is endowed with the topology of uniform convergence, F is the corresponding Borel σ-field and W is the Wiener measure. LetT be the continuous linear operator defined by

T : C([0,1],R3) → (C([0,1],R3), ϕ 7→ (T ϕ(s) =ϕs−sϕ1). Let alsoΓ be the following function,

Γ :ϕ7→

Z 1 0

kϕ(s)k3Esds.

wherekxk denotes the Euclidian norm onR3. With these notations and (7.3), we have that the pushforward measure ofWthrough the application

F :ϕ7→Γ(T ϕ),

is the law ofkEk22Λ. In other words, the random variableF is equal in distribution tokEk22Λ. Now for every ϕ in C([0,1],R3) such that Leb

{t∈R+ : ϕ(t) = 0}

= 0, we have that Γ is Frechet differentiable at point ϕ(whereLebdenotes the Lebesgue measure). Indeed, set

DϕΓ : (C([0,1],R3) → R,

h 7→ R1

0

hϕ(s),h(s)i kϕ(s)k Es ds.

Then, some straightforward manipulations give Z 1

Now, Cauchy-Schwarz inequality entails

is well-defined (because the integrand is bounded by2) and goes to zero askhkgoes to zero, this prove thatDϕΓis the Frechet derivative ofΓat pointϕ. Now, sinceT is linear, we have thatF is Frechet differentiable at everyϕsuch thatLeb

{t∈R+ : ϕ(t) = 0}

= 0andDϕF =DT ϕΓ◦T. We now show that F belongs to the Malliavin-Sobolev space D1,2 (see [75, p. 25-27] for the definition of this space). Leth be an element ofL2([0,1],R3), it is easily seen that

But in the right hand side of the last inequality, we have, using Jensen’s inequality, Z 1

7.2. Inferring σ−1 from a forest From this, using the results of [75, p. 35], we have thatF belongs to the space D1,2.

Before going further let us recall some facts on Malliavin derivative. When, working with the probability space (C([0,1],R3),F,W), its is known (see Section 1.2.1 in [75]) that there exists strong connexions between Malliavin derivative and Frechet derivative for a random variableG ofD1,2 defined from(C([0,1],R3),F,W)toR. Since, the Frechet derivativeDωGat point ωofG is a continuous linear form fromC([0,1],R3) intoR, it can be identified to a triple (µω1, µω2, µω3) ofσ-finite measures onR such that

DϕGh=

3

X

i=1

Z

[0,1]

his µωi(ds), ∀h∈ C([0,1],R3).

In such case, the Malliavin derivative ofGis random process belonging toL2([0,1],R3) given by {(µω1(u,1], µω2(u,1], µω3(u,1]), u∈[0,1]}.

In our case, since DϕF h=

Z 1 0

hs

ϕs−sϕ1

s−sϕ1kEs ds− Z 1

0

v(ϕv−vϕ1) kϕv−vϕ1kEv dv

δ1(ds)

, it follows that the Malliavin derivative ofF is given by

DF = Z 1

0

s−sω1)Es

s−sω1k (1s>u−s)ds, u∈[0,1]

∈L2([0,1],R3).

Now, sinceDF isW-almost everywhere not zero (inL2([0,1],R3)), we have using [75, Theorem 2.1.2] the existence of a density for the push-forward measure of W by F with respect to the Lebesgue measure.

2 It should be noted that the weak limit of bλ[τn] has mean equal to σ−1 by (7.5). Moreover, it can be showed that the random variable Λ is square integrable. Indeed, since the function E is bounded, we have

0≤Λ≤ C Z 1

0

etdt,

for some positive constantC. Now, its is known that the random variable R1

0 etdtadmit moments at all order (see for instance [71]).

The variance of Λ can then be evaluated numerically in order to compare our methods with other estimators. We use Monte-Carlo simulations to produce a sample with same law asΛto achieve this task. This lead to

Var(Λ)'0.0690785.

At this point, it is quite interesting to compare our approach to the one developed in [9]. As in the present paper, the authors of [9] construct estimators for the inverse standard deviation of the offspring distribution of a forest of conditioned critical Galton-Watson trees. Their strategy relies on the distance to the root of a uniformly sampled vertexvof the considered treeτn∼GWn(µ),

bδ[τn] = h(v)

√n,

where we recall thath(v)is the height ofv in the tree. Using Theorem 7.1.2, it has been shown thatbδ[τn]converges in law, when the number of nodesngoes to infinity, towards σ−1 where the random variable∆follows the Rayleigh distribution with parameter scale1[9, Proposition 4] with density,

∀x∈R+, f(x) =xexp

−1 2x2

.

This was not noticed in [9], but we emphasize that δ[τb n] is somehow biased because E[∆] = pπ

2 6= 1. Nevertheless, one may avoid this issue by considering the quantity

bδ[τn] = r2

πδ[τn] which converges to σ−1

q2

π which is σ−1 on average. As a consequence, bλ[τn] and bδ[τn] are two quantities directly computable from the treeτnand that may be used to estimate the inverse standard deviation. We propose to compare them from their respective asymptotic dispersion. A first comparison may be done by computing the variances of Λ and

q2

π. One has Var

r2 π∆

!

'0.2732395 and Var(Λ)'0.0690785.

This difference in the dispersions is quite apparent in Figure 7.3 where the densities of q2

π

andΛ have been displayed. Consequently, one may expect better results in terms of dispersion from our strategy.