Asymptotics for the <span class="mathjax-formula">$L^p$</span>-deviation of the variance estimator under diffusion

(1)

August 2004, Vol. 8, p. 132–149 DOI: 10.1051/ps:2004005

ASYMPTOTICS FOR THE L

^p

-DEVIATION OF THE VARIANCE ESTIMATOR UNDER DIFFUSION

^∗

Paul Doukhan

¹

and Jos´ e R. Le´ on

²

Abstract. We consider a diﬀusion process Xt smoothed with (small) sampling parameter ε. As in Berzin, Le´on and Ortega (2001), we consider a kernel estimate αε with windowh(ε) of a functionα of its variance. In order to exhibit global tests of hypothesis, we derive here central limit theorems for theL^pdeviations such as

√1 h

h

ε ^p

2

αε−α^p_p−IEαε−α^p_p .

Mathematics Subject Classification. 60F05, 60F25, 60J60, 60H05, 62M02, 62M05.

Received October 22, 2002. Revised October 21, 2003.

1. Introduction and main results

Recently, the problem of statistical inference for integrated diffusions attracted a certain attention. This type of situation appears, for example, when a realization of a process is observed after passage through an electronic filter. Ditlevsenet al. [4] studied the ice core records from Greenland, which can also be modeled as an integrated diffusion process. Gloter [9] considers the integrated Ornstein-Uhlenbeck process to make inference about its parameters. Integrated processes are also important in the so-called realized volatility in finance. In all these works the observations are assumed to be Y_i=_i∆

(i−1)∆X_sϕ_∆(s−(i−1)∆) dswhereX_sis a diﬀusion process andϕ_∆is a density in the interval [0,∆]. Settingϕ_∆(s) =_∆¹ϕ(_∆^s), we haveY_i=₁

0 ϕ(s)X_{s∆+(i−1)∆}ds.

In this work we observed a smoothed and continuous time process deﬁned as X_t^ε=1

ε _∞

−∞ϕ t−u

ε

X_udu= _1/2

−1/2ϕ(s)X_t−εsds, (1)

withX_ta diffusion process,ϕis a density in [−¹₂,¹₂] andεis interpreted as a sampling parameter. Our goal is to make non parametric inference for the diffusion coefficient. Let us briefly introduce our framework.

Keywords and phrases. Variance estimator, kernel,L^p-deviation, central limit theorem.

∗This work was supported by the Project Agenda Petr´oleo: Modelaje Estoc´astico Aplicado of FONACIT Venezuela.

1 Laboratoire de Statistiques LS-CREST, ENSAE, 3 rue Pierre Larousse, France; e-mail:[email protected]

2 Universidad Central de Venezuela, Escuela de Matem´atica, Facultad de Ciencias, AP. 47197, Los Chaguaramos Caracas 1041-A, Venezuela; e-mail:[email protected]

c EDP Sciences, SMAI 2004

(2)

Let (W_t)_t≥0be a standard Brownian motion, andX_t be deﬁned by the equation

dX_t=σ(t) dW_t+b(X_t) dt, σ >0. (2) To assure the existence and uniqueness of the solutionX_t,σ: IR→IR andb: IR→IR are assumed to satisfy the assumptions (1) and (2) of Theorem 1 of [7] p. 40, adapted to our case,i.e.

|b(x)−b(y)| ≤K|x−y|and|b(x)|+σ(x)≤K²

1 +|x|² . (3)

Additional conditions concerningσwill be given later. This shows however that explosive variances cannot be considered in the present work.

In this work we consider the estimation of the functionσ(t). ProcessX_tis not directly available, we assume that we observeX_t^ε as deﬁned in (1). The case in which function σ(·) only depends onX_t has been studied previously by Perera and Wschebor in [12] and [13].

As in [2], we consider a function G ∈ L²(φ) with φ(x) = √¹

2πe^−x²^/2 together with the continuous and symmetric densitiesϕandKwith support in [−¹₂,¹₂]. Ifϕis a diﬀerentiable function we deﬁne ˙X_ε(t) =_dt^dX_ε(t).

For anyq≥1, we deﬁnefq =_∞

−∞|f(t)|^qdt ¹_q

. Leth=h(ε)→0 asε→0 (the dependence ofhonεis implicit throughout the paper) and we set

α_ε(t) = 1 h

_∞

−∞K t−u

h

G √

ε ϕ2

X˙_ε(u)

du. (4)

Soα_ε(t) is the non-parametric kernel estimate of the parameter

α(t) = IE[G(σ(t)Z)], t∈[0,1] (5)

whereZ ∼ N(0,1) will denote a standard normal random variable throughout the paper. Berzinet al.[2] note several interesting special cases:

• ifG(x) =x² thenα(t) =σ²(t) (recall that IE|Z|²= 1);

• ifG(x) =_π

2|x|thenα(t) =σ(t) (recall that IE|Z|= 2

π);

• if G(x) = log|x| −2γ then α(t) = logσ(t). For this, note that the constant γ can be written as γ=_∞

0 logx φ(x) dx= 0.57721566. . .

By using stable convergence, as in [2], we can deduce our results from the case b ≡0. In this particular case our process is a time-changed Brownian motion.

Deﬁne

β_ε(t) =

h/ε(α_ε(t)−IEα_ε(t)). (6)

A pointwise central limit theorem (CLT)

β_ε(t)−→^D ε→0N 0,Σ²(t)

is proved in [2], where Σ²(t) is deﬁned by equation (11). Alternative estimation techniques and some CLT are proposed in Soulier [17], Genon-Catalotet al. [6] and in Brugi`ere [3] under close settings.

Another expression will also be useful

β_ε(t) =

h/ε(α_ε(t)−α(t)). (7)

Ifα∈C² (twice continuously diﬀerentiable) then the bias term veriﬁes IE(α_ε(t))−α(t) =O(h²). In this case the optimal window size ish=ε^1/5. Replacing IE(α_ε(t)) by α(t) in (6) we get again a CLT with no zero mean

(3)

(resp. zero mean) in the optimal case (resp. in the non optimal case). In Proposition 1 below we will precise the asymptotic bias behavior.

In the present paper, our aim is to provide global estimation results for parameterαinL^p forp≥1. So, we consider theL^p deviations

D_p,ε = 1

√h

βε^p_p−IEβε^p_p , and (8)

Dp,ε = √1 h

β_ε^p_p−IEβ_ε^p_p

. (9)

In Theorems 1 and 2 we show that both expressions are asymptotically normal. These results are used to design global tests of hypotheses for the diﬀusion’s variance in the forthcoming section. That test appears to be of special interest in problems in ﬁnance. Using a Poissonization argument, Beirlant and Mason [1]

obtained analogous results for the case of kernel density and regression estimates based on independent samples.

Soulier [17] proves a CLT for the casep= 2 for a wavelet-based estimator of the diﬀusion coeﬃcient.

Remark. The asymptotic behavior ofDp,ε, more convenient for a test of hypothesis, is an easy consequence of the behavior for D_p,ε in the sub optimal window case, nevertheless needs a more detailed analysis in the optimal one (see remark and proof of Th. 2).

Let us expand the (even) functiong_t(x) =G(σ(t)x) in terms of Hermite polynomials g_t(x) =

∞ n=0

a_2n(t)H_2n(x), witha_2n(t) = 1

(2n)!IEG(σ(t)Z)·H_2n(Z). (10) We will often use Mehler’s formula: IE(H_n(X)H_m(Y)) =n!ρⁿδ_n,m, where (X, Y) a two-dimensional standard Gaussian vector having correlationρ, which is a special case of the Diagram formula, see [11].

Letf g stand for the convolution off andg. Fort∈[0,1] andw∈[−1,1], we deﬁne Σ²(t) =K²₂^∞

n=1

a²_2n(t) (2n)!

₁

−1

ϕ ϕ(z) ϕ²₂

_2n

dz, and (11)

Γ(w) = K K(w)

K²₂ ∈[−1,1]. (12)

Let (Z₁, Z₂) be a standard (0, I₂) normal vector, so we deﬁne Σ²_p=

₁

−1Cov

1−Γ²(w)Z₁+ Γ(w)Z₂^p,|Z₂|^p dw·

₁

0

Σ^2p(t) dt. (13)

We have the following result

Theorem 1. Assume that the diﬀusion (2) is such as the functionσis continuous andσ >0 over the compact set[0,1], there exists some q≥4 such as IE|G(σ(t)Z)|^pq<∞and lim

ε→0h= lim

ε→0

ε

h^2(1−1/q) = 0. Then D_p,ε−→^D _ε→0N

0,Σ²_p . Remarks. Using Lemma 5 below proves that the same CLT holds for

D_p,ε= 1

√h

βε^p_p−IE|Z|^p ₁

0

(Σ(t))^pdt

,

where Σ²(t) is deﬁned in equation (11) and it is also the limiting variance in the CLT as proved in [2].

(4)

Letb_2k =_(2k)!¹ IE|Z|^pH_2k(Z), then we write

Σ²_p= ∞ k=1

b²_2k(2k)!

Γ^2k(w) dw

Σ^2p(t) dt.

Inspired by Jacod [10], the proof of Theorem 1 is divided in two steps: ﬁrst one assumes thatb≡0, which mean thatX_tis a time-changed Brownian motion, and shows stable convergence. Secondly, using Girsanov’s formula we consider the caseb= 0.

We ﬁrst provide the asymptotic behaviour of the bias:

Proposition 1. Assume that the even function G is a.s. twice diﬀerentiable and assume that σ > 0 is a C²−function. Let b_ε(t) =IEα_ε(t)−α(t). and lim

ε→0h= lim

ε→0

ε

h = 0, then

ε→0limh⁻² sup

t∈[0,1]

b_ε(t)−h² 2 α(t)

s²K(s) ds = 0.

Besides, lim

ε→0

ε²

h³ = 0and the functionsG, σ areC³, imply that the norming factorh⁻² may be replaced byh⁻³. Remark. As usual, the use of kernelsKwith higher order¹yieldsb_ε(t) =cα^(r)(t)h^r+o(h^r) whereois uniform with respect tot. Moreover the remainder term isO(h^r+δ) if the more restrictive assumption|α^(r)(s)−α^(r)(t)| ≤ c_δ|s−t|^δ is assumed.

We now turn to the asymptotic behaviour ofDp,ε. Assuming that the functionsσ, Garea.s.twice diﬀeren- tiable, then the suboptimal window case, lim_ε→0h⁵/ε= 0 leads to the same result as Theorem 1.

Dp,ε −→D ε→0N(0,Σ²_p) if lim

ε→0

h⁵

ε = 0. (14)

Examining the optimal window caseh=λε¹⁵, we get:

Theorem 2. Assume that the function σ > 0 is C² (twice continuously diﬀerentiable), and that G is a.s.

twice diﬀerentiable and has a second order bounded derivative set h = λε¹⁵ for some constant λ > 0. If IE|G(σ(t)Z)|^2p<∞then

Dp,ε −→D _ε→0N(0, τ_p²), where, as in Theorem 1,

τ_p²= Θ(w, t) Σ^2p(t) dwdt andc(t) =1

2λ⁵²α(t)

s²K(s) ds, Θ(w, t) = Cov

1−Γ²(w)Z₁+ Γ(w)Z₂+c(t)^p,|Z2+c(t)|^p .

Remark. The statisticDp,ε is not well adapted to make a hypothesis test. It can be modiﬁed as Dp,ε,so= √1

h

β_ε^p_p−IE|Z|^p ₁

0

(Σ(t))^pdt

,

1This means that

t^kK(t) dtis 1 for k= 0, vanishes for 0 < k < r and is well deﬁned for k = r; usually one considers bounded and compactly supported kernels and a standard construction of such kernel consists to search polynomialsP such as those conditions hold for the kernelK=P·uwhereudenotes a bounded and compactly supported non-negative density.

(5)

under the suboptimal window case and as Dp,ε,o= 1

√h

β_ε^p

p− ₁

0

|Σ(t)Z+c(t)|^pdt

,

if h = λε¹⁵. In Lemma 5 below we will show that these two statistics have the same asymptotic behaviour that Dp,ε in each case.

Examples. In some special cases of interest, the functionGis homogeneousG(σx) =σ^rG(x) forσ >0, hence Σ²(t) =Aσ^2r(t) for a suitable constant A > 0 only depending on φand on Gand, this makes much simpler the expressions of Σ²_p and τ_p². Examples of these situationsG(x) =_π

2|x| andG(x) =x² have already been sketched.

Analogous considerations are valid for the function G(x) = log|x| −γ for which onlya₀(t) = logσ(t)−γ really depends ont whilea_2n(t) =a_2n = _(2n)!¹ IE log|Z|H_2n(Z) forn > 0, and Σ²(t) = Σ²_ϕ only depends onϕ.

Note that Σ²_p does not depend on the functionσ(·); this however does not hold for the companion varianceτ_p². This paper is organized as follows, Section 1 introduces the problem and gives the main results. Section 2 is devoted to provide an explicit expression for a test of hypothesis useful for various applications. Section 3 is devoted to a series of technical lemmas useful in the proof of the main results. The main results are proved in Section 4, while the proof of the preliminary lemmas is given in Section 5.

2. Application to a test of hypothesis

Assume that we want to provide a test for hypothesisH₀:α=aagainst a family of contiguous alternatives α=a+δAwhere the functions a, A∈L^p are given,δ↓0 as↓0, andAp>0.

From the examples of functionsG, it is clear that such tests can be transferred toσ, testing nowσ=sagainst σ=s+δS wherea(t) = IEG(s(t)Z) andA(t) =S(t)IEG(s(t)Z) in the case of a diﬀerentiable functionG. The interesting casesG(x) =x²and_π

2|x|are straightforward.

We now set

β^a(t) = h

(α(t)−a(t)).

Under the null hypothesis,α=athe remark following Theorem 2 implies Dp,ε,so = 1

√h

β^a^p_p−IE|Z|^p ₁

0

(Σ(t))^pdt

→ N 0,Σ²_p ,

if lim_→0h⁵/= 0. This gives a level for a test, provided we have estimated both expressions Σ²_p,

₁

0

Σ^p(t) dt

through empirical standard procedures. To this aim we only make use of the classical plug-in principle.

Passing now to the alternatives and setting γ=δ

h

, with δ=

h^(1−1/p)

_1/2 .

We haveβ^a(t) =β(t) +γA(t) +OL^p

h²

h

andγ=h^2p¹ . Obtaining

Dp,ε,so= 1

√h

βε^p_p−IE|Z|^p ₁

0

(Σ(t))^pdt

+A^p_p+ 1

√hpγ^p−1

β(t)|A(t)|^p−1sign(A(t)) dt+o(1).

(6)

We observe that

√1

(α(t)−IEα(t))|A(t)|^p−1sign(A(t)) dt→_→0N(0, σ²_a,A), for a suitable constantσ_a,A² >0.All this entails

Dp,ε,so→ N

A^p_p,Σ²_p ,

as is usual under contiguous alternatives. This provides a control of the local power for this procedure.

Even in the case where one considers the optimal window h = λε¹⁵ it is possible to work out a test of hypothesis. In this particular case we must useDp,ε,oinstead ofDp,ε,so, obtaining a similar result. This is based on Lemma 5 and the remark located after Theorem 2.

3. Collecting some facts in the case b ≡ 0

The following simple facts are essentially collected from [2]. Set

˙

σ²_ε(t) = Var ˙X_ε(t). (15)

Lemma 1. We have

Cov ( ˙X_ε(s),X˙_ε(t)) = 1 ε²

ϕ

s−u ε

ϕ

t−u ε

σ²(u) du

= 1 ε

ϕ(x)ϕ

x+t−s ε

σ²(t−εx) dx.

This expression vanishes if|t−s|>2εand we have√

εσ˙_ε(t)→ ϕ2σ(t)asε→0where the previous convergence holds uniformly on [0,1].

We often work with the following “almost” white noise process which we shall denote for simplicity’s sake Z_ε(t) = X˙_ε(t)

˙

σ_ε(t) ∼ N(0,1). (16)

Setting ρ_ε(s, t) = Cov (Z_ε(s), Z_ε(t)), note that the previous lemma implies ρ_ε(s, t) =

ϕ(x)ϕ

x+^t−s_ε σ²(t−εx) dx ϕ²(x)σ²(s−εx) dx

ϕ²(x)σ²(t−εx) dx ,

which yields

IE(β_ε(t))²∼ h

ε K(u)K(v) Cov

G(σ(t)Z_ε(t−uh)), G(σ(t)Z_ε(t−vh)) dudv.

The above covariance is a function of tand ofρ_ε(t−uh, t−vh). Using Mehler’s formula we prove that IE(β_ε(t))²∼h

ε K(u)K(v) ∞ n=1

a_2n(t−uh)a2n(t−vh)(2n)!

ϕ(x)ϕ(x +^h(v−u)_ε )σ²(t−uh−εx) dx ϕ²(x)σ²(t−uh−εx)dx

_2n dudv.

Finally, the change of variablez=^h(v−u)_ε implies IE(β_ε(t))²→Σ²(t) asε→0 where equation (11) deﬁnes Σ(t).

The processβ_ε(t) is not Gaussian (even if asymptotically Gaussian) itsL^p-norm cannot be computed using Mehler’s formula. Another way to proceed is as used in Gin´eet al. [8] using a Gaussian approximation. We ﬁrst note thatβ_ε(t) may be rewritten as the partial sum of 1-dependent random variables.

(7)

Lemma 2. SetN = 2_h

2ε

, then we haveβ_ε(t) = N k=1

ζ_k,ε(t)with

ζ_k,ε(t) = ₋¹

2+_N^k

−¹₂+^k−1_N

K(u)

G √

εσ˙_ε(t−uh)

ϕ2 ·Z_ε(t−uh)

−IEG √

εσ˙_ε(t−uh)

ϕ2 ·Z_ε(t−uh)

du, and the N random variables ζ_k,ε(t)are 1-dependent fork= 1, . . . , N.

Given that the random variablesZ_ε(t) andZ_ε(s) are independent when|s−t|>2ε. We obtain this lemma from relationh/N > ε, which also yields

Lemma 3. SetM =₁

2h

, thenβ_ε^p_p= M

=1

Y_,ε with

Y_,ε= h

ε

_p/2 _/M

(−1)/M

K(u)

G

√εσ˙_ε(t−uh)

ϕ2 ·Z_ε(t−uh)

− IEG √

εσ˙_ε(t−uh)

ϕ2 ·Z_ε(t−uh)

du ^p dt, and the M random variables Y_,ε are2-dependent for = 1, . . . , M if we assume that h >2ε.

Hence, the technique of proof of the main theorem will be based on a Lindeberg central limit theorem for m-dependent random variables. The two first moments of the above random variable are difficult to calculate directly. Thus, in order to avoid this problem we shall proceed as in Giné et al. [8]: by using a Gaussian approximation of the previous sums β_ε(t).

The proof of the main theorem will be based on the following lemmas which will provide (in particular) the asymptoticL² behaviour ofβε^p_p.

Lemma 4 (approximating expectations). Let d ∈ IN and x_1,n, . . . , x_n,n ∈ IR^d be centered at expectation, m-dependent for some integer m ≥ 0. Denoting by Var (_n

k=1x_k,n) the variance covariance matrix of the vector_n

k=1x_k,n, supposing that for some deﬁnited×dcovariance matrix V we have Var

_n

k=1

x_k,n

→n→∞ V, and n

k=1

IExk,n^3∨(dpq) →_n→∞ 0.

Denotingx_j,n= (x⁽⁾_j,n)_1≤≤d. Then, ifZ= (Z⁽¹⁾, . . . , Z^(d))∼ Nd(0, V), there exists a constant c (only depending on dand on the norm · on IR^d) such as

IE

d

=1

n k=1

x⁽⁾_k,n

p

−IE d

=1

Z⁽⁾^p ≤c

_n

k=1

IExk,n³ _δ

,

whereδ= 1−¹_q if d= 1andδ=¹₄

1−^pd+d−1_pqd

ifd≥2.

(8)

Lemma 5. Assume that lim

ε→0h= lim

ε→0

ε

h² = 0. Using notation (11), we have IEβ_ε^p_p=IE|Z|^p

₁

0

(Σ(t))^pdt+o 1

√h

, as ε→0.

If we choose the optimal windowh=λε¹⁵ and considering βˆ_ε^p_p we have IEβˆ_ε^p_p=

₁

0

|Σ(t)Z+c(t)|^pdt+o 1

√h

, asε→0.

In order to provide the asymptotic variance ofD_p,εwe precise the second order properties of the random process (β_ε(t))_t∈[0,1]. We set

β_ε(t) = β_ε(t)

Varβ_ε(t)· (17) To obtain the asymptotic behaviour of VarD_p,ε, we shall need the asymptotic behaviour of Cov (β_ε(s),β_ε(t)), easily deduced from the following lemma.

Lemma 6. Assume that lim

ε→0h= lim

ε→0

ε

h= 0, then Cov (β_ε(s), β_ε(t))∼

¹

2

−¹₂ K(u)K

u+t−s h

du

× ₁

−1

∞ n=1

a_2n(s)a_2n(t)(2n)!

σ(t) σ(s)ϕ²₂

₁

−1ϕ(x)ϕ(x+z) dx 2n

dz.

Mehler’s formula allows computing moments of non linear functionals of a Gaussian process. Hence if processβ_ε was Gaussian then we should be able to derive the asymptotic behaviour ofD_p,ε, but this is not the case. Using a Gaussian approximation of β_ε, the following lemma indicates what the asymptotic behaviour of VarD_p,ε would be. Thus, we consider the centered Gaussian process (B_ε(t))_t∈[0,1], such as

Cov (B_ε(s), B_ε(t)) = Cov (β_ε(s), β_ε(t)), ∀s, t∈[0,1].

Lemma 7. Using notations (11)–(12), we assume that lim

ε→0h= lim

ε→0

ε h = 0.

Let b_2k =_2k!¹ IE|Z|^pH_2k(Z), then

VarBε^p_p∼h ∞ k=1

b_2k(2k)!

Γ^2k(w) dw·

Σ^2p(t) dt.

Remark. Let (Z₁, Z₂) be a standard (0, I₂) normal vector, then the previous expression can be written as VarBε^p_p∼h

Cov

1−Γ²(w)Z₁+ Γ(w)Z₂^p,|Z2|^p dw·

Σ^2p(t) dt.

4. Proofs of the theorems 4.1. Proof of Theorem 1: case b ≡ 0

As quoted in Lemma 3, D_p,ε is a sum of the 2-dependent random variables (Y_k,ε−IEY_k,ε)_1≤k≤M with M =M_ε= ₁

2h

.

(9)

Now let s, t ∈ [0,1] be such as |s−t| ≤ 2ε, then it is simple to deduce from Lemma 2 that the random variable (β_ε(s), β_ε(t))∈IR²can also be written as the sum of 4-dependent vectors,x₁+· · ·+x_N. In cased= 2 Lemma 4 implies

|IE|β_ε(s)|^p|β_ε(t)|^p−IE|B_ε(s)|^p|B_ε(t)|^p| ≤ε h

^δ₂

·

Again applying Lemma 4 with d= 1 allows us to subtract expectations which ﬁnally yields

|Cov (|βε(s)|^p,|βε(t)|^p)−Cov (|Bε(s)|^p,|Bε(t)|^p)| ≤ε h

^δ₂

+o(1),

where δ is provided in Lemma 4, which is 1 ford= 2. In order to compute an approximation of VarD_p,ε we ﬁrst expand

VarD_p,ε= 1

h Cov (|βε(s)|^p,|βε(t)|^p) dsdt,

wheres, t∈[0,1] and using Lemma 3, we check that this is enough to assume|s−t| ≤2ε. We get the bound VarD_p,ε−Σ²_p≤

ε h·o(1), which is enough for our purpose.

Ifq >1 + _2p¹, then Lemma 7 yields

Var (D_p,ε)−→ε→0Σ²_p. The CLT will follow from the Lindeberg condition

η_ε=

Mε

k=1

IE 1

√h(Y_k,ε−IEY_k,ε)

⁴−→_ε→00.

Using again Lemmas 4–6 we prove that ifq≥4

IE|Y_k,ε−IEY_k,ε|⁴=O(h⁴),

because this is the expectation of a quadruple integral on a set with volumeM_ε⁻⁴ and the integrated function has an expectation uniformly bounded by 2⁴sup_t∈[0,1]IEG^4p(σ(t)Z). This yields

η_ε=O

h⁻³·h⁴ . Remark. We set

D_p,ε,t= 1

√h _t

0

(|β_ε(s)|^p−IE|β_ε(s)|^p) ds. (18) The previous proof provides a Donsker type invariance principle (form-dependent sequences, again). Sketching the expression in Theorem 1, we set

Σ²_p(t) = ₁

−1Θ(w)dw· _t

0

Σ^2p(s) ds,

andD_p,t= _t

0

Σ_p(s) dW_s for a standard Brownian motion (W_t)_t∈[0,1], then

D_p,ε,t−→^D _ε→0D_p,t, in the spaceC([0,1]). (19)

(10)

4.2. Proof of Theorem 1: the general case

Notations. To simplify the notation we add the drift parameter as an index in the underlying probability law which we now denote IP^(b) and expectations IE^(b). Hence the expression relative to the time changed Brownian motion (i.e. b≡0) can be written asE_ε⁽⁰⁾(t), and IE⁽⁰⁾, respectively.

An essential lemma links the expectations relative to IE^(b)and IE⁽⁰⁾. Under the conditions Theorem 1 p. 40 of [7]: (3) we have

Lemma 8 (Girsanov formula, e.g., in [7] p. 90). Let H :IR→IR be continuous and bounded then:

IE^(b)H(D_p,ε) =IE⁽⁰⁾

H(D_p,ε) exp ₁

0

b(X_s)

σ(s) dX_s−1 2

₁

0

b²(X_s) σ²(s) ds

·

Remark. The conditions that must verifybare restrictive see (3), however if we deal only with weak solutions we can weaken our assumptions and the Girsanov formula will still holds.

An independence argument called stable convergence, that was developed in [10], entails the convergence in distribution of D_p,ε under the general law IP^(b) with the help of the Cameron-Martin formula (see [7] p. 82), which states that

IE⁽⁰⁾exp ₁

0

b(X_s)

σ(s) dX_s−1 2

₁

0

b²(X_s) σ²(s) ds

= 1.

We thus have to prove that the couple

(X_t)_t∈[0,1], D_p,ε converges inC([0,1])×IR (under the distribution IP⁽⁰⁾) to

(X)_t∈[0,1],Σ_pZ where the Brownian motion with a time change (X)_t∈[0,1] is independent of the standard normalZ.

From now, we will only work under the probability distribution IP⁽⁰⁾. Thus the previous asymptotic independence holds if the process

(E_ε,t)_t∈[0,1]≡(X_t, D_p,ε,t)_t∈[0,1]

(with values in IR²) converges to a process (E_t)_t∈[0,1]≡(X_t, D_p,t)_t∈[0,1] asε→0 such as (X_t)_t∈[0,1] is independent ofD_p,1 (we shall prove it for (D_p,t)_t∈[0,1]).

As the family of distributions (D_p,ε,t)_t∈[0,1] converges under the probability distribution IP⁽⁰⁾ asε→0, this implies its tightness in C([0,1]) hence the process (E_ε,t)_t∈[0,1] is also tight in C([0,1],IR²). Let us consider now any limit point (E_t)_t∈[0,1] ≡(X_t, D_p,t)_t∈[0,1] (in distribution) of this family, asε→0. The random vector (D_p,ε,t−D_p,ε,s, X_t−X_s) is independent (always under IP⁽⁰⁾) from D_p,ε,t −D_p,ε,s (if s≤ t ≤s ≤t satisfy s−t >2ε) and it is also independent ofX_t −X_s because the intervals [s, t] and [s, t] do not overlap. This implies that the process (E_t)_t∈[0,1] has independent increments. This process is also a second order process because each of its coordinates has this property. Hence (E_t)_t∈[0,1] is a Gaussian process.

Independence of E_t coordinates now relies on their orthogonality. The only point we need to prove is thus that under the probability distribution IP⁽⁰⁾, we have

Cov (X_s, D_p,ε,t)−→ε→00, ∀s, t∈[0,1].

Note that

√1 hCov

X_s,

_t

0

|βε,u|^pdu

= 1

√h _t

0

Cov (X_s,|βε,u|^p) du. (20) In order to proceed we ﬁrst write A_ε(s, u) = Cov (X_s,|βε,u|^p) = 0 if u > s+ε. Now we deduce that

√1 h

_s+ε

s−2ε

A_ε(s, u) du=O ε

√h

from the relation sup_t∈[0,1]IE|β_ε(t)|^2p<∞.

(11)

We thus only need to consider 1

√h _s−2ε

0

A_ε(s, u) du. Recall that

β_ε,u= h

ε( ˆα_ε(u)−E( ˆα_ε(u))), where

ˆ α_ε(u) =

_1/2

−1/2

K(u)G √

εσ˙_ε(u−vh)

||ϕ||2 ·Z_ε(u−vh)

dv.

By making the change of variablev=_h^εw, we get ε

h _1/2

−1/2K ε

hw

G √

εσ˙_ε(u−εw)

||ϕ||2

·Z_ε(u−εw)

dw:= ε hα˜_ε(u).

Thus, we can write

β_ε,u= ε

h( ˜α_ε(u)−E( ˜α_ε(u))).

Conditioning w.r.tX_s=xwe obtain:

Z_ε(t−εw) =√

εν_t,w,εx+H_t,w,ε,

for some uniformly bounded and deterministicν_u,w,εand some GaussianH_u,w,ε, because Cov (X_s, Z(u−wh)) = O(√

ε).

We now deﬁne

˜˜ α(u) =

_1/2

−1/2K ε

hw

G √

εσ˙_ε(u−εw)

||ϕ||2

·H_u,w,ε

dw, and

β˜_ε,u= ε

h

α˜˜_ε(u)−Eα˜˜_ε(u) . We are interested in obtaining the asymptotic behavior of

√1 h

Cov (X_s,|βε,u|^p)−Cov

X_s,β˜_ε,u^p, fort < s−2. Note that the second term in the above diﬀerence is 0. Moreover

√1 h

Cov (X_s,|βε,u|^p)−Cov

X_s,β˜_ε,u^p= 1

√h E

X_s

|βε,u|^p−β˜_ε,u^p. (21)

The inequality||x+y|^p− |x|^p| ≤p|y|(|x|^p−1+|y|^p−1), entails (21)≤ p

√hE|Xs|

β_ε,u−β˜_ε,u

|βε,u|^p−1+β_ε,u−β˜_ε,u^p−1

≤ p

√h

E|Xs|²β_ε,u−β˜_ε,u² _1/2

O(1).

Hence we get

√p h

E|Xs|²β_ε,u−β˜_ε,u² _1/2

≤

√2p

√h ε

h _p/2

E

|Xs|²α˜_ε(u)−α˜˜_ε(u)² +

E|Xs|²Eα˜_ε(u)−α˜˜_ε(u)²_1/2 .

(12)

But we have

˜

α_ε(u)−α˜˜_ε(u) =√ ε

_1/2

−1/2K ε

hv

ν_u,v,εX_sG λ₁√

εν_u,v,εX_s+λ₂H_u,v,ε dv.

This yields

E

|X_s|²α˜_ε(u)−α˜˜_ε(u)²

≤εO(1), by using the assumption aboutG(x). Therefore

√1 h

Cov (X_s,|β_ε,u|^p)−Cov

X_s,β˜_ε,u^p≤ε h

^p+1₂ O(1).

This last term tends to 0 when lim

ε→0

ε

h= 0. All this implies

√1 h

_s−2ε

0

A_ε(s, u) du→0.

4.3. Proof of Proposition 1

We write

b_ε(t) =

K(s)IEG

√εσ(t˙ −hs) ϕ2

Z

ds−α(t).

Using Lemma 1, settingθ=σ², we obtain the following uniform estimates √

εσ(t˙ −hs) ϕ2

2

=θ(t)−shθ(t) +1

2s²h²θ(t) +o h² .

Consider the functiong(x) =G(

|x|), thusgis alsoa.s.twice diﬀerentiable and

b_ε(t) =

K(s)IEg √

εσ(t˙ −hs) ϕ2

2

Z²

ds−IEg

θ(t)Z² .

Use of Taylor formula yields b_ε(t) = IE

K(s)

−shθ(t) +1

2s²h²θ(t)

Z²g

θ(t)Z² + 1

2s²h²θ²(t)Z⁴g

θ(t)Z² ds+o(h²).

Using symmetries yields with the relationg(u) =G(u²), b_ε(t) = h²

2

s²K(s) ds·IE

σ(t)ZG(σ(t)Z) +σ²(t)Z²G(σ(t)Z) +o h² .

The remark concerning the C³ case follows from careful statements of the above relation with the bound ε²=o(h³).

4.4. Proof of Theorem 2

As done previously, we make use of the stable convergence argument in order to deal only with the simpler caseb≡0. We assume below that b≡0.

(13)

We writeβ_ε(t) =β_ε(t) +c_ε(t), with c_ε(t) =

h εb_ε(t) =

h⁵ ε

a(t)

s²K(s) ds+o(1)

. (22)

Thus

Dp,ε = 1

√h ₁

0

(|βε(t) +c_ε(t)|^p−IE|βε(t) +c_ε(t)|^p) dt.

Proof of relation (14). Using the bound

|β+c|^p− |β|^p≤p|c|

|β|^p−1+|c|^p−1 , we write the following integral (still with|s−t| ≤2ε)

Var (Dp,ε−D_p,ε)≤ 1

h IE(|β_ε(s)|^p− |βε(s)|^p)(|β_ε(t)|^p− |βε(t)|^p)dsdt.

Assuming lim_ε→0h⁵/ε= 0 we obtain lim_ε→0Var (Dp,ε−Dp,ε) = 0. The following facts: sup_s∈[0,1]βε(s)_2p−1≤

sup_s∈[0,1]βε(s)2p<∞andε/h→0, imply (14).

Proof of Theorem 2. Replacingβ byβ+candc byc_ε−c, we prove as above that

ε→0limVar

Dp,ε−Dp,ε

= 0, where

D_p,ε= 1

√h ₁

0

(|β_ε(t) +c(t)|^p−IE|β_ε(t) +c(t)|^p) dt.

The proof of Theorem 2 follows the same lines as that of Theorem 1 up to very simple changes in Lemma 4.

This lemma was indeed dedicated to the approximation of IEf(x₁+· · ·+x_n) for m-dependent vector valued sequences and for the special functionf(x₁, . . . , x_d) = _d

=1|x|^p. Very small changes entail the same result withf(x₁, . . . , x_d) =_d

=1|x+c|^p for ﬁxed real numbersc₁, . . . , c_d.Indeed, one may easily rewrite a version of this lemma for which the measurable function f only satisﬁes|f(x₁, . . . , x_d)| ≤_d

=1|x|^p∨1.

5. Proofs of the lemmas in Section 3 5.1. Proof of Lemma 4

The proofs are diﬀerent for d= 1 andd≥2.

Case d= 1. . Shergin [16] (Th. 1) proved that

∆_n = sup

x∈IR PI

_n

k=1

x_k,n ≤x

−P(ZI ≤x) ≤c

n k=1

IExk,n³.

Recall that the following relation holds for each random variable inL^p IE|X|^p=p

_∞

0

x^p−1P(|XI |> x) dx,

(14)

hence the diﬀerence of expectations to approximate is an integral over IR = (−∞,∞). Divide it for|x| ≤M ≡

∆⁻

pq1

n and|x|> M. Rosenthal inequality [15] up to orderpq(this also holds withm-dependent sequences since sums may be rewritten asmsum of independent variables) and Markov inequality provide a bound for the the second term while the ﬁrst one is bounded by using the previous result in [16].

Case d≥2. In order to handle the same technique as above, we need to develop a bound analogue as that in [16].

Lemma 9. Assuming the assumptions in Lemma 4, then

∆_n= sup

x∈IR⁺ PI

n k=1

x_k,n ≤x

−P(I Z ≤x) ≤c

_n

k=1

IExk,n³ ¹₄

.

Notation. For simplicity’s sake, set

∆_n(x) = PI

n k=1

x_k,n ≤x

−P(I Z ≤x) .

The proof of Lemma 4 now follows the same lines as for d= 1 up to the following expressions IE|X₁· · ·X_d|^p =p^d

_∞

0

· · · _∞

0

|x₁· · ·x_d|^p−1(1−P(XI ₁≤x₁, . . . , X_d≤x_d)) dx₁· · ·dx_d.

For example using(x₁, . . . , x_d)= max{|x₁|, . . . ,|x_d|}implies that the diﬀerence of product moments to bound is bounded above by

c_p _∞

0

x^pd+d−1∆_n(x) dx,

for a constantc_p>0 only depending onp. Using the same techniques as above, yields the result, by truncating at a levelM >0 such as

M^−4pqd= n k=1

IEx_k,n³.

5.2. Proof of Lemma 9

The proof will use the following lemma which is an easy extension of [14] to a vector valued case.

Lemma 10(Lindeberg-Rio form-dependent sequences). Letd∈IN. Let x₁, . . . , x_n∈IR^d be centered at expec- tation, m-dependent and such as IExk³ <∞ for k= 1, . . . , n. Then there exists an independent succession y₁, . . . , y_n of centered d-dimensional random vectors with the following property. Let f : IR^d → IR be a C³- function with bounded partial derivatives of order 3 (write f_∞= sup_{{s, h}_i ≤1;i=1,2,3}f(s)(h₁, h₂, h₃)), then if we consider

∆_n(f) =IE(f(x₁+· · ·+x_n)−f(y₁+· · ·+y_n)), there exists a constant c >0 such as

|∆n(f)| ≤cf_∞ n k=1

IExk³.

Remarks. In view of the theorem relative to the equivalence of the norms in thed-dimensional space we may choose any norm on IR^d and the constantconly depends on this norm and onm.

A simple use of Taylor formula at the origin and with order 3 proves that expression ∆_n(f) is well deﬁned.