• Aucun résultat trouvé

On estimating the diffusion coefficient : parametric versus nonparametric

N/A
N/A
Protected

Academic year: 2022

Partager "On estimating the diffusion coefficient : parametric versus nonparametric"

Copied!
34
0
0

Texte intégral

(1)

2001 Éditions scientifiques et médicales Elsevier SAS. All rights reserved S0246-0203(00)01070-0/FLA

ON ESTIMATING THE DIFFUSION COEFFICIENT:

PARAMETRIC VERSUS NONPARAMETRIC

Marc HOFFMANN

Laboratoire de Probabilités et Modèles Aléatoires, CNRS-UMR 7599, Université Paris 6 et 7, 16, rue Clisson, 75013 Paris, France

Received 2 September 1999, revised 29 August 2000

ABSTRACT. – We consider the following problem: estimate the Lipschitz continuous diffusion coefficientσ2 from the path of a 1-dimensional diffusion process sampled at times i/n, i= 0, . . . , n, when we believe thatσ2 actually belongs to a smaller regular parametric set0. By introducing random normalizing factors in the risk function, we obtain confidence sets which can be essentially better than the minimax raten1/3of estimation for Lipschitz functions in diffusion models. With a prescribed confidence levelαn, we show that the best possible attainable (random) rate is(

logαn1/n)2/5. We construct an optimal estimator and an optimal random normalizing factor in the sense of Lepski (1999).

This has some consequences for classical estimation: our procedure is adaptive w.r.t.0and enables us to test the hypothesis that σ2 is parametric against a family of local alternatives with prescribed 1st and 2nd-type error probabilities.2001 Éditions scientifiques et médicales Elsevier SAS

AMS classification: 62G05; 62M05

1. Introduction

In this paper, we study the statistical estimation of the diffusion coefficient, when one observes a 1-dimensional diffusion process at timesi/n, i=0, . . . , n, and asymptotics are taken as n→ ∞. The sample size increases not because of a longer observation period but, rather, because of more frequent observations. This setting has been addressed by several authors, both from a parametric or a nonparametric point of view.

A brief summary of the state of the art yields the following conclusions:

(1) In regular parametric models, the LAMN property holds with rate 1/√ n (see Donhal [1], or more recently Gobet [3]), but the MLE is not tractable in general.

Computationnally fast methods based on contrasts are known and possess good optimality properties as far as rates of convergence are concerned (Genon-Catalot and Jacod [2]).

E-mail address: hoffmann@math.jussieu.fr (M. Hoffmann).

(2)

(2) For nonparametric models, if the diffusion coefficient has smoothness of order s (in a Sobolev or Hölder sense for instance but this can easily be embedded in a Besov space framework), estimators based on kernels (see Jacod [8]) or linear wavelets techniques (see [6]) achieve the rate ns/(1+2s). This, of course, under some restriction which are specific to diffusion processes. Moreover, the rate ns/(1+2s)has been proved to be optimal in the minimax sense when the diffusion coefficient possesses at least bounded derivatives up to order 2 (see [6]). However, from a practical point of view, the methods proposed have drawbacks and are not always easily implementable on numerical data.

(3) But, for nonparametric models with low order of smoothness (precisely: with diffusion coefficient no more regular than Lipschitz continuous), the Nadaraya–

Watson estimator, introduced in this context by Florens in [4] – which historically is the first nonparametric estimator of the diffusion coefficient – has good convergence properties and is easy to implement in practice. (In this paper, we also complete Florens’ results by showing that the Nadaraya–Watson estimator achieves the raten1/3if the diffusion coefficient is Lipschitz continuous and that this rate is optimal in the minimax sense.)

A caricatural synthesis could be the following: theoretically optimal and computation- nally fast methods are known when the underlying model is either parametric and regular (take then the contrast estimators of Genon-Catalot and Jacod and the – optimal – rate 1/√

nis achievable) or nonparametric but the diffusion coefficient is Lipschitz contin- uous (take the Nadaraya–Watson estimator of Florens and the – optimal – raten1/3is achievable).

In this paper, we address the following problem: how can we combine both technologies and what precise mathematical consequences can we derive? We believe that such a question has some importance in practice: given two different methods, a practitioner – motivated by a specific experiment in e.g. finance, biology or physics, say – would legitimately ask which one to choose. Of course, a prior knowledge, intuition, suspicion or guess about the underlying structure of the model (parametric) usually exists, and this should be taken into account, even at a mathematical level. Also, the answer we want to give must be numerically feasible and must quantify precisely the consequences of the choice (parametric versus nonparametric), especially when the initial suspicion turns out to be wrong. Our angle is thus the following: we believe that the diffusion coefficient has a given regular parametric structure, but we wish to take into account the possibility that this prior intuition is wrong, in which case the diffusion coefficient could be any Lipschitz continuous function within a certain nonparametric class.

To formulate and solve this problem mathematically, we use the notion of minimax risk with random normalizing factors, which is based on adaptive estimation and nonparametric testing (Theorems 1 and 2 below). The method is easily tractable on numerical data. The ideas developed here heavily rely on the work of Lepski [11].

However, Lepski considers in [11] a slightly different problem in the white noise model context. Therefore, both techniques and answers given here differ a bit from his paper, and we borrow his formalism and mathematical devices rather than complete his theory.

Note also that our approach is different from robustness, where misspecified models

(3)

are allowed. In general, such models are defined around tubular neighbourhoods of the original parametric model, at a distance vanishing asn→ ∞, an assumption we do not have to make here.

A by-product of our approach is that we complete Florens’ paper [4] (and also [6]) by showing that her estimator is optimal in the minimax sense under squared-error loss for Lipschitz continuous diffusion coefficients (Proposition 2), but we know from [9] that this is no longer true for a diffusion coefficient with a higher degree of smoothness.

1.1. Statistical setting

We observeXn=(Xi/n, i=0, . . . , n)where(Xt)t∈[0,1]is a 1-dimensional diffusion process of the form

Xt=x0+ t

0

b(s, Xs) ds+ t

0

σ (Xs) dWs, t∈ [0,1] (1.1) with x0R, W a standard Brownian motion, b smooth, σ Lipschitz continuous and nonvanishing. Our aim is to estimate the function 2(x), xI ), for an arbitrary compact interval I. In this setting, the drift b cannot be identified from the data and is a nuisance parameter.

Formally, we take X as the canonical process on the space =C([0,1],R) of continuous functions equipped with the norm of uniform convergence, endowed with its Borel σ-field F. We assume that the drift b has linear growth, therefore (1.1) has a unique solution. We further denote by Pσ2 the probability measure on (,F) under whichXsolves (1.1).

There are several ways of assessing the quality of an estimation procedure. First, the estimation ofσ2(x)at a point xI is meaningful only if the process X hits the point x before time 1, or ifLx1(X) >0, whereLx1(X)=lim0 1

2

1

01|Xsx|ds denotes the local time ofXat levelxand time 1. So if

D(x, ν)=ω: Lx1X(ω)ν,

we shall restrict our attention to the set D(x, ν) for a given ν >0, fixed throughout the paper. However, the setD(x, ν) is not observable, therefore it is better for practical purposes to replace – like in Jacod, [8] – the set D(x, ν)by a setDn(x, ν)measurable w.r.t. the σ-field Gn generated by the Xi/n, i =0, . . . , n, at stage n. To do so, we introduce the following empirical local time

Lxn

Xn= 1 2n2/3

n

i=1

1X(i1)∈[xn−1/3, x+n−1/3] (1.2) which converges toLx1(X)asn→ ∞(see, e.g., [9]). The choice of the bandwithn1/3 will prove to be technically useful. Define

Dn(x, ν)=ω:Lxn

Xn(ω)ν

(4)

accordingly. We will further restrict our attention to Dn(x, ν). For c >1, let c = {f: RR;c1f (x)c}and define the Lipschitz class

=c(L)=f: RR; |f (x)f (y)|L|xy|c.

The space describes the minimal smoothness properties we require for the unknown parameterσ2.

An estimator Tn = (Tn(Xn, x), xI ) of 2(x), xI ) is a function which is GnB(I )measurable; we evaluate its performance uniformly over by means of its minimax risk

Rn(Tn, , ϕn)= sup

σ2

Eσ2 ϕn2

I

Tn

Xn, xσ2(x)2dx|DIn(ν)

where DnI(ν) =xIDn(x, ν) and ϕn >0 is a normalizing factor. Of course, the finiteness of Rn will only be meaningful if ϕn →0 as n → ∞. Here, Eσ2 means integration with respect to the probabilityPσ2. Thus we measure the quality of estimation in integrated quadratic loss, conditional on the eventDnI(ν).

1.2. Statement of the problem and objectives

An estimatorTn!is said to attain an optimal rate of convergence ϕn()if lim sup

n→∞ Rn

Tn!, , ϕn()<+∞ (1.3)

and no estimator can attain a better rate:

lim inf

n→∞ inf

Tn

Rn

Tn, , ϕn()>0 (1.4)

where the infimum is taken over all estimators. In Section 2, we show thatϕn()=n1/3 is an optimal rate of convergence and prove that the Nadaraya–Watson estimator, introduced in this context by Florens in [4] attains the optimal rate (see also [8]). We understand ϕn() as an accuracy of estimation: for any confidence level α >0, we guarantee from (1.3) the existence of (an explicitly computable) γα>0 s.t.

sup

σ2

Pσn,ν2 Tn!σ2Iγαϕn()α, (1.5) where Pσn,ν2 (·) = Pσ2{· | DIn(ν)} and fI =(If2(x) dx)1/2. Furthermore, in the optimality sense described by (1.4), this accuracy is the best one achievable uniformly over.

However, suppose we suspectσ2 to actually lie in a smaller parametric set, namely σ20=0(I )given by

0(I )=f: f (x)=σ02(x, θ ), θ%, xI, where%Rs,s1 is given and the functionσ02(·, θ )is known up toθ.

(5)

Under some regularity assumptions on%andσ02(see Section 2 below), an optimal rate of convergence over0isϕn(0)=n1/2and is attained by the least-square estimator Tn(0)=σ02(·,θˆn), where

θˆn=arg min

θ%

n

i=1

&niX2σ02(X(i1)/n, θ )2 (1.6)

and where we denote &niX =√

n(Xi/nX(i1)/n) the normalized increments of the observed process. Based on the hypothesis: σ20, we can hope to improve the accuracy of estimation. A traditional way of improvement is the so-called adaptive approach.

1.2.1. The adaptive approach

Intuitively, a practitioner would presumably: (1) test the hypothesisσ20, (2) based on the acceptance of the test, choose the parametric estimator Tn(0), (3) keep the nonparametric estimator Tn! otherwise. From a mathematical point of view, such a procedure – call it temporarily Tn(a)=Tn(a)(x, Xn) – is admissible if it adapts to the sets(0, )in the following sense: define the adaptive rate

ψn2)= n−1/2 if σ20,

n1/3 if σ2\0. (1.7)

ThenTn(a)should verify

lim sup

n→∞ Rn

Tn(a), , ψn(·)<+∞. (1.8)

However, even if we have satisfied the adaptive criterion (1.8), we are unable to state any accuracy of the method sinceψn=ψn2)depends on the unknown, we cannot provide any confidence set of the type (1.5).

In this paper, we propose an alternative approach by introducing a procedure based on random normalizing factors (r.n.f. for abbreviation), following Lepski in [11] and [7].

This will enable us to improve the accuracy of estimation in the sense of (1.5). We will even show that a procedure based on r.n.f. can simultaneoulsy give an improvement of accuracy and be adaptive in the sense of (1.8).

1.2.2. Random normalizing factors

We introduce the class of observable normalizing factors (r.n.f.) n=ρn(0, ϕn()]: ρnisGn-measurable,

where Gn is the σ-field generated by the obervation Xi/n, i =0, . . . , n. Clearly, any estimatorTn satisfying

lim sup

n→∞ Rn(Tn, , ρn) <∞ for someρnn (1.9) attains the optimal rate of convergence over. But in contrast to an adaptive estimator, we now guarantee the existence of an explicitly computable γα from (1.9) such that for

(6)

anyα >0:

sup

σ2

Pσn,ν2 Tn!σ2Iγαρn

α,

where we have set – recall (1.5) –Pσn,ν2 (·)=Pσ2{· |DnI(ν)}. This provides us with a new (possibly random) accuracy of estimation. The possibility that σ2 belongs to 0 may give a value to ρn essentially better than ϕn()=n1/3 with some probability, while still ensuring a confidence set uniformly over.

Next, we need a consistent way to compare 2 r.n.f. in order to define an optimality cri- terion. Since a r.n.f. is random, we introduce the following (deterministic) characteristic:

DEFINITION 1. – For a given confidence level 0< αn <1, the characteristic of ρnnis

χnn)=inft(0, ϕn()]: inf

σ20

Pσn,ν2 nt)1−αn

.

Note thatχnn)depends onαnand on0.

Remark. – A heuristic approach to understand this definition can be the following: let us fixt >0 “small”, i.e. at least smaller thanϕn(). What we require is that a “good”

ρn will provide improvement of accuracy if the guess 20) turns out to be true.

This means that underPσn,ν2 , forσ20, the event “ρnt” has a controlled probability.

Mathematically, we translate this idea by saying that for a given confidence levelαn, we guarantee that

inf

σ20

Pσn,ν2 nt)1−αn. (1.10) Next, the smaller t we can find such that (1.10) holds, the better ρn hence χnn) is defined as infimum oftproviding (1.10).

We now have a canonical way to compare r.n.f. We naturally derive the following optimality criterion:

DEFINITION 2. –ρn!nis optimal (orα-optimal) w.r.t.(, 0)if (i) There exists an estimatorTn!!such that

lim sup

n→∞ Rn(Tn!!, , ρn!) <+∞. (ii) For anyρnnsuch that

χnn)

χnn!)→0, asn→ ∞, we have

lim inf

n→∞ inf

Tn

Rn(Tn, , ρn)= +∞

where the infimum is taken over all estimators.

(7)

Remark 2. – Following Lepski, we call Tn!! an α-adaptive estimator. Note that by definition, we always have ρn!ϕn(). Thus an α-adaptive estimator is optimal with respect to . Also, note that anα-optimal random normalizing factor may depends in general on the quantityαn.

The aim of this paper is to construct an optimal random normalizing factor w.r.t (.0)following Definition 2 and anα-optimal estimator accordingly.

1.3. Organization of the paper

In Section 2, we recall and adapt some facts about statistical estimation of the diffusion coefficient from discrete observations. The nonparametric kernel estimator of Section 2.2 was introduced in [4] and later generalized to our setting in [8]. However, we give a self containing proof of the upper bound. The lower bound is new, and follows the same strategy as Proposition 1 in [6], with new technicalities.

Section 3 is devoted to the construction of an optimal r.n.f. for the diffusion coefficient under a parametric hypothesis. We discuss the link to adaptive estimation and show how a slight modification of the optimal (α-adaptive) estimator enables to obtain simultaneously an optimal accuracy of estimation and an adaptive estimator. Links to testing are also mentioned. The proofs are delayed until Section 4.

2. Preliminary results 2.1. Parametric estimation

We need the following regularity assumptions on%andσ02. Assumption A. – We have

(1) The set%is compact in Rs=R.

(2) For someM=MI>0, sup

xI

σ02(x, θ1)σ02(x, θ2)M|θ1θ2| forθ1, θ2%.

(3) For someη=ηI>0, inf(x,θ )I×%|dσ02(x, θ )|η.

(4) The functions dxd22σ02and dxd d σ02are well defined and continuous onI×%.

(5) The equalityσ02(x, θ1)=σ2(x, θ2)for allxI impliesθ1=θ2.

Remark 1. – The above assumptions are standard in parametric estimation but do not claim to be minimal. For instance, A5 can be relaxed but known extensions are usually difficult to check (see, e.g., [2]).

By Ito formula, the model given by (1.1) can be recast in a regression setting, having &niX2=n

i/n

(i1)/n

σ2(Xs) ds+εni +a higher order term, (2.11)

(8)

whereεni =2n(ii/n1)/n(XsX(i1)/n) dWs is a martingale increment. Thus we observe on the non-uniform random grid(X(i1)/n, i=1, . . . , n)the valuen(ii/n−1)/nσ2(Xs) dsσ2(X(i1)/n), contaminated by the noise εin, plus a negligible drift effect. From this formulation, we readily obtain the least-square estimator Tn(0)=σ02(·ˆn), where θˆn is defined by (1.6).

PROPOSITION 1. – Grant Assumption A. Then ϕn(0)=n1/2 is an optimal rate of convergence and is attained byTn(0).

The proof of the upper bound is readily obtained from Theorem 1 in [2]. We simply added ad hoc assumptions in order to obtain the uniformity inθ%for the integrated risk. The proof of the lower bound follows from the LAMN property of the parametric model (see, e.g., [1]).

Remark 2. – Note thatθˆnis not the best available parametric estimator ofθ – it is not equivalent to the MLE – but since we focus on rates of convergence only, this intuitively simple choice is sufficient.

2.2. Nonparametric estimation

We assume (with no loss of generality as far as practical considerations are concerned) thatI is on a dyadic scale, namely

I=k02j0, k12j0, for some integersk0, k1, j0. For integers(k, j ), letIj k= [k2j, (k+1)2j)and define forjj0

VIj = {f ∈L2: f is constant onIj k, Ij kI}

the finite element space of functions which are piecewise constant on I over a grid of mesh 2j. Indeed an orthogonal basis forVIj is given by the family

φj k=2j/2φ2j· −k, ksuch thatIj kI,

where φ =1[0,1). We estimate σ2 by an element of VIjn, for some projection level jn

chosen in accordance with the asymptotics in the following way. Let ˆ

cjnk= 1 n 2jnk

n

i=1

&niX2φjnk(X(i1)/n),

(with 0/0=0), where 2jnk = 2njnni=11X(i1)/nIjnk. Informally, we use the regression analogy

&niX2σ2(X(i1)/n)+εin

defined in (2.11) and we weight the local average by an approximation 2jnk of the time spent by the process X inIjnk. Finally, the nonparametric estimator Tn!(x)ofσ2(x)is

(9)

defined by

Tn!(x)=

k:IjnkI

ˆ

cjnkφjnk(x)

and is specified by the projection level jn. The performances ofTn! are summarized in the following result. ForxR, we denote byxthe integer part ofx.

PROPOSITION 2. – An optimal rate of convergence over is ϕn() = n1/3. Moreover,Tn!, calibrated byjn= 13lognj0is optimal for the criterion given by (1.3) and (1.4).

3. Main result

This section is devoted to the construction of an optimal r.n.f. ρn! in the sense of Definition 2. Accordingly, we construct an α-adaptive estimator w.r.t. (0, ). Our algorithm can be described as follows:

(1) Estimate the distancedˆn (for the · I seminorm) betweenσ2and0,

(2) Takeρn!=n1/3 ifdˆn is above some threshold level (possibly depending on the confidence levelαn).

(3) Takeρn!=ϕn,αn>0 for some normalizing factorϕn,αn (tuned with the asymptot- ics) otherwise.

We will show that we can takeϕn,αnconverging to 0 faster thann1/3by a polynomial power. The value ϕn,αn corresponds to the acceptance that σ20 and measures the improvement of the accuracy of estimation. The assumptions on the parametric family 0are less stringent than in Section 2.

Assumption B. – We have

(1) The set%is compact in Rs,s1.

(2) There existµ >0 andM=MI,µ>0 such that sup

xI

σ02(x, θ1)σ02(x, θ2)1θ2µ forθ1, θ2%.

(3) For allθ%,σ02(·, θ )0.

(4) There existθ0%andL < L˜ such thatσ02(·, θ0)c(L).˜ 3.1. Construction ofρnand main result

ForJnj0andθ%, define dn(θ )=

k:IJnkI

cˆJnkcJnk

σ02(·, θ )2,

where

cJnk

σ02(·, θ )=2Jn/2

IJnk

σ02(x, θ ) dx.

(10)

By Assumption B2, there existsC=C(µ, MI,µ)such that

|dn1)dn2)|1θ2µ forθ1, θ2%.

This, together with Assumption B1, ensures the existence of,θn!%, measurable w.r.t.

Gn-measurable and solution to

dnn!)= inf

θ%dn(θ ).

For technical reasons, we need to compensate the variance ofdnas follows. Let CJnk= 2

3n22J2nk n

i=1

&niX4φJ2

n,k(X(i−1)/n) and

dˆnn!)=dnn!)

k:IJnkI

CJnk.

The estimate dˆnn!) will determine the following decision rule. For αn >0, let Jn= 25log√ n

logα−1nj0and

ϕn,αn=

logαn−1 n

2/5

.

Note that 2Jn andϕn,αn coincide up to a constant. Define now the threshold λ=A(10, c, ν)−1/4,

whereA(t, c, ν)is specified in (4.17), see the proof of Lemma 3. (The choice ofλwill become transparent in the proof of Theorem 1 in Section 4 below when Lemma 3 is used.) The decision rule then takes the form

ρn!=ϕn,αn1{ ˆd

nn!)λ2ϕn,αn2 }+n1/31{ ˆd

nn!)>λ2ϕ2n,αn}

and

T¯n(x)=

k:IJnkI

cJnk

σ02(·, θn!)φJnk(x).

Finally, our estimator ofσ2(x)is

Tn!!(x)= ¯Tn(x)1{ρn!=ϕn,αn}+Tn!(x)1{ρn!=n1/3},

whereTn!is the nonparametric estimator of Section 2.2 specified withjn= 13lognj0. THEOREM 1. – Grant Assumption B. Assume thatnaαn<24for some arbitrary a >0. Then ρn! is an optimal random normalizing factor w.r.t. (0, ) and Tn!! is α-adaptive.

(11)

Remark 1. – In particular, we see that if our parametric assumption is correct, with prescribed confidence 1−αn, we are able to improve (asymptotically) the accuracy of estimation, i.e., the size of the confidence band we construct, by a factor ϕn,αn = (

logαn1/n)2/5.

Remark 2. – The improvement of the random confidence band is of a polynomial order, but is lowered down by the size ofαn. However, the restriction αnnaensures that it is a least of order(

logn/n)2/5.

Remark 3. – For practical purposes, it seems more clever to replaceνin the definition ofλby infk:IJnkI2Jnk. For technical reasons, we are unable however to prove Theorem 1 in this setting. Note also that the practical implementation ofρn!is easy: the computation of cˆJnk is reasonably fast and the cardinality (in k) of such cˆJnk is of order logn.

Likewise for the CJnk. Eventually, the minimization problem arg minθ dn(θ ) has the same complexity as the computation of a standard parametric estimator. Once ρ!n is computed, one readily computes either T¯n (which is no more difficult to obtain than Tn!) orTn!itself, the standard Nadaraya–Watson estimator.

3.2. Discussion

3.2.1. Links to adaptive estimation

We show in this paragraph how a simple modification of Tn!! provides us with an adaptive estimator – in the usual sense of (1.8) – without loosing the optimality in terms of r.n.f. Assumption A is in force here. We consider the estimatorTn(0)=σ02(·ˆn) introduced in Section 1.2, whereθˆnsolves (1.6). Note thatTn(0)is well defined thanks to Assumption A. Let

Tn(a)(x)=Tn(0)(x)1{ ˆdn(θˆn)λ2ϕn,αn2 }+Tn!(x)1{ ˆdn(θˆn)>λ2ϕ2n,αn}. Define the random normalizing factorρn(a)accordingly

ρn(a)=ϕn,αn1{ ˆdn(θˆn)λ2ϕ2n,αn}+n1/31{ ˆdn(θˆn)>λ2ϕ2n,αn}. THEOREM 2. – Grant Assumption A. We have

(i) The estimatorTn(a)isα-adaptive w.r.t.(0, ).

(ii) If moreoverαn=O(n1), the estimatorTn(a) is adaptive in the usual sense w.r.t.

(0, ):

lim sup

n→∞ Rn

Tn(a), , ψn(·)<∞,

whereψn2)denotes the adaptive rate defined by (1.7).

The proof of (i) is delayed until Section 4. The proof of (ii) is a direct consequence of (i) and Proposition 2 in [11].

Remark 1. – The decision rule 1{ ˆd

n(θˆn)λ2ϕn,αn2 }answers to our original question: given a parametric procedure Tn(0) versus a nonparametric oneTn!, which one shall we use in practice? The answer 1 to our test yields the choice Tn(0), whereas the answer 0 yields

(12)

Tn!. The precise mathematical consequences of this choice are described by the r.n.f.

ρn(a). Moreover, this choice is optimal in the sense of Definition 2.

Remark 2. – Again – see Remark 3 in Section 3.1 – we can see that the practical implementation ofTn(a)is fast and has complexity no worse than that ofTn!.

3.2.2. Links to testing

We explore in this section another virtue ofρn!, namely the possibility to build a test for the hypothesis σ20 against a family of local alternatives. More precisely, given h >0 and 0< α <1, define

Wn(h, I, α)=fc(L): inf

θ%

fσ02(·, θ )Ih·ϕn,α

.

THEOREM 3. – Grant Assumption A. For any 0< α, β < 1, we have, for large enoughn

(i) supσ20(I )Pσn,ν2 n!=n1/3)α.

(ii) There existsh(β) >0 such that sup

σ2Wn(h(β),I,α)

Pσn,ν2

ρn!=ϕn,α

β.

In words, the hypothesis: σ20(I ) can be tested against the family of local alternatives:σ2Wn(h(β), I, α)with prescribed first and second type error probability.

The proof of (i) readily follows from (4.27) in the proof of Theorem 1 below. The proof of (ii) follows from Theorem 1 together with Proposition 3 in [11].

4. Proofs

The proof of Proposition 2 can be read independently from the that of Theorems 1 and 2. The reader is however invited to first scan the notation and preliminaries section.

4.1. Notation and preliminaries Forθ%, we abbreviate Pσ2

0(·,θ ) byPθ when no confusion is possible. Forσ2, letP˜σ2 denote the law of the processY such thatdYt =σ (Yt) dWt,Y0=x0. DefineP˜σn,ν2

accordingly. We denote byC a generic constant, possibly varying from line to line and which may depend onI. Any other dependence will be explicitly mentioned.

4.1.1. Preliminary decompositions (a) Forσ2, define

aknσ2=23Jn/2 n2Jnk

n

i=1

1X(i−1)/nIJnk

IJnk

τin(x, X) dx,

bnkσ2= 2Jn/2 n2Jnk

n

i=1

1X(i1)/nIJnkεin,

(13)

where

τin(x, X)=σ2(x)n i/n

(i1)/n

σ2(Xs) ds. (4.12) Define the random variable

&n

σ2=

k:IJnkI

cJnk

σ02(·, θn!)σ2aknσ22, (4.13)

where – recall Section 2 – we denotecj k2)=2j/2I

j kσ2(x) dx. Using that(&niX)2= n(ii/n1)/nσ2(Xs) ds+εinunderP˜σ2 we have

ˆ

cJnkcJnk=ankσ2+bknσ2. We thus obtain the following decomposition

dˆnn!)=&n

σ2+Mn

σ2+Nn

σ2,

having

Mn

σ2=2

k:IJnkI

cJnk

σ2σ02(·, θn!)ankσ2bnkσ2,

Nn

σ2=

k:IJnkI

bnk2CJnk

.

(b) Define ψ(x) =1[0,1/2)(x)−1[1/2,1)(x). If dj k

σ2= σ2(x)ψj k(x) dx is the wavelet coefficient of σ2 in the Haar basis, from the multiscale decomposition of σ2, we have by Parseval’s identity

T¯nσ22I=

k:IJnkI

c2Jnkσ2σ02(·, θn!)+Jn

σ2

2&n

σ2+2

k:IJnkI

aknσ22+Jn

σ2,

where the remainder termJn2)=jJn

k:Ij kIdj k22)satisfies Jn

σ2C(c, L)22Jn=C(c, L)ϕn,α2 n

sinceσ2. Note that

τin(x, X)1X(i1)/nIJnkL

2Jn+n i/n

(i1)/n

|XsX(i1)/n|ds

, (4.14)

therefore |τin(x, X)|1X(i1)/nIJnk L(2Jn + n1/2rin2)), and since we have supσ2E˜σn,ν2 (|rin2)|p)Cpfor allp1 by the Burkholder–Davis–Gundy inequality

Références

Documents relatifs

This question, however, will not be elucidated because we prove in Section 4 that the bulk diffusion coefficient is smooth up to the boundary.. Consider the resolvent

Mettre en place l’estimateur par projection dans le cas de la base de Fourier puis des polynˆ omes de Legendre.. On reprend les noyaux utilis´ es dans la

The M SD can be measured by dMRI using a sequence of magnetic field

Let X be an algebraic curve defined over a finite field F q and let G be a smooth affine group scheme over X with connected fibers whose generic fiber is semisimple and

First introduced by Faddeev and Kashaev [7, 9], the quantum dilogarithm G b (x) and its variants S b (x) and g b (x) play a crucial role in the study of positive representations

JACOD, Random sampling in estimation problems for continuous Gaussian processes with independent increments, Stoch. JEGANATHAN, On the asymptotic theory of estimation

DISCRETE SAMPLING OF AN INTEGRATED DIFFUSION PROCESS AND PARAMETER ESTIMATION OF THE DIFFUSION COEFFICIENT.. Arnaud

For each of the simulated data sets, the copula