• Aucun résultat trouvé

Definition of the wavelet-based regression estimators of the memory param-

5.2.1 The wavelet setting

The wavelet setting involves two functionsφandψ in L2(R) and their Fourier transforms φ(ξ)b def=

Z

−∞

φ(t)eiξtdt and ψ(ξ)b def= Z

−∞

ψ(t)eiξtdt . (5.2) Let us assume that conditions (W-1)-(W-4) hold. Condition (W-2) ensures that the Fourier transformψbdecreases quickly to zero. Condition (W-3) ensures that ψ oscillates and that its scalar product with continuous-time polynomials up to degreeM−1 vanishes.

It is equivalent to asserting that the firstM−1 derivatives of ψbvanish at the origin and hence

|ψ(λ)b |=O(|λ|M), asλ→0. (5.3) Daubechies wavelets (withM ≥2) and the Coiflets satisfy these conditions, see Moulines et al. [2007a]. Viewing the wavelet ψ(t) as a basic template, define the family {ψj,k, j ∈ Z, k∈Z}of translated and dilated functions

ψj,k(t) = 2j/2ψ(2jt−k), j∈Z, k∈Z. (5.4)

Positive values ofktranslateψ to the right, negative values to the left. Thescale index j dilatesψso that large values ofjcorrespond to coarse scales and hence to low frequencies.

We suppose throughout the paper that

(1 +β)/2−α < d≤M . (5.5)

We now describe how the wavelet coefficients are defined in discrete time, that is for a real-valued sequence {xk, k ∈ Z} and for a finite sample {xk, k = 1, . . . , n}. Using the scaling function φ, we first interpolate these discrete values to construct the following continuous-time functions

xn(t)def= Xn k=1

xkφ(t−k) and x(t)def= X

k∈Z

xkφ(t−k), t∈R. (5.6) Without loss of generality we may suppose that the support of the scaling function φis included in [−T,0] for some integer T≥1. Then

xn(t) =x(t) for all t∈[0, n−T + 1].

We may also suppose that the support of the wavelet function ψ is included in [0,T].

With these conventions, the support of ψj,k is included in the interval [2jk,2j(k+ T)].

The wavelet coefficient Wj,k at scale j ≥0 and locationk ∈Z is formally defined as the scalar product in L2(R) of the function t7→x(t) and the wavelet t7→ψj,k(t):

Wj,k def= Z

−∞

x(t)ψj,k(t) dt= Z

−∞

xn(t)ψj,k(t) dt, j≥0, k∈Z, (5.7) when [2jk,2jk+ T]⊆[0, n−T + 1], that is, for all (j, k)∈ In, where whereIn is defined in (2.32)

If∆MXis stationary, then from (2.43) the process{Wj,k}k∈Z of wavelet coefficients at scalej ≥0 is stationary but the two–dimensional process {[Wj,k, Wj,k]T}kZ of wavelet coefficients at scalesj and j, withj≥j, is not stationary. Here T denotes the transpo-sition. This is why we consider instead the stationary between-scale process

{[Wj,k,Wj,k(j−j)T]T}k∈Z, whereWj,k(j−j) is defined as follows:

Wj,k(j−j)def= h

Wj,2j−j′k, Wj,2j−j′k+1, . . . , Wj,2j−j′k+2j−j′1

iT

. For allj, j ≥1, the covariance function of the between scale process is given by

Cov(Wj,k(j−j), Wj,k) = Z π

π

eiλ(kk)Dj,jj(λ;f)dλ , (5.8) whereDj,jj(λ;f) stands for the cross-spectral density function of this process. The case j=j corresponds to the spectral density function of the within-scale process{Wj,k}k∈Z.

In the sequel, we shall use that the within- and between-scale spectral densities Dj,jj(λ;d) of the processX with memory parameterd∈Rcan be approximated by the corresponding spectral density of the generalized fractional Brownian motionB(d) defined in (2.84).

5.2.2 Definition of the robust estimators of d

Let us now define robust estimators of the memory parameter d of the M(d) process X from the observationsX1, . . . , Xn. These estimators are derived from the Abry and Veitch [1998] construction, and consists in regressing estimators of the scale spectrum

σj2 def= Var(Wj,0) (5.9)

with respect to the scale index j. More precisely, if bσj2 is an estimator of σ2j based on Wj,0:nj1 = (Wj,0, . . . , Wj,nj1) then an estimator of the memory parameterdis obtained by regressing log(bσj2) for a finite number of scale indices j ∈ {J0, . . . , J0 +ℓ} where J0 =J0(n)≥0 is the lower scale and 1 +ℓ≥2 is the number of scales in the regression.

The regression estimator can be expressed formally as dbn(J0,w)def=

JX0+ℓ j=J0

wjJ0log bσj2

, (5.10)

where the vectorwdef= [w0, . . . , w]T of weights satisfiesP

i=0wi= 0 and 2 log(2)P

i=0iwi= 1, see Abry and Veitch [1998] and Moulines et al. [2007b]. For J0 ≥ 1 and ℓ > 1, one may choose for examplew corresponding to the least squares regression matrix, defined byw=DB(BTDB)1b where

bdef= h

0 (2 log(2))1i

, B def=

"

1 1 . . . 1 0 1 . . . ℓ

#T

is the design matrix and D is an arbitrary positive definite matrix. The best choice of D depends on the memory parameterd. However a good approximation of this optimal matrix D is the diagonal matrix with diagonal entries Di,i = 2i, i = 0. . . , ℓ; see Fa¨y et al. [2009] and the references therein. We will use this choice of the design matrix in the numerical experiments. A heuristic justification for this choice is that by [Moulines et al., 2007a, Eq. (28)],

σ2j ∼C22jd , asj→ ∞, (5.11)

whereC is a positive constant.

In the sequel, we shall consider three different estimators ofdbased on three different estimators of the scale spectrum σj2 with respect to the scale index j which are defined below.

Classical scale estimator

This estimator has been considered in the original contribution of Abry and Veitch [1998]

and consists in estimating the scale spectrumσj2 with respect to the scale index j by the empirical variance

b

σCL,j2 = 1 nj

nj

X

i=1

Wj,i2 , (5.12)

where for any j, nj denotes the number of available wavelet coefficients at scale index j defined in (2.32).

Median absolute deviation

This estimator is well-known to be a robust estimator of the scale and as mentioned by Rousseeuw and Croux [1993] it has several appealing properties: it is easy to compute and has the best possible breakdown point (50%). Since the wavelet coefficients Wj,i are centered Gaussian observations, the square of the median absolute deviation ofWj,0:nj1 is defined by

b

σMAD,j2 =

m(Φ) med

0inj1Wj,i

2

, (5.13)

where Φ denotes the c.d.f of a standard Gaussian random variable and

m(Φ) = 1/Φ1(3/4) = 1.4826 . (5.14) The use of the median estimator to estimate the scalogram has been suggested to estimate the memory parameter in Stoev et al. [2005] (see also [Percival and Walden, 2006, p. 420]).

A closely related technique is considered in Coeurjolly [2008a] and Coeurjolly [2008b] to estimate the Hurst coefficient of locally self-similar Gaussian processes. Note that the use of the median of the squared wavelet coefficients has been advocated to estimate the variance at a given scale in wavelet denoising applications; this technique is mentioned in Donoho and Johnstone [1994] to estimate the scalogram of the noise in the i.i.d. context;

Johnstone and Silverman [1997] proposed to use this method in the long-range dependent context; the use of these estimators has not been however rigorously justified.

The Croux and Rousseeuw estimator

This estimator is another robust scale estimator introduced in Rousseeuw and Croux [1993]. Its asymptotic properties in several dependence contexts have been further studied in L´evy-Leduc et al. [2009] and the square of this estimator is defined by

b

σCR,j2 =

c(Φ){|Wj,i−Wj,k|; 0≤i, k≤nj−1}(knj)

2

, (5.15)

wherec(Φ) = 2.21914 and knj =⌊n2j/4⌋. That is, up to the multiplicative constantc(Φ), b

σCR,j is the knjth order statistics of the n2j distances |Wj,i−Wj,k| between all the pairs of observations.