Definition of the wavelet-based regression estimators of the memory param-

5.2.1 The wavelet setting

The wavelet setting involves two functionsφandψ in L²(R) and their Fourier transforms φ(ξ)b ^def=

Z _∞

−∞

φ(t)e⁻^iξtdt and ψ(ξ)b ^def= Z _∞

−∞

ψ(t)e⁻^iξtdt . (5.2) Let us assume that conditions (W-1)-(W-4) hold. Condition (W-2) ensures that the Fourier transformψbdecreases quickly to zero. Condition (W-3) ensures that ψ oscillates and that its scalar product with continuous-time polynomials up to degreeM−1 vanishes.

It is equivalent to asserting that the firstM−1 derivatives of ψbvanish at the origin and hence

|ψ(λ)b |=O(|λ|^M), asλ→0. (5.3) Daubechies wavelets (withM ≥2) and the Coiflets satisfy these conditions, see Moulines et al. [2007a]. Viewing the wavelet ψ(t) as a basic template, define the family {ψ_j,k, j ∈ Z, k∈Z}of translated and dilated functions

ψ_j,k(t) = 2⁻^j/2ψ(2⁻^jt−k), j∈Z, k∈Z. (5.4)

Positive values ofktranslateψ to the right, negative values to the left. Thescale index j dilatesψso that large values ofjcorrespond to coarse scales and hence to low frequencies.

We suppose throughout the paper that

(1 +β)/2−α < d≤M . (5.5)

We now describe how the wavelet coefficients are defined in discrete time, that is for a real-valued sequence {x_k, k ∈ Z} and for a finite sample {x_k, k = 1, . . . , n}. Using the scaling function φ, we first interpolate these discrete values to construct the following continuous-time functions

x_n(t)^def= Xn k=1

x_kφ(t−k) and x(t)^def= X

k∈Z

x_kφ(t−k), t∈R. (5.6) Without loss of generality we may suppose that the support of the scaling function φis included in [−T,0] for some integer T≥1. Then

x_n(t) =x(t) for all t∈[0, n−T + 1].

We may also suppose that the support of the wavelet function ψ is included in [0,T].

With these conventions, the support of ψ_j,k is included in the interval [2^jk,2^j(k+ T)].

The wavelet coefficient W_j,k at scale j ≥0 and locationk ∈Z is formally defined as the scalar product in L²(R) of the function t7→x(t) and the wavelet t7→ψ_j,k(t):

W_j,k ^def= Z _∞

−∞

x(t)ψ_j,k(t) dt= Z _∞

−∞

x_n(t)ψ_j,k(t) dt, j≥0, k∈Z, (5.7) when [2^jk,2^jk+ T]⊆[0, n−T + 1], that is, for all (j, k)∈ In, where whereIn is defined in (2.32)

If∆^MXis stationary, then from (2.43) the process{W_j,k}k∈Z of wavelet coefficients at scalej ≥0 is stationary but the two–dimensional process {[W_j,k, W_j^′_,k]^T}k∈Z of wavelet coefficients at scalesj and j^′, withj≥j^′, is not stationary. Here ^T denotes the transpo-sition. This is why we consider instead the stationary between-scale process

{[W_j,k,W_j,k(j−j^′)^T]^T}k∈Z, whereW_j,k(j−j^′) is defined as follows:

W_j,k(j−j^′)^def= h

W_j′,2^j−j′k, W_j′,2^j−j′k+1, . . . , W_j′,2^j−j′k+2^j−j′−1

. For allj, j^′ ≥1, the covariance function of the between scale process is given by

Cov(W_j,k^′(j−j^′), W_j,k) = Z π

−π

e^iλ(k⁻^k^′⁾D_j,j₋_j^′(λ;f)dλ , (5.8) whereD_j,j₋_j^′(λ;f) stands for the cross-spectral density function of this process. The case j=j^′ corresponds to the spectral density function of the within-scale process{W_j,k}k∈Z.

In the sequel, we shall use that the within- and between-scale spectral densities D_j,j₋_j^′(λ;d) of the processX with memory parameterd∈Rcan be approximated by the corresponding spectral density of the generalized fractional Brownian motionB_(d) defined in (2.84).

5.2.2 Definition of the robust estimators of d

Let us now define robust estimators of the memory parameter d of the M(d) process X from the observationsX₁, . . . , X_n. These estimators are derived from the Abry and Veitch [1998] construction, and consists in regressing estimators of the scale spectrum

σ_j² ^def= Var(W_j,0) (5.9)

with respect to the scale index j. More precisely, if bσ_j² is an estimator of σ²_j based on W_j,0:n_j₋₁ = (W_j,0, . . . , W_j,n_j₋₁) then an estimator of the memory parameterdis obtained by regressing log(bσ_j²) for a finite number of scale indices j ∈ {J0, . . . , J0 +ℓ} where J₀ =J₀(n)≥0 is the lower scale and 1 +ℓ≥2 is the number of scales in the regression.

The regression estimator can be expressed formally as dbn(J0,w)^def=

JX0+ℓ j=J0

wj−J0log bσ_j²

, (5.10)

where the vectorw^def= [w₀, . . . , w_ℓ]^T of weights satisfiesPℓ

i=0w_i= 0 and 2 log(2)Pℓ

i=0iw_i= 1, see Abry and Veitch [1998] and Moulines et al. [2007b]. For J₀ ≥ 1 and ℓ > 1, one may choose for examplew corresponding to the least squares regression matrix, defined byw=DB(B^TDB)⁻¹b where

b^def= h

0 (2 log(2))⁻¹i

, B ^def=

1 1 . . . 1 0 1 . . . ℓ

#^T

is the design matrix and D is an arbitrary positive definite matrix. The best choice of D depends on the memory parameterd. However a good approximation of this optimal matrix D is the diagonal matrix with diagonal entries Di,i = 2⁻ⁱ, i = 0. . . , ℓ; see Fa¨y et al. [2009] and the references therein. We will use this choice of the design matrix in the numerical experiments. A heuristic justification for this choice is that by [Moulines et al., 2007a, Eq. (28)],

σ²_j ∼C2^2jd , asj→ ∞, (5.11)

whereC is a positive constant.

In the sequel, we shall consider three different estimators ofdbased on three different estimators of the scale spectrum σ_j² with respect to the scale index j which are defined below.

Classical scale estimator

This estimator has been considered in the original contribution of Abry and Veitch [1998]

and consists in estimating the scale spectrumσ_j² with respect to the scale index j by the empirical variance

σ_CL,j² = 1 n_j

i=1

W_j,i² , (5.12)

where for any j, n_j denotes the number of available wavelet coefficients at scale index j defined in (2.32).

Median absolute deviation

This estimator is well-known to be a robust estimator of the scale and as mentioned by Rousseeuw and Croux [1993] it has several appealing properties: it is easy to compute and has the best possible breakdown point (50%). Since the wavelet coefficients W_j,i are centered Gaussian observations, the square of the median absolute deviation ofW_j,0:n_j₋₁ is defined by

σ_MAD,j² =

m(Φ) med

0≤i≤nj−1Wj,i

, (5.13)

where Φ denotes the c.d.f of a standard Gaussian random variable and

m(Φ) = 1/Φ⁻¹(3/4) = 1.4826 . (5.14) The use of the median estimator to estimate the scalogram has been suggested to estimate the memory parameter in Stoev et al. [2005] (see also [Percival and Walden, 2006, p. 420]).

A closely related technique is considered in Coeurjolly [2008a] and Coeurjolly [2008b] to estimate the Hurst coefficient of locally self-similar Gaussian processes. Note that the use of the median of the squared wavelet coefficients has been advocated to estimate the variance at a given scale in wavelet denoising applications; this technique is mentioned in Donoho and Johnstone [1994] to estimate the scalogram of the noise in the i.i.d. context;

Johnstone and Silverman [1997] proposed to use this method in the long-range dependent context; the use of these estimators has not been however rigorously justified.

The Croux and Rousseeuw estimator

This estimator is another robust scale estimator introduced in Rousseeuw and Croux [1993]. Its asymptotic properties in several dependence contexts have been further studied in L´evy-Leduc et al. [2009] and the square of this estimator is defined by

σ_CR,j² =

c(Φ){|W_j,i−W_j,k|; 0≤i, k≤n_j−1}(k_nj)

, (5.15)

wherec(Φ) = 2.21914 and k_n_j =⌊n²_j/4⌋. That is, up to the multiplicative constantc(Φ), b

σCR,j is the knjth order statistics of the n²_j distances |Wj,i−W_j,k| between all the pairs of observations.

Dans le document Analysedessérieschronologiquesàmémoirelonguedansledomainedesondelettes OlafKOUAMO Thèse (Page 128-132)