• Aucun résultat trouvé

4.4 Application

4.4.3 Change point test statistic

Let X1, . . . , Xn be n observations of a time series, and denote by Wj,k for (j, k) ∈ In

with In defined in (2.53) the associated wavelet coefficients. If X is an M(d) process and the number of vanishing moments M of the wavelet satisfies M > d−1/2, then

MX is weakly stationary and it follows that the wavelet variance at each given scale j should be constant. If this hypothesis is not fulfilled, then it can be expected that the wavelet variance will change either gradually or abruptly. This motivates a test for the constancy of the variance of the wavelet coefficients. If∆MX is weakly stationary but not centered, it becomes centered by replacingM byM+ 1; hence, without loss of generality, the hypothesis that ∆MX is weakly stationary can be replaced by the hypothesis that

MXis weakly stationary and centered, which is considered as ournull hypothesis in the following. Under the null hypothesis, the wavelet coefficients {Wj,k, k ∈Z} is a centered and covariance stationary sequence. The cumulative sum (CUSUM) of squares used in Inclan and Tiao [1994] to detect change points in the variance can thus be adapted to obtain a test of the null hypothesis based on CUSUM of squared wavelet coefficients. This idea have been developed in Kouamo et al. [2010], where, moreover, changes in wavelet variances occurring simultaneously at multiple time-scales are considered, resulting in a multiple scale procedure. Let us briefly describe the procedure and the main theoretical result which it relies on (see Kouamo et al. [2010] for details). GivenJ1 ≤J2, define the

muti-scale scalogram by

YJ1,J2[i] =

WJ22,i, X2 u=1

WJ221,2(i1)+u, . . . ,

2(JX2−J1)

u=1

WJ2

1,2(J2−J1)(i1)+u

T

. (4.29) and denote the corresponding partial sum process by

SJ1,J2(t) = 1

√nJ2

nXJ2t

i=0

YJ1,J2[i]. (4.30)

Then for a wide class of Gaussian processes, under the null hypothesis and provided that the analyzing wavelet hasM vanishing moments, it is shown that

ΓJ11/2,J2(SJ1,J2(t)−E[SJ1,J2(t)])−→d B(t) = (BJ1(t), . . . , BJ2(t)), (4.31) where the weak convergence holds in the spaceDJ2J1+1[0,1] of c`adl`ag functions defined on [0,1] and valued inRJ2J1+1,{Bj(t), t≥0}j=J1,...,J2 are independentBrownian motions and and ΓJ1,J2 denotes the asymptotic covariance matrix of SJ1,J2(1). As classical in convergence results for c`adl`ag functions, DJ2J1+1[0,1] is equipped with the Skorokhod metric.

Since the matrix ΓJ1,J2 is unknown, it should be estimated. Denote bΓJ1,J2 a consistent estimator of ΓJ1,J2. The test is based on the statistics TJ1,J2 : [0,1]→R+ defined by

TJ1,J2(t)def= (SJ1,J2(t)−tSJ1,J2(1))T ΓbJ11,J2(SJ1,J2(t)−tSJ1,J2(1)), t∈[0,1]. (4.32) As a consequence of (4.31),TJ1,J2(t) also converges weakly in the Skorokhod spaceD([0,1]),

TJ1,J2(t)−→d

J2

X

ℓ=J1

B0(t)2

(4.33) where {Bj0(t), t ≥ 0}j=J1,...,J2 are independent Brownian bridges. For any continuous functionF :D[0,1]→R, the continuous mapping Theorem implies that

F[TJ1,J2(·)]−→d F

"J2J11 X

ℓ=1

B0(·)2

# .

We may for example apply either integral or sup functionals, resulting in Cram´er-Von Mises or Kolmogorov-Smirnov statistics, respectively defined by

CVM(J1, J2)def= Z 1

0

TJ1,J2(t)dt , (4.34)

KSM(J1, J2)def= sup

0t1

TJ1,J2(t). (4.35)

We refer to Kouamo et al. [2010] for details on the computation of asymptotic quantiles for these statistics.

Our goal here is to provide new estimators of ΓJ1,J2. Because wavelet coefficients are correlated along and across scales, it has been suggested in Kouamo et al. [2010] to use theBartlett estimator ΓbBJ1,J2 ofthe covariance matrix of the square wavelet’s coefficients for scales{J1, . . . , J2}which is given by (2.122). The Bartlett estimator corresponds to a nonparametric approach which avoids assuming a precise model under the null hypothesis.

On the other hand, the choice of the window sizeq in the Bartlett estimator is difficult in practice, and may lead to very different estimates. Here we consider alternative estimators of ΓJ1,J2. The first is based on an estimator of the asymptotic covariance matrix of the GFBM square wavelet coefficients defined in (2.130) and the second is based on a pre-estimation of a semi-parametric model. In this second case, once the model is estimated, the estimator of ΓJ1,J2 can be obtained by two different ways, based on 1) the auto-covariance and cross-auto-covariance of wavelet coefficients among scales and 2) the with-in and between scale spectral density. Indexing ΓJ1,J2[i, j] by scale indicesiand j, we have, forJ1 ≤i≤j≤J2,

ΓJ1,J2[i, j] =X

τZ

Cov

2J2X−j1 l=0

Wj,l2,

2J2X−i1 u=0

Wi,22J2−iτ+u

= 2X

τ∈Z

2J2X−j1 l=0

2J2X−i1 u=0

Cov2 Wj,l, Wi,2J2−iτ+u

= 2J2j+1X

s∈Z

δi,j2 [s].

This expression is useful when the covariance of the estimated model is easy to compute, yielding a fast computation of δi,j by Proposition 4.3.1. However only an approximation of the infinite sum above can be computing by truncation. The computation can also be done in the frequency domain, using the Parseval inequality,

2J2j+1X

s∈Z

δ2i,j[s] = 2J2j+1X

t∈Z 2j−iX1

u=0

δ2i,j[2jit+u]

= 2J2j π

Z π

πkDj,u(λ)k2dλ .

Again the obtained formula cannot be computed exactly but can be approximated to an arbitrary level of accuracy approximating the integral Rπ

πkDj,ji(λ;f)k2dλ with a Riemann sum.

Alternatively, once a specific model has been estimated, it is legitimate to compute

thenon-asymptotic covariance matrix of SJ1,J2(1) in (4.30), given by assumption on the wavelet coefficients. The latter expression can be expressed using the covariance functionδi,j, namely,

ΓeJ1,J2[i, j] = 2 Observe that this covariance matrix can be computed exactly since it involves a finite number of covariances δi,j[t].

Having this expression at hand, suppose now that we are interested in changes of the spectral density at large scales, or, equivalently, at small frequencies, that is, take J2 ≥J1 large enough. Note however that we do not take J2=Jn here since the Gaussian approximation (4.31) is valid asnJ2 ∼n2J2 → ∞and nJn =O(1) by definition. Hence J1,J2 and nJ2 have to be large simultaneously. Not surprisingly, detecting changes in low frequencies, that is, when 2J2 large, obviously requires long times series, that is,nhas to be even larger. We propose to rely on the approximation of the second order properties of the wavelet coefficients ofM(d) processes at large scales that has been already used for semiparametric wavelet estimation ofdin Section 4.4.1. More precisely an alternative to the Bartlett estimator is obtained through the following steps.

Step1. Estimate the memory parameterdusingdb=dbLLW(J1, J2), that is the same esti-mator as dbLLW(ℓ) but using the scale indicesJ1 ≤j≤J2 instead ofℓ≤j ≤Jn. We also estimate the corresponding scaling parameterσ2 of the ARFIMA(0, d,0) model as in (4.24).

Step2. Compute the auto-covarianceνb0[k], of the ARFIMA(0,db−M,0) model, with the scaling parameter given by the estimated one. From this, compute the auto-covarianceδbj[k] for a scalejand the cross-covariance bδi,j[k] of the wavelet coeffi-cients for two given scalesiandjincluded in{J1, . . . , J2}as in Proposition 4.3.2.

We denote by KSMB(J1, J2), KSM(d)(J1, J2) and KSMLLW(J1, J2) the test statistics KSM(J1, J2) normalized by ΓbBJ1,J2, Γb(d)J1,J2 and bΓLLWJ1,J2 respectively. The test statistics CVMB(J1, J2), CVM(d)(J1, J2) and CVMLLW(J1, J2) are defined accordingly for the CVM statistic. Each corresponding test rejects the null hypothesis if the statistic exceeds the (1−α)-th quan-tile of the corresponding asymptotic distributions. We now investigate the finite sample false alarm probabilities of the SIX resulting KSM’s and CVM’s tests when the level is set to α = 0.05, when the observed signal is the same ARFIMA(1, d,0) as in Section 4.4.2, analyzed with the same Daubechies wavelets with 4 vanishing moments. Here, we consider series with sample lengthn= 15000, which, with the chosen wavelet, gives Jn= 10. We set the coarsest scale to J2= 8 and vary the finest scale J1 from 4 to 6.

The results, obtained through Monte Carlo simulations with 1000 independent repli-cations, are displayed in Table 4.3. It appears that the normalization of the test by bΓLLWJ1,J2 yields more accurate levels compared to the normalization byΓbBJ1,J2 and bΓ(d)J1,J2. In particular the accuracy deteriorates more significantly when the process is not station-ary (d > 1/2) for CVMB, KSMB, CVM(d) and KSM(d) see Table 4.3. However, when d = 2,2.6, the empirical levels are much higher to the asymptotic one in all cases. We also notice that in general the empirical levels of CVM are more accurate than the ones for KSM test.

J2= 8 n= 15000 ARFIMA(1, d,0)

d -0.8 0 0.2 0.6 1 1.6 2 2.6

KSMLLW 0.057 0.072 0.046 0.09 0.087 0.12 0.26 0.334 KSM(d) 0.07 0.086 0.063 0.11 0.24 0.26 0.33 0.54

J1= 4 KSMB 0.083 0.10 0.22 0.42 0.385 0.52 0.58 0.69

CVMLLW 0.03 0.045 0.061 0.053 0.043 0.064 0.091 0.14 CVM(d) 0.053 0.056 0.076 0.098 0.109 0.196 0.23 0.42

CVMB 0.031 0.068 0.091 0.15 0.29 0.25 0.31 0.49

KSMLLW 0.079 0.08 0.061 0.13 0.112 0.17 0.253 0.613 KSM(d) 0.09 0.078 0.089 0.21 0.27 0.25 0.45 0.71

J1= 5 KSMB 0.13 0.18 0.23 0.26 0.31 0.39 0.43 0.75

CVMLLW 0.056 0.049 0.039 0.06 0.087 0.085 0.12 0.26 CVM(d) 0.051 0.067 0.064 0.106 0.16 0.197 0.23 0.61

CVMB 0.091 0.097 0.094 0.15 0.21 0.26 0.31 0.63

KSMLLW 0.08 0.073 0.091 0.16 0.157 0.23 0.218 0.584 KSM(d) 0.067 0.089 0.18 0.27 0.31 0.31 0.381 0.786 J1= 6 KSMB 0.108 0.123 0.328 0.321 0.519 0.61 0.716 0.857 CVMLLW 0.06 0.071 0.062 0.078 0.067 0.096 0.108 0.247 CVM(d) 0.078 0.068 0.091 0.11 0.097 0.18 0.23 0.56 CVMB 0.101 0.11 0.172 0.328 0.303 0.411 0.515 0.702

Table 4.3: Empirical level ofKSMCVM on 15000 observations of different ARFIMA(1, d,0) processes using two estimators of the scalogram covariance matrix.

4.4.4 Large scale multiple change detection for stock market data