• Aucun résultat trouvé

Sequential change-point detection in Poisson autoregressive models

N/A
N/A
Protected

Academic year: 2022

Partager "Sequential change-point detection in Poisson autoregressive models"

Copied!
15
0
0

Texte intégral

(1)

Vol. 156No. 4 (2015)

Sequential change-point detection in Poisson autoregressive models

Titre:Détection séquentielle de ruptures dans les models autorégressifs de Poisson

William Kengne1 *

Abstract:We consider the sequential change-point detection in a general class of Poisson autoregressive models. The conditional mean of the process depends on a parameterθ0ΘIRdwhich may change over time as and when data are observed. We propose a closed and open-end procedure based on the maximum likelihood estimator of the parameter. Under the null hypothesis of no change, it is shown that the detector converges to a well know distribution.

The (empirical) power and the efficiency in terms of the detection delay are assessed through a simulation study and a real data example is provided.

Résumé :Nous considérons la détection séquentielle de ruptures dans une classe assez générale de modèles de Poisson autorégressifs de séries temporelles à valeurs entières. La moyenne conditionnelle du processus dépend d’un paramètre θ0ΘIRdsusceptible de changer dans le temps au fur et à mesure que les données sont observées. Nous proposons une procédure séquentielle dont le temps de suivi peut être fini ou infini basée sur l’estimateur du maximum de vraisemblance du paramètre. Sous l’hypothèse nulle selon laquelle aucun changement n’intervient dans le paramètre, la statistique de test converge vers une distribution connue. Des résultats de simulations nous permettent d’évaluer la puissance (empirique) ainsi que l’efficacité en terme du délai de détection et un exemple d’application aux données réelles est fourni.

Keywords:Sequential detection, change-point, time series of counts, Poisson autoregression, likelihood estimation Mots-clés :Detection séquentielle, rupture, séries temporelles à valeurs entières, autoregression de Poisson, estimation par vraisemblance

AMS 2000 subject classifications:62M10, 62L12, 62F05

1. Introduction

Detecting change in time series has become an important research topic in statistics in the last three decades, since it has been known that many data often suffer from structural change. Before any statistical inference, one must test if a change has not occurred in the model during the data generating period. Two approaches are generally considered for solving this problem : the off-line (or retrospective) detection and the on-line (or sequential) detection. The off-line detection is an approach used when all data are available ; see the book ofCsörgö and Horváth(1997) for a large overview andAue et al.(2009a);Bardet et al.(2012);Kengne(2012);Fryzlewicz and Subba Rao (2013);Fokianos et al.(2014) among others, for some recent works that have been done in this setting. The on-line (sequential) detection that we will focus here, refers to the change-point detection as and when new data are observed.

* This work has been developed within the Labex MME-DII (ANR-11-LBX-0023-01) ;http://labex-mme-dii.

u-cergy.fr/.

1 THEMA, Université de Cergy-Pontoise, 33 Boulevard du Port, 95011 Cergy-Pontoise Cedex, France.

E-mail:william.kengne@u-cergy.fr

(2)

Numerous works have been devoted to the sequential change-point detection, we refer to the book ofBasseville et al.(1993) for a large surveys. But the important turning on this topic has been done with the paper ofChu et al.(1996). They have pointed out that repeating the retrospective test as new data are observed will increase the probability of type I error of the procedure. For the regression model, they addressed the sequential change detection as a classical hypothesis testing with a fixed probability of type I error and proposed two monitoring procedures based on cumulative sum (CUSUM) of residuals and recursive parameter fluctuations. Their approach has been generalized and extended in several directions. SeeLeisch et al.(2000);Horváth et al.(2004);

Zeileis et al.(2005);Aue et al.(2006);Horvath et al.(2007);Aue et al.(2009b) for some various procedures for sequential change in linear models with independent and dependent innovations.

Berkes et al.(2004) and Gombay and Serban(2009) focussed respectively on the sequential change-point detection in the parameters of GARCH and linear autoregressive processes ; whereas Na et al.(2011) proposed fluctuation-type test procedure for sequential change detection in a general class of time series models.Bardet and Kengne(2014) considered a large class of causal models (that including AR(∞), ARCH(∞), TARCH(∞),... processes) and developed a sequential procedure where the updated estimator is computed without the historical observations ; the consistency of the procedure has been established.

In this paper, we focus on sequential change detection in a large class of Poisson autoregressive models. More precisely, we consider a processY = (Yt)t∈Zsatisfying :

Yt/Ft−1∼ Poisson(λt)withλt= fθ

0(Yt−1, . . .) (1)

whereFt =σ(Ys, s≤t) is theσ-field generated by the whole past of the process and f is a measurable non-negative function assumed to be know up to some parameterθ0belonging to a compact setΘ⊂IRd.

Let{Nt(·);t=1,2,· · · }be a sequence of independent Poisson processes of unit intensity.Yt can be seen as the number of events ofNt(·)in the time interval[0,λt]. The feedback mechanism in the conditional meanλt provides a large tool for modeling dependence structure in count events (number of new infections, number of transactions per minute, number of defect products,· · ·).

The model with one order autoregression has been addressed byFokianos et al.(2009);Fokianos and Tjøstheim(2012). They investigated the asymptotic properties of the maximum likelihood estimator by using a perturbation approach. WhereasDoukhan et al.(2012) use a weak depen- dence approach to prove the existence of a stationary solution of a class of model larger than (1).

Neumann et al.(2011) focused on the absolute regularity and ergodicity of the model with one order autoregression. For surveys on the change-point and outliers detection in integer-valued autoregressive models and relate, we refer toKang and Lee(2009,2014);Fokianos and Fried (2010,2012);Doukhan and Kengne(2015).

The previous papers have worked out the retrospective change-point detection on variants of the model (1) and related models. Sequential detection seems to be very important in finance, economics or epidemiology. For example, in the real data application in finance (see below), it is crucial to know early if the high variations in the number of transactions are noise effects or are due to a structural change in the dynamic of the transaction process, before taking an appropriate decision.

(3)

Now, assume that a trajectory(Y1,· · ·,Yn)of the processY generated as in (1) with a parameter θ0 has been observed. In the setting of the sequential detection, these available observations are called thehistorical data. New dataXn+1,Xn+2,· · ·,Xk,· · · will be observed and monitoring scheme starts at timen+1. For each new observation, one would like to know if it has been generated by the model depending on θ0 or some other parameterθ1 (withθ16=θ0). More precisely, we consider the following test problem :

H00is constant over the observationsY1,· · ·,Yn,Yn+1,· · · i.e.(Yn)n∈IN satisfying (1) with the parameterθ0;

H1: there existk>n,(θ01)∈Θ2, withθ06=θ1, such that(Y1,· · ·,Yk)satisfying (1) with the parameterθ0and and(Yk+n)n∈INsatisfying (1) withθ1.

It is known that under H0, the maximum likelihood estimator (MLE in the sequel) of the parameterθ0is consistent (seeDoukhan and Kengne,2015). Therefore, as the fluctuation pro- cedure proposed byChu et al.(1996), we define a detector based on the difference between the historical and the updated parameter estimator. More precisely, at the timek>n, we compare the estimator computed on the historical data to that computed with all the data up to the timek(i.e.

X1,· · ·,Xn,Xn+1,· · ·,Xk). Thus, under H0these two quantities are consistent estimators ofθ0, so they will be close and the detector will not be large enough. We will show through simulation study that the detector seems to be large enough under H1. Hence, if the detector is larger than a suitable critical value, thenH0is rejected and a model with a new parameter is considered;

otherwise, the sequential monitoring scheme continues.

2. Assumptions and examples 2.1. Assumptions

We will use the following classical notations:

1. kyk:=

p

P

j=1

|yj|for anyy∈IRp;

2. for any compact setK ⊆IRdand for any functiong:K −→IRd0,kgkK =supθK(kg(θ)k);

3. for any setK ⊆IRd,K˚ denotes the interior ofK.

Throughout the sequel, we will assume that the functionθ7→ fθ is twice continuously differen- tiable onΘ. The following Lipschitz-type condition A0(Θ)is classical to ensure the existence of solution of such model (see for instanceDoukhan and Wintenberger,2008) and the assumptions A1(Θ), A2(Θ) as well as D(Θ), Id(Θ) and Var(Θ) are needed for inference on the model see Doukhan and Kengne(2015).

Fori=0,1,2 and for any compact setK ⊆Θ, define

Assumption Ai(K):k∂ifθ(0)/∂ θikΘ<∞and there exists a sequence of non-negative real numbers (αk(i)(K))k≥1 satisfying P

j=1

αk(0)(K)<1 (wheni=0) and P

j=1

αk(i)(K)<∞(when

(4)

i=1,2) such that

ifθ(y)

∂ θi −∂ifθ(y0)

∂ θi K

X

k=1

αk(i)(K)|yk−y0k| for ally,y0∈(IR+)IN. Assumption D(Θ):∃c>0 such that inf

θ∈Θ(fθ(y))≥cfor ally∈(IR+)IN.

Assumption Id(Θ):For all(θ,θ0)∈Θ2,fθ(Yt−1, . . .) = fθ0(Yt−1, . . .)a.s. for somet∈Z⇒ θ=θ0.

Assumption Var(Θ):For allθ∈Θandt∈Z, the components of the vector∂fθ

∂ θi(Yt−1,...)are a.s.

linearly independent.

2.2. Examples

2.2.1. Linear Poisson autoregression

We consider an integer-valued time series(Yt)t∈Zsatisfying for anyt∈Z Yt/Ft−1∼ Poisson(λt)withλt00) +X

k≥1

φk0)Yt−k (2) whereθ0∈Θ⊂IRd, the functionsθ 7→φk(θ)are positive and satisfyingPk≥1k(θ)kΘ<1.

This model is also called an INARCH(∞), due to its similarity with the classical ARCH(∞) model.

Assumptions A0(Θ)holds automatically. If the functionφkare twice continuous differentiable such thatPk≥1k0(θ)k

Θ<∞andPk≥1k00(θ)k

Θ<∞, then A1(Θ)and A2(Θ)hold. If inf

θ∈Θφ0(θ)>0 then D(Θ) holds. Moreover, if there exists a finite subsetI⊂IN− {0} such that the function θ7→(φk(θ))k∈Iis injective, then assumption Id(Θ)holds and the model (2) is identifiable.

Note that the classical Poisson INGARCH(p,q) (seeFerland et al.,2006orWeiß,2009) obtained with

λt00) +

p

X

k=1

αk0t−k+

q

X

k=1

βk0)Yt−k

is a special case of the model (2) if the conditionPk=1pk(θ)kΘ+Pqk=1k(θ)kΘ<1 is satisfied.

Fokianos and Fried(2010) focussed on the intervention effects (that can generated sudden mean shift or outliers seeFokianos and Fried,2010) in this model whereas the diagnostic checking when p=0 has been addressed byZhu and Wang (2010). The INGARCH(1,1) is known to describe adequately the number of transactions in the stock Ericsson B (seeFokianos et al.,2009, 2013;Doukhan and Kengne,2015).

2.2.2. Poisson exponential autoregressive model

This model, proposed byFokianos et al.(2009) is defined by

Yt/Ft−1∼ Poisson(λt)withλt=Äα01exp(−γ λt−12 )äλt−1+βYt−1 (3) whereα01,β,γ>0 are the parameters of the model. Ifα01+β<1, then the Lipschitz-type conditions above hold. Moreover, the assumptions D(Θ), Id(Θ)and Var(Θ) hold. SeeFokianos et al.(2009) for an application of this model to transactions data.

(5)

3. The sequential procedure and the main results 3.1. Likelihood estimation

For any integer`≥1, denote

T`={1,2,· · ·, `}.

Letk≥n≥1. If(X1,· · ·,Xk)is generated according to (1) with the parameterθ, then for any

`≤k, the conditional (log)-likelihood (up to a constant) computed onT`, is given by L(T`,θ) =

`

X

t=1

ÄYtlogλt(θ)−λt(θ)ä=

`

X

t=1

`t(θ) with `t(θ) =Ytlogλt(θ)−λt(θ) whereλt(θ) = fθ(Yt−1, . . .). In the sequel, we use the notation fθt := fθ(Yt−1, . . .).This (log)- likelihood can be approximated by (see alsoDoukhan and Kengne,2015)

L(Tb `,θ) =

`

X

t=1

ÄYtlogbλt(θ)−λbt(θ)ä (4)

wherebλt(θ):= fbθt := fθ(Yt−1, . . . ,Y1,0, . . .) andbλ1(θ) = fθ(0, . . .). The maximum likelihood estimator of the parameter computed onT`is defined by

θ(Tb `) =argmaxθ∈Θ(bL(T`,θ)). (5) The two following results have been established byDoukhan and Kengne(2015). Under H0, if θ0∈Θ˚ andD(Θ), Id(Θ), Var(Θ) andAi(Θ)i=0,1,2 hold with

X

j≥1

pj×α(i)j (Θ)<∞ (6)

then

θb(Tn)−−−−→a.s.

n→+∞ θ0

and √

n(θ(Tb n)−θ0)−−−−→D

n→+∞ N (0,Σ−1) (7)

whereΣ=EÄ 1

f0

θ 0

(∂ θ fθ0

0)(∂ θ fθ0

0)0äand where0denotes the transpose. Under the above conditions, the matrix

bΣn=1 n

n

X

t=1

1 fbθt

Ä

∂ θ bfθtäÄ

∂ θ fbθtä0

θ=θb(Tn)

is a consistent estimator ofΣ. These results will be the main tools for the following sequential procedure.

(6)

3.2. The sequential procedure

In the sequel,(Y1,· · ·,Yn)is supposed to be the historical available observations generated accord- ing to (1) with the parameterθ0. At a monitoring instantk, we will assess the difference between the two estimatorsθ(Tb k)andθ(Tb n). More precisely, according to (7), we define for anyk>nthe statisticDk, called thedetectorby

Dk:=√

nΣb1/2n Äθ(Tb k)−θb(Tn)ä.

This detector is well defined since the matrixΣbnis asymptotically symmetric and positive definite (seeDoukhan and Kengne,2015). Note that, if change does not occur at timek>n, both the estimatorsθb(Tk)andθb(Tn)are close and the detectorDk is not too large.

LetT >1 (T can be equal to infinity). The sequential monitoring scheme rejectsH0at the first timeksatisfyingn<k≤[T n] +1 such thatDk>cfor a suitably chosen constantc>0, where [x]denote the integer part ofx. The procedure is calledclosed-end method whenT <∞and open-end methodwhenT =∞. The set{n+1,n+2,· · ·,[T n]}is called the monitoring period.

The choice ofT depends on the time that we when to monitor the procedure. Note that, this choice will not affect the efficiency of the procedure ; as we will see below, the critical value of the test takes into account the choice ofT. But in practice, we cannot monitor until infinity thus, we have to chooseT <∞.

To build a detector that is sensitive to detect changes that occur at the beginning of the monitoring and those occur a long time after the beginning of the monitoring, we use the so-called boundary functionb:[1,∞)7→(0,∞), assumed to be continuous and satisfying :

1≤t<∞Inf b(t)>0.

UnlikeBardet and Kengne(2014), the decreasing assumption is not imposed tob.

The monitoring scheme rejectsH0at the first timek(withn<k≤[T n] +1 ) such asDk>b(k/n).

Hence define the stopping time:

τ(n):=Infnn<k≤[T n] +1¿Dk>b(k/n)o with the convention that Inf{/0}=∞. Therefore, we have

P{τ(n)<∞}=Pn Dk,`

b(k/n) >1 for somek betweennand[T n] +1o

=Pn sup

n<k≤[T n]+1

Dk

b(k/n)>1o. (8)

The main aim is to choose a suitable boundary functionb(·)such as for some givenα∈(0,1),

n→∞limPH0{τ(n)<∞}=α (9) andPH1{τ(n)<∞}is asymptotically close to one ; where the hypothesisH0andH1are specified in Section1.

(7)

If the break is suspected to occur soon after the beginning of the monitoring, a boundary function bwith the smallest values in the neighborhood of 1 is appropriate. Otherwise, choose a functionb that takes its largest values in the neighborhood of 1. But, in practice, we do not often know the instant of the break. So, one will choose a constant boundary functionb≡cwithc>0, this leads to compute a thresholdc=cαthat satisfying (9).

Moreover, if a change-point is detected underH1i.e.τ(n)<∞andτ(n)>k, then the detection delay is defined by

dbn=τ(n)−k. (10)

dbn is used to assess the efficiency of the procedure to early detect changes in the model. The smaller is the detection delay, the better is the efficiency under the alternative.

3.3. Main results

UnderH0, the parameterθ0does not change over the new observations. The following theorem displays the asymptotic behavior under the null hypothesis of the detectorDk for the open and closed-end procedure.

Theorem 3.1. Assume D(Θ),Id(Θ), Var(Θ) and Ai(Θ)i=0,1,2hold with X

j≥1

pj×α(i)j (Θ)<∞.

Under H0withθ0∈Θ, for the open-end (T˚ =∞) and closed-end (T<∞) procedure it holds that

n→∞limP{τ(n)<∞}=Pn sup

1≤s≤T

kWd(s)−sWd(1))k s b(s) >1o, where Wd is a d-dimensional standard Brownian motion.

In the case of using the most "natural" boundary functionb(·) =cwithc>0, the following corollary is a direct application of Theorem3.1.

Corollary 3.1. Assume b(t) =c>0for all t≥0. Under the assumptions of Theorem3.1, and with T ∈(1,∞),

n→∞limP{τ(n)<∞}=P{Ud,T>c}

where

Ud,T =»(T−1)/T sup

0≤s≤1

kWd(s)k when 1<T <∞ (11) and

Ud,∞= sup

0≤s≤1

kWd(s)k. (12)

So that, at a nominal levelα∈(0,1), the critical value of interest isc=cαthe(1−α)-quantile of the distribution ofUd,T. Note that, the distribution of sup0<s≤1kWd(s)k is known see for instanceBerkes et al.(2004). The quantiles of this distribution can also be computed through a

(8)

Monte Carlo simulation ; see the values display in Table 1 ofNa et al.(2011) ford=1,2,· · ·,5.

Under the alternative, one would like to show that sup

n<k≤[T n]+1

Dk b(k/n)

−−−−→P

n→+∞ ∞ (13)

which is enough for the consistency under H1. But this seems to be not easy, since the convergence ofθ(Tb k)is not ensured under H1. But we believe that (13) can be obtained with an adapted version of the procedure proposed byBardet and Kengne(2014). This problem is the topic of a different research project. Nevertheless, the following simulation study shows that the procedure based on Dkstill works well under the alternative.

4. Some numerical results

We present some numerical results for sequential change-point detection in Poisson autoregressive models. These results are based on the application of the INGARCH(1,1) model that satisfying

Yt/Ft−1∼ Poisson(λt)withλt01λt−11Yt−1 (14) whereα0>0 andα11≥0. Denoteθ= (α011)the parameter of the model. As we men- tioned above, this model is known to describe adequately the number of transactions in the stock Ericsson B on that we will focus in the real data example. In the sequel, we takeb(s)≡cis constant and deal with the closed-end procedure withT =2 ; hence the procedure is monitored fromk=n+1 tok=2n. The nominal level used isα=0.05. According to Corollary3.1, the critical value of the test satisfiescα =»(T−1)/T c0α wherec0α is the(1−α)-quantile of the distribution of sup0<s≤1kWd(s)k(d=3 in this section). From Table 1 ofNa et al.(2011), we get cα=2.130.

In the sequel, the historical dataX1,· · ·,Xnare generated by the model (14) depending on the parameterθ0. Under the null hypothesis, the new observationsXn+1,· · ·,X[T n]are also generated according toθ0. Under the alternative, the new observationsXn+1,· · ·,Xk are derived from the model that depends onθ0whileXk+1,· · ·,X[T n]depend onθ1.

4.1. An illustration

We consider the above INGARCH(1,1) model and generate the historical data of lengthn= 500 according to the parameterθ0= (0.5,0.7,0.15). Figure1(a) and (b) displays the statistics (Dk)501≤k≤1000for a scenario without change and a scenario with break atk=1.25n=625.

Figure1a-) shows that the detectorDkis under the horizontal line which represents the critical value of the test. In Figure1 b-), before change occurs,Dk is under the horizontal line and increases with a high speed after change. Such growth over a long period indicates that something happening in the model.

(9)

a-) Dk for INGARCH(1,1) without change

k

500 600 700 800 900 1000

0.01.02.03.0

b-) Dk for INGARCH(1,1) with a break at k*=625

k

500 600 700 800 900 1000

0123

FIGURE1. Typical realization of the statistics(Dbk)501≤k≤1000for INGARCH(1,1)with n=500. a-) The parameter θ0= (0.5,0.7,0.15)is constant; b-) the parameterθ0= (0.5,0.7,0.15)changes toθ1= (0.5,0.4,0.3)at k=625.

The horizontal solid line represents the limit of the critical region, the vertical dotted line indicates where the change occurs and the vertical solid line indicates the time where the sequential procedure points a presence of break in the data.

4.2. Sequential change-points detection in INGARCH(1,1) processes

For the problem of the sequential change-points detection in the INGARCH(1,1) model (14), we consider the following scenarios.

– Scenario A: under H00= (0.5,0.7,0.15). Under H10changes toθ1= (0.5,0.4,0.3) atk=1.25n.

– Scenario B: under H00= (1,0.2,0.3). Under H10changes toθ1= (0.4,0.2,0.3)at k=1.25n.

Note that, the scenario A is close to the fitted model obtained from the number of transactions per minute for the stock Ericsson B, which is known to exhibitα11≈0.9 (see for instance Fokianos et al.,2009).

Table1reports the empirical levels and powers based on 200 replications forn=150,300,500.

Some elementary statistics of the empirical detection delay (see (10)) are summarized in Table2.

TABLE1. Empirical levels and powers for sequential change-point detection in INGARCH(1,1) processes according to the scenarios A and B.

n=150 n=300 n=500 Empirical levels : Scenario A 0.090 0.085 0.060

Scenario B 0.085 0.075 0.055 Empirical powers : Scenario A 0.650 0.840 0.980 Scenario B 0.630 0.945 0.995

(10)

TABLE2. Elementary statistics of the empirical detection delay for sequential change-point detection in INGARCH(1,1) processes according to the scenarios A and B.

dbn Mean SD Min Q1 Med Q3 Max

Scenario A n=150 ;k=187 53.96 27.25 4 34 55 78 109 n=300 ;k=375 98.30 48.57 11 59 95 133 218 n=500 ;k=625 102.90 53.67 13 65 94 126 351 Scenario B n=150 ;k=187 65.77 21.64 17 49 68 84 112 n=300 ;k=375 90.13 33.33 19 65 87 109 215 n=500 ;k=625 94.60 29.03 32 74 91 112 206

The results of Table1 show some distortion in the empirical levels. But it decreases asn increases for approaching the nominal level according to Corollary3.1. The empirical powers increases withnand approaching one whenn=500 for both scenarios A and B. Even if the asymptotic power one has not yet been proved, these results show the efficiency of the procedure to detect change-points under the alternative.

Moreover, recall that the detection delaydbnis the random distance between the break time and the stopping time of the procedure. For example, whenn=150 with the break occurred at the timek=187 ; from Table2, this break is detected on average after a delay of 54 for the scenario A. Note that, even if it appears to be symmetric (from Table2and some other simulations not reported here) the distribution ofdbnis unknown. As pointed out byBardet and Kengne(2014), such procedure (that take into account all the historical data in the updated parameter estimate) leads to a detection delay which increases withn; but one can see that the standard deviation is stabilized forn≥300. According to the dependence structure of the model and the numerical difficulties to compute the estimate of the parameter (a minimum sample size of 50 is needed to expect the convergence of the numerical algorithm), the results of Table2are quite satisfactory.

4.3. Real data example

We consider the number of transactions per minute for the stock Ericsson B during July 16, 2002. There are 460 observations which represent trading from 09 : 35 to 17 : 14. These data are displayed in Figure2. Note that, the number of transactions per minute during July 2, 2002 has been studied byFokianos et al.(2009,2013) and are known to be described adequately by the INGARCH(1,1) model.

The INGARCH(1,1) model describes the first 130 observations(from 09 : 35 to 11 : 45) ade- quately according to the goodness-of-fit test proposed byFokianos et al.(2013). Therefore, these observations are considered as the historical data and we assume that the parameter of the model does not change during this period. Hence, the monitoring starts at the timet=131 and the detectorDk is computed at each time where a new observation is supposed to be available.Dk is displayed in Figure3.

Figure3shows that the procedure works well for this real data example, according to the detection delay which is reasonably good. Note that, in practice, once the sequential procedure stops and indicates a break in the observations, a retrospective procedure has to be applied to estimate the

(11)

Number of transactions per minute in stock Ericsson B on July 16, 2002

Time Nbr of transactions 051015

10:00 12:00 14:00 16:00 17:00

FIGURE2. Number of transactions per minute for the stock Ericsson B during July 16, 2002. The vertical line represents the break detected using the retrospective procedure proposed byDoukhan and Kengne(2015).

real time of change. To end this subsection, let us point out that the sequential procedure could be a good tool for off-line multiple change-points detection, seeBardet and Kengne(2014) for an example of causal time series.

5. Proofs of the main results

In the sequel, C denotes a positive constant which the value may differ from one inequality to another.

Under H0, the asymptotic covariance matrix ofθb(Tn)isΣ−1. For anyk>n, denote Dk:=√

nΣ1/2Äθ(Tb k)−θb(Tn)ä.

The following lemma will be useful.

Lemma 5.1. Let T>1(can be equal to∞). Under the assumptions of Theorem3.1, sup

n<k<[T n]+1

1

b(k/n)kDk−Dkk=oP(1) as n→∞.

Proof. FromDoukhan and Kengne(2015), it hold thatΣbn a.s.

−−−→n→∞ Σ,kθb(Tn)−θ0k=OP(1/√ n)

(12)

Dk for the number of transactions

Time

140 160 180 200 220

01234567

FIGURE3. Realizations of the statistics(Dbk)131≤k≤235for number of transactions per minute in the stock Ericsson B during July 16, 2002 ; the historical data considered are the first130observations. The horizontal solid line represents the limit of the critical region, the vertical dotted line indicates the break that has been detected using the retrospective procedure ofDoukhan and Kengne(2015) and the vertical solid line indicates the stopping time of the sequential procedure.

and fork>n,kθ(Tb k)−θ0k=OP(1/√

k)asn→∞. Hence sup

n<k<[T n]+1

1

b(k/n)kDk−Dkk= 1

1≤s≤TInf b(s) sup

n<k<[T n]+1

√nÄΣb1/2n −Σ1/2äÄθ(Tb k)−θb(Tn)ä

≤C sup

n<k<[T n]+1

√nkΣb1/2n −Σ1/2kkθb(Tk)−θ0k

+C√

nkΣb1/2n −Σ1/2kkθb(Tn)−θ0k

=oP(1) +oP(1) =oP(1).

Proof of Theorem3.1

The following proof is for the closed-end procedure (T <∞). The result for open-end (T =∞) procedure is established along similar lines.

According to (8) and Lemma5.1, it suffices to show that sup

n<k≤[T n]+1

Dk b(k/n)

−−−−→D

n→+∞ sup

1≤s≤T

kWd(s)−sWd(1))k s b(s) .

(13)

Fork>n, from the proof of Theorem 4.1 ofDoukhan and Kengne(2015), it hold that Σ(θb(Tk)−θ0) =1

k

∂ θL(Tk0) +oP( 1

k) and Σ(θ(Tb n)−θ0) =1 n

∂ θL(Tn0) +oP( 1

√n).

Hence,

Σ(θb(Tk)−θ(Tb n)) = 1 k

Ä

∂ θL(Tk0)−k n

∂ θL(Tn0)ä+oP( 1

√n).

The latter inequality holds uniformly ink>nby usingoP(1

k) +oP(1n) =oP(1n). Therefore,

√nΣ1/2b(Tk)−θ(Tb n)) = n k

√1

−1/2Ä

∂ θL(Tk0)−k n

∂ θL(Tn0)ä+oP(1)

= n k

√1 nΣ−1/2

Xk

t=1

∂ θ`t0)−k n

n

X

t=1

∂ θ`t0)+oP(1). (15) According toDoukhan and Kengne (2015),Ä`t0),Ft

ä

t∈Z is a stationary ergodic square in- tegrable martingale difference sequence with covariance matrixΣ. Hence, from Cramér-Wold device (cf.Billingsley,1968), it holds that, for anyT >1,

√1 n

[ns]

X

t=1

∂ θ`t0)−−−−→D[1,T]

n→+∞ Wd,Σ

whereWd,Σ is a zero mean d-dimensional Gaussian process satisfying EÄWd,Σ(s)0Wd,Σ(τ)ä= min(s,τ)Σ. Hence

n [ns]

√1 nΣ−1/2

X[ns]

t=1

∂ θ`t0)−[ns]

n

n

X

t=1

∂ θ`t0)−−−−→D[1,T]

n→+∞

Wd(s)−sWd(1)

s (16)

whereWdis a d-dimensional standard Brownian motion. Thus, from (15) and (16), it follows that sup

n<k≤[T n]+1

Dk

b(k/n) = sup

n<k≤[T n]+1

1 b(k/n)

n k

√1 nΣ−1/2

Xk

t=1

∂ θ`t0)−k n

n

X

t=1

∂ θ`t0)

+oP(1)

= sup

1<s≤T

1 b([ns]/n)

n [ns]

√1 nΣ−1/2

[ns]

X

t=1

∂ θ`t0)−[ns]

n

n

X

t=1

∂ θ`t0)

+oP(1)

−−−−→D

n→+∞ sup

1≤s≤T

kWd(s)−sWd(1)k sb(s) .

Proof of Corollary3.1

The following proof is for the closed-end procedure (T <∞). The result for open-end (T =∞) procedure is established along similar lines.

Ifb(s)≡cis constant, then from Theorem3.1it follows that

n→∞limP{τ(n)<∞}=Pn sup

1≤s≤T

kWd(s)−sWd(1))k s >co.

(14)

Hence, it suffices to show that sup

1≤s≤T

kWd(s)−sWd(1))k s

=D »(T−1)/T sup

0≤s≤1

kWd(s)k.

By computing the covariance matrix, one can verify that

Wd(s)−sWd(1))

s ,s≥1©=D Wd(s−1

s ),s≥1©. Thus,

sup

1≤s≤T

kWd(s)−sWd(1))k s

=D sup

1≤s≤T

Wd(s−1 s )

= sup

0≤τ≤1

Wd

ÄT−1 T τä

=D sup

0≤τ≤1

»(T−1)/TkWd(τ)k.

References

Aue, A., Hörmann, S., Horváth, L., Reimherr, M., et al. (2009a). Break detection in the covariance structure of multivariate time series models.The Annals of Statistics, 37(6B):4046–4087.

Aue, A., Horváth, L., Hušková, M., and Kokoszka, P. (2006). Change-point monitoring in linear models. The Econometrics Journal, 9(3):373–403.

Aue, A., Horváth, L., and Reimherr, M. L. (2009b). Delay times of sequential procedures for multiple time series regression models.Journal of Econometrics, 149(2):174–190.

Bardet, J.-M. and Kengne, W. (2014). Monitoring procedure for parameter change in causal time series.Journal of Multivariate Analysis, 125:204–221.

Bardet, J.-M., Kengne, W., Wintenberger, O., et al. (2012). Multiple breaks detection in general causal time series using penalized quasi-likelihood.Electronic Journal of Statistics, 6:435–477.

Basseville, M., Nikiforov, I. V., et al. (1993). Detection of abrupt changes: theory and application, volume 104.

Prentice Hall Englewood Cliffs.

Berkes, I., Gombay, E., Horváth, L., and Kokoszka, P. (2004). Sequential change-point detection in garch (p, q) models.

Econometric Theory, 20(06):1140–1167.

Billingsley, P. (1968). Convergence of probability measures.

Chu, C.-S. J., Stinchcombe, M., and White, H. (1996). Monitoring structural change.Econometrica: Journal of the Econometric Society, pages 1045–1065.

Csörgö, M. and Horváth, L. (1997).Limit theorems in change-point analysis. Wiley New York.

Doukhan, P., Fokianos, K., and Tjøstheim, D. (2012). On weak dependence conditions for poisson autoregressions.

Statistics & Probability Letters, 82(5):942–948.

Doukhan, P. and Kengne, W. (2015). Inference and testing for structural change in general poisson autoregressive models.Electronic Journal of Statistics, 9:1267–1314.

Doukhan, P. and Wintenberger, O. (2008). Weakly dependent chains with infinite memory.Stochastic Processes and their Applications, 118(11):1997–2013.

Ferland, R., Latour, A., and Oraichi, D. (2006). Integer-valued garch process. Journal of Time Series Analysis, 27(6):923–942.

Fokianos, K. and Fried, R. (2010). Interventions in ingarch processes.Journal of Time Series Analysis, 31(3):210–225.

Fokianos, K. and Fried, R. (2012). Interventions in log-linear poisson autoregression.Statistical Modelling, 12(4):299–

322.

(15)

Fokianos, K., Gombay, E., and Hussein, A. (2014). Retrospective change detection for binary time series models.

Journal of Statistical Planning and Inference, 145:102–112.

Fokianos, K., Neumann, M. H., et al. (2013). A goodness-of-fit test for poisson count processes.Electronic Journal of Statistics, 7:793–819.

Fokianos, K., Rahbek, A., and Tjøstheim, D. (2009). Poisson autoregression. Journal of the American Statistical Association, 104(488):1430–1439.

Fokianos, K. and Tjøstheim, D. (2012). Nonlinear poisson autoregression. Annals of the Institute of Statistical Mathematics, 64(6):1205–1225.

Fryzlewicz, P. and Subba Rao, S. (2013). Multiple-change-point detection for auto-regressive conditional heteroscedas- tic processes.Journal of the Royal Statistical Society: series B (statistical methodology).

Gombay, E. and Serban, D. (2009). Monitoring parameter change in time series models. Journal of Multivariate Analysis, 100(4):715–725.

Horváth, L., Hušková, M., Kokoszka, P., and Steinebach, J. (2004). Monitoring changes in linear models.Journal of Statistical Planning and Inference, 126(1):225–251.

Horvath, L., Kokoszka, P., and Steinebach, J. (2007). On sequential detection of parameter changes in linear regression.

Statistics & probability letters, 77(9):885–895.

Kang, J. and Lee, S. (2009). Parameter change test for random coefficient integer-valued autoregressive processes with application to polio data analysis.Journal of Time Series Analysis, 30(2):239–258.

Kang, J. and Lee, S. (2014). Parameter change test for poisson autoregressive models. Scandinavian Journal of Statistics, 41(4):1136–1152.

Kengne, W. C. (2012). Testing for parameter constancy in general causal time-series models.Journal of Time Series Analysis, 33(3):503–518.

Leisch, F., Hornik, K., and Kuan, C.-M. (2000). Monitoring structural changes with the generalized fluctuation test.

Econometric Theory, 16(06):835–854.

Na, O., Lee, Y., and Lee, S. (2011). Monitoring parameter change in time series models. Statistical Methods &

Applications, 20(2):171–199.

Neumann, M. H. et al. (2011). Absolute regularity and ergodicity of poisson count processes.Bernoulli, 17(4):1268–

1284.

Weiß, C. H. (2009). Modelling time series of counts with overdispersion. Statistical Methods and Applications, 18(4):507–519.

Zeileis, A., Leisch, F., Kleiber, C., and Hornik, K. (2005). Monitoring structural change in dynamic econometric models.Journal of Applied Econometrics, 20(1):99–121.

Zhu, F. and Wang, D. (2010). Diagnostic checking integer-valued arch (p) models using conditional residual autocorre- lations.Computational Statistics & Data Analysis, 54(2):496–508.

Références

Documents relatifs

We construct efficient robust truncated sequential estimators for the pointwise estimation problem in nonparametric autoregression models with smooth coeffi- cients.. For

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des

R EMARK 1.6.– It should be emphasized that to estimate the function S in [1.1.1] we use the approach developed in Galtchouk and Pergamenshchikov (2011) for the sequential

Abstract: We construct a robust truncated sequential estimators for the pointwise estimation problem in nonpara- metric autoregression models with smooth coefficients.. For

7 the mass curve are compared to the real data: black circles are the masses of the segmented images, the black line is the mass of the interpolation obtained by an exponential

The test done on the hepatitis dataset provided by the Chiba University Hospital allowed us after a learning step, to extract very short patterns, which were sufficient, in half

We construct a sequential adaptive procedure for estimating the autoregressive function at a given point in nonparametric autoregression models with Gaussian noise.. We make use of

The detection of anomalies is performed utilizing three different algorithms (one algorithm for each type of anomaly) by finding the patterns that are close to the