• Aucun résultat trouvé

Testing stationary processes for independence

N/A
N/A
Protected

Academic year: 2022

Partager "Testing stationary processes for independence"

Copied!
7
0
0

Texte intégral

(1)

www.imstat.org/aihp 2011, Vol. 47, No. 4, 1219–1225

DOI:10.1214/11-AIHP426

© Association des Publications de l’Institut Henri Poincaré, 2011

Testing stationary processes for independence

Gusztáv Morvai

a,1

and Benjamin Weiss

b

aMTA-BME Stochastics Research Group, Egry József utca 1, Building H, Budapest, 1111 Hungary. E-mail:morvai@math.bme.hu bInstitute of Mathematics, Hebrew University of Jerusalem, Jerusalem 91904, Israel. E-mail:weiss@math.huji.ac.il

Received 27 October 2009; revised 21 December 2010; accepted 15 March 2011

Abstract. LetH0denote the class of all real valued i.i.d. processes andH1all other ergodic real valued stationary processes. In spite of the fact that these classes are not countably tight we give a strongly consistent sequential test for distinguishing between them.

Résumé. SoitH0la classe de tous les processus indépendants et équidistribués à valeurs réelles, etH1la classe complémentaire dans l’ensemble des processus ergodiques. Nous donnons un test séquentiel fortement consistant pour les distinguer.

MSC:62M07

Keywords:Independent processes; Hypothesis testing

1. Introduction

In sequential testing for distinguishing between two competing hypotheses, the usual framework consists of two classes of stochastic processesH0andH1, and a sequence ofnobservationsX1, X2, . . . , Xnwhich represent a sam- pling from a single process. A test is usually a sequence of functionsgnofnarguments that take the values zero and one and it is said to be weakly consistentif the sequence converges to 0/1 in probability according to whether the process belongs toH0/H1. It is said to beconsistentif the convergence is pointwise with probability one. Much clas- sical work (see [2,3,5]) was done in the case where the classesHi consisted of i.i.d. processes with the classes being distinguished by the type of the underlying distribution. More recently (see [11], [6], [12] and [10]) the more general problem was considered where the classesHiare no longer restricted to being i.i.d. but are assumed to contain certain ergodic stationary processes. In particular in the latter two papers some general sufficient conditions were given on the classes which ensure the existence of consistent tests.

Perhaps the simplest question of this type that one can put is to distinguish between H0, the class of all i.i.d.

processes and its complement in the class of ergodic stationary processses, namelyH1is taken to be all such processes that are not independent. If we restrict attention to finite valued processes with a fixed number of states then D. Bailey in his thesis [1] gave just such a consistent test that was based on his universal scheme for estimating the entropy of an unknown process. If however we takeH0to be the class of all real valued independent processes then none of these earlier results apply. For example, the main result in [10] assumes that the union of the two classes is contained in a countable union of uniformly tight processes whereas the class of all distribution functions on the real line is not countably tight. Our purpose in this note is to answer just this question by providing a consistent sequential test for deciding whether an arbitrary stationary ergodic process is independent or not. In fact our result applies to vector-

1Supported by OTKA grant No. K75143 and Bolyai János Research Scholarship.

(2)

valued processes as well, indeed all that one needs to assume is that the values taken by the process lie in a countably generated measurable space which is known a priori.

In the first section we shall show how an appropriate quantization enables one to adapt any good set of independence tests for finite valued i.i.d. processes to produce an independence test for general processes. In the second section we shall give a specific example of such a good test. Similar questions can be asked about classes of Markov processes and we have answered some of these in earlier work. In [8] we showed that even when we restrict to binary processes the class of all finite order Markov chains cannot be distinguished from its complement by any weakly consistent test.

On the other hand for anykthe order-kMarkov chains can be distinguished from their complement by a consistent test even in the setting of countable alphabets (see [7] and [9]).

2. A general test for independence

Let{X,F}be a countably generated measurable space, such asRorRdor any Polish space with its Borelσ-algebra.

Our process{Xn}n=−∞will be stationary and ergodic taking values inX. (Note that all stationary time series{Xn}n=0 can be thought to be a two sided time series, that is,{Xn}n=−∞.) The reader can keep in mind real valued processes without loss of generality.

For notational convenience, letXmn =(Xm, . . . , Xn), wheremn. Note that ifm > nthenXmn is the empty string.

LetAkbe a refining sequence of finite partitions ofXand denote byAˆkthe corresponding algebras of sets. Assume that

k=1Aˆk generates the wholeσ-algebraF. For a countably generated measurable space such sequences always exist. For exampleAkcan be the partition ofRinto intervals of the form[j/2k, (j+1)/2k)for all integers|j|< k2k and their complement. We will denote by

Qk(x)=Ai ifxAiAk

the function which assigns tox the set of the partition to which it belongs. For any stationary and ergodicX-valued process{Xn}, let

Yn(k)=Qk(Xn)

denote the process obtained by quantizing at levelk.

Lemma 1. A stationary and ergodicX-valued process{Xn}is independent if and only if the correspondingAk-valued processYn(k)is independent for eachk.

Proof. If the{Xn}is independent then as functions of theXn’s so are all theYn(k). In the opposite direction assume that for allktheYn(k)’s are independent. Denote byAthe union of the finite algebras

k=1Aˆk and byAZthe productσ- algebra generated byA. On thisσ-algebra the probability measure defined by the{Xn}process is a product measure.

Since the algebra

k=1Aˆk generates the fullσ-algebraFand since measures that agree on an algebra agree also on theσ-algebra it generates this implies that the probability measure defined by the{Xn}process is a product measure

onFZas was to be shown.

LetY be a finite set. A sequence of functionsgndefined onYn+1is a consistent test for independence if for any Y-valued stationary and ergodic process{Yn}, eventually almost surely

gn(Y0, . . . , Yn)=IND if the process is independent, DEP otherwise.

Define

En=supP

gn(Y0, . . . , Yn)=DEP ,

where the sup is over allY-valued independent and identically distributed processes.

(3)

If in addition n=0

En<

then we will say thatgnis astrongly consistenttest. In the next section we will show how to construct such tests.

Now letYk =Ak as above. We will show how to use any family of strongly consistent tests{gn(k)}to devise a consistent test for the class of allX-valued processes. For an increasing sequenceb= {bn}such thatbn→ ∞define

TESTbn X0n

=

IND ifgn(k)

Y0(k), . . . , Yn(k)

=INDfor all 1≤kbn, DEP otherwise.

Theorem 1. Let{gn(k)}be a family of strongly consistent tests and definemkbe so large that

k=1

n=mkEn(k)<∞ (e.g.

n=mkEn(k)<2k).Fornm1letbn=max{1≤k: mlnfor all1≤lk}.Then TESTbn(Xn0)is consistent for all stationary and ergodicX-valued processes{Xn}.

Proof. If theX-valued process{Xn}is dependent then by Lemma1for somek,g(k)n (Y0(k), . . . , Yn(k))=DEPeven- tually almost surely and so TESTbn(X0n)=DEPeventually almost surely. Now assume that{Xn}is an independent X-valued process. Then by Lemma1, for eachk,{Yn(k)}is independent and so

n=m1

bn

k=1

En(k)=

k=1

n=mk

En(k)<.

Thus n=m1

P TESTbn

Xn0

=DEP

= n=m1

P g(k)n

Y0(k), . . . , Yn(k)

=DEPfor some 1≤kbn

n=m1

bn

k=1

P gn(k)

Y0(k), . . . , Yn(k)

=DEP

n=m1

bn

k=1

En(k)<.

By the Borel–Cantelli Lemma,g(k)n (Y0(k), . . . , Yn(k))=INDfor all 1≤kbneventually almost surely. The proof of

Theorem1is complete.

3. Construction of strongly consistent tests

In this section we will show how to construct a strongly consistent test for independence for the class of all processes taking values in a fixed finite set Y. Assume {Yn}is a stationary and ergodic process taking values in Y. First let us define a number which will measure the degree of independence of the process. It will be zero if and only if the process is independent.

For convenience let p(y0k+1) andp(y|y0k+1)denote the distribution P (Y0k+1=y0k+1)and the conditional distributionP (Y1=y|Y0k+1=y0k+1), respectively.

(4)

Define Γ = sup

1k< sup

{z0k+1∈Yk,y∈Y:p(z0k+1,y)>0}

p(y)p

y|z0k+1 .

We proceed to define an empirical version of this based on the observation of a finite data segmentY0n. To this end first define the empirical version ofp(y)andp(y|w0k+1)as

ˆ

pn(y)=#{0≤tn−1:Yt+1=y} n

and ˆ pn

y|w0k+1

=#{k−1≤tn−1:Ytt+k1+1=(w0k+1, y)}

#{k−1≤tn−1:Yttk+1=w0k+1} .

These empirical distributions, as well as the sets we are about to introduce are functions ofY0n, but we suppress the dependence to keep the notation manageable.

For a fixed 0< γ <1 letLnk denote the set of strings with lengthkwhich appear more thann1γ times inY0n. That is,

Lnk=

y0k+1Yk: #

k−1≤tn−1:Yttk+1=y0k+1

> n1γ . Finally, define the empirical version ofΓ as follows:

Γˆn=max

y∈Y max

1k< max

z0−k+1∈Lnk

pˆn(y)− ˆpn

y|z0k+1 .

Let us agree by convention that if the set over which we are maximizing is empty then the maximum is zero.

Observe, that by ergodicity, the ergodic theorem implies that almost surely the empirical distributionspˆ converge to the true distributionspand so

lim inf

n→∞ ΓˆnΓ almost surely.

If the process is dependent then clearlyΓ >0. If the process is independent thenΓ =0 and we will show that not justΓˆn→0 almost surely but it converges to zero at a certain rate.

Let 0< β <12γ be arbitrary. Let gn=

IND ifΓˆnnβ, DEP otherwise.

Note thatgndepends onY0n.

Theorem 2. Let{Yn}be an arbitrary stationary and ergodic process taking values from a finite setY.Assume that 0< γ <1and0< β <12γ.If the process is independent then eventually almost surely,gn=IND and if the process is not independent then eventually almost surely,gn=DEP.Furthermore,if the process is independent and identically distributed then forn >21/(1γ2β)

P (gn=DEP)≤14|Y|n4en+1γ/2 and the right-hand side is summable.

Proof. If the process is not independent then there is a string z0k+1Yk and a letter yY such that p(y)=p(y|z0k+1) and p(z0k+1y) >0. By ergodicity, pˆn(y)p(y) and pˆn(y|z0k+1)p(y|z0k+1) and so lim infn→∞Γˆn>0 almost surely which in turn implies thatgn=DEPeventually almost surely.

(5)

Assume that the process is independent and identically distributed. Then ˆn> nβ

=P

maxy∈Y max

1k< max

(z0k+1,y)∈Lnk

pˆn(y)− ˆpn

y|z0k+1 > nβ

P

max

y∈Y pˆn(y)p(y) > nβ/2 +P

maxy∈Y max

1k<n max

z0k+1∈Lnk

p

y|z0k+1

− ˆpn

y|z0k+1 > nβ/2

P

For someyY: pˆn(y)p(y) > nβ/2 +P

For someyY,1≤k < n, k−1≤ln−1:Yllk+1Lnk, pˆn

y|Yllk+1

p

y|Yllk+1 > nβ/2 .

By the union bound this in turn is:

ˆn> nβ

y∈Y

P pˆn(y)p(y) > nβ/2

+

y∈Y n1

k=1 n1

l=k1

P

Yllk+1Lnk, pˆn

y|Yllk+1

p

y|Yllk+1 > nβ/2 .

By Hoeffding’s inequality (cf. Theorem 2 in [4]) for sums of bounded independent random variables, for a given yY,

P pˆn(y)p(y) > nβ/2

≤2e0.5n−2βn. Summing overywe get that

y∈Y

P pˆn(y)p(y) > nβ/2

≤2|Y|e0.5nn.

Now for a given 0≤ln−1, 1≤kn−1 andyY we will give an upper bound on the probability P

Yllk+1Lnk, pˆn

y|Yllk+1

p

y|Yllk+1 > nβ/2 .

Letl+θ+(l, j, i)andlθ(l, j, i)denote the position of theith occurrence of the patternYllj (with lengthj and positionl) going in positive and negative directions respectively. Formally, setθ+(l, j,0)=0,θ(l, j,0)=0 and define

θ+(l, j, i)=θ+(l, j, i−1)+min

t >0: Yll++θθ++(l,j,i(l,j,i1)1)+jt+1+t=Yll++θθ++(l,j,i(l,j,i1)1)j+1 and

θ(l, j, i)=θ(l, j, i−1)+min

t >0: Yllθθ(l,j,i(l,j,i1)1)jt+1t=Yllθθ(l,j,i(l,j,i1)1)j+1 .

For a given 0≤ln−1, 1≤kn−1,yY,r≥0 ands≥0, by Hoeffding’s inequality (cf. Theorem 2 in [4]) for sums of bounded independent random variables,

P

r

h=11{Ylθ−(l,k,h)+1=y}+s

h=01{Yl+θ+(l,k,h)+1=y}

r+s+1 −p

y|Yllk+1 ≥0.5nβ Yllk+1=y0k+1

≤2e0.5n(r+s+1).

(6)

Multiplying both sides byP (Yllk+1=y0k+1)and summing over all possibley0k+1Ykwe get that

P

Yllk+1Lnk,

r

h=11{Yl−θ−(l,k,h)+1=y}+s

h=01{Yl+θ+(l,k,h)+1=y}

r+s+1 −p

y|Yllk+1 >0.5nβ

P

r

h=11{Ylθ−(l,k,h)+1=y}+s

h=01{Yl+θ+(l,k,h)+1=0}

r+s+1 −p

y|Yllk+1 >0.5nβ

≤2e0.5n(r+s+1).

Summing over all 0≤ln−1 and over all pairs(r, s)such thatr≥0,s≥0,r+s+1≥ n1γ we get that

n1

l=0

P

Yllk+1Lnk, pˆn

y|Yllk+1

p

y|Yllk+1 > nβ/2

n h=n1γ

h2e0.5nh.

Thus

y∈Y n1

k=0 n1

l=k1

P

Yllk+1Lnk, pˆn

y|Yllk+1

p

y|Yllk+1 > nβ/2

≤2n2|Y|

h=n1−γ

henh/2.

Now we give an upper bound on the sum on the right-hand side. Observe thathen−2βh/2 is monotone decreasing inh as soon as the derivative enh/2h0.5nhenh/2 is negative forh > n1γ which is the case forn >

21/(1γ2β). Using this fact, we bound the sum by the integral

h=n1γ

henh/2

n1γ

henh/2dh.

Integrating by parts we get that

n1γ

hen−2βh/2dh=

h −1

n/2en−2βh/2

n1−γ

n1γ

−1

n/2en−2βh/2dh

= n1γ

n/2en−2βn1−γ/2− 1

(n/2)2en−2βh/2

n1−γ

= n1γ

n/2en1−γ−2β/2+ 1

(n/2)2en1−γ−2β/2

=

2n1γ++4n

en1γ/2

2n2+4n2

en1γ/2 since by assumption 0< γ <1 and 0< β <12γ.

Combining all these we get that forn >21/(1γ2β) P (gn=DEP)=ˆn> nβ

≤2|Y|e0.5n1+12n4|Y|en1−γ−/2

≤14n4|Y|en1γ/2.

The right-hand side is summable provided 2β+γ <1 and the Borel–Cantelli Lemma yields that ˆnnβ eventually

=1

(7)

and sogn=INDeventually almost surely. The proof of Theorem2is complete.

This test can be used as the input needed for Theorem 1. Now let(X,F)be a countably generated measurable space. As is well known any countably generated measurable space has a refining sequence of finite partitionsAk

such that

k=1Aˆk generates the whole σ-algebraF on X whereAˆk denotes the algebra which is generated by partitionAk. Letg(k)n be the test constructed above for partitionAk. Then choosingbso as to satisfy the condition in Theorem1we have the following corollary.

Corollary 1. For all stationary and ergodicX-valued process{Xn}eventually almost surely,TESTbn(X0n)=IND if the process is independent and eventually almost surely TESTbn(X0n)=DEP if the process is not independent.

References

[1] D. H. Bailey. Sequential schemes for classifying and predicting ergodic processes. Ph.D. thesis, Stanford Univ., 1976.MR2626644 [2] A. Berger. On uniformly consistent tests.Ann. Math. Statist.22(1951) 289–293.MR0042653

[3] A. Dembo and Y. Peres. A topological criterion for hypothesis testing.Ann. Statist.22(1994) 106–117.MR1272078

[4] W. Hoeffding. Probability inequalities for sums of bounded random variables.J. Amer. Statist. Assoc.58(1963) 13–30.MR0144363 [5] W. Hoeffding and J. Wolfowitz. Distinguishability of sets of distributions.Ann. Math. Statist.29(1958) 700–718.MR0095555

[6] Ch. Kraft. Some conditions for consistency and uniform consistency of statistical procedures.Univ. California Publ. Statist.2(1955) 125–141.

MR0073896

[7] G. Morvai and B. Weiss. Order estimation of Markov chains.IEEE Trans. Inform. Theory51(2005) 1496–1497.MR2241507 [8] G. Morvai and B. Weiss. On classifying processes.Bernoulli11(2005) 523–532.MR2146893

[9] G. Morvai and B. Weiss. Estimating the lengths of memory words. IEEE Transactions on Information Theory54(2008) 3804–3807.

MR2451043

[10] A. Nobel. Hypothesis testing for families of ergodic processes.Bernoulli12(2006) 251–269.MR2218555 [11] D. Ornstein and B. Weiss. How sampling reveals a process.Ann. Probab.18(1990) 905–930.MR1062052

[12] B. Weiss. Some remarks on filtering and prediction of stationary processes.Israel J. Math.149(2005) 345–360.MR2191220

Références

Documents relatifs

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des

The remaining seven algorithms are more sophis- ticated in that they use two different regimens: a normal regimen for failure-free segments, with a large checkpointing period; and

For the same notion of consistency, [10] gives sufficient conditions on two families H 0 and H 1 that consist of stationary ergodic real-valued processes, under which a

By a strong Markov process we mean one with stationary tran- sition probabilities, taking values in a locally compact separable metric space, and having almost all paths

Despite the lack or total absence of the optical counterpart, the observations of dark GRBs in this domain may also give useful information about the nature of these events,

• A novel method to detect failure cascades, based on the quantile distribution of consecutive IAT pairs (Section III-B); • The design and comparison of several cascade-aware

Asymmetrically consistent tests for some specific hypotheses, but under the general alternative of stationary ergodic processes, have been proposed in [9, 10, 15, 16], which

API with a non-stationary policy of growing period Following our findings on non-stationary policies AVI, we consider the following variation of API, where at each iteration, instead