INVARIANCE FOR THE TAIL OF HYBRIDS OF EMPIRICAL AND PARTIAL-SUM PROCESSES

(1)

INVARIANCE FOR THE TAIL OF HYBRIDS OF EMPIRICAL AND PARTIAL-SUM PROCESSES

SERGIO ALVAREZ-ANDRADE

We consider the hybrids of empirical and partial-sum processes given byA^∗n(t) =

=

1≤i≤nH(Xi)1_{X_i_≤t}i, −∞< t <∞ under suitable conditions on the sequence of random variables {Xi}i≥1 and {i}i≥1 given below. Under moment conditions on the sequence{i}i≥1, we establish an upper bound in a strong approximation result for the processA^∗_n(a(n)t), wherea(n) is a sequence of positive constants such thata(n) =O(^log_n²γⁿ), γ∈(0,1/4).

AMS 2000 Subject Classification: 60F15, 60F17.

Key words: hybrids of empirical and partial-sum processes, tail empirical process.

1. INTRODUCTION

Let U1, U2, . . . , be a sequence of independent random variables with a uniform distribution on (0,1). Denote by ˜F_n(s) =n⁻¹_n

i=11_{U_i_≤s}, 0≤s≤1 the empirical distribution function based on the firstn of these random variables, where 1_{x≤y} denotes the indicator function. Letα_n(s) =n¹^/²{F˜_n(s)− s}, 0 ≤s≤ 1 be the uniform empirical process. For each integer n≥ 1, the tail empirical process is defined to be a(n)⁻¹^/²α_n(a(ns)), 0 ≤ s ≤ 1, where a(n) is a sequence of positive constants such that a(n)→ 0 and na(n) → ∞ as n→ ∞. Tail empirical process has been considerated many times in the literature. In particular, Mason [14] established a strong invariance theorem for the tail empirical process, Csörg˝o and Mason [2] utilized tail empirical process results to study intermediate and extreme-sum processes (cf. for in- stance p. 62, 63), Einmahl [9] studied the almost sure behavior in the case of the weighted tail empirical process with application to the construction of asymptotic confidence bands for intermediate quantiles from an arbitrary continuous distribution function, Deheuvels [5] established the Chung type functionals laws of the iterated logarithm for tail empirical processes while in [4] he stated that the tail empirical process is almost sure relatively compact in an appropriate topological space for a suitable sequencea(n).

Our aim is to establish the upper bound in a strong approximation for the tail of hybrids of empirical and partial-sum processes deﬁned below by (1).

REV. ROUMAINE MATH. PURES APPL.,52(2007),3, 305–313

(2)

In Section 2 we begin by recalling a strong approximation result given in [11]

and by Remark 2.1 we recall that it is equivalent to consider a version of A^∗_n(a(n)t) for tin (0,1). In Section 3 we establish our main results. Finally, Section 4 is devoted to the proofs of the main results.

Denote by

(1) A^∗_n(t) =

1≤i≤n

H(X_i)1_{X_i_≤t}_i, −∞< t <∞

the hybrids of empirical and partial-sum processes, with the following regu- larity conditions on the sequence of random variables{X_i}_i≥1 and {_i}_i≥1:

(A) The sequences of real random variables{Xi}i≥1 and {i}i≥1 are independent.

(B) {X_i,1 ≤ i < ∞} are independent identically distributed random variables with common distribution functionF.

(C) {_i,0 ≤ i < ∞} are independent identically distributed random variables with E[i] = µ, E[²] = 1. Without loss of generality, we consider thatµ= 1, when µ= 0.

(D) The functionHis positive and of bounded variation on the real line.

(E)1 has a ﬁnite moment generation function in a neighborhood of 0.

Diebolt [6] (see also [8]) and Diebolt and Laib [7] showed thatn⁻¹^/²A^∗_n(t) (withµ= 0 in condition (C)) converges weakly to a time transformed Wiener process (Brownian motion) and obtained upper bounds for the rate of conver- gence.

The time transformation for the limiting Wiener process given in [6] is

(2) G_n(t) =

_∞

0 U²(s)dF_n(s), whereFn(t) =n⁻¹

1≤i≤n1_{X_i_≤t}, −∞< t <∞, i.e. the empirical distribution function of X1, X2, . . .. Later on, Horv´ath [11], showed that the random time change G_n(t) can be replaced by a non-random time change G(t) de- ﬁned as

(3) G(t) =

_∞

0 U²(s)dF(s),

whereF(x) is the common cumulated distribution function of theX_i, without reducing the rates of the approximations given in [6]. The almost sure approximation of the two-time parameter process{A^∗_n(t),−∞< t <∞, 1≤n <∞}

by a Gaussian process was also stated in [11].

(3)

Under some conditions on the sequence of random variables {Xi}i≥1

and {_i}_i≥1, Heusler and Mason [10] deﬁned the “randomly weighted bootstrap empirical process” (see also [12] for approximations for weighted bootstrap processes and references therein) associated withA^∗_n(t) (withH= 1) by X_n(t) =

1≤i≤n_i(1_{X_i_≤t}−t), 0≤t≤1. They showed that behind this process there is a martingale structure, and establishe by martingale arguments theO_p( ) approximation of X_n(t) by a standard Brownian bridge.

We can and shall assume that without loss of generality that all the random variables and processes introduced so far and later on in this paper can be deﬁned on the same probability space (cf. Appendix 2 in [1]).

In the following, we set log₁u = log₊u = log(u ∨e) and log_pu = log₊(log_p−₁u) for p≥2.

2. SOME USEFUL RESULTS

In this section we recall some useful results concerning the hybrids of empirical and partial-sum processes.

Replacing condition (E) by the weaker condition (F) the functionE[|1|^P]<∞ forp≥5,

Horv´ath [11] (Theorem 2.2 ii) proved the following result.

Theorem1. Under conditions (A), (B), (C,with µ= 0), (D) and (F), we can define a two-time parameter Wiener process {Γ(x, y), 0≤x, y < ∞}

such that, with probability one,

−∞<t<∞sup |A^∗_n(t)−Γ(G(t), n)|=O

n¹^/²^−δ⁽^p⁾(logn)¹^/² ,

where δ(p) = (p−2)/4(p+ 1).

Recall that the two-parameter Wiener process {Γ(x, y), 0 ≤x, y < ∞}

is a Gaussian process (also called Brownian sheet) with E[Γ(x, y)] = 0 and E[Γ(x, y)Γ(x, y)] = min(x, x) min(y, y) (see [3]).

Remark 2.1. By [11], p. 5, without any loss of generality, there are independent identically distributed random variables{Y_i, 1≤i <∞}uniform on [0,1] such that X_i =Q(Y_i), with Q(y) = inf{x:F(x)≥y}, i.e., the quantile function ofF. Then we can consider

(4) A_n(t) =

1≤i≤n

V(Y_i)1_{Y_i_≤t}_i, 0≤t≤1,

(4)

in place ofA^∗_n(t), since we have (5) An(t) =

1≤i≤n

H(Q(Yi))1_{Q₍_Y_i₎_≤t}i =

1≤i≤n

V(Yi)1_{Y_i_≤F₍_t₎_}i,

where V(t) = H(Q(t)). By (D) we can assume without any loss of generality that

sup

t∈[0,1]|V(t)|<1.

Remark 2.2. By Remark 2.1, the times change G(t), t ∈ R, must be replaced byJ(t) =_t

0V²(s)dsfort∈[0,1].

3. MAIN RESULTS

In this section our main aim is to give the upper bound of the strong approximation for the process A_n(a(n)t) and to establish the same kind of results for some associated processes. We now state our results.

Proposition1. Under the same conditions as in Theorem 1withp≥6 in condition (F), and for a sequence of real numbers a(n) such that a(n) = O(^log_n²_γⁿ), γ∈(0,1/4), we have

lim sup

n→∞ sup

0≤t≤1

A_n(a(n)t)

2na(n) log₂n− Γ(J(a(n)t), n) 2na(n) log₂n

= 0, a.s., where J(t) =_t

0 V²(s)ds, 0≤t≤1.

The next result concerns the case where H(x) = 1 for all x, and with µ= 1 in condition (C). Denote by

A¹_n(t) = n

i=1

εi1_{U_i_≤t}, 0< t≤1, the corresponding process.

Proposition 2. Under conditions (A), (B), (C,with µ = 1) and (F), we can define a Kiefer process {K(t, y), 0 ≤t ≤1, 0≤y ≤ ∞} and a two- parameter Wiener process {Γ(t, y), 0 ≤ t ≤ 1, 0 ≤ y ≤ ∞} such that for a sequence of real numbers a(n) satisfying a(n) = O(^log_n²_γⁿ), γ ∈ (0,1/4), we have, with probability one,

lim sup

n→∞ sup

0≤t≤1

1

2nlog₂n

A¹_n(a(n)t−na(n)t

a(n) − Γ(a(n)t, n)+K(a(n)t, n) a(n)

= 0.

(5)

Recall that the Kiefer process {K(s, t), 0 ≤ s≤1, t ≥0} is a continuous two-parameter centered Gaussian process indexed by [0,1]×R+ whose covariance function E[K(s1, t1)K(s2, t2)] = (min(s1, s2) − s1s2) min(t1, t2), 0≤s1, s2 ≤1, t1, t2≥0 (see, e.g., [3]).

As immediate consequence of Proposition 2 is the following result about the modiﬁed empirical process deﬁned as

α_n,c(t) =√

n A¹_n(t) n −t

.

Corollary 1. Under conditions (A), (B), (C, with µ= 1) and (F), we can define a sequence of Wiener processes{W_n(t), 0≤t≤1} and a sequence of Brownian bridges {Bn(t), 0 ≤ t ≤ 1} such that for a sequence of real numbers a(n) such thata(n) =O(^log_n²γⁿ), γ∈(0,1/4), we have

lim sup

n→∞ sup

0≤t≤1

1

2 log₂n

α_n,c(a(n)t)

a(n) −σW_n(a(n)t, n)−B_n(a(n)t) a(n)

= 0, a.s.

4. PROOFS

The proof of Proposition 1 will be based on the lemmas below (see the proof of Lemma 4 in [14]).

Lemma 1. Under the conditions in Proposition 1, for any integer k≥1 and fort_k=a(n_k)t, 0≤t≤1, we have

k→∞lim sup

0≤t≤1

Ank(t_k)

2n_ka(n_k) logn_k − Γ(J(t_k), n_k) 2n_ka(n_k) log₂n_k

= 0, a.s.

(6)

This lemma is a direct consequence of Theorem 1.

Lemma2. Under the conditions in Proposition1, for any integerk≥1, t_k =a(n_k)t, 0≤t≤1, and n_k=λ^k, withλ >1, we have

(7) lim sup

k→∞ max

nk<n≤nk+1

0sup≤t≤1

ⁿ⁻ⁿ^k

i=1 V (Ynk+i) 1_{Y_nk₊_i_≤t_k_}εnk+i

2n_ka(n_k) log₂n_k ≤(λ−1)¹^/², a.s.

Proof. Let us introduce the process T(t, i, j) =

j<l≤i

V(Yl)1_{Y_l_≤t}εl.

(6)

Then, for 0 = n0 < n1 < n2· · ·, the processes {T(t, n_k, n_k−1), 0 ≤ t ≤ 1} are independent (see [11]). Next, T(t, n_k,0) = A_n_k(t) = _k

i=1T(t, n_i, n_i−1).

Remark thatAn(t) =Ank(t) +T(t, n, nk) for nk< n≤nk+1. Consider

T(t_k, n, n_k) =

n−nk

i=1

V(Y_n_k+i)1_{Y_nk₊_i_≤t_k_}ε_n_k+i, forn_k< n≤n_k+1. Note that

• E[T(t, n, n_k)] = 0,

• Var(T(t, n, nk)) = (n−nk)J(t),

• E[|T(t, n, n_k)|^p] = (n−n_k)E[|ε_i|^p]_t

0V^p(s)ds.

Then by using the Fuck and Nagaev inequality (see [15], p. 78), for positive constantsCp and C_p depending on pwe have

P(T(t_k, n_k+1, n_k)> x)≤C_p(n−n_k)E[|ε_i|^p] _t_k

0 V^p(s)dsx^−p+ + exp

−C_px²((n−n_k)J(t_k))⁻¹ .

Hence

P

T(t_k, n_k+1, n_k)

2n_ka(n_k) log₂n_k >(1 +ε)√ λ−1

≤ (8)

≤ C_p(n−n_k)E[|ε_i|^p]_a₍_n_k₎_t

0 V^p(s)ds (1 +ε)^p(λ−1)^p/²(2n_k(a(n_k) log₂n_k)^p/²+ + exp

−(1 +ε)²(λ−1)C_pnka(nk) log₂nk

(n−nk)a(nk)

,

and note that for n_k = λ^k withλ > 1, the order of the right hand side is O(1/λ^kβ), β >1.

The same result holds when we replace T(t_k, n_k+1, n_k) in (8) by

nk<n≤nsup k

T(tk, n, nk) (cf. p. 79 of [15]). By the Borel-Cantelli lemma, (7) now follows.

Lemma 3. Under the conditions in Proposition 1, for any integer k≥1 and for a Brownian sheet Γ(J(tk), n) defined as in Theorem 1 we have (9) lim sup

k→∞ max

nk<n≤nk+1

0sup≤t≤1

|Γ(J(t_k), n)−Γ(J(t_k), n_k)|

2nka(nk) log₂nk ≤(λ−1)¹^/², a.s.

(7)

Proof. By (2.1) of [11] and (1.11) of [3], forn_k< n≤n_k+1we can deﬁne Sn−nk(J(t_k)) =

n−nk

i=1

Wi+nk(J(t_k)), t∈[0,1],

where theWm(t) are independent standard Brownian motions such that Γ(J(t_k), n)−Γ(J(t_k), n_k) =S_n−n_k(J(t_k)).

Form_k+1 =n_k+1−n_k we have

P max

nk<n≤nk+1

0sup≤t≤1|Γ(J(t_k), n)−Γ(J(t_k), n_k)|> x

≤

≤P sup

0≤t≤1

S_m_k+1(J(t_k))> x

. Replacingx by (2(1 +ε)(λ−1)n_ka(n_k) log₂n_k)¹^/², we obtain

P sup

0≤t≤1

S_m_k+1(J(t_k)) > x

≤

≤2P sup

0≤t≤1|W(J(t_k)|>(2(1 +ε)(λ−1)n_ka(n_k) log₂n_k/m_k+1)¹^/²

, which for all large enoughkis less than

8P

W(1)>(2(1 +ε/2) log₂n_k)¹^/²

<exp{−(1 +ε/2) log₂n_k}.

The preceding inequalities can be obtained as in the proof of Claim 2, p. 499 in [14]. The last inequality and the Borel-Cantelli lemma complete the proof of (9).

Proof of Proposition 1. Put

∆_k= max

nk<n≤nk+1

0sup≤t≤1T(t_k, n, n_k), and

∆^∗_k= max

nk<n≤nk+1 sup

0≤t≤1Γ(J(t_k), n)−Γ(J(t_k), n_k).

Chooseε >0 and 1< λ <∞such thatλ−1< ε. For any integers k≥1 and n_k< n≤n_k+1 we have

0sup≤t≤1

A_n(a(n)t)

2na(n) log₂n− Γ(J(a(n)t), n) 2na(n) log₂n

≤

≤ sup

0≤t≤1

A_n_k(a(n_k)t)

2n_ka(n_k) log₂n_k − Γ(J(a(n_k)t), n_k) 2n_ka(n_k) log₂n_k

+

(8)

+ max

nk<n≤nk+1 sup

0≤t≤1

∆_k 2na(n) log₂n

+ max

nk<n≤nk+1 sup

0≤t≤1

∆^∗_k

2n_ka(n_k) log₂n_k . The result, stated follows from Lemmas 1, 2 and 3.

Proof of Proposition2. Remark that the existence of the two-parameter centered Gaussian process {K(s, t), 0 ≤s ≤1, t ≥ 0} (respectively, {Γ(s, t), 0 ≤ s ≤ 1, t ≥ 0}) is obtained by using Theorem 4.4.3 of [3] (respectively, Theorem 1, above). Moreover, see [13], p. 28, we have

A¹_n(t)−nt= n i=1

ε_i1_{U_i_≤t}−nt= (10)

= n i=1

(ε_i−1)1_{U_i_≤t}+√

nα_n(t), 0≤t≤1. By (10), it is suﬃcient to evaluate

0sup≤t≤1

1

2na(n) log₂nA¹_n(t)−K(t, n)−Γ(t, n)≤

≤ 1

2na(n) log₂n sup

0≤t≤1

√

nα_n(t)−K(t, n)+ + sup

0≤t≤1

ⁿ

i=1

(ε_i−1)1_{U_i_≤t}−Γ(t, n)

= I + II.

The term I on the right hand side converges to 0, at ono

n⁽¹^−ν⁾^/⁽²^−ε⁾ rate for all ε > 0 small enough. This rate is obtained by using Theorem 4.4.3 of [3], which gives the existence of a Kiefer process such that

0sup≤t≤1

√

nα_n(t)−K(t, n)=O

(logn)²

, a.s.

The term II is a special case of Proposition 1 withJ(t) =t, i.e., V ≡1.

So, Proposition 2 follows from the asymptotic behaviours of I and II.

Proof of Corollary 1. By Proposition 2 we have to estimate 1

2 log₂n

α_n,c(t)

a(n)−σΓ(a(n)t, n)

na(n) −K(a(n)t, n) na(n)

.

It it not diﬃcult to see that it is suﬃcient to use the following results:

(a) ^K√⁽^a⁽ⁿ⁾^t,n⁾

na(n) = ^B√ⁿ⁽^a⁽ⁿ⁾^t⁾

a(n)) , where Bn is a sequence of Brownian bridges (see p. 80 of [3]).

(b) ^Γ(√^a⁽ⁿ⁾^t,n⁾

na(n) = ^W√ⁿ⁽^a⁽ⁿ⁾^t⁾

a(n) , where W_n is a sequence of Brownian motions (see p. 58 of [3]).

(9)

Corollary 1 now follows by using (a) and (b), and the same arguments as in the proof of Proposition 2.

REFERENCES

[1] M. Cs¨org˝o and L. Horv´ath,Weighted Approximations in Probability and Statistics. Wi- ley, Chichester, 1993.

[2] S. Cs¨org˝o and D.M. Mason,Intermediate and extreme-sum processes. Stochastic Process.

Appl.40(1992), 55–67.

[3] M. Csörg˝o and P. Révész, Strongs Approximations in Probability and Statistics. Aca- demic Press, New York, 1981.

[4] P. Deheuvels and D.M. Mason, Nonstandard functional laws of the iterated logarithm for tail empirical processes and quantile processes. Ann. Probab.4(1990), 1693–1722.

[5] P. Deheuvels, Chung type functionals laws of the iterated logarithm for tail empirical processes. Ann. Inst. H. Poincar´e Probab. Statist.36(2000), 583–616.

[6] J. Diebolt, A nonparametric test for the regression function: Asymptotic theory. J.

Statist. Plann. Inference44(1995), 1–17.

[7] J. Diebolt and N. Laib, Un principe d’invariance faible pour l’etude d’un test non- parametrique relatif `a la fonction de regression. C. R. Acad. Sci. Paris S´er. I Math.312 (1991), 887–891.

[8] J. Diebolt,Testing the functions defining a nonlinear autoregressives times series. Sto- chastic Process. Appl.36(1990), 85–106.

[9] J.H.J. Einmahl, The a.s. behavior of the weighted empirical process and the LIL for the weighted tail empirical process. Ann. Probab.20(1990), 681–695.

[10] E. Heusler and D.M. Mason,Weighted approximations to continuous time martingales with applications. Scand. J. Statist.26(1999), 281–295.

[11] L. Horv´ath,Approximations for hybrids of empirical and partial sums process. J. Statist.

Plann. Inference88(2000), 1–18.

[12] L. Horv´ath, P. Kokoszka and J. Steinebach,Approximations for weighted bootstrap pro- cesses with an application. Statist. Probab. Lett.48(2000), 59–70.

[13] M. Maumy,Etude du processus empirique compos´e. Th`ese de Doctorat, Universit´e de Paris VI, 2002.

[14] D.M. Mason,A strong invariance theorem for the tail empirical process. Ann. Inst. H.

Poincar´e Probab. Statist.24(1988), 491–506.

[15] V.V. Petrov, Limit Theorems of Probability Theory. Oxford Studies in Probability 4.

Oxford Science Publications, 1995.

Received 6 March 2006 Université de Technologie de Compiègne Laboratoire de Mathématiques Appliquées (L.M.A.C.)

BP 529, 60205 Compi`egne Cedex, France sergio.alvarez@utc.fr