Invariance principles for adaptive self-normalized partial sums processes ∗
Alfredas Ra£kauskas
Vilnius university and Institute of Mathematics and Informatics
Department of Mathematics, Vilnius University Naugarduko 24, Lt-2006 Vilnius
Lithuania
Charles Suquet
Université des Sciences et Technologies de Lille
Statistique et Probabilités F.R.E. CNRS 2222 Bât. M2, U.F.R. de Mathématiques F-59655 Villeneuve d'Ascq Cedex France
Revised version of
Invariance principle for self-normalized sums of symmetric random variables
Abstract
Let ζnse be the adaptive polygonal process of self-normalized partial sumsSk=P
1≤i≤kXiof i.i.d. random variables dened by linear interpo- lation between the points(Vk2/Vn2, Sk/Vn),k≤n, whereVk2=P
i≤kXi2. We investigate the weak Hölder convergence ofζnse to the Brownian mo- tionW. We prove particularly that whenX1 is symmetric,ζnse converges toW in each Hölder space supportingW if and only if X1belongs to the domain of attraction of the normal distribution. This contrasts strongly with Lamperti's FCLT where a moment ofX1of orderp >2is requested for some Hölder weak convergence of the classical partial sums process.
We also present some partial extension to the non symmetric case.
Keywords: functional central limit theorem, domain of attraction, Hölder space, randomization.
AMS classications: 60F05, 60B05, 60G17, 60E10.
∗Preprint. Research supported by a cooperation agreement CNRS/LITHUANIA (4714).
1 Introduction and results
Various partial sums processes can be built from the sumsSn =X1+· · ·+Xnof independent identically distributed mean zero random variables. In this paper we focus attention on what we call the adaptive self-normalized partial sums process, denoted ζnse. We investigate its weak convergence to the Brownian motion, trying to obtain it under the mildest integrability assumptions onX1 and in the strongest topological framework. We basically show that in both respects, ζnse behaves better than the classical Donsker-Prohorov partial sum processesξnsr. Self-normalized means here that the classical normalization by
√nis replaced by
Vn= (X12+· · ·+Xn2)1/2.
Adaptive means that the vertices of the corresponding random polygonal line have their abscissas at the random points Vk2/Vn2 (0 ≤k ≤ n) instead of the deterministic equispaced pointsk/n. By this construction the slope of each line adapts itself to the value of the corresponding random variable.
As a lot of dierent partial sums processes will appear throughout the paper, we need to explain our typographical conventions and x notations.
Byζn(respectivelyξn) we denote the random polygonal partial sums process dened on[0,1]by linear interpolation between the vertices (Vk2/Vn2, Sk), k = 0,1, . . . , n(respectively (k/n, Sk),k= 0,1, . . . , n), where
Sk =X1+· · ·+Xk, Vk2=X12+· · ·+Xk2. For the special casek= 0, we putS0= 0,V0= 0.
The upper scriptssr orse mean respectively normalization by square root of nor self-normalization. Hence,
ξnsr= ξn
√n, ξnse= ξn
Vn
, ζnsr = ζn
√n, ζnse= ζn
Vn
.
By convention the random functions ξsen and ζnse are dened to be the null function on the event{Vn= 0}. Finally, the step partial sums processesΞn,Zn, Ξsen etc. are the piecewise constant random càdlàg functions whose jump points are vertices for the polygonal process denoted by the corresponding lowercase Greek letter.
Classical Donsker-Prohorov invariance principle states, that if EX12 = 1, then
ξsrn −→D W, (1)
inC[0,1], where(W(t), t∈[0,1])is a standard Wiener process and−→D denotes convergence in distribution. Since (1) yields the central limit theorem, the niteness of the second moment ofX1therefore is necessary.
Lamperti [14] considered the convergence (1) with respect to a stronger topology. He proved that ifE|X1|p <∞, wherep >2, then (1) takes place in the Hölder spaceHα[0,1], where 0 < α < 1/2−1/p. This result was derived again by Kerkyacharian and Roynette [13] by another method using Ciesielski
[6] analysis of Hölder spaces by triangular functions. Further generalizations were given by Erickson [7], Hamadouche [12], Ra£kauskas and Suquet [19].
Considering a symmetric random variable X1 such that P(X1 ≥ u) = 1/(2up), u ≥ 1, Lamperti [14] noticed that the corresponding sequence (ξsrn) is not tight in Hα[0,1] for α = 1/2−1/p. It is then hopeless in general to look for an invariance principle in Hα[0,1] without some moment assumption beyond the square integrability of X1. Recently, Ra£kauskas and Suquet [19]
proved more precisely that if(ξsrn) satises the invariance principle inHα[0,1]
for some0< α <1/2, then necessarily sup
t>0
tpP(|X1|> t)<∞, (2) for anyp <1/(1/2−α).
Let us see now, how self-normalization and adaptiveness help to improve this situation. Recall that "X1 belongs to the domain of attraction of the normal distribution" ( denoted by X1 ∈ DAN) means that there exists a sequence bn↑ ∞such that
b−1n Sn −→D N(0,1). (3)
According to O'Brien's [16] result: X1∈DAN if and only if Vn−1 max
1≤k≤n|Xk|−→P 0, (4)
where −→P denotes convergence in probability. In the classical framework of C[0,1], we obtain the following improvements of the Donsker Prohorov theorem.
Theorem 1. The convergence
ξsen −→D W (5)
holds in the spaceC[0,1]if and only ifX1∈DAN. Theorem 2. The convergence
ζnse−→D W (6)
holds in the spaceC[0,1]if and only ifX1∈DAN.
Let us remark that the necessity ofX1 ∈DAN in both Theorems 1 and 2 follows from Giné, Götze and Mason [9]. Let us notice also that (5) or (6) both exclude the degenerated caseP(X1= 0) = 1, so that almost surelyVn>0for large enoughn. We have similar results [20] for the step processesΞsen andZsen within the Skorohod spaceD(0,1).
For a modulus of continuity ρ : [0,1] → R, denote by Hρ[0,1] the set of continuous functionsx: [0,1]→Rsuch thatωρ(x,1)<∞,where
ωρ(x, δ) := sup
t,s∈[0,1], 0<|t−s|<δ
|x(t)−x(s)|
ρ(|s−t|) .
The setHρ[0,1]is a Banach space when endowed with the norm kxkρ:=|x(0)|+ωρ(x,1).
Dene
Hρo[0,1] ={x∈Hρ[0,1] : lim
δ→0ωρ(x, δ) = 0}.
Then Hρo[0,1] is a closed separable subspace of Hρ[0,1]. In what follows we assume that the functionρ satises technical conditions (12)to (16)(see Sec- tion 2). These assumptions are fullled particularly whenρ=ρα,β,0< α <1, β∈R, dened by:
ρα,β(h) :=hαlnβ(c/h), 0< h≤1,
for a suitable constant c. We write Hα,β and Hα,βo for Hρ[0,1] and Hρo[0,1], respectively, whenρ=ρα,β and we abbreviateHα,0 inHα.
With respect to this Hölder scale Hα,β, we obtain an optimal result when X1 is symmetric.
Theorem 3. Assume thatρsatises conditions(12)to(16)and
j→∞lim
2jρ2(2−j)
j =∞. (7)
IfX1 is symmetric and X1∈DAN then
ζnse−→D W, (8)
inHρo[0,1].
Corollary 4. IfX1 is symmetric and X1 ∈DAN then (8) holds in the space H1/2,βo , for anyβ >1/2.
It is well known that the Wiener process has a version in the spaceH1/2,1/2 but none in H1/2,1/2o . Hence Corollary 4 gives the best result possible in the scale of the separable Hölder spacesHα,β. In [19] it is proved that if the classical partial sums processξsrn converges inH1/2,βo for someβ >1/2, then kX1kψ
γ <
∞, where kX1kψ
γ is the Orlicz norm related to the Young function ψγ(r) = exp(rγ)−1withγ= 1/β. This shows the striking improvement of weak Hölder convergence due to self-normalization and adaptation.
It seems worth noticing here, that without adaptive construction of the polygonal process, the existence of moments of order bigger than 2 is neces- sary for Hölder weak convergence. Indeed, ifξsen −→D W in Hα, then one can prove thatEX12<∞. Thereforeξnsr −→D W in Hα and the moment restriction (2) is necessary.
Naturally it is very desirable to remove the symmetry assumption in Corol- lary 4. Although the problem remains open, we can propose the following partial results in this direction.
Theorem 5. Letβ >1/2 and suppose that we have P°
max
1≤k≤n
Xk2 Vn2 ≥δn±
−−−−→
n→∞ 0 (9)
and
P°
1≤k≤nmax
¬
¬
¬ Vk2 Vn2 −k
n
¬
¬
¬≥δn
±−−−−→
n→∞ 0, (10)
with
δn=c2−(logn)γ
logn , for some 1
2β < γ <1 and somec >0. (11) Then
ζnse−→D W, in H1/2,βo .
Observe thatn−¯=o(δn)for any¯ >0. This mild convergence rateδn may be obtained as soon asE|X1|2+εis nite.
Corollary 6. If for some ε >0, E|X1|2+ε <∞, then for any β > 1/2, ζnse converges weakly toW in the spaceH1/2,βo .
This result contrasts strongly with the extension of Lamperti's invariance principle in the same functional framework [19].
The present contribution is a new illustration of the now well established fact, that in general, self-normalization improves the asymptotic properties of sums of independent random variables.
A rich litterature is devoted to limit theorems for self-normalized sums. Lo- gan, Mallows, Rice and Shepp [15] investigate the various possible limit distri- butions of self-normalized sums. Giné, Götze and Mason [9] prove thatSn/Vn converges to the Gaussian standard distribution if and only if X1 is in the domain of attraction of the normal distribution (the symmetric case was pre- viously treated in Grin and Mason [11]). Egorov [8] investigates the non identically distributed case. Bentkus and Götze [2] obtain the rate of con- vergence of Sn/Vn when X1 ∈ DAN. Grin and Kuelbs [10] prove the LIL for self-normalized sums when X1 ∈ DAN. Moderate deviations (of Linnik's type) are studied in Shao [22] and Christiakov and Götze [4]. Large devia- tions (of Cramér-Cherno type) are investigated in Shao [21] without moment conditions. Chuprunov [5] gives invariance principles for various partial sums processes under self-normalization inC[0,1]orD[0,1]. Our Theorems 1 and 2 improve on Chuprunov's results in the i.i.d. case.
2 Preliminaries
2.1 Analytical background
In this section we collect some facts about the Hölder spacesHρ[0,1]including the tightness criterion for distributions in these spaces. All these facts may be found e.g. in Ra£kauskas and Suquet [18].
In what follows, we assume that the modulus of smoothness ρsatises the following technical conditions wherec1, c2 andc3 are positive constants:
ρ(0) = 0, ρ(δ)>0, 0< δ≤1; (12) ρis non decreasing on[0,1]; (13)
ρ(2δ)≤c1ρ(δ), 0≤δ≤1/2; (14)
Z δ
0
ρ(u)
u du≤c2ρ(δ), 0< δ≤1; (15) δ
Z 1
δ
ρ(u)
u2 du≤c3ρ(δ), 0< δ≤1. (16) For instance, elementary computations show that the functions
ρ(δ) :=δαlnβ c δ
¡, 0< α <1, β∈R,
satisfy conditions (12) to (16), for a suitable choice of the constant c, namely c≥exp(β/α)ifβ >0andc >exp(−β/(1−α))ifβ <0.
WriteDj for the set of dyadic numbers of level j in [0,1], i.e. D0={0,1}
and forj≥1,
Dj={(2k+ 1)2−j; 0≤k <2j−1}.
For any continuous functionx: [0,1]→R, dene λ0,t(x) :=x(t), t∈D0
and forj≥1,
λj,t(x) :=x(t)−1 2
x(t+ 2−j) +x(t−2−j)¡
, t∈Dj.
The λj,t(x) are the coecients of the expansion of x in a series of triangular functions. Thej-th partial sum Ejxof this series is exactly the polygonal line interpolating x between the dyadic points k2−j(0 ≤ k ≤ 2j). Under (12) to (16), the normkxkρ is equivalent to the sequence norm
kxkseqρ := sup
j≥0
1 ρ(2−j)max
t∈Dj
|λj,t(x)|.
In particular both norms are nite if and only ifxbelongs toHρ. It is easy to check that
kx−Ejxkseqρ = sup
i>j
1 ρ(2−i)max
t∈Di
|λi,t(x)|.
Proposition 7. The sequence (Yn) of random elements in Hρo is tight if and only if the following two conditions are satised:
i) For each t∈[0,1], the sequence(Yn(t))n≥1 is tight on R.
ii) For each ε >0,
j→∞lim sup
n≥1
P(kYn−EjYnkseqρ > ε) = 0.
Remark 8. Condition ii) in Proposition 7 may be replaced by
j→∞lim lim sup
n→∞
P(kYn−EjYnkseqρ > ε) = 0. (17)
2.2 Adaptive time and DAN
We establish here the technical results on the adaptive time whenX1 ∈DAN which will be used throughout the paper. These results rely on the common assumption that X1 is in the domain of normal attraction. This provides the following properties on the distribution ofX1. SinceX1∈DAN, there exists a sequencebn ↑ ∞ such thatb−1n Sn converges weakly to N(0,1). Then Raikov's theorem yields
b−2n Vn2−→P 1. (18)
We have moreover for eachτ >0, puting bn=n−1/2`n, nP(|X1|> τ `n
√n) → 0, (19)
`−2n E(X12;|X1| ≤τ `n√
n) → 1, (20)
nE(X1;|X1| ≤τ `n
√n) → 0, (21)
see for instance Araujo and Giné [1], Chap. 2, Cor. 4.8 (a), Th. 6.17 (i) and Cor. 6.18 (b). Here and in all the paper (X;E) means the product of the random variableX by the indicator function of the eventE.
Lemma 9. IfX1∈DAN, then sup
0≤t≤1
¬
¬
¬ V[nt]2
Vn2 −t¬
¬
¬
−→P 0. (22)
Proof. Consider the truncated random variables
Xn,i:=b−1n (Xi;Xi2≤b2n), i= 1, . . . , n.
DeneVn,0:= 0 andVn,k2 =Xn,12 +· · ·+Xn,k2 fork= 1, . . . , n. Set
νn= sup
0≤t≤1
¬
¬
¬ V[nt]2
Vn2 −t
¬
¬
¬ and eνn = sup
0≤t≤1
¬
¬
¬ Vn,[nt]2
Vn,n2 −t
¬
¬
¬. Then we have forλ >0,
P(νn> λ)≤P(νen> λ) +nP(X12> b2n).
Due to (19) the proof of (22) reduces to the proof of eνn
−→P 0. (23)
SinceVn,k2 ≤Vn,n2 fork= 0, . . . , n, the elementary estimate
¬
¬
¬ Vn,k2 Vn,n2 − k
n
¬
¬
² Vn,k2
Vn,n2 |1−Vn,n2 |+
¬
¬
¬Vn,k2 −k n
¬
¬
¬
leads to
eνn ≤ max
0≤k≤n
¬
¬
¬Vn,k2 −k n
¬
¬
¬+|1−Vn,n2 |+ 1
n. (24)
Noting thatVn,n2 =b−2n Vn2Rn with Rn:= 1 Vn2
n
X
i=1
(Xi2;Xi2≤b2n), we clearly haveRn≤1 a.s. and
P(Rn<1) =P( max
1≤i≤n|Xi|> bn)≤nP(|X1|> bn), which goes to zero by (19). This together with (18) gives
Vn,n2 −→P 1. (25)
Hence the proof of (23) reduces to
0≤k≤nmax
¬
¬
¬Vn,k2 −k n
¬
¬
¬
−→P 0. (26)
For this convergence we have
0≤k≤nmax |Vn,k2 −k/n| ≤ max
0≤k≤n|Vn,k2 −EVn,k2 |+ max
0≤k≤n|EVn,k2 −k/n|.
Noting that
EVn,k2 − k n = k
n
°nb−2n E(X12;X12≤b2n)−1± gives
0≤k≤nmax
¬
¬
¬EVn,k2 −k n
¬
¬
¬≤¬
¬nb−2n E(X12;X12≤b2n)−1¬
¬,
which goes to zero by (20). Hence it remains to prove
0≤k≤nmax |Vn,k2 −EVn,k2 |−→P 0. (27) PuttingTn,k :=Vn,k2 −EVn,k2 , we have by Ottaviani inequality
P
1≤k≤nmax |Tn,k|>2λ¡
≤ P(|Tn,n|> λ)
1−max1≤k≤nP(|Tn,n−Tn,k|> λ). (28)
Due to (25), we are left with the control ofI := max1≤k≤nP(|Tn,k| > λ). By Chebyshev's inequality
I≤λ−2 max
1≤k≤nETn,k2 ≤λ−2nEXn,14
and we have to consider I1 = nEXn,14 = nb−4n E(X14;|X1| ≤ bn). For any 0< τ <1,
E(X14;|X1| ≤bn) ≤ E(X14;|X1| ≤τ bn) +E(X14;τ bn ≤ |X1| ≤bn)
≤ τ2b2nE(X12;|X1| ≤τ bn) +b4nP(|X1| ≥τ bn).
So
I1≤τ2nb−2n E(X12;|X1| ≤τ bn) +nP(|X1| ≥τ bn).
Choosingτ =λ/2 in (19) and (20), we can achieveI≤1/2 fornlarge enough and the proof is complete.
Remark 10. IfX1∈DAN, we also have sup
0≤t≤1
¬
¬
¬ V[nt]+12
Vn2 −t¬
¬
¬
−→P 0. (29)
Indeed, recalling (4), it suces to write V[nt]+12 −V[nt]2
Vn2 =X[nt]+12 Vn2 ≤
² 1
Vn+12 max
1≤k≤n+1Xk2
³Vn+12 Vn2 ,
and observe thatVn+12 /Vn2converges to 1in probability since by Lemma 9,
¬
¬
¬ Vn2
Vn+12 − n n+ 1
¬
¬
² sup
0≤t≤1
¬
¬
¬
V[(n+1)t]2 Vn+12 −t
¬
¬
¬
−→P 0.
Remark 11. For eacht∈[0,1], b2[nt]
b2n →t. (30)
This is a simple by-product of Lemma 9, writing b2[nt]
b2n = Vn2 b2n ×b2[nt]
V[nt]2 ×V[nt]2 Vn2
and noting that for xedt >0andn≥n0 large enough[nt]<[(n+ 1)t]so the sequence(b2[nt]/V[nt]2 )n≥n0 is a subsequence of(b2n/Vn2)n≥n0 which converges in probability to1 by (18).
Dene the random variables
τn(t) = max{k= 0, . . . , n; Vk2≤tVn2}, t∈[0,1], (31)
so that we haveτn(1) =nand for0≤t <1, Vτ2
n(t)
Vn2 ≤t < Vτ2
n(t)+1
Vn2 . (32)
Lemma 12. IfX1∈DAN then sup
t∈[0,1]
|n−1τn(t)−t|−→P 0. (33) Proof. The result will follow from Remark 10, if we check the inclusion of events
n sup
t∈[0,1]
|n−1τn(t)−t|> εo
⊂n sup
u∈[0,1]
¬
¬
¬ V[nu]+12
Vn2 −u¬
¬
¬≥εo
. (34)
The occurrence of the left hand side in (34) is equivalent to the existence of one s∈[0,1]such that|n−1τn(s)−s|> ε, i.e. such that
τn(s)> n(s+ε) (35)
or
τn(s)< n(s−ε). (36)
Observe that under (35), s+ε < 1, while under (36), s−ε > 0. From the denition ofτn, (35) gives an integerk > n(s+ε)such thatVk2/Vn2≤s, whence
V[n(s+ε)]+12
Vn2 ≤s. (37)
On the other hand, under (36), we haveVk2/Vn2> sfor everyk≥n(s−ε)and in particular
V[n(s−ε)]+12
Vn2 > s. (38)
Recasting (37) and (38) under the form V[n(s+ε)]+12
Vn2 −(s+ε), ≤ −ε V[n(s−ε)]+12
Vn2 −(s−ε) > ε,
shows that both (35) and (36) imply the occurrence of the event in the right hand side of (34).
3 Proofs
Proof of Theorem 1. First we prove the convergence of nite dimensional dis- tributions (f.d.d.) of the processξnse to the corresponding f.d.d. of the Wiener processW.
To this aim, consider the process Ξn = (S[nt], t∈[0,1]). By (4) applied to the obvious bound
sup
0≤t≤1
Vn−1|ξn(t)−Ξn(t)| ≤Vn−1 max
1≤k≤n|Xk|, the convergence of f.d.d. ofξnsefollows from those of the process Ξsen.
Let 0 ≤ t1 < t2 < · · · < td ≤1. From (3), independence of the Xi's and Remark 11, we get
b−1n
S[nt1], S[nt2]−S[nt1], . . . , S[ntd]−S[ntd−1]
¡ D
−→
W(t1), W(t2)−W(t1), . . . , W(td)−W(td−1)¡ . Now (18) and the continuity of the map
(x1, x2, . . . , xd)7→(x1, x2+x1, . . . , xd+· · ·+x1)
yields the convergence of f.d.d. of Ξsen. The convergence of nite dimensional distributions of the processξnse is thus established.
To prove the tightness we shall use Theorem 8.3 from Billingsley [3]. Since ξnse(0) = 0, the proof reduces in showing that for allε,η >0there existn0≥1 andδ,0< δ <1, such that
1 δPn
sup
1≤i≤nδ
Vn−1|Sk+i−Sk| ≥εo
≤η, n≥n0, (39) for all1≤k≤n.
Let us introduce the truncated variables
Yi:=`−1n (Xi; Xi2≤τ2b2n), i= 1, . . . , n,
with`n=n−1/2bn as above andτ to be chosen later. Denote bySek andVek the corresponding partial sums with their self-normalizing random variables:
Sek =Y1+· · ·+Yk, Vek = (Y12+· · ·Yn2)1/2, k= 1, . . . , n.
Then we have Pn
sup
1≤i≤nδ
Vn−1|Sk+i−Sk| ≥εo
≤A+B+C, (40) where
A := Pn sup
1≤i≤nδ
|Sek+i−Sek| ≥ε√ n/2o
, B := P{Ven<√
n/2}, C := nP{|X1| ≥τ `n
√n}.
Due to (21) we can choosen1such that√
n|EY1| ≤1/4forn≥n1. Then with n≥n1andδ≤εwe have
A ≤ Pn
1≤i≤nδmax
¬
¬
¬
k+i
X
j=k+1
(Yj−EYj)
¬
¬
¬+nδ|EY1| ≥√ nε/2o
≤ Pn
1≤i≤nδmax
¬
¬
¬
k+i
X
j=k+1
(Yj−EYj)
¬
¬
¬≥√ nε/4o
.
By Chebyshev's inequality and Rosenthal inequality with p > 2, we have for each1≤k≤n
Pn n−1/2
¬
¬
¬
k+nδ
X
j=k+1
(Yj−EYj)
¬
¬
¬≥ ε 8
o ≤ 8p εpnp/2E
¬
¬
¬
k+nδ
X
j=k+1
(Yj−EYj)
¬
¬
¬
p
≤ 8p εpnp/2
¢(nδ)p/2(EY12)p/2+nδE|Y1|p£ . By (20) we can choosen2 such that
3/4≤EY12≤3/2 for n≥n2. (41) Then we have E|Y1|p ≤ 2n(p−2)/2τp−2 and then assuming that τ ≤ δ1/2 we obtain
Pn n−1/2¬
¬
¬
k+nδ
X
j=k+1
(Yj−EYj)¬
¬
¬≥ ε 8
o ≤ 8p εpnp/2
¢2p/2(nδ)p/2+δnp/2τp−2£
≤ 2·16pδp/2 εp . Now by Ottaviani inequality we nd
A≤δη
3 , (42)
providedδp/2≤εp/(4·16p)andδ(p−2)/2≤ηεp/(6·16p).
Next we considerB. Sincen−1EVen2=EY12we have by (41)n−1EVen2≥3/4, forn≥n2. Furthermore
B ≤P{n−1|Ven2−EVen2| ≥1/2} ≤4n−1EY14≤4τ2EY12≤δη/3, (43) providedn≥n2 andτ2≤δη/18.
Finally choose n3 such that C ≤ ηδ/3 when n ≥ n3 and join to that the estimates (42, 43) to conclude (39). The proof is completed.
Proof of Theorem 2. Due to Theorem 1, it suces to check that
Vn−1(ξn−ζn)
∞ goes to zero in probability, where kfk∞ := sup0≤t≤1|f(t)|. To this end let us introduce the random change of timeθn dened as follows. When Vn >0, θn
is the map from[0,1]onto [0,1]which interpolates linearly between the points (k/n, Vk2/Vn2),k= 0,1, . . . , n. When Vn = 0, we simply takeθn =I, the iden- tity on [0,1]. With the usual convention Sk/Vn := 0 for Vn = 0, we always have
ζnse θn(t)¡
=ξsen(t), 0≤t≤1. (44) Clearly for eacht∈[0,1],
¬
¬
¬ V[nt]2
Vn2 −θn(t)
¬
¬
² max
1≤k≤n
Xk2 Vn2. It follows by (4) that
sup
0≤t≤1
¬
¬
¬ V[nt]2
Vn2 −θn(t)¬
¬
¬
−→P 0
and this together with Lemma 9 gives
kθn−Ik∞−→P 0. (45) Letω(f;δ) := sup{|f(t)−f(s)|; |t−s| ≤δ}denote the modulus of continuity off ∈C[0,1]. Then recalling (44) we have
kξnse−ζnsek∞= sup
0≤t≤1
¬¬ξnse θn(t)¡
−ζnse θn(t)¡¬
¬≤ω
ξnse;kθn−Ik∞¡ . It follows that for anyλ >0and0< δ≤1,
P
kξnse−ζnsek∞≥λ¡
≤P
kθn−Ik∞> δ¡ +P
ω(ξnse;δ)≥λ¡
. (46) Now since the Brownian motion has a version inC[0,1], we can nd for each positiveε, some δ∈(0,1]such that P
ω(W;δ)≥λ¡
< ε. As the functionalω is continuous onC[0,1], it follows from Theorem 1 that
lim sup
n→∞
P
ω(ξsen;δ)≥λ¡
≤P
ω(W;δ)≥λ¡ . Hence forn≥n1we haveP
ω(ξsen;δ)≥λ¡
<2ε. Having in mind (45) and (46) we see that the proof is complete.
Proof of Theorem 3. The convergence of nite dimensional distributions is al- ready established in the proof of Theorem 2
It remains to prove tightness of ζnse in the space Hρ[0,1]. To this aim, we have to check the second condition of Proposition 7 only.
Let ε1, . . . , εn, . . . be an independent Rademacher sequence which is inde- pendent on (Xi). By symmetry of X1, both sequences (Xi) and (εiXi) have
the same distribution. Noting also thatε2i = 1 a.s., we have that ζnse has the same distribution as the random processζense which is dened linearly between the points
°Vk2 Vn2, Uk
Vn
± , whereU0= 0andUk =Pk
i=1εiXi,fork≥1.Hence it suces to prove that
J→∞lim sup
n
X
j>J
2j max
0≤k<2jP ¬
¬ eζnse
(k+ 1)2−j)−eζnse(k2−j)¬
¬> ερ(2−j)¡
= 0. (47) To this aim we shall estimate
δ(t, h, r) :=P
|eζnse(t+h)−ζense(t)|> r¡ , uniformly inn. First consider the case, where
0≤Vk−12
Vn2 ≤t < t+h≤ Vk2 Vn2, so
0≤h≤Vk2
Vn2 −Vk−12 Vn2 = Xk2
Vn2. We have then by linear interpolation
|eζnse(t+h)−ζense(t)| = |εkXk| Vn
Vn2 Xk2h
= ° Vn
|Xk|
√ h±√
h
≤ √
h. (48)
Next consider the following conguration:
0≤Vk−12
Vn2 ≤t < Vk2 Vn2 ≤ Vl2
Vn2 ≤t+h < Vl+12 Vn2 . Then we have
|eζnse(t+h)−ζense(t)| ≤δ1+δ2+δ3, where
δ1 := |eζnse(t+h)−ζense(Vl2/Vn2)| ≤ q
t+h−Vl2/Vn2≤√ h, δ2 := |eζnse(Vl2/Vn2)−ζense(Vk2/Vn2)|=Vn−1|Ul−Uk| ≤ |Ul−Uk|
pVl2−Vk2
√ h, δ3 := |eζnse(Vk2/Vn2)−ζnse(t)| ≤q
Vk2/Vn2−t≤√ h.
Hence, for any conguration we obtain
|eζnse(t+h)−ζense(t)| ≤ |Ul−Uk| pVl2−Vk2
√ h+ 2√
h, (49)
if we agree that|Ul−Uk|(Vl2−Vk2)−1/2:= 0 whenk=l. Therefore δ(t, h, r)≤P
|Ul−Uk|/
q
Vl2−Vk2> r/(2√ h)¡
, (50)
providedr >4√
h. Observe that in this formula the indexeslandkare random variables depending on t, h and the sequence (Xi), but independent of the sequence(εi). Thus conditioning onX1, . . . , Xn and applying the well known Hoeding's inequality we obtain
δ(t, h, r)≤cexp{−r2/(8h)}. (51) Now (47) clearly follows if for every¯ >0,
∞
X
j=1
2jexp{−¯2jρ2(2−j)}<∞, (52) which is easily seen to be equivalent to our hypothesis (7). The proof is com- pleted.
Proof of Theorem 5. From (9) and the characterization (4) of DAN, X1 is clearly in the domain of normal attraction. So the convergence of nite di- mensional distributions is already given by Theorem 2.
To establish the tightness we have to prove that
J→∞lim lim sup
n→∞
P
kζnse−EJζnsekseqρ >4ε¡
= 0. (53)
To this end, it suces to prove that with some sequenceJn↑ ∞to be precised later,
lim sup
n→∞
P° sup
j>Jn
max
0≤k<2j
1 ρ(2−j)
¬¬λ0j,k(ζnse)¬
¬> ε±
= 0 (54)
and
J→∞lim lim sup
n→∞
P° sup
J≤j≤Jn
max
0≤k<2j
1 ρ(2−j)
¬¬λ0j,k(ζnse)¬
¬>3ε±
= 0, (55) where
λ0j,k(ζnse) :=ζnse
(k+ 1)2−j¡
−ζnse(k2−j), 0≤k <2j.
To start with (54), following the same steps which led to (49) we obtain with k,l such that
Vk−12
Vn2 < t≤ Vk2
Vn2, Vl−12
Vn2 < t+h≤ Vl2 Vn2,
the upper bound
|ζnse(t+h)−ζnse(t)| ≤
²
2 + |S(l,k]| V(l,k]
³√ h, where we use the notations
S(i,j]:= X
i<k≤j
Xk, V(i,j] :=
² X
i<k≤j
Xk2
³1/2 ,
with the usual convention of null value for a sum indexed by the empty set.
WritingTk,l:= 2 +|S(l,k]|/V(l,k], this gives
|ζnse(t+h)−ζnse(t)| ≤√
h max
1≤k≤l≤nTk,l. (56)
By [9, Th. 2.5], the Tk,l are uniformly subgaussian. It is worth recalling here and for further use, that if the random variablesYi(1≤i≤N) are subgaussian, then so ismax1≤i≤N|Yi|, which more precisely satises
1≤i≤Nmax |Yi|
φ
2
≤a(logN)1/2 max
1≤i≤NkYikφ
2, (57)
where ais an absolute constant and k kφ
2 denotes the Orlicz norm associated to the Young functionφ2(t) := exp(t2)−1. Applying (57) to the n2 random variables Tk,l, we obtain (with constants c, C whose value may vary at each occurence)
P° sup
j>Jn
max
0≤k<2j
1
ρ(2−j)|λ0j,k(ζnse)|> ε±
≤ X
j>Jn
P°
1≤k≤l≤nmax Tk,l> cεjβ±
≤ X
j>Jn
Cexp°−cj2β logn
±
. (58)
Now choose Jn = (logn)γ with 1 > γ > (2β)−1. Then 2β−1/γ is strictly positive and using
j2β=j1/γj2β−1/γ > Jn1/γj2β−1/γ =j2β−1/γlogn, we see that the right hand side in (58) is bounded byP
j>JnCexp
−cj2β−1/γ¡ , whence (54) follows.
To prove (55), we start with P°
J≤j≤Jmaxn max
0≤k<2j
1
ρ(2−j)|λ0j,k(ζnse)|>3ε±
≤P1+P2+P3, (59) withP1,P2andP3 dened below. First introduce the event
An=n sup
t∈[0,1]
¬
¬
¬ Vτ2
n(t)
Vn2 −V[nt]2 Vn2
¬
¬
¬≤δn
o∩n sup
t∈[0,1]
¬
¬
¬ V[nt]2
Vn2 −t
¬
¬
¬≤δn
o .
whereδn is choosen as in (11), keeping the freedom of choice of the constant c. Now we dene
P1 := P(Acn);
P2 := P° An∩n
J≤j≤Jmaxn
max
0≤k<2j
1 ρ(2−j)
|S[(k+1)2−jn]−S[k2−jn]| Vn
> εo±
; P3 := P°
An∩n
J≤j≤Jmaxn max
0≤k<2j
1
ρ(2−j) max
|l−[k2−jn]|≤nδn
h|Sl−S[k2−jn]|
Vn + 2
2j/2 i
>2εo±
. The following easy estimates
sup
t∈[0,1]
¬
¬
¬ V[nt]2
Vn2 −t¬
¬
² max
1≤k≤n
¬
¬
¬ Vk2 Vn2−k
n
¬
¬
¬+ 1 n,
sup
t∈[0,1]
¬
¬
¬ Vτ2
n(t)
Vn2 −V[nt]2 Vn2
¬
¬
² max
1≤k≤n
Xk2
Vn2 + max
1≤k≤n
¬
¬
¬ Vk2 Vn2 −k
n
¬
¬
¬+1 n, lead by (9) and (10) to
P(Acn)→0. (60)
SoP1 will be killed by taking thelim supinn.
To controlP2, rst write with self-explanatory notations
|S[(k+1)2−jn]−S[k2−jn]| Vn
= |S[(k+1)2−jn]−S[k2−jn]| V [k2−jn],[(k+1)2−jn]£
×
V [k2−jn],[(k+1)2−jn]£ Vn
.
Observing that on the eventAn, we have V [k2−jn],[(k+1)2−jn]£
Vn
≤p
2−j+δn and assuming that
δn≤2−Jn, (61)
we get
P2≤ X
J≤j≤Jn
P° max
0≤k<2j
1 ρ(2−j)
|S[(k+1)2−jn]−S[k2−jn]| V [k2−jn],[(k+1)2−jn]£
>√ 2ε2j/2±
.
Since we are dealing now with the maximum of2juniformly subgaussian random variables (theirϕ2norms are bounded by a constant which depends only on the distribution ofX1), this leads to
P2≤ X
J≤j≤Jn
Cexp
−cj2β−1¡
≤
∞
X
j=J
Cexp
−cj2β−1¡
. (62)
To controlP3, we rst get rid of the residual term by noting that 2
ρ(2−j)2j/2 = c
jβ < ε for j ≥J ≥J(ε), uniformly inn. So for J≥J(ε),
P3≤P° An∩n
J≤j≤Jmaxn
0≤k<2maxj 1
ρ(2−j) max
|l−[k2−jn]|≤nδn
|Sl−S[k2−jn]| Vn
> εo±
. On the eventAn we have for anylsuch that|l−[k2−jn]| ≤nδn,
|V[k22 −jn]−Vl2| Vn2 ≤2δn. It follows that
P3≤P°
J≤j≤Jmaxn
max
0≤k<2j max
|l−[k2−jn]|≤nδn
|Sl−S[k2−jn]|
|V[k22 −jn]−Vl2|1/2 >ερ(2−j)
√2δn
± . Using the invariance of distributions under translations onk, we get
P3 ≤ X
J≤j≤Jn
2jP° max
0<l≤[2nδn]
|Sl| Vl
>ερ(2−j)
√2δn
±
≤ X
J≤j≤Jn
2jCexp°
−c2−jj2β δnlogn
±
≤ C X
J≤j≤Jn
2jexp°
− c2−Jn δnlognj2β±
.
Now we see that the following convergence rate (stronger than (61)) δn= 1
2Jnlogn = 2−(logn)γ
logn , with 1
2β < γ <1, is sucient to obtain (55). The proof is complete.
Proof of Corollary 6. As isX1is square integrable,X1is inDAN. The conver- gence rates (9) and (10) required by Theorem 5 are provided by the two following lemmas, recalling that with our choice (11) ofδn, we haven−ε=o(δn)for any ε >0.
Lemma 13. IfE|X1|2+δ <∞ for someδ >0, then almost surely n−c max
1≤k≤n
¬
¬
¬ Vk2 Vn2 −k
n
¬
¬
¬→0, (63)
wherec=δ/(2 + 2δ).
Proof. By Marcinkiewicz SLLN, if the i.i.d. sequence(Yk)satisesE|Y1|p <∞ for some 1 ≤ p < 2, then n−1/p P
k≤nYk−nEY1
¡ goes to 0 almost surely.
Applying this toY1=X12andp= 1 +δ/2 gives Vn2
n = 1 +n1/p−1εn, n≥1,
where the random sequence (εn) goes to zero almost surely. Since we assume P(X1= 0)<1, we haveP(∀n≥1, Vn = 0) = 0. On each event{Vn2>0}, we may write witha= 1−1/p,
Vk2 Vn2 −k
n =k n
°Vk2 k
n Vn2 −1±
= k
n×k−aεk−n−aεn 1 +n−aεn
.
For eachn ≥n0 =n0(ω)large enough, n−aεn >−1/2. Now for an exponent 0< b <1 to be precised later, we have
¬
¬
¬ Vk2 Vn2 −k
n
¬
¬
¬≤4nb−1sup
i≥1
|εi|, for n≥n0, 1≤k≤nb and
¬
¬
¬ Vk2 Vn2 −k
n
¬
¬
¬≤4n−absup
i≥nb
|εi|, for n≥n0, nb< k≤n.
The optimal choice ofb given by1−b=ableads to the announced conclusion withc=a/(a+ 1) =δ/(2 + 2δ).
Lemma 14. IfE|X1|2+δ <∞ for someδ >0, then almost surely nd max
1≤k≤n
Xk2
Vn2 →0, (64)
for anyd < δ/(2 +δ).
Proof. We use the same trick as in O'Brien [16] p. 542. For any positiveεwe have (noting the key role of i.o. in the following inequalities)
P°
1≤k≤nmax Xk2
Vn2 > εn−d, i.o.±
≤ P° Vn2< n
2, i.o.± +P°
1≤k≤nmax Xk2> n
2εn−d, i.o.±
= 0 +P°
Xn2 >n
2εn−d, i.o.± . Now observe that
∞
X
n=1
P°
Xn2> n 2εn−d±
≤°2 ε
±1+δ/2
E|X1|2+δ
∞
X
n=1
1 n(1−d)(1+δ/2). For anydsuch that (1−d)(1 +δ/2)>1, Borel-Cantelli's Lemma leads to
P°
1≤k≤nmax Xk2
Vn2 > εn−d, i.o±
= 0.
Asεis arbitrary, the result is proved.
Acknowledgement: The rst author would like to thank Vidmantas Bentkus for a number of stimulating discussions on the invariance principle for self- normalized sums.
References
[1] Araujo, A. and Giné, E. (1980) The Central Limit Theorem for Real and Banach Valued Random Variables. Wiley, New York.
[2] Bentkus, V. and Götze, F. (1996) The Berry-Esseen bound for student's statistic Annal. Probab. 24, 491503
[3] Billingsley, P. (1968) Convergence of probability measures. Wiley, New York.
[4] Christiakov, G. P. and Götze, F. (1999) Moderate deviations for self- normalized sums. Preprint 99-048, SFB 343, Univ. Bielefeld.
[5] Chuprunov, A. N. (1997) On convergence of random polygonal lines under Student-type normalizations. Theory of Probab. and Appl. 41, 756761.
[6] Ciesielski, Z. (1960). On the isomorphisms of the spacesHα andm. Bull.
Acad. Pol. Sci. Ser. Sci. Math. Phys. 8, 217222.
[7] Erickson, R. V. (1981). Lipschitz smoothness and convergence with appli- cations to the central limit theorem for summation processes. Ann. Probab.
9, 831851.
[8] Egorov, V. A. (1997) On the asymptotic behavior of self-normalized sums of random variables. Theory of Probab. and Appl. 41, 542548.
[9] Giné, E., Götze, F. and Mason, D. M. (1997) When is the Student t-statistic asymptotically standard normal? Ann.Probab. 25, 15141531.
[10] Grin, P. S. and Kuelbs, J. (1989) Self-normalized laws of the iterated logarithm. Ann. Probab. 17, 15711601.
[11] Grin, P. S. and Mason, D. M. (1991) On the asymptotic normality of self-normalized sums. Proc. Cambridge Phil. Soc. 109, 597610.
[12] Hamadouche, D. (1998). Invariance principles in Hölder spaces. Portugaliae Mathematica, 57 (2000), 127151.
[13] Kerkyacharian, G. and Roynette, B. (1991). Une démonstration simple des théorèmes de Kolmogorov, Donsker et Ito-Nisio. C. R. Acad. Sci. Paris Série I 312, 877882.
[14] Lamperti, J. (1962). On convergence of stochastic processes. Trans. Amer.
Math. Soc. 104, 430435.
[15] Logan, B. F., Mallows, C. L., Rice, S. O. and Shepp, L. A. (1973) Limit distributions of self-normalized sums. Ann. Probab. 1, 788809.
[16] O'Brien, G. L. (1980) A limit theorem for sample maxima and heavy branches in Galton-Watson trees. J. Appl. Probab. 17, 539545.
[17] Ra£kauskas, A. and Suquet, Ch. (1999). Central limit theorem in Hölder spaces. Probab. Math. Stat. 19, 155174.
[18] Ra£kauskas, A. and Suquet, Ch., Random elds and central limit theorem in some generalized Hölder spaces, Prob. Theory and Math. Statist., Pro- ceedings of the 7th Vilnius Conference (1998), B. Grigelionis et al. (Eds), 599616, TEV Vilnius and VSP Utrecht (1999).
[19] Ra£kauskas, A. and Suquet, Ch., On the Hölderian functional central limit theorem for i.i.d. random elements in Banach space, Pub. IRMA Lille 50-III, to appear in Proceedings of the Fourth Hungarian Colloquium on Limit Theorems of Probability and Statistics (1999).
[20] Ra£kauskas, A. and Suquet, Ch., Convergence of self-normalized partial sums processes in C[0,1] and D[0,1], Pub. IRMA Lille (2000) 53-VI, preprint.
[21] Shao, Q.-M. (1997) Self-normalized large deviations, Ann. Probab. 25, 285 328.
[22] Shao, Q.-M. (1999) A Cramér type large deviation result for Student's t-statistic. J. Theoret. Probab. 12, 387-398.