A NNALES DE L ’I. H. P., SECTION B
E. DEL B ARRIO C. M ATRÁN
J. A. C UESTA -A LBERTOS
Necessary conditions for the bootstrap of the mean of a triangular array
Annales de l’I. H. P., section B, tome 35, n
o3 (1999), p. 371-386
<http://www.numdam.org/item?id=AIHPB_1999__35_3_371_0>
© Gauthier-Villars, 1999, tous droits réservés.
L’accès aux archives de la revue « Annales de l’I. H. P., section B » (http://www.elsevier.com/locate/anihpb) implique l’accord avec les condi- tions générales d’utilisation (http://www.numdam.org/conditions). Toute uti- lisation commerciale ou impression systématique est constitutive d’une infraction pénale. Toute copie ou impression de ce fichier doit conte- nir la présente mention de copyright.
Article numérisé dans le cadre du programme Numérisation de documents anciens mathématiques
http://www.numdam.org/
371
Necessary conditions for the bootstrap
of the
meanof
atriangular array*
E. del BARRIO and C.
MATRÁN
J. A. CUESTA-ALBERTOS
Departamento de Estadistica e Investigaciôn Operativa,
Universidad de Valladolid, Spain
Departamento de Matemáticas, Estadistica y Computaciôn,
Universidad de Cantabria, Spain
Vol. 35, n° 3, 1999, p. -386 Probabilités et Statistiques
ABSTRACT. -
Although
necessary conditions for thebootstrap
of the meanto work have
already
beengiven, only
the case of an i.i.d. séquence has beenexhaustively
considered. Westudy
such necessary conditions for(not necessarily infinitesimal) triangular
arraysshowing
that the existence of alimit law in
probability
leads toinfinitesimality
and to the Central Limit Theorem to hold for a rescaledsubarray.
Our
setup
is based on atriangular
array of row-wiseindependent identically
distributed random variables and anyresampling
size. Whileour results are similar to those obtained in Arcones and Giné
Henri Poincaré
25(1989) 457-481]
for an i.i.d. séquence, ourproof
isbased on
symmetrizations
and the considération of U-statistics and allowsa unified treatment without moment
assumptions.
@Elsevier,
ParisKey words: necessary condition, bootstrap mean, arbitrary resampling sizes, triangular
arrays, U-statistics, Central Limit Theorem.
* Research partially supported by DGICYT grant PB95-0715-C02-00, 01 and 02
1 Thèse authors have also been supported by the PAPIJCL grant VA08/97 A.M.S. 1991 subject classification: Primary: 62 G 09, 60 F 05. Secondary: 62 E 20.
Annales de l’Institut Henri Poincaré - Probabilités et Statistiques - 0246-0203
Vol. 35/99/03/@ Elsevier, Paris
372 E. DEL BARRIO, J. A. CUESTA-ALBERTOS AND C. MATRÀN
RÉSUMÉ. - Bien que les conditions nécessaires au fonctionnement du
bootstrap
de la moyenne aientdéjà
étédonnées,
seul le cas d’uneséquence
de variables aléatoires
indépendantes
et de même loi(i.i.d.)
a été considéré exhaustivement. Nous étudions de telles conditions nécessaires pour untableau
triangulaire (pas
nécessairementinfinitésimal)
en montrant quel’existence d’une loi limite en
probabilité
conduit à l’infinitésimalité et à la vérification du Théorème Central du Limite pour un sous-tableau normalisé.Notre cadre se base sur un tableau
triangulaire
de variables aléatoiresindépendantes
et de même loi surchaque ligne
et den’importe quel
tailled’enchantillonnage.
Bien que nos résultats sont similaires à ceux obtenus dans Arcones et Giné Henri Poincaré25(1989) 457-481]
pourune
séquence i.i.d.,
notre démonstration se base sur lasymétrisation
et laprise
en considération desU-statistiques
etpermet
un traitement unifié sanshypothèses
sur les moments. @Elsevier,
Paris1. INTRODUCTION
The différences between the
asymptotic
distribution of thebootstrap sample
mean for infinitesimal arrays and for sequences ofindependent identically
distributed(i.i.d.)
random variables(r.v.’s)
have been shown in Cuesta-Albertos and Matràn(1998).
After this work it becomesmathematically
natural to follow itby providing
necessary conditions toassure a
bootstrap
limit law.We would like to
emphasize, however,
the interest of necessary conditions from thepoint
of view of statisticalapplications
of thebootstrap.
Recall thatone of the achievements of the
bootstrap techniques
since their introductionby
Efron(1979)
is that itallows,
via Monte Carlomethod,
toapproximate
the distribution of the statistic of interest. In the
simplest
case of the mean,given
n observationsXi
=..., Xn
of a random variableX,
thisleads to obtain a
large
set ofresamples
of size m~ from...,~}:
...~ ~
withsample mean ~
~ i?
... ,~ ~
withsample mean ~
~ i,..., ~ ~
withsample mean t;
l’Institut Poincaré - Probabilités et Statistiques
with the guess that the
sample
distribution of~,~...~
will be agood approximation
to theasymptotic distribution,
ifany!,
of X.Thus,
as a matter of fact our interest is to ensure that if thebootstrap
distribution
is,
sayapproximately normal,
thenX
will beapproximately
normal. The mathematical
justification
of thisprocedure
must begiven by
a necessary condition.
Let us suppose that
Bi, ..., Bn
is asample
of Bernoulli trials. Twoapproximations
to theprobability
law ofSn
== areusually
considered.
First,
the celebrated De Moivre’s Central Limit Theorem assertsthat,
whenproperly normalized,
the law ofSn
isnearly
normal. On the otherhand,
the(not
lesscelebrated)
Poisson’s Theorem of Rare Events allows to use the Poisson distribution toapproximate
the law ofSn.
As it is well
known,
from the mathematicalpoint
ofview,
with thelaw of convergence of
types
in mindavoiding
both convergences for thesame séquence
(even rescaled),
theproblem
is circunventedthrough
theconsidération of
triangular
arrays of random variables. This allows togive
mathematical
meaning
to theexpression
’rare events’through
acompromise
between the
sample
size and theprobability
of success in the trials.Thèse observations should make clear that necessary conditions obtained in the
setup
of a séquence of i.i.d. random variables canonly give
answersto the
following question:
If
we assume that the distributiontype
summand does notdepend
and wide range
of sample
thedistribution of
thecorresponding bootstrap
sum can beapproximated by a law,
what canwe say about the distribution sum?
However,
from thepoint
of view ofapplications
theinteresting question
should be
7~ sample,
the distributionof
thebootstrap
sum can beapproximated by a law,
what can we say about the distributionof
the sum
of the original sample ?
and the
appropriate setup
to discuss thisproblem
isgiven by triangular
arrays of row-wise i.i.d. random
variables,
which will be thesetup
considered in this paper.The first necessary condition for the
bootstrap
of the mean for i.i.d.sequences and
resampling
sizeequal
to thesample size,
i.e. m~ = n, wasgiven
in Giné and Zinn(1986) showing
that thebootstrap
works a.s. if andonly
if the common distribution of the séquence has finite second moment, while it works inprobability
if andonly
if that distributionbelongs
tothe domain of attraction of the normal law. Hall
(1990) completes
the3-1999.
374 E. DEL BARRIO, J. A. CUESTA-ALBERTOS AND C. MATRÀN
analysis
in thissetup showing
that when there exists abootstrap
limit law(in probability)
then either theparent
distributionbelongs
to the domain ofattraction of the normal law or it has
slowly varying
tails and one of thetwo tails
completely
dominates the other.The interest of
considering resampling
sizes différent to thesample
size was noted among others
by Swanepoel (1986)
andAthreya (1987).
For
général arbitrary resampling sizes,
but still in thesetup
of an i.i.d.séquence, Arcones and Giné
(1989)
obtained:Let
~
l’X£ ~ , bootstrap sample
obtained from 1.i.d.r.v.’ sand
set Denoteby
~ the conditional lawgiven
thesample. Then, provided
that mn/
oo, if there exist a non-degenerated
randomprobability
measure constants{a}~n=1, an /
oo,and r.v’s cn, measurable on the 03C3-fields 03C3(X1, X2,..., Xn), n E
N,
satisfying
/ . ,(a)
there exist aLévy
measure v, andcr~ >
0 such that for every T > 0(b)
If c > 0 andthen,
(c)
Iflimsup
> 0 then v == 0 andcr~
> 0.(d)
If liminf > 0 then Xbelongs
to the domain of attraction of the normal law withnorming
constantsbn
=As observed
earlier,
the fact that thèse results concem a séquence of . i.i.d.r.v.’sprevents
them fromproperly discussing
thequestion
of interestabove. To our best
knowledge, only
Mammen(1992)
addresses theproblem (in
thesetup
of linearstatistics) considering triangular
arrays, but withHenri Poincaré - Probabilités et Statistiques
resampling sizes, equal
to theoriginal sample
sizen).
Ourgénéral
theorem on necessary conditions for thebootstrap
of the mean willcover any
resampling
rate.Before
stating
our mainresult,
we introduce some notation. We will consider atriangular
array, =1,...,~; ~ EN},
oo, of row-wise i.i.d.r.v.’s
(from
now on atriangular array).
X~ i,X~,...,~~(?r~
-~(0)
will be abootstrap sample, i.e.,
an i.i.d.sample given
with law~-
and(bootstrap)
sumS*n
=where 03B4x
denotes Dirac’s measure on x. Asusual,
P*and E* will
respectively
denote the conditionalprobability
andexpectation given
thesample,
while~(~?)
is the distribution of the r.v. Z and/~*(Z)
is the
corresponding
conditional distribution of Z.As in
Araujo
and Giné(1980), N( a, o~) ~ crPois v
is agénéral infinitely
divisible law written as the convolution of a normal law and the
generalized
Poisson law associated to the
Lévy
measure v.Moreover,
for anygiven
r.v.
X,
and 03B4 >0,
we denoteX03B4
:= andX03B4
:=Convergence
indistribution,
in law or weak convergence will be termsindistinctly
usedthrough
the paper, and will be denotedby 1,
while2014~
will mean convergence in
probability.
We state now our main result.
THEOREM 1.1. - Let mn
/
00, and C e[0,0o]. If there probability
constants{~}~=i? ~ ~
oo, and randomvariables
An,
measurable on thea(Xn,j : j
=1,...,~),?~
E Nsatisfying
I ..in
probability,
then:(i)
p isinfinitely
divisible.Thus,
p ==N( a, 03B12)
*c03C4Pois
v.(ii) If
C == 0 then there exist constants{bn}~n=1
such thatThe constants
bn
mustsatisfy
where 7rn is a median
for Xn,j.
Vol. 35, n° 3-1999.
376 E. DEL BARRIO, J. A. CUESTA-ALBERTOS AND C. MATRÀN
(iii) If
c > 0 then v m 0 and there exists{cn}~n=1
such thatThe constants cn must
satisfy
where
03C0n is a
medianof Xn,j.
Remark 1. - Note that the sum in
(ii)
in theprevious
theorem does notinclude all the values in the
sample.
DEven
though
our results are similar to those in Arcones and Giné(1989),
the difficulties that arise in the
triangular
arraysetting
are very différent. In addition to the usualtechniques
in thestudy
of the Central LimitTheorem,
to carry out the
proof
we handle a uniformapproximation
between someU-statistics and their
corresponding projections
madepossible through
Lemma 2.1. The use of U-statistics is a conséquence of the
impossibility
of
avoiding
the use ofsymmetrizations
toget independent
summands insome
steps (see
Remark2).
2. PROOFS
Our
proof
of Theorem 1.1 is based onsymmetrization.
Let~*i,...,
be a newbootstrap sample independent
of~ ~
...,(i.e., X~...,X~
are 1.1.. r.v’s with law and inde-pendent
ofX~, .~~~J.
Let~r - ~
Thenin
probabilité where p
is theprobability
measure such thatp(A)
==p( -A)
for every Borel set A. Let =
1,
...,?r~~, n~ N}
be atriangular
array of row-wise i.i.d. r.v.’s such that = where is an
independent
copy of When the limitresampling
rate, c,is
finite,
we will use(2.1)
to showthat, {Y~j:~=l,...,m~~GN}
verifies the
hypothèses
of the Central Limit Theorem. Then we will obtain the convergence of theoriginal
array. In order to prove theAnnales de l’Institut Henri Poincaré - Probabilités et Statistiques
convergence for =
1,...,
n EN}
we need a lemmaconceming
the
approximation
of a U -statisticby
itsHoeffding projection.
First weintroduce some notation.
Let h : Rm - R be a
symmetric integrable
function. With the notationd(Pi
x ... xPm),
theHoeffding projections
of hare defined as
for xi E R and 1 ~ k m. For
convenience,
we denote03C00 h
= Pmf.
Thèse
projections
induce adécomposition
of the U -statisticinto the sum of U-statistics of orders k m,
namely,
theHoeffding décomposition:
We will use this
decomposition
to prove thefollowing lemma,
whichprovides
a bound for theL2-distance
between a U-statistic of order 2 anda sum of
independent
r.v’s. The main interest of this result relies on the fact that this bound is valid for allsymmetric
kernels.LEMMA 2.1. - Let
Xi, ..., Xn
beindependent identicaly
distributed random variables. Assume that h :1R2 ---*
R is asymmetric function
such that E
X2)
00. With the abovenotation,
letÛn(h)
==U(0)n(03C00h)
+ 2(l)( (
7ri)
ThenProof. - According
toHoeffding’s expansion, i7~(~) = L~(7T2 h).
Note that EUÉ~~ (7r~ h)
=0,
so that it suffices to prove thatNow observe that
~(~ -
We can
easily
check that Var = Varh(Xi, X2) -
2V ar Tri
h(Xi)
and = 0. Therefore35,n°
378 E. DEL BARRIO, J. A. CUESTA-ALBERTOS AND C. MATRÀN
which
completes
theproof.
DNow we prove convergence for
(Yn,j : j
==1,
..., mn, n EN)
whenC 00.
LEMMA 2.2. - Under the
hypotheses of
Theorem 1. 1 andusing
the abovenotation, if
c 00 andTn
==Yn,j
then .Proof - Conditionally given ~(~ - ~)
is a sumof i.i.d r.v’s. From every
subsequence {~~}
we can extract a furthersubsequence
such thatr§,1(S§§,, - S§§1,)
convergesweakly
in aprobability
one set. In that set,{~~(~~j 2014 X§§1 , ~ ) : j
=1 , 2,
... , isinfinitesimal
(see
e.g. Breiman(1968),
p.191). Hence,
we can conclude thatand that p *
p
isinfinitely divisible,
i.e. thereexist ~ ~
0 and asymmetric Lévy
measure, p, such thatp * p
=cr~) ~ crPois
~.By infinitesimality,
necessary conditions in the CLT
(see
e.g.Araujo
and Giné(1980),
p.61) imply
that there exists aLévy
measure ~ such thatfor
every 8
> 0 such that = 0 andfor every séquence
{ï~}~i
such that Tn~
0.The conditional distribution of
X§§ ~ - given
ishence
and we can use
Lebesgue’s
Theorem and(2.2)
to obtainAnnales de l’Institut Henri Poincaré - Probabilités et Statistiques
which
implies
that{Y~j :j=l,...,m~~eN}
is infinitesimal.Now observe
that, using
the same notation as in Lemma 2.1 and the fact that the law ofX~ i -
issymmetric,
where =
~_,~,,~
whilewhere
Thus,
we can rewrite(2.3)
and(2.4)
asfor
every 8
> 0 such that = 0 andfor every
{7~}~
such that Tn~
0.Since
and
we can use Lemma 2.1 to conclude that
and also that
Vol. 35,n° 3-1999.
380 E. DEL BARRIO, J. A. CUESTA-ALBERTOS AND C. MATRÀN
which, together
with(2.5)
and(2.6), provide
for every 8 > 0 such that == 0 and
for
such that Tn ~
0. Note that is a sum ofindependent r.v’s, namely
Since
1 =
1, ..., kn n
~ is an infinitesimal array.By
theCLT
(see
e.g.Araujo
and Giné(1980),
pg.61,
Theorem4.7(iii))
and(2.9), ~
E A similarreasoning
shows that m~ E-
~.
Now(2.7)
and(2.8) imply
for
every 8
> 0 such that = 0 andfor every
{7-~}~
such that Tn~
0.Note that E = >
8)
and that E = VarObserve also that
(2.12) implies
which
completes
theproof.
DNow we are
ready
to prove our main result.l’Institut Henri Poincaré - Probabilités et Statistiques
Proof of Theorem
1.1. - Assume first that c oo. Let be as in Lemma 2.2. If 7rn is a median of thenLévy inequalities (see
e.g.Feller
(1966),
pag.147) imply
thatthat =
1,2,...,
mn n ~ N} is an infinitesimal array. On the otherhand,
sincethen
holds,
whichtogether
withimplies
that p isinfinitely divisible, namely,
p =~V(~,Q;~) ~ crPois
v.By
Lemma 2.2 p *
p
=cr~) ~ crPois Hence, c~~ = ~
and v + ~ == ~.The
infinitesimality
and(2.13) imply
thatfor
every 8
> 0 such that~{2014~}
= 0. ThereforeObserve that the array
{
"~J -7r " ">ò~ j : j
==1,
...,kn
n EN)
isinfinitesimal,
thus we canemploy
the CLT to concludeand
Let
Sn = ~ (Xn,j - ~). By
Lemma2.2,
ifS§j
is anindependent
copy of
Sn
thenVol. 35, n° 3-1999.
382 E. DEL BARRIO, J. A. CUESTA-ALBERTOS AND C. MATRÁN
This means that
Sn
isshift-tight (see Araujo
and Giné(1980), Corollary 4.11,
p.27).
Let{7~}~
be a sequence such that5~
is
shift-convergent. By
theinfinitesimality
of{(X~j 2014
=1, 2, ..., rnn, n
EN}
andusing again
Theorem 4.7 inAraujo
and Giné(1980),
there exist03B2 2
0 and aLévy
measure À such thatfor every 7- such that
~{2014~,7-}
= 0.By (2.14) necessarily
À = v.Furthermore,
which
implies N(0,203B22) * c03C4Pois(03BD+03BD)
=7V(0, 03C32) *c03C4Pois . Uniqueness
in the
Lévy-Khintchine representation, pro vides /?~
=~/2
=o~. Thus,
from every sequence we can extract a
subsequence
suchthat
therefore
which
completes
theproof
of(ii).
Assume now that c E
(0,oo).
Note that(2.15) implies
thatfor every 7- > 0 such that
~{-T, r}
= 0. From( 2.14)
we obtain thatfor
every 8
> 0 such that~{2014~}
= 0.According
to theCLT,
thesefacts
imply
Now, by
Theorem 11 in Cuesta and Matrán(1998),
we obtain thatAnnales de l’Institut Henri Poincaré - Probabilités et Statistiques
in
law,
where N is a Poisson random measure withintensity
measure v.But,
as we have shown in(2.13),
in
probability,
thennecessarily
N = v. This canonly happen
if v = 0.Hence
Finally,
suppose c = oo.Reasoning
as above we can obtain thatand conclude from this that
and
Since oo,
(2.16)
canonly happen
if(passing
tosubsequences)
is
eventually
zero on aprobability
one set. Hencev == 0 and we can rewrite
(2.17)
as follows:Now call
Zn,j
==/f£ (Xn,j - 03C0n) and Z*n,j
== mn kn 1 rn(X*n,j - xn),
j
==1 , 2,
...,kn. Then, by Lévy’s
andChebychev’s inequalities
and(2. 18)
384 E. DEL BARRIO, J. A. CUESTA-ALBERTOS AND C. MATRÁN
Observe that the r.v. >
8)
== follows abinomial distribution with
parameters kn
and >8),
hence thebound above
implies
thatFrom
(2.19)
it is easy toconclude, employing
the CLT thatSince
(2.20) implies
thatNow,
from(2.20)
and(2.21)
we obtain thatin
probability, but,
as shownbefore,
thisimplies
which
completes
theproof.
DRemark 2. - The
symmetrization employed
in theproof
of Theorem 1.1 is necessary to achieve the conclusion. We could use the initialhypotheses
to
obtain,
in a similar fashion as in Lemma2.1,
theinfinitesimality
=
1,2,..., (hence,
the infinitedivisibility
ofp =
A~(~~,c~~) ~ crPois v),
and also the convergencesfor
every 8
> 0 such that~{-~~}
= 0 andAnnales de l’Institut Henri Poincaré - Probabilités et Statistiques
for every 7 > 0 such that
~{2014~,~}
= 0. It is easy to argue as above toget
theinfinitesimality
of =1,2,
The CLTand
(2.22) provide
In order to
complete
theproof
we should show thatfor every T > 0 such that
~{-~ ~}
= 0.However,
we can not use the CLTto conclude this
directly
from(2.23)
becauseis not a sum of
independent
r.v’s. Thisdifficulty
is sorted out in theproofs
in Giné and Zinn
(1989)
and Arcones and Giné(1989)
for thebootstrap
of1.i.d.r.v. ’ s
mainly through
Lemma 2 in Giné and Zinn(1989),
which shows that if the array comes from a sequence of 1,i,d.r.v.’s andEXf
= oo, thenbut a similar result in our
setup
without additional moments restrictions is not available. On the other hand ourproof jointly
handles both cases offinite and infinite variance. D
ACKNOWLEDGMENTS
We thank Professor Evarist Giné for his
suggestions
and commentsconceming !7-statistics,
whichhelped
to write andconsiderably simplify
the
proof
in the final version of Lemma 2.1.Vol. 35, n° 3-1999.
386 E. DEL BARRIO, J. A. CUESTA-ALBERTOS AND C. MATRÁN REFERENCES
[1] ARAUJO A. and GINÉ E., The Central Limit Theorem for Real and Banach Valued Random Variables. Wiley, New York, 1980.
[2] ARCONES M. and GINÉ E., The Bootstrap of the Mean with Arbitrary Bootstrap Sample
Size. Ann. Inst. Henri Poincaré, Prob. Stat., Vol. 25, 1989, pp. 457-481.
[3] ATHREYA K. B., Bootstrap of the Mean in the Infinite Variance Case. Proceedings of the 1st
World Congress of the Bernoulli Society, Y. Prohorov and V. V. Sazonov, Eds., Vol. 2, 1987, pp. 95-98. VNU Science Press, The Netherlands.
[4] BREIMAN L., Probability, Addison-Wesley, Reading, Mass, 1968.
[5] CUESTA-ALBERTOS J. A. and MATRÁN C, The asymptotic distribution of the bootstrap sample
mean of an infinitésimal array. Ann. Inst. Henri Poincaré, Prob. Stat., Vol. 34, 1998, pp. 23-48.
[6] FELLER W., An Introduction to Probability Theory and its Applications, Vol. II, Wiley,
New York, 1966.
[7] GINÉ E. and ZINN J., Necessary conditions for the bootstrap of the mean, Ann. Statist., Vol. 17, 1989, pp. 684-691.
[8] HALL P., Asymptotic Properties of the Bootstrap of Heavy Tailed Distributions, Ann.
Statist., Vol. 18, 1990, pp. 1342-1360.
[9] MAMMEN E., Bootstrap, wild bootstrap, and asymptotic normality, Probab. Theory Relat.
Fields, Vol. 93, 1992, pp. 439-455.
[10] SWANEPOEL J. W. H., A Note in Proving that the (Modified) Bootstrap Works, Commun.
Statist. Theory Meth., Vol. 15, 1986, pp. 3193-3203.
(Manuscript received November 1 7, 1997;
Revised version received July 15, 1998. )
Annales de l’Institut Henri Poincaré - Probabilités et Statistiques