A NNALES DE L ’I. H. P., SECTION B
P ASCAL M ASSART
Rates of convergence in the central limit theorem for empirical processes
Annales de l’I. H. P., section B, tome 22, n
o4 (1986), p. 381-423
<http://www.numdam.org/item?id=AIHPB_1986__22_4_381_0>
© Gauthier-Villars, 1986, tous droits réservés.
L’accès aux archives de la revue « Annales de l’I. H. P., section B » (http://www.elsevier.com/locate/anihpb) implique l’accord avec les condi- tions générales d’utilisation (http://www.numdam.org/conditions). Toute uti- lisation commerciale ou impression systématique est constitutive d’une infraction pénale. Toute copie ou impression de ce fichier doit conte- nir la présente mention de copyright.
Article numérisé dans le cadre du programme Numérisation de documents anciens mathématiques
http://www.numdam.org/
Rates of convergence in the central limit theorem
for empirical processes
Pascal MASSART Universite Paris-Sud
U. A. CNRS 743 « Statistique Appliquee » Mathematiques, Bat. 425,
91405 Orsay (France)
Ann. Inst. Henri Poincaré,
Vol. 22, n° 4, 1986, p. 381-423. Probabilités et Statistiques
SUMMARY . In this paper we
study
the uniform behavior of theempirical
brownian
bridge
over families of functions ~ boundedby
a function F(the
observations areindependent
with common distributionP).
Undersome suitable
entropy
conditions which werealready
usedby
Kolcinskiiand
Pollard,
we proveexponential inequalities
in theuniformly
boundedcase where F is a constant
(the
classical Kiefer’sinequality (1961)
isimproved),
as well as weak andstrong
invarianceprinciples
with rates ofconvergence in the case where F
belongs
to~2 +a(P)
with 6 E] o,1 ] (our
results
improve
onDudley
andPhilipp’s
results(1983)
whenever ~ isa
Vapnik-Cervonenkis
class in theuniformly
bounded case and are newin the unbounded
case).
Key-words and phrases : Invariance principles, empirical processes, gaussian processes,
exponential bounds.
RESUME. - Dans cet article nous etudions le
comportement
uniforme dupont
brownienempirique
sur des familles de fonctions ~ borneespar une fonction F
(les
observations sontindependantes,
de distributioncommune
P).
Sous des conditionsd’entropie
convenablesqui
sontdeja
utilisees par Kolcinskii et
Pollard,
nous prouvons lesinegalites
exponen- tielles dans le cas uniformement borne ou F est une constante(1’inegalite classique
de Kiefer(1961)
estamelioree)
aussi bien que lesprincipes
d’in-Annales de l’Institut Henri Poincaré - Probabilités et Statistiques - 0246-0203 381
382 P. MASSART
variance faibles et forts avec des vitesses de convergence dans le cas ou F
appartient
a L2+a(P)
avec 03B4~] O,
1] (nos
résultats améliorent ceux deDudley
etPhilipp (1983) lorsque
~ est une classe deVapnik-Cervonenkis
dans le cas
uniformement
borne et sont nouveaux dans le cas non-borne.TABLE OF CONTENTS
1. Introduction ... 382
2. Entropy and measurability ... 385
3. Exponential bounds for the empirical brownian bridge ... 388
4. Exponential bounds for the brownian bridge ... 403
5. Weak invariance principles with speeds of convergence ... 405
6. Strong invariance principles with speeds of convergence ... 412
Appendix 1. Proof of lemma 3.1... 419
2. The distribution of the supremum of a d-dimensional parameter brownian bridge ... 420
3. Making an exponential bound explicit ... 420
1. INTRODUCTION
1.1. GENERALITIES. - Let
(X, ~’, P)
be aprobability
space and(Xn)n>
1be some sequence of
independent
andidentically
distributed random variables with lawP,
defined on a richenough probability
spacePr).
Pn
stands for theempirical measure -
and we choose to callempirical
brownianbridge relating
to P the centered and normalized process 03BDn =P).
Our purpose is tostudy
the behavior of theempirical
brownianbridge uniformly
over~ ,
where ~ is some subsetof 22(P).
More
precisely,
wehope
togeneralize
and sometimes toimprove
someclassical results about the
empirical
distribution functions onf~d (here
~is the collection of
quadrants
on[Rd),
in the wayopened by Vapnik, Cervo-
nenkis and
Dudley.
In
particular,
theproblem
is toget
bounds for:Annales de l’Institut Henri Poincaré - Probabilités et Statistiques
383 CENTRAL LIMIT THEOREM FOR EMPIRICAL PROCESSES
where ~ ~ ~ ~ ~ ~
stands for the uniform norm over ~ and to buildstrong
uniformapproximations of 03BDn by
someregular gaussian
process indexedby F
with some
speed
of convergence, say(bn).
First let us recall the main known results about the
subject
in the classicalcase described above.
1 . 2. THE CLASSICAL BIBLIOGRAPHY. - We
only
submit here a succinctbibliography
in order to allow an easycomparison
with our results(for
a more
complete bibliography
see[26 ]). Concerning
the real case(d
=1),
the results mentioned below do not
depend
on P and areoptimal:
~ . 2 . l.
(1.1.1)
is boundedby C exp ( - 2t2),
where C is a universalconstant
according
toDvoretzky,
Kiefer and Wolfowitz[24 ] (C 4/
according
to[17]).
1.2.2. The
strong
invarianceprinciple
holds withaccording
toKomlos, Major
andTusnady [37 ]. V~
In the multidimensional case
(d
>2):
1. 2 . 3.
(1.1.1)
is boundedby C(E)
exp( - (2 - ~)t2),
for any ~ >0, according
to Kiefer[34 ].
In thisexpression
E cannot be removed(see [35 ]
but also
[28]).
’_ i
1. 2 . 4. The
strong
invarianceprinciple
holdswith bn
= n 2 ~Za -1 ~Log (n), according
to Borisov[8 ].
This result is not known to be
optimal,
besides it can beimproved
when P is
uniformly
distributed on[O, 1 ]d.
In this case we have:1. 2. 5. If d =
2,
thestrong
invarianceprinciple
holdswith bn
= ,according
toTusnady [.~0 ]. " /n
1.2.6. If d >
3,
thestrong
invarianceprinciple
holds withaccording
toCsorgo
and Revesz[14 ].
1.2. 5 and 1.2.6 are not known to the
optimal.
Let us note that even the
asymptotic
distributionof )) 03BDn~F
is not wellknown
(the
case where d = 2 and P is the uniform distribution on[0, 1 ]2
is studied in
[12]).
384 P. MASSART
Now we describe the way which has
already
been used to extend theabove results.
1 . 3. THE WORKS OF
VAPNIK, CERVONENKIS,
DUDLEY AND POLLARD.Vapnik
and Cervonenkis introduce in[51 ]
some classes of sets-whichare
generally
called V. C.-classes for whichthey
prove astrong
Glivenko- Cantelli law oflarge
numbers and anexponential
bound for(1.1.1).
P. Assouad studies these classes in detail and
gives
manyexamples
in[3 ] (see
also[40 ] for
a table ofexamples).
The functional P-Donsker classes
(that
is to say thoseuniformly
overwhich some central limit theorem
holds)
were introduced and characte- rized for the first timeby Dudley
in[20] ]
and were studiedby Dudley
himself in
[21 ]
and laterby
Pollard in[44 ].
Some sufficient
(and
sometimes necessary, see[27 ] in
case ~ isuniformly bounded)
conditions for ~ to be a P-Donsker class used in these worksare some kinds of entropy
conditions,
as follows.Conditions where functions are
approximated
from above and below(bracketing,
see[20 ])
are used in case ~ is a P-Donsker class whenever Pbelongs
to some restricted set of laws on X(P
often has a boundeddensity
with
respect
to theLebesgue
measurein
theapplications)
whereas Kol- cinskii and Pollard’s conditions are used in case F is a P-Donsker class whenever Pbelongs
to some set of lawsincluding
any finitesupport
law(the
V. C.-classes are under somemeasurability assumptions-the
classesof sets of this
kind,
see[21 ]).
In our
study
we are interested in the latter kind of the above classes.Let us recall the
already existing
results in thisparticular
direction. When-ever ~ is some V. C.-class and under some
measurability conditions,
we have :
1. 3 . l.
(1.1.1)
is boundedby (2 - E)t2)
for any £ in] O, 1 ], according
to Alexander in[1 ] and
moreprecisely by :
+
t2)2048(D+ 1)
exp(- 2t2),
in[2 ] (1) ,
where D stands for the
integer density
of EF(from
Assouad’sterminology
in
[3 ]).
1. 3 . 2.
(1.1.1)
is boundedby 4e8(03A3(n2 j))
exp( - 2t2), according
toDevroye
in[16 ].
(~)
Our result of the same kind (inequality 3 . 3 .1°) a) in the present work) seems to havebeen announced earlier (in [41 ]) than K. Alexander’s.
Annales de l’Institut Henri Poincaré - Probabilités et Statistiques
385 CENTRAL LIMIT THEOREM FOR EMPIRICAL PROCESSES
_
1
1. 3 . 3. The
strong
invarianceprinciple
holdswith bn
= naccording
toDudley
andPhilipp
in[23 ].
Now let us describe the scope of our work more
precisely.
2. ENTROPY AND MEASURABILITY
From now on we assume the existence of a
non-negative
measurablefunction F such
that F,
for anyf
in F . We use in this work Kol-cinskii’s
entropy
notionfollowing
Pollard[44 ] and
the samemeasurability
condition as
Dudley
in[21 ].
Let us define Kolcinskii’sentropy
notion.Let p
be in[1,
+ oo[. d(X)
stands for the set of laws with finitesupport
and for the set of the lawsmaking
FPintegrable.
2.1. DEFINITIONS. - Let 8 be in
] 0,1 [ and Q
be in~ , Q)
stands for the
maximal cardinality
of a subset % of !F for which :holds for any
f, g
in % withf ~ g (such
a maximalcardinality family
iscalled an 8-net of
(~, F) relating
toQ).
We setNF ~(., ff)
= supN~F ~(. , ~, Q).
QEd(X)
Log (N~F ~( . , ~ ))
is called the(p)-entropy
function of(~ , F).
The finite orinfinite
quantities :
are
respectively
called the(p)-entropy
dimension and(p)-entropy exponent
of(~ , F).
Entropy computations.
We cancompute
theentropy
of ~ from that of auniformly
boundedfamily
as follows. .then :
~ ~
For, given Q
in~(X),
eitherQ(F)
= 0 and so~ ,~ Q)
=1,
orFp
Q(F) > 0,
soO(Fp)
Q
EA(X)
and then :Vol. 22, n° 4-1986.
386 P. MASSART
Some other
properties
of the(p)-entropy
are collected in[40 ].
The mainexamples
ofuniformly
bounded classes with finite(p)-entropy
dimensionor
exponent
are described below.2 . 2. COMPUTING A DIMENSION: THE V.
C.-CLASSES .
-According
toDudley [20 ]
on the one hand and to Assouad[3 ] on
the other we have= pd
whenever L is some V. C.-class with realdensity d (this
notioncan be found in
[3 ]). Concerning
V. C.-classes offunctions,
ananalogous computation
and itsapplications
aregiven
in[45 ].
See also[21 ]
for aconverse.
2.3. COMPUTING AN EXPONENT: THE HOLDERIAN FUNCTIONS. Let d be an
integer
and a be somepositive
real number. Wewrite f3
for thegreatest integer strictly
less than a. Whenever xbelongs
to(Rd
and k toI
stands forki
+ ...+ kd
andDk
for the differentialoperator
...Let ) ) . ))
be a norm on Let be thefamily
of the restrictions to the unit cube of[Rd
ofthe fl-differentiable
functionsf
such that:Then, according
to[36 ] on
the one hand andusing Dudley’s arguments
in[19 ]
on theother,
it is easy to see that : -.a
Measurab ilit y
considerations. Durst andDudley give
in[21 ] an example
of a V. C.-class L such
that )) Pn - P~L
- 1. So somemeasurability
con-dition is needed to
get
any of the results we have in view. So from now on we assume thefollowing measurability
condition(which
is due to Dud-ley [21 ])
to be fulfilled :.
(X, ~’)
is a Suslin space.. There exists some
auxiliary
Suslin space(Y, ~)
and somemapping
Tfrom Y onto ~ such that :
(x, y)
~T( y)(x)
is measurable on(X
x 0J)
and we say that iF is
image
admissible Suslin via(Y, T).
This
assumption
isessentially
usedthrough
one measurable selection theorem which is due to Sion[47] (more
about Suslin spaces isgiven
in
[13 ]).
2 . 4. THEOREM. - Let H be some measurable subset of X x Y. We
Annales de l’Institut Henri Poincaré - Probabilités et Statistiques
CENTRAL LIMIT THEOREM FOR EMPIRICAL PROCESSES 387
write A for its
projection
on X. Then A isuniversally
measurable and there exists auniversally
measurablemapping
from A to Y whosegraph
isincluded in H.
A
trajectory
spacefor
brownianbridges.
We set :lT (~ ) _ ~ h :
~ -~f~ ; hoT
is bounded and measurable on(Y,
We consider
lT (~ )
as a measurable spaceequipped
with the d-field gene- ratedby
the open ballsrelating to )) . )(, (which
isgenerally
distinct fromthe Borel because
lT (~ )
is notseparable).
This
trajectory
space does notdepend
on P any more(as
it was the casein
[20 ])
butonly
on the measurablerepresentation (Y, T)
of ~ .From now on for convenience we set:
where £ stands for the
Lebesgue
measure on[0, 1 ], ~( [0, 1 ])
for the Borela-field on
[o, 1 ] and (X ~, ~’ °~, p:D)
for thecompleted probability
space of the countableproduct (X~, POO)
ofcopies
of(X, ~’, P).
Thefollowing
theorem
points
out howIT (~ )
is convenient as atrajectory
space.L= 1
to
lT (~ ). Moreover, setting ~Cb(~ ) - ~ h :
isuniformly
continuous and bounded on
(~ , pp)},
is included inlT (~ ).
Pro-vided that
(~ , pp)
istotally
bounded this inclusion is measurable. Here5f2(P)
is
given
the distance with6P : f -~ P( f 2)- (P( f ))2.
For a
proof
of2 . 5,
see[21 (sec. 9)
and[40 ]
where it is also shown that many reasonable families(in particular
and the «geometrical »
V.
C.-classes)
fulfill2 . 6. REMARK. - Since % fulfills it follows from
[21 ] (sec. 12) that )) Pn -
0 a. s. wheneverNF ~( . , ~ )
oo and therefore:This
implies
that the local behavior of theentropy
function isunchanged
when
taking
the sup in 2.1 over the set of all reasonable laws.Vol. 22, n° 4-1986.
388 P. MASSART
3. EXPONENTIAL BOUNDS
FOR THE EMPIRICAL BROWNIAN BRIDGE
We assume in this section that for some constants u
and v,
uf
vfor any f
The
following
entropy conditions are considered:Using
asingle
method we find upper bounds for(1.1.1)
that are effectivein the
following
two situations:U~
1°)
Observe that~03C32P~F~ 4; nothing
more is known about the variance over iF. In this case we prove someinequalities
which are ana-logous
toHoeffding’s inequality [30].
2°)
We assume thatI ~ ~P ~ ~~ (J2.
This time ourinequalities
are ana-logous
to Bernstein’sinequality (see
Bennett[5]).
3.1. DESCRIPTION OF THE METHOD. - We randomize from a
sample
whose size is
equal
to N = mn. In Pollard’s[44 ], Dudley’s [20 ] or Vapnik
and Cervonenkis’
[51 ] symmetrization technique, m
= 2 buthere,
fol-lowing
an idea fromDevroye [16 ],
we choose alarge
m.Effecting
thechange
of central law : P -~PN
with thehelp
of a PaulLevy’s type inequality,
we maystudy Pn - PN
instead ofPn -
P wherePn
stands for the randomizedempirical
measure.Choosing
some sequenceof measurably
selected netsrelating
toP~
whose mesh decreases to zero and
controlling
the errors committedby passing
from a net to another via some one dimensionalexponential bounds,
we canevaluate, conditionally
onPN,
thequantity II Pn - PN~F.
Randomization.
Setting
N = nm(m
is aninteger),
let w be some ran-dom one-to-one
mapping
from[l, n ]
into[1, N ]
whose distribution is uniform(the
«sample
w is drawn withoutreplacement »).
The
inequalities
in the next two lemmas are fundamental for what follows :Annales de l’Institut Henri Poincaré - Probabilités et Statistiques
389 CENTRAL LIMIT THEOREM FOR EMPIRICAL PROCESSES
3.1. LEMMA. - For
any ~
in(~N,
we setand
UN
=(
mjx(ji))
-( min (ji)) ;
thefollowing
threequantities
are,~
~ ~ ~ ~ ~’(l ( S~ Sr~ / ))
for any
positive
£, lower bounds for -Log ( 2 Pr - - -
> £ :These bounds
only depend on ~ through
numericalparameters (UN, 6N).
Bound
3°)
is new ;concerning 1 °) (due
toHoeffding [30 ]), Serfling’s
boundis better
(see [46])
butbrings
no moreefficiency
when an islarge.
The
proof
of lemma 3.1 isgiven
in theappendix.
From now on we write
Pn
for the randomizedempirical
process- n
Theinequality allowing
us tostudy
the randomized processi= 1
rather than the initial one is the
following :
3 . 2. LEMMA. - The random
elements II Pn - and )) Pn - P~F
are measurable.
Besides, whenever I p2,
thefollowing
holds :for any
positive
2 and any a in]0,1 [,
where n’ - N - n.For a
proof
of this lemma see[16 ] using Dudley’s measurability
argu-ments in
[21 ] (sec. 12).
390 P. MASSART
Statements
of
the results.3 . 3. THEOREM. The
following quantities
are, for anypositive
t and r~, upper bounds for >t) :
where k
= 03B6( 6 + 03B6 2 + 03B6) (when (
increasesfrom
0 to 2so does k).
%2°) Suppose that )) ~03C32P~F ~ (J2,
with aU,
thenB v .. /
where p = 2((4 - ()
(when increases
from 0 to 2 so does2
The constants
appearing
in these boundsdepend
on :only through NU ~{ . , ~ - u)
and of course on ~.Comments.
- Yukich also used in
[54 Kolcinskii-Pollard entropy
notion to provean
analogue
of theorem3.3,
but our bounds aresharper
because of theuse
of
a randomization from alarge sample
as described in 3.1.- From section
2 . 2,
theassumption d(2)1(F)
oo istypically
fulfilledwhenever % is some V. C.-class with real
density
d.Thus bound
1 °) a)
issharper
than those of1. 3.1 ;
in an other connectionthe factor
0(iF, r~)t6~d +’’)
in1 °) a)
isspecified
in theappendix.
In the classical case
(i.
e. ~ is the collection ofquadrants
inbound
1 °) a) improves
on 1. 2 . 3 but is lesssharp
than 1. 2 .1 in the realAnnales de l’Institut Henri Poincaré - Probabilités et Statistiques
391 CENTRAL LIMIT THEOREM FOR EMPIRICAL PROCESSES
case ; moreover the
optimality
of1 °) a)
is discussed in theappendix
whereit is shown that
So,
there is a gap for thedegree
of thepolynomial
factor in bound3.3.2°) a)
between2(d - 1)
and6(d
+r~).
d
- Suppose
that % =then,
from section 2.3 we havee 12~(~ ) - .
a
In other
respects,
Bakhvalov proves in[4 ] that
if P stands for the uniform distribution on[0, 1 ]~
then :Thus we cannot
get
anyinequality
of the1 °)
or2°) type
in the situation wheree12~(~ )
> 2.The border line case. For any modulus of
continuity
we can intro-duce a
family
of functionsl~~,d
in the same way asAa,d by changing
u - u°‘into ~
anddefining {3
as thegreatest integer
for which -~ 0 asu - 0.
It is an easy
exercise, using
Bakhvalov’smethod,
to show that :provided
that =Ud/2 (log (u-1))y
and P isuniformly
distributed on[0, 1 ]d.
Of coursee 12~(A~,d)
= 2 and we cannotget
bounds such as in theo-rem 3.3.
But the above result is rather
rough
and we want to go further in theanalysis
of the families around the border line.Then the
(2)-entropy plays
the same role forconcerning
the Donskerproperty
as the metricentropy
in a Hilbert space for the Hilbertellipsoids concerning
thepregaussian property,
that is to say that thefollowing
holds :
i )
is a functional P-Donsker class wheneverVol.
392 P. MASSART
ii)
is not a functional £-Donsker class wheneverand in this case we have
(E Log (~)) - 2.
i)
follows from Pollard’s central limit theorem in[44].
ii)
follows from a result of Kahane’s in[32 ] about
Rademachertrigono-
metric series. In
fact, if
we set =u |Log(u)|,
we have from[32 p.
66that: t ~
~nen(t) KnLog(n) belongs
to03C6,1
withprobability pK ~ 1
as K ~ oo, where
(8n)
is a Rademacher sequence and ==2
cos(2xnt).
Let us consider a standard Wiener process on
L2( [0, 1 ]),
we may write(W(en))
as(8n
withbeing independent
of( ~ W(en) ).
Then,
withprobability
more than pK, thefollowing
holds :By
the three series theorem the seriesE 20142014201420142014 diverges
ton
Log (n)
infinity
almostsurely
and therefore W is almostsurely
unbounded onThe same
property
holds for any brownianbridge G,
forf -; G( f )
+ is some Wiener processprovided
thatW(l)
issome
N(0,1)
random variableindependent
of G. So is notpregaussian
and
ii)
isproved.
An upper bound in situation
2°)
is also an oscillation control. If we set~~ _ ~ f - g ; a-p( f - g)
J, it is not difficult to see that:Thus
changing
U into 2U and d into 2d if necessary the upper bounds in situation2°) hold
with~~.
instead of~,
the constantsbeing independent
of 6 because of 3.4.
In
particular
if F is a V. C.-class with realdensity d,
we set :(~)
We write g, when 0 oo.Annales de l’Institut Henri Poincaré - Probabilités et Statistiques
393 CENTRAL LIMIT THEOREM FOR EMPIRICAL PROCESSES
As it is summarized in
[23 ], Dudley
shows in[29] ]
thatA(6,
n,t)
t/ t
B
whenever t is small
enough, 6
=0 and n
>0(t-r)
with r > 8.Applying 3 . 3 . 2°) a) improves
on this evaluation for then:n, _ t)
t whenever t is smallenough, 03C3
=I
andIn order to
specify
in what way the constant in bound2°) a) depends on ~ ,
we indicate thefollowing
variant of3 . 3 . 2°) a).
3.5. PROPOSITION. - If we assume that
NU ~(E, ~ - u) -
for any E in
]0,1 [ and
some Eoin ]0,1 [ and that I I 6P I ~~ - a2
with 6 notexceeding U,
then there exists some E 1 in]0,1 [ depending only
on Eo and a constant Kdepending only
on C such that :From now on L stands for the function x ~ max
(1, Log (x)).
’
3.6. COROLLARY. - Let be some sequence of V. C.-classes ful-
filling
withinteger
densities(D~).
Then(with
the abovenotations)
-~ 0 as n - oo for any
positive
t whenevera(n)2
= anda(n)-2
= ..Provided that
Dn = 03C3(n Ln)
, such a choice of6(n)
does exist . Comment.According
to Le Cam[38 ] (Lemma 2)
andapplying
3 . 6the
f E ~n ~
admits finite dimensionalapproximations
whenever
D,~ _ ~ Lo n and provided
that Le Cam’s assumption (Al)
is fulfilled. "
This result
improves
on Le Cam’scorollary
ofproposition
3 whereDn = 0(n-03B3) for
some y1
is needed.Proof of
3 . 6. Let ~ be a V. C.-class with entiredensity
D and realdensity
d.Using Dudley’s proof
in[20 ] (more
details aregiven
in[40 ])
it394 P. MASSART
is easy to show
that,
for anyw > d (or w > d if d is «
achieved»),
we have :for any £ in
] 0,
1[,
with inparticular
when w =D,
K3
(2D)2D.
Sofrom
Stirling’s
formula weget : N(2)1(~, F)
for any ~ in0,20142014
and some universal constantC1.
Hence, for any s in]0,1 [
wehave:
Thus, applying
3.5 to the classyields
3.6.We propose below another variant of
inequality 3.3.2°) a), providing
an alternative
proof
of a classical result about the estimation of densities.3 . 7. PROPOSITION. - If we assume that
u)
= 2d oo and~03C32P ~F ~ 62
withUV n
6 U for somepositive V,
then there existssome
positive
constant C such that an upper bound for >t) is,
for anypositive t, given by:
In the situation where U is
large
thisinequality
may be more efficientthan
3.3.2°) a).
Application to
the estimationof
densities : minimax risk.Let
K.M
be thefollowing
kernel onf~~ : KM(y) = gi( y’my)
for any y in where~r
is some continuous function with bounded variation from R into -2’ 1 1 2
and M is some k x k matrix. Pollard shows in f~J1
thatthe class
Annales de l’lnstitut Henri Poincaré - Probabilités et Statistiques
395
CENTRAL LIMIT THEOREM FOR EMPIRICAL PROCESSES
is a V. C.-class of functions and so:
where C and w
depend only
on k.Now if we assume that P is
absolutely
continuous withrespect
to theLebesgue
measure on the classical kernel estimator of itsdensity f
is :where K is a
KM
with fixed§
and M so that K2(x)dx oJ . We setf = E( £).
Proposition
3 . 7gives
a control of the randomexpression £ - f by choosing:
So,
if we assume thatn > h-k
>C2,
weget, setting Dn
=f (x) ~ :
:Ln x
for any t in
[1
+ x[ and
somepositive
ex and/~. Hence,
after anintegration :
for any T in
[1,
+ oo[, provided
thatnhk > 4f32.
We choose T =0( Ln),
thus :
Provided that
f belongs
to some subset ofregular
functions8,
the biasexpression f - f
can be evaluated so that the minimax risk associated to the uniform distanceon [Rk
and to 8 can be controlled with the samespeed
of convergence as in[29],
via anappropriate
choice of h.3.8. PROOFS OF
3. 3, 3. 5,
3 . 7 . Lemma 3 .1 isproved
in the appen- dix. Let us prove theorem 3.3.First,
we reduce theproblem
to the casewhere u = 0 and u = 1
by studying
thef
E~~
instead396 P. MASSART
of
F ; G
fulfills(A)
andN(2)U(., F)
=N(2)1(., G). Finally ~03BDn~F
=and II 6P II~ = II~.
In the course of the
proof
we will need to introduce someparameters
such as :(in ] 0, 1 [) ;
r, m(in ~f ) ;
a(in ]1,
+ oo[ ) ; q (in ] 0, 2 ])
andsome
positive s, ~3
and y. Let(r~)
be apositive
sequencedecreasing
tozero. These parameters will be chosen in due
time,
sometimesdifferently
in different cases. We set N = mn, we write
(.)
for theprobability
distribution conditional on
(xi,
...,xN)
insteadof ) ) . )
for short.We set s =
20142014
and Sf =1 - - ( 1 -
A bound for >t)
will
follow,
via lemma3.2,
from a bound forPr(~Pn - PN II > 8’)
whichis at first
performed conditionally
on(xl,
...,xN).
Conditional
approximation by
a seriesof projections.
For each
integer j
aFj
can bemeasurably
selected(with
thehelp
of
2. 4,
see[21 ] p. 120).
So we can define aprojection n j
from % onto/Fj
such that on the one hand
f )2) i~
holds for anyf
in ~and on the other hand
(Pn - II~ belongs
tolT {~ ).
We show that thedevelopment :
holds
uniformly
over ~ and over the realizations ofIn
fact,
because each realization of w is one-to-one we have:Pn(g) mPN(g)
for anypositive function g
defined on X.Hence:
Therefore we
get :
So, provided
that(~j)
is apositive series
such that03A3
q j p, we havej>r+ 1
Annales de l’Institut Henri Poincaré - Probabilités et Statistiques
397 CENTRAL LIMIT THEOREM FOR EMPIRICAL PROCESSES
But the cardinalities of the ranges of
II~
andII~ -
arerespectively
not
greater
thanN J
and(where N~
stands forN 12~(2~, ~ ))
so that :where A and B are the
(xi,
...,xN)-measurable
variables :A is the
principal part
of the above bound and B is the sum of the errors.We use lemma 3 .1 to bound A and B :
inequalities 1°)
or2°)
are neededto control A
according
to whether case1°)
or2°)
isinvestigated. Setting
t’ ==
fi8’,
we use bound3°)
to controlB,
so :The control of the tail of the series in 3 . 8 .1 is
performed
via thefollowing elementary
lemma :3.8.2. LEMMA. -
Let 0/: [r,
+ oo[
-~ f~. Providedthat 0/
is anincreasing
convexfunction,
thefollowing inequality
holds :where
~d
stands for theright-derivative
of~.
In each case it will be
enough
to prove theinequality
for t > to =~ )
for some to. We choose
[3
= 1 underassumption a) a
under
assumption b). B ~/
Proof of
theorem 3 . 3 in case1 °). Applying
3.1.1 °)
weget :
398 P. MASSART
If we set p ==
-, 2
the variance factor 1 -03C12 a2~2n’
in lemma 3 . 2 is stabilized for it is not less than - whenever t > 2.2
-
Besides, whenever t2 ~ 3, t’2 > t2 -
5 hence :Under
assumption a).
2014 If we prove that for anygiven positive
and d’such that
F) ~ C(d’)03BD-2d
holds for any 03BD in]0,1 [,
an upper bound forPr(!!~J!
>~)
isgiven by: K(~~)(l
+~)~~’+~exp(- 2~). Then, setting
= d+ - and ~’ == -
weget 3.3.1°)a).
So, writing « d,~ »
instead of «d’, ~’ »
forshort,
we may assume thatN, Ct2dj2(03B1+1)d.
Wechoose
==t-2
and x = Max(2,1
+2014),
so:In order to evaluate B we
apply
lemma3 . 8 . 2, setting:
then the
condition t2
> 7 +4d(a
+1) (which
can beassumed)
ensuresthat >
1,
henceBut the above estimates of A and B are
deterministic,
so,using
lemma3 . 2,
theorem 3 . 3 isproved
in situation1 °) a).
With the idea of
proving proposition
3. 5 we remarkthat, setting
a =2,
the above method
gives,
under thehypothesis
in3. 5,
thatPr ( ~ ~
>t)
is bounded
by:
with
Ki depending only
onC, whenever t2
> 7 + 12d.Annales de l’Institut Henri Poincaré - Probabilités et Statistiques
399
CENTRAL LIMIT THEOREM FOR EMPIRICAL PROCESSES
Under
assumption b). Using
the samearguments
as above we may suppose that :N~
expWe
set J1
=t- 2y,
then thefollowing inequalities
hold s. :In order to balance the above bound of A we choose y so that
( +
1 +2 a + ~ - Y a -
1~~ 2 1 - y) t Y)
and alarge
g eenough
g for y Y - >- ~
2 and~3
> 1 tohold,
whereY(~)
is the solution of the aboveequation
when a = + o0(Note
that2( r -
=k,
where k is defined in the statement of 3 .3).
SoThe evaluation of B is
performed
via lemma 3 . 8 . 2 with the choice :2
Then, whenever t2
> - + 5 + >1,
hence(since
r -~ o0whenever t -~
oo) f3
We finish the
proof
as underassumption a),
so theorem 3.3 isproved
incase
1 °).
l
Proof ~ of
theorem 3. 3 in case2°).
We set ~p = - and choose m = a =203C6-q, = 03C6-q and
03C4j =03C3 mj-(03B1+03B2).
, If weset
p = o-, the variancefactor 1
- 6 2
in lemma 3 . 2 is stabilized for it is not lessthan -
whenever4. The variable A is this time controlled with the
help
of3.1.2°),
so now the
probleme
is toreplace 03C32N by
Infact,
letN
be the(xl,
...,measurable event :
Each term of the
following
estimate is studied in thesequel:
where A’ = and B’ =
400 P. MASSART
The control
of
Pr reduces to aproblem of
type1 °). For, setting
~ 2 - ~ f 2, f E ~ ~,
we have :Since
N(2)1(., F2) ~ N(2)1(. 2,F) and F2
fulfills(M),
we may use the boundsin
3 . 3 .1 °),
so ~ B ~ ~~~ 0Bwhere ~’ is a
polynomial
orexponential
function(according
to whethercase
a)
orb)
isstudied). Anyway
_The evaluation
of
A’ and B’. Theinequality ( ~ aN I 62
+ s holdson thus
applying
3 .1.2°) :
Under
assumption a).
As in theproof
of3 . 3 .1 °)
we may assume thatN~
> and wechoose q
= 2 and a =Max (2, 1
+2014 )
(recall that /3 == 1).
Then : ~’~
’~ ~
In another connection the control of B’ follows from the control of the
same tail of series as in
1°) (via
a modification ofparameters),
so :Annales de l’Institut Henri Poincaré - Probabilités et Statistiques
CENTRAL LIMIT THEOREM FOR EMPIRICAL PROCESSES 401
whenever
~p2 >
8 +4d(a
+1). Collecting
the estimates of Pr(EN),
A’and
B’,
bound3 . 3 . 2°) a)
is established via lemma 3 . 2 andinequality
3 . 8 . 4.Proof of 3 .
5. - Under thehypothesis
in 3 . 5 we choose this time a =2,
so :hence
whenever
~p2 >_
8 + 12d. The evaluation of Pr(,~N)
isperformed by using
bound 3 . 8 . 3
( changing
~o into~-°
whenstudying ~ 2
at thepoint.
HenceB
g g
2 / 2
whenever 03C62 4 ~ 7 + 12d. Applying inequality
3.8.4 and lemma 3.2gives proposition
3.5.Proof of
3. 3 .2°).
- Underassumption b).
Let us choose q so that :We choose a
large enough for 03B2
> 1 and 1- q 2 p
+ ~(where p
is definedin the statement of theorem
3 . 3)
to hold. We may assume that :Then:
The control of B’ is of the same kind as that of B in situation
1 °),
so :402 P. MASSART
Using inequality
3.8.4 and lemma 3.2gives 3.3.2°) b)
and finishes theproof
of theorem 3.3.Proof of proposition
3. 7. The above method which allows deduction of bounds oftype 2°)
from bounds oftype 1 °)
is iterated here. We shallassume that u = 0 and v = 1.
Inequality 3 . 3 . 2°) a)
may be written(in
view of its
proof) :
whenever 5. Let us define
by
induction thefollowing
sequences3 with ao = 1 and
bo
= 20147=’We suppose that the
following inequality (M~ )
holds whenevert2~62 >_ 5 :
We want to deduce from
(M~ ) by
the same way as3 . 3 . 2°) a)
from3 . 3 .1 °) a). So, let ~
bepositive
so that :Then
according
to remark2. 6,
hence :Moreover it is easy to show that
Annales de l’Institut Henri Poincaré - Probabilités et Statistiques