A NNALES DE L ’I. H. P., SECTION B
P. D OUKHAN P. M ASSART E. R IO
Invariance principles for absolutely regular empirical processes
Annales de l’I. H. P., section B, tome 31, n
o2 (1995), p. 393-427
<http://www.numdam.org/item?id=AIHPB_1995__31_2_393_0>
© Gauthier-Villars, 1995, tous droits réservés.
L’accès aux archives de la revue « Annales de l’I. H. P., section B » (http://www.elsevier.com/locate/anihpb) implique l’accord avec les condi- tions générales d’utilisation (http://www.numdam.org/conditions). Toute uti- lisation commerciale ou impression systématique est constitutive d’une infraction pénale. Toute copie ou impression de ce fichier doit conte- nir la présente mention de copyright.
Article numérisé dans le cadre du programme Numérisation de documents anciens mathématiques
http://www.numdam.org/
393
Invariance principles for absolutely regular empirical processes
P. DOUKHAN, P. MASSART and E. RIO U.R.A., C.N.R.S. 743, Departement de Mathematiques,
Bat. 425, Universite de Paris Sud, 91405 Orsay Cedex, France.
Vol. 31, n° 2, 1995, p. 427 Probabilités et Statistiques
ABSTRACT. - Let be a
strictly stationary
sequence of randomn
variables with
marginal
distribution P. LetZn
=n-1/2 ~(6~~ _ p)
denotei
the centered and normalized
empirical
measure. Assume that the sequence offl-mixing
coefficients of satisfies thesummability
condition
L (3n
+00. Define themixing
rate function/?(.) by (3( t) =
Tt>0
if t >
1, and{3(t)
= 1 otherwise. For any numerical functionf,
we denoteby Q~
thequantile
functionof If(ço)l.
We define a new norm forf by
where
/3-1
denotes thecàdlàg
inverse of the monotonic function/3 ( . ).
Thisnorm coincides with the usual in the
independent
case. Wedenote
by jC2,/?(~)
the class of numerical functions with~/~2,/?
+0oo.Let F be a class of
functions,
F C/~2,/3(~).
In a recent paper, the authors have shown that the finite dimensional convergencef
E0)
to a Gaussian random vector holds.
The main result of this paper is that a functional invariance
principle
in the sense of Donsker holds
f
E0)
if theentropy
withbracketing
of F in the sense ofDudley
with respectto ~. ~2,/3
satisfiessome
integrability
condition. This resultgeneralizes
Ossiander’s theorem( 1987)
forindependent
observations.1980 Mathematics Subject Classification (1985 Revision) : 60 F 17, 62 G 99.
Annales de l’Institut Henri Poincaré - Probabilités et Statistiques - 0246-0203
Vol. 31/95/02/$ 4.00/© Gauthier-Villars
394 P. DOUKHAN, P. MASSART AND E. RIO
Key words: Central limit theorem, {3-mixing processes, Donsker invariance principle, strictly stationary sequences, empirical processes, entropy with bracketing.
RÉSUMÉ. - Soit une suite strictement stationnaire de variables aléatoires de loi
marginale
P. SoitZn
=n-1/2(03B403BEi - P)
la mesure~ ~ i
empirique
normalisée et centree. On suppose que la suite est(3- melangeante,
de coefficients de(3-mélange (3n
en série sommable. On lui associe la fonction demélange /?(.)
définie par,~ (t)
=(3[t] si t 2:
1 et(3(t)
= 1 sinon. Pour toute fonctionnumerique f,
on noteQy
la fonction dequantile
de Nous definissons une nouvelle norme pourf
parou
/3-1 désigne
1’ inversecàdlàg
de la fonction décroissante/?(.).
Cettenorme coincide avec la norme usuelle de
L2(P)
dans le casindependant.
On note alors
L2,03B2(P) l’espace
des fonctionsnumériques
denorme 11.//2,{3
finie. Soit F une classe de fonctions de
~2,/?(P).
Dans un articlerecent,
lesauteurs ont montré la convergence vers un vecteur
gaussien
desmarginales f
E0)
de dimension finie.Dans cet
article,
nous démontrons unprincipe
d’invariance fonctionnel au sens de Donskerpour {~(/) : f
E sous une conditiond’ integrabilite
de
l’entropie
avec crochets de .~’ par rapport a lanorme ~.~2,~.
Ce théorèmegénéralise
celui d’ Ossiander(1987)
pour des variablesindépendantes.
1. INTRODUCTION
Let be a
strictly stationary
sequence of random elements of aPolish space
X ,
with common distribution P. Let us denoteby Pn
then
empirical probability
measurePn = V bf.i
and thenby Zn
the centered%==!
and normalized
empirical
measureZn = P).
Let F be a classof functions of
£2(P).
Independent
observationsLet us assume the variables to be
independent.
Thetheory
of uniform central limit theorems
(CLT)
for theempirical
processAnnales de l’Institut Henri Poincaré - Probabilités
E
0)
washistorically
motivatedby
the classicalexample
.~’
== {][]_oo~]~
tojustify
the heuristic of Doob forexplaining
the convergence of the
Kolmogorov-Smirnov
statistics. The first result in this context is due to Donsker( 1952).
The first attempts atgeneralizing
thistheorem concerned classes of sets, that is when
Dudley (1966)
extended Donsker’s result to the multivariateempirical
distributionfunction,
whichcorresponds
to the case where C is the class ofquadrants (actually, Dudley
alsoprovided
a morerigorous
formulation of Donsker’ stheorem).
Theproblem
is that in the multidimensionalsituation,
severalclasses of sets are candidate to extend the class of intervals of the real line. In that
spirit,
several new results were established in the seventies.Bolthausen
(1978) proved
the uniform CLT for the class of convex subsetsof the unit square, Sun
(1976)
and Révész(1976)
studied the CLT for classes of subsets with a-differentiable boundaries.In his landmark paper,
Dudley (1978)
has unified these various results:he gave a
precise
definition of the uniform CLTproperty
in terms of weak convergence of theempirical
process in the(generally
nonseparable)
Banach space to the Gaussian process with the same covariance
structure and almost sure
uniformly
continuous(with
respect to the covariancepseudo-metric) sample paths.
The existence of such aregular
version of the Gaussian process is now refered to as the
P-pregaussian
property and the existence of a uniform CLT as the P-Donsker property.
He also
provided
two different sufficient conditions on C for the P-Donskerproperty
to hold. The first one is of combinatorial type,namely
C isa
Vapnik-Chervonenkis
class of sets. In that case, the uniform CLT isuniversal in the sense that it holds for all P. The second one is a condition of
integrability
on themetric
entropy with inclusion of C[with respect
to the Since that time the
theory
of CLT’s forempirical
processes has been
developped mainly
in the direction of unbounded classes of functions .F. The P-Donskerproperty
isequivalent
to the fact that .F istotally
bounded andZn
isasymptotically equicontinuous [this tightness
criterion may be found in
Dudley (1984)].
So the mainchallenge
is toprovide
minimalgeometric
conditions on the class .~’ensuring
thetightness
of
Zn .
Gine and Zinn( 1984) pointed
out that F is a P-Donsker class if F isP-pregaussian
and satisfies some extra conditionensuring
that theempirical
process
Zn
may be controlled on the small balls. Two maintechniques
have been
proposed
to solve thisproblem.
Theguiding
idea has beenDudley’s
criterion(1967)
which ensures the stochasticequicontinuity
of aGaussian process
~G(t), t
ET~
if the metric entropy functionH(., T, d)
with respect to the
pseudo-metric d(s, t)
=Var (G(s) - G(t))
satisfiesVol. 31, n° 2-1995.
396 P. DOUKHAN, P. MASSART AND E. RIO
the
integrability
conditionSpecializing
to the Gaussian process with the same covariance structure asZn,
we have T = .~’ andd( f, g)
=dp ( f, g)
= Kolchinskii(1981)
and Pollard(1982) proposed independently
a sufficient conditionanalogous
to( 1.1 )
butinvolving
the universalentropy
function which is defined as theenvelope
function of theH(., 0, d~*~)
whenQ
is anyprobability
measure. On the otherhand,
Pollard(1982)
was able to relaxthe boundedness condition on
0, assuming only
thatsup If I
is boundedby
some function F in£2(P),
Anotherapproach
is to consider entropy withbracketing,
which is the natural extension to functions of entropy withinclusion,
usedby Dudley (1978)
for classes of sets. It has takenquite
along
time before one realizes that the
£2(P)-norm
was still theright
norm toconsider in order to measure the size of the brackets. In fact the first results in that direction were
involving
the[see
for instanceDudley (1984)]. Using
a delicateadaptive
truncationprocedure,
Ossiander(1987) proved
that a sufficient condition for .~’ to be a P-Donsker class is thatwhere
Hj j ( . , 0, d p )
denotes theentropy
withbracketing
function withrespect to
d p. Next,
Andersen et al.(1988)
have weakened thebracketing
condition
[controling
the size of the brackets in a weak/:2(P)-space]
and also used
majorizing
measure instead of metric entropy[actually
itis known since
Talagrand (1987)
that theP-pregaussian
property may be characterized in terms of the existence of amajorizing measure].
We see that the
plays
a crucial role in thistheory
inthe
independent
case. Infact,
on the one hand the finite dimensional convergence ofZn
to a Gaussian random vector holds whenever .~’ is included inL2(P)
and on the other hand the geometry of F with respect to the characterizes theP-pregaussian
property and is involved in thedescription
oftightness
of theempirical
process.Weakly dependent
observationsThere are several notions of weak
dependence.
It isquite
natural whendealing
withempirical
processes to considermixing
type conditions whichare defined in terms of coefficients
measuring
theasymptotic independence
Annales de l’Institut Henri Poincaré - Probabilités
between the a-fields of the past and the future
generated by
Ofcourse different
mixing
coefficients have beenintroduced, corresponding
to different ways of
measuring
theasymptotic independence.
To ourknowledge,
all these coefficients arenondecreasing
with respect to these a-fields.Hence,
has smallermixing
coefficients than the initial sequence.Let us recall the definitions of some classical
mixing
coefficients.Rosenblatt
(1956)
introduced the strongmixing
coefficienta (A, B)
betweentwo a-fields A and B :
The
absolutely regular mixing
coefficient was definedby
Volkonskii and Rozanov
(1959) [see
alsoKolmogorov
and Rozanov(1960)].
Namely
where the supremum is taken over all the finite
partitions (Ai)iEI
andrespectively
A and B measurable. This coefficient is also calledfl-mixing
coefficient between A and B.Ibragimov (1962)
introduced thep-mixing
coefficient between A and B:The
following
relations between these coefficients hold:The Markov chains for instance are
geometrically p-mixing
under theDoblin recurrence condition
[Doob (1953)]
and,Q-mixing
under themilder Harris recurrence condition if the
underlying
space is finite[Davydov (1973)].
Mokkadem(1990)
obtains sufficient conditions forRd-
valued
polynomial autoregressive
processes to be Harris recurrent andgeometrically absolutely regular.
He also shows that these processes may fail to be03C6-mixing.
As in the
independent
case, the first results forweakly dependent
observations were concerned with the
(possibly multivariate) empirical
distribution function. Berkes and
Philipp (1977) [resp. Philipp
andVol. 31, n° 2-1995.
398 P. DOUKHAN, P. MASSART AND E. RIO
Pinzur
(1980)] generalized
Donsker’s theorem for the univariate(resp.
d-dimensional) empirical
distribution function of astrongly mixing stationary
sequence with an =
O(n-a),
a > 4 + 2d.Next,
Yoshihara( 1979)
weakenedthe
strong mixing assumption
of Berkes andPhilipp:
heproved that,
if the strongmixing
coefficients( an ) n > o
of the sequencesatisfy
an =
O(n-a),
for some a > 3, then the uniform CLT for the univariate distribution function holds.Dhompongsa ( 1984)
extended Yoshihara’s result to the multivariateempirical
distribution function. ForRd-valued
randomvariables,
he obtained the uniform CLT for the multivariateempirical
distribution function under the condition an =
0(?~),
a > d + 2, which is the best known result for the class ofquadrants
in thestrong mixing
case.Doukhan,
Leon and Portal(1987)
studied classes of functions embedded in Hilbertian spaces in thestrong mixing
case. Massart(1987) proved
a uniform CLT for
strongly
oruniformly mixing empirical
processes ina more
general
framework: if the class F has a finite dimension d of,C1 (P)-entropy
withbracketing,
and if the strongmixing
coefficientssatisfy
an =
O(n-a)
for some a >32d+ 3,
then the uniform CLT holds.Moreover, nonpolynomial covering
numbers are allowed in thegeometrically mixing
case. Some related results in the
setting
ofpolynomial bracketing covering
numbers
[with
respect to£2(P)]
may be found in Andrews and Pollard(1994). Looking carefully
at these papers, we seethat,
for classes F with finite dimension ofentropy,
somepolynomial
rate ofdecay
of the strongmixing
coefficients isrequired
and that this ratedepends
on theentropy
dimension of F.By
contrast, Arcones and Yu(1994)
have established a uniform CLT forabsolutely regular empirical
processes indexedby Vapnik-Chervonenkis subgraph
classes of functions under somepolynomial
rate ofdecay
ofthe
/3-mixing
coefficients notdepending
on the entropy dimension of the class. This makes feasible the existence of a uniform CLT for03B2- mixing empirical
processes which does notrequire
ageometrical decay
ofthe
mixing
coefficients.Actually
since/?-mixing
allowsdecoupling (see
Berbee’s lemma in section 3
below),
it is much easier to work with this notion rather than withstrong mixing.
Moreover it still covers a wide class ofexamples (from
thispoint
ofview,
thep-mixing
mayhappen
to be toorestrictive). Hence,
in this paper, we shall deal withfl-mixing,
which seemsto be a
good compromise.
Our
approach
consists inexhibiting
a norm whichplays
the role of the£2 (P)-norm
in theindependent setting.
At a firstglance,
one could think that the Hilbertianpseudonorm
associated with thelimiting
covariance ofZn
should be agood
candidate. It has two main defaults.First,
the existenceAnnales de l’Institut Henri Poincaré - Probabilités
of a
limiting
variance forZn(f)
does notimply
the CLT.Secondly
it doesnot allow
bracketing.
We willprovide
a normdepending
on P and onthe
mixing
structure of the sequence ofobservations,
which coincides withthe usual
£2(P)-norm
in theindependent
case.Denoting by ~2,/?(~)
theso-defined normed space, the finite dimensional convergence of
Zn
tosome Gaussian vector with covariance function F holds on r is the
limiting
covariance ofZn
and ismajorized by
the square of theMoreover,
this norm allowsbracketing
and we will obtaina
generalization
of Ossiander’s theoremby simply measuring
the size ofthe brackets with the instead of the
£2(P)-norm.
2. STATEMENT OF RESULTS
Throughout
thesequel,
theunderlying probability
space(S2, T, IP)
isassumed to be rich
enough
in thefollowing
sense: there exists a randomvariable U with uniform distribution over
[0,1], independent
of the sequence For any numericalintegrable
functionf,
we set =/
For any r > 1, let
£r (P)
denote the class of numerical functions on(X, P)
such that == +oo.
Since is a
strictly stationary
sequence, themixing
coefficients of the sequence are definedby j3n
= whereFo
=I 0)
and0n = a(çi :
i >n).
is called aj3-mixing
sequence if lim
j3n
= 0.Examples
of such sequences may be found inDavydov (1973), Bradley (1986)
and Doukhan(1994).
We now need to recall the definitions of entropy and entropy with
bracketing.
Entropy. -
Given a metric set(V, d),
let.IV (6, V, d)
be the minimum ofThe entropy function
H(6, V, d)
is thelogarithm
of.J~( s, V, d).
Bracketing. -
Let V be some linearsubspace
of the space of numerical functions on(~ P).
Assume that there exists someapplication
A : V - R+such
that,
for anyf
and any g inV,
Vol. 31, n° 2-1995.
400 P. DOUKHAN, P. MASSART AND E. RIO
Assume that F C V. A
pair ( f, g]
of elements of V such thatf g
iscalled a bracket of V..F is said to
satisfy (A.1) if,
forany 6
> 0, there exists a finite collection of brackets of V such thatThe
bracketing
numberJU~ ~ (6, .~’)
of .~’ with respect to( V, A)
is theminimal
cardinality
of such collectionsS(6).
Theentropy
withbracketing
(8, 0, A)
is thelogarithm
of(8, 0) .
When A is a norm,denoting by dA
thecorresponding metric,
it follows from(2.1)
and(2.2)
thatWe now define a new norm, which emerges from a covariance
inequality
due to
Rio (1993).
Notations. -
Throughout
thesequel,
we make the convention that/30
= 1.If
(un)n2::0
is anonincreasing
sequence ofnonnegative
realnumbers,
we denote
by u(.)
the rate function definedby u ( t)
= For anynonincreasing
function denote the inverse function ofFor
any f
in we denoteby Qy
thequantile
functionof 1!(ço)l,
which is the inverse of the tail function t -
1P(1!(ço)1 ]
>t).
Let denote the sequence of strong
mixing
coefficients of It follows from(1.5)
thatQ~(i6) ~ /~(2~).
Hence,by
Theorem 1.2 inRio
(1993),
thefollowing
result holds.PROPOSITION 1
[Rio, 1993]. -
Assume that the03B2-mixing coefficients of satisfy
thesummability
conditionLet
G2,~(P)
denote the classof
numericalfunctions f
such thatAnnales de
Then,
for
anyf
in,C~,~ (P),
and
denoting by T ( f, f )
the sumof
the series we tEZhave:
Remark. - Note that
~,~~
=J /~i /3-1(u)du. So,
under condition(2.4),
n>o
°
GZ,p(P)
contains the spaceG~(P)
of bounded functions. Moreover, we will prove in section 6 thefollowing
basic lemma.LEMMA 1. - Assume that the sequence
of 03B2-mixing coefficients of satisfies
condition(2.4).
Then,G2,~(P~, equipped
with~~y~z,~
isa normed
subspace of GZ(P) satisfying (2.1),
andfor
anyf
inG2,~(P),
~
We now define the
corresponding
weak space as follows. LetB(t)
=J /’c 0
For any measurable numerical functionf,
we setThroughout,the sequel,
denotes the space of measurable functionsf
on X such thatA2,~(/)
+00.Clearly ,CZ,p(P)
CAz,p(P)
andNow we compare the so defined spaces with the Orlicz spaces of
functions,
and with the so-called weak Orlicz spaces of functions.Comparison
with Orlicz spaces Let 03A6 be the class ofincreasing
functionsj) ~ (~ : R~ 2014~R~:~~
convex anddifferentiable,
Vol. 31, n° 2-1995.
402 P. DOUKHAN, P. MASSART AND E. RIO
For
any ~
E~,
we denoteby G~,2(P)
the space of functions~,2(~P)
isequipped
with thefollowing
norm :This norm is the usual Orlicz norm associated with the function x - Let us now define the
corresponding
weak Orlicz spaces. Forany cp
in~,
letA~(P)
be the space of measurable functionsf
on X such thator
equivalently
such thatOf course,
by
Markov’sinequality,
The purpose of the
following
lemma is to relate the£2,,B-norm
with1I./lcjJ,2
andLEMMA 2. - For any
element ~ of ~P,
letdefine
the dualfunction 4J* by
~*~y~
= sup~~x~~~ If
x>o
then
and so
then
and so
Annales de l’Institut Henri Poincaré - Probabilités
The main result
In a recent paper,
Doukhan,
Massart and Rio( 1994)
provethat,
forany finite
subset 0
c,C2,~(P),
the convergence of~Zn(g) :
g E~~
to a Gaussian random vector with covariance function r holds true.
Moreover,
acounterexample
shows that fidi convergence may fail to holdif G L2,03B2(P). So,
it isquite
natural to assume that 0 cL2,03B2(P).
Nowthe
question
raises whether anintegrability
condition on the metricentropy
withbracketing
in,C2,~ (P)
is sufficient toimply
the uniform CLT. In fact the answer ispositive,
as shownby
thefollowing
theorem.THEOREM 1. - Let be a
strictly stationary
and,3-mixing
sequenceof
random variables with commonmarginal
distribution P. Assume that the sequencesatisfies (2.4).
Let .~’ be a classof functions f ,
.~ c
,C2"~ (P).
Assume that the entropy withbracketing
with respect to which we denoteby H,~ (b, 0), satisfies
theintegrability
conditionThen,
(i)
the series~
Cov( f (~o), f (~t))
isabsolutely
convergent over .~’ to a tEZnonnegative quadratic form I’( f, f),
and(ii)
There exists a sequenceof
Gaussian processes indexedby
.~’ with covariance
function
r and a. s.uniformly
continuoussample paths
such that
Applications
of Theorem 11. Orlicz spaces. -
Let §
be some element such that themixing
coefficients
/3n satisfy
and suppose that F C
~,2~).
Then(2.9a)
holds[see Rio, 1993].
Henceit follows from Lemma 2 that F satisfies the
assumptions
of Theorem 1 ifVol. 31, n ° 2-1995.
404 P. OOUKHAN, P. MASSART AND E. RIO
the metric
entropy
withbracketing
of 0 in fulfills theintegrability
condition
For
example,
if~(~) ==
xr,G~,z(P) _ !).t!~,2
is the usual normand
(S.I)
means that the series is convergent.n>o
As an
application
of(2.11),
we can derive thefollowing striking
result.Assume that the
mixing
coefficientssatisfy !3k
=O(bk)
for some b in]0, 1 [.
There exists some s > 0 such that
(S.I)
holds withWe notice that
G~,2(P)
is the space(P)
of numerical functionsf
such that oo and it is
equipped
with a norm which isequivalent
to the usual Orlicz norm in this space.Hence, by
Lemma 2 and Theorem1,
the uniform CLT for theempirical
process holds as soon as theentropy
withbracketing
of F inL2
~g+(P)
satisfies the usualintegrability
condition.
2. Weak Orlicz spaces. -
Let §
be some element of 03A6 such that~ 2014~
x-r cjJ(x)
isnondecreasing
for some r > 1.Then, (2.9b)
isequivalent
to the
summability
condition of Herrndorf( 1985)
on themixing
coefficients:It follows from Lemma 2 that F c
11~(P)
satisfies theassumptions
ofTheorem 1 if the
entropy
withbracketing
of F with respect toA~(.)
verifiesSome calculations
(cf. Rio, 1993)
show that(S.2)
is stronger than(S.I).
Forexample,
if03C6(x) = xr
for some r > 1, is the usual weakL2r(P)-
space
equipped
with the usual weak normA2r(.)
and(S.2)
isequivalent
to the convergence of the series while(S. I)
holds_
n>0
iff the series
~ n~~~’‘-~~,C3~
is convergent, which is a weaker condition.n>0
We refer the reader to
application
3 of Theorem 1 in Doukhan, MassartAnnales de l’Institut Henri Poincaré - Probabilités
and Rio
(1994),
for more aboutcomparisons
with theprevious
conditionsof
Ibragimov ( 1962)
and Herrndorf( 1985).
3. Conditions
involving
theenvelope function of
F. -Let G
cbe a class of functions
satisfying
the entropy conditionLet F be some element of
G2,R(P) satisfying
F > 1 and .~’ =={gF :
g E~}.
Both Theorem 1 and the
elementary inequality ~~gF~~z,p imply
the uniform CLT.4. Conditions
involving
theL2(P)-entropy of
0. - In thissubsection,
we consider the
following problem: given
a class .~’ c with metric entropy withbracketing Hz ( . )
with respect to the metric inducedby ~.~2.
we want to find a condition on the
mixing
coefficients and on the£2(P)-
entropy implying (2.10). Throughout application 4,
we assume that theenvelope
function of F is inA2r(P),
for some r where wemake the convention that
Aoo(P)
= We shall prove inappendix
C that condition
(2.10)
is satisfied if any of the threefollowing
conditionsis fulfilled:
In
particular, (2.15)
and(2.16)
are satisfied if/3n
=O(n-b),
0(~"~)
withb(l - ()
>r/(r - 1).
Also(2.17)
holds whenever/3n
= and is the class ofquadrants,
whichimproves
in this
special
case onCorollary
3 of Arcones and Yu(1994).
Vol. 31, n° 2-1995.
406 P. DOUKHAN, P. MASSART AND E. RIO
The outline of the paper is as
follows:
in section3,
thetechnique
ofblocking
formixing
processes isapplied
to establish an upper bound onthe mean of the supremum of
Zn
over a finite class of bounded functions.Next,
in section4,
we show how both the upper bound of section 3 and ageneralization
of Ossiander’s method forproving tightness
of theempirical
processyield
the stochasticequicontinuity
of{~(/) : f
E0)
under the
assumptions
of Theorem 1. In section5,
we weaken thebracketing conditions,
in thespirit
of Andersen et al.( 1988).
The stochasticequicontinuity
ofZn
is ensured in the bounded caseby
thefollowing result,
which is in fact the crucial technical part of the paper. In what
follows,
for anypositive (non necessarily measurable)
random elementX, E(X)
denotes the smallest
expectation
of the(measurable)
random variablesmajorizing
X.THEOREM 2. - Let a be a
positive
number and let C,C2,,~
be a classof functions satisfying
thecondition ~f~2,03B2
afor
anyf
in0a. Suppose
thatfulfills (2.10)
and thatfor
some M >1, f I
Mfor
anyfunction f
in
0a.
Then there exists a constantK, depending only
on () =( ~ ,Q ) 1 /2
n>_0 such
that, for
anypositive integer
q,-
3. A MAXIMAL
INEQUALITY
FOR03B2-MIXING
PROCESSES INDEXED BY FINITE CLASSES OF FUNCTIONSIn order to establish the stochastic
equicontinuity
of theempirical
process~ Zn ( f ) : f
E0),
we need to control the mean of the supremum of theempirical
processZn ( . )
over a finite class of bounded functions. This control isperformed
via theapproximation
of theoriginal
processby conveniently
definedindependent
random variables. The main argument is Berbee’scoupling
lemma.LEMMA
[Berbee (1979)]. -
Let X and Y be two r.v.’staking
theirvalues in . Borel spaces
S1 and S2 respectively
and let U be a r.v. withuniform
distribution over[0, 1], independent of (X, Y). Then,
there exists arandom variable Y* =
f (X, Y, U),
wheref
is a measurablefunction from 81
x52
x[0, 1]
into82,
such that:. Y* is
independent of
X and has the same distribution as Y..
Y* ) _ {3,
where{3
denotes the,~-mixing coefficient
between thea-fields generated by
X and Yrespectively.
Annales de l’Institut Henri Poincaré - Probabilités
The
following
lemma will be usedrepeatedly
in theproof
of Theorem 2.In
fact,
this lemma is theonly thing
one has to know about(3-mixing
sequences in order to prove Theorem 2.
LEMMA 3. - Let a and 6 be
positive
realsand G
be anyfinite
subclassof
G2,R (P~ satisfying
thefollowing assumptions:
(i)
For any g in9, ~(g(~~))
= 0.(ii)
For any g in9,
a and 6.Let
L(G)
= max(1, log 191),
where191
denotes thecardinality of G.
Thereexists some universal
positive
constant C such that,for
any q in[1, rt~,
Proof. -
Theproof
of Lemma 3 ismainly
based on Bernstein’sexponential inequalities
forindependent
r.v.’s and on Berbee’s lemma.Let us now
give
aCorollary
of Berbee’slemma,
whoseproof only
uses thefact that a countable
product
of Polish spaces is a Polish space(the proof
will be
omitted, being elementary).
PROPOSITION 2. - Let sequence
of
random variablestaking
their values in a Polish space X. For anyinteger j
> 0, let i >j)). Then,
there exists a sequence(Xz )Z>o
of independent
random variables such that,for
anypositive integer j, X~
has the same distribution as
Xj X~ ) bj .
Invoke now
Proposition
2 to construct a sequence of real- valued r. v.’ s such that the random vectorsYk = ( ~q ~+ ~ , ... ,
andYk
=(~+1~’’’ Ç;(k+1))
fulfill the conditions below:. For
any k
> 0,Y~*
andYk
have the same distribution and~) ~ {3q.
.. the r.v.’s are
independent,
the r.v .’ s areindependent.
n
Now, let
Z~ = n-1/2 ~(b~i - P)
denote the normalizedempirical
i=l
measure associated with the r.v.’s
~:
First,
wegive
an upper bound on the error term dueto
this substitution.Clearly,
Vol. 31, n° 2-1995.
408 P. DOUKHAN, P. MASSART AND E. RIO
Now,
recallthat,
for anypositive integer
i,# ~) /3q.
It follows thatIt remains to handle the first term on
right
hand in(3.1).
LetThe random variables are centered and each bounded
by
qa.Moreover,
it follows fromProposition
1 thatfor any g in
~. So, recalling
that the r.v.’s areindependent,
and that the
corresponding
r.v.’s with odd index also areindependent,
andapplying
twice Bernstein’sinequality [see
Pollard(1984),
p.193]
we get:there exists some
positive
constant c, such that for anypositive A,
It follows that
Let
Ao
andAi
be thepositive
numbers definedby
theequations
Integrating
the aboveinequality yields
Both
(3.4)
and(3.5)
ensure thatfor some
positive
constantC, which, together
with(3.2), implies
Lemma 3.4. PROOF OF THEOREM 1
We first notice that we may assume
Ep(f)
= 0 for anyf
in .~’. Thisfollows from the
elementary inequality
below(proof omitted).
Throughout
theproof, ~2,/?
stands for~2,/?(~). (i)
follows fromProposition
1. To prove(ii),
we know from Doukhan, Massart, Rio(1994)
that fidi convergence of
Z~
holds. Moreover(2.3)
and(i)
ensure thatthe
integral
entropy condition(2.10) imply
that(0, F)
ispregaussian
viaDudley’s
criterion. It is well known[see
for instance Pollard(1990)]
that(ii)
will be ensuredby
theasymptotic
stochasticequicontinuity
ofZn
withrespect
to the~2,/3-metric.
This will followeasily
from the fundamentalinequality
below, which is in fact acorollary
of Theorem 2.THEOREM 3. - Let a be a
positive
nunber and let0a
C£’2,3
be a classof functions satisfying
thecondition
~for
anyf
inSuppose
that
fulfills (2.10)
and letLet B : R+ ->
[0, B(1)~
be theapplication defined by B(x)
=J ~ ~3-1(t)dt
[note
that03B2-1(t)
= 0if t
>lJ
and,for
any measurablefunction h,
detLet F be some
positive function satisfying
thefollowing
conditions:F > ~ f ~ I for
anyf
inF,
and lim03B4F (~)
= 0. Then there exists somepositive
constante-o
A
depending only
onB(l)
=82
such that,for
anypositive integer
n,where
n)
is theunique
solutionof
theequation
Vol. 31, n° 2-1995.
410 P. DOUKHAN, P. MASSART AND E. RIO
Remark. - Since B is a
nondecreasing
concavefunction, x - x2 /B(x)
is an
increasing
and continuousapplication
from R+ onto R+. Hence theequation defining
has anunique solution,
and lim = 0.Furthermore,
thebracketing
conditionimplies
the existence of anenvelope
function F
belonging
to~2,~
which in turn ensures that limbF (~)
= 0,via
(2.8).
e-oIt follows from the above remark
that,
if n islarge enough
Hence; applying
Theorem 3 to0a
={f - 9 : f, g
E F and!!./’-~2,/3 ~},
we obtain the
asymptotic
stochasticequicontinuity
of{~(/) : f
E0),
which
completes
theproof
of Theorem 1.Proof of Theorem
2. - In order to prove Theorem 2 we shall use achaining
argument with
adaptive
truncatures. Thistechnique
was initiatedby
Bass(1985)
in the context of set-indexedpartial
sum processes and then usedby
Ossiander(1987)
and nextby
Andersen et al.(1988)
in order to proveuniform central limit theorems for function-indexed
empirical
processes.Contrary
to Ossiander and Andersen etal.,
we do not force the bracketsto be
decreasing
allalong
thechaining.
This may be doneusing
a newchaining decomposition (1).
We need to
impose
some additionalmonotonicity
condition on theentropy
function
H{3.
Thefollowing
claim is a direct consequence of Lemma6,
which iselementary
but ofindependent
interest. This lemma will beproved
in section 6.
CLAIM 2. - There exists a
nonincreasing function H(.), majorizing
H~(.,.~’a),
such that x - isnondecreasing
andSince Theorem 2 is trivial when
2-6 y’n (in
that caseTheorem 2 holds with K =
213)
we shall assume that condition(~) The idea of using such a decomposition was given to us by D. Pollard (private communication).
Annales de l’lnstitut Henri Poincaré - Probabilités
is satisfied. We define
80
= a andthen,
for anyinteger k, 8k
=2-~bo.
Now, since
(Al)
holds for.F,
for eachnonnegative integer k,
we may choose acovering
of Fby
bracketsBj,k
==[9j,k, hj,k], ,1 j Jk,
withb~
and Of course we may assumethat 9j,k I
2M . Then in each bracketBj,k
we fix apoint fj,k belonging
to .F . We now define amapping
from .~’to [1, Jk] by
=
min{j
E[1, Jk] : f E Finally
we set03A0kf
= and= comes from these definitions that
with
We need to define some more functions and parameters. For any
positive
6 we set
H(6) = ~~ H(6k).
This function H will be usefull becauses,~ > a
the
mapping (~c/y, ~ ~
ranges in a finite set withcardinality
lessthan or
equal
to The choice of parameters that we propose hereunder will tend to make the three terms of Lemma 3 of the sameorder of
magnitude
whenapplied
to the control of the different terms ofdecomposition (4.8).
We defineq(6)
= min{s
E N* :and the
parameter e(6) by ~e(~) == ((q(6) - 1)
VI)H(6).
Letand
We note that both sequences
(bk)k
and(qk)k
arenonincreasing.
Moreover,it follows from the definitions
of q
and e that{3 -1 ( ê ( 8)) ~ (q(6) - 1)
V 1.So
ne(6) /3-~(c(~))H(~),
which means thatAlso
q(b) 2((q(6) - 1)
V1) , .thus
Let
Vol. 31, n° 2-1995.
412 P. DOUKHAN, P. MASSART AND E. RIO
We note
that,
because of(4.2),
N > 1 . We will check at the end of theproof
of Theorem 2 that n, whichimplies
that 1 for all1~ n.
Finally
for anyf
E.F,
we set’
- -
Let I denote the
identity
operator.Starting
fromand
noting
thatwe get
Let k be an
integer
1 ~ N - 1 . Sincebk,
we havePlugging
this relation in thedecomposition
aboveproduces
thefollowing
relation where the summations have to be taken as 0 if N = 1
The
following inequality
derivesstraightforwardly
from(4.8)
Annales de l’Institut Henri Poincaré - Probabilités
where
C ontrol
of Ei
We note that
IIo
ranges in the finite set of functions As a directapplication
of Lemma 3 and Claim 2 we get that for anyinteger
qIn order to control the
expectations Ei,
i =2, 3, 4
we will need to relate the1-norm
of a function which is truncated from below to the2,03B2-norm
ofthe nontruncated function. This is what is done in the
following
lemma.LEMMA 4. - Let h be some
function
in,Ci{P).
Then,for
any ~ in]0, 1],
the
following inequality
holdsIn
particular, if
then
Vol. 31,~2-1995.
414 P. DOUKHAN, P. MASSART AND E. RIO
Proof of
Lemma 4. -Clearly,
for all t E,Now E.
So,
thus, applying (4.9)
and theconcavity
ofB(.) provides:
proving (a). Noticing
thateR-1(e) B(e),
condition(i) implies,
via(4.9)
that a >
Qh(e). Hence, (b)
follows from(a).
We can use this lemma to
produce
a result which is crucial for what follows.CLAIM 3. - For any
integer k
N - 1, we haveProof of
Claim 3. - We wish toapply
Lemma 4 with a =bk/ qk ,
8 =b~
and e = ek. We have to check that condition
(i)
is satisfied.By (4.6)
and(4.7b)
we havethus,
it follows from(4.7a)
that6k,
hence condition(i)
isfulfilled. Therefore
so,
using (4.7a) again provides
Annales de l’Institut Henri Poincaré - Probabilités