A NNALES DE L ’I. H. P., SECTION B
C RISTINA A NDRIANI
P AOLO B ALDI
Sharp estimates of deviations of the sample mean in many dimensions
Annales de l’I. H. P., section B, tome 33, n
o3 (1997), p. 371-385
<http://www.numdam.org/item?id=AIHPB_1997__33_3_371_0>
© Gauthier-Villars, 1997, tous droits réservés.
L’accès aux archives de la revue « Annales de l’I. H. P., section B » (http://www.elsevier.com/locate/anihpb) implique l’accord avec les condi- tions générales d’utilisation (http://www.numdam.org/conditions). Toute uti- lisation commerciale ou impression systématique est constitutive d’une infraction pénale. Toute copie ou impression de ce fichier doit conte- nir la présente mention de copyright.
Article numérisé dans le cadre du programme Numérisation de documents anciens mathématiques
http://www.numdam.org/
371
Sharp estimates of deviations of the
sample
meanin many dimensions
Cristina ANDRIANI and Paolo BALDI * Dipartimento di Matematica, Universita di Roma-Tor
Vergata, Via della Ricerca Scientifica, 00133 Roma.
e-mail: baldi@.oscar.proba.jussieu.fr
Vol. 33, n° 3, 1997, p. 385. Probabilités et Statistiques
ABSTRACT. - We
give
theasymptotics
as n - oo of theprobability
forthe
empirical
mean of a sequence of i.i.d. random vectors to be in an open domain whose closure doesn’t contain the true value of the mean. Our resultgeneralizes
those of Bahadur and Rao and holds under suitableassumptions
on the
boundary
of the domain(which
needs not however to beconvex)
and on the laws of the random vectors
(a
boundeddensity
isneeded).
RESUME. - On donne le
comportement asymptotique
pour n - oo de laprobability
que la moyenneempirique
d’une suite de variables aleatoires multi-dimensionnelles prenne ses valeurs dans un ouvert ne contenant pas la moyenne dans sa fermeture. Ce resultatgeneralise
celui de Bahadur et Rao et est valable sous certaineshypotheses
sur la frontiere de 1’ ouvert(lequel
ne doit pas necessairement etre
convexe)
et sur les lois communes des v.a.(on
doit supposer un peu moins que l’existence d’une densitebomee).
1. INTRODUCTION
Cramer’s theorem on
large
deviations for theempirical
mean statesthat if
~X.n, ~n
is a sequence of i.i.d. d-dimensional r.v. and we note* Partially supported by research funds of the MURST "Processi Stocastici"
Annales de l’Institut Henri Poincaré - Probabilités et Statistiques - 0246-0203
for every Borel set D C where the functional I is the
Fenchel-Legendre
transform of the
logarithm
of theLaplace
transform of thelaw /L
ofXi.
Of course
( 1.1 ) gives
theasymptotics
oflogP{Xn ~ D}
if the Borel set D is such thatinf o I(x) = infxED I(x).
By "sharp asymptotics"
we mean theasymptotics
ofP{Xn
ED}
(as opposed
to theasymptotics
of itslogarithm).
The first result in this direction was obtainedby
Bahadur and Rao[2]
for the dimension d = 1.If D =
[q,
+00[,
q >E( Xl)
and q lies in the admissible domain of I(see
§2
for theexplanation
ofthis),
thenthey
prove thatThe aim of this paper is to
give
theasymptotics
ofP{Xn
ED}
for r.v.’staking
values in This is a much morecomplicated
situationmainly
because the
boundary
of a Borel set D inf~d
can be morecomplicated.
Forthis reason we have chosen to deal with the
simplest
situation: we assumethat the
Laplace
transformof
is finite in aneighborhood
of theorigin,
and
that
has a boundeddensity (an assumption
which can be howeverweakened,
see Remark3.2).
Moreover we suppose that there exists aunique point
x* E at which theinfx~D I(x)
is attained and that x* is aregular
constrained
point
of minimum for I on 9D(again
see§2
for aprecise definition)
moreoverbelonging
to the admissible domain.It is fair to
acknowledge
that we follow here thepath
of Borovkov andRogozin [7]. Actually
the content of§3
comes from[7]
and is hereonly
in order to make the paper self-contained. Borovkov and
Rogozin ([7],
Theorem
2) give
also anasymptotic
estimate forreducing
it to the
asymptotics
of certainintegrals
whose behaviour is however notexplicitely
studied. In§4
wegive
anexplicit asymptotics (Theorem 4.4)
interms of the Cramer transform
I,
its derivatives and theshape
of 9D at x*.Moreover we are able to
give
a cleargeometric meaning
of the constantappearing
in theasymptotics. By
the way we also correct a mistake in the Theorem 2 of Borovkov andRogozin,
as it is easy to detect if one considers the case ofN(0, 1)-distributed r.v.’s,
a situation in which thecomputations
Annales de l’lnstitut Henri Poincaré - Probabilités et Statistiques
can be made
directly (see [3], [5]
or[12]
or even thefollowing §4,
forinstance).
In
§2
we recall someproperties
of the functional I(which
is sometimes referred to as the Cramer transform of/L)
and otherpreliminary
statements.§3
contains anasymptotic
result for thedensity
ofXn
from which in§4
we derive the
asymptotics.
Some results of
sharp asymptotics
existalready
in thelitterature;
wewish to recall those of R.Azencott
[ 1 ],
in a infinite dimensionalsetting, concerning
the case of small randomperturbations
ofdynamical systems and,
morerecently,
of M. Iltis[10],
in a context similar to the one of thispaper, but where the
geometric meaning
of theasymptotics
is notapparent.
It is also useful to
point
out that we don’t needassumptions
of theexistence of
dominating points,
in the sense ofNey [ 11 ] .
2. CRAMER TRANSFORM
Let /L
be aprobability
measure onIRd.
We shall denoteby v
its(complex) Laplace
transformso that its Fourier transform is
It is well-known that for d = 1 the set of all z E
Cd
such that theintegral
in the definition of v converges, is a
strip,
that is of the form J xiR,
where J is an interval. In
general,
d >1, v
is defined on a set of the form J x where J is a convex subset of We shall refer to this set as thestrip
of convergence. We shall writeDv = {A
E and alsoA(A)
= A EBy
I we denote the Cramer transform of ~:As a supremum of linear-affine
functions,
I isclearly
convex and lowersemi-continuous.
Moreover,
since 03BB ~03BB, x) -
is concave, one cantry
tocompute
the supremumby looking
for the criticalpoint,
e.g.by
solving
in A theequation
If this
equation
has a solution A :=A(x),
thenclearly
(otherwise
the supremum is attained atinfinity).
Ifequation (2.1 )
has ao o
solution A E
VA == Vv
we shall say that x lies in the admissible domain of I. In other words the admissible domain 0 is theimage
of the interior ofV A by
A’. If thesupport of J-L
is not contained in a properhyperplane
of then the covariance matrix of ~c is
positive
definite and thusA"(0)
is invertible. Indeed this
argument implies
thatA"(A)
isalways
invertiblefor A in the domain of v, since
A"(A)
coincides(see
nextparagraph)
withthe covariance matrix of another
probability
stillhaving
itssupport
not contained in a properhyperspace.
Thisimplies
that A isstrictly
convexo
on
D:1,
so that the solution of(2.1 )
isunique
for x in the admissible comain. Moreover theimplicit
function theoremgives immediately
that theadmissible domain is an open set and that x -
A(x)
is C°° on H.This also
implies
that the Cramer transform I is C°° when restricted to the admissible domain.By computing
the derivatives in(2.2)
one hasthe relations
For more on
questions
of convexanalysis
the reader can refer to the books of Rockafellar[13]
and Ellis[8].
3. THE ASYMPTOTICS FOR THE DENSITY Assume that x
belongs
to the admissible domain. Lat us denote theprobability
measure onIRd
defined eitherby
A - +x)
orAn
important
role isgoing
to beplayed by
thefollowing
measureA
straightforward computation gives
that itsLaplace
transform isAnnales de l’lnstitut Henri Poincaré - Probabilités et Statistiques
so that is a
probability
measure. If xbelongs
to the admissibledomain,
then let us set /Lx =
(3.1) gives
then for the Fourier transformjix
of ~c~Computing
the derivatives onegets easily
that ~c~ is centered and that its covariance matrix coincides with the Hessian of Acomputed
at a =As we
already
remarked if thesupport
of /L is not contained in a properhyperplane
of then the covariance matrix of /L is(strictly) positive
definite. Thus
A"(0)
ispositive
definite. But if thesupport
of /L is not contained in a properhyperplane,
the same is true for so that alsol1"(~(x))
isnecessarily positive
definite.By (2.3)
~" isstrictly positive
definite in the admissible domain.
THEOREM 3.1. - Let
{Xn}n
be a sequenceof Rd-valued
r.v.’s;
assumethat their common law M has a bounded
density
withrespect
to theLebesgue
measure and that its
Laplace transform
v isfinite
in aneighborhood of
theorigin.
ThenXn
has adensity
gnfor
whichholds as n - oo
for
every xbelonging
to the admissible domain Q.Moreover the above
expansion
holdsuniformly for
x in any compact subsetof IRd
which is contained in Q.Proof. -
Let us denoteby f n
thedensity
ofxl
+ ... +Xn.
ThenLet A E
IRd
be apoint
in the interior ofPBi
then the Fourier transform of xfn(x)
is t -v(À
+it)n.
Let us prove that it isintegrable
for
large
n. Indeed it suffices to show that x is in LP forsome p >
1,
since this willimply
that t -v(À
+it)
is in Lq for someq >
2, by
theHausdorff-Young inequality
and that t -v ( ~ ~ it) n
is inL1
forlarge
n. Butand the left hand term is
integrable
as soon as p is smallenough
so thatpA
still lies in
Ð A.
Thus forlarge n
the Fourier inversion theoremgives
which means
and
by (3.3)
Since this
relationship
holds forevery A
and x lies in the admissibledomain,
then it holds also for A =
À (x),
so thatThe last
integral
can besplit
U
being
aneighborhood
of theorigin.
Since /Lx isabsolutely continuous,
then ]
1 andby
theRiemann-Lebesgue
lemmalimt~~|x(t)|
= 0.Thus,
for everyneighborhood
U of theorigin
there exists a constant k 1 such thatso that
where q
is such that .Lq. Thus the contribution of theintegral
overUC
goes to 0
exponentially
fast. As for the other term,by
achange
of variableAnnales de l’Institut Henri Poincaré - Probabilités et Statistiques
Since the
probability
measurejix
is centered and has covariance matrixA"(A(~))
=I" ( x ) -1 : - r x, by
the Central Limit TheoremIn order to
apply Lebesgue
theorem we remarkthat,
if U is smallenough
then
for c >
0,
which can be chosen smallenough
so thatrx
= EI isstill
positive
definite. ThusB
and since
1 - y
e-yB /
Lebesgue
theorem nowgives
Since the contribution of the
integral
overUC
goes to 0exponentially fast,
we havefinally
~ B
_
/ B /
We finish
by indicating
thearguments
which lead to theuniformity
of theasymptotics (3.2).
First it is easy to prove that theLaplace
transform v isuniformly
continuous in anystrip
of the form K + where K is anycompact
such that K c c J(recall
that J +ilRd
is the domain ofv).
This
implies
that a constant k 1 can be chosen so that(3.4)
holds ina
neighborhood
of x. Theanaliticity properties
of v moreoverimply
thatthe remainder in the
development
to the second order ofjix
at theorigin
can be controlled with
continuity,
so that the convergence in(3.5)
and themajorization
in(3.6)
are uniform in a smallneighborhood
of x.REMARK 3.2. - A closer look to the
proof of
Theorem 3.1 shows that theassumption of
existenceo.f’a
boundeddensity for
/L can be weakened. Indeed anyassumption ensuring that, for
some n, has adensity
to which theinversion Fourier theorem can be
applied
issufficient; for
instance that*n
has a bounded
density for
some n.4. THE
ASYMPTOTICS
In this section we state our main
result, giving
theasymptotics
forWe
begin by stating
theassumptions.
HYPOTHESIS
(A). - ~
has a boundeddensity
withrespect
to theLebesgue
measure and its
Laplace
transform is finite in aneighborhood
of theorigin.
DEFINITION 4.1. - Let W : R be a smooth
function
and let x* E ~D be a local constrained minimumof w
on ,x* is said to be nondegenerate
or
regular if for
some local systemof
coordinates G: U - o~D the Hessianof W
o G ispositive definite
atG-1 (x* ).
Let M be a
hypersurface
of let z be apoint
on M andMz
thetangent
space to M at z. Let be a C°° unit normal field to M aroundz. If X = is a vector in
M~
consider the transformationwhere
X n is the vectorwhose j-th component
is~ ~ 1 ai a~ ~ ( z ) .
ThenX n is still a vector in
Mz
so that(4.1 )
defines a transformation ofMz
intoitself which is called the
Weingarten
map(see
Hicks[1965], p.21 e.g.).
This map is
closely
related with the curvature of M at z(ibidem p.24).
The
following examples
show how tocompute
theWeingarten
map in twotypical
situations. Indoing
this one must take care because the choice of the C°° normal unit field is notunique,
so that theWeingarten
map is defined up to a constant -1. In thefollowing examples
the C°° normal unit field is chosen so that at xo itpoints
in the direction of thepositive
d-axis.EXAMPLE 4.2. - Let us assume that M is
locally
thegraph
of a functiong. That is Xo E M is such that in a
neighborhood
of xo M consists of thepoints (xl, ... , xa-l, g(~1, ... , xd-1)), g : IRd-1
- Rbeing
asmooth function such that
g’ (xo)
= 0. Then in thesystem
of coordinatesz -
(~ 1, ... , Xd-1)
theWeingarten
map of M at xo isgiven (see,
e.g., Baldi[3], §5).
EXAMPLE 4.3. - Let us assume that M =
{x; F( x)
=0~
whereR is a smooth function such that
F’ (xo ) #
0 and eventhat
F’ (xo) points along
the d-th coordinate.By
theimplicit
functionstheorem there exists a smooth
function g :
R such thatF(~i,...,~_i,~(~i,...~_i)) = 0
andg’(xo)
= 0.Computing
theHessian
of g
in terms of the derivatives of F onegets
Annales de I’Institut Henri Poincaré - Probabilités et Statistiques
for =
1,...,
d - 1. Inparticular
theWeingarten
map of asphere
ofradius R
equals
theidentity
matrix dividedby
R.Also,
for d =2,
theWeingarten
map reduces tomultiplication by
the inverse of the radius of theosculating
circle(multiplication by
0 if the latterequals oo).
HYPOTHESIS
(B). -
The infimum of I overD
is attained at aunique point
x* E which moreover is a
regular
constrained minimum andbelongs
to the admissible domain.
THEOREM 4.4. - Under
Hypotheses (A)
and(B)
thefollowing expansion
holds
where
Ll
andL2
are theWeingarten
maps at x*of
thehypersurfaces {~;7(~/) ~ ~(~)}
and c~Drespectively.
Before the
proof
of Theorem 4.4 it is worth topoint
out that in theasymptotics
the term before theexponential
isalways
of ordern-1/2, regardless
of the dimension d. Also theasymptotics
of Theorem 4.4depend
on the
law only through
the value of its Cramer transform I and of its first two derivatives at x*.It is not difficult to realize that if x* is a
regular point,
thenL1 - L2
ispositive
definite(and viceversa ! )
thusensuring
thatdet(L11(L1 -L2))
> 0.The
quantities appearing
in Theorem 4.4 have an intuitivemeaning
which is easy to
understand; det(L11(L1 - L2 ) ) -1 / 2
is a measure of thecontact between 9D and the level set
{~/;7(?/)
=I (~* ) ~
at theirpoint
oftangency
~* . The more these twohypersurfaces
are "close" the more thesymmetric
matrixL1 - L2
will be small thusgiving
alarge asymptotics.
Conversely
the measures how fast theaction functional I increases as one moves from x* to the interior of D. It should be recalled that the
gradient I’ (~* )
isorthogonal
to 9D andpoints
to the interior of D.
The
quantity det(L11(L1 - L2 ) ) appeared already
in acompletely
different
problem
ofsharp asymptotics (Baldi [3]).
It should also be noticed that it doesn’tdepend
on the choice of the normal field.Also remark that the
assumption
of existence of a boundeddensity
forF is needed
only
in order toapply
Theorem 3.1. ThusHypothesis (A)
canbe weakened
according
to Remark(3.2).
Proof of
Theorem 4.4. - For everyneighborhood
17 of x* we cansplit
Since I is l.s.c. and
I(x)
- +00 as +00(consequence
ofHypothesis (A))
thereexists ~
> 0 such thatand
by
Cramer’ s theoremAs for the estimation of the other term in
(4.2)
one hasLet us denote
by
G a localsystem
of coordinates of 9D around x* andassume for
simplicity
thatG(0)
= x*. Let us denoteby A
the Hessianof I o G at 0
(it
is apositive
definite matrix because ofHypothesis (B)).
Then classical
expansion
formulas(see
e.g. Bleistein and Handelsman[6], (8.3.63)
p.140)
state thatThis
already
proves that thequantity P{Xn
E in(4.2)
isnegligible
because of
(4.4).
In a firststep
we shall assume thatI" (x* )
is theidentity
matrix. We
only
need to prove thatBy
aorthogonal change
of coordinates we can assume thatlocally,
aroundx*,
9D is thegraph
of a smoothfunction g
defined on an open setU C We can assume moreover that 0 E
U,
that x* =(0,..., 0, Ix* I)
and that at x* the normal to 9D is
parallel
to thepositive
d-axis(it
sufficesto choose an
isometry
whichchanges x*
to(0,..., 0,
and thetangent
space to 9D at x* to thehyperplane
xd =~x* ~).
This means inparticular
that the first derivatives
of g
at 0 vanish. We can thus consider the localsystem
of coordinates G definedby
Annales de l’lnstitut Henri Poincaré - Probabilités et Statistiques
Since all the first derivatives
of g
vanish at 0 one hasso that
where denotes the
( d -1 )
x( d -1 ) identity
matrix. Thevanishing
of thefirst derivatives
of g
at 0 alsoimplies, by
astraightforward computation,
that for
z,j’=l,...,d2014l
Since we assume that
I" (x* )
is theidentity
matrixBut in this situation we know that
L2
=-g" (0) (Example 4.2)
whereasLi
=-Ia-1 ( a d (G(o))-1 (Example 4.3).
ThusSince of course
(4.7), (4.8)
and(4.9) give (4.5)
under theassumption
thatI" (x* )
is theidentity
matrix. Let us now remove thisassumption.
We know that the HessianI" (x* )
ispositive
definite. Let C be the(symmetric)
square root ofI" (x* ),
and let us define the r.v.’sZi
= One has thuswhere D = CD. It is immediate to check
that, denoting
with A and Ithe
log-Laplace
and the Cramer transformsrespectively
of the r.v.’sZi,
one has the relations
Moreover the infimum of I over D is attained at z* = Cx* and
Since
7~(~)
is theidentity matrix,
we canapply Step
2 and obtain theasymptotics
where
Li
andL~
are theWeingarten
maps of thehypersurfaces
9D and{~,7(~/)
=I (~* ) ~ respectively.
Thus weonly
have to prove that thequantity det ( L11 (L1 - L2))
is invariantby
linear transformations. This isproved
in the next statement which concludes theproof
of Theorem 4.4.PROPOSITION 4.5. - Let
Ml , M2 be
smoothhypersurfaces of IRd
whichare tangent at xo. Let C be an inversible linear
transformation of (~d
and let
Mi, M2
be theimages through
Cof Ml
andM2 respectively. If
Li,L2?Li,L2
are theWeingarten
mapsof Ml
andM2
at xo andof 1Ml
and
M2 at
Yo =Cxo respectively,
then thequantities L2))
and
L2 ) )
areequal.
Proof. -
We shallgive
asgranted
that the statement is true if C isorthogonal.
Otherwise let F : I~ be a smooth function such that M =~~;; F(x)
=0~
and 0.By
anorthogonal
transformationwe can
always
assume thatF’ (xo ) points along
the d-th coordinate. Thenwe know from
Example
4.3 thatLi
isgiven by HF,
theprincipal
minorof order d - 1 of the Hessian of
F,
dividedby
the modulus ofF’(xo).
If
n (x)
is a vector which is normal toMi
at x, thenC* -1 n
is normal toM at y
= Cx. Let 0 be anorthogonal
matrix such thatpoints along
the d-th coordinate: since = and for theorthogonal
transformations the statement is true we can assume thatG’* -17z
still
points along
the d-th coordinate. SinceM
={y;FoC-1(y)
=0~
andthe
gradient
of F oC-1 points along
the d-thcoordinate,
we needonly
tocompute
theprincipal
minor of order d - 1 of the Hessian of F oC-1
at y, that we shall denoteby Hp.
Now it is known thatAnnales de l ’lnstitut Henri Poincaré - Probabilités et Statistiques
However, by
theassumptions
wemade,
has the formso that
Since the same
computation
holds forL2,
we havefinally
so that the determinants of
~2)
andL11 (L1 - L 2 )
areequal.
Remark. - It is
possible
to prove(4.6) directly,
withoutsplitting
theproof
into two
parts according
to theassumption
that7~(~)
is theidentity
matrixor not as we did. The way we have chosen
points
out a useful invarianceproperty
of the formL2 ) ) .
EXAMPLE 4.6. - Let us assume that the distribution of
Xl
is theproduct
of two double
exponentials,
that is it hasdensity
Then
straightforward,
if notamusing, computations give
theLaplace
transform
The
system
has solution
(the
admisible domain coincides with so thatIf D is the ball of radius 1 centered at
(0,2),
thenusing Lagrange multipliers
one findseasily
that the minimum of I on 9D is attained at x* _(0,1),
as thesymmetries
of the situationsuggest.
Wealready
knowthat the
Weingarten
map of 9D at x* is1,
whereas for the level set{~;7(~) = 7(~)}
at x* isSo that
L2))-1~2 = (2v"2- 1)’~.
Anotherstraightforward computation gives
Annales de l’Institut Henri Poincaré - Probabilités et Statistiques
so
that,
since7’(~) = ~(.r*) _ (0,V~- 1),
Finally
theasymptotics
is[1] R. AZENCOTT, Petites perturbations aléatoires des systèmes dynamiques: développements asymptotiques, Bull. Sci. Math.
(2ème
série), Vol. 109, 1985, pp. 253-308.[2] R. R. BAHADUR and RAO, On deviations of the sample mean, Ann. Math. Statist., Vol. 31, 1960, pp. 1015-1027.
[3] P. BALDI, On the exact equivalent of the exit time of a Brownian motion from an open set,
Prépublication Laboratoire de Probabilités, Université Paris 6, 1991, 71-91.
[4] P. BALDI, Exact asymptotics for the probability of exit from a domain and applications to simulation, Ann. Probab., Vol. 23, 1995, pp. 1644-1670.
[5] C. BELLAÏCHE, Comportement asymptotyque de
p(t,,
x,y)
quand t ~ 0 (points éloignés).In: Gódesiques et diffusions en temps petit, Astérisque, Vol. 84-85, 1981, pp. 151-187.
[6] N. BLEISTEIN and R. A. HANDELSMAN, Asymptotic expansion of integrals, Dover, New York.
[7] A. A. BOROVKOV and B. A. ROGOZIN, On the multidimensional central limit theorem, Th.
Probab. Appl., Vol. 10, 1965, pp. 55-62.
[8] R. S. ELLIS, Entropy Large Deviations and Statistical Mechanics, Springer Verlag, New York, Berlin, Heidelberg, Tokyo.
[9] N. J. HICKSO, Notes On Differential Geometry, Van Nostrand Math. Studies 3, 1965.
[10] M. ILTIS, Sharp Asymptotics of Large Deviations in
Rd,
J. Theor. Probab., Vol. 8, 1995, pp. 501-522.[11] P. NEY, Dominating points and the asymptotics of large deviations for random walk on
Rd,
Ann. Probab., Vol. 11, 1983, pp. 158-167.[12] W. D. RICHTER, Laplace-Gauß Integrals, Gaussian Measure Asymptotic Behaviour and
Probability of moderate Deviations, Z. Anal. ihre Anwendugen, Vol. 4, 1985, pp. 257- 267.
[13] T. R. ROCKAFELLAR, Convex Analysis, Princeton University Press, Princeton N.J., 1970.
(Manuscript received 1th September, 1995;
Revised 15 October, 1995. ) REFERENCES