• Aucun résultat trouvé

Asymmetric covariance estimates of Brascamp-Lieb type and related inequalities for log-concave measures

N/A
N/A
Protected

Academic year: 2022

Partager "Asymmetric covariance estimates of Brascamp-Lieb type and related inequalities for log-concave measures"

Copied!
12
0
0

Texte intégral

(1)

www.imstat.org/aihp 2013, Vol. 49, No. 1, 1–12

DOI:10.1214/11-AIHP462 In the Public Domain

Asymmetric covariance estimates of Brascamp–Lieb type and related inequalities for log-concave measures

Eric A. Carlen

a,1

, Dario Cordero-Erausquin

b

and Elliott H. Lieb

c,2

aDepartment of Mathematics, Hill Center, Rutgers University, 110 Frelinghuysen Road Piscataway, NJ 08854-8019, USA.

E-mail:carlen@math.rutgers.edu

bInstitut de Mathématiques de Jussieu, Université Pierre et Marie Curie (Paris 6), 4 place Jussieu, 75252 Paris, France cDepartments of Mathematics and Physics, Jadwin Hall, Princeton University, P.O. Box 708, Princeton, NJ 08542-0708, USA

Received 15 June 2011; revised 24 October 2011; accepted 4 November 2011

Abstract. An inequality of Brascamp and Lieb provides a bound on the covariance of two functions with respect to log-concave measures. The bound estimates the covariance by the product of theL2norms of the gradients of the functions, where the magnitude of the gradient is computed using an inner product given by the inverse Hessian matrix of the potential of the log-concave measure.

Menz and Otto [Uniform logarithmic Sobolev inequalities for conservative spin systems with super-quadratic single-site potential.

(2011) Preprint] proved a variant of this with the twoL2norms replaced byL1 andLnorms, but only forR1. We prove a generalization of both by extending these inequalities toLpandLqnorms and onRn, for anyn≥1. We also prove an inequality for integrals of divided differences of functions in terms of integrals of their gradients.

Résumé. Une inégalité de Brascamp et Lieb donne une estimation sur la covariance entre deux fonctions par rapport à une mesure log-concave, qui est bornée par le produit des normesL2des gradients des fonctions, où l’amplitude du gradient est calculée en utilisant un produit scalaire égal à l’inverse de la matrice Hessienne du potentiel de la mesure log-concave. Menz et Otto [Uniform logarithmic Sobolev inequalities for conservative spin systems with super-quadratic single-site potential. (2011) Preprint]

ont prouvé une variante de ce résultat où les normesL2sont remplacées par des normesL1etL, mais seulement dansR1. Nous prouvons une généralisation de ces deux résultats, avec une extension de ces inégalités à des normesLpetLqdansRn, pour tout n≥1. Nous prouvons aussi une inégalité pour des intégrales de différences divisées de fonctions à l’aide des intégrales de leurs gradients.

MSC:26D10

Keywords:Convexity; Log-concavity; Poincaré inequality

1. Introduction

Letf be aC2strictly convex function onRnsuch that ef is integrable. By strictly convex, we mean that the Hessian matrix, Hessf, off is everywhere positive.

Adding a constant tof, we may suppose that

Rnef (x)dnx=1.

1Work partially supported by U.S. National Science Foundation Grant DMS 0901632.

2Work partially supported by U.S. National Science Foundation Grant PHY 0965859.

(2)

Let dμdenote the probability measure

dμ:=ef (x)dnx (1.1)

and let · pdenote the correspondingLp(μ)-norm.

For any two real-valued functionsg, hL2(μ), the covariance ofgandhis the quantity cov(g, h):=

Rnghdμ−

Rng

Rnh

(1.2) and the variance ofhis var(h)=cov(h, h).

The Brascamp–Lieb (BL) inequality [4] for the variance ofhis var(h)≤

Rn

h,Hessf1h

dμ, (1.3)

where(x, y)denotes the inner product inRn. (We shall also usex·y to denote this same inner product in simpler expressions where it is more convenient.)

Since(cov(g, h))2≤var(g)var(h), an immediate consequence of (1.3) is cov(g, h)2

Rn

g,Hessf1g

Rn

h,Hessf1h

dμ. (1.4)

The one-dimensional variant of (1.4), due to Otto and Menz [12], is cov(g, h)≤ ∇g1Hessf1h

=sup

x

|h(x)|

f(x) Rg(x)dμ(x) (1.5)

for functionsgandhonR1. They call this anasymmetric Brascamp–Lieb inequality. Note that it is asymmetric in tworespects: One respect is to take anL1norm of∇gand anLnorm of∇h, instead ofL2andL2. The second respect is that theLnorm is weighted with the inverse Hessian – which here is simply a number – while theL1 norm is not weighted.

Our first result is the following theorem, which generalizes both (1.4) and (1.5).

Theorem 1.1 (Asymetric BL inequality). Letdμ(x)be as in(1.1)and letλmin(x)denote the least eigenvalue of Hessf(x).For any locally Lipschitz functionsgandhonRnthat are square integrable with respect todμ,and for 2≤p≤ ∞, 1/p+1/q=1,

cov(g, h)≤Hessf1/pg

qλ(2minp)/pHessf1/ph

p. (1.6)

This is sharp in the sense that(1.6)cannot hold,generally,with a constant smaller than1on the right side.

Forp=2, (1.6) is (1.4). Note that (1.6) implies in particular that for Lipschitz functionsg, honRn, cov(g, h)≤λ1/p

ming

qλ1/q

minh

p. Forp= ∞andq=1, the latter is

cov(g, h)≤ ∇g1λmin1h

(1.7)

which forn=1 reproduces exactly (1.5).

We also prove the following theorem. In addition to its intrinsic interest, it gives rise to an alternative proof, which we give later, of Theorem1.1in the casep= ∞(though this proof only yields the sharp constant forR1, which is the original Otto–Menz case (1.5)).

(3)

Theorem 1.2 (Divided differences and gradients). Letμbe a probability measure with log-concave density(1.1).

For any locally Lipschitz functionhonRn,

Rn

Rn

|h(x)h(y)|

|xy| dμ(x)dμ(y)≤2n

Rn

h(x)dμ. (1.8)

Remark 1.3. The constant 2n is not optimal,as indicated by the examples in Section 4 (we will actually briefly mention how to reach the constant2n/2).We do not know whether the correct constant grows withn(and then how), or is bounded uniformly inn.We do know that forn=1,the constant is at least2 ln 2.We will return to this later.

The rest of the paper is organized as follows: Section2contains the proof of Theorem1.1, and Section3contains the proof of Theorem1.2, as well as an explanation of the connection between the two theorems. Section4contains comments and examples concerning the constant and optimizers in Theorem1.2. Section5contains a discussion of an application that motivated Otto and Menz, and finally, theAppendixprovides some additional details on the orig- inal proof of the Brascamp–Lieb inequalities, which proceeds by induction on the dimension, and has an interesting connection with the application discussed in Section5.

We end this introduction by expressing our gratitude to D. Bakry and M. Ledoux for fruitful exchanges on the preliminary version of our work. We originally proved (1.7) with the constantn2n using Theorem1.2, as explained in Section 3. Bakry and Ledoux pointed out to us that using a stochastic representation of the gradient along the semi-group associated toμ(sometimes referred to as the Bismut formula), one could derive inequality (1.7) with the right constant 1. This provided evidence that something more algebraic was at stake. It was confirmed by our general statement Theorem1.1and by its proof below.

2. Bounds on covariance

The starting point of the proof we now give for Theorem1.1is a classical dual representation for the covariance which, in the somewhat parallel setting of plurisubharmonic potentials, goes back to the work of Hörmander. We shall then adapt to ourLpsetting Hörmander’sL2approach [8] to spectral estimates.

Letgandhbe smooth and compactly supported onRn. Define the operatorLby

L=Δ− ∇f · ∇ (2.1)

and note that

Rng(x)Lh(x)dμ(x)= −

Rng(x)· ∇h(x)dμ(x), (2.2)

so thatLis self-adjoint onL2(μ). Let us (temporarily) addε|x|2tof to make it uniformly convex, so that the Hessian of f is invertible and so that the operatorLhas a spectral gap. (Actually,Lalways has a spectral gap sinceμis a log-concave probability measure, as noted in [1,9]. Our simple regularization makes our proof independent of these deep results.)

Then provided

Rnh(x)dμ(x)=0, (2.3)

u:= −

0

et Lh(x)dt (2.4)

exists and is in the domain ofL, and satisfiesLu=h.

Thus, assuming (2.3), and by standard approximation arguments, cov(g, h)=

Rng(x)h(x)dμ(x)=

Rng(x)Lu(x)dμ(x)

= −

Rng(x)· ∇u(x)dμ(x). (2.5)

(4)

This representation for the covariance is the starting point of the proof we now give for Theorem1.1.

Proof of Theorem 1.1. Fix 2≤p <∞, and letq =p/(p−1), as in the statement of the theorem. Suppose h satisfies (2.3), and defineuby (2.4) so thatLu=h. Then from (2.5),

cov(g, h)≤

Rng(x)· ∇u(x)dμ(x)

Rn

Hessf1/pg(x)·Hess1/pfu(x)dμ(x)

≤Hessf1/pg(x)

qHess1/pfu(x)

p. (2.6)

Thus, to prove (1.6) for 2≤p <∞, it suffices to prove the followingW1,p–W1,ptype estimate:

Hess1/pfu(x)

p≤λ(2p)/p

min Hessf1/ph

p. (2.7)

Toward this end, we compute L

|∇u|p

=p|∇u|p2(Lu)· ∇u +p|∇u|p2Tr

Hess2u

+p(p−2)|∇u|p4|Hessuu|2

p|∇u|p2(Lu)· ∇u, (2.8)

where we have used the fact that p ≥2, and where the notation L(u) refers to the coordinate-wise action (L∂1u, . . . , L∂nu)ofL.

Then, using the commutation formula (see the remark below)

L(u)= ∇(Lu)+Hessfu, (2.9)

we obtain 0=

RnL

|∇u|p

dμ(x)≥p

Rn|∇u|p2u· ∇hdμ(x)+p

Rn|∇u|p2u·Hessfudμ(x), and hence

Rn|∇u|p2Hess1/2fu2dμ(x)≤

Rn|∇u|p2Hess1/pfuHessf1/phdμ(x). (2.10) We now observe that for any positiven×nmatrix and any vectorv∈Rn,

A1/pvp≤ |v|p2A1/2v2.

To see this, note that we may suppose|v| =1. Then in the spectral representation ofA, by Jensen’s inequality, A1/pv=

n

j=1

λ1/pj vj2 1/2

n

j=1

λ1/2j vj2 1/p

.

Using this on the left side of (2.10), and using the obvious estimate

|∇u| ≤λmin1/pHess1/pfu on the right, we have

Hess1/pfup

p

Rn

Hess1/pfup1λ(2minp)/pHessf1/phdμ(x). (2.11)

(5)

Then by Hölder’s inequality we obtain (2.7).

It is now obvious that we can take the limit in whichεtends to zero, so that we obtain the inequality without any additional hypotheses onf. Our calculations so far have required 2≤p <∞, however, having obtained the inequality for suchp, by taking the limit in whichpgoes to infinity, we obtain thep= ∞,q=1 case of the theorem.

Finally, considering the case in which dμ(x)=(2π)n/2e−|x|2dx,

andg=h=x1, we have that Hessf =Idand so λmin=Hessf1/pg=Hessf1/ph=1

for allx, and so the constant is sharp, as claimed.

Remark 2.1. Many special cases and variants of the commutation relation(2.9)are well-known under different names.

Perhaps most directly relevant here is the case in whichf (x)= |x|2/2.Then∂jand its adjoint inL2(μ),∂j=xjj, satisfy the canonical commutation relations,and the operatorL= −n

j=1jj is(minus)the Harmonic oscillator Hamiltonian in the ground state representation.This special case of(2.9),in which the Hessian on the right is the identity,is the basis of the standard determination of the spectrum of the quantum harmonic oscillator using “raising and lowering operators.”

In the setting of Riemannian manifolds,a commutation relation analogous to(2.9)in which L is the Laplace–

Beltrami operator and the Hessian is replaced by Ric, the Ricci curvature tensor, is known as the Bochner–

Lichnerowicz formula.Both the Hessian version(2.9)and the Bochner–Lichnerowicz version have been used a number of times to prove inequalities related to those we consider here,for instance in the work of Bakry and Emery on loga- rithmic Sobolev inequalities.

We note that our proof immediately extends,word for word,to the Riemannian setting if we use,in place of(2.9) the commutation satisfied by the operatorLgiven by(2.1)wheref is a(smooth)potential on the manifold;That is, with some abuse of notation,L(u)= ∇(Lu)+Hessfu+Ric∇u,or rather,more rigorously,

L

|∇u|p

p|∇u|p2

(Lu)· ∇u+Hessfu· ∇u+Ric∇u· ∇u .

Thus,an analog of Theorem1.1holds on a Riemannian manifoldMequipped with a probability measure dμ(x)=ef (x)d vol(x),

whered volis the Riemannian element of volume andf a smooth function onM,providedHessf at each pointx is replaced in the statement by the symmetric operator

Hx=Hessf(x)+Ricx

defined on the tangent space.Of course,the convexity condition onf is accordingly replaced by the assumption that Hx>0at every pointxM.

3. Bounds on differences

Proof of Theorem1.2. Sinceh(x)h(y)=1

0h(xt)·(xy)dt, we have h(x)h(y)≤ |xy|

1

0

h(xt)dt, wherext:=t x+(1t )y. (3.1)

Next, by the convexity off,

ef (x)ef (y)=e(1t )f (x)etf (y)etf (x)e(1t )f (y)≤ef (xt)e(1t )f (x)etf (y). (3.2)

(6)

Introduce the variables w=t x+(1t )y,

z=xy. (3.3)

A simple computation of the Jacobian shows that this change of variables is a measure preserving transformation for all 0≤t≤1, and hence

Rn

Rn

|h(x)h(y)|

|xy| dμ(x)dμ(y)

1

0

Rn

Rn

h(w)e(1t )f (w+(1t )z)etf (wt z)dzdμ(w)

dt. (3.4)

We estimate the right side of (3.4). By Hölder’s inequality,

Rne(1t )f (w+(1t )z)etf (wt z)dnz

Rnef (w+(1t )z)dz 1t

Rnef (wt z)dz t

. (3.5)

But

Rnef (w+(1t )z)dz=(1t )n and

Rnef (wt z)dz=tn,

and finally,(1t )n(1t )tnt=en(tlogt+(1t )log(1t ))≤2n. A corollary of Theorem1.2is a proof of Theorem1.1for the special case ofq=1 andp= ∞. This proof is not only restricted to this case, it also has the defect that the constant is not sharp, except in one-dimension. We give it, nevertheless, because it establishes a link between the two theorems.

Alternative proof of Theorem1.1forq=1. We shall use the identity cov(g, h)=1

2

Rn

Rn

g(x)g(y)

h(x)h(y)

dμ(x)dμ(y), (3.6)

and estimate the differences on the right in different ways.

Fix anyx=yinRn, and define the vectorv:=xy, and for 0t≤1, definext=y+t v=t x+(1t )y. Then for any Lipschitz functionh,

h(x)h(y)= t

0

v· ∇h(xt)dt. (3.7)

Now note that d

dtv· ∇f (xt)=

v,Hessf(xt)v

≥ |xy|2λmin(xt) >0. (3.8)

Integrating this intfrom 0 to 1, we obtain xy,f (x)− ∇f (y)

= 1

0

v,Hessf(xt)v

dt >0 (3.9)

which expresses the well-known monotonicity of gradients of convex functions.

(7)

Next, multiplying and dividing by(v,Hessf(xt)v)in (3.7), we obtain h(x)−h(y)=

1

0

v,Hessf(xt)v

v,Hessf(xt)v1

v· ∇h(xt)dt

1

0

v,Hessf(xt)vv,Hessf(xt)v1

v· ∇h(xt)dt

1

0

v,Hessf(xt)vλmin(xt)1

|xy|2v· ∇h(xt)dt

≤ sup

z∈Rn

|∇h(z)| λmin(z)

|xy|1 1

0

v,Hessf(xt)v dt

= sup

z∈Rn

|∇h(z)| λmin(z)

|xy|1

xy,f (x)− ∇f (y)

. (3.10)

Define C:=sup

z∈Rn

|∇h(z)| λmin(z)

and use (3.10) in (3.6):

cov(g, h)≤ 1 2

Rn

Rn

g(x)−g(y)h(x)h(y)dμ(x)dμ(y)

C 2

Rn

Rn

g(x)g(y) 1

|xy|(xy)·

f (x)− ∇f (y)

dμ(x)dμ(y)

=C 2

Rn

Rn

g(x)−g(y) 1

|xy|(xy)·

yef (y)ef (x)− ∇xef (x)ef (y) dnxdny

= −C

Rn

Rn

g(x)−g(y) 1

|xy|(xy)· ∇xef (x)ef (y)dnxdny, where, in the last line, we have used symmetry inxandy.

Now integrate by parts inx. Suppose first thatn >1. Then div

1

|z|z

=n−1

|z|

and|∇x|g(x)g(y)|| = |∇xg(x)|almost everywhere. Hence we obtain cov(g, h)≤C

Rn

g(x)dμ(x)+(n−1)

Rn

Rn

|g(x)g(y)|

|xy| dμ(x)dμ(y)

. (3.11)

Forn=1, div(|1z|z)=2δ0(z)and (3.11) is still valid since|g(x)g(y)|δ0(xy)=0.

Now, forn=1, (3.11) reduces directly to (1.5). Forn >1, it reduces to (1.7) upon application of Theorem1.2, but

with the constantn2ninstead of 1.

4. Examples and remarks on optimizers in Theorem1.2

Our first examples address the question of the importance of log-concavity.

(1)Some restriction onμis necessary: If a measure dμ(x)=F (x)dx onRhasF (a)=0 for somea∈R, andF has positive mass to the left and right ofa, then inequality (1.8) cannot possibly hold with any constant. The choice ofhto be the Heaviside step function shows that (1.8) cannot hold with any constant for thisμ.

(8)

(2) Unimodality is not enough: Take dμ(x)=F (x)dx, withF (x)=1/4ε on (ε, ε) andF (x)=1/4(1−ε) otherwise on the interval(−1,1)andF (x)=0 for|x|>1. Letg(x)=1 for|x|< ε+δ andg(x)=0 otherwise.

Whenδis positive but small,

R|∇g|dμ(x)=1/2(1−ε) while

R

R

|g(x)g(y)|

|xy| dμ(x)dμ(y)=O

−ln(ε) .

(3)Forn=1, the best constant in(1.8)is at least2 ln 2: Take dμ(x)=F (x)dx, withF (x)=1/2 on(−1,1)and F (x)=0 for|x|>1. Letg(x)=1 forx≥0 andg(x)=0 forx <0. All integrals are easily computed.

(4)The best constant is achieved for characteristic functions: When seeking the best constant in (1.8), it suffices, by a standard truncation argument, to consider bounded Lipschitz functionsh. Then, since neither side of the inequality is affected if we add a constant toh, it suffices to consider non-negative Lipschitz functions. We use the layer-cake representation [11]:

h(x)=

0

χ{h>t}(x)dt.

Then

Rn

Rn

|h(x)h(y)|

|xy| dμ(x)dμ(y)≤

0

Rn

Rn

|χ{h>t}(x)χ{h>t}(y)|

|xy| dμ(x)dμ(y)dt. (4.1)

DefineCnto be the best constant for characteristic functions of setsAand log-concave measuresμ:

Cn:=sup

f,A Rn

Rn|χA(x)χA(y)|/|xy|dμ(x)dμ(y)

∂Aef (x)dHn1(x)

, (4.2)

whereHn1denotesn−1 dimensional Hausdorff measure. Apply this to (4.1) to conclude that

Rn

Rn

|h(x)h(y)|

|xy| dμ(x)dμ(y)≤Cn

0

∂χ{h>t}

ef (x)dHn1(x)dt

=Cn

Rn

h(x)dμ(x), (4.3)

where the co-area formula was used in the last line. Thus, inequality (1.8) holds with the constant Cn; in short, it suffices to consider characteristic functions as trial functions. Note that the argument is also valid at the level of each measureμindividually, although we are interested here in uniform bounds.

With characteristic functions in mind, let us consider the case thatgis the characteristic function of a half-space inRn. Without loss of generality let us take this to be{x:x1<0}. Clearly, the left side of (1.8) is less than the integral with|xy|1replaced by|x1y1|1. Since the marginal (obtained by integrating overx2, . . . , xn) of a log-concave function is log-concave, we see that our inequality reduces to the one-dimensional case. In other words, the constant Cn in (4.2) would equalC1, independent ofn, if the supremum were restricted to half-spaces instead of to arbitrary measurable sets.

(5)Improved constants and geometry of log-concave measures: With additional assumptions on the measure one can see that the constant is not only bounded inn, but of order 1/

n. We are grateful to F. Barthe and M. Ledoux for discussions and improvements in particular cases concerning the constant in Theorem1.2. This relies on the Cheeger constantα(μ)1>0 associated to the log-concave probability measure dμ, which is defined to be the best constant in the inequality

A⊂Rn(regular enough), μ(A)

1−μ(A)

α(μ)

∂A

ef (x)dHn1(x).

(9)

M. Ledoux suggested the following procedure. Split the function |xy|1 into two pieces according to whether

|xy|is less than or greater thanR, for someR >0. Withhbeing the characteristic function ofA, the contribution to the left side of (4.3) for|xy|> R is bounded above by 2R1α(μ)

∂Aef (x)dHn1(x). The contribution for

|xy| ≤R is bounded above in the same manner as in the proof of Theorem 1.2, but this time we only have to integratezover the domain|z| ≤Rin each of the integrals in (3.5). Thus, our bound 2nis improved by a factor, which is the dμvolume of the ballBR= {|z| ≤R}, once we used the Brunn–Minkowski inequality for the bound

μ

(1t )BR+w1t

μ(t BR+w)tμ(BR+w)μ(BR):=sup

x

μ(BR+x).

The final step is to optimize the sum of the contributions of the two terms with respect toR. Thus, if we denoteCn(μ) the best constant in the inequality (1.8) of Theorem1.2for a fixed measureμ, we have

Cn(μ)≤ inf

R>0

2nμ(BR)+2R1α(μ)

≤2n. (4.4)

Note that ifμis symmetric (i.e. iff is even), then the Brunn–Minkowski inequality ensures thatμ(BR)=μ(BR).

Unlike in (1.8), this improved bound depends onμbut there are situation where this gives optimal estimates as pointed out to us by F. Barthe. As an example, consider the case whereμis the standard Gaussian measure onRn. Using the known value of the Cheeger constant for this μ, and linear trial functions, one finds that the constant is bounded above and below by a constant timesn1/2.

Actually, we can use (4.4) to improve the constant from 2nto 2n/2for arbitrary measures using some recent results from the geometry of log-concave measures. Without loss of generality, we can assume, by translation of μ, that |x|dμ(x)=infv

|x+v|dμ(x)=:Mμ. It was proved in [1,9] that for every log-concave measure onRn, α(μ)cMμ,

wherec >0 is some numerical constant (meaning a possibly large, but computable, constant, in particular independent ofnandμ, of course). On the other hand, it was proved by Guédon [7] that for every log-concave measureνonRn

ν(BR)C |x|dνR

for some numerical constantC >0. In the caseμis not symmetric, we pickvsuch thatμ(Br+v)=μ(BR), and then we apply the previous bound toν(·)=μ(· +v)in order to get thatμ(BR)MCμ. Using these two estimates in (4.4) we see that

Cn(μ)≤inf

s>0

C2ns+c/s

=κ2n/2 for some numerical constantκ >0.

The Brascamp–Lieb inequality (1.3), as well as inequality (1.8), have connections with the geometry of convex bodies. It was observed in [2] that (1.3) can be deduced from the Prékopa–Leindler inequality (which is a functional form of the Brunn–Minkowski inequality). But the converse is also true: the Prékopa theorem follows, by a local computation, from the Brascamp–Lieb inequality (see [5] where the procedure is explained in the more general com- plex setting). To sum up, the Brascamp–Lieb inequality (1.3) can be seen as thelocalform of the Brunn–Minkowski inequality for convex bodies.

5. Application to conditional expectations

Otto and Menz were motivated to prove (1.5) for an application that involves a large amount of additional structure that we cannot go into here. We shall however give an application of Theorem1.1to a type of estimate that is related to one of the central estimates in [12].

We use the notation in [4], which is adapted to working with a partitioned set of variables. Write a pointx∈Rn+m asx=(y, z)withy∈Rmandz∈Rn. For a functionAonRn+m, letAz(y)denote the conditional expectation ofA

(10)

giveny, with respect toμ. For a functionBofyalone,Byis the expected value ofB, with respect toμ. As in [4], a subscripty orzon a function denotes differentiation with respect toy orz, while a subscripty orzon a bracket denotes integration. For instance, for a functiongonRn+m,gy denotes the vector(∂y∂g

i)ininRn, and forin,gyiz denotes the vector(∂y2g

i∂zj)jminRm. Finally,(gyz)denotes then×mmatrix having the previous vectors as rows.

Lethbe non-negative withhx=1 so thath(x)dμ(x)is a probability measure, and so ishz(y)dν(y), where dν(y)is the marginal distribution ofyunder dμ(x).

A problem that frequently arises [3,6,10,12,13] is to estimate the Fisher information ofhz(y)dν(y)in terms of the Fisher information ofh(x)dμ(x)by proving an estimate of the form

|(hz)y|2 hz

y

C |hx|2

h

x

. (5.1)

Direct differentiation under the integral sign in the variableyi gives hz

yi= hyiz−covz(h, fyi),

where covz denotes the conditional covariance ofh(y, z)andfyi(y, z), integrating in zfor each fixed y. Letu= (u1, . . . , um)be any unit vector inRm. Then, for eachy,

hz

y·u= m

i=1

hz

yiui= m

i=1

hyizuim

i=1

covz(h, fyi)ui

= hyz·u−covz(h, fy·u), and hence, choosinguto maximize the left hand side,

hz

y2≤2hyz2+2

covz(h, fy·u)2

. (5.2)

By (1.6),

covz(h, fy·u)

|hz|

zλmin1(fy·u)z

. (5.3)

Note that the least eigenvalue of then×nblockfzzis at least as large as the least eigenvalueλmin(y, z)of the full Hessian, by the variational principle. Hence, while we are entitled to use the least eigenvalue of then×nblockfzzof the full(n+m)×(n+m)Hessian matrixfxx, and this would be important in the application in the one-dimensional case made in [12], here, without any special structure to take advantage of, we simply use the least eigenvalue of the full matrix in our bound.

Next note that (fy·u)z2

m

i=1

n

j=1

(fyi,zj)2

u2i,

and thatn

j=1(fyi,zj)2 is the i, i entry offyzTfyz wherefyz denotes the upper right corner block of the Hessian matrix. This number is no greater than thei, ientry of the square of the full Hessian matrix. This, in turn, is no greater thanλ2max. Then, sinceuis a unit vector, we have

(fy·u)zλmax. Using this in (5.3), we obtain

covz(h, fy·u)

|hz|

zλmaxmin, (5.4)

(11)

and then from (5.2) hz

y2≤2hyz2+2λmaxmin2

|hz|2

z. (5.5)

Then the Cauchy–Schwarz inequality yields |hz|

z

2

≤ |hz|2

h

z

hz. (5.6)

Use this in (5.5), divide both sides byhz, and integrate iny. The joint convexity inAandα >0 ofA2yields (5.1) with the constantC=2λmaxmin2.

The bound we have obtained becomes useful whenλmax(x)/λmin(x)is bounded uniformly. Suppose thatf (x)has the formf (x)=ϕ(|x|2). Then the eigenvalues of the Hessian off are 2ϕ(|x|2), with multiplicitym+n−1, and 4ϕ(|x|2)|x|2+2ϕ(|x|2), with multiplicity 1. Then both eigenvalues are positive, and the ratio is bounded, whenever ϕis positive and, for somec <1< C <∞,

(s)(s)(s).

Remark 5.1 (Other asymmetric variants of the BL inequality). Together, (5.3)and(5.6)yield (covz(h, fy·u))2

hz ≤ |hz|2

h

z

λ1

min(fy·u)z2

.

A weaker inequality is (covz(h, fy·u))2

hz ≤ |hz|2

h

z

λ1

min2

(fy·u)z2

. (5.7)

In the context of the application in[12],finiteness of(fy·u)z limitsf to quadratic growth at infinity.A major contribution of [12] is to remove this limitation in applications of (5.1).The success of this application of (1.5) depended on the full weight of the inverse Hessian being allocated to theLterm.

Nonetheless,once the topic of asymmetric BL inequalities is raised,one might enquire whether an inequality of the type

cov(g, h)≤CgHessf1h

1 (5.8)

can hold for any constantC.There is no such inequality,even in one dimension.To see this,suppose that for some a∈Rand someε >0,fxx> Mon(aε, a+ε).Takeh(x)=1forx > aandh(x)=0forxa.Takeg(x)=xa.

Suppose thatf is even abouta.Thencov(g, h)=

a (xa)ef (x)dx,whileHessf1h1M1,andf can be chosen to makeMarbitrarily large while keepingg≤1,andcov(g, h)bounded away from zero.

Appendix

We recall that the original proof of (1.3), Theorem 4.1 of [4], used dimensional induction, though interesting non- inductive proofs have since been provided [2].

The starting point for the inductive proof is that the proof for n=1 is elementary. The proof of the inductive step is more involved, and we take this opportunity to provide more detail about the passage from Eq. (4.9) of [4] to Eq. (4.10) of [4]. There is an interesting connection with the application discussed in the previous section, which also concernshyz−covz(h, fy). We continue using the notation introduced there, but nowm=1 (i.e.y∈R).

Equation (4.9) reads var(h)≤ Bywhere B=varz(h)+[hyz−covz(h, fy)]2

fyyz−varzfy . (A.1)

(12)

Our goal is to prove B

hz, fzz1hz

z+ hy(hz, fzz1fyz)2z

fyy(fyz, fzz1fyz)z

. (A.2)

To do this, use the inductive hypothesis; i.e., for anyHonRn1, varz(H )

Hz, fzz1Hz

z. (A.3)

Apply this to arbitrary linear combinationH=λh+μfyto conclude the 2×2 matrix inequality varz(h) covz(h, fy)

covz(h, fy) varz(fy)

hz, fzz1hz

z

hz, fzz1fyz z

fyz, fzz1hz

z

fyz, fzz1fyz

z

.

Take the determinant of the difference to find that hz, fzz1hz

z−varz(h)≥ [(hz, fzz1fyz)z−covz(h, fy)]2

(fyz, fzz1fyz)z−varz(fy) . (A.4)

Combine (A.1) and (A.4) to obtain B

hz, fzz1hz

z+[hyz−covz(h, fy)]2

fyyz−varz(fy) −[(hz, fzz1fyz)z−covz(h, fy)]2

(fyz, fzz1fyz)z−varz(fy) . (A.5) Sincea2is jointly convex inaandα >0, and is homogeneous of degree one, for allα > β >0 and allaandb,

a2 αb2

β +(ab)2 αβ .

That is,a2b2(ab)2/(αβ). Use this on the right side of (A.5) to obtain (A.2), noting that the positivity ofαβ= fyyz(fyz, fzz1fyz)zis a consequence of the positivity of the Hessian off.

References

[1] S. G. Bobkov. Isoperimetric and analytic inequalities for log-concave probability measures.Ann. Probab.27(1999) 1903–1921.MR1742893 [2] S. Bobkov and M. Ledoux. From Brunn–Minkowski to Brascamp–Lieb and to logarithmic Sobolev inequalities.Geom. Funct. Anal.10

(2000) 1028–1052.MR1800062

[3] T. Bodineau and B. Helffer. The log-Sobolev inequalities for unbounded spin systems.J. Funct. Anal.166(1999) 168–178.MR1704666 [4] H. J. Brascamp and E. H. Lieb. On extensions of the Brunn–Minkovski and Prékopa–Leindler theorems, including inequalities for log-concave

functions, and with an application to the diffusion equation.J. Funct. Anal.22(1976) 366–389.MR0450480

[5] D. Cordero-Erausquin. On Berndtsson’s generalization of Prékopa’s theorem.Math. Z.249(2005) 401–410.MR2115450

[6] N. Grunewald, F. Otto, C. Villani and M. G. Westdickenberg. A two-scale approach to logarithmic Sobolev inequalities and the hydrodynamic limit.Ann. Inst. Henri Poincaré Probab. Stat.45(2009) 302–351.MR2521405

[7] O. Guédon. Kahane–Khinchine type inequalities for negative exponent.Mathematika46(1999) 165–173.MR1750653 [8] L. Hörmander.L2estimates and existence theorems for the¯operator.Acta Math.113(1965) 89–152.MR0179443

[9] R. Kannan, L. Lovász and M. Simonovits. Isoperimetric problems for convex bodies and a localization lemma.Discrete Comput. Geom.13 (1995) 541–559.MR1318794

[10] C. Landim, G. Panizo and H. T. Yau. Spectral gap and logarithmic Sobolev inequality for unbounded conservative spin systems.Ann. Inst.

Henri Poincaré Prob. Stat.38(2002) 739–777.MR1931585

[11] E. H. Lieb and M. Loss.Analysis, 2nd edition. Amer. Math. Soc., Providence, RI, 2001.MR1817225

[12] G. Menz and F. Otto. Uniform logarithmic Sobolev inequalities for conservative spin systems with super-quadratic single-site potential.

Preprint, 2011.

[13] F. Otto and M. G. Reznikoff. A new criterion for the logarithmic Sobolev inequality and two applications.J. Funct. Anal.243(2007) 121–157.

MR2291434

Références

Documents relatifs

Key words: Mulholland inequality, Hilbert-type inequality, weight function, equiv- alent form, the best possible constant factor.. The above inequality includes the best

Kirchberg, An estimation for the first eigenvalue of the Dirac operator on closed K¨ ahler manifolds of positive scalar curvature, Ann. Kirchberg, The first Eigenvalue of the

More precisely, we show that the minimax optimal orders of magnitude L d/(d+2) T (d+1)/(d+2) of the regret bound over T time instances against an environment whose mean-payoff

[19] Fr´ ed´ eric Robert, Existence et asymptotiques optimales des fonctions de Green des op´ erateurs elliptiques d’ordre deux (Existence and optimal asymptotics of the

Nacib Albuquerque Departamento de Matemática Universidade Federal da Paraíba 58.051-900 - João Pessoa,

Pinelis, Birkh¨ auser, Basel L’Hospital type rules for monotonicity: applications to probability inequalities for sums of bounded random variables. Pinelis, Binomial upper bounds

Hence in the log-concave situation the (1, + ∞ ) Poincar´e inequality implies Cheeger’s in- equality, more precisely implies a control on the Cheeger’s constant (that any

The results we obtained in the case d ≥ 4 are similar to the case d = 3, with V γ,d,1 as the maximizer for small γ, until it is outperformed by branches with a larger number of