Wasserstein geometry of porous medium equation

(1)

Ann. I. H. Poincaré – AN 29 (2012) 217–232

www.elsevier.com/locate/anihpc

Wasserstein geometry of porous medium equation

^✩

Asuka Takatsu

^a,b,c,^∗

aGraduate School of Mathematics, Nagoya University, Nagoya 464-8602, Japan bInstitut des Hautes Études Scientifiques, Bures-sur-Yvette 91440, France cMax-Planck-Intitut für Mathematik, Vivatsgasse 7, 53111 Bonn, Germany

Received 18 February 2010; received in revised form 23 August 2011; accepted 11 October 2011 Available online 19 October 2011

Abstract

We study the porous medium equation with emphasis onq-Gaussian measures, which are generalizations of Gaussian measures by using power-law distribution. On the space ofq-Gaussian measures, the porous medium equation is reduced to an ordinary differential equation for covariance matrix. We introduce a set of inequalities among functionals which gauge the difference between pairs of probability measures and are useful in the analysis of the porous medium equation. We show that anyq-Gaussian measure provides a nontrivial pair attaining equality in these inequalities.

MSC:60D05; 46E27

Keywords:q-Gaussian measure; Porous medium equation; Functional inequality

1. Introduction

Aq-Gaussian measure is one of power-law distributions onR^d, which is described by mean, covariance matrix parameters and theq-exponential functiongiven by

exp_q(t ):=

1+(1−q)t1−q¹

+ forq∈Qd:=(0,1)∪

1,d+4 d+2

,

where we put[x]+:=max{x,0}and by convention 0^a:= ∞for any negative number a (see (3.1) for the precise definition of q-Gaussian measures). The q-Gaussian measure is considered as an approximation of the Gaussian measure since theq-exponential function recovers the usual exponential function in the limitq→1. This paper aims to demonstrate that theq-Gaussian measure inherits some features of the Gaussian measure. To do this, let us start by recalling properties of the Gauss measure.

For any v∈R^d andV ∈Sym⁺(d,R), which is the set of symmetric positive definite matrices of size d, the Gaussian measureN (v, V )with meanvand covariance matrixV is given by

✩ This work was partially supported by Research Fellowships of the Japan Society for the Promotion of Science for Young Scientists and by JSPS–IHÉS (EPDI) Fellowship.

* Correspondence to: Graduate School of Mathematics, Nagoya University, Nagoya 464-8602, Japan.

E-mail address:[email protected].

doi:10.1016/j.anihpc.2011.10.003

(2)

N (v, V ):=exp

−1 2

x−v, V⁻¹(x−v)

det(2π V ) ⁻

1 2L^d,

whereL^d stands for the Lebesgue measure onR^d. For example, the density ofN (0,2t Id), whereId is the identity matrix of sizedandt >0, is the heat kernel which is a self-similar solution to the heat equation

∂

∂tρ=ρ.

As well as the heat kernel, a solution to the heat equation with an initial data being a Gaussian density remains Gaussian densities for all future time, that is, the space of Gaussian measures is stable under the heat equation. On the space of Gaussian measures, the heat equation is reduced to an ordinary differential equation for covariance matrix.

It simplifies not only the analysis of the heat equation but also the analysis of the Wasserstein space to restrict arguments to the space of Gaussian measures. TheWasserstein spaceis the spaceP2of Borel probability measures onR^dhaving finite second moments equipped with a distance functionW₂defined by

W₂(μ, ν):=inf

R^d×R^d

|x−y|²dπ(x, y) ¹

2 π: a transport plan ofμandν

,

where atransport planofμandνis a probability measure onR^d×R^dsuch thatπ[B×R^d] =μ[B]andπ[R^d×B] = ν[B]for any Borel setB⊂R^d. A transport plan is said to beoptimalif it attains the infimum above, which always exits (see [24, Chapter 4]). Though the explicit expression of an optimal transport plan is not usually obtained, the explicit expression of an optimal transport plan of a pair of Gaussian measures is known, which guarantees the convexity of the space of Gaussian measures in Wasserstein geometry. The restriction to the space makes it possible to analyze Wasserstein geometry in detail (see [17]).

We furthermore focus on the fact that the Gaussian measure satisfies a set of inequalities among the relative en- tropyH, the Fisher informationI and the Wasserstein distance functionW2, all of which gauge differences between pairs of probability measures and are defined by

H (μ|ν):= ^R^d ^dμ^dν ln^dμ_dνdν ifμis absolutely continuous with respect toν,

+∞ otherwise,

I (μ|ν):= ^R^d|∇ln^dμ_dν|²dμ ifμis absolutely continuous with respect toν,

+∞ otherwise.

We say that a probability measureνsatisfies the logarithmic Sobolev inequalitywith constantλ, in short LS(λ) if we have

H (μ|ν) 1 2λI (μ|ν)

for all absolutely continuous probability measure μ with respect toν. Similarly, the probability measureν is said tosatisfy the Talagrand inequalitywith constantλ, in short T(λ) (resp. theHWI inequalitywith constantλ, in short HWI(λ)) if we have

W₂(μ, ν) 2

λH (μ|ν)

resp.H (μ|ν)W₂(μ, ν)

I (μ|ν)−λ

2W₂(μ, ν)²

for all absolutely continuous probability measureμwith respect to ν. Any Gaussian measure satisfies LS(λ), T(λ) and HWI(λ), whereλ⁻¹is the largest eigenvalues of the covariance matrix, and moreover provides a nontrivial pair attaining equality in these inequalities. Note that criteria for a probability measure to satisfy LS(λ), T(λ) and HWI(λ) are known, for example, LS(λ) with some convexity condition forνimplies T(λ) (see [15] for the details). The key of the proof that LS(λ) implies T(λ) is to analyze asymptotic behaviors of the Fokker–Planck equation of the form

∂

∂tρ=ρ+div(ρ∇Ψ ),

whereΨ is a function onR^d, on the contrary, these inequalities are applied to analyze asymptotic behaviors of the Fokker–Planck equation. In particular, Otto [14] investigated the asymptotic behaviors in the case thatΨ (x)= |x|²/4 using Wasserstein gradient structure, where the scaling

(3)

ρ(t, x)=t⁻^d²ρˆ

lnt, t⁻¹²x

provides a one-to-one correspondence between solutionsρ to the heat equation and the solutionsρˆ to the Fokker–

Planck equation.

We show that such features are inherited to the space ofq-Gaussian measures after some preliminaries on the q-exponential function and the Wasserstein geometry in Section 2. Section 3 is devoted to the convexity of the space ofq-Gaussian measures in the Wasserstein space (Theorem A). In Section 4, we consider the features ofq-Gaussian measures as solutions to the porous medium equation

∂

∂tρ=

ρ²⁻^q . (1.1)

Since the stability of the space ofq-Gaussian measures has been already shown by Ohara and Wada [12], it is possible to restrict the porous medium equation to the space ofq-Gaussian measures. We introduce the ordinary differential equation for covariance matrix obtained by restricting the porous medium equation to the space ofq-Gaussian measures (Theorem B). Section 5 is concerned with generalizations of the logarithmic Sobolev inequality, the Talagrand inequality and the HWI inequality, which play crucial roles in analyzing the asymptotic behaviors of the nonlinear Fokker–Planck equation

∂

∂tρ=

ρ²⁻^q +div(ρ∇Ψ ).

We give criteria for a probability measure to satisfy these inequalities and show that anyq-Gaussian measure provides a nontrivial pair attaining equality in these inequalities ifq >1 (Corollaries C, D). We finally prove that a variant of the logarithmic Sobolev inequality implies a variant of the Talagrand inequality using the nonlinear Fokker–Planck equation (Theorem E).

We refer to the preceding results in the literature. Generalizations of the logarithmic Sobolev inequality, the Tala- grand inequality and the HWI inequality have been studied by several authors. Carrillo, Jüngel, Markowich, Toscani and Untterreiter [6] studied these functional inequalities for parabolic systems using entropy dissipation methods.

Carrillo, McCann and Villani [7] also investigated these functional inequalities and they also estimated in [8] the con- traction rate of nonlinear evolution equations in the Wasserstein distance function. Agueh, Ghoussoub and Kang [1]

and Cordero-Erausquin, Gangbo and Houdré [9] introduced variants of the relative entropy and the Fisher information, through modifying the Boltzmann entropy and the quadratic transport cost. However, none of them referred to theq-Gaussian measures and our result is the first one concerning the importance of theq-Gaussian measures in the porous medium equation. See also [13], where Ohta and the author investigated the nonlinear Fokker–Planck equation on a (weighted) Riemannian manifold using the Wasserstein gradient structure and theq-exponential function.

2. Preliminaries

2.1. q-exponential function andq-logarithmic function

We first summarize the q-calculus, see [21] for further discussion. Take q ∈Q_d and fix it. We define the q- logarithmic functionlnqby

lnq(t ):=t¹⁻^q−1 1−q

fort >0. Since the function lnq is monotone increasing, there exists its inverse function on the image. This inverse function is called theq-exponential functionexp_qand is naturally extended to all ofRas

exp_q(t ):=

1+(1−q)t₁₋¹_q

+ .

Note that the functions lnq and exp_q recover the usual logarithmic function and the usual exponential function asq tends to 1, respectively.

Let us define functionals on the space P^ac of absolutely continuous probability measures with respect to the Lebesgue measure using theq-logarithmic function. TheTsallis entropyE_qis defined by

(4)

E_q(μ):= −

R^d

f^qlnq(f ) dL^d= −

R^d

f^q−f q−1 dL^d

forμ=fL^d∈P^ac, which is regarded as an approximation of the Boltzmann entropyEsince we have E_q(μ)−−−→^q^→¹ E(μ)= −

R^d

fln(f ) dL^d.

We also define theq-relative entropyH_qand theq-Fisher informationby H_q(μ|ν):= 1

2−q

R^d

flnq(f )−glnq(g)−(2−q)lnq(g)(f −g) dL^d,

I_q(μ|ν):=

R^d

∇

lnq(f )−lnq(g)²dμ

forμ=fL^d, ν=gL^d∈P^ac, both of which are non-negative and gauge the difference between pairs of probability measures. Note that limq→1Hq(μ|ν)=H (μ|ν)and limq→1Iq(μ|ν)=I (μ|ν)hold. The square root of theq-relative entropyH_q is considered as a generalization of the distance function in the context of information geometry since this satisfies a generalized Pythagorean relation (see [2,3] for information geometry). We refer to the generalized Pythagorean relation in Remark 3.2. It will be demonstrated in (5.15) that−I_q(μ|ν)is the first variation ofH_q(·|ν) atμ.

2.2. Wasserstein geometry

We briefly recall some results on the Wasserstein space. See [23,24] and references therein for the details and more information on Wasserstein geometry.

As mentioned in the introduction, an optimal transport of any pair inP2always exits. Though the explicit expression of an optimal transport is usually not obtained, it has been known since the work of Brenier [5] that an optimal transport is characterized by a push forward measure of the gradient of a convex function. Apush forward measureof a probability measureμby a mapT, denoted byT μ, is defined byT μ[B] :=μ[T⁻¹(B)]for all Borel setsB⊂R^d. Given two probability measuresμ=fL^d,ν=gL^dand a differentiable mapT onR^d,ν=T μis equivalent to that

f =g(T )det(dT ) (2.1)

holds forμ-almost everywhere, wheredT is the total differential ofT. We denote by id the identity map onR^d. Theorem 2.1. (See [5].) Letμ, ν∈P2 be such thatμ does not give mass to sets of Hausdorff dimension at most (d−1). Then there exists a convex function such that∇ϕpushesμforward toνand[id× ∇ϕ] μis a unique optimal transport plan of them. Moreover,{[(1−t )id+t∇ϕ] μ}t∈[0,1]is a unique geodesic fromμtoν in the Wasserstein space.

The converse situation also holds true, that is, if a support of a transport plan is almost contained in the subdiffer- ential of a convex function, then the transport plan is optimal. This is known as Knott–Smith optimality criterion.

Theorem 2.2. (See [23, Theorem 2.16].) Let ϕ be a proper lower semicontinuous convex function on R^d. For μ, ν∈P2, letπbe a transport plan of them such that

R^d×R^d

ϕ(x)+sup

z∈R^d

y, z −ϕ(z) − x, y

dπ(x, y)ε.

Then we have

R^d×R^d

|x−y|²dπ(x, y)W₂(μ, ν)²+2ε.

(5)

We finally give the explicit expression of the optimal transport between a pair of Gaussian measures, which helps us to understand Wasserstein geometry (see [17] and references therein for more details). Given anyX∈Sym⁺(d,R), we define a symmetric positive definite matrixX^1/2=√

Xsuch thatX^1/2·X^1/2=X.

For any pair of Gaussian measuresN (v, V )andN (u, U ), we define the symmetric positive definite matrixT and the associated functionT by

T :=U¹²

U¹²V U¹² ⁻

1

2U¹², T(x):=1

2

x−v, T (x−v)

+ x, u, (2.2)

which provides an optimal transport plan ofN (v, V )andN (u, U ). In other words,[id× ∇T] N (v, V )is the optimal transport plan of them and then the Wasserstein distance between them is given by

W2

N (v, V ), N (u, U ) ²= |v−u|²+trV +trU−2 tr

U¹²V U¹².

A unique geodesic fromN (v, V )toN (u, U )is given by{N (w_t, W_t)}t∈[0,1], where the time-dependent vectorw_t and the time-dependent matrixW_tare defined by

wt:=(1−t )v+t u, Wt:=

(1−t )Id+t T V

(1−t )Id+t T

. (2.3)

3. q-Gaussian measure

Let us summarize the definition ofq-Gaussian measures and then discuss Wasserstein geometry of the space of q-Gaussian measures. Background onq-Gaussian measures is found in [20] and [21].

A probability measure Nq(v, V ) is called theq-Gaussian measurewith mean v and covariance matrixV if it maximizes the Tsallis entropy Eq amongμ∈P^ac with meanv and covariance matrixV. It is known that the q- Gaussian measureN_q(v, V )is given by

N_q(v, V )=C₀(detV )⁻¹²exp_q

−1 2C₁

x−v, V⁻¹(x−v)

L^d, (3.1)

whereC0andC1are the positive constants given by

C₀=C₀(q, d):=

⎧⎪

⎪⎨

⎪⎪

⎩

Γ (²₁⁻₋^q_q+^d2)

Γ (²₁⁻₋^q_q) (⁽¹⁻_2π^q)C¹)^d² if 0< q <1,

Γ (_q₋¹₁)

Γ (_q₋¹₁−^d₂)(^(q⁻_2π^1)C¹)^d² if 1< q <^d_d⁺₊⁴₂, C₁=C₁(q, d):= 2

2+(d+2)(1−q)

andΓ (·)is theΓ-function (see [22]). In some cases, theq-Gaussian measureN_q^∗(v, V )is obtained as a maximizer of the Tsallis entropyEqunder theq-meanvconstraint and theq-covarianceV constraint, that is,N_q^∗(v, V )maximizes the Tsallis entropyE_qamongμ=fL^d∈P^acsatisfying

⎧⎪

⎪⎪

⎨

⎪⎪

⎪⎩

Q(f )=

R^d

f^qdL^d,

R^d

xf (x)^qdL^d(x)=Q(f )v,

R^d

(x−v)^T(x−v)f (x)^qdL^d(x)=Q(f )V ,

where vectors inR^dare column and^Tx stands for the transpose ofx. A relation betweenN_q(v, V )andN_q^∗(v, V )is given by

N_q^∗(v, V )=Nq

v, 2+d(1−q) 2+(d+2)(1−q)V

.

(6)

We use the usual mean and covariance constraint condition throughout this paper. In this case, theq-Gaussian measure is well defined forq∈Q_d and theq-Gaussian measureN_q(v, V )recovers the Gaussian measureN (v, V )asqtends to 1. We denote byn_q(v, V )the density ofN_q(v, V )with respect to the Lebesgue measure.

The space ofq-Gaussian measures is convex in the Wasserstein space.

Theorem A.For anyq∈Q_d, the space ofq-Gaussian measures is convex and isometric to the space of Gaussian measures with respect to Wasserstein geometry.

Proof. For the functionT given in (2.2), we have nq(v, V )=nq(u, U )(∇T)(det HessT)

forNq(v, V )-almost everywhere, which implies[∇T] Nq(v, V )=Nq(u, U )by (2.1). Theorem 2.2 with the convexity ofT ensures the optimality of[id× ∇T] Nq(v, V )and we have

W₂

N_q(v, V ), N_q(u, U ) ²= |v−u|²+trV+trU−2 tr

U¹²V U¹²

=W₂

N (v, V ), N (u, U ) ².

Hence the map from the space of Gaussian measures to the space of q-Gaussian measures sending N (v, V ) to N_q(v, V )is an isometry with respect toW₂.

Moreover, for the time-dependent vectorw_t and the time-dependent matrixW_t given in (2.3),{N_q(w_t, W_t)}t∈[0,1]

is a unique geodesic fromN_q(v, V )toN_q(u, U ), which shows the convexity of the space ofq-Gaussian measures in the Wasserstein space. 2

Remark 3.1. The convexity of the space of q-Gaussian measures is due to the characterization by the mean and covariance matrix parameter rather than the q-exponential function, which suggests the existence of other convex spaces (see [19]).

Remark 3.2. We briefly explain that the square root of the q-relative entropy satisfies a generalized Pythagorean relation. Givenμ∈P^ac, letN_q(v, V )be a minimizingq-Gaussian measure for the variational problem

min H_q

μN_q(u, U ) (u, U )∈R^d×Sym⁺(d,R) . Then the following Pythagorean relation

H_qμN_q(u, U ) =H_qμN_q(v, V ) +H_q

N_q(v, V )N_q(u, U )

holds for anyq-Gaussian measureN_q(u, U ). For the proof, see [12, Proposition 3].

4. q-Gaussian measures as solutions to porous medium equation

It is known that the porous medium equation (1.1) allows for a self-similar solution of the form ρ_q(x, t ):=

At⁻^dα(1⁻^q)−B|x|²t⁻¹_1−q¹

+ =

A−B|x|²t⁻^2α_1−q¹

+ t⁻^dα, where the constantsαandBare given by

α=α(q, d):= 1

d(1−q)+2, B=B(q, d):=(1−q)α 2(2−q).

The other constantA=A(q, d)is defined by the total mass of the solution and we normalized it such that

R^d

ρ_q(x, t ) dL^d=1.

(7)

To be precise,Ais given by A:=C₀^2α(1⁻^q)

α (2−q)C₁

dα(1−q)

.

The solutionρq was discovered by Barenblatt [4] and Pattle [16] and is called theBarenblatt–Pattle solution. It is easy to check the relation

ρq(x, t )=nq

0, Ct^2αId (x), where the constantC=C(q, d)is given by

C:=(2−q)C₁

α A.

As well as the Barenblatt–Pattle solution, it was proved by Ohara and Wada [12, Proposition 5] that a solution to the porous medium solution with an initial data being aq-Gaussian density remainsq-Gaussian densities for all future time. This fact implies that a solution to the porous medium equation on the space ofq-Gaussian measures can be explicitly solved [12, Remark 2]. To do this, we introduce the mapΘon Sym⁺(d,R)such that

Θ(V ):=(detV )⁻^α(1⁻^q)V .

Note thatρ_q(x, t )=n_q(0, CΘ(t Id))(x)holds.

Theorem B.For anyq∈QdandV ∈Sym⁺(d,R), we set the time-dependent matrixVtas Θ(Vt)=Θ(V )+σ (t )Id, d

dtσ (t )=2α

detΘ(Vt) ⁻

1−q 2 . Thenn_q(v, CΘ(V_t))is a solution to the porous medium equation(1.1).

Remark 4.1.The assertion also holds true forq=1.

Proof of Theorem B. For simplicity, we use the following notations

|x|²_V :=

x, V⁻¹x

, Θ_t:=Θ(V_t), F (t, x):=

A−B|x−v|²_Θ_t

+. Then we haveρ(t, x):=F^1/(1⁻^q)(detΘt)⁻^1/2=nq(v, CΘ(Vt))(x)and calculate

ρ²⁻^q(t, x) =αF

1−qq (t, x)(detΘ_t)⁻²⁻²^q 2B

1−q|x−v|²_Θ2

t −F (t, x)tr Θ_t⁻¹ . Combining well-known equations for a time-dependent invertible matrixX_t

d dt

X⁻_t ¹ = −X⁻_t ¹ d

dtX_t

X⁻_t ¹, d

dtdetX_t=(detX_t)tr

X_t⁻¹d dtX_t

with the assumption d

dtΘt=2α(detΘt)⁻¹⁻²^qId, we obtain

∂

∂t

ρ(t, x) =αF

1−qq (t, x)(detΘ_t)⁻^2−q² 2B

1−q|x−v|²_Θ2

t −F (t, x)tr Θ_t⁻¹ . We thus have

∂

∂tρ= ρ²⁻^q ,

proving thatρ(t, x)=n_q(c, CΘ(V_t))(x)is a solution to (1.1). 2

(8)

Remark 4.2.It is known that the scaling ρ(t, x)=t⁻^dαρˆ

lnt, t⁻^αx

provides a one-to-one correspondence between solutions ρ to the porous medium equation and solutionsρˆ to the nonlinear Fokker–Planck equation of the form

∂

∂tρˆ= ˆ ρ²⁻^q +

ˆ ρ

∇α 2|x|²

.

In particular, the Barenblatt–Pattle solution corresponds to the stationary solutionn_q(0, CId)to the nonlinear Fokker–

Planck equation (for instance, see [14]). Since the solutions obtained by Theorem B contain nonself-similar solutions, we obtain nonequilibrium solutions to the nonlinear Fokker–Planck equation. Such solutions help us to understand the asymptotic behavior of solutions to the nonlinear Fokker–Planck equation. As an example, the author [18] demonstrated by using elemental calculations thatNq(0, CId)satisfies theq-logarithmic Sobolev inequality, theq-Talagrand inequality and theq-HWI inequality for anyq-Gaussian measures, which are defined in Section 5 and are key ingredients to analyze the asymptotic behavior of solutions to the nonlinear Fokker–Planck equation.

5. Functional inequalities

In this section, we study relations among the functionalsH_q,I_qandW₂in the form of inequality and discuss the importance ofq-Gaussian measures in the inequalities. We deal with the three inequalities, called theq-logarithmic Sobolev inequality,q-Talagrand inequality andq-HWI inequality, of the form

Hq(μ|ν) 1

2λIq(μ|ν), W₂(μ, ν)

2

λH_q(μ|ν), H_q(μ|ν)

I_q(μ|ν)W₂(μ, ν)−λ

2W₂(μ, ν)².

Though criteria for a probability measure to satisfy the three inequalities have been already provided even in a more general setting (for instance see [1,7,9]), we show criteria to emphasize the importance of theq-Gaussian measure.

We use the same symbol∇for the distributional gradient.

Lemma 5.1.Givenμ1=f1L^d, μ2=f2L^d∈P2∩P^ac, letT be a map such that[id×T]μ1is an optimal transport of them. For aC²-functionΨ onR^dsuch thatHessΨ is bounded below by some numberK, we have

R^d

(f₂−f₁)Ψ dL^d

R^d

∇Ψ, T−idf₁dL^d+K

2W₂(μ₁, μ₂)². (5.1)

If we moreover assume thatf₁has a weak derivative, then we have

R^d

f2lnq(f2)−f1lnq(f1) dL^d

R^d

T −id,∇

f₁²⁻^q dL^d

=

R^d

(2−q)

T−id,∇lnq(f₁)

dμ₁. (5.2)

Proof. Since the assumption HessΨ Kprovides Ψ

T (x) −Ψ (x)

∇Ψ (x), T (x)−x +K

2x−T (x)²,

we obtain (5.1) by integrating it with respect toμ₁, where we use the optimality of[id×T] μ₁.

We next prove (5.2). Since the equality is trivial, we only prove the inequality. Due to Theorem 2.1 and (2.1), there exists a convex functionϕsuch that

(9)

∇ϕ=T , Hessϕ=dT , f₁=f₂(T )det(dT )

hold forμ1-almost everywhere. LetU be the maximal subset whereϕ has the second derivative. Then the change of variable formula fory=T (x)implies

R^d

f₂lnq(f₂) dL^d=

U

f₁lnq

f₂(T ) dL^d

and

R^d

f2lnq(f2)−f1lnq(f1) dL^d=

U

lnq

f2(T ) −lnq(f1) f1dL^d. Fixx∈U and set

b(t ):=lnq

f1(x)

det[Id+t (Hessϕ(x)−Id)]

,

which is convex fort∈ [0,1](see [11, Theorem 2.2]). Hence we have b(1)−b(0)=lnq

f₂ T (x)

−lnq

f₁(x) b(0)= −f₁¹⁻^q(x)

ϕ(x)−|x|² 2

and by integrating it with respect toμ1=f1L^d, we obtain

R^d

f₂lnq(f₂)−f₁lnq(f₁)

dL^d−

U

f₁¹⁻^q(x)

ϕ(x)−|x|² 2

f₁(x) dL^d(x)

−

R^d

f₁¹⁻^q(x)_D

ϕ(x)−|x|² 2

f₁(x) dL^d(x)

=

R^d

∇(f₁)²⁻^q,∇ϕ−id dL^d,

where_D is the distributional Laplacian. In the second inequality, we use the Aleksandrov theorem [10] which states that _Dϕ coincides with ϕ onU, and the non-negativity of _Dϕ on the interior of the domain of ϕ, which is derived from the convexity ofϕ. Thus the lemma is proved. 2

We now give a criterion for a probability measure to satisfy the q-logarithmic Sobolev inequality and the q- Talagrand inequality. Forν∈P2, we set

P2^ac(ν):=

μ∈P2∩P^acμis absolutely continuous with respect toν .

Proposition 5.2.For ν:=exp_q(−Ψ )L^d∈P2withq ∈Q_d, we assume that Ψ isC²and that HessΨ is bounded below by a positive numberλ.

(1) Theq-logarithmic Sobolev inequality H_q(μ|ν) 1

2λI_q(μ|ν) (5.3)

holds for allμ∈P₂^ac(ν)such that the density ofμhas a first weak derivative.

(2) Theq-Talagrand inequality W2(μ, ν)

2

λHq(μ|ν) (5.4)

holds for allμ∈P₂^ac(ν).

(10)

Proof. (1) Lemma 5.1 with(f₁, f₂)=(dμ/dL^d,exp_q(−Ψ ))provides

−H_q(μ|ν)= −1 2−q

R^d

f₁lnq(f₁)−f₂lnq(f₂)−(2−q)lnq(f₂)(f₁−f₂) dL^d

= −1 2−q

R^d

f₁lnq(f₁)−f₂lnq(f₂)+(2−q)Ψ (f₁−f₂) dL^d

R^d

T −id,∇

lnq(f1)+Ψ +λ

2|id−T|²

f1dL^d

−

R^d

1 2λ∇

lnq(f₁)−lnq(f₂) ²f₁dL^d

= − 1

2λI_q(μ|ν),

where we use the assumptionμ∈P₂^ac(ν)in the second line and complete the square in the second inequality.

(2) Setting(μ₁, μ₂)=(ν, μ)and adding up the inequality (5.1), (5.2), we have λ

2W2(μ, ν)²Hq(μ|ν). 2

Let us demonstrate that anyq-Gaussian measure provides a nontrivial pair which attains equality in (5.3) and (5.4).

Corollary C. ForV ∈Sym⁺(d,R), letσ be the largest eigenvalue ofV andv^∗be the corresponding eigenvector.

Then for anyq >1andv∈R^d, a pair(μ, ν)=(N_q(v+v^∗, V ), N_q(v, V ))attains equality in the inequalities(5.3) and(5.4)forλsatisfying

λσ =C₀¹⁻^qC1(detV )⁻¹⁻²^q.

Remark 5.3.For the functionΨ satisfying exp_q(−Ψ )=nq(v, V ), we have HessΨ =C₀¹⁻^qC1(detV )⁻⁽¹⁻^q)/2V⁻¹, that is, the Hessian ofΨ is bounded below byλ. The assumptionq >1 guaranteesP₂^ac(ν)=P2∩P^ac.

Proof of Corollary C. By the proof of Proposition 5.2, equality holds for a pair (μ, ν)in (5.3) if and only if the inequalities in (5.1), (5.2) are equalities and we have

λ

x−T (x) = ∇

lnq

dμ dL^d

+Ψ (x)

(5.5) forμ-almost everywhere, whereT is an optimal transport map ofμandν. Similarly, equality holds for a pair(μ, ν) in (5.4) if and only if the inequalities in (5.1) and (5.2) are equalities. To have equality in (5.2),T must be a translation map, that is, T (x)=x +u for some u∈R^d (see [11, Theorem 2.2]). In the case ofT (x)=x +u, equality in each of (5.1) and (5.5) is equivalent to that ∇Ψ (T (x))− ∇Ψ (x)=λu holds for μ-almost everywhere. The pair (N_q(v+t v^∗, V ), N_q(v, V ))satisfies these conditions, hence we have equality in (5.3) and (5.4). 2

Remark 5.4.The proof of Corollary C is needed in order to consider conditions to establish equality in (5.3) and (5.4).

We directly show that the pair (μ, ν)=(Nq(v+v^∗, V ), Nq(v, V )) given in Corollary C attains equality in (5.3) and (5.4) by computing

H_q(μ|ν)=λ

2v^∗², I_q(μ|ν)=λ²v^∗², W₂(μ, ν)=v^∗.

In this case, the both sides of (5.3) coincide withλ|v^∗|/2 and the both sides of (5.4) are equal to|v^∗|.

(11)

Remark 5.5.Since the square root ofH_qbehaves like the distance function in information geometry, theq-Talagrand inequality provides the comparison between the two different geometries, namely, Wasserstein geometry and information geometry.

We next provide a criterion for a probability measure to satisfy theq-HWI inequality using not Lemma 5.1 but the geodesic equation. For a geodesic{μ_t}t∈[0,1]in the Wasserstein space, there exists a family{Φ_t}t∈[0,1] of functions such that

⎧⎪

⎨

⎪⎩

∂

∂tμ_t+div(μt∇Φ_t)=0,

∂

∂tΦt+1

2|∇Φt|²=0,

(5.6)

where the functionΦ_t serves as a “tangent vector field” on the Wasserstein space (a heuristics can be found in [15]

and also see [23,24]). In short, the relation between this “tangent vector field” Φ_t and an optimal transport mapT fromμ₀toμ₁is

∇Φ_t= [T−id] ◦

T_t⁻¹ , T_t:=id+t[T−id].

Proposition 5.6.Forν:=exp_q(−Ψ )L^d∈P2 withq∈Q_d andq (d+1)/d, we assume thatΨ isC² and that HessΨ is bounded below by some numberK. Suppose that the support ofνis convex. Then theq-HWI inequality

H_q(μ|ν)W₂(μ, ν)

I_q(μ|ν)−K

2W₂(μ, ν)² (5.7)

holds for anyμ∈P₂^ac(ν)such that the density ofμhas a first weak derive.

Remark 5.7.IfKis positive, then the support ofνis automatically convex (see [13, Lemma 2.5]). The convexity of the support ofνprovides the convexity ofP₂^ac(ν)in the Wasserstein space (see [8, Corollary 4.3]).

Proof of Proposition 5.6. Let{μ_t=f_tL^d}t∈[0,1]be a geodesic fromμ∈P₂^ac(ν)toνand{Φ_t}t∈[0,1]be the function satisfying (5.6). We calculate

d

dtHq(μt|ν)=

R^d

∇Φt,∇

lnq(ft)−lnq(f1)

dμt, (5.8)

d²

dt²H_q(μ_t|ν)=

R^d

HessΨ (∇Φ_t,∇Φ_t)+f_t¹⁻^q 2−q

(1−q)(Φ_t)²+

ij

(HessΦ_t)²_ij

dμ_t, (5.9)

d dt

R^d

|∇Φ_t|²dμ_t=0, (5.10)

where we use the condition that−lnq(exp_q(−Ψ ))=Ψ holds on the support ofμ_t∈P₂^ac(ν). Combining the inequality (1−q)(Φ_t)²+

ij

(HessΦ_t)²_ij

1−q+1 d

(Φ_t)²0 (5.11)

provided by the Cauchy–Schwarz inequality and the assumptionq(d+1)/d, with the assumption HessΨ K, we deduce from (5.9) and (5.10) that

d²

dt²Hq(μt|ν)

R^d

K|∇Φt|²dμt=KW2(μ0, μ1)²=KW2(μ, ν)². (5.12) It follows that

(12)

−H_q(μ|ν)=H_q(μ₁|ν)−H_q(μ₀|ν)

= 1 0

t 0

d²

ds²Hq(μs|ν) ds+ d ds

s=0

Hq(μs|ν)

dt

1 0

t 0

KW2(μ0, μ1)²ds+ d ds

s=0

Hq(μs|ν)

dt

=K

2W₂(μ₀, μ₁)²+

R^d

∇Φ₀,∇

lnq(f₀)−lnq(f₁) dμ_t

K

2W2(μ0, μ1)²−

!

R^d

|∇Φ0|²dμ_t

R^d

∇

lnq(f0)−lnq(f1)²dμ_t

=K

2W2(μ, ν)²−W2(μ, ν)

Iq(μ|ν), (5.13)

where the third line follows from (5.12) and the forth line follows from (5.8). In the fifth line, we apply the Cauchy–

Schwarz inequality. This concludes the proof of Proposition 5.6. 2

The pair ofq-Gaussian measure given in Corollary C also attains equality in the inequality (5.7).

Corollary D.ForV ∈Sym⁺(d,R), let σ be the largest eigenvalue ofV andv^∗be the corresponding eigenvector.

Then for anyq >1andv∈R^d, a pair(μ, ν)=(N_q(v+v^∗, V ), N_q(v, V ))attains equality in the inequality(5.7)for λsatisfying

λσ =C₀¹⁻^qC₁(detV )⁻¹⁻²^q.

Proof. To establish equality in (5.7), HessΨ (∇Φt,∇Φt)=K|∇Φt|²holds and the inequalities in (5.11), (5.13) need to be equalities, which is equivalent to∇Φt ≡u, whereuis the difference between the mean ofν andμ. The given pair satisfies these conditions and we have equality in (5.7). 2

We finally give a new relation between theq-logarithmic inequality and theq-Talagrand inequality. To do this, we deal with the following evolution equation given by

⎧⎨

⎩

∂

∂tρ= 1 2−q

ρ²⁻^q +div(ρ∇Ψ ), x∈Ω,

ρ=0, x∈Ω^c,

(5.14) whereΩis an open set defined by

Ω:=

x∈R^d(1−q)Ψ (x) <1 .

For technical reasons, we need to require the solutionsρof (5.14) to satisfy the following conditions (I), (II) and (III).

(I) ρis non-negative and smooth.

(II) ρconserves the total mass and has a finite third moment, that is,

R^d

ρ dL^d=

Ω

ρ dL^d=1,

R^d

|x|³ρ dL^d<∞.

(III) Forξt(x):=lnq(ρ)−lnq(exp_q(−Ψ )), there exists a locally bounded functiona(t )such that (a) |∇(ξt(x)−ξt(y))|a(t )|x−y|.

(b) |∇(ξ_t²(x))|a(t )(1+ |x|³).

(13)

For a functionΨ, we set P(Ψ ):=

fL^d∈P^acthe solution to (5.14) with initial dataf satisfies (I)–(III) .

Remark 5.8.LetΨ (x)=λ|x|²/2 with someλ >0. If 1< q < (d+5)/(d+3), then anyq-Gaussian measure belongs toP(Ψ )(anyq-Gaussian measure has a finite third moment ifq < (d+5)/(d+3)). Forq <1,N_q(0, aId)∈P(Ψ ) ifλ^2αa < (C₀¹⁻^qC₁)^2α, which ensures that the support ofN_q(0, aId)is contained inΩ.

Theorem E.Forν:=exp_q(−Ψ )L^d∈P2withq∈Q_d, we assume thatΨ isC²and thatHessΨ is bounded below by a positive numberK. If there exists some positive numberλsuch that

H_q(μ|ν) 1

2λI_q(μ|ν)

holds for anyμ∈P₂^ac(ν)∩P(Ψ ), then we also have W₂(μ, ν)

2 λI_q(μ|ν) for anyμ∈P₂^ac(ν)∩P(Ψ ).

Remark 5.9.Proposition 5.2 says that the condition HessΨ K >0 implies that the q-Talagrand inequality (5.4) holds with the constantK. However,λis independent of K in Theorem E and the estimate in theq-Talagrand inequality is improved ifλ > K.

The basic strategy for proving Theorem E is similar to the proof of [15, Theorem 1], where the key ingredients are some variations along the flow (5.14) and the convergence of the solution in the sense ofHqandW2.

Forμ=fL^d, letft be the solution to (5.14) with the initial dataf0=f. Thenftsatisfies

∂

∂tft= 1 2−q

f_t²⁻^q +div(ft∇Ψ )=div(ft∇ξt)

and the conditions (I) and (II) guarantee μ_t :=f_tL^d ∈P₂^ac(ν). We first consider the variations of H_q(μ_t|ν) and W₂(μ, μ_t)along the solution to (5.14).

Lemma 5.10.

d

dtHq(μt|ν)= −Iq(μt|ν), (5.15)

d⁺

dtW2(μ, μt)=lim sup

s↓0

W₂(μ, μ_t₊_s)−W₂(μ, μ_t)

s

Iq(μt|ν). (5.16)

Proof. Setg:=exp_q(−Ψ ). For any smooth functionηwith compact support, we have d

dt 1

2−q

R^d

f_tlnq(f_t)−glnq(g)−(2−q)lnq(g)(f_t−g) η dL^d

= −

R^d

f_t

"

∇ξ_t,∇

ξ_t+ 1 2−q

η

# dL^d

= −1 2

R^d

∇η,∇

ξ_t² dμ_t− 1 2−q

R^d

∇η,∇ξ_tdμ_t−

R^d

η|∇ξ_t|²dμ_t,