• Aucun résultat trouvé

The heat equation on manifolds as a gradient flow in the Wasserstein space

N/A
N/A
Protected

Academic year: 2022

Partager "The heat equation on manifolds as a gradient flow in the Wasserstein space"

Copied!
23
0
0

Texte intégral

(1)

www.imstat.org/aihp 2010, Vol. 46, No. 1, 1–23

DOI:10.1214/08-AIHP306

© Association des Publications de l’Institut Henri Poincaré, 2010

The heat equation on manifolds as a gradient flow in the Wasserstein space

Matthias Erbar

Rheinische Friedrich Wilhelms Universität Bonn, Germany. E-mail:matthias.erbar@web.de Received 23 June 2008; revised 27 October 2008; accepted 12 November 2008

Abstract. We study the gradient flow for the relative entropy functional on probability measures over a Riemannian manifold.

To this aim we present a notion of a Riemannian structure on the Wasserstein space. If the Ricci curvature is bounded below we establish existence and contractivity of the gradient flow using a discrete approximation scheme. Furthermore we show that its trajectories coincide with solutions to the heat equation.

Résumé. Nous étudions les flux gradients dans l’espace des mesures de probabilité sur une variété Riemannienne pas nécessaire- ment compacte. Dans ce but nous munissons l’espace de Wasserstein avec une sorte de structure Riemannienne. Si la courbure de Ricci de la variété est bornée inférieurement nous démontrons qu’il existe un flux gradient contractif pour l’entropie relative. Il est construit explicitement en utilisant une approximation variationelle discrète. De plus ses trajectoires Coïncident avec les solutions à l’équation de la chaleur.

MSC:35A15; 58J35; 60J60

Keywords:Gradient flow; Wasserstein metric; Relative entropy; Heat equation

1. Introduction and statement of the main results

Since the work of Otto (see e.g. [7]) it is known that many equations of diffusion type onRnmight be interpreted as gradient flows for an appropriate functional on the Wasserstein space of probability measures equipped with the L2-Wasserstein metric. In this context the heat equation corresponds to the relative entropy,H (μ)=

logρdμ if μ=ρ·vol. This interpretation has been made rigorous since giving precise meaning to the notion of gradient flow in an abstract metric space. We refer to the book of Ambrosio, Gigli and Savaré [1] for a comprehensive treatment of gradient flows in metric spaces. As a standard example they show that there exists a unique gradient flow of the relative entropy functional on the Wasserstein space P2(Rn)whose trajectories coincide with solutions to the heat equation. These results have partly been extended from the Euclidean to the Riemannian setting. Savaré [11]

showed that there exists a unique gradient flow for certain functionals on metric spaces whose squared distance satisfies a concavity property. As special case this includes energy functionals on the Wasserstein space over a compact Riemannian manifold. In an independent related work Ohta [10] established existence of a unique gradient flow for (geodesically-)semiconvex functionals on Wasserstein spaces over compact Alexandrov spaces. In the special case of a compact Riemannian manifold with nonnegative sectional curvature he proves coincidence of the trajectories of the gradient flow for the free energy with solutions to the Fokker–Planck equation. However if the manifold is non-compact the picture is less complete. If the Ricci curvature is bounded below Villani ([14], Chapter 23) shows that smooth sufficiently integrable solutions to a large class of diffusive PDE’s constitute trajectories of the gradient flow for the associated functional. The assumption of a lower bound on the Ricci curvature is essential to guarantee

(2)

that the entropy functional is displacementK-convex, i.e.K-convex along geodesics in the Wasserstein space (cf.

Definition3.4).

In this paper we present a notion of Riemannian structure on the Wasserstein space to define gradient flows adopting the approach in [1]. We will study in detail the gradient flow for the relative entropy functional under the assumption of a lower bound on the Ricci curvature. Building on the results of Villani we show that its trajectories actually coincide with solutions to the heat equation. This yields contractivity of the gradient flow in the Wasserstein distance.

Furthermore we provide a time discrete approximation of the gradient flow, thus explicitly constructing its trajectories and establishing existence.

Throughout this paper letMbe a smooth (of classC3), connected and complete Riemannian manifold with Rie- mannian volumem. We denote byP2(M)the Wasserstein space of probability measures onM equipped with the L2-Wasserstein distancedW (see Section2). We shall assume that the Ricci curvature is bounded below, i.e.RicK for someK∈Rin the sense thatRicx(ξ,ξ)K|ξ|2for allxM,ξTxM.In the first part of this paper we will develop a notion of Riemannian structure on the Wasserstein space which provides the framework to give a precise definition of gradient flows inP2(M)(see Definition3.7). For the present purpose of this introduction say that a tra- jectory of the gradient flow for the relative entropyH is an absolutely continuous curvet)t0inP2(M)satisfying the evolution variational inequality

d dt

1

2dW2t, σ )+K

2dW2t, σ )H (σ )H (μt)σD(H ), (1) whereD(H )= {μ∈P2(M): H (μ) <∞}. We will see later (4.5) that the gradient flow ofH satisfies this property.

Our main results are the following:

Theorem 1. There exists a unique gradient flowσ:[0,∞)×P2(M)D(H )⊂P2(M)for the relative entropyH and it satisfies

dWt, νt)≤eKtdW0, ν0) for allμ0, ν0∈P2(M), t>0, whereμt=σ (t, μ0), νt=σ (t, ν0).

Theorem 2. Let(μt)t0be a continuous curve inP2(M).Then the following are equivalent:

(i) t)t0is a trajectory of the gradient flow for the relative entropyH,

(ii) μt is given byμt(dx)=ρt(x)·m(dx)fort >0,where(ρt)t >0is a solution to the heat equation

tρt(x)=ρt(x) on(0,)×M satisfying the conditions

H (ρt·m) <∞ ∀t >0 and s1

s0

M

|∇ρt|2

ρt dmdt <∞ ∀0< s0< s1. (2) The structure of this paper is as follows. In Sections2and3 we will lay down the foundation to define gradient flows inP2(M)without any assumption on the Ricci curvature. We shall adopt the approach of [1] which they apply in the case ofRn. We will use the Riemannian structure of the underlying space to endow the Wasserstein space with a notion of a Riemannian differentiable structure which gives the theory of gradient flows an appealing formal analogy to the classical setting. In Section2we will introduce a notion of tangent bundle toP2(M)which allows us to assign a tangent vector to an absolutely continuous curvet)t inP2(M). Heuristically this is the vector field governing the evolution ofμtvia the continuity equation, chosen in such a way that the kinetic energy is minimal. We will establish a subdifferential calculus for functionals onP2(M)in Section3. Our definition of gradient flows inP2(M)is modeled on classical gradient flows of smooth functionals on a Riemannian manifold by demanding that the tangent to the curve belongs to the subdifferential of the functional. We transpose the arguments and constructions of [1], Chapters 8 and 10, fromRn to the Riemannian setting. From then on we will focus on the relative entropy functional and assume that RicK. Building on the results in [14] we compute explicitly its subdifferential in Section4. This

(3)

will yield coincidence of the gradient flow with the heat flow as stated in Theorem2and make it possible to derive contractivity of the gradient flow and thus uniqueness. To establish existence we will introduce in Section5the time discrete variational approximation scheme of Otto. By this mean we will construct trajectories of the gradient flow for initial measures with finite entropy. In Section6we will use the contractivity to extend the gradient flow semigroup to arbitrary initial measures and thus conclude the proof of Theorem1.

2. Metric and Riemannian structure ofP2(M)

Recall that (M,·,·)is a smooth (of classC3) connected complete Riemannian manifold. Letm denote the Rie- mannian volume anddthe Riemannian distance onM. The Wasserstein space overMis defined as the space of Borel probability measures onMwith finite second moment:

P2(M):=

μ∈P(M)

M

d2(x0, x)dμ(x) <∞for some (hence all)x0M

.

Givenμ, ν∈P2(M)theirL2-Wasserstein distance is defined by dW(μ, ν):=inf

M×M

d2(x, y)dπ(x, y)πis a coupling ofμandν 1/2

. (3)

Here a probability measure π ∈P(M×M) is called a coupling of μand ν if its marginals are μ andν, i.e.

π(A×M)=μ(A), π(M×A)=ν(A)for all Borel setsAM. For anyμ, ν∈P2(M)there is at least one minimizer of (3) called an optimal coupling [14], Theorem 4.1. Formula (3) defines a metric onP2(M)(see [14], Definition 6.1), in fact(P2(M), dW)is a polish space. We denote the subspace of probability measures absolutely continuous w.r.t. the volume measurembyP2ac(M). The variational problem (3) is a particular instance of the Monge–Kantorovich mass transfer problem for the cost functiond2. See [14], Part I, for a comprehensive treatment of optimal transportation on Riemannian manifolds. Let us recall the following results.

Proposition 2.1. Letμ∈P2ac(M)andν∈P2(M).Then there exists a unique minimizerπin(3)andπ=(id, F )#μ for aμ-a.e.uniquely determined Borel mapF:MM.

Moreover,π-a.e.pair(x, y)is connected by a unique minimizing geodesic.Hence there exists a μ-a.s.unique vector fieldΨνμsuch that forμ-a.e.xM:

(i) F (x)=expxνμ(x)),

(ii) t→expx(tΨνμ(x)), t∈ [0,1]is a minimizing geodesic.

Proof. In the case of a compact Riemannian manifold (or compactly supportedμ andν) these results are due to McCann (cf. [9], Theorem 9). For a proof that the optimal coupling is unique and deterministic in the noncompact case see [14], Theorem 10.41. In fact the proof also shows thatπ-a.e. pair(x, y)is connected by a unique minimizing

geodesic (cf. also [14], (10.32)).

The vector field Ψνμ will be referred to as the optimal transport vector field. Note that we have the identity:

dW2(μ, ν)=

νμ|2dμ .One easily checks that tμt=exp

νμ

#μ, t∈ [0,1],

is a (constant speed) geodesic inP2(M)connectingμtoν. It turns out that this is the unique geodesic connectingμto νand that any two measures inP2(M)can be connected by geodesic (cf. [14], Corollaries 7.22, 7.23). By uniqueness of the optimal transport vector field it is immediate thatΨμμt =t·Ψνμ.

We say that a sequenceμnof probability measures onMconverges weakly toμif for every bounded and continu- ous functionfCb0(M):

fn

fdμ.It can be shown ([14], Theorem 6.9) that convergence in the metricdW is equivalent to weak convergence of probability measures plus convergence of second moments.

When considering curves in the metric spaceP2(M)our usual regularity assumption will be absolute continuity.

(4)

Definition 2.2 (Locally absolutely continuous curves). Let(X, d)be a metric space andI ⊂Ran open interval.A curvec:I−→X is said to be locally absolutely continuous of orderp(denoted bycAClocp (I, X))if there exists uLploc(I )such that

d

c(s), c(t)

t

s

u(τ )dτ ∀stI. (4)

It can be shown ([1], Theorem 1.1.2) that for every locally absolutely continuous curvecthe metric derivative c(t):=lim

h0

d(c(t+h), c(t))

|h|

exist for a.e.tI and is the minimaluin (4).

We now endow P2(M) with a kind of Riemannian differentiable structure and construct the tangent space TμP2(M)to the Wasserstein space at a given measureμ. We employ the approach of Ambrosio–Gigli–Savaré ([1], Chapter 8) and transpose their construction for the Euclidean caseP2(Rn)to our Riemannian setting. Let us denote byL2(μ, T M)the Hilbert space of measurable vector fieldsw:M−→T M,w(x)TxMwith finite 2-norm:

wL2(μ,T M)=

M

w(x),w(x)

xdμ(x) 1/2

.

Definition 2.3. Letμ∈P2(M).We defineTμP2(M):= {∇ϕ|ϕCc(M)}L2(μ,T M).

HereCc(M)denotes the space of smooth compactly supported functions onM. TμP2(M)is a Hilbert space endowed with the natural L2 scalar product which is our formal analogue to a Riemannian metric. We have the following characterization of tangent vector fields in terms of a minimizing property.

Lemma 2.4. Letμ∈P2(M)andvL2(μ, T M).Then the following are equivalent:

(i) vTμP2(M),

(ii) v+wL2(μ,T M)≥ vL2(μ,T M)∀w∈L2(μ, T M)such thatdiv(wμ)=0.

If equality holds in(ii)for somewL2(μ, T M)withdiv(wμ)=0thenw=0.

Herediv(wμ)=0is to be understood in the sense of distributions,i.e.

M

ϕ,wdμ=0 ∀ϕCc(M).

Proof. Note that{wL2(μ, T M)|div(wμ)=0}is the orthogonal complement of the closed subspaceTμP2(M)in

L2(μ, T M). Then the assertions are straightforward.

Note that if we denote byΠ the orthogonal projection ontoTμP2(M)then by (ii) we have thatwL2(μ,T M)ΠwL2(μ,T M)for allwL2(μ, T M).

Our definition of tangent bundle toP2(M)is justified by the following proposition.

Proposition 2.5. LetI be an open interval and(μt)tACloc2 (I,P2(M))with metric derivative|μ| ∈L2loc(I ).Then there exists a vector fieldv:I×MT M, (t, x)vt(x)TxMwithvtL2t,T M)L2loc(I )such that

vtTμtP2(M) for a.e.tI (5)

and the continuity equation

tμt+div(vtμt)=0 inI×M (6)

(5)

holds in the sense of distributions,i.e.

I

M

tϕ(t, x)+

vt(x),ϕ(t, x)

t(x)dt=0 ∀ϕCc(I×M).

The vector fieldvt is uniquely determined inL2t, T M)by(5)and(6)for a.e.tI and we havevtL2t,T M)=

|μ|(t).

Conversely,let(μt)tIbe a curve inP2(M)satisfying(6)for some vector field(vt)t with r

s

vt2L2t,T M)dt <∞ ∀s < rI.

Then(μt)t is locally absolutely continuous inI and|μ|(t)≤ vtL2t,T M).

We will view the vector fieldvt characterized by the previous proposition as the tangent vector to the curvet)t at timet. Note that among all vector fields satisfying (6) this one is also unique such thatvtL2t,T M)is minimal for a.e.tI. By Lemma2.4this minimality is characterized byvtTμtP2(M). Intuitively, if we pictureμt as a cloud of gas, the tangent vector field is the unique velocity field governing the evolution ofμt via the continuity equation with minimal kinetic energy.

By Proposition2.5we recover in our Riemannian setting the Benamou–Brenier formula (7) [3], which shows that our formal Riemannian structure onP2(M)is consistent with the metric space structure. For anyμ0, μ1∈P2(M), we have

dW20, μ1)=min 1

0

wt2L2t,T M)dt

, (7)

where the minimum is taken over all absolutely continuous curvest)t∈[0,1]connectingμ0toμ1with tangent vector field(wt)t. Just recall that|μ|(t)≤ wtL2t,T M)for any admissible curve to see thatdW20, μ1)is dominated by the right-hand side. Choose a geodesic and its tangent vector field to obtain equality.

Proof of Proposition2.5. It suffices to prove the case, where|μ| ∈L2(I ). Indeed, in the general case exhaustI by compact intervals. By uniqueness the resulting vector fields yield a well defined vector field onI. So note first that for everyϕCc(I×M)the maptμt(ϕ)is inAC2(I,R). Indeed choosing an optimal couplingπs,t ofμs and μtand applying the Hölder inequality we have

μt(ϕ)μs(ϕ)=

M×M

ϕ(x)ϕ(y)s,t(x, y)

≤Lip(ϕ)dWs, μt)

which implies absolute continuity. To estimate the metric derivative ofμt(ϕ)consider the upper semicontinuous and bounded map

H (x, y):= ∇ϕ(x) ifx=y,

|ϕ(x)ϕ(y)|

d(x,y) ifx=y

and setπh:=πs+h,s. By Hölder inequality we obtain

|μs+h(ϕ)μs(ϕ)|

|h| ≤ 1

|h|

M×M

d(x, y)H (x, y)hdWs+h, μs)

|h| M×M

H2(x, y)h

1/2

.

Lett)t be metrically differentiable ins. Ash→0 the marginals ofπh converge weakly toμs and therefore πh must converge weakly to(id, id)#μs the unique optimal coupling ofμs andμs (cf. [14], Theorem 5.20). Hence, we have

lim sup

h0

|μs+h(ϕ)μs(ϕ)|

|h| ≤μ(s)

M

|∇ϕ|2s

1/2

=μ(s)ϕL2s,T M). (8)

(6)

Now setQ:=I×Mand define a measure onQbyλ=

Iμtdt. LetϕCc(Q). Then we have using Fatou’s lemma and (8)

Q

sϕdλ =lim

h0

Q

ϕ(s, x)ϕ(sh, x)

h dλ(s, x)

=lim

h0

I

1 h

μs ϕ(s,·)

μs+h ϕ(s,·)

ds

I

μ(s)

M

|∇ϕ|2(s, x)s(x) 1/2

ds≤

I

μ2(s)ds 1/2

ϕL2(λ,T M).

LetV:= {∇ϕ|ϕCc (Q)}L2(λ,T M). The previous estimate shows that the linear functional L(ϕ):= −

Q

sϕ

can be uniquely extended to a bounded functional onV. By the Riesz representation theorem we obtain a uniquevV withv2L2(λ,T M)≤ |μ|L2(I )such that

Qv,ϕdλ=L(ϕ)= −

Q

sϕdλ ∀ϕCc(Q).

Settingvt =v(t,·)we obtain the continuity equation (6). Repeating the argument above withJI the uniqueness of the Riesz representation yields thatv:=v|J×M represents the restriction ofLto{∇ϕ|ϕCc(J ×M)}L2(λ,T M) and in particular

J

vs2L2s,T M)ds=v2

L2(λ,T M)

J

μ2(s)ds.

SinceJ is arbitrary we obtainvtL2t,T M)≤ |μ|(t)for a.e.tI. Equality will follow from the converse implica- tion.

Let us now show thatvtTμtP2(M)for a.e.tI. LetϕnCc(Q)such that∇ϕnvinL2(λ, T M)asn→ ∞.

Then there is a subsequence(nk)ksuch that for a.e.tI:∇ϕnk(t,·)vt inL2t, T M). This yields the claim.

LetwL2(λ, T M)be another vector field satisfying (5) and (6). Then div((wtvtt)=0 for a.e. t. Hence Lemma2.4yieldswtL2t,T M)= vtL2t,T M).

Furtheru:=12v+12walso satisfies (6) andutTμtP2(M). By the strict convexity of theL2-norm we infer that vt=wt for a.e.tI.

It remains to show the converse implication. So lett)tIbe a curve inP2(M)satisfying (6) for a square integrable vector field(vt)t. In estimating the differencedWs, μt)a recent result of Bernard ([4], Theorem 5.8) is very useful as pointed out to the author by Savaré. It states that the solutiont)t to the continuity equation for the vector field (vt)t can be represented as a superposition of random solutions to the equation

˙ γ (t)=vt

γ (t)

. (9)

Precisely there exists a probability measureΠon the space of continuous curvesC0(I, M)such that (et)#Π=μttI,

whereet is the evaluation map. Moreover,Π-a.e.γC0(I, M)is differentiable at a.e.tI and satisfies (9). Since (es, et)#Π is a coupling ofμs andμt we can now estimate using Jensen’s inequality:

dW2s, μt)

d2

γ (s), γ (t)

dΠ (γ )≤ t

s | ˙γ|(r)dr 2

dΠ (γ )

(ts)

t s

vr

γ (r)2drdΠ (γ )=(ts) t

s

M|vr|2rdr.

(7)

This implies thatt)t is absolutely continuous and|μ|(t)≤ vtL2t,T M for a.e.tI. To conclude this section we shed some more light on the relation between tangent vector fields and optimal transport maps. We observe that optimal transport vector fieldsΨνμ (see Proposition2.1) are tangent to P2(M)at the initial measure. Moreover, we recover the tangent to a curve inP2(M)as the limit of secants made up of transport maps.

Lemma 2.6. Letμ∈P2ac(M), ν∈P2(M)and letΨνμbe the optimal transport vector field.ThenΨνμTμP2(M).

Proof. Approximateμandνby compactly supported measuresμn, νn. Proposition 13.2 of [14] shows that this can be done in such a way, namely by compactly exhausting the set of geodesics along which mass is being transported, that the optimal transport vector fieldsΨνμnnandΨνμcoincide a.e. on the support ofμn. Possibly after settingΨνμnn=0 outside supp(μn)we clearly haveΨνμnnΨνμinL2(μ, T M). In the compactly supported caseΨνμnnis given as∇ψ, whereψis a function satisfying the so-calledd22-convexity property which is Lipschitz on supp(μn)([14], Theorems 10.41 and 10.26). By settingψ=0 outside supp(μn)and regularizing (e.g. in local charts) we can findψεCc(M) such that∇ψε→ ∇ψ m-a.e. Sinceμis absolutely continuous the dominated convergence theorem yields convergence

inL2(μ, T M). This showsΨνμTμP2(M).

Lemma 2.7. Let(μt)tACloc2 (I,P2ac(M))and(vt)tthe tangent vector field characterized by Proposition2.5.Then we have for a.e.tI:

hlim0

1

hΨμμtt+h=vt inL2t, T M). (10)

Proof. Choose a countable subsetD⊂Cc(M)which is dense w.r.t. the norm ϕC1:=sup

xM

ϕ(x)+sup

xM

ϕ(x)

x.

Fixϕ∈Dand letηCc(I ). By the continuity equation (6) and a change of variables we have 0=

I

M

tη(t )·ϕ+ ∇ϕ,vtη(t )tdt

=

I

−lim

h0

μt+h(ϕ)μt(ϕ)

h +

M

ϕ,vtt

η(t )dt.

Sinceηis arbitrary we must have that for a.e.tI:

hlim0

μt+h(ϕ)μt(ϕ)

h =

Mϕ,vtt. (11)

SinceDis countable and the metric derivative exists a.e. we can find a negligible set of timesN such that for every tI\N we have that (11) holds for everyϕ∈Dandsμs is metrically differentiable att. Now fixtI\N and set

Ψh:=1 hΨμμtt+h.

LetΨ0be a weak limit point ofΨhinL2t, T M)ash→0. Forϕ∈Dwe have using Taylor expansion 1

h

μt+h(ϕ)μt(ϕ)

=1 h

M

ϕ◦exp(hΨh)ϕt=

M

ϕ,Ψht+O(h).

Combining this with (11) we obtain

M

ϕ,Ψ0t=

M

ϕ,vtt.

(8)

By the density ofDwe conclude that in the sense of distributions div

0vt)·μt

=0. (12)

Since theL2-norm is weakly lower semicontinuous we have that Ψ0L2t,T M)≤lim inf

h0 ΨhL2t,T M)=lim

h0

1

hdWt+h, μt)

(t)= vtL2t,T M). (13)

From (12) and (13) we infer with Lemma2.4thatΨ0=vt. Since weak convergence and convergence of the norm

imply strong convergence this yields the claim.

3. Subdifferentials and gradient flows

LetΦ:P2(M)(−∞,+∞]be a functional on the Wasserstein space. Denote by D(Φ):=

μ∈P2(M)|Φ(μ) <

the proper domain ofΦ. We will establish a notion of (sub-)differential ofΦ modeled on the classical one in linear spaces. Recall that for a functionalF:X→R∪ {∞}on a Hilbert spaceXthe subdifferential at a pointxD(F )is defined by

v∂F (x): ⇔ F (y)F (x)v, yx +o

yx .

To transpose this definition to the Wasserstein space we proceed as follows: ifμ is the reference point we intend the scalar product to be taken inL2(μ, T M)and the displacement vectoryxcorresponds to the optimal transport vector fieldΨνμ, which is defined ifμ∈P2ac(M). Hence, we shall assume that

D(Φ)⊂P2ac(M).

Note that this is the case for the relative entropy functional.

Definition 3.1 (Subdifferential). LetμD(Φ)and wL2(μ, T M).We say thatwbelongs to the subdifferential

∂Φ(μ)if

Φ(ν)Φ(μ)

M

w,Ψνμ dμ+o

dW(μ, ν)

ν∈P2(M).

A vector fieldw∂Φ(μ)is said to be a strong subdifferential if it also satisfies Φ

exp(Ψ)#μ

Φ(μ)

M

w,Ψdμ+o

ΨL2(μ,T M)

∀Ψ ∈L2(μ, T M).

Note that the vectorwin the definition of subdifferential only acts on tangent vectors by Lemma2.6. We conclude that the orthogonal projectionΠwonTμP2(M)is in∂Φ(μ)wheneverwis. The following lemma shows that subd- ifferentials inTμP2(M)are strong subdifferentials. It is a direct adaption of [2], Proposition 4.2, to the Riemannian setting.

Lemma 3.2. LetμD(Φ)andw∂Φ(μ)TμP2(M).Thenwis a strong subdifferential.

Proof. We argue by contradiction. Supposew is not a strong subdifferential. Then there isδ >0 and a sequence (un)n∈NL2(μ, T M)such that

εn:= unL2(μ,T M)→0 and Φ(μn)Φ(μ)

M

w,undμ≤ −δεn, (14)

(9)

whereμn:=exp(un)#μ. SettingΨn:=Ψμμnwe have

ΨnL2(μ,T M)=dW(μ, μn)≤ unL2(μ,T M)=εn→0. (15)

Hence by the definition of subdifferential there isN such that Φ(μn)Φ(μ)

M

w,Ψndμ−δ

2εnn > N. (16)

Combining (14) and (16) yields:

Mw,Ψnundμ≤ −δ

2εnn > N. (17)

Note that by (14) and (15)nn)n and(unn)nare bounded sequences inL2(μ, T M). Hence up to extracting a subsequence we can assume that

Ψn

εn Ψ, un

εn u weakly inL2(μ, T M).

Dividing byεnin (17) and passing to the limit asn→ ∞yields

Mw,Ψudμ≤ −δ

2. (18)

Now letϕCc(M). Asun,Ψn →0 we obtain by Taylor expansion that for some constantC and allnlarge enough:

0=

M

ϕ expx

Ψn(x)

ϕ expx

un(x)

M

ϕ,Ψnundμ+C

Ψn2+ un2

. (19)

Now dividing byεnand passing to the limit in (19) yields:

Mϕ,Ψudμ≥0 ∀ϕCc(M).

Sincewbelongs toTμP2(M)in which gradients of smooth functions are dense we also have

M

w,Ψudμ≥0

in contradiction to (18). Hencewmust be a strong subdifferential.

As a metric substitute for the norm of the gradient of a functional we define the metric slope.

Definition 3.3 (Metric slope). ForμD(Φ)we define:

|∂Φ|(μ):=lim sup

νμ

(Φ(μ)Φ(ν))+

dW(μ, ν) . (20)

Note that ifw∂Φ(μ)it is immediate from the definitions that |∂Φ|(μ)wL2(μ,T M). In the sequel we will exploit that the functional we consider enjoys a certain convexity property along geodesics inP2(M). More precisely:

(10)

Definition 3.4. LetK∈R.The functionalΦis called(displacement)K-convex,if for each pairμ0, μ1D(Φ)there exists a geodesic(μt)t∈[0,1]inP2(M)such that for eacht∈ [0,1]:

Φ(μt)t Φ(μ1)+(1t )Φ(μ0)t (1t )K

2dW20, μ1). (21)

Note that in our case there is exactly one geodesic connectingμ0toμ1by Proposition2.1since we assume that D(Φ)⊂P2ac(M). This notion of convexity for functionals on the Wasserstein space has been introduced by McCann in [8]. ForK-convex functionals the subdifferential can be characterized by a variational inequality.

Lemma 3.5. LetΦ beK-convex and letμD(Φ),wL2(μ, T M).Thenw∂Φ(μ)iff Φ(ν)Φ(μ)

M

w,Ψνμ

dμ+K

2dW2(μ, ν)ν∈P2(M). (22)

The metric slope can be represented as

|∂Φ|(μ)=sup

ν=μ

Φ(μ)Φ(ν) dW(ν, μ) +K

2dW(ν, μ) +

. (23)

Proof. The if implication is obvious. So letw∂Φ(μ)andνD(Φ)(otherwise, there is nothing to prove). Let t)t∈[0,1]be the geodesic betweenμandν.K-convexity (21) yields:

Φ(μt)Φ(μ)

tΦ(ν)Φ(μ)(1t )K

2dW2(μ, ν). (24)

By the definition of subdifferential we also have:

lim inf

t0

Φ(μt)Φ(μ)

t ≥lim inf

t0

1 t

M

w,Ψμμt dμ=

M

w,Ψνμ dμ,

where we have used the fact that Ψμμt =t·Ψνμ and dW(μ, μt)=t·dW(μ, ν). To obtain the nontrivial inequality for the metric slope we use again (24) and

lim inf

t0

Φ(μt)Φ(μ)

t =lim inf

t0

Φ(μt)Φ(μ)

dW(μ, μt) dW(μ, ν)≥ −|∂Φ|(μ)·dW(μ, ν).

Proposition 3.6 (Chain rule). LetΦ beK-convex.Let(a, b)be an open interval and(μt)tAC2((a, b), D(Φ)) with tangent vector field(vt)t.Assume that

a b

|∂Φ|t)·μ(t)dt <. (25)

Then the maptΦ(μt)is absolutely continuous in(a, b)and we have for a.e.t: d

dtΦ(μt)=

M

ξ,vtt ∀ξ∈∂Φ(μt). (26)

Proof. For absolute continuity of the maptΦ(μt)see the proof of [1], Corollary 2.4.10, which relies on (23). In view of (25) and Lemma2.7we have for a.e.t

|∂Φ|t) <, 1 hΨμμtt+h

h0

vt inL2t), sΦ(μs)is differentiable att.

(11)

Let us compute the derivative ofΦ(μs)at such a timet: forξ∂Φ(μt)we can estimate Φ(μt+h)Φ(μt)

M

ξ,Ψμμtt+h

t+K

2dW2t, μt+h)

=h

M

ξ,vtt+h

M

ξ,1

hΨμμt+htvt

t+K

2dW2t, μt+h)

=h

M

ξ,vtt+o(h).

Dividing byhand taking left and right limits yields the claim d

dt

+

Φ(μt)

M

ξ,vtt≥ d dt

Φ(μt).

With the notions of tangent and subdifferential at hand we can now define gradient flows in the Wasserstein space P2(M)in analogy to gradient flows of smooth functionals on Riemannian manifolds.

Definition 3.7 (Gradient flow). Let(μt)t0be a continuous curve inP2(M)which belongs toACloc2 ((0,),P2(M)).

Let(vt)tbe the tangent vector field characterized by Proposition2.5.We say that(μt)t0is a trajectory of the gradient flow forΦif it satisfies the gradient flow equation

−vt∂Φ(μt) for a.e.t >0. (27)

Observe that the gradient flow equation implies that|μ|(t)· |∂Φ|t)≤ vt2L2t,T M). The right-hand side is in L1loc(R+)so Proposition3.6yields that the maptΦ(μt)is locally absolutely continuous in (0,)and for a.e.

t >0:

−d

dtΦ(μt)= vt2L2t,T M). (28)

In particularΦ(μt) <∞for allt >0. Integrating (28) we obtain the so called energy identity:

Φ(μb)Φ(μa)= − b

a

vt2L2t,T M)dt ∀0< a < b. (29)

4. Identification of the gradient flow for the relative entropy

In this section we shall turn to the relative entropy functional and compute its subdifferential. This will yield the proof of Theorem2.

Recall that the relative entropyH:P2(M)→ [−∞,∞]is defined by H (μ):=

M

ρlogρdm, (30)

provided thatμis absolutely continuous with densityρ andρ(logρ)+is integrable. Otherwise setH (μ):= ∞.In particular D(H )⊂P2ac(M). Convexity properties of this functional are intimately related to the curvature ofM.

Sturm and others (see [12], Theorem 1.3) showed thatHis displacementK-convex if and only if

Ricx(ξ,ξ)K· ξ2xM, ξTxM. (31)

So let us assume from now on that the Ricci curvature ofMis bounded below by someK∈Rin the sense of (31) which we will denote in short byRicK. Under this assumption the relative entropy functionalH takes values in

(12)

(−∞,∞], precisely we prove an estimate which will be useful later on. Fix a pointoMand denote the second moment by

m2(μ):=

M

d2(o, x)dμ(x). (32)

Lemma 4.1. Let RicK.For any ε >0 there exists a constant Cε >0 depending only on ε such that for any μ∈P2(M):

H (μ)≥ −Cεε·m2(μ) >−∞. (33)

Proof. Letε >0. We can assume thatμ=ρ·m∈P2ac(M), otherwise there is nothing to prove. Let c >0 to be chosen later and observe thatr|logr| ≤√

rfor allr∈ [0,1]. Using this in the second step we can boundH (μ)as

M

logρ)dm=

{ρecd(o,·)}ρ|logρ|dm+

{ecd(o,·)1}ρ|logρ|dm

M

ec/2d(o,x)dm(x)+c

M

d(o, x)ρ(x)dm(x)

M

ec/2d(o,x)dm(x)+ε·m2(μ)+c2,

where we used the elementary inequalitycyεy2+c2/(4ε). By the Bishop volume comparison theorem ([5], Theo- rem 3.9), the volume of geodesic ballsBr(o)inMgrows at most exponentially, i.e.m(Br(o))a(ebr−1)for suitable constanta, b >0. Hence we conclude that forc >2b

M

ec/2d(o,x)dm(x)= 1

0

m

ec/2d(o,x)t

dt <∞,

which proves the claim.

We now compute explicitly the subdifferential of the relative entropy functional starting with the following lemma which shows that we can take “directional derivatives” of the entropy along the geodesic flow of smooth vector fields.

Let us denote byCc(M, T M)the space of smooth compactly supported vector fields onM.

Lemma 4.2. LetξCc(M, T M)andμ=ρ·mD(H ).We setTt :=exp(tξ)fort∈Randμt:=Tt#μ.Then we have:

tlim0

H (μt)H (μ)

t = −

M

ρ·divξdm. (34)

Proof. Fort sufficiently smallTt is a diffeomorphism. We denote byJt:=det(dTt)its Jacobian determinant. Since ξ is compactly supported andJ0=1 there is c >0 such that 1/c≤Jt(x)cfor xM, t∈ [−ε, ε]andεsmall enough. By the change of variables formulaμt is again absolutely continuous with densityρt=·Jt1)Tt1and

H (μt)=

M

F (ρt)dm=

M

F ρ

Jt

Jtdm <∞, whereF (r):=r·log(r). Direct computation yields:

d

dt F ρ(x) Jt(x)

Jt(x)

= −ρ(x)J˙t(x) Jt(x).

Références

Documents relatifs

It may be expected that this solution concept, at least in the case of energy convergence, satisfies a weak-strong uniqueness principle similar to the ones in the forthcoming works

Plus précisément on se propose de : – montrer que pour tout pair il existe un cycle homocline qui possède points d’équilibre ; – montrer que le seul cycle avec un nombre impair

Keywords: Constant vector fields, measures having divergence, Levi-Civita connection, parallel translations, Mckean-Vlasov equations..

Le but de cette thèse est d’étendre cette théorie pour étudier des équations ne provenant pas de flots gradient dans l’espace de Wasserstein, par exemple, les systèmes

Keywords: Wasserstein space; Geodesic and Convex Principal Component Analysis; Fréchet mean; Functional data analysis; Geodesic space;.. Inference for family

Moreover, under some regularity assumption on the initial datum u 0 , such a variational problem allows to define a weak formulation of the Hele-Shaw flow [9, 11] (see also [12] for

This numerical study, which may model any Lagrangian evolution of orientation, shows that the scalar gradient does not align with the equilibrium direction if nonstationary ef-