Optimal dynamic transport with mass consumption

(1)

HAL Id: hal-02516695

https://hal.archives-ouvertes.fr/hal-02516695v3

Preprint submitted on 9 Apr 2020

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires

Optimal dynamic transport with mass consumption

Romain Hug

To cite this version:

Romain Hug. Optimal dynamic transport with mass consumption. 2020. �hal-02516695v3�

(2)

Research report

^∗

: Optimal dynamic transport with mass consumption

Romain Hug^†

The problem of optimal transport has remained very academic for a long time at least in the formulation that G. Monge made of it, and took on a more concrete aspect with the idea of L. Kantorovitch to formulate it in terms of optimal allocation of resources. More recently, works such as those by R.

J. McCann [13], Y. Brenier and J.-D. Benamou [1] made it possible to give a dynamic formulation of this type of problem (mainly for quadratic Euclidean transport costs). This formulation open the way to easier numerical process since it allows to apply classical numerical methods in the domain of variational and convex optimization under constraint. For absolutely continuous measures with L²-densities, we can cite in particular the augmented Lagrangian method [1], splitting-proximal or primal-dual[8, 14, 3]. Our present work will be based on the augmented Lagrangian formulation.

An important tool from the Optimal Transportation theory is the Wasserstein distance, a metric allowing to estimate the difference between two probability measures (or more generally, of two measures with the same "mass"). This distance represents a "global transportation cost" between two measures, assuming the local mass displacement cost to depend on the distance and linearly on the local mass.

One of the main limitation of Optimal Transport is that it can only be applied between measures with the same "total mass". This is why a lot of research works have proposed various extensions of these Wassertein distances to measure spaces with potentially different "masses" [2, 17, 16].

The use which is made of Optimal Transport for interpolation problems has also encouraged research for new optimal "unbalanced" mass transport models, and in particular dynamic models.

Indeed, in addition to offering a variational framework more suited for numerical processing, those models provide a continuous time evolution (a "geodesic interpolation" for the Wasserstein metric in the classic case) between the source and target measures. The dynamic formulation proposed in [1] represents the constraint of mass conservation during the time-continuous transport with a con- servative continuity equation. Their model uses absolutely continuous measures withL²-densities:

ifρ(t, x)represents the time evolution of the interpolated density, andv(t, x)represents the velocity field derived from the transportation map, we therefore have∂_tρ+ div(ρv) = 0.

Many models, which aim to generalize these interpolations for measures or densities with different mass, use a source term in the continuity equation (which become∂tρ+ div(ρv) =f) which has to be taken into account in the energy fonctional (see [9, 10], or [6, 5, 15, 11, 12] with also applications to images processing, or [7] for a use in data assimilation) .

∗Research report produced within the teamOptimization and Optimal Control (RICAM, Johannes Kepler Uni- versity, 4040 Linz, Austria, 2017-2018)

†Université d’Artois, LML, 62300 Lens, France

(3)

The model we propose to study in the present work starts from this idea of introducing a source term in the continuity equation and adapting the energy term. However, it does not currently aim to create an interpolation between two different measures or densities, but rather go back directly to an optimal allocation problem under the constraint of a displacement cost. A non standard feature of our model in that cost directly reprensents a loss of the mass transported: the objective of the problem is therefore to minimize this mass loss. In this model, the initial density does not need to be entirely transported, and the question is to determine the correct allocation for a part of this initial mass by taking into account the dissipation of it along its way.

In this report, we will limit ourselves to the proof of the equivalence between the resolution of the strict statement of this model (but whose dynamic formulation is reduced to a non-convex optimization problem), and the resolution of a relaxed version of it (which is a convex problem).

1 Brief statement of the problem

Consider a production area (for instance coal), and an area to supply. The production area pro- duces more coal than the needs of the area to be supplied. However, transport also requires coal consumption. The objective is to determine an optimal allocation map that ensures supply while minimizing the consumption of coal required for

transportation. resource to deliver

resource consumed by transportation

P roduction ar ea Consu mption ar ea

Transport

(4)

We choose a dynamic model (inspired by [1]) and we denote byρthe mass to be transported and by mthe momentum. In this first model, we assume that consumption is proportional to the energy dispersed by solid friction (for example the friction of the train on the rails), and therefore that it is proportional to the norm of the momentum |m|, with a coefficient of dispersion k. Here, we choose to overlook the consumption of the initial impulse (even if it may be an interesting thing to look out). It is clear that for such a model several solutions are possible. We are looking for a dynamic model which does not take into account the mass that is not displaced (which is useless), and which does not allow for overlapping of trajectories. That is why, in addition to the energy term|m|, we add a penalization term(/2) |m|²/(2ρ) +ρ

with fixed.

Solving this first problem opens the way to the asymptotic analysis when→0 (preferably inL² space).

We set the domainQ= (0,1)×Ω, with Ωa convex bounded domain (sufficiently regular) ofR^d. Letρ₀ and ρ₁ be two (non-negative) densities of L²(Ω) (withR

Ωρ₀dx≥R

Ωρ₁dx), with compact supports inΩ.

We define the following kinetic energy term(1/2)J:

J(α, β) =







|β|²

α if α >0,

0 if (α, β) = (0,0), +∞ else.

(1-1)

We also define the space of constraintsC:

C=

(ρ, m)∈L²(Q)^d+1, ρ(0,·)≤ρ0, ρ(1,·) =ρ1, ∂tρ+ divx(m) =−k|m|, hm, ~ni∂Ω= 0 (1-2) We want to solve the following problem:

inf

(ρ,m)∈C

2

Z

Q

(J(ρ, m) +ρ)dx dt+ Z

Q

|m|dx dt

(1-3) In order to more easily a dual formulation of this problem, we choose to expand the space of constraints as follow:

C^re=

(ρ, m)∈L²(Q)^d+1, ρ(0,·)≤ρ₀, ρ(1,·)≥ρ₁, ∂_tρ+ div_x(m)≤ −k|m|, hm, ~ni_∂Ω≥0 (1-4) Therefore, we would stude the problem of existence and uniqueness of a couple density-momentum (ρ^∗, m^∗)such that

(ρ^∗, m^∗) = argmin

(ρ,m)∈C^re

2

Z

Q

(J(ρ, m) +ρ)dx dt+ Z

Q

|m|dx dt

(1-5) and we will prove (in section 3) that such a couple would satisfy ultimately(ρ^∗, m^∗)∈ C.

(5)

To model the space of constraintC^re, we introduce the Lagrange multiplierϕas follow:

inf

(ρ,m)∈C^re

Z

Q

2(J(ρ, m) +ρ) +|m|

dx dt

= inf

(ρ,m)∈L²(Q) sup

0≤ϕ∈H¹(Q)

Z

Q

2(J(ρ, m) +ρ) +|m|

dx dt

+ Z

Q

ϕ(∂tρ+ divx(m) +k|m|)dx dt +

Z

Ω

(ρ(0,·)−ρ0)ϕ(0,·)dx− Z

Ω

(ρ(1,·)−ρ1)ϕ(1,·)dx

= inf

(ρ,m)∈L²(Q) sup

0≤ϕ∈H¹(Q)

Z

Q

2(J(ρ, m) +ρ) + (kϕ+ 1)|m|

dx dt

+ Z

Q

∇t,xϕ·(ρ, m)dx dt+ Z

Ω

[ϕ(1,·)ρ1−ϕ(0,·)ρ0]dx

(1-6)

By defining µ= (ρ, m)∈L²(Q)^d+1, G(ϕ) =R

Ω[ϕ(1,·)ρ1−ϕ(0,·)ρ0]dx, and using the Legendre’s transformation

Z

Q

2(J(ρ, m) +ρ) + (kϕ+ 1)|m|

dx dt= sup

q

[q·µ−F(ϕ, q)] (1-7) with

F(ϕ, q) =i_P_ϕ(q) =

0 if q∈ P_ϕ,

+∞ else. (1-8)

and P_ϕ =

q= (a, b)∈L²(Q)×L²(Q)^d, a+ 1

2(max{|b|, kϕ+ 1} −kϕ−1)²≤ 2 a.e.

(1-9) Then, we have

− inf

(ρ,m)∈C^re

Z

Q

2(J(ρ, m) +ρ) +|m|

dx dt

= inf

0≤ϕ∈H¹ inf

q∈L² sup

µ∈L²

[F(ϕ, q) +G(ϕ) +hµ,∇t,xϕ−qi_L2]

(1-10)

Therefore, we search for a saddle point(ϕ^∗, q^∗, µ^∗)in S=H₊¹(Q)×L²(Q)^d+1×L²(Q)^d+1 (where H₊¹(Q) ={ϕ∈H¹(Q), ϕ≥0}) of the LagrangianLdefined by

L(ϕ, q, µ) =F(ϕ, q) +G(ϕ) +hµ,∇_t,xϕ−qi_L2(Q), (1-11) i.e. such thatL(ϕ^∗, q^∗, µ)≤L(ϕ^∗, q^∗, µ^∗)≤L(ϕ, q, µ^∗)for all(ϕ, q, µ)∈S.

In the following, when there is no ambiguity, we will write (ϕ^∗, q^∗, µ^∗) instead of (ϕ^∗, q^∗, µ^∗) in order to lighten the notation.

First of all, we would like to study the relevance of an augmented Lagrangian method (inspired by the algorithm developed in [1]): in this new formulation, where the old primal variableµ= (ρ, m)

(6)

is became the Lagrange multiplier, and conversely the old Lagrange multiplierϕis now became a primal variable, the new (linear) constraint is now∇t,xϕ=q. Hence, we introduce, for allr≥0, the augmented LagrangianL_r by:

L_r(ϕ, q, µ) =F(ϕ, q) +G(ϕ) +hµ,∇t,xϕ−qi_L2(Q)+r

2k∇t,xϕ−qk²_L2(Q). (1-12) Remark 1.1. In this Lagrangian formulation, we choose a partial dualisation by including the non-linear part of the constraint in the energy term (which then depends on the Lagrange multiplier ϕ). This choice seemed easier to us with regard to the purely theoretical study of the properties of the solutions of the model we propose here.

For a numerical exploitation, as an application of an Augmented Lagrangian method (as for instance in [1]), we would rather choose a total dualisation (thus not including the non-linear part of the constraint in the energy term), in order to obtain a Lagrangian formulation equivalent to the first, but more exploitable numerically. We could use the following Lagrangian:

L(ϕ, q, µ) =F(q) +G(ϕ) +hµ,∇t,xϕ−kϕc(b)−qi_L2(Q),

where we definec(b)(t, x) = 0

b(t,x)

|b(t,x)|

!

if b(t, x)6= 0andc(b)(t, x) = 0 ifb(t, x) = 0, and alsoF(q) = Ind_P

0(q), with P₀=

q= (a, b)∈L²(Q)×L²(Q)^d, a+ 1

2(max{|b| −1,0})²≤ 2 a.e.

The corresponding augmented Lagrangian become:

L_r(ϕ, q, µ) =F₂(q) +G(ϕ) +hµ,∇t,xϕ−kϕc(b)−qi_L2(Q)+r

2k∇t,xϕ−kϕc(b)−qk²_L2(Q).

2 Some few theoric results on the augmented Lagrangian for- mulation

Let us characterize the saddle points of the Langrangians L_r, which are, as we will see, the same as forL.

Proposition 2.1. Let >0 and (ϕ^∗, q^∗, µ^∗)∈S. Then, for all r≥0, the triplet(ϕ^∗, q^∗, µ^∗)is a saddle point ofL_r if and only if it satisfies the followings assumptions:

1. q^∗=∇t,xϕ^∗, 2. q^∗= (a^∗, b^∗)∈ P_ϕ∗,

3. the vector µ^∗= (ρ^∗, m^∗)is orthogonal to the paraboloidP_ϕ∗ inq^∗, i.e. hµ^∗, q−q^∗i_L2 ≤0 for allq∈ P_ϕ∗.

4. µ^∗= (ρ^∗, m^∗)∈ C^re: more precisely, we have:

∀ϕ∈H¹(Q), ϕ≥0, Z

Q

(∂tϕ ρ^∗+∇xϕ·m^∗)dx dt−k Z

Q

|m^∗|ϕ dx dt+G(ϕ)≥0. (2-13)

(7)

Especially forϕ=ϕ^∗, we have:

Z

Q

(∂_tϕ^∗ρ^∗+∇xϕ^∗·m^∗)dx dt−k Z

Q

|m^∗|ϕ^∗dx dt+G(ϕ^∗) = 0. (2-14) Since this is true for any r ≥ 0, the saddle points are therefore common to all Lagrangians L_r, especially forL whenr= 0.

Before giving proof of this proposition, we will need to state (ond prove) an other one wich will characterize specially the optimal couple density-momentumµ^∗= (ρ^∗, m^∗).

Proposition 2.2. Let µ^∗ = (ρ^∗, m^∗)∈L²(Q)^d+1 and q^∗= (a^∗, b^∗)∈ P_ϕ∗, such thatµ^∗ is orthogonal toP_ϕ∗ inq^∗, i.e. hµ^∗, q−q^∗i_L2 ≤0, ∀q∈ P_ϕ∗. Therefore:

1. We haveρ^∗≥0.

2. For almost all (t, x)∈Q, m^∗(t, x) =





 h_ρ∗

1−^kϕ_|b^∗∗⁺¹|

b^∗i

(t, x), if |b^∗|(t, x)≥kϕ^∗(t, x) + 1≥1,

0 else.

(2-15) In other words,

m^∗=w^∗ρ^∗b^∗, with w^∗= max

0, 1

1−kϕ^∗+ 1

|b^∗|

(we consider 1/0 = +∞). (2-16)

3. We havea^∗=−(w^∗|b^∗|)²

2 +

2 on the support of ρ^∗, i.e. ρ^∗

a^∗+(w^∗|b^∗|)²

2 −

2

= 0.

Proof:

(I)[ρ^∗≥0]: Letf :R→Rbe defined as follow: f : y∈R7→

( y

|y|, if y6= 0, 0, if y= 0.

For all y ∈R, yf(y) = |y|, then ρ^∗f(ρ^∗) =|ρ^∗| and −1 ≤f(ρ^∗) ≤1. We define q = (a, b)with a=f(ρ^∗)−1 +a^∗andb=b^∗. Therefore, we have

a+ 1

2(max{|b|, kϕ^∗+ 1} −kϕ^∗−1)²=f(ρ^∗)−1 +a^∗+ 1

2(max{|b^∗|, kϕ^∗+ 1} −kϕ^∗−1)²≤ 2, i.e. q∈ P_ϕ∗, then0≥ hµ^∗, q−q^∗i_L2=

Z

Q

(|ρ^∗| −ρ^∗)dx dt.

As|ρ^∗| −ρ^∗≥0, we conclude that|ρ^∗|=ρ^∗ almost everywhere, that is to sayρ^∗≥0.

(II)[m^∗=w^∗ρ^∗b^∗]: Firstly, let us remark that for all~u∈L^∞((0,1)×Ω), and for allδ >0, and by definingbδ=b^∗+δ~u, we have:

(|bδ| −kϕ^∗−1)²=|bδ|²−2(kϕ^∗+ 1)|bδ|+ (kϕ^∗+ 1)²

= (|b^∗|²+ 2δhb^∗, ~ui+δ²|~u|²)−2(kϕ^∗+ 1)|bδ|+ (kϕ^∗+ 1)²

= (|b^∗| −kϕ^∗−1)²+ 2δhb^∗, ~ui+δ²|~u|²−2ϕ^∗(|b_δ| − |b^∗|).

(2-17)

Let~u∈L^∞((0,1)×Ω)and letλ >0.

(8)

• We define

Aλ={(t, x)∈(0,1)×Ω, |b^∗| −kϕ^∗−1≥λ}.

Let~u0=1^A0~u(we assume~u06= 0). Let bδ =b^∗+δ~u0, aδ =−1

2(max{|bδ|, kϕ^∗+ 1} −kϕ^∗−1)²+ 2 For allδ≤ _2k~_u^λ

0k∞ (then such that|bδ| ≥kϕ^∗+ 1onAλ), we have Z

A_λ

(|b_δ| −kϕ^∗−1)²dx dt= Z

A_λ

(|b^∗| −kϕ^∗−1)²dx dt +δ

Z

A_λ

2hb^∗, ~u0i −2ϕ^∗2hb^∗, ~u0i+δ|~u0|²

|bδ|+|b^∗|

dx dt+δ² Z

A_λ

|~u0|²dx dt (2-18)

(we recall that(|bδ|+|b^∗|)(t, x)≥ |b^∗|(t, x)≥kϕ^∗(t, x) + 1≥1 onAλ).

We haveq_δ = (a_δ, b_δ)∈ P_ϕ∗. Then 0≥ hµ^∗, q_δ−q^∗i_L2((0,1)×Ω)

= Z

Aλ

(aδ−a^∗)ρ^∗dx dt+ Z

Aλ

hbδ−b^∗, m^∗idx dt

= Z

A_λ

−a^∗ρ^∗−ρ^∗

2(|bδ| −kϕ^∗−1)²+

2ρ^∗+δhm^∗, ~u0i

dx dt

=− Z

A_λ

a^∗+ 1

2(|b^∗| −kϕ^∗−1)²− 2

ρ^∗dx dt+δ² Z

A_λ

|~u₀|² 2

2(kϕ^∗+ 1)

|bδ|+|b^∗| −1

dx dt

−δ Z

A_λ

ρ^∗

hb^∗, ~u₀i −2(kϕ^∗+ 1)

|bδ|+|b^∗| hb^∗, ~u₀i

+δhm^∗, ~u₀i

dx dt

≥δ² Z

A_λ

|~u0|² 2

2(kϕ^∗+ 1)

|bδ|+|b^∗| −1

dx dt−δ Z

A_λ

ρ^∗

1−2(kϕ^∗+ 1)

|bδ|+|b^∗|

b^∗−m^∗, ~u₀

dx dt (2-19) Then

δ Z

A_λ

|~u₀|² 2

2(kϕ^∗+ 1)

|bδ|+|b^∗| −1

dx dt−

Z

A_λ

ρ^∗

1− 2(kϕ^∗+ 1)

|b^∗+δ~u0|+|b^∗|

b^∗−m^∗, ~u₀

dx dt≤0.

(2-20) Therefore, whenδ→0, we have

Z

A_λ

ρ^∗

1−kϕ^∗+ 1

|b^∗|

b^∗−m^∗, ~u₀

dx dt≥0. (2-21)

Whenλ→0, we finally have Z

Q

ρ^∗

1−kϕ^∗+ 1

|b^∗|

b^∗−m^∗, ~u₀

dx dt≥0. (2-22)

(9)

• We define

Bλ={(t, x)∈(0,1)×Ω, |b^∗| −kϕ^∗−1≤ −λ}.

Let~v0=1^B0~u(we assume~v06= 0). Letbδ=b^∗+δ~v0, aδ =−1

2(max{|bδ|, kϕ^∗+ 1} −kϕ^∗−1)²+ 2. For all δ≤ _2k~_v^λ

0k∞, we have|bδ| ≤kϕ^∗+ 1onBλ, i.e. aδ =/2(and a^∗≤/2) onBλ, and thenq_δ= (a_δ, b_δ)∈ P_ϕ∗.

Thus, for allδ≤ _2k~_v^λ

0k∞, we have 0≥ hµ^∗, qδ−q^∗iL²((0,1)×Ω)=

Z

B_λ

(aδ−a^∗)ρ^∗dx dt+ Z

B_λ

hbδ−b^∗, m^∗idx dt

= Z

Bλ

2 −a^∗

ρ^∗dx dt+δ Z

Bλ

h~v0, m^∗idx dt≥δ Z

Bλ

h~v0, m^∗idx dt.

(2-23)

Therefore, whenλ→0, we have Z

Q

h~v0, m^∗idx dt≤0.

As1B0+1A0 = 1, and then~v₀+~u₀=~u, we finally have, and this for all~u∈L^∞((0,1)×Ω), Z

Q

hw^∗ρ^∗b^∗−m^∗, ~uidx dt≥0, (2-24) with

w^∗= max

0, 1

1−kϕ^∗+ 1

|b^∗|

,

and we then concludem^∗=w^∗ρ^∗b^∗. (III)[ρ^∗ a^∗+(w^∗|b^∗|)²/2−/2

= 0]: We define the vector q= (a, b) ∈ P_ϕ∗, with b =b^∗ and a=−₂¹(max{|b^∗|, kϕ^∗+ 1} −kϕ^∗−1)²+₂, whence

0≥ hµ^∗, q−q^∗iL² =− Z

Q

ρ^∗

a^∗+ 1

2(max{|b^∗|, kϕ^∗+ 1} −kϕ^∗−1)²− 2

dx dt.

Asρ^∗

a^∗+ 1

2(max{|b^∗|, kϕ^∗+ 1} −kϕ^∗−1)²− 2

≤0, we then have

ρ^∗

a^∗+(w^∗|b^∗|)²

2 −

2

=ρ^∗

a^∗+ 1

2(max{|b^∗|, kϕ^∗+ 1} −kϕ^∗−1)²− 2

= 0 (2-25) almost everywhere.

(10)

Proof of Proposition 2.1:

• Firstly, let us assume that the triplet(ϕ^∗, q^∗, µ^∗)satisfies the four assumptions of the statement.

Letr≥0. Letϕ∈H¹(Q)such thatϕ≥0, and letq∈ P_ϕ, i.e. such thatF(ϕ, q) = 0. Then, according to (2-13), we have

L(ϕ, q, µ^∗) =F(ϕ, q) +G(ϕ) +hµ^∗,∇_t,xϕ−qi_L2(Q)+r

2k∇_t,xϕ−qk²_L2(Q)

≥k Z

Q

|m^∗|ϕ dx dt− hµ^∗, qi_L2(Q).

(2-26)

We define the setsAandB by

A={(t, x)∈(0,1)×Ω, |b(t, x)| −kϕ(t, x)−1>0}, and

B={(t, x)∈(0,1)×Ω, |b(t, x)| −kϕ(t, x)−1≤0}.

According to (2-15), we havem^∗=w^∗ρ^∗b^∗, and then k

Z

Q

|m^∗|ϕ dx dt− hµ^∗, qi_L2(Q)

=− Z

Q

|m^∗|dx dt+ Z

Q

(kϕ+ 1)|m^∗|dx dt− Z

Q

aρ^∗dx dt− Z

Q

b·m^∗dx dt

≥ − Z

Q

|m^∗|dx dt− Z

Q

(|b| −kϕ−1)|m^∗|dx dt− Z

Q

aρ^∗dx dt

≥ − Z

Q

|m^∗|dx dt+ Z

B

(kϕ+ 1− |b|)|m^∗|dx dt− Z

B

aρ^∗dx dt

− Z

A

a+ 1

2(|b| −kϕ−1)²

ρ^∗dx dt− 2

Z

A

(w^∗|b^∗|)²ρ^∗dx dt

≥ − Z

Q

|m^∗|dx dt− 2

Z

B

ρ^∗dx dt− 2

Z

A

ρ^∗dx dt− 2

Z

A

(w^∗|b^∗|)²ρ^∗dx dt

≥ − Z

Q

|m^∗|dx dt− 2

Z

Q

1 + (w^∗|b^∗|)²

ρ^∗dx dt.

(2-27)

We haveb^∗·m^∗ =b^∗·(w^∗ρ^∗b^∗) =w^∗ρ^∗|b^∗|²=|b^∗| · |m^∗|, and moreover, by definition ofw^∗ (see (2-16)), we have(|b^∗| −kϕ^∗−1)|m^∗|=(w^∗|b^∗|)²ρ^∗. Thus,

k Z

Q

|m^∗|ϕ^∗dx dt− hµ^∗, q^∗i_L2(Q)

=− Z

Q

|m^∗|dx dt− Z

Q

a^∗ρ^∗dx dt+ Z

Q

(kϕ^∗+ 1)|m^∗|dx dt− Z

Q

b^∗·m^∗dx dt

=− Z

Q

|m^∗|dx dt− Z

Q

a^∗ρ^∗dx dt− Z

Q

(|b^∗| −kϕ^∗−1)|m^∗|dx dt

=− Z

Q

|m^∗|dx dt− Z

Q

a^∗+(w^∗|b^∗|)²

ρ^∗dx dt.

(2-28)