HAL Id: hal-03116036
https://hal.archives-ouvertes.fr/hal-03116036
Submitted on 20 Jan 2021
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
A damped Newton algorithm for generated Jacobian equations
Anatole Gallouët, Quentin Merigot, Boris Thibert
To cite this version:
Anatole Gallouët, Quentin Merigot, Boris Thibert. A damped Newton algorithm for generated Ja-
cobian equations. [Research Report] Université Grenoble Alpes; Université Paris Sud. 2021. �hal-
03116036�
JACOBIAN EQUATIONS
ANATOLE GALLOUËT, QUENTIN MÉRIGOT, AND BORIS THIBERT
Abstract. Generated Jacobian Equations have been introduced by Trudinger [Disc. cont. dyn. sys (2014), pp. 1663–1681] as a gener- alization of Monge-Ampère equations arising in optimal transport. In this paper, we introduce and study a damped Newton algorithm for solving these equations in the semi-discrete setting, meaning that one of the two measures involved in the problem is finitely supported and the other one is absolutely continuous. We also present a numerical application of this algorithm to the near-field parallel refractor problem arising in non-imaging problems.
1. Introduction
This paper is concerned with the numerical resolution of Generated Jaco- bian equations, introduced by N. Trudinger [19] as a generalization of Monge- Ampère equations arising in optimal transport. Generated Jacobian equa- tions were originally motivated by inverse problems arising in non-imaging optics in the near-field case [12, 7, 8] but they also apply to problems arising in economy [16, 5]. A survey on these equations and their applications was recently written by N. Guillen [6]. The input for a generated Jacobian equa- tions are two probability measures µ and ν over two spaces X and Y , and a generating function G : X × Y × R → R . Loosely speaking, a scalar function ψ on Y is an Alexandrov solution to the generated jacobian equation if the map T
ψdefined by
T
ψ(x) ∈ arg max
y∈Y
G(x, y, ψ(y))
transports µ onto ν , i.e. ν is the image of the measure µ under T
ψ, denoted T
ψ#µ = ν.
Note that one needs to impose some conditions on µ and G ensuring that the map T
ψis well-defined µ-almost everywhere. One can describe the meaning of this equation using an economic metaphor. We consider X as a set of customers, Y as a set of products and G(x, y, ψ(y)) corresponds to the utility of the product y for the customer x given a price ψ(y). The probability measure µ and ν describe the distribution of customers and products. The map T
ψcan be described as the “best response” of customers given a price menu ψ : Y → R : each customer x ∈ X tries to maximize its own utility G(x, y, ψ(y)) over all products y ∈ Y : the maximizer, if it exists and is unique, is denoted T
ψ(x). Then, ψ is a solution to the generated jacobian
Date: January 20, 2021.
1
equation if the best response map T
ψpushes the distribution of customers to the distribution of available products ν.
In this article, we are interested in algorithms for solving the semi-discrete case, where the source measure µ is absolutely continuous with respect to the Lebesgue measure on X ⊆ R
dand the target measure ν is finitely sup- ported. Such discretization can be traced back to Minkowski, but have been used more recently to solve Monge-Ampère equations [17], problems from non-imaging optics [4], more general optimal transport problems [10], but also generated Jacobian equations [1]. In all the cited papers, the methods are coordinate-wise algorithms with minimal increment and are similar to the algorithm introduced by Oliker-Prussner [17]. The number of iterations of these algorithms scales more than cubicly (N
3, where N is the size of the support of ν), making them limited to fairly small discretizations. More recently, Newton methods have been introduced to solve semi-discrete op- timal transport problems [11, 14]. In this paper, we show that newtonian techniques can also be applied to Generated Jacobian equations under mild conditions on the generating function G.
Semi-discrete optimal transport. The semi-discrete setting refers to the case where one is given an absolutely continuous probability measure µ (with respect to the Lebesgues measure) supported on a domain X of R
dand a dis- crete probability measure ν = P
y
ν
yδ
ysupported on a finite set Y . Given a cost function c : X ×Y → R , the optimal transport problem amounts to find- ing a function T : X → Y that minimizes the total cost R
X
c(x, T (x))dµ(x) under the condition µ(T
−1(y)) = ν
yfor any y ∈ Y . This problem can be recast, using Kantorovitch duality under some mild conditions on the cost c, into finding a dual potential ψ : Y → R that satisfies
∀y ∈ Y µ(Lag
y(ψ)) = ν
y(MA) where Lag
y(ψ) are the Laguerre cells defined by
Lag
y(ψ) = {x ∈ X | ∀z ∈ Y, c(x, y) + ψ(y) ≤ c(x, z) + ψ(z)} .
The application T
ψdefined for x ∈ X by T
ψ(x) = y if x ∈ Lag
y(ψ) is then an optimal transport map between µ and ν for the cost c, and satisfies in particular T
ψ#µ = ν. Equation (MA) can be regarded as a discrete version of the Monge-Ampère type equation arising in optimal transport. We refer for instance to [2, §2] for more details in the case c(x, y) = −hx|yi.
Generated Jacobian equation. The Generated Jacobian equation in the semi-discrete setting has a very similar form. The problem also amounts to finding a function ψ : Y → R that satisfies Equation (MA), but the Laguerre cells have a more general form and read
Lag
y(ψ) = {x ∈ X | ∀z ∈ Y, G(x, y, ψ(y)) ≥ G(x, z, ψ(z))}
where G is called a generating function. When G is linear in the last variable, i.e. when G(x, y, v) = −c(x, y) − v, one obviously recovers the Laguerre cells from optimal transport.
Note that the lack of linearity in the generating function G adds several theoretical and practical difficulties. To see this, consider the mass function
H : R
Y→ R
Y, ψ 7→ (µ(Lag
y(ψ)))
y∈Y.
In the optimal transport case, the function H is invariant under the addition of a constant (i.e. H(ψ + c) = H(ψ) for any c ∈ R ), which entails under mild assumptions that the kernel of DH(ψ) has rank one and coincides with the vector space of constant functions on Y [11]. Furthermore, as a consequence of Kantorovitch duality, the function H is the gradient of a functional, called Kantorovitch functional in [11]. This implies that the differential DH(ψ) is symmetric. In the case of generated Jacobian equations, these two properties do not hold anymore: the differential DH(ψ) is not necessarily symmetric and its kernel is not reduced to the set of constant functions in general.
In this article, we generalize the damped Newton algorithm proposed in [11] to solve generated Jacobian equations. Note that unlike [11] we do not require any Ma-Trudinger-Wang type condition to prove the convergence of our algorithm. In Section 2 we recall the notion of generating function and its properties, and introduce the generated Jacobian equation in the semi- discrete setting. Section 3 is entirely dedicated to the numerical resolution of the generated Jacobian equation. In Section 4, we apply our algorithm to numerically solve the Near Field Parallel Reflector problem. Note that F. Abedin and C. Gutierrez also consider this problem [1], but their algo- rithm requires a strong condition, called Visibility Condition, that implies the Twist condition (defined hereafter) of the generating function G. We show that under a much weaker assumption, this twist condition holds for a subset of dual potential ψ : Y → R on which we can apply our algorithm. It is very likely that our assumption could also be adapted to [1].
2. Semi-discrete generated Jacobian equation
In this section, we recall the notions introduced by N. Trudinger in order to define the generated Jacobian equation [19] in the semi-discrete setting.
Let Ω be an open bounded domain of R
d, let X be a compact subset of Ω and let Y be a finite set of R
d. Let µ be a measure on Ω, which is absolutely continuous with respect to the Lebesgue measure, with non-negative density ρ supported on X (i.e. spt(ρ) ⊂ X), and let ν = P
y∈Y
ν
yδ
ybe a measure on the finite set Y such that all ν
yare positive (ν
y> 0). These two measures must satisfy the mass balance condition µ(X) = ν(Y ) and it is not restrictive to view them as probability measures:
Z
X
ρ(x)dx = X
y∈Y
ν
y= 1
Notations. We denote by H
kthe k-dimensional Hausdorff measure in R
d. In particular H
dis the Lebesgue measure in R
d. The set of functions from Y to R is denoted by R
Y. We denote by h·|·i the Euclidean scalar product, by k · k the Euclidean norm, by B(x, r) the Euclidean ball of center x and radius r, by χ
A: R
d→ {0, 1} the indicator function of a set A. The image and kernel of a matrix M are respectively denoted by im(M ) and ker(M ).
We denote by span(u) the linear space spanned by a vector u, by ∇
xG the
gradient of a function G with respect to x and by ∂
vG its scalar derivative
with respect to v. Finally, for N ∈ N , we denote J 1, N K = {1, . . . , N}.
2.1. Generating function. We recall below the notion of generating func- tion and G-convexity in the semi-discrete setting [19, 1].
Definition 1 (Generating function). Let a, b ∈ R ∪ {−∞, +∞} with a < b and I =]a, b[. A function G : Ω × Y × I → R is called a generating function.
We assume that it satisfies the following properties:
• Regularity condition: (x, y, v) 7→ G(x, y, v) is continuously differen- tiable in x and v, and
∀α < β ∈ I, sup
(x,y,v)∈Ω×Y×[α,β]
|∇
xG(x, y, v)| < +∞ (Reg)
• Monotonicity condition:
∀(x, y, v) ∈ Ω × Y × I : ∂
vG(x, y, v) < 0 (Mono)
• Twist condition:
∀x ∈ Ω, (y, v) 7→ (G(x, y, v), ∇
xG(x, y, v)) is injective on Y × I (Twist)
• Uniform Convergence condition:
∀y ∈ Y, lim
v→a
inf
x∈Ω
G(x, y, v) = +∞ (UC)
Remark 2 (Range of G). Through the whole paper we can and will consider that I = R . Indeed suppose that G : Ω×Y ×I → R satisfies the assumptions of the above definition. Considering a strictly increasing C
1diffeomorphism ζ : R → I and setting G(x, y, v) = e G(x, y, ζ(v)), we get a generating function G ˜ : Ω × Y × R → R , which also satisfies the conditions above. Moreover, up to reparametrization, the generated Jacobian equations associated to G and G ˜ are equivalent.
Remark 3. Note that F. Abedin and C. Gutierrez [1] impose a slightly more restrictive inequality in condition (Reg): their supremum is taken over Ω × Y ×] − ∞, α] for any α instead of Ω × Y × [α, β]. We changed here this condition in order to handle the Near Field point reflector problem in the last section.
Definition 4 (G-convexity). Let ϕ : Ω → R be a function. If ϕ ≥ G(·, y
0, λ
0) for all x ∈ Ω with equality at x = x
0, we say that the function G(·, y
0, λ
0) supports ϕ at x
0. A function ϕ : Ω → R is said to be G-convex if it is supported at every point, i.e. for all x
0∈ Ω,
∃(y
0, λ
0) ∈ Y × R s.t.
( ∀x ∈ Ω, ϕ(x) ≥ G(x, y
0, λ
0)
ϕ(x
0) = G(x
0, y
0, λ
0) (2.1)
Remark 5 (Relation with convexity). The notion of G-convexity generalizes
in a certain sense the notion of convexity. Intuitively, it amounts to replacing
the supporting hyperplanes by functions of the form G(·, y, λ). If G(·, y, λ)
is convex for any y ∈ Y and any λ ∈ R , then a G-convex function is always
convex. Moreover, if the generating function G is affine (i.e G(x, y, λ) =
hx, yi + λ) and if Y = R
d, then the notions of G-convexity and convexity are
equivalent.
Definition 6 (G-subdifferential). Let ϕ be a G-convex function and let x
0∈ Ω. The G-subdifferential ∂
Gϕ of ϕ at x
0is defined by
∂
Gϕ(x
0) = {y ∈ Y | ∃λ
0∈ R s.t. G(·, y, λ
0) supports ϕ at x
0} (2.2) The following lemma (Lemma 2.1 in [1]) shows that the ∂
Gϕ is single- valued almost everywhere, and induces a measurable
Lemma 7. [1, Lemma 2.1] Let ϕ be G-convex with G satisfying (Reg), (Mono) and (Twist). Then, there exists a measurable map S
ϕ: Ω → Y s.t.
for a.e. x ∈ Ω, ∂
Gϕ(x) = {S
ϕ(x)}.
We can define the notion of generated Jacobian equation.
Definition 8 (Brenier solution to the GJE). A function ϕ : X → R is a Brenier solution to the generated Jacobian equation between a probability density µ on Ω and a probability measure ν = P
y∈Y
ν
yδ
yon Y if it satisfies ( ϕ is G-convex
∀y ∈ Y, µ(S
ϕ−1({y})) = ν
y(GenJac) 2.2. G-transform. The goal in this section is to write a dual formulation of the generated Jacobian equation, using the notion of G-transform introduced by Trudinger [19].
Definition 9. The G-transform ψ
G: Ω → R of ψ : Y → R is defined by
∀x ∈ Ω, ψ
G(x) = max
y∈Y
G(x, y, ψ(y)). (2.3) Proposition 10. Assume G satisfies (Reg), (Mono) and (Twist) and let ϕ : Ω → R be a G-convex function. Then there exists ψ : S
ϕ(Ω) → R s.t.
∀x ∈ Ω, ϕ(x) = max
y∈Sϕ(Ω)
G(x, y, ψ(y))
Proof. Let y ∈ S
ϕ(Ω), then for any x
0∈ S
ϕ−1(y) there exists λ
0∈ R such that ϕ(x
0) = G(x
0, y, λ
0). Since ϕ is G-convex we also have for any x ∈ Ω that ϕ(x) ≥ G(x, y, λ
0). Specifically for x
1∈ S
ϕ−1(y), we get ϕ(x
1) = G(x
1, y, λ
1) ≥ G(x
1, y, λ
0) and since ∂
vG(x, y, v) < 0 then λ
1≤ λ
0. By symmetry we have λ
1= λ
0. We can deduce that there exists a unique ψ(y) ∈ R such that for any x ∈ S
ϕ−1(y), ϕ(x) = G(x, y, ψ(y)). This defines a map ψ : S
ϕ(Ω) → R satisfying
∀x ∈ Ω,
( ∀y ∈ S
ϕ(Ω), ϕ(x) ≥ G(x, y, ψ(y))
∃y ∈ S
ϕ(Ω), ϕ(x) = G(x, y, ψ(y)) As a conclusion we have ϕ(x) = max
y∈Sϕ(Ω)
G(x, y, ψ(y)).
Corollary 11. Let ϕ be a G-convex function such that S
ϕ(Ω) = Y , then there exists ψ : Y → R such that ϕ = ψ
G.
Remark 12 (G-convex functions are not always G-transforms). Without any
additional assumptions on the generating function, we cannot guarantee that
any G-convex function ϕ on X is the G-transform of a function ψ on Y . Define for instance
Ω = (1, 2), Y = {0, 1}, G(x, y, v) =
( xe
−vif y = 0
−xv if y = 1 .
and consider the function ϕ on Ω defined by ϕ(x) = G(x, 1, 1) = −x, which is G-convex by definition. Yet for any v ∈ R and any x ∈ Ω,
max(G(x, 0, v), G(x, 1, 1)) = max(xe
−v, −x) = xe
−v,
thus implying that that there does not exist any ψ : Y → R such that ϕ is the G-transform of ψ.
Suppose that ϕ is a solution of (GenJac) and that for all y ∈ Y , the mass ν
yis positive. Then, for any y ∈ Y one has µ(S
ϕ−1(y)) = ν
y> 0, which guarantees that S
ϕ(Ω) = Y . Therefore by Corollary 11 there exists a function ψ on Y such that ϕ = ψ
G. This means that we can reparametrize the problem (GenJac) by assuming that the solution ϕ is the G-transform of some function ψ. The sets S
ψ−1G({y}), which appear in (GenJac) will be called generalized Laguerre cells.
Definition 13 (Generalized Laguerre cells). The generalized Laguerre cells associated to a function ψ : Y → R are defined for every y ∈ Y by
Lag
y(ψ) := S
ψ−1G({y})
= {x ∈ Ω | ∀z ∈ Y, G(x, y, ψ(y)) ≥ G(x, z, ψ(z))} . (2.4) Note that by Lemma 7, the intersection of two generalized Laguerre cells has zero Lebesgue measure, ensuring that the sets Lag
y(ψ) form a partition of Ω up to a µ-negligible set.
Definition 14 (Alexandrov solution to GJE). A function ψ : Y → R is an Alexandrov solution to the generated Jacobian equation between generated Jacobian equation between a probability density µ on Ω and a probability measure ν = P
y∈Y
ν
yδ
yon Y if ψ
Gis a Brenier solution (Definition 8) to the same GJE, or equalently if
∀y ∈ Y, H
y(ψ) = ν
y, where H
y(ψ) = µ(Lag
y(ψ)).
Setting H(ψ) = (H
y(ψ))
y∈Yand considering ν as a function over Y , we can even rewrite this equation as
H(ψ) = ν. (GenJacD)
3. Resolution of the generated Jacobian equation
The goal of this section is to introduce and study a Newton algorithm to solve the semi discrete generated Jacobian equation (GenJacD). Before doing so, we study the regularity of the mass function H : R
Y→ R
Yin Section 3.1 and establish a non-degeneracy property of its differential DH in Section 3.2, under a connectedness assumption on the support of the source measure. We present the algorithm and prove its convergence in Section 3.3.
For simplicity, we will number the points in Y , i.e. we assume that
Y = {y
1, . . . , y
N},
where the points y
iare distinct. This allows us to identify the set of functions R
Ywith R
N, by setting ψ
i= ψ(y
i). We also denote (e
i)
1≤i≤Nthe canonical basis of R
N. Finally, we introduce a shortened notation for Laguerre cells and intersections thereof
Lag
i(ψ) = Lag
yi(ψ), Lag
ij(ψ) = Lag
i(ψ) ∩ Lag
j(ψ).
Throughout this section, we assume that the generating function G satisfies all the conditions of Definition 1.
3.1. C
1-regularity of H. The differentiability of H is established under a (mild) genericity hypothesis on the cost function, ensuring in particular that the intersection between three distinct Laguerre cells is negligible with respect to the (d − 1)-dimensional Hausdorff measure, denoted H
d−1. To write this hypothesis, we denote for three distinct indices i, j, k in J 1, N K ,
Γ
ij(ψ) = {x ∈ Ω | G(x, y
i, ψ
i) = G(x, y
j, ψ
j)}, Γ
ijk(ψ) = Γ
ij(ψ) ∩ Γ
ik(ψ).
Definition 15 (Genericity of the generating function.). The generating func- tion G is generic with respect to Ω and Y if for any distinct indices i, j, k in J 1, N K and any ψ ∈ R
Nwe have
H
d−1(Γ
ijk(ψ)) = 0. (Gen
YΩ) The generating function G is generic with respect to the boundary ∂X and Y if for any distinct indices i, j in J 1, N K and any ψ ∈ R
Nwe have
H
d−1(Γ
ij(ψ) ∩ ∂X ) = 0. (Gen
Y∂X) Proposition 16. Assume that
• G ∈ C
2(Ω×Y × R ) satisfies (Reg), (Mono), (Twist), (Gen
YΩ), (Gen
Y∂X),
• X ⊆ Ω is compact and that ρ is a continuous probability density on X.
Then the mass function H : R
N→ R
Ndefined by H(ψ) = (µ(Lag
i(ψ)))
1≤i≤Nhas class C
1. We have for ψ ∈ R
Nand i ∈ J 1, N K
∂H
j∂ψ
i(ψ) = Z
Lagij(ψ)
ρ(x) |∂
vG(x, y
i, ψ
i)|
k∇
xG(x, y
j, ψ
j) − ∇
xG(x, y
i, ψ
i)k dH
d−1(x) ≥ 0 for j 6= i
∂H
i∂ψ
i(ψ) = − X
j6=i
∂H
j∂ψ
i(ψ)
(3.5) Proof. Let ψ ∈ R
Nand i, j ∈ J 1, N K be fixed indices such that i 6= j. We want to compute ∂H
j/∂ψ
i(ψ). For this purpose, we introduce ψ
t= ψ + te
ifor t ≥ 0. From (Mono), we obviously have Lag
j(ψ) ⊆ Lag
j(ψ
t). Therefore H
j(ψ
t) − H
j(ψ) = µ(Lag
j(ψ
t)) − µ(Lag
j(ψ)) = µ(Lag
j(ψ
t) \ Lag
j(ψ)) We introduce the set L obtained by removing one inequality in the definition of the generalized Laguerre cell Lag
j(ψ):
L = {x ∈ Ω | ∀k 6= i, G(x, y
j, ψ
j) ≥ G(x, y
k, ψ
k)}.
We have in particular Lag
j(ψ) ⊆ L and more precisely Lag
j(ψ
t) \ Lag
j(ψ) = G
0<s≤t
L ∩ Γ
ij(ψ
t).
We will use this formula to get another expression of H
j(ψ
t) − H
j(ψ).
Step 1. Construction of u
ijsuch that Γ
ij(ψ
t) = u
−1ij({t}). To construct such a function u
ij: Ω → R , we first consider the function f
ij: Ω × R → R defined by
f
ij(x, t) = G(x, y
j, ψ
j) − G(x, y
i, ψ
i+ t)
This function f
ijis of class C
1on Ω × R by hypothesis on G and we have
∀(x, t) ∈ Ω × R, ∂f
ij∂t (x, t) = −∂
vG(x, y
i, ψ
i+ t) > 0.
This implies that a fixed x ∈ Ω, the function f
ij(x, ·) is strictly increasing, so that equation f
ij(x, t) = 0 has at most one solution. Denoting
V
ij= {x ∈ Ω | ∃t ∈ R , f
ij(x, t) = 0} = [
t∈R
Γ
ij(ψ
t), one can therefore define a function u
ij: V
ij→ R which satisfies
∀x ∈ V
ij, f
ij(x, t) = 0 ⇐⇒ u
ij(x) = t.
By the implicit function theorem, the set V
ijis open and the function u
ijis C
1on V
ij. In order to apply the co-area formula, we need to compute the gradient of u
ij. For any point x in V
ij, we have by definition
f
ij(x, u
ij(x)) = G(x, y
j, ψ
j) − G(x, y
i, ψ
i+ u
ij(x)) = 0.
Differentiating this expression with respect to x, we obtain
∇u
ij(x) = ∇
xG(x, y
j, ψ
j) − ∇
xG(x, y
i, ψ
i+ u
ij(x))
∂
vG(x, y
i, ψ
i+ u
ij(x))
which is well defined since ∂
vG(x, y
i, v) < 0 on Ω × Y × R by the (Mono) hypothesis. The (Twist) condition guarantees that for all x ∈ V
ij, the map (y, v) 7→ (G(x, y, v), ∇
xG(x, y, v)) is injective. By definition of u
ijwe have f
ij(x, u
ij(x)), so that
G(x, y
j, ψ
j) = G(x, y
i, ψ
i+ u
ij(x)).
The (Twist) condition then entails
∇
xG(x, y
j, ψ
j) 6= ∇
xG(x, y
i, ψ
i+ u
ij(x)), implying that the gradient ∇u
ij(x) does not vanish.
Step 2. Computation of the partial derivatives. We can write the difference between Laguerre cells using the function u
ij:
Lag
j(ψ
t) \ Lag
j(ψ) = [
0<s≤t
Lag
ij(ψ
s)
= {x ∈ Ω, ∃s ∈]0, t], f
ij(x, s) = 0} ∩ L
= {x ∈ Ω, ∃s ∈]0, t], u
ij(x) = s} ∩ L
= u
−1ij(]0, t]) ∩ L, giving directly
H
j(ψ
t) − H
j(ψ) = µ(L ∩ u
−1ij(]0, t])) = Z
L∩u−1ij (]0,t])
ρ(x)dx.
Then the co-area formula gives us H
j(ψ
t) − H
j(ψ)
t = 1
t Z
L∩u−1ij (]0,t])
ρ(x)dx = 1 t
Z
t 0H
ij(ψ
s)ds, where we introduced
H
ij(ψ) = Z
Lagij(ψ)
ρ(x)
k∇u
ij(x)k dH
d−1(x). (3.6) Note that thanks to the computations above, we already know that the gradient ∇u
ij(x) does not vanish. Moreover, for any x in Lag
ij(ψ) ⊆ Γ
ij(ψ), one has u
ij(x) = 0. Thus,
∇u
ij(x) = (∇
xG(x, y
j, ψ
j) − ∇
xG(x, y
i, ψ
i))/(∂
vG(x, y
i, ψ
i)).
We can therefore rewrite H
ij(ψ) =
Z
Lagij(ψ)
ρ(x)|∂
vG(x, y
i, ψ
i)|
|∇
xG(x, y
j, ψ
j) − ∇
xG(x, y
i, ψ
i)| dH
d−1(x). (3.7) As shown in Proposition 17 below, H
ijis continuous on R
N. We deduce that
∂H
j∂ψ
i(ψ) = lim
t→0,t>0
H
j(ψ
t) − H
j(ψ)
t = H
ij(ψ) ≥ 0. (3.8) The case t < 0 can be treated similarly by replacing Lag
j(ψ) ⊆ Lag
j(ψ
t) with Lag
j(ψ
t) ⊆ Lag
j(ψ). We thus get the desired expression for the partial derivative ∂H
j/∂ψ
ifor i 6= j.
To compute the partial derivative for j = i, we use the mass conservation property P
1≤i≤N
H
i(ψ) = 1 to deduce that
∂H
i∂ψ
i(ψ) = − X
j6=i
∂H
j∂ψ
i(ψ).
It remains to show that the functions H
ijused in the proof of Proposition 16 are continuous.
Proposition 17. Under the assumptions of Proposition 16, for every i, j ∈ J 1, N K , the function H
ijdefined in (3.6) is continuous on R
N.
Proof. We introduce the function g : Ω × R
N→ R defined by g(x, ψ) = ¯ ρ(x) |∂
vG(x, y
i, ψ
i)|
k∇
xG(x, y
j, ψ
j) − ∇
xG(x, y
i, ψ
i)k
where ρ ¯ is a continuous extension of the probability density ρ
|Xon Ω. For a given ψ ∈ R
N, the (Twist) hypothesis guarantees that for any x ∈ Γ
ij(ψ),
∇
xG(x, y
j, ψ
j) 6= ∇
xG(x, y
i, ψ
i). This implies that g is continuous on a neighborhood of the set {(x, ψ) ∈ Ω × R
N|x ∈ Γ
ij(ψ)}. We introduced in Proposition 16 the function
H
ij(ψ) = Z
Lagij(ψ)∩X
g(x, ψ)dH
d−1(x).
Let ψ
∞∈ R
Nand ψ
na sequence converging towards ψ
∞.The main diffi-
culty for proving that H
ij(ψ
n) converges to H
ij(ψ
∞) as n → +∞ is that
the integrals in the definition of H
ij(ψ
n) and H
ij(ψ
∞) are over different hy- persurfaces, namely Γ
ij(ψ
n) and Γ
ij(ψ
∞). Our first step will therefore be to construct a diffeomorphism between (subsets) of these hypersurfaces. We introduce f : R × R × Ω → R the function defined by
f(a, b, x) = G(x, y
j, ψ
j∞+ a) − G(x, y
i, ψ
∞i+ b)
We put a
n= ψ
jn− ψ
∞jand b
n= ψ
in− ψ
∞i, so that a
n→ 0 and b
n→ 0 as n tends to +∞. We also have
Γ
ij(ψ
∞) = (f(0, 0, ·))
−1(0), Γ
ij(ψ
n) = (f (a
n, b
n, ·))
−1(0).
Step 1: Construction of a map F
nbetween Γ
ij(ψ
∞) and Γ
ij(ψ
n).
This map is constructed using the composition of the flows associated to two vector fields X
aand X
b. Let Ω e ⊂ Ω an open domain containing X.
The (Twist) hypothesis guarantees that there exists a neighborhood V e of the set {(a, b, x) ∈ R
2× Ω| e f(a, b, x) = 0 } such that we have for any v ∈ V e ,
∇
xf (v) 6= 0. We can then define two vector fields X
a, X
bon V e by X
a(a, b, x) =
1, 0, −∂
af(a, b, x) ∇
xf(a, b, x) k∇
xf (a, b, x)k
2X
b(a, b, x) =
0, 1, −∂
bf (a, b, x) ∇
xf (a, b, x) k∇
xf (a, b, x)k
2Since f is of class C
2, X
aand X
bare both of class C
1on V e . We then consider Φ
aand Φ
bthe flows associated respectively to X
aand X
bdefined for (t, v) ∈ [−ε, ε]
2× V e by
( Φ
a(0, v) = v
∂
tΦ
a(t, v) = X
a(Φ(t, v)) and
(
Φ
b(0, v) = v
∂
tΦ
b(t, v) = X
b(Φ(t, v))
The vector fields X
aand X
bare continuously differentiable on V e which implies that both Φ
a(t, ·) and Φ
b(t, ·) converge pointwise in the C
1sense toward the identity as t → 0. Let (t, v) ∈ [−ε, ε] × V e . Denoting ∇f(v) = (∂
af, ∂
bf, ∇
xf)(v), we then have
f (Φ
a(t, v)) = f (Φ
a(0, v)) + Z
t0
∂
∂s s 7→ f (Φ
a(s, v)) ds
= f (v) + Z
t0
h∇f(Φ
a(s, v))|∂
tΦ
a(s, v)ids
= f (v) + Z
t0
h∇f(Φ
a(s, v))|X
a(Φ
a(s, v))ids
= f (v)
Similarly one has f(Φ
b(t, v)) = f (v). Let Π : V e → Ω the projection of V e ⊆ R
2× Ω on Ω, and let F
n: Γ
ij(ψ
∞) ∩ Ω e → Ω be the function defined by
F
n(x) = Π Φ
a(a
n, Φ
b(b
n, (0, 0, x)))
.
For x ∈ Γ
ij(ψ
∞) and v = (0, 0, x) ∈ V e , we have
Φ
a(a
n, Φ
b(b
n, v)) = (a
n, b
n, F
n(x)) and from the previous equality we deduce that
f(Φ
a(a
n, Φ
b(b
n, v))) = f (v) = 0
This means that for x ∈ Γ
ij(ψ
∞), F
n(x) ∈ Γ
ij(ψ
n). Moreover Φ
a(a
n, ·) and Φ
b(b
n, ·) are both invertible of inverse Φ
a(−a
n, ·) and Φ
b(−b
n, ·). Thus F
nis also invertible of inverse
F
n−1(x) = Π Φ
b(−b
n, Φ
a(−a
n, (a
n, b
n, x)))
Since both Φ
a(a
n, ·) and Φ
b(b
n, ·) converge pointwise in the C
1toward the identity as n → +∞, we have for x ∈ Γ
ij(ψ
∞) ∩ Ω e
n→+∞
lim F
n(x) = x,
n→+∞
lim J F
n(x) = 1,
where J F
nis the absolute value of the determinant of the Jacobian matrix of F
n.
Step 2: Convergence of H
ij(ψ
n) toward H
ij(ψ
∞).
We let L
∞= Lag
ij(ψ
∞) and L
n= F
n−1(Lag
ij(ψ
n) ∩ Ω). Denoting by e χ
Athe indicator function of a set A, we have
H
ij(ψ
∞) = Z
Γij(ψ∞)
g(x, ψ
∞)χ
X(x)χ
L∞(x)dH
d−1(x)
= Z
Γij(ψ∞)∩Ωe
g(x, ψ
∞)χ
X(x)χ
L∞(x)dH
d−1(x) because Γ
ij(ψ
∞) ∩ X ⊂ Ω. We also have e
H
ij(ψ
n) = Z
Γij(ψn)
g(x, ψ
n)χ
X(x)χ
Lagij(ψn)
(x)dH
d−1(x) By a change of variable from x to F
n(x), the latter equality becomes
H
ij(ψ
n) = Z
Γij(ψ∞)∩eΩ
g(F
n(x), ψ
n)J F
n(x)χ
X(F
n(x))χ
Ln(x)dH
d−1(x) where J F
n(x) denotes the determinant of the Jacobian matrix of F
n. We already have the pointwise convergences F
n(x) → x and J F
n(x) → 1 as n → ∞. If we can show that
n→+∞
lim χ
X(F
n(x))χ
Ln(x) = χ
X(x)χ
L∞(x)
for H
d−1almost every point x, then using Lebesgue’s dominated convergence theorem, we will obtain that H
ij(ψ
n) → H
ij(ψ
∞).
We first show that lim
n→+∞χ
Ln(x) → χ
L∞(x) H
d−1-almost everywhere
on Γ
ij(ψ
∞) ∩ Ω. We first consider the superior limit: given e x ∈ Γ
ij(ψ
∞) ∩ Ω, e
we prove that lim sup
n→∞χ
Ln(x) ≤ χ
L∞(x). The limsup is non-zero if and
only if there exists a subsequence (σ(n))
n∈Nsuch that ∀n ∈ N , x ∈ L
σ(n). In
this case we have F
σ(n)(x) ∈ F
σ(n)(L
σ(n)) = Lag
ij(ψ
σ(n)) ∩ Ω. This means e that for any k 6= i, j
G(F
σ(n)(x), y
i, ψ
σ(n)i) = G(F
σ(n)(x), y
j, ψ
jσ(n)) ≤ G(x, y
k, ψ
kσ(n)) Since G is continuous the previous inequality passes to the limit n → ∞, showing that x ∈ L
∞, and that
lim sup
n→∞
χ
Ln(x) ≤ χ
L∞(x)
We now want to show lim inf
n→∞χ
Ln(x) ≥ χ
L∞(x). If x / ∈ L
∞the result is straightforward. Let us consider the set
S
ij=
[
k6=i,j
Γ
ijk(ψ
∞)
∪ (Γ
ij(ψ
∞) ∩ ∂X) (3.9) By the genericity hypothesis (Definition 15) we have H
d−1(S
ij) = 0. If x ∈ L
∞\ S
ij, by definition we get for every k 6∈ {i, j} that x does not belong to Γ
jk(ψ
∞). This implies a strict inequality
G(x, y
i, ψ
i∞) = G(x, y
j, ψ
∞j) < G(x, y
k, ψ
∞k).
Since F
n(x) converges to x and since ψ
nconverges to ψ
∞, we get for n large enough
( G(F
n(x), y
i, ψ
ni) < G(F
n(x), y
k, ψ
kn) G(F
n(x), y
j, ψ
jn) < G(F
n(x), y
k, ψ
nk).
Moreover since x ∈ Γ
ij(ψ
∞), F
n(x) ∈ Γ
ij(ψ
n). Combining the inequalities above, this shows that F
n(x) belongs to Lag
ij(ψ
n) ∩ Ω = e F
n(L
n), i.e. x ∈ L
n. This gives us
∀x 6∈ S
ij, lim inf
n→∞
χ
Ln(x) ≥ χ
L∞(x).
Consider x 6∈ S
ij. For such x, we already know that χ
Ln(x) → χ
L∞(x) as n → +∞. Thus, if x does not belong to L
∞, we directly have
n→+∞
lim χ
Ln(x)χ
X(F
n(x)) = χ
L∞χ
X(x) = 0.
We may now assume that x belongs to L
∞\ S
ij. By definition of S
ij, this implies that x / ∈ ∂X. We can directly deduce that χ
Xis continuous at x and that χ
X(F
n(x)) → χ
X(x) when n → +∞.
In conclusion we have that H
ij(ψ
n) → H
ij(ψ
∞), so that H
ijis continuous.
3.2. Kernel and image of DH. The goal of this section is to prove Propo- sition 18 that gives properties on the differential of the mass function H. We consider the admissible set
S
+=
ψ ∈ R
N| ∀i ∈ J 1, N K , H
i(ψ) > 0 . (3.10) Proposition 18. In addition to the assumptions of Proposition 16, we as- sume that
int(X) ∩ {ρ > 0} is path-connected ,
where int(X) is the interior of X. Then we have for any ψ ∈ S
+• The differential DH(ψ) has rank N − 1;
• The image of DH is im(DH(ψ)) = 1
⊥where 1 = (1, · · · , 1) ∈ R
N;
• For any w ∈ ker(DH(ψ)) \ {0}, we have for all i ∈ J 1, N K , w
i6= 0 and all w
ihave the same sign.
The next two lemmas have already been included in the recent survey on optimal transport involving the second and third authors [15], but we include them here for completeness. The proof of Proposition 18 is different from the previous work in optimal transport because H is not symmetric.
Lemma 19. Let U ⊂ R
dbe a path-connected open set, and S ⊂ R
dbe a closed set such that H
d−1(S) = 0. Then, U \ S is path-connected.
Proof. It suffices to treat the case where U is an open ball, the general case will follow by standard connectedness arguments. Let x, y ∈ U \ S be distinct points. Since U \ S is open, there exists r > 0 such that B(x, r) and B(y, r) are included in U \ S. Consider the hyperplane H orthogonal to the segment [x, y], and Π
Hthe projection on H. Then, since Π
His 1-Lipschitz, H
d−1(Π
HS) ≤ H
d−1(S) = 0, so that H \ Π
HS is dense in the hyperplane H.
In particular, there exists a point z ∈ Π
H(B(x, r)) \ S = Π
H(B(y, r)) \ S.
By construction the line z + R (y − x) avoids S and passes through the balls B(x, r) ⊂ U \ S and B(y, r) ⊂ U \ S. This shows that the points x, y can be
connected in U \ S.
We define for ψ ∈ R
Nthe graph G
ψ= (V, E) with vertex set V = {1, . . . , N} with edges
E =
(i, j) ∈ V
2| ∂H
i∂ψ
j(ψ) > 0
We have the following result.
Lemma 20. Under the assumptions of Proposition 18 and for ψ ∈ S
+, the graph G
ψis connected.
Proof. Let Z = int(X) ∩ {ρ > 0}, and S = S
ij
S
ijwhere S
ijis defined in (3.9). From Lemma 19 the set Z \ S is path connected, we also have µ(Z \ S) = 1 since µ(∂X ) = µ(S) = 0. Suppose that G
ψis not connected.
Let i
0∈ J 1, N K , and let I
0be the connected component of i
0in the graph G
ψ. We thus have i
0∈ I
06= J 1, N K . We consider the two non-empty sets
U
1= [
i∈I0
Lag
i(ψ) ∩ (Z \ S) and U
2= [
i /∈I0
Lag
i(ψ) ∩ (Z \ S),
which partition Z \ S up to a Lebesgue-negligible set. Moreover, since ψ ∈ S
+,
U
1∪ U
2= Z \ S, 0 < µ(U
1) < 1, 0 < µ(U
2) < 1.
By construction U
1and U
2are closed sets in Z \ S. Since µ(U
i) > 0 we can
pick x and y in Z \ S such that x ∈ U
1and y ∈ U
2. The Z \ S being path-
connected, we know that there exists a path γ ∈ C
0([0, 1], Z \ S) satisfying
γ (0) = x and γ(1) = y. We let t = max{s ∈ [0, 1]|γ(s) ∈ U
1} and we are
going to show that γ(t) ∈ U
1∩ U
2. By construction, γ(t) obviously belongs
to U
1. Now if t = 1 we have γ (t) = y ∈ U
2. If not, we have for all > 0 that
γ (t + ) ∈ U
2. Since U
2is relatively closed in Z \ and since γ is continuous, we have γ (t) ∈ U
2. Naming z = γ (t), there exists i ∈ I
0, j / ∈ I
0such that z ∈ Lag
i(ψ) ∩ Lag
j(ψ). Moreover, since z / ∈ S we get that for any k / ∈ {i, j},
G(z, y
i, ψ
i) = G(z, y
j, ψ
j) > G(z, y
k, ψ
k).
By continuity of G we can deduce that there exists an open ball of radius r > 0 such that
∀x ∈ B(z, r), ∀k / ∈ {i, j}, G(x, y
i, ψ
i) > G(x, y
k, ψ
k) This implies that
B(z, r) ∩ Γ
ij(ψ) ⊂ Lag
ij(ψ)
where Γ
ij(ψ) is defined in Definition 15. By (Twist) condition and the inver- sion function theorem, we know that Γ
ij(ψ) is a d − 1 dimensional manifold and z ∈ Γ
ij(ψ). Moreover we have ρ(z) > 0 because z ∈ Z and ρ is continu- ous on Z ⊂ X by hypothesis. We now have
∂H
i∂ψ
j(ψ) = Z
Lagij(ψ)
ρ(x) |∂
vG(x, y
i, ψ
i)|
k∇
xG(x, y
j, ψ
j) − ∇
xG(x, y
i, ψ
i)k dH
d−1(x)
≥ Z
B(z,r)∩Γij(ψ)
ρ(x) |∂
vG(x, y
i, ψ
i)|
k∇
xG(x, y
j, ψ
j) − ∇
xG(x, y
i, ψ
i)k dH
d−1(x) > 0 which is a contradiction with the hypothesis that i and j are not connected
in the graph G
ψ.
Proof of Proposition 18. We note the matrix M = DH(ψ), with coefficients m
i,j= ∂H
i/∂ψ
j(ψ). We first show that ker(M
T) = span(1). The inclusion 1 ∈ ker(M
T) follows from
N
X
i=1
m
i,j= ∂
∂ψ
j NX
i=1
H
i(ψ)
!
= 0.
Consider now v ∈ ker(M
T), and pick an index i
0where v is maximum, i.e.
i
0∈ arg max
1≤i≤Nv
i. We have 0 = (M
Tv)
i0=
N
X
i=1
m
i,i0v
i= X
i6=i0
m
i,i0v
i+ m
i0,i0v
i0= X
i6=i0
m
i,i0(v
i− v
i0).
Since ψ ∈ S
+, we have by Proposition 16 that for i 6= i
0, m
i,i0≥ 0. By definition of i
0we also have v
i− v
i0≤ 0. From all this we deduce that v
i= v
i0for any i 6= i
0satisfying m
i,i0> 0, i.e. any vertex i adjacent to i
0in the graph G
ψ. By connectedness of G
ψ, we conclude that v = v
i01, thus showing ker(M
T) = span(1).
We can deduce from this result that M is of rank N − 1 because rk(M ) = rk(M
T) = N − 1. Moreover for any u ∈ R
N,
h1, M ui = (M u)
T1 = u
TM
T1 = 0.
Since the spaces im(M ) and 1
⊥have the same dimension, we immediately get im(M ) = 1
⊥.
Let w ∈ ker(M ) \ {0}, we now want to show that for all i ∈ J 1, N K , w
i6= 0 and that all of the w
ihave the same sign. The proof consists in two steps:
• Step 1: we show that w ≥ 0 (or −w ≥ 0).
• Step 2: we show that for i ∈ J 1, N K , w
i> 0.
We define λ = max
i|m
i,i| and A = λI +M . With these definitions, v belongs to ker(M ) if and only if Av = λv. Moreover, for any i, j ∈ J 1, N K , one has a
i,j≥ 0 and
N
X
k=1
a
k,j= λ.
Step 1: Assume that there exists i
0∈ J 1, N K such that w
i0≥ 0 (we can do this without loss on generality, by working on −w otherwise). Suppose that there exists j 6= i
0such that a
i0,j> 0 and w
j< 0, then since Aw = λw, we have λw
i0= P
Nj=1
a
i0,jw
jand thus λ|w
i0| < P
Nj=1
a
i,j|w
j|. We also have for any i ∈ J 1, N K , λ|w
i| ≤ P
Nj=1
a
i,j|w
j|. By summing this inequality on i and since the inequality is strict when i = i
0, we obtain
N
X
i=1
λ|w
i| <
N
X
i=1 N
X
j=1
a
i,j|w
j| =
N
X
j=1
|w
j|
N
X
i=1
a
i,j=
N
X
j=1
λ|w
j|,
which is a contradiction, so we can affirm that there exists no index j 6= i
0such that w
j< 0 and a
i0,j> 0. Since A = M + λI, for j 6= i
0, a
i0,j= m
i0,j. We thus have ∀j ∈ J 1, N K , m
i0,j> 0 = ⇒ w
j≥ 0. By connectedness of G we deduce w ≥ 0.
Step 2: If there exists i ∈ J 1, N K such that w
i= 0, then P
j