A new class of costs for optimal transport problems
Thierry Champion
Laboratoire IMATH, Universit´e de Toulon joint work withG. Bouchitt´eandJ.J. Alibert(IMATH)
Classical optimal transport problem
I X,Y convex, compact sets (in someRd)
I cost functionc :X×Y →R∪ {+∞}, lower semicontinuous
I µ∈ P(X), ν∈ P(Y) Borel probabilities on X,Y
Theclassical Monge-Kantorovich problemassociated to c :
(MK) inf
Z
X,Y
c(x,y)dγ(x,y) : γ ∈Π(µ, ν)
Π(µ, ν) : set oftransport plansfrom µto ν γ∈Π(µ, ν) ⇔
∀A, γ(A×Y) = µ(A)
∀B, γ(X ×B) = ν(B)
⇔ ∀φ, ψ, Z
X×Y
φ(x) +ψ(y)dγ = Z
X
φdµ+ Z
Y
ψdν
Classical optimal transport problem
Discrete : if µ=X
i
µiδxi and ν =X
j
νjδyj
then γ = X
i,j
γi,jδ(xi,yj) belongs to Π(µ, ν) whenever µi =X
j
γi,j and νj =X
i
γi,j. Note : γi,j = amount of mass moved fromxi to yj.
Product : γ =µ×ν belongs to Π(µ, ν)
Classical optimal transport problem
Discrete : if µ=X
i
µiδxi and ν =X
j
νjδyj
then γ = X
i,j
γi,jδ(xi,yj) belongs to Π(µ, ν) whenever µi =X
j
γi,j and νj =X
i
γi,j. Note : γi,j = amount of mass moved fromxi to yj. Product : γ =µ×ν belongs to Π(µ, ν)
Classical optimal transport problem
Transport maps: if T#µ=ν then (id ×T)#µ ∈ Π(µ, ν).
(∀A, T#µ(A) :=µ(T−1(A)))
Discrete : if µ=X
i
µiδxi and ν =X
j
νjδyj
then T#µ=ν ⇔ ∀j, νj = X
i:xi∈T−1(yj)
µi and (id ×T)#µ= X
i
µiδ(xi,T(xi))
(MK) is the relaxed version of theMonge problem
(M) inf
Z
X
c(x,T(x))dµ(x) : T#µ=ν
References (MK)-(M): Villani (2003,2009), Santambrogio (2015)
Classical optimal transport problem
Transport maps: if T#µ=ν then (id ×T)#µ ∈ Π(µ, ν).
(∀A, T#µ(A) :=µ(T−1(A))) Discrete : if µ=X
i
µiδxi and ν =X
j
νjδyj
then T#µ=ν ⇔ ∀j, νj = X
i:xi∈T−1(yj)
µi and (id ×T)#µ= X
i
µiδ(xi,T(xi))
(MK) is the relaxed version of theMonge problem
(M) inf
Z
X
c(x,T(x))dµ(x) : T#µ=ν
References (MK)-(M): Villani (2003,2009), Santambrogio (2015)
Desintegration of γ
Takeγ ∈Π(µ, ν)
Writeγ =γx ⊗µ , desintegration of γ with respect to µ : γx ∈ P(Y) µ−a.e.x
∀f ∈ Cb(X×Y), hγ,fi= Z
X
Z
Y
f(x,y)dγx(y)
dµ(x)
Discrete : if µ=X
i
µiδxi , ν =X
j
νjδyj
and γ = X
i,j
γi,jδ(xi,yj) then γxi =X
j
γi,j
µi δyj
Transport map : if γ= (id ×T)#µ then γx =δT(x) a.e.x Product : if γ=µ×ν then γx =ν a.e.x
Desintegration of γ
Takeγ ∈Π(µ, ν)
Writeγ =γx ⊗µ , desintegration of γ with respect to µ : γx ∈ P(Y) µ−a.e.x
∀f ∈ Cb(X×Y), hγ,fi= Z
X
Z
Y
f(x,y)dγx(y)
dµ(x)
Discrete : if µ=X
i
µiδxi , ν =X
j
νjδyj
and γ = X
i,j
γi,jδ(xi,yj) then γxi =X
j
γi,j
µi δyj
Transport map : if γ= (id ×T)#µ then γx =δT(x) a.e.x Product : if γ=µ×ν then γx =ν a.e.x
Desintegration of γ
Takeγ ∈Π(µ, ν)
Writeγ =γx ⊗µ , desintegration of γ with respect to µ : γx ∈ P(Y) µ−a.e.x
∀f ∈ Cb(X×Y), hγ,fi= Z
X
Z
Y
f(x,y)dγx(y)
dµ(x)
Discrete : if µ=X
i
µiδxi , ν =X
j
νjδyj
and γ = X
i,j
γi,jδ(xi,yj) then γxi =X
j
γi,j
µi δyj
Transport map : ifγ = (id ×T)#µ then γx =δT(x) a.e.x
Product : if γ=µ×ν then γx =ν a.e.x
Desintegration of γ
Takeγ ∈Π(µ, ν)
Writeγ =γx ⊗µ , desintegration of γ with respect to µ : γx ∈ P(Y) µ−a.e.x
∀f ∈ Cb(X×Y), hγ,fi= Z
X
Z
Y
f(x,y)dγx(y)
dµ(x)
Discrete : if µ=X
i
µiδxi , ν =X
j
νjδyj
and γ = X
i,j
γi,jδ(xi,yj) then γxi =X
j
γi,j
µi δyj
Transport map : ifγ = (id ×T)#µ then γx =δT(x) a.e.x Product : ifγ =µ×ν then γx =ν a.e.x
Desintegration of γ
Theclassical Monge-Kantorovich problemnow reads : (MK) inf
Z
X
Z
Y
c(x,y)dγx(y)dµ(x) : Z
X
γxdµ(x) =ν
About Z
X
γxdµ(x) =ν : Discrete : if µ=X
i
µiδxi , ν =X
j
νjδyj, γ ∈Π(µ, ν),
then γxi =X
j
γi,j
µi δyj with νj =X
i
γi,j
and Z
X
γxdµ(x) =X
i
µi
X
j
γi,j µi δyj
=ν
Desintegration of γ
Theclassical Monge-Kantorovich problemnow reads : (MK) inf
Z
X
Z
Y
c(x,y)dγx(y)dµ(x) : Z
X
γxdµ(x) =ν
About Z
X
γxdµ(x) =ν :
Continuous : ifν =ν(y)dy and γx =γx(y)dy for a.e. x then
Z
X
γx(y)dµ(x) =ν(y) for a.e. x.
Desintegration of γ
Theclassical Monge-Kantorovich problemnow reads : (MK) inf
Z
X
Z
Y
c(x,y)dγx(y)dµ(x) : Z
X
γxdµ(x) =ν
can be rewritten (MK) inf
Z
X
G(x, γx)dµ(x) : Z
X
γxdµ(x) =ν
with G : (x,p)∈X × P(Y)7→G(x,p) = Z
Y
c(x,y)dp(y) Note : G is linear in p.
New class of costs
In this talk, we are interested in the generalization of (MK) : F(µ, ν) = inf
Z
X
G(x, γx)dµ(x) : Z
X
γxdµ(x) =ν
with
G : (x,p)∈X × P(Y)7→G(x,p) = Z
Y
c(x,y)dp(y) +H(x,p)
whereH:X × P(Y)→[0,+∞] is aentropy / perturbation cost.
New class of costs
I Cardinal costH(x,p) = #(support(p))−1.
Note :
H(x, γx) = 0a.e.⇔γx =δT(x)a.e.⇔γ = (id×T)#µ so that F(µ, ν) = inf(M) = min(MK) when µhas no atoms.
Then F(µ, ν) may have no solution despite H is l.s.c. on P(Y).
New class of costs
I “Variance” costH(x,p) =var(p) = Z
Y
|y|2dp(y)− |[p]|2 where [p] =
Z
Y
y dp(y) denotes the barycenter ofp. Note
Z
X
H(x, γx)dµ(x) = Z
Y
|y|2dν(y)− Z
X
|[γx]|2dµ(x) H is not convex inp,F(µ, ν) may have no solution.
I Variance costH(x,p) =−var(p)or H(x,p) =|[p]|2. Then H is l.s.c. and convex onP(Y).
H favours the spreading ofp (max. of variance).
New class of costs
I “Variance” costH(x,p) =var(p) = Z
Y
|y|2dp(y)− |[p]|2 where [p] =
Z
Y
y dp(y) denotes the barycenter ofp. Note
Z
X
H(x, γx)dµ(x) = Z
Y
|y|2dν(y)− Z
X
|[γx]|2dµ(x) H is not convex inp,F(µ, ν) may have no solution.
I Variance costH(x,p) =−var(p)or H(x,p) =|[p]|2. Then H is l.s.c. and convex onP(Y).
H favours the spreading ofp (max. of variance).
New class of costs
I Barycenter constraint
H(x,p) =χ[p]=x =
0 if [p] =x +∞ otherwise For the cost c(x,y) =−|y−x|,F(µ, ν) is related to model-independent pricing in mathematical finance [Hobson Neuberger 2012] and [Beiglb¨ock Henry-Labord`ere Penkner 2013].
Existence of a particular solutionγ : [Beiglb¨ock Juillet –]
Note : F(µ, ν)<+∞ ⇔ µν for convex order
Existence result
Main hypotheses
(H1) c :X×Y →R∪ {+∞}is lower semicontinuous, (H2) H :X × P(Y)→R∪ {+∞}satisfies
I H is lower semicontinuous onX × P(Y).
I for everyx∈X,p7→H(x,p) isconvex.
Theorem
Assume (H1) and (H2), and recall F(µ, ν) = inf
Z
X
G(x, γx)dµ(x) : Z
X
γxdµ(x) =ν
thenF is lower semicontinuous on M+b(X)× M+b(Y).
Moreover, ifF(µ, ν)<+∞ then there is at least one minimizer.
F(µ, ν) extended byF(µ, ν) = +∞ whenever µ(X)6=ν(Y)
Lower semicontinuity property
Set E(γ) = Z
X
G(x, γx)dµ wheneverγ ∈Π(µ, ν) Lemma –Lower semicontinuity of E
Assume (H1) and (H2), (γn)n= (γxn⊗µn)n weakly converges in Mb(X ×Y) toγ =γx⊗µ,
then lim inf
n→+∞
Z
X
G(x, γnx)dµn≥ Z
X
G(x, γx)dµ.
Note : convexity ofp 7→H(x,p) is necessary counterexamples follow for cardinal cost
H(x,p) = #(support(p))−1
when inf(M) = min(MK) and (M) not attained
Lower semicontinuity property
LetG∗(x,·) denote the Fenchel conjugate of the convex G(x,·) :
∀ψ∈ C(Y) G∗(x, ψ)= sup Z
Y
ψdp−G(x,p) :p ∈ P(Y)
. Then one has :
I Upper semicontinuity : ifψ∈ C(Y) then
x 7→G∗(x, ψ) is upper semicontinuous
I bounds: denote mG = infG then infY ψ−mG ≤G∗(x, ψ)≤sup
Y
ψ−mG
I Lipschitz property For every x∈X,G∗(x,·) satisfies
|G∗(x, ψ1)−G∗(x, ψ2)| ≤ sup
Y
|ψ1−ψ2|.
Lower semicontinuity property
Let (ψk)k a dense sequence inC(Y).
SinceG(x,·) convex l.s.c. :
∀p∈ P(Y), G(x,p) = sup
k
Z
ψkdp−G∗(x, ψk)= sup
k
Gk(x,p)
Then for (Ωk)1≤k≤m disjoint open sests Z
X
G(x, γnx)dµn(x) ≥
m
X
k=0
Z
Ωk
Gk(x, γnx)dµn(x)
=
m
X
k=0
Z
Ωk×Y
ψk(y)dγn(x,y) +
Z
Ωk
−G∗(x, ψk)dµn(x)
Lower semicontinuity property
Then one gets lim inf
n→+∞
Z
X
G(x, γnx)dµn(x) ≥
m
X
k=0
Z
Ωk×Y
ψk(y)dγ(x,y) +
Z
Ωk
−G∗(x, ψk)dµ(x)
=
m
X
k=0
Z
Ωk
Gk(x, γx)dµn(x) Taking the sup onm and the open partitions yields :
lim inf
n→+∞
Z
X
G(x, γnx)dµn(x) ≥ Z
X
G(x, γx)dµ(x).
Dual problem and optimality conditions
Recall
F(µ, ν) = inf Z
X
G(x, γx)dµ(x) : Z
X
γxdµ(x) =ν
extended by 1-homogeneity onM+b(X)× M+b(Y).
From convexity and lower-semicontinuity it comes Assume (H1) and (H2), then
F(µ, ν) = sup Z
Y
ψ(y)dν− Z
X
G∗(x, ψ)dµ(x) : ψ∈ C0(Y)
and equality holds in [0,+∞].
Moreover the dual pair (γ, ψ) is optimal whenever ψ∈∂G(x, γx) µ−a.e.
Dual problem and optimality conditions
I if H= 0, then G(x,p) = Z
Y
c(x,y)dp(y), G∗(x, ψ) = sup
p∈P(Y)
Z
Y
ψ(y)−c(x,y)dp
= sup
y∈Y
ψ(y)−c(x,y) =−ψc(x)
so that one recovers the classical Kantorovich dual problem F(µ, ν) = sup
Z
Y
ψ(y)dν+ Z
X
ψc(x)dµ(x) : ψ∈ C0(Y)
Dual problem and optimality conditions
I if c = 0 andH(x,p) =χ[p]=x then G∗(x, ψ) = sup
p∈P(Y)
Z
Y
ψ(y)dp: [p] =x
= − inf
p∈P(Y)
Z
Y
−ψ(y)dp: [p] =x
=−(−ψ)∗∗(x)
so that (here X =Y) : F(µ, ν) = sup
− Z
X
−ψdν+ Z
X
(−ψ)∗∗dµ(x) : ψ∈ C0(X)
and then we recover [Strassen 1965]
F(µ, ν)<+∞ ⇔ Z
ψdµ≤ Z
ψdν ∀ψconvex
Dual problem and optimality conditions
I if H(x,p) =h(x,[p]) then G∗(x, ψ) = − inf
z∈Rd
{(c(x,·)−ψ)∗∗(−z) +h(x,z)}
and the optimality condition reads : for µ−a.e.x 0∈∂h(x,[γx]) +∂(c(x,·)−ψ)∗∗([γx]) and
Z
Ω
(c(x,y)−ψ(y))γx(dy) = (c(x,·)−ψ)∗∗([γx])
Dual problem and optimality conditions
I if H(x,p) =h(x,[p]) then G∗(x, ψ) = − inf
z∈Rd
{(c(x,·)−ψ)∗∗(−z) +h(x,z)}
and the optimality condition reads : for µ−a.e.x 0∈∂h(x,[γx]) +∂(c(x,·)−ψ)∗∗([γx]) and
Z
Ω
(c(x,y)−ψ(y))γx(dy) = (c(x,·)−ψ)∗∗([γx])
model case : h(x,[p]) =χ[p]=x then G∗(x, ψ) =−(c(x,·)−ψ(·))∗∗(x)
Dual problem and optimality conditions
I if H(x,p) =h(x,[p]) then G∗(x, ψ) = − inf
z∈Rd
{(c(x,·)−ψ)∗∗(−z) +h(x,z)}
and the optimality condition reads : for µ−a.e.x 0∈∂h(x,[γx]) +∂(c(x,·)−ψ)∗∗([γx]) and
Z
Ω
(c(x,y)−ψ(y))γx(dy) = (c(x,·)−ψ)∗∗([γx])
model case : c(x,y) =λ|y−x|2, h(x,[p]) =λ|[p]|2 then G∗(x, ψ) =−(| · | −ψ)∗∗O| · |(λx)−λ(1−λ)|x|2
Existence of a solution ψ for dual problem
Difficult task : in [Beiglb¨ock Henry-Labord`ere Penkner 2013]
counterexamplefor c(x,y) =−|y−x|andH(x,p) =χ[p]=x for some discreteµon [0,2] and ν = 12dxb[0,2]
back to classical dual case : sup
Z
Y
ψ(y)dν+ Z
X
ψc(x)dµ(x) : ψ∈ C0(Y)
= sup Z
Y
(ψc)c(y)dν+ Z
X
ψc(x)dµ(x) : ψ∈ C0(Y)
withψc(x) = supy∈Y{ψ(y)−c(x,y)} and ((ψc)c)c =ψc.
Existence of a solution ψ for dual problem
Difficult task : in [Beiglb¨ock Henry-Labord`ere Penkner 2013]
counterexamplefor c(x,y) =−|y−x|andH(x,p) =χ[p]=x for some discreteµon [0,2] and ν = 12dxb[0,2]
back to classical dual case : sup
Z
Y
ψ(y)dν+ Z
X
ψc(x)dµ(x) : ψ∈ C0(Y)
= sup Z
Y
(ψc)c(y)dν+ Z
X
ψc(x)dµ(x) : ψ∈ C0(Y)
withψc(x) = supy∈Y{ψ(y)−c(x,y)} and ((ψc)c)c=ψc.
Existence of a solution ψ for dual problem
classical dual case,c subadditive(c(x,z)≤c(x,y) +c(y,z)):
sup Z
Y
(ψc)c(y)dν+ Z
X
ψc(x)dµ(x) : ψ∈ C0(Y)
sup Z
Y
ψc(y)dν− Z
X
ψc(x)dµ(x) : ψ∈ C0(Y)
i.e. (ψc)c =ψc →look for a solution of the form ψc. framework : X =Y,c subadditive(c(x,z)≤c(x,y) +c(y,z)) goal : find conditions for whichG∗(·,G∗(·, ψ)) =G∗(·, ψ).
First : ifc(x,x) = 0 andH(x, δx) = 0 thenG(x, δx) = 0 so that G∗(x, ψ)≥ψ(x) for all x, ψ
Existence of a solution ψ for dual problem
Proposition
ifc subadditive, and
H(x,p) =h([p]) withh l.s.c. convex,h(0) = 0, h≥0 , or H(x,p) =h([p]−x) h as above + subadditive, then G∗(·,G∗(·, ψ)) =G∗(·, ψ) for any ψ∈ C(X).
Applies in particular toH(p) =|[p]|2 andH(x,p) =χ[p]=x.
Example : barycenter constraint
Takec(x,y) =|y−x|andH(x,p) =χ[p]=x setµ= 12dxb[−1,1] and ν = 14δ−1+12δ0+ 14δ1. Then γx = |x|−x2 δ−1+ (1− |x|)δ0+|x|+x2 δ1
and F(µ, ν) = 13 while inf(M) = 14 Setψ(0) = 0, then
Z
Ω
(c(x,y)−ψ(y))γx(dy) = (c(x,·)−ψ)∗∗([γx]) =−G∗(x, ψ) implies
ψ(y) =−(|y− ·| −ψ)∗∗(x) = 2x(1 +x)−ψ(−1)x ifx ≤0, ψ(y) =−(|y− ·| −ψ)∗∗(x) = 2x(x−1) +ψ(1)x if x≥0.
Example : barycenter constraint
Setψ(1) =ψ(−1) = 0 then a solution of the dual problem is ψ(x) =
2x(x+ 1) ifx ≤0, 2x(x−1) ifx ≥0.
|x0− ·| −ψ
x0
−1 1
(|x0− ·| −ψ(·))∗∗
Z
Ω
(c(x0,y)−ψ(y))γx0(dy) = (c(x0,·)−ψ)∗∗(x0) =−G∗(x0, ψ)
Example : variance cost
Takec(x,y) =λ|y−x|2 andH(x,p) =|[p]|2 setµ= 12δ0+12δ1 and ν =dxb[0,1]. Thenfor λ≥ 12, γ0 =dxb[0,1
2] and γ1 =dxb[1 2,1]