A new class of costs for optimal transport problems

(1)

A new class of costs for optimal transport problems

Thierry Champion

Laboratoire IMATH, Universit´e de Toulon joint work withG. Bouchitt´eandJ.J. Alibert(IMATH)

(2)

Classical optimal transport problem

I X,Y convex, compact sets (in someR^d)

I cost functionc :X×Y →R∪ {+∞}, lower semicontinuous

I µ∈ P(X), ν∈ P(Y) Borel probabilities on X,Y

Theclassical Monge-Kantorovich problemassociated to c :

(MK) inf

Z

X,Y

c(x,y)dγ(x,y) : γ ∈Π(µ, ν)

Π(µ, ν) : set oftransport plansfrom µto ν γ∈Π(µ, ν) ⇔

∀A, γ(A×Y) = µ(A)

∀B, γ(X ×B) = ν(B)

⇔ ∀φ, ψ, Z

X×Y

φ(x) +ψ(y)dγ = Z

X

φdµ+ Z

Y

ψdν

(3)

Classical optimal transport problem

Discrete : if µ=X

i

µiδxi and ν =X

j

νjδyj

then γ = X

i,j

γ_i,jδ_(x_i_,y_j₎ belongs to Π(µ, ν) whenever µ_i =X

j

γ_i,j and ν_j =X

i

γ_i,j. Note : γ_i,j = amount of mass moved fromx_i to y_j.

Product : γ =µ×ν belongs to Π(µ, ν)

(4)

Classical optimal transport problem

Discrete : if µ=X

i

µiδxi and ν =X

j

νjδyj

then γ = X

i,j

γ_i,jδ_(x_i_,y_j₎ belongs to Π(µ, ν) whenever µ_i =X

j

γ_i,j and ν_j =X

i

γ_i,j. Note : γ_i,j = amount of mass moved fromx_i to y_j. Product : γ =µ×ν belongs to Π(µ, ν)

(5)

Classical optimal transport problem

Transport maps: if T_#µ=ν then (id ×T)_#µ ∈ Π(µ, ν).

(∀A, T#µ(A) :=µ(T⁻¹(A)))

Discrete : if µ=X

i

µiδxi and ν =X

j

νjδyj

then T_#µ=ν ⇔ ∀j, ν_j = X

i:xi∈T⁻¹(yj)

µ_i and (id ×T)_#µ= X

i

µ_iδ_(x_i_,T(x_i₎₎

(MK) is the relaxed version of theMonge problem

(M) inf

Z

X

c(x,T(x))dµ(x) : T#µ=ν

References (MK)-(M): Villani (2003,2009), Santambrogio (2015)

(6)

Classical optimal transport problem

Transport maps: if T_#µ=ν then (id ×T)_#µ ∈ Π(µ, ν).

(∀A, T#µ(A) :=µ(T⁻¹(A))) Discrete : if µ=X

i

µiδxi and ν =X

j

νjδyj

then T_#µ=ν ⇔ ∀j, ν_j = X

i:xi∈T⁻¹(yj)

µ_i and (id ×T)_#µ= X

i

µ_iδ_(x_i_,T(x_i₎₎

(MK) is the relaxed version of theMonge problem

(M) inf

Z

X

c(x,T(x))dµ(x) : T#µ=ν

References (MK)-(M): Villani (2003,2009), Santambrogio (2015)

(7)

Desintegration of γ

Takeγ ∈Π(µ, ν)

Writeγ =γ^x ⊗µ , desintegration of γ with respect to µ : γ^x ∈ P(Y) µ−a.e.x

∀f ∈ C_b(X×Y), hγ,fi= Z

X

Z

Y

f(x,y)dγ^x(y)

dµ(x)

Discrete : if µ=X

i

µ_iδ_x_i , ν =X

j

ν_jδ_y_j

and γ = X

i,j

γi,jδ_(x_i_,y_j₎ then γ^xⁱ =X

j

γi,j

µ_i δyj

Transport map : if γ= (id ×T)_#µ then γ^x =δ_T_(x) a.e.x Product : if γ=µ×ν then γ^x =ν a.e.x

(8)

Desintegration of γ

X

Z

Y

f(x,y)dγ^x(y)

dµ(x)

Discrete : if µ=X

i

µ_iδ_x_i , ν =X

j

ν_jδ_y_j

and γ = X

i,j

γ_i,jδ_(x_i_,y_j₎ then γ^xⁱ =X

j

γi,j

µ_i δy_j

Transport map : if γ= (id ×T)_#µ then γ^x =δ_T_(x) a.e.x Product : if γ=µ×ν then γ^x =ν a.e.x

(9)

Desintegration of γ

X

Z

Y

f(x,y)dγ^x(y)

dµ(x)

Discrete : if µ=X

i

µ_iδ_x_i , ν =X

j

ν_jδ_y_j

and γ = X

i,j

j

γi,j

µ_i δy_j

Transport map : ifγ = (id ×T)#µ then γ^x =δ_T_(x) a.e.x

Product : if γ=µ×ν then γ^x =ν a.e.x

(10)

Desintegration of γ

X

Z

Y

f(x,y)dγ^x(y)

dµ(x)

Discrete : if µ=X

i

µ_iδ_x_i , ν =X

j

ν_jδ_y_j

and γ = X

i,j

j

γi,j

µ_i δy_j

Transport map : ifγ = (id ×T)#µ then γ^x =δ_T_(x) a.e.x Product : ifγ =µ×ν then γ^x =ν a.e.x

(11)

Desintegration of γ

Theclassical Monge-Kantorovich problemnow reads : (MK) inf

Z

X

Z

Y

c(x,y)dγ^x(y)dµ(x) : Z

X

γ^xdµ(x) =ν

About Z

X

γ^xdµ(x) =ν : Discrete : if µ=X

i

µiδxi , ν =X

j

νjδyj, γ ∈Π(µ, ν),

then γ^xⁱ =X

j

γi,j

µ_i δyj with νj =X

i

γi,j

and Z

X

γ^xdµ(x) =X

i



µi

X

j

γ_i,j µ_i δyj



=ν

(12)

Desintegration of γ

Z

X

Z

Y

X

γ^xdµ(x) =ν

About Z

X

γ^xdµ(x) =ν :

Continuous : ifν =ν(y)dy and γ^x =γ^x(y)dy for a.e. x then

Z

X

γ^x(y)dµ(x) =ν(y) for a.e. x.

(13)

Desintegration of γ

Z

X

Z

Y

X

γ^xdµ(x) =ν

can be rewritten (MK) inf

Z

X

G(x, γ^x)dµ(x) : Z

X

γ^xdµ(x) =ν

with G : (x,p)∈X × P(Y)7→G(x,p) = Z

Y

c(x,y)dp(y) Note : G is linear in p.

(14)

New class of costs

In this talk, we are interested in the generalization of (MK) : F(µ, ν) = inf

Z

X

γ^xdµ(x) =ν

with

G : (x,p)∈X × P(Y)7→G(x,p) = Z

Y

c(x,y)dp(y) +H(x,p)

whereH:X × P(Y)→[0,+∞] is aentropy / perturbation cost.

(15)

New class of costs

I Cardinal costH(x,p) = #(support(p))−1.

Note :

H(x, γ^x) = 0a.e.⇔γ^x =δ_T(x)a.e.⇔γ = (id×T)#µ so that F(µ, ν) = inf(M) = min(MK) when µhas no atoms.

Then F(µ, ν) may have no solution despite H is l.s.c. on P(Y).

(16)

New class of costs

I “Variance” costH(x,p) =var(p) = Z

Y

|y|²dp(y)− |[p]|² where [p] =

Z

Y

y dp(y) denotes the barycenter ofp. Note

Z

X

H(x, γ^x)dµ(x) = Z

Y

|y|²dν(y)− Z

X

|[γ^x]|²dµ(x) H is not convex inp,F(µ, ν) may have no solution.

I Variance costH(x,p) =−var(p)or H(x,p) =|[p]|². Then H is l.s.c. and convex onP(Y).

H favours the spreading ofp (max. of variance).

(17)

New class of costs

I “Variance” costH(x,p) =var(p) = Z

Y

|y|²dp(y)− |[p]|² where [p] =

Z

Y

y dp(y) denotes the barycenter ofp. Note

Z

X

H(x, γ^x)dµ(x) = Z

Y

|y|²dν(y)− Z

X

|[γ^x]|²dµ(x) H is not convex inp,F(µ, ν) may have no solution.

I Variance costH(x,p) =−var(p)or H(x,p) =|[p]|². Then H is l.s.c. and convex onP(Y).

H favours the spreading ofp (max. of variance).

(18)

New class of costs

I Barycenter constraint

H(x,p) =χ_[p]=x =

0 if [p] =x +∞ otherwise For the cost c(x,y) =−|y−x|,F(µ, ν) is related to model-independent pricing in mathematical finance [Hobson Neuberger 2012] and [Beiglb¨ock Henry-Labord`ere Penkner 2013].

Existence of a particular solutionγ : [Beiglb¨ock Juillet –]

Note : F(µ, ν)<+∞ ⇔ µν for convex order

(19)

Existence result

Main hypotheses

(H1) c :X×Y →R∪ {+∞}is lower semicontinuous, (H₂) H :X × P(Y)→R∪ {+∞}satisfies

I H is lower semicontinuous onX × P(Y).

I for everyx∈X,p7→H(x,p) isconvex.

Theorem

Assume (H₁) and (H₂), and recall F(µ, ν) = inf

Z

X

γ^xdµ(x) =ν

thenF is lower semicontinuous on M⁺_b(X)× M⁺_b(Y).

Moreover, ifF(µ, ν)<+∞ then there is at least one minimizer.

F(µ, ν) extended byF(µ, ν) = +∞ whenever µ(X)6=ν(Y)

(20)

Lower semicontinuity property

Set E(γ) = Z

X

G(x, γ^x)dµ wheneverγ ∈Π(µ, ν) Lemma –Lower semicontinuity of E

Assume (H₁) and (H₂), (γ_n)_n= (γ^x_n⊗µ_n)_n weakly converges in M_b(X ×Y) toγ =γ^x⊗µ,

then lim inf

n→+∞

Z

X

G(x, γ_n^x)dµn≥ Z

X

G(x, γ^x)dµ.

Note : convexity ofp 7→H(x,p) is necessary counterexamples follow for cardinal cost

H(x,p) = #(support(p))−1

when inf(M) = min(MK) and (M) not attained

(21)

Lower semicontinuity property

LetG^∗(x,·) denote the Fenchel conjugate of the convex G(x,·) :

∀ψ∈ C(Y) G^∗(x, ψ)= sup Z

Y

ψdp−G(x,p) :p ∈ P(Y)

. Then one has :

I Upper semicontinuity : ifψ∈ C(Y) then

x 7→G^∗(x, ψ) is upper semicontinuous

I bounds: denote mG = infG then infY ψ−m_G ≤G^∗(x, ψ)≤sup

Y

ψ−m_G

I Lipschitz property For every x∈X,G^∗(x,·) satisfies

|G^∗(x, ψ1)−G^∗(x, ψ2)| ≤ sup

Y

|ψ₁−ψ2|.

(22)

Lower semicontinuity property

Let (ψk)k a dense sequence inC(Y).

SinceG(x,·) convex l.s.c. :

∀p∈ P(Y), G(x,p) = sup

k

Z

ψkdp−G^∗(x, ψk)= sup

k

Gk(x,p)

Then for (Ω_k)1≤k≤m disjoint open sests Z

X

G(x, γ_n^x)dµn(x) ≥

m

X

k=0

Z

Ωk

G_k(x, γ_n^x)dµn(x)

=

m

X

k=0

Z

Ωk×Y

ψ_k(y)dγn(x,y) +

Z

Ωk

−G^∗(x, ψ_k)dµ_n(x)

(23)

Lower semicontinuity property

Then one gets lim inf

n→+∞

Z

X

G(x, γ_n^x)dµ_n(x) ≥

m

X

k=0

Z

Ωk×Y

ψ_k(y)dγ(x,y) +

Z

Ωk

−G^∗(x, ψ_k)dµ(x)

=

m

X

k=0

Z

Ωk

G_k(x, γ^x)dµ_n(x) Taking the sup onm and the open partitions yields :

lim inf

n→+∞

Z

X

G(x, γ_n^x)dµ_n(x) ≥ Z

X

G(x, γ^x)dµ(x).

(24)

Dual problem and optimality conditions

Recall

F(µ, ν) = inf Z

X

γ^xdµ(x) =ν

extended by 1-homogeneity onM⁺_b(X)× M⁺_b(Y).

From convexity and lower-semicontinuity it comes Assume (H1) and (H2), then

F(µ, ν) = sup Z

Y

ψ(y)dν− Z

X

G^∗(x, ψ)dµ(x) : ψ∈ C⁰(Y)

and equality holds in [0,+∞].

Moreover the dual pair (γ, ψ) is optimal whenever ψ∈∂G(x, γ^x) µ−a.e.

(25)

Dual problem and optimality conditions

I if H= 0, then G(x,p) = Z

Y

c(x,y)dp(y), G^∗(x, ψ) = sup

p∈P(Y)

Z

Y

ψ(y)−c(x,y)dp

= sup

y∈Y

ψ(y)−c(x,y) =−ψ^c(x)

so that one recovers the classical Kantorovich dual problem F(µ, ν) = sup

Z

Y

ψ(y)dν+ Z

X

ψ^c(x)dµ(x) : ψ∈ C⁰(Y)

(26)

Dual problem and optimality conditions

I if c = 0 andH(x,p) =χ_[p]=x then G^∗(x, ψ) = sup

p∈P(Y)

Z

Y

ψ(y)dp: [p] =x

= − inf

p∈P(Y)

Z

Y

−ψ(y)dp: [p] =x

=−(−ψ)^∗∗(x)

so that (here X =Y) : F(µ, ν) = sup

− Z

X

−ψdν+ Z

X

(−ψ)^∗∗dµ(x) : ψ∈ C⁰(X)

and then we recover [Strassen 1965]

F(µ, ν)<+∞ ⇔ Z

ψdµ≤ Z

ψdν ∀ψconvex

(27)

Dual problem and optimality conditions

I if H(x,p) =h(x,[p]) then G^∗(x, ψ) = − inf

z∈R^d

{(c(x,·)−ψ)^∗∗(−z) +h(x,z)}

and the optimality condition reads : for µ−a.e.x 0∈∂h(x,[γ^x]) +∂(c(x,·)−ψ)^∗∗([γ^x]) and

Z

Ω

(c(x,y)−ψ(y))γ^x(dy) = (c(x,·)−ψ)^∗∗([γ^x])

(28)

Dual problem and optimality conditions

z∈R^d

{(c(x,·)−ψ)^∗∗(−z) +h(x,z)}

Z

Ω

(c(x,y)−ψ(y))γ^x(dy) = (c(x,·)−ψ)^∗∗([γ^x])

model case : h(x,[p]) =χ_[p]=x then G^∗(x, ψ) =−(c(x,·)−ψ(·))^∗∗(x)

(29)

Dual problem and optimality conditions

z∈R^d

{(c(x,·)−ψ)^∗∗(−z) +h(x,z)}

Z

Ω

(c(x,y)−ψ(y))γ^x(dy) = (c(x,·)−ψ)^∗∗([γ^x])

model case : c(x,y) =λ|y−x|², h(x,[p]) =λ|[p]|² then G^∗(x, ψ) =−(| · | −ψ)^∗∗O| · |(λx)−λ(1−λ)|x|²

(30)

Existence of a solution ψ for dual problem

Difficult task : in [Beiglb¨ock Henry-Labord`ere Penkner 2013]

counterexamplefor c(x,y) =−|y−x|andH(x,p) =χ_[p]=x for some discreteµon [0,2] and ν = ¹₂dx_b[0,2]

back to classical dual case : sup

Z

Y

ψ(y)dν+ Z

X

= sup Z

Y

(ψ^c)^c(y)dν+ Z

X

withψ^c(x) = sup_y∈Y{ψ(y)−c(x,y)} and ((ψ^c)^c)^c =ψ^c.

(31)

Existence of a solution ψ for dual problem

Difficult task : in [Beiglb¨ock Henry-Labord`ere Penkner 2013]

counterexamplefor c(x,y) =−|y−x|andH(x,p) =χ_[p]=x for some discreteµon [0,2] and ν = ¹₂dx_b[0,2]

back to classical dual case : sup

Z

Y

ψ(y)dν+ Z

X

= sup Z

Y

(ψ^c)^c(y)dν+ Z

X

withψ^c(x) = sup_y∈Y{ψ(y)−c(x,y)} and ((ψ^c)^c)^c=ψ^c.

(32)

Existence of a solution ψ for dual problem

classical dual case,c subadditive(c(x,z)≤c(x,y) +c(y,z)):

sup Z

Y

(ψ^c)^c(y)dν+ Z

X

sup Z

Y

ψ^c(y)dν− Z

X

i.e. (ψ^c)^c =ψ^c →look for a solution of the form ψ^c. framework : X =Y,c subadditive(c(x,z)≤c(x,y) +c(y,z)) goal : find conditions for whichG^∗(·,G^∗(·, ψ)) =G^∗(·, ψ).

First : ifc(x,x) = 0 andH(x, δ_x) = 0 thenG(x, δ_x) = 0 so that G^∗(x, ψ)≥ψ(x) for all x, ψ

(33)

Existence of a solution ψ for dual problem

Proposition

ifc subadditive, and

H(x,p) =h([p]) withh l.s.c. convex,h(0) = 0, h≥0 , or H(x,p) =h([p]−x) h as above + subadditive, then G^∗(·,G^∗(·, ψ)) =G^∗(·, ψ) for any ψ∈ C(X).

Applies in particular toH(p) =|[p]|² andH(x,p) =χ_[p]=x.

(34)

Example : barycenter constraint

Takec(x,y) =|y−x|andH(x,p) =χ_[p]=x setµ= ¹₂dx_b[−1,1] and ν = ¹₄δ−1+¹₂δ0+ ¹₄δ1. Then γ^x = ^|x|−x₂ δ−1+ (1− |x|)δ₀+^|x|+x₂ δ1

and F(µ, ν) = ¹₃ while inf(M) = ¹₄ Setψ(0) = 0, then

Z

Ω

(c(x,y)−ψ(y))γ^x(dy) = (c(x,·)−ψ)^∗∗([γ^x]) =−G^∗(x, ψ) implies

ψ(y) =−(|y− ·| −ψ)^∗∗(x) = 2x(1 +x)−ψ(−1)x ifx ≤0, ψ(y) =−(|y− ·| −ψ)^∗∗(x) = 2x(x−1) +ψ(1)x if x≥0.

(35)

Example : barycenter constraint

Setψ(1) =ψ(−1) = 0 then a solution of the dual problem is ψ(x) =

2x(x+ 1) ifx ≤0, 2x(x−1) ifx ≥0.

|x₀− ·| −ψ

x0

−1 1

(|x₀− ·| −ψ(·))^∗∗

Z

Ω

(c(x0,y)−ψ(y))γ^x⁰(dy) = (c(x0,·)−ψ)^∗∗(x0) =−G^∗(x0, ψ)

(36)

Example : variance cost

Takec(x,y) =λ|y−x|² andH(x,p) =|[p]|² setµ= ¹₂δ₀+¹₂δ₁ and ν =dx_b[0,1]. Thenfor λ≥ ¹₂, γ⁰ =dx_b[0,1

2] and γ¹ =dx_b[1 2,1]