Minimization of entropy functionals

(1)

HAL Id: hal-00177361

https://hal.archives-ouvertes.fr/hal-00177361

Submitted on 7 Oct 2007

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Minimization of entropy functionals

Christian Léonard

To cite this version:

Christian Léonard. Minimization of entropy functionals. Journal of Mathematical Analysis and

Applications, Elsevier, 2008, 346, pp.183-204. �hal-00177361�

(2)

hal-00177361, version 1 - 7 Oct 2007

CHRISTIAN L´ EONARD

Abstract. Entropy functionals (i.e. convex integral functionals) and extensions of these functionals are minimized on convex sets. This paper is aimed at reducing as much as possible the assumptions on the constraint set. Dual equalities and characterizations of the minimizers are obtained with weak constraint qualifications.

1. Introduction 1

2. Presentation of the minimization problems (P

C

) and (P

C

) 3

3. Preliminary results 7

4. Solving (P

C

) 10

5. Solving (P

C

) 15

6. Examples 17

References 19

1. Introduction

1.1. The entropy minimization problem. Let R be a positive measure on a space Z.

Take a [0, ∞]-valued measurable function γ

^∗

on Z × R such that γ

^∗

(z, ·) := γ

_z^∗

is convex and lower semicontinuous for all z ∈ Z . Denote M

Z

the space of all signed measures Q on Z . The entropy functional to be considered is defined by

I(Q) = R

Z

γ

_z^∗

(

^dQ_dR

(z)) R(dz) if Q ≺ R

+∞ otherwise , Q ∈ M

Z

(1.1)

where Q ≺ R means that Q is absolutely continuous with respect to R. Assume that for each z there exists a unique m(z) which minimizes γ

_z^∗

with

γ

_z^∗

(m(z)) = 0, ∀z ∈ Z. (1.2)

Then, I is [0, ∞]-valued, its unique minimizer is mR and I(mR) = 0.

This paper is concerned with the minimization problem

minimize I(Q) subject to T

o

Q ∈ C, Q ∈ M

Z

(1.3) where T

o

: M

Z

→ X

o

is a linear operator which takes its values in a vector space X

o

and C is a convex subset of X

o

.

Date: October 07.

2000 Mathematics Subject Classification. 46E30, 46N10, 49K22, 49N15, 49N45.

Key words and phrases. entropy, convex optimization, constraint qualification, convex conjugate, Or- licz spaces.

1

(3)

1.2. Presentation of the results. Our aim is to reduce as much as possible the restrictions on the convex set C. Denoting the minimizer Q b of (1.3), the geometric picture is that some level set of I is tangent at Q b to the constraint set T

_o⁻¹

C. Since these sets are convex, they are separated by some affine hyperplane and the analytic description of this separation yields the characterization of Q. b Of course Hahn-Banach theorem is the key.

Standard approaches require C to be open with respect to some given topology in order to be allowed to apply it. In the present paper, one chooses to use a topological structure which is designed for the level sets of I to “look like” open sets, so that Hahn-Banach theorem can be applied without assuming to much on C.

This strategy is implemented in [17] in an abstract setting suitable for several applications. It is a refinement of the standard saddle-point method [22] where convex conjugates play an important role. The proofs of the present article are applications of the general results of [17].

Clearly, for the problem (1.3) to be attained, T

_o⁻¹

C must share a supporting hyperplane with some level set of I. This is the reason why it is assumed to be closed with respect to the above mentioned topological structure. This will be the only restriction to be kept together with the interior specification (1.4) below.

Dual equalities and primal attainment are obtained under the weakest possible assumption:

C ∩ T

o

dom I 6= ∅

where dom I := {Q ∈ M

Z

; I(Q) < ∞} is the effective domain of I and T

o

dom I is its image by T

o

. The main result of this article is the characterization of the minimizers of (1.3) in the interior case which is specified by

C ∩ icor (T

o

dom I) 6= ∅ (1.4)

where icor (T

o

dom I ) is the intrinsic core of T

o

dom I. The notion of intrinsic core does not rely on any topology; it gives the largest possible interior set. For comparison, a usual form of constraint qualification required for the representation of the minimizers of (1.3) is

int (C) ∩ T

o

dom I 6= ∅ (1.5)

where int (C) is the interior of C with respect to a topology which is not directly connected to the “geometry” of I. In particular, int (C) must be nonempty; this is an important restriction. The constraint qualification (1.4) is weaker.

An extension of Problem (1.3) is also investigated. One considers an extension ¯ I of the entropy I to a vector space L

Z

which contains M

Z

and may also contain singular linear forms which are not σ-additive. The extended problem is

minimize I(ℓ) ¯ subject to T

o

ℓ ∈ C, ℓ ∈ L

Z

(1.6) Even if I is strictly convex, ¯ I isn’t strictly convex in general so that (1.6) may admit several minimizers. There are situations where (1.3) is not attained in M

Z

while (1.6) is attained in L

Z

. Other relations between these minimization problems are investigated by the author in [18] with probabilistic questions in mind.

1.3. Literature about entropy minimization. Entropy minimization problems ap-

pear in many areas of applied mathematics and sciences. The literature about the min-

imization of entropy functionals under convex constraints is considerable: many papers

are concerned with an engineering approach, working on the implementation of numerical

procedures in specific situations. In fact, entropy minimization is a popular method to

solve ill-posed inverse problems.

(4)

Rigorous general results on this topic are quite recent. Let us cite, among others, the main contribution of Borwein and Lewis: [1], [2], [3], [4], [5], [6] together with the paper [23] by Teboulle and Vajda. In these papers, topological constraint qualifications of the type of (1.5) are required. Such restrictions are removed here.

With a geometric point of view, Csisz´ar [8, 9] provides a complete treatment of (1.3) with the relative entropy (see Section 6.1) under the weak assumption (1.4). The behavior of minimizing sequences of general entropy functionals is studied in [10].

By means of a method different from the saddle-point approach, the author has already studied in [15, 16] entropy minimization problems under affine constraints (corresponding to C reduced to a single point) and more restrictive assumptions on γ

^∗

.

The present article extends these results.

Outline of the paper. The minimization problems (1.3) and (1.6) are described in details at Section 2. In Section 3, the main results of [17] about the extended saddle-point method are recalled. Section 4 is devoted to the extended problem (1.6) and Section 5 to (1.3). One presents important examples of entropies and constraints at Section 6.1.

Notation. Let X and Y be topological vector spaces. The algebraic dual space of X is X

^∗

, the topological dual space of X is X

^′

. The topology of X weakened by Y is σ(X, Y ) and one writes hX, Y i to specify that X and Y are in separating duality.

Let f : X → [−∞, +∞] be an extended numerical function. Its convex conjugate with respect to hX, Y i is f

^∗

(y) = sup

_x∈X

{hx, yi − f(x)} ∈ [−∞, +∞], y ∈ Y. Its subdifferential at x with respect to hX, Y i is ∂

Y

f (x) = {y ∈ Y ; f (x + ξ) ≥ f (x) + hy, ξi, ∀ξ ∈ X}. If no confusion occurs, one writes ∂f(x).

The intrinsic core of a subset A of a vector space is icor A = {x ∈ A; ∀x

^′

∈ aff A, ∃t >

0, [x, x + t(x

^′

− x)[⊂ A} where aff A is the affine space spanned by A. icordom f is the intrisic core of the effective domain of f : dom f = {x ∈ X; f (x) < ∞}.

The indicator of a subset A of X is defined by ι

A

(x) =

0, if x ∈ A

+∞, otherwise , x ∈ X.

The support function of A ⊂ X is ι

^∗_A

(y) = sup

_x∈A

hx, yi, y ∈ Y.

One writes I

ϕ

(u) := R

Z

ϕ(z, u(z)) R(dz) = R

Z

ϕ(u) dR and I = I

γ^∗

for short, instead of (1.1).

2. Presentation of the minimization problems (P

C

) and (P

C

)

The problem (1.3) and its extension (1.6) are introduced. Their correct mathematical statements necessitate the notion of Orlicz spaces. The definitions of good and bad constraints are given and the main assumptions are collected at the end of this section.

2.1. Orlicz spaces. To state the minimization problem (1.3) and its extension correctly, one will need to talk in terms of Orlicz spaces related to the function γ

^∗

.

Let us recall some basic definitions and results. A set Z is furnished with a σ-finite nonnegative measure R on a σ-field which is assumed to be R-complete. A function ρ : Z × R is said to be a Young function if for R-almost every z, ρ(z, ·) is a convex even [0, ∞]-valued function on R such that ρ(z, 0) = 0 and there exists a measurable function z 7→ s

z

> 0 such that 0 < ρ(z, s

z

) < ∞.

In the sequel, every numerical function on Z is supposed to be measurable.

(5)

Definitions 2.1 (The Orlicz spaces L

ρ

, E

ρ

, L

ρ

and E

ρ

). The Orlicz space associated with ρ is defined by L

ρ

(Z, R) = {u : Z → R ; kuk

ρ

< +∞} where the Luxemburg norm k · k

ρ

is defined by kuk

ρ

= inf

β > 0 ; R

Z

ρ(z, u(z)/β) R(dz) ≤ 1 . Hence, L

ρ

(Z, R) =

u : Z → R ; ∃α

o

> 0, Z

Z

ρ

z, α

o

u(z)

R(dz) < ∞

. A subspace of interest is

E

ρ

(Z , R) =

u : Z → R ; ∀α > 0, Z

Z

ρ

z, αu(z)

R(dz) < ∞

.

Now, let us identify the R-a.e. equal functions. The corresponding spaces of equivalence classes are denoted L

ρ

(Z , R) and E

ρ

(Z , R).

Of course E

ρ

⊂ L

ρ

. Note that if ρ doesn’t depend on z and ρ(s

o

) = ∞ for some s

o

> 0, E

ρ

reduces to the null space and if in addition R is bounded, L

ρ

is L

∞

. On the other hand, if ρ is a finite function which doesn’t depend on z and R is bounded, E

ρ

contains all the bounded functions.

Duality in Orlicz spaces is intimately linked with the convex conjugacy. The convex conjugate ρ

^∗

of ρ is defined by ρ

^∗

(z, t) = sup

_s∈R

{st − ρ(z, s)}. It is also a Young function so that one may consider the Orlicz space L

ρ^∗

.

Theorem 2.2 (Representation of E

_ρ^′

). Suppose that ρ is a finite Young function. Then, the dual space of E

ρ

is isomorphic to L

ρ^∗

.

Proof. For a proof of this result, see ([12], Thm 4.8).

A continuous linear form ℓ ∈ L

^′_ρ

is said to be singular if for all u ∈ L

ρ

, there exists a decreasing sequence of measurable sets (A

n

) such that R(∩

n

A

n

) = 0 and for all n ≥ 1, hℓ, u1

_Z\A_n

i = 0. Let us denote L

^s_ρ

the subspace of L

^′_ρ

of all singular forms.

Theorem 2.3 (Representation of L

^′_ρ

). Let ρ be any Young function. The dual space of L

ρ

is isomorphic to the direct sum L

^′_ρ

= (L

ρ^∗

· R) ⊕ L

^s_ρ

. This implies that any ℓ ∈ L

^′_ρ

is uniquely decomposed as

ℓ = ℓ

^a

+ ℓ

^s

(2.4)

with ℓ

^a

∈ L

ρ^∗

· R and ℓ

^s

∈ L

^s_ρ

.

Proof. When L

ρ

= L

∞

this result is the usual representation of L

^′_∞

. When ρ is a finite function, this result is ([13], Theorem 2.2).

The general result is proved in [19], with ρ not depending on z but the extension to a

z-dependent ρ is obvious.

In the decomposition (2.4), ℓ

^a

is called the absolutely continuous part of ℓ while ℓ

^s

is its singular part.

Proposition 2.5. Let us assume that ρ is finite. Then, ℓ ∈ L

^′_ρ

is singular if and only if hℓ, ui = 0, for all u in E

ρ

.

Proof. This result is ([13], Proposition 2.1).

The function ρ is said to satisfy the ∆

2

-condition if

there exist C > 0, s

o

≥ 0 such that ∀s ≥ s

o

, ρ(2s) ≤ Cρ(s) (2.6)

If s

o

= 0, the ∆

2

-condition is said to be global. When R is bounded, in order that E

ρ

= L

ρ

,

it is enough that ρ satisfies the ∆

₂

-condition. When R is unbounded, this equality still

holds if the ∆

2

-condition is global. Consequently, if ρ satisfies the ∆

2

-condition we have

L

^′_ρ

= L

ρ^∗

· R so that L

^s_ρ

reduces to the null vector space.

(6)

2.2. The minimization problem (P

C

). Before introducing an extended minimization problem, let us state properly the basic problem (1.3).

Relevant Orlicz spaces. Since γ

_z^∗

is closed convex for each z, it is the convex conjugate of some closed convex function γ

z

. Defining

λ(z, s) = γ(z, s) − m(z)s, z ∈ Z , s ∈ R

where m satisfies (1.2), one sees that for R-a.e. z, λ

z

is a nonnegative convex function and it vanishes at 0. Hence,

λ

⋄

(z, s) = max[λ(z, s), λ(z, −s)] ∈ [0, ∞], z ∈ Z, s ∈ R is a Young function. We shall use Orlicz spaces associated with λ

⋄

and λ

^∗_⋄

.

We denote the space of R-absolutely continuous signed measures having a density in the Orlicz space L

λ^∗⋄

by L

λ^∗⋄

R. The effective domain of I is included in mR + L

λ^∗⋄

R. Constraint. In order to define the constraint, take X

o

a vector space and a function θ : Z → X

o

. One wants to give a meaning to the formal constraint R

Z

θ dQ = x with Q ∈ L

λ^∗⋄

R and x ∈ X

o

. Suppose that X

o

is the algebraic dual space of some vector space Y

o

and define for all y ∈ Y

o

,

T

_o^∗

y(z) := hy, θ(z)i

_Y_o_,X_o

, z ∈ Z . (2.7) Assuming that

T

_o^∗

Y

o

⊂ L

λ⋄

, (2.8)

H¨older’s inequality in Orlicz spaces allows to define the constraint operator T

o

ℓ := R

Z

θ dℓ for each ℓ ∈ L

λ^∗⋄

R by

y,

Z

θ dℓ

Yo,Xo

= Z

Z

hy, θ(z)i

Yo,Xo

ℓ(dz), ∀y ∈ Y

o

. (2.9)

Minimization problem. Consider the minimization problem minimize I(Q) subject to

Z

θ d(Q − mR) ∈ C

o

, Q ∈ mR + L

λ^∗⋄

R (P

Co

) where C

o

is a convex subset of X

o

. One sees with γ

_z^∗

(t) = λ

^∗_z

(t − m(z)) that I

γ^∗

(Q) = I

λ^∗

(Q − mR). Therefore, the problem (P

Co

) is equivalent to

minimize I

λ^∗

(ℓ) subject to Z

Z

θ dℓ ∈ C

o

, ℓ ∈ L

λ^∗⋄

R (2.10) with ℓ = Q − mR. If the function m satisfies m ∈ L

λ^∗⋄

, one sees with (2.8) and H¨older’s inequality in Orlicz spaces that the vector x

o

= R

Z

θm dR ∈ X

o

is well-defined in the weak sense. Therefore, (P

Co

) is

minimize I (Q) subject to Z

Z

θ dQ ∈ C, Q ∈ L

λ^∗⋄

R (P

C

)

with C = x

o

+ C

o

(7)

2.3. The extended minimization problem (P

C

). If the Young function λ

_⋄

doesn’t satisfy the ∆

₂

-condition (2.6), for instance if it has an exponential growth at infinity as in (6.1) or even worse as in (6.3), the small Orlicz space E

λ⋄

may be a proper subset of L

λ⋄

. Consequently, for some functions θ, the integrability property

T

_o^∗

Y

o

⊂ E

λ⋄

(2.11)

or equivalently

∀y ∈ Y

o

, Z

Z

λ(hy, θi) dR < ∞ (A

^∀_θ

) may not be satisfied while the weaker property (2.8): T

_o^∗

Y

o

⊂ L

λ⋄

, or equivalently

∀y ∈ Y

o

, ∃α > 0, Z

Z

λ(αhy, θi) dR < ∞ (A

^∃_θ

) holds. In this situation, analytical complications occur (see Section 4). This is the reason why constraints satisfying (A

^∀_θ

) are called good constraints, while constraints satisfying (A

^∃_θ

) but not (A

^∀_θ

) are called bad constraints.

If the constraint is bad, it may happen that (P

C

) is not attained in L

λ^∗⋄

R. This is the reason why it is worth introducing its extension (P

C

) which may admit minimizers and is defined by

minimize I ¯ (ℓ) subject to hθ, ℓi ∈ C, ℓ ∈ L

^′_λ_⋄

(P

C

) where L

^′_λ_⋄

is the topological dual space of L

λ⋄

, I ¯ and hθ, ℓi are defined below.

The dual space L

^′_λ⋄

admits the representation L

^′_λ⋄

≃ L

λ^∗⋄

R ⊕ L

^s_λ⋄

. This means that any ℓ ∈ L

^′_λ_⋄

is uniquely decomposed as ℓ = ℓ

^a

+ ℓ

^s

where ℓ

^a

∈ L

λ^∗⋄

R and ℓ

^s

∈ L

^s_λ_⋄

are respectively the absolutely continuous part and the singular part of ℓ, see Theorem 2.3.

The extension ¯ I has the following form

I(ℓ) = ¯ I(ℓ

^a

) + ι

^∗_dom_I_γ

(ℓ

^s

), ℓ ∈ L

^′_λ⋄

(2.12) It will be shown that ¯ I is the greatest convex σ(L

^′_λ_⋄

, L

λ⋄

)-lower semicontinuous extension of I to L

^′_λ_⋄

⊃ L

λ^∗⋄

. In a similar way to (2.9), the assumption (A

^∃_θ

) allows to define T

o

ℓ = hθ, ℓi for all ℓ ∈ L

^′_λ_⋄

by

D

y, hθ, ℓi E

Yo,Xo

= D

hy, θi, ℓ E

Lλ⋄,L^′_λ⋄

, ∀y ∈ Y

o

.

Important examples of entropies with λ

⋄

not satisfying the ∆

2

-condition are the usual (Boltzmann) entropy and its variants, see Section 6.1 and (6.1) in particular.

When λ

⋄

satisfies the ∆

2

-condition (2.6), (P

C

) is (P

C

).

2.4. Assumptions. Let us collect the assumptions on R, γ

^∗

and θ.

Assumptions (A).

(A

R

) It is assumed that the reference measure R is a σ-finite nonnegative measure on a space Z endowed with some R-complete σ-field.

(A

γ^∗

) Assumptions on γ

^∗

.

(1) γ

^∗

(·, t) is z-measurable for all t and for R-almost every z ∈ Z, γ

^∗

(z, ·) is a lower semicontinuous strictly convex [0, +∞]-valued function on R which attains its (unique) minimum at m(z) with γ

^∗

(z, m(z)) = 0.

(2) R

Z

λ

^∗

(αm) dR + R

Z

λ

^∗

(−αm) dR < ∞, for some α > 0.

(A

θ

) Assumptions on θ.

(1) for any y ∈ Y

o

, the function z ∈ Z 7→ hy, θ(z)i ∈ R is measurable;

(2) for any y ∈ Y

o

, hy, θ(·)i = 0, R-a.e. implies that y = 0;

(8)

(∃) ∀y ∈ Y

o

, ∃α > 0, R

Z

λ(αhy, θi) dR < ∞.

Remarks 2.13. Some technical remarks about the assumptions.

(a) Since γ

_z^∗

is a convex function on R , it is continuous on the interior of its domain.

Under our assumptions, γ

^∗

is (jointly) measurable, and so are γ and m. Hence, λ is also measurable.

(b) As γ

_z^∗

is strictly convex, γ

z

is differentiable.

(c) Assumption (A

²_γ^∗

) is m ∈ L

λ^∗⋄

. It allows to consider Problem (P

C

) rather than (P

Co

).

If this assumption is not satisfied, our results still hold for (P

Co

), but their statement is a little heavier, see Remark 4.10-d below.

(d) Since X

o

and Y

o

are in separating duality, (A

²_θ

) states that the vector space spanned by the range of θ “is essentially” X

o

. This is not an effective restriction.

3. Preliminary results

The aim of this section is to recall for the convenience of the reader some results of [14, 16, 17].

3.1. Convex minimization problems under weak constraint qualifications. The main results of [17] are presented.

Basic diagram. Let U

o

be a vector space, L

o

= U

_o^∗

its algebraic dual space, Φ a (−∞, +∞]- valued convex function on U

o

and Φ

^∗

its convex conjugate for the duality hU

o

, L

o

i :

Φ

^∗

(ℓ) := sup

u∈Uo

{hu, ℓi − Φ(u)}, ℓ ∈ L

o

Let Y

o

be another vector space, X

o

= Y

_o^∗

its algebraic dual space and T

o

: L

o

→ X

o

a linear operator. We consider the convex minimization problem

minimize Φ

^∗

(ℓ) subject to T

o

ℓ ∈ C, ℓ ∈ L

o

(P

o

) where C is a convex subset of X

o

.

This will be used later with Φ = I

λ

on the Orlicz space U

o

= E

λ⋄

(Z, R) or U

o

= L

λ⋄

(Z, R).

It is useful to define the constraint operator T

o

by means of its adjoint T

_o^∗

: Y

o

→ L

^∗_o

for each ℓ ∈ L

o

, by hT

_o^∗

y, ℓi

L^∗_o,Lo

= hy, T

o

ℓi

Yo,Xo

, ∀y ∈ Y

o

.

Hypotheses. Let us give the list of the main hypotheses.

(H

Φ

) 1- Φ : U

o

→ [0, +∞] is σ(U

o

, L

o

)-lower semicontinuous, convex and Φ(0) = 0 2- ∀u ∈ U

o

, ∃α > 0, Φ(αu) < ∞

3- ∀u ∈ U

o

, u 6= 0, ∃t ∈ R , Φ(tu) > 0 (H

T

) 1- T

_o^∗

(Y

o

) ⊂ U

o

2- ker T

_o^∗

= {0}

(H

C

) C ∩ X is a convex σ(X , Y)-closed subset of X

The definitions of the vector spaces X and Y which appear in the last assumption are

stated below. For the moment, let us only say that if C is convex and σ(X

o

, Y

o

)-closed,

then (H

C

) holds.

(9)

Several primal and dual problems. These variants are expressed below in terms of new spaces and functions. Let us first introduce them.

- The norms | · |

_Φ

and | · |

_Λ

. Let Φ

±

(u) = max(Φ(u), Φ(−u)). By (H

Φ1

) and (H

Φ2

), {u ∈ U

o

; Φ

_±

(u) ≤ 1} is a convex absorbing balanced set. Hence its gauge functional which is defined for all u ∈ U

o

by |u|

_Φ

:= inf{α > 0; Φ

±

(u/α)) ≤ 1} is a seminorm.

Thanks to hypothesis (H

_Φ3

), it is a norm.

Taking (H

T1

) into account, one can define

Λ

o

(y) := Φ(T

_o^∗

y), y ∈ Y

o

. (3.1) Let Λ

±

(y) = max(Λ

o

(y), Λ

o

(−y)). The gauge functional on Y

o

of the set {y ∈ Y

o

; Λ

±

(y) ≤ 1} is |y|

Λ

:= inf{α > 0; Λ

±

(y/α) ≤ 1}, y ∈ Y

o

. Thanks to (H

Φ

) and (H

T

), it is a norm and

|y|

Λ

= |T

_o^∗

y|

Φ

, y ∈ Y

o

. - The spaces. Let

U be the | · |

Φ

-completion of U

o

and let

L := (U

o

, | · |

Φ

)

^′

be the topological dual space of (U

o

, | · |

Φ

).

Of course, we have (U , | · |

_Φ

)

^′

∼ = L ⊂ L

o

where any ℓ in U

^′

is identified with its restriction to U

o

. Similarly, we introduce

Y the | · |

Λ

-completion of Y

o

and

X := (Y

o

, | · |

Λ

)

^′

the topological dual space of (Y

o

, | · |

Λ

).

We have (Y, | · |

_Λ

)

^′

∼ = X ⊂ X

o

where any x in Y

^′

is identified with its restriction to Y

o

. We also have to consider the algebraic dual spaces L

^∗

and X

^∗

of L and X .

- The operators T and T

^∗

. Let us denote T the restriction of T

o

to L ⊂ L

o

. One can show that under (H

Φ&T

), T

o

L ⊂ X . Hence T : L → X . Let us define its adjoint T

^∗

: X

^∗

→ L

^∗

for all ω ∈ X

^∗

by: hℓ, T

^∗

ωi

L,L^∗

= hT ℓ, ωi

X,X^∗

, ∀ℓ ∈ L. We have the inclusions Y

o

⊂ Y ⊂ X

^∗

. The adjoint operator T

_o^∗

is the restriction of T

^∗

to Y

o

.

- The functionals. They are:

Φ(ζ) := sup ¯

_ℓ∈L

{hζ, ℓi − Φ

^∗

(ℓ)}, ζ ∈ L

^∗

Λ(y) := ¯ Φ(T

^∗

y), y ∈ Y

Λ(ω) := ¯ Φ(T

^∗

ω), ω ∈ X

^∗

Λ

^∗_o

(x) := sup

_y∈Y_o

{hy, xi − Λ

o

(y)}, x ∈ X

o

Λ

^∗

(x) := sup

_y∈Y

{hy, xi − Λ(y)}, x ∈ X - The optimization problems. They are:

minimize Φ

^∗

(ℓ) subject to T

o

ℓ ∈ C, ℓ ∈ L

o

(P

o

)

minimize Φ

^∗

(ℓ) subject to T ℓ ∈ C, ℓ ∈ L (P)

maximize inf

x∈C∩X

hy, xi − Λ(y), y ∈ Y (D)

maximize inf

x∈C∩X

hx, ωi − Λ(ω), ω ∈ X

^∗

(D)

Statement of the results. It is assumed that (H

Φ

), (H

T

) and (H

C

) hold.

Theorem 3.2 (Primal attainment and dual equality).

(a) The problems (P

o

) and (P ) are equivalent: they have the same solutions and inf(P

o

) =

inf(P ) ∈ [0, ∞].

(10)

(b) We have the dual equalities

inf(P

o

) = inf(P ) = sup(D) = sup(D) = inf

x∈C

Λ

^∗_o

(x) = inf

x∈C∩X

Λ

^∗

(x) ∈ [0, ∞]

(c) If in addition {ℓ ∈ L

o

; T

o

ℓ ∈ C} ∩ dom Φ

^∗

6= ∅, then (P

o

) is attained in L. Moreover, any minimizing sequence for (P

o

) has σ(L, U )-cluster points and every such cluster point solves (P

o

).

Theorem 3.3 (Dual attainment and representation. Interior convex constraint).

Assume that C ∩ icor (T

o

dom Φ

^∗

) 6= ∅.

Then, the primal problem (P

o

) is attained in L and the extended dual problem (D) is attained in X

^∗

. Any solution ℓ ˆ ∈ L of (P

o

) is characterized by the existence of some

¯

ω ∈ X

^∗

such that

 



(a) T ℓ ˆ ∈ C

(b) hT

^∗

ω, ¯ ℓi ≤ hT ˆ

^∗

ω, ℓi ¯ for all ℓ ∈ {ℓ ∈ L; T ℓ ∈ C} ∩ dom Φ

^∗

(c) ˆ ℓ ∈ ∂

L

Φ(T ¯

^∗

ω) ¯

(3.4)

Moreover, ℓ ˆ ∈ L and ω ¯ ∈ X

^∗

satisfy (3.4) if and only if ℓ ˆ solves (P

o

) and ω ¯ solves (D).

The assumption C ∩ icor (T

o

dom Φ

^∗

) 6= ∅ is equivalent to C ∩ icordom Λ

^∗_o

6= ∅ and the representation formula (3.4-c) is equivalent to Young’s identity

Φ

^∗

(ˆ ℓ) + ¯ Φ(T

^∗

ω) = ¯ h¯ ω, T ℓi ˆ = Λ

^∗

(ˆ x) + Λ(¯ ω). (3.5) Formula (3.4-c) can be made a little more precise by means of the following regularity result.

Theorem 3.6. Any solution ω ¯ of (D) shares the following properties (a) ¯ ω is in the σ(X

^∗

, X )-closure of dom Λ;

(b) T

^∗

ω ¯ is in the σ(L

^∗

, L)-closure of T

^∗

(dom Λ).

If in addition the level sets of Φ are | · |

_Φ

-bounded, then

(a’) ¯ ω is in Y

^′′

. More precisely, it is in the σ(Y

^′′

, X )-closure of dom Λ;

(b’) T

^∗

ω ¯ is in U

^′′

. More precisely, it is in the σ(U

^′′

, L)-closure of T

^∗

(dom Λ)

where Y

^′′

and U

^′′

are the topological bidual spaces of Y and U . This occurs if Φ, or equivalently Φ

^∗

, is an even function.

3.2. Convex conjugates in a Riesz space. The following results are taken from [14, 16]. For the basic definitions and properties of Riesz spaces, see [7, Chapter 2].

Let U be a Riesz vector space for the order relation ≤ . Since U is a Riesz space, any u ∈ U admits a nonnegative part: u

+

:= u ∨ 0, and a nonpositive part: u

−

:= (−u) ∨ 0.

Of course, u = u

+

− u

−

and as usual, we state: |u| = u

+

+ u

−

.

Remark 3.7. Recall that there is a natural order on the algebraic dual space E

^∗

of a Riesz vector space E which is defined by: e

^∗

≤ f

^∗

⇔ he

^∗

, ei ≤ hf

^∗

, ei for any e ∈ E with e ≥ 0. A linear form e

^∗

∈ E

^∗

is said to be relatively bounded if for any f ∈ E, f ≥ 0, we have sup

_e:|e|≤f

|he

^∗

, ei| < +∞. Although E

^∗

may not be a Riesz space in general, the vector space E

^b

of all the relatively bounded linear forms on E is always a Riesz space.

In particular, the elements of E

^b

admit a decomposition in positive and negative parts

e

^∗

= e

^∗₊

− e

^∗₋

.

(11)

Let Φ be a [0, ∞]-valued function on U which satisfies the following conditions:

∀u ∈ U, Φ(u) = Φ(u

+

− u

−

) = Φ(u

+

) + Φ(−u

−

) (3.8)

∀u, v ∈ U,

0 ≤ u ≤ v = ⇒ Φ(u) ≤ Φ(v )

u ≤ v ≤ 0 = ⇒ Φ(u) ≥ Φ(v ) (3.9)

Clearly (3.8) implies Φ(0) = 0, (3.8) and (3.9) imply that for any u ∈ U, Φ(u) = Φ(u

+

) + Φ(−u

−

) ≥ Φ(0) + Φ(0) = 0. Therefore, Φ

^∗

is [0, ∞]-valued and Φ

^∗

(0) = 0.

For all u ∈ U, Φ

+

(u) = Φ(|u|), Φ

−

(u) = Φ(−|u|). The convex conjugates of Φ, Φ

+

and Φ

−

with respect to hU, U

^∗

i are denoted Φ

^∗

, Φ

^∗₊

and Φ

^∗₋

. Let L be the vector space spanned by dom Φ

^∗

. The convex conjugates of Φ

^∗

, Φ

^∗₊

and Φ

^∗₋

with respect to hL, L

^∗

i are denoted Φ, ¯ Φ

+

and Φ

−

. The space of relatively bounded linear forms on U and L are denoted by U

^b

and L

^b

, whenever L is a Riesz space.

One writes a

±

∈ A

±

for [a

+

∈ A

+

and a

−

∈ A

−

].

Proposition 3.10. Assume (3.8) and (3.9) and suppose that L is a Riesz space.

(a) For all ℓ ∈ U

^∗

,

Φ

^∗

(ℓ) =

Φ

^∗₊

(ℓ

₊

) + Φ

^∗₋

(ℓ

₋

) if ℓ ∈ U

^b

+∞ otherwise

(b) Denoting L

+

and L

−

the vector subspaces of L spanned by dom Φ

^∗₊

and dom Φ

^∗₋

, we have

Φ(ζ) = ¯

Φ

₊

(ζ

₊_|L₊

) + Φ

₋

(ζ

_−|L₋

) if ζ ∈ L

^b

+∞ otherwise

which means that Φ

±

(ζ

±

) = Φ

±

(ζ

_±^′

) if ζ

±

and ζ

_±^′

match on L

±

.

(c) Let ℓ ∈ L, ζ ∈ L

^∗

be such that ℓ ∈ ∂

L

Φ(ζ). ¯ Then, ℓ

±

∈ ∂

L±

Φ

±

(ζ

_±|L_±

) ⊂ L

±

.

Proof. (a) and (b) are proved at [14, Proposition 4.4] under the additional assumption that for all u ∈ U there exists λ > 0 such that Φ(λu) < +∞. But it can be removed.

Indeed, if for instance Φ

−

is null, Φ

^∗₋

is the convex indicator of {0} whose domain is in U

^b

. The statement about ¯ Φ is an iteration of this argument.

The last statement of (b) about ζ

_±|L_±

directly follows from dom Φ

^∗_±

⊂ L

±

.

For (c), see the proof of [16, Proposition 4.5].

4. Solving (P

C

)

The general assumptions (A) are imposed and we study (P

C

).

4.1. Several function spaces and cones. To state the extended dual problem (D

C

) below, notation is needed. If λ is not an even function, one has to consider

λ

₊

(z, s) = λ(z, |s|)

λ

−

(z, s) = λ(z, −|s|) (4.1)

which are Young functions and the corresponding Orlicz spaces.

Definitions 4.2. For any relatively bounded linear form ζ on L

^′_λ_⋄

i.e. ζ ∈ L

^′b_λ_⋄

, one writes:

• ζ ∈ K

_λ^′′

to specify that ζ

_±|L^′

λ±∩L^′_λ⋄

∈ L

^′′_λ_±

• ζ ∈ K

_λ^′^∗

to specify that ζ

_±|L

λ∗

±R∩L^′_λ⋄

∈ L

^′_λ^∗

±

• ζ ∈ K

λ

to specify that ζ

_±|L

λ∗

±R∩L^′_λ⋄

∈ L

λ±

• ζ ∈ K

_λ^s^∗

to specify that ζ

_±|L

λ∗

±R∩L^′_λ⋄

∈ L

^s_λ^∗_±

• ζ ∈ K

_λ^s′

to specify that ζ

_±|L^s

λ±∩L^′_λ⋄

∈ L

^s′_λ_±

(12)

where λ

±

are defined at (4.1) and ζ

_±|L_±_∩L^′

λ⋄

∈ L

^′_±

means that the restriction of ζ

±

to L

±

∩ L

^′_λ_⋄

is continuous with respect to relative topology generated by the strong topology of L

_±

on L

_±

∩ L

^′_λ_⋄

.

(1) The sets K

_λ^′′

, K

_λ^′^∗

, K

λ

, K

_λ^s^∗

and K

_λ^s′

are defined to be the corresponding subsets of L

^′b_λ_⋄

. They are not vector spaces in general but convex cones with vertex 0.

(2) The σ(K

_λ^′′

, K

_λ^′

)-closure A of a set A is defined as follows: ζ ∈ L

^′b_λ_⋄

is in A if ζ

_±|L^′

λ±∩L^′_λ⋄

is in the σ(L

^′′_λ_±

∩ L

^′b_λ_⋄

, L

^′_λ_±

∩ L

^′_λ_⋄

)-closure of A

±

= {ζ

_±

; ζ ∈ A}. Clearly, A

±

= {ζ

±

; ζ ∈ A}.

One defines similarly the σ(K

_λ^′^∗

, K

λ^∗

), σ(K

λ

, K

_λ^′

), σ(K

_λ^s^∗

, K

λ^∗

) and σ(K

_λ^s′

, K

_λ^s

)- closures.

(3) Let A be a subset of L

λ⋄

. Its strong closure s-cl A in K

λ

is the set of all measurable functions u such that u

±

is in the k · k

λ±

-closure of A

±

= {v

_±

; v ∈ A}.

Let ρ be a Young function. By Theorem 2.3, we have L

^′′_ρ

= [L

ρ

.R ⊕ L

^s_ρ

] ⊕ L

^s′_ρ

. For any ζ ∈ L

^′′_ρ

= (L

ρ^∗

R ⊕ L

^s_ρ

)

^′

, let us denote the restrictions ζ

1

= ζ

|Lρ∗R

and ζ

2

= ζ

|L^s_ρ

. Since, (L

ρ^∗

R)

^′

≃ L

ρ

⊕ L

^s_ρ^∗

, one sees that any ζ ∈ L

^′′_ρ

is uniquely decomposed into

ζ = ζ

₁^a

+ ζ

₁^s

+ ζ

2

(4.3)

with ζ

₁

= ζ

₁^a

+ ζ

₁^s

∈ L

^′_ρ^∗

, ζ

₁^a

∈ L

ρ

, ζ

₁^s

∈ L

^s_ρ^∗

and ζ

₂

∈ L

^s′_ρ

. With our definitions, K

_λ^′′

= [K

λ

⊕ K

_λ^s^∗

] ⊕ K

_λ^s′

and the decomposition (4.3) holds for any ζ ∈ K

_λ^′′

with

ζ

1

= ζ

₁^a

+ ζ

₁^s

∈ K

λ

⊕ K

_λ^s^∗

= K

_λ^′^∗

, ζ

2

∈ K

_λ^s′

.

4.2. The ingredients of the saddle-point method. One applies the abstract results of Section 3.1 with

Φ(u) = I

λ

(u) :=

Z

λ(u) dR, u ∈ U

o

:= L

λ⋄

(4.4) This gives U = L

λ⋄

with the Orlicz norm |u|

_Φ

= kuk

λ⋄

and L = L

^′_λ_⋄

= L

λ^∗⋄

R ⊕ L

^s_λ_⋄

, by Theorem 2.3. The space Y is the completion of Y

o

endowed with the norm |y|

_Λ

= khy, θik

λ⋄

. One denotes Y = Y

L

. It is isomorphic to the closure of the subspace {hy, θi; y ∈ Y

o

} in L

λ⋄

, see assumption (A

^∃_θ

). With some abuse of notation, one still denotes T

^∗

y = hy, θi for y ∈ Y

L

. Remark that this can be interpreted as a dual bracket between X

_o^∗

and X

o

since T

^∗

y = h y, θi ˜ R-a.e. for some ˜ y ∈ X

_o^∗

. The topological dual space X

L

= Y

_L^′

is identified with L

^′_λ_⋄

/ker T and its norm is given by |x|

^∗_Λ

= inf{kℓk

^∗_λ_⋄

; ℓ ∈ L

^′_λ_⋄

: T (ℓ) = x}.

This last identity is a dual equality as in Theorem 3.2-b with Φ = ι

B

where B is the unit ball of L

λ⋄

and C = {x}.

The assumption (H

C

) that C is σ(X

L

, Y

L

)-closed convex is equivalent to T

_o⁻¹

C ∩ L

^′_λ_⋄

= \

y∈Y

ℓ ∈ L

^′_λ_⋄

; hhy, θi, ℓi ≥ a

y

(4.5)

for some subset Y ⊂ Y

L

and some functions y ∈ Y 7→ a

y

∈ R . For comparison, note that if C is only supposed to be convex, T

(y,a)∈A

ℓ ∈ L

^′_λ_⋄

; hhy, θi, ℓi > a with A ⊂ Y × R is the general shape of T

⁻¹

C. 4.3. The main result. Let us define Γ

^∗

(x) = sup

y∈Yo

{hy, xi − I

γ

(hy, θi)} , x ∈ X

o

(13)

which is the convex conjugate of Γ(y) = I

γ

(hy, θi), y ∈ Y

o

. The dual problem (D) associated with (P

C

) and (P

C

) is

maximize inf

x∈C∩X

hy, xi − I

γ

(hy, θi), y ∈ Y (D

C

) The extended dual problem is

maximize inf

x∈C∩XL

hω, xi − I

λ

[T

^∗

ω]

^a₁

+ ι

^∗_dom_I_λ∗

[T

^∗

ω]

^s₁

+ ι

D

[T

^∗

ω]

2

, ω ∈ Y (D

C

) where

• T

^∗

: X

_L^∗

→ L

^′∗_λ_⋄

is the extension of T

_o^∗

which is defined at Section 3.1,

• D is the σ(K

_λ^s′

, K

_λ^s

)-closure of dom I

λ

and

• Y is the cone of all ω ∈ X

_L^∗

such that T

^∗

ω ∈ K

_λ^′′

. Clearly, ι

^∗_dom_I_λ∗

(ζ

₁^s

) = ι

^∗_dom_I

λ∗ +

(ζ

₁₊^s

) + ι

^∗_dom_I

λ∗

−

(ζ

₁₋^s

) and ι

D

(ζ

2

) = ι

D+

(ζ

2+

) + ι

D−

(ζ

2−

) where D

±

is the σ(L

^s′_λ_±

∩ L

^′b_λ_⋄

, L

^s_λ_±

∩ L

^′_λ_⋄

)-closure of dom I

λ±

.

As R is assumed to be σ-finite, there exists a measurable partition (Z

k

)

k≥1

of Z : F

k

Z

k

= Z, such that R(Z

k

) < ∞ for each k ≥ 1.

Theorem 4.6. Suppose that

(1) the assumptions (A) are satisfied;

(2) for each k ≥ 1, L

λ⋄

(Z

k

, R

_|Z_k

) is dense in L

λ+

(Z

k

, R

_|Z_k

) and L

λ−

(Z

k

, R

_|Z_k

) with respect to the topologies associated with k · k

λ+

and k · k

λ−

;

(3) C satisfies (4.5) with hy, θi ∈ L

λ⋄

for all y ∈ Y.

Then:

(a) The dual equality for (P

C

) is inf(P

C

) = inf

x∈C

Γ

^∗

(x) = sup(D

C

) = sup(D

C

) ∈ [0, ∞].

(b) If C ∩ dom Γ

^∗

6= ∅ or equivalently C ∩ T

o

dom ¯ I 6= ∅, then (P

C

) admits solutions in L

^′_λ_⋄

, any minimizing sequence admits σ(L

^′_λ_⋄

, L

λ⋄

)-cluster points and every such point is a solution to (P

C

).

Suppose that in addition we have

C ∩ icordom Γ

^∗

6= ∅ (4.7)

or equivalently C ∩ icor (T

o

dom ¯ I ) 6= ∅. Then:

(c) Let us denote x ˆ =

^△

T

^∗

ℓ. ˆ There exists ω ¯ ∈ Y such that

 



(a) ˆ x ∈ C ∩ dom Γ

^∗

(b) h¯ ω, xi ˆ

_X^∗

L,XL

≤ h ω, xi ¯

_X^∗

L,XL

, ∀x ∈ C ∩ dom Γ

^∗

(c) ˆ ℓ ∈ γ

_z^′

([T

^∗

ω] ¯

^a₁

) R + D

^⊥

([T

^∗

ω] ¯

₂

)

(4.8) where

D

^⊥

(η) = {k ∈ L

^s_λ_⋄

; ∀h ∈ L

λ⋄

, η + h ∈ D ⇒ hh, ki ≤ 0}

is the outer normal cone of D at η.

T

^∗

ω ¯ is in the σ(K

_λ^′′

, K

_λ^′

)-closure of T

^∗

(dom Λ) and there exists some ω ˜ ∈ X

_o^∗

such that [T

^∗

ω] ¯

^a₁

= h˜ ω, θ(·)i

X_o^∗,Xo

is a measurable function in the strong closure of T

^∗

(dom Λ) in K

λ

.

Furthermore, ℓ ˆ ∈ L

^′_λ_⋄

and ω ¯ ∈ Y satisfy (4.8) if and only if ℓ ˆ solves (P

C

) and ω ¯ solves (D

C

).

(d) Of course, (4.8-c) implies x ˆ = R

Z

θγ

^′

(h ω, θi) ˜ dR + hθ, ℓ ˆ

^s

i. Moreover,

(14)

1. x ˆ minimizes Γ

^∗

on C, 2. I ¯ (ˆ ℓ) = Γ

^∗

(ˆ x) = R

Z

γ

^∗

◦ γ

^′

(h˜ ω, θi) dR + sup{hu, ℓ ˆ

^s

i; u ∈ dom I

γ

} < ∞ and 3. I ¯ (ˆ ℓ) + R

Z

γ(h ω, θi) ˜ dR = R

Z

h˜ ω, θi d ℓ ˆ

^a

+ h[T

^∗

ω] ¯

2

, ℓ ˆ

^s

i

_K^s

λ

′,K_λ^s

.

Proposition 4.9. For the assumption (2) of Theorem 4.6 to be satisfied, it is enough that one of these conditions holds

(i) λ is even or more generally 0 < lim inf

t→∞ λ+

λ−

(t) ≤ lim sup

_t→∞ ^λ_λ⁺₋

(t) < +∞;

(ii) lim

_t→∞ ^λ_λ⁺₋

(t) = +∞ and λ

₋

satisfies the ∆

₂

-condition (2.6).

Proof. It is enough to work with a bounded measure R.

Condition (i) is equivalent to L

λ+

= L

λ−

= L

λ⋄

and the result follows immediately.

Condition (ii) says that λ

+

= λ

⋄

and L

λ−

= E

λ−

. As γ

^∗

is assumed to be strictly convex, zero is in the interior of dom λ and L

λ⋄

contains the space B of all bounded measurable functions. But B is dense in E

λ−

and the result follows.

Remarks 4.10. General remarks about Theorem 4.6.

(a) The assumption (3) is equivalent to C is σ(X

L

, Y

L

)-closed convex.

(b) The dual equality with C = {x} gives for all x ∈ X

o

Γ

^∗

(x) = inf I(ℓ); ¯ ℓ ∈ L

^′_λ_⋄

, hθ, ℓi = x .

(c) Note that ¯ ω does not necessarily belong to Y

o

. Therefore, the Young equality h ω, ¯ xi ˆ = Γ

^∗

(ˆ x) + Γ(¯ ω) is meaningless. Nevertheless, there exists a natural extension Γ of Γ such that hˆ x, ωi ¯ = Γ

^∗

(ˆ x) + Γ(¯ ω) holds, see (3.5). This gives the statement (d-3).

(d) Removing the assumption (A

²_γ^∗

): m ∈ L

λ^∗⋄

, one can still consider the minimization problem

minimize I(ℓ) ¯ subject to hθ, ℓ − mRi ∈ C

o

, ℓ ∈ mR + L

^′_λ_⋄

(P

Co

) instead of (P

C

). The transcription of Theorem 4.6 is as follows. Denote

Λ

^∗

(x) = sup

y∈Yo

hy, xi − Z

Z

λ(hy, θi) dR

, x ∈ X

o

and replace respectively (P

C

), C, Γ

^∗

, x ˆ and γ by (P

Co

), C

o

, Λ

^∗

, x ˜ and λ where

˜

x = hθ, ℓ ˆ − mRi is well-defined.

The statement (b) must be replaced by the following one: If C

o

∩ dom Λ

^∗

6= ∅, then (P

Co

) admits solutions in mR + L

^′_λ_⋄

, any minimizing sequence (ℓ

n

)

n≥1

is such that (ℓ

n

− mR)

n≥1

admits cluster points ℓ ˆ − mR in L

^′_λ^∗_⋄

with respect to the topology σ(L

^′_λ_⋄

, L

λ⋄

) and ℓ ˆ is a solution of (P

Co

).

Proof of Theorem 4.6. It is an application of Theorems 3.2 and 3.3. We use the notation and framework of Section 3.1.

With (4.4) and Theorem 3.2-a, dom Φ

^∗

⊂ L = L

^′_λ_⋄

. For all ℓ ∈ L

^′_λ_⋄

, Φ

^∗

(ℓ)

^(a)

= Φ

^∗₊

(ℓ

₊

) + Φ

^∗₋

(ℓ

₋

)

(b)

= inf {I

λ^∗₊

(k

^a

) + ι

^∗_dom_λ₊

(k

^s

); k ∈ L

^′_λ₊

: k ≥ 0, k

|Lλ⋄

= ℓ

+

} + inf{I

λ^∗₋

(k

^a

) + ι

^∗_dom_λ₋

(k

^s

); k ∈ L

^′_λ₋

: k ≥ 0, k

_|L_λ⋄

= ℓ

₋

}

Equality (a) comes from Proposition 3.10-a and equality (b) is a dual equality of the type of Theorem 3.2-b applied with

I

_ρ^∗

(k) = I

ρ^∗

(k

^a

) + ι

^∗_dom_ρ

(k

^s

) k ∈ L

^′_ρ

(4.11)

(15)

which holds for any Young function ρ. This identity is proved by Foug`eres, Giner, Kozek and Rockafellar [11, 13, 21] under the assumptions (A

R

) and (A

¹_γ^∗

). The function I

_ρ^∗

is strongly continuous on icordom I

_ρ^∗

⊂ L

^′_ρ

, see [14, Lemma 2.1]. Hence, under the assumption (2), we obtain that

I(ℓ) = Φ ¯

^∗

(ℓ − mR), ℓ ∈ L

^′_λ_⋄

(4.12) taking advantage of the direct sum ℓ = ⊕

k

ℓ

_|Z_k

acting on u = (u

_|Z_k

)

_k≥1

which lead to the nonnegative series Φ(u) = ⊕

k

Φ(u

|Zk

) and Φ

^∗

(ℓ) = P

k

Φ

^∗

(ℓ

|Zk

).

• Reduction to m = 0. We have seen at (2.10) that the transformation Q ℓ = Q − mR corresponds to the transformations γ λ and (P

C

) (2.10). This still works with (P

C

) and one can assume from now on without loss of generality that m = 0 and γ = λ.

The assumption (A

²_γ^∗

) will not be used during the rest of the proof. This allows Remark 4.10-d.

• Verification of (H

Φ

) and (H

T

). Suppose that W = {z ∈ Z ; λ(z, s) = 0, ∀s ∈ R } is such that R(W ) > 0. Then, any ℓ such that hu1

W

, ℓi > 0 for some u ∈ L

λ⋄

satisfies Φ

^∗

(ℓ) = +∞. Therefore, one can remove W from Z without loss of generality. Once, this is done, the hypothesis (H

_Φ

) is satisfied under the assumption (A

¹_γ^∗

). The hypothesis (H

T1

) is (A

^∃_θ

) while (H

T2

) is (A

²_θ

).

• The computation of Φ ¯ in the case where λ is even. Since Φ is even, Theorem 3.6 tells us that dom ¯ Φ is included in the σ(L

^′′_λ

, L

^′_λ

)-closure of dom Φ. Thanks to (4.11) and the decomposition (4.3), the extension ¯ Φ is given for each ζ ∈ L

^′′_λ

by

Φ(ζ ¯ ) = ( ¯ I

λ^∗

)

^∗

(ζ

1

, ζ

2

)

= sup

f∈L_λ^∗_⋄,k∈L^s_λ⋄

{hζ

1

, f Ri + hζ

2

, ki − I

λ^∗

(f R) − ι

^∗_dom_I_λ

(k)}

= I

_λ^∗^∗

(ζ

1

) + ι

^∗∗_dom_I_λ

(ζ

2

)

= ¯ I

λ

(ζ

1

) + ι

D

(ζ

2

)

= I

λ

(ζ

₁^a

) + ι

^∗_dom_I_λ∗

(ζ

₁^s

) + ι

D

(ζ

2

)

where D is the σ(L

^s′_λ

, L

^s_λ

)-closure of dom I

λ

and we dropped the restrictions ζ

|L

for sim- plicity.

• Extension to the case where λ is not even. By Proposition 3.10-b, we have ¯ Φ(ζ) = Φ ¯

+

(ζ

+|L^′_λ

+∩L^′_λ⋄

) + ¯ Φ

−

(ζ

_−|L^′

λ−∩L^′_λ⋄

) if ζ ∈ L

^′b_λ⋄

and +∞ otherwise. It follows that Φ(ζ) = ¯ ¯ I

_λ^∗

(ζ) = I

λ

(ζ

₁^a

) + ι

^∗_dom_I

λ∗

(ζ

₁^s

) + ι

D

(ζ

₂

) (4.13) if ζ ∈ K

_λ^′′

and +∞ otherwise. In particular, we have

Λ(y) = I

λ

(hy, θi), y ∈ Y Λ(ω) =

I

λ

[T

^∗

ω]

^a₁

+ ι

^∗_dom_I_λ∗

[T

^∗

ω]

^s₁

+ ι

D

[T

^∗

ω]

2

if ω ∈ Y

+∞ otherwise , ω ∈ X

_L^∗

.

This provides us with the dual problems (D

C

) and (D

C

).

• Proof of (a) and (b). Apply Theorem 3.2. 2

Let us go on with the proof of (c). By Theorem 3.3, (P

C

,D

C

) admits a solution in L

^′_λ_⋄

× Y and (ˆ ℓ, ω) ¯ ∈ L

^′_λ_⋄

× Y solves (P

C

,D

C

) if and only if

 



(a) ˆ x ∈ C ∩ dom Γ

^∗

(b) h¯ ω, xi ≤ h ˆ ω, xi, ¯ ∀x ∈ C ∩ dom Γ

^∗

(c) ˆ ℓ ∈ ∂

L^′_λ⋄

Φ(T ¯

^∗

ω) ¯

(4.14)

(16)

where ˆ x =

^△

T ℓ ˆ is defined in the weak sense with respect to the duality hY

L

, X

L

i. Since dom Γ

^∗

⊂ X

L

, the above dual brackets are meaningful.

• The computation of ∂

L^′_λ

Φ(ζ). ¯ Let us first assume that λ is even. For all u ∈ L

λ

, u

^a₁

= u

₂

= u and u

^s₁

= 0. This gives ¯ Φ(ζ + u) − Φ(ζ) = ¯ I

λ

(ζ

₁^a

+ u

₁

) − I

λ

(ζ

₁^a

) + ι

D

(ζ

₂

+ u

₂

) − ι

D

(ζ

₂

) where u

1

= u and u

2

= u act respectively on L

λ^∗

R and L

^s_λ

. This direct sum structure leads us to

∂

L^′_λ

Φ(ζ) = ¯ ∂

L_λ∗R

I

λ

(ζ

₁^a

) + ∂

L^s_λ

ι

D

(ζ

2

). (4.15) which again is the direct sum of the absolutely continuous and singular components of

∂

L^′_λ

Φ(ζ). ¯ Differentiating in the directions of U = L

λ

, one obtains ∂

L_λ^∗R

I

λ

(ζ

₁^a

) = {λ

^′

(ζ

₁^a

)R}.

The computation of ∂

L^s_λ

ι

D

(ζ

2

) is standard: ∂

L^s_λ

ι

D

(ζ

2

) = D

^⊥

(ζ

2

) is the outer normal cone of D at ζ

2

.

Now, consider a general λ. By Proposition 3.10-a, ˆ ℓ

+

∈ ∂

L^′_λ

+∩L^′_λ⋄

Φ ¯

+

([T

^∗

ω] ¯

+

) and ˆ ℓ

−

∈

∂

L^′_λ

−∩L^′_λ⋄

Φ ¯

−

([T

^∗

ω] ¯

−

). Therefore, (4.15) becomes

∂

L^′_λ⋄

Φ(ζ) = ¯ ∂

K_λ^∗R∩L_λ^∗_⋄R

I

λ

(ζ

₁^a

) + ∂

K_λ^s∩L^s_λ⋄

ι

D

(ζ

2

).

• Representation of [T

^∗

ω] ¯

^a₁

. One still has to prove that

[T

^∗

ω] ¯

^a₁

(z) = hθ(z), ωi ˜ (4.16) for R-a.e. z ∈ Z and some linear form ˜ ω on X

o

.

If W

−

:= {z ∈ Z ; λ(z, s) = 0, ∀s ≤ 0} satisfies R(W

−

) > 0, dom ¯ I is a set of linear forms which are nonnegative on W and γ

_z^′

(s) = 0 for all s ≤ 0, z ∈ W. Hence, one can take any function for the restriction to W

−

of [T

^∗

ω] ¯

^a₁₋

without modifying (4.14)-c. As a symmetric remark holds for W

+

= {z ∈ Z ; λ(z, s) = 0, ∀s ≥ 0}, it remains to consider the situation where for R-a.e. z, there are s

−

(z) < 0 < s

+

(z) such that λ(z, s

±

(z)) > 0. This implies that lim

s→±∞

λ(z, s)/s > 0.

By Theorem 3.3, T

^∗

ω ¯ is in the σ(K

_λ^′′

, K

_λ^′

)-closure of T

^∗

(dom Λ). Therefore, [T

^∗

ω] ¯

^a₁

is in the σ(K

λ

, K

_λ^′

)-closure of T

^∗

(dom Λ). As T

^∗

(dom Λ) is convex, this closure is its strong closure in K

λ

. Since there exists a finite measurable function c(z) such that 0 < c(z) ≤ lim

s→∞

λ(z, s)/s, one can consider the nontrivial Young function ρ(z, s) = c(z)|s| and the corresponding Orlicz spaces L

ρ

and L

^′_ρ

= L

ρ^∗

. If R is a bounded measure, we have L

λ⋄

⊂ L

ρ

and L

ρ^∗

⊂ L

λ^∗⋄

, so that [T

^∗

ω] ¯

^a₁

is in the strong closure of T

^∗

(dom Λ) in L

ρ

. As a consequence, [T

^∗

ω] ¯

^a₁

is the pointwise limit of a sequence (T

^∗

y

n

)

n≥1

with y

n

∈ Y. As T

^∗

y

n

(z) = hy

n

, θ(z)i, we see that [T

^∗

ω] ¯

^a₁

(z) = hθ(z), ωi ˜ for some linear form ˜ ω on X

o

. If R is unbounded, it is still assumed to be σ-finite: there exists a sequence (Z

k

) of measurable subsets of Z such that ∪

k

Z

k

= Z and R(Z

k

) < ∞ for each k. Hence, for each k and all z ∈ Z

k

, (T

^∗

ω) ¯

^a

(z) = hθ(z), ω ˜

^k

i for some linear form ˜ ω

^k

on X

o

, from which (4.16) follows.

•Proof of (c). It follows from the previous considerations and Theorem 3.3.

•Proof of (d). Statement (d)-1 follows from Theorem 3.2. Statement (d)-2 is immediately

Minimization of entropy functionals

HAL Id: hal-00177361

https://hal.archives-ouvertes.fr/hal-00177361

Submitted on 7 Oct 2007

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.