On a decomposition formula for the proximal operator of the sum of two convex functions

(1)

HAL Id: hal-01558522

https://hal.archives-ouvertes.fr/hal-01558522v2

Submitted on 12 Jun 2018

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires

On a decomposition formula for the proximal operator of the sum of two convex functions

Samir Adly, Loïc Bourdin, Fabien Caubet

To cite this version:

Samir Adly, Loïc Bourdin, Fabien Caubet. On a decomposition formula for the proximal operator of the sum of two convex functions. Journal of Convex Analysis, Heldermann, 2019, 26 (2), pp.699-718.

�hal-01558522v2�

(2)

On a decomposition formula for the proximal operator of the sum of two convex functions

Samir Adly^∗, Lo¨ıc Bourdin^†, Fabien Caubet^‡ June 6, 2018

Abstract

The main result of the present theoretical paper is an original decomposition formula for the proximal operator of the sum of two proper, lower semicontinuous and convex functionsf and g. For this purpose, we introduce a new operator, calledf-proximal operator of g and denoted by prox^f_g, that generalizes the classical notion. Then we prove the decomposition formula prox_f+g = prox_f ◦prox^f_g. After collecting several properties and characterizations of prox^f_g, we prove that it coincides with the fixed points of a generalized version of the classical Douglas-Rachford operator. This relationship is used for the construction of a weakly convergent algorithm that computes numerically this new operator prox^f_g, and thus, from the decomposition formula, allows to compute numerically prox_f_+g. It turns out that this algorithm was already considered and implemented in previous works, showing that prox^f_g is already present (in a hidden form) and useful for numerical purposes in the existing literature.

However, to the best of our knowledge, it has never been explicitly expressed in a closed formula and neither been deeply studied from a theoretical point of view. The present paper contributes to fill this gap in the literature. Finally we give an illustration of the usefulness of the decomposition formula in the context of sensitivity analysis of linear variational inequalities of second kind in a Hilbert space.

Keywords: convex analysis; proximal operator; Douglas-Rachford operator; Forward-Backward operator.

AMS Classification: 46N10; 47N10; 49J40; 49Q12.

Contents

1 Introduction, notations and basics 2

1.1 Introduction . . . . 2 1.2 Notations and basics . . . . 6

2 The f-proximal operator 7

2.1 Definition and main result . . . . 7 2.2 Properties . . . . 11

∗Institut de recherche XLIM. UMR CNRS 7252. Universit´e de Limoges, France. [email protected]

†Institut de recherche XLIM. UMR CNRS 7252. Universit´e de Limoges, France. [email protected]

‡Institut de Math´ematiques de Toulouse. UMR CNRS 5219. Universit´e de Toulouse, France.

[email protected]

(3)

3 Relations with the Douglas-Rachford operator 12 3.1 Several characterizations of prox^f_g . . . . 13 3.2 A weakly convergent algorithm that computes prox^f_g numerically . . . . 14 3.3 Recovering a classical result from the decomposition formula . . . . 15

4 Some other applications and forthcoming works 15

4.1 Relations with the Forward-Backward operator . . . . 16 4.2 Application to sensitivity analysis for variational inequalities . . . . 17

A A nonexistence result for a closed formula 18

1 Introduction, notations and basics

1.1 Introduction

The proximal operator (also known as proximity operator) of a proper, lower semicontinuous, convex and extended-real-valued function was first introduced by J.-J. Moreau in 1962 in [11, 12]

and can be viewed as an extension of the projection operator on a nonempty closed and convex subset of a Hilbert space. This wonderful tool plays an important role, from both theoretical and numerical points of view, in applied mathematics and engineering sciences. This paper fits within the wide theoretical literature dealing with the proximal operator. For the rest of this introduction, we use standard notations of convex analysis. For the reader who is not acquainted with convex analysis, we refer to Section 1.2 for notations and basics.

Motivations from a sensitivity analysis. The present work was initially motivated by the sensitivity analysis, with respect to a nonnegative parameter t ≥ 0, of a parameterized linear variational inequality of second kind in a Hilbert space H, with a corresponding functionh∈Γ0(H), where Γ₀(H) is the set of proper, lower semicontinuous and convex functions from H intoR∪{+∞}.

More precisely, for allt≥0, we consider the problem of findingu(t)∈H such that

hu(t), z−u(t)i+h(z)−h(u(t))≥ hr(t), z−u(t)i, (1) for allz∈H, wherer:R⁺→H is assumed to be given and smooth enough. In that framework, the solutionu(t)∈H (which depends on the parameter t) can be expressed in terms of the proximal operator ofh denoted by prox_h. Precisely it holds that u(t) = prox_h(r(t)) for all t ≥ 0. As a consequence, the differentiability of u(·) at t = 0 is strongly related to the regularity of prox_h. If h is a smooth function, one can easily compute (from the classical inverse mapping theorem for instance) the differential of prox_h, and then the sensitivity analysis can be achieved. In that smooth case, note that the variational inequality (1) can actually be reduced to an equality. On the other hand, ifh=ι_Kis the indicator function of a nonempty closed and convex subset K⊂H, then prox_h= proj_K is the classical projection operator on K. In that case, the work of F. Mignot in [10, Theorem 2.1 p.145] (see also the work of A. Haraux in [8, Theorem 2 p.620]) provides an asymptotic expansion of prox_h = proj_K and permits to obtain a differentiability result on u(·) att= 0.

In a parallel work (in progress) of the authors on some shape optimization problems with unilateral contact and friction, the considered variational inequality (1) involves the sum of two functions.

(4)

Precisely,h=f+g wheref =ι_K(K being a nonempty closed and convex set of constraints), and whereg∈Γ0(H) is a smooth function (derived from the regularization of the friction functional in view of a numerical treatment). Despite the regularity ofg, note that the variational inequality (1) cannot be reduced to an equality due to the presence of the constraint set K. In that framework, in order to get an asymptotic expansion of prox_h= prox_f+g, a first and natural strategy would be to look for a convenient explicit expression of prox_f+gin terms of prox_f and prox_g. Unfortunately, this theoretical question still remains an open challenge in the literature. Let us mention that Y.- L. Yu provides in [18] some necessary and/or sufficient conditions on general functionsf,g∈Γ₀(H) under which prox_f+g = prox_f ◦prox_g. Unfortunately, as underlined by the author himself, these conditions are very restrictive and are not satisfied in most of cases (see, e.g., [18, Example 2] for a counterexample).

Before coming to the main topic of this paper, we recall that a wide literature is already concerned with the sensitivity analysis of parameterized (linear and nonlinear) variational inequalities. We refer for instance to [3, 8, 13, 17] and references therein. The results in are considered in very general frameworks. We precise that our original objective was to look for a simple and compact formula for the derivativeu⁰(0) in the very particular case described above, that is, in the context of a linear variational inequality and withh=f +g where f is an indicator function andg is a smooth function. For this purpose, we were led to consider the proximal operator of the sum of two functions in Γ₀(H), to introduce a new operator and finally to prove the results presented in this paper.

Introduction of the f-proximal operator and main result. Let us consider general functions f, g ∈ Γ₀(H). In order to avoid trivialities, we will assume in the whole paper that dom(f)∩dom(g)6=∅ when dealing with the sumf+g.

Section 2 is devoted to the introduction (see Definition 2.1) of a new operator prox^f_g : H ⇒ H calledf-proximal operator ofg and defined by

prox^f_g := I +∂g◦prox_f−1

. (2)

This new operator can be seen as a generalization of prox_g in the sense that, if f is constant for instance, then prox^f_g = prox_g. More general sufficient (and necessary) conditions under which prox^f_g = prox_g are provided in Propositions 2.13 and 2.15. We prove in Proposition 2.5 that the domain of prox^f_g satisfies D(prox^f_g) = H if and only if ∂(f+g) = ∂f +∂g. Note that prox^f_g is a set-valued operator a priori. We provide in Proposition 2.18 some sufficient conditions under which prox^f_g is single-valued. Some examples illustrate all the previous results throughout the section (see Examples 2.2, 2.3, 2.4, 2.7 and 2.17).

Finally, if the additivity condition∂(f +g) =∂f +∂g is satisfied, the main result of the present paper (see Theorem 2.8) is the original decomposition formula

prox_f+g= prox_f◦prox^f_g. (3)

It is well-known in the literature that obtaining a theoretical formula for prox_f+gis not an easy task in general, even if prox_f and prox_gare known. We give a more precise description of the difficulty to obtain an easy computable formula of prox_f+gin Appendix A, which claims that there is no closed formula, independent off andg, allowing to write prox_f+gas a linear combination of compositions

(5)

of linear combinations of I, prox_f, prox_g, prox⁻¹_f and prox⁻¹_g . In the decomposition formula (3), it should be noted that the difficulty of computing prox_f+g is only transferred to the computation of prox^f_g which is not an easier task. Note that other rewritings, which are not suitable for an easy computation of prox_f+g neither, can be considered such as

prox_f+g= (prox⁻¹_f + prox⁻¹_g −I)⁻¹= (prox⁻¹_2f + prox⁻¹_2g)⁻¹◦2I,

the second equality being provided in [2, Corollary 25.35 p.458]. However we show in this paper that our decomposition formula (3) is of theoretical interest in order to prove in a concice and elegant way almost all other new statements of this paper, and also to recover in a simple way some well-known results (see Sections 3.3 and 4.1 for instance), making it central in our work. We provide an illustration of this feature in the next paragraph about the classical Douglas-Rachford algorithm. Moreover, as explained in the last paragraph of this introduction, we also prove in this paper the usefulness of the decomposition formula (3) in the context of sensitivity analysis of the variational inequality (1) (see Section 4.2).

Relationship with the classical Douglas-Rachford operator. Recall that the proximal operator prox_f+g is strongly related to the minimization problem

argminf +g,

since the set of solutions is exactly the set of fixed points of prox_f+gdenoted by Fix(prox_f+g). In the sequel, we will assume that the above minimization problem admits at least one solution. The classical Douglas-Rachford operator, introduced in [6] and denoted here by DRf,g (see Section 3 for details), provides an algorithmxn+1 =DRf,g(xn) that is weakly convergent to somex^∗ ∈H satisfying

prox_f(x^∗)∈argminf +g.

Even if theDouglas-Rachford algorithm is not aproximal point algorithm in general, in the sense thatDRf,g is not equal to prox_ϕ for someϕ∈Γ₀(H) in general, it is a very powerful tool since it allows to solve the above minimization problem, requiring only the knowledge of prox_f and prox_g. We refer to [2, Section 28.3 p.517] for more details.

Section 3 deals with the relations between the Douglas-Rachford operator DRf,g and the f- proximal operator prox^f_g introduced in this paper. Precisely, we prove in Proposition 3.2 that

prox^f_g(x) = Fix DRf,g(x,·) ,

for all x ∈ H, where DR_f,g(x,·) denotes a x-dependent generalization of the classical Douglas- Rachford operatorDRf,g, in the sense thatDRf,g(y) =DRf,g(prox_f(y), y) for ally∈H. We refer to Section 3 for the precise definition ofDRf,g(x,·) that only depends on the knowledge of prox_f and prox_g.

Let us show that the above statements, in particular the decomposition formula (3), allow to recover in a concise way the well-known inclusion

prox_f(Fix (DRf,g))⊂argminf+g= Fix(prox_f+g). (4) Indeed, if x^∗ ∈ Fix(DRf,g), then x^∗ ∈ Fix(DRf,g(prox_f(x^∗),·)) = prox^f_g(prox_f(x^∗)). From the decomposition formula (3), we conclude that

prox_f(x^∗) = prox_f◦prox^f_g(prox_f(x^∗)) = prox_f+g(prox_f(x^∗)).

(6)

This proof of only few lines is an illustration of the theoretical interest of the decomposition formula (3). Note that the above inclusion (4) is, as well-known, an equality (see Section 3.3 and Proposition 3.8 for details).

The f-proximal operator prox^f_g introduced in this paper is also of interest from a numerical point of view. Indeed, if x ∈ D(prox^f_g), we prove in Theorem 3.3 that the fixed-point algo- rithmy_k+1=DRf,g(x, y_k), denoted by (A1), weakly converges to somey^∗∈prox^f_g(x). Moreover, if the additivity condition ∂(f +g) = ∂f +∂g is satisfied, we get from the decomposition formula (3) that prox_f(y^∗) = prox_f+g(x). In that situation, we conclude that Algorithm (A1) allows to compute numerically prox_f+g(x) with the only knowledge of prox_f and prox_g. It turns out that Algorithm (A₁) was already considered, up to some translations, and implemented in previous works (see, e.g., [4, Algorithm 3.5]), showing that thef-proximal operator prox^f_g is already present (in a hidden form) and useful for numerical purposes in the existing literature. However, to the best of our knowledge, it has never been explicitly expressed in a closed formula such as (2) and neither been deeply studied from a theoretical point of view. The present paper contributes to fill this gap in the literature.

Some other applications and forthcoming works. Section 4 can be seen as a conclusion of the paper. Its aim is to provide a glimpse of some other applications of our main result (The- orem 2.8) and to raise open questions for forthcoming works. This section is splitted into two parts.

In Section 4.1 we consider the framework wheref,g ∈Γ₀(H) with g differentiable on H. In that framework, we prove from the decomposition formula (3) that prox_f+g is related to the classical Forward-Backward operator (see [5, Section 10.3 p.191] for details) denoted by F Bf,g. Precisely, we prove in Proposition 4.1 that

prox_f+g(x) = Fix F Bf,g(x,·) ,

for all x ∈ H, where F Bf,g(x,·) denotes a x-dependent generalization of the classical Forward- Backward operatorF Bf,g. We refer to Section 4.1 for the precise definition ofF Bf,g(x,·) that only depends on the knowledge of prox_f and∇g. From this point, one can develop a similar strategy as in Section 3. Precisely, for allx∈H, one can consider the algorithmyk+1=F Bf,g(x, yk), denoted by (A2), in order to compute numerically prox_f+g(x), with the only knowledge of prox_f and∇g.

Convergence proof (under some assumptions onf andg) of (A2) should be the topic of a future work.

In Section 4.2 we turn back to our initial motivation, namely the sensitivity analysis of the parameterized variational inequality (1). Precisely, under some assumptions (see Proposition 4.3 for details), we derive from the decomposition formula (3) that if

u(t) := prox_f+g(r(t)),

for allt≥0, wheref :=ι_K(where K⊂H is a nonempty closed convex subset) and whereg∈Γ₀(H) andr:R⁺→H are smooth enough, then

u⁰(0) = prox_ϕ_f_+ψ_g(r⁰(0)),

whereϕf :=ιC(whereCis a nonempty closed convex subset of H related to K) and whereψg(x) :=

1

2hD²g(u(0))(x), xi for allx∈H. It should be noted that the assumptions of Proposition 4.3 are

(7)

quite restrictive, raising open questions about their relaxations (see Remark 4.5). This also should be the subject of a forthcoming work.

1.2 Notations and basics

In this section we introduce some notations available throughout the paper and we recall some basics of convex analysis. We refer to standard books like [2, 9, 14] and references therein.

Let H be a real Hilbert space and leth·,·i(resp. k · k) be the corresponding scalar product (resp.

norm). For every subset S of H, we denote respectively by int(S) and cl(S) its interior and its closure. In the sequel we denote by I : H→H the identity operator and by L_x: H→H the affine operator defined by

Lx(y) :=x−y, for allx,y∈H.

For a set-valued mapA: H⇒H, thedomain ofAis given by D(A) :={x∈H|A(x)6=∅}.

We denote byA⁻¹: H⇒H the set-valued map defined by A⁻¹(y) :={x∈H|y∈A(x)},

for ally ∈H. Note thaty ∈A(x) if and only ifx∈A⁻¹(y), for allx,y ∈H. The range of Ais given by

R(A) :={y∈H|A⁻¹(y)6=∅}= D(A⁻¹).

We denote by Fix(A) the set of all fixed points ofA, that is, the set given by Fix(A) :={x∈H|x∈A(x)}.

Finally, ifA(x) is a singleton for allx∈D(A), we say thatAissingle-valued.

For all extended-real-valued functionsg: H→R∪ {+∞}, thedomainof gis given by dom(g) :={x∈H|g(x)<+∞}.

Recall thatg is said to beproper if dom(g)6=∅.

Letg: H→R∪{+∞}be a proper extended-real-valued function. We denote byg^∗: H→R∪{+∞}

theconjugateofg defined by

g^∗(y) := sup

z∈H

{hy, zi −g(z)},

for ally∈H. Clearlyg^∗ is lower semicontinuous and convex.

We denote by Γ₀(H) the set of all extended-real-valued functions g : H → R∪ {+∞} that are proper, lower semicontinuous and convex. If g ∈ Γ₀(H), we recall that g^∗ ∈ Γ₀(H) and that the Fenchel-Moreau equality g^∗∗ = g holds. For all g ∈ Γ0(H), we denote by ∂g : H ⇒ H the Fenchel-Moreau subdifferentialofg defined by

∂g(x) :={y∈H| hy, z−xi ≤g(z)−g(x), ∀z∈H},

(8)

for allx∈H. It is easy to check that∂gis a monotone operator and that, for allx∈H, 0∈∂g(x) if and only if x ∈ argming. Moreover, for all x, y ∈ H, it holds that y ∈ ∂g(x) if and only ifx∈∂g^∗(y). Recall that, ifg is differentiable on H, then∂g(x) ={∇g(x)}for allx∈H.

LetA: H→H be a single-valued operator defined everywhere on H, and letg∈Γ0(H). We denote by VI(A, g) the variational inequality which consists of findingy∈H such that

−A(y)∈∂g(y),

or equivalently,

hA(y), z−yi+g(z)−g(y)≥0,

for allz ∈H. Then we denote by SolVI(A, g) the set of solutions of VI(A, g). Recall that if Ais Lipschitzian and strongly monotone, then VI(A, g) admits a unique solution,i.e. SolVI(A, g) is a singleton.

Letg∈Γ₀(H). The classicalproximal operatorofg is defined by prox_g := (I +∂g)⁻¹.

Recall that prox_g is a single-valued operator defined everywhere on H. Moreover, it can be char- acterized as follows:

prox_g(x) = argmin g+1

2k · −xk²

= SolVI(−Lx, g), for allx∈H. It is also well-known that

Fix(prox_g) = argming.

The classicalMoreau’s envelopeMg: H→Rofgis defined by Mg(x) := min

g+1

2k · −xk² ,

for allx∈H. Recall that Mg is convex and differentiable on H with ∇Mg = prox_g∗. Let us also recall the classical Moreau’s decompositions

prox_g+ prox_g∗ = I and Mg+ Mg^∗ =1 2k · k².

Finally, it is well-known that ifg=ιK is theindicator functionof a nonempty closed and convex subset K of H, that is,ι_K(x) = 0 ifx∈K andι_K(x) = +∞if not, then prox_g= proj_K, where proj_K denotes the classical projection operator on K.

2 The f-proximal operator

2.1 Definition and main result

Let f, g ∈ Γ0(H). In this section we introduce (see Definition 2.1) a new operator denoted by prox^f_g, generalizing the classical proximal operator prox_g. Assuming that dom(f)∩dom(g)6=∅, and under the additivity condition∂(f+g) =∂f+∂g, we prove in Theorem 2.8 that prox_f+g can be written as the composition of prox_f with prox^f_g.

(9)

Definition 2.1 (f-proximal operator). Let f, g ∈ Γ₀(H). The f-proximal operator of g is the set-valued map prox^f_g : H⇒H defined by

prox^f_g := (I +∂g◦prox_f)⁻¹. (5)

Note that prox^f_g can be seen as a generalization of prox_gsince prox^c_g = prox_gfor all constantc∈R. Example 2.2. Let us assume that H =R. We considerf =ι_[−1,1] and g(x) =|x|for allx∈R. In that case we obtain that∂g◦prox_f =∂g and thus prox^f_g = prox_g.

Example 2.2 provides a simple situation where prox^f_g = prox_gwhilef is not constant. We provide in Propositions 2.13 and 2.15 some general sufficient (and necessary) conditions under which prox^f_g = prox_g.

Example 2.3. Let us assume that H =R. We consider f = ι_{0} andg(x) = |x| for all x∈R. In that case we obtain that∂g◦prox_f(x) = [−1,1] for allx∈R. As a consequence prox^f_g(x) = [x−1, x+ 1] for allx∈R. See Figure 1 for graphical representations of prox_g and prox^f_g in that case.

0

prox_g prox^f_g

Figure 1: Example 2.3, graph of prox_g in bold line, and graph of prox^f_g in gray.

Example 2.4. Let us assume that H = R. We consider f(x) = g(x) = |x| for all x ∈ R. In that case we obtain that ∂g◦prox_f(x) = −1 for all x < −1, ∂g◦prox_f(x) = [−1,1] for all x∈[−1,1] and∂g◦prox_f(x) = 1 for allx >1. As a consequence prox^f_g(x) =x+ 1 for allx≤ −2, prox^f_g(x) = [−1, x+ 1] for allx∈[−2,0], prox^f_g(x) = [x−1,1] for allx∈[0,2] and prox^f_g(x) =x−1 for allx≥2. See Figure 2 for graphical representations of prox_g and prox^f_g in that case.

Examples 2.3 and 2.4 provide simple illustrations where prox^f_g is not single-valued. In particular it follows that prox^f_g cannot be written as a proximal operator prox_ϕ for some ϕ∈ Γ₀(H). We provide in Proposition 2.18 some sufficient conditions under which prox^f_gis single-valued. Moreover, Examples 2.3 and 2.4 provide simple situations where∂g◦prox_f is not a monotone operator. As a consequence, it may be possible that D(prox^f_g) H. In the next proposition, a necessary and sufficient condition under which D(prox^f_g) = H is derived.

(10)

0

prox_g prox^f_g

Figure 2: Example 2.4, graph of prox_g in bold line, and graph of prox^f_g in gray.

Proposition 2.5. Let f,g∈Γ₀(H) such that dom(f)∩dom(g)6=∅. It holds thatD(prox^f_g) = H if and only if the additivity condition

∂(f+g) =∂f+∂g, (C1)

is satisfied.

Proof. We first assume that ∂(f +g) = ∂f +∂g. Let x ∈ H. Defining w = prox_f+g(x) ∈ H, we obtain that x ∈ w+∂(f +g)(w) = w+∂f(w) +∂g(w). Thus, there exist wf ∈ ∂f(w) andw_g∈∂g(w) such thatx=w+w_f+w_g. We definey=w+w_f ∈w+∂f(w). In particular we havew= prox_f(y). Moreover we obtainx=y+w_g∈y+∂g(w) =y+∂g(prox_f(y)). We conclude thaty∈prox^f_g(x).

Without any additional assumption and directly from the definition of the subdifferential, one can easily see that the inclusion∂f(w) +∂g(w) ⊂∂(f +g)(w) is always satisfied for every w ∈ H.

Now let us assume that D(prox^f_g) = H. Let w ∈ H and let z ∈ ∂(f +g)(w). We consider x=w+z ∈w+∂(f +g)(w). In particular it holds thatw= prox_f+g(x). Since D(prox^f_g) = H, there exists y ∈ prox^f_g(x) and thus it holds that x ∈ y +∂g(prox_f(y)). Moreover, since y ∈ prox_f(y) +∂f(prox_f(y)), we get thatx∈prox_f(y) +∂f(prox_f(y)) +∂g(prox_f(y))⊂prox_f(y) +

∂(f+g)(prox_f(y)). Thus it holds that prox_f(y) = prox_f+g(x) =w. Moreover, sincex∈prox_f(y)+

∂f(prox_f(y)) +∂g(prox_f(y)), we obtain that x∈w+∂f(w) +∂g(w). We have proved thatz = x−w∈∂f(w) +∂g(w). This concludes the proof.

In most of the present paper, we will assume that Condition (C1) is satisfied. It is not our aim here to discuss the weakest qualification condition ensuring that condition. A wide literature already deals with this topic (see, e.g., [1, 7, 14]). However, we recall in the following remark the classical sufficient condition of Moreau-Rockafellar under which Condition (C₁) holds true (see, e.g., [2, Corollary 16.48 p.277]), and we provide a simple example where Condition (C1) does not hold and D(prox^f_g) H.

(11)

Remark 2.6(Moreau-Rockafellar theorem). Letf,g∈Γ₀(H) such that dom(f)∩int(dom(g))6=∅.

Then∂(f+g) =∂f+∂g.

Example 2.7. Let us assume that H =R. We consider f =ι_R⁻ andg(x) =ι_R+(x)−√

xfor all x∈R. In that case, one can easily check that dom(f)∩dom(g) ={0} 6=∅, ∂f(0) +∂g(0) =∅ R=∂(f+g)(0) and D(prox^f_g) =∅ H.

We are now in position to state and prove the main result of the present paper.

Theorem 2.8. Let f,g∈Γ0(H)such thatdom(f)∩dom(g)6=∅. If∂(f+g) =∂f+∂g, then the decomposition formula

prox_f+g= prox_f◦prox^f_g (6)

holds true. In other words, for everyx∈H, we haveprox_f+g(x) = prox_f(z)for allz∈prox^f_g(x).

Proof. Letx∈H and lety∈prox^f_g(x) constructed as in the first part of the proof of Proposition 2.5.

In particular it holds that prox_f(y) = prox_f+g(x). Let z ∈ prox^f_g(x). We know that x−y ∈

∂g(prox_f(y)) andx−z∈∂g(prox_f(z)). Since ∂gis a monotone operator, we obtain that h(x−y)−(x−z),prox_f(y)−prox_f(z)i ≥0.

From the cocoercivity (see for instance [2, Definition 4.10 p.72]) of the proximal operator, we obtain that

0≥ hy−z,prox_f(y)−prox_f(z)i ≥ kprox_f(y)−prox_f(z)k²≥0.

We deduce that prox_f(z) = prox_f(y) = prox_f+g(x). The proof is complete.

Remark 2.9. Letf,g∈Γ0(H) with dom(f)∩dom(g)6=∅and such that∂(f+g) =∂f+∂gand letx∈H. Theorem 2.8 states that, even if prox^f_g(x) is not a singleton, all elements of prox^f_g(x) has the same value through the proximal operator prox_f, and this value is equal to prox_f+g(x).

Remark 2.10. Letf, g∈Γ0(H) with dom(f)∩dom(g)6=∅. Note that the additivity condition

∂(f +g) = ∂f +∂g is not only sufficient, but also necessary for the validity of the equality prox_f+g = prox_f ◦prox^f_g. Indeed, from Proposition 2.5, if∂f +∂g ∂(f+g), then there exists x∈H such that prox^f_g(x) =∅and thus prox_f+g(x)6= prox_f◦prox^f_g(x).

Remark 2.11. Letf, g∈Γ0(H) with dom(f)∩dom(g)6=∅ and such that∂(f+g) =∂f +∂g.

From Theorem 2.8, we deduce that R(prox_f+g)⊂R(prox_f)∩R(prox_g). If the additivity condition

∂(f +g) = ∂f+∂g is not satisfied, this remark does not hold true anymore. Indeed, with the framework of Example 2.7, we have R(prox_f+g) ={0}while 0∈/ R(prox_g).

Example 2.12. Following the idea of Y.-L. Yu in [18, Example 2], let us consider H = R and f(x) =¹₂x²for allx∈R. Since prox_γf = _1+γ¹ I for allγ≥0 and prox^f_f =²₃I, we retrieve that

1

3I = prox_2f= prox_f+f= prox_f◦prox^f_f = 1 3I6=1

4I = prox_f◦prox_f, which illustrates Theorem 2.8.

(12)

2.2 Properties

Letf,g∈Γ0(H). We know that prox^f_g is a generalization of prox_gin the sense that prox^f_g= prox_g iff is constant for instance. In the next proposition, our aim is to provide more general sufficient (and necessary) conditions under which prox^f_g = prox_g. We will base our discussion on the following conditions:

∀x∈H, ∂g(x)⊂∂g(prox_f(x)), (C2)

∀x∈H, ∂g(prox_f(x))⊂∂g(x). (C₃) Note that Condition (C2) has been introduced by Y.-L. Yu in [18] as a sufficient condition under which prox_f+g= prox_f◦prox_g.

Proposition 2.13. Let f,g∈Γ₀(H)withdom(f)∩dom(g)6=∅.

(i) If Condition (C2)is satisfied, then prox_g(x)∈prox^f_g(x)for allx∈H.

(ii) If Conditions (C1)and (C3)are satisfied, thenprox^f_g(x) = prox_g(x)for allx∈H.

In both cases, Condition (C₁)is satisfied and the equalityprox_f+g= prox_f◦prox_g holds true.

Proof. Let x ∈ H. If Condition (C2) is satisfied, considering y = prox_g(x), we get that x ∈ y+∂g(y)⊂y+∂g(prox_f(y)) and thus y ∈prox^f_g(x). In particular, it holds that D(prox^f_g) = H and thus Condition (C1) is satisfied from Proposition 2.5. Secondly, if Conditions (C1) and (C3) are satisfied, then D(prox^f_g) = H from Proposition 2.5. Considering y ∈ prox^f_g(x), we get that x∈y+∂g(prox_f(y))⊂y+∂g(y) and thus y= prox_g(x). The last assertion of Proposition 2.13 directly follows from Theorem 2.8.

In the first item of Proposition 2.13 and if prox^f_gis set-valued, we are in the situation where prox_gis a selection of prox^f_g. Proposition 2.15 specifies this selection in the case where∂(f+g) =∂f+∂g.

Lemma 2.14. Let f,g∈Γ0(H)with dom(f)∩dom(g)6=∅. Then prox^f_g(x)is a nonempty closed and convex subset ofHfor allx∈D(prox^f_g).

Proof. The proof of Lemma 2.14 is provided after the proof of Proposition 3.2 (required).

Proposition 2.15. Letf,g∈Γ0(H)withdom(f)∩dom(g)6=∅and such that∂(f+g) =∂f+∂g and letx∈H. Ifprox_g(x)∈prox^f_g(x), then

prox_g(x) = proj_proxf

g(x)(prox_f+g(x)).

Proof. If prox_g(x) ∈ prox^f_g(x), then x ∈ D(prox^f_g) and thus prox^f_g(x) is a nonempty closed and convex subset of H from Lemma 2.14. Let z ∈ prox^f_g(x). In particular we have prox_f(z) = prox_f+g(x) from Theorem 2.8. Using the fact that x−prox_g(x) ∈ ∂g(prox_g(x)) and x−z ∈

∂g(prox_f(z)) =∂g(prox_f+g(x)) together with the monotonicity of∂g, we obtain that

hprox_f+g(x)−prox_g(x), z−prox_g(x)i=hprox_f+g(x)−prox_g(x),(x−prox_g(x))−(x−z)i ≤0.

Since prox_g(x)∈prox^f_g(x), we conclude the proof from the characterization of proj_proxf g(x).