Evolution equations for maximal monotone operators: asymptotic analysis in continuous and discrete time

(1)

Evolution equations for maximal monotone operators:

asymptotic analysis in continuous and discrete time

Juan Peypouquet

Departamento de Matem´ atica, Universidad T´ ecnica Federico Santa Mar´ıa Av. Espa˜ na 1680, Valpara´ıso, Chile

[email protected]

Sylvain Sorin

Equipe Combinatoire et Optimisation, CNRS FRE 3232, Facult´ e de Math´ ematiques, Universit´ e P. et M. Curie - Paris 6, 175 Rue du Chevaleret, 75013 Paris

and Laboratoire d’Econom´ etrie, Ecole Polytechnique, France

[email protected]

November 5, 2009

Abstract

This survey is devoted to the asymptotic behavior of solutions of evolution equations generated by maximal monotone operators in Hilbert spaces. The emphasis is in the comparison of continuous time trajectories to sequences generated by implicit or explicit discrete time schemes. The analysis covers weak convergence for the average process, for the process itself and strong convergence. The aim is to highlight the main ideas and unifying the proofs. Furthermore the connection is made with the analysis in terms of almost orbits that allows for a broader scope.

Introduction

Discrete and continuous dynamical systems governed by maximal monotone operators have a great number of applications in optimization, equilibrium, fixed-point theory, partial differential equations, among others.

(3)

We are specially concerned about the connection between continuous time and discrete time models.

This connection occurs at two levels:

1. On a compact interval, one approximates continuous-time trajectories by interpolation of some sequences computed via discretization. By considering vanishing step sizes this con- struction is used to prove existence results and to approximate the trajectories numerically.

2. Another approximation is in the long term, where we compare asymptotic properties of a continuous trajectory to similar asymptotic properties of a given path defined inductively through a sequence of values and step sizes.

It is important to mention that some estimations (e.g. Kobayashi type) can be useful for both purposes.

The literature on this subject is huge but lot of the arguments turn out to be pretty much the same. Therefore, we intend to give a concise yet complete compendium of the results available, with an emphasis on the techniques and the way they enter in the proofs.

Most of the properties will be established in the framework of Hilbert spaces since our aim is to underline unity in terms of tools and approach. A lot of results can be extended but, in most cases, additional specific assumptions are needed. With no aim for completeness, we have included several references to the corresponding results in Banach spaces that we think might be useful.

The paper is organized as follows: In section 1 we recall the basic properties of maximal monotone operators along with some examples. Section 2 deals with the associated dynamic approach. We present the existence results for the differential inclusion ˙u∈ −Auand global properties of implicit and explicit discretizations. Section 3 establishes the convergence of the value f(u) in the case of an operator of the form A = ∂f. In section 4 we describe general results on weak convergence:

tools, arguments, characterization of the weak limits. Section 5 is devoted to weak convergence in average and section 6 is concerned with weak convergence, especially for demipositive operators.

In section 7 we present the, mostly geometric, conditions ensuring that the convergence is strong.

Section 8 deals with asymptotic equivalence and explains some apparently hidden relationships between certain continuous- and discrete-time dynamical systems. Finally, section 9 contains some concluding remarks.

1 Preliminaries

The purpose of this section is to introduce notations and to recall basic results.

1.1 Monotone operators

LetH be a real Hilbert space with inner producth·,·i and norm k · k. Anoperatoris a set-valued mappingA:H ⇒H whose domain

D(A) ={u∈H:Au6=∅}

is nonempty. For convenience of notation, sometimes we will identify A with its graph by writing [u, u^∗]∈ A foru^∗ ∈ Au. The operator A⁻¹ is defined by its graph: [u, u^∗] ∈A⁻¹ if, and only if,

(4)

[u^∗, u]∈A.

An operator A:H⇒H is monotoneif one has

hx^∗−y^∗, x−yi ≥0 (1)

for all [x, x^∗],[y, y^∗]∈A.

A monotone operator is maximal if its graph is not properly contained in the graph of any other monotone operator. Observe that ifAis monotone (resp. maximal monotone) then so areA⁻¹ and λAifλ >0.

Lemma 1 Let A be a maximal monotone operator. A point[x, x^∗]∈H×H belongs to the graph of A if, and only if,

hx^∗−u^∗, x−ui ≥0 for all [u, u^∗]∈A.

Proof. If [x, x^∗]∈A the inequality holds by monotonicity. Conversely, if [x, x^∗]∈/ A, then the set A∪ {[x, x^∗]}is the graph of a monotone operator that extendsA, which contradicts maximality.

An operator A:H⇒H is nonexpansiveif one has

kx^∗−y^∗k ≤ kx−yk (2)

for all [x, x^∗],[y, y^∗]∈A. Observe that a nonexpansive operator is single-valued on its domain.

LetI be the identity mapping on H. Forλ >0, theresolventof Ais the operator J_λ^A= (I +λA)⁻¹.

Theorem 2 Let A:H ⇒H. Then

i) A is monotone if, and only if,J_λ^A is nonexpansive for each λ >0.

ii) A monotone operator A is maximal if, and only if, I+λA is surjective for each λ >0.

Proof.

i) LetA be monotone, [x, x^∗],[y, y^∗]∈A andλ >0.

Inequality (1) implies

kx−yk ≤ kx−y+λ(x^∗−y^∗)k, ∀λ≥0 (3) which is the non expansiveness ofJ_λ^A.

Conversely, (3) leads to

2λhx^∗−y^∗, x−yi+λ²kx^∗−y^∗k² ≥0 hence implies (1) by dividing byλand letting λ→0.

ii) It is enough to prove the result for λ = 1. Given z₀ ∈ H, we will find x₀ ∈ H such that hx^∗−(z0−x0), x−x0i ≥0 for all [x, x^∗]∈A so that maximality of A impliesz0−x0∈Ax0. For [x, x^∗]∈A, define the weakly compact set Cx,x^∗ by

Cx,x^∗={x₀ ∈H :hx^∗+x0−z0, x−x0i ≥0}.

(5)

It suffices to show that the family {C_x,x^∗}_[x,x^∗_]∈A has the finite intersection property. To this end take [xi, x^∗_i] ∈ A for i = 1, . . . , n. Let ∆ = {(λ₁, . . . , λn) : λi ≥ 0;Pn

i=1λi = 1} denote the n-dimensional simplex and consider the functionf : ∆×∆→R given by

f(λ, µ) =P_n

i=1µ_ihx^∗_i +x(λ)−z₀, x(λ)−x_ii withx(λ) =Pn

i=1λixi. Clearlyf(·, µ) is convex and continuous whilef(λ,·) is linear. The Min-Max Theorem (see, for instance, Theorem 1.1 in [19, Br´ezis]) implies the existence of λ₀∈∆ such that

maxµ∈∆f(λ0, µ) = max

µ∈∆min

λ∈∆f(λ, µ)≤max

µ∈∆f(µ, µ).

Now monotonicity ofA implies f(µ, µ) = Pn

i=1µihx^∗_i, x(µ)−xii+hx(µ)−z0, x(µ)−x(µ)i

= Pn

i,j=1µiµjhx^∗_i, xj−xii

= ¹₂Pn

i,j=1µiµjhx^∗_i −x^∗_j, xj−xii ≤0 so that f(λ0, µ)≤0 for all µ∈∆. Taking forµthe extreme points we get

hy_i+x(λ₀)−z₀, x(λ₀)−x_ii ≤0 for all i, which is x(λ₀)∈Tn

i=1C_x_i_,x^∗

i.

Conversely, take [u, u^∗]∈H×H such that hu^∗−v^∗, u−vi ≥0 for all [v, v^∗]∈A. Since I+A is surjective, there is [v, v^∗]∈ A such that v+v^∗ =u+u^∗. Then hu^∗−v^∗, u−vi =−ku−vk² ≥0

which impliesu=v,u^∗ =v^∗ and [u, u^∗]∈A.

Comments

The study of monotone operators started in [47, Minty]. See also [37, Kato] for part i) in Banach spaces. The if part in ii) holds in Banach spaces, essentially by the same arguments. The proof presented above for theonly ifpart can be found in [19, Br´ezis]. This result does not hold in general

Banach spaces (see [36, Hirsch]).

1.2 Examples and properties

Example 1 Let Γ0(H) denote the set of all proper, lower-semicontinuous convex functions f : H→R∪ {+∞}. Forf ∈Γ₀(H), thesubdifferential of f is the operator ∂f :H⇒H defined by

∂f(x) ={x^∗∈H:f(z)≥f(x) +hx^∗, z−xifor all z∈H}.

To see that it is monotone, takex^∗∈∂f(x) andy^∗ ∈∂f(y). Thus f(y) ≥ f(x) +hx^∗, y−xi f(x) ≥ f(y) +hy^∗, x−yi and adding these two inequalities we obtainhx^∗−y^∗, x−yi ≥0.

For maximality, according to Theorem 2 it suffices to prove that for each y ∈ H and each λ > 0

(6)

there isxλ∈D(∂f) such thaty∈xλ+λ∂f(xλ). Indeed, consider theMoreau-Yosida approximation of f aty, which is the function f_λ defined by

fλ(x) =f(x) + 1

2λkx−yk². (4)

It is proper, lower-semicontinuous, strongly convex and coercive (due to the quadratic term and the fact that f has a affine minorant). Its unique minimizer x_λ satisfies

0∈∂f_λ(x_λ) =∂f(x_λ) + 1

λ(x_λ−y).

That is, y∈x_λ+λ∂f(x_λ).

Example 2 LetAbe monotone, single-valued and continuous onD(A) =H. ThenAis maximal.

Indeed, fromhu−Ay, x−yi ≥0 for ally∈Hone deduces, withy=x−tw, thathu−A(x−tw), wi ≥ 0,for all t≥0 and all w ∈H. By letting t→0 we obtain hu−Ax, wi ≥0 for allw∈H, so that

u=Ax.

Example 3 Let C be a nonempty subset of H and let T :C → H be nonexpansive, thus single- valued onC. The operatorA=I−T is monotone because

hAx−Ay, x−yi = kx−yk²− hT x−T y, x−yi

≥ kx−ykh

kx−yk − kT x−T yki

≥ 0.

If C = H maximality is given in Example 2. Otherwise, T can be extended to a nonexpansive function defined on all of H, so thatA is not maximal. IfC is closed and convex this extension is easily constructed by considering ˜T =T◦PC, where PC denotes the orthogonal projection ontoC.

Notice that ifT :C →C then ˜T has no fixed points outside ofC. Pioneer works in the extension of Lipschitz functions on general sets are [46, 38, 66, 67] but the interested reader can also consult [31] for an updated survey on the topic.

It is important to point out that this lack of maximality whenC H is not a serious drawback,

as we shall see later on (see, for instance, Remark 5).

The set of zeroesofA is

S =A⁻¹0 ={x∈H; 0∈Ax}.

This set is relevant in optimization and fixed-point theory:

• IfA=I−T, whereT is a nonexpansive mapping, then S is the set of fixed points ofT.

• If A = ∂f, where f is a proper lower-semicontinuous convex function then S is the set of minimizers off.

Let us describe some topological consequences of maximal monotonicity.

(7)

Proposition 3 Let A be maximal monotone. For each x ∈H, the set Ax is closed and convex.

In particular, S is closed and convex.

Proof. Lemma 1 implies that

Ax={x^∗ ∈H;hx^∗−u^∗, x−ui ≥0 for all [u, u^∗]∈A}

henceAxis closed and convex. SinceA⁻¹ is maximal monotone and S=A⁻¹0, the setS is closed

and convex.

Proposition 4 Let A be a maximal monotone operator. Then A is sequentially weak-strong and strong-weak closed.

Proof. Take sequences {x_n} and {x^∗_n} in H such that [x_n, x^∗_n]∈ A for each n ∈ Nand suppose thatxn→xand x^∗_n* x^∗, asn→ ∞ (considerA⁻¹ for the other case). To prove that [x, x^∗]∈A, recall that by monotonicity, for all [u, u^∗]∈Aand alln∈N, we havehx^∗_n−u^∗, x_n−ui ≥0. Letting n → ∞ the convergence assumptions imply that hx^∗ −u^∗, x−ui ≥ 0 for all [u, u^∗] ∈ A. Hence

[x, x^∗]∈A by Lemma 1.

Remark 5 If C ⊂ H is closed and convex, T : C → C is nonexpansive and A = I −T, the conclusions in Propositions 3 and 4 are true, even ifA is not maximal (C H).

2 Dynamic approach

The forthcoming sections address, among others, the issue of finding zeroes of a (maximal) monotone operator A. The strategy is the following: we shall consider some continuous and discrete dynamical systems whose trajectories may converge, in some sense and under some conditions, to points inS =A⁻¹0. In this section we present these systems along with some relevant properties.

From now on we assume that A is a maximal monotone operator.

2.1 Differential inclusion

Let us take x∈D(A) and consider the following differential inclusion:

−u(t)˙ ∈ Au(t) a.e. on (0,∞)

u(0) = x. (5)

A solutionof (5) is an absolutely continuous function ufrom R⁺ toH satisfying these two conditions.

Observe thatS is precisely the set of rest points of (5).

Monotonicity implies the following dissipative property:

(8)

Lemma 6 Let u1 and u2 be absolutely continuous functions satisfying u˙i(t) ∈ −Au_i(t) almost everywhere on (0, T). Then the function t7→ ku₁(t)−u2(t)k is decreasing on (0, T).

Proof. For t ∈ (0, T) define θ(t) = ¹₂ku₁(t)−u₂(t)k². The hypotheses give ˙θ(t) = hu˙₁(t)−

˙

u2(t), u1(t)−u2(t)i ≤0 for almost everyt.

Immediate consequences are the following:

Corollary 7 Let y∈ S and u be a solution of (5). Then lim

t→∞ku(t)−yk exists.

Corollary 8 There is at most one solution of (5).

Another aspect of dissipativity is the next property:

Proposition 9 The speed ku(t)k˙ is decreasing.

Proof. Lemma 6 implies that for anyh >0 ands < t

ku(t+h)−u(t)k ≤ ku(s+h)−u(s)k.

We conclude by dividing byh and taking the limit as h→0.

A basic inequality is the following:

Proposition 10 Let u satisfy (5) and [v, w]∈A, then:

ku(t)−vk²− ku(0)−vk² ≤2 Z t

0

hw, v−u(s)ids. (6)

Proof. Write

ku(t)−vk²− ku(0)−vk² = 2 Z t

0

hu(s), u(s)˙ −vids.

By monotonicity, we have hu(s), u(s)˙ −vi ≤ h−w, u(s)−vi, whence the result.

This is the idea in the definition ofintegral solutionintroduced in [17] (see the proof of Theorem 19).

We shall present two approaches for the existence of a solution of (5). The first one uses theYosida approximation and is the best-known in the theory of optimization in Hilbert spaces. The second one uses proximal sequences to approximate the function u. It is popular in the field of partial differential equations since it works naturally in arbitrary Banach spaces. Since it is less known in the optimization community we present it in detail.

But before doing so, and assuming for a moment that the differential inclusion (5) does have a solution, observe that by Lemma 6, for eacht≥0 the mapping x 7→u(t) defines a non expansive function from D(A) to itself that can be continuously extended to a map S_t from D(A) to itself.

The family {S_t}t≥0 is the semi-groupgenerated byA and satisfies:

(9)

i) S0 =I and St◦Sr =St+r; ii) kS_tx−S_tyk ≤ kx−yk;

iii) lim

t→0kx−S_txk= 0.

Reciprocally, given a continuous semi-group of contractions i.e. satisfying i), ii) and iii), from a closed convex subset C to itself, there exists a generator, namely a maximal monotone operator A withC =D(A) such thatS_tx coincides withu(t) for x∈D(A), see [19, Br´ezis].

We will use hereafter both notations u(t) and Stx.

2.2 Approach through the Yosida approximation.

2.2.1 The Yosida approximation

Recall that the resolvent is J_λ^A. The Yosida approximation of A is the single-valued maximal monotone operatorA_λ,λ >0, defined by

Aλ = 1

λ(I−J_λ^A).

SinceJ_λ^Ais nonexpansive and everywhere defined,A_λ is monotone (see Example 1 above) and maximal (using Lemma 1). It is also clear thatA_λ is Lipschitz-continuous with constant 2/λ. Observe thatS =A⁻¹0 =A⁻¹_λ 0 for allλ >0.

Recall thatP_Cxdenotes the orthogonal projection of a pointx∈Honto a nonempty closed convex set C⊂H. Theminimal section of A is the operatorA⁰ defined by A⁰x =PAx0, which is clearly monotone but not necessarily maximal.

The following results summarize the main properties of the resolvent and the Yosida approximation.

They can be found in [19, Br´ezis] (see also [13, Barbu] for Banach spaces).

Proposition 11 With the notation introduced above we have the following:

1. A_λx∈AJ_λ^Ax

2. kA_λxk ≤ kA⁰xk, kA_λxk is nonincreasing inλ and lim

λ→0kA_λxk → kA⁰xk.

3. lim

λ→0J_λ^Ax=x.

4. If xλ →x and Aλxλ remains bounded as λ→0, thenx∈D(A). Moreover, if y is a cluster point of A_λx_λ as λ→0, then y∈Ax.

5. A⁰ characterizes A in the following sense: If A and B are maximal monotone with common domain and A⁰ =B⁰, then A=B.

6. lim

λ→0A_λx=A⁰x and D(A), the (strong) closure of D(A), is convex.

(10)

2.2.2 The existence result The main result is the following:

Theorem 12 There exists a unique absolutely continuous function u : [0,+∞) → H satisfying (5). Moreover,

1. u˙ ∈L^∞(0,∞;H) with ku(t)k ≤ kA˙ ⁰xk almost everywhere.

2. u(t)∈D(A) for allt≥0 and kA⁰u(t)k decreases.

3. A⁰u(t) is continuous from the right and u(t) admits a right-hand derivative for all t ≥ 0;

namelyu(t˙ ⁺) =−A⁰u(t) (lazy behavior).

The problem of finding a trajectory satisfying (5) was first posed and studied in [41, Komura] and [30, Crandall and Pazy]. The classical proof of Theorem 12 above can be found in [19, Br´ezis].

The idea is to consider the differential inclusion (5) with A = A_λ, which has a solution u_λ by virtue of the Cauchy-Lipschitz-Picard Theorem. Then one proves first that, asλ→0,uλ converges uniformly on compact intervals to someu, then thatusatisfies (5) for the originalA. The following estimation plays a crucial role in the proof and is interesting on its own:

ku_λ(t)−u(t)k ≤2kA⁰xk√

λt. (7)

Finally u is proved to have the properties enumerated in Theorem 12.

Comments

The same method can be extended to Banach spaces X such that both X and X^∗ are uniformly

convex (see [37, Kato]).

2.3 Approach through proximal sequences.

2.3.1 Proximal sequences

Given{λ_n}a sequence of positive numbers or step sizes, a sequence {x_n} isproximal if it satisfies







xn−xn−1

λ_n ∈ −Ax_n for all n≥1 x0 ∈ H.

(8) In other words,

xn= (I+λnA)⁻¹xn−1 =J_λ^A_nxn−1. (9) IfAis maximal monotone, the existence of such a sequence follows from Theorem 2. Observe that the first inclusion in (8) can be seen as an implicit discretization of the differential inclusion (5), called also a backward scheme. Thevelocityat stage nis

yn= xn−xn−1

λn

. (10)

Comments

The notion of proximal sequences and the term proximal were introduced in [49, Moreau] for

(11)

f ∈ Γ0(H) and A = ∂f. In that case, finding xn corresponds to minimizing the Moreau-Yosida approximation

f_λ_n(x) =f(x) + 1

2λ_nkx−xn−1k²

of f atxn−1 (see (4)).

Monotonicity implies the following properties:

Lemma 13 The sequence ky_nk is decreasing.

Proof. The inequality hy_n −yn−1, x_n −xn−1i ≤ 0 implies hy_n −yn−1, y_ni ≤ 0 and therefore

ky_nk ≤ ky_n−1k.

This is the counterpart of Proposition 9, which states that the speed of the continuous-time trajectory given by (5) decreases.

Proposition 14 For any [x, y]∈A

kxn−1−xk² ≥ kxn−1−x_nk²+kx_n−xk²+ 2λ_nhy, x_n−xi. (11)

Proof. Simply observe that

kx_n−1−xk² =kx_n−1−xnk²+kx_n−xk²+ 2hx_n−1−xn, xn−xi (12) and hx_n−1−xn, xn−xi ≥ hλ_ny, xn−xi by monotonicity.

This is the counterpart of (6).

In particular one has:

Lemma 15 Let x∈ S. Then kx_n−xk²+λ²_nky_nk²≤ kxn−1−xk². An immediate consequence is the following:

Corollary 16 Let x∈ S. The sequence kx_n−xk² is decreasing, thus convergent.

Notice the similarity with Corollary 7.

(12)

2.3.2 Kobayashi inequality

The following inequality, from [39, Kobayashi], provides an estimation for the distance between two proximal sequences {x_k}and {xb_l}, with step sizes{λ_k} and {bλ_l}, respectively.

We use the following notation throughout the paper:

σk=

k

X

i=1

λi and τk=

k

X

i=1

λ²_i (similarily for bσl and τbl).

Proposition 17 (Kobayashi inequality) Let {x_k} and {xb_l} be two proximal sequences. If u∈ D(A), then

kx_k−bxlk ≤ kx₀−uk+kxb0−uk+kA⁰ukp

(σk−bσl)²+τk+τbl. (13) We first prove the following auxiliary result:

Lemma 18 Let [u₁, v₁], [u₂, v₂]∈A and λ, µ >0, then

(λ+µ)ku₁−u2k ≤λku₂+µv2−u1k+µku₁+λv1−u2k.

Proof. Write ∆u=u₁−u₂. Then

(λ+µ)ku₁−u2k² = λhu₂−u1,−∆ui+µhu₁−u2,∆ui

= λhu₂+µv2−u1,−∆ui+µhu₁+λv1−u2,∆ui+λµhv₂−v1, u1−u2i

≤ h

λku₂+µv2−u1kx+µku₁+λv1−u2ki

ku₁−u2k

by monotonicity.

Proof of Proposition 17: To simplify notation set c_k,l =p

(σ_k−bσ_l)²+τ_k+bτ_l. The proof will use induction on the pair (k, l).

First, let us establish inequality (13) for the pair (k,0) withk≥0. Monotonicity implies, using (3) that, for anyu∈H

kx₁−uk ≤ kx₁−u+λ1(−y₁−A⁰u)k=kx₀−u−λ1A⁰uk so that

kx₁−uk ≤ kx₀−uk+λ1kA⁰uk.

Inductively we obtain

kx_k−uk ≤ kx₀−uk+σ_kkA⁰uk.

Thus

kx_k−xb₀k ≤ kx_k−uk+ku−xb₀k

≤ kx₀−uk+σ_kkA⁰uk+kbx0−uk

≤ kx₀−uk+kbx0−uk+ck,0kA⁰uk

(13)

because σk≤ck,0. In a similar fashion we prove the inequality for (0, l) withl≥0.

Now suppose (13) holds for (k−1, l) and (k, l−1). According to Lemma 18, (λ_k+bλ_l)kx_k−xb_lk ≤λ_kkbx_l+bλ_lyb_l−x_kk+bλ_lkx_k+λ_ky_k−xb_lk.

Settingα_k,l = bλ_l

λ_k+bλ_l and β_k,l = 1−α_k,l = λ_k

λ_k+bλ_l we have kx_k−bx_lk ≤ α_k,lkx_k−1−bx_lk+β_k,lkxbl−1−x_kk

≤ αk,l

kx₀−uk+kxb0−uk+ck−1,lkA⁰uk +β_k,l

kx₀−uk+kbx₀−uk+ck,l−1kA⁰uk

= kx₀−uk+kxb₀−uk+ [α_k,lck−1,l+β_k,lck,l−1]kA⁰uk. (14) It only remains to verify that

αk,lck−1,l+βk,lck,l−1 ≤ck,l. (15) Cauchy-Schwartz Inequality implies

αk,lck−1,l+βk,lck,l−1 = α^1/2_k,l (α^1/2_k,l ck−1,l) +β_k,l^1/2(β_k,l^1/2ck,l−1)

≤ (α_k,l+β_k,l)^1/2(α_k,lc²_k−1,l+β_k,lc²_k,l−1)^1/2

= (α_k,lc²_k−1,l+β_k,lc²_k,l−1)^1/2.

On the other hand, notice thatc²_k−1,l =c²_k,l−2λ_k(σ_k−σb_l), whilec²_k,l−1 =c²_k,l+ 2bλ_l(σ_k−σb_l). Hence, (α_k,lck−1,l+β_k,lck,l−1)² ≤ α_k,lc²_k−1,l+β_k,lc²_k,l−1

= α_k,lc²_k,l+β_k,lc²_k,l−2(α_k,lλ_k−β_k,lbλ_l)(σ_k−bσ_l)

= c²_k,l.

Inequalities (14) and (15) give (13).

Comments

Kobayashi’s original inequality also accounts for possible errors in the determination of the proximal sequence, see [39, Kobayashi]. Nonautonomous versions of the inequality can be found in [40, Kobayasi, Kobayashi and Oharu] or [2, Alvarez and Peypouquet].

2.3.3 The existence result

In general Banach spaces, existence and uniqueness of a solution of (5) can also be derived by the following method from [29, Crandall and Liggett] based on the resolvent.

Set t ∈[0, T], m ∈N and consider a proximal sequence with constant step sizes λk ≡t/m. The m-th iteration defines a function

u_m(t) =

I+ t mA

−m

x.

Repeat the procedure for eachm to obtain a sequence{u_m(t)}of functions from [0, T] toH.

(14)

Theorem 19 The sequence{u_m(t)}defined above converges to someu(t)uniformly on every compact interval [0, T]. Moreover, the function t7→u(t) satisfies (5).

Proof. Instead of the original proof from [29, Crandall and Liggett] we present an easier one using Kobayashi’s inequality (13)¹. Fix N, M ∈N and t, s∈ [0, T] with T >0. Consider two proximal sequences with λk = t/N and bλl = s/M for all k, l. Initialize xk and bxl both at x. Note that xN =uN(t) and bxM =uM(s) hence

ku_N(t)−u_M(s)k ≤ kA⁰xk q

(t−s)²+^T_N² +^T_M².

Thus the sequence {u_n}converges uniformly on [0, T] to a function u, which is globally Lipschitz- continuous with constantkA⁰xk.

In order to prove that the function u satisfies (5) it suffices to verify that it is an integral solution in the sense of [17, B´enilan] (see Proposition 10), which means that for all [x, y]∈Aand t > s≥0 we have

1 2

ku(t)−xk²− ku(s)−xk²

≤ Z t

s

hy, x−u(τ)i dτ. (16) Since u is absolutely continuous and A is maximal monotone, (16) implies ˙u(t) ∈ −Au(t) almost everywhere on [0, T].

Monotonicity ofAimplies that for any proximal sequence{x_k}: one hashx_k−1−x_k−λ_ky, x_k−xi ≥0.

But kx_k−xk²− kx_k−1−xk² ≤2hx_k−1−x_k, x−x_ki and so

kx_k−xk²− kx_k−1−xk² ≤2λkhy, x−xki.

Summing up fork= 1, . . . n we obtain

kx_n−xk²− kx₀−xk² ≤2

n

X

k=1

λ_khy, x−x_ki.

Setting x₀ =u(s) and passing to the limit appropriately we get (16). Notice thatu(t)∈D(A) by

maximality.

A consequence of Proposition 17 and Theorem 19 is the following:

Corollary 20 The following statements hold:

i) For each z∈D(A) we have

kx_n−u(t)k ≤ kx₀−zk+ku(0)−zk+kA⁰zkp

(σ_n−t)²+τ_n. ii) For trajectories u andv we get

kv(s)−u(t)k ≤ kv(0)−zk+ku(0)−zk+kA⁰zk |s−t|.

iii) The unique function u satisfying (5) is Lipschitz-continuous with ku(s)−u(t)k ≤ kA⁰u(0)k |s−t|.

iv) u˙ ∈L^∞(0,∞;H) with ku(t)k ≤ kA˙ ⁰xk almost everywhere.

Proposition 17 was used to construct a continuous trajectory by considering finer and finer discretizations on a compact interval. By controlling the distance between two discrete schemes it is possible to obtain bounds for the distance between a limit trajectory and a discrete scheme. As a consequence, one can estimate the distance between two trajectories as well.

1In fact, Kobayashi’s proof is based on a simplification of Crandall and Liggett’s method.

(15)

2.4 Euler sequences

Assume A maps D(A) into itself (this is a strong assumption, so the range of applications of this discretization method is limited compared to proximal sequences). Let {λ_n} be a sequence of numbers in (0,1] (the step sizes). Define anEuler sequence{z_n}recursively by







z_n−zn−1

λn−1

∈ −Az_n−1 for all n≥1 z₀ ∈ D(A).

(17) A remarkable feature of this scheme is that the terms of the sequence can be computed explicitly (forward scheme).

Observe that if A = I −T with T : C → C nonexpansive and λ_n ≡ 1 then z_n = Tⁿz₀. This particular case has been studied extensively by several authors in the search for fixed points of T.

Some of their results will be presented in the forthcoming sections.

Notice also that in this framework, A=I −T with T nonexpansive, a Kobayashi-type inequality holds too, namely

kz_k−zblk ≤ kz₀−uk+kzb0−uk+ku−T(u)kp

(σk−bσl)²+τk+bτl, (18) whereu is any point inH. This fact was recently established by [68, Vigeral].

Let us define the velocity at stagenas

wn= zn+1−zn

λ_n ∈ −Az_n. (19)

Lemma 21 If [u, v]∈A then

kz_n+1−uk² ≤ kz_n−uk²+ 2λ_nhv, u−z_ni+λ²_nkw_nk². (20)

Proof. For any u∈H one has

kz_n+1−uk² =kz_n−uk²+ 2λnhw_n, zn−ui+λ²_nkw_nk². (21) The desired inequality follows from monotonicity since hw_n, zn−ui ≤ hv, u−zni for [u, v]∈A.

This is the couterpart of (6) and (11). In particular one has:

Lemma 22 If u∈ S thenkz_n+1−uk² ≤ kz_n−uk²+λ²_nkw_nk².

Observe the similarity and the difference with (5) and (8). The dissipativity condition in Lemma 22 is much weaker than the corresponding ones in Lemmas 6 and 15.

An immediate consequence is the following:

Corollary 23 Assume P

kz_n+1−z_nk² <∞. For eachu∈ S the sequence kz_n−uk is convergent.

(16)

Proof. It suffices to observe from Lemma 22 that the sequence kz_n−uk²+P+∞

m=nkz_m+1−zmk²

is decreasing.

Comments

The hypothesis in the previous result holds if{λ_n} ∈`² and {w_n} is bounded.

Notice the similarity with Corollaries 7 and 16.

The main drawback of Euler sequences is that they can be quite unstable. Most convergence results need regularity assumptions such as {λ_n} ∈ `² and the boundedness of the sequence {w_n}, or at least that P

kz_n+1−z_nk² <∞.

An important result involving an operatorA of the form I−T is the following, see [19, Br´ezis]:

Proposition 24 (Chernoff ’s estimate) LetT be nonexpansive fromH to itself andλ >0. Ifv satifies

˙

v(t) =−1

λ(I−T)v(t) with v(0) =v0 then

kv(t)−Tⁿv₀k ≤ kv(0)k˙ p

λt+ (nλ−t)². (22)

Proof. It is enough to consider the caseλ= 1.

Define φn(t) = kv(t)−Tⁿv0k and γn(t) = kv(0)k˙ p

t+ [n−t]². We shall prove inductively that φ_n(t)≤γ_n(t). Forn= 0 simply observe that

kv(t)−v₀k ≤ Z t

0

kv(s)k˙ ds≤ kv(0)kt˙ ≤γ₀(t) by Proposition 9.

Now let us assume φn−1 ≤ γn−1 and prove φ_n ≤ γ_n. Multiplying ˙v(t) +v(t) = T v(t) by e^t and integrating we obtainv(t) =v0e^−t+Rt

0e^(s−t)T v(s)ds so that φ_n(t) =

e^−t(v₀−Tⁿv₀) + Z t

0

e^(s−t)[T v(s)−Tⁿv₀]ds

≤ e^−tkv₀−Tⁿv0k+ Z t

0

e^(s−t)φn−1(s)ds.

Noticing that kv₀ −Tⁿv₀k ≤ P_n

i=1kTⁱ⁻¹v₀ −Tⁱv₀k ≤ nkv₀ −T v₀k = nkv(0)k˙ and using the induction hypothesis we deduce

φ_n(t)≤e^−t

nkv(0)k˙ + Z t

0

e^sγn−1(s)ds

. Hence it suffices to establish the inequality

n+ Z t

0

e^sp

s+ [(n−1)−s]²ds≤e^tp

t+ [n−t]².

(17)

Since this holds trivially fort= 0, it suffices to prove the inequality for the derivatives e^tp

t+ [(n−1)−t]² ≤e^t

"

pt+ [n−t]²+ 1−2[n−t]

2p

t+ [n−t]²

# .

This is easily verified by squaring both sides.

In particular, setting T = J_λ^A we get v = uλ as in (7). Combining inequalities (7) and (22) we deduce that

k(I+λA)⁻ⁿx−u(t)k ≤ k(J_λ^A)ⁿx−uλ(t)k+ku_λ(t)−u(t)k

≤ kA⁰xk 2

√

λt+p

λt+ (nλ−t)²

. (23)

Taking λ=t/nwe obtain the following exponential approximation

I+ t nA

−n

x−u(t)

≤ 3kA⁰xkt

√n . (24)

Therefore, this discretization also approximates the continuous-time trajectory. Moreover, the approximation is uniform on bounded intervals.

2.5 Further remarks

2.5.1 Discrete to continuous

Given a sequence {x_n} in X along with a strictly increasing sequence {σ_n} of positive numbers with σ0 = 0 and σn → ∞ as n → ∞, one can construct a “continuous-time” trajectory x by interpolation: for t ∈ [σ_n, σ_n+1], take x(t) anywhere on the segment [x_n, x_n+1]. It is easy to see that any trajectory defined this way converges to some ¯xif, and only if, the sequence{x_n}converges to ¯x.

Observe that if the interpolation is chosen to be piecewise constant in each subinterval [σ_n, σ_n+1), then

1 t

Z t 0

x(ξ)dξ= 1 σ_n

n

X

k=1

λ_kx_k,

where λ_k = σ_k−σ_k−1. The sum on the right-hand side of the previous equality represents an average of the points {x_n} that isweighted by the sequence{λ_n} and will be denoted by ¯xn. Ob- serve also that the convergence of these weighted averages is equivalent to the convergence of the continuous-time interpolation.

From now on we will consider only proximal or Euler sequences with step sizes {λ_n}∈/`¹.

(18)

2.5.2 Asymptotic analysis to be carried out in the following sections

The next sections are devoted to the asymptotic analysis. We start by considering the sequences of values in the case f ∈ Γ0(H) and A = ∂f in Section 3. The rest deals with the behavior of trajectories and sequences themselves. Section 4 presents general tools related to weak convergence and properties of weak limit points. These last properties hold under weaker assumptions for the averages, which are studied in Section 5. In Section 6 we present weak convergence, in particular in the framework of demipositive operators. Section 7 introduces different geometrical conditions that are sufficient for strong convergence. Section 8 is devoted to almost orbits and describes equivalence classes that allow to recover previous results with a new perspective and extend to non autonomous processes.

3 Convex optimization and convergence of the values

This section is devoted to the case whereA=∂fis the subdifferential of a proper lower-semicontinuous convex function. We evaluatef on trajectories and discuss on the behavior of its values.

3.1 Continuous dynamics

When A = ∂f with f ∈ Γ0(H), the differential inclusion (5) is a generalization of the gradient method, for nondifferentiable functions. In what follows let u: [0,∞) →H be the solution of the differential inclusion

˙

u(t)∈ −∂f(u(t)), (25)

whose existence is given in Theorem 12. Let f^∗= inf

x∈Hf(x)∈R∪ {−∞}.

The following result and its proof are essentially from [19, Br´ezis] (see [34, G¨uler]).

Proposition 25 The function t7→f(u(t)) is decreasing and lim

t→∞f(u(t)) =f^∗. Proof. The subdifferential inequality is

f(u(t))−f(u(s))≤ −hu(t), u(t)˙ −u(s)i.

Thus

lim sup

s→t⁻

f(u(t))−f(u(s))

t−s ≤ −ku(t)k˙ ² and so the functiont7→f(u(t)) is decreasing.

For eachz∈H and s∈[0, t] the subdifferential inequality then gives f(z)≥f(u(s)) +hu(s), u(s)˙ −zi ≥f(u(t)) + 1

2 d

dsku(s)−zk². Integrating on [0, t] we obtain that

tf(z)≥tf(u(t)) + 1

2ku(t)−zk²−1

2ku(0)−zk²

(19)

and so

f(u(t)) +ku(t)−zk²

2t ≤f(z) +ku(0)−zk²

2t (26)

for everyz∈H. We conclude by lettingt→ ∞.

Comments

By inequality (26), if S 6= ∅ then f(u(t)) converges to f^∗ at a rate of O(1/t). However, if the trajectoryu(t) is known to have a strong limit, then the rate drops too(1/t) (see [34, G¨uler]).

3.2 Proximal sequences

Let{x_n} be a proximal sequence associated to A=∂f. The following result is due to [33, G¨uler]:

Proposition 26 The sequence f(x_n) is decreasing and lim

n→∞f(x_n) =f^∗.

Proof. Recall that −y_n=−xn−xn−1

λn

∈∂f(xn). The subdifferential inequality implies

f(xn−1)−f(x_n)≥λ_nky_nk² (27) so that f(x_n) is decreasing. Convergence of f(x_n) to f^∗ follows from Lemma 27 below since

σ_n→ ∞.

Lemma 27 Let u∈domf, then

f(x_n)−f(u)≤ ku−x₀k² 2σn

−ku−x_nk² 2σn

−σ_n 2 ky_nk². Proof. The subdifferential inequality gives

f(u)−f(xn)≥ hu−xn,−y_ni= hu−xn, xn−1−xni λn

for all u in the domain off. Thus

2λn(f(u)−f(xn))≥ ku−xnk²+λ²_nky_nk²− ku−xn−1k². Summation from 1 to nleads to

2σnf(u)−2

n

X

k=1

λkf(xk)≥ ku−xnk²+

n

X

k=1

λ²_kky_kk²− ku−x0k². (28) Multiplying (27) by σn−1 and rearranging we get

σn−1f(xn−1)−σnf(xn) +λnf(xn)≥λnσn−1ky_nk²,

(20)

from which we derive

−σ_nf(x_n) +

n

X

k=1

λ_kf(x_k)≥

n

X

k=1

λ_kσk−1ky_kk² by summation. Adding twice this inequality to (28) we obtain

2σn(f(u)−f(xn))≥ ku−xnk²− ku−x0k²+

n

X

k=1

λ²_kky_kk²+ 2

n

X

k=1

λ_kσk−1ky_kk². Recall from Lemma 13 that ky_nk is decreasing. We get

ky_nk²σ_n² = ky_nk²(σn−1+λn)²=ky_nk²(λ²_n+ 2λnσn−1+σ²_n−1)

= ky_nk²

n

X

k=1

(λ²_k+ 2λ_kσk−1)≤

n

X

k=1

(λ²_k+ 2λ_kσk−1)ky_kk²

and the result follows at once by rearranging the terms.

Comments

IfS 6=∅, Lemma 27 gives

ky_nk ≤ d(x₀,S)

σ_n . (29)

A similar estimation had been proved in [20, Br´ezis and Lions] but the right-hand side is √

2 times larger.

The fact thatf(xn)→f^∗ had first been proved in [45, Martinet] when f is coercive and λn≡λ.

By Lemma 27, if S 6= ∅ the rate of convergence of f(x_n) to f^∗ can be estimated at O(1/σ_n).

Moreover, (29) and the subdifferential inequality together give

f(x_n)−f^∗ ≤ hx^∗−x_n,−y_ni ≤ kx^∗−x_nk ky_nk ≤ d(x₀,S)kx^∗−x_nk σn

for all x^∗ ∈ S. Therefore, if the sequence {x_n}is known to converge strongly, then |f(x_n)−f^∗|= o(1/σn). This was proved in [33, G¨uler] using a clever but unnecessarily sophisticated argument

instead of inequality (29).

3.3 Euler sequences

Let{z_n} be an Euler sequence associated toA=∂f. In this case the sequencef(z_n) need not be decreasing. However, we have the following:

Lemma 28 If either i) P

kz_n+1−znk² <∞ or ii) lim

n→∞λnkw_nk²= 0, then lim inf

n→∞ f(zn) =f^∗. Proof. Assume i). Since−w_n∈∂f(zn), the subdifferential inequality and (21) together imply

kz_n+1−yk² ≤ kz_n−yk²+ 2λn(f(y)−f(zn)) +λ²_nkw_nk² (30)

(21)

for each y∈H. If P

kz_n+1−znk² <∞ then

Xλ_n(f(z_n)−f(y))<∞ (possibly−∞). Since {λ_n}∈/ `¹ one must have lim inf

n→∞ f(z_n)≤f(y) for each y∈H.

Consider nowii). Inequality (30) can be rewritten as λn

h

2(f(zn)−f(y))−λnkw_nk²i

≤ kz_n−yk²− kz_n+1−yk² so that

Xλn

h

2(f(zn)−f(y))−λnkw_nk²i

<∞ and lim inf

n→∞ f(z_n)≤f(y) for each y∈H.

Part of the ideas in the proof of the preceding result (under hypothesis ii)) are from [64, Shor], where we can also find the following:

Proposition 29 Let dim(H) < ∞ and assume S is nonempty and compact. If lim

n→∞λ_n = 0 and the sequence w_n is bounded then lim

n→∞f(z_n) =f^∗.

Proof. By continuity, it suffices to prove that dist(z_n,S) = inf_y∈Skz_n−yk tends to 0 asn→ ∞.

For γ > f^∗ define L_γ = {x : f(x) = γ} and denote L^co_γ its convex hull. Both sets are compact.

Take ε >0 and define

δ(ε) = dist(S, L_f^∗_+ε) and d(ε) = max

u∈L^co

f∗+ε

dist(u,S).

Observe that 0< δ(ε) ≤ d(ε) → 0 as ε→ 0. By hypothesis and Lemma 28 there is N ∈ N such thatf(z_N)≤f^∗+εand λ_nkw_nk ≤δ(ε) for all n≥N. We shall prove that dist(z_n,S)≤2d(ε) for all n≥N. Since ε >0 is arbitrary this shows that lim

n→∞dist(zn,S) = 0.

Indeed, if f(z_n) ≤ f^∗ +ε (this holds for n = N) then z_n ∈ L^co_f∗+ε and dist(z_n,S) ≤ d(ε). Hence dist(zn+1,S) ≤ d(ε) +δ(ε) ≤ 2d(ε). On the other hand, if f(zn) > f^∗ +ε then dist(zn+1,S) ≤ dist(zn,S). To see this, notice that if y ∈ S then h_kw^wⁿ

nk, y −zni is the distance from y to the hyperplane Π_n={x:hw_n, z_n−xi= 0}, which is a supporting hyperplane for the setL^co_f(z

n) at the point zn. Therefore we have

hw_n, y−z_ni ≥ kw_nkdist(S,Π_n)≥ kw_nkdist(S, L_f(z_n₎)≥ kw_nkδ(ε),

where the second inequality follows from convexity and the last one is true wheneverf(z_n)> f^∗+ε.

Using (21) and recalling that λnkw_nk ≤δ(ε) we deduce that

dist(z_n+1,S)² ≤dist(z_n,S)²−λ_nkw_nkδ(ε),

proving that dist(z_n+1,S)≤dist(z_n,S).

Observe that this result does not require the stabilizing summability condition but it is necessary to make a very strong assumption on the setS.

(22)

4 General tools for weak convergence

We denote by Ω[u(t)] (resp. Ω[x_n]) the set of weak cluster points of a trajectory u(t) as t → ∞ (resp. of a sequence{x_n} asn→ ∞).

Given a trajectoryu(t) we define

¯

u(t) = 1 t

Z t 0

u(ξ) dξ.

Similarly, given a sequence {x_n} inH along with step sizes{λ_n}, we introduce

¯ x_n= 1

σ_n

n

X

k=1

λ_kx_k.

4.1 Existence of the limit

Most of the results on weak convergence that exist in the literature rely on the combination of two types of properties involving a subset F ⊂H (in all that followsF will be closed and convex):

The first one is a kind of “Lyapounov condition” on the sequence or the trajectory like (a1) kx_n−uk converges to some`(u) for eachu∈F, or

(a2) PF(xn) converges strongly.

These properties imply that the sequence is somehow “anchored” to the set F.

The second one is a global one, concerning the set of weak cluster points of the sequence or trajectory:

(b) Ω[x_n]⊂F.

However, it is sometimes available only for the averages:

(b’) Ω[¯xn]⊂F.

The following result is a very useful tool for proving weak convergence of a sequence on the basis of (a1)and (b) above. It is known, especially in Hilbert spaces, asOpial’s Lemma [51].

Lemma 30 (Opial’s Lemma) Let {x_n} be a sequence in H and letF ⊂H. Assume 1. kx_n−uk has a limit as n→ ∞ for each u∈F; and

2. Ω[x_n]⊂F.

Then xn converges weakly to some x^∗∈F.

Proof. Since {x_n} is bounded it suffices to prove that it has only one weak cluster point. Let x, y∈Ω[xn]⊂F so that kx_n−xk converges to`(x) and similarly for y. From

kx_n−yk² =kx_n−xk²+kx−yk²+ 2hx_n−x, x−yi (31)

(23)

one deduces by choosing appropriate subsequences

`(y)² =`(x)²+kx−yk² (x_φ(n)* x) and

`(y)² =`(x)²− kx−yk² (x_ψ(n)* y)

hencex=y.

Comments

A Banach space X satisfiesOpial’s condition if it is reflexive and lim sup

n→∞

kx_n−xk<lim sup

n→∞

kx_n−yk whenever x_n* x6=y. (32) Any uniformly convex Banach space having a weakly continuous duality mapping (in particular, any Hilbert space) satisfies Opial’s condition (see [51, Opial]). Opial’s Lemma holds in any Banach

space satisfying Opial’s condition.

Following [52, Passty], one obtains a more general result:

Lemma 31 Let {x_n} be a sequence in H with step sizes{λ_n}and let F ⊂H. Assume(a1) : the sequence kx_n−uk has a limit as n→ ∞ for each u∈F. Then the sets Ω[xn]∩F and Ω[¯xn]∩F each contains at most one point. In particular if Ω[x_n]⊂F (resp. Ω[¯x_n]⊂F), then x_n (resp. x¯_n) converges weakly as n→ ∞. A similar result holds for trajectories.

Proof. By (31), hx_n, x−yi converges to some m(x, y) for any x, y ∈ F. If u and v belong to Ω[x_n]∩F one obtainshu, u−vi=hv, u−vihenceu=v. Similarlyh¯x_n, x−yiconverges tom(x, y).

Thus both Ω[xn]∩F and Ω[¯xn]∩F contain at most one point.

An alternative proof using(a2) and either(b) or(b’) is as follows:

Lemma 32 Let {x_n} be a bounded sequence in H with step sizes {λ_n} and let F ⊂ H be closed and convex. Assume (a2): P_Fx_n→ζ as n→ ∞. Then

Ω[xn]∩F ⊂ {ζ} and Ω[¯xn]∩F ⊂ {ζ}.

In particular, ifΩ[xn]⊂F (resp. Ω[¯xn]⊂F), then xn (resp. x¯n) converges weakly toζ. A similar result is true for trajectories.

Proof. By definition of the projection, for each u∈F one has hx_n−P_Fx_n, u−P_Fx_ni ≤0.

Since x_n is bounded we deduce that

hx_n−ζ, u−ζi ≤ρn

(24)

with lim

n→∞ρn= 0. This implies Ω[xn]∩F ⊂ {ζ}(if v∈Ω[xn]∩F, takeu=v). Similarly hx¯_n−ζ, u−ζi ≤ρ¯_n,

which gives Ω[¯x_n]∩F ⊂ {ζ}.

A sligthly more demanding assumption is the Fejer property:

(a3) ku(t)−pk decreases for eachp∈F, or

(a3’) There exists{ε_n} ∈`¹ such that kx_n+1−uk² ≤ kx_n−uk²+εnfor all u∈F.

Then one has the following, from [27, Combettes]:

Lemma 33 Any trajectory satisfying (a3) also satisfies (a2).

Any sequence satisfying (a3’) also satisfies (a2).

Proof. Letu(t) satisfy(a3)and let v(t) =P_Fu(t). Note first that, using the projection property and (a3)

kv(t+h)−u(t+h)k² ≤ kv(t)−u(t+h)k² ≤ kv(t)−u(t)k². hencekv(t)−u(t)kdecreases, hence converges.

The parallelogram equality gives kv(t+h)−v(t)k²+ 4

v(t+h)+v(t)

2 −u(t+h)

2= 2kv(t+h)−u(t+h)k²+ 2kv(t)−u(t+h)k².

F convex implies

v(t+h)+v(t)

2 −u(t+h)

≥ kv(t+h)−u(t+h)k², hence kv(t+h)−v(t)k² ≤2

kv(t)−u(t)k²− kv(t+h)−u(t+h)k² so that v(t) has a strong limit v ast→ ∞.

Now let {x_n} satisfy(a3’)and write yn=PFxn. As before, one has ky_n+1 −x_n+1k ²≤ ky_n−x_n+1k²≤ ky_n−x_nk²+ε_n so that ky_n−x_nk²+P+∞

m=nε_m is decreasing henceky_n−x_nk² converges as well.

Evolution equations for maximal monotone operators: asymptotic analysis in continuous and discrete time