Regularization of linear least squares problems by total bounded variation

(1)

REGULARIZATION OF LINEAR LEAST SQUARES

PROBLEMS BY TOTAL BOUNDED VARIATION

G. CHAVENT AND K. KUNISCH

Abstract. We consider the problem (^P) Minimize 12^j^T^u^;^z^j²^Y +2 ^ju^j²^L²+

Z

jruj over ^u ² ^K^\^X where 0, ^> 0, ^K is a closed convex subset of ^L²(), and the last additive term denotes the BV-seminorm of^u,^T is a linear operator from ^L²^\^BV into the observation space ^Y. We formulate necessary optimality conditions for (^P). Then we show that (^P) admits, for given regularization parameters and, solutions which depend in a stable manner on the data^z. Finally we study the asymptotic behavior when

=^!0. The regularized solutions ^^u of (^P) converge to the^L²^\

BV minimal norm solution of the unregularized problem.The rate of convergence is¹² when the minimum-norm solution ^^uis smooth enough.

1. Problem statement and functions of bounded variation We consider the problem of nding approximate solutions to

Tu=^z (1.1)

where ^T is a bounded linear operator from ^L²() to a Hilbert space ^Y by using regularization techniques based on bounded variation functionals.

Here is a bounded domain in ^Rⁿwith Lipschitzian boundary and ^z²^Y. The operator ^T is not assumed to be injective and it may be compact. In this case approximate solutions that are stable with respect to ^z can be obtained by solving the regularized problem.

min 12^jT^u^;^zj²^Y +2^juj²^L² (1.2) or

min 12^jT^u^;^zj²^Y +2^jruj²^L²ⁿ (1.3) for example. In (1.2) and (1.3) one refers to ^> 0 as the regularization parameter. Regularization by the square of a Hilbertian norm or seminorm as in (1.2) and (1.3) is well-studied. We refer to 2], 7], 8] and the references given there. While (1.2) and (1.3) have attractive analytical properties concerning the stability of their solution with respect to perturbations in the data ^z and their asymptotic behavior as ^!0 they can have serious prac- tical disadvantages. The ^L²-regularization term in (1.2) is not satisfactory

CEREMADE, Universite Paris-Dauphine and INRIA, Rocquencourt.

Fachbereich Mathematik, Technische Universitat Berlin and Institut fur Mathematik, Universitat Graz.

Received by the journal January 6, 1997. Accepted for publication October 23, 1997.

c

Societe de Mathematiques Appliquees et Industrielles. Typeset by L^ATEX.

(2)

to dampen undesired oscillations in the solution ^u which result from highly noisy data and/or compactness of ^T. Regularization with the square of the gradient as in (1.3) reduces these oscillations but it overregularizes the solution in the neighborhood of edges and corners. This is well-known in the context of image enhancement, see for example 11] and the references given there, and one can quickly convince oneself of this fact by a short MATLAB-code even with ^T = identity. Due to these shortcomings with (1.2) and (1.3) regularization based on total bounded variation functionals has recently been investigated in several publications 3], 11], 13], for example. In this case the regularized problems are given by (variations of)

min 12^jT^u^;^zj²^Y +

Z

jruj (1.4)

where ^R^jruj denotes the bounded variation seminorm, which will be de- scribed at the end of this section. For the moment we may think of the term

R

jruj as the ^W¹¹() seminorm. In the context of image enhancement (1.4) has proved to be extremely eective. Comparing (1.4) to (1.2) or (1.3) we immediately recognize that while (1.2) and (1.3) are quadratic, (1.4) is not. In fact, (1.4) is highly nonlinear and this results in interesting problems both analytically and numerically. In 3] basic facts concerning existence of solutions to (1.4), the optimality system, approximation of (1.4) by nite dimensional problems and algorithms that are specically adapted for the nonlinear structure of problem (1.4) were presented. In the present analysis we focus on the problem of stability of the solutions to (1.4) with respect to perturbations in the data^z and on the asymptotic behavior as ^!0⁺.

Let us now specify the problem to be investigated:

min 12^j^Tu^;^z^j²^Y + 2 ^ju^j²^L²+

Z

jrujover^u²^K^\^X (P) where 0 ^> 0^K is a closed, convex subset of ^L²()^X = ^L²()^\

BV(), and^BV() denotes the space of bounded variation functions to be dened below. The space^Xendowed with the norm^juj^X =^juj^L²+^juj^BV is a Banach space. We shall see that due to the assumption that is bounded ^X and ^BV() are equivalent if ⁿ2. The use of the quadratic regularization term serves two purposes. First, for^>0 it provides a coercive term for the subspace of constant functions which are in the kernel of the^r-operator (and can be in the kernel of^T) and secondly it gives the possibility to distinguish the structure of stability results for quadratic regularization terms from that of the nonquadratic ^BV-term.

Let us now summarize results from the theory of functions of bounded variation that are relevant to our work. We refer to 3], 6], 12] for de- tails. As mentioned above is a bounded domain in ^Rⁿwith Lipschitzian boundary. A function ^u²^L¹() is said to be of bounded variation if

sup

Z

u(^x) div^v(^x)^dx:^v²^C⁰¹()ⁿ ^jv(^x)^j¹1 ^x²

<1: (1.5) Here ^j^j¹ denotes the ^l¹-norm on ^Rⁿ. The sup in (1.5) will be denoted by ^R^jruj. The space of all functions in ^L¹() with bounded variation is

Esaim: Cocv, December 1997, Vol. 2, pp. 359{376

(3)

abbreviated by ^BV(). Endowed with

juj

BV =^juj^L¹+^Z

jruj (1.6)

BV() is a Banach space. Equivalently ^BV() can be introduced in terms of measures. Let ^M() denote the space of real Borel measures on which is a Banach space when endowed with the norm

jj() =

Z

jj 2M()

jj=⁺+^; being the total variation measure associated toand ^jj() is the total variation ofin , 10]. It is known that^M() is the dual space of ^C⁰(), the space of continuous function vanishing on the boundary of and

Z

jj= sup

Z

v(^x)^d(^x) :^v²^C⁰() ^jvj^C()1

:

Thus ^BV() can equivalently be expressed as the space of functions ^u ²

L

1() for which ^@^xⁱ^u²^M(), for every 1ⁱⁿ, with the partial deriva- tives^@^xⁱ understood in the distributional sense. If the vector valued measure space ^Mⁿ() is endowed with the ^l¹-norm, in the sense that

Z

j

*

j=^Xⁿ

i=1 Z

j

*

i j

*

2M n() then

juj

BV =^juj^L¹ +

Z

jruj=^juj^L¹+^Xⁿ

i=1 Z

j@

x

i

uj for^u²^BV()^: If instead of the ^l¹-norm on^Rⁿin (1.5) one uses the ^l^p-norm, 1^p ^<¹, then this induces an equivalent norm on ^BV().

Proposition 1.1.

a) If ^fu^j^g¹^j=1 ^BV() and lim^u^j =^u in ^L¹() then

Z

jrujliminf

Z

jru

j j:

b) For every^u²^BV()^\L^r()^r²1¹) there exists^fu^j^g¹^j=1 ^C¹() such that

lim^u^j =^u in^L^r() and lim

Z

j@

x

i u

j j=

Z

j@

x

i

uj 1ⁱ^n:

c) For every bounded sequence ^fu^j^g¹^j=1 ^BV() there exists a subsequence^fu^j^k^g¹^{k =1} and û ²^BV() such that limû^j^k = û in ^L^p()^p ² 1^n;1ⁿ ), if ⁿ2, and limû^j^k =û in ^L^p^p²1¹), if ⁿ= 1. (Com- pactness).

d) There exists a constant^C=^C(ⁿ) such that

Z

ju;^uj~ ^n;1ⁿ ^dx

n;1

n

C Z

jruj for all^u²^BV()

with ^u~ = ^jj¹ ^R^u^dx. (Sobolev inequality). If ⁿ= 1, then the ^Lⁿ⁼^n;1- norm is understood to be the ^L¹-norm.

(4)

As a consequence of Proposition 1.1 d) we have the following

Corollary 1.2. If ⁿ= 1 or 2 then ^BV() and ^X are equivalent Banach spaces.

Sketch of proof of Proposition 1.1. The assertions of Proposition 1.1 are small modication of results given in 6]. The semicontinuity of the bounded variation functional asserted in a) is proved in (6], Theorem 1.9). As for b) the proof of Theorem 1.17 in 6] can be generalized from ^r= 1 to arbitrary

r 2 1¹). An application of Rellich's Theorem, see (6], Theorem 1.19) implies c). Finally d) is well-known for ^u ² ^C¹() 9]. For arbitrary

u 2BV() there exists as a consequence of b) a sequenceû^j ²^C¹() such thatû^j ^!ûin ^L¹() and^R^jru^j^j^!^R^jruj. Applying d) to the sequence

fu

j

gwe obtain that^fu^j^g¹^j=1 is bounded in ^L^n;1ⁿ () and hence û^j converges weakly to ûin ^L^n;1ⁿ () and lim ~û^j = ~û. It follows that

ju;^uj~L n

n;1

lim^ju^j^;^u~^j^jlim^C

Z

jru

j j=^C

Z

jruj:

The subsequent sections are organized as follows. In Section 2 sucient conditions for existence of a unique solution to (P) and the optimality system are given. Section 3 is dedicated to the stability of the solutions with respect to the data ^z. The asymptotic behavior of the solutions to (P) as

!0 ^!0 is analyzed in Section 4.

2. Existence and optimality conditions We shall use the following hypothesis.

One of the following properties holds:

(i) ^>0,

(ii) ⁿ= 1 orⁿ= 2, and const²⁼ ker^T, (H)

(iii) ^K is bounded in ^L²().

Theorem 2.1. If (H) holds, then there exists a solution to (P). If moreover

>0 or ^T is injective then the solution is unique.

Proof. Let ^fu^j^g¹^j=1 be a minimizing sequence. If (H) (i) or (iii) hold, then

fu

j

g is bounded in ^L²(). In the case of (H) (ii) boundedness of ^fu^j^g in

L

2() follows from Proposition 1.1 d) and the fact that^>0. Thus in either case (H) and the form of the cost functional imply that ^fu^j^gis bounded in

X. By Proposition 1.1 c) there exists a subsequence of^fu^j^gdenoted by the same symbol and û ²^X, such that ^fu^j^g converges to û strongly in ^L¹() and weakly in ^L²(). It follows that û ² ^K. Moreover, ^T maps weakly convergent sequence in ^L²() into weakly convergent sequences in ^Y. By Proposition 1.1 a) we nd

12^jT^u^;^zj²+2 ^j^u^j²+

Z

jr^u^j

liminf

1

2^j^T^u^j^;^z^j²+2 ^j^u^j^j²^L²+

Z

jru

j j

inf

1

2^j^T^u^;^z^j²+ 2^j^u^j²^L² +

Z

jruj:^u²^K

(5)

and thus û is a solution to (P). Concerning uniqueness we note that in the case ^> 0 the cost functional of (P) is strictly convex and hence û is unique. In the case of injectivity of ^T the cost functional of (P) is again strictly convex and uniqueness of û follows.

We now specify the optimality system for (P).

Theorem 2.2. Let ^u² ^K^\^X. Then ^u is a solution of (P) if and only if there exists in the dual space of ^Mⁿ() such that

(^T(^Tû^;^z) +û û^;û)^L²⁽⁾^;div û^;û^BV^BV 0 (2.1) for all û²^K^\^X

^;^r^u^Mⁿ^Mⁿ+

Z

jr^uj

Z

jj for all²^Mⁿ()^: (2.2) Here ^T denotes the adjoint of ^T ² ^L(^L²()^Y) and - div stands for the conjugate of ^r:^BV()^!^Mⁿ(), i.e.

- div ^u^BV^BV =^ru

M n

M

n for ^u²^BV()^:

The proof can be obtained with only minor modications from that of (3], Theorem 2.2) and it is therefore omitted. Let us note that (2.2) is equivalent to

^r^u=

Z

jr^uj (2.3)

and

^Z

jjfor all ²^Mⁿ()^: (2.4) Alternatively (2.2) can be expressed as ² ^@'(^r^u) where ^@'is the subd- ierential of ^':^Mⁿ()^!^Rgiven by

'() =^Xⁿ

i=1 Z

j

i j:

Note that (2.4) implies that the restriction to ^L¹ⁿ of ^! can be identied with an element in ^L¹ⁿ () with^j^j^L¹

n 1.

The Lagrange multiplier of Theorem 2.2 is not unique. In the following theorem we give a necessary and sucient optimality system that involves a unique Lagrange multiplier ^q ²^H¹(). Let us recall some function spaces involving the divergence operator 5]:

H(div) =^v²^L²ⁿ() : div ^v²^L²() which is a Hilbert space for the norm

jvj

H(div⁾=^jvj²^L²n+^jdiv ^vj²^L²¹⁼² and the closed subspace of ^H(div):

H

0(div) =^D()ⁿ

where^D()ⁿdenotes the space of innitely dierentiable vector-valued functions with support in and the closure is taken in the norm of^H(div). The trace operator(^v) = ^vⁿ^j; can be extended as continuous linear operator from ^H(div) to^H^;1=2(;) (5],^x2). Hereⁿdenotes the outer unit normal to

(6)

the Lipschitzian boundary ; of . The space ^H⁰(div) can be characterized as

H

0(div) =^fv²^H(div) :(^v) = 0^g^:

Theorem 2.3. Let^K =^Xand let^ube a solution to (P). Then there exists a unique ^q² ^H¹() with ^R^q^dx = 0 such that for ~ = ^r^q we have ~ ²

H

0(div)^j^j~L 2

n

pmeas(),

(^T^T +)û^;^T^z= div ~in ^L²() (2.5) (^;div ~û^;û)^L²⁽⁾+

Z

jr^u^j

Z

jrujfor all û²^{X :} (2.6) Conversely, if there exists a pair (û~)²^X^H⁰(div) such that (2.5), (2.6) hold, then û is a solution to (P).

Remark 2.4. From the proof it will follow that ~ of Theorem 2.3 is obtained from the Lagrange multiplier of Theorem 2.2 by restriction of the functional ^!^hⁱ^Mⁿ^Mⁿ to^L²ⁿ() and subsequent projection of onto the orthogonal complement in ^L²ⁿ() of the divergence-free space. We recall the decomposition of ^L²ⁿ() as

L 2

n() =^H^H^?

where ^H = ^fv ² ^H⁰(div) : div ^v = 0^g, and ^H^? denotes the orthogonal complement of ^H in ^L²ⁿ(). Accordingly the restriction to ^L²ⁿ() of of Theorem 2.2 can be decomposed as ¹ + ² ² ^H ^H^? and ² can be identied with the Lagrange multiplier ~ of Theorem 2.3.

Proof of Theorem 2.3. By Theorem 2.2 there exists ² ^Mⁿ(), the dual of ^Mⁿ(), such that (2.1) and (2.2) hold. Since ^K=^X we nd

((^T^T+) ^u^;^T^z^v)^L² =^;^rv

M n

M

n for all ^v²^{X :} (2.7) Let us consider the restriction of the functional^!^hⁱ^Mⁿ^Mⁿ to^L²ⁿ().

By the Riesz representation theorem it can be identied with an element of

L 2

n() which we denote by ^. Due to (2.4) we nd that^j^^jL 2

n

pmeas ().

Taking the distributional derivative in

((^T^T +) ^u^;^T^z^v)^L² =^;(^^rv)^L²ⁿ for all ^v²^H¹()

it follows that div ^ can be considered as bounded linear functional on

L

2(). Hence ^can be identied with an element of^H (div) and

(^T^T+) ^u^;^T^z= div ^in ^L²()^: (2.8) Combining (2.7) and (2.8) it follows that

div ^^v

L

2 =^; ^rv

M n

M

n for all ^v²^{X :} (2.9) Let us argue that ^²^H⁰(div). By Green's formula and (2.9) we have

div ^^v

L

2+^^rv

L 2 =

Z

;

v^ⁿ^ds= 0 for every ^v²^D()^:

(7)

Hence the trace of ^ satises (^) = 0 and ^ ² ^H⁰(div). Let ^H = ^fv ²

H

0(div) : div ^v= 0^g. Since ^H is a closed subspace of ^L²ⁿ(), we have

L 2

n=^H^H^? (2.10)

with ^H^? the orthogonal complement of^H in ^L²ⁿ(). It is well-known that

H

?=^frq:^q ²^H¹()^g5]. Let us decompose

^

=¹+² ²^H^H^?

with div ¹ = 0 and ² = ^r^q. The element ^q is unique when normalized by ^R^q^dx= 0. Then (2.5) follows from (2.8). To verify (2.6) let us observe that by (2.2)

^r(^u^;^u)^Mⁿ^Mⁿ+

Z

jr^u^j

Z

jrujfor all ^u²^X and by (2.9)

(^;div ²^u^;^u)^L² +

Z

jr^u^j

Z

jrujfor all ^u²^{X :}

Finally concerning the ^L²ⁿ-norm of ² we have due to the orthogonal decomposition (2.10)

j

2 j

L 2

n

j^jL 2

n

pmeas ()^: This concludes the rst part of the proof with ~=².

Conversely assume that (^u~) ² ^X^H⁰(div) satises (2.5), (2.6), and let

u be an arbitrary element in ^X. Let ^F denote the cost functional in (P).

Then by (2.5), (2.6)

F(û)^;^F(û)((^T^T+) ûû^;û)^L²+

Z

jruj; Z

jr^u^j

=(div ~û^;û)^;(div ~û^;û)0 and û is a solution to (P).

Corollary 2.5. Under the assumptions of Theorem 2.3, ^qis the unique variational solution satisfying ^R^q^dx= 0 to the Neumann problem

; q = (^T^T+)^u+^T^z in

@q

@n = 0 on ;^: (2.11)

The compatibility condition is clearly satised since

Z

((^T^T+) ^u^;^T^z)^dx= 0

by (2.7) with ^v= 1. If the boundary of is ^C¹¹-smooth then ^q²^H²().

Corollary 2.6. Under the assumption of Theorem 2.3

;div ~^u

L 2

()

=

Z

jr^u^j (2.12)

and

div ~^u

L 2

()

Z

jruj for all û²^{X :} (2.13) Equality (2.12) follows from (2.6) with û= 0 and û = 2û. By (2.6) and (2.12) one obtains (2.13).

(8)

3. Stability with respect to data

This section is dedicated to the analysis of the stability of the solution to (P) with respect to the data ^z. It will be convenient to denote (P) by (^P^z) and solutions to (^P^z) by ^u^z. Let us rst give the stability result that can be attained from a-priori estimates.

Theorem 3.1. Assume that (H) holds, let ^fzⁿ^g be a sequence of observa- tions in ^Y converging to ^z, and let ^fu^zn^g denote a sequence of solutions to (^P^zⁿ). Then there exists a subsequence^fu^zⁿ^k^gof^fu^zⁿ^gand^u^z ²^L²() such that

lim ^u^zⁿ^k =^u^z weakly in^L²() and lim

Z

jru

z

n

k j=

Z

jru

z

j (3.1) and every such sequence has the property that û^z is a solution to (P). If (H) (i) holds then limû^zn^k =û^z strongly in ^L²().

Proof. For simplicity let us denoteûⁿ=û^zn. We have for every û²^K 12^jTûⁿ^;^zⁿ^j²^Y + 2 ^juⁿ^j²^L²+

Z

jru

n j

12^jT^u^;^zⁿ^j²^Y (3.2) +2 ^j^u^j²^L²+

Z

jruj:

If (H) (i) or (iii) hold then it follows directly that ^fuⁿ^g is bounded in^X. In the case of (H) (ii) boundedness of^fuⁿ^gin^Xfollows from Proposition 1.1 d).

Thus there exists a subsequence of^fuⁿ^gdenoted, for simplicity by the same symbol and ^u^z²^L²() such that

lim ûⁿ=û^z weakly in ^L²() and limûⁿ=û^z in ^L¹()^: (3.3) It follows thatû^z²^K and

lim ^Tuⁿ=^T^u^z weakly in ^Y and ^Z

jru

z

jlim^Z

jru

n j:

We can now pass to the limit in (3.2) to obtain 12^jT^u^z^;^z^j²^Y +2^j^u^z^j²^L² +

Z

jru

z j

12^j^T^u^;^z^j²^Y +2 ^j^u^j²^L²+

Z

jruj

for every ^u ² ^K and hence ^u^z is a solution to (^P^z). Let us show next the second convergence statement in (3.1). We have

12^j^T^u^z^;^z^j²+2 ^ju^z^j²+ lim

Z

jru

n j

lim 12^j^Tuⁿ^;^zⁿ^j²+ lim 2^j^uⁿ^j²+ lim

Z

jru

n j

lim

1

2^jT^uⁿ^;^zⁿ^j²+ 2 ^j^uⁿ^j²+

Z

jru

n j

lim

1

2^jT^u^z^;^zⁿ^j²+2 ^j^u^z^j²+

Z

jru

z j

= 12^j^T^u^z^;^z^j²+2 ^j^u^z^j²+

Z

jru

z j

(9)

and hence lim ^jruⁿ^j ^jru^z^j, which together with ^jru^z^j

lim^R^jruⁿ^jimplies the second statement in (3.1). If (H) (i) holds then one shows similarly that lim ^jûⁿ^j^L² =^jû^z^j, which together with weak convergence implies strong convergence in ^L²() of ûⁿ toû^z.

The following lemma will be useful.

Lemma 3.2. For every pair of solutions^u^z and^u^z with associated Lagrange multipliers ^z and^z we have

h

z

;

z

r(^u^z^;^u^z)ⁱ^Mⁿ^Mⁿ 0^:

Proof. Since ^z ² ^@'(^ru^z) and ^z ² ^@'(^ru^z) the Lemma follows from a wellknown property of subdierentials.

Theorem 3.3. Let (H) (i) hold. Then for every pair ^z^z²^Y

ju

z

;u

z j

L 2

1

kT k

L(L 2

Y)

jz;^z^j^Y^: Proof. From (2.1) of Theorem 2.2 we have

h(^T^T+)û^z^;^T^z^; div ^zû^z^;û^zⁱ0

h(^T^T+)û^z^;^T^z^; div^zû^z^;û^zⁱ0^:

The duality pairing ^hⁱmust be interpreted as in (2.1) and hence

h(^T^T+)(û^z^;û^z)^;^T(^z^;^z)^; div (^z^;^z)û^z^;û^zⁱ0^: Lemma 3.2 implies that

h(^T^T +)(û^z^;û^z)^;^T(^z^;^z)û^z^;û^zⁱ^h^z^;^z^r(û^z^;û^z)ⁱ0 and hence

ju

z

;u

z j

2

kT

k ju

z

;u

z

jj^z^;^z^j and the result follows.

Let ^B^r denote the ball with center at the origin and radius ^rin ^Y. Theorem 3.4. Let (H) (i) hold and let ^r^>0. Then

j Z

jru

z j;

Z

jru

z

jj r

2 + 1

max (1^k^T ^k⁴)^jz^;^z^j for every ^z^z²^B^r.

Proof. From Theorem 3.3 we conclude that

ju

z j

L 2

r

kT k for every^z²^B^r^: (3.4) Utilizing the fact that ^u^z is a solution to (^P^z) we nd

Z

jru

z j;

Z

jru

z j

12

jTu

z

;zj 2

;jTu

z

;zj

2+^ju^z^j²^;ju^z^j²^: Similarly, since ^u^z is a solution to (^P^z)

Z

jru

z

j;

Z

jru

z j

12

jTu

z

;^zj²^;jT^u^z^;^zj²+^ju^z^j²^;ju^z^j²^: