REGULARIZATION OF LINEAR LEAST SQUARES
PROBLEMS BY TOTAL BOUNDED VARIATION
G. CHAVENT AND K. KUNISCH
Abstract. We consider the problem (P) Minimize 12jTu;zj2Y +2 juj2L2+
Z
jruj over u 2 K\X where 0, > 0, K is a closed convex subset of L2(), and the last additive term denotes the BV-seminorm ofu,T is a linear operator from L2\BV into the observation space Y. We formulate necessary optimality conditions for (P). Then we show that (P) admits, for given regularization parameters and, solutions which depend in a stable manner on the dataz. Finally we study the asymptotic behavior when
=!0. The regularized solutions ^u of (P) converge to theL2\
BV minimal norm solution of the unregularized problem.The rate of convergence is12 when the minimum-norm solution ^uis smooth enough.
1. Problem statement and functions of bounded variation We consider the problem of nding approximate solutions to
Tu=z (1.1)
where T is a bounded linear operator from L2() to a Hilbert space Y by using regularization techniques based on bounded variation functionals.
Here is a bounded domain in Rnwith Lipschitzian boundary and z2Y. The operator T is not assumed to be injective and it may be compact. In this case approximate solutions that are stable with respect to z can be obtained by solving the regularized problem.
min 12jTu;zj2Y +2juj2L2 (1.2) or
min 12jTu;zj2Y +2jruj2L2n (1.3) for example. In (1.2) and (1.3) one refers to > 0 as the regularization parameter. Regularization by the square of a Hilbertian norm or seminorm as in (1.2) and (1.3) is well-studied. We refer to 2], 7], 8] and the references given there. While (1.2) and (1.3) have attractive analytical properties con- cerning the stability of their solution with respect to perturbations in the data z and their asymptotic behavior as !0 they can have serious prac- tical disadvantages. The L2-regularization term in (1.2) is not satisfactory
CEREMADE, Universite Paris-Dauphine and INRIA, Rocquencourt.
Fachbereich Mathematik, Technische Universitat Berlin and Institut fur Mathematik, Universitat Graz.
Received by the journal January 6, 1997. Accepted for publication October 23, 1997.
c
Societe de Mathematiques Appliquees et Industrielles. Typeset by LATEX.
to dampen undesired oscillations in the solution u which result from highly noisy data and/or compactness of T. Regularization with the square of the gradient as in (1.3) reduces these oscillations but it overregularizes the solution in the neighborhood of edges and corners. This is well-known in the context of image enhancement, see for example 11] and the references given there, and one can quickly convince oneself of this fact by a short MATLAB-code even with T = identity. Due to these shortcomings with (1.2) and (1.3) regularization based on total bounded variation function- als has recently been investigated in several publications 3], 11], 13], for example. In this case the regularized problems are given by (variations of)
min 12jTu;zj2Y +
Z
jruj (1.4)
where Rjruj denotes the bounded variation seminorm, which will be de- scribed at the end of this section. For the moment we may think of the term
R
jruj as the W11() seminorm. In the context of image enhancement (1.4) has proved to be extremely eective. Comparing (1.4) to (1.2) or (1.3) we immediately recognize that while (1.2) and (1.3) are quadratic, (1.4) is not. In fact, (1.4) is highly nonlinear and this results in interesting problems both analytically and numerically. In 3] basic facts concerning existence of solutions to (1.4), the optimality system, approximation of (1.4) by nite dimensional problems and algorithms that are specically adapted for the nonlinear structure of problem (1.4) were presented. In the present analysis we focus on the problem of stability of the solutions to (1.4) with respect to perturbations in the dataz and on the asymptotic behavior as !0+.
Let us now specify the problem to be investigated:
min 12jTu;zj2Y + 2 juj2L2+
Z
jrujoveru2K\X (P) where 0 > 0K is a closed, convex subset of L2()X = L2()\
BV(), andBV() denotes the space of bounded variation functions to be dened below. The spaceXendowed with the normjujX =jujL2+jujBV is a Banach space. We shall see that due to the assumption that is bounded X and BV() are equivalent if n2. The use of the quadratic regularization term serves two purposes. First, for>0 it provides a coercive term for the subspace of constant functions which are in the kernel of ther-operator (and can be in the kernel ofT) and secondly it gives the possibility to distinguish the structure of stability results for quadratic regularization terms from that of the nonquadratic BV-term.
Let us now summarize results from the theory of functions of bounded variation that are relevant to our work. We refer to 3], 6], 12] for de- tails. As mentioned above is a bounded domain in Rnwith Lipschitzian boundary. A function u2L1() is said to be of bounded variation if
sup
Z
u(x) divv(x)dx:v2C01()n jv(x)j11 x2
<1: (1.5) Here jj1 denotes the l1-norm on Rn. The sup in (1.5) will be denoted by Rjruj. The space of all functions in L1() with bounded variation is
Esaim: Cocv, December 1997, Vol. 2, pp. 359{376
abbreviated by BV(). Endowed with
juj
BV =jujL1+Z
jruj (1.6)
BV() is a Banach space. Equivalently BV() can be introduced in terms of measures. Let M() denote the space of real Borel measures on which is a Banach space when endowed with the norm
jj() =
Z
jj 2M()
jj=++; being the total variation measure associated toand jj() is the total variation ofin , 10]. It is known thatM() is the dual space of C0(), the space of continuous function vanishing on the boundary of and
Z
jj= sup
Z
v(x)d(x) :v2C0() jvjC()1
:
Thus BV() can equivalently be expressed as the space of functions u 2
L
1() for which @xiu2M(), for every 1in, with the partial deriva- tives@xi understood in the distributional sense. If the vector valued measure space Mn() is endowed with the l1-norm, in the sense that
Z
j
*
j=Xn
i=1 Z
j
*
i j
*
2M n() then
juj
BV =jujL1 +
Z
jruj=jujL1+Xn
i=1 Z
j@
x
i
uj foru2BV(): If instead of the l1-norm onRnin (1.5) one uses the lp-norm, 1p <1, then this induces an equivalent norm on BV().
Proposition 1.1.
a) If fujg1j=1 BV() and limuj =u in L1() then
Z
jrujliminf
Z
jru
j j:
b) For everyu2BV()\Lr()r211) there existsfujg1j=1 C1() such that
limuj =u inLr() and lim
Z
j@
x
i u
j j=
Z
j@
x
i
uj 1in:
c) For every bounded sequence fujg1j=1 BV() there exists a subse- quencefujkg1k =1 and u 2BV() such that limujk = u in Lp()p 2 1n;1n ), if n2, and limujk =u in Lpp211), if n= 1. (Com- pactness).
d) There exists a constantC=C(n) such that
Z
ju;uj~ n;1n dx
n;1
n
C Z
jruj for allu2BV()
with u~ = jj1 Rudx. (Sobolev inequality). If n= 1, then the Ln=n;1- norm is understood to be the L1-norm.
Esaim: Cocv, December 1997, Vol. 2, pp. 359{376
As a consequence of Proposition 1.1 d) we have the following
Corollary 1.2. If n= 1 or 2 then BV() and X are equivalent Banach spaces.
Sketch of proof of Proposition 1.1. The assertions of Proposition 1.1 are small modication of results given in 6]. The semicontinuity of the bounded variation functional asserted in a) is proved in (6], Theorem 1.9). As for b) the proof of Theorem 1.17 in 6] can be generalized from r= 1 to arbitrary
r 2 11). An application of Rellich's Theorem, see (6], Theorem 1.19) implies c). Finally d) is well-known for u 2 C1() 9]. For arbitrary
u 2BV() there exists as a consequence of b) a sequenceuj 2C1() such thatuj !uin L1() andRjrujj!Rjruj. Applying d) to the sequence
fu
j
gwe obtain thatfujg1j=1 is bounded in Ln;1n () and hence uj converges weakly to uin Ln;1n () and lim ~uj = ~u. It follows that
ju;uj~L n
n;1
limjuj;u~jjlimC
Z
jru
j j=C
Z
jruj:
The subsequent sections are organized as follows. In Section 2 sucient conditions for existence of a unique solution to (P) and the optimality sys- tem are given. Section 3 is dedicated to the stability of the solutions with respect to the data z. The asymptotic behavior of the solutions to (P) as
!0 !0 is analyzed in Section 4.
2. Existence and optimality conditions We shall use the following hypothesis.
One of the following properties holds:
(i) >0,
(ii) n= 1 orn= 2, and const2= kerT, (H)
(iii) K is bounded in L2().
Theorem 2.1. If (H) holds, then there exists a solution to (P). If moreover
>0 or T is injective then the solution is unique.
Proof. Let fujg1j=1 be a minimizing sequence. If (H) (i) or (iii) hold, then
fu
j
g is bounded in L2(). In the case of (H) (ii) boundedness of fujg in
L
2() follows from Proposition 1.1 d) and the fact that>0. Thus in either case (H) and the form of the cost functional imply that fujgis bounded in
X. By Proposition 1.1 c) there exists a subsequence offujgdenoted by the same symbol and u 2X, such that fujg converges to u strongly in L1() and weakly in L2(). It follows that u 2 K. Moreover, T maps weakly convergent sequence in L2() into weakly convergent sequences in Y. By Proposition 1.1 a) we nd
12jTu;zj2+2 juj2+
Z
jruj
liminf
1
2jTuj;zj2+2 jujj2L2+
Z
jru
j j
inf
1
2jTu;zj2+ 2juj2L2 +
Z
jruj:u2K
Esaim: Cocv, December 1997, Vol. 2, pp. 359{376
and thus u is a solution to (P). Concerning uniqueness we note that in the case > 0 the cost functional of (P) is strictly convex and hence u is unique. In the case of injectivity of T the cost functional of (P) is again strictly convex and uniqueness of u follows.
We now specify the optimality system for (P).
Theorem 2.2. Let u2 K\X. Then u is a solution of (P) if and only if there exists in the dual space of Mn() such that
(T(Tu;z) +u u;u)L2();div u;uBVBV 0 (2.1) for all u2K\X
;ruMnMn+
Z
jruj
Z
jj for all2Mn(): (2.2) Here T denotes the adjoint of T 2 L(L2()Y) and - div stands for the conjugate of r:BV()!Mn(), i.e.
- div uBVBV =ru
M n
M
n for u2BV():
The proof can be obtained with only minor modications from that of (3], Theorem 2.2) and it is therefore omitted. Let us note that (2.2) is equivalent to
ru=
Z
jruj (2.3)
and
Z
jjfor all 2Mn(): (2.4) Alternatively (2.2) can be expressed as 2 @'(ru) where @'is the subd- ierential of ':Mn()!Rgiven by
'() =Xn
i=1 Z
j
i j:
Note that (2.4) implies that the restriction to L1n of ! can be identied with an element in L1n () withjjL1
n 1.
The Lagrange multiplier of Theorem 2.2 is not unique. In the following theorem we give a necessary and sucient optimality system that involves a unique Lagrange multiplier q 2H1(). Let us recall some function spaces involving the divergence operator 5]:
H(div) =v2L2n() : div v2L2() which is a Hilbert space for the norm
jvj
H(div)=jvj2L2n+jdiv vj2L21=2 and the closed subspace of H(div):
H
0(div) =D()n
whereD()ndenotes the space of innitely dierentiable vector-valued func- tions with support in and the closure is taken in the norm ofH(div). The trace operator(v) = vnj; can be extended as continuous linear operator from H(div) toH;1=2(;) (5],x2). Herendenotes the outer unit normal to
Esaim: Cocv, December 1997, Vol. 2, pp. 359{376
the Lipschitzian boundary ; of . The space H0(div) can be characterized as
H
0(div) =fv2H(div) :(v) = 0g:
Theorem 2.3. LetK =Xand letube a solution to (P). Then there exists a unique q2 H1() with Rqdx = 0 such that for ~ = rq we have ~ 2
H
0(div)jj~L 2
n
pmeas(),
(TT +)u;Tz= div ~in L2() (2.5) (;div ~u;u)L2()+
Z
jruj
Z
jrujfor all u2X : (2.6) Conversely, if there exists a pair (u~)2XH0(div) such that (2.5), (2.6) hold, then u is a solution to (P).
Remark 2.4. From the proof it will follow that ~ of Theorem 2.3 is ob- tained from the Lagrange multiplier of Theorem 2.2 by restriction of the functional !hiMnMn toL2n() and subsequent projection of onto the orthogonal complement in L2n() of the divergence-free space. We recall the decomposition of L2n() as
L 2
n() =HH?
where H = fv 2 H0(div) : div v = 0g, and H? denotes the orthogonal complement of H in L2n(). Accordingly the restriction to L2n() of of Theorem 2.2 can be decomposed as 1 + 2 2 H H? and 2 can be identied with the Lagrange multiplier ~ of Theorem 2.3.
Proof of Theorem 2.3. By Theorem 2.2 there exists 2 Mn(), the dual of Mn(), such that (2.1) and (2.2) hold. Since K=X we nd
((TT+) u;Tzv)L2 =;rv
M n
M
n for all v2X : (2.7) Let us consider the restriction of the functional!hiMnMn toL2n().
By the Riesz representation theorem it can be identied with an element of
L 2
n() which we denote by ^. Due to (2.4) we nd thatj^jL 2
n
pmeas ().
Taking the distributional derivative in
((TT +) u;Tzv)L2 =;(^rv)L2n for all v2H1()
it follows that div ^ can be considered as bounded linear functional on
L
2(). Hence ^can be identied with an element ofH (div) and
(TT+) u;Tz= div ^in L2(): (2.8) Combining (2.7) and (2.8) it follows that
div ^v
L
2 =; rv
M n
M
n for all v2X : (2.9) Let us argue that ^2H0(div). By Green's formula and (2.9) we have
div ^v
L
2+^rv
L 2 =
Z
;
v^nds= 0 for every v2D():
Esaim: Cocv, December 1997, Vol. 2, pp. 359{376
Hence the trace of ^ satises (^) = 0 and ^ 2 H0(div). Let H = fv 2
H
0(div) : div v= 0g. Since H is a closed subspace of L2n(), we have
L 2
n=HH? (2.10)
with H? the orthogonal complement ofH in L2n(). It is well-known that
H
?=frq:q 2H1()g5]. Let us decompose
^
=1+2 2HH?
with div 1 = 0 and 2 = rq. The element q is unique when normalized by Rqdx= 0. Then (2.5) follows from (2.8). To verify (2.6) let us observe that by (2.2)
r(u;u)MnMn+
Z
jruj
Z
jrujfor all u2X and by (2.9)
(;div 2u;u)L2 +
Z
jruj
Z
jrujfor all u2X :
Finally concerning the L2n-norm of 2 we have due to the orthogonal decom- position (2.10)
j
2 j
L 2
n
jjL 2
n
pmeas (): This concludes the rst part of the proof with ~=2.
Conversely assume that (u~) 2 XH0(div) satises (2.5), (2.6), and let
u be an arbitrary element in X. Let F denote the cost functional in (P).
Then by (2.5), (2.6)
F(u);F(u)((TT+) uu;u)L2+
Z
jruj; Z
jruj
=(div ~u;u);(div ~u;u)0 and u is a solution to (P).
Corollary 2.5. Under the assumptions of Theorem 2.3, qis the unique variational solution satisfying Rqdx= 0 to the Neumann problem
; q = (TT+)u+Tz in
@q
@n = 0 on ;: (2.11)
The compatibility condition is clearly satised since
Z
((TT+) u;Tz)dx= 0
by (2.7) with v= 1. If the boundary of is C11-smooth then q2H2().
Corollary 2.6. Under the assumption of Theorem 2.3
;div ~u
L 2
()
=
Z
jruj (2.12)
and
div ~u
L 2
()
Z
jruj for all u2X : (2.13) Equality (2.12) follows from (2.6) with u= 0 and u = 2u. By (2.6) and (2.12) one obtains (2.13).
Esaim: Cocv, December 1997, Vol. 2, pp. 359{376
3. Stability with respect to data
This section is dedicated to the analysis of the stability of the solution to (P) with respect to the data z. It will be convenient to denote (P) by (Pz) and solutions to (Pz) by uz. Let us rst give the stability result that can be attained from a-priori estimates.
Theorem 3.1. Assume that (H) holds, let fzng be a sequence of observa- tions in Y converging to z, and let fuzng denote a sequence of solutions to (Pzn). Then there exists a subsequencefuznkgoffuznganduz 2L2() such that
lim uznk =uz weakly inL2() and lim
Z
jru
z
n
k j=
Z
jru
z
j (3.1) and every such sequence has the property that uz is a solution to (P). If (H) (i) holds then limuznk =uz strongly in L2().
Proof. For simplicity let us denoteun=uzn. We have for every u2K 12jTun;znj2Y + 2 junj2L2+
Z
jru
n j
12jTu;znj2Y (3.2) +2 juj2L2+
Z
jruj:
If (H) (i) or (iii) hold then it follows directly that fung is bounded inX. In the case of (H) (ii) boundedness offunginXfollows from Proposition 1.1 d).
Thus there exists a subsequence offungdenoted, for simplicity by the same symbol and uz2L2() such that
lim un=uz weakly in L2() and limun=uz in L1(): (3.3) It follows thatuz2K and
lim Tun=Tuz weakly in Y and Z
jru
z
jlimZ
jru
n j:
We can now pass to the limit in (3.2) to obtain 12jTuz;zj2Y +2juzj2L2 +
Z
jru
z j
12jTu;zj2Y +2 juj2L2+
Z
jruj
for every u 2 K and hence uz is a solution to (Pz). Let us show next the second convergence statement in (3.1). We have
12jTuz;zj2+2 juzj2+ lim
Z
jru
n j
lim 12jTun;znj2+ lim 2junj2+ lim
Z
jru
n j
lim
1
2jTun;znj2+ 2 junj2+
Z
jru
n j
lim
1
2jTuz;znj2+2 juzj2+
Z
jru
z j
= 12jTuz;zj2+2 juzj2+
Z
jru
z j
Esaim: Cocv, December 1997, Vol. 2, pp. 359{376
and hence lim jrunj jruzj, which together with jruzj
limRjrunjimplies the second statement in (3.1). If (H) (i) holds then one shows similarly that lim junjL2 =juzj, which together with weak con- vergence implies strong convergence in L2() of un touz.
The following lemma will be useful.
Lemma 3.2. For every pair of solutionsuz anduz with associated Lagrange multipliers z andz we have
h
z
;
z
r(uz;uz)iMnMn 0:
Proof. Since z 2 @'(ruz) and z 2 @'(ruz) the Lemma follows from a wellknown property of subdierentials.
Theorem 3.3. Let (H) (i) hold. Then for every pair zz2Y
ju
z
;u
z j
L 2
1
kT k
L(L 2
Y)
jz;zjY: Proof. From (2.1) of Theorem 2.2 we have
h(TT+)uz;Tz; div zuz;uzi0
h(TT+)uz;Tz; divzuz;uzi0:
The duality pairing himust be interpreted as in (2.1) and hence
h(TT+)(uz;uz);T(z;z); div (z;z)uz;uzi0: Lemma 3.2 implies that
h(TT +)(uz;uz);T(z;z)uz;uzihz;zr(uz;uz)i0 and hence
ju
z
;u
z j
2
kT
k ju
z
;u
z
jjz;zj and the result follows.
Let Br denote the ball with center at the origin and radius rin Y. Theorem 3.4. Let (H) (i) hold and let r>0. Then
j Z
jru
z j;
Z
jru
z
jj r
2 + 1
max (1kT k4)jz;zj for every zz2Br.
Proof. From Theorem 3.3 we conclude that
ju
z j
L 2
r
kT k for everyz2Br: (3.4) Utilizing the fact that uz is a solution to (Pz) we nd
Z
jru
z j;
Z
jru
z j
12
jTu
z
;zj 2
;jTu
z
;zj
2+juzj2;juzj2: Similarly, since uz is a solution to (Pz)
Z
jru
z
j;
Z
jru
z j
12
jTu
z
;zj2;jTuz;zj2+juzj2;juzj2:
Esaim: Cocv, December 1997, Vol. 2, pp. 359{376