Mesh-independence and preconditioning for solving parabolic control problems with mixed control-state constraints

(1)

DOI:10.1051/cocv:2008042 www.esaim-cocv.org

MESH-INDEPENDENCE AND PRECONDITIONING

FOR SOLVING PARABOLIC CONTROL PROBLEMS WITH MIXED CONTROL-STATE CONSTRAINTS

Michael Hinterm¨ uller

¹

, Ian Kopacka

²

and Stefan Volkwein

²

Abstract. Optimal control problems for the heat equation with pointwise bilateral control-state constraints are considered. A locally superlinearly convergent numerical solution algorithm is proposed and its mesh independence is established. Further, for the efficient numerical solution reduced space and Schur complement based preconditioners are proposed which take into account the active and inactive set structure of the problem. The paper ends by numerical tests illustrating our theoretical findings and comparing the efficiency of the proposed preconditioners.

Mathematics Subject Classification.49K20, 65K05.

Received December 13, 2006. Revised November 7, 2007 and March 31, 2008.

Published online July 19, 2008.

1. Introduction

In this work we study a locally superlinearly convergent algorithm for computing the solution of constrained distributed optimal control problems for processes governed by the linear heat equation

y_t−αΔy =f +u inQ, y(0) =y_◦ on Ω, (1.1) where Ω⊂R^d represents the domain of interest andQ is the space-time cylinder. Furthermore,f is a given source,y_◦ denotes the initial temperature, and α >0 is a given constant reﬂecting heat conduction properties.

We consider (1.1) together with homogeneous Dirichlet boundary conditions. In what follows we callythe state anduthe control (variable), respectively.

Recently, there has been signiﬁcant interest in optimally controlling (1.1) subject to pointwise mixed control- state constraints of the type

a≤y+cu≤b almost everywhere (a.e.) inQ, (1.2)

Keywords and phrases.Bilateral control-state constraints, heat equation, mesh independence, optimal control, PDE-constrained optimization, semismooth Newton method.

1 University of Sussex Department of Mathematics Mantell Building Falmer, Brighton BN1 9RF, UK.

michael.hintermueller@uni-graz.at

2Karl-Franzens-University of Graz Department of Mathematics and Scientific Computing Heinrichstrasse 36, 8010 Graz, Austria.

ian.kopacka@uni-graz.at; stefan.volkwein@uni-graz.at

Article published by EDP Sciences c EDP Sciences, SMAI 2008

(2)

where a, b∈L^q(Q), for some q >2 and with a < b, andc ∈L^∞(Q), withc ≥ε_c >0 a.e. inQor c≤ε_c <0 a.e. in Q. Mixed control-state constraints are of interest in several respects: (i) They occur in Lavrentiev-type regularized state constrained optimal control problems, where typicallyc≡ >0. In contrast to the measure- valuedness of the Lagrange multiplier associated with pure state constraints (i.e.,c≡0 in (1.2)), the multiplier pertinent to the mixed control-state constraints enjoys L²(Q)-regularity; see, e.g., [20]. This makes analytical investigations as well as the development of fast numerical solution methods amenable. (ii) On the other hand, mixed control-state constraints may appear in their own right. For instance, forc <0 (1.2) can be interpreted as to restrict the control by some multiple of the state. In thermal processes this might be intended to avoid material tensions due to a signiﬁcant diﬀerence between the state (temperature) and the control (heating or cooling) action.

Based on earlier experience [10,12,13], here we propose a primal-dual active set or, equivalently, semismooth Newton method for the numerical solution of the underlying constrained optimal control problems. It turns out that the method converges locally at a superlinear rate in function space as well as in finite dimensions after discretization. In addition we prove that the convergence is mesh-independent. In both cases we extend currently available work for the control of elliptic partial differential equations [9,11] to the parabolic case. This is of particular interest with respect to the mesh-independence, since there are no such results available even in the case where the mixed control-state constraints are replaced by the more accessible pointwise control constraintsa≤u≤b a.e. inQ. We emphasize that our findings for the mixed control-state constraints readily carry over to the case of pure control constraints.

As the discretization of time-dependent PDE-constrained optimization problems naturally results in an ex- tremely large scale problem, preconditioned iterative solvers for the resulting subsystems have to be employed.

For research papers on reliable preconditioning in (unconstrained) optimal control of PDEs we refer to,e.g., [2–4].

The literature on preconditioning techniques in the case of additional inequality constraints and, in particular, in connection with active set solvers is relatively scarce. Therefore, another goal of the present work is to introduce a preconditioning technique which is tailored to the active respectively inactive set structure of our solver and, hence, is able to handle additional pointwise inequality constraints eﬃciently.

The subsequent sections are organized as follows. In Section 2 we introduce the optimal control problem under consideration and present first-order necessary and sufficient optimality conditions. Section3is devoted to the development and convergence analysis of our solution algorithms. Then, in Section 4, we prove mesh independence of our method. Finally, numerical examples illustrating the efficient performance of our solver are discussed in Section 5. This section also contains an investigation of appropriate preconditioners for the iterative solvers considered.

2. The optimal control problem

In this section we formulate the optimal control problem with mixed pointwise control-state constraints and review ﬁrst-order necessary optimality conditions.

2.1. The constrained optimal control problem

Suppose that Ω is an open and bounded subset ofR^d,d∈ {2,3}, with Lipschitz boundary Γ =∂Ω. ForT >0 we setQ= (0, T)×Ω and Σ = (0, T)×Γ. Moreover, byL²(0, T;H₀¹(Ω)) we denote the space of (equivalence classes) of measurable abstract functionsϕ: [0, T]→H₀¹(Ω), which are square integrable,i.e.,

_T

0 ϕ(t)²_H1

0(Ω)dt <∞.

For the deﬁnition of Sobolev spaces we refer the reader, e.g., to [1,6]. In particular, H⁻¹(Ω) stands for the dual space of H₀¹(Ω). Note that the space H₀¹(Ω) is continuously embedded in L⁶(Ω) for spatial dimension d≤3. Whent is ﬁxed, the expression ϕ(t) stands for the function ϕ(t,·) considered as a function in Ω only.

(3)

Recall that

W(0, T) =

ϕ∈L²(0, T;H₀¹(Ω)) :ϕ_t∈L²(0, T;H⁻¹(Ω))

is a Hilbert space supplied with its common inner product; see [5], p. 473, for instance. Since H₀¹(Ω) is continuously embedded inL^r(d)(Ω) with r(1)∈[2,+∞],r(2)∈[2,+∞) andr(d)∈[2,2d/(d−2)] for alld≥3, we have thatW(0, T) is continuously embedded inL²(0, T;L^r(d)(Ω)). Further, it is well-known thatW(0, T) is continuously embedded inC([0, T];L²(Ω)); see [21], Theorem 3.10. Notice thatL²(0, T;L²(Ω)) can be identiﬁed withL²(Q). By Aubin’s lemma [18], W(0, T) is compactly embedded intoL²(Q). Moreover, it follows from

_T

0

Ω|ϕ(t,x)|³dxdt≤ _T

0 Ω|ϕ(t,x)|²dx _1/2

Ω|ϕ(t,x)|⁴dx _1/2

dt

= _T

0 ϕ(t)_L2(Ω)ϕ(t)²_L4(Ω)dt

≤ ϕ_C([0,T_];L2(Ω))ϕ²_L2(0,T;L⁴(Ω))

for anyϕ∈W(0, T) thatW(0, T) is continuously embedded intoL³(Q) for spatial dimensiond≤3.

We consider a distributed optimal control problem for the heat equation with mixed pointwise control-state constraints. The goal is to minimize the cost functionJ :W(0, T)×L²(Q)→[0,∞) given by

J(y, u) =1 2

_T

0

Ωα_Q|y−z_Q|²dxdt+1 2

Ωα_Ω|y(T)−z_Ω|²dx +κ

2 _T

0

Ω|u−u_d|²dxdt,

(2.1)

where the statey and the controluare coupled by the linear boundary value problem

y_t(t,x)−αΔy(t,x) =f(t,x) +u(t,x) for almost all (t,x)∈Q, (2.2a) y(t,s) = 0 for almost all (t,s)∈Σ, (2.2b)

y(0,x) =y_◦(x) for almost allx∈Ω. (2.2c)

In (2.1) we assume that α_Q and α_Ω are non-negative weights satisfying α_Q ∈ L^∞(Q) and α_Ω ∈ L^∞(Ω), respectively. The desired states z_Q ∈L²(Q),z_Ω∈L²(Ω) and the nominal control u_d ∈ L³(Q) are given, and κ >0 denotes a regularization parameter. For the data in (2.2) we suppose that the inhomogeneityf belongs to L²(0, T;H⁻¹(Ω)), α > 0 holds true, and the initial state satisﬁes y_◦ ∈ L²(Ω). It is well-known that for any u∈L²(Q) there exists a unique solutiony∈W(0, T) of the state equation (2.2). Moreover, the mapping u→y(u) is continuous fromL²(Q) toW(0, T); see [14,21]. If, in addition,y_◦∈H₀¹(Ω),f ∈L²(Q) hold and Ω is suﬃciently smooth (e.g., Ω is convex with Lipschitz-continuous boundary), then the statey belongs even to L²(0, T;H²(Ω)∩H₀¹(Ω))∩H¹(0, T;L²(Ω)).

We also impose bilateral pointwise control-state constraints. For that purpose let a, b ∈ L³(Q) be given lower and upper bounds, respectively. Moreover, letc∈L^∞(Q) satisfyc≥ε_c>0 (orc≤ε_c<0) for almost all (f.a.a.) (t,x)∈Q. We deﬁne the two Banach spaces

X =W(0, T)×L²(Q), Y =L²(0, T;H₀¹(Ω)),

and denote the common compact embedding operators by ı : W(0, T) → L²(Q) and j : L²(Q) → Y. Then admissible state-control pairs (y, u) are required to belong to the closed convex set

X_ad=

(y, u)∈Xa≤ıy+cu≤ba.e. inQ

. (2.3)

(4)

For a compact formulation of the optimal control problem we introduce the aﬃne linear mappinge:X →Y by

e(y, u), ϕ_Y,Y = _T

0 yt(t)−f(t), ϕ(t)_H−1(Ω),H¹₀(Ω)dt+

Ωα∇y· ∇ϕ−uϕdxdt

for (y, u)∈ X and ϕ∈ Y, where the dualY of Y is identiﬁed with L²(0, T;H⁻¹(Ω)), and ·,·_H−1(Ω),H¹₀(Ω)

denotes the duality pairing betweenH₀¹(Ω) and its dualH⁻¹(Ω). The feasible set is given by Φ(P) =

x= (y, u)∈X_ade(x) = 0 inY and y(0) =y_◦ inL²(Ω) . Throughout the paper we assume that Φ(P)=∅.

Our inﬁnite dimensional optimal control problem now reads

minJ(x) subject to (s.t.) x∈Φ(P). (P)

Since Φ(P)= ∅ by assumption, there exists a unique solutionx^∗ = (y^∗, u^∗) of (P). The uniqueness follows from the strict convexity properties of the objective functional. If y_◦ ∈ H₀¹(Ω) and f ∈ L²(Q) hold, then y^∗∈L²(0, T;H²(Ω)∩H₀¹(Ω))∩H¹(0, T;L²(Ω)).

The ﬁrst-order optimality conditions of (P) are stated in the next theorem.

Theorem 2.1. Suppose thatΦ(P)=∅ and thatx^∗ = (y^∗, u^∗)∈Φ(P)is the solution of (P). Then there exists a unique Lagrange multiplier pair(p^∗, λ^∗)∈W(0, T)×L²(Q)satisfying, together with(y^∗, u^∗), the dual system (here written in its strong form)

−p^∗_t−αΔp^∗+λ^∗ =−α_Q(y^∗−z_Q) inQ, (2.4a)

p^∗= 0 onΣ, (2.4b)

p^∗(T) =−α_Ω(y^∗(T)−z_Ω) inΩ, (2.4c)

κ(u^∗−u_d)−p^∗+cλ^∗= 0 inQ, (2.4d)

λ^∗= max(0, λ^∗+σ(y^∗+cu^∗−b)) + min(0, λ^∗+σ(y^∗+cu^∗−a)) onQ, (2.4e) where σ is an arbitrary function in L^∞(Q) with σ(t,x) ≥ σ > 0 f.a.a. (t,x) ∈ Q. In (2.4e) the min- and max-operations are interpreted in the pointwise almost everywhere sense.

Proof. The proof follows from arguments analogous to those given in [9], Section 2.

Note that (2.4e) is a nonlinear complementarity problem (NCP) function based reformulation of the complementarity systems

λ^∗_a ≥0, a−y^∗−cu^∗≤0, λ^∗_a(a−y^∗−cu^∗) = 0, a.e. in Q, (2.5) λ^∗_b ≥0, y^∗+cu^∗−b≤0, λ^∗_b(y^∗+cu^∗−b) = 0, a.e. inQ, (2.6) withλ^∗=λ^∗_b −λ^∗_a.

2.2. The reduced optimal control problem

Since, for anyu∈L²(Q), there exists a unique solutiony=y(u)∈W(0, T) of the state equation (2.2), we can deﬁne the bounded and aﬃne linear solution operator

S:L²(Q)→W(0, T), u→ A⁻¹(f+ju),

(5)

whereA= (_∂t^∂ −αΔ) :W(0, T)→Y is linear and bounded. For later use, letA^∗ denote the adjoint operator ofA. Next we introduce the so-called reduced cost functional

Jˆ(u) =J(S(u), u).

Further, we deﬁne the set of admissible controls U_ad=

u∈L²(Q)a≤ıS(u) +cu≤b a.e. inQ , which has the following properties.

Proposition 2.2. The setU_ad is convex, closed and bounded in L²(Q).

Proof. Closedness and convexity follow immediately. Therefore, we only focus on the boundedness ofU_ad. Since y=S(u),i.e.,Ay=f+ju, we havea≤ıy+c u≤ba.e. inQ. Settingv:=ıy+c uwe obtain

A+jc⁻¹ıid y=f +jc⁻¹v and a≤v≤ba.e. inQ.

From this we infer y²_L2(Q)

cL^∞(Q) ≤1

2y_◦²_L2(Ω)+

fL²(Q)+|_c|⁻¹max(aL²(Q),bL²(Q)) yL²(Q).

Hence, y is bounded in L²(Q). Since c ∈L^∞(Q) with|c| ≥_c >0 a.e. in Q, we infer that u is bounded in

L²(Q). This proves the assertion.

Remark 2.3. In the context of Lavrentiev-type regularization of state constrained optimal control problems one is interested in setting c ≡ _n > 0 and studying _n ↓ 0. In this case, our arguments in the proof of Proposition2.2yield the boundedness ofy=y_n inL²(Q) uniformly with respect to _n.

Problem (P) can be equivalently expressed as

min ˆJ(u) s.t. u∈U_ad. ( ˆP)

For accessing the gradient of ˆJ we have to guarantee diﬀerentiability of S. As in [21], one argues thatS is continuously diﬀerentiable as a mapping from L²(Q) toW(0, T). The action of the derivative S(u) (we also write y(u)) on some v ∈ L²(Q), i.e., S(u)v =w, is characterized by the solution w of the initial-boundary value problem

w_t−αΔw=jv inQ,

w= 0 on Σ, (2.7)

w(0) = 0 in Ω.

Considering the adjoint equations (2.4a)–(2.4c) with y = y(u), we ﬁnd that the adjoint state depends on u and λ, i.e., p =p(u, λ). Similarly, we obtain the diﬀerentiability of the adjoint statep(u, λ) considered as a function ofuandλ.

The derivative of ˆJ at a pointu∈L²(Q) is represented by Jˆ(u) =S(u)^∗∂J(S(u), u)

∂y +∂J(S(u), u)

∂u =−p(u) +κ(u−u_d) inQ,

(6)

wherep(u) solves the equation

−pt−αΔp=α_Q(z_Q−y(u)) in Q,

p= 0 on Σ,

p(T) =α_Ω(z_Ω−y(u)(T)) in Ω.

The ﬁrst-order necessary optimality condition of (P) is given by the variational inequalityˆ Jˆ(u^∗), u−u^∗_L2(Q)≥0 for allu∈U_ad.

This is equivalent to

κ(u^∗−u_d)−p^∗+cλ^∗= 0, (2.8a)

λ^∗= max (0, λ^∗+σ(y(u^∗) +cu^∗−b)) + min (0, λ^∗+σ(y(u^∗) +cu^∗−a)) (2.8b) in Q for some arbitrarily ﬁxed, positive σ ∈ L^∞(Q), and with p^∗ = p(u^∗) solving (2.4a)–(2.4c). Thus, the ﬁrst-order necessary optimality conditions for (P) are given by (2.8) together with (2.4a)–(2.4c).ˆ

To express (P) as a bilateral control constrained problem we set ˜ˆ a=a−ı₃A⁻¹f, ˜b =b−ı₃A⁻¹f, with ı₃ denoting the continuous embedding operator fromW(0, T) intoL³(Q) ford≤3. Moreover, we deﬁne the linear and bounded operatorsT =ıA⁻¹j:L²(Q)→L²(Q) andF=T +cid :L²(Q)→L²(Q). By assumptionc= 0 is satisﬁed. Since ıis compact andjas well asA⁻¹ are continuous, the operatorT is compact. If

−cis no eigenvalue of T, (2.9)

we infer from the Fredholm theory that the linear operator F admits a (unique) inverse. Thus, (P) can beˆ expressed equivalently as a bilaterally control constrained problem for the new control variablev:=Fu

min ˜J(v) s.t. v∈V_ad=

v∈L²(Q)˜a≤v≤˜b a.e. inQ

, ( ˜P)

with ˜J = ˆJ ◦ F⁻¹. Notice that (P) is a minimization problem with bilateral control constraints, but with no˜ equality constraints. Of course,v^∗=Fu^∗is the solution of (P). We will make use of (˜ P) when establishing a˜ mesh-independence principle of our algorithm in Section4.

Remark 2.4. Note that the smoothness of the bounds ˜aand ˜b depends on the smoothness ofa,b andA⁻¹f. In particular, if aand b are constant andf ≡0 holds, (P) is an optimal control problem with constant box˜ constraints. On the other hand, higher regularity properties can be ensured by proper assumptions on a, b andf. In our numerical test examples carried out in Section5 we have ˜a, ˜b∈C(Ω)∩C^∞(Ω).

In order to ease the notation, in what follows we frequently neglect the embedding operators.

3. The semismooth Newton method

In this section, to solve (P) numerically a Newton-type algorithm is applied to the first-order necessary optimality conditions of (P). For the generalized (Newton) differentiation of the min- and max-operators inˆ function space we rely on the following definition which is due to [12].

Definition 3.1. LetV,W be two Banach spaces,S⊂V a non-empty open set,F :S→W a given mapping, and v^∗ ∈S. If there exists a neighborhoodN(v^∗)⊂S and a family of mappingsG:N(v^∗)→ L(V, W) such that

vlim_V↓0

1

v_V F(v^∗+v)−F(v^∗)−G(v^∗+v)v_W = 0, (3.1)

(7)

thenF is calledNewton-diﬀerentiable at v^∗, andG(v^∗) is said to be ageneralized derivative (or Newton map) for F atv^∗.

Here,L(V, W) denotes the Banach space of all bounded and linear operators fromV toW endowed with the common norm. Moreover, we writeL(V) =L(V, V).

Remark 3.2. The function max :L^p(Q)→L^s(Q) is Newton differentiable for 1≤s < p≤ ∞(see [12]). If F :L^r(Q)→L^p(Q) is Fréchet differentiable for some 1≤r≤ ∞, then the function

(t,x)→χ_A(t,x)· ∇F(u(t,x)), (t,x)∈Q, (3.2) is a generalized derivative of max(0, F(·)) : L^r(Q)→ L^s(Q). Here, χ_A denotes the characteristic function of the set A⊂Q, whereF(u(·)) is positive,i.e., χ_A(t,x) = 1 ifF(u(t,x))>0 andχ_A(t,x) = 0 otherwise. From min(0, F(·)) =−max(0,−F(·)), we see that an analogous diﬀerentiation formula holds true for the min-function.

Next observe that (2.8a) in (2.4a)–(2.4c) withy^∗=y(u^∗) yields

(A^∗+c⁻¹id)p^∗=α_Q(z_Q−y(u^∗)) +κc⁻¹(u^∗−u_d). (3.3) Consequently, assuming that

−c⁻¹ is no eigenvalue ofA^∗, (3.4)

we havep^∗=p(u^∗)∈Y uniquely. Parabolic regularity results yieldp(u^∗)∈W(0, T). This relation holds true whenevery =y(u) in the right hand side of the adjoint system andκ(u−u_d)−p+cλ= 0 on Σ. Further we concludeλ=λ(u).

Choosingσ=κ/c²∈L^∞(Q) withσ≥κ/ε²_c >0 in (2.8b) and taking into account (2.8a), we obtain c⁻¹

κ(u_d−u^∗) +p^∗(u^∗) −max

0, c⁻¹(κu_d+p^∗(u^∗)) +κc⁻²(y(u^∗)−b)

−min

0, c⁻¹(κu_d+p^∗(u^∗)) +κc⁻²(y(u^∗)−a) = 0.

(3.5)

Thus, we introduce the mappingF :L²(Q)→L²(Q) by F(u) =c⁻¹

κ(u_d−u) +p(u) −max

0, c⁻¹(κu_d+p(u)) +κc⁻²(y(u)−b)

−min

0, c⁻¹(κu_d+p(u)) +κc⁻²(y(u)−a) . (3.6) Then, (3.5) becomes the nonsmooth operator equation

F(u^∗) = 0 in L²(Q). (3.7)

Now suppose ¯u∈L²(Q) is some given approximation ofu^∗. Then, sincey(¯u) as well asp(¯u) are continuously diﬀerentiable from L²(Q) to W(0, T) → L³(Q) and u_d, a, b ∈ L³(Q) by assumption, Remark 3.2 provides Newton diﬀerentiability of the min- and max-terms in (3.5), respectively. A particular Newton map ofF at ¯u in directionu∈L²(Q) is given by

G(¯u)u= 1 c

p(¯u)u−κu − 1 c²χ

λ(¯u)+_c^κ2(y(¯u)+c¯u−b)>0cp(¯u)u+κy(¯u)u

− 1 c²χ

λ(¯u)+_c^κ₂(y(¯u)+c¯u−a)<0cp(¯u)u+κy(¯u)u ,

(3.8)

(8)

whereδy=y(¯u)u∈W(0, T) solves the linearized state equations

δy_t−αΔδy=u inQ, (3.9a)

δy= 0 on Σ, (3.9b)

δy(0) = 0 in Ω, (3.9c)

andδp=p(¯u)u∈W(0, T) solves the linearized adjoint system

−δpt−αΔδp+c⁻¹δp=−αQδy+κc⁻¹u inQ, (3.10a)

δp= 0 on Σ, (3.10b)

δp(T) =−αΩδy(T) in Ω. (3.10c)

It follows by standard arguments that for everyu∈L²(Q) there exist uniquely determinedδy∈W(0, T) and δp∈W(0, T) solving (3.9) and (3.10), respectively.

In Algorithm1we formulate the corresponding generalized Newton method for ﬁndingu^∗∈L²(Q) such that (3.7) holds true. Due to the additional regularity ofF implied by (3.1) we call it a semismooth Newton method.

Algorithm 1(semismooth Newton method).

1: Chooseu⁰∈L²(Q), and setk= 0.

2: repeat

3: ComputeG(u^k) according to (3.8) and solve forδu^k: G(u^k)δu^k =−F(u^k) withF given by (3.6).

4: Setu^k+1 =u^k+δu^k, and k=k+ 1.

5: untilsome stopping rule is satisﬁed.

We have the following convergence result; see [12].

Theorem 3.3. Let {u^k}_k∈Nbe a sequence generated by Algorithm1. Then,{u^k}_k∈Nconverges to the solution u^∗∈U_ad of (P) at aq-superlinear rate provided thatu⁰∈L²(Q) is suﬃciently close tou^∗.

Algorithm 1 can be expressed equivalently as a primal-dual active-set strategy. In fact, using (3.2) and deﬁning λ^k =c⁻¹

κ(u_d−u^k) +p(u^k) and A^k_b =

(t,x)∈Qλ^k+ κ

c²(y^k(u^k) +cu^k−b)>0 a.e.

, A^k_a =

(t,x)∈Qλ^k+ κ

c²(y^k(u^k) +cu^k−a)<0 a.e.

, I^k =Q\A^k, withA^k =A^k_b∪A^k_a,

(3.11)

we obtain the following linearization of (3.5) at (u, y(u), p(u)) with respect to the independent variableu:

1 c

κ(u_d−u^k−δu^k) +p(u^k) +p(u^k)δu^k

−1 c²χ_Ak

b

c(κu_d+p(u^k) +p(u^k)δu^k) +κ(y(u^k) +y(u^k)δu^k−b)

−1 c²χ_Ak

a

c(κu_d+p(u^k) +p(u^k)δu^k) +κ(y(u^k) +y(u^k)δu^k−a)

= 0.

(3.12)

(9)

Hereδu∈L²(Q) represents the increment. A closer look reveals:

y(u^k) +y(u^k)δu^k+cu^k+1=aonA^k_a, (3.13a) y(u^k) +y(u^k)δu^k+cu^k+1=bonA^k_b, (3.13b) κ(u^k+1−u_d)−p(u^k)−p(u^k)δu^k= 0 onI^k, (3.13c) whereu^k+1=u^k+δu^k. Note that (3.13c) can be viewed as

λ^k+δλ^k = 0 onI^k, (3.13d)

withδλ^k =−κδu^k+p(u^k)δu^k. The active respectively inactive set behavior of the variables in (3.13) motivates Algorithm2.

Algorithm 2(primal-dual active set strategy).

1: Choose starting valuesλ⁰ andu⁰, computey⁰=S(u⁰) andp⁰ satisfying (2.4a)–(2.4d). Setk= 0.

2: repeat

3: Determine the active setsA^k_a,A^k_b and the inactive setI^k according to (3.11).

4: Compute the (unique) solutionx^k = (u^k, y^k) with pertinent multiplierλ^k and adjoint state p^k of

⎧⎨

⎩

min J(x) over x= (y, u)∈X s.t. e(x) = 0, y(0) =y_◦,

y+cu=aonA^k_a andy+cu=bonA^k_b.

(3.14)

5: Setk=k+ 1.

Utilizing the setting onA^k_a andA^k_b, note that the feasible set of the minimization problem (3.14) in step 4 of Algorithm2becomes

y_t−αΔy+c⁻¹χ_Ak

a∪A^k_by=f +χ_Iku+c⁻¹(χ_Ak

aa+χ_Ak bb), y= 0 on Σ, y(0) =y_◦ in Ω.

Givenu, this system admits a unique solution. The radial unboundedness ofJ with respect touon the feasible set now guarantees the existence of a unique solution to (3.14).

Letδy^k=y(u^k)δu^k andδp^k=p(u^k)δu^k. Theny^k+1=y^k+δy^k solves

y^k+1_t −αΔy^k+1=f+u^k+1 in Q, (3.15a)

y^k+1= 0 on Σ, (3.15b)

y^k+1(0) =y_◦ in Ω. (3.15c)

Furthermore,p^k+1=p^k+δp^k satisﬁes

−p^k+1_t −αΔp^k+1+λ^k+1=−αQ(y^k+1−z_Q) inQ, (3.16a)

p^k+1= 0 on Σ, (3.16b)

p^k+1(T) =−αΩ(y^k+1(T)−z_Ω) in Ω. (3.16c)

(10)

Utilizing (3.13) we derive from (3.15a) y^k+1_t −αΔy^k+1+1

cχ_Aky^k+1−1

κχ_Ikp^k+1=f+1 c

χ_Ak

aa+χ_Ak bb

+χ_Iku_d (3.17) withA^k =A^k_a∪A^k_b. Since

λ^k+1=λ^k+δλ^k =1 c

p^k+1+κ(u_d−u^k+1) (see (2.4d)), we conclude from (3.13d) that

−p^k+1_t +

−αΔ + 1 cχ_Ak

p^k+1+

α_Q+ κ c²χ_Ak

y^k+1=α_Qz_Q−κ c

χ_Aku_d−1 c

χ_Ak

aa+χ_Ak

bb . (3.18) Summarizing, (3.17) and (3.18) can be written compactly as

α_Q+_c^κ2χ_Ak −_∂t^∂ −αΔ + ¹_cχ_Ak

∂t∂ −αΔ +¹_cχ_Ak −¹_κχ_Ik

y^k+1 p^k+1

=

α_Qz_Q−^κ_c

χ_Aku_d−¹_c(χ_Ak

aa+χ_Ak bb) f +¹_c

χ_Akaa+χ_Ak

bb +χ_Iku_d

. (3.19) This is the reduced form of the Newton system which is used in our numerics; compare (5.1) in Section 5.

4. Mesh-independence

In this section we give suﬃcient conditions for the mesh-independent convergence of Algorithm2. The proof technique is based on a combination of arguments in [15] and [9]. Throughout we assumey_◦, z_Q,z_Ω, α_Q, and α_Ωare suﬃciently regular; see [15], Table 1.

We proceed as in [15], Section 3, and [8]. Let Ω_h be a family of grids depending on the parameterh >0.

On these grids,P_h⁰ andP_h¹are the spaces of all piecewise constant respectively piecewise linear ﬁnite elements.

Furthermore, we deﬁne the operators of orthogonal projections byRⁱ_h:L²(Ω)→P_hⁱ fori∈ {0,1}. We suppose that the ﬁnite element grids are chosen in such a way that

ϕ− Rⁱ_hϕ_Hr(Ω)≤c_Ωh^s−rϕ_Hs(Ω) for allϕ∈H^s(Ω),

where r∈[0, i],s∈[r, i+ 1], c_Ω >0, and i∈ {0,1}. Next we introduce approximations for functions deﬁned onQ. Lett_j=jh_t, 0≤j≤n_t, be a chosen grid in [0, T] with step sizeh_t=T /n_t. To simplify the presentation, i.e., to avoid terms of the typeO(ht+h²) in our error analysis, we couple the time and spatial discretization in the following manner: we suppose that there are constants 0< c₁≤c₂ such that

c₁h²≤h_t≤c₂h²; (4.1)

see,e.g., [8].

Next we deﬁne the ﬁnite dimensional spaces S_h⁰ =

ϕ_h:Q→Rϕ_h(t) =ϕ_h(t_j)∈P_h⁰, t∈[t_j−1, t_j), 1≤j≤n_t , S_h¹ =

ϕ_h:Q→Rϕ_h(t) =P_j^j−1(ϕ_h)(t), t∈[t_j−1, t_j], ϕ_h(t_j)∈P_h¹, 1≤j≤n_t

, where

P_j^j−1(ϕ_h)(t) =t_j−t

h_t ϕ_h(t_j−1) +t−t_j−1

h_t ϕ_h(t_j) fort∈[t_j−1, t_j].

The corresponding restriction operators are G_hⁱ : L²(Q) → S_hⁱ with i ∈ {0,1}. Note that the elements of S_h⁰ are piecewise constant in space and time (on intervals [t_j−1, t_j)), and the elements ofS_h¹ are piecewise linear in space and time.

(11)

The operatorA⁻¹is approximated byA⁻¹_h :Y→S_h¹as follows: for everyg∈Ythe elementy_h=A⁻¹_h g∈S_h¹

solves

Ω

y_h(t_j)−y_h(t_j−1)

h_t ϕ_h+α∇yh(t_j)· ∇ϕhdx= 1

h_t _t_j

tj−1

g(t) dt, ϕ_h

H⁻¹(Ω),H¹₀(Ω)

(4.2a) for all 1≤j≤n_tandϕ_h∈P_h¹ together with the boundary and initial conditions

y_h(t_j,·) = 0 on Γ, 1≤j≤n_t, and y_h(0) =R¹_hy_◦in Ω. (4.2b) Hence, we have applied the implicit Euler method for the time integration and the piecewise linear ﬁnite element method for the spatial discretization.

By Sh : L²(Q) → S_h¹ ⊂ W(0, T), u → A⁻¹_h (f +u) we introduce an approximation of the aﬃne linear operatorS introduced in Section2.2. The discretized set of admissible controls is deﬁned as

U_ad^h =

u_h∈S_h⁰a≤ı_hSh(u_h) +cu_h≤b a.e. inQ ,

where ı_h denotes the (orthogonal) projection of Y ontoS_h⁰. We may considerj|_S_h⁰ =ı^∗_h, where ı^∗_h denotes the adjoint ofı_h. Further, for simplicity we assumea, b, c, u_d, α_Q ∈S_h⁰; otherwise a corresponding discretization has to be considered.

Then, (P) is discretized by the family of ﬁnite-dimensional problemsˆ

min ˆJ_h(u_h) s.t. u_h∈U_ad^h ( ˆP_h) with ˆJ_h(u_h) =J(Sh(u_h), u_h) foru_h∈S_h⁰.

Proceeding as in Section 3 we formulate a semismooth Newton method for (Pˆ_h). For that purpose we introduce the operatorF_h:S_h⁰→S_h⁰,

F_h(u_h) =c⁻¹

κ(u_d−u_h) +ı_hp_h(u_h)

−max

0, c⁻¹(κu_d+ı_hp_h(u_h)) +κc⁻²(ı_hy_h(u_h)−b)

−min

0, c⁻¹(κu_d+ı_hp_h(u_h)) +κc⁻²(ı_hy_h(u_h)−a) ,

(4.3)

wherey_h(u_h) =A⁻¹_h (f+ı^∗_hu_h)∈S_h¹ andp_h(u_h)∈S_h¹solves the adjoint equation of (4.2):

Ω

p_h(t_j)−p_h(t_j−1)

h_t ϕ_h−α∇ph(t_j−1)· ∇ϕhdx= 1 h_t

_t_j

tj−1

(α_Q(t)(y_h(t)−z_Q(t)), ϕ_h)_L2(Ω)dt (4.4a) for all 1≤j≤n_tandϕ_h∈P_h¹ together with the boundary and initial conditions

p_h(t_j,·) = 0 on Γ, 0≤j≤n_t−1, p_h(0) =R¹_h

α_Ω(z_Ω−y_h(T)) in Ω. (4.4b)

As a mapping between ﬁnite dimensional spaces,F_h is Newton-diﬀerentiable. The generalized derivative ofF_h at a point ¯u_h∈S⁰_hin directionu_his given by

G_h(¯u_h)u_h=c⁻¹

ı_hp_h(¯u_h)u_h−κu_h

−c⁻²χ

λh(¯uh)+_c^κ₂(ıhyh(¯uh)+c¯uh−b)>0cı_hp_h(¯u_h)u_h+κı_hy_h(¯u_h)u_h

−c⁻²χ

λh(¯uh)+_c^κ₂(ıhyh(¯uh)+c¯uh−a)<0cı_hp_h(¯u_h)u_h+κı_hy_h(¯u_h)u_h ,

(4.5)

(12)

wherey_h=y_h(¯u_h)u_h∈S_h¹ solves (compactly written)

Ahy_h=ı^∗_hu_h (4.6a)

together with the boundary and initial conditions

y_h(t_j,·) = 0 on Γ, 1≤j≤n_t, and y_h(0) = 0 in Ω (4.6b) for all 1≤j≤n_t, andp_h=p_h(¯u_h)u_h∈S_h¹is the solution to

Ω

p_h(t_j)−p_h(t_j−1)

h_t ϕ_h−α∇p_h(t_j−1)· ∇ϕ_h−c(t_j−1)⁻¹p_h(t_j−1)ϕ_hdx= 1

h_t _t_j

tj−1

(α_Q(t)y_h(t)−κc(t)⁻¹u_h(t), ϕ_h)_L2(Ω)dt (4.7a) for all 1≤j≤n_tandϕ_h∈P_h¹ together with the boundary and initial conditions

p_h(t_j,·) = 0 on Γ, 0≤j ≤n_t−1, and p_h(0) =−R¹_h

α_Ωy_h(T) in Ω. (4.7b) The discretized version of Algorithm 1is given by Algorithm3.

Algorithm 3(discretized semismooth Newton method).

1: Chooseu⁰_h∈S_h⁰, and setk= 0.

2: repeat

3: ComputeG_h(u^k_h) according to (4.5) and solve forδu^k_h: G_h(u^k_h)δu^k_h=−Fh(u^k_h), withF_h given by (4.3).

4: Setu^k+1_h =u^k_h+δu^k_h, andk=k+ 1.

We have the following convergence theorem; see [7,12].

Theorem 4.1. Let {u^k_h}_k∈N be a sequence in S_h⁰ generated by Algorithm 3. Then, {u^k_h}_k∈N converges to the solution u^∗_h∈U_ad^h of (Pˆ_h)at a superlinear rate provided thatu⁰_h∈L²(Q)is suﬃciently close tou^∗_h.

To apply the results in [15] we have to introduce an approximation for the bilaterally control constrained problem (P). For that purpose we set˜ Th=ı_hA⁻¹_h ı^∗_h:S_h⁰→S_h⁰ andFh= (Th+cid).

Note that due to the existence of a unique solution of (4.2) and [8], Corollary 3.1, we have Th_L(S0

h,L²(Q))= sup

uh_S0

h=1ıhA⁻¹_h ı^∗_hu_h_L2(Q)≤c_A and further

Fh_L(S0

h,L²(Q))≤c_A+c_L^∞_(Q)=:c_F.

Letu∈L²(Q) be given. Fory=Tu_h andy_h=Thu_hwe have the estimate [8], Corollary 3.1,

y−y_h_L2(Q)≤c_Th²y_L2(0,T;H²(Ω))∩H¹(0,T;L²(Ω)) (4.8)