Optimal control of ordinary differential equations

(1)

HAL Id: cel-00392170

https://cel.archives-ouvertes.fr/cel-00392170

Submitted on 5 Jun 2009

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Optimal control of ordinary differential equations

Frédéric Bonnans

To cite this version:

Frédéric Bonnans. Optimal control of ordinary differential equations. 3rd cycle. Castro Urdiales (Espagne), 2006, pp.81. �cel-00392170�

(2)

Optimal control of ordinary differential equations

1 J. Fr´ed´eric Bonnans

2

August 11, 2006

1_{Lecture notes, CIMPA School on Optimization and Control, Castro Urdiales, August 28} -September 8, 2006.

2_{INRIA-Futurs and Centre de Math´ematiques Appliqu´ees (CMAP), Ecole Polytechnique,} 91128 PALAISEAU Cedex, France. Email: Frederic.Bonnans@inria.fr.

(3)

Foreword

These notes give an introduction to the theory of optimal control of ordinary differen-tial equations, and to some related algorithmic questions. We put the emphasis on the question of well-posedness (or not) of a local minimum.

For a system of nonlinear equations the main tool for checking well-posedness of a local solution is the implicit function theorem. We are sometimes able to reduce optimality conditions to this setting. However, there are situations when we cannot, and then several concepts of well-posedness may be used, based on the stability or uniqueness of local minimizers, solutions of optimality conditions, at different rates (strong regularity, strong stability, H¨older stability, etc.) In addition a number of functional analysis tools are needed: characterization of dual spaces, separation theorems, convex analysis.

The point of view taken in these notes is, starting from “concrete situations” (i.e. optimal control problems), to introduce gradually the needed theoretical concepts that are needed for either a numerical resolution or a sensitivity analysis of the problem. So in some sense we take the point of view of a (mathematical) engineer, but without being afraid of using abstract tools if necessary. Two chapters have been written at the occasion of the course, and the notes include also the papers [6] and [9], coauthored with A. Hermant and J. Laurent-Varin, respectively. Let me mention also the related papers [3, 8, 7].

These notes are in some sense a continuation of the book [10] written with A. Shapiro, devoted to the sensitivity analysis for general optimization problems. Many papers have since clarified the link between optimization theory and optimal control problems. A classical and still useful reference is Ioffe and Tihomirov [17]. A more recent book on optimal control is Milyutin and Osmolovskii [21].

I thank Eduardo Casas and Michel Th´era for giving me the opportunity of presenting this material, and wish that these notes will motivate students for entering in this field and obtaining new results. All remarks are welcome.

(6)

(7)

(8)

Chapter 1 Linear quadratic control and control

constrained problems

Linear quadratic optimal control problems occur in several situations:

(i) linearization of the dynamics around a stationary point (where the derivative is zero) and stabilization around that point

(ii) study of the optimality conditions of a critical point of an optimal control problem (iii) sensitivity analysis of a local solution of an optimal control problem.

The first section of this chapter we try first present the theory of critical points, including the shooting formulation and the Riccati equation. Then we relate the notion of Legendre form to the case when we have to solve a minimization problem.

In the second section we present a no-gap theory of second-order optimality conditions as well as a sensitivity analysis, in an abstract framework: nonlinear cost function and polyhedric constraints. We show how this applies to linear quadratic optimal control problems with bound constraints.

In the third section we study the case of nonlinear local constraint on the control, of the form

U ={u ∈ Rm

; gi(u)≤ 0, i = 1, . . . , r}, (1.0.1) and functions gi are convex continuous. Then the curvature of these functions has to be taken into account.

Notations We denote the Euclidean norm of x _{∈ R}n _by _{|x|. The transposition of a} matrix A is A>_.

1.1 Unconstrained problems

1.1.1 Critical points of quadratic functionals

Consider the following dynamical system

(9)

where s≤ T , and matrices At et Bt, measurable functions of time, are of size n× n and n_{× m respectively, and essentially bounded. Denote the control and state spaces by}

U := L2

(0, T, Rm); Y := H1

(0, T, Rn).

We know that with each u_{∈ U is associated a unique solution in Y of (1.1.2), called the} state and denoted y(u). Define the criterion

F (u, y) := 1 2

Z T s

[yt· Ctyt+ 2ut · Dtyt+ ut· Rtut] dt + 1₂yT · MyT. (1.1.3)

The matrices Ct, Dt and Rt are measurable, essentially bounded functions of time of appropriate dimension. The function F is therefore well-defined U × Y → R. Denote

f (u) := F (u, y(u)).

Being quadratic and continuous, f has a gradient and the latter is an affine function of u. We say that u is a critical point of f if Df (u) = 0.

In order to compute the gradient, let us introduce the adjoint state (or costate) equa-tion

− ˙pt = A>t pt+ Ctyt + D>t ut, t∈ [s, T ]; pT = M yT. (1.1.4) The costate p∈ Y associated with the control u ∈ U is defined as the unique solution of (1.1.4), where y = y(u).

Remark 1.1 A general method for finding the costate equation is as follows: let L(u, y, p) := F (u, y) +

Z T s

p(t)· (Atyt+ Btut − ˙yt)dt

denote the Lagrangian associated with the cost function F and state equation (1.1.2). Then the costate equation is obtained by setting to zero the derivative of the Lagrangian with respect to the state.

Proposition 1.2 The quadratic mapping u_{→ f(u) is of class C}∞ _from _{U to R, and its} gradient satisfies

Df (u)t = Bt>pt+ Rtut+ Dtyt, t ∈ [0, T ]. (1.1.5) where y and p are the state and costate associated with u.

The stationary points of f are therefore characterized by the (algebraic-differential) two-point boundary value problem (TPBVP)

˙yt = Atyt+ Btut, t∈ [s, T ]; y0 = x, (1.1.6) − ˙pt = A>t pt + Ctyt+ Dt>ut, t∈ [s, T ]; pT = M yT, (1.1.7)

0 = Bt>pt + Rtut+ Dtyt. (1.1.8)

In the sequel we will often assume Rt uniformly invertible:

(10)

Eliminating then the control variable from relation (1.1.8) we obtain then that the triple (u, y, p) is solution of (1.1.6)-(1.1.8) iff (y, p) is solution of the differential two-point bound-ary value problem

˙yt = (At − BtR−1t Dt)yt− BtRt−1Bt>pt, t ∈ [s, T ]; (1.1.10) − ˙pt = (Ct− D>t R−1t Dt)yt+ (At>− D>t R−1t Bt>)pt, t ∈ [s, T ]; (1.1.11)

ys = x, pT = M yT. (1.1.12)

Equations (1.1.10)-(1.1.12) may be rewritten as Ψ(y, p) = 0

(by putting all expressions on the right-hand-side), the mapping Ψ(y, p) being linear and continuous

Y × Y → L2(0, T, Rn₎

× L2(0, T, Rn₎

× R2n.

The only nonhomogeneous term is due to the given initial point x. Therefore the set of stationary points is a closed affine space, and there exists at most a stationary point iff the above system, when x = 0, has the only solution y = 0 and p = 0.

1.1.2 Shooting function and Hamiltonian flow

Let us introduce the shooting function

Ss,T : Rn → Rn; q 7→ pT − MyT,

where (y, p)∈ Y × Y is solution of (1.1.10)-(1.1.11), with initial condition (x, q) at time s. We can easily see that

Lemma 1.3 Assume that (1.1.9) holds. Then the control function u is a stationary point of f iff the associated costate p is such that ps is a zero of S.

The problem of finding the critical points of f reduces therefore to the one of solving a linear equation in Rn_.

Denote by Φs,t the “flow” associated with (1.1.10)-(1.1.11). In other words, Φs,t associates with (x, q) the value (yt, pt) obtained by integrating (1.1.10)-(1.1.11) over [s, t]. Denote by Φys,t and Φ

p

s,t the n first and last components of Φs,t. We have d dtΦs,t= At − BtR−1t Dt −BtR−1t Bt> −Ct+ Dt>R−1t Dt −A>t + D>t R−1t Bt> Φs,t (1.1.13)

The Hamiltonian function: Rm_{× R}n_{× R}n _{→ R, asociated with the original system, is} H(u, y, p, t) := 1

2(y· Cty + 2u· Dty + u· Rtu) + p· (Aty + Btu). (1.1.14) By substituting u =_−R−1t (B>t p + Dty), we obtain the reduced Hamitonian

H(y, p, t) := 1 2y· Cty + p· Aty− 1 2(B > t p + Dty)R−1t (Bt>p + Dty). (1.1.15)

(11)

The matrix in (1.1.13) denoted by MH

t , is called the Hamiltonian matrix associated with the critical point problem. It satisfies the relation

MtH= ∂2 H(y,p,t) ∂p∂y ∂2 H(y,p,t) ∂y∂y ∂2 H(y,p,t) ∂p∂p ∂2 H(y,p,t) ∂y∂p ! (1.1.16) We may write the shooting equation under the form

Φps,T(x, p0) = M Φys,T(x, p0). (1.1.17) Since Φs,t is linear, this can be rewritten as

Φp_s,T(0, p0)− MΦys,T(0, p0) =−Φps,T(x, 0) + M Φ y

s,T(x, 0). (1.1.18) Lemma 1.4 Assume that (1.1.9) holds. Then when s is close to T , Ss,T is invertible, i.e., there exists a unique stationary point of f .

Proof. It is easy to check that Ss,T is a continuous function of s, and Ss,T(q)→ q − Mx when s_{↑ T . Therefore S}s,T is invertible for s close to T . The conclusion follows.

Definition 1.5 We say that s < T is a conjugate point of T if Ss,T is not invertible. Denote by T the set of times s < T which are not conjugate, i.e., for which Ss,T is invertible.

Obviously T is an open set. If all matrices are (real) analytic functions of time (i.e., locally expandable in power series), then the shooting function is also an analytic function, and has for each s, at most finitely many zeroes. To see this, observe that the determinant of the Jacobian of the shooting function is a nonzero analytic function of time, so that it may have only a finite number of zeroes over a bounded interval of R. Now _{T is the set of times for which this determinant does not vanish.}

We say that (y, p) is a singular solution of the two-point boundary value problem (1.1.10)-(1.1.12) if it is a nonzero solution of (1.1.10)-(1.1.12) with x = 0. We can express the fact that a time is a conjugate point using singular solutions.

Lemma 1.6 A time τ is a conjugate point of T iff there exists a singular solution of (1.1.10)-(1.1.12).

Proof. We have that τ is a conjugate point iff the shooting equation has a nonzero solution q with zero initial condition x. Integrating (1.1.10)-(1.1.12) with initial condition

(0, q), we derive the conclusion.

1.1.3 Riccati equation

Let s _{∈ T . Since S}s,T is affine, with right hand side linear function of x, ps is a linear mapping of x. So we may write

ps = Psx,

where Ps is a square matrix of size n. For all σ ∈ T ∩]s, T [, (y, p) solution of (1.1.10)-(1.1.12), restricted to [σ, T ], is a stationary point with initial condition yσ, and so

(12)

By standard results on ordinary differential equations, St and hence, Pt are differentiable functions of t. Substituting P y to p in (1.1.11), and factorizing by yt, we get

0 = Pt˙yt+h ˙Pt+ (Ct− D>t R −1 t Dt) + (A>t − D > t R −1 t B > t )Pt i yt, t∈ T . (1.1.19)

Using the expression of ˙yt in (1.1.10) with pt = Ptyt, we obtain

0 =h ˙Pt+ PtAt + A>t Pt+ Ct− (PtBt+ D>t )R−1t (Bt>Pt+ Dt) i

yt, t ∈ T . (1.1.20)

Since this must be satisfied for all possible values of yt (take s = t and then yt = x is arbitrary) we obtain that P is solution of the Riccati equation

0 = ˙Pt+ PtAt+ A>t Pt+ Ct − (PtBt + Dt>)R−1t (Bt>Pt + Dt) t∈ T . (1.1.21) Denote by τ0 the largest conjugate point (i.e., the first starting backwards from T ). If no conjugate point exist, we set τ0 =−∞.

Lemma 1.7 The Riccati operator Pt (defined on T ) is symmetric.

Proof. (i) We have that Pt is symmetric on (τ0, T ], since the final condition is symmet-ric, and the derivative is symmetric on the subspace of symmetric matrices1_.

(ii) We approximate the data by convolution with a smooth kernel (so as to obtain C∞ data), and then by polynomials. In that case Φs,T is an analytic function of time, and hence the solution p0 of (1.1.18)) too. Since each column of Ps is the solution of (1.1.18)) when w is one basis vector, we obtain that Ps is also an analytic function of time. Being symmetric for values close to T , it must be symmetric everywhere. Lemma 1.8 Assume that τ0 is finite. Then the Riccati equation (1.1.21), with final condition PT = M , has a unique solution over (τ0, T ], that if τ0 is finite, satisfies limt↓τ0kPtk = +∞.

Proof. It is a standard result of the theory of ODEs that, since (1.1.21) is a differential equation with locally Lipschitz dynamics, it has a unique solution over a segment of the form (τ1, T ], and if τ1 is finite, limt↓τ0kPtk = +∞.

Since (1.1.21) has a solution over _{T , we obtain that τ}1 ≤ τ0. If τ0 = −∞ the conclusion follows. Otherwise assume that lim supt↓τ0kPtk < +∞. Then (1.1.21) would have a solution over [τ1, T ]. But then pt = Ptytis solution of the two point boundary value problem over [τ1, T ], for any initial condition x. This contradicts the non invertibility of

the shooting mapping.

Remark 1.9 Let τ be a (necessarily isolated) conjugate point. Then lim

s→τ±kPsk = +∞

otherwise Pτ would be well-defined, and p = Pτx would provide a solution of the shooting equations, for arbitrary x, in contradiction with the definition of a conjugate point.

(13)

1.1.4 Expression of the critical value

With every critical point u at time s is associated he critical value f (u). The latter has, when s∈ T , a simple expression involving Ps. Since

yT · MyT = yT · pT = x· ps+ Z T

s

( ˙yt · pt + yt· ˙pt) dt (1.1.22)

we obtain, combining with (1.1.2) et (1.1.4), that yT · MyT = x· ps+

Z T s

pt · Btut − yt· Ctyt− yt· D>t ut dt (1.1.23)

Using (1.1.23) and (1.1.8) for evaluating the critical value as a function of x, denoted F (x), we obtain

f (x) = x_{· p}s. (1.1.24)

In particular, if s_{∈ T , then}

f (x) = x_{· P}sx. (1.1.25)

Consequently, the nonnegativity f is equivalent to the positive semidefiniteness of Ps.

1.1.5 Legendre forms and minima of quadratic functions

We consider in this section the problem of minimizing the quadratic cost f . A local minimum ¯u satisfies the second-order necessary condition2

Df (¯u) = 0 and D2f (¯u) 0. (1.1.26) Since D2_{f (}_{·) is constant, this means that ¯u is a stationary point of f and that f is convex.} In that case we know that critical points coincide with global minima.

The next step is to study the well-posedness of local minima. The latter may be defined as the invertibility of D2_{f (¯}_{u), so the the implicit function theorem applies to a} smooth perturbation of the critical point equation Df (¯u) = 0. The following is proved in [10, Lemma 4.124].

Lemma 1.10 Assume that D2_{f (¯}_u) _{≥ 0. Then D}2_{f (¯}_{u) is invertible iff it is uniformly} positive, in the following sense: there exists α > 0 such that

D2f (¯u)(h, h) _{≥ αkhk}2. (1.1.27) Since f is quadratic, its Hessian is uniformly positive iff f satisfies the following quadratic growth condition

Definition 1.11 Let u be a stationary point of f . We say that the quadratic growth property is satisfied if there exists α > 0 such that f (u)≥ f(¯u) + αku − ¯uk2

U, for all u in some neighborhood of ¯u.

Let us now relate these notions to the one of Legendre forms [10, Sections 3.3.2 et 3.4.3].

(14)

Definition 1.12 Let X be a Hilbert space. We say that Q : X → R is a Legendre form if it is a weakly lower semi continuous (w.l.s.c.) quadratic form over X, such that, if yk

→ y weakly in X and Q(yk₎

→ Q(y), then yk

→ y strongly. Set wk_{:= y}k_{− y. Using}

Q(yk) = Q(y) + DQ(y)wk+ Q(wk),

and since DQ(y)wk_{→ 0 as w}k _{→ 0 weakly, we have that Q is a Legendre form iff for any} sequence wk _{weakly converging to 0, Q(w}k₎_{→ 0 iff w}k_{→ 0 strongly.}

The following examples apply easily to the quadratic costs for optimal control prob-lems:

Example 1.13 Let Q be a quadratic form over a Hilbert space X.

(i) Let Q(y) = _kyk2 _{be the square of the norm. Then obviously Q(w}k₎ _{→ 0 iff w}k _{→ 0} strongly. Therefore Q is a Legendre form.

(ii) Assume that Q is nonnegative, and y 7→ pQ(y) is a norm equivalent to the one of X. Then (the weak topology being invariant by under a new equivalent norm) Q is a Legendre form.

(iii) Assume that Q(y) = Q1(y) + Q2(y), where Q1 is a Legendre form, and Q2 is weakly continuous. Then Q is a Legendre form.

The notions of quadratic growth and Legendre form are related in the following way: Lemma 1.14 Let Q : X → R be a Legendre form, and C a closed convex cone of X. Then the two statements below are equivalent:

Q(h) > 0, for all h∈ C \ {0} (1.1.28) ∃ α > 0; Q(h) ≥ αkhk2_, _{for all h}_{∈ C.} _(1.1.29) Lemma 1.15 The functional f is w.l.s.c. over U iff Rt 0 a.e., and D2f is a Legendre form iff there exists α > 0 such that Rt αId a.e.

Proof. (i) We can decompose f as f = f1 + f2, where f1 is the part that does not depend on the state (obtained by setting Ct and Dt to 0) and f2 = f − f1. It is easily checked that f2 is weakly continuous. Therefore f is w.l.s.c. iff f1 is w.l.s.c.

(ii) If Rt 0 a.e., then f1 being convex and continuous, is w.l.s.c.; If not, it is easily shown that there exists β > 0 and a measurable set I _{⊂ (s, T ) such that}

h_{· R}th≤ −βkhk2, for all h∈ Rm, a.e. t∈ I. (1.1.30) Let UI be the subset of U of functions that are zero a.e. outside I. Since UI is infinite dimensional, there is an orthonormal sequence uk _in _U

I. We have that uk → 0 weakly in U, whereas lim sup k f (uk) = lim sup k f1(uk)≤ −β < 0 = f(0). (1.1.31) Therefore f is w.l.s.c. iff Rt 0 a.e.

(iii) If Rt αId a.e., then √

f1 defines a norm equivalent to the one of U, and since f2 is weakly continous, D2_{f is a Legendre form (see case (iii) example 1.13).}

(15)

Otherwise there exists an orthonormal sequence uk _{such that a := lim sup f}

1(uk) ≤ 0. Since uk _{→ 0 weakly, either a < 0 contradicting the weak l.s.c. of f}

1, or a = 0 so that f1(uk) → f1(0), but uk does not strongly converge to 0, contradicting the definition of

the Legendre form.

1.1.6 Spectral analysis

In this section, for simplicity, we assume that all matrices in the definition of the quadratic problem are constant over time, and that R is positive definite. We can make a change of variable on Rm_,

v = Lu such that L>_{L = R, and then}

|v|2 = u_{· Ru.}

The corresponding change of variables on _{U has the effect of reducing R to identity. So} in the sequel we assume that R is the identity matrix. Also for simplicity we assume that D = 0. So we may write f = f1+ f2, with

f1(u) = 1₂ Z T s |u t|2dt =kuk2 (1.1.32) and f2(u) = 1₂ Z T s yt· Ctytdt +1₂yT · MyT (1.1.33) Let Hs denote the Hessian of f2, and Qs denote the associated quadratic form. If X, Y are Banach spaces, an operator A _{∈ L(X, Y ) is said to be compact if the} image of BX (unit ball) by A has a compact closure. The following lemma is classical (see e.g. Dunford and Schwartz [13]).

Lemma 1.16 The operator Hs is selfadjoint and compact. Consequently, there is an orthonormal basis of Us composed of eigenvectors of Hs.

Proof. The first statement is a consequence of the compactness of the mapping Us → Ys, v7→ z, where z is the unique solution of the linearized equation

˙z = Az + Bv; z(s) = 0. (1.1.34)

The second statement comes from the well-known theory of compact operators; see e.g.,

Balakrishnan [2, Section 3.3]

Lemma 1.17 We have that

lim sup s↑T

Hs(v, v) kvkUs

= 0 (1.1.35)

Proof. The conclusion follows easily from the inequalities below, that are consequence of Gronwall’s lemma and the Cauchy-Schwarz inequality:

kzk∞≤ C Z T

s |v(t)|dt ≤ √

(16)

For s close to T , the above lemma implies that the Hessian of f , i.e., Id + Hs, is uniformly positive, and hence f is strongly convex, and has a unique critical point that is a minimum point. Therefore the first conjugate point τ0 is the first for which Hs has an eigenvalue equal to -1.

1.2 Polyhedric constraints

1.2.1 Overview

Here we study problems of the form

Min f (x); x_{∈ K,} (P )

with K closed convex subset of the Hilbert space X, and f : X → R of class C2_{. The} essential hypothesis is that the set K is polyhedric (definition 1.22). It allows a rather complete theory of second-order optimality conditions and sensitivity.

Although the cost function is not necessarily quadratic, the application we have in view is linear quadratic optimal control problems with bound constraints on the control variable. Dealing with nonquadratic cost functions has its own interest since it suggests how to deal with nonquadratic optimal control problems (where as we will see two norms are to be used for the control space).

1.2.2 second-order necessary optimality conditions

In the statements below, X is a Hilbert space and f is of class C2_{, X} _{→ R.} Define the (abstract) critical cone as

C(x) :=_{{h ∈ T}K(x); Df (x)h≤ 0}. A second-order necessary optimality condition is as follows.

Proposition 1.18 Let ¯x, local solution of (P ). Then ¯x satisfies the first-order necessary optimality condition

Df (¯x)h = 0, for all h_{∈ C(¯x).} (1.2.37) In addition,

D2f (¯x)(h, h)≥ 0, for all h ∈ RK(¯x)∩ Df(¯x)⊥. (1.2.38) Proof. Relation follows from the well-known first-order optimality condition

Df (¯x)(x− ¯x) ≥ 0, for all x ∈ K (1.2.39) and the definition of the critical cone. If in addition h_{∈ R}K(¯x)∩Df(¯x)⊥, then ¯x+th∈ K for t > 0 small enough, and hence

0_{≤ lim} t↓0 f (¯x + th)_{− f(¯x)} 1 2t2 = D2f (¯x)(h, h).

(17)

Remark 1.19 The conclusion holds even if K is nonconvex.

We now introduce a second-order sufficient optimality condition.

Proposition 1.20 Let ¯x_{∈ K, satisfying the second-order necessary optimality condition} (1.2.37). Assume that D2_{f (¯}_{x) is a Legendre form, and that}

D2f (¯x)(h, h) > 0, for all h_{∈ C(¯x), h 6= 0.} (1.2.40) Then ¯x is a local solution of (P ), that satisfies the quadratic growth condition.

Proof. If the conclusion is not satisfied, then there exists a sequence xk _{→ x, x}k _{6= x} for all k, such that

f (xk)≤ f(¯x) + o(kxk

− ¯xk2_). _(1.2.41)

Denote tk:=kxk− ¯xk and hk := t−1k (xk− ¯x). Then xk = ¯x + tkhk, and hence, f (xk_{) = f (¯}_{x) + t}

kDf (¯x)hk+1₂t2kD2f (¯x)(h

k_{, h}k_{) + o(t}2

k). (1.2.42) Combining with (1.2.41), get

Df (¯x)hk+1₂tkD2f (¯x)(hk, hk)≤ o(tk). (1.2.43) Extracting if necessary a subsequence, we may assume that hk _{weakly converges to some} ¯

h, and so Df (¯x)hk_{converges to Df (¯}_x)¯_{h, so that with (1.2.43), Df (¯}_x)¯_h_{≤ 0. On the other} hand, ¯h _{∈ T}K(¯x) (since a closed convex set is weakly closed), and hence, ¯h is a critical direction.

By the first-order optimality condition Df (¯x)hk _{≥ 0, so that with (1.2.43),} D2f (¯x)(hk, hk)_{≤ o(1),}

and passing to the limit, D2_{f (¯}_x)(¯_{h, ¯h)}_{≤ 0. Condition (1.2.40) implies}

D2_{f (¯}_x)(¯_{h, ¯h) = 0,} _(1.2.44)

and so D2_{f (¯}_x)(¯_{h, ¯h) = lim}

kD2f (¯x)(hk, hk). Since D2f (¯x) is a Legendre form, this im-plies the strong convergence of hk _{towards ¯}_{h, and so} _{k¯hk = 1. Then (1.2.44) gives a} contradiction with (1.2.40).

1.2.3 Polyhedric sets

It seems that there is an important gap between the previous necessary or sufficient second-order conditions, since they involve directions in the sets RK(¯x)∩ Df(¯x)⊥ and C(¯x), respectively. These two sets may be quite far one from each other, as shows the next example.

Example 1.21 Take X = R2_{, K the unit closed ball, and f (x) = x}

2. At the minimum point ¯x = (0,₋₁₎>_{, we have}

(18)

That said, these two sets coincide in some important cases. Note that the first-order optimality condition may be written as

−Df(¯x) ∈ NK(¯x).

Definition 1.22 Let x ∈ K and q ∈ NK(x). We say that K is polyhedric at x w.r.t. the normal direction q, if

TK(x)∩ q⊥ =RK(x)∩ q⊥. (1.2.46) If that property holds for all x ∈ K and q ∈ NK(x), we say that K is polyhedric.

We will check that this applies to the case of bound constraints on the control. See section 1.2.6.

Proposition 1.23 Assume that K is polyhedric, and that ¯x _{∈ K is such that D}2_{f (¯}_x) is a Legendre form, then ¯x is a local minimum of (P ) satisfying the quadratic growth condition iff it satisfies (1.2.37) and (1.2.40).

Proof. By proposition 1.20, (1.2.37)-(1.2.40) implies local optimality with quadratic growth. Con,versely, assume that the quadratic growth condition holds. Then ¯x satisfies the first-order condition (1.2.37), and is for α > 0 small enough a local minimum of the problem

Min f (x)₋ 1₂α_{kx − ¯xk}2; x_{∈ K.} Proposition 1.18 implies therefore the relation

D2f (¯x)(h, h)_{− αkhk}2 _{≥ 0, for all h ∈ R}K(¯x)∩ Df(¯x)⊥,

implying itself (1.2.40).

1.2.4 Stability of solutions

Consider now a family of optimization problems of the form

Min f (x, u); x_{∈ K,} (Pu)

with X a Hilbert space and U a Banach space, K a nonempty, closed and convex subset of X, and f : X × U → R of class C2_{. We assume that D}2

xxf (¯x, ¯u) is a Legendre form, and ¯x local solution of (Pu¯) satisfying the second-order sufficient condition

Dxf (¯x, ¯u)h = 0 and D2xxf (¯x, ¯u)(h, h) > 0, for all h∈ C(¯x, ¯u), h 6= 0, (1.2.47) where C(¯x, ¯u) denotes the critical cone

C(¯x, ¯u) := {h ∈ TK(¯x); Dxf (¯x, ¯u)h≤ 0}. (1.2.48) By proposition 1.20 the quadratic growth condition is satisfied. More precisely, define the local problem (around ¯x)

(19)

with θ > 0. Then for θ > 0 small enough (we assume that this holds in the sequel), ¯x is unique solution of (Pu,θ¯ ), and there exists α > 0 such that

f (x, ¯u)≥ f(¯x, ¯u) + αkx − ¯xk2, for all x∈ K, kx − ¯xk ≤ θ. (1.2.49) Let us show the stability of the local solution of (Pu) w.r.t. a perturbation.

Proposition 1.24 Assume f w.l.s.c., D2

xxf (¯x, ¯u) a Legendre form, the second-order con-dition (1.2.47) satisfied, and let θ > 0 be such that (1.2.49) holds. Then , for all u_{∈ U,} the local problem (Pu,θ) has at least one solution and , if xu ∈ S(Pu,θ), we have

kx − ¯xk = O(ku − ¯uk). (1.2.50)

Proof. A minimizing sequence of problem (Pu,θ) is bounded. Since X is a Hilbert space, there exists a limit-point (for the weak topology) xu. The set K is weakly closed, and f is w.l.s.c.; therefore xu ∈ S(Pu,θ). Combining relations

f (xu, ¯u) = f (xu, u) + R1

0 Df (xu, u + σ(¯u− u))(¯u − u)dσ f (¯x, ¯u) = f (¯x, u) +R1

0 Df (¯x, u + σ(¯u− u))(¯u − u)dσ with the quadratic growth condition (1.2.49), we get

α_kxu− ¯xk2 ≤ f(xu, ¯u)− f(¯x, ¯u)

≤ f(xu, ¯u)− f(xu, u) + f (¯x, u)− f(¯x, ¯u) =R1

0 [Df (xu, u + σ(¯u− u)) − Df(¯x, u + σ(¯u − u))] (¯u − u)dσ = O (_kxu− ¯xk ku − ¯uk) ,

implying (1.2.50).

1.2.5 Sensitivity analysis

We have a mapping R+ → U, t → u(t) with d ∈ U, be such that

u(t) = ¯u + td + r(t); kr(t)k = o(t). (1.2.51) Set v(t) := val(Pu(t),θ), where θ > 0 is such that (1.2.49) is satisfied. Define the subprob-lem

Min h∈C(¯x)D

2_{f (¯}_{x, ¯}_{u)((h, d), (h, d)).} _{(SP )}

Theorem 1.25 Assume that K is polyhedric, that f is weakly l.s.c., that D2_{f (¯}_{x) is a} Legendre form, and that the second-order condition (1.2.47) is satisfied. Then the value function may be expanded as follows:

v(t) = v(0) + Duf (¯x, ¯u)(u(t)− ¯u) +1₂t2val(SP ) + o(t2). (1.2.52) In addition, any weak limit-point ¯h of (xt − ¯x)/t is a strong limit-point, and satisfies ¯

h∈ S(SP ). If (SP ) has the unique solution ¯h, then the following expansion of solutions holds

(20)

Proof. a) Upper estimate. Let ε > 0. Since K is polyhedric, there exists h∈ RK(¯x)∩ Df (¯x)⊥ _{such that}

D2f (¯x, ¯u)((h, d), (h, d))_{≤ val(SP ) + ε.} The following holds:

f (¯x + th, u(t)) = f (¯x, ¯u) + Duf (¯x, ¯u)(u(t)− ¯u)

+1₂t2_D2_{f (¯}_{x, ¯}_{u)((h, d), (h, d)) + o(t}2_). (1.2.54) Since ¯x + th∈ K pour t > 0 small enough, we have

v(t)≤ f(¯x+th, u(t)) ≤ f(¯x, ¯u)+Duf (¯x, ¯u)(u(t)− ¯u)+1₂t2(val(SP ) + ε)+o(t2). (1.2.55)

This being true for any ε > 0, we obtain

v(t)≤ f(¯x, ¯u) + Duf (¯x, ¯u)(u(t)− ¯u) + 1₂t2val(SP ) + o(t2). (1.2.56)

b) Lower estimate. Let xt ∈ S(Pu(t),θ). By proposition 1.24, we know that kxt− ¯xk = O(ku(t) − ¯uk) = O(t),

and ht := (xt− ¯x)/t is therefore bounded. Let ¯h be a weak limit-point. We have

f (xt, u(t)) = f (¯x + tht, u(t))

= f (¯x, ¯u) + Df (¯x, ¯u)(xt− x, u(t) − ¯u) +1

2t

2_D2_{f (¯}_{x, ¯}_u)((h

t, d), (ht, d)) + o(t2). Comparing to (1.2.56), obtain after division by 1

2t 2

2t−1_D

xf (¯x, ¯u)ht+ D2f (¯x, ¯u)((ht, d), (ht, d))≤ val(SP ) + o(1). (1.2.57) This implies Dxf (¯x, ¯u)ht ≤ o(t), and hence, Dxf (¯x, ¯u)¯h ≤ 0. Since ht ∈ RK(¯x), we have ¯

h _{∈ T}K(¯x), therefore ¯h is a critical direction. On the other hand, ht ∈ RK(¯x) combined with the first-order necessary condition implies Dxf (¯x, ¯u)ht ≥ 0. Using the weak l.s.c. of D2_{f (¯}_{x, ¯}_{u), get with (1.2.57)}

D2f (¯x, ¯u)((¯h, d), (¯h, d))_{≤ lim inf} t↓0 D

2_{f (¯}_{x, ¯}_u)((h

t, d), (ht, d))≤ val(SP ).

As ¯h∈ C(¯x), this im plies ¯h ∈ S(SP ) and hence, D2_{f (¯}_{x, ¯}_u)((h

t, d), (ht, d))→ D2f (¯x, ¯u)((¯h, d), (¯h, d)).

Since ¯h is a weak limit-point of ht, this implies Dxx2 f (¯x, ¯u)(ht, ht) → Dxx2 f (¯x, ¯u)(¯h, ¯h). Since D2

xxf (¯x, ¯u) is a Legendre form, we deduce that ¯h is a limit-point of ht for the strong convergence. In particular, if (SP ) has a unique solution, then ht → ¯h, implying (1.2.53).

(21)

1.2.6 Bound constraints in spaces of summable square

In this section we apply the above results to the case when Ω is an open subset of Rn_, X := L2_{(Ω) is the Hilbert space of summable square over Ω, and K := L}2_(Ω)

+ is the set of nonnegative a.e. functions of X. We recall the following result, due to Lebesgue. Theorem 1.26 (Dominated convergence) Let xn a sequence of elements of L2(Ω). Suppose that there exists g_{∈ L}2_{(Ω) such that} _|x

n(ω)| ≤ g(x), a.e. and that, for almost all ω, xn(ω) converges. Set x(ω) = limnxn(ω). Then x∈ L2(Ω), and xn → x in L2(Ω).

Given x_{∈ L}2_{(Ω), denote}

I(x) := _{{ω ∈ Ω; x(ω) = 0}; J(x) := {ω ∈ Ω; x(ω) > 0},}

the contact set and its complement, defined up to a null measure set. The lemma below states the essential properties for the sequel.

Lemma 1.27 (i) The cone K is a closed subset of L2_(Ω). (ii) Its dual cone is K− _{= L}2_(Ω)

−, the set of functions of X that are nonpositive a.e. (iii) Let x_{∈ K. Then}

TK(x) := {h ∈ X; h ≥ 0, a.e. sur I(x)}, (1.2.58) NK(x) := {h ∈ X−; h = 0, a.e. sur J(x)}. (1.2.59) In addition, let q ∈ NK(x). Then

TK(x)∩ q⊥ ={h ≥ 0, a.e. sur I(x); h(ω)q(ω) = 0 a.e. }. (1.2.60) (iv) The positive cone of L2_{(Ω) is polyhedric.}

Proof. (i) Let xn→ ¯x in L2(Ω), xn nonnegative a.e. The function yn(ω) := min(0, xn(ω))

has value zero, and converges in L2_{(Ω) towards min(0, ¯}_{x) in view of the dominated} con-vergence theorem. Therefore min(0, ¯x) = 0 in L2_{(Ω), so that ¯}_x _{≥ 0 a.e., as was to be} shown.

(ii) If y _{∈ L}2_(Ω)

−, then clearly R

Ωy(ω)x(ω)dω ≤ 0 for all x ∈ K, and hence, L 2_(Ω)

− ⊂ K−_{. Conversely, if y} _{∈ K}−_{, let x} _{∈ L}2_{(Ω) defined by x(ω) := max(0, y(ω)) a.e.; then} x∈ K and hence, 0 ≥R

Ωy(ω)x(ω)dω = R

Ω(y(ω)) 2

+dω. Therefore y(ω)≤ 0 a.e., implying (ii).

(iii) The expression of normal directions is a direct consequence of the formula of normal cones when the set K is a cone, see e.g. [10, Example 2.62]:

NK(x) = K−∩ x⊥ (1.2.61)

The one of the tangent cone follows, using the relation TK(x) = NK(x)−, and the latter implying (1.2.60).

(iv) Let h _{∈ T}K(x)∩ q⊥. Set, for ε > 0, hε := ((x + εh)+ − x)/ε. Then x + εhε = (x + εh)+∈ K, and hence, h ∈ RK(x). By the dominated convergence theorem, hε → h

(22)

in L2_{(Ω). Point (ii) implies that h}

ε(ω)q(ω) is zero for almost all ω, and hence, h ∈ q⊥.

We have shown that K is polyhedric.

For problem

Min

x∈L2_(Ω)+f (x),

with f of class C2 _{: L}2_(Ω) _{→ R, the second-order sufficient optimality condition (1.2.40)} writes, taking into account the previous lemma, when D2_{f (¯}_{x) is a Legendre form:}

 



Df (¯x)(ω)_{≥ 0, Df(¯x)(ω)x(ω) = 0, a.e.} D2_{f (¯}_{x)(h, h) > 0, for all h}_{≥ 0 over I(¯x),} Df (¯x)(ω)h(ω) = 0 a.e.

(1.2.62)

1.3 Convex constraints on control variables

1.3.1 Framework

In this section we assume that the state equation is linear, and that the cost function is quadratic, given by (1.1.2) and (1.1.3) respectively. The problem is

Min

u f (u); u∈ K. (P )

The novelty is that we have now control constraints of the form u_{∈ K,}

where

K :=_{{u ∈ U; g(u(t)) ≤ 0, a.e. t ∈ (0, T )}.} (1.3.63) The convex function g : Rm _{→: R}ng _{is assumed to be C}2 _{: R} _{→ R. For simplicity we} assume that

g(0) = 0. (1.3.64)

1.3.2 First-order necessary optimality conditions

Let ¯u be a local solution of the problem Min

u f (u); u∈ K.

Since K is convex, a first-order necessary optimality condition is

Df (¯u)(u− ¯u) ≥ 0, for all u∈ K, (1.3.65) or equivalently

Df (¯u) + NK(¯u)3 0. (1.3.66)

We can prove the following result of smoothness of optimal control (for which no qualification condition is needed). We denote by ¯y, ¯p the state and costate associated with a solution or critical point ¯u.

(23)

Lemma 1.28 Assume that Rt is uniformly positive:

∃ α > 0; u · Rtu≥ α|u|2, for almost all t∈ (0, T ). (1.3.67) Then any solution of the first-order necessary optimality conditions is essentially bounded. Proof. Let ¯u be such a solution. Combining proposition 1.2 and (1.3.65), we obtain that the following holds:

(B>t pt+ Rtu¯t+ Dtyt)· (v − ¯ut)≥ 0, for all v∈ g−1(]− ∞, 0]), t ∈ [0, T ]. (1.3.68) In view of (1.3.64), we may take v = 0, obtaining (using (1.3.67) and the fact that Bt, Dt, p, y are essentially bounded)

α_|¯ut|2 ≤ ¯ut· Rtu¯t ≤ (Bt>pt + Dtyt)· ¯ut ≤ c|¯ut| t ∈ [0, T ], (1.3.69) for some constant c. Then by the Cauchy Schwarz inequality, _|¯ut| ≤ c/α for a.a. t. Again without any qualification condition, we can show the local nature of the tangent and normal cones to K. Denote

Kg := g−1(R ng −). Lemma 1.29 Let u_{∈ K. Then}

TK(u) ={v ∈ U; vt ∈ TKg(ut) for almost all t∈ (0, T )}. (1.3.70) NK(u) :={µ ∈ U; µt ∈ NKg(ut) for almost all t∈ (0, T )}. (1.3.71) Proof. Denote by PK the orthogonal projection onto K (well-defined since K is a closed convex set of the Hilbert space _{U). We have that v ∈ T}K(u) iff, given ε > 0,

vε:= ε−1(PK(u + εv)− u) is such that vε _{→ v in U when ε ↓ 0. Obviously}

vε t = ε

−1_(P

Kg(ut+ εvt)− ut), a.e. t∈ (0, T ). (1.3.72) Since PKg is non expansive, |vεt| ≤ |vt| a.e., therefore the dominated convergence theorem implies that vε

→ v in U when ε ↓ 0 iff vε

t → vt a.e. The latter holds iff vt ∈ TKg(ut) a.e.; relation (1.3.70) follows, and (1.3.71) is an easy consequence of (1.3.70). We need however a qualification condition in order to relate the expression of the Lagrange multipliers to g(u) and Dg(u). So let us assume that

∃ β > 0 and u0 _{∈ R}m

; g(u0) <−β. (1.3.73) In that case it is well-known that for all u_{∈ R}m_:

TKg(u) ={v ∈ Rm; Dgi(u)v ≤ 0, for all i; gi(u) = 0} (1.3.74)

NKg(u) = ( ng

X

i=1

λiDgi(u); λ∈ Rm₊; λi = 0, for all i; gi(u) < 0 )

. (1.3.75)

Denote the set of active constraints at a point u _{∈ U (defined up to a null measure set)} by

(24)

Lemma 1.30 Let u ∈ U, and assume that the qualification condition (1.3.73) holds. Then

TK(u) ={v ∈ U; Dgi(ut)vt ≤ 0, for a.a. t∈ (0, T ), i ∈ It(ut)}, (1.3.77) NK(u) = { u ∈ U; µt =

Png

i=1λi,tDgi(ut); λi,t ∈ Rm+;

λi,t = 0, for all i; gi(ut) < 0 a.e. t∈ (0, T )}. (1.3.78) In addition we have that if λ satisfies (1.3.78), then

X

i

|λi,t| ≤ β−1|µt| |u0− ut|. (1.3.79)

Proof. Relations (1.3.77) and (1.3.77) are immediate consequences of the above rela-tions. If λ satisfies (1.3.78), then since g is convex, then a.e., for all i_{∈ I}t(u):

−β ≥ g(u0)_{≥ Dg}i(ut)(u0− ut). (1.3.80) Multiplying by λi,tand summing over i (the contribution of non active constraints is zero) we get

−βX

i

|λi,t| ≥ µt· (u0− ut)≥ −|µt| |u0− ut|, (1.3.81)

from which (1.3.79) follows.

Remark 1.31 Note that the above λ is not necessarily measurable. A measurable λ can be constructed as follows. Given J ⊂ {1, . . . , ng}, denote the (measurable) set of times for which the set of active constraints is J (defined up to a null measure set) by

TJ :={t ∈ (0, T ); It(ut) = J}. (1.3.82) Next, denote by ϕJ(ηt, γt) the solution of the following problem

Min λ∈Rng+ |λ|; ηt := X i∈J λiγi; λi = 0, i6∈ J. (1.3.83)

When t∈ TJ, ηt = µt and γt = Dg(ut), the problem has a unique solution that (in view of the qualification condition) depends continuously on (ηt, γt); otherwise observe that if η = 0, the solution is λ = 0. Now the minimum-norm λ can be expressed as

λt :=

X

J⊂{1,...,ng}

ϕJ(1t∈TJµt, Dg(ut)). (1.3.84)

Being a sum of continuous functions of measurable mappings, this is a measurable func-tion.

Denote the set of Lagrange multipliers by Λ(u) := ( λ_{∈ L}2(0, T ; Rng_{); λ} t ∈ NKg(ut) a.e.; Df (u)t+ ng X i=1 λi,tDgi(ut) = 0 ) . (1.3.85)

(25)

Lemma 1.32 The point ¯u satisfies the first-order necessary optimality conditions iff Λ(u) is not empty. If in addition Rt is uniformly positive, then Λ(u) is a bounded and weakly∗ closed subset of L∞_{(0, T, R}m_).

Proof. The expression of the set of Lagrange multipliers is a consequence of the ex-pressions of the normal cone to K given before.

The L∞ boundedness follows from lemma 1.28 and (1.3.79). It remains to show that Λ(u) is weakly_{∗ closed. Since a half-space of the form}

Hψ,β :={γ ∈ L∞(0, T, Rng); Z T

0

γt· ψtdt≤ β} (1.3.86)

is weakly∗ closed whenever ψ ∈ L1_{(0, T, R}ng_{), it suffices to show that Λ(u) is an} intersec-tion of such spaces. Obviously Df (u) +Png

i=1λi,tDgi(ut) = 0 iff Z T 0 [Df (u) + ng X i=1

λi,tDgi(ut)ψi,t]dt = 0, for all ψ ∈ L1(0, T, Rn). (1.3.87)

That λ ≥ 0 holds iff RT

0 λtψtdt ≥ 0, for all ψ ∈ L

1_{(0, T, R}ng₎

+. Finally the complemen-tarity condition can be written as RT

0 λtgi(ut)dt = 0.

1.3.3 Second-order necessary optimality conditions

The essential ingredient here is to build paths that are “second order feasible”. The set of “strongly active constraints” is defined as

It+(u) := {1 ≤ i ≤ ng; λi,t > 0, for all λ∈ Λ(u)}. (1.3.88) The critical cone is a as follows

C(u) :=_{{v ∈ T}K(u); Dgi(ut)vt = 0, i∈ It+(u), a.a. t} (1.3.89) Let, for ε > 0, the ε-“almost active” constraints be defined by

Iε

t(u) :={1 ≤ i ≤ ng; −ε ≤ gi(ut) < 0}. (1.3.90) Denote by Cε(¯u) the cone of pseudo-feasible and essentially bounded critical directions, in the following sense:

Cε(¯u) :={v ∈ C(¯u); kvk∞≤ 1/ε; vt = 0 if Itε(u)6= ∅, for a.a. t}. (1.3.91) Lemma 1.33 The set _∪ε>0Cε(¯u) is a dense subset of C(¯u).

Proof. Let v be a critical direction. Let v1,ε _{be the truncation}

vt1,ε := max(−1/ε, min(1/ε, vt)), for all t∈ (0, T ), (1.3.92) and vε _{be defined by} vtε = 0 if Iε t(¯u)6= ∅ v1,εt if not (1.3.93)

(26)

Obviously vε _{∈ C}

ε(¯u). Since meas(∩ε>0Itε) = 0, we have that vε → v a.e. when ε ↓ 0. Since _|vε

t| ≤ |vt| a.e., the dominated convergence theorem implies that vε → v in U. The

result follows.

Let us see now, for given v _{∈ C}ε(¯u), build a “second-order feasible” path (this corre-sponds to the “primal form” of the second-order necessary conditions)

Lemma 1.34 Given ε > 0 and v_{∈ C}ε(¯u), let w∈ L∞(0, T ; Rm) be such that

Dgi(¯ut)w + D2gi(¯ut)(vt, vt)≤ −ε, i ∈ It(¯u)∪ Itε(¯u). (1.3.94) Then for θ > 0 small enough, the path uθ _{defined below is contained in K:}

uθ _{:= ¯}_{u + θv +} 1 2θ

2_w. _(1.3.95)

Proof. This is an immediate consequence of a second-order expansion of g(uθ_),

com-bined with the definitions of It(¯u) and Itε(¯u).

Define the set of “ε-augmented Lagrange multipliers” as Λε(u) :=    λ_{∈ N}g(u); Df (u)t+ X i∈It(¯u)∪Iε t(¯u) λi,tDgi(ut) = 0    . (1.3.96)

The qualification condition (1.3.73) implies that these sets are uniformly bounded when ε < β, and we have that Λ(u) =∩ε>0Λε(u).

Define the Lagrangian of problem (P ) as L :_{U × U → R,} L(u, λ) = f (u) + Z T 0 ng X i=1 λtg(ut)dt. (1.3.97)

Theorem 1.35 Let ¯u be a local solution of (P ). Then for any critical direction v, there exists a Lagrange multiplier λ such that

D2uuL(u, λ)(v, v)≥ 0. (1.3.98)

Proof. a) Given ε > 0, let v ∈ Cε(¯u). Consider the subproblem Min

w∈U Df (¯u)w + D

2_{f (¯}_{u)(v, v);}

Dgi(¯ut)w + D2gi(¯ut)(vt, vt)≤ −ε, i ∈ It(¯u)∪ Itε(¯u).

(SPε)

We choose L2_{(0, T, R}ng_{) as constraint space. By lemma 1.34, for any feasible w in} F (SPε)∩ L∞(0, T ; Rm), the path uθ defined in (1.3.95) is feasible. Since v is a criti-cal direction, Df (¯u)v = 0. Using the fact that ¯u is a local minimum of (P ), we get

0≤ lim θ↓0 f (uθ₎_{− f(¯u)} 1 2θ2 = Df (¯u)w + D2f (¯u)(v, v) (1.3.99)

Now let w∈ F (SPε). For γ > 0, let wγ ∈ F (SPε)∩ L∞(0, T ; Rm) be the unique solution of Min w∈U Z T 0 |w t− wγt|2dt; kw γ k∞≤ 1/γ; Dgi(¯ut)w + D2gi(¯ut)(vt, vt)≤ −ε, i ∈ It(¯u)∪ Itε(¯u). (1.3.100)

(27)

Let us show that for ε < 1₂β and small enough γ, this problem is feasible. Indeed if i_{∈ I}t(¯u)∪ Itε(¯u), then gi(¯ut)≥ −ε, so that

−β > gi(u0)≥ gi(¯ut) + Dgi(¯ut)(u0− ¯ut)≥ −ε + Dgi(¯ut)(u0− ¯ut), (1.3.101) That is, set ˆw := (u0_{− ¯u}

t). Then ˆ

w_{∈ L}∞(0, T, Rn) and Dgi(¯ut) ˆwt ≤ −1₂β, for a.a. t∈ (0, T ). (1.3.102) Since ¯u is essentially bounded, this proves that the linear constraints may be satisfied by some essentially bounded w such thatkwk∞≤ c(kvk2∞+ ε), for some c > 0 not depending on v or ε. Finally if 1/γ _{≥ c(kvk}∞+ ε), feasibility of (1.3.100) holds.

Now wtγ = wt if |wt| ≤ 1/γ, and |wtγ| ≤ |wt| a.e.; it follows that when γ ↓ 0, wγ → w in U. Passing to the limit in (1.3.99) (in which w is wγ_{) we obtain that}

Df (¯u)w + D2f (¯u)(v, v)_{≥ 0, for all w ∈ F (SP}ε). (1.3.103) In other words, F (SPε) has a nonnegative value.

b) The dual (in the sense of convex analysis) of (SPε) is the problem Max

λ∈Λε(¯u)D 2

uuL(u, λ)(v, v) + εkλkL1. (SD_ε) The problem obtained by an additive perturbation of the constraints, i.e.,

Minw∈UDf (¯u)w + D2f (¯u)(v, v);

Dgi(¯ut)w + D2gi(¯ut)(vt, vt)≤ −ε + η, i ∈ It(¯u)∪ Itε(¯u),

(1.3.104) where η _{∈ U, is feasible; indeed, using ˆ}w satisfying (1.3.102), it suffices to take w of the form

wt = c(1 +|ηt|) ˆwt, for large enough c > 0. (1.3.105) It follows that the primal and dual values are equal. In addition, we know that the set of dual solutions is bounded and weakly_{∗ compact. In view of step a), we obtain that} val(SDε)≥ 0.

c) It is easily checked that Λε(¯u) is bounded in L∞(0, T, Rng). We may check that it is a weakly_{∗ compact subset of L}∞_{(0, T, R}ng_{), using arguments similar to those of the proof} of lemma 1.32.

d) Let v _{∈ C(¯u), and for ε > 0, v}ε _{∈ C}

ε(¯u) be such that vε → v in U. It follows that D2_{f (¯}_u)(vε_{, v}ε₎ _{→ D}2_{f (¯}_{u)(v, v) and D}2_g(¯_u)(vε_{, v}ε₎ _{→ D}2_g(¯_{u)(v, v) in L}1_{(0, T, R}ng_{). For} each ε > 0 there exists λε _{∈ Λ}

ε(¯u) such that D2uuL(¯u, λε)(vε, vε) + εkλεkL1 ≥ 0. Given ε0 > 0, λε belongs to Λε0(¯u) when ε < ε0. Since Λε0(¯u) is weakly∗ compact, there exists a sequence εk ↓ 0, such that there exists λk ∈ Λεk(¯u) that weakly∗ converges to some ¯λ, and denoting by vk _{the corresponding sequence extracted from v}ε_,

D2uuL(¯u, λ

εk_)(vk_{, v}k_{) + ε}

kkλkkL1 ≥ 0. We obtain that ¯λ∈ Λε(¯u) for all ε > 0, and hence ¯λ∈ Λ(¯u), and

D2

uuL(¯u, ¯λ)(v, v) = lim Duu2 L(¯u, λ

ε_)(vε_{, v}ε₎

≥ 0 (1.3.106)

(28)

1.4 Notes

The theory of unconstrained linear quadratic problems is classical and can be found in many textbooks. We have taken the point of view of studying the critical points. Also we emphasize the role of Legendre form in the case of minimization problems. The concept of polyhedricity is due to Haraux [16] and Mignot [20]. Our presentation in section 1.2 follows [5]. Various extensions are presened in [10]. Section 1.3 is an adaptation to the case of the control of ODEs of results obtained when dealing with the optimal control of a semilinear elliptic system [4].

(29)

(30)

Chapter 2 Nonlinear optimal control,

2.1 Unconstrained nonlinear optimal control

2.1.1 Setting

We consider in this section unconstrained optimal controls problems, with nonlinear dynamics and cost functions. Due to this we restrict the analysis to the case of essentially bounded control variables. So the function spaces for the control and state variables will be

U := L∞(0, T ; Rm_);

Y := W1,∞(0, T ; Rn_). _(2.1.1) The optimal control problem is as follows

(_P) min

(u,y)∈U ×YF (u, y) := Z T

0

`(u(t), y(t))dt + φ(y(T )) (2.1.2) subject to ˙y(t) = f (u(t), y(t)), a.e. t∈ (0, T ) ; y(0) = y0 (2.1.3) The functions involved in this setting, all of class C∞_{, are:}

• ` : Rm_{× R}n_{→ R, distributed cost,} • φ : Rn _{→ R, final cost,}

• f : Rm_{× R}n _{→ R}n_{, dynamics (assumed to be Lipschitz).}

Remark 2.1 The existence of solutions in this setting is a difficult question. A coercivity hypothesis on ` of the type

∃ β ∈ R, α > 0; `(u, y) ≥ α|u|2_{− β} _(2.1.4) implies that minimizing sequences are bounded in L2_{(0, T, R}m_{). Therefore a subsequence} weakly converges. However, we cannot pass to the limit in the state equation, using the above functional framework. One has to rely on the theory of relaxed controls, see e.g. Ekeland and Temam [14]. In the sequel we assume the existence of a (locally) optimal control.

(31)

2.1.2 First-order optimality conditions

We may apply the implicit function theorem to the state equation, viewed as written in the space L∞_{(0, T, R}n_{). It follows that the mapping u}

7→ yu (solution of the state equation) is of class C∞_, _{U → Y. Denote the cost function, expressed a depending on} the control only, as

J(u) := Z T

0

`(u(t), yu(t))dt + φ(yu(T )) (2.1.5) Then J(_{·) is of class C}∞ _over _{U. We next show how to compute its first derivative. We} define first the Hamiltonian function H : Rm_{× R}n_{× R}n _{→ R by}

H(u, y, p) := `(u, y) + pf (u, y). (2.1.6) Observe that the state equation may be written as

˙y(t) = Hp(u(t), y(t), p(t)) = f (u(t), y(t)) a.e. t∈ [0, T ] ; y(0) = y0. (2.1.7) Next, the adjoint state equation is defined as

− ˙p(t) = Hy(u(t), y(t), p(t)) a.e. t∈ [0, T ], p(T ) = Dφ(y(T )). (2.1.8) Introduce the linearized state equation

˙z(t) = Df (u(t), y(t))(v(t), z(t)) a.e. t _{∈ [0, T ] ; z(0) = 0.} (2.1.9) Then for all u and v in _{U, using the chain rule:}

DJ(u)v := Z T 0 D`(u(t), yu(t))(v(t), z(t))dt + Dφ(yu(T ))z(T ). (2.1.10) Use Dφ(yu(T ))z(T ) = p(T )z(T ) = Z T 0 [ ˙p(t)z(t) + p(t) ˙z(t)]dt = Z T 0

[−Hy(u(t), y(t), p(t))z(t) + p(t)Df (u(t), y(t))(v(t), z(t))]dt

= Z T

0

[_−`y(u(t), y(t))z(t) + p(t)Duf (u(t), y(t))v(t)]dt.

(2.1.11) We deduce that DJ(u)v := Z T 0 Hp(u(t), y(t), p(t))v(t)dt. (2.1.12) In other words, Hp(u(t), y(t), p(t)) is the derivative of J at point u. Therefore

Proposition 2.2 Let J attain a local minimum at the point u ∈ U. Then, denoting by y and p the state and costate associated with u, we have

(32)

Remark 2.3 The above relations are reminiscent of classical Hamiltonian systems, in-troduced by Hamilton in [15]. The latter are defined as follows. Given a smooth function (the Hamiltonian) _{H : R}n

× Rn

→ R, the associated (dynamical) Hamiltonian system is ˙y(t) =Hp(y(t), p(t)); − ˙p(t) = Hy(y(t), p(t)). (2.1.14)

An obvious invariant of the Hamiltonian system is the value of the Hamiltonian itself, since _dtd_Hy(y(t), p(t)) = Hy(y(t), p(t)) ˙y(t) +Hp(y(t), p(t)) ˙p(t) = 0. For mechanical con-servative systems, the Hamiltonian function represents the mechanical energy (sum of potential and cinetic energy). In (2.1.7)-(2.1.8) we have the additional “algebraic” vari-able u, and if u is locally optimal, the additional “algebraic” relation (2.1.13). We show in section 2.1.4 that in some cases u can be eliminated from the algebraic relation.

2.1.3 Pontryaguin’s principle

Let z_{∈ L}1_{(0, T ). We say that t}

0 ∈]0, T [ is a Lebesgue point of z if z(t0) = lim γ↓0 1 2γ Z t0+γ t0−γ z(t)dt. (2.1.15)

This property is satisfied almost everywhere, see e.g. Rudin [24, theorem 7.7].

Definition 2.4 We say that (u, y) _{∈ U × Y is a Pontryagin extremal it the following} holds:

u(t)∈ argmin

w∈Rm H(w, y(t), p(t)), a.e. t∈ (0, T ). (2.1.16) Theorem 2.5 Let ¯u and ¯y be an optimal control and the associated optimal state. Then (¯u, ¯y) is a Pontryagin extremal.

Proof.

a) Let v be a feasible control, with associated state y. Denote w := y_{− ¯y. Since f is} Lipschitz, we have that

k ˙w(t)k ≤ |f(u(t), y(t)) − f(¯u(t), y(t))| + |f(¯u(t), y(t)) − f(¯u(t), ¯y(t))| ≤ O(ku(t) − ¯u(t)k) + O(kw(t)k).

We deduce that

kyu− ¯yk∞= O(ku − ¯uk1). (2.1.17)

b) Denote by ¯p the costate associated with ¯u. Let v be a feasible control, with associated state y. Set ∆ := J(v)− J(¯u). Adding to ∆ the null amount

Z T 0

¯

(33)

obtain ∆ = A + B, where A :=

Z T 0

[H(v(t), ¯y(t), ¯p(t))_{− H(¯u(t), ¯y(t), ¯p(t))] dt,} B := Z T 0 [H(v(t), y(t), ¯p(t))− H(v(t), ¯y(t), ¯p(t))] dt + Z T 0 ¯ p(t)· [ ˙¯y − ˙y] dt + Φ(y(T ))_{− Φ(¯y(T )).} Since −d

dtp(t) = H¯ y(¯u(t), ¯y(t)) and p(T ) = Φ0(¯y(T )), integrating by parts the term RT

0 p(t)¯ · [ ˙¯y − ˙y] dt, we can write B = B1+ B2, with

B1 = Z T

0

[H(v(t), y(t), ¯p(t))− H(v(t), ¯y(t), ¯p(t)) − Hy(¯u(t), ¯y(t))(y(t)− ¯y(t))] dt

= Z T

0

[Hy(v(t), ˆy(t), ¯p(t))− Hy(¯u(t), ¯y(t), ¯p(t))( y(t)− ¯y(t))dt, B2 = Φ(y(T ))− Φ(¯y(T )) − Φ0(¯y(T ))(y(T )− ¯y(T ))

= (Φ0(˜y(T )))_{− Φ}0(¯y(T ))) (y(T )_{− ¯y(T )),}

where (by the mean value theorem) ˆy(t) ∈ [¯y(t), y(t)] for all t, and ˜y ∈ [¯y(T ), y(T )]. By (2.1.17), _|B2| = o(kv − uk1). On the other hand, by Lebesgue’s theorem,

Hy(v(t), ˆy(t), ¯p(t))→ Hy(¯u(t), ¯y(t), ¯p((t)) in L1(0, T ). Combining with (2.1.17), get

|B1| ≤ kHy(v, ˆy, p)→ Hy(¯u, ¯y, p)k1kˆy − ¯yk∞ = o(kv − uk1). We have proved that

∆ = A + o(_{kv − ¯uk}1). (2.1.18)

c) Consider now the spike perturbations, i.e.,fix γ > 0, t0 ∈]0, T [, w ∈ U and vγ(t) = w if |t − t0| ≤ γ, ¯u(t) sinon.

Then

A = Z t0+γ

t0−γ

[H(w, ¯y(t), ¯p(t))− H(¯u(t), ¯y(t), ¯p(t))] dt, and kvγ − ¯uk1 = O(γ).

Almost each t0 ∈]0, T [ is a Lebesgue point of t → H(¯u(t), ¯y(t), ¯p(t)). Therefore, by (2.1.18), we have, for almost all t0 ∈]0, T [,

0≤ lim γ↓0

J(vγ)− J(¯u)

2γ = H(w, ¯y(t0), p(t0))− H(¯u(t0), ¯y(t0), p(t0)) (2.1.19)

as was to be proved.

In addition, it is easy to prove that each Pontryagin extremal is such that the Hamil-tonian is constant over the trajectory:

(34)

Lemma 2.6 Let (u, y) be a Pontryagin extremal, and p be the associated costate. Then t_{7→ H(u(t), y(t), p(t)) is a constant function (up to a set of measure 0 !).}

Proof.

a) Set g(t) := minu∈UH(u, y(t), p(t)). For R >kuk∞, let UR:= U ∩ BR, where BR is the ball of radius R and center 0 in Rm_{. Using}

|g(t0₎_{− g(t)| ≤ sup}

u∈ ¯UR|H(u, y(t0), p(t0))− H(u, y(t), p(t))|

≤ c (ky(t0₎_{− y(t)k + kp(t}0₎_{− p(t))k) ,} (2.1.20) (with c independent of t and t0_{) as well as the absolute continuity of y and p, we deduce} thta g is absolutely continuous. So there exists a set T ⊂ [0, T ], of full measure in [0, T ], such that (2.1.16) is satisfied, and y, p and g are differentiable, for all t_{∈ T . Let t}0 ∈ T . By (2.1.16), for t > t0, we have

g(t)− g(t0) t_{− t}0 ≤

H(u(t0), y(t), p(t))− H(u(t0), y(t0), p(t0)) t_{− t}0

and so with the state and costate equations: ˙g(t0) ≤ lim

t↓t0

H(u(t0), y(t), p(t))− H(u(t0), y(t0), p(t0)) t_{− t}0

= Hy(u(t0), y(t0), p(t0)) ˙y(t0) + Hp(u(t0), y(t0), p(t0)) ˙p(t0) = 0.

Taking t < t0, we would prove in a similar way that ˙g(t0) ≥ 0. Therefore ˙g(t) = 0 a.e., which since g is absolutely continuous, implies that g is constant. Remark 2.7 We have stated Pontryaguin’s principle for a global minimum. However, the proof indicates that it also holds for a local minimum in the topology of L1_{(0, T, R}m_). It also holds for a strong relative minimum in the sense of calculus of variations, i.e., a point at which the cost function is less or equal than for every other control whose associated state is close in the uniform topology.

2.1.4 Legendre-Clebsch conditions

If (¯u, y¯u) is a Pontryaguin extremal, denoting ¯y = yu¯ and ¯p = pu¯, then obviously the so-called weak Legendre-Clebsch condition holds:

Duu2 H(¯u(t), ¯y(t), ¯p(t)) 0 a.e. (2.1.21) It is easily seen that this condition also holds for local minima inU.

We say that a stationary point ¯u of J satisfies the strong Legendre-Clebsch condition whenever

∃ α > 0; D2uuH(¯u(t), ¯y(t), ¯p(t))(v, v)≥ α|v|2, for all v ∈ Rm, a.e. t∈ (0, T ). (2.1.22) From the proof of Pontryaguin’s principle it can be checked that the strong Legendre-Clebsch condition is a necessary condition for quadratic growth (in the sense of proposi-tion 2.16).

(35)

Another consequence of the strong Legendre-Clebsch condition is that we can apply the IFT (implicit function theorem) to the stationarity equation

DuH(¯u(t), ¯y(t), ¯p(t)) = 0. (2.1.23) Since the IFT has a local nature, the strong Legendre-Clebsch condition allows the control to have large jumps, but not small ones. Therefore the following holds.

Proposition 2.8 Let ¯u be a stationary point of J satisfying the strong Legendre-Clebsch condition. Then there exists ε > 0, such that for all t0 ∈ [0, T ], and t ∈ Vε(t0) := [t0 − ε, t0+ ε]∩ [0, T ], there exists a C∞ function Υ : Rn× Rn → Rm such that, either ¯

u(t) = Υ(¯y(t), ¯p(t)), or ess sup_{{|¯u(t) − ¯u(t}0₎_{|; t, t}0 _{∈ V}

ε(t0)} > ε.

Remark 2.9 If in addition H(., ¯y(t), ¯p(t)) is pseudo-convex (i.e., has convex level sets) for all t_{∈ [0, T ], then we obtain that t → ¯u(t) is of class C}∞_.

2.1.5 Abstract second-order necessary optimality conditions

For the sake of clarity, we introduce first the second-order optimality conditions in an abstract setting. Let in this subsection_{U, Y and W be arbitrary Banach spaces. Consider} a C2 _mapping _{A : U × Y → W. Define the state equation as}

A(u, y) = 0. (2.1.24)

Let (u0, y0) be a zero ofA (a solution of (2.1.24)). Assume that DyA(u0, y0) is invertible. Then by the Implicit Function Theorem, (2.1.24) is locally equivalent to y = yu, where the function yu :U → Y is of class C2, and we have for all v∈ U

yu0+v = y0+ z + o(kvk), (2.1.25)

where z _{∈ Y is the unique solution of}

DA(u0, y0)(v, z) = DuA(u0, y0)v + DyA(u0, y0)z = 0. (2.1.26) Consider a C2 _{cost function F (u, y), with F :} _{U × Y → R. In a neighborhood of u}

0, the reduced cost function J(u) := F (u, yu) is well defined. Let the Lagrangian function be defined a s

L(u, y, p) := F (u, y) + hp, A(u, y)i (2.1.27) with here p∈ W∗_{. Let the costate p}

u ∈ W∗ be defined as the unique solution of

0 = DyL(u, yu, pu) = DyF (u, yu) + DyA(u, yu)>pu. (2.1.28) Locally, J(u + v) is well-defined and equal to _{L(u + v, y}u+v, pu). It follows that

J(u + v) =L(u + v, yu+v, pu) =L(u, yu, pu) + DuL(u, yu, pu)v + o(kvk), (2.1.29) and therefore an expression of the derivative of J is

DJ(u) = DuL(u, yu, pu). (2.1.30) In particular, if J attains a local minimum over a convex set K at the point ¯u, then the following first-order necessary optimality condition holds:

(36)

Remark 2.10 We easily recover of course as a particular case the results of the previous section. We proved there a very interesting regularity result: the derivative of the cost function happens to be (identifiable to) a function in _{U (instead of U}∗_).

Now we compute second-order expansions. Using again J(u + v) =_{L(u + v, y}u+v, pu), (2.1.25), and the convention ((x))2 _{≡ (x, x):}

J(u + v) = _{L(u, y}u, pu) + DuL(u, yu, pu)v +1₂D2

((u,y))2L(u, yu, pu)((v, yu+v− yu))2+ o(|vk2),

= J(u) + DuL(u, yu, pu)v +1₂D_((u,y))2 2L(u, yu, pu)((v, z))2+ o(kvk2). (2.1.32) Therefore:

Lemma 2.11 The second-order dervative of J is characterized by

D2J(¯u)(v, v) = D_((u,y))2 2L(u, y_u, p_u)((v, z))2, for all v∈ U. (2.1.33) An immediate consequence is the following second-order necessary optimality condi-tion:

Proposition 2.12 Let J attain a local (unconstrained) minimum at ¯u. Then for all v ∈ U and z solution of (2.1.26), the following holds:

D_((u,y))2 2L(u, y_u, p_u)((v, z))2 ≥ 0. (2.1.34)

Of course this is nothing else that the condition D2_J(¯_u)_{0, where “ 0” means that} the associated quadratic form is nonnegative.

Remark 2.13 As is well-known, a second-order sufficient optimality condition is that there exists α > 0 such that for all v ∈ U and z solution of (2.1.26), the following holds: D2_((u,y))2L(u, y_u, p_u)((v, z))2 ≥ αkvk2. (2.1.35)

Note however that then the function v _→qD2

((u,y))2L(u, yu, pu)((v, z))2 is a norme equiv-qlent to the one ofU. This means that U is Hilbertisable (i.e., endowed with an equivalent norm, is a Hilbert space). So we see that (2.1.35) never holds for a non Hilbertisable space like Ls _{for s}_{6= 2. In particular, it never holds in our application to optimal control ! We} wil have to rely on two norms second-order sufficient optimality conditions.

2.1.6 Specific second-order necessary optimality condition

We just apply the previous results. The expression of the Lagrangian is L(u, y, p) = F (u, y) +

Z T 0

p(t)(`(u(t), y(t))_{− ˙y(t))dt} =

Z T 0

H(u(t), y(t), p(t))dt + φ(y(T ))− Z T

0

p(t) ˙y(t))dt.

(37)

Here we may take the multiplier p inU, since we know that the costates associated with control variables are in this space. The last term in the r.h.s. of (2.1.36) being linear in y, has no contribution to the Hessian of the Lagrangian, and it remains

D2J(u)(v, v) = Z T

0

D2_((u,y))2H(u(t), y_u(t), p_u(t))((v, z))2dt + D2φ(y_u(T ))(v, v). (2.1.37)

Therefore the expression of the second-order necessary optimality condition is as follows: Proposition 2.14 Let J attain a local (unconstrained) minimum at ¯u. Then for all v ∈ U, z being the solution of the linearized state equation (2.1.9), the expression in the r.h.s. of (2.1.37) is nonnegative.

2.1.7 Second-order sufficient optimality conditions

We know that u7→ J(u) is of class C∞_, _{U → R. Therefore, we may write} J(u + v) = J(u) + DJ(u)v + 1

2D

2_{J(u)(v, v) + r(u, v)} _(2.1.38) where for fixed u we have, denoting by _{k · k}s the norm in Ls (s∈ [1, +∞[):

r(u, v) = O(_kvk3_∞). (2.1.39)

For the theory of second-order sufficient conditions we need to check that (under appro-priate hypotheses) the second-order term of the expansion of J dominates the remainder r(u, v). Since this second-order term involves “integrals of squares” it will be of the order of the L2 _{norm. Therefore it is useful to check that r(u, v) is small with respect to the} L2 _{norm of v. Note that (2.1.39) gives no guarantee in this respect, since no inequality} of the type _{k · k}∞≤ Ck · k2 holds.

Lemma 2.15 For any M > 0, there exists cM > 0 such that, ifkuk∞≤ M and kvk∞≤ M , then

|r(u, v)| ≤ CMkvk33 ≤ CMkvk∞kvk22. (2.1.40) Proof. The last inequality being obvious, we just have to prove the first one. In the sequel we use Gronwall’s lemma several times, and often omit the time argument. Using Taylor’s expansions up to order q with integral remainders, and since derivatives of any order are Lipschitz on bounded sets, we see that the remainder over a bounded set is uniformly of order q + 1.

We first obtain an expansion of the mapping yu. Set δ = (v, yu+v−yu), δy = yu+v−yu. Since

˙δy(t) = f (u + v, yu+v)− f(u, y) = O(|v(t)| + |yu+v(t)− yu(t)|) (2.1.41) (with O(_{·) ≤ c| · | uniformly whenever kuk}∞≤ M and say kvk∞ ≤ 1, we obtain that

kyu+v− yuk∞= O(kvk1). (2.1.42) Next, set

(38)

We have that

˙δyz = f (u + v, yu+v)− f(u, y) − Df(u, y)(v, z)

= f (u + v, yu+v)− f(u, y) − Df(u, y)(v, yu+v− yu) + Dyf (u, y)δyz = Dyf (u, y)δyz+ 1₂D2f (u, y)((v, δy))2+ O(|v(t)|3+|yu+v(t)− yu(t)|3).

(2.1.43) This proves that

yu+v = yu+ z + zv,v + rv,v (2.1.44) where zv,v is solution of

˙zv,v = Dyf (u, y)zv,v+1₂D2f (u, y)((v, δy))2 (2.1.45) and

rv,v(t) = O(|v(t)|3+kvk3₁). (2.1.46) Note that, since v_{→ z}v,v is a quadratic mapping, zv,v is nothing but the second derivative of yu in direction v. Omitting the time argument, get

`(u + v, yu+v) = `(u, yu) + D`(u, yu)(z + zv,v) + ₂1D2`(u, yu)((v, z))2+ r`(u, v) (2.1.47) and r`(u, v) is the remainder in the second-order expansion (since it includes no linear or quadratic term), and satisfies

rL(u, v)(t) = O(|v(t)|3+kvk31) = O(|v(t)|3+kvk33). (2.1.48) Integrating the above relation over time, we obtain the desired result. Proposition 2.16 Let u_{∈ U satisfy the second-order sufficient condition:}

DJ(u) = 0 and D2J(u)(v, v)_{≥ αkvk}2₂, for all v_{∈ U.} (2.1.49) Then for all α0 _{< α, there exists ε > 0 such that u satisfies the (two-norms) quadratic} growth property

J(u + v)_{≥ J(u) +} 1 2α

0

kvk22, for all v; kvk∞≤ ε. (2.1.50) Remark 2.17 The statement of the second-order sufficient condition uses two norms: the L2 _{norm for the estimate of increase of the cost function, and the L}∞ _{norm for the} neighborhood.

Remark 2.18 The above results correspond to the following abstract situation. Let the Banach space _{U be included in a Hilbert space X, and denote by k · k}U, k · kX the norms of _{U and X resp. Assume that J is a C}2 _{function over} _{U, and set r(u, v) :=} J(u + v) − J(u) − 1

2D2J(u)(v, v). If ¯u ∈ U is such that DJ(¯u) = 0, and there exist constants α > 0, ε_{∈ (0, α), and ε}0 _{> 0 such that}

D2_J(¯_{u)(v, v)}_{≥ αkvk}2

X, for all v ∈ U; |r(u, v)| ≤ 12(α− ε)kvk

2

X, when kvkU < ε0, (2.1.51) then J has a local minimum at ¯u, and the following quadratic growth condition is satisfied:

J(¯u + v)_{≥ J(¯u) +} 1 4εkvk

2

Optimal control of ordinary differential equations

HAL Id: cel-00392170

https://cel.archives-ouvertes.fr/cel-00392170

Optimal control of ordinary differential equations

Frédéric Bonnans

To cite this version:

Optimal control of ordinary differential equations

1

J. Fr´ed´eric Bonnans

August 11, 2006

Contents

Foreword

Chapter 1

Linear quadratic control and control

constrained problems

1.1

Unconstrained problems

1.1.1

Critical points of quadratic functionals

1.1.2

Shooting function and Hamiltonian flow

1.1.3

Riccati equation

1.1.4

Expression of the critical value

1.1.5

Legendre forms and minima of quadratic functions

1.1.6

Spectral analysis

1.2

Polyhedric constraints

1.2.1

Overview

1.2.2

second-order necessary optimality conditions

1.2.3

Polyhedric sets

1.2.4

Stability of solutions

1.2.5

Sensitivity analysis

1.2.6

Bound constraints in spaces of summable square

1.3

Convex constraints on control variables

1.3.1

Framework

1.3.2

First-order necessary optimality conditions

1.3.3

Second-order necessary optimality conditions

1.4

Notes

Chapter 2

Nonlinear optimal control,

2.1

Unconstrained nonlinear optimal control

2.1.1

Setting

2.1.2

First-order optimality conditions

2.1.3

Pontryaguin’s principle

2.1.4

Legendre-Clebsch conditions

2.1.5

Abstract second-order necessary optimality conditions

2.1.6

Specific second-order necessary optimality condition

2.1.7

Second-order sufficient optimality conditions