• Aucun résultat trouvé

Consider the optimal control problem (5.15) and the infinite horizon optimal control problem (OCP n ∞ ) with same running cost and constraints X and

Dans le document Communications and Control Engineering (Page 118-126)

Stability and Suboptimality Using Stabilizing Constraints

Theorem 5.22 Consider the optimal control problem (5.15) and the infinite horizon optimal control problem (OCP n ∞ ) with same running cost and constraints X and

U(x). Let the assumptions of Theorem 4.16 hold and assume that there isN0∈N such that the assumptions of Theorem5.13hold for N=N0. Assume in addition thatX0contains a ballBν(x)and that the terminal costF satisfies

F (x)−V(x)ε

106 5 Stability and Suboptimality Using Stabilizing Constraints

withμα from Theorem 4.16, i.e., satisfying (4.18). This control sequence lies in UNX0(x0)since

Inserting this inequality into (5.29) yields VN(x)

5.4 Suboptimality and Inverse Optimality 107 Example 5.23 We illustrate Theorem5.21by Example5.6and5.18. Observing that XN=Rholds forN≥1, the dynamic programming equation (3.15) forK=1 and N≥2 becomes

VN(x)= inf

u∈R x2+u2+VN1(x+u) .

Using this equation in order to iteratively computeVN starting fromV1(x)=2x2, cf. Example5.6, we obtain the (approximate) values

V1(x)=2x2, V2(x) =1.666666667x2, V3(x)=1.625x2, V4(x) =1.619047619x2, V5(x)=1.618181818x2, V6(x) =1.618055556x2, V7(x)=1.618037135x2, V8(x) =1.618034448x2, V9(x)=1.618034056x2, V10(x)=1.618033999x2,

cf. Problem4. Since, as computed in Example5.18, the infinite horizon optimal value function is given by

V(x)=1 2(1+√

5)x2≈1.618033988x2,

this shows that, e.g., forR=1, the inequalityVN(x)V(x)εholds forε= 2.2·105forN=6, forε=4.6·107forN=8 and forε=1.1·108forN=10.

We end this section by investigating the inverse optimality of the NMPC-feedback lawμN. While the suboptimality estimates provided so far in this section give bounds on the infinite horizon performance ofμN, inverse optimality denotes the fact thatμN is in fact an infinite horizon optimal feedback law—not for the running costbut for a suitably adjusted running cost. The motivation for such˜ a result stems from the fact that optimal feedback laws have desirable robustness properties. This can be made precise for continuous time control affine systems

˙

x(t )=g0(x)+g1(x)u (5.30) withx∈Rd,u∈Rm, g0:Rd→Rd andg1:Rd →Rd×m. For these systems it is known that a stabilizing (continuous time) infinite horizon optimal feedback law μhas a sector margin(1/2,)which means thatu=μ(x)stabilizes not only (5.30) but also

˙

x(t )=g0(x)+g1(x)φ (u) (5.31) for anyφ:Rm→Rmsatisfyingu22/2uφ (u)≤ ∞for allu∈Rm, see Magni and Sepulchre [9] for details.

Although we are not aware of analogous discrete time results in the literature, it seems reasonable to expect that this robustness is inherited in an approximate way for optimal control of sampled data systems with sufficiently fast sampling. This justifies the investigation of inverse optimality also in the discrete time setting.

For the NMPC schemes presented in this chapter we can make the following inverse optimality statement.

108 5 Stability and Suboptimality Using Stabilizing Constraints

Theorem 5.24 Consider the optimal control problem (5.5) or (5.15) for some N∈Nwith the usual constraintsx∈Xand u∈U(x). Let the assumptions of the respective Theorem5.5or5.13hold for thisN. Then on the setXN the feedback μNequals the infinite horizon optimal feedback law for (OCPn) with running cost

(x, u)˜ :=(x, u)+VN1

Proof First observe that the assumptions of Theorem5.5or5.13imply (5.4) and V (x)=0. Hence, (5.32) satisfies˜≥, is of the form (3.2), and the inequality for in (5.2) remains valid for. We denote the infinite horizon optimal value function˜ of (OCPn) with running cost˜byVand the corresponding optimal feedback law byμ˜N.

From the dynamic programming principle (3.15) withK=1 and the definition of˜we get From these two equations, by induction for eachK∈Nwe get

VN(x0)

either grows unboundedly or converges to some finite value. Since (x, u)˜ ≥ α3(|x|x), convergence is only possible ifxu(k, u)converges toxask→ ∞, i.e., if

5.5 Notes and Extensions 109 for allu∈U(x0), which impliesVN(x0)V(x0).

On the other hand, sinceμN asymptotically stabilizes the system, in (5.34) we getVN(xμN(K, x0))→0 asK→ ∞and thus lettingK→ ∞in (5.34) yields

VN(x0)=

k=0

˜

xμN(k, x0), μN

xμN(k, x0)

(5.35) which impliesVN(x0)V(x0). Consequently, we getVN(x0)=V(x0)and from (5.35) it follows that μ˜=μN is the infinite horizon optimal feedback law for

running cost.˜

Observe that for the inverse optimality statement to hold we need to replace the constraintsx∈Xin (OCPn) by the in general tighter constraintsx∈XN−1, where XN−1 is the feasible set for (5.5) or (5.15) with horizon N−1. This is because by (3.19) the feedbackμN is obtained by minimization with respect to these con-straints. Thus, it cannot in general be optimal for the infinite horizon problem with the usually weaker original constraintsx∈X.

5.5 Notes and Extensions

Most of the results in this chapter are classical and can be found in several places in the NMPC literature. In view of the huge amount of this literature, here we do not make an attempt to give a comprehensive list of references but rather restrict ourselves just to the literature from which we learned the results presented in this chapter.

While the proofs in the NMPC literature are similar to the proofs given here, the relaxed dynamic programming arguments outlined in Sect.5.1are usually applied in a more ad hoc manner. The reason we have put more emphasis on this approach and, in particular, used Theorem 4.11 in the stability proofs is because the analysis of NMPC schemes without stabilizing terminal constraints in the following Chap. 6 will also be based on Theorem 4.11. Hence, proceeding this way we can highlight the similarities in the analysis of these different classes of NMPC schemes.

For discrete time NMPC schemes with equilibrium terminal constraints as fea-tured in Sect.5.2, a version of Theorem5.5was published by Keerthi and Gilbert [8]

in 1988, even for the more general case in which the optimization horizon may vary with time. Their approach was inspired by earlier results for linear systems, for more information on these linear results we refer to the references in [8]. Even earlier, in 1982 Chen and Shaw [1] proved stability of an NMPC scheme with equilibrium terminal constraint in continuous time, however, in their setting the whole optimal control function on the optimization horizon is applied to the plant, as opposed to only the first part. Continuous time and sampled data versions of Theorem5.5were given by Mayne and Michalska [10] in 1990, using, however, a differentiability as-sumption on the optimal value function which is quite restrictive in the presence of state constraints.

110 5 Stability and Suboptimality Using Stabilizing Constraints The “quasi-infinite horizon” idea of imposing regional terminal constraintsX0

plus a terminal cost satisfying Assumption5.9as presented in Sect.5.3came up in the second half of the 1990s in papers by De Nicolao, Magni and Scattolini [3,4], Magni and Sepulchre [9] or Chen and Allgöwer [2], both in discrete and continuous time. Typically, these papers provide specific constructions ofF andX0satisfying Assumption5.9rather than imposing this assumption in an abstract way as we did here. The abstract formulation of these conditions given here was inspired by the survey article by Mayne, Rawlings, Rao, and Scokaert [11], which also contains a comparative discussion of the approaches in some of the cited papers. For a con-tinuous time version of such abstract conditions we refer to Fontes [5]. A terminal cost meeting Assumption5.9was already used before by Parisini and Zoppoli [14], however, without terminal constraint; we will investigate this setting in Sect. 7.1.

The construction ofF andX0in Remark5.15is similar to the construction in [2]

and [14]. A related NMPC variant which may have motivated some of the authors cited above was proposed by Michalska and Mayne [12]. In this so-called dual mode NMPC the prediction horizon length is an additional optimization variable and the prediction is stopped once the setX0is reached. Inside this set, the control valueux from Assumption5.9(ii) is used.

Establishing the existence of a suitable upper bound ofVNis essential for being able to useVN as a Lyapunov function. The argument used here in the proofs of Propositions5.7(ii) and5.14(ii) was adopted from Rawlings and Mayne [16, Propo-sition 2.18]. Of course, this is not the only way to obtain an upper bound onVN. Other sufficient conditions, like, e.g., the controllability condition “C” in Keerthi and Gilbert [8, Definition 3.2], may be used as well.

Regarding the suboptimality results in Sect.5.4, for the special case of equilib-rium terminal constraintsX0= {x}andF≡0, a version of the suboptimality result in Theorem5.21was given by Keerthi and Gilbert [8]. For the case of generalX0

andF we are not aware of a result similar to Theorem 5.21, although we would not be surprised to learn that such a result exists in the huge body of NMPC litera-ture. Theorem5.22is a variant of Grüne and Rantzer [6, Theorem 6.2] and extends Theorem 2 of Hu and Linnemann [7] in which the caseF =Vis considered.

Inverse optimality was extensively investigated already for linear MPC leading to the famous “fake” algebraic Riccati equation introduced by Poubelle, Bitmead and Gevers [15]. For nonlinear systems in continuous time this property was proved by Magni and Sepulchre [9]. While the discrete time nonlinear version given in Theorem5.24is used in an ad hoc manner in several papers (e.g., in Neši´c and Grüne [13]), we were not able to find it in the literature in the general form presented here.

5.6 Problems

1. Consider the scalar control system

x+=x+u, x(0)=x0

with xX=R, uU =R which shall be controlled via NMPC using the quadratic running cost

5.6 Problems 111 (x, u)=x2+u2

and the stabilizing endpoint constraint xu(N, x0)=x =0. For the horizon N=2, compute an estimate for the closed-loop costsJ(x, μ2(·)).

2. Consider the setting of Remark5.15and prove the following properties.

(a) There exists a constantE >0 such that|r(x, u)| + | ˜(x, u)| ≤Ex3holds for eachx∈Rdwithxsufficiently small andu=u(x).

(b) For eachσ >1 there existsδ >0 such thatxδimplies

(x, u)+ ˜(x, u)+r(x, u)≤ −(x, u)/σ foru=u(x).

Hint for (b): Look at the hints for Problem 2 in Chap. 4.

3. Consider f, , X0 and F satisfying the assumptions of Proposition 5.14(i) and (ii). Prove the following properties.

(a) The running cost satisfies(x, ux)≤ ˜α2(|x|x)forx∈X0,uxfrom Assump-tion5.9(ii) andα˜2from the assumption of Proposition5.14(ii).

(b) For the feedback law μ(x):=ux with ux from Assumption 5.9(ii) the closed-loop systemx+=f (x, μ(x))is asymptotically stable onX0. 4. Consider the setting from Problem1. Prove without using Theorem5.21that for

allε >0 andR >0 there existsNε∈Nsuch that VN(x)V(x)+ε

holds for allNNεand allx∈Rwith|x| ≤R. Proceed as follows:

(a) Use dynamic programming in order to show VN(x)=CNx2withC1=2 and

CN=8CN21+12CN1+4 4CN21+8CN1+4 .

(b) Use the expression from (a) to conclude that CN12(1+√

5) holds as N→ ∞.

(c) Use the exact expression forV from Example5.23in order to prove the claim.

5. Consider Example 2.3, i.e., f (x, u):=

x1+ x2+

=

sin(ϑ (x)+u) cos(ϑ (x)+u)/2

with

ϑ (x)=

arccos 2x2, x1≥0, 2π−arccos 2x2, x1<0,

initial value (0,1/2) and running cost (x, u)= xx2+u2 with x = (0,−1/2). The control values are restricted to the setU= [0,0.2]which allows the car to only move clockwise on the ellipse

X=

x∈R2

x1

2x2

=1

. Perform the following numerical simulations for this problem.

112 5 Stability and Suboptimality Using Stabilizing Constraints (a) Implement the NMPC closed loop forN =8 and confirm that the

closed-loop trajectory does not converge towardx.

(b) Modify the NMPC problem by introducing the terminal constraint X0= {x}. Again considering the horizon length N =8, verify that now x(n)x.

(c) Check the control constraints for each NMPC iterate from (b) more closely, verify that they are violated at some sampling instants and explain why this happens. Determine by simulations how largeNneeds to be such that these violations vanish.

Hint: Instead of implementing the problem from scratch you may suitably modify the MATLAB code for Example 6.26, cf. Sect. A.1.

References

1. Chen, C.C., Shaw, L.: On receding horizon feedback control. Automatica 18(3), 349–352 (1982)

2. Chen, H., Allgöwer, F.: A quasi-infinite horizon nonlinear model predictive control scheme with guaranteed stability. Automatica J. IFAC 34(10), 1205–1217 (1998)

3. De Nicolao, G., Magni, L., Scattolini, R.: Stabilizing nonlinear receding horizon control via a nonquadratic terminal state penalty. In: CESA’96 IMACS Multiconference: Computational Engineering in Systems Applications, Lille, France, pp. 185–187 (1996)

4. De Nicolao, G., Magni, L., Scattolini, R.: Stabilizing receding-horizon control of nonlinear time-varying systems. IEEE Trans. Automat. Control 43(7), 1030–1036 (1998)

5. Fontes, F.A.C.C.: A general framework to design stabilizing nonlinear model predictive con-trollers. Systems Control Lett. 42(2), 127–143 (2001)

6. Grüne, L., Rantzer, A.: On the infinite horizon performance of receding horizon controllers.

IEEE Trans. Automat. Control 53, 2100–2111 (2008)

7. Hu, B., Linnemann, A.: Toward infinite-horizon optimality in nonlinear model predictive con-trol. IEEE Trans. Automat. Control 47(4), 679–682 (2002)

8. Keerthi, S.S., Gilbert, E.G.: Optimal infinite-horizon feedback laws for a general class of constrained discrete-time systems: stability and moving-horizon approximations. J. Optim.

Theory Appl. 57(2), 265–293 (1988)

9. Magni, L., Sepulchre, R.: Stability margins of nonlinear receding-horizon control via inverse optimality. Systems Control Lett. 32(4), 241–245 (1997)

10. Mayne, D.Q., Michalska, H.: Receding horizon control of nonlinear systems. IEEE Trans.

Automat. Control 35(7), 814–824 (1990)

11. Mayne, D.Q., Rawlings, J.B., Rao, C.V., Scokaert, P.O.M.: Constrained model predictive con-trol: Stability and optimality. Automatica 36(6), 789–814 (2000)

12. Michalska, H., Mayne, D.Q.: Robust receding horizon control of constrained nonlinear sys-tems. IEEE Trans. Automat. Control 38(11), 1623–1633 (1993)

13. Neši´c, D., Grüne, L.: A receding horizon control approach to sampled-data implementation of continuous-time controllers. Systems Control Lett. 55, 660–672 (2006)

14. Parisini, T., Zoppoli, R.: A receding-horizon regulator for nonlinear systems and a neural approximation. Automatica 31(10), 1443–1451 (1995)

15. Poubelle, M.A., Bitmead, R.R., Gevers, M.R.: Fake algebraic Riccati techniques and stability.

IEEE Trans. Automat. Control 33(4), 379–381 (1988)

16. Rawlings, J.B., Mayne, D.Q.: Model Predictive Control: Theory and Design. Nob Hill Pub-lishing, Madison (2009)

17. Wang, L.: Model Predictive Control System Design and Implementation Using MATLAB.

Advances in Industrial Control. Springer, Berlin (2009)

Chapter 6

Stability and Suboptimality Without Stabilizing

Dans le document Communications and Control Engineering (Page 118-126)