HAL Id: hal-00654159
https://hal-mines-paristech.archives-ouvertes.fr/hal-00654159
Submitted on 21 Dec 2011HAL is a multi-disciplinary open access
archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Averaging and Deterministic Optimal Control
François Chaplais
To cite this version:
François Chaplais. Averaging and Deterministic Optimal Control. SIAM Journal on Control and Optimization, Society for Industrial and Applied Mathematics, 1986, 25 (3), pp.767-780. �10.1137/0325044�. �hal-00654159�
AVERAGING AND DETERMINISTIC OPTIMAL
CONTROL*
F. CHAPLAIS"Abstract. Averagingisoften used inordinarydifferential equations when dealingwithfast periodic phenomena.Itisshownhere thatitcanbe usedefficientlyin optimal control.Astheperiodtendstozero,
alimit or"averaged" problemis defined.The open loop optimal controlof thelimitprobleminduces a costwhich isoptimalup tothe secondorderwhen evaluatedthroughtheoriginal dynamics.The definition
of theaveraged problemisthengeneralizedtothe nonperiodic case. Itisshownthat theBellmanfunction
of the original "fast" problem tends uniformlyonany compactset tothatof the averagedproblem.
Keywords, averaging, optimalcontrol,timescales, perturbationsincontrol,Hamilton-Jacobiequations
AMS(MOS)subject classifications. 34C29, 41A60, 4900,49C20
Introduction: A perturbation approach. Averaging can be seen as part of the
perturbation theory of differential equations. Consider dx
--=f(x,t),
x(O)=xo,
x(t)R
n,
t[0,
T].
(1)
dt
Regular perturbations correspond to the situation when
f
has a limit inC(Rn
[0,
T], R)
as e tends to zero. Singular perturbations[6]
can be seen asf
having a limit inL2(R"
x [0,T], R").
Roughly speaking, averaging is the case whenf
has a limit inL2(R"
x[0, T], Rn)
inthe weaktopology.As
an example, considerdx
f(x,
where
f
isperiodicinthelastvariable. Definef(x, t) =f(x,
t,t/e); clearlyf
hasno limiteitherpointwiseorinL;
yetf tendstof
definedbyfO(x,
t)
1/
io
f(x,
t,O)
dO,weaklyin L
.
Itiswell known
1]
thatthe solution of(2)
canbe approximatedbythesolution of(3)
=fO(y,
t),
y(0) Xodt
with an erroroforder eto,provided that
f
be regular enough. Solving(3)
instead of(2)
isknownas "averaging".Whatisgoodfor differential equationsisoftengoodfor optimal control. Regulm’
and singular perturbations have been extensivelystudied
([2],
[6]).
Asfarasaveraging isconcerned,it has beenstudied inthe contextof partialdifferentialequations[3]
or stochasticoptimal control[5].
Wepresenthere some approximation results in deter-ministic optimal control.1. Theperiodiccase.
1.1. The averaged problem. Let
f:
RxRPx[0, T]xR->R
,
(x,
u,t,O)->f(x,
u, t,0),
* Receivedbythe editorsFebruary 3, 1986; acceptedforpublication (inrevisedform) May5, 1986. f Centred’AutomatiqueetInformatique del’Ecole NationaleSup6rieure desMinesde Paris, 35, rue
Saint-Honor6,77305 FontainebleauCedex,France. 767
L:
R"
xRP
x[0,T] xR-R,
(x,
u,t,O) L(x,
u, t, 0),with
f
andLperiodicin 0 with a period to independent ofx,u and (regularitywill beconsidered in1.2).
Let Uaabeaconstraint domain and Wad{u
L2([0,
T],RP),
u(t)
Uaa for almost everyt}
the setofadmissible controls.We definethe problem(P)
as"d--’f
f
x( t), u( t),
t,x(O)
Xo,(4)
Minimize L
x( t), u( t),
t, dtWedefinethe associatedproblem Pas follows:
(5)
in Wad
vaa={vL2([O,
T]
x [0, to],RP), v(t, 0)
Uada.e.
(t, 0)},
dy_
1f(y(t),
v(t, 0),
t,O)
dO,y(O)
Xo, dt toMinimize
L(y(t),
v(t, 0),
t,O)
dO in Vad,
Problem
(4)
canbe seen as the perturbation of(5)
bya fast oscillating inputofnull average.Noticethat iff
isLipschitzinx andmeasurable in u, t, 0,then, for v in Vad,
the average
f off
as defined in(5)
is Lipschitzinx andmeasurablein t.Hence,
both(4)
and(5)
have aunique solution over [0, T].Remark 1.
At
firstsight,itwouldseemreasonabletoconsider in(5)
the averages off
and L in 0independently ofu,that is, u beinga constant vector in Rp andnot a functionin
L2([0,
to],RP).
This amounts torestrictingVadtocontrolswhich are constantover[0,
to]
attime t.As
weshallsee,thismay leadto asevereloss of optimality when e tendsto 0. In short, it isnecessaryto have afeedbackonthe fasttime 0. Consider forinstancethe following problem: n p 1, UadR,
f(x, u,t,O)
-x+
u sin(0)
andL(x,
u, t,O)=x2+(u2/6).
If u is independent of 0, the averageoff
is equal to -x,which is itselfindependent of u. Within this class of functions, the optimal cost for
(P)
isasymptoticallyequalto.(0),
if.(u)
denotes thecostofuin(/5).
(0)
isequal toXo2(1
e-2T)/2.
Nowuse ourdefinition of the averagedproblemtocompute abetter control.Let
q bethe solutionof theRiccatiequationofthe averaged problem: dq/dt 3q2
+
2q 1,q(T) 0; thenq(0)
(1
e-4r)/(3 + e-4r).
Define zbydz/dt
-z(1
+
3q),z(0)
x(0);
z isthe optimal trajectoryfor the averaged problem. Now usethe open-loop control
u(t)=-6q(t)z(t)
sin(t/e)
in problem(P).
Thenit is easyto see that x driven by u in(4)
is asymptoticto z. Hence the cost of u in(P)
is asymptotic too
Tz2(t)x
(1
+ 3q2(t))dt.
Thanks to the Riccati equation, the latter quantity is equalto q(0)x2,
which issmallerthan
J(0);
thus foresmallenough,u isbetter than any"slow" control. Notice that thesamephenomenoncanbeobservedin stochastic optimalcontrol,where it is well known that open-loop controls alone are not enough to ensure optimality. In both cases this is due to the presence ofaverages w.r.t, the events or fasttime in theproblem
(the
same canbe saidof the "ordinary,""slow"time, ofcourse).
Remark2. Eventhough the controls of problem
(P)
arein an infinite dimensional state, the minimumprinciple and dynamic programming apply; the important factis that the state is finite-dimensional. One also checks that the Bellmanfunction isthe viscosity solutionof the Hamilton-Jacobi-Bellman equation [7].Let
f(x,
v,t)
denote the average off
and L the average ofL. The adjoint state equations for theproblem(P)
are"dp
of
dr-
OxpOx’
p(T)
Oandif
v*
is anoptimal control for the problem(P),
theminimumprinciple saysthatv*(t,.)
minimizes the Hamiltonian of(P),
that is,p(t)rf(x,
v,t)+,(x,
v,t),
with respect to v in Vaa.
This is equivalent to’v*(t, O)
minimizesp(t)rf(x,
v,t,0)+
L(x,
v,t,O)
with respectto v in U"d,
this for almost every 0. We see thatthecontrol appears naturally as afeedback onthe fasttime(cf.
Remark1).
Remark 3.
(P)
is better conditioned numerically than(P),
since(P)
involves a time grid of order 8t/e, while(P)
involves only a grid of orderSt,
at least in the simulation part. A time grid int/e
is still needed to compute the averages and to minimizethe Hamiltonian.1.2.
An
approximation theorem. The averaged problem is used here to compute anearoptimal control for theproblem(P),
with an errorofordere2.
The proof and assumptions are close tothose usedin[2].
Assumptions.
(H1)
Uad Rp" noconstraint.(H2)
(H3)
(n4)
Ldoesnot depend on 0;
f
and Lare C,
of class C2in x, and u; the first-and second-order derivatives of
f
are bounded, and Lipschitz; the second-orderderivatives ofL arebounded, and Lipschitz.(P)
has asolution, withoptimalcontrol Uo,trajectory y andadjointstate q.Let
H(p,x,
u, t,O)=prf(x,
u,t,O)+L(x,
u,t).
There exists/3>0
such that,forall
(x,
u, t,0),
02H/Ov
2isgreater
than/3
Identityatpoint(q(t),
x, u,t,0),
in the sense of the cone of positive semidefinite matrices.
(H5)
Onehas>=0 at(q(t), x, u,t,
O)
forall(x,
u, t,0).
(H6)
f(y(t),Uo(t, 0),
t,O)
and Lipschitz derivative.y(t),
Uo(t,
0),
t,O)
are C in with aAssumptions
(H4)
and(H5)
ensure sufficientregularity of the control withrespectto thecost. Assumption(H6)
is specificto averaging: ifit did not hold,there wouldnot be a true separation oftime scales.THEOREM 1. Let
f
and L meet Assumptions(H1)-(H6).
Define
u byu(t)
Uo(t, t/e).
Thenu isnearoptimalfor
theproblem(P)
with an error onthecostof
order2
E
Sketch
of
proof.(A
detailedproof canbe found in[4].)
Wewill proceed in two stages. First,wewill exhibitalower boundofthecostfor any control u."Wewillthen show thatthe control u induces a cost whichapproximatesthislowerbound with an errorofordere2.
Itisthen easytoconcludethatu isnearoptimal for theproblem(P).
Remark 4. Unless otherwise stated, all partialderivatives willbetaken attime t, with x being
y(t),
u beingUo(t, t/e)= u(t),
0 beingt/e
andthe adjoint statebeingof 0, we will denote byg(tr) or g(tr,. or av(g(tr,.)) the average of g in 0.Wethen definethe operator
II
onperiodicfunctions g,byII(g(tr,.))
being the only primitive in 0 ofg-g
with a null average.II
plays a keyrole in all developments of integrals orsolutions ofdifferentialequations involving aperiodicity infast time. Finally, the superscript Tdenotes transposition.LEMMA 1. Let
X2(t,
O)=II(f(y(t),Uo(t, "),
t,.)(0).
Thereexists Co>0, k>-O, suchthat,
for
any controlu and anye ]0, eo[, x being the trajectory drivenby u in(4),
the following estimationholds:Io
L(x(t), u(t), t)
dt>- L y(t),Uo t, dt+
av-0-;
(y(t, q(t,Uo(t,.
,
,.
lx(,.
t
eq(O)
TX2(0
0)
ke2.
Proof of
Lemma 1. Wewill denote by:
and the followingerrors"( t)
x( t)
y(t)
ex2(
t,),
( t)
u( t)
Uo(
t,).
It should benoticed that, if
x
is definedbydx,
a
f
f
X "l---
Xl(O
"l-x2(O O)
0(6)
dt OX OXX2,
and if =0
(that
is, x is the trajectory driven byu),
theny+
eXl+ ex2(t, t/e)
is a uniform approximation of x with an e2 error.Wewill usedevelopments offunctions withintegralremains.Tothisend,wewill usethe following notation: for
a
in[0,1],/z
in[0,1],
F,(A,/z)
willdenote theHessian symmetric operator associated withthe second-order derivatives off
in x and u, at point(y(t),alx(ex2(t,t/e)+(t)), Uo(t, t/e)+
a(t),
t,t/e). L,(A,
)
willdenote the analogue for L andH(A,
g)qr(t)F(A,
)+ L(A, )
will denote the Hessian ofHatthesame point.
We will also make use of the following linear quadratic oscillatory
"tangent"
problem: dz
0,
dt 8x
,
,
OuMino
(zL
vr)H,(O, O)(zL vr)
r+p
t,z+Vsu
dr,whereP2
=-H(SHr/Sx).
P2 is the analogue ofx2for the adjoint state.Thanksto
(H4)
and(H5),
(TP)
hasauniquesolution withtrajectory y, optimal controlv
and adjointstate q.Moreover,
Ily,
I1
and o,I1
arebounded whene rangeswithin aneighbourhood ofzero. Finallyestimates willbear onthe quantity
z=
Ariad II&(x,
)IIL
wherer
1
(
z(x,
)(t)=,.,(t)+
L\-DT
/
o,oxj
r(t)+
ex2 t,thederivatives being taken atthe sameinterpolation points as for
Ft(A,/z), L,, (A,/z)
and Ht(A,
I).
At last, will denote an approximationwith an e2error.
LEMMA 1.1.
;+
ex2(t,
t/e))
u
Proof of
Lemma 1.1. The cost is expanded atthe second order usingan integral remain. Sincethereis noconstraint, q’(cgf/gu)
+
(cL/u)
0; theHamiltonianappears aftera classical integration by parts. Wethen neglect all integrals depending on fast time when of order larger thanor equal to e2.
LEMMA 1.2. Thereexistsk
>=
0 such thatdt dt
e"
2 t, e dt(8)
=f
r+y+ex2+eyl,v+tto+el)l,t,--f
y,Uo t, ,t,--e(y+x2)
e-v,-e
n
(f(y,uo(,"
t,"OU
since
/OO[H(f)]
=f-
The second expression isequal tof(
r+
y+ ex
+
ey, v+ uo +
eV, t,)
f(
y+ ex +
eyluo
+
eV t,)
Io’
v then appears as a difference in the controls and isreplaced by
z(x,
)-
xj(+
x:).
Usint
the Gronwall lemma, and neglecting theintetrals
offast periodicfunctions at thesecon
order, weet
te
estimate onI111.
The estimation onIlvll
is otainethroughthedefinition of
Lza 1.3. rca, be pproximated uniformly with an error
of
order@
rl, with(9)
dt rl+
+
x2+
y V+
Uo+
V t, y+ exx +
y Uo+
eV t,r(O)
=0.IIIIL-<_
k(-+z)
Proof of
Lemma 1.2. dr dt 2 andIloll_-<
k(+z).
Proof of
Lemma 1.3. Notethat(9)
isclose to(8).
Wethen get the result by using the Gronwalllemma. Wewill now use rl rather than r.LEMMA 1.4. Thereexists k >-0 such that,
for
e]0,
1]:
Ox
-07
S
dt >-,
p2
t,+
Ofv
dt- ke2
kez.
LOx
and
7
r +
y,Hence,
integrating bypas,
and nglting sond order trms,But
r
+Of
re p t,
r
+ z)
ou
byLemma 1.2. On the otherhand,
io
(Irll
2+
I 1:)
at
--ke:
IIr,
ll+
L
by averaging estimations. But, from
(9)
and Lemmas 1.2 and 1.3, one sees that there exists k>
0 such thatiidr/dtll,
k(1
+ z);
thiscompletes theproof.We are now going to study the second orderterm with
Ht(A, )
in Lemma 1.1.LEPTA 1.5.
ere
exists k>
0 such that:io’
d dio
d(
+ x:,
)H,(X,
)
(+:)
x:
d d
d(y,+u)H,(,)
rx:
_k+z].
oProof of
Lemma 1.5.By
substitutingZ
-[(02H/Ov2)
-1OH
Ox](r+ ex2)
to v in the integral and using Assumptions(H4)
and(H5),
oneshows that the expressionon the left-handside isgreater than1o
dtjl
oAdAJl
odl fl]Z(A, )(t)[
,
thatis,fizZ.
2
This willbe theonlypositiveterm in z; the otherswill be of the form
-kez.
Note that Lemma 1.4 already displays one
pa
of the cost to be minimized in problem(TP).
The otherpa
will appear by replacingHt(A,
)
byHt(O, O)
in the estimates ofLemma 1.5.LEMMA 1.6.
ere
existsk 0 such that:io
io
io
2e dt A dA
d(y, v)Ht(A,
)
r x2oof of
Lemma 1.6. Sincethe second derivatives off
and L are Lipschitz,H,(A,
)-
Hi(0,o)11
g(lx
-Y=I
+
lu
Uo12)
1/2gA(Ir +
ex2
+
ey,=
+
Iv
v,12)
1/2.
As
x2,y andv
arebounded,there exists k>
0 such that:io
Io’
e dt AdA
dg(y+v)[Ht(A,)-Ht(O,O)]
rx
o
<-
ke(e2+
rll
+
Weget the result from Lemma2.
LEMMA 1.7. There existsk>-0 such that:
Io
L(x,
u,t)
dt >-Io
L(x,
u,
t)
dt+
eio
o
--x
x2 dt--eq(O)
Tx2(O,
O)
+e
p
t,[Ofrl+Ofv]
dtLOx Ou .i
oof of
Lemma 1.7. Combine the results ofLemmas 1.1, 1.4, 1.5 and 1.6.Weare going now to completethe estimate by using the problem
(TP ).
LEMMA 1.8.
ere
existsk 0 such that:io
()[o
av
o
3.
o(+:)
oof of
Lemma 1.8. Transformations usingthe adjoint equations of(TP )
and the explicit value ofv
as afeedback yield the following estimate for the expression onthe left-handside:io
{()
(
)
o
o]
,
qT
i
x,,t,-i
x-,,,-,t,-,,-
dtwhichis clearly of second order in
(llrlll:+
IIII)1/=;
from Lemmas I.and 1.3 weget the result.From Lemmas 1.8 and 1.7the estimation proposed in Lemma 1 isproved for e sufficiently small to make
flz2
dominant against -ke. This completes the proof ofLemma 1.
LEMMA 2. Weare goingto estimate thecost inducedby U
.
Letnow u u,
that is O. Then"Io
L(x,
u,
t)
dtIo
L(y,u,
t)
dteqT
(O)x2(O, O)
+
e-x
Xadt.Proof of
Lemma 2. We canuseLemma 1.1 toget afirst estimate:L(x,
ut)
dtL(,
u)
de- eqr(0)x(0, 0)+
e dt OXX2 r OH+
dt+
dt dd(
+x),)(+x)
where
02H/x
2is computedatthe samepointas forH(A,
).
Since isof orderone in e, the last integral is of ordertwo.Proceeding as in Lemma 1.4,we also show that the integral before that one is of order two. This completes the proofof Lemma 2. Theorem 1 follows from Lemmas 1 and 2.COROLLARY 1.
If
U is a "better" control than ufor
theproblem(P),
then"Proof of Corolla
1. Denote byJ (u)
the cost in problem(P).
For e]0,
1],
one has
J (u)
J (u )
ke2
+
(fl
ke)z,
with k 0. If u is better than u,
then2<
2k/fie
2.
The result follows fromLemma1.2.(
ke)z
ke2Takee
< /2k;
thenz
2. The
nonerioic
case.2.1.
An
ergaic thearem an O.D.E. LetR"
x[0,T]xR+
f:
(x,
t,o) f(x,
t,o)
meeting the following assumptions"
(H7)
f
is Lipschitzin(x, t)
with Lipschitz constantA,
and integrablein 0.(HS)
f
has an averagef
inthe sense that,for any boundB,
one has:1 t+
Sup
f(x, t,O)
dO-f
(x,
t)
O.IxlB +
te[0,T]
t0
As before,
(H7)
ensures that onehas a trueseparation oftime scales.Let x andy be definedby:
(lO)
dt
-f
x’
t,(0)=Xo,
t[o, T],
(11)
dy=f(y,
),
y(0)=Xo, [0,r].
Then
Sup
Ix
t)
y(t){
o.
t[0,T] e0
Proof.
At
afasttimescale,the slowtime canbeconsidered asconstant, aswell asx(t)
andy(t). Hencethedynamicsf
canbe approximated byf;
integration yields the result.More precisely, letD
N*
and,for [0, T], let k--kt/D.
Then:Ix(t)-y(t)l<-A
Ix(s)-y(s)l
ds+f
y(s),s,(tk),
tk dsk=0 dt D-1
(
+
f(y(s), s)-(y(t),
)1
ds k=0 ol+
2
f
y(tk),tk,--f(y(tk),
tk)
ds k=0 dtfrO
t2
De[
A
]x(s)-y(s)l
ds+Al+
Sup
f(x,
t,O)
dO t[0,T]t0
-f(x,t)
>0.Choose D
D(e)
such thatD(e)
--
_ooo andeD(e)
--_o0; use of the Gronwalllemmayields theresult.
2.2. The averaged problem. Section 2.1 can be viewed as a generalization of averaging techniquesto the nonperiodic case. We are now going to use it to define the averagedproblem in the nonperiodiccase.
Let
f
and L be as in 1.1, except that nowthey need not be periodicin 0, and defineproblem(P)
accordingly. The important pointinthedefinitionof the averaged problem isthat oftheset Wdofadmissiblecontrols. Wad
will betheset offunctions u from [0, T]xR/ to R
p,
with values in Uadfor almost every
(t, 0),
and such thatf(x,
u(t,
0),t,O)
andL(x,
u(t,
0),t,O)
have an average in 0for every(x, t).
Note that Wad
may be empty.
However,
we will show that, if averaging can reasonably be expectedto be used (i.e., theminimized Hamiltonianhas an average),then Wad
isnonempty
(see
2.3).
For u in Wad
we definethe averaged problem:
dy_
f
(y, u(t, ),
t,),
y(O)=Xo,dt
Minimize L(y,
u(
t,.),
t,. dt,t[0, T],
(P).
If
f
and L are periodic,we findthe same definition as in 1.1.2.3. The Hamiltonian of the averaged problem. In this section, we will omitthe mention x and t. All assumptions made willbe supposedto hold for every x and t.
Definethe pseudo-Hamiltonian h
(p,
u,0)
by h(p,
u,0)
pTf(u,
0)
+
L(u, 0)
and let H(p,0)=
Minuuod
h(p, u,O)
whenit exists.We are goingto show thatifH has. anaverage forany p, then its average isthe
minimum ofthe Hamiltonian of the averaged problem. We will use the following
assumptions"
(H9)
For any bounded part B of Rp,
f
and L are bounded and uniformly continuous on BxR+
(i.e.,f
andLare in BUC(B, R+)).
(H10)
Forany(p,0),
h(p, u,0)
has a minimum on Uad.
Moreover,
for any bounded domain BinRpthereexists acompactsetKin Uad
such that theminimum can bereached in K for any (p,
0)
in B xR+.
(H11)
H has an averageH,
i.e., 1H(p,
O)
dO(p)
for all p inR".
"1"
THEOREM2. Let
f
andLmeeting (Hg),(H10)
and(Hll).
Then Wad isnonempty and
ffI(p)
MinvwOdprf(v)+
(v).
Proof
Denotepr(
v)
+
C(
v)
byg(p,
v)
forvin WadandletE
={pR"
=lv Waa,
151(p)
(
p,v)}. As
obviously/(
p,v)
=>
D
(p)for any vinWad,
Theorem2 isequivalent to E R".
We are goingto showthat E isclosed and thatR"-
E is ofnull measure(Lebesgue).
(i) E is closed.
IfE
,
this is true.IfE,
letp. be asequenceinE,
converginginR",
withH(p.) h(p.,
v.).
Thanks to(H9)
and(H10),
f(v.)
andL(v.)
arebounded;letf
andL be two cluster points, and
w.
a subsequence such thatf(w.)--*._.oof
and/2(w.
._.ooL-.
Wearegoingtoexhibit acontrol v suchthatprf
+
r_.
prf(v)
+/S(v).
Let
-.
be anincreasing sequenceinR/ such that-.
=>
n!and suchthat, for1_
f(w,,(O), O)
dO-?(w,)
1z, exists since w, is in
Wad.
Define v byv(0)= w,,(O)
for 0 in[z,, ’,+1[;
it is then easyto check that v is in W"dand that p
rf
+/S=/(p,
v).
(ii)
R"-E
is of null measure.H is locally Lipschitz, thanks to Assumptions
(H9)
and(H10),
and thus, if Fdenotestheset ofdifferentiability points of
H, R"
Fisofnullmeasure.Wearegoing to show thatFcE.Noticethat,thanksto
(H9)
and(H10),
there existsameasurable functionu fromR"
xR+
to Uaa such thatH(p,O)
h(p, u(p,0), 0).
Letp beasmall positivenumber,p in Fand q a direction in
R".
Then"H(p+ pq)- H(p,
O)
<=
qrf(p,
u(p,0),O)
<--H(p, 0)-H(p-pq,
O)
P P
Let be aclusterpoint of
1/r
Io
qrf(p,
u(p,
0), O)
dO as z+oo;
existsthanks to(H9)
and(H10).
Then let-
oefirst,then p 0 intheabove inequalities.Weconclude that =OH/Op in the direction q.As
this is true for any clusterpoint andany direction q,we have lim1 f(p, u(p,0),
0)
dOOff-It
->o
.
OpSince
f
and H have an average, L has an average andu(p,
O)
is inWad,
withRemark 5. The concavity ofH inp is essential. Let H(p,
0)=
sin(p,
0).
H 0, yetOH/Op has noaverage.2.4. A limittheorem on the Bellman function. Assumption
(Hl
l)
has given sense to the averaged problem by ensuring that Waa is nonempty. It also yields a limit theorem onthe Bellman function.We
will considerthe latteras the unique viscosity solutionof the Hamilton-Jacobi-Bellman equation(see [7]).
Assumptions. Let
R"
xR"
[0,T]
R+-->
R,
H"
(p, x,t,
0)
H(p,x,
t,0).
H may represent, forinstance, the minimized Hamiltonian of 2.3. The assumptions arethe following:
(H12)
(H13)
HsBUC(BxR"x[0, T]xR/)
for any bounded part B of R",
andIn(p,
x,
t,o)- n(p,
y, t,0)1_-
<C(1
+lpl)(lx- yl).
H has anaverage H suchthat, for any p, x, t,
Sup
_
H(p, x,
t,0) dO-H(p,
x,t)
O.0 +o
THEOREM3. Let Vbe the viscosity solution
of:
(12)
O___V+ot
ffI(OV’\-x
X,t)
=0, Let V be the viscosity solutionof:
+H
,x,
t, =0,Ot
V(x, T)O,
xR",
t[0, T].
V(x, T)=-O,
xR",
tel0,T].
ThenV convergestoVasetendstozero,uniformlyonany compact subset
of
R"
x[0,
T].
Proof.
We
will makeuse of the followingnotation:Lnf(R
x [O,T])
(
4
SLo(R
X[O, T], R),
Supf
yII
t[O,T]
I(x, t)l"
dx dt<
/In unif
"
X[0,
T])
isthesetof functionsb
inLPnif(R" x[0,
T])
suchthatOdp/Ot,Ocb/Ox
and
02b/0x
2exist and are in P uz2,1.,Lunif.
nir isthe analogueof W2’1" except thatU’
is replaced byLunif.P
We can usethe norm on uz2.1.p,,unif definedbythe sup of W2,’p
norms over all B(y,
1)x [0,
T], whereB(y,
1)
istheclosed ball ofcentery andradius 1.ii12,1,p We willdenote,for a
>
0, byV,,
the unique solution infqp_o
unif of(14)
OV+aAV+H
,x,t, =0,V(x,
T)=-O
Ot and
V
the analoguefor(15)
OV+aAV+fft(OV
)
O
\-x
X, =0,V(x, T) =-
O.Itis well known
[7]
thatMoreover,
k doesnotdependonthebehaviourofH
intime, that is,inparticular k does not depend on e.Hence,
the theorem will be proved if, a being fixed,V
convergesto
V,
for e in]0,
1].
Estimates onthe derivatives ofV
ensurethat the arebounded in(and
thus form aweakly relatively compact subsetof)
IITE’I’Pvv
unif for anyp>0. Thus, the
(V,
OV/Ox)
are relatively compact inC(K
x[0, T],R),
K being any compact subset ofR".
Let(V,OV/Ox)
a cluster point and(V,OV./Ox)
a sequence converging to(V,
0V/Ox)
uniformly on any compact, withV
,
V
Z2,1,p
in ,if weakly.
Proving the theorem thusamountsto showing that
V
is a solutionof(15)
inthe weak sense. Let afunction inC(R"
x[0, T])
with a compact domain.at
dx+
aA9
+
B
(x, t)
t Otk
’x’at
dx+
aA9
aAV:
(x, t)
+
dt dxX’
x, t, x, t,(x, t)
Ill2,1,2As e tends to zero,thefirst expression has limit zero by weak convergencein The secondonetendstozerothankstotheLebesguedominatedconvergencetheorem.
A discretization scheme similar to that used in 2.1 ensures that the third one has limitzero,thanks to Assumption
(H13).
Remark 6. It can be proved
[4]
that, if H is locally Lipschitz in p and iftheconvergencein Assumption
(H13)
is uniformfor x inR",
then the convergence of Vto V is uniform on
R"x
[0,T].
Onemay wonder when
(H12)
is true with H=prf+
L. It is trueif, for instance,f
andLare BUCand Lipschitzin x.However,
thiscanbe extendedtothecasewheref
isuniformly continuous, Lipschitzin xwithIf]
=
k(1
+
Ix]
+
}u]);
and Lis uniformly continuous, locally Lipschitz in x,]o /oxl k(l+lxl+l.I)
andI l=k(l+lxl=+lul=),
This includes the linearquadraticcase; actually, wemightcall this the"sublinear quadraticcase." With a suitable truncationoff
and Lonthephase domain,we may keep V and V unchanged on a potion ofR"x
[0, T], while retrieving Assumption(H12).
The convergence is still uniform on any compact.3. Perspectives. Wehave provenhere,atleast in theperiodiccase, that averaging canbe usedas an efficienttool in deterministicoptimal control. Itis efficientfortwo reasons.
First, the averaged problem
(P)
is easier to solve numerically than the originalproblem.
(P),
since a "fast" time grid is no longer needed in the simulation part. Gains should also be expected on the state space grid, since it isoften relatedto the former one;itis animportant pointif onethinks, for instance, of dynamic programming.Second,andthankstoTheorem 1, thesolution of
(P)
isknowntobenearoptimal for(P);
hence,we do notlosemuchby solvingtheaveraged probleminsteadofthe original one.We have shown that use of the averaged problem can also be expected to be efficient inthe nonperiodic case, as the optimal cost of
(P)
is close to that of(P)
However,
practicalproblemsareseldom under the form(P),
beit inthe periodic ornonperiodiccase.Mostof the time, there is, for instance,no explicit separation of the time scales under the form(t, 0),
with 0 ranging from 0 to+oo;
in particular, periodicity or averaging assumptions cannot be checked directly. Nevertheless, the results presented here provideanimportanttheoreticalbackgroundfor developments of both theoretical andpractical interest.From thetheoretical point of view, it is reasonableto expect problem
(TP)
inLemma 1 to provide further expansions of the cost
(J),
as similar methods havealready been used withsuccess in the caseof regular and singular perturbations
[2].
A co.mplete expansion of the Bellman function in thelinear quadratic periodic case has already been obtained
[4].
In particular, the terms of order higher thantwo are inthe.form V(x, t, t/e,(T-t/e)),withperiodicity inboth the forward and backward fast times. This is probably related to the existence of the terms x2 and P2 in the expansion of the primalanddual trajectories. Thesamephenomenonexists insingular perturbations with boundary layersinstead of phaseterms.Wehaveseen thatx2and P2 are definedthrough the operatorII. In fact,
II
appears inany expansion ofanintegralwith anintegrand periodicinfasttime.A
generalization ofII would be welcome if we hope to find some results equivalent to Theorem 1 in thenonperiodic case.Atlast,linksshouldbedevelopedwithsingular perturbations.Fromthe practical point of view, we have seen that the assumptions in Theorems 1 and 3 cannot be checked directly. It should be noticed (especially in the nonperiodic
case)
that the questionis not somuch that the assumptions mightnothold, asitisrathertoimmerse theoptimizationprobleminthe "right" family of problems(P). Moreover,
theideas aresufficiently simpleandgeneraltobeusedinheuristics.Onecanthink,for instance, of separating the time scales through the use of "movingaverages."
Heuristics can also be developedto generalize theoperatorII
to the nonperiodic case and use it to improve performances. We aregoing to experiment numericallyonthese ideas.Conversely, averaging has been often used empirically by engineers in practical problems. The results and notions presented here may provide them someguidelines in further applications.
Wehavediscussedhow averaging couldbeused practically.Weshalldiscuss now when it couldbeused. Heuristically, averaging canbe expected toyield goodresults andperformanceswhenthe dynamicsof asystem dependon afast"erratic"exogenous phenomenon. A goodexampleis given by weatherdisturbances.
However,
these phenomena are often modelized by stochastic processes.As
wementioned before, both approaches are similar in the sense that they make use of
averages;theirtheoreticalbackgroundis,however,quitedifferent.
Moreover,
averaging has also been used in stochastic control([5],
for instance). In all cases, the original dataconsistsoftenof a finite number ofphysicalmeasures,in noway probabilisticor two-time scaled in nature. Thus, neither approach is justified a priori. Therefore, it should prove quite interestingto experiment all methods onvarious sets ofdata. Weplan to conduct such experiments.
REFERENCES
[1] V. ARNOLD, Chapitres suppldmentaires de la thorie des quations diffrentielles ordinaires, French translation fromthe Russian, Mir,Moscow,1980.
[2] A. BENSOUSSAN, SingularPerturbationsinSystemsandControl,MarkArdema, ed., Springer-Verlag, Berlin-Heidelberg-NewYork,1983, pp. 169-185.
[3] A. BENSOUSSAN,J. L. LIONS ANDG. PAPANICOLAOU,Asymptotic AnalysisforPeriodicStructures, North-Holland,Amsterdam, 1978.
[4] F.CHAPLAIS,Averagingetcontrleoptimaldd.terministe,ThsedeDocteur-Ing6nieur, Ecolenationale
Sup6rieure desMinesde Paris, Paris,1984.
[5] F. DELBECQUEANDJ. P. QUADRAT,ContributionofStochasticControl SingularPerturbationAveraging andTeam Theoryto anExampleofLarge-ScaleSystems: ManagementofHydropower Production, IEEETrans. Automat.Control, AC-23, April 1978, pp.209-221.
[6] P.V. KOKOTOVIC, R. E.O’MALLEY,JR.ANDP.SANNUTI,Singular perturbationsandorderreduction incontroltheory:Anoverview,Automatica, 12(1976),pp.123-132.