HAL Id: inria-00439543
https://hal.inria.fr/inria-00439543
Submitted on 7 Dec 2009
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Hamilton-Jacobi-Bellman approach
Emiliano Cristiani, Pierre Martinon
To cite this version:
Emiliano Cristiani, Pierre Martinon. Initialization of the shooting method via the
Hamilton-Jacobi-Bellman approach. Journal of Optimization Theory and Applications, Springer Verlag, 2010, 146 (2),
pp.321-346. �10.1007/s10957-010-9649-6�. �inria-00439543�
a p p o r t
d e r e c h e r c h e
N
0
2
4
9
-6
3
9
9
IS
R
N
IN
R
IA
/R
R
--7
1
3
9
--F
R
+
E
N
G
Initialization of the shooting method via the
Hamilton-Jacobi-Bellman approach
Emiliano Cristiani — Pierre Martinon
N° 7139
Centre de recherche INRIA Saclay – Île-de-France
Parc Orsay Université
EmilianoCristiani
∗
,Pierre Martinon
†
Thème: Modélisation,optimisationet ontrledesystèmesdynamiques Équipe-ProjetCommands
Rapportdere her he n°7139Dé embre200925pages
Abstra t: Theaimof thispaperistoinvestigatefromthenumeri alpointof viewthe possibilityof ouplingthe Hamilton-Ja obi-Bellman(HJB)approa h andthePontryagin'sMinimumPrin iple(PMP)tosolvesome ontrolproblems. We show that an approximation of the valuefun tion omputed by the HJB method onroughgrids anbeused toobtainagoodinitial guessforthePMP method. The advantage of our approa h over other initialization te hniques (su h as ontinuationordire tmethods) isto provideaninitial guess loseto theglobalminimum. Numeri altestsinvolvingmultipleminima,dis ontinuous ontrol,singular ar sand state onstraintsare onsidered. The CPUtime for the proposed method isless than four minutes upto dimension four, without odeparallelization.
Key-words: optimal ontrol problem, minimumtimeproblem, Pontryagin's minimumprin iple
∗
CEMSAC,Fis iano(SA)andIAC-CNR,Rome,Italy
†
Résumé: Theaim ofthispaperisto investigatefrom thenumeri alpointof viewthepossibility of oupling theHamilton-Ja obi-Bellman(HJB) approa h andthePontryagin'sMinimumPrin iple(PMP)tosolvesome ontrolproblems. We show that an approximation of the value fun tion omputed by the HJB methodonroughgrids anbeusedtoobtainagoodinitialguess forthePMP method. The advantage of our approa h over other initialization te hniques (su h as ontinuationordire t methods)is to provideaninitial guess loseto theglobalminimum. Numeri altestsinvolvingmultipleminima,dis ontinuous ontrol,singularar sand state onstraintsare onsidered. TheCPU timefor theproposed method is lessthan four minutesupto dimensionfour, without odeparallelization.
Mots- lés : optimal ontrolproblem, minimum timeproblem, Pontryagin's minimumprin iple
1 Introdu tion
The Hamilton-Ja obi-Bellman (HJB) theory and the Pontryagin's Minimum Prin iple(PMP)areusually onsideredtwoseparateworldsalthoughtheydeal with thesamekindofproblems. Thetheoreti al onne tionsbetweenthetwo approa hesarewellknown[11,7,8,9℄,but oupledusageofthetwote hniques isnot ommonandnot ompletelyexplored.
Inthispaperwewilldealwiththefollowing ontrolleddynami s
˙y(t) = f (y(t), u(t)),
t > 0
y(0) = x,
x ∈ R
d
(1)where the ontrol variable
u(·) ∈ U := {u : R
+
→ U, u
measurable
}
andU ⊂ R
m
(
m ≥ 1
). We will denote byy
x
(t; u)
the solution of the system (1) startingfromthepointx
with ontrolu
. LetC ⊂ R
d
beagiventarget. Forany given ontrol
u
wedenote byt
f
(x, u)
thersttime thetraje toryy
x
(t; u)
hitsC
(wesett
f
(x, u) = +∞
ifthetraje toryneverhitsthetarget). Wealsodene a ostfun tionalJ
asJ(x, u) :=
Z
t
f
(x,u)
0
ℓ(y
x
(t; u), u(t))dt.
(2)Thenalgoalisto nd
u
∗
∈ U
su hthatJ(x, u
∗
) = min
u∈U
J(x, u)
(3) andthento omputetheasso iatedoptimaltraje toryy
∗
x
(t; u
∗
)
. Wealsodene thevaluefun tionT (x) := J(x, u
∗
) ,
x ∈ R
d
.
Choosing
ℓ ≡ 1
in (2)weobtainthe lassi alminimum timeproblem.TheHJBapproa hisbasedontheDynami ProgrammingPrin iple [3℄. It onsists in hara terizingthe valuefun tion asso iated tothe ontrol problem by meansof arst-order non-linearpartial dierential equation. On e an ap-proximationof the valuefun tion is omputed, we aneasily re onstru t the optimal ontrol
u
∗
in feedba k form and, by a dire t integration, the optimal traje toriesfor any startingpoint
x
. The method is greatlyadvantageous be- auseitis abletorea h theglobalminimumofthe ostfun tional, evenifthe problemisnot onvex.TheHJBapproa hallowsalsotohaveaglobaloverview ofthesetoftheoptimaltraje toriesandoftherea hableset(or apturebasin) i.e. theset ofthepointsfromwhi h itispossibletorea hthetargetinagiven time.Beside all the advantages listed above, the HJB approa h suers the well known urseofdimensionality,soingeneralitisrestri tedtoproblemsinlow dimension(
d ≤ 3
).The PMP approa h onsists in nding traje tories that satisfy the ne es-sary onditions stated by Pontryagin's Minimum Prin iple. This is done in pra ti e by sear hing a zero of a ertain shooting fun tion, typi ally with a (quasi-)Newton method. This method is wellknown andis used in many ap-pli ations,see[21,23,12℄ andreferen estherein. Themain advantagesof this approa hliein itsa ura yand itsnumeri al omplexity. It isworthto re all
thatthedimension ofthenonlinearsystemfortheshootingmethod isusually
2d
, whered
is the statedimension. This isin pra ti equitelowfor this kind ofproblem,thereforefast onvergen eisexpe tedin aseofsu ess,espe ially iftheinitial guessis lose totherightvalue. Unfortunately, ndingasuitable initial guess an be extremely di ult in pra ti e. Thealgorithm may either not onvergeatall,or onvergetoalo alminimumofthe ostfun tional.Inthispaperwe ouplethetwomethodsinsu hawaywe anpreservethe respe tiveadvantages. TheideaistosolvetheproblemviatheHJBmethodon a oarsegridto havein short timearst approximationofthe valuefun tion and thestru ture ofthe optimaltraje tory. Then, weusethis information to initializethePMPmethod and omputeapre ise approximationoftheglobal minimum. Toourknowledgethisistherstattempttoexploit the onne tion betweentheHJBandPMPtheoriesfrom thenumeri alpointofview.
Comparedtotheuseof ontinuationte hniquesordire tmethodstoobtain anestimateoftheinitial ostate,themainadvantageoftheapproa hpresented hereisthat theHJBmethodprovidesaninitial guess loseto theglobal min-imum. Themain limitationis therestri tionwith respe tto thedimensionof thestate.
We onsidersomeknown ontrolproblemswithdierentspe i di ulties: severallo alminima,dis ontinuous ontrol,presen eofsingularar s,andstate onstraints. Inalltheseproblems, weshowthat ombiningPMPmethodwith HJBapproa hleadstoaverye ientalgorithm.
2 Preliminaries
Consideroptimal ontrolproblemsinthegeneralBolzaform,autonomous ase, withaxedorfreenal time.
(P )
min J(x, u) =
R
t
f
(x,u)
0
ℓ(y(t), u(t)) dt
Obje tive˙y(t) = f (y(t), u(t))
Dynami su(t) ∈ U
fora.e.t ∈ (0, t
f
(x, u))
AdmissibleControlsy(0) = x
Initial Conditionsy(t
f
(x, u)) ∈ C
TerminalConditionsHere
U
is a ompa t set ofR
m
and the following lassi al assumptions are satised:
-
f : R
d
× U → R
d
and
ℓ : R
d
× U → R
d
are ontinuous,andareof lass
C
1
withrespe tto therstvariable. -
C
isa losedsubsetofR
d
forwhi hthepropertyave torisnormalto
C
atapointofC
makessense. Forinstan e,C
anbedes ribedbyanite set of equalities{c
i
(x) = 0}
i
or inequalities{c
i
(x) ≤ 0}
i
, with thec
′
i
s
beingof lass
C
1
andthe lassi al onstraintquali ationassumptions.
2.1 Pontryagin's Minimum Prin iple approa h
We give here a brief overview of the so alled indire t methods for optimal ontrol problems[24, 6,22℄. Weintrodu ethe ostate
p
, ofsamedimensiond
asthestate
x
,and denetheHamiltonianH(y, p, u, p
0
) = p
0
ℓ(y, u)+ < p, f (y, u) > .
Undertheassumptionson
f
andℓ
introdu edabove,thePontryagin's Min-imum Prin iple statesthat if(y
∗
x
, u
∗
, t
∗
f
)
is asolution of (P
) thenthere exists(p
0
, p
∗
) 6= 0
absolutely ontinuoussu hthat˙y
∗
(t) = H
p
(y
x
∗
(t), p
∗
(t), u
∗
(t), p
0
),
y
x
∗
(0) = x,
(4a)˙p
∗
(t) = −H
x
(y
x
∗
(t), p
∗
(t), u
∗
(t), p
0
),
(4b)p
∗
(t
∗
f
) ⊥ T
C
(y
∗
x
(t
∗
f
)),
(4 )u
∗
(t) = arg min
v∈U
H(y
∗
x
(t), p
∗
(t), v, p
0
)
fora.e.t ∈ [0, t
∗
f
],
(4d)where
T
C
(ξ)
denotesthe ontingent oneofC
atξ
. Moreover,ifthenal timet
∗
f
isnotxedandisanoptimaltime,thenwehavetheadditional ondition:H(y
∗
x
(t), p
∗
(t), u
∗
(t), p
0
) = 0,
fort ∈ (0, t
∗
f
).
(5)Two ommon asesare
C = {y
f
}
withp
∗
(t
∗
f
)
free,andC = R
d
withp(t
∗
f
) = 0.
Nowweassumethat minimizingtheHamiltonian providesthe ontrol asa fun tion
γ
ofthestateand ostate. Foragivenvalueofp(0)
,we anintegrate(x, p)
byusingthe ontrolu = γ(x, p)
on[0, t
f
]
. Wedenetheshootingfun tionS
that maps the unknownsp(0)
to the value of the nal and transversality onditions at(x(t
f
), p(t
f
))
. Finding azeroofS
givesatraje tory(x, u)
that satisesthene essary onditionsfortheproblem(P
). Thisistypi allydonein pra ti ebyapplyinga(quasi-)Newtonmethod.Remark 2.1 The multiplier
p
0
ould be equal to0
. In that ase, the PMP is saidanormal,itssolution(y
∗
, u
∗
, p
∗
)
orrespondstoasingular extremalwhi h doesnot dependon the ostfun tion
ℓ
. Several workshave been devotedtothe existen e (or nonexisten e) of su h extremal urves [4 , 10 ℄. For numeri s, in generalweassumethatp
0
6= 0
whi hleadstosolvethePMPsystemwithp
0
= 1
. In thesequel,wewill alwaysassume thatweareinthe normal ase(p
0
= 1
). Singular ar s. Asingularar o urs whenminimizingtheHamiltonianfails to determine the optimal ontrolu
∗
on a whole time interval. The typi al ontextiswhen
H
islinearwithrespe ttou
,withanadmissiblesetof ontrols of the formU = [u
low
, u
up
]
. In this parti ular ase, thefun tion(x, u, p) 7−→
H
u
(x, u, p)
does notdepend onthe ontrol variable. We dene the swit hing fun tionψ(x, p) = H
u
(x, u, p)
andhavethefollowingbang-bang ontrollaw:
ifψ(x, p) > 0
thenu
∗
= u
low
ifψ(x, p) < 0
thenu
∗
= u
up
if
ψ(x, p) = 0
thenswit hingorsingular ontrol.
Asingularar then orrespondstoatimeintervalwheretheswit hingfun tion
ψ
iszero. Theusualwaytoobtainthesingular ontrolistodierentiateψ
with re-spe ttot
untilthe ontrolexpli itlyappears,whi hleadstosolvinganequation oftheformψ
(2k)
(x, p) = 0
depending onthe problem. Moreover,it isalso requiredto makeassumptions aboutthe ontrol stru ture, morepre isely to xthenumber ofsingularar s. Ea h expe ted singular ar adds two shooting unknowns
(t
entry
, t
exit
)
, with the orresponding jun tion onditionsψ(t
entry
) = ˙
ψ(t
entry
) = 0
oralternatelyψ(t
entry
) = ψ(t
exit
) = 0
. The problem studied in se tion 4.3 presentssu h a singularar .State onstraints. We onsiderastatevariableinequality onstraint
g(x(t)) ≤ 0
. Wedenotebyq
thesmallestordersu hthatg
(q)
dependsexpli itlyonthe on-trol
u
;q
is alledtheorderofthe onstraintg
. TheHamiltonianisdenedwith anadditionaltermforthe onstraintH(x, p, u) = ℓ(x, u)+ < p, f (x, u) > +µg
(q)
(x, u)
withthesign ondition
µ = 0
ifg < 0
µ ≥ 0
ifg = 0.
Whenthe onstraintis ina tiveweare in thesamesituation asfor an un on-strainedproblem. Overa onstrainedar where
g(x) = 0
,weobtainthe ontrol from theequationg
(q)
(x, u) = 0
,and
µ
from theequationH
u
= 0
. Asin the singularar ase, weneed to makeassumptions on erning the ontrol stru -ture, namely the number of onstrained ar s. Ea h expe ted onstrained ar addstwoshooting unknowns(t
entry
, t
exit
)
with theHamiltonian ontinuityas orresponding onditions. Wealsohavetheso alledtangen y onditionatthe entrypointN (x(t
entry
)) = (g(x(t
entry
)), . . . , g
(q−1)
(x(t
entry
))) = 0,
withthe ostatedis ontinuity
p(t
+
entry
) = p(t
−
entry
) − πN
x
|t
entry
where
π ∈ R
q
isanothermultiplieryieldingtoanadditionalshootingunknown. Remark2.2 Thetangen y ondition an also be enfor edat the exit time, in this asethe ostate jump o ursatthe exit timeaswell.
2.2 Hamilton-Ja obi-Bellmanapproa h Considerthe value fun tion
T : R
d
→ R
, whi h maps everyinitial ondition
x ∈ R
d
to the minimal value of the problem (
P
). It is well known (see for example[1℄fora omprehensiveintrodu tion)thatthevaluefun tionT
satises aDynami ProgrammingPrin ipleandtheKruºkovtransformofT
,denedbyv(x) := 1 − e
−T (x)
istheuniquesolutionofthefollowingHJBequation,invis ositysense [1℄:
(
v(x) + sup
u∈U
{−f (x, u) · Dv(x) − ℓ(x, u) + (ℓ(x, u) − 1)v(x))} = 0 x ∈ R
d
\C
v(x) = 0
x ∈ C.
(6)Obtaininganumeri alapproximationofthefun tion
v
isadi ulttaskmainly be ausev
is not always dierentiable. Several numeri al s hemes have been studiedintheliterature. Inthispaperwewillusearst-ordersemi-Lagrangian (SL) s heme, werefer to [13, 14℄ for asurveyon these kindof s hemes. This hoi e ismotivated bythefa t that SLs heme seemsthebest onein orderto approximate the gradientof thevalue fun tion,this beingour goalaswewill see in the nextse tion. Wex a(numeri al)bounded domainΩ ⊃ C
and we introdu e in it aregular gridG = {x
i
, i = 1, . . . , N
G
}
whereN
G
is the total numberofnodes. Wedenotebyv(x; h, k, Ω)
e
thefullydis reteapproximationofv
,h
andk
beingtwodis retizationparameters(therstone anbeinterpreted asatimesteptointegratealong hara teristi sandthese ondoneistheusual spa e step). We impose state onstraint boundary onditions on∂Ω
. The dis reteversionof(6)ise
v(x
i
) = e
H[ev](x
i
)
x
i
∈ (Ω\C) ∩ G
e
v(x
i
) = 0
x ∈ C ∩ G
(7) wheree
H[ev](x
i
) := min
u∈U
{P
1
e
v; x
i
+ hf (x
i
, u)
+ hℓ(x
i
, u)(1 − ev(x
i
))}
(8) andP
1
v; x
e
i
+ hf (x
i
, u)
denotes the value of
e
v
at the pointx
i
+ hf (x
i
, u)
obtained bylinear interpolation using the known valuesofev
onG
(note that the pointx
i
+ hf (x
i
, u)
is not in general sittingon the grid). The numeri al s heme onsistsin iteratinge
v
(n+1)
= e
H[ev
(n)
]
n = 1, 2, . . .
(9)until onvergen e,startingfrom
ev
(0)
(x
i
) = 0
onC
and1elsewhere. Toa elerate the onvergen e we use the Fast Sweeping te hnique [27℄. The fun tionev
is then extendedto thewhole spa ebylinearinterpolation. On e thefun tione
v
is omputed,wegeteasilythe orrespondingapproximationT
e
ofT
,and then theoptimal ontrollawinfeedba kform,see[13,14℄fordetails.Itis usefulto notethat theequation (6) analso model afront (interfa e) propagationproblem. Followingthisinterpretation,theboundaryofthetarget
∂C
isthefrontat initialtimet = 0
,andthelevelset{x : T (x) = t}
represents thefrontat anytimet > 0
.3 Coupling HJB and PMP 3.1 Main onne tion
It is known [7℄ that for a general ontrol problem with free end-point, if the value fun tion is dierentiable at somepoint
x ∈ R
d
then it is dierentiable along theoptimal traje tory startingat
x
. A tually, the gradient of the value fun tion isequal tothe ostateof the Pontryagin'sprin iple.Inthe ontextofminimumtimeproblems(withtarget onstraint),thelink be-tweentheminimumtimefun tion andthePontryagin'sprin iplehasbeenalso investigatedinseveralpapers[9,8℄, provingthesame onne tion.
On e the valuefun tion
T
is omputed by solving theHJB equation, we ap-proximateDT (x)
(x
being the starting point) by standard rst-order nite dieren es,andthenweuseitasinitialguessforp(0)
.Inthe ase
T /
∈ C
1
(R
d
)
itis provedin [8℄ that a onne tionbetweenthe two approa hesstillexists. Morepre isely, undersomeadditionalassumptions,we have
p
∗
(t) ∈ D
+
T (y
∗
x
(t))
fort ∈ [0, T (x)],
whereD
+
T (x)
isthesuperdierential of
T
atx
dened byD
+
T (x) :=
η ∈ R
d
: lim sup
y→x
T (y) − T (x) − η · (y − x)
|y − x|
≤ 0
.
(10)Intherestof thisse tionweassumethat
D
+
T (x) 6= ∅
. Itisplainthat we an notusenite dieren e approximationin order to ompute
p(0)
at thepoints wherethevaluefun tionT
isnotdierentiable. Ratherthanthat,wewilltryto approximatethedire tionξ
∗
whi hisorthogonaltothelevelsetsof
T
,pointing towardthedire tion of maximal de rease. This dire tion,in the ase whenT
isdierentiable,isgivenby:ξ
∗
= −DT (x).
(11)Here,we omputeanapproximationof
ξ
∗
as:ξ
∗
= arg min
ξ∈B(0,1)
T (x + δξ) − T (x)
δ
,
(12)where
δ > 0
is asmall positiveparameter, andB(0, 1)
denotes the ballinR
d
enteredat
0
withradius1
.Letusexplainonasimpleexamplewhywe hoosethedenition(12). Con-siderthe ase
C = {(3, 0)} ∪ {(−3, 0)}
,ℓ ≡ 1
,f = u
andU = B(0, 1)
(eikonal equation). On the line{x = 0}
the fun tionT
is not dierentiable (see Fig. 1-left). This line orresponds to a zone where two globally optimaltraje to-−5
0
5
−5
−4
−3
−2
−1
0
1
2
3
4
5
−5
−4
−3
−2
−1
0
1
2
3
4
5
−5
−4
−3
−2
−1
0
1
2
3
4
5
Figure1: two rossingfrontswithandwithoutsuperimposition. Arrows orre-spondto the(two)ve tor(s)
ξ
∗
se tion 2.2)herewehavetwofrontswhi h hit ea h other at theline
{x = 0}
. Thevis ositysolutionoftheHJBequationsele tsautomati allytherstarrival time so weneversee the two rossing fronts, but we ould in prin iple follow the propagations of the two frontsseparately (see Fig. 1-right). Considering thetwofrontsseparately,bymeansof(12),we aneasilyapproximatethetwo dire tionsξ
∗
1
andξ
∗
2
ofmaximal de reaseofthefun tionT
(and thenthetwo gradients−ξ
∗
1
and−ξ
∗
2
ofT
)usingonlythevaluefun tionT
.Inthepresentexample,fo usingonthepoint
(0, 0)
,weeasily omputethe twodire tionsofmaximalde reaseas(−1, 0)
and(1, 0)
. Itiseasytoshowthat these twove tors oin idewiththetwoextremal ve torsinD
+
T (x)
, namely theve tors
η
verifyinglim sup
y→x
T (y) − T (x) − η · (y − x)
|y − x|
= 0.
(13)Althoughthisrelationshipisnottrueforeveryfun tion
T
su hthatD
+
T (x) 6=
∅
,it iseasy to seethat it istruewheneverthe urveof non-dierentiability is dueto the ollisionoftwoormorefronts(asin Problem1,Se tion 4.1).Inthis paper, weproposeto investigatenumeri ally therelevan e of using the HJB approa h to ompute
−ξ
∗
and then using it as initial guess for the initial ostate
p(0)
intheshootingmethod.3.2 Convergen e of
DT
Many papers(see for example[2, 26℄ in the ontext of dierential games) in-vestigated the onvergen e of the approximate value fun tion
e
v(· ; h, k, Ω)
to theexa tsolutionv
whentheparametersh, k
tendto zeroandΩ
tendstoR
d
. These resultswerequite di ultto beobtainedbe ausethe fun tion
v
is not in generaldierentiable.Letusdenoteby
D = ( e
e
D
1
, . . . , e
D
d
)
thedis retegradient omputedby entered nitedieren eswithstepz > 0
e
D
i
T (x) :=
T (x + ze
i
) − T (x − ze
i
)
2z
,
i = 1, . . . , d
where
{e
i
}
i=1,...,d
isthestandardbasisofR
d
.
Toourpurposeswehavetogofurtherprovingthe onvergen eof
T (·; h, k, Ω) =
e
− ln(1 − ev(· ; h, k, Ω))
and then the onvergen e ofD e
e
T (· ; h, k, Ω)
be ause the latterwillbeusedbythePMPmethodasinitialguess.Letusassumethat
k = C
1
h
forsome onstantC
1
. Givenageneri estimateof theformkev(· ; h, R
d
) − v(·)k
L
∞
(R
d
)
≤ Ch
α
,
C, α > 0
(14) wehavethefollowingTheorem3.1 Assumethat
T ∈ C
1
(Ω)
andthere exists
T
max
> 0
su hthat0 ≤ T (x) ≤ T
max
for allx ∈ Ω.
LetusdeneE(x) := k e
D e
T (x; h, Ω) − DT (x)k
∞
.
Then thereexists
Ω
′
⊂ Ω
su hthat
kE(·)k
L
∞
(Ω
′
For theSLs hemeweuse here,anestimate oftheform (14)in theparti ular ase
ℓ ≡ 1
(underassumptionsweakerthanthoseused inTheorem3.1) anbe foundin[26℄. TheproofofthetheoremispostponedintheAppendix.4 Numeri al experiments
Wehave tested the feasibilityand relevan e of ombining the HJBand PMP methodsonfouroptimal ontrolproblems. Ea hoftheseproblemshighlightsa parti ulardi ultyfromthe ontrolpointofview.
Problem 1 (se tion 4.1) is a simple minimum time target problem in di-mension twopresentinglo al and global minima. We will see in this example thattheshootingmethod isverysensitivewithrespe tto theinitial guess(as usual). WheninitializedbyusingtheHJBapproa h,shootingmethodre overs theoptimalsolution.
Problem2(se tion4.2)isa ontrolledVanderPolos illator,alsoof dimen-siontwo,with ontrolswit hings.
Problem 3 (se tion4.3) is the well-known Goddard problem with singular ar s,intheone-dimensional ase(totalstatedimensionisthree).
Problem4(se tion4.4)is anothersimpleminimumtimetargetproblem in dimensionfour, witharst-orderstate onstraint.
Details for HJB implementation. The algorithmis written in C++ and it runs on a PC with an Intel Core 2 Duo pro essor at 2.00 GHz and 4GB RAM. Note that the ode is notparallelized. Theindi ated CPU time isthe timeneededforthe omputationofthevaluefun tionandsavingtheresulton le. Thetimeneededtore onstru ttheoptimaltraje toryisnot onsidered(is almost0).
The numeri al domain
Ω
is dis retized by a regular grid withN
1
× . . . × N
d
nodes. Thesetofadmissible ontrolsU
isdis retizedinN
C
equispa eddis rete ontrolsu
1
, . . . , u
N
C
. The stop riterion for the xed point iterations (9) iskev
(n+1)
− ev
(n)
k
L
∞
(Ω)
< ε = 1e − 5
.DetailsforPMPimplementation. TheshootingmethodiswritteninF or-tran90andrunsonaPCwithanIntelCore2Duopro essorat2.33GHzand 2GBRAM.WeusedtheShoot
1
softwarewhi himplementsashootingmethod withtheHybrd[19℄ solver. Forthefour problemsstudiedwesettheODE in-tegrationmethodtoabasi 4th-orderRunge-Kuttawith100steps.
4.1 Minimum time target problem
Therstexampleillustrateshowalo alsolution anae ttheshootingmethod. We onsiderasimpleminimumtimeproblemwherewewanttorea ha ertain position on the plane by ontrolling the angle of the speed. We hoose the velo ityinorderto reatemultipleminimaofthe ostfun tional.
1
(P
1
)
min t
f
˙y
1
(t) = c(y
1
(t), y
2
(t)) cos(u(t))
˙y
2
(t) = c(y
1
(t), y
2
(t)) sin(u(t))
u(t) ∈ [0, 2π)
fora.e.t ∈ (0, t
f
)
y(0) = x = (−2.5, 0)
y(t
f
) = (3, 0)
withc(y
1
, y
2
) =
1
ify
2
≤ 1
(y
2
− 1)
2
+ 1
ify
2
> 1.
Dueto theexpressionof
c
,wehaveat leasttwominima. Thesimplestone orrespondstoastraightlinetraje tory(−
)alongthey
1
axiswithy
2
= 0
. The otheronehasa urvedtraje tory(∩
)thattakesadvantageofthelargervalues ofy
2
.4.1.1 PMPand shooting method
We rst try to solve the problem with the PMP and the shooting method. Thereforeweseekazerooftheshootingfun tiondened by
S
1
:
t
f
p
1
(0)
p
2
(0)
7→
y
1
(t
f
) − 3
y
2
(t
f
)
p
3
(t
f
) − 1
.
Globaland lo alsolutions. Dependingonthestartingpoint,theshooting method an onvergeto alo alorglobalsolution(Fig. 2). Themore ommon lo al solution is the straight line traje tory from
x
toC := {(3, 0)}
, with a onstant ontrolu = 0
andanaltimeT
local
= 5.5
. Theglobalsolutionhasan ar hshapedtraje torythatbenetsfromthehigherspeedforin reasingvalues ofy
2
,withanaltimet
∗
f
= T
global
= 4.868
.4
5
6
7
8
CONTROL
X1
X2
−5
0
5
STATE
0
5
10
3
4
5
6
−2
−1
0
1
COSTATE
−1
0
1
0
0.5
1
5
5.5
6
6.5
7
7.5
CONTROL
X1
X2
−5
0
5
STATE
−0.1
0
0.1
4
5
6
7
−2
−1
0
COSTATE
−1
0
1
0
0.5
1
Figure2:
(P
1
)
-Globalsolution( urvedtraje tory)andlo alsolution(straight traje tory)foundbytheshootingmethod.Sensitiveness with respe t to the starting point. Even for this simple problem,theshootingmethodisverysensitivetothestartingpoint. Numeri al tests indi ate that it onverges in most ases to lo al solutions. We run the shooting method with abat h of 441valuesof
p(0) ∈ [−10, 10]
2
on a
21 × 21
grid,withdierentstartingguessesforthenaltime(Fig. 3). Weobservethat forthe bat h with thet
f
= 1
initialization, 11%of the shootings onvergeto theglobalsolution,60%to thestraightlinelo alsolution,and24%toanother lo al solution with an even worse nal time (t
f
= 6.06
). For the bat h with thet
f
= 10
initialization, 9%of theshootings onvergeto theglobal solution, and50%and29%to thetwolo alsolutions. Obviously,just taking arandom startingpointisnotareliablewaytondtheglobalsolution.−10
−5
0
5
10
−10
−8
−6
−4
−2
0
2
4
6
8
10
P
1
(0)
P
2
(0)
CONVERGENCE STUDY FOR P(0)
∈
[−10,10]
2
AND T = 1
GLOBAL SOLUTION (T=4.868): 49 [11%]
LOCAL SOLUTION (T=5.5): 265 [60%]
LOCAL SOLUTION (T=6.06): 107 [24%]
−10
−5
0
5
10
−10
−8
−6
−4
−2
0
2
4
6
8
10
P
1
(0)
P
2
(0)
CONVERGENCE STUDY FOR P(0)
∈
[−10,10]
2
AND T = 10
GLOBAL SOLUTION (T=4.868): 38 [9%]
LOCAL SOLUTION (T=5.5): 222 [50%]
LOCAL SOLUTION (T=6.06): 127 [29%]
Figure3:
(P
1
)
-Convergen etotheglobalsolutionfromarandominitialization ishazardousdue tothepresen eof alo alsolution.4.1.2 Solving the problemwith the HJBapproa h
InFig. 4,weshowthelevelsetsoftheminimumtimefun tion
T
asso iatedto the ontrolproblem(P
1
)
. These levelsetsare obtainedbysolvingnumeri ally theHJBequation. Asit anbeeasilyseeninFig. 4,theminimumtimefun tion isnotdierentiableeverywhere. The urveof thedis ontinuityofthegradient representsherethesetoftheinitialpointsasso iatedtotwooptimaltraje tories. 4.1.3 Coupling the HJBand PMPapproa hesWenowusethedataprovidedbytheHJBapproa hto obtainastartingpoint losetotheglobalsolution. TheHJBsolutionprovidesanestimateofthenal time, and alsoan approximationof the ostate
p(0)
by omputing adire tion ofmaximalde reaseoftheminimumtimefun tionaty(0) = x
. InTable1,we summarizethe resultsobtainedby solvingtheHJBequation on several grids, andgivetheobtainedminimaltimetorea hthetargetstartingfromthepositionx = (−2.5, 0)
. Aswe ansee, evenona oarsegrid(25 × 25
nodes),weobtain a good approximation ofp(0)
in a very short time (the CPU times in Table 1in lude the numeri al resolution of the HJB equation and the omputation ofp(0)
). As weexpe ted, theshooting method immediately onvergesto the global solutionwhen using the startingpointobtained from the HJBmethod (Table2).SL−FS
−6
−4
−2
0
2
4
6
−6
−4
−2
0
2
4
6
Figure 4:
(P
1
)
- Levelsets of theminimum time fun tionT
, theoptimal tra-je tory starting from(−2.5, 0)
and the twooptimal traje tories startingfrom(−1.835, 0)
.nodes
N
C
−ξ
∗
t
∗
f
CPUtime(se )25 × 25
16 (-0.049,-1.000) 4.895 0.0850 × 50
16 (-0.048,-1.000) 4.895 0.37200 × 200
32 (-0.051,-1.000) 4.878 20.25Table 1:
(P
1
)
- HJB approa h: the optimal minimal time starting fromx =
(−2.5, 0)
, and the approximation−ξ
∗
of the initial ostate asso iated to the optimaltraje tory InitializationfromHJB
t
∗
f
= 4.89
−ξ
∗
= (−0.05, −1)
SolutionbyPMPt
∗
f
= 4.868
p(0) = (−5.552 × 10
−2
, −9.985 × 10
−1
)
Table2:
(P
1
)
-Initializationfrom HJBandsolutionfromPMP.We an he kthatthe onvergen eoftheshootingmethodismu hbetterin aneighbourhoodoftheHJBinitialization. Comparedtothepreviousgridwith
p(0) ∈ [−10, 10]
2
, we test initial points with
p(0) ∈ [−0.1, 0] × [−2, 0]
, whi h orrespondstoa100%rangearoundtheHJBinitialization−ξ
∗
= (−0.05, −1)
; wealsoset
t
f
= 4.89
. This timetheshootingmethod ndstheglobalsolution for76%
ofthepoints,andonly12%
and9%
forthelo alsolutions(Fig. 5).InTable3 (see also Fig. 4), we onsider the ase of astarting point very lose to the urve where the minimal time fun tion is notdierentiable:
x =
(−1.835, 0)
. Here the omputation ofp(0)
by HJB gives the two dire tionsp(0) = (−0.05, −1.00)
andp(0) = (−0.99, 0.00)
. Using these two values to initialize the shooting method, we obtain the two distin t solutions with the apandstraighttraje tories(Table4). Forthisproblem,thestartingpoints where the minimal time fun tion is not dierentiable orrespond to the ase−0.1
−0.08
−0.06
−0.04
−0.02
0
−2
−1.8
−1.6
−1.4
−1.2
−1
−0.8
−0.6
−0.4
−0.2
0
P
1
(0)
P
2
(0)
CONVERGENCE STUDY NEAR HJB INITIALIZATION
GLOBAL SOLUTION (T=4.868): 337 [76%]
LOCAL SOLUTION (T=5.5): 53 [12%]
LOCAL SOLUTION (T=6.06): 39 [9%]
Figure5:
(P
1
)
-Convergen etotheglobalsolutionismu heasierneartheHJB initialization.wherethelo al (straight)solutionbe omesglobaland hasthesameminimal timeastheglobal( ap)solution. Noti ethatheretheminimaltimefun tion remainsdierentiablealongea htraje tory. Wewillseeinse tion4.2adierent aseofnondierentiabilityforthevaluefun tion.
nodes
N
C
−ξ
∗
t
∗
f
CPUtime(se )300 × 300
32 (-0.05,-1.00)and(-0.99,0.00) 4.84 39.98Table3:
(P
1
)
-HJBapproa hforaninitial positionx = (−1.835, 0)
.t
∗
f
p(0)
HJB4.84
(−0.05, −1)
and(−0.99, 0)
PMP(∩)
4.8246
(−7.67 × 10
−2
, −9.97 × 10
−1
)
PMP(−)
4.835
(−1, −6.2137 × 10
−16
)
Table 4:
(P
1
)
- Lo al solution be omes global for astarting point where the minimaltimefun tionisnotdierentiable.4.2 Van der Pol os illator
These ondtestproblemisa ontrolledVanderPolos illator. Herewewantto rea hthesteadystate
(y
1
, y
2
) = (0, 0)
inminimumtime. Itiswellknownthat theoptimaltraje tories, forthis problem, areasso iated tobang-bang ontrolvariables.
(P
2
)
min t
f
˙y
1
(t) = y
2
(t)
˙y
2
(t) = −y
1
(t) + y
2
(t)(1 − y
1
(t)
2
) + u(t)
u(t) ∈ [−1, 1]
y(0) = x = (1, −0.8)
y(t
f
) = (0, 0)
4.2.1 PMPand shooting method
Here,theHamiltonianislinearwithrespe tto
u
,thereforewehaveabang-bang ontrolwiththeswit hingfun tionψ(x, p) = H
u
(x, p, u) = p
2
.Theshooting fun tionisdened by
S
2
:
t
f
p
1
(0)
p
2
(0)
7→
y
1
(t
f
)
y
2
(t
f
)
p
3
(t
f
) − 1
.
Wetest theshootingmethod withthesameinitialpointsasforproblem
(P
1
)
. The onvergen eresultsareevenworseinthis ase: forthet
f
= 1
initialization, only9%oftheshootings onvergetotheglobalsolution,and0.5%forthet
f
= 10
initialization.4.2.2 Solvingthe problem with the HJBapproa h
Here weusetheHJBapproa hto omputetheminimaltimefun tion. InFig. 6,weshowthenumeri alsolutionobtainedby arryingout omputationsona
200 × 200
gridandN
C
= 2
.SL−FS
−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
Figure 6:
(P
2
)
- Level sets of fun tionT
and the optimal traje tory starting from(1, −0.8)
T
.
4.2.3 Couplingthe HJBand PMPapproa hes
Thenumeri alsolutionoftheHJBequationprovidessomeusefuldata,namely an approximation of the nal time
t
f
and aninitial ostatep(0)
. This infor-mation is used here to start the shooting algorithm. On e again, the HJBinitializationgivesanimmediatea urate onvergen eto theoptimalsolution, seeTable5and Fig. 7. Inthisexample,the ontroldis ontinuitieshinderthe
InitializationfromHJB
t
∗
f
= 4.2
−ξ
∗
= (1.2, −4.2)
SolutionfromPMPt
∗
f
= 3.837
p(0) = (1.249, −3.787)
Table5:
(P
2
)
-InitializationfromHJBandsolutionfrom PMP.onvergen e by testing dierent integration s hemes for the stateand ostate pair
(x, p)
. Usingaxedstepintegrator(4th orderRunge-Kutta)withoutany pre autionsgivesaverypoor onvergen ewithanormof≈ 10
−3
forthe shoot-ing fun tion. Using either a variable step integrator (Dopri, see [17℄) or a swit hingdete tionmethod forthexedstepintegrator(see[15℄)wegetmu h betterresults(
≈ 10
−11
fortheshootingfun tion norm).
−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
TRAJECTORY
0
0.2
0.4
0.6
0.8
1
−1
0
1
CONTROL
0
0.5
1
−1
−0.5
0
0.5
1
STATE
0
0.5
1
−2
−1
0
1
2
COSTATE
0
0.5
1
−1
−0.5
0
0.5
1
0
0.5
1
−4
−2
0
2
Figure7:
(P
2
)
-Solutionwithoneswit hfortheVanderPolos illator(shooting method).Wenowtesttwostartingpointsforwhi htheminimaltimefun tion isnot dierentiable. Inthepreviousproblemthenondierentiabilitywas ausedbya lo alsolution be oming global(followingthefrontpropagation interpretation, twofrontsarehitting). Herethenondierentiabilityhasadierentnature. It annotbeseenasthe urveof ollisionbetweenfronts,and orrespondstothe pointswhere the ontrolswit hesbetween
−1
and+1
. Taking su h astarting pointwehaveasolutionwitha onstant ontrolu = ±1
and noswit hes. We testthe twostarting pointsx = (1.5, −0.67)
andx = (1, −0.57)
that are lose tothenondierentiable urve(seeFig. 6). Computationofξ
∗
is performedas beforeinthe ase
T
isnotdierentiable. Weobservethattheshootingmethod nds solutionswith a swit h immediately after the initial time or just before thenaltime. Here theHJBinitializationis notas loseto theinitial ostatep(0)
,butissu ienttoobtain onvergen e. Also,theminimumtimesgivenby HJBarestill losetotheexa tones (Table6).4.3 Goddard problem
The third exampleis the well-known Goddard problem (see for instan e [16, 18,20, 28,25, 5℄), toillustrate the aseofsingularar s. Thisproblem models theas ent ofa ro ket through theatmosphere, and werestri t hereourselves toverti al(monodimensional)traje tories. Thestatevariablesarethealtitude, speedandmassofthero ketduring theight,foratotaldimensionof3. The
x
methodp(0)
t
f
(1.5, −0.67)
HJB(1.62, −0.87)
2.96
PMP(1.487, 2.309 × 10
−3
)
2.9594
(1, −0.57)
HJB(1.96, −0.10)
2.2
PMP(1.715, 1.111 × 10
−2
)
2.1351
Table6:
(P
2
)
- Solutions with no swit hesfor starting point where the value fun tion isnotdierentiable.ro ketis subje tto gravity,thrustanddrag for es. Thenal timeisfree, and theobje tiveisto rea ha ertainaltitudewithaminimalfuel onsumption.
(P
3
)
min J(u) =
R
t
f
0
bT
max
u
˙r = v
˙v = −
r
1
2
+
1
m
(T
max
u − D(r, v))
˙
m = −bT
max
u
u(t) ∈ [0, 1]
r(0) = 1, v(0) = 0, m(0) = 1,
r(t
f
) ≥ 1.01
with the parameters used for instan e in [20℄:
b = 7
,T
max
= 3.5
and dragD(r, v) = 310v
2
e
−500(r−1)
.
4.3.1 PMPand shooting method
As for
(P
2
)
, the Hamiltonianis linearwith respe ttou
, andwehavea bang-bang ontrol withpossibleswit hingsorsingularar s. Theswit hingfun tion isψ(x, p) = H
u
(x, p, u) = T
max
((1 − p
m
)b +
p
v
m
)
,andthesingular ontrol anbe obtainedby formallysolvingψ = 0
¨
. Themain di ulty, however,is to deter-minethestru tureoftheoptimal ontrol,namelythenumberandapproximate lo ation of singularar s. The HJBapproa h isable to provide su h informa-tion,in additionto theinitial ostatep(0)
. Assumingforinstan eone interior singularar ,theshootingfun tionisdened byS
3
:
t
f
, p
1
(0), p
t
entry
2
(0), p
3
(0)
t
exit
7→
r(t
f
) − 1.01, p
2
(t
f
), p
3
(t
f
), p
4
(t
f
)
ψ(x(t
entry
), p(t
entry
))
˙
ψ(x(t
entry
, p(t
entry
))
.
4.3.2 Solvingthe problem with the HJBapproa h
GoddardproblemisalsohardtosolvewiththeHJBapproa h,spe iallybe ause the omputation of the value fun tion needs a huge number of iterations to onvergeandthesolutionisquitesensibletothe hoi eofthenumeri albox
Ω
in whi hthevaluefun tionis omputed. InFig. 8,weshowthe optimaltraje toryand theoptimal ontrol omputedbyHJB onaroughgrid. Aswe ansee,theHJBapproa hdoesnotgiveagood approx-imation of theoptimal ontrol (verti al lines orrespond to strongos illations ofthesolution). TheHJBformulation ansuggestnotonlythevaluesfor
p(0)
andt
f
,but alsothelo ationofthesingularar .0
0.05
0.1
0.15
0.2
1
1.005
1.01
1.015
x1
0
0.05
0.1
0.15
0.2
0
0.05
0.1
0.15
0.2
x2
0
0.05
0.1
0.15
0.2
0.7
0.8
0.9
1
x3
0
0.05
0.1
0.15
0.2
0
0.2
0.4
0.6
0.8
1
u
Figure8:
(P
3
)
-Goddardproblem,solutionbyHJBapproa h(rstline: altitude andvelo ity. Se ondline: massand ontrol).4.3.3 Coupling the HJBand PMPapproa hes
Wenow tryto initializethe shooting method dire tly from the resultsof the HJBapproa h. As for problems
(P
1
)
and(P
2
)
, theHJBsolution providesan estimateofthenal timet
∗
f
andinitial ostatep(0)
. Moreover,examiningthe statevariables on the HJBsolution also givesagood ideaof the stru ture of the ontrol: the hangeof slopeonthespeed learlyvisiblein Fig. 8indi ates aninteriorsingularar at(t
entry
, t
exit
) ≈ (0.02, 0.06)
. On e againweobtaina qui k onvergen etothe orre tsolutionwiththeexpe tedsingularar (Table 7andFig. 9)).t
∗
f
(t
entry
, t
exit
)
−ξ
∗
andp(0)
InitializationfromHJB0.17
(0.02, 0.06)
(−7.79, −0.31, 0.04)
SolutionfromPMP0.1741
(0.02351, 0.06685)
(−7.275, −0.2773, 0.04382)
Table7:
(P
3
)
-InitializationfromHJBandsolutionfrom PMP.4.4 Minimumtimetarget problemwitha state onstraint Thisfourthexampleaimstoillustrate the aseofastate onstraint,aswellas afour-dimensionalproblemfortheHJBapproa h. We hoseasimpleproblem wherewewanttomoveapointontheplane,fromasteadyinitialpositiontoa targetposition,withanullinitialand nalspeed. The ontrol isthedire tion ofa eleration,andtheobje tiveistominimizethenaltime. Weaddastate onstraintwhi hlimitsthevelo ityofthepointalongthe
x
-axis.0
0.5
1
CONTROL
0
0.5
1
SWITCHING FUNCTION
1
1.005
1.01
STATE
0
0.05
0.1
0.7
0.8
0.9
0
0.1
0.2
−10
−5
0
COSTATE
−0.5
0
0.5
−0.05
0
0.05
−5
0
5
x 10
−8
Figure 9:
(P
3
)
-Goddard problem,solutionbyPMPmethod.(P
4
)
min J(x, u) = t
f
˙y
1
= y
3
˙y
2
= y
4
˙y
3
= cos(u)
˙y
4
= sin(u)
u(t) ∈ [0, 2π)
y(0) = x = (−3, −4, 0, 0)
y(t
f
) = (3, 4, 0, 0)
y
3
(t) ≤ 1
t ∈ (0, t
f
)
Letus writethe state onstraintsas
g(y(t)) ≤ 0
, withg
dened byg(y) =
y
3
− 1
. The ontrol appears expli itlyin the rst time derivativeofg
, sothe onstraintis oforder 1,andwehave:˙g(y(t)) = cos(u(t)),
g
y
(y) = (0, 0, 1, 0).
Whenthe onstraintisnota tive,minimizingtheHamiltoniangivestheoptimal ontrol
u
∗
via(cos(u
∗
), sin(u
∗
)) = −
p
(p
3
, p
4
)
p
2
3
+ p
2
4
.
Overa onstrainedar where
g(x) = 0
,theequation˙g(x, u) = 0
andminimizing theHamiltonianH
leadstou
∗
= −sign(p
4
)
π
2
.
Thenthe ondition
H
u
= 0
givesthevalueforthe onstraintmultiplierµ = −p
3
. Attheentrypointwehaveajump onditionforthe ostate:p(t
+
entry
) = p(t
−
entry
) − π
entry
g
x
,
with
π
entry
∈ R
an additional shooting unknown. Compared to the un on-strainedproblem,wehavethreemoreunknownst
entry
, t
exit
andπ
entry
. The or-respondingequationsaretheHamiltonian ontinuityatt
entry
andt
exit
(whi hboilsdownto
p
3
= 0
),andthetangentialentry onditiong(x(t
entry
)) = 0
. The shootingfun tionis denedbyS
4
:
p
1...4
t
f
(0)
t
entry
, t
exit
, π
entry
7→
y
1...4
(t
f
p
) − (−3, −4, 0, 0)
5
(t
f
) − 1
p
3
(t
entry
), p
4
(t
entry
), g(y(t
entry
))
.
0
2
4
6
−5
0
5
x
0
2
4
6
−5
0
5
y
0
2
4
6
−2
0
2
4
Vx
0
2
4
6
−2
0
2
4
Vy
Figure10:
(P
4
)
-Solutionwitha onstrainedar bytheHJBapproa h.In g. 10, weshowthe numeri al solutionobtainedby using theHJB ap-proa h. This approa hprovides alsoapproximationsofthe optimalnal time andtheinitial ostate. ExaminingtheHJBsolutionalsogivesanestimateofthe bounds for the onstrainedar where
y
3
= 1
. The onlyshooting unknownfor whi hwewerenotabletoobtainrelevantinformationisthemultiplierπ
entry
for the ostatejump att
entry
. Thereforeweusedπ
entry
= 0.1
asastartingguess, whi h turned out to be su ient for the shooting method to onverge prop-erly(Table8). Fig. 11 showsthe orrespondingsolution, mu h leanerthan theHJBsolutionbutwith thesamestru ture. We he kedthat the onditionµ ≥ 0
wassatisedovertheboundaryar asp
3
isnegative,andp
3
= 0
atboth entryandexitofthear asrequestedbytheHamiltonian ontinuity onditions. Thea tualvalueofthemultiplierforthejumponp
3
isπ
entry
= 4.1294
.t
∗
f
(t
entry
, t
exit
)
−ξ
∗
andp(0)
InitializationfromHJB7.5
(1.35, 5.6)
(−0.51, −0.24, −0.89, −0.61)
SolutionfromPMP7.0356
(1.1370, 5.8986)
(−0.8672, −0.0474, −0.9860, −0.1667)
Table8:
(P
4
)
-InitializationfromHJBandsolutionfrom PMP.CPU times. In Table 9 we nally summarize the CPU times needed for omputations.
−3
−2
−1
0
1
2
CONTROL
−4
−2
0
2
4
TRAJECTORY
−5
0
5
STATE
−5
0
5
0
0.5
1
0
1
2
3
7.03
7.04
7.05
−2
−1
0
COSTATE
−1
0
1
−4
−2
0
2
−0.2
0
0.2
0
0.5
1
Figure11:
(P
4
)
-Solutionwitha onstrainedar byPMPapproa h.Problem1 Problem2 Problem3 Problem4 HJBapproa hwithroughdis retization
8 × 10
−2
2.98
211
182
PMPapproa hwithHJBinitialization
3 × 10
−3
7 × 10
−3
3 × 10
−2
2 × 10
−2
Shootingfun tionnormforPMP
2.82 × 10
−16
8.14 × 10
−11
1.12 × 10
−7
6.68 × 10
−11
Table9:SummaryofCPUtimesfornumeri alexperiments(se onds)and shoot-ingfun tion norm
5 Con lusions
The known relation between the gradient of the value fun tion in the HJB approa h and the ostate in the PMP approa h makes it possible to use the HJBresultsto initializeashooting method. With this ombinedmethod, one anhopetobenetfromtheoptimalityofHJBandthehighpre isionofPMP. Themainlimitationisonthestatedimensionimposed byHJB.
We have tested this approa h on four ontrol problems presenting some spe i di ulties: lo alandglobalsolutions(Problem1),dis ontinuous bang-bang ontrol(Problem2),singularar s(Problem3),state onstraint(Problem 4). Thenumeri al tests alsoin luded two ases where thevalue fun tion was notdierentiable.
Forthesefourproblems,theHJBapproa hprovidesanapproximatesolution withsomeadditionalinformation,su hasanestimateoftheinitial ostate
p(0)
, optimalnaltimet
f
,stru tureoftheoptimalsolutionwithrespe ttosingular or onstrainedsubar s. Inea h asethisinformationallowedustosu essfully initializetheshootingmethod. Thefa t thattheoptimal ontrolre onstru ted byHJBwassometimesfarfromtheexa t ontroldidnotseemtobeproblemati for the shooting method initialization. The total omputational time for the ombined HJB-PMP approa h did not ex eed four minutes, up to dimension four.Appendix
Proof ofTheorem3.1. Giventhenumeri aldomain
Ω
wedenethesetΩ
′
as
Ω
′
:= {x ∈ R
d
: ev(x; h, Ω) ≤ min
x
′
∈∂Ω
e
v(x
′
; h, Ω)}.
Theset
Ω
is the boxin whi h the approximate solution is a tually omputed andΩ
′
representsthe subsetof
Ω
in whi h the solutionis notae ted bythe titiousboundary onditionsweneedto impose at∂Ω
to make omputation. Fromthefrontpropagationpointofview,∂Ω
′
representsthefrontatthetime ittou hes
∂Ω
fortheveryrsttime.Letus dene
v
max
:= (1 − e
−T
max
)
andx
x ∈ Ω
′
. Wehave
T (x) ≤ T
max
< +∞
andv(x) ≤ v
max
< 1.
By(14)wehave
e
v(x; h) ≤ v(x) + Ch
α
≤ v
max
+ Ch
α
.
Sin e
v
max
< 1
thereexistsh
0
> 0
su hthatv
max
+ Ch
α
< 1
forall0 < h ≤ h
0
thenwe andene
ev
max
:= v
max
+ Ch
α
0
< 1
andwehave
v(x) ≤ v
max
≤ ev
max
ande
v(x; h) ≤ ev
max
forallx ∈ Ω
′
,
0 < h ≤ h
0
.
Foranyxed
x ∈ Ω
′
,itexists
ξ
x
∈ [min{v(x), ev(x; h)}, max{v(x), ev(x; h)}]
su h that| e
T (x) − T (x)| =
ln 1 − v(x)
− ln 1 − ev(x; h)
=
1 − ξ
1
x
|v(x) − e
v(x; h)|.
Sin e
ξ
x
≤ ev
max
,wehave| e
T (x) − T (x)| ≤
Ch
α
1 − ev
max
forallx ∈ Ω
′
and0 < h ≤ h
0
andthenitexists apositive onstant
C
2
whi hdependsbytheproblem'sdata andonΩ
su hthatk e
T − T k
L
∞
(Ω
′
)
≤ C
2
h
α
forall0 < h ≤ h
0
.
(15) Weare now ready to re overan estimate on thegradient of the approximate solutionT
e
. By(15)weknowthat,foranyi = 1, . . . , d
e
T (x + ze
i
) = T (x + ze
i
) + E
1
with|E
1
| ≤ C
2
h
α
ande
T (x − ze
i
) = T (x − ze
i
) + E
2
with|E
2
| ≤ C
2
h
α
.
Sowehavee
D
i
T (x) =
e
T (x + ze
i
) + E
1
− (T (x − ze
i
) + E
2
))
2z
= e
D
i
T (x) +
E
1
− E
2
2z
sothat
| e
D
i
T (x) − e
e
D
i
T (x)| ≤
E
1
− E
2
2z
≤ C
2
h
α
z
andthenk e
D e
T (x) − e
DT (x)k
∞
≤ C
2
h
α
z
.
Wenallyobtain,for
x ∈ Ω
′
and0 < h ≤ h
0
,k e
D e
T (x)−DT (x)k
∞
≤ k e
D e
T (x)− e
DT (x)k
∞
+k e
DT (x)−DT (x)k
∞
= O
h
α
z
+O(z
2
)
andthe on lusionfollows.
Referen es
[1℄ Bardi,M.,CapuzzoDol etta,I.: Optimal ontrolandvis ositysolutionsof Hamilton-Ja obi-Bellmanequations.Birkhäuser,Boston(1997)
[2℄ Bardi, M., Fal one, M., Soravia, P.: Fully dis retes hemesfor the value fun tion of pursuit-evasion games. InT. Basarand A. Haurie(eds), Ad-van esin Dynami Games and Appli ations,Annals of theInternational So ietyofDynami Games1,89-105(1994)
[3℄ Bellman,R.E.: ThetheoryofDynami Programming.Bull.Amer.Math. So .60,503515(1954)
[4℄ Bettiol,P.,Frankowska,H.: Normalityof themaximumprin iplefor non- onvex onstrainedBolzaproblems.J.DierentialEquations243,256269 (2007)
[5℄ Bonnans, F., Martinon, P., Trélat, E.: Singular ar s in the generalized Goddard'sproblem.J.Optim.TheoryAppl.139,439461(2008)
[6℄ Bryson,A.E.,Ho,Y.-C.: Appliedoptimal ontrol.HemispherePublishing, New-York(1975)
[7℄ Cannarsa,P.,Frankowska,H.: Some hara terizationsofoptimal traje to-riesin ontroltheory.SIAMJ.ControlOptim.28,13221347(1991) [8℄ Cannarsa, P., Frankowska, H., Sinestrari, C.: Optimality onditions and
synthesisfortheminimumtimeproblem. SetValuedAnalysis8, 127148 (2000)
[9℄ Cernea, A., Frankowska,H.: A onne tionbetweenthe maximum prin i-pleand dynami programmingfor onstrained ontrol problems.SIAMJ. ControlOptim. 44,673703(2006)
[10℄ Chitour, Y., Jean, F., Trélat, E.: Generi ity results for singular urves. JournalofDierentialGeometry,73, 4573(2006)
[11℄ Clarke,F.,Vinter,R.B.: Therelationshipbetweenthemaximumprin iple anddynami programming.SIAMJ.ControlOptim.25,12911311(1987)
[12℄ Deuhard,P.: NewtonMethodsforNonlinearProblems.SpringerSeriesin ComputationalMathemati s,35(2004)
[13℄ Fal one,M.: Numeri alsolutionof dynami programmingequations. Ap-pendixA in[1℄.
[14℄ Fal one,M.: Numeri almethodsfordierentialgamesbasedonpartial dif-ferentialequations.InternationalGameTheory Review8,231272(2006) [15℄ Gergaud,J.,Martinon.P.: Usingswit hingdete tionandvariational
equa-tions for the shooting method. Optim. Control Appl. Meth. 28, 95116 (2007)
[16℄ Goddard,R.H.: AMethodofrea hingextremealtitudes.SmithsonianInst. Mis .Coll.71,(1919)
[17℄ Hairer, E., Nørsett, S.P., Wanner, G.: Solvingordinarydierential equa-tionsI.SpringerSeriesinComputationalMathemati s8,Springer-Verlag, Berlin(1993)
[18℄ Maurer,H.: Numeri alsolutionofsingular ontrolproblemsusingmultiple shootingte hniques.J.Optim.TheoryAppl.18,235257(1976)
[19℄ More, J.J.,Garbow,B.S., Hillstrom, K.E.: UserGuidefor MINIPACK-1. Argonne NationalLaboratoryReportANL-80-74(1980)
[20℄ Oberle,H.J.: Numeri al omputationofsingular ontrolfun tionsintra je -toryoptimization.JournalofGuidan e,ControlandDynami s13,153159 (1990)
[21℄ Pes h, H.J.: A pra ti alguide to the solution of real-lifeoptimal ontrol problems. ControlandCyberneti s23,760(1994)
[22℄ Pes h, H.J.: Real-time omputation of feedba k ontrols for onstrained optimal ontrolproblemsII:a orre tionmethodbasedonmultiple shoot-ing.OptimalControl,Appli ationsandMethods10,147-171(1989) [23℄ Pes h, H.J., Plail, M.: The maximumprin ipleof optimal ontrol: a
his-tory ofingeniousideaandmissed opportunities.to appearin Controland Cyberneti s.
[24℄ Pontryagin,L.S., Boltyanski,V.G., Gamkrelidze,R.V. Misht henko,E.F.: The mathemati al theory of optimal pro esses. Wiley Inters ien e, New York (1962)
[25℄ Seywald,H.,Cli,E.M.: Goddardproblemin presen eofadynami pres-surelimit.JournalofGuidan e,Control,andDynami s16,776781(1993) [26℄ Soravia, P.: Estimates of onvergen e of fully dis rete s hemes for the Isaa s equation of pursuit-evasion dierential games via maximum prin- iple.SIAMJ.ControlOptim. 36,111(1998)
[27℄ Tsai,Y.R.,Cheng,L.T.,Osher,S.,Zhao,H.: Fastsweepingalgorithmsfor a lass of Hamilton-Ja obiequations. SIAMJ.Numer.Anal. 41, 673694 (2003)
[28℄ Tsiotras,P.,Kelley,H.J.: Drag-lawee tsintheGoddardproblem. Auto-mati a27,481-490(1991)