Initialization of the shooting method via the Hamilton-Jacobi-Bellman approach

(1)

HAL Id: inria-00439543

https://hal.inria.fr/inria-00439543

Submitted on 7 Dec 2009

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Hamilton-Jacobi-Bellman approach

Emiliano Cristiani, Pierre Martinon

To cite this version:

Emiliano Cristiani, Pierre Martinon. Initialization of the shooting method via the

Hamilton-Jacobi-Bellman approach. Journal of Optimization Theory and Applications, Springer Verlag, 2010, 146 (2),

pp.321-346. �10.1007/s10957-010-9649-6�. �inria-00439543�

(2)

a p p o r t

d e r e c h e r c h e

N

0

2

4

9 -6

3

9

9 IS

R

N

IN

R

IA

/R

R

--7

1

3

9 --F

R

+

E

N

G

Initialization of the shooting method via the

Hamilton-Jacobi-Bellman approach

Emiliano Cristiani — Pierre Martinon

N° 7139

(3)

(4)

Centre de recherche INRIA Saclay – Île-de-France

Parc Orsay Université

EmilianoCristiani

∗

,Pierre Martinon

†

Thème: Modélisation,optimisationet ontrledesystèmesdynamiques Équipe-ProjetCommands

Rapportdere her he n°7139Dé embre200925pages

Abstra t: Theaimof thispaperistoinvestigatefromthenumeri alpointof viewthe possibilityof ouplingthe Hamilton-Ja obi-Bellman(HJB)approa h andthePontryagin'sMinimumPrin iple(PMP)tosolvesome ontrolproblems. We show that an approximation of the valuefun tion omputed by the HJB method onroughgrids anbeused toobtainagoodinitial guessforthePMP method. The advantage of our approa h over other initialization te hniques (su h as ontinuationordire tmethods) isto provideaninitial guess loseto theglobalminimum. Numeri altestsinvolvingmultipleminima,dis ontinuous ontrol,singular ar sand state onstraintsare onsidered. The CPUtime for the proposed method isless than four minutes upto dimension four, without odeparallelization.

Key-words: optimal ontrol problem, minimumtimeproblem, Pontryagin's minimumprin iple

∗

CEMSAC,Fis iano(SA)andIAC-CNR,Rome,Italy

†

(5)

Résumé: Theaim ofthispaperisto investigatefrom thenumeri alpointof viewthepossibility of oupling theHamilton-Ja obi-Bellman(HJB) approa h andthePontryagin'sMinimumPrin iple(PMP)tosolvesome ontrolproblems. We show that an approximation of the value fun tion omputed by the HJB methodonroughgrids anbeusedtoobtainagoodinitialguess forthePMP method. The advantage of our approa h over other initialization te hniques (su h as ontinuationordire t methods)is to provideaninitial guess loseto theglobalminimum. Numeri altestsinvolvingmultipleminima,dis ontinuous ontrol,singularar sand state onstraintsare onsidered. TheCPU timefor theproposed method is lessthan four minutesupto dimensionfour, without odeparallelization.

Mots- lés : optimal ontrolproblem, minimum timeproblem, Pontryagin's minimumprin iple

(6)

1 Introdu tion

The Hamilton-Ja obi-Bellman (HJB) theory and the Pontryagin's Minimum Prin iple(PMP)areusually onsideredtwoseparateworldsalthoughtheydeal with thesamekindofproblems. Thetheoreti al onne tionsbetweenthetwo approa hesarewellknown[11,7,8,9℄,but oupledusageofthetwote hniques isnot ommonandnot ompletelyexplored.

Inthispaperwewilldealwiththefollowing ontrolleddynami s

˙y(t) = f (y(t), u(t)),

t > 0

y(0) = x,

x ∈ R

d

(1)

where the ontrol variable

u(·) ∈ U := {u : R

+

_{→ U, u}

measurable

}

and

U ⊂ R

m

(

m ≥ 1

). We will denote by

y

x

(t; u)

the solution of the system (1) startingfromthepoint

x

with ontrol

u

. Let

C ⊂ R

d

beagiventarget. Forany given ontrol

u

wedenote by

t

f

(x, u)

thersttime thetraje tory

y

x

(t; u)

hits

C

(weset

t

f

(x, u) = +∞

ifthetraje toryneverhitsthetarget). Wealsodene a ostfun tional

J

as

J(x, u) :=

Z

t

f

(x,u)

0 ℓ(y

x

(t; u), u(t))dt.

(2)

Thenalgoalisto nd

u

∗

_{∈ U}

su hthat

J(x, u

∗

_{) = min}

u∈U

J(x, u)

(3) andthento omputetheasso iatedoptimaltraje tory

y

∗

x

(t; u

∗

)

. Wealsodene thevaluefun tion

T (x) := J(x, u

∗

) ,

x ∈ R

d

_.

Choosing

ℓ ≡ 1

in (2)weobtainthe lassi alminimum timeproblem.

TheHJBapproa hisbasedontheDynami ProgrammingPrin iple [3℄. It onsists in hara terizingthe valuefun tion asso iated tothe ontrol problem by meansof arst-order non-linearpartial dierential equation. On e an ap-proximationof the valuefun tion is omputed, we aneasily re onstru t the optimal ontrol

u

∗

in feedba k form and, by a dire t integration, the optimal traje toriesfor any startingpoint

x

. The method is greatlyadvantageous be- auseitis abletorea h theglobalminimumofthe ostfun tional, evenifthe problemisnot onvex.TheHJBapproa hallowsalsotohaveaglobaloverview ofthesetoftheoptimaltraje toriesandoftherea hableset(or apturebasin) i.e. theset ofthepointsfromwhi h itispossibletorea hthetargetinagiven time.

Beside all the advantages listed above, the HJB approa h suers the well known urseofdimensionality,soingeneralitisrestri tedtoproblemsinlow dimension(

d ≤ 3

).

The PMP approa h onsists in nding traje tories that satisfy the ne es-sary onditions stated by Pontryagin's Minimum Prin iple. This is done in pra ti e by sear hing a zero of a ertain shooting fun tion, typi ally with a (quasi-)Newton method. This method is wellknown andis used in many ap-pli ations,see[21,23,12℄ andreferen estherein. Themain advantagesof this approa hliein itsa ura yand itsnumeri al omplexity. It isworthto re all

(7)

thatthedimension ofthenonlinearsystemfortheshootingmethod isusually

2d

, where

d

is the statedimension. This isin pra ti equitelowfor this kind ofproblem,thereforefast onvergen eisexpe tedin aseofsu ess,espe ially iftheinitial guessis lose totherightvalue. Unfortunately, ndingasuitable initial guess an be extremely di ult in pra ti e. Thealgorithm may either not onvergeatall,or onvergetoalo alminimumofthe ostfun tional.

Inthispaperwe ouplethetwomethodsinsu hawaywe anpreservethe respe tiveadvantages. TheideaistosolvetheproblemviatheHJBmethodon a oarsegridto havein short timearst approximationofthe valuefun tion and thestru ture ofthe optimaltraje tory. Then, weusethis information to initializethePMPmethod and omputeapre ise approximationoftheglobal minimum. Toourknowledgethisistherstattempttoexploit the onne tion betweentheHJBandPMPtheoriesfrom thenumeri alpointofview.

Comparedtotheuseof ontinuationte hniquesordire tmethodstoobtain anestimateoftheinitial ostate,themainadvantageoftheapproa hpresented hereisthat theHJBmethodprovidesaninitial guess loseto theglobal min-imum. Themain limitationis therestri tionwith respe tto thedimensionof thestate.

We onsidersomeknown ontrolproblemswithdierentspe i di ulties: severallo alminima,dis ontinuous ontrol,presen eofsingularar s,andstate onstraints. Inalltheseproblems, weshowthat ombiningPMPmethodwith HJBapproa hleadstoaverye ientalgorithm.

2 Preliminaries

Consideroptimal ontrolproblemsinthegeneralBolzaform,autonomous ase, withaxedorfreenal time.

(P )











min J(x, u) =

R

t

f

(x,u)

0 ℓ(y(t), u(t)) dt

Obje tive

˙y(t) = f (y(t), u(t))

Dynami s

u(t) ∈ U

fora.e.

t ∈ (0, t

f

(x, u))

AdmissibleControls

y(0) = x

Initial Conditions

y(t

f

(x, u)) ∈ C

TerminalConditions

Here

U

is a ompa t set of

R

m

and the following lassi al assumptions are satised:

-

f : R

d

_{× U → R}

d

and

ℓ : R

d

_{× U → R}

d

are ontinuous,andareof lass

C

1

withrespe tto therstvariable. -

C

isa losedsubsetof

R

d

forwhi hthepropertyave torisnormalto

C

atapointof

C

makessense. Forinstan e,

C

anbedes ribedbyanite set of equalities

{c

i

(x) = 0}

i

or inequalities

{c

i

(x) ≤ 0}

i

, with the

c

′

i

s

beingof lass

C

1

andthe lassi al onstraintquali ationassumptions.

2.1 Pontryagin's Minimum Prin iple approa h

We give here a brief overview of the so alled indire t methods for optimal ontrol problems[24, 6,22℄. Weintrodu ethe ostate

p

, ofsamedimension

d

(8)

asthestate

x

,and denetheHamiltonian

H(y, p, u, p

0 ) = p

0 ℓ(y, u)+ < p, f (y, u) > .

Undertheassumptionson

f

and

ℓ

introdu edabove,thePontryagin's Min-imum Prin iple statesthat if

(y

∗

x

, u

∗

, t

∗

f

)

is asolution of (

P

) thenthere exists

(p

0 , p

∗

) 6= 0

absolutely ontinuoussu hthat

˙y

∗

(t) = H

p

(y

x

∗

(t), p

∗

(t), u

∗

(t), p

0 ),

y

x

∗

(0) = x,

(4a)

˙p

∗

(t) = −H

x

(y

x

∗

(t), p

∗

_{(t), u}

∗

_{(t), p}

0 ),

(4b)

p

∗

(t

∗

f

) ⊥ T

C

(y

∗

x

(t

∗

f

)),

(4 )

u

∗

(t) = arg min

v∈U

H(y

∗

x

(t), p

∗

_{(t), v, p}

0 )

fora.e.

t ∈ [0, t

∗

f

],

(4d)

where

T

C

(ξ)

denotesthe ontingent oneof

C

at

ξ

. Moreover,ifthenal time

t

∗

f

isnotxedandisanoptimaltime,thenwehavetheadditional ondition:

H(y

∗

x

(t), p

∗

(t), u

∗

(t), p

0 ) = 0,

for

t ∈ (0, t

∗

f

).

(5)

Two ommon asesare

C = {y

f

}

with

p

∗

_(t

∗

f

)

free,and

C = R

d

with

p(t

∗

f

) = 0.

Nowweassumethat minimizingtheHamiltonian providesthe ontrol asa fun tion

γ

ofthestateand ostate. Foragivenvalueof

p(0)

,we anintegrate

(x, p)

byusingthe ontrol

u = γ(x, p)

on

[0, t

f

]

. Wedenetheshootingfun tion

S

that maps the unknowns

p(0)

to the value of the nal and transversality onditions at

(x(t

f

), p(t

f

))

. Finding azeroof

S

givesatraje tory

(x, u)

that satisesthene essary onditionsfortheproblem(

P

). Thisistypi allydonein pra ti ebyapplyinga(quasi-)Newtonmethod.

Remark 2.1 The multiplier

p

0

ould be equal to

0

. In that ase, the PMP is saidanormal,itssolution

(y

∗

_{, u}

∗

_{, p}

∗

₎

orrespondstoasingular extremalwhi h doesnot dependon the ostfun tion

ℓ

. Several workshave been devotedtothe existen e (or nonexisten e) of su h extremal urves [4 , 10 ℄. For numeri s, in generalweassumethat

p

0 6= 0

whi hleadstosolvethePMPsystemwith

p

0 = 1

. In thesequel,wewill alwaysassume thatweareinthe normal ase(

p

0 = 1

). Singular ar s. Asingularar o urs whenminimizingtheHamiltonianfails to determine the optimal ontrol

u

∗

on a whole time interval. The typi al ontextiswhen

H

islinearwithrespe tto

u

,withanadmissiblesetof ontrols of the form

U = [u

low

, u

up

]

. In this parti ular ase, thefun tion

(x, u, p) 7−→

H

u

(x, u, p)

does notdepend onthe ontrol variable. We dene the swit hing fun tion

ψ(x, p) = H

u

(x, u, p)

andhavethefollowingbang-bang ontrollaw:







if

ψ(x, p) > 0

then

u

∗

_{= u}

low

if

ψ(x, p) < 0

then

u

∗

_{= u}

up

if

ψ(x, p) = 0

thenswit hingorsingular ontrol

.

Asingularar then orrespondstoatimeintervalwheretheswit hingfun tion

ψ

iszero. Theusualwaytoobtainthesingular ontrolistodierentiate

ψ

with re-spe tto

t

untilthe ontrolexpli itlyappears,whi hleadstosolvinganequation oftheform

ψ

(2k)

_{(x, p) = 0}

(9)

depending onthe problem. Moreover,it isalso requiredto makeassumptions aboutthe ontrol stru ture, morepre isely to xthenumber ofsingularar s. Ea h expe ted singular ar adds two shooting unknowns

(t

entry

, t

exit

)

, with the orresponding jun tion onditions

ψ(t

entry

) = ˙

ψ(t

entry

) = 0

oralternately

ψ(t

entry

) = ψ(t

exit

) = 0

. The problem studied in se tion 4.3 presentssu h a singularar .

State onstraints. We onsiderastatevariableinequality onstraint

g(x(t)) ≤ 0

. Wedenoteby

q

thesmallestordersu hthat

g

(q)

dependsexpli itlyonthe on-trol

u

;

q

is alledtheorderofthe onstraint

g

. TheHamiltonianisdenedwith anadditionaltermforthe onstraint

H(x, p, u) = ℓ(x, u)+ < p, f (x, u) > +µg

(q)

_{(x, u)}

withthesign ondition

µ = 0

if

g < 0

µ ≥ 0

if

g = 0.

Whenthe onstraintis ina tiveweare in thesamesituation asfor an un on-strainedproblem. Overa onstrainedar where

g(x) = 0

,weobtainthe ontrol from theequation

g

(q)

_{(x, u) = 0}

,and

µ

from theequation

H

u

= 0

. Asin the singularar ase, weneed to makeassumptions on erning the ontrol stru -ture, namely the number of onstrained ar s. Ea h expe ted onstrained ar addstwoshooting unknowns

(t

entry

, t

exit

)

with theHamiltonian ontinuityas orresponding onditions. Wealsohavetheso alledtangen y onditionatthe entrypoint

N (x(t

entry

)) = (g(x(t

entry

)), . . . , g

(q−1)

(x(t

entry

))) = 0,

withthe ostatedis ontinuity

p(t

+

entry

) = p(t

−

entry

) − πN

x

|t

entry

where

π ∈ R

q

isanothermultiplieryieldingtoanadditionalshootingunknown. Remark2.2 Thetangen y ondition an also be enfor edat the exit time, in this asethe ostate jump o ursatthe exit timeaswell.

2.2 Hamilton-Ja obi-Bellmanapproa h Considerthe value fun tion

T : R

d

_{→ R}

, whi h maps everyinitial ondition

x ∈ R

d

to the minimal value of the problem (

P

). It is well known (see for example[1℄fora omprehensiveintrodu tion)thatthevaluefun tion

T

satises aDynami ProgrammingPrin ipleandtheKruºkovtransformof

T

,denedby

v(x) := 1 − e

−T (x)

istheuniquesolutionofthefollowingHJBequation,invis ositysense [1℄:

(

v(x) + sup

u∈U

{−f (x, u) · Dv(x) − ℓ(x, u) + (ℓ(x, u) − 1)v(x))} = 0 x ∈ R

d

_\C

v(x) = 0

x ∈ C.

(6)

(10)

Obtaininganumeri alapproximationofthefun tion

v

isadi ulttaskmainly be ause

v

is not always dierentiable. Several numeri al s hemes have been studiedintheliterature. Inthispaperwewillusearst-ordersemi-Lagrangian (SL) s heme, werefer to [13, 14℄ for asurveyon these kindof s hemes. This hoi e ismotivated bythefa t that SLs heme seemsthebest onein orderto approximate the gradientof thevalue fun tion,this beingour goalaswewill see in the nextse tion. Wex a(numeri al)bounded domain

Ω ⊃ C

and we introdu e in it aregular grid

G = {x

i

, i = 1, . . . , N

G

}

where

N

G

is the total numberofnodes. Wedenoteby

v(x; h, k, Ω)

e

thefullydis reteapproximationof

v

,

h

and

k

beingtwodis retizationparameters(therstone anbeinterpreted asatimesteptointegratealong hara teristi sandthese ondoneistheusual spa e step). We impose state onstraint boundary onditions on

∂Ω

. The dis reteversionof(6)is

e

v(x

i

) = e

H[ev](x

i

)

x

i

∈ (Ω\C) ∩ G

e

v(x

i

) = 0

x ∈ C ∩ G

(7) where

e

H[ev](x

i

) := min

u∈U

{P

1 e

v; x

i

+ hf (x

i

, u)

+ hℓ(x

i

, u)(1 − ev(x

i

))}

(8) and

P

1 v; x

e

i

+ hf (x

i

, u)

denotes the value of

e

v

at the point

x

i

+ hf (x

i

, u)

obtained bylinear interpolation using the known valuesof

ev

on

G

(note that the point

x

i

+ hf (x

i

, u)

is not in general sittingon the grid). The numeri al s heme onsistsin iterating

e

v

(n+1)

= e

H[ev

(n)

]

n = 1, 2, . . .

(9)

until onvergen e,startingfrom

ev

(0)

_(x

i

) = 0

on

C

and1elsewhere. Toa elerate the onvergen e we use the Fast Sweeping te hnique [27℄. The fun tion

ev

is then extendedto thewhole spa ebylinearinterpolation. On e thefun tion

e

v

is omputed,wegeteasilythe orrespondingapproximation

T

e

of

T

,and then theoptimal ontrollawinfeedba kform,see[13,14℄fordetails.

Itis usefulto notethat theequation (6) analso model afront (interfa e) propagationproblem. Followingthisinterpretation,theboundaryofthetarget

∂C

isthefrontat initialtime

t = 0

,andthelevelset

{x : T (x) = t}

represents thefrontat anytime

t > 0

.

3 Coupling HJB and PMP 3.1 Main onne tion

It is known [7℄ that for a general ontrol problem with free end-point, if the value fun tion is dierentiable at somepoint

x ∈ R

d

then it is dierentiable along theoptimal traje tory startingat

x

. A tually, the gradient of the value fun tion isequal tothe ostateof the Pontryagin'sprin iple.

Inthe ontextofminimumtimeproblems(withtarget onstraint),thelink be-tweentheminimumtimefun tion andthePontryagin'sprin iplehasbeenalso investigatedinseveralpapers[9,8℄, provingthesame onne tion.

(11)

On e the valuefun tion

T

is omputed by solving theHJB equation, we ap-proximate

DT (x)

(

x

being the starting point) by standard rst-order nite dieren es,andthenweuseitasinitialguessfor

p(0)

.

Inthe ase

T /

∈ C

1 _(R

d

₎

itis provedin [8℄ that a onne tionbetweenthe two approa hesstillexists. Morepre isely, undersomeadditionalassumptions,we have

p

∗

(t) ∈ D

+

T (y

∗

x

(t))

for

t ∈ [0, T (x)],

where

D

+

_{T (x)}

isthesuperdierential of

T

at

x

dened by

D

+

T (x) :=

η ∈ R

d

_{: lim sup}

y→x

T (y) − T (x) − η · (y − x)

|y − x|

≤ 0

.

(10)

Intherestof thisse tionweassumethat

D

+

_{T (x) 6= ∅}

. Itisplainthat we an notusenite dieren e approximationin order to ompute

p(0)

at thepoints wherethevaluefun tion

T

isnotdierentiable. Ratherthanthat,wewilltryto approximatethedire tion

ξ

∗

whi hisorthogonaltothelevelsetsof

T

,pointing towardthedire tion of maximal de rease. This dire tion,in the ase when

T

isdierentiable,isgivenby:

ξ

∗

= −DT (x).

(11)

Here,we omputeanapproximationof

ξ

∗

as:

ξ

∗

= arg min

ξ∈B(0,1)

T (x + δξ) − T (x)

δ

,

(12)

where

δ > 0

is asmall positiveparameter, and

B(0, 1)

denotes the ballin

R

d

enteredat

0

withradius

1

.

Letusexplainonasimpleexamplewhywe hoosethedenition(12). Con-siderthe ase

C = {(3, 0)} ∪ {(−3, 0)}

,

ℓ ≡ 1

,

f = u

and

U = B(0, 1)

(eikonal equation). On the line

{x = 0}

the fun tion

T

is not dierentiable (see Fig. 1-left). This line orresponds to a zone where two globally optimal

traje to-−5

0

5 −5

−4

−3

−2

−1

0

1

2

3

4

5 −5

−4

−3

−2

−1

0

1

2

3

4

5 −5

−4

−3

−2

−1

0

1

2

3

4

5

Figure1: two rossingfrontswithandwithoutsuperimposition. Arrows orre-spondto the(two)ve tor(s)

ξ

∗

(12)

se tion 2.2)herewehavetwofrontswhi h hit ea h other at theline

{x = 0}

. Thevis ositysolutionoftheHJBequationsele tsautomati allytherstarrival time so weneversee the two rossing fronts, but we ould in prin iple follow the propagations of the two frontsseparately (see Fig. 1-right). Considering thetwofrontsseparately,bymeansof(12),we aneasilyapproximatethetwo dire tions

ξ

∗

1

and

ξ

∗

2

ofmaximal de reaseofthefun tion

T

(and thenthetwo gradients

−ξ

∗

1

and

−ξ

∗

2

of

T

)usingonlythevaluefun tion

T

.

Inthepresentexample,fo usingonthepoint

(0, 0)

,weeasily omputethe twodire tionsofmaximalde reaseas

(−1, 0)

and

(1, 0)

. Itiseasytoshowthat these twove tors oin idewiththetwoextremal ve torsin

D

+

_{T (x)}

, namely theve tors

η

verifying

lim sup

y→x

T (y) − T (x) − η · (y − x)

|y − x|

= 0.

(13)

Althoughthisrelationshipisnottrueforeveryfun tion

T

su hthat

D

+

_{T (x) 6=}

∅

,it iseasy to seethat it istruewheneverthe urveof non-dierentiability is dueto the ollisionoftwoormorefronts(asin Problem1,Se tion 4.1).

Inthis paper, weproposeto investigatenumeri ally therelevan e of using the HJB approa h to ompute

−ξ

∗

and then using it as initial guess for the initial ostate

p(0)

intheshootingmethod.

3.2 Convergen e of

DT

Many papers(see for example[2, 26℄ in the ontext of dierential games) in-vestigated the onvergen e of the approximate value fun tion

e

v(· ; h, k, Ω)

to theexa tsolution

v

whentheparameters

h, k

tendto zeroand

Ω

tendsto

R

d

. These resultswerequite di ultto beobtainedbe ausethe fun tion

v

is not in generaldierentiable.

Letusdenoteby

D = ( e

e

D

1 , . . . , e

D

d

)

thedis retegradient omputedby entered nitedieren eswithstep

z > 0

e

D

i

T (x) :=

T (x + ze

i

) − T (x − ze

i

)

2z

,

i = 1, . . . , d

where

{e

i

}

i=1,...,d

isthestandardbasisof

R

d

.

Toourpurposeswehavetogofurtherprovingthe onvergen eof

T (·; h, k, Ω) =

e

− ln(1 − ev(· ; h, k, Ω))

and then the onvergen e of

D e

e

T (· ; h, k, Ω)

be ause the latterwillbeusedbythePMPmethodasinitialguess.

Letusassumethat

k = C

1 h

forsome onstant

C

1

. Givenageneri estimateof theform

kev(· ; h, R

d

_{) − v(·)k}

L

∞

(R

d

₎

≤ Ch

α

,

C, α > 0

(14) wehavethefollowing

Theorem3.1 Assumethat

T ∈ C

1 _(Ω)

andthere exists

T

max

> 0

su hthat

0 ≤ T (x) ≤ T

max

for all

x ∈ Ω.

Letusdene

E(x) := k e

D e

T (x; h, Ω) − DT (x)k

∞

.

Then thereexists

Ω

′

_{⊂ Ω}

su hthat

kE(·)k

L

∞

(Ω

′

(13)

For theSLs hemeweuse here,anestimate oftheform (14)in theparti ular ase

ℓ ≡ 1

(underassumptionsweakerthanthoseused inTheorem3.1) anbe foundin[26℄. TheproofofthetheoremispostponedintheAppendix.

4 Numeri al experiments

Wehave tested the feasibilityand relevan e of ombining the HJBand PMP methodsonfouroptimal ontrolproblems. Ea hoftheseproblemshighlightsa parti ulardi ultyfromthe ontrolpointofview.

Problem 1 (se tion 4.1) is a simple minimum time target problem in di-mension twopresentinglo al and global minima. We will see in this example thattheshootingmethod isverysensitivewithrespe tto theinitial guess(as usual). WheninitializedbyusingtheHJBapproa h,shootingmethodre overs theoptimalsolution.

Problem2(se tion4.2)isa ontrolledVanderPolos illator,alsoof dimen-siontwo,with ontrolswit hings.

Problem 3 (se tion4.3) is the well-known Goddard problem with singular ar s,intheone-dimensional ase(totalstatedimensionisthree).

Problem4(se tion4.4)is anothersimpleminimumtimetargetproblem in dimensionfour, witharst-orderstate onstraint.

Details for HJB implementation. The algorithmis written in C++ and it runs on a PC with an Intel Core 2 Duo pro essor at 2.00 GHz and 4GB RAM. Note that the ode is notparallelized. Theindi ated CPU time isthe timeneededforthe omputationofthevaluefun tionandsavingtheresulton le. Thetimeneededtore onstru ttheoptimaltraje toryisnot onsidered(is almost0).

The numeri al domain

Ω

is dis retized by a regular grid with

N

1 × . . . × N

d

nodes. Thesetofadmissible ontrols

U

isdis retizedin

N

C

equispa eddis rete ontrols

u

1 , . . . , u

N

C

. The stop riterion for the xed point iterations (9) is

kev

(n+1)

_{− ev}

(n)

_k

L

∞

(Ω)

< ε = 1e − 5

.

DetailsforPMPimplementation. TheshootingmethodiswritteninF or-tran90andrunsonaPCwithanIntelCore2Duopro essorat2.33GHzand 2GBRAM.WeusedtheShoot

1

softwarewhi himplementsashootingmethod withtheHybrd[19℄ solver. Forthefour problemsstudiedwesettheODE in-tegrationmethodtoabasi 4th-orderRunge-Kuttawith100steps.

4.1 Minimum time target problem

Therstexampleillustrateshowalo alsolution anae ttheshootingmethod. We onsiderasimpleminimumtimeproblemwherewewanttorea ha ertain position on the plane by ontrolling the angle of the speed. We hoose the velo ityinorderto reatemultipleminimaofthe ostfun tional.

1

(14)

(P

1 )











min t

f

˙y

1 (t) = c(y

1 (t), y

2 (t)) cos(u(t))

˙y

2 (t) = c(y

1 (t), y

2 (t)) sin(u(t))

u(t) ∈ [0, 2π)

fora.e.

t ∈ (0, t

f

)

y(0) = x = (−2.5, 0)

y(t

f

) = (3, 0)

with

c(y

1 , y

2 ) =

1

if

y

2 ≤ 1

(y

2 − 1)

2 + 1

if

y

2 > 1.

Dueto theexpressionof

c

,wehaveat leasttwominima. Thesimplestone orrespondstoastraightlinetraje tory(

−

)alongthe

y

1

axiswith

y

2 = 0

. The otheronehasa urvedtraje tory(

∩

)thattakesadvantageofthelargervalues of

y

2

.

4.1.1 PMPand shooting method

We rst try to solve the problem with the PMP and the shooting method. Thereforeweseekazerooftheshootingfun tiondened by

S

1 :





t

f

p

1 (0)

p

2 (0)



 7→





y

1 (t

f

) − 3

y

2 (t

f

)

p

3 (t

f

) − 1



 .

Globaland lo alsolutions. Dependingonthestartingpoint,theshooting method an onvergeto alo alorglobalsolution(Fig. 2). Themore ommon lo al solution is the straight line traje tory from

x

to

C := {(3, 0)}

, with a onstant ontrol

u = 0

andanaltime

T

local

= 5.5

. Theglobalsolutionhasan ar hshapedtraje torythatbenetsfromthehigherspeedforin reasingvalues of

y

2

,withanaltime

t

∗

f

= T

global

= 4.868

.

4

5

6

7

8 CONTROL

X1

X2

−5

0

5 STATE

0

5

10

3

4

5

6 −2

−1

0

1 COSTATE

−1

0

1

0

0.5

1

5

5.5

6

6.5

7

7.5 CONTROL

X1

X2

−5

0

5 STATE

−0.1

0

0.1

4

5

6

7 −2

−1

0 COSTATE

−1

0

1

0

0.5

1

Figure2:

(P

1 )

-Globalsolution( urvedtraje tory)andlo alsolution(straight traje tory)foundbytheshootingmethod.

(15)

Sensitiveness with respe t to the starting point. Even for this simple problem,theshootingmethodisverysensitivetothestartingpoint. Numeri al tests indi ate that it onverges in most ases to lo al solutions. We run the shooting method with abat h of 441valuesof

p(0) ∈ [−10, 10]

2

on a

21 × 21

grid,withdierentstartingguessesforthenaltime(Fig. 3). Weobservethat forthe bat h with the

t

f

= 1

initialization, 11%of the shootings onvergeto theglobalsolution,60%to thestraightlinelo alsolution,and24%toanother lo al solution with an even worse nal time (

t

f

= 6.06

). For the bat h with the

t

f

= 10

initialization, 9%of theshootings onvergeto theglobal solution, and50%and29%to thetwolo alsolutions. Obviously,just taking arandom startingpointisnotareliablewaytondtheglobalsolution.

−10

−5

0

5

10 −10

−8

−6

−4

−2

0

2

4

6

8

10 P

1 (0)

P

2 (0)

CONVERGENCE STUDY FOR P(0)

∈

[−10,10]

2 AND T = 1

GLOBAL SOLUTION (T=4.868): 49 [11%]

LOCAL SOLUTION (T=5.5): 265 [60%]

LOCAL SOLUTION (T=6.06): 107 [24%]

−10

−5

0

5

10 −10

−8

−6

−4

−2

0

2

4

6

8

10 P

1 (0)

P

2 (0)

CONVERGENCE STUDY FOR P(0)

∈

[−10,10]

2 AND T = 10

GLOBAL SOLUTION (T=4.868): 38 [9%]

LOCAL SOLUTION (T=5.5): 222 [50%]

LOCAL SOLUTION (T=6.06): 127 [29%]

Figure3:

(P

1 )

-Convergen etotheglobalsolutionfromarandominitialization ishazardousdue tothepresen eof alo alsolution.

4.1.2 Solving the problemwith the HJBapproa h

InFig. 4,weshowthelevelsetsoftheminimumtimefun tion

T

asso iatedto the ontrolproblem

(P

1 )

. These levelsetsare obtainedbysolvingnumeri ally theHJBequation. Asit anbeeasilyseeninFig. 4,theminimumtimefun tion isnotdierentiableeverywhere. The urveof thedis ontinuityofthegradient representsherethesetoftheinitialpointsasso iatedtotwooptimaltraje tories. 4.1.3 Coupling the HJBand PMPapproa hes

WenowusethedataprovidedbytheHJBapproa hto obtainastartingpoint losetotheglobalsolution. TheHJBsolutionprovidesanestimateofthenal time, and alsoan approximationof the ostate

p(0)

by omputing adire tion ofmaximalde reaseoftheminimumtimefun tionat

y(0) = x

. InTable1,we summarizethe resultsobtainedby solvingtheHJBequation on several grids, andgivetheobtainedminimaltimetorea hthetargetstartingfromtheposition

x = (−2.5, 0)

. Aswe ansee, evenona oarsegrid(

25 × 25

nodes),weobtain a good approximation of

p(0)

in a very short time (the CPU times in Table 1in lude the numeri al resolution of the HJB equation and the omputation of

p(0)

). As weexpe ted, theshooting method immediately onvergesto the global solutionwhen using the startingpointobtained from the HJBmethod (Table2).

(16)

SL−FS

−6

−4

−2

0

2

4

6 −6

−4

−2

0

2

4

6

Figure 4:

(P

1 )

- Levelsets of theminimum time fun tion

T

, theoptimal tra-je tory starting from

(−2.5, 0)

and the twooptimal traje tories startingfrom

(−1.835, 0)

.

nodes

N

C

−ξ

∗

_t

∗

f

CPUtime(se )

25 × 25

16 (-0.049,-1.000) 4.895 0.08

50 × 50

16 (-0.048,-1.000) 4.895 0.37

200 × 200

32 (-0.051,-1.000) 4.878 20.25

Table 1:

(P

1 )

- HJB approa h: the optimal minimal time starting from

x =

(−2.5, 0)

, and the approximation

−ξ

∗

of the initial ostate asso iated to the optimaltraje tory InitializationfromHJB

t

∗

f

= 4.89

−ξ

∗

= (−0.05, −1)

SolutionbyPMP

t

∗

f

= 4.868

p(0) = (−5.552 × 10

−2

, −9.985 × 10

−1

)

Table2:

(P

1 )

-Initializationfrom HJBandsolutionfromPMP.

We an he kthatthe onvergen eoftheshootingmethodismu hbetterin aneighbourhoodoftheHJBinitialization. Comparedtothepreviousgridwith

p(0) ∈ [−10, 10]

2

, we test initial points with

p(0) ∈ [−0.1, 0] × [−2, 0]

, whi h orrespondstoa100%rangearoundtheHJBinitialization

−ξ

∗

_{= (−0.05, −1)}

; wealsoset

t

f

= 4.89

. This timetheshootingmethod ndstheglobalsolution for

76%

ofthepoints,andonly

12%

and

9%

forthelo alsolutions(Fig. 5).

InTable3 (see also Fig. 4), we onsider the ase of astarting point very lose to the urve where the minimal time fun tion is notdierentiable:

x =

(−1.835, 0)

. Here the omputation of

p(0)

by HJB gives the two dire tions

p(0) = (−0.05, −1.00)

and

p(0) = (−0.99, 0.00)

. Using these two values to initialize the shooting method, we obtain the two distin t solutions with the apandstraighttraje tories(Table4). Forthisproblem,thestartingpoints where the minimal time fun tion is not dierentiable orrespond to the ase

(17)

−0.1

−0.08

−0.06

−0.04

−0.02

0 −2

−1.8

−1.6

−1.4

−1.2

−1

−0.8

−0.6

−0.4

−0.2

0 P

1 (0)

P

2 (0)

CONVERGENCE STUDY NEAR HJB INITIALIZATION

GLOBAL SOLUTION (T=4.868): 337 [76%]

LOCAL SOLUTION (T=5.5): 53 [12%]

LOCAL SOLUTION (T=6.06): 39 [9%]

Figure5:

(P

1 )

-Convergen etotheglobalsolutionismu heasierneartheHJB initialization.

wherethelo al (straight)solutionbe omesglobaland hasthesameminimal timeastheglobal( ap)solution. Noti ethatheretheminimaltimefun tion remainsdierentiablealongea htraje tory. Wewillseeinse tion4.2adierent aseofnondierentiabilityforthevaluefun tion.

nodes

N

C

−ξ

∗

_t

∗

f

CPUtime(se )

300 × 300

32 (-0.05,-1.00)and(-0.99,0.00) 4.84 39.98

Table3:

(P

1 )

-HJBapproa hforaninitial position

x = (−1.835, 0)

.

t

∗

f

p(0)

HJB

4.84 (−0.05, −1)

and

(−0.99, 0)

PMP

(∩)

4.8246

(−7.67 × 10

−2

_{, −9.97 × 10}

−1

₎

PMP

(−)

4.835 (−1, −6.2137 × 10

−16

₎

Table 4:

(P

1 )

- Lo al solution be omes global for astarting point where the minimaltimefun tionisnotdierentiable.

4.2 Van der Pol os illator

These ondtestproblemisa ontrolledVanderPolos illator. Herewewantto rea hthesteadystate

(y

1 , y

2 ) = (0, 0)

inminimumtime. Itiswellknownthat theoptimaltraje tories, forthis problem, areasso iated tobang-bang ontrol

(18)

variables.

(P

2 )











min t

f

˙y

1 (t) = y

2 (t)

˙y

2 (t) = −y

1 (t) + y

2 (t)(1 − y

1 (t)

2 ) + u(t)

u(t) ∈ [−1, 1]

y(0) = x = (1, −0.8)

y(t

f

) = (0, 0)

Here,theHamiltonianislinearwithrespe tto

u

,thereforewehaveabang-bang ontrolwiththeswit hingfun tion

ψ(x, p) = H

u

(x, p, u) = p

2

.

Theshooting fun tionisdened by

S

2 :





t

f

p

1 (0)

p

2 (0)



 7→





y

1 (t

f

)

y

2 (t

f

)

p

3 (t

f

) − 1



 .

Wetest theshootingmethod withthesameinitialpointsasforproblem

(P

1 )

. The onvergen eresultsareevenworseinthis ase: forthe

t

f

= 1

initialization, only9%oftheshootings onvergetotheglobalsolution,and0.5%forthe

t

f

= 10

initialization.

4.2.2 Solvingthe problem with the HJBapproa h

Here weusetheHJBapproa hto omputetheminimaltimefun tion. InFig. 6,weshowthenumeri alsolutionobtainedby arryingout omputationsona

200 × 200

gridand

N

C

= 2

.

SL−FS

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2 −2

−1.5

−1

−0.5

0

0.5

1

1.5

2

Figure 6:

(P

2 )

- Level sets of fun tion

T

and the optimal traje tory starting from

(1, −0.8)

T

.

4.2.3 Couplingthe HJBand PMPapproa hes

Thenumeri alsolutionoftheHJBequationprovidessomeusefuldata,namely an approximation of the nal time

t

f

and aninitial ostate

p(0)

. This infor-mation is used here to start the shooting algorithm. On e again, the HJB

(19)

initializationgivesanimmediatea urate onvergen eto theoptimalsolution, seeTable5and Fig. 7. Inthisexample,the ontroldis ontinuitieshinderthe

InitializationfromHJB

t

∗

f

= 4.2

−ξ

∗

= (1.2, −4.2)

SolutionfromPMP

t

∗

f

= 3.837

p(0) = (1.249, −3.787)

Table5:

(P

2 )

-InitializationfromHJBandsolutionfrom PMP.

onvergen e by testing dierent integration s hemes for the stateand ostate pair

(x, p)

. Usingaxedstepintegrator(4th orderRunge-Kutta)withoutany pre autionsgivesaverypoor onvergen ewithanormof

≈ 10

−3

forthe shoot-ing fun tion. Using either a variable step integrator (Dopri, see [17℄) or a swit hingdete tionmethod forthexedstepintegrator(see[15℄)wegetmu h betterresults(

≈ 10

−11

fortheshootingfun tion norm).

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2 −2

−1.5

−1

−0.5

0

0.5

1

1.5

2 TRAJECTORY

0

0.2

0.4

0.6

0.8

1 −1

0

1 CONTROL

0

0.5

1 −1

−0.5

0

0.5

1 STATE

0

0.5

1 −2

−1

0

1

2 COSTATE

0

0.5

1 −1

−0.5

0

0.5

1

0

0.5

1 −4

−2

0

2

Figure7:

(P

2 )

-Solutionwithoneswit hfortheVanderPolos illator(shooting method).

Wenowtesttwostartingpointsforwhi htheminimaltimefun tion isnot dierentiable. Inthepreviousproblemthenondierentiabilitywas ausedbya lo alsolution be oming global(followingthefrontpropagation interpretation, twofrontsarehitting). Herethenondierentiabilityhasadierentnature. It annotbeseenasthe urveof ollisionbetweenfronts,and orrespondstothe pointswhere the ontrolswit hesbetween

−1

and

+1

. Taking su h astarting pointwehaveasolutionwitha onstant ontrol

u = ±1

and noswit hes. We testthe twostarting points

x = (1.5, −0.67)

and

x = (1, −0.57)

that are lose tothenondierentiable urve(seeFig. 6). Computationof

ξ

∗

is performedas beforeinthe ase

T

isnotdierentiable. Weobservethattheshootingmethod nds solutionswith a swit h immediately after the initial time or just before thenaltime. Here theHJBinitializationis notas loseto theinitial ostate

p(0)

,butissu ienttoobtain onvergen e. Also,theminimumtimesgivenby HJBarestill losetotheexa tones (Table6).

4.3 Goddard problem

The third exampleis the well-known Goddard problem (see for instan e [16, 18,20, 28,25, 5℄), toillustrate the aseofsingularar s. Thisproblem models theas ent ofa ro ket through theatmosphere, and werestri t hereourselves toverti al(monodimensional)traje tories. Thestatevariablesarethealtitude, speedandmassofthero ketduring theight,foratotaldimensionof3. The

(20)

x

method

p(0)

t

f

(1.5, −0.67)

HJB

(1.62, −0.87)

2.96

PMP

(1.487, 2.309 × 10

−3

₎

_2.9594

(1, −0.57)

HJB

(1.96, −0.10)

2.2

PMP

(1.715, 1.111 × 10

−2

₎

_2.1351

Table6:

(P

2 )

- Solutions with no swit hesfor starting point where the value fun tion isnotdierentiable.

ro ketis subje tto gravity,thrustanddrag for es. Thenal timeisfree, and theobje tiveisto rea ha ertainaltitudewithaminimalfuel onsumption.

(P

3 )











min J(u) =

R

t

f

0 bT

max

u

˙r = v

˙v = −

_r

1

2 +

1 m

(T

max

u − D(r, v))

˙

m = −bT

max

u

u(t) ∈ [0, 1]

r(0) = 1, v(0) = 0, m(0) = 1,

r(t

f

) ≥ 1.01

with the parameters used for instan e in [20℄:

b = 7

,

T

max

= 3.5

and drag

D(r, v) = 310v

2 _e

−500(r−1)

.

As for

(P

2 )

, the Hamiltonianis linearwith respe tto

u

, andwehavea bang-bang ontrol withpossibleswit hingsorsingularar s. Theswit hingfun tion is

ψ(x, p) = H

u

(x, p, u) = T

max

((1 − p

m

)b +

p

v

m

)

,andthesingular ontrol anbe obtainedby formallysolving

ψ = 0

¨

. Themain di ulty, however,is to deter-minethestru tureoftheoptimal ontrol,namelythenumberandapproximate lo ation of singularar s. The HJBapproa h isable to provide su h informa-tion,in additionto theinitial ostate

p(0)

. Assumingforinstan eone interior singularar ,theshootingfun tionisdened by

S

3 :





t

f

, p

1 (0), p

t

entry

2 (0), p

3 (0)

t

exit



 7→





r(t

f

) − 1.01, p

2 (t

f

), p

3 (t

f

), p

4 (t

f

)

ψ(x(t

entry

), p(t

entry

))

˙

ψ(x(t

entry

, p(t

entry

))



 .

4.3.2 Solvingthe problem with the HJBapproa h

GoddardproblemisalsohardtosolvewiththeHJBapproa h,spe iallybe ause the omputation of the value fun tion needs a huge number of iterations to onvergeandthesolutionisquitesensibletothe hoi eofthenumeri albox

Ω

in whi hthevaluefun tionis omputed. InFig. 8,

weshowthe optimaltraje toryand theoptimal ontrol omputedbyHJB onaroughgrid. Aswe ansee,theHJBapproa hdoesnotgiveagood approx-imation of theoptimal ontrol (verti al lines orrespond to strongos illations ofthesolution). TheHJBformulation ansuggestnotonlythevaluesfor

p(0)

and

t

f

,but alsothelo ationofthesingularar .

(21)

0

0.05

0.1

0.15

0.2

1

1.005

1.01

1.015 x1

0

0.05

0.1

0.15

0.2

0

0.05

0.1

0.15

0.2 x2

0

0.05

0.1

0.15

0.2

0.7

0.8

0.9

1 x3

0

0.05

0.1

0.15

0.2

0

0.2

0.4

0.6

0.8

1 u

Figure8:

(P

3 )

-Goddardproblem,solutionbyHJBapproa h(rstline: altitude andvelo ity. Se ondline: massand ontrol).

4.3.3 Coupling the HJBand PMPapproa hes

Wenow tryto initializethe shooting method dire tly from the resultsof the HJBapproa h. As for problems

(P

1 )

and

(P

2 )

, theHJBsolution providesan estimateofthenal time

t

∗

f

andinitial ostate

p(0)

. Moreover,examiningthe statevariables on the HJBsolution also givesagood ideaof the stru ture of the ontrol: the hangeof slopeonthespeed learlyvisiblein Fig. 8indi ates aninteriorsingularar at

(t

entry

, t

exit

) ≈ (0.02, 0.06)

. On e againweobtaina qui k onvergen etothe orre tsolutionwiththeexpe tedsingularar (Table 7andFig. 9)).

t

∗

f

(

t

entry

, t

exit

)

−ξ

∗

and

p(0)

0.17 (0.02, 0.06)

(−7.79, −0.31, 0.04)

SolutionfromPMP

0.1741

(0.02351, 0.06685)

(−7.275, −0.2773, 0.04382)

Table7:

(P

3 )

4.4 Minimumtimetarget problemwitha state onstraint Thisfourthexampleaimstoillustrate the aseofastate onstraint,aswellas afour-dimensionalproblemfortheHJBapproa h. We hoseasimpleproblem wherewewanttomoveapointontheplane,fromasteadyinitialpositiontoa targetposition,withanullinitialand nalspeed. The ontrol isthedire tion ofa eleration,andtheobje tiveistominimizethenaltime. Weaddastate onstraintwhi hlimitsthevelo ityofthepointalongthe

x

-axis.

(22)

0

0.5

1 CONTROL

0

0.5

1 SWITCHING FUNCTION

1

1.005

1.01 STATE

0

0.05

0.1

0.7

0.8

0.9

0

0.1

0.2 −10

−5

0 COSTATE

−0.5

0

0.5 −0.05

0

0.05 −5

0

5 x 10

−8

Figure 9:

(P

3 )

-Goddard problem,solutionbyPMPmethod.

(P

4 )











min J(x, u) = t

f

˙y

1 = y

3 ˙y

2 = y

4 ˙y

3 = cos(u)

˙y

4 = sin(u)

u(t) ∈ [0, 2π)

y(0) = x = (−3, −4, 0, 0)

y(t

f

) = (3, 4, 0, 0)

y

3 (t) ≤ 1

t ∈ (0, t

f

)

Letus writethe state onstraintsas

g(y(t)) ≤ 0

, with

g

dened by

g(y) =

y

3 − 1

. The ontrol appears expli itlyin the rst time derivativeof

g

, sothe onstraintis oforder 1,andwehave:

˙g(y(t)) = cos(u(t)),

g

y

(y) = (0, 0, 1, 0).

Whenthe onstraintisnota tive,minimizingtheHamiltoniangivestheoptimal ontrol

u

∗

via

(cos(u

∗

), sin(u

∗

)) = −

p

(p

3 , p

4 )

p

2

3 + p

2

4 .

Overa onstrainedar where

g(x) = 0

,theequation

˙g(x, u) = 0

andminimizing theHamiltonian

H

leadsto

u

∗

= −sign(p

4 )

π

2 .

Thenthe ondition

H

u

= 0

givesthevalueforthe onstraintmultiplier

µ = −p

3

. Attheentrypointwehaveajump onditionforthe ostate:

p(t

+

entry

) = p(t

−

entry

) − π

entry

g

x

,

with

π

entry

∈ R

an additional shooting unknown. Compared to the un on-strainedproblem,wehavethreemoreunknowns

t

entry

, t

exit

and

π

entry

. The or-respondingequationsaretheHamiltonian ontinuityat

t

entry

and

t

exit

(whi h

(23)

boilsdownto

p

3 = 0

),andthetangentialentry ondition

g(x(t

entry

)) = 0

. The shootingfun tionis denedby

S

4 :





p

1...4

t

f

(0)

t

entry

, t

exit

, π

entry



 7→





y

1...4

(t

f

p

) − (−3, −4, 0, 0)

5 (t

f

) − 1

p

3 (t

entry

), p

4 (t

entry

), g(y(t

entry

))



 .

0

2

4

6 −5

0

5 x

0

2

4

6 −5

0

5 y

0

2

4

6 −2

0

2

4 Vx

0

2

4

6 −2

0

2

4 Vy

Figure10:

(P

4 )

-Solutionwitha onstrainedar bytheHJBapproa h.

In g. 10, weshowthe numeri al solutionobtainedby using theHJB ap-proa h. This approa hprovides alsoapproximationsofthe optimalnal time andtheinitial ostate. ExaminingtheHJBsolutionalsogivesanestimateofthe bounds for the onstrainedar where

y

3 = 1

. The onlyshooting unknownfor whi hwewerenotabletoobtainrelevantinformationisthemultiplier

π

entry

for the ostatejump at

t

entry

. Thereforeweused

π

entry

= 0.1

asastartingguess, whi h turned out to be su ient for the shooting method to onverge prop-erly(Table8). Fig. 11 showsthe orrespondingsolution, mu h leanerthan theHJBsolutionbutwith thesamestru ture. We he kedthat the ondition

µ ≥ 0

wassatisedovertheboundaryar as

p

3

isnegative,and

p

3 = 0

atboth entryandexitofthear asrequestedbytheHamiltonian ontinuity onditions. Thea tualvalueofthemultiplierforthejumpon

p

3

is

π

entry

= 4.1294

.

t

∗

f

(

t

entry

, t

exit

)

−ξ

∗

and

p(0)

7.5 (1.35, 5.6)

(−0.51, −0.24, −0.89, −0.61)

SolutionfromPMP

7.0356

(1.1370, 5.8986)

(−0.8672, −0.0474, −0.9860, −0.1667)

Table8:

(P

4 )

CPU times. In Table 9 we nally summarize the CPU times needed for omputations.

(24)

−3

−2

−1

0

1

2 CONTROL

−4

−2

0

2

4 TRAJECTORY

−5

0

5 STATE

−5

0

5

0

0.5

1

0

1

2

3

7.03

7.04

7.05 −2

−1

0 COSTATE

−1

0

1 −4

−2

0

2 −0.2

0

0.2

0

0.5

1

Figure11:

(P

4 )

-Solutionwitha onstrainedar byPMPapproa h.

Problem1 Problem2 Problem3 Problem4 HJBapproa hwithroughdis retization

8 × 10

−2

2.98

211

182

PMPapproa hwithHJBinitialization

3 × 10

−3

7 × 10

−3

3 × 10

−2

2 × 10

−2

Shootingfun tionnormforPMP

2.82 × 10

−16

8.14 × 10

−11

1.12 × 10

−7

6.68 × 10

−11

Table9:SummaryofCPUtimesfornumeri alexperiments(se onds)and shoot-ingfun tion norm

5 Con lusions

The known relation between the gradient of the value fun tion in the HJB approa h and the ostate in the PMP approa h makes it possible to use the HJBresultsto initializeashooting method. With this ombinedmethod, one anhopetobenetfromtheoptimalityofHJBandthehighpre isionofPMP. Themainlimitationisonthestatedimensionimposed byHJB.

We have tested this approa h on four ontrol problems presenting some spe i di ulties: lo alandglobalsolutions(Problem1),dis ontinuous bang-bang ontrol(Problem2),singularar s(Problem3),state onstraint(Problem 4). Thenumeri al tests alsoin luded two ases where thevalue fun tion was notdierentiable.

Forthesefourproblems,theHJBapproa hprovidesanapproximatesolution withsomeadditionalinformation,su hasanestimateoftheinitial ostate

p(0)

, optimalnaltime

t

f

,stru tureoftheoptimalsolutionwithrespe ttosingular or onstrainedsubar s. Inea h asethisinformationallowedustosu essfully initializetheshootingmethod. Thefa t thattheoptimal ontrolre onstru ted byHJBwassometimesfarfromtheexa t ontroldidnotseemtobeproblemati for the shooting method initialization. The total omputational time for the ombined HJB-PMP approa h did not ex eed four minutes, up to dimension four.

(25)

Appendix

Proof ofTheorem3.1. Giventhenumeri aldomain

Ω

wedenetheset

Ω

′

as

Ω

′

_{:= {x ∈ R}

d

_{: ev(x; h, Ω) ≤ min}

x

′

_∈∂Ω

e

v(x

′

_{; h, Ω)}.}

Theset

Ω

is the boxin whi h the approximate solution is a tually omputed and

Ω

′

representsthe subsetof

Ω

in whi h the solutionis notae ted bythe titiousboundary onditionsweneedto impose at

∂Ω

to make omputation. Fromthefrontpropagationpointofview,

∂Ω

′

representsthefrontatthetime ittou hes

∂Ω

fortheveryrsttime.

Letus dene

v

max

:= (1 − e

−T

max

₎

andx

x ∈ Ω

′

. Wehave

T (x) ≤ T

max

< +∞

and

v(x) ≤ v

max

< 1.

By(14)wehave

e

v(x; h) ≤ v(x) + Ch

α

_{≤ v}

max

+ Ch

α

.

Sin e

v

max

< 1

thereexists

h

0 > 0

su hthat

v

max

+ Ch

α

< 1

forall

0 < h ≤ h

0

thenwe andene

ev

max

:= v

max

+ Ch

α

0 < 1

andwehave

v(x) ≤ v

max

≤ ev

max

and

e

v(x; h) ≤ ev

max

forall

x ∈ Ω

′

_,

_{0 < h ≤ h}

0 .

Foranyxed

x ∈ Ω

′

,itexists

ξ

x

∈ [min{v(x), ev(x; h)}, max{v(x), ev(x; h)}]

su h that

| e

T (x) − T (x)| =

_{ln 1 − v(x)}

_{− ln 1 − ev(x; h)}

₌

_{1 − ξ}

1 _x

|v(x) − e

v(x; h)|.

Sin e

ξ

x

≤ ev

max

,wehave

| e

T (x) − T (x)| ≤

Ch

α

1 − ev

max

forall

x ∈ Ω

′

and

0 < h ≤ h

0

andthenitexists apositive onstant

C

2

whi hdependsbytheproblem'sdata andon

Ω

su hthat

k e

T − T k

L

∞

(Ω

′

)

≤ C

2 h

α

forall

0 < h ≤ h

0 .

(15) Weare now ready to re overan estimate on thegradient of the approximate solution

T

e

. By(15)weknowthat,forany

i = 1, . . . , d

e

T (x + ze

i

) = T (x + ze

i

) + E

1

with

|E

1 | ≤ C

2 h

α

and

e

T (x − ze

i

) = T (x − ze

i

) + E

2

with

|E

2 | ≤ C

2 h

α

_.

Sowehave

e

D

i

T (x) =

e

T (x + ze

i

) + E

1 − (T (x − ze

i

) + E

2 ))

2z

= e

D

i

T (x) +

E

1 − E

2 2z

(26)

sothat

| e

D

i

T (x) − e

e

D

i

T (x)| ≤

E

1 − E

2 2z

≤ C

2 h

α

z

andthen

k e

D e

T (x) − e

DT (x)k

∞

≤ C

2 h

α

z

.

Wenallyobtain,for

x ∈ Ω

′

and

0 < h ≤ h

0

,

k e

D e

T (x)−DT (x)k

∞

≤ k e

D e

T (x)− e

DT (x)k

∞

+k e

DT (x)−DT (x)k

∞

= O

_h

α

z

+O(z

2 ₎

andthe on lusionfollows.

Referen es

[1℄ Bardi,M.,CapuzzoDol etta,I.: Optimal ontrolandvis ositysolutionsof Hamilton-Ja obi-Bellmanequations.Birkhäuser,Boston(1997)

[2℄ Bardi, M., Fal one, M., Soravia, P.: Fully dis retes hemesfor the value fun tion of pursuit-evasion games. InT. Basarand A. Haurie(eds), Ad-van esin Dynami Games and Appli ations,Annals of theInternational So ietyofDynami Games1,89-105(1994)

[3℄ Bellman,R.E.: ThetheoryofDynami Programming.Bull.Amer.Math. So .60,503515(1954)

[4℄ Bettiol,P.,Frankowska,H.: Normalityof themaximumprin iplefor non- onvex onstrainedBolzaproblems.J.DierentialEquations243,256269 (2007)

[5℄ Bonnans, F., Martinon, P., Trélat, E.: Singular ar s in the generalized Goddard'sproblem.J.Optim.TheoryAppl.139,439461(2008)

[6℄ Bryson,A.E.,Ho,Y.-C.: Appliedoptimal ontrol.HemispherePublishing, New-York(1975)

[7℄ Cannarsa,P.,Frankowska,H.: Some hara terizationsofoptimal traje to-riesin ontroltheory.SIAMJ.ControlOptim.28,13221347(1991) [8℄ Cannarsa, P., Frankowska, H., Sinestrari, C.: Optimality onditions and

synthesisfortheminimumtimeproblem. SetValuedAnalysis8, 127148 (2000)

[9℄ Cernea, A., Frankowska,H.: A onne tionbetweenthe maximum prin i-pleand dynami programmingfor onstrained ontrol problems.SIAMJ. ControlOptim. 44,673703(2006)

[10℄ Chitour, Y., Jean, F., Trélat, E.: Generi ity results for singular urves. JournalofDierentialGeometry,73, 4573(2006)

[11℄ Clarke,F.,Vinter,R.B.: Therelationshipbetweenthemaximumprin iple anddynami programming.SIAMJ.ControlOptim.25,12911311(1987)

(27)

[12℄ Deuhard,P.: NewtonMethodsforNonlinearProblems.SpringerSeriesin ComputationalMathemati s,35(2004)

[13℄ Fal one,M.: Numeri alsolutionof dynami programmingequations. Ap-pendixA in[1℄.

[14℄ Fal one,M.: Numeri almethodsfordierentialgamesbasedonpartial dif-ferentialequations.InternationalGameTheory Review8,231272(2006) [15℄ Gergaud,J.,Martinon.P.: Usingswit hingdete tionandvariational

equa-tions for the shooting method. Optim. Control Appl. Meth. 28, 95116 (2007)

[16℄ Goddard,R.H.: AMethodofrea hingextremealtitudes.SmithsonianInst. Mis .Coll.71,(1919)

[17℄ Hairer, E., Nørsett, S.P., Wanner, G.: Solvingordinarydierential equa-tionsI.SpringerSeriesinComputationalMathemati s8,Springer-Verlag, Berlin(1993)

[18℄ Maurer,H.: Numeri alsolutionofsingular ontrolproblemsusingmultiple shootingte hniques.J.Optim.TheoryAppl.18,235257(1976)

[19℄ More, J.J.,Garbow,B.S., Hillstrom, K.E.: UserGuidefor MINIPACK-1. Argonne NationalLaboratoryReportANL-80-74(1980)

[20℄ Oberle,H.J.: Numeri al omputationofsingular ontrolfun tionsintra je -toryoptimization.JournalofGuidan e,ControlandDynami s13,153159 (1990)

[21℄ Pes h, H.J.: A pra ti alguide to the solution of real-lifeoptimal ontrol problems. ControlandCyberneti s23,760(1994)

[22℄ Pes h, H.J.: Real-time omputation of feedba k ontrols for onstrained optimal ontrolproblemsII:a orre tionmethodbasedonmultiple shoot-ing.OptimalControl,Appli ationsandMethods10,147-171(1989) [23℄ Pes h, H.J., Plail, M.: The maximumprin ipleof optimal ontrol: a

his-tory ofingeniousideaandmissed opportunities.to appearin Controland Cyberneti s.

[24℄ Pontryagin,L.S., Boltyanski,V.G., Gamkrelidze,R.V. Misht henko,E.F.: The mathemati al theory of optimal pro esses. Wiley Inters ien e, New York (1962)

[25℄ Seywald,H.,Cli,E.M.: Goddardproblemin presen eofadynami pres-surelimit.JournalofGuidan e,Control,andDynami s16,776781(1993) [26℄ Soravia, P.: Estimates of onvergen e of fully dis rete s hemes for the Isaa s equation of pursuit-evasion dierential games via maximum prin- iple.SIAMJ.ControlOptim. 36,111(1998)

[27℄ Tsai,Y.R.,Cheng,L.T.,Osher,S.,Zhao,H.: Fastsweepingalgorithmsfor a lass of Hamilton-Ja obiequations. SIAMJ.Numer.Anal. 41, 673694 (2003)

(28)

[28℄ Tsiotras,P.,Kelley,H.J.: Drag-lawee tsintheGoddardproblem. Auto-mati a27,481-490(1991)

(29)

Parc Orsay Université - ZAC des Vignes

4, rue Jacques Monod - 91893 Orsay Cedex (France)

Centre de recherche INRIA Bordeaux – Sud Ouest : Domaine Universitaire - 351, cours de la Libération - 33405 Talence Cedex

Centre de recherche INRIA Grenoble – Rhône-Alpes : 655, avenue de l’Europe - 38334 Montbonnot Saint-Ismier

Centre de recherche INRIA Lille – Nord Europe : Parc Scientifique de la Haute Borne - 40, avenue Halley - 59650 Villeneuve d’Ascq

Centre de recherche INRIA Nancy – Grand Est : LORIA, Technopôle de Nancy-Brabois - Campus scientifique

615, rue du Jardin Botanique - BP 101 - 54602 Villers-lès-Nancy Cedex

Centre de recherche INRIA Paris – Rocquencourt : Domaine de Voluceau - Rocquencourt - BP 105 - 78153 Le Chesnay Cedex

Centre de recherche INRIA Rennes – Bretagne Atlantique : IRISA, Campus universitaire de Beaulieu - 35042 Rennes Cedex

Centre de recherche INRIA Sophia Antipolis – Méditerranée : 2004, route des Lucioles - BP 93 - 06902 Sophia Antipolis Cedex

Éditeur

INRIA - Domaine de Voluceau - Rocquencourt, BP 105 - 78153 Le Chesnay Cedex (France)

Initialization of the shooting method via the Hamilton-Jacobi-Bellman approach

HAL Id: inria-00439543

https://hal.inria.fr/inria-00439543

Submitted on 7 Dec 2009

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Hamilton-Jacobi-Bellman approach

Emiliano Cristiani, Pierre Martinon

To cite this version:

Emiliano Cristiani, Pierre Martinon. Initialization of the shooting method via the

Hamilton-Jacobi-Bellman approach. Journal of Optimization Theory and Applications, Springer Verlag, 2010, 146 (2),

pp.321-346. �10.1007/s10957-010-9649-6�. �inria-00439543�

a p p o r t

d e r e c h e r c h e

N

0

2

4

9

-6

3

9

9

IS

R

N

IN

R

IA

/R

R

--7

1

3

9

--F

R

+

E

N

G