• Aucun résultat trouvé

Dynamic programming principle for stochastic recursive optimal control problem with delayed systems

N/A
N/A
Protected

Academic year: 2022

Partager "Dynamic programming principle for stochastic recursive optimal control problem with delayed systems"

Copied!
22
0
0

Texte intégral

(1)

DOI:10.1051/cocv/2011187 www.esaim-cocv.org

DYNAMIC PROGRAMMING PRINCIPLE FOR STOCHASTIC RECURSIVE OPTIMAL CONTROL PROBLEM WITH DELAYED SYSTEMS

Li Chen

1

and Zhen Wu

2

Abstract. In this paper, we study one kind of stochastic recursive optimal control problem for the systems described by stochastic differential equations with delay (SDDE). In our framework, not only the dynamics of the systems but also the recursive utility depend on the past path segment of the state process in a general form. We give the dynamic programming principle for this kind of optimal control problems and show that the value function is the viscosity solution of the corresponding infinite dimensional Hamilton-Jacobi-Bellman partial differential equation.

Mathematics Subject Classification. 49L20, 60H10, 93E20.

Received June 7, 2010. Revised May 6, 2011 Published online 16 January 2012.

1. Introduction

The classical stochastic control system is governed by a nonlinear stochastic differential equation (SDE). This kind of stochastic optimal control problems has been studied extensively, both by the dynamic programming approach and by the Pontryagin stochastic maximum principle. In our paper, we are concerned with the dynamic programming principle. There are many works concerning this subject. Such as Yong and Zhou [13] for the classical stochastic control system, and Peng [10,11], Wu and Yu [12] for the stochastic recursive case.

The research of many natural and social phenomena shows that the future development of many processes depends not only on their present state but also essentially on their previous history. Such processes can be de- scribed by the stochastic differential delayed equation (SDDE). Many examples can be found in Mohammed [8,9].

Whereas the dynamic programming principle can also be extended to stochastic control problems with delay (see e.g. [6]), most problems remain practically intractable because of the complex infinite-dimensional state space framework. In [7], Larssen and Risebro consider a class of optimal consumption problems with the stochas- tic delayed systems for some special cases, which shows that the financial application of this kind of dynamic programming principle.

Keywords and phrases. Stochastic differential equation with delay, recursive optimal control problem, dynamic programming principle, Hamilton-Jacobi-Bellman equation.

This work is partly supported by the Natural Science Foundation of P.R. China (10921101) and Shandong Province (JQ200801 and 2008BS01024), the National Basic Research Program of P.R. China (973 Program, No. 2007CB814904) and the Science Fund for Distinguished Young Scholars of Shandong University (2009JQ004), the Fundamental Research Funds for the Central Universities (2010QS05).

1 Department of Mathematics, China University of Mining & Technology, Beijing 100083, P.R. China.chenli@cumtb.edu.cn

2 School of Mathematics, Shandong University, Jinan 250100, P.R. China.wuzhen@sdu.edu.cn

Article published by EDP Sciences c EDP Sciences, SMAI 2012

(2)

However, Duffien and Epstein [2] showed that the personal utility at time t is not only a function of the instantaneous consumption rate, but also of the future utility (corresponding to the future consumption). Also, from [3], we know that this kind of recursive utility can be described by backward stochastic differential equa- tion(BSDE), and the stochastic differential recursive utility is an extension of the standard additive one. So one of our goals in this paper is to obtain the dynamic programming principle for the delayed stochastic optimal control problem with the recursive utility.

For this, we study one kind of delayed stochastic recursive optimal control problem with the cost functional described by the solution of a BSDE. We prove that the celebrated dynamic programming principle for this kind of optimal control problem still holds. Because of the absence of Itˆo’s formula for the function of the history state and the infinite-dimensional difficulty, it is not easy to obtain the Hamilton-Jacobi-Bellman (HJB) equation which the optimal value function satisfies. To overcome this difficulty, with the help of the theory of generator for the operator and the semigroup properties, we obtain the corresponding HJB equation and show that the value function of the recursive optimal control problem is the viscosity solution of HJB equation.

The paper is organized as follows. In Section 2, we present some notations which will be used throughout the paper, and we formulate the recursive optimal control problem with delay. In Section 3, we prove that the celebrated dynamic programming principle still holds under our framework. In Section 4, we give some properties of the optimal value function. We obtain the corresponding HJB equation and show that the optimal value function is the viscosity solution of the HJB equation in this section. Moreover, in some special case, we can get the uniqueness result of the viscosity solution.

2. Notations and formulation of the problem

Let (Ω,F, P) be a probability space. Given 0≤T <∞ denoting a fixed terminal time. Let 0≤δ <∞ be a fixed constant, and [−δ,0] be the duration of the bounded delay of the systems considered in our paper. For the sake of simplicity, denote C:=C([−δ,0];Rn), the Banach space of continuous pathsγ: [−δ,0]Rn with normγC:= sup

−δ≤s≤0|γ(s)|, where | · |is the Euclidean norm onRn. And letn, d≥1 be integer.

Ifϕ∈C([−δ, T];Rn) and 0≤t≤T, letϕtbe defined by

ϕt(θ) =ϕ(t+θ), −δ≤θ≤0. (2.1)

We note thatϕtis the segment of the path of ϕfrom t−δ tot.

Throughout the end, let{W(t)}0≤t≤T be ad-dimensional Brownian motion on completed filtered probability space (Ω,F, P;{Ft}t≥0), where Ftis the natural filtration of{W(t)}withF0 contains allP-null sets ofF.

We also use the following notations in our paper:

L2(Ω,F, P) =

ξis anFTmeasurable random variable s.t.E|ξ|2<+∞

, HF2(0, T) =

t,0≤t≤T) is an adapted process s.t.E T

0

t|2dt <+∞

, SF2(0, T) =

t,0≤t≤T) is an adapted process s.t.E

0≤t≤Tsup t|2 <+∞

, L2(Ω, C;Ft) =

ϕ:Ω→CisFt measurables.t. ϕ2L2(Ω,C):=E[ϕ(ω)2C]<∞ .

Moreover, we denote by (·|·) the inner product in L2(Ω, C;Ft), and·,· the inner product inRn. Givenϕ andφin C, we have defined as follows,

(ϕ|φ) = 0

−δϕ(θ), φ(θ) dθ,andϕ2= (ϕ|ϕ)12.

(3)

We introduce the admissible control setUad defined by

Uad:={v(·)∈HF2(0, T)|v(·) takes value inU Rk}. Here U is a compact subset ofRk, an element ofUad is called an admissible control.

For a given admissible control, we consider the following system of controlled stochastic functional differential equations with a bounded memory:

dX(t) =b(t, Xt, v(t))dt+σ(t, Xt, v(t))dW(t), t[s, T],

Xs=ϕs. (2.2)

Here Xs and ϕs are defined similarly as in (2.1), i.e., for any 0 s T, Xs = X(s+θ), ϕs = ϕ(s+θ),

−δ≤θ≤0, andϕ∈L2(Ω, C;Fs) is regarded as the initial path. The maps b: [0, T]×C×U Rn; σ: [0, T]×C×U Rn×d satisfy the following assumptions:

(A1)b andσare continuous int;

(A2) for each integerm≥1, there is a constantLm>0 (independent oft) such that

|b(t, φ, v)−b(t,φ,ˆ ˆv)|+|σ(t, φ, v)−σ(t,φ,ˆ ˆv)| ≤Lm(φ−φˆC+|v−vˆ|), for any 0≤t≤T, φ,φˆ∈C withφC≤Lm, φˆC≤Lmandv,vˆ∈U;

(A3) there is a constantK >0 such that

|b(t, φ, v)|+|σ(t, φ, v)| ≤K(1+φC+|v|), for any 0≤t≤T, φ∈C, v∈U.

Under the above assumptions, for anyv(·)∈ Uad, the control system (2.2) with the aftereffect has a unique strong solution {Xs,ϕ;v(t),0 s t T}, and also we have the following estimates by the existence and uniqueness theorem of SDDE in Mohammed [8,9]:

Proposition 2.1. For alls∈[0, T], ϕ,ϕˆ∈L2(Ω, C;Fs), v(·),ˆv(·)∈ Uad,

EFs[Xts,ϕ;v2C]≤Λ(1+ϕ2C). (2.3)

If we define the following mapping

Tts:L2(Ω, C;Fs)×U→L2(Ω, C;Ft), (ϕ, v)→Xts,ϕ;v, then we have

EFs[Tts(ϕ, v)−Tts( ˆϕ,v)ˆ 2C]≤Λ

ϕ−ϕˆ2C+EFs T

s

|v(t)−ˆv(t)|2dt

. (2.4)

Remark 2.2. From now on, letΛbe a constant which can be changed line by line throughout our paper.

Now for any admissible controlv(·)∈ Uad, 0≤s≤t≤T, we consider the following BSDE:

Ys,ϕ;v(t) =Φ(XTs,ϕ;v) + T

t

f(r, Xrs,ϕ;v, Ys,ϕ;v(r), Zs,ϕ;v(r), v(r))dr T

t

Zs,ϕ;v(r)dW(r), (2.5)

(4)

where

f : [0, T]×C×R×Rd×U R, Φ:C→R satisfy the following conditions:

(A4)f isFtmeasurable and it is continuous int;

(A5) for some constantL >0, and for anyφ,φˆ∈C, y,yˆR, z,zˆRd, v,vˆ∈U, a.s.

|f(t, φ, y, z, v)−f(t,φ,ˆ y,ˆ z,ˆ v)|ˆ +|Φ(φ)−Φ( ˆφ)| ≤L(φ−φˆC+|y−y|ˆ +|z−z|ˆ +|v−v|),ˆ and∀(φ, v)∈C×U, we have

|f(t, φ,0,0, v)|+|Φ(φ)| ≤K(1+φC).

Then from the result of the classical BSDE, we have that BSDE (2.5) has a unique solution triple (Ys,ϕ;v, Zs,ϕ;v)∈ SF2(0, T)×HF2(0, T).Moreover we get the estimates for the solution of (2.5).

Proposition 2.3. (A1)–(A5)hold, we have EFs

s≤t≤Tsup |Ys,ϕ;v(t)|2+ T

s

|Zs,ϕ;v(t)|2dt

≤Λ(1+ϕ2C) (2.6)

and

EFs

s≤t≤Tsup |Ys,ϕ;v(t)−Ys,ϕ;ˆˆv(t)|2+ T

s

|Zs,ϕ;v(t)−Zs,ϕ;ˆˆv(t)|2dt

≤Λϕ−ϕˆ2C+ΛEFs T

s

|v(t)−ˆv(t)|2ds

.

(2.7)

Given a control processv(·)∈ Uad, we introduce the associated cost functional

J(s, ϕ;v(·)) :=Ys,ϕ;v(t)|t=s, (s, ϕ)[0, T]×C, (2.8) For each initial datum (s, ϕ)[0, T]×C, the optimal control problem is to findv(·)∈ Uad so as to maximize the objective functionJ.

Remark 2.4. The problem we formulated is one kind of stochastic recursive optimal control problem with delay. In the financial market, Xs,ϕ;v(t) can represent the wealth of the investor and Ys,ϕ;v(t) represents the recursive utility cost functional, (see, for example, Ref. [3]).

3. Dynamic programming principle of the problem

In this section, we will prove that the dynamic programming principle still holds for the above optimization problem.

We define the value function of the optimal control problem

u(s, ϕ) :=esssupv(·)∈UadJ(s, ϕ;v(·)), (s, ϕ)[0, T]×C. (3.1) For eachs >0, we denote by{Fts, s≤t≤T}the natural filtration of the Brownian motion{W(t)−W(s), s t≤T} augmented by the P-null sets ofF. Also we introduce the following subspaces ofUad:

Uads :={v(·)∈ Uad|v(t) isFtsprogressively measurable,∀s≤t≤T}, U¯ads :=

⎧⎨

v(t) = N j=1

vj(t)1Aj|vj(t)∈ Uads ,{Aj}Nj=1is a partition of (Ω,Fs), N = 1,2,· · ·

⎫⎬

.

(5)

Lemma 3.1.

Xs,ϕ;

N

j=1vj1Aj

t =

N j=1

1AjXts,ϕ;vj,

Ys,ϕ;Nj=1vj1Aj(t) = N j=1

1AjYs,ϕ;vj(t),

Zs,ϕ;Nj=1vj1Aj(t) = N j=1

1AjZs,ϕ;vj(t).

Proof. The main idea of the proof is essentially the same as that in Peng [11], hence we only give sketch for the different points. Ifv=N

j=11Ajvj, for eachj we denote

(Xj(t), Yj(t), Zj(t)) = (Xs,ϕ;v(t), Ys,ϕ;v(t), Zsϕ;v(t))|v=vj, andXrj=Xr|v=vj. We can see thatXj(t), Yj(t), Zj(t) are the solutions of the following SDDE and BSDE respectively,

Xj(t) =ϕ(s) + t

s

b(r, Xrj, vj(r))dr+ t

s

σ(r, Xrj, vj(r))dW(r), t[s, T], Yj(t) =Φ(XTj) +

T

t

f(r, Xrj, Yj(r), Zj(r), vj(r))dr T

t

Zj(r)dW(r), t[s, T].

Multiplying 1Aj on both sides of the above functions, we have N

j=1

1AjXj(t) = N j=1

1Ajϕ(s) + t

s

b

⎝r,N

j=1

1AjXrj, N j=1

1Ajvj(r)

⎠dr

+ t

s

σ

r, N j=1

1AjXrj, N j=1

1Ajvj(r)

⎠dW(r),

N j=1

1AjYj(t) =Φ

N

j=1

1AjXTj

⎠+ T

t

f

r, N j=1

1AjXrj, N j=1

1AjYj(r), N j=1

1AjZj(r)

⎠dr

T

t

N j=1

1AjZj(r)dW(r),

by the virtue of

jΦ(xj)1Aj =Φ(

jxj1Aj).SinceN

j=11Ajϕ=ϕand the uniqueness of solutions for SDDE and BSDE, we have

Xs,ϕ;v(t) =Xs,ϕ;Nj=11Ajvj(t) = N j=1

1AjXj(t),

Ys,ϕ;v(t) =Ys,ϕ;Nj=11Ajvj(t) = N j=1

1AjYj(t),

Zs,ϕ;v(t) =Zs,ϕ;

N

j=11Ajvj(t) = N j=1

1AjZj(t).

(6)

And also we have

Xts,ϕ;v=Xs,ϕ;

N

j=11Ajvj

t =

N j=1

1AjXtj.

Proposition 3.2. Under the assumptions(A1)–(A5), the value functionu(s, ϕ)defined in(3.1)is a determin- istic function.

Proof. Firstly, we will prove that

esssupv(·)∈UadJ(s, ϕ;v(·)) =esssupv(·)∈U¯ads J(s, ϕ;v(·)). (3.2) From the fact that ¯Uads is a subset ofUad,we have

esssupv(·)∈UadJ(s, ϕ;v(·))≥esssupv(·)∈U¯ads J(s, ϕ;v(·)).

Consequently, we only need to prove the inverse inequality. For anyv(·),ˆv(·)∈ Uad,we have E{|Ys,ϕ;v(s)−Ys,ϕ;ˆv(s)|} ≤ΛE

T

s

|v(t)−v(t)|ˆ 2dt

by Proposition 2.3. Note that ¯Uads is dense in Uad under the norm of HF2(0, T), so we know that, for each v(·)∈ Uad, there exists a sequence{vn(·)}n=1∈U¯ads such that

n→∞lim E{|Ys,ϕ;vn(s)−Ys,ϕ;v(s)|2}= 0.

Thence, there exists a subsequence, we also denote{vn(·)}n=1, such that

n→∞lim Ys,ϕ;vn(s) =Ys,ϕ;v(s), a.s.

i.e.,

n→∞lim J(s, ϕ;vn(·)) =J(s, ϕ;v(·)), a.s.

By the arbitrariness ofv(·) and the definition of essential supremum, we get esssupv(·)∈U¯ads J(s, ϕ;v(·))≥esssupv(·)∈UadJ(s, ϕ;v(·)).

Then (3.2) is proved.

Secondly, we want to prove

esssupv(·)∈U¯ads J(s, ϕ;v(·)) =esssupv(·)∈Us

adJ(s, ϕ;v(·)). (3.3)

Obviously,

esssupv(·)∈U¯ads J(s, ϕ;v(·))≥esssupv(·)∈Us

adJ(s, ϕ;v(·)).

On the other hand, from Lemma3.1, we have J(s, ϕ;v(·)) =J

s, ϕ;

N j=1

1Ajvj(·)

⎠= N j=1

1AjJ(s, ϕ;vj(·))

for allv(·) ∈U¯ads . Note that vj(·)(j = 1,2, . . . , N) are {Fts} progressively measurable, then J(s, ϕ;vj(·))(j = 1,2, . . . , N) are deterministic. Without loss of generality, we assume that

J(s, ϕ;v1(·))≥J(s, ϕ;vj(·)), ∀j = 2,3, . . . , N.

(7)

Then

J(s, ϕ;v(·))≤J(s, ϕ;v1(·))≤esssupv(·)∈Us

adJ(s, ϕ;v(·)).

Sincev(·) is arbitrary, we obtain that

esssupv(·)∈U¯ads J(s, ϕ;v(·))≤esssupv(·)∈Us

adJ(s, ϕ;v(·)).

The desired equality (3.3) is obtained.

From the definition ofUads , we know that the cost functionalJ(s, ϕ;v(·)) is deterministic when v(·) ∈ Uads . Hence we get

u(s, ϕ) = sup

v(·)∈Uads J(s, ϕ;v(·))

is deterministic which is our desired result.

Next, we will explore the continuity of value functionu(s, ϕ) with respect to ϕ.

Lemma 3.3. For each s∈[0, T],andϕ,ϕˆ∈C, we have

(i)|u(s, ϕ)−u(s,ϕ)ˆ |2≤Λϕ−ϕˆ2C; (ii)|u(s, ϕ)| ≤Λ(1+ϕC).

Proof. From Proposition2.3, for each admissible controlv(·)∈ Uad, we have

|J(s, ϕ;v(·))| ≤Λ(1+ϕC), (3.4)

|J(s, ϕ;v(·))−J(s,ϕ;ˆ v(·))|2≤Λϕ−ϕˆ2C. (3.5) On the other hand, for each >0, there exists av(·),v(·)ˆ ∈ Uads.t.

J(s, ϕ; ˆv(·))≤u(s, ϕ)≤J(s, ϕ;v(·)) +, J(s,ϕ;ˆ v(·))≤u(s,ϕ)ˆ ≤J(s,ϕ; ˆˆ v(·)) +. Then from (3.4) , we get

−Λ(1+ϕC)≤J(s, ϕ; ˆv(·))≤u(s, ϕ)≤J(s, ϕ;v(·)) +≤Λ(1+ϕC) +. Then (ii) comes from the arbitrariness of. Similarly,

J(s, ϕ; ˆv(·))−J(s,ϕ; ˆˆ v(·))−≤u(s, ϕ)−u(s,ϕ)ˆ ≤J(s, ϕ;v(·))−J(s,ϕ;ˆ v(·)) +,

|u(s, ϕ)−u(s,ϕ)ˆ |

max{|J(s, ϕ;v(·))−J(s,ϕ;ˆ v(·))|,|J(s, ϕ; ˆv(·))−J(s,ϕ; ˆˆ v(·))|}+,

|u(s, ϕ)−u(s,ϕ)|ˆ 2

2 max{|J(s, ϕ;v(·))−J(s,ϕ;ˆ v(·))|2,|J(s, ϕ; ˆv(·))−J(s,ϕ; ˆˆ v(·))|2}+ 22

ϕ−ϕˆ2C+22.

So (i) holds.

(8)

From (2.8), we knowYs,ϕ;v(s) =J(s, ϕ;v(·)) for deterministicϕ∈C. Aboutζ∈L2(Ω, C;Fs),we need the following result:

Lemma 3.4. For any bounded ζ∈L2(Ω, C;Fs), we can find a sequence ηm=

m j=1

1Ajϕj (3.6)

converges to ζ in L2(Ω, C;Fs), where ϕj is a function defined on [−δ,0] for 1 j m and {Aj}mj=1 is a partition of(Ω,Fs).

Proof. For any boundedζ∈L2(Ω, C;Fs), we also denote it byζ(r, ω),r∈[s−δ, s]. We define ζm(r, ω) =

−1 i=−2m

ζ i

2mδ, ω 1[2mi δ,i+12mδ)(r), −δ≤r≤0.

For everyω∈Ω, since the trajectory is continuous on a closed interval [s−δ, s], so the continuity is uniform.

We have

m→∞lim ζm(·, ω)−ζ(·, ω)C= 0, for anyω∈Ω.

Sinceζmis uniformly bounded, by Lebesgue’s dominated convergence theorem, we have

m→∞lim ζm−ζ2L2(Ω,C)= lim

m→∞E

ζm(·, ω)−ζ(·, ω)2C

= 0.

So, for any given >0, there existsM >0 such that, whenever m > M, we have ζm−ζL2(Ω,C)

3· (3.7)

For each iand m, let us turn to analyze theFs-measurable random variable ζ i

2mδ, ω . It is easy to know that there exists a partition{Ai,mj }Nj=1i,m of (Ω,Fs), and there exists a sequence ofφi,mj Rn,j= 1,2, . . . , Ni,m,

such that !!

!!!!

Ni,m

j=1

1Ai,m

j (ω)φi,mj −ζ i

2mδ, ω !!!!

!!

3, for any (i, m). (3.8)

We notice that the partitions above depend on (i, m). Without loss of generality, we will use a thinner partition {Amj }Nj=1m of (Ω,Fs) to keep (3.8) to hold for all i, and the new partition is generated by the all partitions {Ai,mj }Nj=1i,m. The advantage of the thinnest partition is that only depending onm. Then we have

!!!!

!!

Nm

j=1

1Am

j (ω)φi,mj −ζ i

2mδ, ω !!!

!!!

3, for any (i, m).

We define

ηm(r, ω) = −1 i=−2m

Nm

j=1

φi,mj 1Am

j (ω)1[2mi δ,i+12mδ)(r)

=

Nm

j=1

1Amj (ω) −1 i=−2m

φi,mj 1[2imδ,i+12mδ)(r)

=

Nm

j=1

1Am

j (ω) ˜ϕmj (r).

(9)

And for anym, we calculate

m(r, ω)−ζm(r, ω)|=

!!!!

!!

−1 i=−2m

Nm

j=1

φi,mj 1Amj (ω)1[2mi δ,i+12mδ)(r)− −1 i=−2m

ζ i

2mδ, ω 1[2mi δ,i+12mδ)(r)

!!!!

!!

−1 i=−2m

!!!!

!!

Nm

j=1

φi,mj 1Am

j (ω)−ζ i

2mδ, ω !!!

!!!1[2mi δ,i+12mδ)(r)

3

−1

i=−2m1[2mi δ,i+12mδ)(r)

= 3· Moreover

ηm−ζmL2(Ω,C)

3· (3.9)

Combining (3.7) and (3.9), we obtain the conclusion: for any given >0, there existsM >0 such that, whenever

m > M,ηm−ζL2(Ω,C)≤. We complete the proof.

Lemma 3.5. We have

J(s, ζ;v(·)) =Ys,ζ;v(s) (3.10)

for any s∈[0, T],v(·)∈ Uad and for any ζ∈L2(Ω, C;Fs).

Proof. First, we consider a simple case that ζ=N

j=11Ajϕj with{Aj}Nj=1 is a finite partition of (Ω,Fs), and ϕj∈C[−δ,0]n,for 1≤j≤N.The similar argument in Lemma3.1, we deduce that

Ys,ζ;v(s) = N j=1

1AjYs,ϕj;v(s) = N j=1

1AjJ(s, ϕj;v(·)) =J

s, N j=1

1Ajϕj;v(·)

⎠=J(s, ζ;v(·)).

That implies (3.10) holds for the above simple case.

Next, given a boundedζ∈L2(Ω, C;Fs), we can choose a sequence of simple processj}similar to the form of (3.6) such thatj} converges toζ inL2(Ω, C;Fs). Then, from Proposition2.3we can derive

E[|Ys,ζ;v(s)−Ys,ζj;v(s)|2]≤Λζ−ζjL2(Ω,C)0, as j→ ∞.

At the same time, form (3.5) we have the following result about J

E[|J(s, ζ;v(·))−J(s, ζj;v(·))|2]≤Λζ−ζj L2(Ω,C)0, asj → ∞.

As a result of Ys,ζj;v(s) =J(s, ζj;v(·)), we get our desired conclusion. If ζ is unbounded, we can define ζn = (ζ"

n)#

(−n), thenζn is bounded and

E[|Ys,ζ;v(s)−Ys,ζn;v(s)|2]≤Λζ−ζn L2(Ω,C)0, asn→ ∞.

Repeat the processes similarly as the bounded case, and we can also get our desired results.

(10)

In order to prove the dynamic programming principle, we need the following lemma:

Lemma 3.6. For each v(·)∈ Uad, fixed s∈[0, T)andζ∈L2(Ω, C;Fs), we have

u(s, ζ)≥Ys,ζ;v(s). (3.11)

On the other hand, for each >0, there is an admissible control v(·)∈ Uad, such that

u(s, ζ)≤Ys,ζ;v(s) +, a.s. (3.12)

Proof. Same as that in the proof of Lemma 3.5, we first consider the case: ζ =N

j=11Ajϕj. Then, for each v(·)∈ Uad, we have

Ys,ζ;v(s) =Ys,

N

j=11Ajϕj;v(s) = N j=1

1AjYs,ϕj;v(s) N j=1

1Aju(s, ϕj) =u(s, ζ).

Furthermore, ifζ∈L2(Ω, C;Fs), we can also choose a sequence of simple processj} which converges toζ in L2(Ω, C;Fs). So,

E[|Ys,ζ;v(s)−Ys,ζj;v(s)|2]0; E[|u(s, ζ)−u(s, ζj)|2]0, (j→ ∞).

Then there exists a subsequence which we also denote it byj} such that

j→∞lim Ys,ζj;v(s) =Ys,ζ;v(s), a.s., lim

j→∞u(s, ζj) =u(s, ζ), a.s.

With the help ofYs,ζj;v(s)≤u(s, ζj), j= 1,2, . . . ,we get (3.11).

We can use the similar method to prove (3.12). We first consider that ζ ∈L(Ω, C;Fs) with ζC≤M, then we can construct a random variableη ∈L(Ω, C;Fs) with the formη =N

j=11Ajϕj, ϕj ∈C,{Aj}Nj=1 is also the partition of (Ω,Fs) such that

ζ−ηL2(Ω,C) · For anyv(·)∈ Uad,we have

|Ys,ζ;v(s)−Ys,η;v(s)| ≤

3, |u(s, ζ)−u(s, η)| ≤

3· (3.13)

Then, For eachϕj, we can choose anFts-adapted admissible controlvj(·) such that u(s, ϕj)≤Ys,ϕj;vj(s) +

3· If we setv(·) =N

j=11Ajvj(·),we can derive the following result with (3.13):

Ys,ζ;v(s)≥ −|Ys,η;v(s)−Ys,ζ,v(s)|+Ys,η;v(s)≥ − 3+

N j=1

1AjYs,ϕj;vj(s)

≥ − 3 +

N j=1

1Aj(u(s, ϕj) 3) =2

3+ N j=1

1Aju(s, ϕj)

=2

3+u(s, η)≥ −+u(s, ζ).

That is to say that (3.12) holds for anyζ∈L(Ω, C;Fs).

(11)

For the general caseζ∈L2(Ω, C;Fs),fortunately we know thatζ can be decomposed as ζ=

j=1

1Ajζj,

where {Aj}j=1 is also a partition of (Ω,Fs) and ζj ∈L(Ω, C;Fs) withζj C≤j, j= 1,2,· · ·.Thence, for eachζj, there existsvj(·)∈ Uadsuch that

u(s, ζj)≤Ys,ζj;vj(s) +. Denotingv(·) =

j=11Ajvj(·), we get u(s, ζ) =u(s,

j=1

1Ajζj) = j=1

1Aju(s, ζj)

j=1

1Aj(Ys,ζj;vj(s) +)

= j=1

1AjYs,ζj;vj(s) +=Ys,ζ;v(s) +.

Then we complete our proof.

Next, we will introduce a family of backward semigroups which is embedded in [11].

Given the initial condition (s, ϕ)[0, T)×C,and an admissible controlv(·)∈ Uad. Ifτ≤T−sis a positive number, andζ∈L2(Ω, C;Fs+τ), we denote

Gs,ϕ;vs,s+τ[ζ] :=Ys,ϕ;v(s), where (Ys,ϕ;v(t), Zs,ϕ;v(t)s≤t≤s+τ is the solution of the following BSDE

dYs,ϕ;v(t) =f(t, Xts,ϕ;v, Ys,ϕ;v(t), Zs,ϕ;v(t), v(t))dt−Zs,ϕ;v(t)dW(t), t[s, s+τ], Ys,ϕ;v(s+τ) =ζ.

Obviously,

Gs,ϕ;vs,T [Φ(XTs,ϕ;v)] =Gs,ϕ;vs,s+τ[Ys,ϕ;v(s+τ)]. (3.14) Then we can get the main result of this section, the generalized dynamic programming principle for our stochastic recursive optimal control problem with delayed system.

Theorem 3.7 (the generalized dynamic programming principle).Let(A1)–(A5)hold. Then the value function u(s, ϕ)defined by(3.1) for our optimal control problem with the delayed system has the following property: for each 0≤τ≤T−s,

u(s, ϕ) =esssupv(·)∈UadGs,ϕ;vs,s+τ[u(s+τ, Xs+τs,ϕ;v)]. (3.15) Proof. We have

u(s, ϕ) =esssupv(·)∈UadGs,ϕ;vs,T [Φ(XTs,ϕ;v)] =esssupv(·)∈UadGs,ϕ;vs,s+τ[Ys,ϕ;v(s+τ)]

=esssupv(·)∈UadGs,ϕ;vs,s+τ[Ys+τ,Xt+τs,ϕ;v;v(s+τ)].

From Lemma3.6and the comparision theorem of BSDE, we have

u(s, ϕ)≤esssupv(·)∈UadGs,ϕ;vs,s+τ[u(s+τ, Xs+τs,ϕ;v)].

(12)

Moreover, also by Lemma 3.6, we can conclude that: for every > 0, we can find an admissible control v(·)ˆ ∈ Uad such that

u(s+τ, Xs+τs,ϕ;v)≤Ys+τ,Xs+τs,ϕ;vv(s+τ) +.

For eachv(·)∈ Uad, we denote ¯v(t) = 1{t≤s+τ}v(t)+1{t>s+τ}v(t).ˆ From the above inequality and the comparison theorem, we get

Ys+τ,Xs+τs,ϕ;¯vv(s+τ)≥u(s+τ, Xs+τs,ϕ;¯v)−, u(s, ϕ)≥esssup¯v(·)∈Uˆv

adGs,ϕ;¯s,s+τv[u(s+τ, Xs+τs,ϕ;¯v)−], and here

Uadvˆ :={¯v(·)∈ Uad|¯v(t) = 1{t≤s+τ}v(t) + 1{t>s+τ}v(t) for someˆ v(·)∈ Uad}.

From the estimates of BSDE, there exists a positive constantΛ0 such that u(s, ϕ)≥esssup¯v(·)∈Uˆv

adGs,ϕ;¯s,s+τv[u(s+τ, Xs+τs,ϕ;¯v)]−Λ0. Letting 0,we obtain

u(s, ϕ)≥esssup¯v(·)∈Uvˆ

adGs,ϕ;¯s,s+τv[u(s+τ, Xs+τs,ϕ;¯v)].

From the definition of ¯v(·) and the arbitrariness of v(·)∈ Uad,we know

u(s, ϕ)≥esssupv(·)∈UadGs,ϕ;vs,s+τ[u(s+τ, Xs+τs,ϕ;v)].

Then (3.15) is obtained.

Remark 3.8. (3.15) implies that the value function u(s, ϕ) obeys the dynamic programming principle which holds also for the usual optimization case without the recursive utility (see Refs. [6,7]).

4. The Hamilton-Jacobi-Bellman equations

We call (3.15) the dynamic programming equation. It seems impossible to solve such an equation directly.

In this section, we will prove that, under some smooth conditions, the value function u(·,·) defined by (3.1) satisfies a kind of partial differential equations-HJB equations.

4.1. The notions and definitions

We first introduce some notations and definitions which also used in [1,4,5,9].

LetCb be the Banach space of all bounded uniformly continuous functionsΦ:C→Rwith the sup norm ΦCb:= sup

η∈C|Φ(η)|, Φ∈Cb. Define the operatorPt:Cb →Cb, t≥0,onCb by

Pt(Φ)(η) :=E[Φ(Xtη)], t≥0, Φ∈Cb, η ∈C.

For anyΦ∈Cb and any finite Borel measureμ onC, define the pairing Φ, μ :=

η∈CΦ(η)dμ(η).

We also define a generatorA:D(A)⊂Cb →Cb of{Pt}t≥0 by the weak limit A(Φ)(η) :=w− lim

t→0+

Pt(Φ)−Φ

t ,

hereΦbelongs to the domainD(A) ofAif and only if the above weak limit exists in Cb.

(13)

Then we can easily obtain:

Lemma 4.1.

d

dtPt(Φ) =A(Pt(Φ)) =Pt(A(Φ)), t0, for any Φ∈D(A).

LetFn:=1{0}:κ∈Rn}andC⊕Fn:=+κ1{0}:η∈C, κ∈Rn} with norm η+κ1{0}:=ηC+|κ|forη ∈C, κ∈Rn, where 1{0}: [−δ,0]Ris defined by

1{0}(θ) =

0,forθ∈[−δ,0), 1,forθ= 0.

For a Borel measurable functionΦ:C→R,we also define S(Φ)(η) := lim

t→0

Φ(˜ηt)−Φ(η) t for allη∈C,where ˜η: [−δ, T]Rnis an extension ofη defined by

˜ η(t) :=

η(t),t∈[−δ,0), η(0),t≥0,

and ˜ηt∈Cis defined by ˜ηt(θ) := ˜η(t+θ), θ∈[−δ,0]. Let ˆD(S),the domain ofS, be the set ofΦ:C→Rsuch that the above limit exists for eachη∈C. Define D(S) as the set of all functionsΦ: [0, T]×C→Rsuch that Φ(t,·)∈D(S),ˆ ∀t∈[0, T].

In addition, for each sufficiently smooth function Φ, we will denote its first and secondF rechet´ derivative with respect toη∈C byandD2Φ. And letClip1,2([0, T]×C) be the set of functionsΦ: [0, T]×C→Rsuch that ∂Φ∂t,andD2Φexist and they are globally bounded and Lipschitz.

Then we can derive a formula for the generatorA.

Theorem 4.2 (([9]). Suppose that Φ∈ Clip1,2([0, T]×C)∩D(S). Let v(·) ∈ Uad, and {X(t), t[s, T]} be the C−valued Markov solution process of equation(2.2)with the initial data(s, ηs)[0, T]×C.Then forΦ∈D(A),

A(Φ)(Xt) =

∂tΦ(t, Xt) +S(Φ)(Xt) +DΦ(t, Xt)(b(t, Xt, v(t))1{0}) +1

2 d i=1

D2Φ(t, Xt)(σ(t, Xt, v(t))ei1{0}, σ(t, Xt, v(t))ei1{0}),

(4.1)

where :C⊕Fn R, D2Φ: (C⊕Fn)×(C⊕Fn)Rare the continuous linear and bilinear extensions of DΦ, D2Φrespectively, and {ei}di=1 is the standard basis inRd.

4.2. Some properties of the value function

We will explore some more properties of the value function defined in previous subsection.

Lemma 4.3. Under the assumptions (A1)–(A5), for any(s, ζ)[0, T)×L2(Ω, C;Fs)andv∈ Uad, we have

t→slim+Xts,ζ;v−ζL2(Ω,C)= 0. (4.2)

Proposition 4.4. Suppose(A1)–(A5)hold, then the value function u(s, ϕ)is continuous in (s, ϕ)[0, T]×C.

Références

Documents relatifs

Then we discuss the maximization condition for optimal sampled-data controls that can be seen as an average of the weak maximization condition stated in the classical

Under general assumptions on the data, we prove the convergence of the value functions associated with the discrete time problems to the value function of the original problem..

Then, a convergence result is proved by using the general framework of Barles-Souganidis [10] based on the monotonicity, sta- bility, consistency of the scheme (a precise definition

00 (1.1) In particular, this result is routinely used in the case of controlled Markov jump- diffusions in order to derive the corresponding dynamic programming equation in the sense

In this note, we give the stochastic maximum principle for optimal control of stochastic PDEs in the general case (when the control domain need not be convex and the

To constrain the conditions that give rise to a thin ther- mal aureole and the absence of large-scale convection, we made a parametric study on rock conductivities, magma

In Chapter 6 , by adapting the techniques used to study the behavior of the cooperative equilibria, with the introduction of a new weak form of MFG equilibrium, that we

If we naively convert the SFR function to the mass function of the star-forming galaxies by assuming that the galaxy main sequence is already in place at z ∼ 7 and β is constant