Dynamic programming principle for stochastic recursive optimal control problem with delayed systems

(1)

DOI:10.1051/cocv/2011187 www.esaim-cocv.org

DYNAMIC PROGRAMMING PRINCIPLE FOR STOCHASTIC RECURSIVE OPTIMAL CONTROL PROBLEM WITH DELAYED SYSTEMS^∗

Li Chen

¹

and Zhen Wu

²

Abstract. In this paper, we study one kind of stochastic recursive optimal control problem for the systems described by stochastic differential equations with delay (SDDE). In our framework, not only the dynamics of the systems but also the recursive utility depend on the past path segment of the state process in a general form. We give the dynamic programming principle for this kind of optimal control problems and show that the value function is the viscosity solution of the corresponding infinite dimensional Hamilton-Jacobi-Bellman partial differential equation.

Mathematics Subject Classification. 49L20, 60H10, 93E20.

Received June 7, 2010. Revised May 6, 2011 Published online 16 January 2012.

1. Introduction

The classical stochastic control system is governed by a nonlinear stochastic diﬀerential equation (SDE). This kind of stochastic optimal control problems has been studied extensively, both by the dynamic programming approach and by the Pontryagin stochastic maximum principle. In our paper, we are concerned with the dynamic programming principle. There are many works concerning this subject. Such as Yong and Zhou [13] for the classical stochastic control system, and Peng [10,11], Wu and Yu [12] for the stochastic recursive case.

The research of many natural and social phenomena shows that the future development of many processes depends not only on their present state but also essentially on their previous history. Such processes can be described by the stochastic diﬀerential delayed equation (SDDE). Many examples can be found in Mohammed [8,9].

Whereas the dynamic programming principle can also be extended to stochastic control problems with delay (see e.g. [6]), most problems remain practically intractable because of the complex inﬁnite-dimensional state space framework. In [7], Larssen and Risebro consider a class of optimal consumption problems with the stochastic delayed systems for some special cases, which shows that the ﬁnancial application of this kind of dynamic programming principle.

Keywords and phrases. Stochastic diﬀerential equation with delay, recursive optimal control problem, dynamic programming principle, Hamilton-Jacobi-Bellman equation.

∗This work is partly supported by the Natural Science Foundation of P.R. China (10921101) and Shandong Province (JQ200801 and 2008BS01024), the National Basic Research Program of P.R. China (973 Program, No. 2007CB814904) and the Science Fund for Distinguished Young Scholars of Shandong University (2009JQ004), the Fundamental Research Funds for the Central Universities (2010QS05).

1 Department of Mathematics, China University of Mining & Technology, Beijing 100083, P.R. China.chenli@cumtb.edu.cn

2 School of Mathematics, Shandong University, Jinan 250100, P.R. China.wuzhen@sdu.edu.cn

Article published by EDP Sciences c EDP Sciences, SMAI 2012

(2)

However, Duffien and Epstein [2] showed that the personal utility at time t is not only a function of the instantaneous consumption rate, but also of the future utility (corresponding to the future consumption). Also, from [3], we know that this kind of recursive utility can be described by backward stochastic differential equation(BSDE), and the stochastic differential recursive utility is an extension of the standard additive one. So one of our goals in this paper is to obtain the dynamic programming principle for the delayed stochastic optimal control problem with the recursive utility.

For this, we study one kind of delayed stochastic recursive optimal control problem with the cost functional described by the solution of a BSDE. We prove that the celebrated dynamic programming principle for this kind of optimal control problem still holds. Because of the absence of Itô’s formula for the function of the history state and the infinite-dimensional difficulty, it is not easy to obtain the Hamilton-Jacobi-Bellman (HJB) equation which the optimal value function satisfies. To overcome this difficulty, with the help of the theory of generator for the operator and the semigroup properties, we obtain the corresponding HJB equation and show that the value function of the recursive optimal control problem is the viscosity solution of HJB equation.

The paper is organized as follows. In Section 2, we present some notations which will be used throughout the paper, and we formulate the recursive optimal control problem with delay. In Section 3, we prove that the celebrated dynamic programming principle still holds under our framework. In Section 4, we give some properties of the optimal value function. We obtain the corresponding HJB equation and show that the optimal value function is the viscosity solution of the HJB equation in this section. Moreover, in some special case, we can get the uniqueness result of the viscosity solution.

2. Notations and formulation of the problem

Let (Ω,F, P) be a probability space. Given 0≤T <∞ denoting a ﬁxed terminal time. Let 0≤δ <∞ be a ﬁxed constant, and [−δ,0] be the duration of the bounded delay of the systems considered in our paper. For the sake of simplicity, denote C:=C([−δ,0];Rⁿ), the Banach space of continuous pathsγ: [−δ,0]→Rⁿ with normγC:= sup

−δ≤s≤0|γ(s)|, where | · |is the Euclidean norm onRⁿ. And letn, d≥1 be integer.

Ifϕ∈C([−δ, T];Rⁿ) and 0≤t≤T, letϕ_tbe deﬁned by

ϕ_t(θ) =ϕ(t+θ), −δ≤θ≤0. (2.1)

We note thatϕ_tis the segment of the path of ϕfrom t−δ tot.

Throughout the end, let{W(t)}_0≤t≤T be ad-dimensional Brownian motion on completed ﬁltered probability space (Ω,F, P;{Ft}t≥0), where Ftis the natural ﬁltration of{W(t)}withF0 contains allP-null sets ofF.

We also use the following notations in our paper:

L²(Ω,F, P) =

ξis anFT−measurable random variable s.t.E|ξ|²<+∞

, H_F²(0, T) =

(ϕ_t,0≤t≤T) is an adapted process s.t.E _T

0

|ϕt|²dt <+∞

, S_F²(0, T) =

(ϕ_t,0≤t≤T) is an adapted process s.t.E

0≤t≤Tsup |ϕ_t|² <+∞

, L²(Ω, C;Ft) =

ϕ:Ω→CisFt− measurables.t. ϕ²_L2(Ω,C):=E[ϕ(ω)²_C]<∞ .

Moreover, we denote by (·|·) the inner product in L²(Ω, C;Ft), and·,· the inner product inRⁿ. Givenϕ andφin C, we have deﬁned as follows,

(ϕ|φ) = ₀

−δϕ(θ), φ(θ) dθ,andϕ2= (ϕ|ϕ)¹².

(3)

We introduce the admissible control setUad deﬁned by

Uad:={v(·)∈H_F²(0, T)|v(·) takes value inU ⊂R^k}. Here U is a compact subset ofR^k, an element ofUad is called an admissible control.

For a given admissible control, we consider the following system of controlled stochastic functional diﬀerential equations with a bounded memory:

dX(t) =b(t, X_t, v(t))dt+σ(t, X_t, v(t))dW(t), t∈[s, T],

X_s=ϕ_s. (2.2)

Here X_s and ϕ_s are deﬁned similarly as in (2.1), i.e., for any 0 ≤ s ≤ T, X_s = X(s+θ), ϕ_s = ϕ(s+θ),

−δ≤θ≤0, andϕ∈L²(Ω, C;Fs) is regarded as the initial path. The maps b: [0, T]×C×U →Rⁿ; σ: [0, T]×C×U →R^n×d satisfy the following assumptions:

(A1)b andσare continuous int;

(A2) for each integerm≥1, there is a constantL_m>0 (independent oft) such that

|b(t, φ, v)−b(t,φ,ˆ ˆv)|+|σ(t, φ, v)−σ(t,φ,ˆ ˆv)| ≤L_m(φ−φˆC+|v−vˆ|), for any 0≤t≤T, φ,φˆ∈C withφC≤L_m, φˆC≤L_mandv,vˆ∈U;

(A3) there is a constantK >0 such that

|b(t, φ, v)|+|σ(t, φ, v)| ≤K(1+φ_C+|v|), for any 0≤t≤T, φ∈C, v∈U.

Under the above assumptions, for anyv(·)∈ Uad, the control system (2.2) with the aftereﬀect has a unique strong solution {X^s,ϕ;v(t),0 ≤ s ≤ t ≤ T}, and also we have the following estimates by the existence and uniqueness theorem of SDDE in Mohammed [8,9]:

Proposition 2.1. For alls∈[0, T], ϕ,ϕˆ∈L²(Ω, C;Fs), v(·),ˆv(·)∈ Uad,

E^F^s[X_t^s,ϕ;v²_C]≤Λ(1+ϕ²_C). (2.3)

If we deﬁne the following mapping

T_t^s:L²(Ω, C;Fs)×U→L²(Ω, C;Ft), (ϕ, v)→X_t^s,ϕ;v, then we have

E^F^s[T_t^s(ϕ, v)−T_t^s( ˆϕ,v)ˆ ²_C]≤Λ

ϕ−ϕˆ²_C+E^F^s _T

s

|v(t)−ˆv(t)|²dt

. (2.4)

Remark 2.2. From now on, letΛbe a constant which can be changed line by line throughout our paper.

Now for any admissible controlv(·)∈ Uad, 0≤s≤t≤T, we consider the following BSDE:

Y^s,ϕ;v(t) =Φ(X_T^s,ϕ;v) + _T

t

f(r, X_r^s,ϕ;v, Y^s,ϕ;v(r), Z^s,ϕ;v(r), v(r))dr− _T

t

Z^s,ϕ;v(r)dW(r), (2.5)

(4)

where

f : [0, T]×C×R×R^d×U →R, Φ:C→R satisfy the following conditions:

(A4)f isFtmeasurable and it is continuous int;

(A5) for some constantL >0, and for anyφ,φˆ∈C, y,yˆ∈R, z,zˆ∈R^d, v,vˆ∈U, a.s.

|f(t, φ, y, z, v)−f(t,φ,ˆ y,ˆ z,ˆ v)|ˆ +|Φ(φ)−Φ( ˆφ)| ≤L(φ−φˆC+|y−y|ˆ +|z−z|ˆ +|v−v|),ˆ and∀(φ, v)∈C×U, we have

|f(t, φ,0,0, v)|+|Φ(φ)| ≤K(1+φC).

Then from the result of the classical BSDE, we have that BSDE (2.5) has a unique solution triple (Y^s,ϕ;v, Z^s,ϕ;v)∈ S_F²(0, T)×H_F²(0, T).Moreover we get the estimates for the solution of (2.5).

Proposition 2.3. (A1)–(A5)hold, we have E^F^s

s≤t≤Tsup |Y^s,ϕ;v(t)|²+ _T

s

|Z^s,ϕ;v(t)|²dt

≤Λ(1+ϕ²_C) (2.6)

and

E^F^s

s≤t≤Tsup |Y^s,ϕ;v(t)−Y^s,^ϕ;ˆ^ˆ^v(t)|²+ _T

s

|Z^s,ϕ;v(t)−Z^s,^ϕ;ˆ^ˆ^v(t)|²dt

≤Λϕ−ϕˆ²_C+ΛE^F^s _T

s

|v(t)−ˆv(t)|²ds

.

(2.7)

Given a control processv(·)∈ Uad, we introduce the associated cost functional

J(s, ϕ;v(·)) :=Y^s,ϕ;v(t)|t=s, (s, ϕ)∈[0, T]×C, (2.8) For each initial datum (s, ϕ)∈[0, T]×C, the optimal control problem is to ﬁndv(·)∈ Uad so as to maximize the objective functionJ.

Remark 2.4. The problem we formulated is one kind of stochastic recursive optimal control problem with delay. In the ﬁnancial market, X^s,ϕ;v(t) can represent the wealth of the investor and Y^s,ϕ;v(t) represents the recursive utility cost functional, (see, for example, Ref. [3]).

3. Dynamic programming principle of the problem

In this section, we will prove that the dynamic programming principle still holds for the above optimization problem.

We deﬁne the value function of the optimal control problem

u(s, ϕ) :=esssup_v(·)∈U_adJ(s, ϕ;v(·)), (s, ϕ)∈[0, T]×C. (3.1) For eachs >0, we denote by{F_t^s, s≤t≤T}the natural ﬁltration of the Brownian motion{W(t)−W(s), s≤ t≤T} augmented by the P-null sets ofF. Also we introduce the following subspaces ofUad:

U_ad^s :={v(·)∈ Uad|v(t) isF_t^sprogressively measurable,∀s≤t≤T}, U¯_ad^s :=

⎧⎨

⎩v(t) = N j=1

v^j(t)1_A_j|v^j(t)∈ U_ad^s ,{Aj}^N_j=1is a partition of (Ω,Fs), N = 1,2,· · ·

⎫⎬

⎭.

(5)

Lemma 3.1.

X^s,ϕ;

_N

j=1v^j1_Aj

t =

N j=1

1_A_jX_t^s,ϕ;v^j,

Y^s,ϕ;^N^j=1^v^j¹^Aj(t) = N j=1

1_A_jY^s,ϕ;v^j(t),

Z^s,ϕ;^N^j=1^v^j¹^Aj(t) = N j=1

1_A_jZ^s,ϕ;v^j(t).

Proof. The main idea of the proof is essentially the same as that in Peng [11], hence we only give sketch for the diﬀerent points. Ifv=_N

j=11_A_jv^j, for eachj we denote

(X^j(t), Y^j(t), Z^j(t)) = (X^s,ϕ;v(t), Y^s,ϕ;v(t), Z^sϕ;v(t))|v=v^j, andX_r^j=X_r|v=v^j. We can see thatX^j(t), Y^j(t), Z^j(t) are the solutions of the following SDDE and BSDE respectively,

X^j(t) =ϕ(s) + _t

s

b(r, X_r^j, v^j(r))dr+ _t

s

σ(r, X_r^j, v^j(r))dW(r), t∈[s, T], Y^j(t) =Φ(X_T^j) +

_T

t

f(r, X_r^j, Y^j(r), Z^j(r), v^j(r))dr− _T

t

Z^j(r)dW(r), t∈[s, T].

Multiplying 1_A_j on both sides of the above functions, we have N

j=1

1_A_jX^j(t) = N j=1

1_A_jϕ(s) + _t

s

b

⎛

⎝r,^N

j=1

1_A_jX_r^j, N j=1

1_A_jv^j(r)

⎞

⎠dr

+ _t

s

σ

⎛

⎝r, N j=1

1_A_jX_r^j, N j=1

1_A_jv^j(r)

⎞

⎠dW(r),

N j=1

1_A_jY^j(t) =Φ

⎛

⎝^N

j=1

1_A_jX_T^j

⎞

⎠+ _T

t

f

⎛

⎝r, N j=1

1_A_jX_r^j, N j=1

1_A_jY^j(r), N j=1

1_A_jZ^j(r)

⎞

⎠dr

− _T

t

N j=1

1_A_jZ^j(r)dW(r),

by the virtue of

jΦ(x_j)1_A_j =Φ(

jx_j1_A_j).Since_N

j=11_A_jϕ=ϕand the uniqueness of solutions for SDDE and BSDE, we have

X^s,ϕ;v(t) =X^s,ϕ;^N^j=1¹^Aj^v^j(t) = N j=1

1_A_jX^j(t),

Y^s,ϕ;v(t) =Y^s,ϕ;^N^j=1¹^Aj^v^j(t) = N j=1

1_A_jY^j(t),

Z^s,ϕ;v(t) =Z^s,ϕ;

_N

j=11_Ajv^j(t) = N j=1

1_A_jZ^j(t).

(6)

And also we have

X_t^s,ϕ;v=X^s,ϕ;

_N

j=11_Ajv^j

t =

N j=1

1_A_jX_t^j.

Proposition 3.2. Under the assumptions(A1)–(A5), the value functionu(s, ϕ)deﬁned in(3.1)is a determin- istic function.

Proof. Firstly, we will prove that

esssup_v(·)∈U_adJ(s, ϕ;v(·)) =esssup_v(·)∈U¯_ad^s J(s, ϕ;v(·)). (3.2) From the fact that ¯U_ad^s is a subset ofUad,we have

esssup_v(·)∈U_adJ(s, ϕ;v(·))≥esssup_v(·)∈U¯_ad^s J(s, ϕ;v(·)).

Consequently, we only need to prove the inverse inequality. For anyv(·),ˆv(·)∈ Uad,we have E{|Y^s,ϕ;v(s)−Y^s,ϕ;ˆ^v(s)|} ≤ΛE

_T

s

|v(t)−v(t)|ˆ ²dt

by Proposition 2.3. Note that ¯U_ad^s is dense in Uad under the norm of H_F²(0, T), so we know that, for each v(·)∈ Uad, there exists a sequence{vn(·)}^∞_n=1∈U¯_ad^s such that

n→∞lim E{|Y^s,ϕ;vⁿ(s)−Y^s,ϕ;v(s)|²}= 0.

Thence, there exists a subsequence, we also denote{vn(·)}^∞_n=1, such that

n→∞lim Y^s,ϕ;vⁿ(s) =Y^s,ϕ;v(s), a.s.

i.e.,

n→∞lim J(s, ϕ;v_n(·)) =J(s, ϕ;v(·)), a.s.

By the arbitrariness ofv(·) and the deﬁnition of essential supremum, we get esssup_v(·)∈U¯_ad^s J(s, ϕ;v(·))≥esssup_v(·)∈U_adJ(s, ϕ;v(·)).

Then (3.2) is proved.

Secondly, we want to prove

esssup_v(·)∈U¯_ad^s J(s, ϕ;v(·)) =esssup_v(·)∈Us

adJ(s, ϕ;v(·)). (3.3)

Obviously,

esssup_v(·)∈U¯_ad^s J(s, ϕ;v(·))≥esssup_v(·)∈U^s

adJ(s, ϕ;v(·)).

On the other hand, from Lemma3.1, we have J(s, ϕ;v(·)) =J

⎛

⎝s, ϕ;

N j=1

1_A_jv^j(·)

⎞

⎠= N j=1

1_A_jJ(s, ϕ;v^j(·))

for allv(·) ∈U¯_ad^s . Note that v^j(·)(j = 1,2, . . . , N) are {F_t^s} progressively measurable, then J(s, ϕ;v^j(·))(j = 1,2, . . . , N) are deterministic. Without loss of generality, we assume that

J(s, ϕ;v¹(·))≥J(s, ϕ;v^j(·)), ∀j = 2,3, . . . , N.

(7)

Then

J(s, ϕ;v(·))≤J(s, ϕ;v¹(·))≤esssup_v(·)∈U^s

adJ(s, ϕ;v(·)).

Sincev(·) is arbitrary, we obtain that

esssup_v(·)∈U¯_ad^s J(s, ϕ;v(·))≤esssup_v(·)∈U^s

adJ(s, ϕ;v(·)).

The desired equality (3.3) is obtained.

From the deﬁnition ofU_ad^s , we know that the cost functionalJ(s, ϕ;v(·)) is deterministic when v(·) ∈ U_ad^s . Hence we get

u(s, ϕ) = sup

v(·)∈U_ad^s J(s, ϕ;v(·))

is deterministic which is our desired result.

Next, we will explore the continuity of value functionu(s, ϕ) with respect to ϕ.

Lemma 3.3. For each s∈[0, T],andϕ,ϕˆ∈C, we have

(i)|u(s, ϕ)−u(s,ϕ)ˆ |²≤Λϕ−ϕˆ²_C; (ii)|u(s, ϕ)| ≤Λ(1+ϕC).

Proof. From Proposition2.3, for each admissible controlv(·)∈ Uad, we have

|J(s, ϕ;v(·))| ≤Λ(1+ϕC), (3.4)

|J(s, ϕ;v(·))−J(s,ϕ;ˆ v(·))|²≤Λϕ−ϕˆ²_C. (3.5) On the other hand, for each >0, there exists av(·),v(·)ˆ ∈ Uads.t.

J(s, ϕ; ˆv(·))≤u(s, ϕ)≤J(s, ϕ;v(·)) +, J(s,ϕ;ˆ v(·))≤u(s,ϕ)ˆ ≤J(s,ϕ; ˆˆ v(·)) +. Then from (3.4) , we get

−Λ(1+ϕC)≤J(s, ϕ; ˆv(·))≤u(s, ϕ)≤J(s, ϕ;v(·)) +≤Λ(1+ϕC) +. Then (ii) comes from the arbitrariness of. Similarly,

J(s, ϕ; ˆv(·))−J(s,ϕ; ˆˆ v(·))−≤u(s, ϕ)−u(s,ϕ)ˆ ≤J(s, ϕ;v(·))−J(s,ϕ;ˆ v(·)) +,

|u(s, ϕ)−u(s,ϕ)ˆ |

≤ max{|J(s, ϕ;v(·))−J(s,ϕ;ˆ v(·))|,|J(s, ϕ; ˆv(·))−J(s,ϕ; ˆˆ v(·))|}+,

|u(s, ϕ)−u(s,ϕ)|ˆ ²

≤2 max{|J(s, ϕ;v(·))−J(s,ϕ;ˆ v(·))|²,|J(s, ϕ; ˆv(·))−J(s,ϕ; ˆˆ v(·))|²}+ 2²

≤2Λϕ−ϕˆ²_C+2².

So (i) holds.

(8)

From (2.8), we knowY^s,ϕ;v(s) =J(s, ϕ;v(·)) for deterministicϕ∈C. Aboutζ∈L²(Ω, C;Fs),we need the following result:

Lemma 3.4. For any bounded ζ∈L²(Ω, C;Fs), we can ﬁnd a sequence η_m=

m j=1

1_A_jϕ^j (3.6)

converges to ζ in L²(Ω, C;Fs), where ϕ^j is a function deﬁned on [−δ,0] for 1 ≤ j ≤ m and {Aj}^m_j=1 is a partition of(Ω,F_s).

Proof. For any boundedζ∈L²(Ω, C;Fs), we also denote it byζ(r, ω),r∈[s−δ, s]. We deﬁne ζ_m(r, ω) =

−1 i=−2^m

ζ i

2^mδ, ω 1[₂mⁱ δ,ⁱ⁺¹₂mδ)(r), −δ≤r≤0.

For everyω∈Ω, since the trajectory is continuous on a closed interval [s−δ, s], so the continuity is uniform.

We have

m→∞lim ζm(·, ω)−ζ(·, ω)_C= 0, for anyω∈Ω.

Sinceζ_mis uniformly bounded, by Lebesgue’s dominated convergence theorem, we have

m→∞lim ζm−ζ²_L2(Ω,C)= lim

m→∞E

ζm(·, ω)−ζ(·, ω)²_C

= 0.

So, for any given >0, there existsM >0 such that, whenever m > M, we have ζ_m−ζ_L2(Ω,C)≤

3· (3.7)

For each iand m, let us turn to analyze theFs-measurable random variable ζ _i

2^mδ, ω . It is easy to know that there exists a partition{Aî,m_j }^N_j=1î,m of (Ω,Fs), and there exists a sequence ofφî,m_j ∈Rⁿ,j= 1,2, . . . , N_i,m,

such that !!

!!!!

Ni,m

j=1

1_Ai,m

j (ω)φ^i,m_j −ζ i

2^mδ, ω !!!!

!!≤

3, for any (i, m). (3.8)

We notice that the partitions above depend on (i, m). Without loss of generality, we will use a thinner partition {A^m_j }^N_j=1^m of (Ω,Fs) to keep (3.8) to hold for all i, and the new partition is generated by the all partitions {A^i,m_j }^N_j=1^i,m. The advantage of the thinnest partition is that only depending onm. Then we have

!!!!

!!

Nm

j=1

1_A^m

j (ω)φ^i,m_j −ζ i

2^mδ, ω !!!

!!!≤

3, for any (i, m).

We deﬁne

η_m(r, ω) = −1 i=−2^m

Nm

j=1

φ^i,m_j 1_A^m

j (ω)1[₂mⁱ δ,ⁱ⁺¹₂mδ)(r)

=

Nm

j=1

1_A^m_j (ω) −1 i=−2^m

φ^i,m_j 1[2ⁱmδ,ⁱ⁺¹₂mδ)(r)

=

Nm

j=1

1_A^m

j (ω) ˜ϕ^m_j (r).

(9)

And for anym, we calculate

|ηm(r, ω)−ζ_m(r, ω)|=

!!!!

!!

−1 i=−2^m

Nm

j=1

φ^i,m_j 1_A^m_j (ω)1[₂mⁱ δ,ⁱ⁺¹₂mδ)(r)− −1 i=−2^m

ζ i

2^mδ, ω 1[₂mⁱ δ,ⁱ⁺¹₂mδ)(r)

!!!!

!!

≤ −1 i=−2^m

!!!!

!!

Nm

j=1

φ^i,m_j 1_A^m

j (ω)−ζ i

2^mδ, ω !!!

!!!1[₂mⁱ δ,ⁱ⁺¹₂mδ)(r)

≤ 3

−1

i=−2^m1[₂mⁱ δ,ⁱ⁺¹₂mδ)(r)

= 3· Moreover

ηm−ζ_m_L2(Ω,C)≤

3· (3.9)

Combining (3.7) and (3.9), we obtain the conclusion: for any given >0, there existsM >0 such that, whenever

m > M,ηm−ζ_L2(Ω,C)≤. We complete the proof.

Lemma 3.5. We have

J(s, ζ;v(·)) =Y^s,ζ;v(s) (3.10)

for any s∈[0, T],v(·)∈ Uad and for any ζ∈L²(Ω, C;Fs).

Proof. First, we consider a simple case that ζ=_N

j=11_A_jϕ^j with{Aj}^N_j=1 is a ﬁnite partition of (Ω,Fs), and ϕ^j∈C[−δ,0]ⁿ,for 1≤j≤N.The similar argument in Lemma3.1, we deduce that

Y^s,ζ;v(s) = N j=1

1_A_jY^s,ϕ^j^;v(s) = N j=1

1_A_jJ(s, ϕ^j;v(·)) =J

⎛

⎝s, N j=1

1_A_jϕ^j;v(·)

⎞

⎠=J(s, ζ;v(·)).

That implies (3.10) holds for the above simple case.

Next, given a boundedζ∈L²(Ω, C;F_s), we can choose a sequence of simple process{ζ_j}similar to the form of (3.6) such that{ζ_j} converges toζ inL²(Ω, C;Fs). Then, from Proposition2.3we can derive

E[|Y^s,ζ;v(s)−Y^s,ζ^j^;v(s)|²]≤Λζ−ζ_j_L²_(Ω,C)→0, as j→ ∞.

At the same time, form (3.5) we have the following result about J

E[|J(s, ζ;v(·))−J(s, ζ_j;v(·))|²]≤Λζ−ζ_j L²(Ω,C)→0, asj → ∞.

As a result of Y^s,ζ^j^;v(s) =J(s, ζ_j;v(·)), we get our desired conclusion. If ζ is unbounded, we can deﬁne ζⁿ = (ζ"

n)#

(−n), thenζⁿ is bounded and

E[|Y^s,ζ;v(s)−Y^s,ζⁿ^;v(s)|²]≤Λζ−ζⁿ L²(Ω,C)→0, asn→ ∞.

Repeat the processes similarly as the bounded case, and we can also get our desired results.

(10)

In order to prove the dynamic programming principle, we need the following lemma:

Lemma 3.6. For each v(·)∈ Uad, ﬁxed s∈[0, T)andζ∈L²(Ω, C;Fs), we have

u(s, ζ)≥Y^s,ζ;v(s). (3.11)

On the other hand, for each >0, there is an admissible control v(·)∈ Uad, such that

u(s, ζ)≤Y^s,ζ;v(s) +, a.s. (3.12)

Proof. Same as that in the proof of Lemma 3.5, we ﬁrst consider the case: ζ =_N

j=11_A_jϕ^j. Then, for each v(·)∈ Uad, we have

Y^s,ζ;v(s) =Y^s,

_N

j=11_Ajϕ^j;v(s) = N j=1

1_A_jY^s,ϕ^j^;v(s)≤ N j=1

1_A_ju(s, ϕ^j) =u(s, ζ).

Furthermore, ifζ∈L²(Ω, C;Fs), we can also choose a sequence of simple process{ζ_j} which converges toζ in L²(Ω, C;Fs). So,

E[|Y^s,ζ;v(s)−Y^s,ζ^j^;v(s)|²]→0; E[|u(s, ζ)−u(s, ζ_j)|²]→0, (j→ ∞).

Then there exists a subsequence which we also denote it by{ζj} such that

j→∞lim Y^s,ζ^j^;v(s) =Y^s,ζ;v(s), a.s., lim

j→∞u(s, ζ_j) =u(s, ζ), a.s.

With the help ofY^s,ζ^j^;v(s)≤u(s, ζ_j), j= 1,2, . . . ,we get (3.11).

We can use the similar method to prove (3.12). We ﬁrst consider that ζ ∈L^∞(Ω, C;Fs) with ζC≤M, then we can construct a random variableη ∈L^∞(Ω, C;Fs) with the formη =_N

j=11_A_jϕ^j, ϕ^j ∈C,{Aj}^N_j=1 is also the partition of (Ω,Fs) such that

ζ−ηL²(Ω,C)≤ 3Λ· For anyv(·)∈ Uad,we have

|Y^s,ζ;v(s)−Y^s,η;v(s)| ≤

3, |u(s, ζ)−u(s, η)| ≤

3· (3.13)

Then, For eachϕ^j, we can choose anF_t^s-adapted admissible controlv^j(·) such that u(s, ϕ^j)≤Y^s,ϕ^j^;v^j(s) +

3· If we setv(·) =_N

j=11_A_jv^j(·),we can derive the following result with (3.13):

Y^s,ζ;v(s)≥ −|Y^s,η;v(s)−Y^s,ζ,v(s)|+Y^s,η;v(s)≥ − 3+

N j=1

1_A_jY^s,ϕ^j^;v^j(s)

≥ − 3 +

N j=1

1_A_j(u(s, ϕ^j)− 3) =−2

3+ N j=1

1_A_ju(s, ϕ^j)

=−2

3+u(s, η)≥ −+u(s, ζ).

That is to say that (3.12) holds for anyζ∈L^∞(Ω, C;Fs).

(11)

For the general caseζ∈L²(Ω, C;Fs),fortunately we know thatζ can be decomposed as ζ=

∞ j=1

1_A_jζ_j,

where {Aj}^∞_j=1 is also a partition of (Ω,Fs) and ζ_j ∈L^∞(Ω, C;Fs) withζ_j C≤j, j= 1,2,· · ·.Thence, for eachζ_j, there existsv^j(·)∈ Uadsuch that

u(s, ζ_j)≤Y^s,ζ^j^;v^j(s) +. Denotingv(·) =_∞

j=11_A_jv^j(·), we get u(s, ζ) =u(s,

∞ j=1

1_A_jζ_j) = ∞ j=1

1_A_ju(s, ζ_j)≤^∞

j=1

1_A_j(Y^s,ζ^j^;v^j(s) +)

= ∞ j=1

1_A_jY^s,ζ^j^;v^j(s) +=Y^s,ζ;v(s) +.

Then we complete our proof.

Next, we will introduce a family of backward semigroups which is embedded in [11].

Given the initial condition (s, ϕ)∈[0, T)×C,and an admissible controlv(·)∈ Uad. Ifτ≤T−sis a positive number, andζ∈L²(Ω, C;Fs+τ), we denote

G^s,ϕ;v_s,s+τ[ζ] :=Y^s,ϕ;v(s), where (Y^s,ϕ;v(t), Z^s,ϕ;v(t)_s≤t≤s+τ is the solution of the following BSDE

−dY^s,ϕ;v(t) =f(t, X_t^s,ϕ;v, Y^s,ϕ;v(t), Z^s,ϕ;v(t), v(t))dt−Z^s,ϕ;v(t)dW(t), t∈[s, s+τ], Y^s,ϕ;v(s+τ) =ζ.

Obviously,

G^s,ϕ;v_s,T [Φ(X_T^s,ϕ;v)] =G^s,ϕ;v_s,s+τ[Y^s,ϕ;v(s+τ)]. (3.14) Then we can get the main result of this section, the generalized dynamic programming principle for our stochastic recursive optimal control problem with delayed system.

Theorem 3.7 (the generalized dynamic programming principle).Let(A1)–(A5)hold. Then the value function u(s, ϕ)deﬁned by(3.1) for our optimal control problem with the delayed system has the following property: for each 0≤τ≤T−s,

u(s, ϕ) =esssup_v(·)∈U_adG^s,ϕ;v_s,s+τ[u(s+τ, X_s+τ^s,ϕ;v)]. (3.15) Proof. We have

u(s, ϕ) =esssup_v(·)∈U_adG^s,ϕ;v_s,T [Φ(X_T^s,ϕ;v)] =esssup_v(·)∈U_adG^s,ϕ;v_s,s+τ[Y^s,ϕ;v(s+τ)]

=esssup_v(·)∈U_adG^s,ϕ;v_s,s+τ[Y^s+τ,X^t+τ^s,ϕ;v^;v(s+τ)].

From Lemma3.6and the comparision theorem of BSDE, we have

u(s, ϕ)≤esssup_v(·)∈U_adG^s,ϕ;v_s,s+τ[u(s+τ, X_s+τ^s,ϕ;v)].

(12)

Moreover, also by Lemma 3.6, we can conclude that: for every > 0, we can ﬁnd an admissible control v(·)ˆ ∈ U_ad such that

u(s+τ, X_s+τ^s,ϕ;v)≤Y^s+τ,X^s+τ^s,ϕ;v^;ˆ^v(s+τ) +.

For eachv(·)∈ Uad, we denote ¯v(t) = 1_{t≤s+τ}v(t)+1_{t>s+τ}v(t).ˆ From the above inequality and the comparison theorem, we get

Y^s+τ,X^s+τ^s,ϕ;¯^v^;¯^v(s+τ)≥u(s+τ, X_s+τ^s,ϕ;¯^v)−, u(s, ϕ)≥esssup_¯_v(·)∈Uˆv

adG^s,ϕ;¯_s,s+τ^v[u(s+τ, X_s+τ^s,ϕ;¯^v)−], and here

U_ad^v^ˆ :={¯v(·)∈ Uad|¯v(t) = 1_{t≤s+τ}v(t) + 1_{t>s+τ}v(t) for someˆ v(·)∈ Uad}.

From the estimates of BSDE, there exists a positive constantΛ₀ such that u(s, ϕ)≥esssup_¯_v(·)∈Uˆv

adG^s,ϕ;¯_s,s+τ^v[u(s+τ, X_s+τ^s,ϕ;¯^v)]−Λ₀. Letting ↓0,we obtain

u(s, ϕ)≥esssup_¯_v(·)∈Uvˆ

adG^s,ϕ;¯_s,s+τ^v[u(s+τ, X_s+τ^s,ϕ;¯^v)].

From the deﬁnition of ¯v(·) and the arbitrariness of v(·)∈ Uad,we know

u(s, ϕ)≥esssup_v(·)∈U_adG^s,ϕ;v_s,s+τ[u(s+τ, X_s+τ^s,ϕ;v)].

Then (3.15) is obtained.

Remark 3.8. (3.15) implies that the value function u(s, ϕ) obeys the dynamic programming principle which holds also for the usual optimization case without the recursive utility (see Refs. [6,7]).

4. The Hamilton-Jacobi-Bellman equations

We call (3.15) the dynamic programming equation. It seems impossible to solve such an equation directly.

In this section, we will prove that, under some smooth conditions, the value function u(·,·) defined by (3.1) satisfies a kind of partial differential equations-HJB equations.

4.1. The notions and definitions

We ﬁrst introduce some notations and deﬁnitions which also used in [1,4,5,9].

LetC_b be the Banach space of all bounded uniformly continuous functionsΦ:C→Rwith the sup norm ΦCb:= sup

η∈C|Φ(η)|, Φ∈C_b. Deﬁne the operatorP_t:C_b →C_b, t≥0,onC_b by

P_t(Φ)(η) :=E[Φ(X_t^η)], t≥0, Φ∈C_b, η ∈C.

For anyΦ∈C_b and any ﬁnite Borel measureμ onC, deﬁne the pairing Φ, μ :=

η∈CΦ(η)dμ(η).

We also deﬁne a generatorA:D(A)⊂C_b →C_b of{P_t}t≥0 by the weak limit A(Φ)(η) :=w− lim

t→0+

P_t(Φ)−Φ

t ,

hereΦbelongs to the domainD(A) ofAif and only if the above weak limit exists in C_b.

(13)

Then we can easily obtain:

Lemma 4.1.

d

dtP_t(Φ) =A(Pt(Φ)) =P_t(A(Φ)), t≥0, for any Φ∈D(A).

LetF_n:={κ1{0}:κ∈Rⁿ}andC⊕F_n:={η+κ1{0}:η∈C, κ∈Rⁿ} with norm η+κ1_{0}:=ηC+|κ|forη ∈C, κ∈Rⁿ, where 1_{0}: [−δ,0]→Ris deﬁned by

1{0}(θ) =

0,forθ∈[−δ,0), 1,forθ= 0.

For a Borel measurable functionΦ:C→R,we also deﬁne S(Φ)(η) := lim

t→0

Φ(˜η_t)−Φ(η) t for allη∈C,where ˜η: [−δ, T]→Rⁿis an extension ofη deﬁned by

˜ η(t) :=

η(t),t∈[−δ,0), η(0),t≥0,

and ˜η_t∈Cis deﬁned by ˜η_t(θ) := ˜η(t+θ), θ∈[−δ,0]. Let ˆD(S),the domain ofS, be the set ofΦ:C→Rsuch that the above limit exists for eachη∈C. Deﬁne D(S) as the set of all functionsΦ: [0, T]×C→Rsuch that Φ(t,·)∈D(S),ˆ ∀t∈[0, T].

In addition, for each suﬃciently smooth function Φ, we will denote its ﬁrst and secondF rechet´ derivative with respect toη∈C byDΦandD²Φ. And letC_lip^1,2([0, T]×C) be the set of functionsΦ: [0, T]×C→Rsuch that ^∂Φ_∂t,DΦandD²Φexist and they are globally bounded and Lipschitz.

Then we can derive a formula for the generatorA.

Theorem 4.2 (([9]). Suppose that Φ∈ C_lip^1,2([0, T]×C)∩D(S). Let v(·) ∈ Uad, and {X(t), t∈[s, T]} be the C−valued Markov solution process of equation(2.2)with the initial data(s, η_s)∈[0, T]×C.Then forΦ∈D(A),

A(Φ)(Xt) = ∂

∂tΦ(t, X_t) +S(Φ)(Xt) +DΦ(t, X_t)(b(t, X_t, v(t))1{0}) +1

2 d i=1

D²Φ(t, X_t)(σ(t, X_t, v(t))ei1_{0}, σ(t, X_t, v(t))ei1_{0}),

(4.1)

where DΦ:C⊕F_n →R, D²Φ: (C⊕F_n)×(C⊕F_n)→Rare the continuous linear and bilinear extensions of DΦ, D²Φrespectively, and {ei}^d_i=1 is the standard basis inR^d.

4.2. Some properties of the value function

We will explore some more properties of the value function deﬁned in previous subsection.

Lemma 4.3. Under the assumptions (A1)–(A5), for any(s, ζ)∈[0, T)×L²(Ω, C;Fs)andv∈ Uad, we have

t→slim⁺X_t^s,ζ;v−ζL²(Ω,C)= 0. (4.2)

Proposition 4.4. Suppose(A1)–(A5)hold, then the value function u(s, ϕ)is continuous in (s, ϕ)∈[0, T]×C.