• Aucun résultat trouvé

A verification theorem for optimal stopping problems with expectation constraints

N/A
N/A
Protected

Academic year: 2021

Partager "A verification theorem for optimal stopping problems with expectation constraints"

Copied!
33
0
0

Texte intégral

(1)

HAL Id: hal-01229024

https://hal.archives-ouvertes.fr/hal-01229024

Preprint submitted on 16 Nov 2015

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub-

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non,

A verification theorem for optimal stopping problems with expectation constraints

Stefan Ankirchner, Maike Klein, Thomas Kruse

To cite this version:

Stefan Ankirchner, Maike Klein, Thomas Kruse. A verification theorem for optimal stopping problems

with expectation constraints. 2015. �hal-01229024�

(2)

A verification theorem for optimal stopping problems with expectation

constraints

Stefan Ankirchner Maike Klein Thomas Kruse November 16, 2015

We consider the problem of optimally stopping a continuous-time process with a stopping time satisfying a given expectation cost constraint. We show, by introducing a new state variable, that one can transform the problem into an unconstrained control problem and hence derive a dynamic programming principle. We characterize the value function in terms of the dynamic pro- gramming equation, which turns out to be an elliptic, fully non-linear partial differential equation of second order. We prove a classical verification theo- rem and illustrate its applicability with several examples.

Introduction

Let (X

t

)

t∈R+

be an n-dimensional stochastic state process that satisfies a stochastic differential equation driven by a d-dimensional Brownian motion W . Denote by (F

t

) the filtration that is generated by W and extended by null sets so as to satisfy the usual conditions. Let f : R

n

→ R be a payoff function and H

t

= R

t

0

h(X

s

)ds an increasing cost process, where h : R

n

→ (0, ∞).

We denote by S(T ) the set of (F

t

)-stopping times satisfying the constraint E[H

τ

] ≤ T ∈ R

+

. In this article we consider the optimal stopping problem

maximize E[f (X

τ

)] subject to τ ∈ S(T ). (0.1)

Stefan Ankirchner, Institute for Mathematics, University of Jena, Ernst-Abbe-Platz 2, 07745 Jena, Germany. Email: [email protected], Phone: +49 (0)3641 946275.

Maike Klein, Institute for Mathematics, University of Jena, Ernst-Abbe-Platz 2, 07745 Jena, Ger- many. Email: [email protected], Phone: +49 (0)3641 946186.

Thomas Kruse, Faculty of Mathematics, University of Duisburg-Essen, Thea-Leymann-Str. 9, 45127

Essen, Germany. Email: [email protected], Phone: +49 (0)201 183 3911.

(3)

By choosing h(t) = 1 for all t ∈ R

+

, we obtain as a special case the stopping problem over all stopping times with the expectation constraint E[H

τ

] = E[τ ] ≤ T .

The problem (0.1) captures situations where there is an average time/cost limit for any stopping rule. Whenever a stopping rule is applied repeatedly, an average constraint seems to be more appropriate than a sharp constraint requiring that any stopping time τ satisfies H

τ

≤ T , a.s. Besides, an expec- tation constraint on the stopping time is a suitable counterpart for a gain functional that is itself an expectation.

What makes the stopping problem (0.1) difficult is that there is no simple dependence of the constraint on time. The expectation constraint has to be turned into a scenario-dependent constraint. A first attempt to eliminate the constraint is to follow a Lagrange approach and to consider, for every λ > 0, the unconstrained stopping problem

w(λ) = sup{E[f (X

τ

) − λ(τ − T )] : τ stopping time with E[τ ] < ∞}. (0.2) Notice that (0.2) is an infinite horizon stopping problem that does not in- volve a discount factor. Therefore it is often impossible to characterize w as the unique solution of a dynamic programming equation (cf. Section 6). Disregard this for a moment and assume that we can identify an opti- mal stopping time τ

(λ) for (0.2) and that w is absolutely continuous with

∂w

∂λ

(λ) = −E[τ

(λ)]. If there exists b λ such that

∂w∂λ

(b λ) = −T , then the stop- ping time τ

(b λ) is optimal for the original problem (0.1). It can happen, however, that the function w is not absolutely continuous (see Section 6 for an example). Even if w is differentiable, then it can be involved and error- prone to invert the derivative

∂w∂λ

and to determine the appropriate Lagrange multiplier b λ.

In this article we propose a new approach for solving stopping problems of the type (0.1). Our basic idea is to extend the state space by the conditional expectation process of H

τ

. Assuming a Brownian set-up, the predictable representation property allows to interpret the new state variable as a mar- tingale with controlled diffusion coefficient. One can thus transform the stopping problem (0.1) into a problem with a controlled state and time hori- zon. The advantage of the transformed problem is that it allows to formulate a dynamic programming principle (DPP). With a DPP at hand, we can char- acterize the value function V (T, x) = sup{E[f(X

τ

)] : τ ∈ S(T ) and X

0

= x}

as a solution of the dynamic programming equation (DPE). In order to obtain a classical verification theorem, we consider also the auxiliary stopping prob- lem U (T, x) = sup{E[f(X

τ

)] : X

0

= x, τ a stopping time with E[H

τ

] = T }.

One can show that V (T, x) = U (T ∧ T ˜ (x), x), where T ˜ (x) = inf {T ≥ 0 :

∂U

∂T

(T, x) ≤ 0}. Consequently, U fully determines V . The DPE for U turns

(4)

out to be the partial differential equation (PDE) h(x)U

T

(T, x) − LU (T, x) +

σ

>

(x) · ∇

x

U

T

(T, x)

2

2U

T T

(T, x) = 0, (0.3) with initial condition U (0, x) = f(x); here L is the generator of X and σ its diffusion matrix. We give sufficient conditions for the value function U to be a solution of (0.3). Moreover, we provide a verification theorem that allows to verify whether a solution of (0.3) coincides with U . Since V is determined by U , this allows further to identify an optimal stopping time for the original problem (0.1).

If no closed-form solution is at hand, then one can try to solve the PDE (0.3) numerically. Related is the question whether there is a comparison principle for viscosity solutions. We do not discuss these issues in the present article, but leave them for future research.

The idea to extend the state space by a conditional expectation process in order to make a constraint more tangible can be found already in the control literature. Bouchard, Elie and Touzi [2] consider the problem of attaining a possibly stochastic target with a given probability. They extend the state space by a conditional probability process in order to reduce the problem to a standard stochastic target problem. Bokanowski, Picarelli and Zidani [1]

reformulate a stochastic control problem with a state constraint as a target problem by introducing a conditional expectation process as a new controlled variable.

There are only few articles in the literature that deal with stopping prob- lems of the type (0.1). Kennedy [7] considers the problem of stopping a discrete time process with the constraint that the expectation of any stop- ping time is bounded by some given constant. He uses Lagrangian techniques for determining optimal stopping rules. Horiguchi [5] considers optimal stop- ping of a finite state process that, in addition, can be controlled with finitely many actions. Optimal stopping rules satisfying an expectation constraint are determined with mathematical programming techniques. Palczewski and Stettner [8] consider an undiscounted optimal stopping problem with infinite time horizon of the type (0.3) under the additional assumption that X is an ergodic, time-homogeneous weak Feller process. They state sufficient conditions guaranteeing that the set of stopping times can be restricted to those with bounded expectation. This boundary is in general not global but depends on the initial value of X.

The paper is organized as follows. Section 1 states the precise assump-

tions. In Section 2 we establish a one-to-one correspondence between the set

S(T ) and a class of controlled martingales. The correspondence allows us

to transform the constrained stopping problem so that one can formulate a

dynamic programming principle (see Section 3). Section 4 deals with the dy-

namic programming equation and Section 5 provides a classical verification

(5)

theorem. In Section 6 we briefly compare the new method with the Lagrange approach. Finally, in Section 7 we discuss several examples illustrating the scope of our results.

1 Optimal stopping with expectation constraints

Let (W

t

)

t≥0

be a d-dimensional Brownian motion on a probability space (Ω, F , P ) and denote by (F

t

)

t≥0

its augmented natural filtration. Let µ : R

n

→ R

n

and σ : R

n

→ R

n×d

be Lipschitz-continuous functions and assume that for every x ∈ R

n

the matrix (σσ

>

)(x) ∈ R

n×n

is positive definite. Then there exists a unique R

n

-valued strong solution (X

tx,r

)

t≥r

of the SDE

dX

tx,r

= µ(X

tx,r

)dt + σ(X

tx,r

) · dW

t

, X

rx,r

= x, (1.1) for every r ∈ R

+

and x ∈ R

n

. Moreover, recall that X fulfills the strong Markov property, cf. [6], Chapter 5.

Let h : R

n

→ (0, ∞) be Borel-measurable and define the process (H

tx,r

)

t≥0

by

H

tx,r

= Z

r+t

r

h(X

sx,r

)ds

for x ∈ R

n

and r ∈ R

+

fixed. Denote by T (T ) = T (T, x) and S(T ) = S(T, x) the set of all (F

t

)-stopping times τ with E[H

τx,0

] = T resp. E[H

τx,0

] ≤ T . Note that Assumption (A) guarantees that T (T ) is nonempty; e.g. the stopping time τ = inf{s ≥ 0|H

sx,0

> T } satisfies H

τx,0

= T , and hence lies in T (T ). In the following we sometimes refer to T (T ) as the set of admissible stopping times.

Standing Assumption. Throughout we assume that for all x ∈ R

n

and r ∈ R

+

H

tx,r

< ∞ and lim

t→∞

H

tx,r

= ∞, P − a.s. (A)

Note that if h is bounded and bounded away from zero, i.e. h : R

n

→ [δ, C ] with 0 < δ < C , then Condition (A) is satisfied.

In order to simplify notation, in the following we often write X

tx

and H

tx

instead of X

tx,0

resp. H

tx,0

. In addition, if the starting value x is clear from the context, we omit it.

For a measurable function f : R

n

→ R we consider the following optimal stopping problem with constraint function h

V (T, x) = sup

τ∈S(T)

E[f(X

τx

)], (1.2)

(6)

where T ∈ R

+

and x ∈ R

n

. Here we use the convention that E[f(X

τx

)] = −∞

if both the negative and the positive part of f (X

τx

) have infinite expectation.

Notice that by Assumption (A) every τ ∈ S(T ) is finite a.s. and hence X

τx

is well-defined.

It turns out to be useful to study also the stopping problem with the equality constraint E[H

τ

] = T . We therefore introduce

U (T, x) = sup

τ∈T(T)

E[f (X

τx

)]. (1.3)

Observe that by the very definition the function T 7→ V (T, x) is non-decreasing.

Let T ˜ (x) be the infimum of all time points T at which T 7→ V (T, x) is not increasing (set T ˜ (x) = ∞ if the mapping is strictly increasing everywhere).

Then, for all T < T ˜ (x) we have V (T, x) = U(T, x). Suppose for a moment that T 7→ U (T, x) is concave. Then this implies that also T 7→ V (T, x) is concave. Consequently, if T ˜ (x) < ∞, then V (T, x) = V ( ˜ T (x), x) for all T ≥ T ˜ (x). This means that V is completely determined by U, namely we have

V (T, x) = U(T ∧ T ˜ (x), x), (T, x) ∈ R

+

× R

n

. (1.4) In the next sections we transform problem (1.3) into a control problem with an extended state space and derive a DPE for U . Moreover, we provide a verification theorem that allows to check whether a solution of the DPE coincides with the value function U . The link (1.4) allows us then to identify the value function V and to obtain an optimal stopping time for the original problem (1.2).

In the above derivation of (1.4) we have assumed that T 7→ U (T, x) is concave. Conveniently, concavity turns out to be a consequence of the veri- fication theorem. Notice, however, that one can heuristically show concavity of T 7→ U(T, x) as follows: let τ

1

∈ T (T

1

) and τ

2

∈ T (T

2

). Flip a coin with probability α ∈ (0, 1) for head, and choose τ

1

if head and τ

2

if tail appears.

With the randomized stopping time we can show V (αT

1

+ (1 − α)T

2

, x) ≥ αV (T

1

, x) + (1 − α)V (T

2

, x).

2 Every admissible stopping time is a first hitting time

In this section we establish a one-to-one correspondence between the set of stopping times T (m) and a class of (F

t

)-martingales solving a specific type of SDE with initial value m, where m ∈ R

+

. This correspondence allows us to transform the stopping problems (1.2) and (1.3).

For every τ ∈ T (m), the process (M

t

)

0≤t≤∞

defined by

M

t

= E [H

τ

| F

t

]

(7)

is a continuous (F

t

)-martingale with M

= H

τ

and M

0

= E[H

τ

] = m. Thus, the martingale representation theorem implies

M

t

= m + Z

t

0

α

s

· dW

s

,

where (α

t

)

t≥0

= (α

1t

, ..., α

dt

)

t≥0

∈ L

2loc

(W ), i.e. (α

t

)

t≥0

is measurable, (F

t

)- adapted and there exists a sequence (σ

n

)

n∈N

of (F

t

)-stopping times with σ

n

% ∞ a. s. such that for all n ∈ N

E Z

σn

0

s

|

2

ds

< ∞.

Then, the stopping time τ can be characterized as the first time when the process of conditional expectations falls below the process H

t

.

Lemma 2.1. For an (F

t

)-stopping time τ such that H

τ

is integrable and M

t

:= E[H

τ

| F

t

], t ≥ 0, we have

τ = inf{t ≥ 0 | M

t

≤ H

t

}.

Proof. Notice that

M

t

= E[H

τ

| F

t

] = 1

{τ≤t}

H

τ

+ 1

{τ >t}

E[H

τ

| F

t

]. (2.1)

Therefore, we have M

t

= H

τ

≤ H

t

on {τ ≤ t}, which implies τ ≥ inf {t ≥ 0 | M

t

≤ H

t

}.

On the other hand it holds true that

1

{τ >t}

H

τ

≥ 1

{τ >t}

H

t

and, hence, on {τ > t} it follows from (2.1) that a.s.

M

t

= E[H

τ

| F

t

] > H

t

. From this we deduce that τ ≤ inf {t ≥ 0 | M

t

≤ H

t

}.

Lemma 2.1 implies that for τ ∈ T (m) the following holds true M

s

> H

s

for s < τ = inf{t ≥ 0 | M

t

≤ H

t

}, M

s

= M

τ

for s ≥ τ = inf{t ≥ 0 | M

t

≤ H

t

}.

Therefore, the process M satisfies

dM

t

= 1

{τ >t}

α

t

· dW

t

= 1

{∀s≤t:Ms>Hs}

α

t

· dW

t

.

On {M

t

> H

t

} we have t < τ = inf{s ≥ 0 | M

s

≤ H

s

}, which implies M

s

> H

s

for all s ≤ t. Therefore, it follows that

dM

t

= 1

{Mt>Ht}

α

t

· dW

t

, M

0

= m. (2.2)

We have thus shown that any stopping time τ ∈ T (m) coincides with the

first time a process solving (2.2) hits H

t

. Indeed, there is a one-to-one cor-

respondence between (F

t

)-stopping times with E[H

τ

] = m and processes

(M

t

) satisfying (2.2). To establish this correspondence we need the following

lemma.

(8)

Lemma 2.2. Let (α

t

)

t≥0

= (α

1t

, ..., α

dt

)

t≥0

∈ L

2loc

(W ) and m ∈ R

+

. Then there exists a unique solution M of (2.2). This solution is a non-negative martingale.

Proof. Let (α

t

)

t≥0

∈ L

2loc

(W ), m ∈ R

+

and Y

t

= m + R

t

0

α

s

· dW

s

. Then, τ = inf{t ≥ 0 | Y

t

≤ H

t

}

defines an (F

t

)-stopping time and the stopped process M

t

:= Y

t∧τ

satisfies dM

t

= 1

{τ >t}

α

t

· dW

t

= 1

{∀s≤t:Ys>Hs}

α

t

· dW

t

= 1

{∀s≤t:Ms=YsandMs>Hs}

α

t

· dW

t

.

Notice that M

t

> H

t

implies M

s

> H

s

for all s ≤ t he definition of τ . Hence, {∀s ≤ t : M

s

= Y

s

and M

s

> H

s

} = {M

t

> H

t

}

and (M

t

) solves (2.2).

We next show that M is the unique solution of (2.2). To this end let (N

t

) be another solution of (2.2). Let ρ = inf{t ≥ 0|N

t

≤ H

t

}. Then N

t∧ρ

= m + R

ρ∧t

0

α

s

· dW

s

, which implies ρ = τ and hence N

t

= M

t

on [0, τ ].

After τ we have N

t

= N

τ

= M

τ

= M

t

. Therefore, N = M .

We now show that M is a martingale. First notice that by definition M is a continuous, non-negative local martingale. Hence, it is a supermartingale and the limit

M

= lim

t→∞

M

t

exists almost surely with M

∈ L

1

(Ω).

Note that Fatou’s lemma implies E [M

] ≤ m. In order to prove that M is a true martingale, it suffices to show that E[M

] ≥ m. To this end observe that monotone convergence implies

E[M

] = lim

n→∞

E[M

1

{M≤n}

].

Define the stopping times τ

n

by

τ

n

= inf{t ≥ 0 | M

t

≤ H

t

or M

t

≥ n}.

Then, τ

n

≤ τ almost surely. The stopped processes (M

t∧τn

)

t≥0

are martin- gales with

E[M

τn

] = lim

t→∞

E[M

τn∧t

] = m

(9)

by the dominated convergence theorem. Notice that τ

n

= τ on {M

τ

≤ n}.

This, together with M

= M

τ

, gives

E[M

1

{M≤n}

] = E[M

τn

1

{Mτ≤n}

]

= E[M

τn

] − E[M

τn

1

{Mτ>n}

]

= m − n P [M

τ

> n]

≥ m − E[M

τ

1

{Mτ>n}

]

= m − E[M

1

{M>n}

].

To sum up, we have

E[M

] ≥ m and hence M is a true martingale.

Let M(m) be the set of all solutions M of (2.2) with (α

t

)

t≥0

∈ L

2loc

(W ).

The results obtained so far show that one can identify T (m) with M(m).

Proposition 2.3. There is a one-to-one correspondence between T (m) and M(m) given by

M

t

= E[H

τ

|F

t

], t ≥ 0, and τ = inf{t ≥ 0 | M

t

≤ H

t

}, where τ ∈ T (m) and M ∈ M(m). Moreover, H

τ

= M

τ

= M

.

Proof. The statements follow from Lemma 2.1 and 2.2, and the discussion preceding Lemma 2.2.

Remark 2.4. To emphasize the dependence on α and m, in the following we often write M

α,m

instead of M.

Example 2.5. Let (W

t

)

t≥0

be a Brownian motion in R

d

. For x ∈ R

d

let X

tx

= x + W

t

, t ≥ 0, and the constraint function h is given by h(y) = 1, y ∈ R

d

, hence, we have H

t

= t. Denote by τ

R

the first exit time of the ball around 0 with radius R > |x|, i.e. τ

R

= inf{t ≥ 0 | |X

tx

| ≥ R}. Then, the expected value of τ

R

is given by

E[τ

R

] = R

2

− |x|

2

d ,

see Chapter 4.2.E in [6]. Hence, on {τ

R

> t} the process of conditional expectations M

t

is given by

M

t

= E[τ

R

|F

t

] = t + R

2

− |X

tx

|

2

d

= R

2

− |x|

2

d + 1

d Z

t

0

−2X

sx

· dW

s

.

Thus, α

s

= −2X

sx

/d, s ≥ 0, with localizing sequence τ

n

= n and we conclude

that τ

R

= inf{t ≥ 0 | M

tα,(R2−|x|2)/d

≤ t} from Proposition 2.3.

(10)

In dimension 1 we can extend the above example to exit times of intervals (a, b), a < x < b, instead of intervals (−R, R) with R > |x|.

Example 2.6. For x ∈ R let X

tx

= x + W

t

, t ≥ 0, where (W

t

)

t≥0

is a one-dimensional Brownian motion. Again let h(y) = 1, y ∈ R , and, thus, H

t

= t. The first exit time τ of an interval (a, b), a < x < b, has expectation (b − x)(x − a). The associated process of conditional expectations M

t

on {M

t

> t} is given by

M

t

= (b − x)(x − a) + Z

t

0

α

s

dW

s

with α

s

= −2X

sx

+ b + a, s ≥ 0, and (α

s

) ∈ L

2loc

(W ).

The stopping problem as an optimal control problem

The one-to-one correspondence of Proposition 2.3 allows to reformulate the optimal stopping problems (1.2) and (1.3) as optimal control problems. To do so, observe first that t 7→ H

tx,r

is continuous and strictly increasing almost surely. Moreover, note that Condition (A) guarantees that the pathwise inverse of H

x,r

,

G

x,r

(t) = inf {s ≥ 0 | H

sx,r

≥ t} ,

is defined for t, r ∈ R

+

and x ∈ R

n

. In order to simplify notation, we write G

x

(t) instead of G

x,0

(t).

Notice that if τ ∈ T (m) and M ∈ M(m) such that H

τ

= M

, then τ = G(M

). Therefore, by Proposition 2.3 we have

V (T, x) = sup n E h

f X

x

Gx

(

Mα,m

) i

m ∈ [0, T ], α ∈ L

2loc

(W ) o

(V ) and

U (T, x) = sup n E h

f X

Gxx

(

Mα,T

) i

α ∈ L

2loc

(W ) o

, (U )

where

dM

tα,m

= 1

{Mtα,m>Ht}

α

t

· dW

t

, M

0α,m

= m.

At a first glance, the reformulations (V ) and (U ) look more cumbersome

than the original formulations of the stopping problem. We show in the

next section, however, that the new formulations allow to obtain a dynamic

programming principle.

(11)

3 Derivation of a dynamic programming principle

The aim of this section is to derive a dynamic programming principle (DPP) for U under suitable measurablility assumptions. For the derivation we need that the value function U is finite. We start, therefore, with a subsection providing sufficient conditions for finiteness.

3.1 Finiteness of the value functions

Note that if the payoff function f is bounded, then the value functions are also bounded. We next give a more general condition, in the one-dimensional case n = d = 1, guaranteeing that the value functions V and U are finite.

Let d = n = 1 and denote by I ⊆ R the interior of the state space of X.

By assumption we have σ

2

(x) > 0 for all x ∈ I. Furthermore, we assume that (1 + |µ(x)|)/σ

2

(x) is locally integrable on I (see conditions (ND)

0

and (LI)

0

in Section 5.5.C of [6]). Let X

x

be a solution of (1.1) with X

0x

= x and define the scale function s by

s

x

(y) = Z

y

x

exp

− Z

z

x

2µ(w) σ

2

(w) dw

dz.

Then the process Z

t

:= s

x

(X

tx

), t ≥ 0, is a local martingale starting in 0 with dZ

t

= η(Z

t

) dW

t

,

where η = (s

0x

σ) ◦ s

−1x

. Hence we can convert the optimal stopping problem with reward function f for the process X into an optimal stopping problem with reward function f ◦ s

−1

for Z. Let

q

x

(y) = Z

y

0

Z

z 0

2 h(s

−1x

(w)) η

2

(w) dw dz.

In the following we show that if f ◦ s

−1x

is bounded from above by q

x

, x ∈ I, then V (T, x) < ∞ for all T ∈ R

+

. More precisely,

Proposition 3.1. Let x ∈ I . If there exists C > 0 such that f(s

−1x

(y)) ≤ C(1 + q

x

(y)) for all y ∈ I, then ¯ U (T, x) ≤ V (T, x) < ∞. If, in addition

|E [f (X

G(Tx )

)]| < ∞ for all x ∈ R

n

, then |U(T, x)| < ∞.

Proof. First notice that the definition of V implies V (T, x) ≥ f (x) for all x ∈ I.

The function q

x

is convex, because q

x00

(y) = 2h(s

−1x

(y))/η

2

(y) > 0 and hence

we can apply Itô’s formula for convex functions to show that q

x

(Z

t

) − H

tx

,

t ≥ 0, is a local martingale. Moreover, we have E[q

x

(Z

τ

)] ≤ T for all

(12)

τ ∈ S (T ): Let (τ

n

)

n∈N

be a localizing sequence for Z . Then Fatou’s lemma and the monotone convergence theorem imply that

E[q

x

(Z

τx

)] = E [lim inf

n→∞

q

x

(Z

τ∧τx n

)] ≤ lim inf

n→∞

E[q

x

(Z

τ∧τx n

)] = lim inf

n→∞

E[H

τ∧τx n

] = E[H

τx

] ≤ T.

Therefore, we have for T ∈ R

+

U (T, x) ≤ V (T, x) = sup

τ∈S(T)

E[f(X

τx

)]

= sup

τ∈S(T)

E[f(s

−1x

(Z

τ

))]

1 + sup

τ∈S(T)

E[q

x

(Z

τ

)]

≤ C(1 + T ).

If, in addition, E[f (X

G(Tx )

)] < ∞, then |U (T, x)| < ∞.

The following example shows that the condition from Proposition 3.1 is sharp if X is a Brownian motion.

Example 3.2. For a one-dimensional Brownian motion W we have q

0

(y) = y

2

. Consider the optimal stopping problem (1.2) for f(y) = |y|

2+ε

, ε > 0, constraint function h(x) = 1, i.e. H

t

= t, and the Brownian motion W . For every T > 0 the first time H(a, −T /a), a > 0, when W hits a or −T /a has expectation T . Hence,

V (T, 0) ≥ U(T, 0) ≥ sup

a>0

E[f(W

H(a,−T /a)

)] = sup

a>0

a

2+ε

T

a

2

+ T + a

2

a

2+ε

T

2+ε

a

2

+ T

= ∞.

3.2 Dynamic programming principle for U

In this subsection we prove a DPP for U under the following assumptions.

Assumption.

(U0) |U (T, x)| < ∞ for all x ∈ R

n

, T ∈ R

+

. (U1) U is measurable.

(U2) For all T ≥ 0, any family (θ

α

)

α∈L2

loc(W)

of (F

t

)-stopping times, for any ε > 0 and ω ∈ Ω with M

θα,Tα

(ω) > H

θxα

(ω) there exists β = β

ε,ω

∈ L

2loc

(W ) with

E

f

X

Xθαx α

GXxθα,θα

(

M

)

F

θα

≥ U

M

θα,Tα

(ω) − H

θxα

(ω), X

θxα

(ω)

− ε,

(3.1)

(13)

where M

= M

ε,ω

= M

β,M

α,T

θα (ω)−Hθαx (ω)

, such that the process α ˆ defined by

ˆ α

s

(ω) =

( α

s

(ω) if s ∈ [0, θ

α

(ω)] or (s > θ

α

(ω) and M

θα,Tα

(ω) ≤ H

θxα

(ω)), β

s−θα

(ω) if s > θ

α

(ω) and M

θα,Tα

(ω) > H

θxα

(ω),

(3.2) satisfies α ˆ ∈ L

2loc

(W ).

Proposition 3.3 (DPP for U). Let the Assumptions (U0) − (U2) be sat- isfied. For any family of (F

t

)-stopping times (θ

α

)

α∈L2

loc(W)

we have U(T, x) = sup

α∈L2loc(W)

E h 1

{Mα,T

θα ≤Hθαx }

f X

x

Gx

(

Mθαα,T

)

+ 1

{Mα,T

θα >Hθαx }

U

M

θα,Tα

− H

θxα

, X

θxα

i . (3.3)

Remark 3.4. Using the upper- and lower-semicontinuous envelopes of U one can formulate a weak DPP, in the fashion of [3], that does not require Assumption (U1) and the measurable selection assumption (U2).

Proof. Fix (T, x) ∈ (0, ∞) × R

n

and let (θ

α

)

α∈L2

loc(W)

be a family of (F

t

)- stopping times. Then for every α ∈ L

2loc

(W ) we have

E h

f

X

Gxx

(

Mα,T

) i

= E h

1

{Mθαα,T≤Hxθα}

f

X

Gxx

(

Mα,T

)

+ 1

{Mθαα,T>Hxθα}

f

X

Gxx

(

Mα,T

) i

= E h

1

{Mθαα,T≤Hxθα}

f

X

Gxx

(

Mθαα,T

)

+ 1

{Mθαα,T>Hxθα}

f

X

Gxx

(

Mα,T

)

i , where we use that M

α,T

and M

θα,Tα

coincide on {M

θα,Tα

≤ H

θxα

}. By con-

ditioning the second summand on F

θα

, the strong Markov property of X

x

implies

E h 1

{Mα,T

θα >Hθαx }

f X

x

Gx

(

Mα,T

) i

= E h E h

1

{Mα,T

θα >Hθαx }

f X

x

Gx

(

Mα,T

) F

θα

ii

= E

1

{Mθαα,T>Hθαx }

E

f

X

X

x θαα Gx

(

Mα,T

)

−θα

F

θα

. The expectation constraint conditioned on F

θα

results in

E

H

X

x θαα Gx

(

Mα,T

)

−θα

F

θα

= E

"

Z

Gx

(

Mα,T

)

θα

h X

X

x θαα s

ds

F

θα

#

= E

"

Z

Gx

(

Mα,T

)

θα

h (X

sx

) ds

F

θα

#

= E h H

Gxx

(

Mα,T

) − H

θxα

F

θα

i

= E

M

α,T

|F

θα

− H

θxα

= M

θα,Tα

− H

θxα

,

(14)

because (M

tα,T

) is a martingale (Lemma 2.2). Thus, the conditional expec- tation of the remaining constraint equals M

θα,Tα

− H

θxα

. Hence,

U(T, x) ≤ sup

α∈L2loc(W)

E h 1

{Mα,T

θα ≤Hθαx }

f X

x

Gx

(

Mθαα,T

)

+ 1

{Mα,T

θα >Hθαx }

U

M

θα,Tα

− H

θxα

, X

θxα

i ,

where we use that U is measurable (Assumption (U1)).

For the reverse inequality let α ∈ L

2loc

(W ). For every ε > 0 and every ω ∈ Ω with M

θα,Tα

(ω) > H

θxα

(ω), choose β = β

ε,ω

∈ L

2loc

(W ) such that (3.1) holds and such that the control α ˆ defined in (3.2) is in L

2loc

(W ), which exists by Assumption (U2). Thus,

U(T, x) ≥ E

f

X

x

Gx

(

Mα,Tˆ

)

= E

1

{Mθαα,Tˆ ≤Hθαx }

f

X

x

Gx

(

Mθαα,Tˆ

)

+ 1

{Mθαα,Tˆ >Hθαx }

E

f

X

x

Gx

(

Mα,Tˆ

)

F

θα

= E

1

{Mα,T

θα ≤Hθαx }

f X

x

Gx

(

Mθαα,T

)

+ 1

{Mα,T

θα >Hθαx }

E

f

X

Xθαx α

Gx

(

Mα,Tˆ

)

−θα

F

θα

by the strong Markov property and the fact that M

sα,Tˆ

(ω) = M

sα,T

(ω) for all s ∈ [0, θ

α

(ω)]. Notice that G

x

M

α,Tˆ

−θ

α

and G

Xθαx α

M

= G

Xθαx α

M

β,M

α,T θα −Hθαx

both conditioned on {M

θα,Tα

> H

θxα

} have the same distribution. Therefore, U(T, x) ≥ E

1

{Mα,T

θα ≤Hθαx }

f X

x

Gx

(

Mθαα,T

)

+ 1

{Mα,T

θα >Hθαx }

E

f

X

Xθαx α

GXxθα,θα

(

M

)

F

θα

≥ E h

1

{Mθαα,T≤Hθαx }

f

X

Gxx

(

Mθαα,T

)

+ 1

{Mθαα,T>Hθαx }

U

M

θα,Tα

− H

θxα

, X

θxα

− ε i

, which implies a DPP for U by the arbitrariness of α ∈ L

2loc

(W ) and ε > 0.

4 The dynamic programming equation for U

The DPP from Proposition 3.3 allows to derive a dynamic programming equation for U. In order to do so, we denote by L the generator of the Markov process X, i.e.

Lu(x) = 1 2

n

X

i=1 n

X

j=1

2

u

∂x

i

∂x

j

(x)

d

X

l=1

σ

il

(x)σ

jl

(x) +

n

X

i=1

b

i

(x) ∂u

∂x

i

(x)

(15)

for suitable functions u ∈ C

2

( R

n

, R ). For a function u ∈ C

2

((0, ∞) × R

n

, R ) we use the following notation:

u

T

(T, x) = ∂u

∂T (T, x), u

T T

(T, x) = ∂

2

u

∂T

2

(T, x), u

T xi

(T, x) = ∂

2

u

∂T ∂x

i

(T, x),

x

u(T, x) = ∂u

∂x

i

(T, x)

n

i=1

,

x

u

T

(T, x) = u

T xi

(T, x)

n i=1

and for a matrix A ∈ R

k×l

, k, l ∈ N , its transpose is denoted by A

>

.

Proposition 4.1. Assume that h is continuous. If U ∈ C

2

((0, ∞) × R

n

) and U satisfies the DPP (3.3), then

1. U is a supersolution to

h(x)U

T

(T, x) − LU(T, x) +

σ

>

(x) · ∇

x

U

T

(T, x)

2

2U

T T

(T, x) = 0 (4.1) on (0, ∞) × R

n

with initial condition U (0, x) = f (x). Moreover, U is concave in T and

(T, x)| U

T T

(T, x) = 0 ⊆ (T, x)

x

U

T

(T, x) = 0 ∈ R

n

. Here we set

σ

>

(x) · ∇

x

U

T

(T, x)

2

/U

T T

(T, x) = 0 if both the numerator and the denominator equal 0.

2. If, in addition,

σ

>

· ∇

x

U

T

2

/U

T T

is continuous on (0, ∞) × R

n

, then U is a solution to (4.1) on A ¯ ∪ (0, ∞) × {x ∈ R

n

| ∃ T > 0 : (T, x) ∈ A}, where ¯ A ¯ denotes the closure of A := {(T, x) | U

T T

(T, x) < 0} in (0, ∞) × R

n

.

Proof. 1. The initial condition is satisfied, because T = 0 is equivalent to stopping directly. In order to prove that U is a supersolution of (4.1) let (T, x) ∈ (0, ∞) × R

n

, write X instead of X

x

in the following and consider the constant control α ≡ a, with a ∈ R

d

. Then, α ∈ L

2loc

(W ). Let

θ

a

:= inf n s ≥ 0

|X

s

− x| ≥ 1 or M

sa,T

− H

s

∈ / [T /2, 2T ] o

.

(16)

The DPP for U and Itô’s formula imply that for t > 0 0 ≥ E h

U (M

t∧θa,Ta

− H

t∧θa

, X

t∧θa

) − U(T, x) i

≥ E

Z

t∧θa 0

−h (X

sx

) U

T

+ LU + |a|

2

2 U

T T

+ (∇

x

U

T

)

>

· σ(X

s

) · a

(M

sa,T

− H

s

, X

s

)ds

+ E

Z

t∧θa 0

U

T

(M

sa,T

− H

s

, X

s

) a + ∇

x

U (M

sa,T

− H

s

, X

s

)

>

· σ(X

s

)

· dW

s

. (4.2)

The stochastic integral has expectation 0, because the integrand is bounded on the stochastic interval [0, t ∧ θ

a

].

By the pathwise continuity of X

s

and M

sa,T

and the boundedness of the integrand in the Lebesgue integral we obtain, after first dividing by t and then taking the limit t ↓ 0, that

−h(x)U

T

(T, x) + LU (T, x) + |a|

2

2 U

T T

(T, x) + ∇

x

U

T

(T, x)

>

· σ(x) · a ≤ 0 for all a ∈ R

d

. Thus,

−h(x)U

T

(T, x) + LU (T, x) + sup

a∈Rd

|a|

2

2 U

T T

(T, x) + ∇

x

U

T

(T, x)

>

· σ(x) · a

≤ 0.

(4.3) In particular, the supremum is finite which shows the concavity of U in T and

{(T, x)|U

T T

(T, x) = 0} ⊆ n

(T, x)

x

U

T

(T, x)

>

· σ(x) = 0 ∈ R

d

o

= (T, x)

x

U

T

(T, x) = 0 ∈ R

n

.

The last conclusion follows from the fact that (σσ

>

)(x) is positive definite.

Inequality (4.3) simplifies to

−h(x)U

T

(T, x) + LU (T, x) −

σ

>

(x) · ∇

x

U

T

(T, x)

2

2U

T T

(T, x) ≤ 0

if U

T T

(T, x) < 0 resp. −h(x)U

T

(T, x) + LU (T, x) if U

T T

(T, x) = 0. To con- clude, U is a supersolution to (4.1), if we set

σ

>

(x) · ∇

x

U

T

(T, x)

2

/U

T T

(T, x) = 0 in the case that both expressions are 0.

2. In order to prove that U is a subsolution to (4.1) on A ¯ if

σ

>

· ∇

x

U

T

2

/U

T T

is continuous, we first assume that there exists (T

0

, x

0

) ∈ (0, ∞) × R

n

with U

T T

(T

0

, x

0

) < 0 and

h(x

0

)U

T

(T

0

, x

0

) − LU (T

0

, x

0

) +

σ

>

(x

0

) · ∇

x

U

T

(T

0

, x

0

)

2

2U

T T

(T

0

, x

0

) > 0. (4.4)

(17)

Define for (T, x) ∈ (0, ∞) × R

n

ϕ(T, x) = U (T, x) + |T − T

0

|

4

+ |x − x

0

|

4

.

Then ϕ ∈ C

2

((0, ∞) × R

n

) and (4.4) holds also if U is replaced with ϕ.

Moreover, the continuity of the derivatives implies that h(x)ϕ

T

(T, x) − Lϕ(T, x) +

σ

>

(x) · ∇

x

ϕ

T

(T, x)

2

T T

(T, x) > 0 (4.5) and

ϕ

T T

(T, x) < 0. (4.6)

Now let α ∈ L

2loc

(W ) and (τ

n

)

n∈N

be a localizing sequence for α. In the following we write M

α

and X instead of M

α,T0

resp. X

x0

. Define the stopping times

θ

α

= inf t ≥ 0

(M

tα

− H

t

, X

t

) ∈ N /

r

, θ

αn

= θ

α

∧ τ

n

∧ n, n ∈ N .

Applying Itô’s formula to ϕ leads to U(T

0

, x

0

) = ϕ(T

0

, x

0

)

= E ϕ(M

θαα

n

− H

θαn

, X

θαn

) + E

Z

θαn 0

h (X

s

) ϕ

T

− Lϕ − |α

s

|

2

2 ϕ

T T

− (∇

x

ϕ

T

)

>

· σ(X

s

) · α

s

(M

sα

− H

s

, X

s

)ds

≥ E ϕ(M

θαα

n

− H

θnα

, X

θαn

) + E

Z

θαn 0

h (X

s

) ϕ

T

− Lϕ − sup

a∈Rd

|a|

2

2 ϕ

T T

− (∇

x

ϕ

T

)

>

· σ(X

s

) · a)

(M

sα

− H

s

, X

s

)ds

. Here we use that the stochastic integral is a martingale. By (4.6) the previous

supremum is given by

σ

>

(X

s

) · ∇

x

ϕ

T

(M

sα

− H

s

, X

s

)

2

T T

(M

sα

− H

s

, X

s

) . Therefore, we conclude from (4.5) that

U(T

0

, x

0

) ≥ E[ϕ(M

θαα

n

− H

θαn

, X

θnα

)]

for all n ∈ N . Since ϕ is bounded on N

r

, taking the limit n → ∞ results in U(T

0

, x

0

) ≥ E[ϕ(M

θαα

− H

θα

, X

θα

)]

≥ η + E[U (M

θαα

− H

θα

, X

θα

)],

(18)

where η := min

(T ,x)∈∂Nr

(ϕ − U)(T, x) > 0. Since α was arbitrary and η >

0 is independent of α, this contradicts the DPP for U . Hence, we have shown that U (T, x) is a subsolution to (4.1) on A := {(T, x) ∈ (0, ∞) × R

n

| U

T T

(T, x) < 0}. The continuity of h, U

T

, LU and

σ

>

· ∇

x

U

T

2

/U

T T

implies that U satisfies

h(x)U

T

(T, x) − LU (T, x) +

σ

>

(x) · ∇

x

U

T

(T, x)

2

U

T T

(T, x) ≤ 0 for (T, x) ∈ A. ¯

Finally, let (T

0

, x

0

) ∈ A ¯

c

∩ (0, ∞) × {x ∈ R

n

| ∃ T > 0 : (T, x) ∈ A}. ¯ Then there exists a neighborhood N of (T

0

, x

0

) with N ⊆ (0, ∞) × R

n

such that U

T T

= 0 on N and (0, ∞) × {x

0

} ∩ N ∩ A ¯ 6= ∅. In particular, we have

x

U

T

= 0 on N by the first part of the proof. Hence, the value function U is linear or constant in T and there exist c ∈ R and g ∈ C

2

( R

n

) such that

U (T, x) = cT + g(x), (T, x) ∈ N . Therefore,

h(x

0

)U

T

(T

0

, x

0

) − LU (T

0

, x

0

) = c h(x

0

) − Lg(x

0

). (4.7) On the other hand there exists T > 0 such that (T, x

0

) ∈ N ∩ A ¯ with U(T, x

0

) = cT + g(x

0

) by the continuity of U and

h(x

0

)U

T

(T, x

0

) − LU (T, x

0

) = c h(x

0

) − Lg(x

0

) ≤ 0 (4.8) by the previous part. Combining (4.7) and (4.8) results in

h(x

0

)U

T

(T

0

, x

0

) − LU (T

0

, x

0

) ≤ 0.

To sum up, this together with the first part of the Proposition implies that U is a solution to (4.1) on A ¯ ∪ (0, ∞) × {x ∈ R

n

| ∃ T > 0 : (T, x) ∈ A}. ¯ Remark 4.2.

(a) In general U ∈ C

2

(0, ∞) × R

n

is not a solution to (4.1) on the set A ¯

c

∩ (0, ∞) × {x ∈ R

n

| (T, x) ∈ / A ¯ for all T > 0}, see Subsection 7.2 for a counterexample. One can show, however, if there exists an optimal stopping time for any (T, x), then U is a solution to (4.1) on the whole set R

+

× R

n

.

(b) Let h be continuous. Then the continuity condition in the second part

of Proposition 4.1 is necessary: Let u ∈ C

2

(0, ∞) × R

n

be a solution of

(4.1) and let (T, x) ∈ (0, ∞) × R

n

. If u

T T

(T, x) 6= 0, then the quotient is

continuous in (T, x). Now assume that u

T T

(T, x) = 0 and observe that

(19)

σ

>

(y) · ∇

x

u

T

(t, y)

2

u

T T

(t, y) =

( 2 Lu(t, y) − h(y)u

T

(t, y)

, if u

T T

(t, y) 6= 0,

0, if u

T T

(t, y) = 0

for all (t, y) ∈ (0, ∞) × R

n

. Thus, for every sequence (T

n

, x

n

)

n∈N

⊆ (0, ∞) × R

n

with (T

n

, x

n

) → (T, x) we have

σ

>

(x

n

) · ∇

x

u

T

(T

n

, x

n

)

2

u

T T

(T

n

, x

n

)

≤ 2 |Lu(T

n

, x

n

) − h(x

n

)u

T

(T

n

, x

n

)|

−−−→

n→∞

2 |Lu(T, x) − h(x)u

T

(T, x)| = 0, because u solves (4.1) and u

T T

(T, x) = 0. Hence,

n→∞

lim

σ

>

(x

n

) · ∇

x

u

T

(T

n

, x

n

)

2

u

T T

(T

n

, x

n

) = 0 =

σ

>

(x) · ∇

x

u

T

(T, x)

2

u

T T

(T, x) , which implies the continuity of

σ

>

(x) · ∇

x

u

T

(T, x)

2

/u

T T

(T, x).

In Example 7.1 the continuity condition is not satisfied and the value function is only a supersolution to (4.1).

Remark 4.3. The convention that

σ

>

(x) · ∇

x

U

T

(T, x)

2

/U

T T

(T, x) = 0 if U

T T

(T, x) = 0 is justified by the proof of the first part of Proposition 4.1.

There it is shown that sup

a∈Rd

|a|

2

2 U

T T

(T, x) + ∇

x

U

T

(T, x)

>

· σ(x) · a

=

− |

σ>(x)·∇xUT(T ,x)

|

2

2UT T(T ,x)

, if U

T T

(T, x) < 0,

0, if U

T T

(T, x) = 0.

Remark 4.4. If one replaces Equation (4.1) with h(x)U

T

(T, x) − LU(T, x) − sup

a∈Rd

|a|

2

2 U

T T

(T, x) + ∇

x

U

T

(T, x)

>

· σ(x) · a

= 0, then all the statements of Proposition 4.1 remain true.

Remark 4.5. If we can show that for X

tx

= x + W

t

and h(y) = 1, y ∈ R

d

, then the optimal control α

for U (T, x) is given by α

s

= −2X

sx

, i.e. the optimal stopping time is the first exit time of the ball around 0 with radius R

x

for some R

x

> |x| (cf. with Example 2.5). In this case the PDE (4.1) simplifies to

h(x)U

T

(T, x) − 1

2 ∆

x

U(T, x) + x

>

· ∇

x

U

T

(T, x) = 0.

(20)

5 Verification

In this section we state a classical verification theorem for U and V . Recall Remark 4.2 (a): if for all (T, x) there exists an optimal stopping time for problem (1.3), then U is a solution of (4.1) on the whole domain R

+

× R

n

. Therefore, for a verification it is natural to look for a solution of the PDE (4.1) on the whole set R

+

× R

n

.

We first show that if u ∈ C

2

((0, ∞) × R

n

) ∩ C(([0, ∞) × R

n

) is a solution of the PDE (4.1), and some additional mild conditions are satisfied, then u coincides with the value function of the optimal control problem (U ). The relation (1.4) allows us then to identify the value function V .

Theorem 5.1.

1. Let u ∈ C

2

((0, ∞)× R

n

)∩C([0, ∞)× R

n

) be a supersolution of (4.1) which is concave in T , satisfies {(T, x)|u

T T

(T, x) = 0} ⊆

(T, x)

x

u

T

(T, x) = 0 ∈ R

n

, u(0, x) = f (x) and has linear growth in T and polynomial growth in x. More precisely, there exists C > 0 and p ≥ 1 such that

|u(T, x)| ≤ C(1 + T + |x|

p

). (5.1) If X

τx

∈ L

p

(Ω) for all τ ∈ T (T ), then u(T, x) ≥ U (T, x).

2. Let u ∈ C

2

((0, ∞) × R

n

) ∩ C([0, ∞) × R

n

) be a solution of (4.1) which is concave in T and satisfies {(T, x)|u

T T

(T, x)}} ⊆

(T, x)

x

u

T

(T, x) = 0 ∈ R

n

, u(0, x) = f (x) and (5.1). Furthermore, assume that

α

s

:= − 1

{uT T<0}

(∇

x

u

T

)

>

· σ u

T T

(M

s

− H

s

, X

sx

) ∈ L

2loc

(W ), where

dM

s

= − 1

{Ms>Hs}

1

{uT T<0}

(∇

x

u

T

)

>

· σ u

T T

(M

s

− H

s

, X

sx

) · dW

s

, M

0

= T.

If X

τx

∈ L

p

(Ω) for all τ ∈ T (T ), then u(T, x) = U (T, x) and (α

s

) is an optimal control.

The corresponding optimal stopping time τ

in (1.2) is given by τ

= {t ≥ 0 | M

t

≤ H

t

} .

3. If the assumptions in 2. are satisfied for all s ≤ T , then V (T, x) = sup

s∈[0,T]

u(s, x) = u(T ∧ T e (x), x),

where T e (x) = inf {t ≥ 0 | u

T

(t, x) ≤ 0}; and an optimal control (α

s

, m

) for V (T, x) is given by

α

s

= − 1

{uT T<0}

(∇

x

u

T

)

>

· σ u

T T

(M

s

− H

s

, X

sx

), m

= T ∧ T e (x),

Références

Documents relatifs

To reduce the constrained stochastic control problem (1.1) to the above abstract framework, we use a weak formulation of the control problem by considering the law of

For second-order state constraints, a shooting approach can be used for the stability analysis if all the touch points are reducible, see [20, 8], but not in presence of

Interestingly, the continuity order of the saturation function plays an important role for the existence of bounded trajectories which represent inverse images of the

For the numerical algorithm we use a gradient projection method in order to deal with the geometric constraint Ω ⊂ Ê N + ; see the textbooks [20,25] for details on the method.

In order to prove the convergence in distribution to the solution of a stochastic diusion equation while removing a boundedness assumption on the driving random process, we adapt

The agents will use a higher fraction of the initial wealth to finance investment in favorably financial market conditions since they are risk averse.. This in turn will make

A classical approach to solve the free boundary problem is to penalize one of the boundary conditions in the over- determined system (1)-(4) within a shape optimization approach to

Wittbold, The Cauchy problem for a conservation law with a multiplicative stochastic perturbation, Journal of Hyperbolic Differential Equations 09 (04) (2012) 661–709..