Simulation of conditioned diffusions

(1)

HAL Id: hal-00019329

https://hal.archives-ouvertes.fr/hal-00019329

Preprint submitted on 21 Feb 2006

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires

Simulation of conditioned diffusions

Bernard Delyon, Ying Hu

To cite this version:

Bernard Delyon, Ying Hu. Simulation of conditioned diffusions. 2006. �hal-00019329�

(2)

ccsd-00019329, version 1 - 21 Feb 2006

Simulation of conditioned diffusions

Bernard Delyon, Ying Hu

IRMAR, Universit´e Rennes 1

Campus de Beaulieu, 35042 Rennes Cedex, France

Emails: [email protected], [email protected] February 21, 2006

Abstract

In this paper, we propose some algorithms for the simulation of the distribution of certain diffusions conditioned on terminal point. We prove that the conditional distribution is absolutely continuous with respect to the distribution of another diffusion which is easy for simulation, and the formula for the density is given explicitly.

1 Introduction

The aim of this paper is to propose algorithms for the simulation of the distribution of a diffusion dx_t=b(t, x_t)dt+σ(t, x_t)dw_t, x₀=u, 0≤t≤T,

conditioned onx_T =v, whereband σ are given functions with appropriate dimensions, andw is a standard Brownian motion.

From the point of view of application, this allows to do posterior sampling when the diffusion is observed at instants{t₁,· · ·, t_n} ⊂[0, T].

Let us recall that in the usual conditioning (see, e.g., [7]), the distribution of the diffusion x conditioned on x_T =v is the same as that of another diffusion y satisfying

dy_t= ˜b(t, y_t)dt+σ(t, y_t)dw_t, y₀ =u, 0≤t≤T, where

˜b(t, x) =b(t, x) + [σσ^∗](t, x)∇x(logp(t, x;T, v)),

and p(s, u;t, z) is the density of x^s,u_t . However, this is not suitable for simulations because in general, one does not know the transition densityp.

We will prove that, in certain cases, the conditional distribution of the diffusion is absolutely continuous with respect to the distribution of another diffusion which is easy for simulation, and we give the explicit formula for the density. This leads to an efficient simulation algorithm.

Two different cases will be considered:

1. The matrixσ(t, x) depends only ont, andbhas the formb(t, x) =b₀(t) +A(t)x+σ(t)b₁(t, x).

(3)

2. The matrix σ(t, x) is uniformly invertible.

This simulation algorithm can be applied to a problem of parameter estimation for a discretely observed diffusion as in [8].

This paper is organized as follows: In Sect. 2, we recall a Girsanov theorem for unbounded drift which is essential for our simulation algorithm. In Sect. 3, we consider Case 1, and in Sect. 4, we consider Case 2.

2 A Girsanov theorem for unbounded drifts

This section is devoted to give a slightly generalized Girsanov theorem which will be used in the next section. We call a measurable functionF(t, x) fromR+×R^dtoRⁿlocally Lipchitz with respect to x, if for anyR >0, there exists a constant CR>0, such that, for any (t, x, y)∈R+×R^d×R^d with|x| ≤R,|y| ≤R,

|F(t, x)−F(t, y)| ≤C_R|x−y|.

And on the metric spaceC([0, T];R^m), we define the filtration{Ft}t to be the natural filtration of the coordinate process.

Theorem 1 Letb(t, x),h(t, x),σ(t, x)be measurable functions fromR+×R^dtoR^d,R^m, andR^d×m which are locally Lipschitz with respect tox; consider the following stochastic differential equations:

dx_t=b(t, x_t)dt+σ(t, x_t)dw_t, (1)

dyt= (b(t, yt) +σ(t, yt)h(t, yt))dt+σ(t, yt)dwt, y0 =x0, (2) on the finite interval [0, T]. We assume the existence of strong solution for each equation. We assume in addition that h is bounded on compact sets. Then the Girsanov formula holds: for any non-negative Borel function f(x, w) defined onC([0, T];R^d)×C([0, T];R^m), one has

E_y[f(y, w^h)] =E_x[f(x, w)e^R⁰^T^h^∗^(t,x^t^)dw^t⁻¹²^R⁰^T^|h(t,x^t^)|²^dt], (3) E_x[f(x, w)] =E_y[f(y, w^h)e⁻^R

T

0 h^∗(t,yt)dwt−¹₂RT

0 |h(t,yt)|²dt], (4) where w^h_t =w_t+Rt

0h(s, y_s)ds, and h^∗ stands for the transpose of h.

Proof: We assume first that the positive supermartingale M_t= exp

Z t 0

h^∗(s, x_s)dw_s− 1 2

Z t

0 |h(s, x_s)|²ds

is a martingale under P_x which will be proved later. In this case ˜w_t = w_t−R_t

0h(s, x_s)ds is a Brownian motion underM_TP_x, leading to a solution (x,w) of (2):˜

dx_t=b(t, x_t)dt+σ(t, x_t)h(t, x_t)dt+σ(t, x_t)d ˜w_t.

As b(t, x), h(t, x), σ(t, x) are locally Lipschitz with respect to x, pathwise uniqueness holds for (1) and (2). The standard Girsanov theorem implies that (3) holds.

(4)

We prove now that M_t is a martingale. For any R >0, consider the stopping time τ_R= inf{t≥0 :|x_t| ≥R} ∧T.

Taking into consideration that h is locally bounded, we have, according to the Girsanov theorem for bounded drift:

P_y|F_τR =M_τRP_x|F_τR. Hence

E_x[M_T]≥E_x[1_τR=TM_T] =E_x[1_τR=TM_τR] =P_y[τ_R=T]

which converges to 1 as R→ ∞. It implies that E_x[M_T] = 1, andM is a martingale.

Finally, (4) follows in the same way.

3 Case when σ is independent of x

We assume here that xt has the specific form

dx_t= (σ_th(t, x_t) +A_tx_t+b_t)dt+σ_tdw_t, x₀ =u, (5) where σ_t and A_t are time dependent deterministic matrices and h(t, x), b_t are vector valued with appropriate dimension.

For example the 2-dimensional process (x, y) which satisfies the following SDE:

dxt=ytdt (6)

dy_t=b(t, x_t, y_t)dt+σdw_t (7)

and which is the noisy version of ¨x_t=b(t, x_t,x˙_t), see, e.g. [1].

We shall prove the following result:

Theorem 2 Assume thatA_t, b_t andσ_tare bounded measurable functions of twith values in R^d×d, R^dand R^d×m, respectively. Assume also that h(t, x) is locally Lipschitz with respect tox uniformly with respect to t with values in R^m, and locally bounded; and the SDE(5) has a strong solution.

Moreover, we assume that σ admits a measurable left inverse almost everywhere¹, denoted by σ⁺; and that h, A, b and σ⁺ are left continuous with respect tot. Then,

(i) the covariance matrixR_stof the Gaussian processξ_tcorresponding to (5) withh= 0 is given by:

Rst=Ps

Z min(s,t) 0

P_u⁻¹σuσ_u^∗P_u^−∗du P_t^∗, where

dP_t

dt =A_tP_t, P₀ =Id, and P_u^−∗= (P_u⁻¹)^∗;

1This requires essentially thatσ^∗σis almost everywhere>0

(5)

(ii) the distribution of the process

p_t=ξ_t−R_tTR⁺_{T T}(ξ_T −v) (8)

is the same as the distribution of ξ conditioned on ξ_T =v (M⁺ stands for the left pseudo-inverse² of M). For any nonnegative measurable function f,

E[f(x)|x₀ =u, x_T =v] =CEh f(p)e^R

T

0 h^∗(t,pt)(σ⁺t dpt−σ⁺t(Atpt+bt)dt)−¹₂RT

0 kh(t,pt)k²dti

, (9)

where C is a constant depending on u, v and T.

Proof: (i) The formula for R_st is classic and comes from ξ_t =P_tRt

0P_u⁻¹(b_udu+σ_udw_u) +P_tξ₀, see e.g. [6].

(ii) Let us first recall that if (Y, Z) is a Gaussian vector, the distribution of Y conditioned on Z =z₀ coincides with the distribution of another Gaussian vector Y −R_{Y Z}R⁺_ZZ(Z −z₀), where R_ZZ⁺ is the left pseudo-inverse of R_ZZ; its covariance is R_{Y Y} −R_{Y Z}R⁺_ZZR_ZY. Taking Y as the vector (ξt1,· · ·, ξtk), and Z =ξT, we observe that, defining the processp by (8), (pt1,· · ·, ptk) has the same distribution as that of (ξ_t₁,· · ·, ξ_tk) conditioned on ξ_T = v. And the covariance of p_t is C_st=R_st−R_sTR⁺_{T T}R_{T t}.

Denote by p^v_t the process (8); in particular for any nonnegative measurable function ϕ(·), E[ϕ(ξ)] = R

E[ϕ(p^v)]µ_T(dv) where µ_T is the distribution of ξ_T. For any nonnegative measurable functions f and g,

E[f(x)g(x_T)] = E[f(ξ)g(ξ_T)e^R

T

0 h^∗(t,ξ^t)dw^t−¹₂RT

0 kh(t,ξ^t)k²dt]

= E[f(ξ)g(ξT)e^R

T

0 h^∗(t,ξ^t)(σt⁺dξ^t−σ⁺t (A^tξ^t+b^t)dt)−¹₂RT

0 kh(t,ξ^t)k²dt]. (10) Given a sequence of partitions (∆_n)_n≥1 of [0, T]:

∆_n={tⁿ₀ < tⁿ₁ <· · ·< tⁿ_kn =T}

with|∆_n|= max_0≤i≤kn−1(tⁿ_i+1−tⁿ_i)→0, and a continuous stochastic processX, we define:

S_n(X) =

kⁿ−1

X

i=0

h^∗(tⁿ_i, X_tⁿ_i)σ⁺_tⁿ

i(X_tⁿ_i+1−X_tⁿ_i).

Then

E[|S_n(ξ)−S_m(ξ)| ∧1] = Z

R^d

E[|S_n(p^v)−S_m(p^v)| ∧1]µ_T(dv), which implies thatS_n(p^v) converges in probabilityP⊗µ_T. Hence, we can defineRT

0 h^∗(t, p^v_t)σ_t⁺dp^v_t as the limit (in probabilityP⊗µ_T) of the sequence S_n(p^v). Obviously, this limit is independent of the sequence of partitions (∆_n)_n which satisfies|∆_n| →0.

Finally, defining the continuous function Θ_N(x) =N ∧x, x≥0, we have E[Θ_N(f(ξ)g(ξ_T)e^Sⁿ^(ξ)−^R

T

0 h^∗(t,ξ^t)σ⁺t (A^tξ^t+b^t)dt−¹₂RT

0 kh(t,ξ^t)k²dt)]

= Z

R^d

E[Θ_N(f(p^v)e^Sⁿ^(p^v⁾⁻^R⁰^T^h^∗^(t,p^v^t^)σ^t⁺^(A^t^p^v^t^+b^t^)dt−¹²^R⁰^T^kh(t,p^v^t^)k²^dtg(v))]µ_T(dv).

2M⁺= (M^∗M)⁻¹M^∗and the symmetric matrix is inverted by diagonalisation with 1/0 = 0

(6)

Taking the limit first in nand then inN, and returning to (10), we deduce:

E[f(x)g(x_T)] = Z

R^d

E[f(p^v)e^R

T

0 h^∗(t,p^vt)(σt⁺dp^vt−σt⁺(A^tp^vt+b^t)dt)−¹₂RT

0 kh(t,p^vt)k²dt]g(v)µ_T(dv)

which implies (9) and C is the value of the density ofµT with respect to the distribution of xT at v.

As the Brownian bridge, we have:

Proposition 3 Let us assume that M_t=RT

t P_u⁻¹σ_uσ_u^∗P_u^−∗du is positive definite for anyt∈[0, T).

Then the distribution of the process p is the same as that ofq which is the solution to the following linear SDE

dq_t=A_tq_tdt+b_tdt+σ_tσ_t^∗P_t^−∗M_t⁻¹(P_t⁻¹(E[ξ_t]−q_t)−P_T⁻¹(E[ξ_T]−v))dt+σ_tdw_t, (11) withq₀ =u.

Proof: The matrix Q_t=P_tM_t is solution to ˙Q_t= (A_t−σ_tσ^∗_tP_t^−∗M_t⁻¹P_t⁻¹)Q_t, implying that the covariance of q_t can be rewritten as follows: fors < t,

Q_s Z _s

0

Q⁻¹_u σ_uσ^∗_uQ^−∗_u du Q^∗_t = Q_s Z _s

0

M_u⁻¹P_u⁻¹σ_uσ_u^∗P_u^−∗M_u⁻¹du Q^∗_t

= Q_s M_s⁻¹−M₀⁻¹ Q^∗_t

= P_sM_s(M_s⁻¹−M₀⁻¹)M_tP_t^∗

= P_s(M₀−M_s)(Id−M₀⁻¹(M₀−M_t))P_t^∗

= Cst.

On the other hand, from (8), the expectation ¯p_t of the process p_t satisfies d

dtp¯t−Atp¯t−bt = −σtσ_t^∗P_t^−∗P_T^∗R⁻¹_{T T}(E[ξT]−v).

Elementary algebra shows P_T^∗R⁻¹_{T T} =−Q⁻¹_t (R_tTR⁻¹_{T T} −P_tP_T⁻¹), hence d

dtp¯_t−A_tp¯_t−b_t = σ_tσ_t^∗P_t^−∗Q⁻¹_t (R_tTR⁻¹_{T T} −P_tP_T⁻¹)(E[ξ_T]−v)

= σtσ_t^∗P_t^−∗Q⁻¹_t (E[ξt]−p¯t−PtP_T⁻¹(E[ξT]−v))

which is the equation satisfied by E[q_t]. The conclusion follows by noting that both p and q are Gaussian processes.

Remark. M_t is positive definite for any t ∈ [0, T) if and only if the pair of functions (A, σ) is controllable on [t, T] for anyt∈[0, T). See, e.g. [6] for some discussions.

Example. Consider the 2-dimensional stochastic differential equation defined by (6,7), where σ 6= 0. Let us assume that bis locally Lipschitz with respect to (x, y), and this equation admits a strong solution (the strong solution exists if there exists a Lyapunov function, see, e.g. [1]). Then we have:

E[f(x, y)|(x₀, y₀) =u,(x_T, y_T) =v] =CEh

f(p, q)e^σ⁻²

RT

0 b(t,p^t,q^t)dq^t− ¹

2σ2

RT

0 b(t,p^t,q^t)²dti (12)

(7)

where (p, q) is the following bridge starting from (p₀, q₀) =u:

p_t qt

= z_t

˙ zt

− t T³

t(3T −2t) −tT(T −t) 6(T −t) T(3t−2T)

z_T −v₁

˙ zT −v2

, (13)

with z_t=u₁+tu₂+σ Z t

0

w_sds, z˙_t=u₂+σw_t; or (p, q) can be chosen as:

dpt=qtdt, dqt=

−6 pt−v1

(T −t)² −22qt+v2

T−t

dt+σdwt. (14)

(8)

4 σ invertible, general b

4.1 Bounded drift

Let us consider the following SDEs:

dx_t=b(t, x_t)dt+σ(t, x_t)dw_t, x₀=u, (15)

dy_t=b(t, y_t)dt−y_t−v

T −tdt+σ(t, y_t)dw_t, y₀=u. (16)

Remark. Ifb = 0, and σ = Id, then x is a Brownian motion. It is well known (see, e.g. [6]) that the law of the Brownian motion xconditioned on x_T =vis the same as that of the Brownian bridgey satisfying the following SDE:

dy_t=−y_t−v

T−tdt+ dw_t, y₀ =u.

The form of SDE(16) is inspired by the above SDE in order to fit the simplest case: the Brownian bridge case.

The objective of this section is to prove that the distribution ofx(solution of (15)) conditioned onxT =v is absolutely continuous with respect toy(solution of (16)) with an explicit density. We shall assume some regularity conditions onb and σ here.

Assumption 4.1The functions b(t, x) andσ(t, x) are C^1,2 with values inR^dand R^d×drespec- tively; and the functionsb, σ, together with their derivatives, are bounded. Moreover,σis invertible with a bounded inverse.

Let x^s,u be the solution of (15) starting at s ∈ [0, T]. Under Assumption 4.1, x is a strong Markov process with positive transition density. For (s, u) ∈ [0, T], we denote p(s, u;t, z) to be the density of x^s,u_t . Then there exist constants m, λ, M,Λ > 0, such that the density function p(s, u;t, z) satisfies Aronson’s estimation [2] : fort > s,

m(t−s)⁻^d²e⁻

λ|z−u|2

t−s ≤p(s, u;t, z)≤M(t−s)⁻^d²e⁻^Λ|

z−u|2 t−s . We first study SDE (16).

Lemma 4 Let Assumption 4.1 hold. Then the SDE(16) admits a unique solution on [0, T). More- over, lim_t→Ty_t=v, a.s. and |y_t−v|² ≤C(T −t) log log[(T −t)⁻¹+e], a.s., where C is a positive random variable.

Proof: The fact that the SDE(16) admits a unique solution on [0, T) is classic. Applying Itˆo’s formula to ^y_T^t^−v_−t, we deduce easily the following:

y_t−v

T −t = u−v

T +

Z t 0

(T −s)⁻¹b(s, ys)ds+ Z t

0

(T −s)⁻¹σ(s, ys)dws. For each i,{

Rt

0(T −s)⁻¹σ(s, y_s)dw_s

i, t ≥0} ={P_d

j=1

Rt

0(T −s)⁻¹σ_ij(s, y_s)dw^j_s, t ≥0} is a continuous local martingale, and its quadratic variation processτ_t=R_t

0

P_d

j=1(T−s)⁻²σ_ij²(s, y_s)ds

(9)

satisfiesτ_t→ ∞ ast→T, and τ_t≤ _T^c_−t for a constantc > 0. Applying Dambis-Dubins-Schwarz’s theorem, for each i, there exists a standard one-dimensional Brownian motionBⁱ, such that

Z t 0

(T−s)⁻¹σ(s, y_s)dw_s

i =Bⁱ(τ_t), t≥0.

Taking into consideration of the law of the iterated logarithm for the Brownian motion Bⁱ, the conclusion follows easily.

Now we can state the main theorem of this section.

Theorem 5 Let Assumption 4.1 hold. Then

E[f(x)|x_T =v] (17)

= CE

"

f(y) exp (

− Z T

0

2˜y^∗_tA_t(y_t)b_t(y_t)dt+ ˜y^∗_t(dA_t(y_t))˜y_t+P

ijdhA^ij_t (y_t),y˜_tⁱy˜_t^ji 2(T−t)

)#

whereA(t, y) = (σ(t, y)^∗)⁻¹σ(t, y)⁻¹, y˜_t=y_t−v, andh·,·iis the quadratic variation of semimartin- gales.

Remark. From Lemma 4, the integral in (17) is well defined.

Proof: Let f(x) be an Ft-measurable nonnegative function, t < T, then E[f(y)] = E

f(x) exp

− Z _t

0

(σ_s⁻¹(x_s)(x_s−v))^∗

T−s dw_s−1 2

Z _t

0 kσ⁻¹_s (x_s)(x_s−v) T−s k²ds

(18) On the other hand, Itˆo’s formula gives:

d kσ⁻¹(t, x_t)(x_t−v)k²

T−t = 2(x_t−v)^∗A(t, x_t)dx_t

T−t +kσ⁻¹(t, x_t)(x_t−v)k²

(T−t)² dt+ d·dt T −t +(x_t−v)^∗(dA(t, x_t))(x_t−v)

T −t +

P

ijdhA^ij(t, x_t),(xⁱ_t−vⁱ)(x^j_t−v^j)i T−t

Combining the above equation with (18), we deduce that, E[f(y)] = CC_tE

f(x) exp

−kσ⁻¹_t (x_t)(x_t−v)k² 2(T −t) +

Z _t

0

(x_s−v)^∗A_s(x_s)b_s(x_s)

T −s ds

+1 2

Z _t

0

(x_s−v)^∗(dA_s(x_s))(x_s−v)

T −s +

P

ijdhA^ijs(xs),(xⁱ_s−vⁱ)(x^js−v^j)i T−s

)#

, whereC >0 is a constant, andC_t= (T −t)⁻^d².

Or equivalently, E[f(y)ϕ_t] =CC_tE

f(x) exp

−kσ(t, x_t)⁻¹(x_t−v)k² 2(T −t)

, (19)

where

ϕ_t = exp (

− Z _t

0

˜

y_s^∗As(ys)bs(ys) T −s ds−1

2 Z _t

0

˜

y^∗_s(dAs(ys))˜ys

T −s +

P

ijdhA^ij_s(y_s),y˜ⁱ_sy˜^j_si T −s

)

. (20)

(10)

Note that{ϕ_t, t∈[0, T]}is a well defined continuous process, thanks to Lemma 4.

Putting f = 1 in (19), we deduce then:

E[f(y)ϕt] E[ϕt] = Eh

f(x) expn

−^kσ(t,x^t_2(T⁾⁻¹_−t)^(x^t^−v)k²oi Eh

expn

−^kσ(t,x^t_2(T⁾⁻¹_−t)^(x^t^−v)k²oi . (21) Assuming that f(x) takes the form f(x) = g(x_t₁,· · ·, x_tN), 0 < t₁ < t₂ < · · · < t_N < T, g∈C_b(R^{N d}), and letting t→T, from the Lemmas 7 and 8 in the Appendix, we get:

E[f(y)ϕ_T]

E[ϕ_T] =E[f(x)|x_T =v].

This completes the proof of the theorem.

Remark. For practical implementation, it is useful to note that the second and third terms of the integral in (17) are the limit of P

˜

y^∗_t_k(A(t_k, y_tk)−A(t_k−1, y_tk−1))˜y_tk

1 2(T−tk). 4.2 Unbounded drift

Let us now consider the following SDE:

dx_t=b(t, x_t)dt+σ(t, x_t)dw_t, x₀=u, (22)

where the driftb can be unbounded. We assume instead

Assumption 4.2 The function σ(t, x) is C^1,2 with values in R^d×d; the function σ together with its derivatives are bounded; and σ is invertible with a bounded inverse. The function b is locally Lipschitz with respect tox and is locally bounded. Moreover, the SDE (22) admits a strong solution.

Combining the Theorems 1 and 5, we are able to prove the following Theorem 6 Let Assumption 4.2 hold, andy be the solution of

dyt=−y_t−v

T−tdt+σ(t, yt)dwt, y0=u. (23)

Then,

E[f(x)|x_T =v]

= CE

"

f(y) exp (

− Z T

0

˜

y^∗_t(dA(t, y_t))˜y_t+P

ijdhA^ij(t, y_t),y˜_tⁱy˜^j_ti 2(T−t)

+ Z _T

0

(b^∗A)(t, y_t)dy_t−1 2

Z _T

0 |σ⁻¹b|²(t, y_t)dt

,

where A(t, y) =σ(t, y)^−∗σ(t, y)⁻¹, y˜_t=y_t−v, and h·,·i is the quadratic variation of semimartin- gales.

(11)

Proof: Let ¯x be the solution of:

d¯x_t=σ(t,x¯_t)dw_t, x¯₀ =u. (24)

Then, from Theorem 1, for nonnegative measurable functions f and g, E[f(x)g(x_T)] = E[f(¯x)g(¯x_T)e^R

T

0 (b^∗A)(t,¯x^t)d¯x^t−¹₂RT

0 |σ⁻¹b|²(t,¯x^t)dt]

= Z

R^d

E[f(¯x)e^R

T

0 (b^∗A)(t,¯x^t)d¯x^t−¹₂RT

0 |σ⁻¹b|²(t,¯x^t)dt|x¯_T =v]g(v)dv.

It remains to apply Theorem 5.

Remark. If the driftb is bounded, both formulas in Theorems 5 and 6 are available. Unfortu- nately, it is difficult to compare the efficiency of simulation when applying these two formulas.