Stochastic Approximation Finite Element method: analytical formulas for multidimensional diffusion process

(1)

HAL Id: hal-00842608

https://hal.archives-ouvertes.fr/hal-00842608

Preprint submitted on 9 Jul 2013

HAL

is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire

HAL, est

destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Stochastic Approximation Finite Element method:

analytical formulas for multidimensional diffusion process

Romain Bompis, Emmanuel Gobet

To cite this version:

Romain Bompis, Emmanuel Gobet. Stochastic Approximation Finite Element method: analytical

formulas for multidimensional diffusion process. 2013. �hal-00842608�

(2)

FORMULAS FOR MULTIDIMENSIONAL DIFFUSION PROCESS

R. BOMPIS^∗ANDE. GOBET^†

Abstract.We derive an analytical weak approximation of a multidimensional diffusion process as coefficients or time are small. Our methodology combines the use of Gaussian proxys to approximate the law of the diffusion and a Finite Element interpolation of the terminal function applied to the diffusion. We call this method Stochastic Approximation Finite Element (SAFE for short) method. We provide error bounds of our global approximation depending on the diffusion process coefficients, the time horizon and the regularity of the terminal function. Then we give estimates of the computational cost of our algorithm. This shows an improved efficiency compared to Monte-Carlo methods in small and medium dimensions (up to 10), which is confirmed by numerical experiments.

Key words.weak approximation, diffusion processes, Malliavin calculus, finite elements AMS subject classifications.60H35, 60H07, 65M60, 65Cxx

1. Introduction.

Motivation and contribution of the paper. We consider ford ≥ 1 a d-dimensional stochastic differential equation (SDE) defined by:

Xt=x0+

q

X

j=1

Z t 0

σj(s,Xs)dW_s^j+Z t 0

b(s,Xs)ds,

where (W_t)t≥0is a standard Brownian motion inR^qon a filtered probability space

(Ω,F,(F_t)t≥0,P) with the usual assumptions on the filtration (F_t)t≥0. Here,σis ad×qmatrix andbis ad-dimensional vector, their entries being regular and bounded functions. We are interested in deriving analytical approximations of

(1.1) E[h(X_T)]

for a given functionh, at least Lipschitz continuous, and a fixed time horizonT > 0. The explicit calculus of (1.1) is most of the time impossible because the marginal law of the dif- fusionX is not known and because of the general form of the functionh. Hence it is usual to perform a numerical method. For low dimension (sayd ≤3), we may use PDE schemes since (x0,T)7→E[h(X_T)] solves a linear parabolic PDE but the complexity is increasing very quickly with the dimensiond. For higher dimension, Monte-Carlo methods are preferred, but although almost insensitive to the dimension, they only evaluate the above expectation for a single (x0,T). The aim of this work is to provide an alternative numerical method, based on analytical approximation, and we highlight an approach suiting well to general functionsh without specific form (under reasonable conditions) and to rather general diffusion models.

The quick and efficient approximation of SDE distributions is fundamental, it is widely used as a cornerstone of probabilistic algorithms related to the dynamic programming problems which necessitate the evaluation of many nested conditional expectations (for instance, see [1] for optimal stopping problems and [16] for Backward SDEs).

Our subsequent numerical method (called Stochastic Approximation Finite Element, SAFE for short) relies both on the weak approximation of the marginal law ofXT and on the approximation of the functionh. Firstly, to approximate the law ofX, we consider the Gaussian

∗CMAP, Ecole Polytechnique, Route de Saclay, 91128 Palaiseau cedex, France. Email:

romain.bompis@polytechnique.edu.

†Email:emmanuel.gobet@polytechnique.edu. Corresponding author.

1

(3)

proxy process obtained by freezing atx=x0the diffusion coefficients:

X^P_t =x0+

q

X

j=1

Z t 0

σ_j(s,x0)dWs^j+Z t 0

b(s,x0)ds.

(1.2)

Using the Proxy principle of [9] , we derive a weak approximation in the form (see Theorem 2.1):

E[h(XT)]≈E[h(X_T^P)]+X

|α|≤3

wα,T∂^|α|_α

1..._α|α| E[h(X_T^P+)]

₌₀, (1.3)

whereα ∈ {1, . . . ,d}^|α| is a multi-index,w_α,T are weights depending explicitly on the SDE coefficients and where the sensitivities∂^|α|_α

1...α|α| E[h(X_T^P+)]

₌₀are well defined as soon as the law ofX_T^Pis non degenerate. Apart from few specific cases of functionsh(for example if hhas separable variables combined with the independence of theX_T^Pcomponents), the representation (1.3) can not be directly computed in closed forms: however, it can be rewritten in a simple expectation form suitable for simple and direct Monte-Carlo simulations (see Theo- rem 2.3). To obtain fully analytical formulas, another ingredient is needed. The second step is to approximate the functionhby a local interpolation based on suitable shape functions of Fi- nite Element Methods (see Theorems 2.4-2.5-2.8). Denoting by ˆhthe resulting interpolation ofh, the final structure of approximation becomes

E[h(XT)]≈E[ˆh(X_T^P)]+X

|α|≤3

wα,T∂^|α|_α

1..._α|α| E[ˆh(X_T^P+)]

₌₀,

which accuracy and complexity are given in Theorems 2.6-2.8 and Corollaries 2.7-2.9. The convergence holds asb, σorTgo to 0 in a suitable sense. The key feature in this methodology is that the interpolation procedure is done in such a way that the calculus of the above expectations is fully explicit and reduces to computations involving the c.d.f. of a one-dimensional Gaussian r.v. and its derivatives (see Subsection 2.2). The flexibility and the accuracy of our formulas allow their use as it stands or alternatively it could serve as a control variates tool to improve Monte-Carlo methods.

Background results. We briefly describe the main known approaches to approximate the distribution of a SDE. Time discretization schemes are broadly described in [13]: they consist in replacingXby an approximation ˆXeasier to simulate, the evaluation ofE(h( ˆXT)) is then made using Monte-Carlo simulations. The balance between discretization and integra- tion errors is described in [6].

Alternatively, the cubature on Wiener space by Kusuoka-Lyons-Victoir [15, 18] is a well- established theory. It is based on a smart discrete approximation of the Wiener measure, which leads to solving ODEs in order to approximateX. The splitting method by Ninomiya- Victoir [19] also reduces to solving ODEs. Clearly, these approaches are different from ours.

The quantization method [10] is aimed at approximating the distribution ofXT with a fixed number of points, optimally w.r.t. aLp-norm; for applications to stochastic processes, see for instance [1]. This differs from the current work.

The use of asymptotic methods has been much developed during the recent years, mostly in the fields of mathematical finance. As opposed to our setting, the related works deal mainly with one-dimensional processes and specifich. Mathematical approaches are numerous, see [7, 3, 17] among others. Here, we address the multi-dimensional case with general functions h, extending much the setting of previous references.

(4)

Organization of the paper. In the following, we introduce notations and assumptions that are used throughout the paper. We state in Section 2 the main results of the paper:

• We first provide in Theorem 2.1 a second order weak approximation of E[h(XT)]

using the Gaussian proxy, the magnitude of the error being estimated w.r.t. the SDE coefficients and the time horizonT. The previous approximation, involving correction terms as expectation sensitivities, has an interesting representation as a simple expectation, much suitable for direct Monte-Carlo simulations, see Theorem 2.3.

• We then perform a suitable multilinear interpolation of the functionhin Theorem 2.4, the accuracy results being given in Theorem 2.5 according to the regularity of h. The resulting formulas are fully explicit.

• We finally establish a final approximation combining both the weak expansion and the interpolation ofhin Theorem 2.6, providing tight error estimates as well a complexity analysis (Corollary 2.7).

• Results are extended in Theorem 2.8 and Corollary 2.9 considering multiquadratic finite elements.

The proof of the error estimates of Theorems 2.1 and 2.5 are respectively given in Sections 3 and 4. Numerical experiments illustrating the performance of our algorithm in comparison to Monte-Carlo methods are presented in Section 5. Appendix A is devoted to the explicit derivation of the corrective terms of the weak expansion provided in Theorem 2.1.

Notations, definitions and assumptions.

Linear algebra. The j-th column of a matrixAwill be denoted byAj(or Aj,tifAis a time-dependent matrix) and itsi-th row by Aⁱ. A^∗ denotes the transpose ofAand if it is a squared matrix, det(A) stands for its determinant. I_mdenotes them-dimensional identity matrix,h·,·ithe inner product onR^mand| · |is the Euclidean norm onR^m.

Functions. As usual,C^k(O1,O2) stands for the set of functions g : O1 7→ O2 that arek-times continuously differentiable, whereO1,O2 are some subsets of Euclidean spaces.

Letp1,p2be inN\{0}. For any functiong=(g1, . . . ,gp₂)^∗: [0,T]×R^p¹ → R^p², we denote

|g|∞ = sup

(t,x)∈[0,T]×R^p¹|g(t,x)|. Ifgis sufficiently differentiable w.r.t. the variablex, its gradient which takes values inR^p²⊗R^p¹is simply denoted∇g(t,x)=(∂x₁g(t,x), . . . , ∂x_p₁g(t,x)); when p2=1, we denote its Hessian matrix byH(g)(t,x)=(∂²_x_i_,x_jg(t,x))i,j∈{1,...,p₁}. Furthermore, we often use the short notation∂^αg(t,x) for∂^|α|_x_α

1...x_α|α|g(t,x), i.e. the partial derivative ofgw.r.t. a multi-indexαaccording to the space variable. We denote by Lip(R^d,R) the space of Lipschitz functionsh:R^d 7→Rsatisfying C_Lip,h:=sup_x,y∈

R^d,x,y

|h(x)−h(y)|

|x−y| <+∞.

About the Gaussian proxy. Whenever unambiguous, we use the notations σt := σ(t,x0) andbt :=b(t,x0) for anyt∈[0,T] and we denote byΣt :=σtσ^∗_t thed-dimensional non-negative definite covariance matrix at timetassociated to the Gaussian processX^Pde- fined in (1.2). We start with an easy result, which notations are used throughout the paper.

Property1.1.

1. The distribution of X_T^Pis normal with mean m_T^P=x0+RT

0 btdt and covariance matrix V_T^P=RT

0 Σtdt.

2. There is a d-dimensional orthogonal matrixU_Vsuch thatV_T^P=U_VD_T^PU⁻¹_V where D^P_T :=diag(λ²₁T, . . . , λ²_dT)is a d-dimensional diagonal matrix containing the eigenvalues ofV^P_T.

Assumption (H_x₀) onσandb.

(H_x₀)-i) σandbare bounded measurable functions from [0,T]×R^dtoR^d×qandR^drespec- tively, they are twice continuously differentiable w.r.t. x, with uniformly bounded derivatives, and their second derivatives are locally α ∈ (0,1]-Hölder continuous

(5)

w.r.t. x. We set:

M₁(σ,b)= X

α:1≤|α|≤2

(|∂^ασ|_∞+|∂^αb|_∞) and M₀(σ,b)=max(|σ|_∞,|b|_∞,M₁(σ,b)).

To avoid uninteresting situations, we assumeM₀(σ,b)>0.

(Hx₀)-ii) There is a constantCV≥1 such that CVM₀(σ,b)≥ max

i∈{1,...,d}λi≥ min

i∈{1,...,d}λi≥(CV)⁻¹M₀(σ,b).

In particular, the matrixV_T^Pis positive definite.

From(H_x₀)and Property 1.1 we easily deduce Property1.2.

1. The distribution of X_T^P has a density f^P(x) = ^e⁻¹²^(x−mP^T⁾^∗^(VP^T⁾⁻^{1 (x−mP}^T⁾

(2π)^d2

√

det(V^P_T) , such that for any multi-indexα

∂^αf^P(x)

≤Cα,d(M₀(σ,b)

√

T)^−(d⁺^|α|)exp

− |x−m^P_T|² Cα,d[M0(σ,b)]²T

, (1.4)

for a constant Cα,d >0that depends in a non-decreasing way on T ,M₀(σ,b)and CV.

2. For any measurable function φ : R^d → R exponentially bounded, define φ^P : ∈ R^d 7→ φ^P() = E[φ(X_T^P+)]. Thenφ^P is of classC^∞ and all the derivatives

∂^|α|_α

1...α|α|φ^P(0) :=∂^|α|_α

1...α|α|φ^P()|=0exist for any multi-indexα∈ {1, . . . ,d}^|α|. Miscellaneous. We use the following notations to state our error estimates throughout the paper:

• "A = O(B)" means that|A| ≤ CBwhereC stands for a generic constant that is a non-negative non-decreasing function of the parametersd,T,M₀(σ,b),M₁(σ,b) andCV. Unless made explicit, a generic constant may depend on the test function h.

• Similarly, ifAis non-negative,A≤_cBmeans thatA≤CBfor a generic constantC.

Lastly, for a r.v.Y ∈R^m(m≥1) and forp≥1,||Y||_p=(E|Y|^p)¹^p stands for itsL^p-norm.

2. Main results.

2.1. Second order weak approximation. The model proxy has the advantage to have an explicit Gaussian law and the accuracy of the approximations σ(t,Xt) ≈ σ(t,x0) and b(t,Xt) ≈ b(t,x0) can be justified ifM₁(σ,b), M₀(σ,b) andT are globally small enough (see Lemma 3.1). Nevertheless, we can not reasonably expectE[h(XT)]≈h^P(0)=E[h(X_T^P)]

to be solely accurate enough and we provide correction terms. To derive them, we make an intensive use of the next interpolated process:

X^η_t =x0+

q

X

j=1

Z t 0

σj(s, ηX^η_s+(1−η)x0)dW_s^j+Z t 0

b(s, ηX^η_s+(1−η)x0)ds, η∈[0,1], (2.1)

so that X^η⁼¹ = X and X^η⁼⁰ = X^P. Under (H_x₀)-i), almost surely for anyt, η → X_t^η is C²([0,1],R^d), see [14]. The dynamics of the two first derivatives ( ˙X^η_t :=∂_ηX^η_t)_t≥0and ( ¨X_t^η:=

(6)

∂²_η₂X^η_t)_t≥0are obtained by a straight differentiation of the SDE satisfied byX^η:

X˙^η_t =

q

X

j=1

Z t 0

∇σ_j(s,x0+η(X^η_s −x0))(X^η_s−x0+ηX˙^η_s)dW_s^j

+Z t 0

∇b(s,x0+η(X^η_s−x0))(X^η_s −x0+ηX˙_s^η)ds, (2.2)

( ¨X_t^η)ⁱ=

q

X

j=1

Z t 0

(X^ηs −x0+ηX˙s^η)^∗H(σⁱ_j)(s,x0+η(X^η_s−x0))(Xs^η−x0+ηX˙^ηs) +∇σⁱ_j(s,x0+η(X^η_s−x0))(2 ˙X^ηs+ηX¨^ηs)

dWs^j

+Z t 0

(X^η_s−x0+ηX˙^η_s)^∗H(bⁱ)(s,x0+η(X^η_s−x0))(X^η_s−x0+ηX˙^η_s) +∇bⁱ(s,x0+η(X^η_s−x0))(2 ˙X^η_s+ηX¨_s^η)ds, ∀i∈ {1, . . . ,d}.

(2.3)

Settingσ⁰_j,t :=∇σj(t,x0),Σ⁰_j,t :=∇Σj(t,x0) andb⁰_t :=∇b(t,x0), ˙X :=X˙^η⁼⁰ is solution of the SDE:

X˙t=

q

X

j=1

Z t 0

σ⁰_j,s(X^P_s −x0)dW_s^j+Z t 0

b⁰_s(X_s^P−x0)ds.

(2.4)

Then combining Taylor expansions for the interpolated processX^ηand the functionh(here as- sumed to be smooth enough for the sake of brevity), we propose the following weak stochastic approximation:

E[h(XT)]=E[h(X_T^P)]+E[∇h(X_T^P) ˙XT]+Error^SA_2,h, (2.5)

where the explicit calculus of the corrective termE[∇h(X_T^P) ˙XT] is performed in Proposition A.2 whereas the estimate of the error Error^SA_2,h is postponed to Section 3. This leads to the following Theorem (stated for only Lipschitz functionh).

Theorem2.1.(Second order weak approximation using the Gaussian proxy).

Assume(H_x₀)and suppose that h∈Lip(R^d,R). Then we have:

E[h(X_T)]=E[h(X^P_T)]+Cor_2,h+Error^SA_2,h, (2.6)

where:

Cor2,h=∇h^P(0) Z T

0

b⁰_t Z t

0

bsdsdt+

d

X

i,j=1

∂²_i_,_jh^P(0)hZ T 0

(bⁱ_t)⁰ Z t

0

Σj,sdsdt

+1 2

Z T 0

(Σⁱ_j,t)⁰ Z t

0

b_sdsdti +1

2

d

X

i,j,k=1

∂³

i,_j,_kh^P(0) Z T

0

(Σⁱ_j,t)⁰ Z t

0

Σk,sdsdt, (2.7)

recalling b⁰_t =∇b(t,x0)and(Σⁱ_j,t)⁰=∇[σσ^∗]ⁱ_j(t,x0). The stochastic approximation error term is estimated as follows:

|Error^SA_2,h| ≤_cCLip,hM₁(σ,b)[M₀(σ,b)]²T³². (2.8)

Remind that Cor_2,his well defined whatever the smoothness ofh(see Property 1.2).

(7)

Remark 2.2. The weak approximation is constituted by a leading order h^P(0) plus a sum of weighted sensitivities, i.e. derivatives of h^Pat zero, up to the third order. The error is of order 3 w.r.t. the standard deviation M₀(σ,b)

√

T and is null ifM₁(σ,b) = 0 or if CLip,h =0(i.e. h is constant). That justifies the label of second order weak approximation.

When h(x)=φ(Pd

i=1ηixi)withηi≥0, the above expansion coincides with that of [9, Theorem 2.1]related to averaged diffusions.

Although the density of the Gaussian proxy is known, the approximation formula (2.6) does not reduce to fully explicit calculations, due to the general form ofh. Nevertheless, we can derive another representation as an expectation ofh(X_T^P) modified by an explicit weight:

this is easily obtained by transferring theε-differentiation of the expectationh^P(ε) (associ- ated to correction terms) into a differentiation of the proxy Gaussian density f^P. This is a somewhat standard argument, in particular regarding the Malliavin calculus applications [20, Section 6.2], we skip details of the derivation. The advantage of this representation as an expectation is to make possible its evaluation by standard Monte-Carlo methods involving only simulations of the Gaussian proxyX_T^P, see our subsequent numerical experiments.

Theorem2.3. Under the notations and assumptions of Theorem 2.1, the main terms of the stochastic approximation are

E[h(X_T^P)]+Cor2,h =E

h(X_T^P)n

1+W[Σ,b;x0]^T₀ [V₀^T]⁻¹(X_T^P−m_T^P)o , (2.9)

where we set, forY∈R^d,

W[Σ,b;x0]^T₀(Y)=<Y,Z T 0

b⁰_t Z t

0

bsdsdt>

+

d

X

i,j=1

nYⁱY^j−([V^T₀]⁻¹)ⁱ_johZ T 0

(bⁱ_t)⁰ Z t

0

Σj,sdsdt+1 2

Z T 0

(Σⁱj,t)⁰ Z t

0

bsdsdti

+1 2

d

X

i,j,k=1

YⁱY^jY^k−Y^k([V^T₀]⁻¹)ⁱ_j−Y^j([V^T₀]⁻¹)ⁱ_k−Yⁱ([V^T₀]⁻¹)_k^j Z ^T

0

(Σⁱ_j,t)⁰ Z t

0

Σk,sdsdt.

As an alternative to a Monte-Carlo evaluation based on (2.9), we provide in the following Subsection a new numerical method to approximate the expansion formula (2.6) taking advantage of a multilinear interpolation with hat functions, which theoretical accuracy is given according to theh-smoothness. The extension to multiquadratic interpolation is presented afterwards.

2.2. An efficient algorithm using multilinear finite elements. We define the hat func- tionΛ^µz with centerz∈Rand size parameterµ >0 by:

Λ^µz(y)=y−(z−µ)

µ 1y∈[z−µ,z[+z+µ−y

µ 1y∈[z,z+µ]. (2.10)

Observe thatE(Λ^µz(G1)) is known in explicit form when G1 is a scalar Gaussian r.v. (like the proxy): therefore, replacingh by a linear interpolation ˆh(using theΛ^µz-function) leads to a fully explicit formula for (2.6). To extend to thed-dimensional case, we wish to use tensor products of such a function basis in all directions to provide an interpolation ofh.

However remark that for any (G1,G2) Gaussian vector and anyz1,z2, µ₁, µ₂, the calculus of EΛ^µz₁¹(G1)Λ^µz₂² G2is not tractable, except in the case ofzero correlation; thus an additional ingredient is necessary to maintain explicit formulas. In order to be placed in a situation

(8)

of uncorrelated Gaussian r.v., we introduce an affine transformation Aof the space, com- posed of a rotation using thed-dimensional diagonal matrixU_V (involved in the diagonal decomposition ofV^P_T) and a translation of vectorm_T^P(the expectation ofX_T^P). The following presentation is aimed at providing the construction of the right grid (nodes, directions, size).

Description of the methodology. We consider a finite product grid inR^ddefined by:

Y=(y_i^j)(i,j)∈{1,...,d}×{0,...,N}, y_i^j=−Ri+jδ_i, Ri=Rλ_i√

T, δ_i=δλ_i√

T, δ=2R N, (2.11)

where we recall thatλ²_iT are the eigenvalues of the covariance matrixV^P_Tand where the grid parameters R andδare to be specified according to the final approximation accuracy desired.

We assumeN ∈ N^∗. The gridYcontainsN^d small hypercubes and their vertices are the nodes with coordinates (y₁^j¹, . . . ,y_d^j^d)^∗ for any j1, . . . ,jd ∈ {0, . . . ,N}. Then we define a new gridX=(x^j¹^,...,^j^d)j₁,...,j_d∈{0,...,N}image ofYby the transformationA:x7→ Ax=m^P_T+U_Vx:

x^j¹^,...,j^d =(x₁^j¹^,...,j^d, . . . ,x_d^j¹^,...,^j^d)^∗:=A(y₁^j¹, . . . ,y_d^j^d)^∗.

The convex hull ofYis the hypercubeD^P, and let us introduce its imageDe^PbyA: D^P=[−R1,R1]× · · · ×[−Rd,Rd], De^P:=A(D^P).

(2.12)

Then we define the multilinear interpolation ofh based on the grid Xby setting, for any x∈R^d,

h(x)≈h(x) :ˆ = X

j1,...,jd∈{0,...,N}

h(x^j¹^,...,j^d)

d

Y

i=1

Λ^δⁱ

y_i^ji (U_V⁻¹(x−m^P_T))ⁱ. (2.13)

Notice that ˆhis continuous, vanishes outside the domainA [−R₁−δ₁,R₁+δ₁]× · · · ×[−R_d− δ_d,R_d+δ_d], and the restriction of ˆhtoD^Pis Lipschitz continuous with a Lipschitz constant at most equal to C_Lip,h. The above construction is very similar to multilinear Lagrange finite elements ond-parallelotope, see [4] for a general reference.

Explicit approximation. Using (2.13) and taking the expectation, we get for the leading order of the expansion (2.6):

E[h(X^P_T)]≈E[ˆh(X_T^P)]= X

j₁,...,j_d∈{0,...,N}

h(x^j¹^,...,^j^d)E

d

Y

i=1

Λ^δⁱ

y_i^ji (U_V⁻¹(X_T^P−m^P_T))ⁱ, (2.14)

whereU⁻¹

V(X_T^P−m^P_T) has the centered Gaussian law with independent components. Thus combining independence and scaling argument, setting

y0=(y₀^j)j∈{−1,...,N+1}:=(−R+jδ)j∈{−1,...,N+1}

(2.15)

and using (2.10)-(2.11)-(2.15), a straightforward calculus leads to

E

d

Y

i=1

Λ^δⁱ

y_i^ji (U⁻¹_V(X^P_T−m^P_T))ⁱ=

d

Y

i=1

EΛ^δⁱ

y_i^ji λi

√

T W₁¹)=

d

Y

i=1

EΛ^δ_yji 0

(W₁¹)=

d

Y

i=1

β^δ_j

i(y0), (2.16)

where

β^δ_j(y0) := β(y₀^j⁺¹)−2β(y₀^j)+β(y₀^j−1)

δ , β(x) :=xN(x)+N⁰(x), N(x)=Z x

−∞

e⁻^y

2 2

√2πdy=β⁰(x).

(2.17)

(9)

Next for the corrective terms (2.7), we similarly replacehby ˆh, which gives Cor_2,h ≈Cor_2,hˆ. For any multi-indexα∈ {1, . . . ,d}^|α|, we get for the derivatives of ˆh

P

using (2.14)-(2.16):

∂^|α|_α

1,...,_α|α|hˆ

P()= X

j1,...,jd∈{0,...,N}

h(x^j¹^,...,^j^d)∂^|α|_α

1,...,_α|α|

d

Y

i=1

EΛ^δⁱ

y_i^ji(λ_i√

T W₁¹+(U_V⁻¹)ⁱ),

EΛ^δⁱ

y_i^ji(λi

√

T W₁¹+(U_V⁻¹)ⁱ)=EΛ^δ

y₀^ji−^(U−

1 V)i λi

√ T

(W₁¹)=β^δ_j

i

y0−(U⁻¹_V)ⁱ λ_i√

T .

Thus it is sufficient to compute the perturbed coefficientsβ^δas in (2.17) according to the new translated gridy0−^(U

−1 V)ⁱ λ_i√

T and then to differentiate w.r.t. at =0, which leads to explicit calculations. The next result summarizes the previous analysis, in combination with Theorem 2.1.

Theorem2.4.(SAFE method with multilinear finite elements).

Assume(H_x₀)and suppose that h∈Lip(R^d,R). Define hˆ

P() := X

j1,...,j_d∈{0,...,N}

h(x^j¹^,...,j^d)

d

Y

i=1

β^δ_j

i

y0−(U_V⁻¹)ⁱ λi

√ T

.

where the weight functionsβ^δ_j

i and the grid y₀are respectively defined in(2.17)and (2.15).

Then we have

E[h(X_T)]=hˆ

P

(0)+Cor_2,ˆh+Error^SA_2,h+Error^FEL_h , (2.18)

whereCor_2,ˆhis obtained replacing h^Pbyhˆ

P

inCor_2,h(defined in(2.7)) and where the error using the multilinear finite elements approximation is defined by:

Error^FEL_h :=h^P(0)−hˆ

P

(0)+Cor2,h−Cor_2,hˆ. (2.19)

Accuracy results. The accuracy of multilinear interpolation depends on the smoothness of the functionh to approximate. Our goal is not to be exhaustive in this respect but rather to give few settings relevant for the practical applications that we have in mind. For a detailed exposure on Finite Elements accuracy, see [4]. We distinguish three kinds of in- creasingly strong assumptions:

(H1):h∈Lip(R^d,R).

(H2): h ∈ Lip(R^d,R), piecewiseC², in the sense that there are an integer Nh ∈ N^∗, Nh

domains (non empty open connected sets ofR^d) (Dⁱ)i∈{1,...,N_h}, such that:

1. ∀i∈ {1, . . . ,Nh}, either the domainDⁱhas a compact boundary∂Dⁱof classC², orDⁱis a half-space,

2. R^d=

i=Nh

[

i=1

Dⁱ,

3. ∀i∈ {1, . . . ,Nh}, the restriction ofhtoDⁱ, which is denoted byhi, is aC²(Dⁱ,R^d)- function with bounded derivatives.

(H3):h∈ C²(R^d,R) with bounded derivatives.

We state the accuracy results in the following Theorem, which proof is postponed to Section 4.

(10)

Theorem2.5.(Accuracy of SAFE method with multilinear finite elements).

Assume(H_x₀)and suppose that h satisfies at least(H1). Recalling the density f^P of X_T^P in (1.4)and the domainDe^Pin(2.12), for any multi-indexαset

G^α,T_h =Z

R^d

1y<De^Ph(y)∂^α(f^P(y−))

₌₀dy, G^α,I_h =Z

R^d

1y∈eD^Ph(y)∂^α(f^P(y−)) ₌₀dy.

(2.20)

Then, defineCor^T_2,h(respectivelyCor^I_2,h) replacing inCor2,hthe sensitivities∂^αh^P(0)byG^α,T

h

(respectivelyG^α,I_h ) so thatCor2,h =Cor^T_2,h+Cor^I_2,h; proceed similarly withG^α,T_ˆ

h ,G^α,I_ˆ

h ,Cor^T_2,_ˆ

h

andCor^I_2,_ˆ

h. Then the multilinear finite elements error(2.19)is decomposed as Error^FEL_h =Error^FEL,T_h +Error^FEL,I_h ,

where the Truncation Error

Error^FEL,T_h :=E[(h(X_T^P)−h(Xˆ _T^P))1X_T^P<De^P]+Cor^T_2,h−Cor^T_2,_ˆ

h

strongly depends on the size parameterRintroduced in(2.11), and where the Interpolation Error onDe^P

Error^FEL,I_h :=E[(h(X^P_T)−h(Xˆ _T^P))1X_T^P∈eD^P]+Cor^I_2,h−Cor^I_2,_ˆ

h

depends on the grid meshδ. On the one hand, for h∈Lip(R^d,R), the truncation error is such

|Error^FEL,T_h | ≤_c(|h(m_T^P)|+C_Lip,h) exp(−R²/4).

(2.21)

On the other hand, the Interpolation Error is estimated as follows, according to the regularity of h:

Error^FEL,I_h ≤c











CLip,hδM0(σ,b)

√

T under(H1),

nC_Lip,h+ max

i≤N_h, α:|α|=2|∂^αhi|_∞o

δM₀(σ,b)

√ Th

δ+M₀(σ,b)

√ Ti

under(H2), sup

α:|α|=2

|∂^αh|∞δ²[M₀(σ,b)

√

T]² under(H3),

(2.22)

where the generic constant c in case(H2)depends on the domains.

2.3. Final approximation and complexity of the SAFE algorithm based on multilinear finite elements. Combining Theorems 2.1 (weak approximation with the Gaussian proxy) and 2.5 (suitable interpolation ofhusing the hat functions), we derive a final approximation ofE[h(XT)] with a choice of parameters R andδ(see (2.11)) allowing to obtain a global error of order at most equal to

E=[M0(σ,b)

√ T]³. The proof of the following Theorem is left to the reader.

Theorem2.6.Assume(H_x₀)and suppose that h satisfies at least(H1). Consider the local approximationh of h defined inˆ (2.13)with parametersRandδset as follows:

R :=2p

log(1/E), δ:=c









 [maxiλi

√

T]² under(H1), maxiλ_i√

T under(H2), max_iλ_i√

T¹₂

under(H3),

(11)

for an arbitrary fixed constant c. Then, the global error is of order 3 w.r.t.M₀(σ,b)√ T : E[h(X_T)]=E[ˆh(X_T^P)]+Cor_2,hˆ +O[M₀(σ,b)

√ T]³.

Now, let us analyze the algorithm complexity w.r.t. the target errorE, according to the regularity ofh. Denote byC(d) the computational cost for the elementary operations at each nodex^j¹^,...,j^dof the local approximation: apart from the evaluation ofh(x^j¹^,...,j^d), computations are mainly dedicated to the calculus of theβweights defined in (2.17) and their derivatives, which is simple and can even be made off-line. Therefore, the total computational cost of the algorithm isC^FEL_calculus=O

C(d)(N+1)^d

. SinceN=2R/δ, the complexity of the algorithm to reach the target errorE=[M0(σ,b)

√

T]³can be evaluated in the following manner.

Corollary2.7.With the previous notations and assumptions, asE →0we have

C^FEL_calculus=









 O

[log(1/E)]^d/2E⁻^2d³

under(H1), O

[log(1/E)]^d/2E⁻^d³

under(H2), O

[log(1/E)]^d/2E⁻^d⁶

under(H3).

(2.23)

Let us briefly discuss the theoretical efficiency of our algorithm in comparison to a direct Monte-Carlo method. If we performMsimulations of the diffusionXT via an Euler scheme X^∆_T^twith time step∆t, the total computational cost is of orderM×(T/∆t), whereas the mean square error is (see [6])

Var[h(X^∆_T^t)]M⁻¹+

E[h(XT)]−E[h(X_T^∆^t)]2

.

The first term (statistical error) is approximately equal to

Var[h(XT)]M⁻¹ =Var[h(XT)−h(x0)]M⁻¹=O[CLip,hM₀(σ,b)

√

T]²M⁻¹. The second error term (discretization error) is a bit delicate to analyze under our assumptions:

Kebaier shows in [12, Proposition 2.2] that for anyα∈(¹₂,1], there is a Lipschitz functionh so that the discretization error isO((∆t)^α). In [21, 2], the orderα=1 is established for smooth hor for uniform (hypo)-ellipticσ, but these assumptions are not fulfilled in our setting. To encompass general results, we rather use strong convergence estimates, and we specialize them to our setting:

E[h(X_T)]−E[h(X_T^∆^t)

≤C_Lip,hE|X_T−X^∆_T^t| ≤_cC_Lip,h[M₀(σ,b)]²

√ T

√∆t.

We now tune the parameters to achieve aL2-error of the same order as the SAFE method whenE →0, i.e. to have a mean square errorE²=[M0(σ,b)

√

T]⁶: this is achieved by taking M⁻¹ ∼[M0(σ,b)

√

T]⁴ =E⁴³ and∆t∼[M0(σ,b)]²T² =E²³T. Therefore, the computational cost is C^MC

calculus = O(E⁻²), independently of the dimension. Thus, in view of (2.23) and neglecting logarithmic factors, the SAFE method with multilinear finite elements is (in the sense of this theoretical comparison) more competitive than Monte-Carlo methods up to the dimension

d =3 under(H1), d=6 under(H2), d=12 under(H3).

Subsequent experiments are coherent with these rules.

(12)

2.4. SAFE with multiquadratic finite elements. For smooth functions at least of class C³(R^d,R), we propose an extension using multiquadratic finite elements [4, Section 3.5] allowing to reach a given accuracy with a lower computational cost. We define two basis functions of center and size parametersz∈Randµ >0:

ζ_z^µ(y)= y−(z−µ)

µ 2(y−(z−µ) µ )−1

1^{y∈[z−µ,z[}+z+µ−y

µ 2(z+µ−y µ )−1

1^y∈[z,z+µ],

Ξ^µz(y)= (y−(z−µ)) µ

(z+µ−y)

µ 1y∈[z−µ,z+µ].

To obtain a multiquadratic interpolation ofh, for anyi∈ {1, . . . ,d}and any j∈ {0, . . . ,N−1}

we denote byy^j⁺

1 2

0 =−R+(j+¹₂)δandy^j⁺

1 2

i =−Ri+(j+¹₂)δithe middles of respectively the segments [y₀^j,y₀^j⁺¹] and [y_i^j,y_i^j⁺¹]. Then we naturally extendX=(x^j¹^,...,^j^d)_j₁_,...,j_d_∈{0,1

2,...,N−¹₂,N}= A(y₁^j¹, . . . ,y_d^j^d)^∗

j1,...,jd∈{0,¹₂,...,N−¹₂,N}and the multiquadratic interpolation ofhis defined by:

Qh(x) := X

j1,...,jd∈{0,¹₂,...,N−¹₂,N}

h(x^j¹^,...,^j^d)

d

Y

i=1

hζ^δⁱ

y_i^ji (U_V⁻¹(x−m^P_T))ⁱ 1²ji≡0[2]

+ Ξ^δⁱ^/2

y_i^ji (U⁻¹_V(x−m_T^P))ⁱ 12j_i≡1[2]

i, (2.24)

which is a continuous approximation, piecewise quadratic in all directions onDe^Pand vanish- ing outside the domainA[−R1−δ₁,R1+δ₁]× · · · ×[−Rd−δ_d,Rd+δ_d]. A direct computation (a little bit tedious but without mathematical difficulty) yields:

Eζ^δⁱ

y_i^ji (U_V⁻¹(X_T^P−m^P_T))ⁱ=4B^δ_j

i(y0)−β^δ_j

i(y0), EΞ^δⁱ^/2

y_i^ji (U⁻¹_V(X_T^P−m_T^P))ⁱ=−2B^δ/2_j

i (y0)+2β^δ/2_j

i (y0), where the functionβis defined in (2.17) and where:

B^δ_j(y0) :=B(y₀^j⁺¹)− B(y₀^j−1)−2δB⁰(y₀^j)

δ² , B(x) :=Z x

−∞

β(u)du=(x²+1)N(x)+xN⁰(x)

2 ,

(2.25)

β^δ/2_j (y0) :=β(y₀^j⁺^1/2)−2β(y₀^j)+β(y₀^j−1/2)

δ/2 , B^δ/2

j (y0) := B(y₀^j⁺^1/2)− B(y₀^j−1/2)−2(δ/2)B⁰(y₀^j)

(δ/2)² .

We are now in a position to announce the following Theorem, which proof is very similar to those of Theorems 2.4-2.5 and is thus left to the reader.

Theorem2.8.(SAFE using multiquadratic finite elements).

Assume(Hx₀)and suppose that h∈ C³(R^d,R)with bounded derivatives. Define Qh^P() := X

j₁,...,j_d∈{0,¹₂,...,N−¹₂,N}

h(x^j¹^,...,^j^d)

d

Y

i=1

h4B^δ_j

i

y0−(U⁻¹

V)ⁱ λi

√ T

−β^δ_j

i

y0−(U⁻¹

V)ⁱ λi

√ T

1²^ji≡0[2]

+

−2B^δ/2_j

i

y0−(U⁻¹

V)ⁱ λi

√ T

+2β^δ/2_j

i

y0−(U⁻¹

V)ⁱ λi

√ T

1²^ji≡1[2]

i,

where the weight functionsB^δ

ji,B^δ/2

ji ,β^δ_j

iandβ^δ/2_j

i and the grid y0are defined in(2.25),(2.17) and(2.15). Then

E[h(XT)]=Qh^P(0)+Cor2,Qh+OCLip,hM₁(σ,b)[M0(σ,b)]²T³²+Error^FEQ,T_h +Error^FEQ,I_h ,