Asymptotic equivalence for pure jump Lévy processes with unknown Lévy density and Gaussian white noise

(1)

HAL Id: hal-01131685

https://hal.archives-ouvertes.fr/hal-01131685v2

Preprint submitted on 12 Aug 2015

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires

Asymptotic equivalence for pure jump Lévy processes with unknown Lévy density and Gaussian white noise

Ester Mariucci

To cite this version:

Ester Mariucci. Asymptotic equivalence for pure jump Lévy processes with unknown Lévy density

and Gaussian white noise. 2015. �hal-01131685v2�

(2)

Asymptotic equivalence for pure jump L´evy processes with unknown L´evy density and Gaussian white noise

Ester Mariucci

^a,b

a

Laboratoire LJK, Universit´ e Joseph Fourier UMR 5224 51, Rue des Math´ ematiques, Saint Martin d’H` eres BP 53 38041 Grenoble Cedex 09

b

Corresponding Author, Ester.Mariucci@imag.fr

Abstract

The aim of this paper is to establish a global asymptotic equivalence between the experiments generated by the discrete (high frequency) or continuous observation of a path of a L´ evy process and a Gaussian white noise experiment observed up to a time T , with T tending to ∞. These approximations are given in the sense of the Le Cam distance, under some smoothness conditions on the unknown L´ evy density. All the asymptotic equivalences are established by constructing explicit Markov kernels that can be used to re- produce one experiment from the other.

Keywords: Nonparametric experiments, Le Cam distance, asymptotic equivalence, L´ evy processes.

2000 MSC: 62B15, (62G20, 60G51).

1. Introduction

L´ evy processes are a fundamental tool in modelling situations, like the dynamics of asset prices and weather measurements, where sudden changes in values may happen. For that reason they are widely employed, among many other fields, in mathematical finance. To name a simple example, the price of a commodity at time t is commonly given as an exponential function of a L´ evy process. In general, exponential L´ evy models are proposed for their ability to take into account several empirical features observed in the returns of assets such as heavy tails, high-kurtosis and asymmetry (see [15] for an introduction to financial applications).

From a mathematical point of view, L´ evy processes are a natural exten-

sion of the Brownian motion which preserves the tractable statistical prop-

erties of its increments, while relaxing the continuity of paths. The jump

dynamics of a L´ evy process is dictated by its L´ evy density, say f. If f is

continuous, its value at a point x

₀

determines how frequent jumps of size

(3)

close to x

₀

are to occur per unit time. Concretely, if X is a pure jump L´ evy process with L´ evy density f , then the function f is such that

Z

A

f(x)dx = 1 t E

X

s≤t

I

A

(∆X

_s

)

,

for any Borel set A and t > 0. Here, ∆X

_s

≡ X

_s

−X

_s⁻

denotes the magnitude of the jump of X at time s and I

A

is the characteristic function. Thus, the L´ evy measure

ν(A) :=

Z

A

f(x)dx,

is the average number of jumps (per unit time) whose magnitudes fall in the set A. Understanding the jumps behavior, therefore requires to estimate the L´ evy measure. Several recent works have treated this problem, see e.g. [2]

for an overview.

When the available data consists of the whole trajectory of the process during a time interval [0, T ], the problem of estimating f may be reduced to estimating the intensity function of an inhomogeneous Poisson process (see, e.g. [23, 42]). However, a continuous-time sampling is never available in practice and thus the relevant problem is that of estimating f based on discrete sample data X

_t₀

, . . . , X

_t_n

during a time interval [0, T

_n

]. In that case, the jumps are latent (unobservable) variables and that clearly adds to the difficulty of the problem. From now on we will place ourselves in a high- frequency setting, that is we assume that the sampling interval ∆

_n

= t

_i

−t

i−1

tends to zero as n goes to infinity. Such a high-frequency based statistical approach has played a central role in the recent literature on nonparametric estimation for L´ evy processes (see e.g. [22, 13, 14, 1, 19]). Moreover, in order to make consistent estimation possible, we will also ask the observation time T

_n

to tend to infinity in order to allow the identification of the jump part in the limit.

Our aim is to prove that, under suitable hypotheses, estimating the L´ evy density f is equivalent to estimating the drift of an adequate Gaussian white noise model. In general, asymptotic equivalence results for statistical experiments provide a deeper understanding of statistical problems and allow to single out their main features. The idea is to pass via asymptotic equivalence to another experiment which is easier to analyze. By definition, two sequences of experiments P

1,n

and P

2,n

, defined on possibly different sample spaces, but with the same parameter set, are asymptotically equivalent if the Le Cam distance ∆( P

1,n

, P

2,n

) tends to zero. For P

i

= ( X

i

, A

i

, P

_i,θ

: θ ∈ Θ)

, i = 1, 2, ∆( P

1

, P

2

) is the symmetrization of the deficiency δ( P

1

, P

2

) where

δ( P

1

, P

2

) = inf

K

sup

θ∈Θ

KP

_1,θ

− P

_2,θ

T V

.

(4)

Here the infimum is taken over all randomizations from ( X

1

, A

1

) to ( X

2

, A

2

) and k · k

_{T V}

denotes the total variation distance. Roughly speaking, the Le Cam distance quantifies how much one fails to reconstruct (with the help of a randomization) a model from the other one and vice versa. Therefore, we say that ∆( P

1

, P

2

) = 0 can be interpreted as “the models P

1

and P

2

contain the same amount of information about the parameter θ.” The general definition of randomization is quite involved but, in the most frequent examples (namely when the sample spaces are Polish and the experiments dominated), it reduces to that of a Markov kernel. One of the most important feature of the Le Cam distance is that it can be also interpreted in terms of statistical decision theory (see [32, 33]; a short review is presented in the Appendix).

As a consequence, saying that two statistical models are equivalent means that any statistical inference procedure can be transferred from one model to the other in such a way that the asymptotic risk remains the same, at least for bounded loss functions. Also, as soon as two models, P

1,n

and P

2,n

, that share the same parameter space Θ are proved to be asymptotically equivalent, the same result automatically holds for the restrictions of both P

1,n

and P

2,n

to a smaller subclass of Θ.

Historically, the first results of asymptotic equivalence in a nonparametric

context date from 1996 and are due to [5] and [39]. The first two authors

have shown the asymptotic equivalence of nonparametric regression and a

Gaussian white noise model while the third one those of density estimation

and white noise. Over the years many generalizations of these results have

been proposed such as [3, 28, 43, 11, 10, 40, 12, 37, 46] for nonparametric

regression or [9, 31, 4] for nonparametric density estimation models. Another

very active field of study is that of diffusion experiments. The first result

of equivalence between diffusion models and Euler scheme was established

in 1998, see [38]. In later papers generalizations of this result have been

considered (see [24, 35]). Among others we can also cite equivalence results

for generalized linear models [27], time series [29, 38], diffusion models [18, 25,

16, 17], GARCH model [7], functional linear regression [36], spectral density

estimation [26] and volatility estimation [41]. Negative results are somewhat

harder to come by; the most notable among them are [20, 6, 48]. There is

however a lack of equivalence results concerning processes with jumps. A first

result in this sense is [34] in which global asymptotic equivalences between the

experiments generated by the discrete or continuous observation of a path of

a L´ evy process and a Gaussian white noise experiment are established. More

precisely, in that paper, we have shown that estimating the drift function

h from a continuously or discretely (high frequency) time inhomogeneous

(5)

jump-diffusion process:

X

_t

= Z

t

0

h(s)ds + Z

t

0

σ(s)dW

_s

+

Nt

X

i=1

Y

_i

, t ∈ [0, T

_n

], (1) is asymptotically equivalent to estimate h in the Gaussian model:

dy

_t

= h(t)dt + σ(t)dW

_t

, t ∈ [0, T

_n

].

Here we try to push the analysis further and we focus on the case in which the considered parameter is the L´ evy density and X = (X

_t

) is a pure jump L´ evy process (see [8] for the interest of such a class of processes when modelling asset returns). More in details, we consider the problem of estimating the L´ evy density (with respect to a fixed, possibly infinite, L´ evy measure ν

₀

concentrated on I ⊆ R ) f :=

_dν^dν

0

: I → R from a continuously or discretely observed pure jump L´ evy process X with possibly infinite L´ evy measure. Here I ⊆ R denotes a possibly infinite interval and ν

₀

is supposed to be absolutely continuous with respect to Lebesgue with a strictly positive density g :=

_dLeb^dν⁰

. In the case where ν is of finite variation one may write:

X

_t

= X

0<s≤t

∆X

_s

(2)

or, equivalently, X has a characteristic function given by:

E e

^iuX^t

= exp

− t Z

I

(1 − e

^iuy

)ν(dy)

.

We suppose that the function f belongs to some a priori set F , nonparametric in general. The discrete observations are of the form X

_t_i

, where t

_i

= T

_n_nⁱ

, i = 0, . . . , n with T

_n

= n∆

_n

→ ∞ and ∆

_n

→ 0 as n goes to infinity.

We will denote by P

n^ν⁰

the statistical model associated with the continuous observation of a trajectory of X until time T

_n

(which is supposed to go to infinity as n goes to infinity) and by Q

n^ν⁰

the one associated with the observation of the discrete data (X

ti

)

ⁿ_i=0

. The aim of this paper is to prove that, under adequate hypotheses on F (for example, f must be bounded away from zero and infinity; see Section 2.1 for a complete definition), the models P

n^ν⁰

and Q

^νn⁰

are both asymptotically equivalent to a sequence of Gaussian white noise models of the form:

dy

_t

= p

f (t)dt + 1 2 √

T

n

dW

_t

p g (t) , t ∈ I.

(6)

As a corollary, we then get the asymptotic equivalence between P

n^ν⁰

and Q

n^ν⁰

. The main results are precisely stated as Theorems 2.5 and 2.6. A particular case of special interest arises when X is a compound Poisson process, ν

₀

≡ Leb([0, 1]) and F ⊆ F

_(γ,K,κ,M^I ₎

where, for fixed γ ∈ (0, 1] and K, κ, M strictly positive constants, F

_(γ,K,κ,M^I ₎

is a class of continuously differentiable functions on I defined as follows:

F

(γ,K,κ,M^I )

= n

f : κ ≤ f(x) ≤ M, |f

⁰

(x) − f

⁰

(y)| ≤ K|x − y|

^γ

, ∀x, y ∈ I o . (3) In this case, the statistical models P

n^ν⁰

and Q

n^ν⁰

are both equivalent to the Gaussian white noise model:

dy

_t

= p

f(t)dt + 1 2 √

T

_n

dW

_t

, t ∈ [0, 1].

See Example 3.1 for more details. By a theorem of Brown and Low in [5], we obtain, a posteriori, an asymptotic equivalence with the regression model

Y

_i

= r

f i T

_n

+ 1

2 √

T

_n

ξ

_i

, ξ

_i

∼ N (0, 1), i = 1, . . . , [T

_n

].

Note that a similar form of a Gaussian shift was found to be asymptotically equivalent to a nonparametric density estimation experiment, see [39]. Let us mention that we also treat some explicit examples where ν

₀

is neither finite nor compactly-supported (see Examples 3.2 and 3.3).

Without entering into any detail, we remark here that the methods are very different from those in [34]. In particular, since f belongs to the discon- tinuous part of a L´ evy process, rather then its continuous part, the Girsanov- type changes of measure are irrelevant here. We thus need new instruments, like the Esscher changes of measure.

Our proof is based on the construction, for any given L´ evy measure ν, of two adequate approximations ˆ ν

_m

and ¯ ν

_m

of ν: the idea of discretizing the L´ evy density already appeared in an earlier work with P. ´ Etor´ e and S. Louhichi, [21]. The present work is also inspired by the papers [9] (for a multinomial approximation), [4] (for passing from independent Poisson variables to independent normal random variables) and [34] (for a Bernoulli approximation). This method allows us to construct explicit Markov kernels that lead from one model to the other; these may be applied in practice to transfer minimax estimators.

The paper is organized as follows: Sections 2.1 and 2.2 are devoted to make the parameter space and the considered statistical experiments precise.

The main results are given in Section 2.3, followed by Section 3 in which some

(7)

examples can be found. The proofs are postponed to Section 4. The paper includes an Appendix recalling the definition and some useful properties of the Le Cam distance as well as of L´ evy processes.

2. Assumptions and main results 2.1. The parameter space

Consider a (possibly infinite) L´ evy measure ν

₀

concentrated on a possibly infinite interval I ⊆ R , admitting a density g > 0 with respect to Lebesgue.

The parameter space of the experiments we are concerned with is a class of functions F = F

^ν⁰^,I

defined on I that form a class of L´ evy densities with respect to ν

0

: For each f ∈ F , let ν (resp. ˆ ν

m

) be the L´ evy measure having f (resp. ˆ f

_m

) as a density with respect to ν

₀

where, for every f ∈ F , ˆ f

_m

(x) is defined as follows.

Suppose first x > 0. Given a positive integer depending on n, m = m

n

, let J

_j

:= (v

j−1

, v

_j

] where v

₁

= ε

_m

≥ 0 and v

_j

are chosen in such a way that

µ

_m

:= ν

₀

(J

_j

) = ν

₀

(I \ [0, ε

_m

]) ∩ R

+

m − 1 , ∀j = 2, . . . , m. (4) In the sequel, for the sake of brevity, we will only write m without making explicit the dependence on n. Define x

^∗_j

:=

R

Jjxν0(dx)

µm

and introduce a sequence of functions 0 ≤ V

_j

≤

_µ¹

m

, j = 2, . . . , m supported on [x

^∗_j−1

, x

^∗_j+1

] if j = 3, . . . , m − 1, on [ε

_m

, x

^∗₃

] if j = 2 and on (I \ [0, x

^∗_m−1

]) ∩ R

+

if j = m. The V

_j

’s are defined recursively in the following way.

• V

₂

is equal to

_µ¹

m

on the interval (ε

_m

, x

^∗₂

] and on the interval (x

^∗₂

, x

^∗₃

] it is chosen so that it is continuous (in particular, V

₂

(x

^∗₂

) =

_µ¹

m

), R

x^∗₃

x^∗₂

V

2

(y)ν

0

(dy) =

^ν⁰^((x_µ^∗²^,v²^])

m

and V

2

(x

^∗₃

) = 0.

• For j = 3, . . . , m − 1 define V

_j

as the function

_µ¹

m

− V

j−1

on the interval [x

^∗_j−1

, x

^∗_j

]. On [x

^∗_j

, x

^∗_j+1

] choose V

_j

continuous and such that R

^x^∗j+1

x^∗_j

V

j

(y)ν

0

(dy) =

^ν⁰^((x

∗ j,vj])

µm

and V

j

(x

^∗_j+1

) = 0.

• Finally, let V

_m

be the function supported on (I \ [0, x

^∗_m−1

]) ∩ R

+

such that

V

_m

(x) = 1

µ

_m

− V

m−1

(x), for x ∈ [x

^∗_m−1

, x

^∗_m

], V

_m

(x) = 1

µ

_m

, for x ∈ (I \ [0, x

^∗_m

]) ∩ R

⁺

.

(8)

(It is immediate to check that such a choice is always possible). Observe that, by construction,

m

X

j=2

V

_j

(x)µ

_m

= 1, ∀x ∈ (I\[0, ε

_m

])∩ R

+

and Z

(I\[0,εm])∩R+

V

_j

(y)ν

₀

(dy) = 1.

Analogously, define µ

⁻_m

=

^ν⁰ ^(I\[−ε^m^,0])∩^R⁻

m−1

and J

−m

, . . . , J

−2

such that ν

₀

(J

−j

) = µ

⁻_m

for all j . Then, for x < 0, x

^∗_−j

is defined as x

^∗_j

by using J

−j

and µ

⁻_m

instead of J

j

and µ

m

and the V

−j

’s are defined with the same procedure as the V

_j

’s, starting from V

−2

and proceeding by induction.

Define

f ˆ

_m

(x) = I

[−εm,εm]

(x) +

m

X

j=2

V

_j

(x)

Z

Jj

f(y)ν

₀

(dy) + V

−j

(x) Z

J−j

f(y)ν

₀

(dy)

. (5) The definitions of the V

_j

’s above are modeled on the following example:

Example 2.1. Let ν

₀

be the Lebesgue measure on [0, 1] and ε

_m

= 0. Then v

j

=

_m−1^j−1

and x

^∗_j

=

_2m−2^2j−3

, j = 2, . . . , m. The standard choice for V

j

(based on the construction by [9]) is given by the piecewise linear functions interpolating the values in the points x

^∗_j

specified above:

0 x

^∗₂

x

^∗₃

1 m − 1

0 x

^∗_j−1

x

^∗_j

x

^∗_j+1

1 m − 1

0 m − 1

x

^∗_m

x

^∗_m−1

1 V

2

V

j

V

m

Remark 2.2. The function ˆ f

_m

has been defined in such a way that the rate of convergence of the L

₂

norm between the restriction of f and ˆ f

_m

on I \ [−ε

_m

, ε

_m

] is compatible with the rate of convergence of the other quantities appearing in the statements of Theorems 2.5 and 2.6. For that reason, as in [9], we have not chosen a piecewise constant approximation of f but an approximation that is, at least in the simplest cases, a piecewise linear approximation of f . Such a choice allows us to gain an order of magnitude on the convergence rate of kf − f ˆ

_m

k

_L₂_(ν₀|I\[−ε_m,εm])

at least when F is a class of sufficiently smooth functions.

We now explain the assumptions we will need to make on the parameter

f ∈ F = F

^ν⁰^,I

. The superscripts ν

₀

and I will be suppressed whenever this

can lead to no confusion. We require that:

(9)

(H1) There exist constants κ, M > 0 such that κ ≤ f (y) ≤ M , for all y ∈ I and f ∈ F .

For every integer m = m

_n

, we can consider c

√ f

_m

, the approximation of √ f constructed as ˆ f

_m

above, i.e. c

√ f

_m

(x) = I

[−εm,εm]

(x) + X

j=−m...,m j6=−1,0,1.

V

_j

(x) Z

Jj

p f (y)ν

₀

(dy), and introduce the quantities:

A

²_m

(f ) :=

Z

I\

−εm,εm

p d

f

_m

(y) − p f(y)

2

ν

₀

(dy), B

_m²

(f ) := X

j=−m...,m j6=−1,0,1.

Z

Jj

p f(y)

p ν

₀

(J

_j

) ν

₀

(dy) − q

ν(J

_j

)

2

,

C

_m²

(f ) :=

Z

εm

−ε_m

p f (t) − 1

2

ν

₀

(dt).

The conditions defining the parameter space F are expressed by asking that the quantities introduced above converge quickly enough to zero. To state the assumptions of Theorem 2.5 precisely, we will assume the existence of sequences of discretizations m = m

_n

→ ∞, of positive numbers ε

_m

= ε

_m_n

→ 0 and of functions V

_j

, j = ±2, . . . , ±m, such that:

(C1) lim

n→∞

n∆

_n

sup

f∈F

Z

I\(−εm,εm)

f (x) − f ˆ

_m

(x)

2

ν

₀

(dx) = 0.

(C2) lim

n→∞

n∆

n

sup

f∈F

A

²_m

(f ) + B

_m²

(f ) + C

_m²

(f)

= 0.

Remark in particular that Condition (C2) implies the following:

(H2) sup

f∈F

Z

I

( p

f(y) − 1)

²

ν

₀

(dy) ≤ L, where L = sup

_f∈_F

R

εm

−εm

( p

f (x) −1)

²

ν

₀

(dx)+( √

M +1)

²

ν

₀

I \(−ε

_m

, ε

_m

) , for any choice of m such that the quantity in the limit appearing in Condition (C2) is finite.

Theorem 2.6 has slightly stronger hypotheses, defining possibly smaller parameter spaces: We will assume the existence of sequences m

n

, ε

m

and V

j

, j = ±2, . . . , ±m (possibly different from the ones above) such that Condition (C1) is verified and the following stronger version of Condition (C2) holds:

(C2’) lim

n→∞

n∆

_n

sup

f∈F

A

²_m

(f ) + B

²_m

(f ) + nC

_m²

(f )

= 0.

(10)

Finally, some of our results have a more explicit statement under the hypothesis of finite variation which we state as:

(FV) R

I

(|x| ∧ 1)ν

₀

(dx) < ∞.

Remark 2.3. The Condition (C1) and those involving the quantities A

_m

(f) and B

_m

(f ) all concern similar but slightly different approximations of f.

In concrete examples, they may all be expected to have the same rate of convergence but to keep the greatest generality we preferred to state them separately. On the other hand, conditions on the quantity C

_m

(f) are purely local around zero, requiring the parameters f to converge quickly enough to 1.

Examples 2.4. To get a grasp on Conditions (C1), (C2) we analyze here three different examples according to the different behavior of ν

0

near 0 ∈ I.

In all of these cases the parameter space F

^ν⁰^,I

will be a subclass of F

_(γ,K,κ,M)^I

defined as in (3). Recall that the conditions (C1), (C2) and (C2’) depend on the choice of sequences m

_n

, ε

_m

and functions V

_j

. For the first two of the three examples, where I = [0, 1], we will make the standard choice for V

_j

of triangular and trapezoidal functions, similarly to those in Example 2.1.

Namely, for j = 3, . . . , m − 1 we have V

j

(x) = I

^(x^∗j−1,x^∗_j]

(x) x − x

^∗_j−1

x

^∗_j

− x

^∗_j−1

1 µ

_m

+ I

^(x^∗_j^,x^∗_j+1^]

(x) x

^∗_j+1

− x x

^∗_j+1

− x

^∗_j

1 µ

_m

; (6) the two extremal functions V

2

and V

m

are chosen so that V

2

≡

_µ¹

m

on (ε

m

, x

^∗₂

] and V

_m

≡

_µ¹

m

on (x

^∗_m

, 1]. In the second example, where ν

₀

is infinite, one is forced to take ε

_m

> 0 and to keep in mind that the x

^∗_j

are not uniformly distributed on [ε

_m

, 1]. Proofs of all the statements here can be found in Section 5.2.

1. The finite case: ν

₀

≡ Leb([0, 1]).

In this case we are free to choose F

^Leb,[0,1]

= F

_(γ,K,κ,M^[0,1] ₎

. Indeed, as ν

₀

is finite, there is no need to single out the first interval J

₁

= [0, ε

_m

], so that C

_m

(f) does not enter in the proofs and the definitions of A

_m

(f ) and B

_m

(f) involve integrals on the whole of [0, 1]. Also, the choice of the V

_j

’s as in (6) guarantees that R

1

0

V

_j

(x)dx = 1. Then, the quantities kf − f ˆ

_m

k

_L₂_([0,1])

, A

_m

(f) and B

_m

(f) all have the same rate of convergence, which is given by:

s Z

1

0

f(x) − f ˆ

_m

(x)

2

ν

₀

(dx) + A

_m

(f ) + B

_m

(f ) = O

m

^−γ−1

+ m

⁻³²

,

uniformly on f. See Section 5.2 for a proof.

(11)

2. The finite variation case:

_dLeb^dν⁰

(x) = x

⁻¹

I

[0,1]

(x).

In this case, the parameter space F

^ν⁰^,[0,1]

is a proper subset of F

_(γ,K,κ,M^[0,1] ₎

. Indeed, as we are obliged to choose ε

_m

> 0, we also need to impose that C

_m

(f) = o

_n^√¹_∆

n

, with uniform constants with respect to f , that is, that all f ∈ F converge to 1 quickly enough as x → 0. Choosing ε

m

= m

^−1−α

, α > 0 we have that µ

_m

=

^ln(ε_m−1⁻¹^m⁾

, v

_j

= ε

m−j m−1

m

and x

^∗_j

=

^(v^j^−v_µ ^j−1⁾

m

. In particular, max

_j

|v

j−1

− v

_j

| = |v

_m

− v

m−1

| = O

lnm m

. Also in this case one can prove that the standard choice of V

_j

described above leads to R

1

εm

V

_j

(x)

^dx_x

= 1.

Again, the quantities kf − f ˆ

_m

k

_L₂_(ν₀|I\[0,εm])

, A

_m

(f ) and B

_m

(f) have the same rate of convergence given by:

s Z

1

εm

f (x) − f ˆ

m

(x)

2

ν

0

(dx)+A

m

(f )+B

m

(f ) = O

ln m m

γ+1

p ln(ε

⁻¹_m

)

, (7) uniformly on f. The condition on C

_m

(f) depends on the behavior of f near 0. For example, it is ensured if one considers a parametric family of the form f (x) = e

^−λx

with a bounded λ > 0. See Section 5.2 for a proof.

3. The infinite variation, non-compactly supported case:

_dLeb^dν⁰

(x) = x

⁻²

I

R+

(x).

This example involves significantly more computations than the preceding ones, since the classical triangular choice for the functions V

j

would not have integral equal to 1 (with respect to ν

₀

), and the support is not compact.

The parameter space F

^ν⁰^,[0,∞)

can still be chosen as a proper subclass of F

_(γ,K,κ,M^[0,∞) ₎

, again by imposing that C

m

(f ) converges to zero quickly enough (more details about this condition are discussed in Example 3.3). We divide the interval [0, ∞) in m intervals J

_j

= [v

j−1

, v

_j

) with:

v

₀

= 0; v

₁

= ε

_m

; v

_j

= ε

m

(m − 1)

m − j ; v

_m

= ∞; µ

_m

= 1 ε

_m

(m − 1) . To deal with the non-compactness problem, we choose some “horizon” H(m) that goes to infinity slowly enough as m goes to infinity and we bound the L

2

distance between f and ˆ f

m

for x > H (m) by 2 sup

x≥H(m) f(x)²

H(m)

. We have:

kf − f ˆ

m

k

²_L₂_(ν₀_|I\[0,ε_m_])

+ A

²_m

(f ) + B

²_m

(f ) = O

H(m)

^3+4γ

(ε

_m

m)

^2+2γ

+ sup

x≥H(m)

f(x)

²

H(m)

. In the general case where the best estimate for sup

x≥H(m)

f (x)

²

is simply given by

(12)

M

²

, an optimal choice for H(m) is √

ε

_m

m, that gives a rate of convergence:

kf − f ˆ

m

k

²_L₂_(ν₀_|I\[0,ε_m_])

+ A

²_m

(f ) + B

²_m

(f ) = O 1

√ ε

_m

m

, independently of γ. See Section 5.2 for a proof.

2.2. Definition of the experiments

Let (x

_t

)

_t≥0

be the canonical process on the Skorokhod space (D, D ) and denote by P

^(b,0,ν)

the law induced on (D, D ) by a L´ evy process with characteristic triplet (b, 0, ν). We will write P

_t^(b,0,ν)

for the restriction of P

^(b,0,ν)

to the σ-algebra D

t

generated by {x

_s

: 0 ≤ s ≤ t} (see Appendix A.2 for the precise definitions). Let Q

^(b,0,ν)_t

be the marginal law at time t of a L´ evy process with characteristic triplet (b, 0, ν). In the case where R

|y|≤1

|y|ν(dy) < ∞ we introduce the notation γ

^ν

:= R

|y|≤1

yν(dy); then, Condition (H2) guarantees the finiteness of γ

^ν−ν⁰

(see Remark 33.3 in [44] for more details).

Recall that we introduced the discretization t

_i

= T

_n_nⁱ

of [0, T

_n

] and denote by Q

^(γ_n^ν−ν⁰^,0,ν)

the laws of the n + 1 marginals of (x

_t

)

t≥0

at times t

_i

, i = 0, . . . , n. We will consider the following statistical models, depending on a fixed, possibly infinite, L´ evy measure ν

₀

concentrated on I (clearly, the models with the subscript F V are meaningful only under the assumption (FV)):

P

_{n,F V}^ν⁰

=

D, D

Tn

, n

P

_T^(γ_n^ν^,0,ν)

: f := dν

dν

₀

∈ F

^ν⁰^,I

o , Q

_{n,F V}^ν⁰

=

R

ⁿ⁺¹

, B ( R

ⁿ⁺¹

), n

Q

^(γ_n^ν^,0,ν)

: f := dν

dν

₀

∈ F

^ν⁰^,I

o , P

n^ν⁰

=

D, D

Tn

, n

P

_T^(γ_n^ν−ν⁰^,0,ν)

: f := dν

dν

₀

∈ F

^ν⁰^,I

o , Q

n^ν⁰

=

R

ⁿ⁺¹

, B ( R

ⁿ⁺¹

), n

Q

^(γ_n^ν−ν⁰^,0,ν)

: f := dν

dν

₀

∈ F

^ν⁰^,I

o .

Finally, let us introduce the Gaussian white noise model that will appear in the statement of our main results. For that, let us denote by (C(I), C ) the space of continuous mappings from I into R endowed with its standard fil- tration, by g the density of ν

₀

with respect to the Lebesgue measure. We will require g > 0 and let W

^fn

be the law induced on (C(I), C ) by the stochastic process satisfying:

dy

_t

= p

f(t)dt + dW

_t

2 √

T

_n

p

g(t) , t ∈ I, (8)

(13)

where (W

_t

)

t∈R

denotes a Brownian motion on R with W

₀

= 0. Then we set:

W

n^ν⁰

=

C(I), C , { W

^fn

: f ∈ F

^ν⁰^,I

} .

Observe that when ν

₀

is a finite L´ evy measure, then W

n^ν⁰

is equivalent to the statistical model associated with the continuous observation of a process (˜ y

_t

)

_t∈I

defined by:

d y ˜

_t

= p

f(t)g(t)dt + dW

_t

2 √

T

_n

, t ∈ I.

2.3. Main results

Using the notation introduced in Section 2.1, we now state our main results. For brevity of notation, we will denote by H(f, f ˆ

_m

) (resp. L

₂

(f, f ˆ

_m

)) the Hellinger distance (resp. the L

₂

distance) between the L´ evy measures ν and ˆ ν

m

restricted to I \ [−ε

m

, ε

m

], i.e.:

H

²

(f, f ˆ

_m

) :=

Z

I\[−εm,εm]

p f (x) −

q f ˆ

_m

(x)

2

ν

₀

(dx), L

₂

(f, f ˆ

_m

)

²

:=

Z

I\[−εm,εm]

f (y) − f ˆ

_m

(y)

2

ν

₀

(dy).

Observe that Condition (H1) implies (see Lemma 5.1) 1

4M L

₂

(f, f ˆ

_m

)

²

≤ H

²

(f, f ˆ

_m

) ≤ 1

4κ L

₂

(f, f ˆ

_m

)

²

.

Theorem 2.5. Let ν

₀

be a known L´ evy measure concentrated on a (possibly infinite) interval I ⊆ R and having strictly positive density with respect to the Lebesgue measure. Let us choose a parameter space F

^ν⁰^,I

such that there exist a sequence m = m

_n

of integers, functions V

_j

, j = ±2, . . . , ±m and a sequence ε

_m

→ 0 as m → ∞ such that Conditions (H1), (C1), (C2) are satisfied for F = F

^ν⁰^,I

. Then, for n big enough we have:

∆( P

n^ν⁰

, W

n^ν⁰

) = O

p n∆

n

sup

f∈F

A

m

(f) + B

m

(f) + C

m

(f )

+ O

p n∆

_n

sup

f∈F

L

₂

(f, f ˆ

_m

) + s m

n∆

n

1 µ

m

+ 1 µ

⁻_m

. (9)

Theorem 2.6. Let ν

₀

be a known L´ evy measure concentrated on a (possibly

infinite) interval I ⊆ R and having strictly positive density with respect to

the Lebesgue measure. Let us choose a parameter space F

^ν⁰^,I

such that there

(14)

exist a sequence m = m

_n

of integers, functions V

_j

, j = ±2, . . . , ±m and a sequence ε

_m

→ 0 as m → ∞ such that Conditions (H1), (C1), (C2’) are satisfied for F = F

^ν⁰^,I

. Then, for n big enough we have:

∆( Q

^ν_n⁰

, W

_n^ν⁰

) = O

ν

₀

I \ [−ε

_m

, ε

_m

] p

n∆

²_n

+ m ln m

√ n + r

n p

∆

_n

sup

f∈F

C

_m

(f )

+ O

p n∆

_n

sup

f∈F

A

_m

(f) + B

_m

(f ) + H(f, f ˆ

_m

)

. (10)

Corollary 2.7. Let ν

₀

be as above and let us choose a parameter space F

^ν⁰^,I

so that there exist sequences m

⁰_n

, ε

⁰_m

, V

_j⁰

and m

⁰⁰_n

, ε

⁰⁰_m

, V

_j⁰⁰

such that:

• Conditions (H1), (C1) and (C2) hold for m

⁰_n

, ε

⁰_m

, V

_j⁰

, and

_n∆^m⁰

n

1 µ_m0

+

1 µ⁻

m0

tends to zero.

• Conditions (H1), (C1) and (C2’) hold for m

⁰⁰_n

, ε

⁰⁰_m

, V

_j⁰⁰

, and ν

₀

I \ [−ε

_m⁰⁰

, ε

_m⁰⁰

] p

n∆

²_n

+

^m⁰⁰^√^ln_n^m⁰⁰

tends to zero.

Then the statistical models P

n^ν⁰

and Q

n^ν⁰

are asymptotically equivalent:

n→∞

lim ∆( P

_n^ν⁰

, Q

^ν_n⁰

) = 0,

If, in addition, the L´ evy measures have finite variation, i.e. if we assume (FV), then the same results hold replacing P

_n^ν⁰

and Q

^ν_n⁰

by P

_{n,F V}^ν⁰

and Q

_{n,F V}^ν⁰

, respectively (see Lemma Appendix A.14).

3. Examples

We will now analyze three different examples, underlining the different behaviors of the L´ evy measure ν

₀

(respectively, finite, infinite with finite variation and infinite with infinite variation). The three chosen L´ evy measures are I

^[0,1]

(x)dx, I

^[0,1]

(x)

^dx_x

and I

R+

(x)

^dx_x2

. In all three cases we assume the parameter f to be uniformly bounded and with uniformly γ-H¨ older derivatives:

We will describe adequate subclasses F

^ν⁰^,I

⊆ F

_(γ,K,κ,M^I ₎

defined as in (3). It

seems very likely that the same results that are highlighted in these exam-

ples hold true for more general L´ evy measures; however, we limit ourselves

to these examples in order to be able to explicitly compute the quantities

involved (v

_j

, x

^∗_j

, etc.) and hence estimate the distance between f and ˆ f

_m

as

in Examples 2.4.

(15)

In the first of the three examples, where ν

₀

is the Lebesgue measure on I = [0, 1], we are considering the statistical models associated with the discrete and continuous observation of a compound Poisson process with L´ evy density f . Observe that W

n^Leb

reduces to the statistical model associated with the continuous observation of a trajectory from:

dy

_t

= p

f(t)dt + 1 2 √

T

n

dW

_t

, t ∈ [0, 1].

In this case we have:

Example 3.1. (Finite L´ evy measure). Let ν

₀

be the Lebesgue measure on I = [0, 1] and let F = F

^Leb,[0,1]

be any subclass of F

_(γ,K,κ,M)^[0,1]

for some strictly positive constants K, κ, M and γ ∈ (0, 1]. Then:

n→∞

lim ∆( P

_{n,F V}^Leb

, W

_n^Leb

) = 0 and lim

n→∞

∆( Q

_{n,F V}^Leb

, W

_n^Leb

) = 0.

More precisely,

∆( P

n,F V^Leb

, W

n^Leb

) =





 O

(n∆

_n

)

⁻^4+2γ^γ

if γ ∈ 0,

¹₂

, O

(n∆

_n

)

⁻¹⁰¹

if γ ∈

¹₂

, 1 .

In the case where ∆

_n

= n

^−β

,

¹₂

< β < 1, an upper bound for the rate of convergence of ∆( Q

^Leb_{n,F V}

, W

_n^Leb

) is

∆( Q

^Lebn,F V

, W

n^Leb

) =



 



 

 O

n

⁻^4+2γ^γ+β

ln n

if γ ∈ 0,

¹₂

and

^2+2γ_3+2γ

≤ β < 1, O

n

¹²^−β

ln n

if γ ∈ 0,

¹₂

and

¹₂

< β <

^2+2γ_3+2γ

, O

n

⁻^2β+1¹⁰

ln n

if γ ∈

₁

2

, 1

and

³₄

≤ β < 1, O

n

¹²^−β

ln n

if γ ∈

₁

2

, 1

and

¹₂

< β <

³₄

. See Section 5.3 for a proof.

Example 3.2. (Infinite L´ evy measure with finite variation). Let X be a truncated Gamma process with (infinite) L´ evy measure of the form:

ν(A) = Z

A

e

^−λx

x dx, A ∈ B ([0, 1]).

Here F

^ν⁰^,I

is a 1-dimensional parametric family in λ, assuming that there

exists a known constant λ

₀

such that 0 < λ ≤ λ

₀

< ∞, f(t) = e

^−λt

and

dν

₀

(x) =

_x¹

dx. In particular, the f are Lipschitz, i.e. F

^ν⁰^,[0,1]

⊂ F

_(γ=1,K,κ,M^[0,1] ₎

.

(16)

The discrete or continuous observation (up to time T

_n

) of X are asymptotically equivalent to W

n^ν⁰

, the statistical model associated with the observation of a trajectory of the process (y

_t

):

dy

_t

= p

f (t)dt +

√ tdW

_t

2 √

T

_n

, t ∈ [0, 1].

More precisely, in the case where ∆

n

= n

^−β

,

¹₂

< β < 1, an upper bound for the rate of convergence of ∆( Q

n,F V^ν⁰

, W

n^ν⁰

) is

∆( Q

^ν_{n,F V}⁰

, W

n^ν⁰

) =

( O n

¹²^−β

ln n

if

¹₂

< β ≤

₁₀⁹

O n

⁻^1+2β⁷

ln n

if

₁₀⁹

< β < 1.

Concerning the continuous setting we have:

∆( P

n,F V^ν⁰

, W

n^ν⁰

) = O

n

^β−1⁶

ln n

⁵₂

= O T

⁻

1

n 6

ln T

_n

⁵₂

. See Section 5.4 for a proof.

Example 3.3. (Infinite L´ evy measure, infinite variation). Let X be a pure jump L´ evy process with infinite L´ evy measure of the form:

ν(A) = Z

A

2 − e

^−λx³

x

²

dx, A ∈ B ( R

⁺

).

Again, we are considering a parametric family in λ > 0, assuming that the parameter stays bounded below a known constant λ

₀

. Here, f (t) = 2 − e

^−λt³

, hence 1 ≤ f (t) ≤ 2, for all t ≥ 0, and f is Lipschitz, i.e. F

^ν⁰^,R⁺

⊂ F

(γ=1,K,κ,M)^R⁺

. The discrete or continuous observations (up to time T

_n

) of X are asymptotically equivalent to the statistical model associated with the observation of a trajectory of the process (y

t

):

dy

_t

= p

f (t)dt + tdW

_t

2 √

T

_n

, t ≥ 0.

More precisely, in the case where ∆

_n

= n

^−β

, 0 < β < 1, an upper bound for the rate of convergence of ∆( Q

n^ν⁰

, W

n^ν⁰

) is

∆( Q

^ν_n⁰

, W

_n^ν⁰

) =

( O n

¹²⁻²³^β

if

³₄

< β <

¹²₁₃

O n

⁻¹⁶⁺¹⁸^β

(ln n)

⁷⁶

if

¹²₁₃

≤ β < 1.

In the continuous setting, we have

∆( P

n^ν⁰

, W

n^ν⁰

) = O n

^3β−3³⁴

(ln n)

⁷⁶

= O T

⁻

3

n 34

(ln T

_n

)

⁷⁶

.

See Section 5.5 for a proof.

(17)

4. Proofs of the main results

In order to simplify notations, the proofs will be presented in the case I ⊆ R

⁺

. Nevertheless, this allows us to present all the main difficulties, since they can only appear near 0. To prove Theorems 2.5 and 2.6 we need to introduce several intermediate statistical models. In that regard, let us denote by Q

^f_j

the law of a Poisson random variable with mean T

_n

ν(J

_j

) (see (4) for the definition of J

_j

). We will denote by L

m

the statistical model associated with the family of probabilities N

m

j=2

Q

^f_j

: f ∈ F : L

m

=

N ¯

^m−1

, P ( ¯ N

^m−1

),

m

O

j=2

Q

^f_j

: f ∈ F

. (11)

By N

_j^f

we mean the law of a Gaussian random variable N (2 p

T

_n

ν(J

_j

), 1) and by N

m

the statistical model associated with the family of probabilities N

m

j=2

N

_j^f

: f ∈ F : N

m

=

R

^m−1

, B ( R

^m−1

),

^m

O

j=2

N

_j^f

: f ∈ F

. (12)

For each f ∈ F , let ¯ ν

_m

be the measure having ¯ f

_m

as a density with respect to ν

₀

where, for every f ∈ F , ¯ f

_m

is defined as follows.

f ¯

_m

(x) :=

( 1 if x ∈ J

1

,

ν(Jj)

ν0(Jj)

if x ∈ J

j

, j = 2, . . . , m. (13) Furthermore, define

P ¯

_n^ν⁰

=

D, D

Tn

, n

P

_T^(γ^¯^νm^−ν⁰^,0,¯^ν^m⁾

n

: d¯ ν

_m

dν

₀

∈ F o

. (14)

4.1. Proof of Theorem 2.5

We begin by a series of lemmas that will be needed in the proof. Before doing so, let us underline the scheme of the proof. We recall that the goal is to prove that estimating f =

_dν^dν

0

from the continuous observation of a L´ evy process (X

_t

)

t∈[0,T_n]

without Gaussian part and having L´ evy measure ν is asymptotically equivalent to estimating f from the Gaussian white noise model:

dy

_t

= p

f (t)dt + 1 2 p

T

_n

g(t) dW

_t

, g = dν

₀

dLeb , t ∈ I.

(18)

Also, recall the definition of ˆ ν

_m

given in (5) and read P

1

⇐⇒

∆

P

2

as P

1

is asymptotically equivalent to P

2

. Then, we can outline the proof in the following way.

• Step 1: P

_T^(γ^ν−ν⁰^,0,ν)

n

⇐⇒

∆

P

_T^(γ^νm−ν^ˆ ⁰^,0,ˆ^ν^m⁾

n

;

• Step 2: P

_T^(γ^νm−ν^ˆ ⁰^,0,ˆ^ν^m⁾

n

⇐⇒

∆

N

m

j=2

P (T

_n

ν(J

_j

)) (Poisson approximation).

Here N

m

j=2

P (T

_n

ν(J

_j

)) represents a statistical model associated with the observation of m−1 independent Poisson r.v. of parameters T

_n

ν(J

_j

);

• Step 3: N

m

j=2

P (T

_n

ν(J

_j

)) ⇐⇒

^∆

N

m

j=2

N (2 p

T

_n

ν(J

_j

), 1) (Gaussian approximation);

• Step 4: N

m

j=2

N (2 p

T

_n

ν(J

_j

), 1) ⇐⇒

^∆

(y

_t

)

t∈I

.

Lemmas 4.1–4.3, below, are the key ingredients of Step 2.

Lemma 4.1. Let P ¯

n^ν⁰

and L

m

be the statistical models defined in (14) and (11), respectively. Under the Assumption (H2) we have:

∆( ¯ P

n^ν⁰

, L

m

) = 0, for all m.

Proof. Denote by ¯ N = N ∪ {∞} and consider the statistics S : (D, D

Tn

) → N ¯

^m−1

, P ( ¯ N

^m−1

)

defined by S(x) =

N

_T^{x; 2}

n

, . . . , N

_T^x;^m

n

with N

_T^x;^j

n

= X

r≤Tn

I

Jj

(∆x

_r

). (15) An application of Theorem Appendix A.12 to P

_T^(γ^¯^νm−ν⁰^,0,¯^ν^m⁾

n

and P

_T^(0,0,ν⁰⁾

n

,

yields

dP

_T^(γ^νm^¯ ^−ν⁰^,0,¯^ν^m⁾

n

dP

_T^(0,0,ν_n ⁰⁾

(x) = exp

^m

X

j=2

ln ν(J

_j

) ν

0

(J

j

)

N

_T^x;j

n

−T

_n

Z

I

( ¯ f

_m

(y)−1)ν

₀

(dy)

. Hence, by means of the Fisher factorization theorem, we conclude that S is a sufficient statistics for ¯ P

n^ν⁰

. Furthermore, under P

_T^(γ^νm−ν^¯ ⁰^,0,¯^ν^m⁾

n

, the random

variables N

_T^x;j_n

have Poisson distributions Q

^f_j

with means T

_n

ν(J

_j

). Then, by means of Property Appendix A.7, we get ∆( ¯ P

n^ν⁰

, L

m

) = 0, for all m.

Let us denote by ˆ Q

^f_j

the law of a Poisson random variable with mean T

_n

R

Jj

f ˆ

_m

(y)ν

₀

(dy) and let L ˆ

m

be the statistical model associated with the family of probabilities { N

m

j=2

Q ˆ

^f_j

: f ∈ F }.

(19)

Lemma 4.2.

∆( L

m

, L ˆ

m

) ≤ sup

f∈F

s T

n

κ Z

I\[0,ε_m]

f (y) − f ˆ

_m

(y)

2

ν

₀

(dy).

Proof. By means of Facts Appendix A.2–Appendix A.4, we get:

∆( L

m

, L ˆ

m

) ≤ sup

f∈F

H

m

O

j=2

Q

^f_j

,

m

O

j=2

Q ˆ

^f_j

≤ sup

f∈F

v u u t

m

X

j=2

2H

²

(Q

^f_j

, Q ˆ

^f_j

)

= sup

f∈F

√ 2

v u u t

m

X

j=2

1 − exp

− T

_n

2 s Z

Jj

f ˆ (y)ν

₀

(dy) − s Z

Jj

f(y)ν

₀

(dy)

2

. By making use of the fact that 1 − e

^−x

≤ x for all x ≥ 0 and the equality

√ a − √

b =

^√^a−b

a+√

b

combined with the lower bound f ≥ κ (that also implies f ˆ

_m

≥ κ) and finally the Cauchy-Schwarz inequality, we obtain:

1 − exp

− T

_n

2 s Z

Jj

f ˆ (y)ν

0

(dy) − s Z

Jj

f (y)ν

0

(dy)

2

≤ T

_n

2 s Z

Jj

f(y)ν ˆ

₀

(dy) − s Z

Jj

f (y)ν

₀

(dy)

2

≤ T

_n

2 R

Jj

(f(y) − f ˆ

_m

(y))ν

₀

(dy)

2

κν

₀

(J

_j

)

≤ T

n

2κ Z

Jj

f(y) − f ˆ

_m

(y)

2

ν

₀

(dy).

Hence, H

^m

O

j=2

Q

^f_j

,

m

O

j=2

Q ˆ

^f_j

≤ s T

_n

κ Z

I\[0,εm]

f(y) − f ˆ

_m

(y)

2

ν

₀

(dy).

Lemma 4.3. Let ν ˆ

_m

and ν ¯

_m

the L´ evy measures defined as in (5) and (13), respectively. For every f ∈ F , there exists a Markov kernel K such that

KP

_T^(γ^νm^¯ ^−ν⁰^,0,¯^ν^m⁾

n

= P

_T^(γ^νm−ν^ˆ ⁰^,0,ˆ^ν^m⁾

n

.

(20)

Proof. By construction, ¯ ν

_m

and ˆ ν

_m

coincide on [0, ε

_m

]. Let us denote by

¯

ν

_m^res

and ˆ ν

_m^res

the restriction on I \ [0, ε

_m

] of ¯ ν

_m

and ˆ ν

_m

respectively, then it is enough to prove: KP

^(γ^ν^¯

resm−ν0,0,¯ν^res_m)

Tn

= P

^(γ^ν^ˆ

mres−ν0,0,ˆν^res_m)

Tn

. First of all, let us observe that the kernel M :

M (x, A) =

m

X

j=2

I

Jj

(x) Z

A

V

_j

(y)ν

₀

(dy), x ∈ I \ [0, ε

_m

], A ∈ B (I \ [0, ε

_m

])

is defined in such a way that M ν ¯

_m^res

= ˆ ν

_m^res

. Indeed, for all A ∈ B (I \ [0, ε

_m

]), M ν ¯

_m^res

(A) =

m

X

j=2

Z

Jj

M (x, A)¯ ν

_m^res

(dx) =

m

X

j=2

Z

Jj

Z

A

V

_j

(y)ν

₀

(dy)

¯ ν

_m^res

(dx)

=

m

X

j=2

Z

A

V

j

(y)ν

0

(dy)

ν(J

j

) = Z

A

f ˆ

m

(y)ν

0

(dy) = ˆ ν

_m^res

(A). (16)

Observe that (γ

^ν^¯^res^m^−ν⁰

, 0, ν ¯

_m^res

) and (γ

^ν^ˆ^m^res^−ν⁰

, 0, ν ˆ

_m^res

) are L´ evy triplets associated with compound Poisson processes since ¯ ν

_m^res

and ˆ ν

_m^res

are finite L´ evy measures. The Markov kernel K interchanging the laws of the L´ evy processes is constructed explicitly in the case of compound Poisson processes.

Indeed if ¯ X is the compound Poisson process having L´ evy measure ¯ ν

_m^res

, then X ¯

_t

= P

Nt

i=1

Y ¯

_i

, where N

_t

is a Poisson process of intensity ι

_m

:= ¯ ν

_m^res

(I \ [0, ε

_m

]) and the ¯ Y

i

are i.i.d. random variables with probability law

_ι¹

m

ν ¯

_m^res

. Moreover, given a trajectory of ¯ X, both the trajectory (n

_t

)

_t∈[0,T_n_]

of the Poisson process (N

_t

)

t∈[0,T_n]

and the realizations ¯ y

_i

of ¯ Y

_i

, i = 1, . . . , n

_T_n

are uniquely deter- mined. This allows us to construct n

_T_n

i.i.d. random variables ˆ Y

_i

as follows:

For every realization ¯ y

_i

of ¯ Y

_i

, we define the realization ˆ y

_i

of ˆ Y

_i

by throwing it according to the probability law M (¯ y

i

, ·). Hence, thanks to (16), ( ˆ Y

i

)

i

are i.i.d. random variables with probability law

_ι¹

m

ν ˆ

_m^res

. The desired Markov kernel K (defined on the Skorokhod space) is then given by:

K : ( ¯ X

_t

)

t∈[0,T_n]

7−→

X ˆ

_t

:=

Nt

X

i=1

Y ˆ

_i

t∈[0,T_n]

. Finally, observe that, since

ι

_m

= Z

I\[0,εm]

f ¯

_m

(y)ν

₀

(dy) = Z

I\[0,εm]

f(y)ν

₀

(dy) = Z

I\[0,εm]

f ˆ

_m

(y)ν

₀

(dy), ( ˆ X

_t

)

t∈[0,Tn]

is a compound Poisson process with L´ evy measure ˆ ν

_m^res

.

Let us now state two lemmas needed to understand Step 4.