• Aucun résultat trouvé

Robust model selection for a semimartingale continuous time regression from discrete data

N/A
N/A
Protected

Academic year: 2021

Partager "Robust model selection for a semimartingale continuous time regression from discrete data"

Copied!
47
0
0

Texte intégral

(1)

HAL Id: hal-02334868

https://hal.archives-ouvertes.fr/hal-02334868

Submitted on 29 Oct 2019

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires

Robust model selection for a semimartingale continuous time regression from discrete data

Victor Konev, Serguei Pergamenchtchikov

To cite this version:

Victor Konev, Serguei Pergamenchtchikov. Robust model selection for a semimartingale continuous time regression from discrete data. Stochastic Processes and their Applications, Elsevier, 2015, 125, pp.294 - 326. �10.1016/j.spa.2014.08.003�. �hal-02334868�

(2)

Robust model selection for a semimartingale continuous time regression from discrete data

Konev Victor and Pergamenchtchikov Serguei

Abstract

The paper considers the problem of estimating a periodic func- tion in a continuous time regression model observed under a general semimartingale noise with an unknown distribution in the case when continuous observation cannot be provided and only discrete time measurements are available. Two specific types of noises are stud- ied in detail: a non - Gaussian Ornstein - Uhlenbeck process and a time - varying linear combination of a Brownian motion and com- pound Poisson process. We develop new analytical tools to treat the adaptive estimation problems from discrete data. A lower bound for the frequency sampling, needed for the efficiency of the procedure constructed by discrete observations, has been found. Sharp non- asymptotic oracle inequalities for the robust quadratic risk have been derived. New convergence rates for the efficient procedures have been obtained. An example of the regression with a martingale noise ex- hibits that the minimax robust convergence rate may be both higher or lower as compared with the minimax rate for the ”white noise”

model. The results of Monte-Carlo simulations are given.

Keywords: Semimartingale regression; Estimation from discrete data; Ro- bust risk estimation; Model selection; Sharp oracle inequalities.

AMS 2000 Subject Classifications: Primary: 62G08; Secondary: 62G05

This study was partially supported by Russian Science Fondation (research project No. 14-49-00079) and by the National Research Tomsk State University.

Department of Applied Mathematics and Cybernetics, Tomsk State University and Department of High Mathematics and Mathematical Physics Tomsk Polytechnical Uni- versity, Tomsk, Russia, e-mail: [email protected]

Laboratoire de Math´ematiques Rapha¨el Salem, UMR 6085 CNRS-Universit´e de Rouen, France and National Research University - Higher School of Economics, Laboratory of Quantitative Finance, Moscow, Russia, email: [email protected].

(3)

1 Introduction

Consider a regression model in continuous time

dyt =S(t)dt+ dξt, 0tn , (1.1) where S is an unknown function which belongs to the linear space V1 of 1 - periodic RRcadlag functions;ξ = (ξt)t≥0 is an unobservable semimartin- gale noise with the values in the Skorokhod space D[0, n] such that for each function f from L2[0, n] the stochastic integral

In(f) = Z n

0

f(s)dξs (1.2)

is well defined and has the properties:

EQIn(f) = 0 and EQIn2(f)κQ

Z n 0

f2(s)ds , (1.3) where κQ >0 is some positive constant depending on the noise distribution Q on the space D[0, n]; EQ denotes the expectation under the distribution Q. The noise distribution Q is unknown and assumed to belong to some distribution family Qn specified below. All necessary tools concerning the stochastic calculus can be found, for example, in [18].

The problem is to estimate the unknown function S in the model (1.1) on the basis of observations

(yt

j)0≤j≤np, tj =j, ∆ = 1

p, (1.4)

where p 3 is an odd number depending on n. Such an assumption about discrete time observations arises if the continuous observation of the process (1.1) cannot be provided. There are many papers devoted to the similar nonparametric estimation problems for the regression model (1.1) and other continuous time processes on the basis of the discrete observations (1.4).

Gobet, Hoffmann and Reiss [14] and Comte, Genon-Catalot and Rozenholc [5] studied the problem of estimating the coefficients of a diffusion process from discrete data. Hoffmann, Munk and Schmidt-Hieber [15] investigate nonparametric estimation for diffusion coefficients from discrete data, when the observation are blurred by an additive noise. Comte and Genon-Catalot

(4)

[4] studied nonparametric estimation problem for pure jumps L´evy process in the model (1.1) based on discrete time observations.

In this paper we consider the estimation problem in an adaptive setting, i.e. when the regularity ofS is unknown. The quality of an estimate Sen (any one periodic R R function constructed from discrete data (1.4)) will be measured with the quadratic risk

RQ(Sen, S) =EQkSenSk2, (1.5) wherekfk2 =R1

0 f2(t)dt. Since the noise distributionQis unknown, it seems reasonable to introduce the robust risk of the form

Rn(Sen, S) = sup

Q∈Qn

RQ(Sen, S), (1.6) which enables one to take into account the information that Q ∈ Qn and ensures the quality of an estimate Sen for all distributions in the family Qn. It will be noted that similar criteria in the problems of nonparametric es- timation were used in a number of papers (see, for example, the papers by Galtchouk and Pergamenchtchikov in [10], by the authors [24] and the refer- ences therein).

The goal of this paper is to develop a model selection method for esti- mating a continuous time semimartingale regression (1.1) from discrete data (1.4). The origin of the model selection method goes back to early seventies with the pioneering papers by Akaike [1] and Mallows [25] who suggested to use penalizing in a log-likelihood type criterion. Barron, Birg´e, Massart [3], Massart [26] and Kneip [19] developed a non-asymptotic model selec- tion method which enables one to derive non-asymptotic oracle inequalities for the nonparametric regression models with Gaussian disturbances. The model selection procedures for the regression schemes with dependent noises and unknown distributions were studied by Comte and Genon-Catalot [4], Fourdrinier and Pergamenchtchikov [8], Galtchouk and Pergamenchtchikov [9], and by the authors (see, [23], [21] for details and further references). The interest to the model selection procedures can be explained by the fact that they provide adaptive solutions to the non-asymptotic regression models by the technique of sharp non-asymptotic oracle inequalities. As is known (see, for example, [3]), the oracle inequality gives an upper bound for the risk via the minimal risk corresponding to a chosen family of estimates. In con- structing an adaptive model selection procedure, in this paper, we will use

(5)

the approach close to that of the papers [12], [11], [13], developed for a het- eroscedastic regression model in discrete time. The key idea of the method in [12] is to combine the Barron-Birg´e-Massart non-asymptotic penalization method [3] and the Pinsker weighted least squares method of minimizing the asymptotic risk [27], [28]. The advantage of this method is two-fold: it deals with a robust risk and enables one to construct asymptotically efficient procedures.

The rest of the paper is organized as follows. Section 2 gives two specific examples of the noise in the model (1.1). First, a non - Gaussian Ornstein - Uhlenbeck process driven by a finite intensity L´evy process and second - a deterministic but time - varying linear combination of a Brownian motion and a compound Poisson process. In Section 3 we construct a model selec- tion procedure based on the weighted least squares estimators and, under general moment conditions on the distribution ξ in (1.1), obtain sharp non- asymptotic oracle inequalities for the risks (1.5) and (1.6). Moreover, we check these conditions for the examples given in Section 2. Section 4 illus- trates the performance of the proposed model selection procedure through numerical simulations. It turns out that the sample convergence rate is close to that obtained in [12] for heteroscedastic regression models in the discrete time. Section 5 deals with the asymptotic properties of the proposed proce- dure under the additional assumption that the unknown function S in (1.1) belongs to a Sobolev ball. In this section we obtain an asymptotic sharp lower bound for the robust risk (1.6) and we calculate the Pinsker constant for the model (1.1). Then we find the lower bound for the frequency sampling p which is needed for constructing efficient estimation procedures from dis- crete data. It is shown that the limit of the risk (1.6) for the proposed model selection procedure, normalized by the minimax rate, equals the Pinsker constant, i.e. the procedure is efficient. Section 6 gives the proofs of all the theorems. In the Appendix some technical results are established.

2 Examples

2.1 Non-Gaussian Ornstein-Uhlenbeck process

First we consider an example of the disturbance (ξt)t≥0 in (1.1) given by a non-Gaussian Ornstein-Uhlenbeck process with the L´evy subordinator. Such processes are used in the financial Black-Scholes market models with stepwise

(6)

randomly evolving coefficients (see [2] for details and further references). Let the noise process in (1.1) obey the equation

t=tdt+ dut, ξ0 = 0, (2.1) where a 0, ut = %1wt+%2zt, %1 and %2 are unknown constants, (wt)t≥0 is a standard Brownian motion, (zt)t≥0 is a compound Poisson process of the form

zt=

Nt

X

j=1

Yj (2.2)

where (Nt)t≥0 is a standard homogeneous Poisson process with unknown intensity λ >0 and (Yj)j≥1 is an i.i.d. sequence of random variables with

EY1 = 0, EY12 = 1 and EY14 <. (2.3) Let (Tk)k≥1 denote the arrival times of the process (Nt)t≥0, that is,

Tk = inf{t0 : Nt=k}.

The parameters λ, a, %1 and %2 of the process (2.1) are assumed to satisfy the following conditions

−a a0, λ λ,

%min %1 %max, %min % %max,

(2.4) where % = %21 +λ%22. For this noise model the family Qn consists of all distributions on the Skorokhod space D[0, n] of the process (2.1) with the parameters satisfying the inequalities (2.4) with unknown bounds λ, a,

%min and %max.

Remark 2.1. The estimation problem from the continuous data for the mod- els of type (1.1), (2.1) has been studied in a parametric setting in [6], [16], [17] and [20].

2.2 Martingale noise

Next we consider a martingale noise in (1.1) obeying the equation t=p

%1(t)dwt+p

%2(t)dzt, (2.5)

(7)

where %1 and %2 are positive two times continuously differentiable R+ R+

nonrandom functions; the process (zt)t≥0 is as defined in (2.2)–(2.3). We assume that λλ for some unknown λ >0. Moreover, denoting

%(t) =e %1(t) +λ%2(t), (2.6) we assume that, there exist R+ R+ continuous functions %min(·) and

%max(·) such that for t 0

%min(t)%(t)e %max(t) (2.7) and, for any δ >0,

lim

t→∞ tδ%min(t) = +∞ and lim

t→∞

%max(t)

tδ = 0. (2.8) In addition, let the first two derivatives of the functions %(·) satisfy thee following conditions for some positive constants %0 and %00

sup

t≥0

(t+ 1)

d dt%(t)e

%0, sup

t≥0

d2 dt2 %(t)e

%00. (2.9) In this caseQnis the family of all distributions of the process (2.5) onD[0, n]

satisfying the conditions (2.7) and (2.9) for some fixed unknown parameters λ >0, %min(·),%max(·),%0 and %00.

3 Oracle inequalities

In this section we construct a model selection procedure for estimating a function S in (1.1) by the discrete time observations (1.4) and establish the oracle inequalities for its risk. Let (φj)j≥1 be the standard trigonometric basis in L2[0,1] defined as

φ1 = 1, φj(x) =

2 cos($jx) for even j 2;

2 sin($jx) for odd j 3,

(3.1) where$j = 2π[j/2], [x] denotes the integer part ofx. The restrictions of the functions 1, . . . , φp}, p3, on the sampling lattice

Tp =

t1, . . . , tp , tj = j p,

(8)

form an orthonormal basis in the Hilbert space RTp with the inner product (x, y)p = 1

p

p

X

j=1

x(tj)y(tj) for x, y RTp (3.2)

and the norm kxkp =q

(x, x)p. It is clear that the space RTp is isometric to Rp. One can check directly that for any odd p

j, φi)p =χ{i=j}. (3.3) This implies that the function S on the lattice Tp coincides with its discrete Fourier transformation, i.e.

S(t) =

p

X

j=1

θj,pφj(t), if t∈ Tp, (3.4) where θj,p = (S, φj)p. The first step in constructing the model selection procedure consists in estimating the coefficients θj,p for S in (1.1) from the discrete data by the formulae

θbj,p = 1 n

Z n 0

ψj,p(t) dyt, ψj,p =Dpj). (3.5) Here Dp stands for the linear mapping given by the equation

Dp(f)(t) =

np

X

k=1

f(tk)χ{t

k−1<t≤tk}. (3.6)

Furthermore, we note that the system of the functions (ψj,p)1≤j≤p is orthonor- mal inL2[0,1] because (ψj,p, ψi,p) = R1

0 ψj,p(t)ψi,p(t)dt= (φj, φi)p =χ{i=j}. In the sequel we need the Fourier coefficients for the function S with respect to these functions which can be written as

θj,p = (S, ψj,p) = Z 1

0

S(t)ψj,p(t) dt=θj,p+hj,p, (3.7) where hj,p = hj,p(S) = Pp

l=1

Rtl

tl−1 φj(tl)(S(t)S(tl))dt. Substituting (1.1) in (3.5) yields

θbj,p =θj,p+ 1

nξj,p and ξj,p = Inj,p)

n . (3.8)

(9)

Besides we will impose the additional conditions on the distribution of the noise ξt in (1.1). For any x= (x1, . . . , xp) from Rp, we set

#(x) =

p

X

j=1

1{|x

j|>0} and |x|2 =

p

X

j=1

x2j. (3.9) We need the following conditions.

C1) There exists a variance proxy σQ 0, which may be depend on n 1, such that the sequence ζj,p =EQξj,p2 σQ satisfies the following inequality

L1,Q(n) = sup

p≥3

sup

x∈H1

p

X

j=1

xjζj,p

< where H1 ={x[−1,1]p : #(x)n}.

C2) Assume that

L2,Q(n) = sup

p≥3

sup

x∈H2

EQ

p

X

j=1

xjξej,p

2

<

where H2 ={xRp : |x| ≤1, #(x)n} and ξej,p=ξj,p2 EQξj,p2 .

Remark 3.1. As is shown in the paper [24], which considers the estimation problem for the model (1.1) under continuous observations, one needs some stability conditions of the noise variances in (3.8) and the boundedness of their deviations to derive the oracle inequalities for the robust risk. In the estimation problem from discrete data the noise properties depend highly on the frequency of observations as well. The additional supremum in the Con- ditions in C1) and C2) over p 3 assumes that the stability of the noise variances and their deviations hold uniformly in p. Further restrictions on the functionals L1,Q(n) and L2,Q(n) will be imposed in Section 3.2.

As will be shown in the Section 3.3, the conditionsC1) and C2) are satisfied for the model (1.1) with the noises (2.1) and (2.5). Now we introduce a weight least squares estimate for S(t) as

Sbγ(t) =

p

X

j=1

γ(j)bθj,pψj,p(t), (3.10)

(10)

where γ = (γ(1), . . . , γ(p))0 [0,1]p is the vector of weight coefficients. The model selection procedure will be chosen from a finite family of such estimates (Sbγ)γ∈Γ. The set of weight sequences Γ will be given below. We will need the following characteristic of this set and the corresponding system of weight sequences

νp =cardΓ and µp = max

γ∈Γ #(γ). (3.11)

We assume that

µp n . (3.12)

Remark 3.2. The last inequality means that we cannot use more than n nonzero terms in the estimate (3.10). We need this condition because the noise variance in (3.8) tends to zero as n−1.

Now note that in view of the definition (3.7), the empirical squared error of the estimator (3.10) can be represented as

Err(γ) =kSbγSk2 =

p

X

j=1

γ2(j)bθ2j,p2

p

X

j=1

γ(j)bθj,pθj,p + kSk2. (3.13)

Since the Fourier coefficients (θj,p)j≥1 are unknown, the weight coefficients (γ(j))1≤j≤pcannot be determined by minimizing this quantity. To circumvent this difficulty we replace the terms θbj,pθj,p by

θej,p =bθ2j,pσbn

n , (3.14)

whereσbn is an estimator for the variance proxy σQ in the conditionC1). We will need the following characteristics of this estimate

rQ(bσn) = EQ|σbnσQ| and rn(σbn) = sup

Q∈Qn

EQ|bσnσQ|. (3.15) For replacing the termsbθj,pθj,p by the estimates (3.14) on the right-hand side of the empirical squared error (3.13), one has to pay some penalty. Thus, one comes to the cost function of the form

J(γ) =

p

X

j=1

γ2(j)bθj,p2 2

p

X

j=1

γ(j)θej,p +ρPb(γ) (3.16)

(11)

where ρ is some positive constant and Pb(γ) is the penalty term defined as Pb(γ) = bσn|2

n , (3.17)

where |γ|2 =Pp

j=1γ2(j). If σQ is known, one can put bσn =σQ and Pb(γ) =PQ(γ) = σQ|γ|2

n . (3.18)

Substituting in (3.10) the values of the weight coefficients γ(j), minimizing the cost function (3.16), that is

bγ = argminγ∈ΓJ(γ), (3.19) yields the model selection procedure

Sb =Sb

bγ. (3.20)

Our first goal is to obtain the oracle inequalities for the quadratic risk of the estimate (3.20) defined in (1.5). To state the result we introduce the sequence

ΨQ(n, p) = 2

3L1,Q(n) +νpL2,Q(n) σQ−1

+νpκQ

. (3.21)

Proposition 3.1. Let the conditions (1.3), C1) and C2) imposed on the noise distribution t)t≥0 in (1.1) hold. Then, for anyn 2, p3, 0< ρ <

1/3and any set Γ with the property (3.12), the estimator (3.20) satisfies the following oracle inequality

RQ(Sb, S) 1 + 3ρ 1min

γ∈Γ

RQ(Sbγ, S) + 1 n

ΨQ(n, p) (13ρ)ρ +prQ(bσn)

n(13ρ) . (3.22)

The proof of Proposition 3.1 is given in Section 6.

If the proxy variance σQ in the Condition C1) is known, then one comes to the following result.

(12)

Corollary 3.2. Suppose that the conditions of Proposition 3.1 hold with known σQ >0. Then, for any n2, p3, 0< ρ <1/3 and any set Γ with the property (3.12), the estimator (3.20) satisfies the oracle inequality

RQ(Sb, S) 1 + 3ρ 1min

γ∈Γ RQ(Sbγ, S) + 1 n

ΨQ(n, p)

(13ρ)ρ. (3.23) Remark 3.3. It should be noted that Comte and Genon-Catalot in [4] use the oracle inequality of type (3.22) to find an adaptive convergence rate for the density estimation problem for the L´evy processes from discrete data. In this paper we will use this inequality to study efficiency properties for the procedure (3.19).

3.1 Estimation of σQ

In this section we will consider the case of an unknown proxy varianceσQ in the condition C1) and derive the oracle inequalities for the continuous time estimate (3.10). We additionally assume that the unknown function S(t) in (1.1) has an absolutely integrable derivative. First we have to estimate σQ and find an upper bound for rQ(bσn) in (3.22). One can use the following estimate for σQ

σbn = n ˇ p

ˇ p

X

j=l0

θbj,p2 and pˇ= min(p, n), (3.24) where l0 1 will be specified later. We set σbn = 0 for l0 >p.ˇ

Lemma 3.3. Assume that the conditions (1.3), C1) and C2) hold and the unknown function S(t)is differentiable for 0t 1and such that for some r >0

kSk2 +kSk˙ 2 = Z 1

0

|S(t)|2+|S(t)|˙ 2

dt r . (3.25) Then, for any n 1 and p 3, the quantity rQ(bσn), defined in (3.15), satisfies the inequality

rQ(bσn) KQ(n) n

ˇ pl0 +l0

ˇ p + 1

pˇ

, (3.26)

where KQ(n) = 16r+ 8

rκQ+L1,Q(n) +σQ+q

L2,Q(n).

(13)

The proof of Lemma 3.3 is given in Section 6.

Minimizing the right hand side of the last inequality with respect to l0 we find the appropriate value for l0 in (3.24), namely

l0 = n

, (3.27)

which yields

rQ(bσn) 3KQ(n)

n ˇ p + 1

pˇ

:= 3KQ(n)gp,n. (3.28) Combining Proposition 3.1 and Lemma 3.3 leads to the following result.

Proposition 3.4. Assume that the conditions (1.3), C1) and C2) hold and the function S satisfies the inequality (3.25) for some r > 0. Then, for any n 1, p 3, 0 < ρ < 1/3 and any set Γ with the property (3.12), the estimate (3.20) obeys the oracle inequality

RQ(Sb, S) 1 + 3ρ 1min

γ∈ΓRQ(Sbγ, S) + BQ(n, p)

n(13ρ)ρ, (3.29) where BQ(n, p) = ΨQ(n, p) + 18KQ(n)µpgp,n.

Remark 3.4. Note that the oracle inequality (3.29)involves the termBQ(n, p) which depends on n and p. Our goal is to find conditions on the noise dis- tribution Q and the frequency p providing the boundedness of this term by any power of n as n → ∞. As will be seen later, this property implies the efficiency of model selection procedure (3.20). To this end we need to study the robust risks for this estimate.

3.2 Robust estimation

In order to obtain the oracle inequality for the robust risk (1.6) of the esti- mate (3.10) we will impose additional conditions on the distribution family Qn of the noise ξ in the equation (1.1). Actually these conditions are stipu- lated also by the further studies of the asymptotic properties of the procedure (3.20) provided that both the number of observation periodsnand the obser- vation frequency ptend to infinity. LetPndenote the class of all probability measures on the space D[0, n] and Pn be its subclass defined as

Pn =n

Q∈ Pn : L1,Q(n) L1,n, L2,Q(n) L2,no

, (3.30)

(14)

where L1,Q(n) andL2,Q(n) are functionals from the conditions C1),C2) and L1,n and L2,n are numerical sequences such that for any δ >0

lim

n→∞

L1,n+L2,n nδ = 0.

H1) Assume that each distribution in the family Qn enters the class (3.30), i.e. Qn ⊆ Pn for each n 1. Besides, the constant κQ in (1.3) and the variance proxy σQ from the condition C1) are such that for n= 1,2, ...

( κ(n) := supQ∈Q

nκQ <, σ(n) := supQ∈Q

nσQ <, σ(n) := infQ∈Q

nσQ >0 (3.31)

and for any δ >0 lim

n→∞

κ(n)

nδ = 0, lim

n→∞

σ(n)

nδ = 0 and lim

n→∞ nδσ(n) = +∞. (3.32) Taking into account these notations, we get the following upper bounds for the functions ΨQ(ρ) andKQ(n) in (3.21) and (3.26)

Ψn(p) = 2

2L1,n+νpL2,n(n))−1+νpκ(n) and

Kn= 16r+ 8p

rκ(n) +L1,n+q

L2,n+σ(n).

Now proceeding from Theorem 3.4 we come to the following result.

Theorem 3.5. Assume that the conditions (1.3)andH1) hold and the func- tion S satisfies the inequality (3.25) for some r > 0. Then, for any n 1, p 3, 0 < ρ < 1/3 and any set Γ with the property (3.12) the estimate (3.20) satisfies the oracle inequality for the robust risk (1.6), i.e.

Rn(Sb, S) 1 + 3ρ 1min

γ∈ΓRn(Sbγ, S) + Bn(p)

n(13ρ)ρ, (3.33) where Bn(p) = Ψn(p) + 18Knµpgp,n.

(15)

3.3 Specification of weights in the model selection

Now we will specify the weight coefficients (γ(j))1≤j≤p in the way proposed in [12]. Consider a numerical grid of the form

A={1, . . . , k} × {t1, . . . , tm}, (3.34) where ti = and m = [1/ε2]. We assume that both parameters k 1 and n−2 ε1 are functions ofn, i.e. k =k(n) and ε=ε(n), such that

limn→∞ k(n) = +∞, limn→∞ k(n) lnn = 0,

limn→∞σ(n)ε(n) = 0 and limn→∞ nδε(n) = +∞

(3.35)

for any δ >0. One can take, for example, for n 2

ε(n) = 1

(n) lnn)n2 and k(n) = k0+

lnn , (3.36) where ab = min(a, b) and k0 0 is some fixed constant. For each α = (β, t)∈ A, we introduce the weight sequence

γα = (γα(j))1≤j≤p with the elements

γα(j) = 1{1≤j<j

}+ 1(j/ωα)β 1{j

≤j≤ωα} (3.37) where j = [ωα/ln(n+ 1)], ωα = (τβt n)1/(2β+1) and

τβ = (β+ 1)(2β+ 1)/πβ. Now we specify the set Γ in (3.19) as

Γ = α, α∈ A}. (3.38)

Note that for any p 1 the number of elements in Γ equals νp = km.

Therefore, in view of the assumptions (3.35), for any δ >0, one has lim

n→∞

km

nδ = 0. (3.39)

Taking into account that τβ <1 for β1 and that εn−2, we obtain ω = sup

α∈A

ωα (n/ε)1/3 n . (3.40)

(16)

Therefore, for any p1 and n 1 µp = sup

α∈A p

X

j=1

1

α(j)>0} ω n ,

i.e. the condition (3.12) holds. Moreover, the last condition in (3.35) yields lim

n→∞

supp≥1 µp

n1/3+δ = 0 for any δ >0.

Our main goal is to bound asymptotically (as n → ∞) the term Bn(p) in (3.33) by any power of n. To this end we note that if

p p = 2[n5/6/2] + 3, (3.41) then the upper bound (3.28) can be estimated as gp,n 2n−1/3, i.e.

lim

n→∞

1 nδ sup

p≥p

µpgp,n= 0, δ >0. (3.42) Therefore, using these properties in (3.33) one comes to the following result.

Theorem 3.6. Assume that the conditions (1.3) andH1) hold, the function S satisfies the inequality (3.25) for some r >0. Then, for any n 3, p3 and 0 < ρ < 1/3, the estimate (3.20) with the weight coefficients (3.38) satisfies the oracle inequality (3.33), in which, for any δ >0,

lim

n→∞

1 nδ sup

p≥p

Bn(p) = 0. (3.43)

Remark 3.5. It will be noted that the assertion of Theorem 3.6 holds true, generally speaking, for any random process t)t≥0 in (1.1) for which the stochastic integral (1.2) is well defined and satisfies the conditions (1.3) and H1). The class of square integrable semimartingales, in our opinion, is the most appropriate one for the estimation problems with the quadratic risks (1.5) and (1.6). As an alternative, one could consider, for example, non- semimartingale regression models based on the fractional Brownian motions with the Hurst parameterH 6= 1/2. However, in the latter case, the stochastic integral is defined not for functions from L2[0, n], but for some special spaces (see [7] for details). This does not agree with the the definition (1.5) and, moreover, the first limiting relation in (3.32) fails to hold.

Références

Documents relatifs

Key words: Improved non-asymptotic estimation, Weighted least squares estimates, Robust quadratic risk, Non-parametric regression, L´ evy process, Model selection, Sharp

The paper considers the problem of robust estimating a periodic function in a continuous time regression model with dependent dis- turbances given by a general square

Keywords: Non-parametric regression; Model selection; Sharp oracle inequal- ity; Robust risk; Asymptotic efficiency; Pinsker constant; Semimartingale noise.. AMS 2000

Fourdrinier and Pergamenshchikov [5] extended the Barron-Birge-Massart method to the models with dependent observations and, in contrast to all above-mentioned papers on the

Key words: Improved non-asymptotic estimation, Least squares esti- mates, Robust quadratic risk, Non-parametric regression, Semimartingale noise, Ornstein–Uhlenbeck–L´ evy

Most of these men whom he brought over to Germany were some kind of &#34;industrial legionaries,&#34; who — in Harkort's own words — &#34;had to be cut loose, so to speak, from

Renal transplant recipients with abnormally low bone den- sity may have a low fracture rate; more than one-third of the patients with fractures showed no signs of osteoporosis in

Section 5 is devoted to numerical experiments on both simulated and real data sets to illustrate the behaviour of the proposed estimation algorithms and model selection criteria..