Efficient adaptive nonparametric estimation in heteroscedastic regression models

(1)

HAL Id: hal-00129707

https://hal.archives-ouvertes.fr/hal-00129707

Preprint submitted on 8 Feb 2007

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de

Eﬀicient adaptive nonparametric estimation in heteroscedastic regression models

Leonid Galtchouk, Sergei Pergamenshchikov

To cite this version:

Leonid Galtchouk, Sergei Pergamenshchikov. Eﬀicient adaptive nonparametric estimation in het-

eroscedastic regression models. 2005. �hal-00129707�

(2)

Efficient adaptive nonparametric estimation in heteroscedastic regression models ^∗

L. Galtchouk ¹ , S. Pergamenshchikov ²

1 IRMA, Département de Mathématiques, Université de Strasbourg, 7 rue Réne Descartes,

F67084, Strasbourg Cedex France e-mail: [email protected]

2 Laboratoire de Math´ematiques Raphael Salem, UMR CNRS 6085, Avenue de l’Universit´e, BP. 12,

UFR Sciences, Universit´e de Rouen,

F76801, Saint Etienne du Rouvray, Cedex, France.

e-mail: [email protected]

Abstract

An adaptive nonparametric estimation procedure is constructed for the estimation problem of heteroscedastic regression. A non-asymptotic upper bound for the quadratic risk (the Oracle inequality) is obtained.

Asymptotic efficiency of this procedure is proved, i.e. Pinsker’s constant is found in the asymptotical lower bound for the risk. It is shown that the asymptotical quadratic risk for the constructed procedure co- incides with this constant.

Key words: adaptive estimation, asymptotic bounds, efficient estimation, heteroscedastic regression, nonparametric estimation, non-asymptotic estimation, Oracle inequality, Pinsker’s constant.

AMS (1991) Subject Classification : primary 62G08; secondary 62G05, 62G20

∗

The second author is partially supported by the RFFI-Grant 04-01-00855.

(3)

1 Introduction

Suppose we are given observations (y

j

)

₁_≤_j_≤_n

which obey the heteroscedastic regression equation

y

j

= S(x

j

) + σ

j

(S) ξ

j

, (1.1) where design points x

j

= j/n, S( · ) is an unknown function to be estimated, (ξ

j

)

₁_≤_j_≤_n

is the sequence of i.i.d. random variables, (σ

j

(S))

₁_≤_j_≤_n

are unknown scale functionals depending on unknown regression function S and the design points. Our goal is to estimate S in the mean quadratic sense in the both non-asymptotic and asymptotic setups when the smoothness of S is unknown.

Note that heteroscedastic regressions with this type of scale functionals has been encountered in consumer budget studies utilizing observations on individuals with diverse incomes and in analyses of the investment behavior of firms of different sizes (see, for example, Goldfeld and Quandt (1972)). Such kind of models was considered by Gunst and Mason (1980), Efroimovich (1999), Akritas and Van Keilegom (2001) as well.

The problem of minimax estimation of nonparametric regression function in homoscedastic case (i.e. σ

j

≡ σ) in the asymptotic setup has been studied in a number of papers. The optimal rate for L

2

− losses and regression functions from a L

2

− Sobolev or a H¨older space was studied by Ibragimov and Hasminskii (1982), Speckman (1985), Donoho et al.(1995); the case of regression functions from Triebel and Besov space was investigated by Donoho and Johnston (1998); the optimal rate for model (1.1) in L

q

− losses and adaptive estimate for Sobolev spacial regression was considered by Nemirovskii (2000).

Efficient linear estimators for L

2

− risk over linear estimators and some other risks was given by Donoho and Liu (1991), Donoho (1994).

The notion of asymptotic optimality is usually associated with conver- gence rate of the minimax risk (see for example Ibragimov and Hasminskii (1981), Stone (1982))). An important question in the development of nonparametric estimation is to study the exact asymptotic behaviour of minimax risk and to find an efficient estimator, i.e. an estimator which achieves this asymptotics.

The optimal constant and efficient estimators for L

2

− losses was obtained by Nussbaum (1985), Golubev (1992), Golubev and Nussbaum (1993), for sup-norm losses by Korostelev (1993). For the absolute error loss and the estimation of regression function S at a fixed point, the optimal constant and efficient estimators was found by Galtchouk and Pergamenshchikov (2005).

A non-asymptotic approach for nonparametric estimation problem in the

model (1.1) with σ

j

≡ σ was studied in a few papers. A non-asymptotic upper

bound for quadratic risk over thresholding estimators is given by Kalifa and

(4)

Mallat (2003). Barron et al. (1999), Massart (2004) have constructed an adaptive procedure of model selection based on least squares estimators and have obtained a non-asymptotic upper bound for quadratic risk which is best in the principal term for given class of estimators. This type of upper bounds is called the Oracle inequalities.

It should be noted that to obtain efficient estimators (in asymptotic setup), in contrast with least squares estimators Nussbaum (1985) and later Golubev and Nussbaum (1993) have used some special class of weighted least squares estimators, where the weights are depending on regularity of unknown function and was chosen specially to obtain the optimal constant in quadratic risk (so-called the Pinsker constant).

By reason of asymptotic efficiency, we make use of Golubev-Nussbaum estimators to obtain non-asymptotic upper bound for quadratic risk in this paper. The main distinction of our approach from that of Barron-Birge- Massart is the following one. We choose a family of estimators such that the every estimator is optimal when the regularity of unknown function is fixed and a basis is given. In contrast with our approach, in the selection model theory one chooses a family of basis (models) for which a least squares estimator will be constructed.

In this paper an adaptive procedure is proposed which is based on weighted least squares estimators and the non-asymptotic upper bound for the quadratic risk is obtained which is best in the principal term for the chosen family of estimators. Moreover, at the same time the asymptotic properties of the proposed procedure are studied as n → ∞ . It turns out that this procedure is asymptotically efficient, that is the asymptotic quadratic risk coinsides with the lower bound for all estimators, i.e. with the Pinsker constant.

The paper is organized as follows. In the next section we formulate the problem and give main results. In section 3 we construct an adaptive estimation procedure based on Nussbaum’s estimators and we obtain a non- asymptotic upper bound for the quadratic risk of this procedure. In section 4 we prove the sharp asymptotic upper bound for the quadratic risk of this estimator. In section 5 the sharp asymptotic lower bound for the minimax risk is obtained. An appendix contains some technical results.

2 Problems and Results

First we consider the model (1.1) in which the sequence (ξ

j

)

₁_≤_j_≤_n

is i.i.d.

with

E ξ

1

= 0 , E ξ

₁²

= 1 and E ξ

₁⁴

= ξ

^∗

< ∞ . (2.1)

(5)

We introduce the Sobolev class W

_r^q

as W

_r^q

= { S ∈ C

^q⁻¹

[0, 1] :

q

X

j=0

k S

^(j)

k

²

≤ r } , (2.2) where S

^(j)

is the j − th derivative of S, r is some positive constant, q ≥ 1 is an integer,

k S k

²

= Z

1

0

S

²

(x)dx .

For any estimate ˆ S

n

of S, based on observations (y

j

)

₁_≤_j_≤_n

, we use the following quadratic risk

R

n

( ˆ S

n

, S) = E

_S

k S ˆ

n

− S k

²n

, (2.3) where

k S k

²n

= 1 n

n

X

j=1

S

²

(x

_j

) .

In the next section we construct an adaptive procedure (see below (3.8)) for which we obtain a non-asymptotic upper bound for the quadratic risk.

To construct this procedure, we make use of the estimator family introduced by Nussbaum (1985), see below (3.4)-(3.5), and the approach proposed by Golubev and Nussbaum (1993) for the homoscedastic case, i.e. σ

l

≡ σ. The key idea in the construction of this procedure is the following one. We replace the unknown variance σ

²

in the Golubev-Nussbaum estimation procedure by some estimator for n

⁻¹

P

n

j=1

σ

_j²

. For this procedure we obtain the non- asymptotic upper bound for the risk (2.3) which is best for the choosen family of estimators (see Theorem 3.3). This type of upper bound is called Oracle inequality.

To study asymptotic properties, we need to impose some additional conditions on the sequence (σ

j

) in (1.1).

H

₁

) σ

j

= g(x

j

, S) for some unknown function g : [0, 1] × L

1

[0, 1] → R

+

, which is square integrable with respect to x such that

n

lim

→∞

sup

S∈Wr^q

1 n

n

X

j=1

g

²

(x

j

, S) − ς (S)

= 0 , (2.4)

where ς(S) := R

1

0

g

²

(x, S)dx. Moreover there exist some reals 0 < g

_∗

≤ g

^∗

< ∞ , for which

g

_∗

≤ inf

0≤x≤1

inf

S∈W_q^r

g(x, S) ≤ sup

0≤x≤1

sup

S∈Wq^r

g(x, S) ≤ g

^∗

. (2.5)

(6)

H

₂

) The function g(x, S) is differentiable in the Frechet sense with respect to S in L

₁

[0, 1] uniformly over 0 ≤ x ≤ 1, i.e. for any S, S

₀

from L

₁

[0, 1]

g(x, S) = g(x, S

₀

) + L

_x,S₀

(S − S

₀

) + Υ(x, S

₀

, S) , where the linear operator L

_x,S

0

(the Frechet derivative) is bounded uniformly over 0 ≤ x ≤ 1 in L

₁

[0, 1], i.e. for any S

₀

from L

₁

[0, 1] there exists some positive constant C

^∗

= C

^∗

(S

0

) such that

sup

0≤x≤1

sup

S∈L¹[0,1], S6=0

| L

_x,S

0

(S) | k S k

1

≤ C

^∗

(2.6)

and the residual term Υ(x, S

₀

, S) satisfies the following property

kS

lim

k1→0

sup

0≤x≤1

| Υ(x, S

₀

, S) | k S k

1

= 0 , where k S k

1

= R

1

0

| S(x) | dx.

H

₃

) The function g

₀

(x) = g(x, 0) is continuous on the interval [0, 1].

Now we formulate the main asymptotic results.We set

γ

q

(S) = C

_q^∗

r

^1/(2q+1)

(ς (S))

^2q/(2q+1)

, (2.7) with

C

_q^∗

= (2q + 1)

^1/(2q+1)

q π (q + 1)

2q/(2q+1)

.

It is well known (see, for example [19]) that the optimal rate is ϕ

n

= n

^2q/(2q+1)

in the case when S ∈ W

_r^q

.

Theorem 2.1. Assume that in the model (1.1)-(2.1) the sequence (σ

j

) fulfils the condition H

1

). Then the estimator S

_n^∗

defined by (3.8) with

ε = 1/ ln(n + 1) satisfies the inequality lim sup

n→∞

ϕ

n

sup

S∈W_r^q

1 γ

q

(S) R

n

(S

_n^∗

, S) ≤ 1 (2.8)

Moreover we will show that the estimator S

_n^∗

is efficient in the following

sense.

(7)

Theorem 2.2. Assume that in the model (1.1) the sequence (ξ

j

) is i.i.d.

∼ N (0, 1) and the sequence (σ

_j

) fulfils the conditions H

₁

)– H

₃

). Then the risk (2.3) admets the following asymptotic lower bound

lim inf

n→∞

inf

Sˆn

ϕ

n

sup

S∈W_r^q

1 γ

q

(S) R

n

( ˆ S

n

, S) ≥ 1 . (2.9) Remark 2.1. Inequalities (2.8) and (2.9) imply that γ

q

(S) is the so-called Pinsker’s constant, i.e. the sharp asymptotic lower bound for the quadratic risk.

3 Non-asymptotic estimation

In this section we construct the estimation procedure for the model (1.1)- (2.1). We use the trigonometric basis { φ

_j

, j ≥ 1 } in L

₂

[0, 1] with

φ

1

(x) = 1, φ

j

(x) = √

2T r

j

(2π[j/2]x) , j ≥ 2 , (3.1) where [a] is the integer part of a real a,

T r

_j

(x) =

( cos x for even j , sin x for odd j . It is easy to see that for this basis

(φ

_i

, φ

_j

)

_n

= 1 n

n

X

l=1

φ

_i

(x

_l

) φ

_j

(x

_l

) = 0

for i 6 = j and 1 ≤ i, j ≤ n. Moreover, k φ

j

k

²_n

= 1 for j ≤ n − 1 and k φ

n

k

²_n

= ν

_n

with

ν

n

=

( 2 for even n ,

1 for odd n . (3.2)

Now by making use of the discrete Fourier transformation we reduce the model (1.1) to the model

θ ˆ

_j,n

= θ

_j,n

+ 1

√ n ξ

_j,n

(3.3)

with

θ ˆ

_j,n

= (y, φ

j

)

n

, θ

_j,n

= (S, φ

j

)

n

, ξ

_j,n

= 1

√ n

n

X

l=1

σ

l

ξ

_l

φ

j

(x

l

) .

(8)

For any given ε > 0, let us define a 2-dimensional parameter τ = (β, t) with values into the set A

ε

= { 1, . . . , β

_∗

} × { t

₁

, . . . , t

_m

} , where β

_∗

= [1/ √

ε], t

_i

= iε for 1 ≤ i ≤ m = [1/ε

²

]. Notice that the number of elements in the set A

ε

is υ

_ε

= card( A

ε

) = β

_∗

· [1/ε

²

].

For any τ ∈ A

ε

we set ρ

_τ

(j) =

( 1 for 1 ≤ j ≤ k

0

(τ )

Ψ

_β

(j w

_τ⁻¹

) for j > k

0

(τ ) , (3.4) where

Ψ

_β

(z) = (1 − z

^β

)1

₍_|_z_|≤₁₎

, w

_τ

= n

^1/(2β+1)

c

_τ

,

c

_τ

= π

⁻^2β/(2β+1)

(t/ι

β

)

^1/(2β+1)

, ι

β

= β

(β + 1)(2β + 1) , k

0

(τ) = w

_τ

/ ln(n + 2) .

For every τ ∈ A

^ε

, we define the following estimator S ˆ

_τ

(x) =

n

X

j=1

ρ

_τ

(j) 1 k φ

j

k

²n

θ ˆ

_j,n

φ

j

(x) . (3.5)

In the sequel we use the following weighted Fourier transformation of S:

S

_τ

(x) =

n

X

j=1

ρ

_τ

(j) 1 k φ

j

k

²n

θ

_j,n

φ

j

(x) .

To estimate the unknown function S by estimators (3.4), we should find τ from A

^ε

which minimizes the loss function

k S ˆ

_τ

− S k

²n

=

n

X

j=1

ρ

²_τ

(j) θ ˆ

_j,n²

k φ

j

k

²n

− 2

n

X

j=1

ρ

_τ

(j ) θ ˆ

_j,n

θ

_j,n

k φ

j

k

²n

+

n

X

j=1

θ

_j,n²

k φ

j

k

²n

(3.6) or equivalently, minimizes the function

n

X

j=1

ρ

²_τ

(j) θ ˆ

²_j,n

k φ

j

k

²n

− 2

n

X

j=1

ρ

_τ

(j)

θ ˆ

_j,n

θ

_j,n

k φ

j

k

²n

.

Since the coefficients θ

_j,n

are unknown, we will find τ which minimizes the function

J

n

(τ) =

n

X

j=1

ρ

²_τ

(j ) θ ˆ

²_j,n

k φ

j

k

²n

− 2

n

X

j=1

ρ

_τ

(j)

θ ˆ

²_j,n

− ς ˆ

_n

/n k φ

j

k

²n

,

(9)

where ˆ ς

_n

is some estimator of

ς

_n

= 1 n

n

X

l=1

σ

_l²

. (3.7)

The estimator ˆ ς

_n

will be constructed below. Denote by ˆ τ = ( ˆ β, ˆ t) the argmin of the function J

n

(τ ), i.e.

J

_n

(ˆ τ) = min

τ∈A^ε

J

_n

(τ ) . We denote

S

_n,ε^∗

= ˆ S

_τ_ˆ

. (3.8)

In this section we study this estimator in the non-asymptotic setup.

Theorem 3.1. Assume that in the model (1.1)–(2.1) the function S belongs to ∪

β≥1,0<r≤r_∗

W

_r^β

, where r

_∗

is an unknown constant. Then, for any n ≥ 1 and ε > 0, the estimate S

_n,ε^∗

satisfies the following inequality

E

_S

k S

_n,ε^∗

− S k

²n

≤ 1

1 − 4%σ

_∗

inf

τ∈A^ε

E

_S

k S ˆ

_τ

− S k

²n

+ 2 c

^∗

υ

_ε

B

1

(%)

n

^2/3

E

_S

| ς ˆ

_n

− ς

_n

| + B

_n

(%, ε) , (3.9) where 0 < % < 1/4σ

_∗

is an arbitrary constant and

B

_n

(%, ε) = 2 p

2(ξ

^∗

− 1) σ

_∗

c

^∗

B

1

(%)

n

^7/6

υ

_ε

+ 7 σ

_∗

2

^β^∗

B

1

(%) + B

2

(%)

n υ

_ε

,

B

1

(%) = 2 1 − 2%σ

_∗

ν

n

1 − 4%σ

_∗

ν

n

, B

2

(%) = 2 (1 + ξ

^∗

)(1 − 2%σ

_∗

ν

n

)

(1 − 4%σ

_∗

ν

n

)% , (3.10) σ

_∗

= max

1≤l≤n

σ

_l²

, c

^∗

= sup

τ∈A^ε

c

τ

.

To estimate ς

_n

, we make use of the following estimator ˆ

ς

_n

=

n

X

j=l_n+1

θ ˆ

_j,n²

k φ

j

k

²n

, n ≥ 3 , (3.11)

with l

_n

= [n

^1/3

+ 1]. This estimator satisfies the following inequality.

Proposition 3.2.

sup

β≥1,r≤r∗

sup

S∈Wr^β

E

_S

| ς ˆ

_n

− ς

_n

| ≤ 1

√ n T

_n^∗

,

(10)

where

T

_n^∗

= p

2(ξ

^∗

− 1)ν

n

σ

_∗

+ 2r

_∗

+ σ

_∗

n

^1/6

+ 2 √

2r

_∗

σ

_∗

ν

n

^1/3

+ 5σ

_∗

√ n . The proof of this proposition is given in Appendix.

Theorem 3.1 and Proposition 3.2 imply immediately the following result.

Theorem 3.3. Assume that in the model (1.1)–(2.1) the function S belongs to ∪

β≥1,0<r≤r_∗

W

_r^β

, where r

_∗

is an unknown constant. Then, for any n ≥ 1 and ε > 0, the estimate S

_n,ε^∗

satisfies the following inequality

E

_S

k S

_n,ε^∗

− S k

²n

≤ 1

1 − 4%σ

_∗

inf

τ∈A^ε

E

_S

k S ˆ

_τ

− S k

²n

+ D

n

(%, ε) , (3.12) where

D

n

(%, ε) = 2 c

^∗

υ

_ε

B

1

(%)

n

^7/6

T

_n^∗

+ B

_n

(%, ε) and 0 < % < 1/4σ

_∗

is an arbitrary constant.

Remark 3.1. Note that the principal term in the right-hand side of (3.12) is best in the class of estimators ( ˆ S

_τ

, τ ∈ A

ε

). Usually inequalities of such type are called the Oracle inequalities.

Proof of Theorem 3.1. Fisrt, by (3.6)–(3.7) we get the following equality k S ˆ

_τ

− S k

²n

= J

_n

(τ) + 2

n

X

j=1

ρ

_τ

(j ) k φ

j

k

²n

(ˆ θ

_j,n²

− ς ˆ

n − θ ˆ

_j,n

θ

_j,n

) +

n

X

j=1

θ

_j,n²

k φ

j

k

²n

.

Further, from the model (3.3) we obtain that θ ˆ

²_j,n

− ˆ ς

_n

n − θ ˆ

_j,n

θ

_j,n

= θ

_j,n

ξ

_j,n

√ n + ξ ˜

_j,n

n + m ˜

_j,n

n + 1

n (ς

_n

− ς ˆ

_n

) , where

ξ ˜

_j,n

= ξ

²_j,n

− m

_j,n

, m

_j,n

= E ξ

_j,n²

=

_n¹

P

n

l=1

σ

_l²

φ

²_j

(x

_l

) ,

˜

m

_j,n

= m

_j,n

− ς

_n

=

_n¹

P

n

l=1

σ

²_l

φ

_j

(x

l

) , φ

_j

(x

l

) = φ

²_j

(x

l

) − 1 .

(3.13)

(11)

Therefore, for any τ ∈ A

ε

,

k S ˆ

_τ

− S k

²n

= J

n

(τ ) + 2

√ n

n

X

j=1

ρ

_τ

(j) θ

_j,n

k φ

j

k

²n

ξ

_j,n

+ 2N

n

(τ ) +

n

X

j=1

θ

_j,n²

k φ

j

k

²n

, (3.14)

where N

n

(τ ) = P

3

i=1

N

_i,n

(τ) with N

_1,n

(τ ) = n

⁻¹

n

X

j=1

ρ

_τ

(j ) k φ

j

k

²n

ξ ˜

_j,n

,

N

_2,n

(τ ) = n

⁻¹

n

X

j=1

ρ

_τ

(j ) k φ

j

k

²n

˜ m

_j,n

,

N

_3,n

(τ ) = n

⁻¹

n

X

j=1

ρ

_τ

(j ) k φ

_j

k

²n

(ς

n

− ς ˆ

_n

) .

Let τ

₁

be some fixed value of parameter from A

ε

. Then by (3.14) we find k S ˆ

_ˆ_τ

− S k

²n

− k S ˆ

_τ₁

− S k

²n

= J

n

(ˆ τ) − J

n

(τ

1

) + 2Z

_1,n

(ˆ τ ) + 2 V

n

(ˆ τ ) , (3.15) where

Z

_1,n

(τ) = 1

√ n

n

X

j=1

ρ

_τ

(j ) − ρ

_τ₁

(j) k φ

_j

k

²n

θ

_j,n

ξ

_j,n

, V

n

(τ) = N

n

(τ ) − N

n

(τ

1

) .

Lemma 6.4 implies that

E Z

_1,n⁴

(τ) ≤ ξ

^∗

d

⁴_1,τ

(3.16) where

d

²_1,τ

= E Z

_1,n²

(τ) ≤ σ

_∗

ν

n

X

j=1

(ρ

_τ

(j) − ρ

_τ₁

(j ))

²

k φ

j

k

⁴n

θ

_j,n²

≤ σ

_∗

ν

n

n k S

_τ

− S

_τ1

k

²n

, ν

_n

is defined by (3.2).

Taking into account in (3.15) that J

n

(ˆ τ) − J

n

(τ

1

) ≤ 0, we obtain

k S ˆ

_ˆ_τ

− S k

²n

− k S ˆ

_τ1

− S k

²n

≤ 2 Z

_1,n

(ˆ τ) + 2 V

_n^∗

,

(12)

where V

_n^∗

= max

_τ_∈A_ε

| V

n

(τ ) | .

We set, for x > 0, % > 0 and fixed τ ∈ A

ε

, Z ˜

_1,n

(τ, x) = 2 Z

_1,n

(τ ) % n

(% n)

²

d

²_1,τ

+ x and Γ

_1,n

(x) = n

sup

_τ

| Z ˜

_1,n

(τ, x) | ≤ 1 o

. Notice now that (% n)

²

d

²_1,τ

+ x ≥ 2d

_1,τ

√

x% n. Therefore by Chebyshev’s inequality and (3.16), we obtain, for any x > 0 and τ ∈ A

ε

,

P

| Z ˜

_1,n

(τ, x) | > 1

≤ P | Z

_1,n

(τ ) | > √ x d

_1,τ

≤ e

^∗

(x) , where e

^∗

(x) = 1

_{_x_≤₁_}

+ ξ

^∗

x

⁻²

1

_{_x>1_}

. Hence

P

Γ

^c_1,n

(x)

≤ X

τ6=τ¹

P | Z

_1,n

(τ, x) | > √ x d

_1,τ

≤ υ

_ε

e

^∗

(x) . (3.17) Thus on the set Γ

_1,n

(x)

k S ˆ

_τ_ˆ

− S k

²n

≤ k S ˆ

_τ₁

− S k

²n

+ % nd

²_1,ˆ_τ

+ x

% n + 2V

_n^∗

≤ k S ˆ

_τ1

− S k

²n

+ % σ

_∗

ν

n

k S

_τ_ˆ

− S

_τ1

k

²n

+ x

% n + 2 V

_n^∗

. (3.18) Let us estimate now k S

_τ_ˆ

− S

_τ₁

k

²n

. We have

k S

_τ_ˆ

− S

_τ₁

k

²n

= k S ˆ

_τ_ˆ

− S ˆ

_τ₁

k

²n

+ ( k S

_τ_ˆ

− S

_τ₁

k

²n

− k S ˆ

_τ_ˆ

− S ˆ

_τ₁

k

²n

)

= k S ˆ

_τ_ˆ

− S ˆ

_τ₁

k

²n

+

n

X

j=1

(ρ

_τ_ˆ

(j ) − ρ

_τ₁

(j))

²

k φ

_j

k

²n

(θ

²_j,n

− θ ˆ

²_j,n

)

≤ k S ˆ

_ˆ_τ

− S ˆ

_τ₁

k

²n

− 2 Z

_2,n

(ˆ τ) , where

Z

_2,n

(τ ) = 1

√ n

n

X

j=1

(ρ

_τ

(j) − ρ

_τ₁

(j ))

²

k φ

_j

k

²n

θ

_j,n

ξ

_j,n

with

d

²_2,τ

= E Z

_2,n²

(τ ) ≤ σ

_∗

ν

n

X

j=1

(ρ

_τ

(j) − ρ

_τ₁

(j))

⁴

k φ

j

k

⁴n

θ

²_j,n

≤ 2σ

_∗

ν

n

n k S

_τ

− S

_τ₁

k

²n

.

(13)

By the same way as before, for x > 0, we set Γ

_2,n

(x) = { sup

τ

| Z ˜

_2,n

(τ) | ≤ 1 } , where

Z ˜

_2,n

(τ, x) = 2Z

_2,n

(τ )% n (% n)

²

d

²_2,τ

+ x . Similarly to (3.17), one shows that

P(Γ

^c_2,n

(x)) ≤ υ

_ε

e

^∗

(x) . On the set Γ

_2,n

(x) we obtain that

k S

_τ_ˆ

− S

_τ₁

k

²n

≤ k S ˆ

_τ_ˆ

− S ˆ

_τ₁

k

²n

+ % nd

²_2,ˆ_τ

+ x

% n

≤ k S ˆ

_τ_ˆ

− S ˆ

_τ1

k

²n

+ 2%σ

_∗

ν

n

k S

_τ_ˆ

− S

_τ1

k

²n

+ x

% n . Hence

k S

_τ_ˆ

− S

_τ₁

k

²n

≤ v

^∗

(%) k S ˆ

_τ_ˆ

− S ˆ

_τ₁

k

²n

+ v

^∗

(%)

% n x , (3.19) where v

^∗

(%) = (1 − 2%σ

_∗

ν

_n

)

⁻¹

. Thus on the set Γ

_n

(x) = Γ

_1,n

(x) ∩ Γ

_2,n

(x), in view of (3.18),(3.19), we get

k S ˆ

_τ_ˆ

− S k

²n

≤ k S ˆ

_τ1

− S k

²n

+ %σ

_∗

ν

n

v

^∗

(%) k S ˆ

_τ_ˆ

− S ˆ

_τ1

k

²n

+ 1 + %σ

_∗

ν

n

v

^∗

(%)

% n x + 2 V

_n^∗

.

Estimating here the term k S ˆ

_τ_ˆ

− S ˆ

_τ₁

k

²n

by 2 k S ˆ

_τ_ˆ

− S k

²n

+ 2 k S − S ˆ

_τ₁

k

²n

yields the inequality

k S ˆ

_τ_ˆ

− S k

²n

≤ 1

1 − 4%σ

_∗

ν

_n

k S ˆ

_τ₁

− S k

²n

+ 2 1 − 2%σ

_∗

ν

n

1 − 4%σ

_∗

ν

_n

V

_n^∗

+ 1 − %σ

_∗

ν

n

(1 − 4%σ

_∗

ν

n

)%n x .

Applying Lemma 6.5 with M(x) = 2e

^∗

(x)υ

ε

gives E

_S

k S ˆ

_τ_ˆ

− S k

²n

≤ 1

1 − 4%σ

_∗

ν

n

E

_S

k S ˆ

_τ

− S k

²n

+ B

1

(%)E

_S

V

_n^∗

+ υ

_ε

B

2

(%) 1

n , (3.20)

(14)

where B

1

(%) and B

2

(%) are defined by (3.10). In Appendix 6.2 we show that E

_S

V

_n^∗

≤ υ

_ε

2 p

2(ξ

^∗

− 1)σ

_∗

c

^∗

n

^7/6

+ 7σ

_∗

2

^β^∗

n + 2 c

^∗

n

^2/3

E

_S

| ς ˆ

_n

− ς

_n

|

!

. (3.21) Therefore the inequality (3.9) follows immediately from (3.20) and (3.21).

4 Asymptotic upper bound

In this section we prove Theorem 2.1.

We start with the estimation problem (1.1) under the condition that S ∈ W

_r^q

with the known parameters q, r and ς (S) defined in (2.4). In this case we use the estimator ˜ S

n

= ˆ S

_τ_ε

defined in (3.5) with τ

ε

= (q, r

ε

) from A

^ε

,

r

_ε

= inf { i ≥ 1 : iε ≥ r/ς(S) } and ε = ε

n

= 1/ ln(n + 1).

Theorem 4.1. Under the condition H

1

) lim sup

n→∞

ϕ

_n

sup

S∈Wr^q

1 γ

q

(S) R ( ˜ S

n

, S) ≤ 1 , (4.1) where ϕ

_n

= n

^2q/(2q+1)

.

Proof. First, note that from (3.5), taking into account that k φ

_j

k

n

≥ 1, we obtain that

E

_S

k S ˜

_n

− S k

²n

≤

n

X

j=1

(1 − ρ ˜

_j

)

²

θ

_j,n²

k φ

j

k

²n

+ 1 n

n

X

j=1

˜ ρ

²_j

m

_j,n

,

where ˜ ρ

j

= ρ

_τ_ε

(j) and m

_j,n

is defined in (3.13). Moreover, puting ˜ k

0

= [k

0

(τ

ε

)]

and ˜ k

1

= [ω

_τ_ε

· ln(n + 1)], we estimate the risk R ( ˜ S

n

, S) as R ( ˜ S

n

, S) ≤

˜k1−1

X

j=˜k⁰

(1 − ρ ˜

j

)

²

θ

²_j,n

+ ς

n

1 n

n

X

j=1

˜

ρ

²_j

+ ∆

1

(n) + ∆

2

(n) with

∆

₁

(n) =

n

X

j=˜k1

θ

_j,n²

k φ

j

k

²n

(15)

and

∆

2

(n) = 1 n

n

X

j=1

˜

ρ

²_j

m ˜

_j,n

= 1 n

²

n

X

d=1

σ

_d²

n

X

j=1

˜

ρ

²_j

φ

_j

(x

d

) . The last inequality implies that, for any fixed 0 < δ < 1,

R ( ˜ S

n

, S) ≤ (1 + δ)

˜k¹−1

X

j=˜k0

(1 − ρ ˜

j

)

²

θ

²_j

+ ς

n

1 n

n

X

j=1

˜ ρ

²_j

+ ∆

1

(n) + ∆

2

(n) + (1 + 1/δ) ∆

3

(n) , where ∆

3

(n) = P

^˜k¹−1

j=˜k0

(θ

_j,n

− θ

_j

)

²

, θ

j

=

Z

1 0

S(x) φ

j

(x) dx . (4.2)

Lemmas 6.1–6.3 imply immediately that

n

lim

→∞

sup

S∈Wr^q

ϕ

n 3

X

l=1

| ∆

l

(n) | = 0 . Therefore denoting ϕ

n

∆

l

(n) as o(1) yields

ϕ

n

R ( ˜ S

n

, S) ≤ (1 + δ)ϕ

n

˜k1−1

X

j=˜k⁰

(1 − ρ ˜

j

)

²

θ

²_j

+ ς

n

⁻^1/(2q+1)

n

X

j=1

˜

ρ

²_j

+ o(1) . Further, for any S ∈ W

_r^q

denoting a

j

= P

q

i=0

(2π[j/2])

²ⁱ

, we have that X

j≥1

a

j

θ

_j²

≤ r . (4.3)

Therefore, in view of (3.4) lim sup

n→∞

sup

j≥k˜0

ϕ

_n

(1 − ρ ˜

j

)

²

a

j

≤ 1

π

^2q

c

^2q₀

,

where c

0

= c

_τ₀

with τ

0

= (q, r/ς(S)). From here, by taking into account the conditions (2.4)–(2.5), we obtain, for sufficiently large n, the following upper bound for the quadratic risk

ϕ

n

R ( ˜ S

n

, S) ≤ (1 + 2δ) r

π

^2q

c

^2q₀

+ ς(S) c

0

Z

1 0

Ψ

²_q

(z)dz

+ o(1)

= (1 + 2δ) γ

q

(S) + o(1) .

(16)

This implies the inequality (4.1). Hence Theorem 4.1.

Now Theorem 3.1 and Theorem 4.1 imply immediately Theorem 2.1.

5 Asymptotic lower bound

In this section we prove Theorem 2.2. For this we need the next auxiliary result.

Lemma 5.1. For any 0 < ε < 1 and any estimate S ˆ

n

of S, k S ˆ

n

− S k

²n

≥ (1 − ε) k T ( ˆ S)

n

− S k

²

− (ε

⁻¹

− 1) r/n

²

, where T ( ˆ S)

n

(x) = P

n

k=1

S ˆ

n

(x

k

)1

_(x_k−₁_,x_k_]

(x).

Proof of this Lemma is given in Appendix 6.5.

From this Lemma we deduce that to prove (2.9), it suffices to show that lim inf

n→∞

inf

Sˆn

ϕ

n

R

⁰

( ˆ S

n

) ≥ 1 , (6.1) where

R

⁰

( ˆ S

n

) = sup

S∈Wr^q

E

_S

k S ˆ

n

− S k

²

γ

_q

(S) . For η > 0 and x ∈ R , denote

I

η

(x) = η

⁻¹

Z

R

1

₍_|_u_|≤₁₋_η)

G

u − x η

du ,

where 1

A

is the indicator of a set A, the kernel G ∈ C

^∞

( R ) is such that G( − u) = G(u) for | u | ≤ 1 , G(u) = 0 for | u | ≥ 1 ,

Z

1

−1

G(u) du = 1 . It is easy to see that the function I

η

(x) possesses the properties : I

η

is symmetric,

I

_η

(x) =







1 for | x | ≤ 1 − 2η ; 0 for | x | ≥ 1 ;

and R

R

f (x)I

η

(x) dx → R

1

−1

f (x) dx , η → 0 , R

R

f (x)I

_η²

(x) dx → R

1

−1

f (x) dx , η → 0 ,

(6.2)

(17)

uniformly over all functions bounded by any fixed constant c > 0, i.e.

sup

_f

sup

₋₁_≤_x_≤₁

| f(x) | ≤ c.

We choose the trigonometric basis in L

2

[ − 1, 1] as follows e

1

(x) = 1/ √

2, e

2

(x) = cos(π x), e

3

(x) = sin(π x), . . . ,

e

2i

(x) = cos(iπ x), e

2i+1

(x) = sin(iπ x), . . . . (6.3) For any array z = { z

_m,j

, 1 ≤ m ≤ M, 1 ≤ j ≤ N } with M = [1/(2h)] − 1 and N a positive integer, we denote

S

z

(x) =

M

X

m=1 N

X

j=1

z

_m,j

D

_m,j

(x) , (6.4) where

D

_m,j

(x) = e

j

(v

m

(x)) I

η

(v

m

(x)) , v

m

(x) = (x − a

m

)/h , a

m

= 2mh , h = L

n

(n)

⁻^1/(2q+1)

, L

n

= ln(n + 1) .

To construct a priori distribution on the family of arrays, we choose the following random array ϑ = { ϑ

_m,j

, 1 ≤ m ≤ M, 1 ≤ j ≤ N } with

ϑ

_m,j

= δ

_m,j

ξ ˆ

_m,j

, δ

_m,j

= g

₀

(a

m

)κ

j

√ nh , (6.5)

where

κ

²_j

= N

j

q

− 1

+

, N = [L

n

N

_ε^∗

], g

0

( · ) = g( · , 0) , N

_ε^∗

= 2

r ˜

_ε

(q + 1)(2q + 1) qπ

^2q

1/(2q+1)

, r ˜

_ε

= (1 − ε) r

ς (0) , (6.6) ε is a small positive real, 0 < ε < 1, ς (0) = R

1

0

g

²

(x, 0)dx, ( ˆ ξ

_m,j

) are i.i.d.

bounded random variables with a density ρ

d

(x) such that E ξ ˆ

_m,j

= 0, E ξ ˆ

_m,j²

= 1, | ξ ˆ

_m,j

| ≤ d ,

i

d

= Z

d

−d

( ˙ ρ

d

(x))

²

ρ

_d

(x) dx → 1 as d → ∞ . (6.7)

The construction of a such density ρ

d

is given in Appendix 6.6.

(18)

For any estimator ˆ S

n

, we denote by ˆ S

_n⁰

its projection on W

_r^q

, i.e. ˆ S

_n⁰

= Pr

_W_r^q

( ˆ S

_n

). Since W

_r^q

is convex set we get that k S ˆ

_n

− S k

²

≥ k S ˆ

_n⁰

− S k

²

. Therefore, for sufficiently large n,

R

0

( ˆ S

n

) = sup

S∈Wr^q

E

_S

k S ˆ

n

− S k

²

γ

q

(S) ≥ sup

S∈Wr^q

E

_S

k S ˆ

_n⁰

− S k

²

γ

q

(S)

≥ Z

{z:Sz∈W_r^q}

E

_z

k S ˆ

n

− S

z

k

²

γ

q

(S

z

) µ

ϑ

(dz)

≥ 1 γ

q

(ε)

Z

{z:Sz∈W_r^q}

E

_z

k S ˆ

_n⁰

− S

_z

k

²

µ

_ϑ

(dz) ,

where γ

_q^∗

(ε) = sup

_k_S_k≤_ε

γ

q

(S) (the last inequality is true since k S

ϑ

k → 0 as n → ∞ ). Here we denote by E

_z

the expectation with respect to the distribution of the process (1.1) with S = S

_z

and the measure µ

_ϑ

denotes the distribution of ϑ in R

^ι

(ι = M N ) which is a priori distribution for the Bayes risk. The last inequality implies that

R

0

( ˆ S

_n⁰

) ≥ 1

γ

^∗_q

(ε) R ˜

0

( ˆ S

_n⁰

) − 2 1

γ

_q^∗

(ε) ω

n

, (6.8) where

R ˜

0

( ˆ S

_n⁰

) = Z

E

_z

k S ˆ

_n⁰

− S

z

k

²

µ

ϑ

(dz) , ω

n

=

Z

{z:Sz∈/W_r^q}

(r + k S

z

k

²

) µ

ϑ

(dz) .

To reduce the nonparametric problem to a parametric one, we replace the functions ˆ S

_n⁰

and S by their Fourier series with respect to the basis (˜ e

_m,i

), where

˜

e

_m,i

(x) = 1

√ h e

_i

(v

_m

(x)) I(v

_m

(x)) , 1 ≤ m ≤ M , 1 ≤ i ≤ N and I (x) = 1

₍_|_x_|≤₁₎

. We estimate the term k S ˆ

_n⁰

− S

z

k

²

as

k S ˆ

_n⁰

− S

_z

k

²

≥

M

X

m=1 N

X

j=1

(ˆ λ

_m,j

− λ

_m,j

(z))

²

.

Here ˆ λ

_m,j

and λ

_m,j

mean the Fourier coefficients for the functions ˆ S

_n⁰

and S, respectively, i.e.

λ ˆ

_m,j

= Z

1

0

S ˆ

_n⁰

(x)˜ e

_m,j

(x)dx and λ

_m,j

(z) = Z

1

0

S

_z

(x)˜ e

_m,j

(x) dx .

(19)

Due to the definition of the function λ

_m,j

(z) we obtain that λ

_m,j

(z) =

M

X

l=1 N

X

i=1

z

_l,i

Z

1

0

D

_l,i

(x)˜ e

_m,j

(x) dx

=

M

X

l=1 N

X

i=1

z

_l,i

√ h Z

1

0

e

i

(v

l

(x)) e

j

(v

m

(x)) I

η

(v

l

(x))I (v

m

(x)) dx

= √ h

N

X

i=1

z

_m,i

Z

1

−1

e

i

(u)e

j

(u)I

η

(u) du . Therefore

Λ

_m,j

= ∂λ

_m,j

∂z

_m,j

= √ h

Z

1

−1

e

²_j

(v)I

η

(v) dv = √

h (e

²_j

, I

η

) . By the van Trees inequality (see Appendix 6.7) one gets

R ˜

0

(S

_n⁰

) ≥

M

X

m=1 N

X

j=1

Λ

²_m,j

A

_m,j

+ B

_m,j

+ I

m,j

=

M

X

m=1 N

X

j=1

h(e

²_j

, I

η

)

²

A

_m,j

+ B

_m,j

+ (δ

_m,j

)

⁻²

i

d

, (6.9)

where I

m,j

is the Fisher information relative the random variable ξ

_m,j

, A

_m,j

=

n

X

k=1

Z

R^ι

1 g

²

(x

_k

, S

_z

)

∂S

_z

(x

k

)

∂z

_m,j

2

µ

_ϑ

(dz) ,

B

_m,j

=

n

X

k=1

Z

R^ι

L ˜

_m,j

(x

_k

, S

_z

) g(x

_k

, S

_z

)

!

2

µ

_ϑ

(dz) , L ˜

_m,j

(x, S

z

) = L

_x,S_z

∂S

_z

∂z

_m,j

. First note that from (6.4) we obtain that

A

_m,j

=

n

X

k=1

D

²_m,j

(x

_k

) Z

R^ι

1 g

²

(x

_k

, S

_z

) µ

_ϑ

(dz) = (1 + o(1))

n

X

k=1

1 g

₀²

(x

k

) D

_m,j²

(x

_k

)

= n h (1 + o(1)) 1 g

₀²

(a

m

)

Z

1

−1

e

²_j

(x) I

_η²

(x)dx = n h (1 + o(1)) 1

g

₀²

(a

m

) (e

²_j

, I

_η²

) . Now we study the behaviour of B

_m,j

. Due to the inequality (2.6) and the definition (6.4), we obtain that

| L ˜

_m,j

(x, S

z

) | ≤ C

^∗

∂S

_z

∂z

_m,j

= C

^∗

D

_m,j

≤ C

^∗

h .

(20)

Therefore by the condition H

₃

) we get

| B

_m,j

| ≤ C

^∗

1 g

²_∗

n h

²

.

Taking into account this inequality, the inequality (6.9) and the definition of δ

_m,j

in (6.5), we evaluate the Bayes risk as

1 γ

_q^∗

(ε) R ˜

0

( ˆ S

_n⁰

) ≥ 1 n γ

_q^∗

(ε)

M

X

m=1

g

²₀

(a

_m

)

N

X

j=1

α

_j

(η)

(1 + o(1)) β

_j

(η) + (κ

_j

)

⁻²

i

d

≥ (1 + o(1)) 2nh γ

_q^∗

(ε) ς(0)

N

X

j=1

α

_j

(η)

(1 + o(1)) β

_j

(η) + (κ

_j

)

⁻²

i

d

= (1 + o(1)) ς (0) 2 γ

_q^∗

(ε) n

^2q/(2q+1)

1 L

n

N

X

j=1

α

_j

(η)

(1 + o(1)) β

_j

(η) + (κ

_j

)

⁻²

i

d

, where we denote α

_j

(η) = (e

²_j

, I

η

)

²

and β

_j

(η) = (e

²_j

, I

_η²

).

In Appendix 6.9 we show that

n

lim

→∞

ϕ

_n

ω

_n

= 0 . (6.10) This result and the previous inequality imply

lim inf

n→∞

inf

Sˆn

ϕ

n

R

0

( ˆ S

n

) ≥ ς (0) N

_ε^∗

2 γ

_q^∗

(ε) Ω

_∗

(η, d) lim inf

n→∞

1 N

N

X

j=1

κ

²_j

κ

²_j

+ 1

= ς (0) N

_ε^∗

2 γ

_q^∗

(ε) Ω

_∗

(η, d) lim

n→∞

1 N

N

X

j=1

(1 − (j/N )

^q

)

= ς (0) N

_ε^∗

2 γ

_q^∗

(ε)

q

q + 1 Ω

_∗

(η, d)

= (1 − ε)

^1/(2q+1)

γ

⁰_q

γ

^∗_q

(ε) Ω

_∗

(η, d) , (6.11) where γ

_q⁰

= γ

_q

(0), Ω

_∗

(η, d) = lim inf

_N_→∞

Ω

_N

(η, d) and

Ω

_N

(η, d) = P

N

j=1

(α

_j

(η) κ

²_j

) (β

_j

(η) κ

²_j

+ i

d

)

⁻¹

P

N

j=1

κ

²_j

(κ

²_j

+ 1)

⁻¹

. (6.12) In Appendix 6.8 we show that

η

lim

→0

lim

d→∞

Ω

_∗

(η, d) = 1 . (6.13)

(21)

The condition H

₁

) implies γ

_q^∗

(ε) → γ

_q⁰

as ε → 0. Therefore limiting ε → 0 , η → 0 , d → ∞ in (6.11) yields

lim inf

n→∞

inf

Sˆn

ϕ

n

R ˜

0

( ˆ S

n

) ≥ 1 .

Thus (6.8) implies the inequality (6.3) from which it follows (6.1) through Lemma 5.1.

6 Appendix

6.1 Properties of trigonometric basis

Lemma 6.1. Let θ

j

,θ

_j,n

be coefficients from (3.3) and (4.2), respectively.

Then for 1 ≤ j ≤ n and n ≥ 2 sup

S∈Wr^q

| θ

_j,n

− θ

_j

| ≤ 2π √ r j

n . (A.1)

Proof. Indeed, we have

| θ

_j,n

− θ

_j

| =

n

X

l=1

Z

x_l x_l−1

(S(x

l

)φ

j

(x

l

) − S(x)φ

j

(x)) dx

≤ 1 n

n

X

l=1

Z

x_l x_l−1

| S(z)φ ˙

j

(z) | + | S(z) ˙ φ

j

(z) | dz

= 1 n

Z

1 0

| S(z) ˙ | | φ

j

(z) | + | S(z) | | φ ˙

j

(z) | dz .

By making use of the Bounyakovskii-Cauchy-Schwartz inequality we get

| θ

_j,n

− θ

_j

| ≤ 1 n

k S ˙ k k φ k + k φ ˙ k k S k

≤ 1 n

k S ˙ k + π j k S k .

The definition of class W

_r^q

in (2.2) implies (A.1). Hence Lemma 6.1.

Lemma 6.2. For any m ≥ 0, sup

N

sup

x∈[0,1]

N

⁻^m

N

X

l=2

l

^m

(φ

²_l

(x) − 1)

≤ 2

^m

. (A.2)

(22)

Proof. By the properties of the trigonometric functions we get

N

X

l=2

l

^m

(φ

²_l

(x) − 1) = X

1≤l≤N/2

(2l)

^m

[2 cos

²

(2πlx) − 1]

+ X

1≤l≤(N−1)/2

(2l + 1)

^m

(2 sin

²

(2πlx) − 1)

= X

1≤l≤N/2

(2l)

^m

cos(4πlx)

− X

1≤l≤(N−1)/2

(2l + 1)

^m

cos(4πlx) . This yields

N

X

l=2

l

^m

(φ

²_l

(x) − 1)

≤

X

1≤l≤(N−1)/2

((2l + 1)

^m

− (2l)

^m

) cos(4πlx)

+ N

^m

≤ X

1≤l≤(N−1)/2

| (2l + 1)

^m

− (2l)

^m

| + N

^m

= X

1≤l≤(N−1)/2 m−1

X

j=0

_m

j

(2l)

^j

+ N

^m

. This implies (A.2).

Lemma 6.3. For any function S ∈ W

_r^q

, sup

n≥1

sup

1≤m≤n−1

m

^2q

n

X

j=m+1

θ

²_j,n

k φ

_j

k

²n

!

≤ 2r

π

^2(q⁻¹⁾

, (A.3) where ν

_n

is defined by (3.2).

Proof. First, note that any function S from W

_r^q

we can be represented by its Fourier series, i.e. S = P

_∞

j=1

θ

j

φ

j

with the coefficients defined by (4.2).

By denoting the residual term for S as

∆

m

(x) = S −

m

X

j=1

θ

j

φ

j

= X

∞

j=m+1

θ

_j

φ

j

(x) , we obtain that

n

X

j=m+1

θ

²_j,n

k φ

j

k

²n

= inf

α1,...,αm

k S −

m

X

j=1

α

j

φ

j

k

²n

≤ k ∆

m

k

²n

.

(23)

Moreover

k ∆

m

k

²n

= 1 n

n

X

k=1

∆

²_m

(x

k

) =

n

X

k=1

Z

x_k x_k−1

∆

²_m

(x

k

)dx

≤ 2 Z

1

0

∆

²_m

(x)dx + 2

n

X

k=1

Z

x_k x_k−1

(∆

m

(x

k

) − ∆

m

(x))

²

dx . The last term in this inequality we estimate as

(∆

_m

(x

_k

) − ∆

_m

(x))

²

=

Z

x_k x

∆ ˙

_m

(z)dz

2

≤ 1 n

Z

x_k x_k−1

( ˙ ∆

m

(z))

²

dz . Therefore

k ∆

m

k

²n

≤ 2 k ∆

m

k

²

+ 2 1

n

²

k ∆ ˙

m

k

²

= 2

∞

X

j=m+1

θ

_j²

+ 2 1 n

²

∞

X

j=m+1

θ

²_j

k φ ˙

_j

k

²

.

Taking here into account the inequality (4.3) we obtain (A.3). Hence Lemma 6.3.

Lemma 6.4. Let ξ

_j,n

be defined in (3.3) for the model (1.1)–(2.1). Then, for any real numbers v

1

, . . . , v

n

,

E P

n

j=1

v

j

ξ

_j,n

4

≤ ξ

^∗

E P

n

j=1

v

j

ξ

_j,n

2

, E

P

n

j=1

v

_j

ξ

_j,n

2

≤ σ

_∗

ν

_n

P

n j=1

v

²_j

,

(A.4)

where σ

_∗

= max

₁_≤_j_≤_n

σ

_j²

and ν

n

is defined in (3.2).

Proof. By the definition of ξ

_j,n

we obtain that P

n

j=1

v

j

ξ

_j,n

= P

n

l=1

σ

l

˜ v

l

ξ

_l

with ˜ v

l

=

^√¹_n

P

n

j=1

v

j

φ

j

(x

l

). Therefore, this implies immediately the first inequality in (A.4). Moreover

E

n

X

j=1

v

_j

ξ

_j,n

!

2

=

n

X

l=1

σ

_l²

v ˜

_l²

≤ σ

_∗

n

X

l=1

˜ v

_l²

= σ

_∗

n

X

i,j=1

v

i

v

j

(φ

i

, φ

j

)

n

.

(24)

The orthogonal properties of the basis (φ

j

) and (3.2) imply the second inequality in (A.4). Hence Lemma 6.4.

6.2 Proof of (3.21)

Indeed, we have

E

_S

V

_n^∗

(τ) ≤

3

X

i=1

X

τ6=τ1

E

_S

| N

_i,n

(τ ) − N

_i,n

(τ

1

) |

≤ υ

_ε

3

X

i=1

sup

τ∈A^ε

E

_S

| N

_i,n

(τ ) − N

_i,n

(τ

1

) | . (A.5) We start with the first term in the right-hand part. We have

E

S

| N

_1,n

(τ) − N

_1,n

(τ

1

) | = 1 n E

S

n

X

j=1

(ρ

_τ

(j) − ρ

_τ₁

(j)) ˜ ξ

_j,n

≤ 1 n

n

X

j=1

(ρ

_τ

(j) + ρ

_τ₁

(j)) (E ξ ˜

_j,n²

)

^1/2

,

where ˜ ξ

_j,n

is defined in (3.13). Moreover, E ξ ˜

_j,n²

= E

ξ

_j,n²

− m

_j,n

2

= Eξ

_j,n⁴

− m

²_j,n

= 1 n

²

n

X

r,s=1

σ

_r²

σ

_s²

φ

²_j

(x

r

)φ

²_j

(x

s

) E ξ

_r²

ξ

_s²

− m

²_j,n

= (ξ

^∗

− 1) 1 n

²

n

X

r=1

σ

⁴_r

φ

⁴_j

(x

r

) ≤ 2(ξ

^∗

− 1)σ

_∗²

ν

n

n . (A.6)

Thus, for any fixed τ and τ

₁

from A

^ε

, E

_S

| N

_1,n

(τ) − N

_1,n

(τ

₁

) | ≤

p 2(ξ

^∗

− 1)ν

n

σ

_∗

n √

n (w

_τ

+ w

_τ₁

) , where w

_τ

= n

^1/(2β+1)

c

_τ

≤ c

^∗

n

^1/3

since β ≥ 1. Therefore

sup

τ∈A^ε

E

_S

| N

_1,n

(τ ) − N

_1,n

(τ

1

) | ≤ 2 p

2(ξ

^∗

− 1)ν

n

σ

_∗

c

^∗

n

^7/6

. (A.7)

(25)

To estimate the second term in the right-hand part of (A.5), note that N

_2,n

(τ ) = 1

n

²

n

X

d=1

σ

_d²

n

X

j=1

ρ

_τ

(j)φ

_j

(x

d

)

− 1 n

²

n

X

d=1

σ

_d²

ρ

_τ

(n)(1 − 1 ν

n

)φ

_n

(x

d

) . Let us show that, for any n ≥ 2,

sup

x

|

n

X

j=1

ρ

_τ

(j)φ

_j

(x) | ≤ 3 · 2

^β

. (A.8) Indeed, by definition (3.4), ρ

_τ

(j ) = 1 for j ≤ k

0

(τ ). Therefore the inequality (A.8) follows immediately from Lemma 6.2, for n ≤ k

0

(τ ). Let us show (A.8), for n > k

0

(τ ). In this case we can represent the sum in the left-hand part of (A.8) as

n

X

j=1

ρ

_τ

(j)φ

_j

(x) =

n

X

j=2

ρ

_τ

(j )φ

_j

(x)

= X

2≤j≤wτ∧n

φ

_j

(x) − 1 (w

_τ

)

^β

X

2≤j≤wτ∧n

j

^β

φ

_j

(x)

+ 1

(w

_τ

)

^β

X

2≤j≤k0(τ)

j

^β

φ

_j

(x) ,

where a ∧ b = min(a, b). Now it is easy to see that inequality (A.8) follows directly from Lemma 6.2. Therefore by (A.8)

sup

τ∈A^ε

E | N

_2,n

(τ) − N

_2,n

(τ

1

) | ≤ 6 · 2

^β^∗

n

²

n

X

d=1

σ

_d²

+ 2 n

²

n

X

d=1

σ

_d²

(1 − 1 ν

n

)

≤ 7 σ

_∗

2

^β^∗

n .

For the last addend in (A.5), in view of (3.4), we get E | N

_3,n

(τ ) − N

_3,n

(τ

1

) | ≤ w

_τ

+ w

_τ1

n E | ς ˆ

_n

− ς

_n

|

≤ c

^∗

(n

^1/(2β+1)

+ n

^1/(2β¹⁺¹

)

n E | ς ˆ

_n

− ς

_n

|

≤ 2 c

^∗

n

^2/3

E | ς ˆ

_n

− ς

_n

| .

Efficient adaptive nonparametric estimation in heteroscedastic regression models

HAL Id: hal-00129707

https://hal.archives-ouvertes.fr/hal-00129707

Preprint submitted on 8 Feb 2007

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de

Eﬀicient adaptive nonparametric estimation in heteroscedastic regression models

Leonid Galtchouk, Sergei Pergamenshchikov

To cite this version:

Leonid Galtchouk, Sergei Pergamenshchikov. Eﬀicient adaptive nonparametric estimation in het-

eroscedastic regression models. 2005. �hal-00129707�