• Aucun résultat trouvé

On adaptive wavelet estimation of the regression function and its derivatives in an errors-in-variables model

N/A
N/A
Protected

Academic year: 2021

Partager "On adaptive wavelet estimation of the regression function and its derivatives in an errors-in-variables model"

Copied!
20
0
0

Texte intégral

(1)

HAL Id: hal-00466830

https://hal.archives-ouvertes.fr/hal-00466830v2

Preprint submitted on 29 Jun 2010

HAL

is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire

HAL, est

destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

function and its derivatives in an errors-in-variables model

Christophe Chesneau

To cite this version:

Christophe Chesneau. On adaptive wavelet estimation of the regression function and its derivatives

in an errors-in-variables model. 2010. �hal-00466830v2�

(2)

in an errors-in-variables model

Christophe Chesneau

University of Caen (LMNO), Caen, France. Email: chesneau@math.unicaen.fr

Abstract

We consider a regression model with errors-in-variables:(Y, X), whereY =f(Z) + ξ and X = Z +W. Our goal is to estimate the unknown regression functionf and its derivatives under mild assumptions onξ(only finite moments of order2are required). To reach this goal, we develop a new adaptive wavelet estimator based on a hard thresholding rule. Taking the minimax approach under the mean integrated squared error over Besov balls, we prove that it attains a sharp rate of convergence.

Keywords: Errors-in-variables model, Derivatives function estimation, Minimax ap- proach, Wavelets, Hard thresholding.

2000 Mathematics Subject Classification:62G07, 62G20.

1 Motivations

We observe nindependent pairs of random variables (Y1, X1), . . . ,(Yn, Xn)in the following errors-in-variables model: for anyv∈ {1, . . . , n},

( Yv =f(Zv) +ξv,

Xv=Zv+Wv, (1.1)

wheref is an unknown function, X1, . . . , Xn are n i.i.d. random variables having the uniform distribution on[0,1],W1, . . . , Wnareni.i.d. unobserved random variables and ξ1, . . . , ξn areni.i.d. unobserved zero mean random variables. We assume that all these random variables are independent, the density function ofW1, denotedg, is known and ξ1 admits finite moments of order 2. We aim to estimate f and its derivatives from (Y1, X1), . . . ,(Yn, Xn).

The estimation of f from the model (1.1) has received a lot of attention. See e.g.

[6, 11–13, 15, 18, 19]. In this paper, we focus on a more general problem: the adaptive estimation of thed-th derivative off: f(d). This is of interest to detect possible bumps, concavity or convexity properties of f. The estimation off(d) has been investigated in

This work is supported by ANR grant NatImages, ANR-08-EMER-009

(3)

several papers for various models (but never (1.1)) starting with [1]. For references using wavelet methods, see e.g. [3–5, 23].

Another feature of the study concernsξ1, . . . , ξn: to estimatef(d)(includingf =f(0)), we only assume thatξ1has finite moments of order2. And thus we do not need to know the distribution ofξ1. Moreover, this relaxes the assumption onξ1in [6] where finite moments of order>6are required.

Under this general framework, considering the ordinary smooth case ong(see (2.2)), we estimatef(d) by a new wavelet estimator based on a hard thresholding rule. It has the originality to combine a singular value decomposition (SVD) approach similar to the one of [17] and some technical tools introduced in wavelet estimation theory by [7].

We evaluate its performances by taking the minimax approach under the mean inte- grated squared error (MISE) over a wide class of functions: the Besov balls Bp,rs (M) (to be defined in Section 3). We prove that our estimator attains the rate of convergence vn = (lnn/n)2s/(2s+2δ+2d+1)

, whereδ is a factor related to the ordinary smooth case.

This rate of convergence is sharp in the sense that it is the one attains by the best non- realistic linear wavelet estimator up to a logarithmic term.

The paper is organized as follows. Assumptions on the model and some notations are introduced in Section 2. Section 3 briefly describes the periodized wavelet basis on[0,1]

and the Besov balls. The estimators are presented in Section 4. The results are set in Section 5. The proofs are gathered in Section 6.

2 Assumptions and notations

We assume in the sequel thatf(d)andgbelong toL2per([0,1]), the space of periodic functions of period one that are square-integrable on[0,1]:

L2per([0,1]) = (

h; his1-periodic andkhk2= Z 1

0

h2(x)dx 1/2

<∞ )

. We assume that there exists a known constantC>0such that

kfk= sup

x∈[0,1]

|f(x)| ≤C<∞. (2.1) Any functionh∈L2per([0,1])can be represented by its Fourier series

h(t) =X

`∈Z

F(h)(`)e2iπ`t, t∈[0,1],

where the equality is intended in mean-square convergence sense, andF(h)(`)denotes the Fourier coefficient given by

F(h)(`) = Z 1

0

h(x)e−2iπ`xdx, `∈Z,

(4)

whenever this integral exists. The notation · will be used for the complex conjugate.

We consider the ordinary smooth case ong: there exist three constants,cg >0,Cg>0 andδ >1, such that, for any`∈Z, the Fourier coefficient ofg, i.e.F(g)(`), satisfies

cg

(1 +`2)δ/2 ≤ | F(g)(`)| ≤ Cg

(1 +`2)δ/2. (2.2)

This assumption controls the decay of the Fourier coefficients ofg, and thus the smoothness ofg. It is a standard hypothesis usually adopted in the field of nonparametric estimation for deconvolution problems. See e.g. [14, 17, 21].

3 Wavelets and Besov balls

3.1 Periodized Meyer Wavelets

We consider an orthonormal wavelet basis generated by dilations and translations of a

”father” Meyer-type waveletφand a ”mother” Meyer-type waveletψ. The main features of such wavelets are:

1. they are bandlimited, i.e. the Fourier transforms ofφandψhave compact supports respectively included in[−4π/3,4π/3]and[−8π/3,−2π/3]∪[2π/3,8π/3].

2. for any frequency in[−2π,−π]∪[π,2π], there exists a constantc >0such that the magnitude of the Fourier transform ofψis lower bounded byc.

3. the functions(φ, ψ)areCas their Fourier transforms have a compact support, and ψhas an infinite number of vanishing moments as its Fourier transform vanishes in a neighborhood of the origin, i.e. for anyu∈N,R

−∞xuψ(x)dx= 0.

For the purpose of this paper, we use the periodized Meyer wavelet bases on the unit interval. For anyx∈[0,1], any integerjand anyk∈ {0, . . . ,2j−1}, let

φj,k(x) = 2j/2φ(2jx−k), ψj,k(x) = 2j/2ψ(2jx−k) be the elements of the wavelet basis, and

φperj,k(x) =X

l∈Z

φj,k(x−l), ψperj,k(x) =X

l∈Z

ψj,k(x−l),

their periodized versions. There exists an integer j such that the collection B = {φperj

,k, k ∈ {0, . . . ,2j−1}; ψj,kper, j ∈ N− {0, . . . , j−1}, k ∈ {0, . . . ,2j −1}}

forms an orthonormal basis ofL2per([0,1]). In what follows, the superscript ”per” will be dropped to lighten the notation.

(5)

Letjc be an integer such thatjc ≥j. A functionh∈ L2per([0,1])can be expanded into a wavelet series as

h(x) =

2jc−1

X

k=0

αjc,kφjc,k(x) +

X

j=jc 2j−1

X

k=0

βj,kψj,k(x), x∈[0,1], where

αj,k= Z 1

0

h(x)φj,k(x)dx, βj,k= Z 1

0

h(x)ψj,k(x)dx. (3.1) See [20, Vol. 1 Chapter III.11] for a detailed account on periodized orthonormal wavelet bases.

3.2 Besov balls

LetM >0,s >0,p≥1andr≥1. Setβj−1,kj,k. A functionhbelongs to the Besov ballsBp,rs (M)if and only if there exists a constantM >0such that the wavelet coefficients (3.1) satisfy

X

j=j−1

2j(s+1/2−1/p)

2j−1

X

k=0

j,k|p

1/p

r

1/r

≤M.

For a particular choice of parameterss,pandr, these sets contain the H¨older and Sobolev balls. See [20].

4 Estimators

Wavelet coefficient estimators. The first step to estimatef(d)consists in expandingf(d) onBand estimating its unknown wavelet coefficients.

For any integer j ≥ j and any k ∈ {0, . . . ,2j − 1}, we estimate αj,k = R1

0 f(d)(x)φj,k(x)dxby αbj,k= 1

n

n

X

v=1

X

`∈Cj

(2iπ`)dF(φj,k)(`)

F(g)(`) Yve−2iπ`Xv, (4.1) Cj= supp (F(φj,0)) = supp (F(φj,k)), andβj,k=R1

0 f(d)(x)ψj,k(x)dxby βbj,k= 1

n

n

X

v=1

Gv1{|Gv|≤ηj}, (4.2)

where

Gv= X

`∈Dj

(2iπ`)dF(ψj,k)(`)

F(g)(`) Yve−2iπ`Xv,

(6)

Dj = supp (F(ψj,0)) = supp (F(ψj,k)), for any random eventA,1Ais the indicator function onA, and the thresholdηjis defined by

ηj =θ2(δ+d)j r n

lnn, (4.3)

θ = p

C∗∗(C2+E(ξ12)),C is (2.1) andC∗∗ = 2δ−1(2(2π)2d/c2g)(8π/3)2(δ+d). The estimators αbj,k andβbj,k are constructed via a SVD approach similar to the one of [17].

The idea of the thresholding in (4.2) is to operate a selection on the observations: when, forv ∈ {1, . . . , n},f is ”too noisy” byξv, the observation(Yv, Xv)is neglected. Such a technique has been introduced by [7] in wavelet estimation. Statistical properties ofαbj,k andβbj,kare given in Propositions 6.1, 6.2 and 6.3.

We consider two wavelets estimators forf(d): a linear estimator and a hard threshold- ing estimator.

Linear estimator. Assuming that f(d) ∈ Bsp,r(M) withp ≥ 2, we define the linear estimatorfbdLby

fbdL(x) =

2j0−1

X

k=0

αbj0,kφj0,k(x), (4.4) whereαbj,kis defined by (4.1) andj0is the integer satisfying

2−1n1/(2s+2δ+2d+1)<2j0≤n1/(2s+2δ+2d+1). It is not adaptive since it depends ons, the smoothness parameter off(d). Hard thresholding estimator.We define the hard thresholding estimatorfbdHby

fbdH(x) =

2j∗−1

X

k=0

αbj,kφj,k(x) +

j1

X

j=j

2j−1

X

k=0

βbj,k1{|bβj,k|≥κλjj,k(x), (4.5) whereαbj,kandβbj,kare defined by (4.1) and (4.2),j1is the integer satisfying

2−1n1/(2δ+2d+1)<2j1≤n1/(2δ+2d+1), κ≥8/3 + 2 + 2p

16/9 + 4andλjis the threshold λj =θ2(δ+d)j

rlnn

n . (4.6)

The definitions ofηjandλjare chosen to minimize the MISE offbH and to make it adap- tive. Further statistical results on the hard thresholding estimator for the standard regression model can be found in [8–10].

(7)

5 Results

Theorem 5.1. Consider(1.1)under the assumptions of Section2. Suppose thatf(d) ∈ Bp,rs (M)withs > 0,p ≥ 2 andr ≥ 1. LetfbdL be(4.4). Then there exists a constant C >0such that

E Z 1

0

fbdL(x)−f(d)(x)2 dx

≤Cn−2s/(2s+2δ+2d+1).

The proof of Theorem 5.1 uses a moment inequality on (4.1) and a suitable decompo- sition of the MISE.

Since the distribution ofξ1is unknown, we can not use the likelihood function related to the model and the optimal lower bound seems difficult to determine (see e.g. [16, 25]).

For this reason, our benchmark will be the rate of convergence attains by the ”optimal non-realistic”fbdL, i.e. vn =n−2s/(2s+2δ+2d+1). Note that, under the standard Gaussian assumption onξ1, one can prove thatvnis optimal in the minimax sense.

Theorem 5.2. Consider(1.1)under the assumptions of Section2. LetfbdHbe(4.5). Suppose thatf(d)∈Bsp,r(M)withr≥1,{p≥2ands >0}or{p∈[1,2)ands >(2δ+ 2d+ 1)/p}. Then there exists a constantC >0such that

E Z 1

0

fbdH(x)−f(d)(x)2

dx

≤C lnn

n

2s/(2s+2δ+2d+1)

.

The proof of Theorem 5.2 is based on several probability results (moment inequalities, concentration inequality,. . . ) and a suitable decomposition of the MISE.

Theorem 5.2 proves that fbdH attains vn up to the logarithmic term (lnn)2s/(2s+2δ+2d+1).

Conclusion and perspectives. We have constructed a new adaptive estimatorfbdH for f(d)under mild assumption onξ1. It is based on wavelet and thresholding. It has ”near- optimal” minimax properties for a wide class of functionsf(d). Possible perspectives of this work are

- to potentially improve the estimation off(d)by considering other kinds of thresh- olding rules as the block thresholding one introduced by [2],

- to investigate the random design case where the distribution ofX1is unknown.

6 Proofs

In this section, C represents a positive constant which may differ from one term to another.

(8)

6.1 Auxiliary results

Proposition 6.1. For any integerj ≥ j and any k ∈ {0, . . . ,2j −1}, letαj,kbe the wavelet coefficient(3.1)off(d)andαbj,kbe(4.1). Then there exists a constantC >0such that

E

(αbj,k−αj,k)2

≤C22(δ+d)j1 n. Proof of Proposition 6.1.For anyv∈ {1, . . . , n}, let us set

Hv= X

`∈Cj

(2iπ`)dF(φj,k)(`)

F(g)(`) Yve−2iπ`Xv.

SinceX1,W1andξ1are independent, using the convolution product betweenfandg, i.e.

(f ? g)(x) =R1

0 f(x−y)g(y)dy, we have E Y1e−2iπ`X1

= E f(X1−W1)e−2iπ`X1

+E(ξ1)E e−2iπ`X1

= E f(X1−W1)e−2iπ`X1

= Z 1

0

Z 1 0

f(x−y)g(y)e−2iπ`xdxdy= Z 1

0

(f ? g)(x)e−2iπ`xdx

= F(f ? g)(`) =F(f)(`)F(g)(`).

Moreover, sincef is1-periodic, for anyu∈ {0, . . . , d},f(u)is1-periodic andf(u)(0) = f(u)(1). Bydintegrations by parts, for any`∈Z, we have

(2iπ`)dF(f)(`) =F f(d)

(`).

The Parseval-Plancherel theorem gives E(H1) = X

`∈Cj

(2iπ`)dF(φj,k)(`)

F(g)(`) E Y1e−2iπ`X1

= X

`∈Cj

(2iπ`)dF(φj,k)(`)

F(g)(`) F(f)(`)F(g)(`)

= X

`∈Cj

F(φj,k)(`)(2iπ`)dF(f)(`) =X

`∈Cj

F(φj,k)(`)F f(d)

(`)

= Z 1

0

φj,k(x)f(d)(x)dx=αj,k. (6.1)

HenceE(αbj,k) =αj,k. So E

(αbj,k−αj,k)2

= V(αbj,k) =V 1 n

n

X

v=1

Hv

!

= 1 n2

n

X

v=1

V(Hv)

= 1

nV(H1)≤ 1 nE H12

. (6.2)

(9)

SinceX1,W1andξ1are independent withE(ξ1) = 0and|f(X1−W1)| ≤ kfk≤C<

∞, we have

E H12

= E

f2(X1−W1)

 X

`∈Cj

(2iπ`)dF(φj,k)(`)

F(g)(`) e−2iπ`X1

2

+ 2E(ξ1)E

f(X1−W1)

 X

`∈Cj

(2iπ`)dF(φj,k)(`)

F(g)(`) e−2iπ`X1

2

+ E ξ12 E

 X

`∈Cj

(2iπ`)dF(φj,k)(`)

F(g)(`) e−2iπ`X1

2

= E

f2(X1−W1)

 X

`∈Cj

(2iπ`)dF(φj,k)(`)

F(g)(`) e−2iπ`X1

2

+ E ξ12 E

 X

`∈Cj

(2iπ`)dF(φj,k)(`)

F(g)(`) e−2iπ`X1

2

≤ C2+E ξ12 E

 X

`∈Cj

(2iπ`)dF(φj,k)(`)

F(g)(`) e−2iπ`X1

2

. (6.3) The assumption (2.2) implies

sup

`∈Cj

(2π`)2d

|F(g)(`)|2 ≤ (2π)2d c2g sup

`∈Cj

`2d(1 +`2)δ≤2δ−1(2π)2d c2g sup

`∈Cj

`2d(1 +`)

≤ 2δ−12(2π)2d c2g

8π 3

2(δ+d)

22(δ+d)j =C∗∗22(δ+d)j. (6.4) Using the fact that (e−2iπ`x)`∈Z is an orthonormal basis of L2per([0,1]), (6.4) and the Parseval-Plancherel theorem, we obtain

E

 X

`∈Cj

(2iπ`)dF(φj,k)(`)

F(g)(`) e−2iπ`X1

2

= Z 1

0

 X

`∈Cj

(2iπ`)dF(φj,k)(`) F(g)(`) e−2iπ`x

2

dx=X

`∈Cj

(2π`)2d|F(φj,k) (`)|2

|F(g)(`)|2

≤ C∗∗22(δ+d)j X

`∈Cj

|F(φj,k) (`)|2=C∗∗22(δ+d)j Z 1

0

j,k(x)|2dx

= C∗∗22(δ+d)j. (6.5)

(10)

Putting (6.3) and (6.5) together, we obtain E H12

≤θ222(δ+d)j. (6.6)

It follows from (6.2) and (6.6) that E

(αbj,k−αj,k)2

≤C22(δ+d)j1 n.

Proposition 6.2. For any integerj ≥ j and anyk ∈ {0, . . . ,2j −1}, letβj,k be the wavelet coefficient(3.1)off(d)andβbj,kbe(4.2). Then there exists a constantC >0such that

E

βbj,k−βj,k

4

≤C24(δ+d)j(lnn)2 n2 .

Proof of Proposition 6.2.Proceeding as in (6.1) (withψinstead ofφ), we have βj,k=

Z 1 0

f(d)(x)ψj,k(x)dx=E(Gv) =E(Gv1{|Gv|≤ηj}) +E(G11{|G1|>ηj}). (6.7) We have

E

βbj,k−βj,k4

= E

 1 n

n

X

v=1

Gv1{|Gv|≤ηj}−E Gv1{|Gv|≤ηj}

−E G11{|G1|>ηj}

!4

≤ 8(A+B), (6.8)

where

A=E

 1 n

n

X

v=1

Gv1{|Gv|≤ηj}−E(Gv1{|Gv|≤ηj})

!4

and

B = E(|G1|1{|G1|>ηj})4 .

Let us boundAandB, in turn. To boundA, we need the Rosenthal inequality presented in lemma below (see [24]).

Lemma 6.1(Rosenthal’s inequality). Letp ≥ 2,n ∈ N and(Uv)v∈{1,...,n} benzero mean i.i.d. random variables such thatE(|U1|p)<∞. Then there exists a constantC >0 such that

E

n

X

v=1

Uv

p!

≤Cmax

nE(|U1|p), nE U12p/2 .

(11)

Applying the Rosenthal inequality withp= 4and, for anyv∈ {1, . . . , n}, Uv=Gv1{|Gv|≤ηj}−E Gv1{|Gv|≤ηj}

, we obtain

A= 1 n4E

n

X

v=1

Uv

!4

≤C 1 n4max

nE U14

, nE U122 .

Using (6.6) (withψinstead ofφ), we have, for anya∈ {2,4}, E(U1a)≤2aE Ga11{|G1|≤ηj}

≤2aηa−2j E G21

≤2aθ2ηa−2j 22(δ+d)j. Hence

A ≤ C 1 n4max

η2jθ222(δ+d)jn,

θ222(δ+d)jn2

= C 1 n4max

24(δ+d)j n2

lnn,24(δ+d)jn2

=C24(δ+d)j 1

n2. (6.9) Let us now boundB. Using again (6.6) (withψinstead ofφ), we obtain

E |G1|1{|G1|>ηj}

≤ E G21

ηj ≤ 1

θ2(δ+d)j rlnn

n 22(δ+d)jθ2

= θ2(δ+d)j rlnn

n . (6.10)

Hence

B≤C24(δ+d)j(lnn)2

n2 . (6.11)

Combining (6.8), (6.9) and (6.11), we have E

βbj,k−βj,k4

≤C

24(δ+d)j 1

n2 + 24(δ+d)j(lnn)2 n2

≤C24(δ+d)j(lnn)2 n2 .

Proposition 6.3. For any integerj ≥ j and anyk ∈ {0, . . . ,2j −1}, letβj,k be the wavelet coefficient(3.1)off(d)andβbj,kbe(4.2). Then, for anyκ≥8/3+2+2p

16/9 + 4, P

|βbj,k−βj,k| ≥κλj/2

≤2n−2. Proof of Proposition 6.3.Using (6.7), we have

|βbj,k−βj,k|

= 1 n

n

X

v=1

Gv1{|Gv|≤ηj}−E Gv1{|Gv|≤ηj}

−E G11{|G1|>ηj}

≤ 1 n

n

X

v=1

Gv1{|Gv|≤ηj}−E Gv1{|Gv|≤ηj}

+E |G1|1{|G1|>ηj} .

(12)

Using (6.10), we obtain

E |G1|1{|G1|>ηj}

≤θ2(δ+d)j rlnn

n =λj. Hence

S = P

|bβj,k−βj,k| ≥κλj/2

≤ P 1 n

n

X

v=1

Gv1{|Gv|≤ηj}−E Gv1{|Gv|≤ηj}

≥(κ/2−1)λj

! .

Now we need the Bernstein inequality presented in the lemma below (see [22]).

Lemma 6.2 (Bernstein’s inequality). Let n ∈ N and (Uv)v∈{1,...,n} be n zero mean i.i.d. random variables such that there exists a constant M > 0 satisfying, for any v∈ {1, . . . , n},|Uv| ≤M <∞. Then, for anyλ >0, holds

P

n

X

v=1

Uv

≥λ

!

≤2 exp − λ2

2 nE(U12) +λM3

! .

Let us set, for anyv∈ {1, . . . , n},

Uv=Gv1{|Gv|≤ηj}−E Gv1{|Gv|≤ηj} . ThenE(U1) = 0,

|Uv| ≤

Gv1{|Gv|≤ηj}

+E |Gv|1{|Gv|≤ηj}

≤2ηj and, using again (6.6) (withψinstead ofφ),

E U12

=V G11{|G1|≤ηj}

≤E G21

≤θ222(δ+d)j. It follows from the Bernstein inequality that

S≤2 exp

− n2(κ/2−1)2λ2j 2

θ2n22(δ+d)j+2n(κ/2−1)λ3 jηj

.

Since

λjηj=θ2(δ+d)j rlnn

n θ2(δ+d)j r n

lnn =θ222(δ+d)j, λ2j222(δ+d)jlnn n , we have, for anyκ≥8/3 + 2 + 2p

16/9 + 4,

S≤2 exp

− (κ/2−1)2lnn 2

1 +2(κ/2−1)3

≤2n−2.

(13)

6.2 Proofs of the main results

Proof of Theorem 5.1.We expand the functionf(d)as

f(d)(x) =

2j0−1

X

k=0

αj0,kφj0,k(x) +

X

j=j0 2j−1

X

k=0

βj,kψj,k(x),

where

αj0,k= Z 1

0

f(d)(x)φj0,k(x)dx, βj,k= Z 1

0

f(d)(x)ψj,k(x)dx.

We have

fbdL(x)−f(d)(x) =

2j0−1

X

k=0

(αbj0,k−αj0,kj0,k(x)−

X

j=j0

2j−1

X

k=0

βj,kψj,k(x).

Hence

E Z 1

0

fbdL(x)−f(d)(x)2 dx

=A+B, where

A=

2j0−1

X

k=0

E

(αbj0,k−αj0,k)2

, B=

X

j=j0 2j−1

X

k=0

βj,k2 .

Using Proposition 6.1, we obtain

A≤C2j0(1+2δ+2d)1

n ≤Cn−2s/(2s+2δ+2d+1). Sincep≥2, we haveBp,rs (M)⊆Bs2,∞(M). Hence

B≤C2−2j0s≤Cn−2s/(2s+2δ+2d+1). So

E Z 1

0

fbdL(x)−f(d)(x)2

dx

≤Cn−2s/(2s+2δ+2d+1). The proof of Theorem 5.1 is complete.

Proof of Theorem 5.2.We expand the functionf(d)as

f(d)(x) =

2j∗−1

X

k=0

αj,kφj,k(x) +

X

j=j

2j−1

X

k=0

βj,kψj,k(x),

(14)

where

αj,k= Z 1

0

f(d)(x)φj,k(x)dx, βj,k= Z 1

0

f(d)(x)ψj,k(x)dx.

We have

fbdH(x)−f(d)(x)

=

2j∗−1

X

k=0

(αbj,k−αj,kj,k(x) +

j1

X

j=j

2j−1

X

k=0

βbj,k1{|bβj,k|≥κλj} −βj,k ψj,k(x)

X

j=j1+1 2j−1

X

k=0

βj,kψj,k(x).

Hence

E Z 1

0

fbdH(x)−f(d)(x)2

dx

=R+S+T, (6.12)

where R=

2j∗−1

X

k=0

E

(αbj,k−αj,k)2

, S=

j1

X

j=j 2j−1

X

k=0

E

βbj,k1{|bβj,k|≥κλj} −βj,k

2

and

T =

X

j=j1+1 2j−1

X

k=0

βj,k2 . Let us boundR,TandS, in turn.

Using Proposition 6.1, we have R≤C2j(1+2δ+2d)1

n ≤C1 n ≤C

lnn n

2s/(2s+2δ+2d+1)

. (6.13)

Forr≥1andp≥2, we haveBp,rs (M)⊆Bs2,∞(M). So T ≤C

X

j=j1+1

2−2js≤C2−2j1s≤Cn−2s/(2δ+2d+1)≤C lnn

n

2s/(2s+2δ+2d+1)

.

Forr≥1andp∈[1,2), we haveBp,rs (M)⊆B2,∞s+1/2−1/p(M). Sinces >(2δ+2d+1)/p, we have(s+ 1/2−1/p)/(2δ+ 2d+ 1)> s/(2s+ 2δ+ 2d+ 1). So

T ≤ C

X

j=j1+1

2−2j(s+1/2−1/p)≤C2−2j1(s+1/2−1/p)

≤ Cn−2(s+1/2−1/p)/(2δ+2d+1)≤C lnn

n

2s/(2s+2δ+2d+1)

.

(15)

Hence, forr≥1,{p≥2ands >0}or{p∈[1,2)ands >(2δ+ 2d+ 1)/p}, we have T ≤C

lnn n

2s/(2s+2δ+2d+1)

. (6.14)

The termScan be decomposed as

S=e1+e2+e3+e4, (6.15)

where

e1=

j1

X

j=j

2j−1

X

k=0

E

βbj,k−βj,k

2

1{|bβj,k|≥κλj}1{|βj,k|<κλj/2}

,

e2=

j1

X

j=j 2j−1

X

k=0

E

βbj,k−βj,k

2

1{|bβj,k|≥κλj}1{|βj,k|≥κλj/2}

,

e3=

j1

X

j=j

2j−1

X

k=0

E

βj,k2 1{|bβj,k|<κλj}1{|βj,k|≥2κλj} and

e4=

j1

X

j=j 2j−1

X

k=0

E

βj,k2 1{|bβj,k|<κλj}1{|βj,k|<2κλj} . Let us analyze each terme1,e2,e3ande4in turn.

Upper bounds fore1ande3.We have n|βbj,k|< κλj, |βj,k| ≥2κλjo

⊆n

|βbj,k−βj,k|> κλj/2o ,

n|bβj,k| ≥κλj, |βj,k|< κλj/2o

⊆n

|βbj,k−βj,k|> κλj/2o and

n|βbj,k|< κλj, |βj,k| ≥2κλj

o⊆n

j,k| ≤2|βbj,k−βj,k|o . So

max(e1, e3)≤C

j1

X

j=j

2j−1

X

k=0

E

βbj,k−βj,k2

1{|bβj,k−βj,k|>κλj/2}

.

It follows from the Cauchy-Schwarz inequality and Propositions 6.2 and 6.3 that E

βbj,k−βj,k2

1{|bβj,k−βj,k|>κλj/2}

E

βbj,k−βj,k41/2

P

|βbj,k−βj,k|> κλj/21/2

≤ C22(δ+d)jlnn n2 .

(16)

Hence

max(e1, e3) ≤ Clnn n2

j1

X

j=j

2j(1+2δ+2d)≤Clnn

n2 2j1(1+2δ+2d)

≤ Clnn n ≤C

lnn n

2s/(2s+2δ+2d+1)

. (6.16)

Upper bound for the terme2. Using Proposition 6.2 and the Cauchy-Schwarz inequality, we obtain

E

βbj,k−βj,k

2

E

βbj,k−βj,k

41/2

≤C22(δ+d)jlnn n . Hence

e2≤Clnn n

j1

X

j=j

22(δ+d)j

2j−1

X

k=0

1{|βj,k|>κλj/2}.

Letj2be the integer defined by 2−1 n

lnn

1/(2s+2δ+2d+1)

<2j2≤ n lnn

1/(2s+2δ+2d+1)

. (6.17)

We have

e2≤e2,1+e2,2, where

e2,1=Clnn n

j2

X

j=j

22(δ+d)j

2j−1

X

k=0

1{|βj,k|>κλj/2}

and

e2,2=Clnn n

j1

X

j=j2+1

22(δ+d)j

2j−1

X

k=0

1{|βj,k|>κλj/2}. We have

e2,1≤Clnn n

j2

X

j=j

2j(1+2δ+2d)≤Clnn

n 2j2(1+2δ+2d)≤C lnn

n

2s/(2s+2δ+2d+1)

.

Forr≥1andp≥2, sinceBp,rs (M)⊆B2,∞s (M),

e2,2 ≤ Clnn n

j1

X

j=j2+1

22(δ+d)j 1 λ2j

2j−1

X

k=0

βj,k2 =C

X

j=j2+1 2j−1

X

k=0

βj,k2 ≤C2−2j2s

≤ C lnn

n

2s/(2s+2δ+2d+1)

.

(17)

Forr≥1,p∈[1,2)ands >(2δ+ 2d+ 1)/p, sinceBsp,r(M)⊆B2,∞s+1/2−1/p(M)and (2s+ 2δ+ 2d+ 1)(2−p)/2 + (s+ 1/2−1/p+δ+d−2(δ+d)/p)p= 2s, we have

e2,2 ≤ Clnn n

j1

X

j=j2+1

22(δ+d)j 1 λpj

2j−1

X

k=0

j,k|p

≤ C lnn

n

(2−p)/2 X

j=j2+1

2j(δ+d)(2−p)2−j(s+1/2−1/p)p

≤ C lnn

n

(2−p)/2

2−j2(s+1/2−1/p+δ+d−2(δ+d)/p)p

≤ C lnn

n

2s/(2s+2δ+2d+1)

.

So, forr≥1,{p≥2ands >0}or{p∈[1,2)ands >(2δ+ 2d+ 1)/p}, e2≤C

lnn n

2s/(2s+2δ+2d+1)

. (6.18)

Upper bound for the terme4.We have e4

j1

X

j=j

2j−1

X

k=0

βj,k2 1{|βj,k|<2κλj}.

Letj2be the integer (6.17). We have

e4≤e4,1+e4,2, where

e4,1=

j2

X

j=j

2j−1

X

k=0

β2j,k1{|βj,k|<2κλj}, e4,2=

j1

X

j=j2+1 2j−1

X

k=0

βj,k2 1{|βj,k|<2κλj}.

We have

e4,1 ≤ C

j2

X

j=j

2jλ2j =Clnn n

j2

X

j=j

2j(1+2δ+2d)≤Clnn

n 2j2(1+2δ+2d)

≤ C lnn

n

2s/(2s+2δ+2d+1)

.

Forr≥1andp≥2, sinceBp,rs (M)⊆B2,∞s (M), we have

e4,2

X

j=j2+1 2j−1

X

k=0

βj,k2 ≤C2−2j2s≤C lnn

n

2s/(2s+2δ+2d+1)

.

(18)

Forr≥1,p∈[1,2)ands >(2δ+ 2d+ 1)/p, sinceBsp,r(M)⊆B2,∞s+1/2−1/p(M)and (2−p)(2s+ 2δ+ 2d+ 1)/2 + (s+ 1/2−1/p+δ+d−2(δ+d)/p)p= 2s, we have

e4,2 ≤ C

j1

X

j=j2+1

λ2−pj

2j−1

X

k=0

j,k|p=C lnn

n

(2−p)/2 j1

X

j=j2+1

2jδ(2−p)

2j−1

X

k=0

j,k|p

≤ C lnn

n

(2−p)/2

X

j=j2+1

2j(δ+d)(2−p)2−j(s+1/2−1/p)p

≤ C lnn

n

(2−p)/2

2−j2(s+1/2−1/p+δ+d−2(δ+d)/p)p≤C lnn

n

2s/(2s+2δ+2d+1)

. So, forr≥1,{p≥2ands >0}or{p∈[1,2)ands >(2δ+ 2d+ 1)/p},

e4≤C lnn

n

2s/(2s+2δ+2d+1)

. (6.19)

It follows from (6.15), (6.16), (6.18) and (6.19) that S ≤C

lnn n

2s/(2s+2δ+2d+1)

. (6.20)

Combining (6.12), (6.13), (6.14) and (6.20), we have, forr≥1,{p≥2ands >0}or {p∈[1,2)ands >(2δ+ 2d+ 1)/p},

E Z 1

0

fbdH(x)−f(d)(x)2 dx

≤C lnn

n

2s/(2s+2δ+2d+1)

. The proof of Theorem 5.2 is complete.

This work is supported by ANR grant NatImages, ANR-08-EMER-009.

References

[1] P.K. Bhattacharya, Estimation of a probability density function and its derivatives, Sankhya Ser. A.29 (1967), 373–382.

[2] T. Cai, Adaptive Wavelet Estimation: A Block Thresholding And Oracle Inequality Approach,The Annals of Statistics27 (1999), 898–924.

[3] Y.P. Chaubey and H. Doosti, Wavelet based estimation of the derivatives of a density for m-dependent random variables, Journal of the Iranian Statistical Society 4 (2) (2005), 97–105.

[4] Y.P. Chaubey, H. Doosti and B.L.S. Prakasa Rao, Wavelet based estimation of the derivatives of a density with associated variables,International Journal of Pure and Applied Mathematics27 (1) (2006), 97–106.

(19)

[5] C. Chesneau, Wavelet estimation of the derivatives of an unknown function from a convolution model, Preprint (2009).

[6] F. Comte and M.-L. Taupin, Nonparametric estimation of the regression function in an errors-in-variables model,Statistica Sinica173 (2007), 1065-1090.

[7] B. Delyon and A. Juditsky, On minimax wavelet estimators,Applied Computational Harmonic Analysis3(1996), 215–228.

[8] D.L. Donoho and I.M. Johnstone, Ideal spatial adaptation by wavelet shrinkage, Biometrika81(1994), 425-455.

[9] D. L. Donoho and I.M. Johnstone, Adapting to unknown smoothness via wavelet shrinkage,Journal of the American Statistical Association90432 (1995), 1200-1224.

[10] D.L. Donoho and I.M. Johnstone, Minimax estimation via wavelet shrinkage, The Annals of Statistics26(1998), 879-921.

[11] J. Fan and M. Masry, Multivariate regression estimation with errors-in-variables:

asymptotic normality for mixing process,J. Multivariate Anal.43(1992), 237-272.

[12] J. Fan and Y.K. Truong, Nonparametric regression with errors-in-variables,The An- nals of Statistics214 (1993), 1900-1925.

[13] J. Fan, Y.K. Truong and Y. Wang, Nonparametric function estimation involving errors-in-variables,Nonparametric Functional Estimation and Related Topic(1991), 613-627.

[14] J. Fan and J.Y. Koo, Wavelet deconvolution,IEEE transactions on information theory 48 (2002), 734–747.

[15] D.A. Ioannides and P.D. Alevizos, Nonparametric regression with errors in variables and applications,Statist. Probab. Lett.32(1997), 35-43.

[16] W. H¨ardle, G. Kerkyacharian, D. Picard and A. Tsybakov,Wavelet, Approximation and Statistical Applications, Lectures Notes in Statistics New York, 129 Springer Verlag, 1998.

[17] I.M. Johnstone, G. Kerkyacharian, D. Picard and M. Raimondo, Wavelet deconvolu- tion in a periodic setting,Journal of the Royal Statistical Society Serie B663 (2004), 547–573.

[18] J.Y. Koo and K.W. Lee, B-splines estimations of regression functions with errors in variables,Statist. Probab. Lett.40(1998), 57-66.

[19] E. Masry, Multivariate regression estimation with errors-in-variables for stationary processes,Nonparametric Statistics3(1993), 13-36.

[20] Y. Meyer,Ondelettes et Op´erateurs, Hermann, Paris, 1990.

[21] M. Pensky and B. Vidakovic, Adaptive wavelet estimator for nonparametric density deconvolution,The Annals of Statistics27 (1999), 2033–2053.

[22] V. V. Petrov,Limit Theorems of Probability Theory: Sequences of Independent Ran- dom Variables, Oxford: Clarendon Press, 1995.

[23] B.L.S. Prakasa Rao, Nonparametric estimation of the derivatives of a density by the method of wavelets,Bull. Inform. Cyb.28 (1996), 91–100.

[24] H. P. Rosenthal, On the subspaces ofLp(p≥2) spanned by sequences of independent random variables,Israel Journal of Mathematics8(1970), 273-303.

(20)

[25] A.B. Tsybakov,Introduction a l’estimation non-parametrique, Springer, 2004.

Références

Documents relatifs

Wavelet block thresholding for samples with random design: a minimax approach under the L p risk, Electronic Journal of Statistics, 1, 331-346..

The first method (see [5]) compares empirical risks associated with different bandwidths by using ICI (Intersection of Confidence Intervals) rule whereas the second one (see

Block thresholding and sharp adaptive estimation in severely ill-posed inverse problems.. Wavelet based estimation of the derivatives of a density for m-dependent

Keywords and phrases: Nonparametric regression, Derivatives function estimation, Wavelets, Besov balls, Hard

(4) This assumption controls the decay of the Fourier coefficients of g, and thus the smoothness of g. It is a standard hypothesis usually adopted in the field of

Key words: Improved non-asymptotic estimation, Least squares esti- mates, Robust quadratic risk, Non-parametric regression, Semimartingale noise, Ornstein–Uhlenbeck–L´ evy

corresponds to the rate of convergence obtained in the estimation of f (m) in the 1-periodic white noise convolution model with an adapted hard thresholding wavelet estimator

Keywords and phrases: Nonparametric regression, Biased data, Deriva- tives function estimation, Wavelets, Besov balls..