Estimation of the derivatives of a function in a convolution regression model with random design

(1)

HAL Id: hal-01054877

https://hal.archives-ouvertes.fr/hal-01054877v2

Preprint submitted on 25 Feb 2015

HAL

is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or

L’archive ouverte pluridisciplinaire

HAL, est

destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires

Estimation of the derivatives of a function in a convolution regression model with random design

Christophe Chesneau, Maher Kachour

To cite this version:

Christophe Chesneau, Maher Kachour. Estimation of the derivatives of a function in a convolution

regression model with random design. 2015. �hal-01054877v2�

(2)

Estimation of the derivatives of a function in a convolution regression

model with random design

Christophe Chesneau

Laboratoire de Math´ematiques Nicolas Oresme,

Universit´e de Caen BP 5186, F 14032 Caen Cedex, FRANCE. e-mail:

christophe.chesneau@unicaen.fr

Maher Kachour

Ecole sup´´ erieure de commerce IDRAC, 47, rue Sergent Michel Berthet CP 607, 69258 Lyon Cedex 09, FRANCE. e-mail:maher.kachour@idraclyon.com

Abstract: A convolution regression model with random design is considered. We investigate the estimation of the derivatives of an unknown function, element of the convolution product. We introduce new estimators based on wavelet methods and provide theoretical guaranties on their good performances.

AMS 2000 subject classifications:62G07, 62G20.

Keywords and phrases:Nonparametric regression, Deconvolution, Deriva- tives function estimation, Wavelets, Hard thresholding.

1. Introduction

We consider the convolution regression model with random design described as follows. Let (Y1, X1), . . . ,(Yn, Xn) be ni.i.d. random variables defined on a probability space (Ω,A,P), where

Y_v= (f ? g)(X_v) +ξ_v, v= 1, . . . , n, (1.1) (f ? g)(x) =R∞

−∞f(t)g(x−t)dt,f :R→Ris an unknown function,g:R→R is a known function, X1, . . . , Xn are n i.i.d. random variables with common densityh: R →[0,∞), and ξ1, . . . , ξn are n i.i.d. random variables such that E(ξ1) = 0 andE(ξ₁²)<∞. Throughout this paper, we assume thatf,gandhare compactly supported with supp(f) = [a, b], supp(g) = [a⁰, b⁰], supp(h) = [a_∗, b_∗], (a, b, a⁰, b⁰)∈R⁴,a < b,a⁰< b⁰,a_∗=a+a⁰,b_∗=b+b⁰,f ismtimes differentiable withm∈N,gis integrable and ordinary smooth (the precise definition is given by(K2)in Subsection 3.1), andXvandξvare independent for anyv= 1, . . . , n.

We aim to estimate the unknown functionf and itsm-th derivative, denoted byf^(m), from the sample (Y1, X1), . . . ,(Yn, Xn).

The motivation of this problem is the deconvolution of a signalf from f ? g perturbed by noise and randomly observed. The function g can represent a driving force that was applied to a physical system. Such situations naturally

1

(3)

appear in various applied areas, as astronomy, optics, seismology and biology.

The model (1.1) can also be viewed as a natural extension of some 1-periodic convolution regression models as those considered by, e.g., Cavalier and Tsy- bakov (2002), Pensky and Sapatinas (2010) and Loubes and Marteau (2012). In the form (1.1), it has been considered in Birke and Bissantz (2008) and Birke et al.(2010) with a deterministic design, and in Hildebrandtet al. (2014) with a random design. These last works focus on kernel methods and establish their asymptotic normality. The estimation of f^(m), more general to f = f⁽⁰⁾, is of interest to examine possible bumps and to study the convexity - concavity properties of f (see, for instance, Prakasa Rao (1983), for standard statistical models).

In this paper, we introduce new estimators forf^(m)based on wavelet methods. Through the use of a multiresolution analysis, these methods enjoy local adaptivity against discontinuities and provide efficient estimators for a wide va- riety of unknown functionsf^(m). Basics on wavelet estimation can be found in, e.g., Antoniadis (1997), H¨ardle et al.(1998) and Vidakovic (1999). Results on the wavelet estimation off^(m) in other regression frameworks can be found in, e.g., Cai (2002), Petsa and Sapatinas (2011) and Chesneau (2014).

The first part of the study is devoted to the case whereh, the common density ofX1, . . . , Xn, is known. We develop a linear wavelet estimator and an adaptive nonlinear wavelet estimator. The second one uses the double hard thresholding technique introduced by Delyon and Juditsky (1996). It does not depend on the smoothness off^(m)in its construction ; it is adaptive. We exhibit their rates of convergence via the mean integrated squared error (MISE) and the assumption that f^(m) belongs to Besov balls. The obtained rates of convergence coincide with existing results for the estimation of f^(m) in the 1-periodic convolution regression models (see, for instance, Chesneau (2010))

The second part is devoted to the case wherehis unknown. We construct a new linear wavelet estimator using a plug-in approach for the estimation ofh.

Its construction follows the idea of the “NES linear wavelet estimator” introduced by Pensky and Vidakovic (2001) in another regression context. Then we investigate its MISE properties whenf^(m) belongs to Besov balls, which naturally depend on the MISE of the considered estimator forh. Furthermore, let us mention that all our results are proved with only moments of order 2 onξ1, which provides another theoretical contribution to the subject.

The remaining part of this paper is organized as follows. In Section 2 we describe some basics on wavelets, Besov balls and present our wavelet estimation methodology. Section 3 is devoted to our estimators and their performances. The proofs are carried out in Section 4.

2. Preliminaries

This section is devoted to the presentation of the considered wavelet basis, the Besov balls and our wavelet estimation methodology.

(4)

2.1. Wavelet basis

Let us briefly present the wavelet basis on the interval [a, b], (a, b)∈R², introduced by Cohen et al.(1993). Let φ and ψ be the initial wavelet functions of the Daubechies wavelets familydb2NwithN ≥1 (see, e.g., Daubechies (1992)).

These functions have the distinction of being compactly supported and belong to the classC^a forN >5a. For any j≥0 andk∈Z, we set

φj,k(x) = 2^j/2φ(2^jx−k), ψj,k(x) = 2^j/2ψ(2^jx−k).

With appropriated treatments at the boundaries, there exists an integer τ and a set of consecutive integers Λj of cardinality proportional to 2^j (both depending ona,b andN) such that, for any integer`≥τ,

B={φ_`,k, k∈Λ_`; ψ_j,k; j∈N− {0, . . . , `−1}, k∈Λ_j}

forms an orthonormal basis of the space of squared integrable functions on [a, b], i.e.,

L²([a, b]) =







u: [a, b]→R; Z b

a

|u(x)|²dx

!^1/2

<∞





 .

For the casea= 0 and b= 1,τ is the smallest integer satisfying 2^τ ≥2N and Λ_j={0, . . . ,2^j−1}.

For any integer ` ≥ τ and u ∈ L²([a, b]), we have the following wavelet expansion:

u(x) = X

k∈Λ_`

c`,kφ`,k(x) +

∞

X

j=`

X

k∈Λ_j

dj,kψj,k(x), x∈[a, b], where

cj,k= Z b

a

u(x)φj,k(x)dx, dj,k= Z b

a

u(x)ψj,k(x)dx. (2.1) An interesting feature of the wavelet basis is to provide a sparse representation ofu; only few wavelet coefficientsdj,k caracterized by a high magnitude reveal the main details ofu. See, e.g., Cohenet al.(1993) and Mallat (2009).

2.2. Besov balls

We say that a function u∈L²([a, b]) belongs to the Besov ballB_p,r^s (M) with s >0,p≥1,r≥1 and M >0 if there exists a constant C >0 such that cj,k

anddj,k (2.1) satisfy

2^{τ(1/2−1/p)} X

k∈Λτ

|cτ,k|^p

!1/p

+







∞

X

j=τ





2j(s+1/2−1/p)



 X

k∈Λj

|dj,k|^p





1/p





r





1/r

≤C,

(5)

with the usual modifications ifp=∞or r=∞.

The interest of Besov balls is to contain various kinds of homogeneous and inhomogeneous functionsu. See, e.g., Meyer (1992), Donoho et al.(1996) and H¨ardleet al.(1998).

2.3. Wavelet estimation

Letf be the unknown function in (1.1) andBthe considered wavelet basis taken withN >5m(to ensure thatφ andψ belongs to the classC^m). Suppose that f^(m)exists with f^(m)∈L²([a, b]).

The first step in the wavelet estimation consists in expandingf^(m)onBas f^(m)(x) = X

k∈Λ`

c^(m)_`,k φ`,k(x) +

∞

X

j=`

X

k∈Λj

d^(m)_j,kψj,k(x), x∈[a, b], (2.2)

where`≥τ and c^(m)_j,k =

Z b a

f^(m)(x)φ_j,k(x)dx, d^(m)_j,k = Z b

a

f^(m)(x)ψ_j,k(x)dx. (2.3) The second step is the estimation ofc^(m)_j,k andd^(m)_j,k using (Y1, X1), . . . ,(Yn, Xn).

The idea of the third step is to exploit the sparse representation off^(m) by se- lecting the most interesting wavelet coefficients estimators. This selection can be of different natures (truncation, thresholding, . . . ). Finally, we reconstruct these wavelet coefficients estimators onB, providing an estimator ˆf^(m)forf^(m). In this study, we evaluate the performance of ˆf^(m) by studying the asymptotic properties of its MISE under the assumption thatf^(m)∈B_p,r^s (M). More precisely, we aim to determine the sharpest rate of convergenceω_n such that

E Z b

a

|fˆ^(m)(x)−f^(m)(x)|²dx

!

≤Cωn,

whereC denotes a constant independent ofn.

3. Rates of convergence

In this section, we list the assumptions on the model, present our wavelet estimators and determine their rates of convergence under the MISE over Besov balls.

3.1. Assumptions

Let us recall thatf andg are the functions in (1.1) andhis the density ofX1. We formulate the following assumptions:

(6)

(K1) We have f^(q)(a) = f^(q)(b) = 0 for any q ∈ {0, . . . , m}, f^(m) ∈ L²([a, b]) and there exists a known constantC₁>0 such that sup_x∈[a,b]|f(x)| ≤C₁. (K2) First of all, let us define the Fourier transform of an integrable functionu

by

F(u)(x) = Z ∞

−∞

u(y)e^−ixydy, x∈R. The notation · will be used for the complex conjugate.

We haveg∈L²([a⁰, b⁰]) and there exist two constants,c₁ >0 andδ≥0, such that

| F(g)(x)| ≥ c1

(1 +x²)^δ/2, x∈R. (3.1) (K3) There exists a constantc2>0 such that

c2≤ inf

x∈[a∗,b_∗]h(x).

The assumptions(K1) and (K3) are standard in a nonparametric regression framework (see, for instance, Tsybakov (2004)). Remark that we do not need f(a) = f(b) = 0 for the estimation of f = f⁽⁰⁾. The assumption (K2) is the so-called ”ordinary smooth case” ong. It is common for the deconvolution - estimation of densities (see, e.g., Fan and Koo (2002) and Pensky and Vi- dakovic (1999)). An example of compactly supported functiongsatisfying(K2) isg(x) =R2

1 ymax (1− |x|/y,0)dy. Then supp(g) = [−2,2],g∈L²([−2,2]) and (K2)is satisfied withδ= 2 andc1= min(inf_{x∈[−2π,2π]}| F(g)(x)|,1/(4π²))>0.

3.2. When h the Xi’s common density is known

3.2.1. Linear wavelet estimator

We define the linear wavelet estimator ˆf₁^(m)by fˆ₁^(m)(x) = X

k∈Λj0

ˆ c^(m)_j

0,kφj₀,k(x), x∈[a, b], (3.2) where

ˆ c^(m)_j,k = 1

n

X

v=1

Yv

h(Xv) 1 2π

Z ∞

−∞

(ix)^mF(φj,k)(x)

F(g)(x) e^−ixX^vdx, (3.3) andj0 is an integer chosen a posteriori.

Proposition 3.1 presents an elementary property of ˆc^(m)_j,k.

Proposition 3.1. Letcˆ^(m)_j,k be (3.3)andc^(m)_j,k be (2.3). Suppose that(K1)holds.

Then we have

E(ˆc^(m)_j,k) =c^(m)_j,k.

(7)

Theorem 3.1 below investigates the performance of ˆf₁^(m)in terms of rates of convergence under the MISE over Besov balls.

Theorem 3.1. Suppose that(K1)-(K3)are satisfied and thatf^(m)∈B_p,r^s (M) withM >0, p≥1,r≥1,s∈(max(1/p−1/2,0), N) andN >5(m+δ+ 1).

Letfˆ₁^(m)be defined by (3.2)withj0 such that

2^j⁰ = [n^1/(2s^∗^+2m+2δ+1)], (3.4)

s_∗=s+ min(1/2−1/p,0) ([a] denotes the integer part ofa).

Then there exists a constantC >0 such that E

Z b a

|fˆ₁^(m)(x)−f^(m)(x)|²dx

!

≤Cn^−2s^∗^/(2s^∗^+2m+2δ+1).

Note that the rate of convergence n^−2s^∗^/(2s^∗^+2m+2δ+1) corresponds to the one obtained in the estimation off^(m)in the 1-periodic white noise convolution model with an adapted linear wavelet estimator (see, e.g., Chesneau (2010)).

The considered estimator ˆf₁^(m) depends on s (the smoothness parameter of f^(m)) ; it is not adaptive. This aspect, as well as the rate of convergence n^−2s^∗^/(2s^∗^+2m+2δ+1) can be improved with thresholding methods. The next paragraph is devoting to one of them : the hard thresholding method.

3.2.2. Hard thresholding wavelet estimator

Suppose that (K2) is satisfied. We define the hard thresholding wavelet estimator ˆf₂^(m)by

fˆ₂^(m)(x) = X

k∈Λτ

ˆ

c^(m)_τ,kφ_τ,k(x) +

j₁

X

j=τ

X

k∈Λj

dˆ^(m)_j,k 1ⁿ

|dˆ^(m)_j,k|≥κλj

oψ_j,k(x), (3.5) x∈[a, b], where ˆc^(m)_j,k is defined by (3.3),

dˆ^(m)_j,k = 1 n

n

X

v=1

Y_v h(Xv)

1 2π

Z ∞

−∞

(ix)^mF(ψ_j,k)(x)

F(g)(x) e^−ixX^vdx×

1^





Yv h(Xv)

1 2π

R∞

−∞(ix)^m^F(ψj,k)^(x)

F(g)(x) e^−ixXvdx

≤ςj







, (3.6)

1is the indicator function,κ > 0 is a large enough constant,j1 is the integer satisfying

2^j¹ =h

n1/(2m+2δ+1)i , δrefers to (3.1),

ς_j=θ_ψ2^mj2^δj r n

lnn, λ_j=θ_ψ2^mj2^δj rlnn

n

(8)

and θ_ψ=

v u u t

1 πc₂c²₁ C₁²

Z ∞

−∞

|g(x)|dx ²

+E(ξ₁²)

!Z ∞

−∞

x^m(1 +x²)^δ|F(ψ)(x)|²dx.

The construction of ˆf₂^(m) uses the double hard thresholding technique introduced by Delyon and Juditsky (1996) and recently improved by Chaubeyet al.

(2014). The main interest of the thresholding usingλj is to make ˆf₂^(m)adaptive;

the construction (and performance) of ˆf₂^(m)does not depend on the knowledge of the smoothness off^(m). The role of the thresholding using ςj in (3.6) is to relax some usual restrictions on the model. To be more specific, it enables us to only suppose thatξ1 admits finite moments of order 2 (with known E(ξ₁²) or a known upper bound ofE(ξ₁²)), relaxing the standard assumptionE(|ξ1|^k)<∞, for anyk∈N.

Further details on the constructions of hard thresholding wavelet estimators can be found in, e.g., Donoho and Johnstone (1994, 1995), Donohoet al.(1995, 1996), Delyon and Juditsky (1996) and H¨ardleet al.(1998).

Theorem 3.2 below investigates the performance of ˆf₂^(m)in terms of rates of convergence under the MISE over Besov balls.

Theorem 3.2. Suppose that(K1)-(K3)are satisfied and thatf^(m)∈B_p,r^s (M) withM >0,r≥1,{p≥2,s∈(0, N)}or{p∈[1,2),s∈((2m+ 2δ+ 1)/p, N)}

andN >5(m+δ+ 1). Letfˆ₂^(m)be defined by (3.5). Then there exists a constant C >0 such that

E Z b

a

|fˆ₂^(m)(x)−f^(m)(x)|²dx

!

≤C lnn

n

2s/(2s+2m+2δ+1)

.

The proof of Theorem 3.2 is an application of a general result established by (Chaubeyet al., 2014, Theorem 6.1). Let us mention that (lnn/n)2s/(2s+2m+2δ+1)

corresponds to the rate of convergence obtained in the estimation off^(m)in the 1-periodic white noise convolution model with an adapted hard thresholding wavelet estimator (see, e.g., Chesneau (2010)). In the case m= 0 and δ = 0, this rate of convergence becomes the optimal one in the minimax sense for the standard density - regression estimation problems (see H¨ardleet al. (1998)).

In comparison to Theorem 3.1, note that

• for the casep≥2 corresponding to the homogeneous zone of Besov balls, (lnn/n)2s/(2s+2m+2δ+1) is equal to the rate of convergence attained by fˆ₁^(m) up to a logarithmic term,

• for the casep∈[1,2) corresponding to the inhomogeneous zone of Besov balls, it is significantly better in terms of power.

3.3. When h the Xi’s common density is unknown

In the case wherehis unknown, we propose a plug-in technique which consists in estimatinghin the construction of ˆf₁^(m)(3.2). This yields the linear wavelet

(9)

estimator ˆf₃^(m) defined by fˆ₃^(m)(x) = X

k∈Λj2

˜ c^(m)_j

2,kφ_j₂_,k(x), x∈[a, b], (3.7) where

˜

c^(m)_j,k = 1 an

an

X

v=1

Yv

h(Xˆ _v)1{^|^ˆ^h(Xv)|≥c2/2} 1 2π

Z ∞

−∞

(ix)^mF(φj,k)(x)

F(g)(x) e^−ixX^vdx, a_n = [n/2],j₂ is an integer chosen a posteriori, c₂ refers to (K3) and ˆh is an estimator ofhconstructed from the random variablesU_n= (X_a_n₊₁, . . . , X_n).

There are numerous possibilities for the choice of ˆh. For instance, ˆhcan be a kernel density estimator or a wavelet density estimator (see, e.g., Donohoet al.

(1996), H¨ardle et al.(1998) and Juditsky and Lambert-Lacroix (2004)).

The estimator ˆf₃^(m) is derived to the “NES linear wavelet estimator” introduced by Pensky and Vidakovic (2001) and recently revisited in a more simple form by Chesneau (2014).

Theorem 3.3 below determines an upper bound of the MISE of ˆf₃^(m). Theorem 3.3. Suppose that (K1) - (K3) are satisfied, h ∈ L²([a_∗, b_∗]) and thatf^(m)∈B_p,r^s (M)withM >0,p≥1,r≥1,s∈(max(1/p−1/2,0), N)and N >5(m+δ+ 1). Letfˆ₃^(m) be defined by (3.7)withj₂ such that2^j² ≤n. Then there exists a constantC >0 such that

E Z b

a

|fˆ₃^(m)(x)−f^(m)(x)|²dx

!

≤ C 2^(2m+2δ+1)j²max E Z b∗

a∗

|h(x)ˆ −h(x)|²dx

! ,1

n

!

+ 2^−2j²^s^∗

! ,

withs∗=s+ min(1/2−1/p,0).

The proof follows the idea of (Chesneau, 2014, Theorem 3) and uses technical operations on Fourier transforms.

From Theorem 3.3,

• if we chose ˆh=handj2=j0 (3.4), we obtain Theorem 3.1,

• if ˆhandhsatisfy : there existsυ∈[0,1] and a constant C >0 such that E

Z b∗

a_∗

|ˆh(x)−h(x)|²dx

!

≤Cn^−υ,

then, the optimal integerj2is such that 2^j² = [n^υ/(2s^∗^+2m+2δ+1)] and we obtain the following rate of convergence for ˆf₃^(m):

E Z b

a

|fˆ₃^(m)(x)−f^(m)(x)|²dx

!

≤Cn^−2s^∗^υ/(2s^∗^+2m+2δ+1).

(10)

Naturally the estimation ofhhas a negative impact on the performance of ˆf₃^(m). In particular, if h∈B_p^s⁰0,r⁰(M⁰), then the standard density linear wavelet estimator ˆh attains the rate of convergence n^−υ with υ = 2s◦/(2s◦ + 1), s◦ = s⁰+ min(1/2−1/p⁰,0) (and it is optimal in the minimax sense for p⁰ ≥2, see H¨ardleet al.(1998)). With this choice, the rate of convergence for ˆf₃^(m)becomes n^−4s^∗^s^◦^/(2s^◦^+1)(2s^∗^+2m+2δ+1). Let us mention that ˆf₃^(m)is not adaptive since it depends ofs. However, ˆf₃^(m) remains an acceptable first approach for the estimation off^(m)with unknownh.

Conclusion and perspectives.This study considers the estimation off^(m) from (1.1). According to the knowledge ofhor not, we propose wavelet methods and prove that they attain fast rates of convergence under the MISE over Besov balls. Among the perspectives of this work, we retain :

• The relaxation of the assumption(K2), perhaps by considering :(K2’): There exist four constants,C1>0,ω∈N,η >0 andδ≥0, such that

| F(g)(x)|⁻¹≤C₁|sin πx

η

|^−ω(1 +|x|)^δ, x∈R.

This condition was first introduced by Delaigle and Meister (2011) in a context of deconvolution - estimation of function. It implies (K2) and has the advantage to consider some functions g having zeros in Fourier transform domain as numerous kinds of compactly supported functions.

• The construction of an adaptive version of ˆf₃^(m) through the use of a thresholding method.

• The extension of our results to theL^p risk withp≥1.

All these aspects need further investigations that we leave for future works.

4. Proofs

In this section,Cdenotes any constant that does not depend onj,kandn. Its value may change from one term to another and may depend onφorψ.

Proof of Proposition 3.1.By the independence betweenX1andξ1,E(ξ1) = 0, sup(f ? g) = supp(h) = [a_∗, b_∗] andF(f ? g)(x) =F(f)(x)F(g)(x), we have

E Y1

h(X1)e^−ixX¹

= E

(f ? g)(X₁) h(X1) e^−ixX¹

+E(ξ1)E 1

h(X1)e^−ixX¹

= E

(f ? g)(X1) h(X1) e^−ixX¹

= Z b∗

a_∗

(f ? g)(y)

h(y) e^−ixyh(y)dy

= Z ∞

−∞

(f ? g)(y)e^−ixydy=F(f ? g)(x) =F(f)(x)F(g)(x). (4.1)

(11)

It follows from (K1) and m integration by parts that (ix)^mF(f)(x) = F(f^(m))(x). Using this equality, (4.1) and the Parseval identity, we obtain

E(ˆc^(m)_j,k ) =E Y₁ h(X1)

1 2π

Z ∞

−∞

(ix)^mF(φ_j,k)(x)

F(g)(x) e^−ixX¹dx

!

= 1

2π Z ∞

−∞

(ix)^mF(φ_j,k)(x) F(g)(x) E

Y₁

h(X1)e^−ixX¹

dx

= 1

2π Z ∞

−∞

(ix)^mF(φj,k)(x)

F(g)(x) F(f)(x)F(g)(x)dx

= 1

2π Z ∞

−∞

(ix)^mF(f)(x)F(φj,k)(x)dx

= 1

2π Z ∞

−∞

F(f^(m))(x)F(φ_j,k)(x)dx= Z b

a

f^(m)(x)φ_j,k(x)dx=c^(m)_j,k . Proposition 3.1 is proved.

Proof of Theorem 3.1.We expand the functionf^(m)onBas (2.2) at the level`=j0. SinceBforms an orthonormal basis ofL²([a, b]), we get

E Z b

a

|fˆ₁^(m)(x)−f^(m)(x)|²dx

!

= X

k∈Λj0

E |ˆc^(m)_j

0,k−c^(m)_j

0,k|² +

∞

X

j=j0

X

k∈Λj

(d^(m)_j,k )². (4.2) Using Proposition 3.1, (Y1, X1), . . . ,(Yn, Xn) are i.i.d., the inequalities :V(D)≤ E(|D|²) for any random complex variableD and (x+y)²≤2(x²+y²), (x, y)∈ R², and(K1) and(K3), we have

E |ˆc^(m)_j

0,k−c^(m)_j

0,k|²

=V

ˆ c^(m)_j

0,k

= 1

nV Y1

h(X₁) 1 2π

Z ∞

−∞

(ix)^mF(φj₀,k)(x)

!

≤ 1

(2π)²nE





Y₁ h(X1)

Z ∞

−∞

x^mF(φ_j₀_,k)(x)

2



≤ 2

(2π)²nE





((f ? g)(X1))²+ξ₁² (h(X₁))²

Z ∞

−∞

x^mF(φj₀,k)(x)

2



≤ 1 n

2 (2π)²c2

C₁² Z ∞

−∞

|g(x)|dx 2

+E(ξ₁²)

!

×

E



 1 h(X1)

Z ∞

−∞

x^mF(φj₀,k)(x)

2

. (4.3)

(12)

The Parseval identity yields E



 1 h(X1)

Z ∞

−∞

x^mF(φ_j₀_,k)(x)

2



= Z b_∗

a∗

1 h(y)

Z ∞

−∞

x^mF(φj₀,k)(x)

F(g)(x) e^−ixydx

2

h(y)dy

≤ Z ∞

−∞

F x^mF(φj₀,k)(x) F(g)(x)

! (y)

2

dy= 2π Z ∞

−∞

x^mF(φj₀,k)(x) F(g)(x)

2

dx.

(4.4) Using (K2), |F(φj₀,k) (x)| = 2^−j⁰^/2

F(φ) (x/2^j⁰)

and a change of variables, we obtain

Z ∞

−∞

x^mF(φ_j₀_,k)(x) F(g)(x)

2

dx≤ 1 c²₁

Z ∞

−∞

x^2m(1 +x²)^δ|F(φj0,k) (x)|²dx

= 1

c²₁2^−j⁰ Z ∞

−∞

x^2m(1 +x²)^δ

F(φ) (x/2^j⁰)

2dx

= 1

c²₁ Z ∞

−∞

2^2j⁰^mx^2m(1 + 2^2j⁰x²)^δ|F(φ) (x)|²dx

≤ 1

c²₁2^(2m+2δ)j⁰ Z ∞

−∞

x^2m(1 +x²)^δ|F(φ) (x)|²dx.

≤ C2^(2m+2δ)j⁰. (4.5)

(Let us mention thatR∞

−∞x^2m(1 +x²)^δ|F(φ) (x)|²dx is finite thanks to N >

5(m+δ+ 1)).

Putting (4.3), (4.4) and (4.5) together, we have E

|ˆc^(m)_j

0,k−c^(m)_j

0,k|²

≤C2^(2m+2δ)j⁰1 n. For the integerj0satisfying (3.4), it holds

X

k∈Λj0

E |ˆc^(m)_j

0,k−c^(m)_j

0,k|²

≤C2^(2m+2δ+1)j⁰1

n ≤Cn^−2s^∗^/(2s^∗^+2m+2δ+1). (4.6) Let us now bound the last term in (4.2). Sincef^(m)∈B^s_p,r(M)⊆B_2,∞^s^∗ (M) (see (H¨ardleet al., 1998, Corollary 9.2)), we obtain

∞

X

j=j₀

X

k∈Λj

(d^(m)_j,k )²≤C2^−2j⁰^s^∗≤Cn^−2s^∗^/(2s^∗^+2m+2δ+1). (4.7) Owing to (4.2), (4.6) and (4.7), we have

E Z b

a

|fˆ₁^(m)(x)−f^(m)(x)|²dx

!

≤Cn^−2s^∗^/(2s^∗^+2m+2δ+1).

(13)

Theorem 3.1 is proved.

Proof of Theorem 3.2.Observe that, for γ ∈ {φ, ψ}, any integer j ≥ τ andk∈Λj,

(a1) using arguments similar to those in Proposition 3.1, we obtain

E 1 n

n

X

v=1

Y_v h(Xv)

1 2π

Z ∞

−∞

(ix)^mF(γ_j,k)(x)

F(g)(x) e^−ixX^vdx

!

= Z b

a

f^(m)(x)γ_j,k(x)dx.

(a2) using (4.3), (4.4) and (4.5) withγ instead ofφ, we have

n

X

v=1

E





Yv

h(Xv) 1 2π

Z ∞

−∞

(ix)^mF(γj,k)(x)

2



= nE





Y₁ h(X1)

1 2π

Z ∞

−∞

x^mF(γ_j,k)(x)

2

≤C_∗²n2^(2m+2δ)j,

withC_∗²= (1/(πc2c²₁))(C₁²(R∞

−∞|g(x)|dx)²+E(ξ₁²))R∞

−∞x^m(1+x²)^δ|F(γ)(x)|²dx.

Thanks to(a1) and (a2), we can apply (Chaubeyet al., 2014, Theorem 6.1) (see Appendix) withµn=υn=n,σ=m+δ,θγ =C_∗,Wv = (Yv, Xv),

qv(γ,(y, z)) = y h(z)

1 2π

Z ∞

−∞

(ix)^mF(γj,k)(x)

F(g)(x) e^−ixzdx

and f^(m) ∈ B_p,r^s (M) with M > 0, r ≥ 1, either {p ≥ 2 and s ∈ (0, N)} or {p∈[1,2) ands∈(1/p, N)}, we prove the existence of a constantC >0 such that

E Z b

a

|fˆ₂^(m)(x)−f^(m)(x)|²dx

!

≤C lnn

n

2s/(2s+2m+2δ+1)

. Theorem 3.2 is proved.

Proof of Theorem 3.3.We expand the functionf^(m)onBas (2.2) at the level`=j₂. SinceBforms an orthonormal basis ofL²([a, b]), we get

E Z b

a

|fˆ₃^(m)(x)−f^(m)(x)|²dx

!

= X

k∈Λj2

E |˜c^(m)_j

2,k−c^(m)_j

2,k|² +

∞

X

j=j₂

X

k∈Λj

(d^(m)_j,k )². (4.8) Usingf^(m)∈B_p,r^s (M)⊆B^s_2,∞^∗ (M) (see (H¨ardleet al., 1998, Corollary 9.2)), we

have ∞

X

j=j₂

X

k∈Λj

(d^(m)_j,k )²≤C2^−2j²^s^∗. (4.9)

(14)

Let ˆc^(m)_j

2,kbe (3.3) withn=anandj=j2. The elementary inequality: (x+y)²≤ 2(x²+y²), (x, y)∈R², yields

X

k∈Λj2

E |˜c^(m)_j

2,k−c^(m)_j

2,k|²

≤2(S1+S2), (4.10) where

S1= X

k∈Λ_j₂

E |˜c^(m)_j

2,k−ˆc^(m)_j

2,k|²

, S2= X

k∈Λ_j₂

E |ˆc^(m)_j

2,k−c^(m)_j

2,k|² .

Upper bound forS₂.Proceeding as in (4.6), we get S2≤C2^(2m+2δ+1)j² 1

an

≤C2^(2m+2δ+1)j²1

n. (4.11)

Upper bound forS₁.The triangular inequality gives

˜c^(m)_j

2,k−ˆc^(m)_j

2,k

≤ 1

(2π)an an

X

v=1

|Y_v|

Z ∞

−∞

x^mF(φ_j₂_,k)(x)

×

1

h(Xˆ v)1{^|^ˆ^h(Xv)|≥c2/2} − 1 h(Xv)

.

Owing to the triangular inequality, the indicator function,(K3),n

|ˆh(X_v)|< c₂/2o

⊆ n|ˆh(Xv)−h(Xv)|> c2/2o

and the Markov inequality, we have

1

ˆh(Xv)1{^|^ˆ^h(Xv)|≥c2/2} − 1 h(Xv)

=

1 h(X_v)

h(Xv)−ˆh(Xv) ˆh(Xv)

!

1{|ˆh(X_v)|≥c2/2} −1{|h(Xˆ _v)|<c2/2}

!

≤ 1

h(Xv) 2

c2

|ˆh(X_v)−h(X_v)|+1{^|^ˆ^h(Xv)−h(Xv)|>c2/2}

≤ 4 c₂

|ˆh(Xv)−h(Xv)|

h(X_v) . Therefore

c˜^(m)_j

2,k−ˆc^(m)_j

2,k

≤CAj₂,k,n, where

Aj,k,n= 1 an

a_n

X

v=1

|Yv|

Z ∞

−∞

x^mF(φj₂,k)(x)

|ˆh(Xv)−h(Xv)|

h(Xv) .

(15)

Let us now considerU_n = (X_a_n₊₁, . . . , X_n). For any complex random variable D, we have the equality:

E(D²) =E(E(D²|Un)) =E(V(D|Un)) +E((E(D|Un))²),

whereE(D|Un) denotes the expectation ofDconditionally toUn andV(D|Un), the variance ofD conditionally toUn. Therefore

S1≤C X

k∈Λ_j₂

E(A²_j₂_,k,n) =C(Wj₂,n+Zj₂,n), (4.12) where

Wj₂,n= X

k∈Λ_j₂

E(V(Aj₂,k,n|Un)), Zj₂,n= X

k∈Λ_j₂

E

(E(Aj₂,k,n|Un))² .

Let us now observe that, owing to the independence of (Y1, X1), . . . ,(Yn, Xn), the random variables|Y1|

R∞

−∞x^m^F(^φj2,k)^(x)

|ˆh(X1)−h(X1)|/h(X1), . . . ,

|Ya_n|

R∞

−∞x^m^F(^φj2,k)^(x)

F(g)(x) e^−ixX^andx

|ˆh(Xa_n)−h(Xa_n)|/h(Xa_n) conditionally to Un are independent. Using this property with the inequalities: V(D|Un) ≤ E(D²|Un) for any complex random variableDand (x+y)²≤2(x²+y²), (x, y)∈ R², the independence betweenX1andξ1,(K1)and(K3), we get

V(Aj₂,k,n|Un) = 1

a_nV |Y1|

Z ∞

−∞

x^mF(φj₂,k)(x)

|ˆh(X1)−h(X1)|

h(X₁)

Un

!

≤ 1 anE



Y₁²

Z ∞

−∞

x^mF(φ_j₂_,k)(x)

2

h(Xˆ ₁)−h(X₁) h(X1)

2

U_n





≤ 1 an

2 c2

C₁² Z ∞

−∞

|g(x)|dx 2

+E(ξ₁²)

!

×

E





Z ∞

−∞

x^mF(φj₂,k)(x)

2|ˆh(X1)−h(X1)|² h(X1)

Un





= 2

c₂ C₁² Z ∞

−∞

|g(x)|dx ²

+E(ξ₁²)

!

× 1

a_n Z b_∗

a∗

Z ∞

−∞

x^mF(φj₂,k)(x)

F(g)(x) e^−ixydx

2|ˆh(y)−h(y)|² h(y) h(y)dy

≤ C1 n

Z b∗

a∗

Z ∞

−∞

x^mF(φj₂,k)(x)

F(g)(x) e^−ixydx

2

|ˆh(y)−h(y)|²dy.

(16)

Owing to (K2), |F(φ_j₂_,k) (x)| = 2^−j²^/2

F(φ) (x/2^j²)

and a change of variables, we obtain

Z ∞

−∞

x^mF(φ_j₂_,k)(x)

F(g)(x) e^−ixydx

≤ Z ∞

−∞

|x|^m|F(φ_j₂_,k) (x)|

|F(g)(x)| dx

≤ 1 c₁

Z ∞

−∞

|x|^m(1 +x²)^δ/2|F(φj₂,k) (x)|dx

= 1

c₁2^−j²^/2 Z ∞

−∞

|x|^m(1 +x²)^δ/2

F(φ) (x/2^j²) dx

= 1

c₁2^j²^/2 Z ∞

−∞

2^j²^m|x|^m(1 + 2^2j²x²)^δ/2|F(φ) (x)|dx

≤ 1

c₁2^(m+δ+1/2)j² Z ∞

−∞

|x|^m(1 +x²)^δ/2|F(φ) (x)|dx.

≤ C2^(m+δ+1/2)j².

Therefore, using Card(Λ_j₂)≤C2^j² and 2^j²≤n, we obtain Wj₂,n ≤ C2^(2m+2δ+1)j²2^j²1

nE Z b∗

a_∗

|ˆh(y)−h(y)|²dy

!

≤ C2^(2m+2δ+1)j²E Z b∗

a∗

|h(y)ˆ −h(y)|²dy

!

. (4.13)

Now, by the H¨older inequality for conditional expectations, arguments similar to (4.3), (4.4) and (4.5), we get

E(A_j₂_,k,n|Un) =E |Y1|

Z ∞

−∞

x^mF(φ_j₂_,k)(x)

|ˆh(X₁)−h(X₁)|

h(X1)

U_n

!

≤



E



 Y₁² h(X1)

Z ∞

−∞

x^mF(φj₂,k)(x)

2

Un









1/2

×

E

|ˆh(X1)−h(X1)|² h(X1)

U_n

!!1/2

=



E



 Y₁² h(X1)

Z ∞

−∞

x^mF(φj₂,k)(x)

2







1/2

×

Z b∗

a∗

|h(y)ˆ −h(y)|² h(y) h(y)dy

!1/2

≤ C2^(m+δ)j² Z b∗

a∗

|ˆh(y)−h(y)|²dy

!1/2

.

(17)

Hence

Zj₂,n≤C2^(2m+2δ+1)j²E Z b_∗

a∗

!

. (4.14)

It follows from (4.12), (4.13) and (4.14) that S1≤C2^(2m+2δ+1)j²E

Z b∗

a∗

|ˆh(y)−h(y)|²dy

!

. (4.15)

Putting (4.10), (4.11) and (4.15) together, we get X

k∈Λj2

E |˜c^(m)_j

2,k−c^(m)_j

2,k|²

≤ C2^(2m+2δ+1)j²max E Z b∗

a∗

! ,1

n

!

. (4.16) Combining (4.8), (4.9) and (4.16), we obtain the desired result, i.e.,

E Z b

a

|fˆ₃^(m)(x)−f^(m)(x)|²dx

!

≤ C 2^(2m+2δ+1)j²max E Z b∗

a∗

! ,1

n

!

+ 2^−2j²^s^∗

! .

Theorem 3.3 is proved.

(18)

Appendix

Let us now present in details the general result of (Chaubeyet al., 2014, Theorem 6.1) used in the proof of Theorem 3.2.

We consider the wavelet basis presented in Section 2 and a general form of the hard thresholding wavelet estimator denoted byfˆ_H for estimating an unknown functionf ∈L²([a, b]) fromnindependent random variables W1, . . . , Wn:

fˆH(x) = X

k∈Λτ

ˆ

ατ,kφτ,k(x) +

j₁

X

j=τ

X

k∈Λj

βˆj,k1{^|^β^ˆj,k|≥κϑj}ψj,k(x), (4.17)

where

ˆ α_j,k= 1

υn n

X

i=1

q_i(φ_j,k, W_i),

βˆj,k= 1 υ_n

n

X

i=1

qi(ψj,k, Wi)1_{|q_i_(ψ_j,k_,W_i_)|≤ς_j_},

ςj=θψ2^σj υn

√µnlnµn

, ϑj=θψ2^σj s

lnµn

µn

,

κ≥2 + 8/3 + 2p

4 + 16/9 andj₁ is the integer satisfying 2^j¹ = [µ^1/(2σ+1)_n ].

Here, we suppose that there exist

• n functions q₁, . . . , q_n with q_i : L²([a, b])×W_i(Ω) → C for any i ∈ {1, . . . , n},

• two sequences of real numbers(υn)_n∈Nand(µn)_n∈Nsatisfyinglim_n→∞υn=

∞andlim_n→∞µn=∞ such that, forγ∈ {φ, ψ},

(A1) any integer j≥τ and any k∈Λ_j, E 1

υn n

X

i=1

qi(γj,k, Wi)

!

= Z b

a

f(x)γj,k(x)dx.

(A2) there exist two constants, θ_γ >0 andσ ≥0, such that, for any integer j≥τ and anyk∈Λj,

n

X

i=1

E

|q_i(γ_j,k, W_i)|²

≤θ²_γ2^2σjυ²_n µ_n.

(19)

Let fˆ_H be (4.17) under (A1) and (A2). Suppose thatf ∈B_p,r^s (M) with r≥1, {p ≥ 2 and s ∈ (0, N)} or {p ∈ [1,2) and s ∈ ((2σ+ 1)/p, N)}. Then there exists a constantC >0 such that

E Z b

a

|fˆH(x)−f(x)|²dx

!

≤C lnµ_n

µn

2s/(2s+2σ+1)

.

Acknowledgments

The author is thankful to the reviewers for their comments which have helped in improving the presentation.

References

Antoniadis, A. (1997). Wavelets in statistics: a review (with discussion),Journal of the Italian Statistical Society Series B, 6, 97-144.

Birke, M. and Bissantz, N. (2008). Asymptotic normality and confidence inter- vals for inverse regression models with convolution-type operators,Journal of Multivariate Statistics, 100, 10, 2364-2375.

Birke, M., Bissantz, N. and Holzmann, H. (2010). Confidence bands for inverse regression models,Inverse Problems, 26, 115020.

Cai, T. (2002). On adaptive wavelet estimation of a derivative and other related linear inverse problems,J. Statistical Planning and Inference, 108, 329-349.

Cavalier, L. and Tsybakov, A. (2002). Sharp adaption for inverse problems with random noise,Prob. Theory Related Fields, 123, 323-354.

Chaubey, Y.P., Chesneau, C. and Doosti, H. (2014). Adaptive wavelet estimation of a density from mixtures under multiplicative censoring, preprint:

http://hal.archives-ouvertes.fr/hal-00918069.

Chesneau, C. (2010). Wavelet estimation of the derivatives of an unknown function from a convolution model,Current Development in Theory and Applica- tions of Wavelets, 4, 2, 131-151.

Chesneau, C. (2014). A note on wavelet estimation of the derivatives of a regression function in a random design setting,International Journal of Mathe- matics and Mathematical Sciences, Volume 2014, Article ID 195765, 8 pages.

Cohen, A., Daubechies, I., Jawerth, B. and Vial, P. (1993). Wavelets on the interval and fast wavelet transforms,Applied and Computational Harmonic Analysis, 24, 1, 54-81.

Daubechies, I. (1992).Ten lectures on wavelets, SIAM.

Delaigle, A. and Meister, A. (2011). Nonparametric function estimation under Fourier-oscillating noise,Statistica Sinica, 21, 1065-1092.

Delyon, B. and Juditsky, A. (1996). On minimax wavelet estimators, Applied Computational Harmonic Analysis, 3, 215-228.

Donoho, D. L. and Johnstone, I. M. (1994). Ideal spatial adaptation by wavelet shrinkage,Biometrika, 81, 425-455.