HAL Id: hal-00466830
https://hal.archives-ouvertes.fr/hal-00466830v2
Preprint submitted on 29 Jun 2010
HAL
is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire
HAL, estdestinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
function and its derivatives in an errors-in-variables model
Christophe Chesneau
To cite this version:
Christophe Chesneau. On adaptive wavelet estimation of the regression function and its derivatives
in an errors-in-variables model. 2010. �hal-00466830v2�
in an errors-in-variables model
Christophe Chesneau
University of Caen (LMNO), Caen, France. Email: chesneau@math.unicaen.fr
Abstract
We consider a regression model with errors-in-variables:(Y, X), whereY =f(Z) + ξ and X = Z +W. Our goal is to estimate the unknown regression functionf and its derivatives under mild assumptions onξ(only finite moments of order2are required). To reach this goal, we develop a new adaptive wavelet estimator based on a hard thresholding rule. Taking the minimax approach under the mean integrated squared error over Besov balls, we prove that it attains a sharp rate of convergence.
Keywords: Errors-in-variables model, Derivatives function estimation, Minimax ap- proach, Wavelets, Hard thresholding.
2000 Mathematics Subject Classification:62G07, 62G20.
1 Motivations
We observe nindependent pairs of random variables (Y1, X1), . . . ,(Yn, Xn)in the following errors-in-variables model: for anyv∈ {1, . . . , n},
( Yv =f(Zv) +ξv,
Xv=Zv+Wv, (1.1)
wheref is an unknown function, X1, . . . , Xn are n i.i.d. random variables having the uniform distribution on[0,1],W1, . . . , Wnareni.i.d. unobserved random variables and ξ1, . . . , ξn areni.i.d. unobserved zero mean random variables. We assume that all these random variables are independent, the density function ofW1, denotedg, is known and ξ1 admits finite moments of order 2. We aim to estimate f and its derivatives from (Y1, X1), . . . ,(Yn, Xn).
The estimation of f from the model (1.1) has received a lot of attention. See e.g.
[6, 11–13, 15, 18, 19]. In this paper, we focus on a more general problem: the adaptive estimation of thed-th derivative off: f(d). This is of interest to detect possible bumps, concavity or convexity properties of f. The estimation off(d) has been investigated in
This work is supported by ANR grant NatImages, ANR-08-EMER-009
several papers for various models (but never (1.1)) starting with [1]. For references using wavelet methods, see e.g. [3–5, 23].
Another feature of the study concernsξ1, . . . , ξn: to estimatef(d)(includingf =f(0)), we only assume thatξ1has finite moments of order2. And thus we do not need to know the distribution ofξ1. Moreover, this relaxes the assumption onξ1in [6] where finite moments of order>6are required.
Under this general framework, considering the ordinary smooth case ong(see (2.2)), we estimatef(d) by a new wavelet estimator based on a hard thresholding rule. It has the originality to combine a singular value decomposition (SVD) approach similar to the one of [17] and some technical tools introduced in wavelet estimation theory by [7].
We evaluate its performances by taking the minimax approach under the mean inte- grated squared error (MISE) over a wide class of functions: the Besov balls Bp,rs (M) (to be defined in Section 3). We prove that our estimator attains the rate of convergence vn = (lnn/n)2s/(2s+2δ+2d+1)
, whereδ is a factor related to the ordinary smooth case.
This rate of convergence is sharp in the sense that it is the one attains by the best non- realistic linear wavelet estimator up to a logarithmic term.
The paper is organized as follows. Assumptions on the model and some notations are introduced in Section 2. Section 3 briefly describes the periodized wavelet basis on[0,1]
and the Besov balls. The estimators are presented in Section 4. The results are set in Section 5. The proofs are gathered in Section 6.
2 Assumptions and notations
We assume in the sequel thatf(d)andgbelong toL2per([0,1]), the space of periodic functions of period one that are square-integrable on[0,1]:
L2per([0,1]) = (
h; his1-periodic andkhk2= Z 1
0
h2(x)dx 1/2
<∞ )
. We assume that there exists a known constantC∗>0such that
kfk∞= sup
x∈[0,1]
|f(x)| ≤C∗<∞. (2.1) Any functionh∈L2per([0,1])can be represented by its Fourier series
h(t) =X
`∈Z
F(h)(`)e2iπ`t, t∈[0,1],
where the equality is intended in mean-square convergence sense, andF(h)(`)denotes the Fourier coefficient given by
F(h)(`) = Z 1
0
h(x)e−2iπ`xdx, `∈Z,
whenever this integral exists. The notation · will be used for the complex conjugate.
We consider the ordinary smooth case ong: there exist three constants,cg >0,Cg>0 andδ >1, such that, for any`∈Z, the Fourier coefficient ofg, i.e.F(g)(`), satisfies
cg
(1 +`2)δ/2 ≤ | F(g)(`)| ≤ Cg
(1 +`2)δ/2. (2.2)
This assumption controls the decay of the Fourier coefficients ofg, and thus the smoothness ofg. It is a standard hypothesis usually adopted in the field of nonparametric estimation for deconvolution problems. See e.g. [14, 17, 21].
3 Wavelets and Besov balls
3.1 Periodized Meyer Wavelets
We consider an orthonormal wavelet basis generated by dilations and translations of a
”father” Meyer-type waveletφand a ”mother” Meyer-type waveletψ. The main features of such wavelets are:
1. they are bandlimited, i.e. the Fourier transforms ofφandψhave compact supports respectively included in[−4π/3,4π/3]and[−8π/3,−2π/3]∪[2π/3,8π/3].
2. for any frequency in[−2π,−π]∪[π,2π], there exists a constantc >0such that the magnitude of the Fourier transform ofψis lower bounded byc.
3. the functions(φ, ψ)areC∞as their Fourier transforms have a compact support, and ψhas an infinite number of vanishing moments as its Fourier transform vanishes in a neighborhood of the origin, i.e. for anyu∈N,R∞
−∞xuψ(x)dx= 0.
For the purpose of this paper, we use the periodized Meyer wavelet bases on the unit interval. For anyx∈[0,1], any integerjand anyk∈ {0, . . . ,2j−1}, let
φj,k(x) = 2j/2φ(2jx−k), ψj,k(x) = 2j/2ψ(2jx−k) be the elements of the wavelet basis, and
φperj,k(x) =X
l∈Z
φj,k(x−l), ψperj,k(x) =X
l∈Z
ψj,k(x−l),
their periodized versions. There exists an integer j∗ such that the collection B = {φperj
∗,k, k ∈ {0, . . . ,2j∗−1}; ψj,kper, j ∈ N− {0, . . . , j∗−1}, k ∈ {0, . . . ,2j −1}}
forms an orthonormal basis ofL2per([0,1]). In what follows, the superscript ”per” will be dropped to lighten the notation.
Letjc be an integer such thatjc ≥j∗. A functionh∈ L2per([0,1])can be expanded into a wavelet series as
h(x) =
2jc−1
X
k=0
αjc,kφjc,k(x) +
∞
X
j=jc 2j−1
X
k=0
βj,kψj,k(x), x∈[0,1], where
αj,k= Z 1
0
h(x)φj,k(x)dx, βj,k= Z 1
0
h(x)ψj,k(x)dx. (3.1) See [20, Vol. 1 Chapter III.11] for a detailed account on periodized orthonormal wavelet bases.
3.2 Besov balls
LetM >0,s >0,p≥1andr≥1. Setβj∗−1,k =αj∗,k. A functionhbelongs to the Besov ballsBp,rs (M)if and only if there exists a constantM∗ >0such that the wavelet coefficients (3.1) satisfy
∞
X
j=j∗−1
2j(s+1/2−1/p)
2j−1
X
k=0
|βj,k|p
1/p
r
1/r
≤M∗.
For a particular choice of parameterss,pandr, these sets contain the H¨older and Sobolev balls. See [20].
4 Estimators
Wavelet coefficient estimators. The first step to estimatef(d)consists in expandingf(d) onBand estimating its unknown wavelet coefficients.
For any integer j ≥ j∗ and any k ∈ {0, . . . ,2j − 1}, we estimate αj,k = R1
0 f(d)(x)φj,k(x)dxby αbj,k= 1
n
n
X
v=1
X
`∈Cj
(2iπ`)dF(φj,k)(`)
F(g)(`) Yve−2iπ`Xv, (4.1) Cj= supp (F(φj,0)) = supp (F(φj,k)), andβj,k=R1
0 f(d)(x)ψj,k(x)dxby βbj,k= 1
n
n
X
v=1
Gv1{|Gv|≤ηj}, (4.2)
where
Gv= X
`∈Dj
(2iπ`)dF(ψj,k)(`)
F(g)(`) Yve−2iπ`Xv,
Dj = supp (F(ψj,0)) = supp (F(ψj,k)), for any random eventA,1Ais the indicator function onA, and the thresholdηjis defined by
ηj =θ2(δ+d)j r n
lnn, (4.3)
θ = p
C∗∗(C∗2+E(ξ12)),C∗ is (2.1) andC∗∗ = 2δ−1(2(2π)2d/c2g)(8π/3)2(δ+d). The estimators αbj,k andβbj,k are constructed via a SVD approach similar to the one of [17].
The idea of the thresholding in (4.2) is to operate a selection on the observations: when, forv ∈ {1, . . . , n},f is ”too noisy” byξv, the observation(Yv, Xv)is neglected. Such a technique has been introduced by [7] in wavelet estimation. Statistical properties ofαbj,k andβbj,kare given in Propositions 6.1, 6.2 and 6.3.
We consider two wavelets estimators forf(d): a linear estimator and a hard threshold- ing estimator.
Linear estimator. Assuming that f(d) ∈ Bsp,r(M) withp ≥ 2, we define the linear estimatorfbdLby
fbdL(x) =
2j0−1
X
k=0
αbj0,kφj0,k(x), (4.4) whereαbj,kis defined by (4.1) andj0is the integer satisfying
2−1n1/(2s+2δ+2d+1)<2j0≤n1/(2s+2δ+2d+1). It is not adaptive since it depends ons, the smoothness parameter off(d). Hard thresholding estimator.We define the hard thresholding estimatorfbdHby
fbdH(x) =
2j∗−1
X
k=0
αbj∗,kφj∗,k(x) +
j1
X
j=j∗
2j−1
X
k=0
βbj,k1{|bβj,k|≥κλj}ψj,k(x), (4.5) whereαbj,kandβbj,kare defined by (4.1) and (4.2),j1is the integer satisfying
2−1n1/(2δ+2d+1)<2j1≤n1/(2δ+2d+1), κ≥8/3 + 2 + 2p
16/9 + 4andλjis the threshold λj =θ2(δ+d)j
rlnn
n . (4.6)
The definitions ofηjandλjare chosen to minimize the MISE offbH and to make it adap- tive. Further statistical results on the hard thresholding estimator for the standard regression model can be found in [8–10].
5 Results
Theorem 5.1. Consider(1.1)under the assumptions of Section2. Suppose thatf(d) ∈ Bp,rs (M)withs > 0,p ≥ 2 andr ≥ 1. LetfbdL be(4.4). Then there exists a constant C >0such that
E Z 1
0
fbdL(x)−f(d)(x)2 dx
≤Cn−2s/(2s+2δ+2d+1).
The proof of Theorem 5.1 uses a moment inequality on (4.1) and a suitable decompo- sition of the MISE.
Since the distribution ofξ1is unknown, we can not use the likelihood function related to the model and the optimal lower bound seems difficult to determine (see e.g. [16, 25]).
For this reason, our benchmark will be the rate of convergence attains by the ”optimal non-realistic”fbdL, i.e. vn =n−2s/(2s+2δ+2d+1). Note that, under the standard Gaussian assumption onξ1, one can prove thatvnis optimal in the minimax sense.
Theorem 5.2. Consider(1.1)under the assumptions of Section2. LetfbdHbe(4.5). Suppose thatf(d)∈Bsp,r(M)withr≥1,{p≥2ands >0}or{p∈[1,2)ands >(2δ+ 2d+ 1)/p}. Then there exists a constantC >0such that
E Z 1
0
fbdH(x)−f(d)(x)2
dx
≤C lnn
n
2s/(2s+2δ+2d+1)
.
The proof of Theorem 5.2 is based on several probability results (moment inequalities, concentration inequality,. . . ) and a suitable decomposition of the MISE.
Theorem 5.2 proves that fbdH attains vn up to the logarithmic term (lnn)2s/(2s+2δ+2d+1).
Conclusion and perspectives. We have constructed a new adaptive estimatorfbdH for f(d)under mild assumption onξ1. It is based on wavelet and thresholding. It has ”near- optimal” minimax properties for a wide class of functionsf(d). Possible perspectives of this work are
- to potentially improve the estimation off(d)by considering other kinds of thresh- olding rules as the block thresholding one introduced by [2],
- to investigate the random design case where the distribution ofX1is unknown.
6 Proofs
In this section, C represents a positive constant which may differ from one term to another.
6.1 Auxiliary results
Proposition 6.1. For any integerj ≥ j∗ and any k ∈ {0, . . . ,2j −1}, letαj,kbe the wavelet coefficient(3.1)off(d)andαbj,kbe(4.1). Then there exists a constantC >0such that
E
(αbj,k−αj,k)2
≤C22(δ+d)j1 n. Proof of Proposition 6.1.For anyv∈ {1, . . . , n}, let us set
Hv= X
`∈Cj
(2iπ`)dF(φj,k)(`)
F(g)(`) Yve−2iπ`Xv.
SinceX1,W1andξ1are independent, using the convolution product betweenfandg, i.e.
(f ? g)(x) =R1
0 f(x−y)g(y)dy, we have E Y1e−2iπ`X1
= E f(X1−W1)e−2iπ`X1
+E(ξ1)E e−2iπ`X1
= E f(X1−W1)e−2iπ`X1
= Z 1
0
Z 1 0
f(x−y)g(y)e−2iπ`xdxdy= Z 1
0
(f ? g)(x)e−2iπ`xdx
= F(f ? g)(`) =F(f)(`)F(g)(`).
Moreover, sincef is1-periodic, for anyu∈ {0, . . . , d},f(u)is1-periodic andf(u)(0) = f(u)(1). Bydintegrations by parts, for any`∈Z, we have
(2iπ`)dF(f)(`) =F f(d)
(`).
The Parseval-Plancherel theorem gives E(H1) = X
`∈Cj
(2iπ`)dF(φj,k)(`)
F(g)(`) E Y1e−2iπ`X1
= X
`∈Cj
(2iπ`)dF(φj,k)(`)
F(g)(`) F(f)(`)F(g)(`)
= X
`∈Cj
F(φj,k)(`)(2iπ`)dF(f)(`) =X
`∈Cj
F(φj,k)(`)F f(d)
(`)
= Z 1
0
φj,k(x)f(d)(x)dx=αj,k. (6.1)
HenceE(αbj,k) =αj,k. So E
(αbj,k−αj,k)2
= V(αbj,k) =V 1 n
n
X
v=1
Hv
!
= 1 n2
n
X
v=1
V(Hv)
= 1
nV(H1)≤ 1 nE H12
. (6.2)
SinceX1,W1andξ1are independent withE(ξ1) = 0and|f(X1−W1)| ≤ kfk∞≤C∗<
∞, we have
E H12
= E
f2(X1−W1)
X
`∈Cj
(2iπ`)dF(φj,k)(`)
F(g)(`) e−2iπ`X1
2
+ 2E(ξ1)E
f(X1−W1)
X
`∈Cj
(2iπ`)dF(φj,k)(`)
F(g)(`) e−2iπ`X1
2
+ E ξ12 E
X
`∈Cj
(2iπ`)dF(φj,k)(`)
F(g)(`) e−2iπ`X1
2
= E
f2(X1−W1)
X
`∈Cj
(2iπ`)dF(φj,k)(`)
F(g)(`) e−2iπ`X1
2
+ E ξ12 E
X
`∈Cj
(2iπ`)dF(φj,k)(`)
F(g)(`) e−2iπ`X1
2
≤ C∗2+E ξ12 E
X
`∈Cj
(2iπ`)dF(φj,k)(`)
F(g)(`) e−2iπ`X1
2
. (6.3) The assumption (2.2) implies
sup
`∈Cj
(2π`)2d
|F(g)(`)|2 ≤ (2π)2d c2g sup
`∈Cj
`2d(1 +`2)δ≤2δ−1(2π)2d c2g sup
`∈Cj
`2d(1 +`2δ)
≤ 2δ−12(2π)2d c2g
8π 3
2(δ+d)
22(δ+d)j =C∗∗22(δ+d)j. (6.4) Using the fact that (e−2iπ`x)`∈Z is an orthonormal basis of L2per([0,1]), (6.4) and the Parseval-Plancherel theorem, we obtain
E
X
`∈Cj
(2iπ`)dF(φj,k)(`)
F(g)(`) e−2iπ`X1
2
= Z 1
0
X
`∈Cj
(2iπ`)dF(φj,k)(`) F(g)(`) e−2iπ`x
2
dx=X
`∈Cj
(2π`)2d|F(φj,k) (`)|2
|F(g)(`)|2
≤ C∗∗22(δ+d)j X
`∈Cj
|F(φj,k) (`)|2=C∗∗22(δ+d)j Z 1
0
|φj,k(x)|2dx
= C∗∗22(δ+d)j. (6.5)
Putting (6.3) and (6.5) together, we obtain E H12
≤θ222(δ+d)j. (6.6)
It follows from (6.2) and (6.6) that E
(αbj,k−αj,k)2
≤C22(δ+d)j1 n.
Proposition 6.2. For any integerj ≥ j∗ and anyk ∈ {0, . . . ,2j −1}, letβj,k be the wavelet coefficient(3.1)off(d)andβbj,kbe(4.2). Then there exists a constantC >0such that
E
βbj,k−βj,k
4
≤C24(δ+d)j(lnn)2 n2 .
Proof of Proposition 6.2.Proceeding as in (6.1) (withψinstead ofφ), we have βj,k=
Z 1 0
f(d)(x)ψj,k(x)dx=E(Gv) =E(Gv1{|Gv|≤ηj}) +E(G11{|G1|>ηj}). (6.7) We have
E
βbj,k−βj,k4
= E
1 n
n
X
v=1
Gv1{|Gv|≤ηj}−E Gv1{|Gv|≤ηj}
−E G11{|G1|>ηj}
!4
≤ 8(A+B), (6.8)
where
A=E
1 n
n
X
v=1
Gv1{|Gv|≤ηj}−E(Gv1{|Gv|≤ηj})
!4
and
B = E(|G1|1{|G1|>ηj})4 .
Let us boundAandB, in turn. To boundA, we need the Rosenthal inequality presented in lemma below (see [24]).
Lemma 6.1(Rosenthal’s inequality). Letp ≥ 2,n ∈ N∗ and(Uv)v∈{1,...,n} benzero mean i.i.d. random variables such thatE(|U1|p)<∞. Then there exists a constantC >0 such that
E
n
X
v=1
Uv
p!
≤Cmax
nE(|U1|p), nE U12p/2 .
Applying the Rosenthal inequality withp= 4and, for anyv∈ {1, . . . , n}, Uv=Gv1{|Gv|≤ηj}−E Gv1{|Gv|≤ηj}
, we obtain
A= 1 n4E
n
X
v=1
Uv
!4
≤C 1 n4max
nE U14
, nE U122 .
Using (6.6) (withψinstead ofφ), we have, for anya∈ {2,4}, E(U1a)≤2aE Ga11{|G1|≤ηj}
≤2aηa−2j E G21
≤2aθ2ηa−2j 22(δ+d)j. Hence
A ≤ C 1 n4max
η2jθ222(δ+d)jn,
θ222(δ+d)jn2
= C 1 n4max
24(δ+d)j n2
lnn,24(δ+d)jn2
=C24(δ+d)j 1
n2. (6.9) Let us now boundB. Using again (6.6) (withψinstead ofφ), we obtain
E |G1|1{|G1|>ηj}
≤ E G21
ηj ≤ 1
θ2(δ+d)j rlnn
n 22(δ+d)jθ2
= θ2(δ+d)j rlnn
n . (6.10)
Hence
B≤C24(δ+d)j(lnn)2
n2 . (6.11)
Combining (6.8), (6.9) and (6.11), we have E
βbj,k−βj,k4
≤C
24(δ+d)j 1
n2 + 24(δ+d)j(lnn)2 n2
≤C24(δ+d)j(lnn)2 n2 .
Proposition 6.3. For any integerj ≥ j∗ and anyk ∈ {0, . . . ,2j −1}, letβj,k be the wavelet coefficient(3.1)off(d)andβbj,kbe(4.2). Then, for anyκ≥8/3+2+2p
16/9 + 4, P
|βbj,k−βj,k| ≥κλj/2
≤2n−2. Proof of Proposition 6.3.Using (6.7), we have
|βbj,k−βj,k|
= 1 n
n
X
v=1
Gv1{|Gv|≤ηj}−E Gv1{|Gv|≤ηj}
−E G11{|G1|>ηj}
≤ 1 n
n
X
v=1
Gv1{|Gv|≤ηj}−E Gv1{|Gv|≤ηj}
+E |G1|1{|G1|>ηj} .
Using (6.10), we obtain
E |G1|1{|G1|>ηj}
≤θ2(δ+d)j rlnn
n =λj. Hence
S = P
|bβj,k−βj,k| ≥κλj/2
≤ P 1 n
n
X
v=1
Gv1{|Gv|≤ηj}−E Gv1{|Gv|≤ηj}
≥(κ/2−1)λj
! .
Now we need the Bernstein inequality presented in the lemma below (see [22]).
Lemma 6.2 (Bernstein’s inequality). Let n ∈ N∗ and (Uv)v∈{1,...,n} be n zero mean i.i.d. random variables such that there exists a constant M > 0 satisfying, for any v∈ {1, . . . , n},|Uv| ≤M <∞. Then, for anyλ >0, holds
P
n
X
v=1
Uv
≥λ
!
≤2 exp − λ2
2 nE(U12) +λM3
! .
Let us set, for anyv∈ {1, . . . , n},
Uv=Gv1{|Gv|≤ηj}−E Gv1{|Gv|≤ηj} . ThenE(U1) = 0,
|Uv| ≤
Gv1{|Gv|≤ηj}
+E |Gv|1{|Gv|≤ηj}
≤2ηj and, using again (6.6) (withψinstead ofφ),
E U12
=V G11{|G1|≤ηj}
≤E G21
≤θ222(δ+d)j. It follows from the Bernstein inequality that
S≤2 exp
− n2(κ/2−1)2λ2j 2
θ2n22(δ+d)j+2n(κ/2−1)λ3 jηj
.
Since
λjηj=θ2(δ+d)j rlnn
n θ2(δ+d)j r n
lnn =θ222(δ+d)j, λ2j =θ222(δ+d)jlnn n , we have, for anyκ≥8/3 + 2 + 2p
16/9 + 4,
S≤2 exp
− (κ/2−1)2lnn 2
1 +2(κ/2−1)3
≤2n−2.
6.2 Proofs of the main results
Proof of Theorem 5.1.We expand the functionf(d)as
f(d)(x) =
2j0−1
X
k=0
αj0,kφj0,k(x) +
∞
X
j=j0 2j−1
X
k=0
βj,kψj,k(x),
where
αj0,k= Z 1
0
f(d)(x)φj0,k(x)dx, βj,k= Z 1
0
f(d)(x)ψj,k(x)dx.
We have
fbdL(x)−f(d)(x) =
2j0−1
X
k=0
(αbj0,k−αj0,k)φj0,k(x)−
∞
X
j=j0
2j−1
X
k=0
βj,kψj,k(x).
Hence
E Z 1
0
fbdL(x)−f(d)(x)2 dx
=A+B, where
A=
2j0−1
X
k=0
E
(αbj0,k−αj0,k)2
, B=
∞
X
j=j0 2j−1
X
k=0
βj,k2 .
Using Proposition 6.1, we obtain
A≤C2j0(1+2δ+2d)1
n ≤Cn−2s/(2s+2δ+2d+1). Sincep≥2, we haveBp,rs (M)⊆Bs2,∞(M). Hence
B≤C2−2j0s≤Cn−2s/(2s+2δ+2d+1). So
E Z 1
0
fbdL(x)−f(d)(x)2
dx
≤Cn−2s/(2s+2δ+2d+1). The proof of Theorem 5.1 is complete.
Proof of Theorem 5.2.We expand the functionf(d)as
f(d)(x) =
2j∗−1
X
k=0
αj∗,kφj∗,k(x) +
∞
X
j=j∗
2j−1
X
k=0
βj,kψj,k(x),
where
αj∗,k= Z 1
0
f(d)(x)φj∗,k(x)dx, βj,k= Z 1
0
f(d)(x)ψj,k(x)dx.
We have
fbdH(x)−f(d)(x)
=
2j∗−1
X
k=0
(αbj∗,k−αj∗,k)φj∗,k(x) +
j1
X
j=j∗
2j−1
X
k=0
βbj,k1{|bβj,k|≥κλj} −βj,k ψj,k(x)
−
∞
X
j=j1+1 2j−1
X
k=0
βj,kψj,k(x).
Hence
E Z 1
0
fbdH(x)−f(d)(x)2
dx
=R+S+T, (6.12)
where R=
2j∗−1
X
k=0
E
(αbj∗,k−αj∗,k)2
, S=
j1
X
j=j∗ 2j−1
X
k=0
E
βbj,k1{|bβj,k|≥κλj} −βj,k
2
and
T =
∞
X
j=j1+1 2j−1
X
k=0
βj,k2 . Let us boundR,TandS, in turn.
Using Proposition 6.1, we have R≤C2j∗(1+2δ+2d)1
n ≤C1 n ≤C
lnn n
2s/(2s+2δ+2d+1)
. (6.13)
Forr≥1andp≥2, we haveBp,rs (M)⊆Bs2,∞(M). So T ≤C
∞
X
j=j1+1
2−2js≤C2−2j1s≤Cn−2s/(2δ+2d+1)≤C lnn
n
2s/(2s+2δ+2d+1)
.
Forr≥1andp∈[1,2), we haveBp,rs (M)⊆B2,∞s+1/2−1/p(M). Sinces >(2δ+2d+1)/p, we have(s+ 1/2−1/p)/(2δ+ 2d+ 1)> s/(2s+ 2δ+ 2d+ 1). So
T ≤ C
∞
X
j=j1+1
2−2j(s+1/2−1/p)≤C2−2j1(s+1/2−1/p)
≤ Cn−2(s+1/2−1/p)/(2δ+2d+1)≤C lnn
n
2s/(2s+2δ+2d+1)
.
Hence, forr≥1,{p≥2ands >0}or{p∈[1,2)ands >(2δ+ 2d+ 1)/p}, we have T ≤C
lnn n
2s/(2s+2δ+2d+1)
. (6.14)
The termScan be decomposed as
S=e1+e2+e3+e4, (6.15)
where
e1=
j1
X
j=j∗
2j−1
X
k=0
E
βbj,k−βj,k
2
1{|bβj,k|≥κλj}1{|βj,k|<κλj/2}
,
e2=
j1
X
j=j∗ 2j−1
X
k=0
E
βbj,k−βj,k
2
1{|bβj,k|≥κλj}1{|βj,k|≥κλj/2}
,
e3=
j1
X
j=j∗
2j−1
X
k=0
E
βj,k2 1{|bβj,k|<κλj}1{|βj,k|≥2κλj} and
e4=
j1
X
j=j∗ 2j−1
X
k=0
E
βj,k2 1{|bβj,k|<κλj}1{|βj,k|<2κλj} . Let us analyze each terme1,e2,e3ande4in turn.
Upper bounds fore1ande3.We have n|βbj,k|< κλj, |βj,k| ≥2κλjo
⊆n
|βbj,k−βj,k|> κλj/2o ,
n|bβj,k| ≥κλj, |βj,k|< κλj/2o
⊆n
|βbj,k−βj,k|> κλj/2o and
n|βbj,k|< κλj, |βj,k| ≥2κλj
o⊆n
|βj,k| ≤2|βbj,k−βj,k|o . So
max(e1, e3)≤C
j1
X
j=j∗
2j−1
X
k=0
E
βbj,k−βj,k2
1{|bβj,k−βj,k|>κλj/2}
.
It follows from the Cauchy-Schwarz inequality and Propositions 6.2 and 6.3 that E
βbj,k−βj,k2
1{|bβj,k−βj,k|>κλj/2}
≤
E
βbj,k−βj,k41/2
P
|βbj,k−βj,k|> κλj/21/2
≤ C22(δ+d)jlnn n2 .
Hence
max(e1, e3) ≤ Clnn n2
j1
X
j=j∗
2j(1+2δ+2d)≤Clnn
n2 2j1(1+2δ+2d)
≤ Clnn n ≤C
lnn n
2s/(2s+2δ+2d+1)
. (6.16)
Upper bound for the terme2. Using Proposition 6.2 and the Cauchy-Schwarz inequality, we obtain
E
βbj,k−βj,k
2
≤
E
βbj,k−βj,k
41/2
≤C22(δ+d)jlnn n . Hence
e2≤Clnn n
j1
X
j=j∗
22(δ+d)j
2j−1
X
k=0
1{|βj,k|>κλj/2}.
Letj2be the integer defined by 2−1 n
lnn
1/(2s+2δ+2d+1)
<2j2≤ n lnn
1/(2s+2δ+2d+1)
. (6.17)
We have
e2≤e2,1+e2,2, where
e2,1=Clnn n
j2
X
j=j∗
22(δ+d)j
2j−1
X
k=0
1{|βj,k|>κλj/2}
and
e2,2=Clnn n
j1
X
j=j2+1
22(δ+d)j
2j−1
X
k=0
1{|βj,k|>κλj/2}. We have
e2,1≤Clnn n
j2
X
j=j∗
2j(1+2δ+2d)≤Clnn
n 2j2(1+2δ+2d)≤C lnn
n
2s/(2s+2δ+2d+1)
.
Forr≥1andp≥2, sinceBp,rs (M)⊆B2,∞s (M),
e2,2 ≤ Clnn n
j1
X
j=j2+1
22(δ+d)j 1 λ2j
2j−1
X
k=0
βj,k2 =C
∞
X
j=j2+1 2j−1
X
k=0
βj,k2 ≤C2−2j2s
≤ C lnn
n
2s/(2s+2δ+2d+1)
.
Forr≥1,p∈[1,2)ands >(2δ+ 2d+ 1)/p, sinceBsp,r(M)⊆B2,∞s+1/2−1/p(M)and (2s+ 2δ+ 2d+ 1)(2−p)/2 + (s+ 1/2−1/p+δ+d−2(δ+d)/p)p= 2s, we have
e2,2 ≤ Clnn n
j1
X
j=j2+1
22(δ+d)j 1 λpj
2j−1
X
k=0
|βj,k|p
≤ C lnn
n
(2−p)/2 ∞ X
j=j2+1
2j(δ+d)(2−p)2−j(s+1/2−1/p)p
≤ C lnn
n
(2−p)/2
2−j2(s+1/2−1/p+δ+d−2(δ+d)/p)p
≤ C lnn
n
2s/(2s+2δ+2d+1)
.
So, forr≥1,{p≥2ands >0}or{p∈[1,2)ands >(2δ+ 2d+ 1)/p}, e2≤C
lnn n
2s/(2s+2δ+2d+1)
. (6.18)
Upper bound for the terme4.We have e4≤
j1
X
j=j∗
2j−1
X
k=0
βj,k2 1{|βj,k|<2κλj}.
Letj2be the integer (6.17). We have
e4≤e4,1+e4,2, where
e4,1=
j2
X
j=j∗
2j−1
X
k=0
β2j,k1{|βj,k|<2κλj}, e4,2=
j1
X
j=j2+1 2j−1
X
k=0
βj,k2 1{|βj,k|<2κλj}.
We have
e4,1 ≤ C
j2
X
j=j∗
2jλ2j =Clnn n
j2
X
j=j∗
2j(1+2δ+2d)≤Clnn
n 2j2(1+2δ+2d)
≤ C lnn
n
2s/(2s+2δ+2d+1)
.
Forr≥1andp≥2, sinceBp,rs (M)⊆B2,∞s (M), we have
e4,2≤
∞
X
j=j2+1 2j−1
X
k=0
βj,k2 ≤C2−2j2s≤C lnn
n
2s/(2s+2δ+2d+1)
.
Forr≥1,p∈[1,2)ands >(2δ+ 2d+ 1)/p, sinceBsp,r(M)⊆B2,∞s+1/2−1/p(M)and (2−p)(2s+ 2δ+ 2d+ 1)/2 + (s+ 1/2−1/p+δ+d−2(δ+d)/p)p= 2s, we have
e4,2 ≤ C
j1
X
j=j2+1
λ2−pj
2j−1
X
k=0
|βj,k|p=C lnn
n
(2−p)/2 j1
X
j=j2+1
2jδ(2−p)
2j−1
X
k=0
|βj,k|p
≤ C lnn
n
(2−p)/2 ∞
X
j=j2+1
2j(δ+d)(2−p)2−j(s+1/2−1/p)p
≤ C lnn
n
(2−p)/2
2−j2(s+1/2−1/p+δ+d−2(δ+d)/p)p≤C lnn
n
2s/(2s+2δ+2d+1)
. So, forr≥1,{p≥2ands >0}or{p∈[1,2)ands >(2δ+ 2d+ 1)/p},
e4≤C lnn
n
2s/(2s+2δ+2d+1)
. (6.19)
It follows from (6.15), (6.16), (6.18) and (6.19) that S ≤C
lnn n
2s/(2s+2δ+2d+1)
. (6.20)
Combining (6.12), (6.13), (6.14) and (6.20), we have, forr≥1,{p≥2ands >0}or {p∈[1,2)ands >(2δ+ 2d+ 1)/p},
E Z 1
0
fbdH(x)−f(d)(x)2 dx
≤C lnn
n
2s/(2s+2δ+2d+1)
. The proof of Theorem 5.2 is complete.
This work is supported by ANR grant NatImages, ANR-08-EMER-009.
References
[1] P.K. Bhattacharya, Estimation of a probability density function and its derivatives, Sankhya Ser. A.29 (1967), 373–382.
[2] T. Cai, Adaptive Wavelet Estimation: A Block Thresholding And Oracle Inequality Approach,The Annals of Statistics27 (1999), 898–924.
[3] Y.P. Chaubey and H. Doosti, Wavelet based estimation of the derivatives of a density for m-dependent random variables, Journal of the Iranian Statistical Society 4 (2) (2005), 97–105.
[4] Y.P. Chaubey, H. Doosti and B.L.S. Prakasa Rao, Wavelet based estimation of the derivatives of a density with associated variables,International Journal of Pure and Applied Mathematics27 (1) (2006), 97–106.
[5] C. Chesneau, Wavelet estimation of the derivatives of an unknown function from a convolution model, Preprint (2009).
[6] F. Comte and M.-L. Taupin, Nonparametric estimation of the regression function in an errors-in-variables model,Statistica Sinica173 (2007), 1065-1090.
[7] B. Delyon and A. Juditsky, On minimax wavelet estimators,Applied Computational Harmonic Analysis3(1996), 215–228.
[8] D.L. Donoho and I.M. Johnstone, Ideal spatial adaptation by wavelet shrinkage, Biometrika81(1994), 425-455.
[9] D. L. Donoho and I.M. Johnstone, Adapting to unknown smoothness via wavelet shrinkage,Journal of the American Statistical Association90432 (1995), 1200-1224.
[10] D.L. Donoho and I.M. Johnstone, Minimax estimation via wavelet shrinkage, The Annals of Statistics26(1998), 879-921.
[11] J. Fan and M. Masry, Multivariate regression estimation with errors-in-variables:
asymptotic normality for mixing process,J. Multivariate Anal.43(1992), 237-272.
[12] J. Fan and Y.K. Truong, Nonparametric regression with errors-in-variables,The An- nals of Statistics214 (1993), 1900-1925.
[13] J. Fan, Y.K. Truong and Y. Wang, Nonparametric function estimation involving errors-in-variables,Nonparametric Functional Estimation and Related Topic(1991), 613-627.
[14] J. Fan and J.Y. Koo, Wavelet deconvolution,IEEE transactions on information theory 48 (2002), 734–747.
[15] D.A. Ioannides and P.D. Alevizos, Nonparametric regression with errors in variables and applications,Statist. Probab. Lett.32(1997), 35-43.
[16] W. H¨ardle, G. Kerkyacharian, D. Picard and A. Tsybakov,Wavelet, Approximation and Statistical Applications, Lectures Notes in Statistics New York, 129 Springer Verlag, 1998.
[17] I.M. Johnstone, G. Kerkyacharian, D. Picard and M. Raimondo, Wavelet deconvolu- tion in a periodic setting,Journal of the Royal Statistical Society Serie B663 (2004), 547–573.
[18] J.Y. Koo and K.W. Lee, B-splines estimations of regression functions with errors in variables,Statist. Probab. Lett.40(1998), 57-66.
[19] E. Masry, Multivariate regression estimation with errors-in-variables for stationary processes,Nonparametric Statistics3(1993), 13-36.
[20] Y. Meyer,Ondelettes et Op´erateurs, Hermann, Paris, 1990.
[21] M. Pensky and B. Vidakovic, Adaptive wavelet estimator for nonparametric density deconvolution,The Annals of Statistics27 (1999), 2033–2053.
[22] V. V. Petrov,Limit Theorems of Probability Theory: Sequences of Independent Ran- dom Variables, Oxford: Clarendon Press, 1995.
[23] B.L.S. Prakasa Rao, Nonparametric estimation of the derivatives of a density by the method of wavelets,Bull. Inform. Cyb.28 (1996), 91–100.
[24] H. P. Rosenthal, On the subspaces ofLp(p≥2) spanned by sequences of independent random variables,Israel Journal of Mathematics8(1970), 273-303.
[25] A.B. Tsybakov,Introduction a l’estimation non-parametrique, Springer, 2004.