• Aucun résultat trouvé

Strong consistency of kernel estimator in a semiparametric regression model

N/A
N/A
Protected

Academic year: 2022

Partager "Strong consistency of kernel estimator in a semiparametric regression model"

Copied!
20
0
0

Texte intégral

(1)

HAL Id: hal-01897055

https://hal.archives-ouvertes.fr/hal-01897055

Preprint submitted on 16 Oct 2018

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires

Strong consistency of kernel estimator in a semiparametric regression model

Emmanuel de Dieu Nkou, Guy Martial Nkiet

To cite this version:

Emmanuel de Dieu Nkou, Guy Martial Nkiet. Strong consistency of kernel estimator in a semipara- metric regression model. 2018. �hal-01897055�

(2)

Strong consistency of kernel estimator in a semiparametric regression model

Emmanuel de Dieu NKOU and Guy Martial NKIET

Universit´e des Sciences et Techniques de Masuku, Franceville, Gabon;

ARTICLE HISTORY Compiled September 23, 2018

ABSTRACT

Estimating the effective dimension reduction (EDR) space, related to the semipara- metric regression model introduced by Li [9], is based on the estimation of the covariance matrix Λ of the conditional expectation of the vector of predictors given the response. An estimatorΛbnof Λ based on kernel method was introduced by Zhu and Fang [17] who then derived, under some conditions, the asymptotic distribution of

n

ΛbnΛ

, asn+∞. In this paper, we obtain, under specified conditions, the almost sure convergence ofΛbnto Λ, asn+∞.

KEYWORDS

Strong consistency; Kernel estimator; Semiparametric; Regression

1. Introduction

Given a univariate response variableY, we consider the regression model:

Y =F(β1TX, ..., βNTX, ε), (1) whereX is a d-dimensional random vector with covariance matrix assumed, without loss of generality, to be the identity matrix,N is an integer of N such that N < d, β1,· · · , βN are vectors inRd, andεis a real random variable that is independent ofX, and F is an arbitrary unknown function onRN+1. This model, introduced by Li [9], permits to achieve dimension reduction since the numberN of variables to be consid- ered for estimatingF is less than the initial dimension dof the regressor vectorX. It expresses the fact that the projection ofX onto theN-dimensional subspace spanned byβ1TX, ..., βNTX, named the effective dimension reduction (EDR) space, contains all information about the response variableY. Estimating N and the EDR space is then a crucial issue that has been tackled in several works (e.g., [1,2,3,4,8,9,10,11,12,13,16]).

Since the directionsβ1,· · ·, βK are, under some conditions, characterized as eigenvec- tors of the covariance matrix Λ ofE(X|Y), the aforementioned estimation problem is based on estimation of Λ. The most popular method for doing that is based on slicing the range of Y and leads to the well known sliced inverse regression (SIR) method that was introduced by Li [9] (see also [8]). An alternative method was introduced

(3)

by Zhu and Fang [17]; in this work an estimator Λbn of Λ based on kernel method is proposed and the limiting distribution of √

n

Λbn−Λ

, as n → +∞, is derived under some conditions. Since this result just implies weak consistency of Λbn, that is the convergence in probability ofΛbn to Λ as n→+∞, it is natural to wonder if one could obtain strong consistency forΛbn.

In this paper, we tackle this problem and we prove, under some conditions, the almost sure convergence ofΛbn to Λ, as n →+∞. The paper is organized as follows:

Section 2 is devoted to the presentation of the used estimator, that is the estimator given in [17]. In Section 3, the assumptions needed for our results are given, and the main theorems that establish the aforementioned consistency are given. Then, the proofs of all lemmas and theorems are postponed in Section 4.

2. Preliminaries and notations

Lettingf be the density ofY, we suppose that, for all y∈R, we havef(y)>0; then, for any j= 1,· · ·, d, we consider

Rj(y) =E(Xj|Y =y) = gj(y)

f(y) where gj(y) = Z

R

zf(Xj ,Y)(z, y)dz, f(Xj ,Y) being the density of the pair (Xj, Y). Then, we consider the random vector

R(Y) =

R1(Y), ..., Rd(Y) T

=

E(X1|Y), ...,E(Xd|Y) T

=E

X|Y

and its covariance matrix Λ =Cov

E(X|Y)

which is of great importance since the EDR space is obtained from its spectral analysis (e.g. [9]). It cannot be computed in practice since it depends on the distribution of (X, Y) which is generally unknown;

that is why approaches for its estimation have been investigated by several authors.

Li [9] considered an estimation method based on slicing the range ofY, so introducing sliced inverse regression, whereas Zhu and Fang [17] introduced a kernel estimator.

More precisely, considering an i.i.d. sample (Yi, Xi)i=1,...nof the pair (Y, X) of random variables connected according to model (1) and putting

Xi = (Xi1,· · ·, Xid)T , we define kernel estimates off and thegj’s by:

fbn(y) = 1 n

n

X

i=1

1 hn

K

y−Yi hn

, bgj,n(y) = 1 n

n

X

i=1

Xij 1 hn

K

y−Yi hn

,

wherehn is a bandwidth andK(·) is a kernel function. In order to avoid small values in the denominator, Zhu and Fang (1996) proposed to consider

fbn(y) = max f(y), bn

and fbbn(y) = max

fbn(y), bn ,

(4)

where (bn)n∈N is a sequence of positive real numbers that satisfies the property:

limn→+∞(bn) = 0. Then, the Rbn,j’s defined by Rbn,j(y) = gj(y)

fbn(y) are estimated by

Rbbn,j(y) = bgj,n(y) fbbn(y) and putting

Rbbn(y) =

Rbbn,1(y), ...,Rbbn,d(y) T

,

we take as estimator of Λ the random matrix:

Λbn= 1 n

n

X

i=1

Rbbn(Yi) Rbbn(Yi)T

.

This estimator was considered in [17] who then proved that√ n

Λbn−Λ

converges in distribution, as n → +∞, to a normal distribution. This result implies that Λbn converges in probability, as n → +∞, to Λ, that is weak consistency of the estima- tor. In the following section, we establish, under specified conditions, the almost sure convergence ofΛbn to Λ as n→+∞.

3. Assumptions and main results

In this section, we present our assumptions, then we give the main results that establish almost sure convergence ofΛbn to Λ asn→+∞.

Assumption 1. The random variableXis bounded, i.e. there existsG >0 such that kXkd≤G, wherek · kd is the usual Euclidean norm of Rd.

Assumption 2. The random variableY has a bounded density f.

Assumption 3. Thegj’s andf are 3-times differentiable and their third derivatives satisfy the following condition: there exists a neighborhood of the origin, say U, and a constantc >0 such that, for any u∈U,

f(3)(y+u)−f(3)(u)

≤c|u| and

gj(3)(y+u)−g(3)j (u)

≤c|u|, forj= 1,· · ·, d.

Assumption 4. For any pair (k, `) such that 1≤k, `≤d, and any u∈U,

|Rk(y+u)R`(y+u)−Rk(y)R`(y)| ≤c|u|.

(5)

Assumption 5. There exists an integerr >6 such that, for any j∈ {1,· · ·, d}, the functiongj belongs to the set

Σ (r, L, α) =n

g∈ Dr / ∀(x, y),

g(r)(x)−g(r)(y)

≤L|x−y|αo ,

where α ∈]0,1], L > 0 and β := r+α satisfies β > 7, and Dr denotes the space of r-times differentiable functions.

Assumption 6. (i) The kernel K is continuous and its support is the interval [−1,1];

(ii) K is symmetric about 0;

(iii) The kernelK is bounded, that is: supu∈R|K(u)|=D <+∞.

(iv) The kernelK is of order r, that is Z

ukK(u)du= 0 for k∈ {1,2,· · ·, r}; (v)

Z

|K(u)|du <+∞ and Z

|u|β|K(u)|du <+∞.

Assumption 7. When n is large enoughhn ∼ n−c1 and bn ∼n−c2 where c1 and c2 are numbers satisfyingc1 >0, 0< c2 <1/10 and 1/8 +c2/4< c1 <1/4−c2.

Assumption 8. The eigenvaluesλ1,· · ·, λd of Λ verify: λ1 >· · ·> λd>0.

The assumptions 3, 4, 6-(i), 6-(ii) and 7 was introduced in [17] and are necessary here to use some results of this paper. Assumption 2 concerns the density ofY and is classical since it is satisfied for the usual probability distributions. The assumptions 5, 6-(iv) and 6-(v) are classical assumptions of nonparametric statistics literature (see, e.g., [15]). Assumption 6-(iii) is satisfied, for instance, by the gaussian kernel.

Remark 1. For overcoming technical difficulties due to small values in the denomi- nator, Zhu and Fang (1996) introduced the modified versionfbbn = max(fbn, bn) of the kernel estimatefbnof the densityf. But this approach does not guarantee that we get a good estimator off. Indeed, if we takebn =n−1/11, then bn is still larger than 1/2 for very large values ofn (for example n= 2000). So, every value of fbn could be cut off and, therefore, fbbn would have a constant value. This is an undesirable property that makesfbbn a bad estimator of the density. To overcome this problem, we can take bn = min(a, n−c2), where ais a fixed strictly positive number. When a is sufficiently small fbbn is near fromfbn and is, therefore, a good estimate of f. Indeed, it is easy to check that supx∈R|fbbn(x)−fbn(x)| ≤a. Finally, by takingbn= min(a, n−c2), we obtain a good estimate of the density and we still havebn∼n−c2 as required in Assumption 7.

For a symmetric (d×d) matrixA= (ak,`)1≤k,`≤d, we denote byV ech(A) thed(d+1)/2- dimensional vector

(a11,· · · , ad1, a22, a32,· · · , ad2, a33, a43,· · · , ad3,· · ·, add)T.

(6)

For a vectorV = (v1, v2,· · ·, vm)∈Rm, we denotekVk= max1≤i≤m|vi|.

Now, we give results which establish strong consistency forΛbn as estimator of Λ.

Theorem 1. Under the assumptions 1, 6 and 7 we have

V ech

Λbn−E

Λbn

=Oa.s.

logn n

ν

withν = 1/2−2(c1+c2).

Putting

Λ = (λk,`)1≤k,`≤d and Λbn=

(n)k,`

1≤k,`≤d, we have:

Theorem 2. Under the assumptions 1 to 7, we have for any 1≤k, `≤d:

n→+∞lim E

(n)k,`

k,`.

The following theorem is our main result; it results from Theorems 1 and 2.

Theorem 3. Under the assumptions 1 to 7, Λbn converges almost surely to Λ, as n→+∞.

As a consequence of this theorem, we can deduce strong consistency for estimators of theβk’s. Since the covariance matrix ofX is assumed to be equal to thed×didentity matrixId , thenβk (fork= 1,· · ·, N) is an eigenvector of Λ associated with thek-th largest eigenvalueλk (see [9]). We consider the empirical covariance matrix

Σbn= 1 n

n

X

i=1

Xi−Xn

Xi−XnT

where Xn= 1 n

X

i=1

Xi,

and we denote byηbk an eigenvector ofΛbn associated with thek-th largest eigenvalue bλk. Clearly, from strong law of large numbers, Σbn converges almost surely to Id as n→+∞; thenΣbnis also invertible for large values ofn, and we can take as estimator ofβk the vectorβbk=Σb−1/2n ηbk. Then, we have:

Corollary 1. Under the assumptions 1 to 8, for any k ∈ {1,· · ·, N}, βbk converges almost surely toβk, asn→+∞.

(7)

4. Proofs

4.1. Preliminary results

First, we recall below a lemma given in [17] (see p. 1058) and which will be useful for proving other results.

Lemma 1. Under the assumptions 1, 3, 4 and 7, we have almost surely:

sup

y∈R

fbn(y)−f(y) =O

h4n+n−1/2h−1n logn

,

asn→+∞.

Lemma 2. Under assumptions 1, 2, 3, 4 and 7, we have for any j ∈ {1,· · · , d}:

E

g2j(Y)

<+∞.

Proof.According to [17] (see p. 1059), we have:

sup

y∈R

|E(gbj,n(y))−gj(y)|=O(h4n).

Then, there existsM1 >0 such that supy∈R|E(bgj,n(y))−gj(y)| ≤M1 for anyn∈N. On the other hand,

|E(bgj,n(y))| ≤ Gh−1n E

K

y−Y1

hn

=Gh−1n Z

K

y−t hn

f(t)dt

= G

Z

|K(u)|f(hny−u) du≤Gkfk Z

|K(u)|du, wherekfk= supt∈Rf(t). Therefore, for anyy∈R,

|gj(y)| ≤ |E(bgj,n(y))−gj(y)|+|E(bgj,n(y))|

≤ M1+Gkfk Z

|K(u)|du.

This shows thatgj(Y) is a bounded real random variable and, therefore,E

g2j(Y)

<

+∞.

Lemma 3. Under the assumptions 5 and 6, we have for any y ∈ R and any j ∈ {1,· · · , d}:

Z

gj(y−uhn)K(u)du−gj(y)

≤C hβn, whereC >0.

(8)

Proof.By a Taylor expansion, we have:

gj(y−uhn) =gj(y) +

r−1

X

k=1

g(k)(y)

k! (−1)kukhkn + (−1)rurhrn

r! g(r)j (y−θuhn), whereθ∈]0,1[. Thus,

Z

gj(y−uhn)K(u)du = Z

gj(y)K(u)du +

r−1

X

k=1

g(k)(y)

k! (−1)kukhkn Z

ukK(u)du

+ (−1)rhrn r!

Z

g(r)j (y−θuhn)urK(u)du,

= gj(y) + (−1)rhrn r!

Z

g(r)j (y−θuhn)urK(u)du.

Furthermore, since Z

gj(r)(y)urK(u)du=g(r)j (y) Z

urK(u)du= 0, it follows:

Z

gj(y−uhn)K(u)du−gj(y) = (−1)rhrn r!

Z

gj(r)(y−θuhn)urK(u)du− Z

gj(r)(y)urK(u)du

= (−1)rhrn r!

Z

g(r)j (y−θuhn)−gj(r)(y)

urK(u)du.

Thus, under Assumption 5,

Z

gj(y−uhn)K(u)du−gj(y)

≤ hrn r!

Z

gj(r)(y−θuhn)−gj(r)(y)

|u|r|K(u)|du

≤ hrn r!

Z

L θα|u|αhαn|u|r|K(u)|du

≤ hr+αn L r!

Z

|u|r+α|K(u)|du

≤ hβnL r!

Z

|u|β|K(u)|du,

what gives the required inequality withC = Lr!R

|u|β|K(u)|du.

Lemma 4. Considering Ej,n=

Z f(y)

Z

gj(y−uhn)K(u)du 2

dy and Ej = Z

f(y)g2j(y)dy=E g2j(Y) ,

then, under the assumptions 1 to 7, we have : |Ej,n− Ej| ≤C2hn +2C hβnE

|gj(Y)|

.

(9)

Proof.Using the equalitya2−b2 = (a−b)2+ 2b(a−b), we obtain : Ej,n− Ej =

Z f(y)

"

Z

gj(y−uhn)K(u)du 2

gj(y) 2#

dy

= Z

f(y)

"

Z

gj(y−uhn)K(u)du−gj(y) 2

+ 2gj(y) Z

gj(y−uhn)K(u)du−gj(y)

dy.

Thus

|Ej,n− Ej| ≤ Z

f(y)

"

Z

gj(y−uhn)K(u)du−gj(y) 2

+ 2 |gj(y)|

Z

gj(y−uhn)K(u)du−gj(y)

dy.

Then, from Lemma 3, it follows

|Ej,n− Ej| ≤ Z

f(y)h

C2hn + 2|gj(y)|C hβni dy

= C2hn Z

f(y)dy+ 2Chβn Z

|gj(y)|f(y)dy

= C2hn + 2ChβnE(|gj(Y)|).

Lemma 5. Putting δn = nhn

1− 1n

Ej,n− Ej

, we have under the assumptions 1 to 7, limn→+∞δn= 0.

Proof.First,

|Ej,n| ≤ |Ej,n− Ej|+|Ej| ≤C2hn + 2C hβnE

|gj(Y)|

+E gj2(Y) .

Therefore,

n| = nhn

(Ej,n− Ej)− 1 nEj,n

≤nhn

|Ej,n− Ej|+ 1 n|Ej,n|

≤ nhβ+1n

C2hβn+ 2CE

|gj(Y)|

+1 n

E gj2(Y)

+C2hn + 2C hβnE

|gj(Y)|

.

Clearly,

n→+∞lim 1

n

E gj2(Y)

+C2hn + 2C hβnE

|gj(Y)| = 0.

(10)

On the other hand, sincehn∼n−c1, it follows thatnhβ+1n ∼n1−(β+1)c1. Further, from β >7 we deduce thatβ+11 < 18 < c1, that is 1−(β+1)c1<0. Thus limn→+∞nhβ+1n = 0 and, therefore,

n→+∞lim

nhβ+1n

C2hβn+ 2CE

|gj(Y)| = 0.

Finally, limn→+∞δn= 0.

Lemma 6. Under the assumptions 1 to 7, we have:

E

(bgj,n(Y)−gj(Y))2

=O 1

n hn

.

Proof.Considering the random variable Wi,j,n =XijK

Y−Yi

hn

, we have:

E bgj,n(Y)2

= 1

nh2nE W1,j,n2 + 1

h2n

1− 1 n

E(W1,j,nW2,j,n). (2) Clearly, E

W1,j,n2

= hnJn, where Jn = R R 1

hnK2 u

hn

V(y − u)f(u)dudy with V(y) = R

z2f(X,Y)(z, y)dz, and from Theorem 2.1.1 in [11] it is known that limn→∞Jn=J :=R R

V(y)K2(u)f(u)du dy. Furthermore, E(W1,j,nW2,j,n) =

Z Z K

u hn

gj(y−u)du Z

K v

hn

gj(y−v)dv

f(y)dy

= Z

f(y) Z

K u

hn

gj(y−u)du 2

dy.

Puttingt= hu

n, we obtain E(W1,j,nW2,j,n) =h2n

Z f(y)

Z

K(t)gj(y−thn)dt 2

dy=h2nEj,n. On the other handE gj(Y)2

=Ej. Then, we deduce from (2) that E bgj,n(Y)2

−E gj(Y)2

= 1

nh2nhnJn+ 1 h2n

1− 1

n

h2nEj,n− Ej = 1 nhn

(Jnn). On the other hand, it is easy to check that

E

[bgj,n(Y)−gj(Y)]2

= E bgj,n(Y)2

−E gj(Y)2

−2

E

bgj,n(Y)gj(Y)

−E gj(Y)2

= 1

nhn(Jnn)−2∆j,n,

(11)

where ∆j,n=E

bgj,n(Y)gj(Y)

−E gj(Y)2

. We have:

E

bgj,n(Y)gj(Y)

= E

"

gj(Y) 1 nhn

n

X

i=1

XijK

Y −Yi hn

#

= 1

hnE

gj(Y)X1jK

Y −Y1 hn

= 1

hn Z

(3)

z gj(y)K

y−u hn

f(Xj ,Y

1,Y)(z, u, y)dz du dy

= 1

hn

Z

(3)

z gj(y)K

y−u hn

f(Xj ,Y1)(z, u)fY(y)dz du dy

= 1

hn

Z

gj(y)f(y) Z

K

y−u hn

Z

zf(Xj ,Y

1)(z, u)

du

dy

= 1

hn Z

gj(y)f(y) Z

K

y−u hn

gj(u)du

dy.

Puttingt= y−uh

n , we obtain E

gj(Y)bgj,n(Y)

= Z

gj(y)f(y) Z

gj(y−thn)K(t)dt

dy.

Hence

j,n = Z

gj(y)f(y) Z

gj(y−thn)K(t)dt

dy− Z

g2j(y)f(y);

= Z

gj(y)f(y) Z

gj(y−thn)K(t)dt−gj(y)

dy

and

|∆j,n| ≤ Z

|gj(y)|f(y)

Z

gj(y−thn)K(t)dt−gj(y)

dy

≤ Chβn Z

|gj(y)|f(y)dy=ChβnE(|gj(Y)|), the second inequality coming from Lemma 3. Then, we have:

E

[gbj,n(Y)−gj(Y)]2

≤ 1

nhn

|Jnn|+ 2|∆j,n|

≤ 1

nhn n

|Jnn|+ 2Cnhβ+1n E(|gj(Y)|)o .

Moreover,nhβ+1n ∼n1−(β+1)c1 and, sinceβ+ 1>8 and 18 < c1 we have the inequality 1−(β+ 1)c1 <0 which implies limn→∞nhβ+1n = 0. Consequently,

n→+∞lim

|Jnn|+ 2Cnhβ+1n E(|gj(Y)|)

=|J|,

(12)

from what we deduce thatE

(bgj,n(Y)−gj(Y))2

=O

1 nhn

.

Lemma 7. Under the assumptions 1, 2, 3, 4 and 7, we have:

n→+∞lim E

"

bgj,n(Y)−gj(Y)fbbn(Y) fbn(Y)

#2

= 0.

Proof.We have:

E

"

bgj,n(Y)−gj(Y)fbbn(Y) fbn(Y)

#2

= E

"

bgj,n(Y)−gj(Y)

+gj(Y)

1−fbbn(Y) fbn(Y)

#2

≤ 2E

bgj,n(Y)−gj(Y) 2!

+ 2E gj2(Y)

1−fbbn(Y) fbn(Y)

2!

. (3)

Equation 4.4 in [17] and Lemma 1 allow to obtain, almost surely, the inequality sup

y∈R

|fbbn(y)−fbn(y)| ≤sup

y∈R

|fbn(y)−f(y)| ≤M2

h4n+n−1/2h−1n logn ,

whereM2 is a positive constant. Therefore, almost surely,

1−fbbn(Y) fbn(Y)

2

=

fbbn(Y)−fbn(Y) fbn(Y)

2

M2b−1n

h4n+n−1/2h−1n logn 2

,

and, consequently, E g2j(Y)

1−fbbn(Y) fbn(Y)

2!

M2b−1n

h4n+n−1/2h−1n logn 2

E g2j(Y)

. (4) Clearly, b−1n h4n ∼ nc2−4c1 and b−1n n−1/2h−1n ∼ nc1+c2−1/2 as n → +∞. Since, un- der assumption 4, we have c2 −4c1 < 0 and c1 +c2 −1/2 < 0, it follows that limn→+∞ b−1n h4n+n−1/2h−1n logn

= 0. Then, from (4), (3) and Lemma 6, we de-

duce the required result.

4.2. Proof of Theorem 1 Since the class of functions

Hn=

h(k,l) :y7−→hk,l(y) = 1 n

bgk,n(y)bgl,n(y)

fbb2n(y) ,1≤k, l≤d

,

(13)

is finite, we deduce from Lemma 3 in [6] that it is a Vapnik- ˇCervonenkis (VC) class of functions with respect to the envelope

h= max{|hk,l|:hk,l ∈ Hn,1≤k, l≤d}. The related covering number N

Hn,k·kL2(P), εkhkL2(P)

satisfies, for all ε ∈]0,1[

and for all probability measuresP on (S,S), N

Hn,k·kL2(P), εkhkL2(P)

≤ A

ε ν

,

whereAandν are postive constant named the VC characteristics ofHn. Assumptions 1 and 6 implyh≤ nhD22G2

nb2n, then we obtain, for allh∈ Hn, E(h(Y))≤ D2G2

h2nnb2n and E h2(Y)

≤ D4G4 h4nn2b4n.

Takingµn= nbD22G2

nh2n andσn2 = hD44G4

nn2b4n, we can apply Talagrand’s inequality (see [14] and Proposition 2.2 in [7]): there exist positive constantsK1 and K2, depending only on Aand ν, such that for allt>K1

"

µnlogσn+√ nσ

q logσn

n

# ,

P (

sup

h∈Hn

n

X

i=1

h(Yi)−E

h(Y)

> t )

≤ K2exp





− 1 K2

t µn

log

1 + tµn

K2

nn q

logσn

n

2



 ,

that is,

P (

sup

1≤k,l≤d

1 n

n

X

i=1

bgk,n(Yi)bgl,n(Yi) fbb2

n(Yi) −E bgk,n(Y)bgl,n(Y) fbb2

n(Y)

!

> t )

≤ K2exp (

− 1 K2

tnb2nh2n

D2G2 log 1 + th2nb2n K2D2G2 1 +√

logA2

!)

. (5)

Since hn∼n−c1 and bn∼n−c2 , we have limn→∞ hn

n−c1

=B1 and limn→∞ bn

n−c2

= B2, where B1 >0 and B2 >0. Thus for ε such as 0 < ε < min (1, B1, B2) and n is large enough, we have

B1−ε < hn

n−c1 < B1+ε and B2−ε < bn

n−c2 < B2+ε, that is

n−c1(B1−ε)< hn< n−c1(B1+ε) and n−c2(B2−ε)< bn< n−c2(B2+ε).

(14)

Then, puttingδ= (B1−ε)2(B2−ε)2, we deduce from (5) that

P (

sup

1≤k,l≤d

1 n

n

X

i=1

bgk,n(Yi)bgl,n(Yi) fbb2

n(Yi) −E bgk,n(Y)gbl,n(Y) fbb2

n(Y)

!

> t )

(6)

≤ K2exp (

− 1 K2

δ t

D2G2n1−2(c1+c2)×log 1 + δ t K2D2G2 1 +√

logA2n−2(c1+c2)

!)

Let us put tn =

logn n

1/2−2(c1+c2)

; Assumption 7 implies 0 < c1 +c2 < 1/4 and, consequently, thatα= 1/2−2(c1+c2) is strictly positive. Then, limn→+∞(logn)α = +∞ and, therefore, putting U = K1G D√

logA we have for nlarge enough (logn)α>2U =U(1 + 1)>U 1 +

rlogA n

!

=U √

n+√ logA

√n

,

that is

logn n

1

2−2(c1+c2)

> K1

D2G2 nb2nh2n ×p

logA √

n+p logA

what means that

tn>K1

"

µnlogAµn σn

+√ nσ

r

logAµn σn

# .

Then, (6) can be applied totn and we obtain

P (

sup

1≤k,l≤d

1 n

n

X

i=1

gbk,n(Yi)bgl,n(Yi) fbb2

n(Yi) −E bgk,n(Y)bgl,n(Y) fbb2

n(Y)

!

> tn )

≤vn

where

vn=K2exp (

− 1 K2

δ√

n(logn)α

D2G2 log 1 + δ(logn)α K2D2G2

n 1 +√ logA2

!) .

Clearly,vn∼wnas n→+∞, where wn=K2exp

(

− δ2(logn) K22D4G4 1 +√

logA2

) ,

and since P+∞

n=1wn < +∞, we deduce that P+∞

n=1vn < +∞. Then from the above inequality it follows that

+∞

X

n=1

P (

sup

1≤k,l≤d

1 n

n

X

i=1

bgk,n(Yi)bgl,n(Yi) fbb2

n(Yi) −E bgk,n(Y)bgl,n(Y) fbb2

n(Y)

!

> tn

)

<+∞,

(15)

and the required result is obtained from Borel Cantelli’s lemma.

4.3. Proof of Theorem 2 Let us consider

Rbn,j(y) = gj(y)

fbn(y), Ikl(1)(y) = gk(y)gl(y) fb2

n(y) =Rbn,k(y)Rbn,l(y),

Ikl(2)(y) = gk(y)bgl,n(y) +gl(y)bgk,n(y) fb2

n(y) = Rbn,k(y)bgl,n(y)

fbn(y) +Rbn,l(y)bgk,n(y) fbn(y) , and

Ikl(3)(y) = 2Rbn,k(y)Rbn,l(y)fbbn(y) fbn(y).

Denoting bybλ(n)k,l the element at the k-th row and the l-th column of the matrixΛbn , it is known from [17] (see pp. 1059-1060) that

(n)k,l = 1 n

n

X

i=1

bgk,n(Yi)bgl,n(Yi) fbb2

n(Yi)

!

= 1

n

n

X

i=1

n

Ikl(1)(Yi) +Ikl(2)(Yi)−Ikl(3)(Yi)o

−An+Bn+Cn−Dn, (7) where

An= 1 n

n

X

i=1

gk(Yi)

gbl,n(Yi)−gl(Yi)

+gl(Yi)

bgk,n(Yi)−gk(Yi)

fbb2

n(Yi)−fb2

n(Yi) fbb2

n(Yi)fb2

n(Yi)

! ,

Bn= 1 n

n

X

i=1

(bgk,n(Yi)−gk(Yi)) (bgl,n(Yi)−gl(Yi)) fbb2

n(Yi) ,

Cn= 1 n

n

X

i=1

R(bn,k)(Yi)R(bn,l)(Yi)

fbb2

n(Yi)−fb2

n(Yi)2

fbb2

n(Yi)fb2

n(Yi) , and

Dn= 1 n

n

X

i=1

fbb2n(Yi)−fb2n(Yi) 2

Rbn,k(Yi)Rbn,l(Yi) fb2

n(Yi) .

(16)

First, we will obtain the rates of convergence of the sequencesE(An),E(Bn), E(Cn) andE(Dn) to 0 as n→+∞. Clearly,E(An) =E

A(k,l)n

+E

A(l,k)n

, where

A(k,l)n =gk(Y)

gbl,n(Y)−gl(Y) fbb2

n(Y)−fb2

n(Y) fbb2

n(Y)fb2

n(Y)

! .

Further,

E

A(k,l)n

≤ 1 b4nE

gk(Y)

bgl,n(Y)−gl(Y)

fbb2n(Y)−fb2n(Y)

≤ 1 b4nE

(

gk(Y)

bgl,n(Y)−gl(Y)

fbbn(Y)−fbn(Y) 2

+ 2fbn(Y)

fbbn(Y)−fbn(Y)

≤ 1 b4nE

gk(Y)

bgl,n(Y)−gl(Y)

fbbn(Y)−fbn(Y)2

+ 2

b4nE

gk(Y)

bgl,n(Y)−gl(Y)

fbn(Y)

fbbn(Y)−fbn(Y)

≤ 1 b4nE

gk(Y)

bgl,n(Y)−gl(Y)

fbn(Y)−f(Y)2

(8)

+ 2

b4nE

gk(Y)

bgl,n(Y)−gl(Y)

fbn(Y)

fbn(Y)−f(Y)

.

Putting αn = h4n+n−1/2h−1n logn and using Lemma 1, Cauchy-Schwartz inequality and Lemma 6, we obtain

E

gk(Y)

bgl,n(Y)−gl(Y)

fbb2n(Y)−fb2n(Y)

≤ α2n v u u tE

bgl,n(Y)−gl(Y) 2!

q

E gk2(Y)

≤M3 α2nλ1/2n q

E gk2(Y) ,

where λn = n−1h−1n and M3 is a positive constant. On the other hand, since for n large enoughfbn(Y)≤ kfk, it follows

E

gk(Y)

bgl,n(Y)−gl(Y)

fbn(Y)

fbn(Y)−f(Y)

≤ αnkfk v u u tE

bgl,n(Y)−gl(Y) 2!

q

E g2k(Y)

≤M3αnλ1/2n kfk q

E gk2(Y) .

Therefore, from (8) we deduce that

|E(An)|=O

b−4n αnλ1/2n

=O(βn) (9)

(17)

whereβn=b−4n αnλ1/2n . In addition, E(Bn) = E

fbb−2

n (Y) (bgk,n(Y)−gk(Y)) (bgl,n(Y)−gl(Y))

≤ b−2n s

E

(bgk,n(Y)−gk(Y))2 s

E

(bgl,n(Y)−gl(Y))2

≤ M32b−2n λn; thus

|E(Bn)|=O bn−2λn

. (10)

Next, we have

|E(Cn)| ≤ b−4n E

|Rk(Y)Rl(Y)|

fbb2n(Y)−fb2n(Y)2

≤ b−4n E |Rk(Y)Rl(Y)|

fbbn(Y)−fbn(Y) 4!

+ 4b−4n E |Rk(Y)Rl(Y)|fbn(Y)

fbbn(Y)−fbn(Y) 3!

+ 4b−4n E |Rk(Y)Rl(Y)|fb2n(Y)

fbbn(Y)−fbn(Y) 2!

≤ b−4n α4n+ 4kfkα3n+ 4kfk2α2n

E(|Rk(Y)Rl(Y)|). Thus

|E(Cn)|=O b−4n α2n

. (11)

Similarly, we have

|E(Dn)| ≤ b−2n E

fbb2n(Y)−fb2n(Y)

2

R(bn,k)(Y)R(bn,l)(Y)

= b−2n E

"

|Rk(Y)Rl(Y)|

fbbn(Y)−fbn(Y) 4#

+ 4b−2n E

"

|Rk(Y)Rl(Y)|fbn(Y)

fbbn(Y)−fbn(Y) 3#

+ 4b−2n E

"

|Rk(Y)Rl(Y)|fb2n(Y)

fbbn(Y)−fbn(Y) 2#

≤ b−2n α4n+ 4kfkα3n+ 4kfk2α2n

E(|Rk(Y)Rl(Y)|), what implies

|E(Dn)|=O α2nb−2n

. (12)

(18)

From Eq. (9) to Eq. (12), we obtain

|E(−An+Bn+Cn−Dn)|=O

n−(1−4c1−2c2)logn

.

Then, from Eq.(7) we deduce that E

(n)k,l

=E h

Ikl(1)(Y) +Ikl(2)(Y)−Ikl(3)(Y)i

+ ∆n, (13)

where|∆n|=O n−(1−4c1−2c2)logn

. On the other hand,

E

Ikl(2)(Y)−Ikl(3)

≤E

Rbn,k(Y)bgl,n(Y)

fbn(Y) −Rbn,k(Y)Rbn,l(Y)fbbn(Y) fbn(Y)

!

+E

Rbn,l(Y)bgk,n(Y)

fbn(Y) −Rbn,k(Y)Rbn,l(Y)fbbn(Y) fbn(Y)

! (14) and

E

Rbn,k(Y)bgl,n(Y)

fbn(Y) −Rbn,k(Y)Rbn,l(Y)fbbn(Y) fbn(Y)

!

= E

Rbn,k(Y) fbn(Y)

bgl,n(Y)−Rbn,l(Y)fbbn(Y)

= E

Rbn,k(Y) fbn(Y)

bgl,n(Y)−gl(Y)fbbn(Y) fbn(Y)

!

≤ E

Rk(Y) f(Y)

bgl,n(Y)−gl(Y)fbbn(Y) fbn(Y)

! .

Then, using Cauchy-Schwarz inequality and Lemma 7 we obtain

n→+∞lim E

Rbn,k(Y)bgl,n(Y)

fbn(Y) −Rbn,k(Y)Rbn,l(Y)fbbn(Y) fbn(Y)

!

= 0.

Since, from similar arguments, we also obtain

n→+∞lim E

Rbn,l(Y)bgk,n(Y)

fbn(Y) −Rbn,k(Y)Rbn,l(Y)fbbn(Y) fbn(Y)

!

= 0, we deduce from (14) that

n→+∞lim E

Ikl(2)(Y)−Ikl(3)

= 0. (15) Moreover, since

Ikl,n(1) (Y) ≤

gk(Y)gl(Y) f2(Y)

=|Rk(Y)Rl(Y)|, we can apply the dominated convergence theorem that gives:

n→+∞lim E

Ikl(1)(Y)

=E

gk(Y)gl(Y) f2(Y)

=E

Rk(Y)Rl(Y)

k,l. (16)

Références

Documents relatifs

A new approach is proposed based on sliced inverse regression (SIR) for estimating the effective dimension reduction (EDR) space without requiring a prespecified para- metric

Keywords: Kernel estimator; Spatial regression; Random fields; Strong mixing coef- ficient; Dimension reduction; Inverse Regression.. The set S can be discret, continuous or the set

In the infinite dimensional framework, the problem of functional estimation for nonparametric regression operator under some strong mixing conditions on the functional data

Inspired by the component selection and smoothing operator (COSSO) (Lin and Zhang, 2006), which is based on ANOVA (ANalysis Of VAriance) decomposition, and the wavelet kernel

La réinvention des valeurs confucéennes et la proclamation de l’harmonie sont supposée manifester l’intérêt des dirigeants pour la paix sociale et le bien-être du

Based on the inverse mean E(z|y), [35] proposed Sliced Inverse Regression (SIR) for dimension reduction in regression.. It is realized that SIR can not recover the symmetric

Lastly, we propose a model selection strategy relying on a Goldenschluger and Lepski [21] method and prove a risk bound for the nal estimator: this result holds for