• Aucun résultat trouvé

Tikhonov Regularization for Nonparametric Instrumental Variable Estimators

N/A
N/A
Protected

Academic year: 2022

Partager "Tikhonov Regularization for Nonparametric Instrumental Variable Estimators"

Copied!
95
0
0

Texte intégral

(1)

Report

Reference

Tikhonov Regularization for Nonparametric Instrumental Variable Estimators

GAGLIARDINI, P., SCAILLET, Olivier

Abstract

We study a Tikhonov Regularized (TiR) estimator of a functional parameter identified by conditional moment restrictions in a linear model with both exogenous and endogenous regressorts. The nonparametric instrumental variable estimator is based on a minimum distance principle with penalization by the norms of the parameter and its derivative. After showing its consisteny in the Sobolev norm we derive the expression of the asymptotic Mean Integrated Square Error. The convergence rate with optimal value of the regularization parameter is characterized in two examples. We illustrate our theoretical findings and the small sample properties with simulation results. Finally, we provide an empirical appli8cation to estimation of an Engel curve, and discuss a data driven selection procedure for the regularization parameter.

GAGLIARDINI, P., SCAILLET, Olivier. Tikhonov Regularization for Nonparametric Instrumental Variable Estimators. 2006

Available at:

http://archive-ouverte.unige.ch/unige:5742

Disclaimer: layout of this document may differ from the published version.

(2)

2006.08

FACULTE DES SCIENCES

ECONOMIQUES ET SOCIALES

HAUTES ETUDES COMMERCIALES

TIKHONOV REGULARIZATION FOR

NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATORS

P. GAGLIARDINI

O. SCAILLET

(3)

TIKHONOV REGULARIZATION FOR NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATORS

P. Gagliardini and O. Scaillet

y

This version: December 2009

z

(First version: May 2006)

University of Lugano and Swiss Finance Institute. Corresponding author: Patrick Gagliardini, Univer- sity of Lugano, Faculty of Economics, Via Bu¢ 13, CH-6900 Lugano, Switzerland. Tel.: ++ 41 58 666 4660.

Fax: ++ 41 58 666 4734. Email: patrick.gagliardini@usi.ch.

yHEC Université de Genève and Swiss Finance Institute.

zWe thank the editor, the associate editor, and the two referees for helpful comments. An earlier version of this paper circulated under the title “Tikhonov regularization for functional minimum distance estima- tors”. Both authors received support by the Swiss National Science Foundation through the National Center of Competence in Research: Financial Valuation and Risk Management (NCCR FINRISK). We also thank Joel Horowitz for providing the dataset of the empirical section and many valuable suggestions as well as Manuel Arellano, Xiaohong Chen, Victor Chernozhukov, Jean-Pierre Florens, Oliver Linton, Enno Mammen, seminar participants at the University of Geneva, Catholic University of Louvain, University of Toulouse, Princeton University, Columbia University, ECARES, MIT/Harvard, CREST, Queen Mary’s College, Maas- tricht University, Carlos III University, ESRC 2006 Annual Conference in Bristol, SSES 2007 Annual Meeting in St. Gallen, the Workshop on Statistical Inference for Dependent Data in Hasselt, ESAM 2007 in Brisbane and ESEM 2007 in Budapest for helpful comments.

(4)

Tikhonov Regularization for Nonparametric Instrumental Variable Estimators Abstract

We study a Tikhonov Regularized (TiR) estimator of a functional parameter identi…ed by conditional moment restrictions in a linear model with both exogenous and endogenous regressors. The nonparametric instrumental variable estimator is based on a minimum dis- tance principle with penalization by the norms of the parameter and its derivative. After showing its consistency in the Sobolev norm we derive the expression of the asymptotic Mean Integrated Square Error. The convergence rate with optimal value of the regular- ization parameter is characterized in two examples. We illustrate our theoretical …ndings and the small sample properties with simulation results. Finally, we provide an empirical application to estimation of an Engel curve, and discuss a data driven selection procedure for the regularization parameter.

Keywords and phrases: Minimum Distance, Nonparametric Estimation, Ill-posed In- verse Problems, Tikhonov Regularization, Endogeneity, Instrumental Variable, Engel curve.

JEL classi…cation: C13, C14, C15, D12.

AMS 2000 classi…cation: 62G08, 62G20.

(5)

1 Introduction

Kernel and sieve estimators provide inference tools for nonparametric regression in empirical economic analysis. Recently, several suggestions have been made to correct for endogeneity in such a context, mainly motivated by functional instrumental variable (IV) estimation of structural equations. Newey and Powell (NP, 2003) consider nonparametric estimation of a function, which is identi…ed by conditional moment restrictions given a set of instruments.

Ai and Chen (AC, 2003) opt for a similar approach to estimate semiparametric speci…cations.

Darolles, Florens and Renault (DFR, 2003) and Hall and Horowitz (HH, 2005) concentrate on nonparametric IV estimation of a regression function. Horowitz (2007) shows the pointwise asymptotic normality for an asymptotically negligible bias. Horowitz and Lee (2007) extend HH to nonparametric IV quantile regression (NIVQR). Florens (2003) and Blundell and Powell (2003) give further background on endogenous nonparametric regressions.

There is a growing recent literature in econometrics extending the above methods and considering empirical applications. Blundell, Chen and Kristensen (BCK, 2007) investigate application of index models to Engel curve estimation with endogenous total expenditure.

As argued, e.g., in Blundell and Horowitz (2007), the knowledge of the shape of an Engel curve is a key ingredient of any consumer behaviour analysis. Chen and Pouzo (2008, 2009) consider a general semiparametric setting including partially linear quantile IV regression, and apply their results to sieve estimation of Engel curves. Further, Chen and Ludvigson (2009) consider asset pricing models with functional speci…cations of habit preferences; Cher- nozhukov, Imbens and Newey (2007) estimate nonseparable models for quantile regression

(6)

analysis; Loubes and Vanhems (2004) discuss the estimation of the solution of a di¤erential equation with endogenous variables for microeconomic applications. Other related works include Chernozhukov and Hansen (2005), Florens, Johannes and Van Bellegem (2005), Horowitz (2006), Hoderlein and Holzmann (2007), and Hu and Schennach (2008).

The main theoretical di¢ culty in nonparametric estimation with endogeneity is overcom- ing ill-posedness of the associated inverse problem (see Kress (1999), and Carrasco, Florens and Renault (CFR, 2007) for overviews). It occurs since the mapping of the reduced form parameter (that is, the distribution of the data) into the structural parameter (the instru- mental regression function) is not continuous. We need a regularization of the estimation to recover consistency. For instance, DFR and HH adopt an L2 regularization technique resulting in a kind of ridge regression in a functional setting.

The aim of this paper is to introduce a new minimum distance estimator for a func- tional parameter identi…ed by conditional moment restrictions in a linear model with both exogenous and endogenous regressors. We consider a penalized extremum estimator which minimizesQT (') + TG('), where QT(')is a minimum distance criterion in the functional parameter ', G(') is a penalty function, and T is a positive sequence converging to zero.

The penalty function G(') exploits the Sobolev norm of function ', which involves the L2 norms of both ' and its derivative r'. The basic idea is that the penalty term TG(') damps highly oscillating components of the estimator. These oscillations are otherwise un- duly ampli…ed by the minimum distance criterionQT (')because of ill-posedness. Parameter

T tunes the regularization. We call our estimator a Tikhonov Regularized (TiR) estimator

(7)

by reference to the pioneering papers of Tikhonov (1963a,b) where regularization is achieved via a penalty term incorporating the function and its derivative (Groetsch (1984)). The TiR estimator admits a closed form and is numerically tractable.

The key contribution of our paper is the computation of an explicit asymptotic expression for the mean integrated squared error (MISE) of a Sobolev penalized estimator in an NIVR setting with both exogenous and endogenous regressors. Such a sharp result extends the asymptotic bounds of HH obtained under a L2 penalty. Our other speci…c contributions are consistency of the TiR estimator in the Sobolev norm (and as a consequence uniform consistency), and a detailed analytic treatment of two examples yielding the optimal value of the regularization parameter.

Our paper is related to di¤erent contributions in the literature. To address ill-posedness NP and AC propose to introduce bounds on the norms of the functional parameter of interest and of its derivatives. This amounts to set compactness on the parameter space. This approach does not yield a closed-form estimator because of the inequality constraint on the functional parameter. In their empirical application, BCK compute a penalized estimator similar to ours. Their estimation relies on series estimators instead of kernel smoothers that we use. Chen and Pouzo (2008, 2009) examine the convergence rate of a sieve approach for an implementation as in BCK.

In de…ning directly the estimator on a function space, we follow the route of Horowitz and Lee (2007) and the suggestion of NP, p. 1573 (see also Gagliardini and Gouriéroux (2007), Chernozhukov, Gagliardini, and Scaillet (CGS, 2006)). Working directly over an

(8)

in…nite-dimensional parameter space (and not over …nite-dimensional parameter spaces of increasing dimensions) allows us to develop a well-de…ned theoretical framework which uses the penalization parameter as the single regularization parameter. In a sieve approach, either the number of sieve terms, or both the number of sieve terms and the penalization coe¢ cient, are regularization parameters that need to be controlled (see Chen and Pouzo (2008, 2009) for a detailed treatment). As in the implementation of a sieve approach, our computed estimator uses a projection on a …nite-dimensional basis of polynomials. The approximation error is of a purely numerical nature, and not of a statistical nature as in a sieve approach where the number of sieve terms can be used as a regularization parameter. The dimension of the basis should be selected large to get a small approximation error. In some cases, for example when the parameter of interest is close to a line, a few basis functions are enough to successfully implement our approach. We cannot see our approach as a sieve approach with an in…nite number of terms, and both asymptotic theoretical treatments do not nest each other (see CGS for similar comments in the quantile regression case). However we expect an asymptotic equivalence between our approach and a sieve approach under a number of sieve terms growing su¢ ciently fast to dominate the decay of the penalization term. The proof of such an equivalence is left for future research.

While the regularization approach in DFR and HH can be viewed as a Tikhonov regular- ization, their penalty term involves theL2norm of the function only (without any derivative).

By construction this penalization dispenses from a di¤erentiability assumption of the func- tion '. To avoid confusion, we refer to DFR and HH estimators as regularized estimators

(9)

with L2 norm. In our Monte-Carlo experiments and in an analytic example, we …nd that the use of the Sobolev penalty substantially enhances the performance of the regularized estimator relative to the use of the L2 penalty. Finally CGS focus on a feasible asymptotic normality theorem for a TiR estimator in an NIVQR setting. Their results can be easily specialized to the linear setting of this paper, and are not further considered here.

In Section 2 we discuss ill-posedness in nonparametric IV regression. We introduce the TiR estimator in Section 3 and prove its consistency in Sobolev norm in Section 4. In Section 5, we derive the exact asymptotic MISE of the TiR estimator. In Section 6 we discuss optimal rates of convergence in two examples, and provide an analytic comparison with L2 regularization. We discuss the numerical implementation in Section 7, and we present the Monte-Carlo results in Section 8. In Section 9 we provide an empirical example where we estimate an Engel curve nonparametrically, and discuss a data driven selection procedure for the regularization parameter. Gagliardini and Scaillet (GS, 2006) give further simulation results and implementation details. The set of regularity conditions and the proofs of propositions are gathered in the Appendices. Omitted proofs of technical Lemmas are collected in a Technical Report, which is available online at our web pages.

2 Ill-posedness in nonparametric regression

Let f(Yt; Xt; Zt) :t= 1; :::; Tg be i.i.d. copies of vector (Y; X; Z), where vectors X and Z are decomposed as X := (X1; X2) and Z := (Z1; X1). Let the supports of X and Z be X := X1 X2 and Z := Z1 X1; where Xi := [0;1]dXi, i = 1;2, and Z1 = [0;1]dZ1; while

(10)

the support of Y is Y R: The parameter of interest is a function'0 de…ned on X which satis…es the NIVR:

E[Y '0(X)jZ] = 0: (1)

The subvectors X1 and X2 correspond to exogenous and endogenous regressors, while Z is a vector of instruments. The conditional moment restriction (1) is equivalent to:

mx1 'x1;0; Z1 :=E Y 'x1;0(X2)jZ1; X1 =x1 = 0, for all x1 2 X1;

where 'x1;0(:) :='0(x1; :). For any given x1 2 X1, the function 'x1;0 satis…es a NIVR with endogenous regressors X2 only. Parameter '0 is such that, for all x1 2 X1, the function 'x1;0 belongs to a subset of the Sobolev space H1(X2) of order 1, i.e., the completion of the linear space f 2C1(X2)j r 2L2(X2);j j 1g with respect to the scalar prod- ucth 1; 2iH1(X2) := X

j j 1

hr 1;r 2iL2(X2), where h 1; 2iL2(X2) :=

Z

X2

1(u) 2(u)du and 2NdX2 is a multi-index. See CGS for use of Sobolev spaces of higher order. The Sobolev space H1(X2) is an Hilbert space w.r.t. the scalar product h 1; 2iH1(X2); and the corre- sponding Sobolev norm is denoted byk kH1(X2) :=h ; i1=2H1(X2). We denote theL2 norm by k kL2(X2) :=h ; i1=2L2(X2). Further, we assume the following identi…cation condition.

Assumption 1: 'x1;0 is the unique function 'x1 2 that satis…es the conditional moment restriction mx1 'x1; Z1 = 0, for all x1 2 X1:

We refer to NP, Theorems 2.2-2.4, for su¢ cient conditions ensuring Assumption 1. In par- ticular, the order condition dZ1 dX2 is not a necessary condition for identi…cation. Since we work below with a penalized quadratic criterion in the parameter of interest, we do not

(11)

need further assumptions on the parameter set , such as compactness. See Chen (2007), Horowitz and Lee (2007), and Chen and Pouzo (2008, 2009) for similar noncompact settings.

Let us now consider a given x1 2 X1 and a nonparametric minimum distance approach for 'x1;0. This relies on 'x1;0 minimizing

Qx1;1('x1) := Eh

x1;0(Z1)mx1 'x1; Z1 2 jX1 =x1i

; 'x1 2 ; (2)

where x1;0 is a nonnegative function onZ1:The conditional moment functionmx1 'x1; z1 can be written as:

mx1 'x1; z1 = Ax1'x1 (z1) rx1(z1) = Ax1 'x1 (z1); (3)

where 'x1 := 'x1 'x1;0, linear operator Ax1 is de…ned by Ax1'x1 (z1) :=

Z 'x

1(x2)fX2jZ(x2jz)dx2 andrx1(z1) :=

Z

yfYjZ(yjz)dy;wherefX2jZandfYjZ are the condi- tional densities ofX2 givenZ, andY givenZ. Assumption 1 on identi…cation of'x1;0holds if and only if operatorAx1 is injective for allx1 2 X1. Further, we assume thatAx1 is a bounded operator from L2(X2) to L2x

1(Z1), where L2x

1(Z1) is the L2 space of square integrable func- tions ofZ1 de…ned by scalar producth 1; 2iL2x1(Z1) =E[ x1;0(Z1) 1(Z1) 2(Z1)jX1 =x1]: The limit criterion (2) becomes

Qx1;1('x

1) = hAx1 'x

1; Ax1 'x

1iL2x1(Z1) (4)

= h 'x1; Ax

1Ax1 'x1iH1(X2) =h 'x1;A~x1Ax1 'x1iL2(X2); whereAx

1, resp. A~x1, denotes the adjoint operator ofAx1 w.r.t. the scalar productsh:; :iH1(X2), resp. h:; :iL2(X2), and h:; :iL2x1(Z1):

(12)

Assumption 2: The linear operator Ax1 fromL2(X2)toL2x1(Z1)is compact for all x1 2 X1.

Assumption 2 on compactness of operator Ax1 holds under mild conditions on the condi- tional density fX2jZ and the weighting function x1;0 (see Assumptions B.3 (i) and B.6 in Appendix 1). Then, operatorAx1Ax1 is compact and self-adjoint in H1(X2), whileA~x1Ax1

is compact and self-adjoint in L2(X2). We denote by x1;j :j 2N an orthonormal ba- sis in H1(X2) of eigenfunctions of operator Ax

1Ax1, and by x1;1 x1;2 > 0 the corresponding eigenvalues (see Kress (1999), Section 15.3, for the spectral decomposition of compact, self-adjoint operators). Similarly, n

~x1;j :j 2No

is an orthonormal basis in L2(X2) of eigenfunctions of operator A~x1Ax1 for eigenvalues ~x1;1 ~x1;2 > 0. By compactness ofAx1Ax1 andA~x1Ax1, the eigenvalues are such that x1;j;~x1;j !0, asj ! 1, for any givenx1 2 X1: The limit criterion Qx1;1('x1)can be minimized by a sequence 'x1;n in such that

'x1;n ='x1;0+"~

x1;n; n2N; (5)

for " > 0, which does not converge to 'x1;0 in L2-norm k:kL2(X2): Indeed, we have Qx1;1('x1;n) ="2h~

x1;n;A~x1Ax1~

x1;niL2(X2)="2~x1;n !0asn! 1, but 'x1;n 'x1;0 L2(

X2)

=",8n:Since" >0is arbitrary, the usual “identi…able uniqueness”assumption (e.g., White and Wooldridge (1991))

inf

'x

12 :k'x1 'x

1;0kL2(X2) "

Qx1;1('x1)>0 =Qx1;1('x1;0); for " >0; (6)

isnot satis…ed. In other words, function 'x1;0 is not identi…ed in as an isolated minimum ofQx1;1. This is the identi…cation problem of minimum distance estimation with functional

(13)

parameter and endogenous regressors. Failure of Condition (6) despite validity of Assump- tion 1 comes from0being a limit point of the eigenvalues of operator A~x1Ax1 (and Ax1Ax1).

This shows that the minimum distance problem for any given x1 2 X1 is ill-posed. The minimum distance estimator of 'x1;0 which minimizes the empirical counterpart of criterion Qx1;1('x1)over the set is not consistent w.r.t. the L2-normk:kL2(X2).

To conclude this section, let us further discuss the link between function'0 and functions 'x1;0,x1 2 X1. First, '0 2L2(X). Indeed, the setP := ' :'x1 2 ; 8x1 2 X1 is a subset ofL2(X), sincek'k2L2(X) =

Z

X1

'x1 2L2(

X2)dx1. Second, Assumption 1 implies identi…cation of'0 2 P:Third, minimizingQx1;1w.r.t.'x1 2 for allx1 2 X1is equivalent to minimizing the global criterion:

Q1(') :=E 0(Z)m('; Z)2 =E QX1;1('X1) ;

w.r.t. ' 2 P, where m('; z) := E[Y '(X)jZ =z] and 0(z) = x1;0(z1). Under As- sumptions B.3 (i) and B.6, ill-posedness of the minimum distance approach for'x

1,x1 2 X1, transfers by Lebesgue theorem to ill-posedness of the minimum distance approach for'. In- deed, the sequence 'n induced by (5) yieldsQ1('n)!0 and 'n9'0 asn ! 1. Finally, Assumption 2 cannot hold for the conditional expectation operator of X given Z. Indeed, as discussed in DFR, this operator is not compact in the presence of exogenous regressors.

This explains why we work x1 byx1 as in HH to estimate '0.

(14)

3 The Tikhonov Regularized (TiR) estimator

We address ill-posedness by Tikhonov regularization (Tikhonov (1963a,b); see Kress (1999), Chapter 16). We consider a penalized criterionLx1;T('x

1) := Qx1;T 'x

1 + x1;T 'x

1

2 H1(X2), where Qx1;T 'x1 is an empirical counterpart of Qx1;1 'x1 de…ned by

Qx1;T 'x1 = Z

Z1

^x1(z1) ^mx1 'x1; z1 2f^Z1jX1(z1jx1)dz1; (7) and ^

x1 is a sequence of positive functions converging in probability to x1;0. In (7) we estimate the conditional moment nonparametrically with

^

mx1 'x1; z1 = Z

'x1(x2) ^fX2jZ(x2jz)dx2 Z

yf^YjZ(yjz)dy=: A^x1'x1 (z1) r^x1(z1); wheref^X2jZ and f^YjZ denote kernel estimators of the density ofX2 givenZ, andY given Z.

We use a common kernel K and two di¤erent bandwidths hT for Y, X2, Z1, and hx1;T for X1.

De…nition 1: The Tikhonov Regularized (TiR) minimum distance estimator for 'x1;0 is de…ned by

^

'x1 := arg inf

'x

12 Lx1;T('x1); (8)

where x1;T >0 and x1;T !0, for any x1 2 X1. The TiR estimator '^ for '0 is de…ned by

^

'(x) := ^'x1(x2), x2 X:

To emphasize the di¤erence between '^x1 for a given x1 2 X1, and ', we refer to the former^ as alocal estimator, and to the latter as a global estimator.

From the proof of Proposition 1 in CGS, we know that sequences 'x1;n such that Qx1;1('x1;n) ! 0 and 'x1;n 9 'x1;0 have the property lim sup

n!1 r'x1;n L2(

X2) = 1. This

(15)

explains why we prefer in de…nition (8) to use a Sobolev penalty x1;T 'x1 2H1(

X2) instead of anL2penalty x1;T 'x1 2L2(

X2)to dampen the highly oscillating components in the estimated function. Without penalization, oscillations are unduly ampli…ed, since ill-posedness yields a criterionQx1;T('x1)asymptotically ‡at along some directions. The tuning parameter x1;T

in De…nition 1 controls for the amount of regularization, and how this depends on point x1 and sample size T. Its rate of convergence to zero a¤ects the one of '^x1.

The TiR estimator admits a closed form expression. The objective function in (8) can be rewritten as (see Lemma A.2 (i) in Appendix 2)

Lx1;T('x1) =h'x1;A^x1A^x1'x1iH1(X2) 2h'x1;A^x1^rx1iH1(X2)+ x1;Th'x1; 'x1iH1(X2); (9)

up to a term independent of 'x1. Operator A^x1 is given by A^x1 =D 1Ae^x1; Ae^x1 (x2) :=

Z

Z1

^x1(z1) ^fX2;Z1jX1(x2; z1jx1) (z1)dz1; (10)

where D 1 denotes the inverse of operator D : H02(X2) ! L2(X2) with D := 1

dX2

X

i=1

r2i

and H02(X2) = f 2H2(X2)j ri (x2) = 0 for x2;i = 0; 1, and i= 1; :::; dX2g. The space H2(X2) is the Sobolev space of order 2, i.e., the completion of the linear space

2C2(X2)j r 2L2(X2);j j 2 w.r.t. the scalar product h 1; 2iH2(X2) :=

X

j j 2

hr 1;r 2iL2(X2). Operators A^x1 and Ae^x1 are the empirical counterparts of Ax1 and A~x1, which are linked by Ax1 = D 1A~x1 (see Lemma A.1 in Appendix 2). The boundary conditions ri (x2) = 0 for x2;i = 0; 1 and i = 1; :::; dX2; in the de…nition of H02(X2) are not restrictive since they concern the estimate '^x1, but not the true function 'x1;0. More precisely, we study in Propositions 1-4 below the properties of '^x1 in the L2 and uniform

(16)

norms, and the properties of r'^x1 in the L2 norm. These propositions hold independently whether'x1;0 satisfy the boundary conditions or not (see also Kress (1999), Theorem 16.20).

From Lemma A.2 (ii), operator A^x

1

A^x1 is compact, and hence T + ^Ax

1

A^x1 is invertible (Kress (1999), Theorem 3.4). Then, Criterion (9) admits a global minimum'^x1 onH1(X2), which solves the …rst order condition

x1;T + ^Ax

1

A^x1 'x1 = ^Ax

1r^x1: (11)

This is an integro-di¤erential Fredholm equation of Type II (see e.g. Mammen, Linton and Nielsen (1999), Linton and Mammen (2005), Gagliardini and Gouriéroux (2007), Linton and Mammen (2008), and the survey by CFR for other examples). The transformation of the ill-posed problem (1) in the well-posed estimating equation (11) is induced by the penalty term involving the Sobolev norm. The TiR estimator of 'x1;0 is the explicit solution of Equation (11):

^ 'x

1 = x1;T + ^Ax

1

A^x1 1A^x

1^rx1: (12)

4 Consistency

Equation (12) can be rewritten as (see Appendix 3):

^

'x1 'x1;0 = x1;T +Ax1Ax1 1Ax1^

x1+Brx1;T + x1;T +Ax1Ax1 1Ax1 x1 +Rx1;T

=:Vx1;T +Brx1;T +Bex1;T +Rx1;T; (13)

(17)

where

^x1(z1) :=

Z

(y 'x1;0(x2))

f^W;Z(w; z) Eh

f^W;Z(w; z)i

fZ(z) dw;

x1(z1) :=

Z

(y 'x1;0(x2)) Eh

f^W;Z(w; z)i

fW;Z(w; z)

fZ(z) dw; (14)

and W := (Y; X2) 2 W := Y X1. In Equation (13) the …rst three terms Vx1;T, Brx1;T := x1;T +Ax1Ax1 1Ax1Ax1'x1;0 'x1;0 =: 'x1; 'x1;0; and Bex1;T are the lead- ing terms asymptotically, while Rx1;T is a remainder term given in (26). The stochastic term Vx1;T has mean zero and contributes to the variance. The deterministic term Bxe1;T

corresponds to kernel estimation bias. The deterministic termBrx1;T corresponds to the reg- ularization bias in the theory of Tikhonov regularization (Kress (1999), Groetsch (1984)).

Indeed, function'x1; minimizes the penalized limit criterionQx1;1 'x1 + x1;T 'x1 2H1(

X2)

w.r.t.'x1 2 . Thus,Bxr1;T is the asymptotic bias term arising from introducing the penalty

x1;T 'x

1

2

H1(X2) in the criterion. To control Brx1;T we introduce a source condition (see DFR).

Assumption 3: The function 'x1;0 satis…es X1

j=1

x1;j; 'x1;0 2H1(

X2) 2 x1

x1;j

<1 for x1 2(0;1].

As in the proof of Proposition 3.11 in CFR, Assumption 3 implies:

Brx1;T H1(X2) =O xx1

1;T : (15)

By bounding the Sobolev norms of the other terms Vx1;T; Bex1;T; and Rx1;T (see Appendix 3), we get the following consistency result. The relation aT bT, for positive sequences aT and bT, means thataT=bT is bounded away from 0and 1as T ! 1.

(18)

Proposition 1: Let the bandwidths hT T and hx1;T T x1 and the regularization parameter x1;T T x1 be such that:

>0; x1 >0; x1 >0; (16)

x1 +dX1 x1+ (dZ1 +dX2) <1; (17) and:

x1 <min m x1; m ;1 dX1 x

1 maxfdZ1; dX2g

2 : (18)

where m 2 is the order of di¤erentiability of the joint density of (W; Z). Under Assump- tions 1-3, B.1-B.3, B.6, B.7 (i)-(ii): '^x1 'x1;0 H1(

X2) =op(1).

Proposition 1 shows that the powers x1, x1 , and need to be su¢ ciently small for large dimensions dX1, dX2, and dZ and small order of di¤erentiability m to ensure consistency.

An analysis of x1, x1 , and close to the origin reveals that conditions (16)-(18) are not mutually exclusive, and that these conditions do not yield an empty region. Consistency of '^x1 in the Sobolev norm H1(X2) implies consistency of both '^x1 and r'^x1 in the norm L2(X2). Lemma C.1 in CGS states that for any ' 2 H1(X2), sup

x22X2

j'(x2)j 2k'kH1(X2). Hence we also get uniform consistency of '^x1, i.e. sup

x22X2j'^x1(x2) 'x1;0(x2)j = op(1), for a given x1 2 X1.

Building on the bounds for terms Vx1;T; Bex1;T; and Rx1;T in the proof of Proposition 1, we can further show uniform consistency of the global estimator '^ (and as a consequence in theL2(X)norm) if we introduce a strengthening of the source condition.

(19)

Assumption 3 bis: The function '0 satis…es sup

x12X1

X1 j=1

x1;j; 'x1;0 2H1(

X2) 2 x1

x1;j

<1; for x1 2 (0;1], x1 2 X1.

Assumption 3 bis implies:

sup

x12X1

Bxr1;T 2

H1(X2)=O sup

x12X1

2 x1

x1;T ; (19)

and we get the next uniform consistency result.

Proposition 2: Let the bandwidths hT T and hx1;T T x1 and the regularization parameter x1;T T x1 be such that:

>0; x1 "; x1 "; @ x

1

@x1 "

x1 +dX1 x

1 + (dZ1 +dX2) 1 ";

and:

x1 min m x1; m ;1 dX1 x1 maxfdZ1; dX2g

2 ";

for some "; " >0 and any x1 2 X1. Under Assumptions 1-3 bis, B.1-B.3, B.6, B.7 (i)-(ii):

sup

x12X1

^

'x1 'x1;0

H1(X2) =op(1).

Again from Lemma C.1 in CGS, Proposition 2 implies consistency of the global estimator

^

' in the sup-norm: supx2Xj'(x)^ '0(x)j =op(1). This in turn implies the L2-consistency k'^ '0kL2(X)=op(1).

(20)

5 Mean Integrated Square Error

As in AC, Assumption 4.1, we assume the following choice of the weighting matrix.

Assumption 4: The asymptotic weighting matrix is 0(z) = V [Y '0(X)jZ =z] 1:

In a semiparametric setting, AC show that this choice of the weighting matrix yields e¢ - cient estimators of the …nite-dimensional component. Here, Assumption 4 is used to derive the exact asymptotic expansion of the MISE of the TiR estimator provided in the next proposition.

Proposition 3: Under Assumptions 1-4, Assumptions B, the conditions (16)-(18) and 1

T hdxX1

1;ThdTZ1+dX2

+h2mx1;T+h2mT =o( x1;Tb( x1;T; hx1;T)); hThmx1;T1+hmT p

x1;T

=o(b( x1;T; hx1;T)); (20) the MISE of '^x1 is given by

Eh

^

'x1 'x1;0 2L2(

X2)

i

=Mx1;T( x1;T; hx1;T)(1 +o(1)); (21)

where

Mx1;T( x1;T; hx1;T) := 1 T hdxX1;T1

2

x1( x1;T) +bx1( x1;T; hx1;T)2; (22) and:

2

x1( x1;T) :=!2fX1(x1) X1

j=1

x1;j

( x1;T + x1;j)2 x1;j

2 L2(X2); bx1( x1;T; hx1;T) := Bxr1;T +hmx

1;T x1;T +Ax

1Ax1 1Ax

1 x1

L2(X2)

;

(21)

with !2 = Z

K(x1)2dx1 and

x1(z1) := 1 m!

X

j j=m

Z

(y 'x

1;0(x2))rX1fW;Z(w; z) fZ(z) dw:

The asymptotic expansion (22) of the MISE consists of one bias component and one variance component which we comment on.

(i) The bias function bx1( x1;T; hx1;T) is the L2 norm of the sum of two contributions, namely the Tikhonov regularization biasBrx1;T and functionhmx1;T x1;T +Ax1Ax1 1Ax1 x1. The latter contribution corresponds to a population Tikhonov regression applied to function hmx1;T x1. Functionhmx1;T x1 arises from smoothing the exogenous regressorsX1and is derived by a standard Taylor expansion w.r.t. X1 of the kernel estimation bias Eh

f^W;Z(w; z)i fW;Z(w; z)in Bex1;T (see (14)).

(ii) The variance term is Vx1;T := 1 T hdxX1

1;T 2

x1( x1;T): The ratio 1= T hdxX1;T1 and the mul- tiplicative factor !2fX1(x1) are standard for kernel regression in dimension dX1 and are induced by smoothingX1. The coe¢ cient 2x1( x1;T)involves a weighted sum of the regular- ized inverse eigenvalues x1;j=( x1;T + x1;j)2 of operator Ax1Ax1, with weights x1;j 2L2(X2)

(since x1;j=( x1;T + x1;j)2 x1;j, the in…nite sum converges under Assumption B.8 (ii) in Appendix 1). To have an interpretation, note that the inverse of operator Ax

1Ax1 corre- sponds to the standard asymptotic variance matrix QXZV0 1QZX

1 of the 2-Stage Least Square (2SLS) estimator of the …nite-dimensional parameter in the instrumental regression Y =X0 +U withE[UjZ] = 0, whereQZX =E ZX0 andV0 =V U2ZZ0 . In the ill-posed

Références

Documents relatifs

To get a rate of convergence for the estimator, we need to specify the regularity of the unknown function ϕ and compare it with the degree of ill-posedness of the operator T ,

In this section, we present an empirical comparison of the proposed estimators with classical approaches, namely Chebyshev polynomial approximation [16], and the conjugate gradient

Since the normalized errors are of the same order of magnitude in both methods, we are tempted to claim that the minimization of I (3) , with the determination of the constant

On the basis of the results obtained when applying this and the previously proposed methods to experimental data from a typical industrial robot, it can be concluded that the

In fact these methods make also some selection ( the generated sequences converge to the analytic center when using smoothing methods or the least norm center ) on the solution set

It is clear that a misspecified value of the variance σ 2 would have an impact on the frontier estimation: if we select a value σ mis ‰ σ, the corresponding estimated frontier

Nonlinear ill-posed problems, asymptotic regularization, exponential integrators, variable step sizes, convergence, optimal convergence rates.. 1 Mathematisches Institut,

From this observation, it is expected that accurate results can be obtained if the trade-off parameters λ and the tuning parameters p and q are properly cho- sen, since