• Aucun résultat trouvé

Three essays in econometrics: robust model and moment selection in GMM and an application of semi-parametric Taylor rules

N/A
N/A
Protected

Academic year: 2022

Partager "Three essays in econometrics: robust model and moment selection in GMM and an application of semi-parametric Taylor rules"

Copied!
166
0
0

Texte intégral

(1)

Thesis

Reference

Three essays in econometrics: robust model and moment selection in GMM and an application of semi-parametric Taylor rules

DE PORRES ORTIZ DE URBINA, Carlos

Abstract

The first two chapters of this thesis develop a new methodology in the Generalized Method of Moments. Typically, researchers assume that the data come from an unknown ideal distribution. In the first two chapters, we relax this assumption by assuming shrinking neighborhoods of this ideal distribution. We show the conditions that are needed for GMM estimators to have a stable behaviour in these neighborhoods and propose a robust GMM estimator based on these conditions. Finally, we show how to perform Moment and Model Selection based on the robust GMM estimators. The third chapter is an extension of the framework introduced by J. Taylor in 1993. We propose a more flexible setting that can capture asymmetric preferences of the Central Bank between the macroeconomic fundamentals as well as a possibly nonlinear economy.

DE PORRES ORTIZ DE URBINA, Carlos. Three essays in econometrics: robust model and moment selection in GMM and an application of semi-parametric Taylor rules. Thèse de doctorat : Univ. Genève, 2001, no. GSEM 15

DOI : 10.13097/archive-ouverte/unige:85623

Available at:

http://archive-ouverte.unige.ch/unige:85623

Disclaimer: layout of this document may differ from the published version.

1 / 1

(2)

Three essays in Econometrics:

Robust Model and Moment Selection in GMM and an application of semi-parametric

Taylor Rules

by

Carlos de Porres Ortiz de Urbina

A thesis submitted to the

Geneva School of Economics and Management, University of Geneva, Switzerland,

in fulfillment of the requirements for the degree of PhD in Econometrics

Members of the thesis committee:

Prof. ElvezioRonchetti, Adviser, University of Geneva Prof. Stefan Sperlich, Chair, University of Geneva

Prof. Alastair R. Hall, University of Manchester

Thesis No. 15 July 2016

(3)
(4)

Acknowledgements

I would like to start by thanking my adviser, Prof. Elvezio Ronchetti. His knowledge on Robust Statistics is truly an inspiration.

I would also like to thank the chair of my thesis committee, Prof. Stefan Sperlich, as well as the external reader, Prof. Alastair Hall. I am grateful for all the comments and constructive critiques they made.

I have met many professors during my studies I could thank one way or another. There is one who deserves a special mention. It is Prof. Jaya Krishnakumar. She taught me most of what I know about GMM and IV techniques.

During all my years at the University of Geneva I have met many researchers who have taught me many lessons and with whom I have shared very good moments. I would like to mention the most special ones: William Aeberhard, Estefan´ıa Amer and Juan T´ellez.

I would like to thank all my friends, in particular: Andr´es Olleta and Sylvain Selleger.

Last but not least I truly thank my mother and Mark. You have made this thesis possible. Without you I never would have finished this work.

Thank you all!

(5)
(6)

Abstract

The first two chapters of this thesis develop a new methodology in the Generalized Method of Moments (GMM) class of estimators. The latter was introduced by [Hansen, 1982].

This method is widely used in economics; see Table 1.1 in [Hall, 2005] for an extensive list of references in different fields of economics. GMM is also used in other fields: medicine, sociology, political science, management, etc.

GMM estimators can be seen as a more general class than M-estimators in the sense that the number of estimating equations we can use can be potentially greater than the number of parameters we want to estimate. It could also be considered a semi- parametric method since we do not need to impose an explicit underlying distribution for the data. The estimators are based on moments which, in general, can be satisfied by many underlying distributions.

The theory of [Hansen, 1982] considered GMM estimators whose moments were asymp- totically zero as well as the number of moments being fixed. Both assumptions were relaxed by [Andrews, 1999] as he considered sets of moments whose expectation could be non-zero. Moreover, [Andrews, 1999] allowed for different numbers of moments. He performed inference by proposing Moment Selection Criteria (MSC) that, under suit- able conditions, will lead to inference asymptotically identical to the one based on the maximum number of moments whose expectation is asymptotically zero.

[Andrews, 1999] assumed that the data come from an unknown underlying distribu- tion. In the first chapter of this thesis we relax that assumption by assuming shrink- ing neighborhoods of this ideal underlying distribution. This reasoning falls within the paradigm of Robust Statistics, [Huber, 1981], 2nd edition by [Huber and Ronchetti, 2009],

(7)

[Hampel et al., 1986], [Maronna et al., 2006]. We show the conditions that are needed for GMM estimators to have a stable behaviour in these neighborhoods and propose new Ro- bust GMM (RGMM) estimators based on these conditions. We also provide the robust counterparts of MSC (RMSC) and show that under some mild conditions they asymptot- ically behave as the MSC of [Andrews, 1999].

In the second chapter, we base our procedure on work by [Andrews and Lu, 2001].

They extended [Andrews, 1999] framework to model and moment selection. Furthermore, we transfer some concepts of higher-order robustness in M-estimation,

[La Vecchia et al., 2012], to the GMM setting.

The third chapter is an extension of the framework introduced by Taylor(1993). He developed the so-called Taylor Rule which consists in a linear relationship between the interest rate fixed by a Central Bank on one side and the inflation and output gap ex- pectations on the other side. The Taylor Rule aims at explaining the Central Bank’s behaviour in an simple and easy way. However, the underlying mechanism of interest rate fixing is certainly more complex. Indeed, there exist in the literature many different articles proposing new and very sophisticated methods to better track Central Bank’s behaviour. Our paper tries to keep the simplicity of the original Taylor Rule by just using the inflation and output gap expectations. However, we propose a more flexible setting since we can capture asymmetric preferences of the Central Bank between the macroeco- nomic fundamentals as well as a possibly nonlinear economy (IS-LM and Phillips curve models).

(8)

esum´ e

L’inf´erence avec des estimateurs de MMG (M´ethode de Moments G´en´eralis´ee) d´epend des propri´et´es des moments qui sont utilis´es pour d´efinir l’estimateur. Dans le premier chapitre, nous ´etudions les crit`eres de s´election de moments (MSC), dont MSC-AIC, MSC-BIC, MSC-HQIC, DT et UT font partie. Les crit`eres de s´election de moments sont utilis´es pour discriminer entre des moments consistents et inconsistents. Nous investigons les propi´et´es de robustesse locale des moments consistents ainsi que des moments incon- sistents. Nous montrons que la plupart des estimateurs MMG existant dans la litt´erature ne satisfont pas les propri´et´es de robustesse locale. Ce manque de robustesse est associ´e

`

a une fonction d’influence non-born´ee et peut amener `a un choix erron´e des moments.

En se basant sur les conditions de robustesse locale, nous proposons une fa¸con g´en´erale de construire des crit`eres de s´election de moments robuste (RMSC). Nous montrons un cas explicite de cette m´ethodologie dans le contexte des variables instrumentales lin´eaires (IV). Finalement, nous comparons MSC avec RMSC avec des simulations et dans une application.

Dans le deuxi`eme chapitre, nous ´etudions les propri´et´es de robustesse locale des crit`eres de s´election de mod`eles et moments (MMSC). Nous transf´erons quelques concepts de robustesse d’ordre sup´erieur du cas M-estimation au cas MMG. Nous construisons des crit`eres de s´election de mod`eles et moments robustes (RMMSC) et montrons que RMMSC a des propri´et´es de robustesse locale de deuxi`eme ordre dans le cas de IV lin´eaire.

Dans le troisi`eme chapitre, nous investigons la sp´ecification empirique de la fonction de r´eaction des Banques Centrales dans un cadre semi-param´etrique. Le cadre standard de la r`egle de Taylor est imbriqu´e dans le cadre propos´e. Nous estimons une r`egle de

(9)

Taylor standard et une autre augment´ee pour 8 pays de l’OCDE sur la p´eriode 1976-2010 avec une fr´equence trimestrielle. La r`egle de Taylor augment´ee a un terme d’int´eraction entre les fondamentaux macro´economiques. Les r`egles de Taylor semi-param´etriques sont faciles `a estimer et `a interpr´eter. Les estimations pr´esentent de l’´evidence statistique en faveur d’une r´eponse non-lin´eaire de la fonction de r´eaction des Banques Centrales

´

etudi´ees. De plus, la fonction de r´eaction qui mieux capture ce comportement est la r`egle de Taylor semi-param´etrique augment´ee, ceci est confirm´e par des tests de nonlin´earit´e.

(10)

Contents

Acknowledgements i

Abstract iii

R´esum´e v

1 Robust Moment Selection in GMM with an application to Instrumental

Variables Models 1

1.1 Introduction . . . 1

1.2 MSC and its local robustness properties . . . 4

1.3 Robust Moment Selection . . . 9

1.4 Properties of RMSC . . . 13

1.5 Computational aspects . . . 16

1.6 Robustness and consistency in moment selection . . . 17

1.7 Instrumental variables case . . . 20

1.8 Simulation study . . . 24

1.8.1 Simultaneous Equations Model (SEM) . . . 24

1.8.2 Empirical example: wage equation . . . 28

1.9 Assumptions . . . 30

1.10 Some asymptotic results of RGMM . . . 31

1.11 Proofs . . . 34

2 Robust Moment and Model Selection in GMM 53 2.1 Introduction . . . 53

(11)

2.2 MMSC and its robustness properties . . . 56

2.3 Robust Model and Moment Selection Criteria . . . 59

2.4 Computational issues . . . 63

2.5 Instrumental variables case . . . 65

2.6 Connections to Higher-Order local Robustness . . . 67

2.7 Potential extensions: Robust Generalized Empirical Likelihood and Expo- nential Tilting . . . 71

2.8 Simulation study . . . 72

2.9 Assumptions . . . 77

2.10 Proofs . . . 78

3 Is the Taylor rule outdated? Evidence from simple semiparametric pol- icy rules 91 3.1 Introduction . . . 91

3.2 Data and methodology . . . 95

3.2.1 Data . . . 95

3.2.2 Methodology . . . 97

3.3 Empirical evidence . . . 98

3.3.1 Australia . . . 100

3.3.2 Japan . . . 101

3.3.3 New Zealand . . . 103

3.3.4 Norway . . . 104

3.3.5 Sweden . . . 105

3.3.6 Switzerland . . . 106

3.3.7 USA . . . 107

3.3.8 Canada . . . 108

3.4 Nonlinearity tests . . . 109

3.5 Output gap instead of output growth . . . 112

3.6 Appendix . . . 115

3.6.1 Data description . . . 115

(12)

Contents ix 3.6.2 Estimation tables, OECD forecasts . . . 117 3.6.3 Nonlinearity tests . . . 119 3.6.4 Figures . . . 121

Conclusion 129

(13)
(14)

To my mom.

(15)
(16)

Chapter 1

Robust Moment Selection in GMM with an application to Instrumental Variables Models 1

1.1 Introduction

In the GMM framework, researchers rely on the assumption that the moments they use are valid and they test this assumption by means of Hansen’s over-identification test. If the null hypothesis of consistency is rejected, it becomes very challenging to find the max- imum number d0 of available moments that are asymptotically zero. There exist many articles dealing with this issue in the econometric literature. [Ghysels and Hall, 1990]

present a test to discriminate between two non-nested sets of Euler conditions based on GMM estimators. [Smith, 1992] provides a procedure to compare non-nested GMM esti- mators. [Andrews, 1999] derives a class of Moment Selection Criteria (MSC), including MSC-BIC, MSC-HQIC, Downward (DT) and Upward Testing (UT) which asymptoti- cally maximizes the number of moments that are asymptotically zero. The latter is extended by [Andrews and Lu, 2001] to moment and model selection and dynamic panel data models. [Donald and Newey, 2001] derive the optimal set of instruments in the in-

1This chapter is co-authored by Elvezio Ronchetti.

(17)

strumental variables setting based on the asymptotic approximation of the MSE for the corresponding estimator. In the same spirit as in [Andrews, 1999], [Hong et al., 2003]

derive a procedure based on the Generalized Empirical Likelihood estimator. Follow- ing the concept of canonical correlations, [Hall and Peixe, 2003] develop the Canonical Correlation Information Criterion (CCIC) which finds the non-redundant instruments among the valid moments, as defined by [Breusch et al., 1999]. This idea is extended by [Hall et al., 2007] to derive a procedure based on long run canonical correlations that can choose moments avoiding redundancy and weak identification, where the latter is de- fined in [Nelson and Startz, 1990]. [Caner, 2009] derives a LASSO-type GMM estimator that, under suitable conditions, chooses the right number of non-zero parameters and estimates them given a set of moments. [Hall and Pelletier, 2011] study the behavior of some measures of goodness of fit to discriminate between non-nested GMM estimators and possibly local misspecification. [Carrasco, 2012] considers a possibly infinite number of instruments in the linear IV setting and proposes three types of regularizations that are asymptotically efficient and normally distributed. [Belloni et al., 2012] consider LASSO- type estimators in the linear IV framework that can handle a large number of instruments, even larger than the sample size. Their estimators can also hande non-gaussian and het- eroscedastic errors. [Liao, 2013] develops a penalized criterion that detects invalid and redundant moments and provides estimation of the structural parameters simultaneously.

[Cheng and Liao, 2015] propose a LASSO-based GMM estimators that select the valid and relevant moments with a possibly divergent number of moments. The selection and estimation of the structural parameters is done simultaneously.

In spite of the properties mentioned above, small departures from the underlying data generating process can drastically affect classical MSC. We defined classical MSC as any criterion relying on established GMM estimators in the econometric literature. In this chapter, we focus on the methodology developed by [Andrews, 1999] and investigate its robustness properties by applying the notion of local robustness in the context of GMM moment selection. In the context of IV we provide evidence that the classical consistent procedures are badly affected in the presence of a small contamination in the sample. For

(18)

1.1. Introduction 3 example, with a 5% contamination, the probability of choosing d0 for consistent criteria drops by more than 50% compared to the non-contaminated case (from around 90% to over 30%); see section 1.8.1. In constrast, the robust criteria developed in this chapter are less vulnerable to local misspecifications and still select d0 with about 90% probability in the presence of a 5% contamination.

Nowadays, the need for robust statistical methods for estimation and testing is well rec- ognized both in the statistical and econometric literature; cf. for instance, the books [Huber, 1981], 2nd edition by [Huber and Ronchetti, 2009], [Hampel et al., 1986],

[Maronna et al., 2006] in the statistical literature and [Peracchi, 1990], [Peracchi, 1991], [Ronchetti and Trojani, 2001], [Gagliardini et al., 2005], and [Koenker, 2005] in the econo- metric literature. More specifically some papers propose robust methods for model selec- tion in the setting of M-estimators; cf. for instance [Ronchetti, 1985], [Ronchetti, 1997], [Hurvich and Tsai, 1990], [Machado, 1993], [Ronchetti and Staudte, 1994],

[Victoria-Feser, 1997], [Shi and Tsai, 1998], [Cantoni et al., 2005].

This chapter combines both strands of the literature in the following directions.

First we investigate the local robustness properties of Moment Selection Criteria by study- ing their influence function (IF) [Hampel, 1974]. An unbounded IF implies an unbounded behavior of the MSC when the underlying distribution lies in a small neighborhood of the reference model. Therefore, a small outlying proportion of the data can drastically change the moment selection. On the other hand, boundedness of the IF implies that the MSC will display a stable behavior not only at the model but also in a neighborhood of it. In the Moment Selection framework, the boundedness of IF prevents a small fraction of the data from changing the rank between d0 and any other set of orthogonality conditions.

Secondly, we derive robust moment selection criteria, in particular robust versions of MSC-AIC, MSC-BIC, MSC-HQIC, DT and UT, and show that, at the model, they asymp- totically behave like the classical criteria. In contrast to MSC, RMSC display a stable behavior under departures from the reference model. This stability is achieved by impos- ing a bound on the influence function underlying the corresponding GMM estimator.

The results are quite general and the new robust moment selection techniques can be

(19)

applied in many contexts in econometrics including, for instance, nonlinear and linear instrumental variables and some time series models.

The approach of this chapter for moment selection is rather empirical based, in the sense that there is not necessarily an economic theory supporting the choice of mo- ments. This methodology does not apply when we want to test an economic theory as in [Hansen and Singleton, 1982]. In their framework there is an underlying set of mo- ments that will give rise to the most efficient and consistent estimation of the structural parameters. If this set of moments turns out to be invalid, then the economic theory behind the model is not verified. In this chapter, we do not make any assumption on any theory behind the choice of set of moments. We aim at maximizing the number of moments that are asymptotically zero by taking into account possible departures from the ideal underlying distribution.

This chapter is organized as follows. In the next section, we define the setup of MSC and derive the local robustness properties for valid and invalid moments. In section 1.3, we define a robust GMM estimator (RGMM) based on the local robustness results for both types of moments. In addition, we derive the corresponding RMSC and in section 1.4 we study their properties. In section 1.5, we propose an algorithm to perform robust moment selection. Section 1.6 sheds more light on the relationship between robustness and the properties of the estimators associated with MSC and RMSC. We then apply our methodology in section 1.7 in the linear IV case. Section 1.8 compares the behavior of different MSC and RMSC in a linear IV setup through Monte-Carlo simulations and in an empirical application. Additional assumptions and proofs of the Propositions are collected in the last sections of the chapter.

1.2 MSC and its local robustness properties

Let y1, ..., yn be a sequence of random variables drawn from a probability distribution P0 belonging to some class P. Let d be the selection vector which denotes the elements of the candidate set that are included in a particular moment condition. Hence, dj = 1 indicates that thejth element is included in the set of moments anddj = 0 indicates that

(20)

1.2. MSC and its local robustness properties 5 this element is excluded. We denote by|d| the number of elements included and byhmax

the maximum number of orthogonality conditions. Let (Wn(d)), n ∈N be a sequence of random weight matrices that converges in limit to a positive definite matrix W0(d) for each set of moments d and h :Y ×Θ× D →R|d| be the random vector of orthogonality conditions, D being defined in Assumption A.1.5. The GMM estimator of θ0 denoted by θˆn( ˆP;d) is defined by

θˆn( ˆP;d) = arg min

θ∈ΘEPˆhT(Y, θ;d)Wn(d)EPˆh(Y, θ;d),

where ˆP is the empirical distribution. Under regularity conditions, [Hansen, 1982], the GMM estimator exists, is consistent and asymptotically normally distributed at the model, for d ∈ Z0 (defined in assumption A.1.5). The asymptotic variance-covariance matrix is given by Σ0(W0(d);d) =S0EP0[∂h∂θT(Y, θ0;d)]W0(d)V0W0(d)EP0[∂θ∂hT(Y, θ0;d)]S0, where V0 =V0(d) =EP0[h(Y, θ0;d)hT(Y, θ0;d)] and

S0 =S0(W0(d);d) =nEP0[∂h∂θT(Y, θ0;d)]W0(d)EP0[∂θ∂hT(Y, θ0;d)]o−1. The corresponding functional form of the GMM estimator is given by:

θ(Pˆ ;d) = arg min

θ∈ΘEPhT(Y, θ;d)W0(d)EPh(Y, θ;d). (1.1) [Andrews, 1999] proposes a family of procedures to asymptotically select the most efficient set of moments among sets of moment conditions that provide consistent estimates of the structural parameters given a model. Let us summarize these criteria. Consider that the functional form is fixed, i.e. the parameter space remains the same for all possible sets of orthogonality conditions. For any empirical measure, ˆP satisfying assumption A.1.6, Andrews proposes to find the set of moments ˆdMSC in the classD which minimizes

MSC( ˆP;d) =nJ( ˆP;d)p(|d|)κn , (1.2)

(21)

where J( ˆP;d) =EPˆhT(Y,θˆn;d)Wn(d)EPˆh(Y,θˆn;d).

Assumption 1.2.1 (Penalty properties for MSC). 1. p(|d|) is a strictly increas- ing function.

2. κn =o(n) and κ→ ∞.

Assumption 1.2.1 is needed for MSC to be consistent. Some specific MSC based on the corresponding classical information criteria are defined by:

MSC-AIC : nJ( ˆP;d)−2(|d| −p) MSC-BIC : nJ( ˆP;d)−(|d| −p) logn

MSC-HQIC : nJ( ˆP;d)H(|d| −p) log logn, H >2.

Note that the penalty for MSC-AIC does not satisfy the assumption κn → ∞, hence it does not choose with probability one the set of moments which maximizes the number of orthogonality conditions that are asymptotically zero as n → ∞. For MSC with κn → ∞, under some regularity conditions given in [Andrews, 1999], dˆMSC( ˆP) →P d0 and√

n(ˆθn( ˆP; ˆdMSC)−θ0)→D N(0,Σ0(W0(d0);d0)), where ’→’ and ’P →’ denote convergenceD in probability and in distribution respectively. These results whend0 is uniquely defined.

If the latter is not verified the limiting behavior of MSC will be a random variable with a probability associated with each element belonging to MZ0,MZ0 defined in A.1.5.

Finally, [Andrews, 1999] also defines two testing procedures for moment selection called Downward Testing and Upward Testing. The former chooses the greatest number of moments, for which some J( ˆP;d) test does not reject for d ∈ D at a given significance level αn, starting from the set of moments that maximizes the number of orthogonality conditions. The latter starts from the set of moments that minimizes the number of overidentifying conditions and stops when J( ˆP;d) test does not reject for d ∈ D at a given significance level αn for |d| being maximal.

These results hold when the data generating process is P0. In order to investigate the behavior of the functionals

(22)

1.2. MSC and its local robustness properties 7

J(P;d) =EPhT(Y,θ;ˆ d)W0(d)EPh(Y,θ;ˆ d)

M SC(P;d) =nJ(P;d)p(|d|)κn (1.3)

outside the reference model, we define the following neighborhood of P0

Uε,n(P0) =

(

Pε,n = 1− ε

n

!

P0+ ε

nQ | Q arbitrary

)

. (1.4)

Notice that any distribution in this neighborhood is at most at distance εn (in the sense of Kolmogorov) from the reference model P0. This is a common choice in robustness investigations to formalize deviations from the reference model (see for instance,

[Ronchetti and Trojani, 2001, p. 51]), but other types of contaminations could be consid- ered.

The purpose of robust statistics is to construct statistical procedures that exhibit a stable behavior in such a neighborhood of the reference model. It is convenient to study this behavior by formalizing the statistical procedure as a functional. Then the local robustness of the statistical functional in a neighborhood like (1.4) is achieved by imposing a bound on the influence function of the functional. A bounded influence function implies a uniform bound on the possible values of any statistic in the neighborhood defined by (1.4); see [Hampel, 1974], [Hampel et al., 1986].

For instance, the influence function of the GMM functional ˆθ, defined in (1.1), is given by IF(y,θ, Pˆ 0;d) = limε→0

θ((1−ε)Pˆ 0+εδy;d)−θ(Pˆ 0;d)

ε . In our setting, for a given set of moments d∈ D, the influence function is given implicitly by:

(23)

S0Γd(I|d|⊗IF(y,θ, Pˆ 0;d))W0EP0h(Y, θ0;d) + IF(y,θ, Pˆ 0;d) = 2S0EP0[∂h∂θT(Y, θ0;d)]W0(d)EP0[h(Y, θ0;d)]−

−S0∂h∂θT(y, θ0;d)W0EP0[h(Y, θ0;d)]S0EP0[∂h∂θT(Y, θ0;d)]W0h(y, θ0;d), (1.5)

where δy is a point mass at y, Γd = EP0[∂θ∂θ2h(1)T · · ·∂θ∂θ2h(|d|)T ], h(i) is the ith element of the vectorh(·) and I|d| is the identity matrix of size|d|. From (1.5) we can draw the following conclusions.

• A necessary condition from (1.5) to achieve local robustness for both valid and invalid moments is to have bounded functions h(·, θ0;d) and ∂θ∂hT(·, θ0;d). It is in- teresting to notice that the same conditions are required in a different context to obtain second-order robustness of M-estimators; see [La Vecchia et al., 2012].

• For valid sets of moments, Z0 in assumption A.1.5 in the Appendix, we obtain the influence function for GMM estimators developed in [Ronchetti and Trojani, 2001, Eq. (14)].

• In some particular cases we can obtain an explicit form for the influence funtion for both valid and invalid moments. One example is when Γd = 0, which corresponds to linear regression and linear IV.

• The necessary conditions that lead to local robustness are not satisfied by most of the existing GMM estimators in the literature. It will be shown in section 1.7 that all linear IV estimators have an unbounded IF; see [Ronchetti and Trojani, 2001]

for more examples.

The influence function is a key element in the application of von Mises expansions to func- tionals, see [Von Mises, 1947]. The expansions provide an approximation of the behavior of the functional in a neighborhood of the ideal distribution. In the context of Moment Selection we provide the asymptotic expansion in the neighborhood (1.4) of the reference model of any MSC defined in (1.3) .

(24)

1.3. Robust Moment Selection 9 Proposition 1.2.2. Let θˆbe a GMM functional induced by an orthogonality function h and MSC a Moment Selection Criterion defined by (1.3) associated with θˆfor a given set of moments d∈ Z0. Then,

n→∞lim MSC(Pε,n;d)MSC(P0;d) =ε2||

Z

IF(y, U, P0;d)dQ(y)||2+o(ε2) , (1.6)

where Pε,n is defined in (1.4), U =U(P;d) =W0(d)12EPh(Y,θ;ˆ d), IF(y, U, P0;d) =W

1 2

0 [−EP0(h(Y, θ0;d)) +EP0[∂θ∂hT(Y, θ0;d)]IF(y,θ, Pˆ 0;d) +h(y, θ0;d)].

For d∈ D\Z0 we have,

∂εJ(Pε,n, d)|ε=0 = 2n[

Z

IF(y, U, P0;d)(dQ(y)dP0(y))]TEP0U (1.7)

Proposition 1.2.2 shows that MSC can be arbitrarily changed by a small perturbation if the corresponding orthogonality functionhis unbounded with respect toyford∈ Z0 and if ∂θ∂hT(.) is unbounded together with the moment conditions ford∈ D\Z0. An unbounded IF of the corresponding orthogonality conditions will have important consequences in the choice of the most efficient set of moments among the sets of orthogonality conditions that provide consistent estimates of the structural parameters. Hence, if we bound the orthogonality functionh(.) and ∂θ∂hT(.) the effect of any small contamination on MSC will be bounded, leading to a stable choice of the set of moments. We provide this construction in the next section. Note that we do not give the asymptotic behavior of MSC for d∈ D\Z0 since the von Mises expansion is dominated by the first term, implying a degenerated behavior of nJ(Pε,n;d) as n → ∞. Therefore, we provide ∂ε J(Pε,n;d)|ε=0.

1.3 Robust Moment Selection

[Ronchetti and Trojani, 2001] propose a robust estimator that ensures boundedness of the orthogonality function h. This approach leads to local robust estimators for d ∈ Z0.

(25)

However, this condition is not enough, in general, ford∈ D\Z0. The estimator is defined by a modification of the orthogonality condition by means of the so-called Huber function:

Hc1 :R|d|→R|d|, yy·ωc1(y)

where ωc1(y) = 1 if ||y||2 < c1 and ωc1(y) = ||y||c1

2 otherwise, c1 > q|d|. Then, new orthogonality conditions are defined by hA,τc1 :RN ×Θ× D →R|d| defined by

hA,τc1 (y, θ;d) =Hc1(A(θ;d)[h(y, θ;d)τ(θ;d)]),

where the nonsingular matrixA ∈R|d|×|d|and the vectorτ ∈R|d| are determined through the implicit equations:

EP0hhA,τc1 (Y, θ0;d)i= 0 (1.8)

1 n

n

X

i=1

h

hA,τc1 (yi, θ0;d)i·hhA,τc1 (yi, θ0;d)iT

=I|d|. (1.9)

This construction will make all original invalid moments valid under the truncating scheme. This can be seen in (1.8). This is why in the construction we propose we drop τ as we still want to discriminate between valid and invalid moments even after the truncation. Moreover, this estimator ensures boundedness of hc1(y, θ;d) but not necessarily for ∂θThc1(y, θ;d).

Indeed, in order to have a bounded derivative, the following two conditions need to be satisfied:

(26)

1.3. Robust Moment Selection 11

∂θTh(y, θ;d) is bounded for ||z||2 < c1, and (1.10)

∂θTh(y, θ;d)||z||−12 is bounded for ||z||2c1; (1.11)

with z =A(θ;d)[h(y, θ;d)τ(θ;d)] and ||.||2 being the L2-norm, see

[Lˆo and Ronchetti, 2012] in the context of empirical likelihood-type estimators and [La Vecchia et al., 2012] formulas (3.8) and (3.9) in the framework of second-order robust- ness for M-estimators.

These conditions need to be checked for a given h and they are not satisfied, for in- stance, in linear regression. To construct a robust GMM which satisfies these conditions, we proceed along the same lines as in [La Vecchia et al., 2012]. For a given set of orthog- onality conditions, we assume the existence of two functions qi : Y →R+, i = 1,2, such that ∂h∂θT||h||−12 =O(q1) and ∂h∂θT=O(q2), ∀d∈ D, and denote by r:= max(q1;q2) the maximum of these functions. We can then construct a robust GMM estimator whose first derivative will be bounded. The robust orthogonality conditions are based on the original unbounded moments in the following way:

hc1,c2(y, θ;d) :=h(y, θ;d) min1;||h(y,θ;d)||c1 2

min1;cr2 (1.12)

It is then straightforward to define the RGMM estimator as well as its functional:

θˆn,c1,c2( ˆP;d) := arg min

θ∈ΘEPˆhTc1,c2(Y, θ;d)Wn(d)EPˆhc1,c2(Y, θ;d) θˆc1,c2(P;d) := arg min

θ∈Θ EPhTc1,c2(Y, θ;d)W0(d)EPhc1,c2(Y, θ;d)

(27)

where the optimal weight matrix, Wopt, can be easily computed and is given by

Wopt(d) = EP0[hc1,c2(Y,θˆc1,c2(P0;d);d)hc1,c2(Y,θˆc1,c2(P0;d);d)T]−1. As suggested by [Andrews, 1999]

one could use ˜Wn(d) as the sample weight matrix defined by:

W˜n(d) =

"

1 n

n

X

i=1

˜h(yiˆn,c1,c2;d)−¯˜h(ˆθn,c1,c2;d) ˜h(yiˆn,c1,c2;d)−¯˜h(ˆθn,c1,c2;d)

T#−1

, (1.13)

with ˜h(yiˆn,c1,c2;d) = h(yiˆn,c1,c2;d) min

1; c1

||h(yi,θˆn,c1,c2;d)||2

min1;cr2

i

and

¯˜

h(ˆθn,c1,c2;d) = 1nPni=1˜h(yiˆn,c1,c2;d) . This weight matrix is an appropriate choice in the presence of invalid moments. Other choices could be considered in the presence of heteroscedasticity and auto-correlation.

The corresponding Hansen’s statistic and RMSC (Robust Moment Selection Criteria) are then defined by

Jc1,c2(P;d) = EPhTc1,c2(Y,θˆc1,c2;d)Wopt(d)EPhc1,c2(Y,θˆc1,c2;d) RMSCc1,c2(P;d) =nJc1,c2(P;d)p(|d|)κn

and the robust moment selection criteria are, by analogy to the classical case, defined by:

dˆRMSC(P) = arg min

d∈D RMSCc1,c2(P;d)

with the robust counterparts of MSC-AIC, MSC-BIC and MSC-HQIC given by:

(28)

1.4. Properties of RMSC 13

RMSC-AIC : nJc1,c2(P;d)−2(|d| −p) RMSC-BIC :nJc1,c2(P;d)−(|d| −p) logn

RMSC-HQIC :nJc1,c2(P;d)H(|d| −p) log logn, H >2

Some remarks about the new GMM estimator are in order.

• (1.12) ensures boundedness of bothh(·, θ0;d) and ∂θ∂hT(·, θ0;d). However, the explicit form of the latter depends on each scenario and has to be determined on a case by case basis.

• The [Ronchetti and Trojani, 2001] estimator, with τ = 0 and A = I, is a special case of this estimator if we let c2 → ∞and Wopt(d) =I|d|.

• Finally, when both c1 andc2 go to infinity, we obtain the classical GMM estimator.

1.4 Properties of RMSC

By truncating the original moments with the weights in (1.12) we ensure local robustness for valid and invalid moments. However, we lose consistency in estimation, a property available in the classical case. In general, ˆθc1,c2(P0;d) 6= ˆθ(P0;d) for d ∈ D. Hence, the RGMM estimate will depend on both tuning constants c1 and c2 and will be inconsistent in most settings. It is clear that by weighting the original moments in this way we cannot have consistency in estimation, even when the underlying distribution holds true.

Although this looks, in principle, like an important limitation we can still have consistency in moment selection if the next two assumptions are satisfied:

Assumption 1.4.1 . ∃!θc1,c2 such that

EP0[h(Y, θc

1,c2;d) min

1;||(h(Y,θc1 c1,c2;d)||2

min1;cr2] = 0

for d∈ Z0.

(29)

Assumption 1.4.2.

EP0[h(Y, θ;d) min1;||(h(Y,θ;d)||c1 2

min1;cr2]6= 0

for ∀θ∈Θ and d∈ D\Z0.

Assumptions 1.4.1 and 1.4.2 are an adaptation of the definitions of valid and invalid moments, respectively, considered in the econometric literature. Note that the classical definitions of valid and invalid moments are nested as a special case. We will provide more insight on this issue in the linear intrumental variable setting in section 1.7. Given these assumptions, we can use Hansen’s statistic to discriminate between valid and invalid moments. The latter is true even with the truncated moments as the original valid moments remain valid after the truncation. The invalid moments remain invalid after the truncation.

Proposition 1.4.3. Let assumptions 1.2.1, 1.4.1 and 1.4.2 hold as well as Assumptions A.1.1-A.1.10. Then,

P r( ˆdRMSC( ˆP)∈ MZ0)n→∞→ 1

where MZ0 is defined in A.1.5.

This result shows that we do not need to have consistency in estimation to get consis- tency in moment selection, i.e., P r( ˆdRMSC( ˆP)∈ MZ0) n→∞→ 1. This result opens up the possibility to consider moments with different properties than the ones that are usually used in the literature. In this case, we use moments that could be potentially inconsistent in estimation, but that satisfy what we call an identification condition (Assumptions 1.4.1 and 1.4.2), which allows us to pinpoint d0 with probability 1 as n→ ∞.

Similarly to the DT and UT we can propose robust analogues denoted by RDT and RUT respectively. Let ˆdRDT and ˆdRU T denote the selection vector chosen by RDT and RUT respectively. We provide the sketch of the algorithm for both RDT and RUT. Let

(30)

1.4. Properties of RMSC 15 Dk ={d ∈ D:|d|=k}, k=p, ..., hmax then ˆdRDT is calculated as follows:

Set k =hmax

if mind∈DknJc1,c2(P;d)> γn,|d| then k =k−1

otherwise dˆRDT = mind∈DknJc1,c2(P;d)

where γn,|d| = χ2|d|−pn) and χ2|d|−pn) denotes the 1−λn quantile of a chi-squared distribution with |d| −p degrees of freedom. Similarly RUT can be implemented in the following fashion:

Set k =p

if mind∈DknJc1,c2(P;d)< γn,|d| then k =k+ 1

otherwise dˆRUT = mind∈Dk−1nJc1,c2(P;d)

In words, RDT starts from the set of moments which maximizes|d|and goes down an integer in |d| until some J-test does not reject for a given levelλn. If there is more than one test that does not reject, then it picks the set associated with the minimum p-value.

Similarly, RUT starts from the set of moments that minimizes|d|and goes up an integer in |d| until all J-tests reject for a given |d|ˇ and a given level λn. RDT then picks the set of moments at the level |d| −ˇ 1 associated with the minimum J-test. Note that we need the assumption that at each integer |d| ≤ |dˆRU T| there is at least one d∈ Z0 in order for RUT to be consistent. In our setting, this assumption is satisfied by construction.

Proposition 1.4.4. Let assumptions 1.2.1, 1.4.1 and 1.4.2 hold as well as Assumptions A.1.1-A.1.10 with γn,|d|→ ∞ and γn,|d|=o(n). Then,

P r( ˆdRDT( ˆP)∈ MZ0)n→∞→ 1 P r( ˆdRUT( ˆP)∈ MZ0)n→∞→ 1

where MZ0 is defined in A.5.

(31)

Assumptionsγn,|d|−|b| → ∞andγn,|d|−|b|=o(n) are taken from [Potscher, 1983]. These results prove consistency in moment selection of the robust procedures we present. It is also convenient to derive at this stage the post-selection properties of the estimates associated with these procedures.

Proposition 1.4.5. Let assumptions 1.4.1 and 1.4.2 hold as well as Assumptions A.1.1- A.1.10 together with γn,|d| → ∞ and γn,|d| = o(n). Let, m = {RMSC,RUT,RDT}.

Moreover, let Wn(d)→P W0opt(d) for d ∈ D. Then,

n(ˆθc1,c2( ˆP; ˆdm)−θc

1,c2)→D N(0,ΣRGMM0 (d0)), where ΣRGMM0 (d0) = (EP0[∂h

Tc1,c2(d0)

∂θ ]EP0[hc1,c2(d0)hTc1,c2(d0)]−1EP0[∂hc1∂θ,c2T(d0)])−1 and hc1,c2(d0) = h(y, θc1,c2;d0) min

1;||h(y,θc1 c1,c2;d0)||

min1;cr2.

1.5 Computational aspects

The construction of the RMSC requires to choose the tuning constants c1 and c2. More- over, this estimator does not have an explicit form even in the linear case. Hence, we think it is convenient to provide some insight on how to choose the tuning constants as well as an algorithm on how to perform RMSC given some potential sets of moments.

The algorithm is presented here.

1. Fix c1 >0 and c2 >0 and a starting value θini(d) for each set of moments d∈ D.

2. Computeαiini(d) = min(1;||h(y c1

iini(d);d)||2) and βi = min(1;||rc2

i||2)

3. Compute the RGMM estimator with the set of orthogonality conditions given by hc1,c2(yi, θini(d);d) = h(yi, θini(d);d)αinii (d)βi, benoted by θupd(d).

4. Replaceαinii (d) byαupdi (d) = min(1;||h(y c1

iupd(d);d)||2), and iterate the second and third step until a convergence criterion is satisfied for each set of orthogonality conditions.

We can consider convergence criteria in the parameter space and in the objective function space.

(32)

1.6. Robustness and consistency in moment selection 17 5. ComputeJcfull1,c2(d), where

Jcfull1,c2(d) =

n

X

i=1

αfulli (d)βihT(yi, θfull(d);d) (

n

X

i=1

α2,fulli (d)βi2h(yi, θfull(d);d)h(yi, θfull(d);d))−1

n

X

i=1

αfulli (d)βihT(yi, θfull(d);d),

and αifull = min(1;||h(y c1

ifull(d);d)||2) and θfull are the fully iterated weights and the structural parameters computed in steps 2 and 3. Then, fix p(|d|) and κn and compute RMSCfullc

1,c2(d) for each set of orthogonality conditions. A mean-corrected version for the weight matrix could also be used.

6. For a givenp(|d|) and κn take the smallest value among

RMSCfullc1,c2(d), d ∈ D. The GMM estimator associated with the selected set of orthogonality conditions is the GMM estimator induced by the truncated orthogo- nality conditions with the smallest RMSC.

Concerning the choice of c2, no unique method is available. Based on the suggestion by [La Vecchia et al., 2012] and the simulation results for the linear IV, we set a relative efficiency loss in terms of MSE with respect to the classical estimator when the ideal distribution holds and this seems to work well. Formally, let η be the relative efficiency we want to achieve at the model. Then we can choose c2 given c1 so that:

η = MSE(ˆθ)

MSE(ˆθc1,c2)

1.6 Robustness and consistency in moment selection

[Andrews, 1999] proposes MSC that maximize the number of orthogonality conditions that are asymptotically zero. This property is very powerful at the model since we get

(33)

a more efficient estimation of the parameters the more valid moments we add among sets of moments that give rise to consistent estimates. Thus, this property makes infer- ence optimal in the sense that we minimize the asymptotic variance among sets of valid moments and at the same time we keep consistency of the estimates, at least with the unbounded moments used in the econometric literature. The next proposition formalizes this reasoning:

Proposition 6.1. Let θ˜1 be defined by EP0[h1(Y, θ0)] = 0, where h1 is a vector with

|d1| ≥ p moments and let θ˜2 be defined by EP0[(hT1(Y, θ0) hT2(Y, θ0))T] = 0 where h2 is a

|d2| ×1 vector. Let W1 =V1−1 and W2 =V2−1, the optimal weight matrices for θ˜1 and θ˜2 respectively. Then,

VP−1

0θ2)−VP−1

0θ1) is positive semi definite

In our setting we allow for contaminated distributions and it is worth thinking about the properties that still hold when the underlying distribution is close to, but not exactly, the ideal distribution. In the case of moment selection, we can judge whether the consis- tency in moment selection is a good property, that is, whether we still keep the efficiency gain in estimation when adding valid moments. In order to provide some insight on this issue, we define the “Change of Efficiency Function”:

Definition 1.6.2. (Change of Efficiency Function). Given two GMM estimators denoted by θ˜1 and θ˜2 and based on EP0[h1(Y, θ0)] = 0 and EP0[(hT1(Y, θ0) hT2(Y, θ0))T] = 0 respectively. We define the Change of Efficiency Function, CEF, as:

CEF(y, h1, h2, P0) =h∂ε(VP˜−1

εθ2)−VP˜−1

εθ1))i

ε=0

whereP˜ε= (1−ε)P0+εδy, δy a point mass at y andVPθj) is the asymptotic variance for the GMM estimator θ˜j with the optimal weight matrix, j = 1,2under the distribution P0.

Références

Documents relatifs

Chapter 1: Rock Bands: Matching, Recording &amp; Work Organization Error.. Bookmark

Nous aborderons tout d’abord la question du statut de la viande dans cette société, puis nous nous interrogerons sur les éléments qui concourent à l’attribution d’un

Au terme de cette thèse, nous voulions être en mesure d’identifier et de quantifier les causes de l’augmentation du nombre de centenaires au Québec, de se

Since we start with a set of 2D parametric motion models meant to provide us with an approximation of the unknown optical flow, we naturally adopt a para- metric approach for the

and surface forces (line force in the two dimensional case) due to the pressure of liquid acting on the bound of the..

La pression à la publication augmente, un nombre croissant de chercheurs tend à privilégier le quantitatif sur le qualitatif (voir l’avis du COMETS de 2014 sur les

If the extent of local capture of these funds is also substantial in Senegal, then the results in this paper are even more remarkable (because they would have been produced with

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des