• Aucun résultat trouvé

A test of endogeneity in quantiles

N/A
N/A
Protected

Academic year: 2021

Partager "A test of endogeneity in quantiles"

Copied!
6
0
0

Texte intégral

(1)

HAL Id: inria-00494786

https://hal.inria.fr/inria-00494786

Submitted on 24 Jun 2010

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

A test of endogeneity in quantiles

Tae-Hwan Kim, Christophe Muller

To cite this version:

Tae-Hwan Kim, Christophe Muller. A test of endogeneity in quantiles. 42èmes Journées de Statistique,

2010, Marseille, France, France. �inria-00494786�

(2)

A Test for Endogeneity in Quantiles

Tae-Hwan Kim a and Christophe Muller b

b Department of Economics, Yonsei University, Seoul 120-749, Korea

Tae-Hwan.Kim@yonsei.ac.kr; thkim8316@gmail.com

b DEFI, University of Aix-Marseille,

14, avenue Jules Ferry, F - 13621 Aix-en-Provence cedex, France.

Email: Christophe.muller@u-cergy.fr; christophe.muller@univmed.fr

Keyword: Quantile regression, endogeneity test, Hausman test. Mots clé de la conférence: Econométrie, Modèles semi et non paramétriques

Abstract: In this paper we develop a test to detect the presence of endogeneity in di¤erent quantiles in the conditional distribution of a variable of interest. This Hausman test type is based on one estimator consistent only under no endogeneity at the examined quantile and another estimator consistent in both the null and the alternative hypotheses.

We derive the asymptotic distribution of the test statistic. Moreover, we study the …nite sample properties of this test with Monte Carlo simulations of which results exhibit sub- stantial power in the studied cases. Finally, we apply our test to Engel curve estimation with UK data. We …nd that the pattern of the endogenenity of the total expenditure for various commodities (food, alcohol, fuel, transport, services) is complex when examining it across quantiles.

Résumé: Dans ce papier nous développons un test pour détecter la présence d’endogénéité en di¤érents quantiles de la distribution conditionnelle d’une variables d’intérêt. Ce test a la d’Hausman est basé sur un estimateur convergent seulement sous absence d’endogénéité et un autre estimateur convergent a la fois dans les hypothèses nulles et alternatives. Nous déerivons la distribution asymptotique de la statistique de test. De plus nous étudions les propriétés à distance …nie de ce test à l’aide de simulations de Monte Carlo dont les réesultats montrent une puissance considérable dans les cas étudiés. Finalement, nous appliquons notre test à l’estimation de courbes d’Engel avec des données britaniques.

Nous trouvons que la structure de l’endogénéité de la dépense totale pour divers biens (alimentation, alcool, fuel, transport, services) est complexe lorsqu’elle est examinée à travers les quantiles.

In recent years quantile regression has become a popular estimation method among

applied economists. The main reason of this popularity is the ‡exibility of this method

which, unlike the traditional mean-like estimators such as OLS, MLE or GMM, allows

researchers to investigate every single corner of the conditional distribution of a variable

of interest.

(3)

The issue of endogeneity in the context of quantile regression has been well recognized and di¤erent methods to deal with such a issue have been proposed, including the friendly two-stage …tted-value procedure. There are two trends in the literature about quantile regressions with endogeneity problem. The …rst one corresponds to models speci…ed in terms of the conditional quantile of the structural equation (the ‘structural conditional quantile’). The second set of works is anchored on conditional quantile restrictions applied to the reduced-form equation (the ‘…tted-value approach’) and has been employed by some empirical researchers. This allows the replacement of the endogenous regressors in the second stage with their …tted values obtained in the …rst stage. We follow this approach in this paper.

We are interested in the parameter ( 0 ) in the following equation for T observations:

y t = x 0 1t 0 + Y t 0 0 + u t (1)

= Z t 0 0 + u t

where [y t ; Y t 0 ] is a (G + 1) row vector of endogenous variables, x 0 1t is a K 1 row vector of exogenous variables, Z t = [x 0 1t ; Y t 0 ] 0 , 0 = [ 0 0 ; 0 0 ] 0 is assumed bounded in R K

1

+G , and u t is an error term, t = 1; : : : ; T vector. We denote by x 0 2t the row vector of K 2 (= K K 1 ) exogenous variables absent from (1).

We shall discuss later what we mean by exogeneity for given estimation problems.

Although weak exogeneity is an issue here, more speci…c exogeneity notions can be used for given estimators, e.g. implying that the regressors are orthogonal to the errors for LS methods.

Estimating 0 at the th -conditional distribution quantile of the dependent variable y can be achieved through the following minimization program:

min X T

t=1

(y t Z t 0 ) (2)

and (z) = z (z) where (z) = 1 [z 0] and 1 [:] is the Kronecker index. The solution from (2), denoted by ~, will be called the one-step quantile estimator for 0 . The one-step estimator ~ is consistent if the following zero conditional expectation condition holds:

E( (u t ) j Z t ) = 0 (3)

This condition is the assumption that zero is the given th -quantile of the conditional distribution of u t . It identi…es the coe¢ cients of the model. Since this identifying condi- tion depends on the chosen , the parameter 0 in the model depends on this too and would vary when considering other quantiles.

In this paper we develop a procedure to test for possible endogeneity in Y t . As op-

posed to traditional endogeneity Hausman tests based on LS estimators, our speci…cation

allows for non-constant e¤ects across quantile. This implies that we test a more general

hypothesis of endogeneity than usual.

(4)

We assume that Y t can be linearly predicted from the exogenous variables:

Y t 0 = x 0 t 0 + V t 0 (4)

where x 0 t = [x 0 1t ; x 0 2t ] is a K-rows vector, 0 is a K G matrix of unknown parameters and V t 0 is a G-rows vector of unknown error terms. By assumption the …rst element of x 1t is 1. Using (1) and (4), y can also be expressed as:

y t = x 0 t 0 + v t (5)

where

0 = H( 0 ) 0 with H( 0 ) = I K

1

0 ; 0 (6)

and v t = u t + V t 0 0 :

So far, we did not mention any restriction on errors. The error restrictions will be introduced below in Assumptions 2(vi). We now specify the data generating process.

Assumption 1 The sequence f (Y t 0 ; x 0 t ; u t ; v t ; V t 0 ) g is independently and identically distrib- uted (iid). Random vectors Y t 0 ; x 0 t ; u t ; v t ; and V t 0 are the t th elements in Y; x; u; v; and V respectively.

More speci…cally, ^ and ^ j (the j th column of ^ ; j = 1; : : : ; G) are …rst stage estimators obtained by:

min X T

t=1

(y t x 0 t ) (7)

min

j

X T t=1

(Y jt x 0 t j ) (8)

where and j are K 1 vectors and Y jt is the (j; t) th element of Y . Based on these

…rst-stage estimator, the second-stage estimator ^ is de…ned and obtained as follows:

min X T

t=1

(y t x 0 t H( ^ ) ):

In order to obtain the asymptotic distributions of ~ and ^, we impose the following reg- ularity conditions. Let h( j x), f ( j x) and g j ( j x) be the conditional densities respectively for u t , v t and V jt .

Assumption 2 (i) E( jj x t jj 3 ) < 1 and E( jj Y t jj 3 ) < 1 where jj a jj = (a 0 a) 1=2 .

(ii) H( 0 ) is of full column rank.

(5)

(iii) There is no hetero-altitudinality. That is: h( j x) = h( ), f ( j x) = f( ) and g j ( j x) = g j ( ). Moreover, h( ), f( ) and g j ( ) are continuous.

(iii) When evaluated at zero, all densities are positive; h(0) > 0, f (0) > 0 and g j (0) > 0.

(iv) All densities are bounded above; that is, there exist constants h , f , and j such that h( ) < h , f( ) < f and g j ( ) < j :

(v) The matrices Q x = E(x t x 0 t ) and Q z = E(Z t Z t 0 ) are …nite and positive de…nite.

(vi) E f (v t ) j x t g = 0 and E f (V jt ) j x t g = 0 (j = 1; : : : ; G):

For the one-step quantile estimator ~ we have

T 1=2 (~ 0 ) ! d N (0; 11 Q z 1 )

where 1t = h(0) 1 (u t ); 11 = E( 2 1t ) = h(0) 2 (1 ) and Q z = E(Z t Z t 0 ). The covari- ance estimator 11 Q z 1 can be consistently estimated by ^ 11 Q ^ z 1 where Q ^ z = T 1 P T

t=1 Z t Z t 0 and ^ 11 = T 1=2 P T

t=1 ^ 2 1t = ^ h(0) 2 (1 ) with ^ 1t = ^ h(0) 1 (^ u t ); u ^ t = y t Z t ^. Here

^ h(0) can be any kernel-type non-parametric estimator of the density h at zero.

A similar result can be obtained for the second-stage estimator ^ (Kim and Muller, 2004):

T 1=2 (^ 0 ) ! d N (0; 22 Q zz 1 );

where Q zz = H( 0 ) 0 Q x H( 0 ); Q x = E(x t x 0 t ) and 2t = f(0) 1 (v t ) P G

i=1 0i g i (0) 1 (V it ),

22 = E( 2 2t ). As before, 22 and Q zz can be consistently estimated: Q ^ zz = H( ^ ) 0 Q ^ x H( ^ ) with Q ^ x = T 1 P T

t=1 x t x 0 t and ^ 22 = T 1=2 P T

t=1 ^ 2 2t with ^ 2t = ^ f(0) 1 (^ v t ) P G

i=1 ^ 0i g ^ i (0) 1 ( ^ V it ) where f ^ (0) and ^ g i (0) are kernel-type estimators of f(0) and g i (0) respectively and v ^ t and

V ^ it are the residuals from the …rst-stage regressions in (7) and (8).

The null hypothesis we wish to test is

H 0 : E( (u t ) j Z t ) = 0, for a given (9) Theorem 1 Suppose that Assumptions 1 and 2 hold. Then, under the null of no endo- geneity, we have

T (~ ^)[RC 1 R 0 ] 1 (~ ^) ! d 2 (K 1 + G);

where R = I K

1

+G : I K

1

+G ;

C = 11 Q z 1 12 Q z 1 Q zx H( 0 )Q zz 1

12 Q zz 1 H( 0 ) 0 Q 0 zx Q z 1 22 Q zz 1 ; Q zx = E(Z t x 0 t ) and 12 = E( 1t 2t ):

C can be replaced with a consistent estimator C ^ T without a¤ecting the limiting dis- tribution. We use the plug-in principle to propose:

C ^ T = ^ 11 Q ^ z 1 ^ 12 Q ^ z 1 Q ^ zx H( ^ ) ^ Q zz 1

^ 12 Q ^ zz 1 H( ^ ) 0 Q ^ 0 zx Q ^ z 1 ^ 22 Q ^ zz 1 ;

(6)

Q ^ zx = T 1 X T

t=1

Z t x 0 t and ^ 12 = T 1 X T

t=1

^ 1t ^ 2t :

A …rst try of test statistics is: = T (~ ^)[R C ^ T 1 R 0 ] 1 (~ ^);which follows asymp- totically 2 (K 1 + G) if ~ and ^ are both consistent under the null.

However, the intercept estimator in ^ is not consistent for all values of . It is because the semi-parametric restrictions E f (v t ) g = 0 and E f (V jt ) g = 0 implied by Assump- tion 2(vi) are …rst imposed for a starting value 0 . Then, they will not be satis…ed for other values of . For another such , the quantile regression admits an asymptotic bias of F 1 ( 0 ) on the intercept, while the Double-Stage quantile regression admits a bias F 1 ( 0 ) + 0 G 1 ( 0 ) , still on the intercept coe¢ cient. to check

On the other hand, the slope estimator is consistent regardless of the value of . Hence, in order to propose a test for any value of , we use the slope estimators only to construct the test statistic (denoted by KM ) and the null distribution is 2 (K 1 + G 1):

Speci…cally, let 0(1) and 0(2) be the intercept and slope coe¢ cients respectively and we also decompose the quantile estimators ~ and ^ accordingly; that is, ~ 0 = (~ (1) ; ~ 0 (2) ) and

~ 0 = (~ (1) ; ~ 0 (2) ). Let R (2) be the matrix composed of the last (K 1 + G 1) rows in R; that is R (2) = [0 I] where the zero vector 0 and the identity matrix I are of size K 1 + G 1.

Then,

Theorem 2

KM = T (~ (2) ^ (2) )[R (2) C 1 R 0 (2) ] 1 (~ (2) ^ (2) ) ! d 2 (K 1 + G 1);

Bibliographie

References

[1] [1] Chernozhukov, V. and C. Hansen (2005), “An IV Model of Quantile Treatment E¤ects,” Econometrica, Vol. 73, No. 1, 245-261.

[2] Hausman, J.A. (1978), “Speci…cation Tests in Econometrics,”Econometrica, 1251- 1271.

[3] Kim, T. and C. Muller (2004), “Two-Stage Quantile Regressions when the First Stage is Based on Quantile Regressions,” The Econometrics Journal, 7, 218-231.

[4] Powell, J. (1983). The asymptotic normality of two-stage least absolute deviations

estimators. Econometrica 51, 1569-75.

Références

Documents relatifs

If the subspace method reaches the exact solution and the assumption is still violated, we avoid such assumption by minimizing the cubic model using the ` 2 -norm until a

GAW15 simulated data on which we removed some genotypes to generate different levels of missing data, we show that this strategy might lead to an important loss in power to

Here we consider the effects of both grain size distribution and dynamical grain charging to a test charge moving slowly in a dusty plasma by expressing the plasma dielectric

In order to check that a parametric model provides acceptable tail approx- imations, we present a test which compares the parametric estimate of an extreme upper quantile with

According to the TCGA batch effects tool (MD Anderson Cancer Center, 2016), the between batch dispersion (DB) is much smaller than the within batch dis- persion (DW) in the RNA-Seq

MICROSCOPE is a French Space Agency mission that aims to test the Weak Equivalence Principle in space down to an accuracy of 10 − 15.. This is two orders of magnitude better than

Our test is a Hausman-type test based on the distance between two estimators, of which one is consistent only under no endogeneity while the other is consistent regardless of

A travers cette étude, plusieurs microstructures et courbes de dureté ont été présentées, où nous avons montré les conditions des traitements thermomécaniques qui permettent