• Aucun résultat trouvé

Quantile regression: From conditional to unconditional marginal

2.3 Minimum wage settings, data sources and methodology

2.3.6 Quantile regression: From conditional to unconditional marginal

In the OLS method, the conditional mean can be derived through minimizing the sum of square residuals whereas, conditional median estimates can be derived through minimizing the sum of absolute residuals. This idea is expanded for the estimation of other conditional quantiles, by minimizing the sum of asymmetrically weighted absolute residuals, and it provides different weights for the positive and negative residuals at each quantile (Koenker and Bassett [1978]). For our case the quantile regression approach enables us to estimate the different impacts at each point of the wage distribution i.e. it allows for a non-linear association between the minimum wage and log of real wages.

In the same way as a mean square error problem one can define the conditional quantile function as a solution to the following minimization problem, Koenker [2005] :

Q·(ln(W)|X, M W) = arg min

q(X,M W)E[fl·(ln(W)≠q(X, M W))], (2.3) where·(.) provides asymmetrical weights on positive and negative residual, which follow:

·(u) = 1(u >0).·|u|+1(uÆ0).(1≠·)|u|,

where 1(.) is the indicator function and · the quantile of interest. Substituting q(X, M W) by a linear model, i.e. q(X, M W) = XT+M W–, allows transforming this minimization into an easier linear optimization problem, which can be solved through iterations. This then provides:

1ˆ·,–ˆ·

2= arg min

—,– E[fl·(ln(W)≠XTM W–)].

The quantile regression gives the median regression where ·=0.5. As one increases

· from 0 to 1, one traces the entire distribution of Y conditional on (X, M W). Thus, quantile regressions provides the effect at different points of a conditional distribution.

We use the same mincerian wage equation in the quantile regression framework:

Quant·(ln(W)|X, M W) =XTˆ· +M W–ˆ·. (2.4) The ·’s and · are the coefficients of the wage regression at·th conditional quantile of ln(W) given all explanatory variables.

Estimated marginal effects at each 0.05 quantile from model (2.4) are computed and then plotted against the related quantiles in Figures 2.2, 2.3, 2.4 and 2.5 for each country under the title of CQR. The main limitation of the CQR approach lies in the interpretation of the marginal effects. One must keep in mind that, using this approach we are estimating the impact of minimum wages on the Conditional Quantiles of the wage distribution.

That is to say our analysis are valid for within group changes, where ”group” refers to the sub-population of individuals who share the same characteristics with respect to the set of covariate in the model. We refer to the marginal effects, estimated from (2.4) as Conditional Quantile Partial Effect (CQPE). The confidence intervals for the estimated CQPE’s can be obtained either by using the asymptotic theory of quantile regression estimators or by using variety of re-sampling schemes. See Koenker and Hallock [2001]

for a fair comparison of different methods. The authors argue that the discrepancies among competing methods are slight and inferences for quantile regressions are quite robust. In the analysis of this chapter we use analytical confidence intervals implemented in the software Stata, using the qreg command. This method, however, results in more conservative confidence intervals than the ones could be obtained from bootstrapping. But the general inferences and conclusions does not change in analysis using either analytical or bootstrap confidence intervals for CQPE’s.

In plotting the marginal effects (CQPE) against their related quantiles, the negative (positive) slope indicates a reduction (increase) of within group inequality as a result of an increase in minimum wages. Consider the case where all the marginal effects are positive.

The negative slope means when there is a positive location shift in M W the wages of workers (from the same group) at lower quantiles increase more than those at the upper quantiles, and as a result the inequality decreases. This intuition holds for marginal effects with different sign. These results, although valuable do not inform us much about between group inequality (The inequality measure calculated using unconditional distribution of wages), which is important from policy point of view.

As discussed earlier, the main challenge that arises when analyzing the effect of min-imum wages on wages, either on average or on the entire distribution, is dealing with the potential endogeneity problem and its consequent bias in the estimation of the effect.

The endogeneity problem can arise from different sources: omitted variables, measure-ment error and simultaneity. In the analysis of minimum wages, the argumeasure-ment lies in the fact that common macro level economic factors can simultaneously influence wages and the minimum wages of an individual (simultaneity). Some researchers have also argued that spill-over effects of minimum wages of workers not covered by minimum wages must also be taken into consideration as a measurement error, that can be another source of endogeneity bias (Autor [2010]).

As a remedy to these issues some empirical studies have also used time and state or region dummies to control for endogenous changes (Bell [1997]; Neumark et al. [2004]).

Gindling and Terrell [2004] include dummy variables for each year to control for endoge-nous changes in yearly average minimum wages and timing of minimum wage changes.

Some others have also tried to use instruments for addressing the endogeneity problem:

Autor et al. [2016] used a two-stage least squares method, wherein they instrument the effective minimum with the statutory minimum wage in each state and year. The assump-tion is that this instrument will capture exogenous variaassump-tion in the effective minimum that are uncorrelated with the measurement error in state medians.

We have taken into consideration the complex setting of the minimum wages to avoid the former problem. In these cases the exogeneity of the minimum wages are plausible since the minimum wage settings are not merely a yearly regular increase in minimum wages but are the result of a number of other factors7. In our study we use the detailed minimum wages at the sectoral and industry level instead of using simple national level

7To dismiss the confounding effect of inflation on the minimum wages and wages, all the analysis has been carried out based on the real hourly minimum wages and real hourly wages

minimum wages. The technique we employed to map is time-consuming but it provides a desirable variation in minimum wages at the individual level. This is especially true in the case of India, Indonesia, and Mexico where there are multiple minimum wages at state, sectoral and occupational level. Therefore, we do not find it essential to simulate the minimum wage variation by using the state or industry specific median wages. Further, to test for the endogeneity problem, we repeated the analysis taking the lag of the minimum wage 8. The results are quite consistent to those without lags (both in terms of the direction and the magnitude of the marginal effects), which to an extent assures us that we can ignore the threat of endogeneity, as it is likely that the minimum wages and wages are not determined in a similar way, like the case inLee[1999],Lemos[2003]. In addition, the dummy variables for industry categories and state are incorporated as controls in the regressions. We have run the analysis for all wage workers, yet we acknowledge that the estimated impacts are for intent-to-treat sample.

Although, conditional quantile regressions provide some sort of functional form flex-ibility to the model by allowing the effects of the covariates to differ at each quantile of the response, yet the results are not easily interpretable. Conditional quantile regressions are vastly used in econometrics and statistics literature, in practice more often than not.

If the question of interest, however, is the impact of the covariates on quantiles of the unconditional (marginal) distribution of the response variable, then its calculation can be complicated using the conditional quantile regression. Firpo et al. [2009] (FFL hereafter) have proposed an approach, namely Unconditional Quantile Regression (UQR) to answer such policy questions which can overcome the limitations of the (CQR) approach. The results and the interpretation following each of these techniques (CQR and UQR) are different9, however these two are closely linked, as one can be expressed as a weighted average of the other. Having both results provides an additional insight. Here we explain the UQR method proposed by (FFL) that it is used for our analysis.

Assume that, Y is the vector of observed responses in the presence of covariates X (for notational simplification the M W is included in the vector X and will manifest separately when is needed). The joint distribution, FY,X(., .) : R ≠æ [0,1] exist, where is the domain of X. In many cases, specially from the policy perspective we are interested in evaluating the effect of infinitesimal (small) location shift of the distribution of a particular variable from the set of covariates, e.g M W, on a statistical functional (e.g. the ·th quantile) of the unconditional distribution of Y.

We can write the unconditional distribution of Y as:

FY(y) = FY|X(y|X =x)dFX(x). (2.5) The small location shift in the distribution of explanatory variable can be defined as a directional shift from FX(x) toward GX(x). In this case, under the assumption that this small change will leave the conditional distribution FY|X(.) intact, the shift in the unconditional distribution of Y is to move from FY in the direction of GY, where:

GY(y) = FY|X(y|X =x)dGX(x).

The idea of UQR lies in use of the definition of Influence Function (IF) that was introduced by Hampel [1974] in the field of robust statistics. While Fx shifts in the

8The estimates with lag is only done for Brazil as the minimum wage mechanism is comparatively simpler than in other countries, and it is easy to implement within a short period of time.

9More specifically in the cases where the effects vary for each conditional quantiles.

direction of Gx, we have the mixing distribution in the neighborhood of FY(y) that moves to the direction of GY(y).

By definition the influence function,IF(Y;T, FY), of the statistical functional, T(F), is defined as :

IF(y;T, FY) = lim

t≠æ0

T{(1≠t)F +t·y}≠T(F)

t ,

wherey places a point mass of 1 at the point y,

y(u) =

Y] [

0 if u < y, 1 if uØy.

If the IF(Y;T, FY) exists, it represents the influence of an individual observation on the statistical functional, T(F). Adding up the influence function to its statistical functional result is what (FFL) refer to as recentered influence function (RIF).

RIF(y;T, FY) =T(FY) +IF(y;T, FY), (2.6) taking an integral over the domain of Y, from both sides of the equation (2.6) gives:

RIF(y;T, FY).dFY(y) =T(FY) + IF(y;T, FY).dFY(y).

Since s IF(y;T, FY).dFY(y) = 0 by definition and by referring to equation (2.5), we have:

T(FY) = RIF(y;T, FY).dFY(y)

=⁄⁄ RIF(y;T, FY).dFY|X(y|X =x).dFX(x)

= E#RIF(Y;T, FY)|X =x$.dFX(x).

(2.7)

This links the statistical functional,T(F) to its conditional expectation of theRIF(y;T, FY).

If we are interested in the impact of small changes in the distribution of covariates on any statistical functional, we simply need to integrate the partial derivative ofE#RIF(Y;T, FY)|X$ with respect to the covariate of interest over the domain of the covariates.

From this we can define the unconditional partial effect of a small location shift in the distribution of a continuous covariate e.g. M W, that belongs to the vector of covariates X, on the statistical functional T(F) (Corollary 1 in FFL):

–(T) = dE#RIF(Y;T, FY)|X =x$

dx .dFX(x). (2.8)

Consider the specific case where the quantiles, q· are the statistical functional of interest. The ·th quantile of the distribution Fy is defined as, q· = T(Fy) = infq{q : FY(q)Ø·}, and its influence function can be expressed as:

IF(y;q·, FY) = · ≠1(y Æq·) fY(q·) ,

where1is the indicator function and fY(q·) is the density of Y atq·. It follows that: can write the conditional expectation of the RIF conditioned on the set of covariates X as:

E#RIF(Y;q·, FY)|X$=c1,·.P r#Y > q· |X$+c2,·,

the unconditional partial effect of a small location shift in the distribution of a con-tinuous covariate xi on the ·th quantile is :

–(q·) =c1,·.

dP r#Y > q· |X =x$

dx .dFX(x) (2.10)

(FFL) call the –(q·) the Unconditional Quantile Partial Effect (UQPE). The problem of estimating marginal effects now boils down to estimate the average marginal effect of the probability model P r#Y > q· |X$, population quantile ˆq· and the density of Y at this point, ˆfYq·), the latter two can easily be replaced by their sample counterparts:

ˆ

q· = arg min

q

ÿn

i=1(· ≠1{yiqÆ0}).(yiq) from Koenker and Bassett [1978], and the kernel density estimator:

fˆYq·) = 1 n

ÿn i=1

Kh(yiqˆ·)

where Kh(.) is a kernel density function with h being the optimal bandwidth chosen as in (2.1).

P r#Y > q· |X$ can be estimated using different methods. FFL discussed the use of Linear Probability Model (LPM), Generalized Linear Models (GLM: Logit or Probit) and the non-parametric model to estimate this conditional probability. Under the assumption that P r#Y > q· |X$ is linear inX (LPM), the OLS estimation of RIF[(Y; ˆq·, Fy) on X, provides a consistent estimate of the marginal effects, –(q·). This is what Firpo et al.

[2009] referred to as RIF-OLS and it is the basis of our analysis in studying the impact of a small location shift in the distribution of minimum wages on the unconditional quantiles of the wage distribution. Changing the notation of the outcome, fromY, toW for wages10 and separating theM W from the vectorX, we first estimate theRIF for each observation point as:

RIF[(W; ˆq·, FW) = ˆc1,·.1{Y >qˆ·}+ ˆc2,·

Here, the same set of covariates are considered for the model as for the conditional quantile regression. We are merely interested in analyzing the impact of the minimum

10As in the case of OLS and conditional quantile regression, our responses of interest, W, is the vector of log of hourly real wages, noted as ln(w), that will prevent us from finding the effect that is a mix of impact on hours of work and/or economic inflation.

wages (in real hourly term) (MW) on different quantiles of the unconditional wage distri-bution. We estimate the following model at each 0.05 quantile, ·, using RIF-OLS:

E

5RIF[(W; ˆq·, FW)|X, M W

6=XT—(·ˆ ) +M W–(·ˆ )

The confidence intervals for the estimated parameters, in this case, are achieved through (x, y)-pair bootstrapping procedure.That is re-sampling observations with replacement and refitting the models. Further, the results of the marginal effects, ˆ–(·), are plotted against the related quantiles in Figures 2.2, 2.3, 2.4 and 2.5 for each country under the title of UQR.

If the linearity assumption holds; i.e. referring to the equation (2.2) ifh(X, M W,‘) = XT+M W–+‘, then UQPE and CQPE are equal to the coefficients of the structural form. However, this not the case in our analysis, as we observe that the partial effects estimated through each of the two methods are different; and so is the interpretation of the results. To interpret the results of the estimated UQPE at each quantile, the same intuition holds here as in the case of CQPE. However, these results are more general than in the case of CQR. Therefore, one can argue that where the graph of the marginal effects against their quantiles has a negative(positive) slope the inequality of wages re-duces(increases) as a result of a small increase in the location of the minimum wages.

This implies that the minimum wages have positive(negative) effect the inequality. The inequality here refers to the between group inequality (dispersion) as opposed to with-in group inequality from CQR. This provides room for policy advices that concerns entire population under study rather than only aim at the specific ”group” that share the same value of the covariates, X. Firpo et al. [2009]

As mentioned earlier, one of the main assumptions behind the UQR, that can appear restrictive is that the conditional distribution of the outcome, FY|X(.) must stay unaf-fected by small location shift in the distribution of X. This is guaranteed if there is an independence between X and ‘, which means that the vector of covariates can only contain the exogenous variables and endogeniety problem cannot be easily dealt with in this framework.

Finally, the use of Linear Probability Model (LPM) for the estimation ofP r#Y > q· |X$ might seem quite restrictive. Firpo et al. [2009], have also suggested the use of other ap-propriate models such as Logit or Probit, or even a non-parametric estimator. They have shown, in their empirical example that the additional gain of using more complicated methods is very little, when the RIF-OLS has the great advantage of being very easy in computations.