• Aucun résultat trouvé

Contributions to the robust analysis of structural models

N/A
N/A
Protected

Academic year: 2022

Partager "Contributions to the robust analysis of structural models"

Copied!
104
0
0

Texte intégral

(1)

Thesis

Reference

Contributions to the robust analysis of structural models

RANJBAR AKBARZADEH, Setareh

Abstract

This thesis looks at different aspects of robustness in the analysis of structural models. The first chapter illustrates the importance of functional form specification for correct inferences on the non-parametric components of a semi-parametric model. Empirical results are provided using the German Socio Economic Panel data. The second chapter takes the quantile regression, conditional and unconditional, as robust methods to analyze the effect of minimum wages on the entire wage distribution for four developing countries: Brazil, Mexico, Indonesia and India. In the last chapter we focus on the robust estimation of the non-linear parameters in the context of Small Area Estimation and we propose new bias calibration methods. The performance of these methods is compared with the existing competitors in the literature.

These methods mostly outperform the others specially in cases when we are dealing with a heavy-tailed and highly skewed distribution, such as the distribution of income or expenditure.

RANJBAR AKBARZADEH, Setareh. Contributions to the robust analysis of structural models. Thèse de doctorat : Univ. Genève, 2018, no. GSEM 55

URN : urn:nbn:ch:unige-1069286

DOI : 10.13097/archive-ouverte/unige:106928

Available at:

http://archive-ouverte.unige.ch/unige:106928

Disclaimer: layout of this document may differ from the published version.

1 / 1

(2)

Contributions to the Robust Analysis of Structural Models

by

Setareh Ranjbar

A thesis submitted to the

Geneva School of Economics and Management, University of Geneva, Switzerland,

in fulfillment of the requirements for the degree of PhD in Statistics

Members of the thesis committee:

Prof. Maria-Pia Victoria-Feser, Chair, University of Geneva Prof. ElvezioRonchetti, Co.adviser, University of Geneva

Prof. Stefan Sperlich, Co.adviser, University of Geneva Prof. Nicola Salvati, University of Pisa

Thesis No. 55 May 2018

(3)
(4)

LE DOYEN

Uni Mail - 40 bd du Pont d’Arve - CH-1211 Genève 4 www.unige.ch

A q u i d e d r o i t

I M P R I M A T U R

Je, soussigné, Professeur Marcelo OLARREAGA, Doyen de la Faculté d’Economie et Management, confirme que Madame Setareh RANJBAR AKBARZADEH obtient l’imprimatur pour sa thèse N°55 suite à sa soutenance publique du 16 mai 2018 pour le grade de docteur en statistique.

Prof. Marcelo OLARREAGA Doyen

Genève, le 7 juin 2018

MO/GK/KR

(5)
(6)

Acknowledgements

Foremost, I am very grateful to my co-advisers, Prof. Elvezio Ronchetti and Prof. Stefan Sperlich for their precious guidance. They were both very patient with me when my learning would take the slow pace. They both trusted me to bring in and try my new ideas. Prof. Sperlich always had his door open and welcomes questions and discussions generously. Working under the supervision of Prof. Ronchetti was a great pleasure and advantage for me. Apart from “Robust statistics”, I have learned under his supervision the ways to organized and scientific thinking.

Furthermore, I would like to express my gratitude to Dr. Uma Rani Amara, who is my coauthor in the 2nd chapter of this thesis. I could profit a lot from her experiences in empirical research and she taught me the principles of conducting an honest scientific research.

I would like to thank Prof. Maria-Pia Victoria-Feser, the chair of my jury, for taking interest in this thesis and for reading it with care and providing us with useful and constructive comments on the proposed methods and approaches.

I would like to thank Prof. Nicola Salvati, the external expert, who has read most carefully this thesis and gave me constructive and detailed comments on all chapters. It was him who generously provided me with the data for the example in the 3rd chapter. I have learned and I am learning a lot from his expertise in Small Area Estimation. He is also a very kind and friendly colleague; fortunately, my collaboration with him continues out of the scope of this thesis in the field of Small Area Estimation.

Additionally, I would like to thank all my past and present colleagues at University of Geneva who always stay my friends. My special thanks in a chronological order goes to Marianne Furrer, Giuditta Rusconi, Dr. Elena Sarti, Dr. Judite Goncalves, Dr. iop¨uLil- iana Foletti, Dr. Mark Hannay, Dr. Pierre-Yves Deleamont, Dr. Daniel Flores , Justine Falciola, Ingrid Vargas Yanez and many more who I could not name here due to the limited space.

Special thanks to Mattia Branca, who shared the office with me for five years. He was always a great moral support in what we called “downward research periods”.

Also special thanks go to Linda Mhalla for being such a caring colleague and supportive friend. She is very insightful in her field and our 5th floor corridor discussions were of great help for the first chapter of this thesis.

I would like also to thank Katarzyna Reluga, the very enthusiast colleague and friend of mine, who spent time in reading and commenting the very first draft of this thesis.

I would like to show my sincere gratitude to my family. My parents and my three sisters have always supported me even from far away. In the same spirit I would like to thank Dr. Fatemeh Valamanesh, my aunt without whom I would not have come to Geneva in the first place. She allowed me to have a fighting chance to get my PhD.

In the end I would like to dedicate this thesis to my supportive husband and sweet daughter, who have contributed in its accomplishment with their patience and passion.

(7)
(8)

Abstract

In this thesis we look at different aspects of robustness in the analysis of structural mod- els. The first chapter illustrates the importance of functional form specification for correct inferences on the non-parametric components of a semi-parametric model. The motivat- ing examples for the issues discussed in this chapter are the studies of life-satisfaction.

Empirical results are provided using the German Socio Economic Panel data. The second chapter takes the quantile regression, conditional and unconditional, as robust methods to analyze the effect of minimum wages on the entire wage distribution. We provide a complete analysis for four developing countries: Brazil, Mexico, Indonesia and India, and we discuss the policy implications. In the last chapter we focus on the robust estimation of the non-linear parameters in the context of Small Area Estimation and we propose two new calibration methods to account for the bias. Through extensive simulations the performance of these methods is compared with the existing competitors in the literature.

These methods mostly outperform the others specially in cases when we are dealing with a heavy-tailed and highly skewed distribution, such as the distribution of income or ex- penditure. An empirical example is provided using the Acid Neutralizing Capacity of the lakes in northeastern USA.

(9)
(10)

R´esum´e

Dans cette th`ese, nous examinons diff´erents aspects de la robustesse dans l’analyse des mod`eles structurels. Le premier chapitre illustre l’importance de la sp´ecification de la forme fonctionnelle pour des inf´erences correctes sur les composantes non param´etriques d’un mod`ele semi-param´etrique. Les discussions dans ce chapitre sont motiv´ees par des exemples d’´etudes sur la satisfaction de vie. Les r´esultats empiriques sont issus de donn´ees du “German Socio Economic Panel”. Le second chapitre consid`ere la r´egression quantile, conditionnelle et inconditionnelle, comme des m´ethodes robustes pour anal- yser l’effet d’un salaire minimum sur l’ensemble de la distribution des salaires. Nous fournissons une analyse compl`ete pour quatre pays en d´eveloppement: le Br´esil, le Mex- ique, l’Indon´esie et l’Inde, et nous discutons des implications politiques. Dans le dernier chapitre, nous nous concentrons sur l’estimation robuste des param`etres non lin´eaires dans le contexte de l’estimation pour petits domaines, et nous proposons deux nouvelles m´ethodes d’´etalonnage pour corriger le biais. Des simulations approfondies nous per- mettent de comparer la performance de ces m´ethodes `a celle des alternatives issues de la litt´erature. Nos m´ethodes surpassent g´en´eralement ces derni`eres, en particulier dans les cas o`u nous avons `a faire `a une distribution asym´etrique et `a queue ´epaisse, comme celle des revenus ou les d´epenses. Ces m´ethodes sont illustr´ees `a l’aide de donn´ees sur la capacit´e des lacs du nord-est des ´Etats-Unis `a neutraliser leur acidit´e.

(11)
(12)

Contents

Acknowledgements i

Abstract iii

R´esum´e v

Introduction 1

1 Unhappy with semi-parametrics? On the effects of functional form mis-

specification. 3

1.1 Data generating processes and regression models . . . 5

1.1.1 A DGP for life-satisfaction over the life span . . . 5

1.1.2 Simulation study . . . 8

1.2 Life-Satisfaction in Germany: Empirical results . . . 11

1.2.1 General class of models . . . 13

1.2.2 Semi-parametric additive model . . . 14

1.2.3 Semi-parametric additive model with interaction . . . 14

1.2.4 GAM with Poisson link function for (dis-)satisfaction . . . 15

1.2.5 An ordered logit model with cubic age-function . . . 16

1.3 Conclusion of the chapter . . . 17

2 Robust analysis of wage distribution 19 2.1 The effect of minimum wages on wage distribution and inequality . . . 20

2.2 Methodological developments: The impact of minimum wages . . . 21

2.3 Minimum wage settings, data sources and methodology . . . 24

2.3.1 Minimum wage settings in the countries under analysis . . . 24

2.3.2 Data sources . . . 25

2.3.3 Methodology . . . 26

2.3.4 Change in unconditional wage distribution and inequality measures 27 2.3.5 Marginal effect of minimum wages . . . 28

2.3.6 Quantile regression: From conditional to unconditional marginal effects . . . 29

2.4 Empirical Evidences . . . 34

2.4.1 Descriptive Statistics . . . 35

2.4.2 Conditional and Unconditional QR marginal effects . . . 37

2.5 Conclusion of the chapter . . . 41

(13)

3 Robust analysis of non-linear small-area parameters 43

3.1 General framework and notations . . . 46

3.2 Robust estimation of the model parameters . . . 48

3.2.1 BLUP, EBLUP and REBLUP . . . 49

3.3 Estimation of non-linear parameters for small domains . . . 51

3.3.1 Conditional CDF . . . 51

3.3.2 New calibration approach: Complying with the shape of the out- come distribution . . . 52

3.3.3 Calibration of non-linear population parameters: Linearization through the Influence Function . . . 54

3.3.4 Full calibration vs. Partial calibration . . . 58

3.4 Simulation Results . . . 58

3.5 Choice of tuning parameters via estimating the MSE . . . 64

3.6 Application: Acid Neutralizing Capacity of the lakes in northeastern USA . 67 3.7 Conclusion of the chapter . . . 68

A Diagnostics on GSOEP data analysis 75 B Data sources, variable names and definitions- heterogeniety of marginal effects 79 B.1 Comparison of the partial effects . . . 82

References 83

(14)
(15)
(16)

Introduction

This thesis consists of three chapters, each of them self contained with separate introduc- tion and conclusion.

Recent developments in the field of econometrics and statistics provide us with such a vast variety of analytical tools that it calls for caution in adopting them to empirical stud- ies. The first chapter “Unhappy with semi-parametrics?” focuses on the consequences of functional form misspecification when one is fitting a semi-parametric model. This prob- lem is explained for the situation when we are dealing with the self-reported bounded discrete responses in the surveys. In this chapter we look at the bias transfer between the components of a semi-parametric model when the link function between the discrete responses and the latent model is misspecified. In statistics, semi-parametric techniques were introduced as a way to providing more flexibility in the model specification. In econometrics they became popular because of their ability to prefix parts of the model, e.g. to incorporate prior knowledge. The elements for which the economic theory is not specific about, can be passed in good conscience to the non-parametric part. In practice researchers tend to model parametrically the parts they wanted to interpret. All the rest is called ”nuisance parameters” (distribution, scedasticity function, impact of control variables, etc.). A more recent approach, is to leave unspecified exactly that part of the model that is of interest, for the sake of having more flexibility in its functional form.

But this estimate inherits the misspecification of the parametric part, and because of its flexibility, it is even more susceptible to it. The existing problems in the literature and our findings are explained using the example of life-satisfaction analysis and the controversial concept of mid-life crisis. However, the concepts that are discussed in this chapter are more general. They concern all situations where the respondents are not neutral towards discrete scales and their specific link between the latent and observed discrete choice is not correctly specified before hand. In the simulations we show that the functional form of life-satisfaction in age can be estimated with great deal of bias if the link function is not correctly specified. This can even create erroneously the hollow of mid-life crisis when it does not exist. An empirical study is also provided using the German Socio-Economics Panel (GSOEP), to explore the effect of age on the overall life satisfaction of people in Germany, between 1986 to 2007.

The second chapter of the this thesis is an empirical study that studies the effect of minimum wages on the entire wage distribution or more specifically on the wage inequality.

As said by Koenker and Bassett [1978]:

“Statement of the Gauss-Markov theorem too often seems to imply that linearity in y and unbiasedness are added virtues of the least square estimator instead of restriction on the class

of its potential competitors.”[p. 35]

(17)

Therefore, in this chapter we use the quantile regression approach as an alternative robust estimation problem in the location model. However, the results of Conditional Quantile Regression (CQR) proposed by Koenker and Bassett [1978] is quite restrictive when it comes to the interpretation of the marginal effects. Indeed, estimated model parameters cannot be directly interpreted to obtain the marginal effect of a location shift in a covariate on the unconditional distribution of the outcome. Using this results either we could only deduct on the inequality changes among the group of individuals who are sharing the same characteristics or we have to deal with computationally expensive cal- culation of the multiple integration over the domain of all covariates. To overcome this problem Firpo et al. [2009] have proposed an approach, namely Unconditional Quantile Regression (UQR). This method will directly provide the estimate of the marginal ef- fect of location shift in the covariate of interest on the unconditional distribution of the outcome. We adopt their method to the data from four developing countries, namely Brazil, Mexico, Indonesia and India to study the effect of minimum wage changes on the wage inequality of the working class. Interesting insights are obtained for these coun- tries by comparing the results from the conditional and unconditional quantile regressions.

Today the availability of rich sample surveys provides a ground for researchers and policy makers to pursue more ambitious objectives. This information in line with auxiliary data coming through administrative channels is used for a better prediction/estimation of social and economic indices, e.g. inequality or poverty measures, that can help to deter- mine more precisely their target domains. The domains for which the sample size is not large enough to provide an acceptable direct estimate, are referred to as “small areas”.

The existence of outliers in the sample data can significantly harm the estimation for areas in which they occur, especially where the domain-sample size is small. Chambers (1986) discussed the robust estimation of finite population total and mean in the presence of outliers and Welsh and Ronchetti (1998) extended the results to the cumulative distri- bution. Chambers et al. (2014) provide a comprehensive review. In chapter three based on a robust EBLUP we propose two new approaches to calibrate for the bias of nonlinear functionals, such as the Gini index and when the so called “representative outliers” come from a skewed heavy tail distribution. For the estimation of the domain specific non- linear parameters we either use the estimate of the Cumulative Distribution Function of each area or we use the linearized approximation of the non-linear parameter. Then we propose an asymmetric Huber type function for calibrating these results. The optimum tuning constants of this function is found through a bootstrapping procedure. In all cases this calibration approach can outperform the conventional symmetric calibration and has the symmetric Huber function in its subset of solution.

(18)

Chapter 1

Unhappy with semi-parametrics? On the effects of functional form

misspecification.

Functional form misspecification has always been an important issue in econometrics.

Non-parametric techniques came to partially answer this issue by relaxing the assump- tion on any explicit functional form by considering the infinite parameter space. However, various practical problems like the curse of dimensionality or the lack of explicit param- eter estimates to make direct inferences of marginal effects, make these techniques less appealing to empirical economists.

In statistics, semi-parametric techniques were introduced as a way of circumventing the curse of dimensionality, at the same time providing some sort of flexibility. In econometrics they became popular- at least in econometric theory- because of their ability to prefix parts of the model, e.g. to incorporate prior knowledge, or say “to model what you want to model”. The elements for which the economic theory is not specific about, can be passed in good conscience to the non-parametric part. In practice researchers tended to model parametrically the parts they wanted to interpret. All the rest was called

“nuisance parameters” (distribution, impact of control variables and etc.). More recent is the trend to leave unspecified exactly that part of the model that is of interest. For example, in studies of life-satisfaction researchers are particularly interested in studying the impact of age whose correct parametric specification has therefore been in the center of controversy. Consequently, a seemingly attractive remedy was to estimate the impact of age non-parametrically, while keeping the rest of the model parametric.

Let us have a closer look to the regression problem: Having discrete responses yi for individuals i = 1, . . . , n (that typically run from 0 or 1 to 7 or 10), one tries to explain these by some observed individual characteristics xi œ Rd and agei via a (generalized) linear model (GLM):

E[Y|X =x, Age=age] =G(÷(x, age)) = G(—0+xT1+age—a).

In a correctly specified parametric model the link function,G,corresponds to the error distribution in the latent variable model. However, often Gis set equal to the identity for convenience by offering various justifications for this choice. When Y can take consider- ably more than just two values, then a necessary prerequisite for making this a reasonable choice is that eitherY is a cardinal variable or the index function÷is sufficiently flexible.

It is clear that no link function G is necessary if this index function is entirely non- parametric. However, its estimation runs into the curse of dimensionality problem, and

(19)

may render the interpretation of the impact of age pretty hard. Therefore, the semi- parametric model:

E[Y|X =x, Age=age] =G(÷(x, age)) =xT+m(age), (1.1) in which m is a one-dimensional non-parametric function, could deem to be an enticing compromise. G is the identity with a quite flexible ÷ but the impact of age is easy to illustrate and interpret.1 This was done quite recently in Wunder et al. [2013] when studying the overall life-satisfaction (LS) with data from the German Socio-Economic Panel (SOEP) and the British household Panel Survey. When they estimated model (1.1), applying P-splines to calculate m, they found a clear indication for the mid-life crises at the age of around 50. We show, however, that this hollow of m at age 50 can equally well be caused by a natural but non-linear link G or by non-linearities of the impact of x. By “natural” we mean that people’s responses regress to the mean, i.e.

they tend not to give the most extreme values – who wants to rank himself as totally unsatisfied or totally satisfied? This simple shift toward unequal scaling can be captured by a proper choice of G, ignoring it can result in a cubic shaped estimate of m. More generally spoken, in the setting of a Generalized Partial Linear Model (GPLM):2

E[Y|X =x, Age=age] =G(÷(x, age)) =G(xT+m(age)),

with x and Age being vectors of potentially correlated covariates and G not being the identity, a misspecification of G or the non-linearity of the impact of x on index

÷ will be reflected in the estimate of m. This leads to the seemingly peculiar though evident situation that in a semi-parametric model, the non-parametric estimate has to be interpreted with a lot of care. “Peculiar” in that many people think that, as the non- parametric part is free from any model misspecification, its estimate would be so, too.

But this estimate inherits the misspecification of the parametric part, and thanks to its flexibility, it is even more susceptible to it than a parametric m would be.

Such an issue is for example of high importance in the analysis of data with bounded, discrete responses, even if ordered. It is often controversially discussed whether the vari- able of interest (like LS) should be considered as ordinal or cardinal. Imposing assump- tions on how such response must be treated, pre-defines to some extend the functional form of link functionG. However, except for certain cases, such problems are not necessar- ily identifiable. Consequently, a proper empirical analysis should study a set of possible models, or say, do robustness checks of its findings. The general contribution of this chapter is to provide awareness for empirical researchers on the bias transfer between the components of a semi-parametric model when there is a functional form misspecification in link between the latent and the observed or in the parametric part. We have observed this problem in the analysis of life-satisfaction or happiness studies. Therefore, the spe- cific contribution of this chapter is to explain a possible source of controversy in this line of literature, when the interest is in the functional form of life-satisfaction in age. We do not provide any new techniques but rather we point out a problem that is unidenti- fiable unless extra assumptions (restrictions) are put in place or additional information

1If rather an appropriate GLM but not the functional form of the impact of age is of interest, see Studer and Winkelmann[2011].

2For future references in this chapter one must keep in mind that GPLM is a special case of a Generalized Additive Models, when there is only one non-parametric component in the model and the rest of the covariates are fitted parametrically. In this case one can achieve the parametric convergence rate for the parametric part of the model under some general conditions.

(20)

is acquired on the link between the latent and observed variable. That means solving the issues that are stated in this chapter needs imposing extra assumptions on the model or running pre-analysis study on the behavior of individuals towards the discrete choice surveys.

In Section 1.1 we first introduce and formalize the aforesaid problem. Through a simulation study in Section 1.1.2 we illustrate the consequences of the functional form misspecification on the transfer of the created bias between the components of the semi- parametric model. Section 1.2 provides an empirical study on the SOEP data set, where we try to explore to what extent different model specifications would lead to different conclusion on the functional form of life-satisfaction in age. At last we conclude the findings of the chapter in (1.3).

1.1 Data generating processes and regression models

Above we have already indicated some sources of potential problems, namely the trap of taking for granted that the non-parametric part in semi-parametric models is not subject to biases due to misspecification, and the scaling problem of bounded or limited responses.

To illustrate this general problem, let us consider the mentioned studies of overall life- satisfaction which amount to a rich body of literature in economics, social science and psychology. Although we also present a replication analysis, we start out from a reflection of the underlying data generating process (DGP), followed by simulations. The reason is that we can only see and understand what our methods do to the data if we know the true model (or say, the DGP). With this knowledge we can reason back to the DGP of real data when applying the method to them.

1.1.1 A DGP for life-satisfaction over the life span

Historically, methodological approaches to analyze the life-satisfaction (LS) evolved differ- ently in each field. For example, Ferrer-I-Carbonell and Frijters [2004] point out in their literature review that in most economics studies the life-satisfaction measure is taken as an ordinal variable, whereas in the studies by psychologist it is assumed to be cardinal.

Their common ground is that there is a direct link between (a kind of) the individual’s utility Ui and the individual’s realization of its life-satisfaction, such that one can write

LSi = ˜G(Ui); Ui =ÿ

j

fj(xij). (1.2)

Moreover, the individual’s utility can be written as a sum of fundamental components such as wealth, health, etc. which in turn can be expressed as a function of personal characteristics and socio-economic indicators summarized in xi. Clearly, the utility is a latent variable, i.e. not observable, so that its realization is measured through indicators of which self reported LS is the most popular one. This is typically recorded on a discrete scale, say from 0 to K. ˜G is now the link between utility and LS. It can take different forms, according to the assumptions that the researcher finds plausible to explain the behavior of an individual in choosing categorical numbers that correspond to his/her feelings. The formulation of overall life-satisfaction is then:

LSi =Ck iff ak ÆUi < ak+1i , (1.3)

(21)

where Ck is the level of LS chosen by the individual whose perceived utility lies between ak and ak+1. We can set a = {a0, a1, . . . , aK+1} with a0 = ≠Œ and aK+1 = Œ. As G˜ is an increasing step function you might think of it as a composition DS of first a monotone link, say S, that changes the scale of utility such that S(ak+1) =S(ak) + 1 for 1 Æ k Æ K with S(a1) = 1, and the discretization D : qKk=11{S(Ui) Ø k} that maps the responses to the discrete scale with support {0,1,2, . . . , K ≠1, K}. If all thefj’s are linear we end up in a GLM model. In practise, however, G (the link function in GLM) is often set to identity, especially if the extreme values, 0 and K in our case, are not frequently observed.

Consider a simple setting where utility is a sum of two fundamental componentsf1and f2, which are themselves a function of personal characteristicsX1 and X2 respectively. In addition, imagine that X1 andX2 can be correlated but the derivative off2 with respect to X1, f2Õ(.), is zero. Then:

U =f1(X1) +f2(X2), LS= ˜G(U),

where LS is the self reported life-satisfaction by individuals and ˜G is the true link between the latent variable U and the observed LS. We are interested in the effect of X1

on utility:

ˆU

ˆX1 =f1Õ(X1).

If ˜G is misspecified in its estimation asG we have:

U =G≠1(LS) =G≠11G(f˜ 1(X1) +f2(X2))2=G2(f1(X1) +f2(X2)), where G1G˜ ©G2.

ˆU

ˆX1 =GÕ2!f1(X1) +f2(X2)"·1f1Õ(X1)2.

The relative bias in estimating the effect due to misspecification of the link between the latent and the observed variable is then:

E

S WU

ˆU

ˆX1ˆXˆU1

ˆU ˆX1

T

XV=EËGÕ2!f1(X1) +f2(X2)"≠1È=EËGÕ2!f1(X1) +f2(X2)≠1.

When considering the function that transforms the continuous utility to the individual discrete choices the implicit underlying assumptions for the model can be categorized at three levels, increasing in restrictiveness:

1. LS is a discretized monotone transformation of utility. A complete heterogeneity of individuals opinion toward happiness is respected, and the error terms can follow different distribution for each individual error terms.

2. LS is ordinal comparable. Then individuals share a common opinion of what hap- piness is, but the gaps between each two consecutive categories are not necessarily equidistant.

(22)

3. LS follows a cardinal scale, saying that the difference between satisfaction of, say 4 and 5 is the same as the difference between 8 and 9. Then G being the identity would be a reasonable link function.

Under the first setting the model is “under-specified”, so that we could just estimate a nonparametric relation between the vector xi and LSi. The two other settings are more restrictive and suggest different functional forms for G(·). For the fully nonparamertic version there is not much to add, but for reasons discussed in the introduction, prac- titioners don’t like this. Therefore we concentrate on the second and third sets of the assumptions.

To understand the impact of the assumption on whether LS is ordinal or cardinal, we plotted in Figure 1.1 four “typical” transformations, S, that link the utility to the discrete scale of LS. Clearly, each of them refers to different “typical” populations: In the first one (upper left) individuals are aware of the effective range of utility, neutral towards discrete scale, and convert their utility into discrete numbers by dividing the effective range of utility in intervals of equal length. In this situation an equidistant transformation, i.e.

identity link would be appropriate. The most common finding in the related literature, however, is that people have an aversion to extremes, such that they tend to choose more grades from the center of the scale, producing a link as shown in the upper right. In the third population (lower left) individuals tend to be optimistic and thus exhibit an aversion to extremely negative responses; they choose more from the upper scale producing a right skewed link. Imagine finally the contrary, i.e. a rather pessimistic population, averse to extremely positive responses. They choose more from the lower scale, resulting in a right skewed distribution of the responses (lower right).

a

S(a)

Neutral

a

S(a)

Extreme averse

a

S(a)

Optimistic

a

S(a)

Pessimistic

Figure 1.1: Transformation or so-called link functionSrelated to four typical populations.

There is a keen interest in the literature of LS studies in estimating the effect of age on LS (sometimes referred to as ’happiness’). The results have been controversially discussed and are not consistent within similar case studies. For instance, Alesina et al.

[2004] find that happiness increases with age till some point and then decreases, whereas Blanchflower and Oswald [2004] find a convex curvature for this relationship. Ferrer-I- Carbonell and Frijters[2004] claim that this ambiguity in the literature comes from a high correlation of age with all other observed and unobserved factors. That criticism points to a potential omitted variable bias but less to the problematic choice ofG(Sand distribution F(u|x, age)) or a functional misspecification of the index÷(x, age). Some researchers try to overcome the latter by means of semi-parametric modeling. The motivation behind it

(23)

is the hope that the flexibility in the functional form of the age would provide a correct inference on the marginal effect of age on LS.

For a better understanding of the DGP, imagine now that we observeself-reported LS, age, and a measure of health, sayillness. Consider the simplified model that presents the utility with these two components at individual level (to then link it with LS via (1.3)),

Ui(Age) =Wi(Age)≠Ii(Age), (1.4) whereUi, Wi , andIi present the individual’s utility, relative wealth, and illness. It is worth mentioning that we do not intend to take the age as a proxy for any missing factor such as health or wealth. On the contrary we try to depict the channel through which age can affect one’s utility and life-satisfaction consequently. This setting does not prevent us from studying the impact of age on LS.3 Imagine relative wealth to be slightly decreasing in age (e.g. by a cohort effect or household size, seeEasterlin[2001]) but illness increasing.

It is standard to impose separability restriction on the utility function, additivity being the most common one.

The main interest then is to study the impact of age on LS, for people between 20 to 80 years old - but controlling for the impact of illness because it is the strongest factor for LS in the developed countries. In order to be flexible inageWunder et al.[2013] proposed a semi-parametric Partial Linear Model (PLM) of the form (1.1). Now, even if W and I were linear in age, and utility as in equation (1.4), then a link as in Figure 1.1 for the second population, would cause a non-linearity from LS to illness and age, respectively.

Even more obviously, if we faced the first population (identity link), a non-linear effect of age on wealth or of illness on utility would again cause a non-linearity of age on LS.

As this might be less obvious for the former case, we illustrate the bias-transfer that occurs in a PLM in Figure 1.2. Imagine we face S(·) as in the upper right panel of Figure 1.1, a utility function as in (1.4) withWi andIi being linear in age. Then the solid lines in Figure 1.2 are the marginal impactsillnessandageon LS. However, a PLM would restrict illness to have a linear impact (dashed line) with positive bias for lower levels of illness, and a negative one for higher levels respectively. Yet, illness is strongly correlated to age whose impact is estimated non-parametrically. Therefore, the non-parametric estimator tries (and does) account for the bias obtained in the left panel. This results in the dashed line of the right panel of Figure 1.2. There we can already see a hollow that is simply due to functional misspecification.

It is clear that we would observe the same kind of bias-transfer if the link function had been indeed the identity but illness had a non-linear impact on LS. In both cases the forced linearity of the illness component would cause a bias when estimating the age effect.

1.1.2 Simulation study

While Figure 1.2 gives already an idea of what a PLM fit can provide using the data, the exact final outcome has to be verified in a simulation study. In order to reproduce the effect of having different type of population or say different link function S, as in Figure 1.1 we generate a random data set with:

Ii(age) = ≠8 + 0.1◊agei+ih; Wi(age) = 4.5≠0.03◊agei+iw; AgeU(20,80),

3We take ’relative wealth’, because people tend to compare their economic situation not with their own past situation but to look at their own present distribution quantile. See McBride[2001].

(24)

LS

Illness +

-

LS

Age

-

+

Figure 1.2: Bias-transfer from the parametric to the non-parametric part when estimating the components of a PLM. Solid line indicate the true marginal impacts, dashed lines the expected PLM estimates. Left panel: bias when illness is forced to have a linear impact.

Right panel: bias transfer coming from the left panel when age is positively correlated with illness.

of sizen = 1500, whereh andw are i.i.d. error terms with standard normal distribution.

Then the latent variable ,Ui is built using equation (1.4). To transform these values into individual responses we choose four different inverse linksa, that corresponds to the cases in Figure 1.1. In each case this inverse link assigns to the elements of a the equidistant responses {1,2,3, . . . ,9,10},see Figure 1.3.

• Neutral population: a={1.0,2.0,3.0, . . . ,9.0,10.0},

• Extreme averse population: a={1.0,1.4,1.9,2.8,4.2,6.8,8.2,9.1,9.6,10.0},

• Optimists population: a={1.0,1.2,1.6,2.1,2.7,3.6,4.9,6.4,8.2,10.0},

• Pessimistic population: a={1.0,2.8,4.6,6.2,7.3,8.2,8.8,9.3,9.7,10.0}.

We then fit the PLM in equation (1.1) to the data using cubic regression spline basis.

Recall that this was proposed in literature to solve the problem of contradictory outcomes on the curvature as to the age effect on LS. The resulting outcome is shown in Figure 1.4. The interesting point of these simulations is that the different impacts that are found in the literature, namely the linearly decreasing, u-shape, inverse u-shape and most important the one with the hollow of mid-life crisis can be find in a the same scenario, depending on the true link function.

Looking closer at the second case in Figure 1.4 the graph depicts what is referred to as the ’mid-life crisis’, the change in the curvature of the overall LS around the age of 50.

However, in our simulation this outcome is nothing but the transferred bias caused by the misspecification of the parametric part i.e. neglecting the extreme aversion in observed responses.

To stop and conclude here would result in a negative output and somehow destructive criticism. In order to turn it into a more constructive contribution that does not limit to pointing out a misunderstanding of semi-parametric methods and misleading conclusions, we discuss in the next section how such an empirical study could be completed. We do not speak of ’solving the problem’ because as long as the true link function is unknown, the safest way of studying the marginal impact of age on LS (conditional on health or other factors) seems to be the use of purely non-parametric methods. However, if the objective of selectingG(·) is to correct for situations like extreme aversion, i.e. something

(25)

Figure 1.3: Simulated link function

that has nothing to do with the perceived utility but how LS is recorded (one might think of a measurement error that is systematically positive for low utilities but negative for large ones), then even the non-parametric methods cannot help. In such a case, identifi- cation requires the imposition of (additional) model assumptions which should be based on studies that investigate this potential measurement error, i.e. the response behavior of people when facing discrete bounded scales. For instance, Mullainathan and Bertrand [2001] have pointed out the necessity of studying people’s attitude toward the subjective and categorical survey questions; see also Krueger and Schkade [2008] regarding the reli- ability of the self-reported subjective well-being or LS. Later Frijters and Beatton [2012]

and Kassenboehmer and Haisken-DeNew [2012] have also pointed out to the problem of controversial results for functional form of LS in age but they have proposed in different ways to solve it by including the panel fixed effect in the model to account for differences in the individual behavior. With the same line of reasoning Wooden and Li [2014] argue that availability of panel structure is necessary for the study of subjective well-being (life- satisfaction). Chadi [2013] provides evidence on the effect of interviewer encounters on the response in the panel studies. They suggest that in the studies of life-satisfaction the interviewer fixed effect must be considered in the model. We argue that these proposed solutions can all be regarded as robustness check in the analysis of life-satisfaction but does not guarantee an unbiased estimation for the functional form of life-satisfaction in age. Main reason is that the misspecification in the link function and or the index is not separable unless additional assumption (restrictions) are imposed.

While focusing on the semi-parametric specification of the life-satisfaction models and letting age to behave non-parametrically, it is also to be mentioned the role of bandwidth selection. It is true that the choice of bandwidth for each non-parametric components of

(26)

Figure 1.4: Results of the fitted PLM model, in cases where each of the four links is the true one.

the model determines its degree of smoothness but only to this extent its functional form, i.e. it can make a slope flatter but not change the direction. Typically, these choices are made by optimizing the trade-off between the variance and bias of the estimator; in contrast, at least to our knowledge, there does nothing exist such a thing like a bandwidth choice procedure to prevent a bias transfer between the components of a semi-parametric model. In all the analysis of this chapter the the General Cross Validation method (GCV) is used to choose the optimum bandwidths for the non-parametric components.

1.2 Life-Satisfaction in Germany: Empirical results

We first replicate the study of Wunder et al. [2013] to afterwards explore to what extent other model specifications would lead to different conclusions. That is, we are interested to see whether the conclusion of having detected the mid-life crisis is a by product of misspecification of the model and thus a transmission of the bias to the non-parametric part.

The data in hand is the German Socio-Economic panel (SOEP, hereafter). Regardless of 30 rounds of available data, as in Wunder et al. [2013], we use only the data from 1986 to 2007 without the years 1990 and 1993 for which the number of nights in the hospital was not registered. The response of interest is the “Overall Life-Satisfaction” (LS). That is a discrete categorical response, evaluated by the respondents on a scale from 0 to 10 where ‘0’ corresponds to ‘completely dissatisfied’ and ‘10’ to ‘completely satisfied’. A histogram of the responses polling all the data for the course of this study is shown in Figure 1.5; it illustrates that the unconditional distribution of the responses is strongly

(27)

skewed to the left having its mode at 8.

Figure 1.5: Observed responses to “overall life satisfaction”

The main interest is to estimate the marginal impact of age on LS where age ranges from 16 to 100 with a mean around 46 . The other covariates included are personal char- acteristics that typically are used in similar studies; such as illness (nights in hospital, disabilities), relative wealth (household income and size, education, nationality, employ- ment status), and family situation. We first fit a PLM as in (1.1), i.e. where the link function is considered to be the identity. The graph in Figure 1.6 presents then the marginal effect of age on LS. A diagnostic check on this model is presented in Figure A.1 (left panel) in the Appendix A; it indicates problems at the tails, maybe due to extreme aversion.

Figure 1.6: Life-Satisfaction over the life span(age) in Germany using PLM model.

Due to our discussions, theoretical derivations and simulation results in the former sections one may be tempted to directly disregard this outcome as a bias-effect transmitted from ignoring the extreme aversion of respondents, or from potential non-linearities in the index. Fortunately, nowadays, it is no longer that hard to check out if a more flexible modeling will produce a different outcome. In the next section we explain a general class of models that we used to preform a robust check on the functional form of the Life Satisfaction in age for different model specifications.

(28)

1.2.1 General class of models

To be more specific, let us group the covariates in two sets, namely one comprising the dummy variables, D, and one comprising the others, X, and consider

E[LS|Age, D, X] =G1m(Age) +mx(X) +mI(X, Age) +DT2, (1.5) with mx to be either non-parametric or linear, andmI a non-parametric interaction term (in some specifications set to zero). Inversely to the flexibility we concede to our index function ÷(·), we can allow for flexibility in the link function G.

We are aware of the fact that there exist many more model specifications than those comprised in (1.5) like e.g. generalized varying coefficient models, for a review see e.g.

Roca-Pardinas and Sperlich [2007]. But the objective is not to find the optimal data fit or a model specification for which you can no longer replicate the shape of Figure 1.6;

the aim is to check whether intuitively reasonable modifications of (1.1) make the hollow, interpreted as ’mid-life crises’, disappear.In this respect we fit a series of plausible models as a robust check for potential misspecification.

The special cases arise from equation (1.5), can be seen as follow 4: Generalized Linear Model (GLM):

Using equation (1.5) we can also express a fully parametric model. Here we consider a GLM model with G being an arbitrary monotone link, i.e. Ordered Logit Model, m(Age) = q3j=0jAgej with unknown j and mx a linear function of the continues co- variates. In this case mI(.) is explicitly set to zero.

Partial Linear Model (PLM):

In the class of semi-parametric models the PLM is when m(Age) is estimated non- parametrically, mI(.) is set to zero, the mx(.) is restricted to be linear in X and the G is an identity function.

Generalized Partial Linear Model (GPLM):

As in PLM we consider that m(Age) is smooth function and the rest of the covariates are linear with no interaction considered. However, in this case G is no more identity. We consider a Poisson link function on the Life Dissatisfaction (instead of Life Satisfaction) as dependent variable. Later we justify more precisely this choice in Section 1.2.4.

Additive Models (AM):

G is again set to identity, however, the linear model is now a sum of smooth functions of the covariates. That is not just m(Age) is smooth function but also mx is the sum of smooth functions for each continues covariates. To fit such model one can represent each smooth function as a linear combination of the orthogonal bases and the fitting algorithm will estimate the coefficient of each of the basis. In the original setting of the AM we consider there is no interaction. That is to say mI(.) = 0. Later we also consider the presence of interactions between covariates by modeling also the interaction of Age with all the other continues variables. To do so, one need to use the Tensor product bases. 5

Generalized Additive Models (GAM):

Generalized Additive Models , follows the AM in a sense that the index is still the sum of smooth functions of the covariates, however, the response may follow any exponential

4For the sake of consistency and coherence all the semi-parametric models that are introduced in this chapter are fit by using the function gam() from mgcv package in R. Only the ordered logit model is fit by using the polr() from MASS.

5The estimation of the non-parametric components in this section is done by using cubic regression spline. The smoothing parameter or the number of knots is chosen using Generalized Cross Validation (GCV).

(29)

family distribution. To fit the model a penalized-iteratively re-weighted least square algorithm must be used. For equation (1.5), that means the m(Age) and mx() are a smooth function and a sum of smooth functions respectively. In our settings we consider mI(.) to be zero. Gis now a known link function. Like the case in GPLM, we consider a Poisson link on the mirror of the responses.

The strategy for the remainder of this chapter is to run the following models: a semi- parametric Additive Model (AM), then one enriched by interaction terms, a Generalized Additive Model (GAM) with pre-specified link function, and finally a GLM with arbitrary monotone link G.

1.2.2 Semi-parametric additive model

In a first step we set G=identity but give full flexibility to the impact of each covariate under the restriction of additive separability. As most covariates are dummy variables, we apply spline functions only to Age, nights in hospital (N inH), years of education (Y of Edu), log of net household income (LN HI), and log of household size (LHS) to estimate

E[LS |Age,· · · , D] =m(Age)+m1(N inH)+m2(Y of Edu)+m3(LN HI)+m4(LHS)+DT—. The estimate ofm(Age) is plotted in Figure 1.7. We can see that the curvature of the age effect persist is flatter in this setting but shows the same behaviour as in the simpler PLM model. The other estimates are in compliance with the existing literature, cf. Ferrer-I- Carbonell [2005], Frey and Stutzer [2002] and Di Tella et al. [2003]. – a clear downward slope for LHS.6

Figure 1.7: Life-Satisfaction over the life span(age) in Germany using Additive Model.

1.2.3 Semi-parametric additive model with interaction

The non-parametric additive 2nd order interaction model that we estimate is of the form E[LS |Age,· · · , D] =m(Age) +m1(N inH) +m2(Y of Edu) +m3(LN HI)

+m4(LHS) +m5(Age◊N inH) +· · ·+DT ,

6The family’s limited resources must be divided between members; an increase in the family size leads to a decrease in the available resources per capita and thereby to a downward trend in happiness.

(30)

where is included the interaction of all continuous variables with age. To fit this model we use the tensor product smoothers. A tensor product interaction within a non-parametric additive model is considered to be an appropriate choice when the main effects as well as interactive effects are present whereas thin-plate smoothers are recommended in cases where the one-dimensional functions are suppressed. The resulting estimate of m(Age) is given in the Figure 1.8, left panel. It can be seen that there is a change in the curvature of the marginal effect function m(Age), but the hollow around 50 is still clearly visible.

It could be argued that for this model, the marginal impact of interest is m(Age) +

s q

jm4+j(Age◊Xj)dFX(x) with FX(·) being the cumulative joint distribution of X’s;

but given the results for the purely additive model above, it is not surprising that this does not make disappear the valley around the age of 50, see Figure 1.8 (right panel).

Figure 1.8: Marginal effect of age in the additive model with interaction: m(age) on the left, and m(Age) +s qjm4+j(Age◊Xj)dFX(x) on the right.

An alternative interaction specification could be to let gender interact with all vari- ables. This is motivated by the various studies that suggest that the model of life satis- faction would differ a lot by gender. Figure 1.9 shows the outcomes form(Age) in a PLM as in (1.1) but separated by gender. It can be seen that indeed the valley is much deeper for men than for women. The shape, however, is basically the same.

Figure 1.9: Marginal effect of age in the PLM for samples separated by gender: m(age) for female on the left, and for men on the right, respectively.

1.2.4 GAM with Poisson link function for (dis-)satisfaction

The obvious alternative to relaxing index function ÷ is to allow for more flexibility in the link G. However, when considering a generalized PLM, i.e. model (1.5) with mx being a linear function, and mI © 0, then it is recommendable to parametrically specify the

(31)

link. Although there exist estimators even for generalized additive models with unknown (monotone) link G but they are numerically extremely unstable unless the sample size is huge and the impacts of all covariates are almost independent from each other. From Figure 1.5 one would guess that a mirrored Poisson distribution might do. When we fit a GAM with the Poisson link function on the mirror of the responses (the response is now the ‘life dissatisfaction’ defined by DS = 10≠ LS) we obtain for m(Age) the function plotted in Figure 1.10. And again, the resulting outcome could be interpreted as a justification of the existence of the mid-life crisis. Some diagnostic graphs of the model are shown in Figure A.1 (right panel) in the Appendix A. We observe now a much better behavior at the tails than for the original PLM. This indicates that the extreme aversion problem has been solved by this link. Interestingly, the mirrored m(Age) estimate has basically the same shape as we found for the PLM, cf. Figure 1.6, though the hollow (here bump) has slightly shifted towards 55.

Figure 1.10: Marginal effect of age using the Poisson link with mirrored responses: m(Age) on the left, and EX,D

5

G1m(Age) +mx(X) +DT26 on the right.

One could argue that the function of interest was ratherEX,D

5

G1m(age) +mx(X) +DT26. This, however, is not necessarily true as we discussed above: if extreme aversion is

not related to real life-satisfaction but just to reporting it, then G only corrects for the reporting error. And as we said, the question of whether it is a ’reporting error’ or not, is an identification problem that cannot be answered from these data.

1.2.5 An ordered logit model with cubic age-function

On ground of the above arguments we now fit an ordered logit model. The linkG is thus defined via the logistic distribution of the residual terms of the latent equation for utility Ui in model (1.3), and the unknown ak. As the ak can be freely chosen with the sole restriction ak≠1 Æak, monotone transformations are automatically captured. As for the impact of age, the estimator adapted a cubic function, see Figure 1.11 (left panel). Due to the parametric restriction we had to use here, the estimate cannot be the same as those we saw for the PLM or the GAM with Poisson link; but the main finding and conclusion is the same. Also here applies the potential criticism that the function of interest might rather be EX,D

5

G1m(Age) +mx(X) +DT26(with mx being linear); but this time the problem is even more complex because Geventually captures both, extreme aversion and functional (mis-)specification of the index ÷(·).

(32)

Figure 1.11: Estimates of ordered logit model: impact of age on LS on the left, and a study of the link on the right (along the estimated cut points).

The estimated cut points do neither correspond to the link function we considered in our simulation study, nor to the identity link function that is generally assumed in the related literature. These cut points actually suggest a link function that presents a left skewed conditional distribution of the responses. As said, this unfortunately has an interplay with the modeling of index function ÷. To provide a better ground for comparison we depict in Figure 1.11 (right panel) the identity link and used the estimated cut points (in the Ordered Logit model) as a transformation set (recall a for extreme averse population) to plot the corresponding link. This provides another evidence that the link function is different from identity: moreover, individuals in this data set might comply to the optimistic population group introduced in Figure 1.1.

At this point one can even go further by considering a non-parametric generalized ad- ditive model with an unknown link that is proposed byHorowitz[2001] as a less restrictive dimension reduction approach. However, this method requires a location, sign and scale normalization which makes more challenging to interpret the results. Besides this method is not implemented in the existing softwares, so is of little use to the practitioners.

1.3 Conclusion of the chapter

Originally, in statistics semi-parametric techniques were introduced as a way of circum- venting the curse of dimensionality of their non-parametric counterparts, still providing some sort of functional form flexibility. In econometrics they were quite welcome to avoid functional misspecification in nuisance parts of the model, but also because their out- comes were much easier to interpret than those of purely non-parametric models. In more recent literature, semi-parametric methods are also used in a so far less common way, relaxing the functional form in the interesting parts of the model but using stiff modeling approaches for the nuisance parameter.

In the first part of this chapter we recall that the non-parametric part of the model is unfortunately not free from the underlying model assumptions in the parametric part.

Along with the popular example of life-satisfaction analysis, we show how biases caused by a misspecification in the nuisance parameter transfer to the part of interest. Moreover, the flexibility of the non-parametric part in such a scenario becomes a disadvantage, as it tries to absorb the bias coming from the parametric part. We have shown in a simulation study on life-satisfaction, the concept of mid-life crisis could merely be a result of such a bias transfer.

We consider the recently published study of Wunder et al. [2013]. Applying a PLM

(33)

on SOEP data, they found a hollow at the age of 45 to 50, interpreted as mid-life crises.

We first replicate their numerical findings to afterwards critically discuss their conclusions making reference to our findings in the previous sections: this finding could be a conse- quence of their model specification. We show how an appropriate robustness check could be preformed and, if applicable, the model and empirical analysis be improved. We could verify that the mid-life crisis (the change in the curvature of the marginal effect of age) is observed over all our different models. That is, even though our simulations show that Wunder et al. [2013] were overconfident about their estimation outcome when concluding to have found empirical evidence for the mid-life crises, our robustness check confirms their conclusion.

Referring to the existing literature there are justifications provided by economists and sociologist regarding the existence of mid-life crisis. Blanchflower and Oswald [2008]

consider a U-Shape function of life-satisfaction in age, which is due to the underlying assumptions of the model, but the first argument that they provide is that individuals learn how to adopt to their environment and at mid-life they leave behind their infeasible dreams. This partially and for some time compensates the continuous negative effect of aging (e.g. health reduction) and create the hollow that is regarded as mid-life crisis. It is worth noting that it also confirmed that an identity link function (as it is usually assumed in such studies) is inappropriate.

(34)

Chapter 2

Robust analysis of wage distribution

1 In the empirical analysis there is a tendency to focus on the estimation of the marginal impacts at the mean or median of the outcome distribution. The reason is that these methods are well established and are mathematically much easier to handle. Consider the case of linear model E(Y | X) = X—. We can fit the model to the data by using OLS. The law of iterated expectations provides E(Y) = EX[E(Y | X)] = E(X)—. This allows for the unconditional mean interpretation of—, i.e. is the effect of increase in the mean X on the unconditional mean of Y. If the data generating processes is Gaussian, then the least squares estimator is the most efficient, yet it is well known that such estimator is extremely sensitive to the presence of outliers. This makes such estimators very poor in many non-Gaussian, especially heavy-tailed situations, see Koenker and Bassett [1978]. Therefore, while it is very popular to look at the marginal effects for the variables of interest at the mean of the outcome, in many cases this is neither sufficient nor appropriate.

A robust alternative to least square error is the least absolute error that minimizes the sum of absolute values of the residuals and analyzes the results for the median. Moreover, in cases such as analyzing the income or expenditure data it can be of interest to study the effect of a particular variable at different quantiles of the outcome. These quantile marginal effects are non-linear parameters for which the unconditional interpretations are no more straightforward. For instance, the law of iterated expectations does not apply in the case of quantiles, i.e. Q· ”=EX[Q·(X)] =E(X)—·. Firpo et al. [2009] have proposed an alternative regression approach, based on the Re-centered Influence Function, that allows for computing the marginal effects on the unconditional functional statistics of interest, given that the closed form influence function is provided under some general assumptions.

This chapter presents empirical evidence for the impact of minimum wages on different quantiles of the wage distribution, in developing countries. We use conditional quantile regression, byKoenker and Bassett[1978], and unconditional quantile regression, byFirpo et al. [2009], on data from four developing countries, namely Brazil, India, Indonesia and Mexico. The findings of taking into consideration the institution impact for the conditional and unconditional quantile regressions are then discussed. The purpose of this analysis is to estimate the effect of the minimum wages on the wage distribution of those workers who are covered by minimum wage legislation. This study does not aim

1This chapter is based on joint work with Dr. Uma Rani Amara, Senior Development Economist, Research Department, International Labour Office Geneva. However, all possible errors or omissions in this script are merely my own responsibility.

Références

Documents relatifs

Richard, Distributed image reconstruction for very large arrays in radio astronomy, in 8th Sensor Array and Multichannel Signal Processing Workshop (IEEE SAM), 2014, pp.

Over our period of analysis (1992-2009), unconditional quantile regressions show that earnings inequality is more rising in the top end of the wage distribution while an

dimensional wedge-billiard dynamics are studied via the three-dimensional discrete Poincaré map relating the state from one impact to the next one, the ball motion between two

Figure 2 A shows repre- sentative current recordings from the same oocytes expressing flounder NaPi-IIb at 20C and 10C in response to voltage steps for 80 ms in the range –160 to

Tables 1 and 2 show some descriptive statistics of important characteristics by income group: i) the percentage of workers with temporary contracts are higher among low income

„   Librairies pour gérer tous les types de base de données (postgre, mysql, sqlite,oracle…).. DES ARCHITECTURES RICHES • Notre appli Web • Sites internet

Des études prospectives ont eu pour objet de montrer l’efficacité du Misoprostol par voie rectale dans le traitement de l’hémorragie du post-partum .Le tableau

Individuals similar to A and A ′ would have been high-skilled with the original, higher minimum wage in order to avoid unemployment. Under the new, lower minimum wage, they can