• Aucun résultat trouvé

Discrete choice pseudo panel data models

N/A
N/A
Protected

Academic year: 2022

Partager "Discrete choice pseudo panel data models"

Copied!
178
0
0

Texte intégral

(1)

Thesis

Reference

Discrete choice pseudo panel data models

BALDE, Thierno

Abstract

Les données de panel sont aujourd'hui d'une importance capitale dans l'analyse du comportement des micro-unités. Or dans beaucoup de pays, ces données n'existent pas encore. A la place, les chercheurs peuvent utiliser des enquêtes répétées. Dans un pareil cas, vu l'impossibilité de suivre la même unité dans le temps, on passe au niveau cohorte tout en introduisant l'hétérogénéité individuelle dans le modèle. Des cohortes construites selon des critères d'homogénéité sont les « individus » du nouveau panel. Cette approche est dite approche des données pseudo panel. Ainsi cette thèse comporte trois chapitres. Le premier traite de l'estimation de modèles à choix binaires avec des effets individuels quand on a des données pseudo-panels. Le deuxième propose une approximation de la distribution théorique exacte obtenue dans le premier chapitre, par la distribution beta. Le troisième chapitre analyse l'impact de l'autonomie des femmes sur l'utilisation des soins de santé en Inde avec des données pseudo-panels.

BALDE, Thierno. Discrete choice pseudo panel data models . Thèse de doctorat : Univ.

Genève, 2014, no. SES 873

URN : urn:nbn:ch:unige-482371

DOI : 10.13097/archive-ouverte/unige:48237

Available at:

http://archive-ouverte.unige.ch/unige:48237

Disclaimer: layout of this document may differ from the published version.

(2)

Discrete Choice Pseudo Panel Data Models

Th` ese pr´ esent´ ee ` a la Facult´ e des sciences ´ economiques et sociales de l’Universit´ e de Gen` eve

par Thierno BALDE

pour l’obtention du grade de

Docteur `es Sciences ´Economiques et Sociales mention ´Econom´etrie

Membres du jury de th` ese

Prof. JayaKrishnakumar, Directrice de th`ese, Universit´e de Gen`eve Prof. StefanSperlich, Pr´esident du jury, Universit´e de Gen`eve

Docteur Jean-PaulChaze, Universit´e de Gen`eve Prof. Jean-MarieDufour, McGill University, Canada

Gen`eve, le 17 d´ecembre 2014 Th`ese No 873

(3)

a autoris´ e l’impression de la pr´ esente th` ese, sans entendre, par l` a, n’´ emettre aucune opinion sur les propositions qui s’y trouvent ´ enonc´ ees et qui n’engagent que la responsabilit´ e de leur auteur.

Gen` eve, le 17 d´ ecembre 2014

Le doyen

Bernard MORARD

Impression d’apr` es le manuscrit de l’auteur.

(4)

Contents

Acknowledgements v

Abstract vii

R´esum´e ix

1 Estimation of a model with grouped binary dependent variables 1

1.1 Introduction . . . 2

1.2 Model specification . . . 4

1.3 Maximum likelihood estimator . . . 10

1.4 The genetic algorithm procedure . . . 14

1.5 Monte Carlo Study . . . 16

1.5.1 Experimental design . . . 17

1.5.2 Measures of performance . . . 19

1.5.3 Simulation results . . . 21

1.6 Conclusion . . . 25

1.7 Appendix . . . 27

1.8 Reference . . . 65

2 Study of the probability distribution of grouped binary variables 69 2.1 Introduction . . . 70

2.2 Construction of the model . . . 71

2.2.1 A binary choice model at the individual level . . . 71

2.2.2 Probability distribution at the individual level . . . 71

(5)

2.2.3 Probability distribution of cohort mean . . . 72

2.3 Comparison of the exact distribution with the beta approximation . 83 2.3.1 Monte Carlo study . . . 83

2.3.2 Goodness-of-Fit (Kolmogorov-Smirnov) Test for the two dis- tributions . . . 84

2.3.3 Goodness-of-Fit (Cramer-Von Mises) Test for the two distri- butions . . . 85

2.3.4 Kullback-Leibler Divergence . . . 86

2.3.5 Results . . . 86

2.3.6 Comments . . . 105

2.4 Conclusion . . . 107

2.5 Reference . . . 108

3 The effect of women’s autonomy on health care use by children 111 3.1 Introduction . . . 113

3.2 Research Question . . . 115

3.3 Women’s autonomy: Concepts and Issues in the Literature . . . 115

3.3.1 Women’s autonomy and its relationship to children’s access to health care . . . 117

3.4 Data . . . 118

3.5 Criteria for Construction of the Pseudo-Panel Database . . . 119

3.6 Variable Specification . . . 121

3.6.1 Dependent variable . . . 121

3.6.2 Independent Variables . . . 121

3.7 Econometric Approach . . . 123

3.8 Results and Discussion . . . 128

3.8.1 Pseudo-panel results by the woman’s age group . . . 129

3.8.2 Pseudo-panel results by geographical region . . . 136

3.8.3 Results of a Probit model . . . 138

3.8.4 Further results . . . 138

3.9 Conclusion . . . 139

(6)

3.11 References . . . 161

Conclusion Thesis 163

(7)
(8)

Acknowledgements

I finally reach the end of this endeavour, which has been such a big part of my life.

The time has come to look back over the road travelled and reflect on the obstacles I had to overcome, but even more on the individuals who helped me along the way.

First and foremost I wish to express my profound gratitude to my thesis director, professor Jaya Krishnakumar. She played a central role in mapping out the path I was to follow when I attended her econometrics classes as a student. She nurtured my passion for econometrics, and her expertise in that field was decisive in shaping the direction taken by this work. I am thankful for her human and scientific strengths, her priceless help, and the confidence, kindness, and support she showed me during all these years of working on my thesis. Her many readings, comments, suggestions, and corrections made writing this paper possible. Without her help, I would have been unable to produce this thesis. My intellectual debt to her is immense and can never be repaid: I will always be grateful.

I also wish to thank professor Stefan Sperlich, who presided over the jury and whose comments allowed me to improve the quality of this work. I was his assistant and thank him for his confidence in me. He can be assured of my deepest respect and profound gratitude.

I had near-daily interactions with Doctor Jean-Paul Chaze, who agreed to sit on the jury. I wish to especially extend my sincere gratitude for the time he spent reading my work: His comments and corrections were of great help when I was finalizing this paper. I am also very thankful for the pleasant moments we shared over a coffee. . .and will not soon forget them!

My deepest gratitude also goes out to professor Jean-Marie Dufour of McGill Uni- versity (Canada), who agreed to sit on my thesis committee despite his busy agenda.

Just a few years ago it would have been unimaginable to me that he would be on my thesis committee. This was truly my good fortune and a great honour. I would also like to thank him for his constructive comments and insightful suggestions during the pre-defense. He, too, can rest assured of my deep gratitude!

These years of working on the thesis are intimately linked with the teaching I gave as Assistant at the University of Geneva. I would like to thank the various professors with whom I worked, including: professor Christian Gouri´eroux, professor G´erard Antille, professor Daniel Royer, and professor Gilbert Ritschard.

(9)

server at Uni Dufour (University of Geneva), particularly Yan Sagon and Jean-Luc Falcone. Without this powerful computer I would have been unable to run all my Matlab programs.

I am further grateful to all my friends in Geneva, Guinea, and Canada, who were generous in encouraging me and always ready to lend an ear. I particularly wish to thank my best friend, David Kasdas, for his loyal companionship. Our discussions on all subjects, our jokes, our outings, and our trips were like deep breaths of fresh air for me. He was always there when I needed him, especially during the most trying periods in my personal life while I was writing this thesis . . . I am in his debt. Thank you from the bottom of my heart!

I was fortunate enough to be accompanied on this journey by my dear family and, especially, by my parents, my brothers and sisters. Despite the fact that I was ab- sent for many years, their trust, their caring, their love, and their unfailing support always enabled me to overcome obstacles. Thanks to my father and mother for having made me the person I am today. I also think of my maternal grandmother, who never ceased encouraging me in my studies and with whom I would like to share the end of this journey.

Finally, I conclude with my most heart-felt thanks, which must naturally be for my dear companion, Anne, for her support, caring, strength, humour, joy, kindness, originality, love . . . and especially for the understanding she always showed me during the most difficult times. So great was her contribution that this is, truly, our thesis. Thank you for everything!

(10)

Abstract

Panel data are important for analyzing micro-level behavior. Why? They enable us to model and estimate individual heterogeneity. However, even today, these data do not exist for many countries. Instead, the researcher may find annual household surveys based on a large random sample of the population. In such a case, given the impossibility of following the same unit over time, heterogeneity is introduced by following cohorts rather than individuals. Deaton (1985) suggests to build cohorts according to some chosen criteria of homogeneity and consider these cohorts as ‘individuals‘ in a new panel. This approach is called a pseudo- panel data approach. The literature in this area has mainly focused on linear models (static and dynamic) with individual fixed (random) effects. However, many economic investigations require nonlinear models. This research is therefore devoted to nonlinear pseudo panel models (Discrete Choice Pseudo Panel Data Models).

The motivation of this thesis is firstly to take account of individual heterogeneity in a model when we only have pseudo panels, and secondly, to deal with situations where individual data are not available for the response variable but only at the aggregate level.

In the first paper, we derive a probability distribution for the mean of binary vari- ables from cohort data following a discrete approach. We propose an appropriate estimation method and study the properties of estimators by Monte Carlo experi- ments.

Given the complexity of the distribution of the aggregate variable, in the second paper, we make a comparative study of the discrete approach with a continuous approach based on a beta law. This is in order to investigate to what extent our exact discrete distribution can be approximated by the (fitted) beta distribution.

The third paper is an empirical application of the methodology developed by an- alyzing the impact of women’s autonomy on the use of health care by children in India.

(11)
(12)

R´ esum´ e

Les donn´ees de panel sont aujourd’hui d’une importance capitale dans l’analyse du comportement des micro-unit´es. Pourquoi ? Pour pouvoir tenir compte de l’h´et´erog´en´eit´e individuelle. Or dans beaucoup de pays, ces donn´ees n’existent pas encore aujourd’hui. A la place, les chercheurs peuvent utiliser des enquˆetes an- nuelles bas´ees sur un large ´echantillon al´eatoire de la population. Dans un pareil cas, vu l’impossibilit´e de suivre la mˆeme unit´e dans le temps, comment introduire l’h´et´erog´en´eit´e individuelle dans le mod`ele? En passant au niveau cohorte. Deaton (1985) conseilla de construire des cohortes selon des crit`eres d’homog´en´eit´e et de consid´erer ces cohortes comme des “individus“ du nouveau panel. Cette approche est dite approche des donn´ees pseudo panel. La litt´erature dans ce domaine s’est surtout focalis´ee sur des mod`eles pseudo panel lin´eaires (statiques et dynamiques) avec effets individuels fixes (al´eatoires). Alors que de nos jours, beaucoup de situ- ations ´economiques exigent des mod`eles non lin´eaires. Ainsi cette th`ese concerne des mod`eles pseudo-panels non lin´eaires (mod`ele `a choix discr`et).

La motivation de cette th`ese est donc :

- De pouvoir tenir compte de l’h´et´erog´en´eit´e individuelle dans le mod`ele quand nous avons des pseudo panels.

- De r´epondre aux situations o`u des donn´ees individuelles ne seraient pas disponibles pour la variable de r´eponse mais seulement au niveau agr´eg´e.

Dans un premier papier, nous d´erivons la distribution de la moyenne des choix d’une cohorte donn´ee sachant les variables explicatives individuelles de la dite cohorte, par une approche discr`ete. Nous proposons une m´ethode d’estimation appropri´ee et ´etudions les propri´et´es des estimateurs par des exp´eriences de Monte Carlo.

Compte tenu de la complexit´e de la distribution de la variable agr´eg´ee, dans un sec- ond papier, nous faisons une ´etude comparative de cette approche discr`ete avec une approche continue bas´ee sur une loi beta. Ceci dans le but d’investiguer la qualit´e d’approximation de la distribution th´eorique (g´en´erique) exacte par la distribution beta.

Enfin, le troisi`eme papier est une application empirique de la m´ethodologie d´evelop- p´ee en analysant l’impact de l’autonomie des femmes sur l’utilisation des soins de sant´e par les enfants en Inde.

(13)
(14)

Chapter 1

Estimation of a model with grouped binary dependent variables from repeated

cross-sections data

Thierno BALDE

University of Geneva, Switzerland Abstract

In this paper we discuss the estimation of a binary choice model with individual ef- fects using a time series of independent cross-sections. We propose a new approach to parametrizing the individual effects that accounts for ‘cohort effects’ as well as purely individual effects. Drawing on Mundlak’s (1978) approach, and postulating certain conditions, we express the ‘cohort effect’ as a linear function of the means of the explanatory variables. In a first setting, we assume that individuals’ choices are not observed and derive the probability that a certain number of individuals choose ‘1’ among the total number of individuals in a given cohort. Then we go on to the special case in which the individual choices are assumed to be observed.

Based on the probabilities of cohort means, we estimate the model using the maxi- mum likelihood method and implement it using a heuristic optimization technique (genetic algorithm). Finally, we carry out Monte Carlo simulations to analyze the finite-sample properties of our estimators, in terms of both bias and mean squared error (MSE).

I would like to express my sincere gratitude to Professor Jaya Krishnakumar for her construc- tive comments. I remain solely responsible for any errors or omissions.

(15)

1.1 Introduction

In this paper we analyze a binary choice model with individual effects in the context of clustered data drawn from repeated cross-sectional data. The issues we encounter in this study include the nonlinearity of the models—a nonlinearity that is exacer- bated by group-level analysis of the dependent variable. This is attributable to the absence of true panel data, which are typically preferred in econometric studies.

Binary choice models are widely used today in the fields of economics, social sci- ences, political science, and also medical research. For example, assume we wish to study the participation of married women in the labor market. In this case, the dependent variable assumes one of two values, one (1) if the married woman is on the labor market and zero (0) if she is not. Researchers in labor economics suggest that the decision to participate is partly a function of observable characteristics, either of the individual, such as the education level and family income, or of the economy, such as the unemployment rate, and partly a function of factors that are not observable by the researcher. If these unobservable effects are correlated with the explanatory variables, the model cannot be identified without resorting to external instruments in a simple cross-sectional setting. However, if they are time invariant, the model can be identified using panel data. In another example, consider presidential elections in the United States, where there are two dominant political parties: Republicans and Democrats. The dependent variable is the choice of which of these two parties to vote for. Let us say that it assumes the value of one (1) if the chosen candidate is a Democrat, and zero (0) for a Republican. This issue has been the focus of much work by the economist Ray Fair of Yale Univer- sity, with the publication “Econometrics and Presidential Elections,” and by other political scientists. Variables that are often used in voter choice models include the individual characteristics of voters, the inflation and unemployment rates, etc.

Despite widespread interest in these models, the binary dependent variable is often not available at the individual level. This is might be because information on indi- vidual choices are very costly to obtain or because the data provided to researchers is restricted owing to privacy concerns. Only aggregate data on choices at the group level are typically published. The most common types of aggregate data are sums and proportions. In political science the behavior of the elector in an election is frequently cited as an example. Because of voters’ privacy rights, their individual choices are masked and we are only given information on the number of votes ob- tained by each candidate. In medical research, individual-level hospitalization data are usually protected, and only aggregate data (proportions) are accessible. If the explanatory variables for all individuals in a given group are the same, then aggre- gated binary choice models are easy to estimate (Greene, 2004; Maddala, 1983).

Conversely, if explanatory models assume different values from one individual in a group to the next, Miller and Plantinga (1999) recommend using group means of all variables to estimate the model. In this case, we lose some information by using the mean of the variables. Also, the interpretation of the estimated parameters is not at the individual, but rather at the group, level. Consequently, this approach does not allow us to make inferences at the individual level. A situation in which

(16)

at the individual level was illustrated by the Pennsylvania gubernatorial election in 2006. In this election an incumbent Democratic governor faced off against a black Republican candidate. The data was collected by the “The Inter-University Consortium for Political and Social Research (ICPSR)” using questionnaires that were completed by voters on the day of the election. For each voter that partici- pated in the survey several characteristics such as race, sex, age, etc. were observed.

Moreover, Pennsylvania is divided into five (5) geographical districts. For each one we have information on the number of survey participants, the number of electors who voted for the Democratic candidate, and individual characteristics of each participant, but not how he or she voted.

Panel data currently offer a wide variety of benefits for analyzing behavior at the micro level, but they are not available for many countries. Instead, there are annual household surveys that are based on a large sample of the population, such as the

“the British Family Expenditure Survey” or “Labor Supply Survey.” In the case of these repeated cross-sectional surveys we are unable to follow a specific household over time, as would be required for a true panel. Thus, the estimation methods currently used for analyzing panel data are inapplicable. To address this problem, Deaton (1985) suggests using cohorts to estimate linear models. His approach is based on aggregating individuals or households into cohorts and treating the pop- ulation means of these cohorts as “individuals.” In this fashion, this new panel (called a “pseudo panel”) allows us to track a representative sample from the same cohort of individuals or households over time. The “pseudo panel” approach has not only been used in applied microeconomics, such as for studying income and savings (see, for example, Beach and Finnie, 2004; Bourguignonet al., 2004; Baldini and Mazzaferro, 1999) but also in many areas of research in the social sciences, includ- ing healthcare, education, employment, etc. (e.g. Garner et al., 2002; Glied, 2002;

Lauer, 2003; Anderson and Hussey, 2000; Weir, 2003). To construct pseudo-panel datasets, cohorts need to be defined on the basis of a certain number of shared characteristics. The control variable has to be constant for each individual at all points in time or the individual will cease to belong to the group. Also, the control variable must be observable for all individuals in the sample: year of birth, sex, level of education, geographical region are all good criteria for the formation of cells (Dargay and Vythoulkas, 1999). In a word, the construction of these cells has significant ramifications for the magnitude of the bias and the variance of the estimators in a sample of finite size (Verbeek and Nijman, 1992). The problem with the cohort approach is that we have to replace the cohort population means with empirical observations from the samples, creating an issue with measurement error in the variables. In the case of linear models, the classic “within” estimator is biased. Since the variance of measurement errors can be estimated from individ- ual data, Deaton (1985) and Collado (1997) suggest corrections that account for measurement error. Fuller (1987) proposes a more general estimator that includes Deaton’s as a special case. Verbeek and Nijman (1992) demonstrate that we can ignore measurement error if we have a large number of observations per cohort.

In the case of linear models with individual effects in a true panel, the classi- cal estimation method consists of transforming the model by using the deviation from the mean to eliminate individual effects (see, for example, Hsiao [1986] or

(17)

Arellano and Bover [1990]). This method is not relevant to our study, since our model is “strongly” nonlinear. For nonlinear models, Mundlak (1978) and Cham- berlain (1984) suggest parameterizing individual effects as a linear function of the explanatory variables. Once again, this will not work for us because we do not have observations on the same individuals over time. Collado (1998) demonstrates how to overcome this difficulty in parameterizing these effects by using cohort means. In her approach, we have a model with measurement error on the variables causing the error term to be correlated with the explanatory variables. The covari- ance between the errors and the explanatory variables is a function of the variance of the measurement error, which can be estimated from individual data. To be able to estimate her model, Collado imposes a further restriction that the joint distribution of the error terms and the cohort means be normal. In this fashion she is able to estimate her binary choice model using pseudo-maximum likelihood and the minimum distance method. In our analysis we circumvent this problem of parameterization by expressing the individual-specific effect as a function of the cohort-specific effect while still following the general thrust of Mundlak’s approach.

Subsequently, unlike Collado we estimate a model whose dependent binary variable is an aggregate, but whose explanatory variables are at the individual level, thus capitalizing on all the information available for individuals.

Our paper is organized as follows. In Section 1.2, we specify the model of our study starting from a binary model with individual effects. In Section 1.3 we construct the likelihood function for the variable of interest (the aggregate dependent variable) and perform the estimation using maximum likelihood. Section 1.4 presents the optimization, which uses the genetic algorithm. In Section 1.5 we report the results of our Monte Carlo simulations. Conclusions are presented in Section 1.6.

1.2 Model specification

When studying discrete choice with panel data the following linear model is often postulated for the underlying latent variable:

yit=x0itβ+αi+uit, i= 1, . . . , N, t= 1, . . . , T (1.1) whereyitis an unobserved variable,xitis a(k×1)vector of observed explanatory variables,αi is the unobserved individual effect, anduitthe error term. In specifi- cation (1.1) we start from the assumption that we retain the same individuals over time, i.e. individualiin period 1 is the same person as individualiin period 2, and so forth. If we assume thatxit,αi,uitare pairwise independent we have a random effects model. Conversely, ifαiandxitare correlated, we can adopt the parametric approach developed by Chamberlain or Mundlak. This reparametrization allows us to establish a relationship between the fixed-effects and the random-effects models, especially in a linear context.

According to the Mundlak approach (1978), the individual effects and the explana- tory variables are correlated as follows:

(18)

whereE(wi|xit, t= 1, . . . , T) = 0 andx¯i= T1(xi1+xi2+. . .+xiT).

The approach taken by Chamberlain (1980, 1985) is more general. αiandxitstand in the following linear relationship:

αi=

T

X

t=1

x0itγt+wi (1.3)

withE(wi|xit, t= 1, . . . , T) = 0.

This last parameterization has its drawbacks, including the fact that it is cumber- some to estimate theγtin the presence of a large number of observations.

In order not to confuse with true panel data, we adopt the notation of R. Moffit (1993)1. Many other authors have adopted this notation in the literature of pseudo- panel data, for example Ainhoa Oguiza Tovar (2012) 2. We formulate our model as follows:

yi

(t)t=x0i

(t)tβ+αi(t)+ui(t)t, i(t)= 1(t), . . . , N(t), t= 1, . . . , T, (1.4) where we assume that E(ui(t)t|xi(t)t) = 0 for each i(t) and t, i.e. the variables xi(t)t are exogenous. Thus, to eliminate any chance of confusion, we index the ith individual at time t with i(t). This individual will not be the same from one period to the next. For example, the second individual in period 1, 2(1), will not be the same person as the second individual in period 2, 2(2). The number of observations can differ from one period to the next, as is indicated by N(t).

Rather than random effect (RE) or fixed effect (FE), we take the option thatαi(t)

is potentially correlated with expanatory variablesxi(t)t(FE) or uncorrelated (RE).

We preferred the first situation because it is very possible that the individual effect αi(t) is correlated withxi(t)t. In this case, the Mundlak method presented in (1.2) and the Chamberlain method in (1.3) are inapplicable because we cannot estimate the coefficients of these equations since we don’t have observations on the same individuals from one period to the next.

The variableyi

(t)tbeing latent, we observe:

yi(t)t=

1 ifyi

(t)t>0

0 otherwise. (1.5)

Thus,

pi(t)tP(yi(t)t= 1) =P r(x0i

(t)tβ+αi(t)+ui(t)t>0). (1.6) Working from the model in (1.1) we can use standard estimation techniques, such as maximum likelihood (after making some preliminary assumptions onuitandαi) to estimate the parameters of the model.

1R. Moffit (1993), Identification and estimation of dynamic models with a time series of re- peated cross-sections.

2Ainhoa Oguiza Tovar, Inmaculada Gallastegui Zulaica, Vicente Nunez-Anton (2012), Analysis of pseudo-panel data with dependent samples.

(19)

If we do not have a “true” panel, as in (1.4), the estimates obtained will not be consistent. Collado (1998), drawing on the approach in Chamberlain (1984) for individual effects, shows how to obtain consistent results using cohort averages.

In our model, given the absence of a true panel and possibly the unavailability of the dependent variable at the individual level, we work on the level of aggregates (proportions, sums) while retaining explanatory variables at the individual level.

Assume that we have a repeated cross-sectional dataset on binary choices for N individuals with N =N(1)+N(2)+. . .+N(T). Let cohorts of variable sizes (nct) and homogeneous. Also, let N(t) = PC

c=1nct. In each cohort c (c = 1, . . . , C), individual i(t) gives the response yi(t)t = 1 with probability (pi(t)t), or response yi(t)t= 0with probability (1−pi(t)t). yi(t)tis thus a Bernoulli variable conditional onpi(t)t, i.e.f(yi(t)t\pi(t)t) = (pi(t)t)yi(t)t(1−pi(t)t)1−yi(t)t where0< pi(t)t<1. For the reasons given above, we are interested in the cohort meany¯ct=n1

ct

Pnct

i=1yi(t)t, where 0≤y¯ct ≤1, c= 1, . . . , C. We can establish the table of the distribution of

¯

yctas follows:

¯

yct 0

nct = 0 n1

ct

2

nct . . . nk

ct . . . nnct

ct = 1 Pnct

i=1yi(t)t 0 1 2 . . . k . . . nct

P

y¯ct= nk

ct

=P Pnct

i=1yi(t)t=k

P0 P1 P2 . . . Pk . . . Pnct

Since the observations are independent (by assumption), we have:

P0=Pyct= 0) =P

nct

X

i=1

yi(t)t= 0

!

=P yi(t)t= 0, ∀i∈c

=p01(t)t. . . p0nc

(t)t=

nct

Y

i=1

p0i(t)t=

nct

Y

i=1

(1−p1i(t)t) wherep1i

(t)tis the probability that individualiin cohortcchoses one (1) at timet, andp0i

(t)tthe probability of the complementary event.

P1=P

y¯ct= 1 nct

=P

nct

X

i=1

yi(t)t= 1

!

=P yi(t)t= 1, yj(t)t= 0, ∀i6=j, i, jc

=p11

(t)tp02

(t)t. . . p0n

c(t)t+p01

(t)tp12

(t)tp03

(t)t. . . p0n

c(t)t+. . .+p01

(t)t. . . p0nc−

1(t)tp1n

c(t)t

=

nct

X

i=1

p1i

(t)t nct

Y

j=1,i6=j

p0j

(t)t

=

nct

X

i=1

p1i(t)t

nct

Y

j=1,i6=j

(1−p1j(t)t).

Example: Assume that there are three individuals per cohort (n = 3). What is

(20)

timet.

P

¯ yct=1

3

=P r(1 of the 3 individuals is hospitalized)

=P r(1 hospitalized, 2 and 3 not hospitalized) +P r(2 hospitalized, 1 and 3 not hospitalized) +P r(3 hospitalized, 1 and 2 not hospitalized)

= [ifui(t)tN]

= Φ(x01

(t)tβ+α1)[1−Φ(x02

(t)tβ+α2)][1−Φ(x03

(t)ttβ+α3)]

+ Φ(x02

(t)tβ+α2)[1−Φ(x01

(t)tβ+α1)][1−Φ(x03

(t)tβ+α3)]

+ Φ(x03

(t)tβ+α3)[1−Φ(x01

(t)tβ+α1)][1−Φ(x02

(t)tβ+α2)]

P2=P

y¯ct= 2 nct

=P

nct

X

i=1

yi(t)t= 2

!

=P yi1(t)t= 1, yi2(t)t= 1, yj(t)t= 0∀i16=i26=j, i1, i2, jc

=p11

(t)tp12

(t)tp03

(t)t. . . p0n

c(t)t+p11

(t)tp02

(t)tp13

(t)tp04

(t)t. . . p0n

c(t)t+. . .+p11

(t)tp02

(t)t. . . p0nc−

1(t)tp1n

c(t)t

+p01

(t)tp12

(t)tp13

(t)tp04

(t)t. . . p0n

c(t)t+. . .+p01

(t)tp12

(t)t. . . p0nc−

1(t)tp1n

c(t)t

+. . . . +p01

(t)tp02

(t)t. . . p1nc−

1(t)tp1nc

(t)t

=

nct

X

i1=1

X

i2>i1

p1i

1(t)tp1i

2(t)t nct

Y

j=1,i16=i26=j

p0j

(t)t=

nct

X

i1=1

X

i2>i1

p1i

1(t)tp1i

2(t)t nct

Y

j=1,i16=i26=j

(1−p1j

(t)t).

Thus, by deduction, we can write the generic probability of the cohort’s average choice as:

Pk=P

¯ yct= k

nct

=P

nct

X

i=1

yi(t)t=k

!

=

nct

X

i1=1

X

i2>i1

. . . X

ik>ik−1

p1i1(t)t. . . p1ik(t)t

nct

Y

j=1,i16=...6=ik6=j

p0j(t)t

=

nct

X

i1=1

X

i2>i1

. . . X

ik>ik−1

p1i

1(t)t. . . p1i

k(t)t

nct

Y

j=1,i16=...6=ik6=j

(1−p1j

(t)t).

(21)

When the data on the choicesYi1(t)tare available, the generic probability becomes:

Pk=P

nct

X

i=1

yi(t)t=k

!

=p1i1(t)t. . . p1ik(t)t

nct

Y

j=1,i16=...6=ik6=j

p0j(t)t

=p1i

1(t)t. . . p1i

k(t)t

nct

Y

j=1,i16=...6=ik6=j

(1−p1j

(t)t).

Here we have the likelihood of the mean of the observations on cohortc at time t as a function of the parameters β, the individual explanatory variablesxi(t)t, and the individual effectsαi(t)texpressed in terms of the individual probabilitiespit. In this formulation, if we let the probability of choosing one (1) be the same for each individual in a given cohort (even though this is not always the case), we end up with a very common probability distribution: the binomial.

This specification creates a major problem in the parametrization of individual- specific effects. In the absence of a true panel, we cannot directly apply the ap- proaches developed by Mundlak (1978) or Chamberlain (1984) for dealing with individual effects. Consequently, we decompose the individual effectsαi(t)into two parts: αctis a “cohort-specific effect” representing the mean of the individual effects of thepopulationin cohortc at timet, and ξi(t)is the deviation ofαi(t)from that mean:

αi(t)=αct+ (αi(t)αct)

| {z }

ξi(t)

=αct+ξi(t). (1.7) The mean of the individual effects of the sample in cohort c at time t, α¯ct, can be decomposed as follows: the population mean in cohort c at time t, αct, and a deviation from this mean attributable to sampling error in the data, denotedvc(t)

¯

αct=αct+vc(t). (1.8)

At this stage we can introduce a first order autocorrelation scheme (AR1) in the sampling error i.e

vc(t)=ρvc(t−1)+ηc(t), ρ is the same for all cohorts.

Substituting (1.8) into (1.7) yields:

αi(t)= ¯αctvc(t)+ξi(t). (1.9) If we assume that populations change little from one period to the next (a very important hypothesis for the construction of our model), then the population mean αct is invariant in time, i.e. αct =αc. If we further assume that the cohort size is sufficiently large, i.e. nct'nc→ ∞, then the sampling error tends toward zero (vc(t)→0). The two preceding assumptions together have the effect thatαc 'α¯c, and thus

(22)

Now consider the Mundlak (1978) approach:

αi(t)= ¯x0iγ+wi(t), (1.11) with E(wi(t)|xi(t)t, t = 1, . . . , T) = 0, wi(t) ∼ iid(0, σw2) and x¯0i (average of the observations over time) unobserved.

A simple algebraic manipulation of (1.11) yields:

αi(t)=x0i

(t)tγ+ (¯xixi(t)t)0γ+wi(t). (1.12) We can now take the mean of the sample observations for each cohort in each period

1 nct

nct

X

i=1

αi(t)

| {z }

¯ αct

= 1

nct nct

X

i=1

x0i

(t)t

| {z }

¯ x0ct

γ+ 1

nct nct

X

i=1

¯ xi

| {z }

¯ xc

− 1 nct

nct

X

i=1

xi(t)t

| {z }

¯ xct

!0 γ+ 1

nct nct

X

i=1

wi(t)

| {z }

¯ wct

¯

αct= ¯x0ctγ+ (¯xcx¯ct)0γ+ ¯wct.

Assume that we can estimate the conditional expectation (or the true mean) E(xi(t)t|i ∈ c) = µct = µc ∀i, t from x¯ct. When nct ' nc → ∞ and for similar populations, we have:

¯ xct= 1

nc

X

i∈c

xi(t)tµct=µc

1 nc

X

i∈c

¯ xi= 1

nc 1 T

X

i∈c

X

t

xi(t)tµc.

Thus,

1 nc

X

i∈c

¯ xix¯ct

!

→0.

We also have

¯ wct= 1

nc

X

i∈c

wi(t)t→0,

¯

αctα¯c. Consequently,

¯

αc 'x¯0cγ.

Substituting all these results into (1.10),

αi(t) = ¯x0cγ+ξi(t).

(23)

Thus, our latent variable postulated in (1.4) becomes:

yi

(t)t=x0i

(t)tβ+ ¯x0cγ+ξi(t)+ui(t)t, (1.13) with ξi(t) and ui(t)thaving expectation and covariance zero (since the individuals change from one period to the next) and being independent ofxi(t)tandx¯c. Note that there is no heterogeneity in the coefficientsθ= (β, γ)as they are fixed, so not random.

This model is thus valid for largenct (nct'nc).

For nonlinear models, and specifically in our case, the structural parameters only provide us with information on the relative magnitude of the change inE(yi(t)t/xi(t)t) resulting from a variation in a unit of xi(t)t, while the marginal effects provide us with the absolute magnitude of the change. This is why, when estimating binary choice models, we are often interested in:

i. The signs and statistical significance of the coefficients.

ii. The marginal effects. For example, in a Probit model, E(yi(t)t/xi(t)t) = Φ(x0i

(t)tβ+ ¯x0cγ) and the marginal effects are computed as

M E= ∂E(yi(t)t/xi(t)t)

∂xi(t)t =

β+ 1 nctγ

φ

x0i

(t)tβ+ ¯x0cγ

(1.14) for a continuous variablexi(t)t. We see that, unlike in the case of the linear model, here the marginal effect is the product of two factors: all the effects of the explanatory variables on the latent variable, as well as the derivative of the normal cumulative function evaluated at pointyi

(t)t. Furthermore, if we consider

E(yi(t)t/xi(t)t, di(t)t) = Φ x0i

(t)tβ+ ¯x0cγ+δdi(t)t

the marginal effects are

M E = Φ

x0i(t)tβ+ ¯x0cγ+δ)−Φ(x0i(t)tβ+ ¯x0cγ

(1.15) for a discrete variabledi(t)t.

1.3 Maximum likelihood estimator

How can we use the maximum likelihood method to estimate the parameters β and γ in this nonlinear model? Assume that we have a “pseudo-panel” of di- mension (PC PT

n ), where C is the number of cohorts, n the size of the

(24)

β and γ are the vectors βˆ and γˆ that give the highest probability of obtain- ing{y¯11, . . . ,y¯1T, . . . ,y¯C1, . . . ,y¯CT}conditional on the explanatory individual vari- ables. This joint probability is written:

L(β, λ; ¯y11, . . . ,y¯1T, . . . ,y¯C1, . . . ,y¯CT) =Py11, . . . ,y¯1T, . . . ,y¯C1, . . . ,y¯CT;β, γ).

By construction of the pseudo-panel, the observations are independent of each other, and so the likelihood is:

L(β, λ; ¯y11, . . . ,y¯1T, . . . ,y¯C1, . . . ,y¯CT) =

T

Y

t=1 C

Y

c=1

Pyct)

=

T

Y

t=1

P(¯y1t)P(¯y2t). . . P(¯yCt)

=

T

Y

t=1

n1t

X

i1=1

X

i2>i1

. . . X

ik>ik−1

p1i1(t)t. . . p1ik(t)t

n1t

Y

j=1,i16=...6=ik6=j

(1−p1j(t)t)

| {z }

cohort forc= 1at timet

. . .

nCt

X

i1=1

X

i2>i1

. . . X

ik>ik−1

p1i

1(t)t. . . p1i

k(t)t

nCt

Y

j=1,i16=...6=ik6=j

(1−p1j

(t)t)

| {z }

cohort forc=Cat timet

.

To simplify the calculations, and because the function log is monotonic, it is ad- visable to work with the log-likelihood function. Thus:

logL(β, λ;y11, . . . , y1T, . . . , yC1, . . . , yCT)

=

T

X

t=1

[logP(¯y1t) + logP(¯y2t) +. . .+ logPyCt)]

=

T

X

t=1

( log

n1t

X

i1=1

X

i2>i1

. . . X

ik>ik−1

p1i

1(t)t. . . p1i

k(t)t

n1t

Y

j=1,i16=...6=ik6=j

(1−p1j

(t)t)

| {z }

cohort forc= 1at time t

+. . .+ log nCt

X

i1=1

X

i2>i1

. . . X

ik>ik−1

p1i1t. . . p1ik(t)t

nCt

Y

j=1,i16=...6=ik6=j

(1−p1j(t)t)

| {z }

cohort forc=Cat time t

)

Références

Documents relatifs

More concretely, our objectives were to analyze how the nurses represent their role as nursing academics, their conceptions, strategies and feelings involved in the teaching

La nouvelle formation mise en place à Genève tente de transformer le rapport entre profession et disciplines en proposant un nouveau dispositif institutionnel de formation –

Within the asymmetric utility specification, the definition of gains and losses is typically expressed in the form of deviations from the reference values and both the utility

The coefficients for the variables seen in the previous regressions kept the same sign, the coefficient for already minister and tenure at the job are positive and age and the

Output-oriented average scores suggest that countries of the sample can improve their performance in PISA mean score and early leavers output by 6%, their average number of

RECOMMENDS that the new procedure adopted for the elaboration of the fifth programme of work become the method of choice for all the planning activities of the

I could not make the data produced during the semi-structured interviews – whose context was ill-defined, insofar as it dealt with an abstract, continuous, large part of the

“need” an extended formal education, “but the argument has never been proven.” As a result, schooling in coastal communities, and in rural and northern places, remains a