• Aucun résultat trouvé

Parametric and nonparametric analysis of simultaneous equation models with latent variables: theory and applications

N/A
N/A
Protected

Academic year: 2022

Partager "Parametric and nonparametric analysis of simultaneous equation models with latent variables: theory and applications"

Copied!
131
0
0

Texte intégral

(1)

Thesis

Reference

Parametric and nonparametric analysis of simultaneous equation models with latent variables: theory and applications

TELLEZ, Juan Manuel

Abstract

Cette thèse de doctorat est composée de trois chapitres. La principale contribution consiste à présenter deux cadres théoriques, paramétrique et non-paramétrique, pour faire des comparaisons et évaluations des concepts théoriques non-observables utilisés dans différents contextes tels que le bien-être, le développement, la santé et l'inégalité entre autres.Le premier chapitre présente le modèle paramétrique avec une méthode d'estimation et un exemple numérique avec des simulations. Le deuxième chapitre est une application, sur des données françaises, de ce type de modèles dans le contexte de la santé, en opérationnalisant l'approche des capabilités d'Amartya Sen. Finalement, le troisième chapitre consiste en une alternative non-paramétrique accompagnée d'une application sur des données indiennes dans le contexte du bien-être, pour tester la dominance stochastique multivariée de premier ordre.

TELLEZ, Juan Manuel. Parametric and nonparametric analysis of simultaneous equation models with latent variables: theory and applications . Thèse de doctorat : Univ. Genève, 2014, no. SES 858

URN : urn:nbn:ch:unige-404528

DOI : 10.13097/archive-ouverte/unige:40452

Available at:

http://archive-ouverte.unige.ch/unige:40452

Disclaimer: layout of this document may differ from the published version.

(2)

Parametric and nonparametric analysis of simultaneous equation models with latent

variables : theory and applications

Th` ese pr´ esent´ ee ` a la Facult´ e des Sciences ´ Economiques et Sociales de l’Universit´ e de Gen` eve

par Juan Tellez

pour l’obtention du grade de

Docteur `es Sciences ´Economiques et Sociales mention ´Econom´etrie

Membres du jury de th` ese

Prof. JayaKrishnakumar, Directrice de th`ese, Universit´e de Gen`eve Prof. G´erardAntille, Pr´esident du jury, Universit´e de Gen`eve

Prof. EvaCantoni Renaud, Universit´e de Gen`eve Prof JeffreyRacine, McMaster University, Ontario, Canada

Gen`eve, le 11 aoˆut 2014 Th`ese No 858

(3)

La Facult´ e des sciences ´ economiques et sociales, sur pr´ eavis du jury, a autoris´ e l’impression de la pr´ esente th` ese, sans entendre, par l` a, n’´ emettre aucune opinion sur les propositions qui s’y trouvent ´ enonc´ ees et qui n’engagent que la responsabilit´ e de leur auteur.

Gen` eve, le 11 aoˆ ut 2014

Le doyen

Bernard MORARD

(4)

Acknowledgements iii

Abstract v

R´esum´e vii

1 FIML estimation of SEM with latent variables and dynamic effects 1

1.1 Introduction . . . 1

1.2 Literature Review . . . 3

1.3 The model . . . 4

1.4 Assumptions . . . 8

1.5 The Maximum Likelihood Estimation . . . 9

1.6 Statistical Properties of the likelihood estimator . . . 12

1.7 The ML and EM algorithm solution . . . 13

1.8 Results . . . 14

1.9 Further Developments . . . 17

1.10 Conclusions . . . 19

1.11 References . . . 20

1.12 Appendix A . . . 24

1.13 Appendix B . . . 26

1.14 Appendix C . . . 30

1.15 Appendix D . . . 33

2 Alzheimer’s Capabilities 39

(5)

Contents

2.1 Introduction . . . 40

2.2 Literature review . . . 42

2.3 Database . . . 46

2.4 Capabilities and Functionings . . . 47

2.5 Exogenous variables . . . 51

2.6 The Model . . . 55

2.7 Identification of the model . . . 59

2.8 Model estimation . . . 60

2.9 Estimation Results . . . 61

2.9.1 1st variant . . . 61

2.9.2 2nd variant . . . 67

2.9.3 3rd variant . . . 71

2.10 Alzheimers vs Non-Alzheimers . . . 74

2.11 Conclusions . . . 78

2.12 References . . . 81

2.13 Appendix . . . 87

3 Nonparametric Multivariate Stochastic Dominance 93 3.1 Introduction . . . 94

3.2 Literature Review . . . 95

3.3 Nonparametric Functions . . . 97

3.4 Bandwidth Selection and Conditional Density Estimation . . . 99

3.5 Testing for Stochastic Dominance . . . 100

3.6 Data . . . 102

3.7 Results . . . 104

3.8 Conclusions . . . 113

3.9 References . . . 114

3.10 Appendix . . . 118

(6)

I have incurred more debts than I can fully acknowledge in such a short space, so must begin by expressing warm, general thanks to all those who helped to bring this project to fruition. I owe an especial debt of gratitude to my dedicated thesis supervisor, Professor Jaya Krishnakumar, who has acted as a mentor since I began my first Undergraduate research project. Professor Krishnakumar helped to nurture my interest in Econometrics and she has always given so generously of her time, sharing with me her expertise in this and other related fields. She has helped to guide not only my doctoral research but also my career choices, and I am deeply grateful to her. I would also like to extend my sincerest thanks to Professor Jeffrey Racine, who agreed to act as the external member of my thesis panel, and who supervised my work while I was on an exchange semester at McMaster University in Canada. I learnt a great deal from Professor Racine; he shared his valuable expertise with me while helping me to improve my work method, research skills, and academic writing. I would also like to thank the other members of my thesis panel: Professor G´erard Antille, the president of my thesis panel, who has done a fine job managing the administrative side of my thesis defense as well as Professor Eva Cantoni who remained always available for my questions. I am grateful to my thesis panel for taking the time to read my thesis, and for their precious comments, corrections, and feedback.

Dr. Catherine Le Gal`es and Dr. Martine Bungener gave me the opportunity to work at the INSERM (French Institute of Health and Medical Research) as a researcher, and I am thankful for this opportunity and the chance to produce research that informed a chapter of my thesis. Thanks also to Professor Elvezio Ronchetti who has been a great support and has acted as an invaluable advisor for my career choices. I would like to thank very especially Gerda Cabej for all the help provided in the programming part of the first chapter as well as Dr. Emma Depledge for proofreading my thesis and improving my written English without changing my meaning or encroaching on my personal style.

On a personal note, the following friends have provided advice and assistance at several stages of this project: Sergio Alcocer, Dr. Estefania Amer, Dr. Paola Ballon, Carlos de Porres, Dr. Jorge Davalos and Dr. Irina Irintcheeva. The list of friends and colleagues who have given me moral support during this time is too long to mention but they know who they are, and I thank them from the bottom of my heart. In particular, I would like to mention Tarek Houdrouge, Rayan Kaouk and

(7)

Contents Philippe Turrian, who became family; for their true, solid and beautiful friendship, great companionship, and unlimited support over the last four years. I could not ask for more wonderful friends.

I would like to extend sincerest thanks to my family who have been endlessly supportive and who have helped me in a multitude of different ways over the years, for their love and support during the composition of this thesis. Last, but by no means least, I would like to express my wholehearted gratitude to my parents and my sister, who have given me everything since the beginning, offering unconditional support at every stage in my career. Knowing that I can always count on them gave me the strength and courage to keep fighting for my goals and dreams, no matter how unrealistic they seemed. My parents believed in me even more than I believed in myself, and they taught me that success is impossible without self-belief and hard work. They helped me to make my dreams come true and for that I am eternally grateful. I have always thought that the best gift life has to give is the people who surround us. As these acknowledgements attest, I am truly blessed and have been surrounded by many, wonderful and loyal people. I remain forever grateful to you all.

(8)

My PhD dissertation consists of three chapters. My contribution relies on suggest- ing new theoretical frameworks, parametric and non-parametric, that prove to be useful to make comparisons and evaluations in contexts of development, health and inequality. The first chapter proposes a new theoretical framework of Structural Equation Models with latent variables and panel data, having dynamic effects given by the observed endogenous variables as well as individual effects. An estimation procedure using Full Maximum Likelihood with the EM algorithm is presented as well as a numerical example with simulations. The main objective being to con- tribute with a theoretical extension for this type of models whose use has intensively increased in several areas including development and health.

The second chapter uses the theoretical framework of the first chapter (adapted to the available data) to examine the well-being of persons aged 60 and above in France from the point of view of their ability to accomplish certain personal care and domestic activities. The theoretical underpinning of our analysis is the capabil- ity approach, developed by Amartya Sen, which advocates that well-being should be evaluated in terms of freedoms and opportunities. We used a latent variable modeling framework to assess the ’freedom to be’ and ’to do’ of our group. As the indicators in our sample combine the autonomy aspect with the satisfaction with help received, we had to construct a functionings vector based on assumed order- ings of the reported situation in terms of its value to the individual and perform robustness checks. Our results have interesting policy implications, for instance living in couple or visits by children have a positive impact whereas living with children may not. People with Alzheimer’s disease are particularly disadvantaged even when compared to people with other serious impairments.

The last chapter presents an appropriate theoretical framework to evaluate the influence of education and social groups in Indian women’s well-being, which is measured by the Wealth Index and the level of hemoglobin. We selected only women who were head of household, as a sign already of female empowerment, and we estimated nonparametric conditional cumulative distributions by using the four variables mentioned above. We later tested first order stochastic dominance for each social group when an individual has a high and a low level of education and we did the same by holding the level of education and changing the social group.

We found out by using this methodology that there is strong evidence of a positive correlation between education and well-being in all social groups and there still

(9)

Contents exist some inequalities of opportunities between them. The conclusions are not the same if the tests are performed by estimating the conventional functions.

(10)

Cette th`ese de doctorat est compos´ee de trois chapitres. La principale contribution consiste `a pr´esenter deux cadres th´eoriques, param´etrique et non-param´etrique, pour faire des comparaisons et ´evaluations des concepts th´eoriques non-observables utilis´es dans diff´erents contextes tels que le bien-ˆetre, le d´eveloppement, la sant´e et l’in´egalit´e entre autres.

Le premier chapitre propose un nouveau cadre th´eorique des mod`eles `a ´equations structurelles avec variables latentes et donn´ees de panel, ayant des des effets in- dividuels ainsi que des effets dynamiques provenant des variables endog`enes ob- serv´ees. Nous d´erivons l’estimateur du maximum de vraisemblance de ce mod`ele en utilisant l’agorithme EM et nous impl´ementons cette m´ethode dans le cadre d’un exemple num´erique ´etudi´e avec des simulations. Le but de cette extension est de fournir un cadre plus ad´equat pour repr´esenter les ph´enom`enes ´etudi´es dans les contextes mentionn´es ci-dessus o`u l’utilisation de ce type de mod`eles est en pleine croissance.

Le deuxi`eme chapitre utilise le cadre th´eorique du premier chapitre (adapt´e aux donn´ees disponibles) afin d’examiner le bien-ˆetre des personnes ˆag´ees de 60 ans et plus en France du point de vue de leur capacit´e d’accomplir certaines activit´es de soins personnels et de la vie domestique. Le fondement th´eorique de notre analyse est l’approche des ‘capabilit´es’, d´evelopp´ee par Amartya Sen, qui prˆone que le bien- ˆ

etre d’un individu devrait ˆetre ´evalu´e en termes de libert´es et d’opportunit´es dont il dispose. Nous avons utilis´e un cadre de mod´elisation de variables latentes afin d’´evaluer cette libert´e d’ˆetre et de faire pour les individus choisis. Les indicateurs du bien-ˆetre que nous avons construits pour ces individus combinent l’aspect de l’autonomie avec la satisfaction de l’aide re¸cue (ou non) pour accomplir ces activit´es, fournissant un vecteur ordonn´e de ‘fonctionnements’ qui peuvent ˆetre analys´es par notre mod`ele ´econom´etrique. Nos r´esultats ont d’int´eressantes cons´equences en terme de politique publique, par exemple vivre en couple ou recevoir la visite des enfants ont un impact positif alors que vivre avec des enfants n’a pas le mˆeme effet sur les ‘capabilit´es’. En plus, les personnes ayant la maladie d’Alzheimer sont particuli`erement d´esavantag´ees mˆeme en comparant aux personnes ayant d’autres d´eficiences graves de sant´e. Nous avons v´erifi´e la robustesse de nos r´esultats par rapport `a la fa¸con de construire les ‘fonctionnements’.

(11)

Contents Le dernier chapitre pr´esente un cadre th´eorique non-param´etrique pour ´evaluer le bien-ˆetre des femmes en Inde, mesur´e par multiples indicateurs - ici l’indice de richesse (bien-ˆetre ´economique) et le taux d’h´emoglobine (bien-ˆetre en sant´e) , et pour ´etudier l’influence du niveau d’´education et du groupe social sur le bien-ˆetre ainsi d´efini. Nous avons s´electionn´e uniquement les femmes qui ´etaient cheffes de famille, pour avoir une certaine homog´en´eit´e de la situation de ces personnes, et nous avons estim´e les fonctions de r´epartition conditionnelles non param´etriques

`

a l’aide des quatre des variables mentionn´ees ci-dessus. Nous avons ensuite test´e la dominance stochastique de premier ordre par rapport au niveau d’´education en gardant constant le groupe social et vice versa. Nous avons trouv´e `a l’aide de cette m´ethodologie une dominance des femmes moins ´eduqu´ees par des femmes ´eduqu´ees (ce qui serait attendu) mais plus important une dominance entre les groupes sociaux pour un mˆeme niveau d’´education, ce qui montre certaines in´egalit´es d’opportunit´es entre eux. Les conclusions ne sont pas les mˆemes si les tests sont effectu´es `a l’aide des fonctions classiques.

(12)
(13)
(14)

Full Information Maximum Likelihood Estimation of

Structural Equation Models with Latent Variables and dynamic observed outcomes using panel data

In this chapter we present an innovative framework for Structural Equation Mod- els (SEM) with latent variables using cross-sectional and time-series data and in the presence of dynamic effects in the structural part of the model given by the indicators of latent variables as well as individual random effects on both parts.

We present an estimation procedure by using Full Information Maximum Likeli- hood with the EM algorithm by considering the unobserved as missing values. We provide the calculations as well as all the necessary steps for this context in order to obtain a consistent way of estimating the parameters. We present a numerical example by doing some simulations and finally we draw some conclusions.

1.1 Introduction

Latent variables can be defined as random variables which may be inferred but not directly observed. They are the opposite of “manifest variables”, which one can measure or observe (cf. Skrondal and Rabe-Hesketh 2004). Latent variables generally represent concepts, such as a person’s level of well-being, which are hard to gauge. Although often discussed in economic and social studies, we currently

(15)

1.1. Introduction have no set way of measuring well-being. In fact, in order to assess an individual’s well-being, different aspects of his / her life must be considered. These include their overall health, education, the economic resources to which they have access, their social relations, political freedom, and so on (Sen 1999). Most often, one observes several achievements / outcome indicators in the above-mentioned dimen- sions, which can all be considered as different manifestations of the underlying latent concept that is well-being.

The latent variable approach postulates that observed variables are imperfect mea- sures of the corresponding underlying concepts. It is well known that, in the pres- ence of measurement errors on the explanatory variables of a structural model, classical least squares procedures give inconsistent estimators. It is therefore nec- essary to resort to maximum likelihood and method of moment type estimators in order to estimate both the unknown parameters and the latent factors.

This chapter will focus on Structural Equation Models (SEM), a class of methodolo- gies “that seeks to represent hypothesis about summary statistics derived from em- pirical measurements in terms of a smaller number of structural parameters defined by a hypothesized underlying model” (Kaplan 2009). The widespread use of these models in several areas, such as well-being measurement (Kuklys 2005; Di Tommaso 2007; Krishankumar 2007; Krishnakumar and Ballon 2008; Krishnakunar 2008;

Anand, Krishnakumar and Ngoc 2011), underground economy (Breusch 2005), corruption (Dreher, Kotsogiannis and McCorriston 2009), finance (Papadopoulous and Amemiya 2005; Baranoff, Papadopoulos and Sager 2007), marketing (Verworn 2009) among others, show the importance and versatility of this framework. The cross-disciplinary use of the framework also highlights the need to extend current theories so that new insights can be gained.

Some extensions exist for the use of multilevel data and latent variable models.

These include hierarchical modeling and latent growth models. For example, la- tent growth models capture time effects with a trend variable and allow different parameters for each time unit (see Bollen and Curran 2006). Our approach differs from the two mentioned above in that we will propose an extension of the original SEM to a panel data framework using random effects on both parts: measurement equations and structural equations. We will additionally take into account lagged effects coming from past values of the observed endogenous variables, something which critics to date have not addressed. By combining all these new factors in a single theoretical model we will provide a consistent estimation procedure for this new model (assuming that all functional form assumptions are valid), while also of- fering a computational procedure that will allow people to calculate the estimators of the parameters.

Well-being provides an apt example. Suppose we have a framework in which well- being in different dimensions are latent variables measured by multiple indicators for each dimension. Higher achievements in one period often influence the well- being in the next period due to the conducive environment created by better out- comes and the spreading of the well-being effects over many periods. In the context

(16)

of social policy evaluation, where the effort a government puts into designing policy is taken as the latent, the outcome of a social policy programme (such as univer- sal health insurance) in a particular period is bound to have an impact on the government’s input in the same area in the following periods.

In the first part of this chapter we will develop a SEM when using panel data and dynamic effects coming from the observed endogenous variables. This will be followed by a literature review of existing methodologies before we introduce the model and assumptions. We will then explain the estimation methods and end by suggesting a procedure to find the solution and showing a numerical example.

1.2 Literature Review

Structural Equation Models can be described as a combination of two well-known models: factor analysis (a set of observed variables are indicators of the latent or unobserved variable) and simultaneous equation models (multiple equations in which variables can appear on several equations, thus affecting each other mutu- ally). The former originates from the field of psychology while the latter was mostly expanded by econometricians. Galton (1889) and Pearson and Lee (1903) were the first to use factor analysis (in genetics) but Spearman (1904) is credited with its development in his work on the measurement of intelligence.

A great deal of work has been done on latent variable models, and the field con- tinues to blossom. 1 Research on latent variable models has been extensive since these techniques are applied in many fields, such as economics, psychology and po- litical science. We are particularly interested in the extended version of the SEM with quantitative outcomes and exogenous variables in the measurement equations.

J¨oreskog (1969) and S¨orbom (1974) helped to popularize this model. A popular variant of the SEM is the Multiple Indicators and Multiple Causes model (MIMIC), originally introduced by J¨oreskog and Goldberger (1975), and an even simpler ver- sion is the Factor Analysis model (FA).

When panel data (or longitudinal data) are available for estimation one can in- troduce both heterogeneity and dynamics of behaviour in the specification of the structural (simultaneous) model, incorporating the dependence of the current en- dogenous variables on their past selves. In addition, there may be feedback in the form of the impact the lagged outcome variable has on the current value of the la- tent variable. Very little has been written about dynamic panel data models with latent variables. Heckman (1981), Honor´e (1992), Honor´e and Kyriazidou (2000) are to be credited for their work on dynamic binary choice and limited depen- dent variable models with fixed effects. However, to our knowledge, similar studies

1 See Bollen (1989), Bartholomew & Knott (1999) and Skrondal & Rabe-Hesketh (2004) for detailed discussion of latent variable models, and Huber, Ronchetti, Victoria-Feser (2004) and Moustaki, Knott (2000) for recent developments in the field.

(17)

1.3. The model for dynamic simultaneous equation models with latent variables in the presence of panel data (with or without qualitative outcome variables) do not exist.

Dupuis and Ryan (1996) built on Laird and Ware’s work (1982) by presenting a latent variable model with multiple outcomes and allowing covariates in the mea- surement equations (factor analysis part) that explain the observed endogenous variables. The structural part (the interdependent system of equations of latent variables) contains a set of different exogenous variables that affect the unobserved.

They estimate the parameters by using maximum likelihood and restricted maxi- mum likelihood with the Expected Conditional Maximization Either (ECME) algo- rithm proposed by Liu and Rubin (1994). This in turn is an extension of Dempster et al’s work (1977). The authors made an application based on birth defects data in order to evaluate the side effects of anticonvulsant medication when taken during pregnancy.

Roy and Lin (2000) propose a MIMIC model of a single latent variable with repeated measures over time. They introduce random effects on both parts and estimate the whole model by maximum likelihood using the EM algorithm. The authors applied their method to data from a U.S. national panel study in order to assess methadone treatment results, their variations over time, and which variables are the best predictors of effectiveness.

Huber, Ronchetti and Victoria-Feser (2004) develop a new estimator for General- ized linear latent variable models (GLLVMs) as a better alternative procedure by bringing Laplace approximation to the likelihood function. Not only the asymptotic properties are known (since it can be considered as an M-estimator); the authors also present a simulation study showing its accuracy when treating finite samples.

Their application on household wealth characteristics using Swiss consumption data illustrates the advantages of using their innovative method.

Moustaki and Knott (2000); Muth´en (2002); Moustaki (2003); Moustaki, J¨oreskog and Mavridis (2004); Muth´en (2007); Tsonoka and Moustaki (2007); Rizopoulos and Moustaki (2008); Cagnoge, Moustaki and Vasdeskis (2009) demonstrate a large range of significant, recent theoretical developments in the field, especially for lon- gitudinal data, binary, ordinal and nominal outcomes, interdependencies between items, parameter constraints, and non-linearity.

1.3 The model

We propose a special Dynamic SEM for cross sectional and time series data (panel data). This first part of the model is called the “structural model”, an interdepen- dent system of equations of unobserved variables, where latent endogenous variables are influenced by other latent endogenous variables and by latent exogenous vari- ables. We are extending the original model to cover cases in which endogenous observed variables influence the latent endogenous variables. Here, more specifi- cally, we are addingone lag of indicators to the equations.

(18)

Let us assume that we have n individuals observed T periods.The model can be written for one individualiat timet as follows:

yit=αi+Ayit+Dyi,t1+Cxit+εit, (1.1) where yit denotes the (M ×1) vector of latent endogenous variables (unobserved variables), which in our case will represent the “true” or “potential”, or level that individuals could reach; yi,t1 is a(P×1)vector of lagged endogenous observed variables (indicators) of the respective latent variable that intends to capture the direct dependence over of time; xit denotes a (S×1) vector of latent exogenous variables. In other words, it includes all known and available explanatory factors;

αiis a(M×1)vector containing individual specific random effects for each latent endogenous variable which captures the heterogeneity of behaviour of individuals;

and εit is a (M ×1) vector containing random error terms which are the mea- surement errors of the equations. A,D,C are coefficient matrices of appropriate dimensions.

The second part of the model consists of a measurement equation where both latent endogenous variables and latent exogenous variables are observed through a set of indicators. This is described as follows:

yit=bi+Λyit+Θwit+uit (1.2)

xit=δi+Γxit+eit, (1.3) whereyitdenotes the vector(P×1)containing indicators that allow us to measure latent variables; wit is a (K ×1) vector of covariates explaining the observed endogenous, which means that this set of variables influence the “potential” level in the way of becoming the “true” level;biandδiare(P×1)and(Q×1)vectors of random intercepts; anduitandeitare(P×1)and(Q×1)disturbances vectors.

We can consider a constant term for each part that is included in the the matrix of the exogenous variables.

We have the following properties for the model:

E(αi) =0,E(εit) =0,E(bi) =0,E(uit) =0,E(δi) =0,E(eit) =0, meaning that all random effects and disturbances have zero expectation and

V(αi) =Ψ,V(εit) =Ω,V(bi) =B,V(uit) =Υ,V(δi) =Ξ,V(eit) =Φ,

All random effects and error terms are uncorrelated with xit, xit and wit. It is common to assume Ω, Υ, Ξ andΦ to be diagonal matrices and Ψ and B to be full matrices. If we have identification problems one can assumeBdiagonal (since correlations between the equations can all be captured by the latent variables) and therefore keep only Ψ as a full matrix. However, with our dynamic effects we

(19)

1.3. The model are guaranteed to find correlation between latent variables even dropping Ψ full matrix.

Let us outline three possible cases for the second group of measurement equations:

Case 1: Γ̸=I

In this case the latent exogenous variables are observed through a set of manifest variables.

Case 2: Γ=I

This occurs when you have a measurement error since the manifest variable is the latent variable plus a random term specific to each individual and variable and an error term.

Case 3: Γ=Iandδi= 0,eit= 0 ∀i, t

In this case exogenous variables are directly observed, meaning that we can drop this part of the model and just focus on the endogenous part. In Economics we are often in presence of cases 2 and 3. From now on we will focus on case 3, in which exogenous variables are observed i.e. xi=xi.

In figure 1.1 we can see the path diagram of our model, which is the graphical representation of the relationships between the variables of the whole model - the structural part as well as the measurement equations.

(20)

Figure 1.1: Path diagram

Let us pile all T periods for one individual into the ”structural part” of the model.

Therefore, this part can be presented in matrix form as:

yi = (ιT⊗αi) + (ITA)yi + (ITD)yi,1+ (ITC)xi+εi

Let us do the same in the “measurement part”:

yi= (ιTbi) + (ITΛ)yi + (ITΘ)wi+ui

The whole system can be written in an alternative way by using the vec operator:

yi = (ιTIMi+ (YiIM)vec(A) + (Yi,1IM)vec(D) + (XiIM)vec(C) +εi

or

yi =H1[(ιTIMi+ (Yi,1IM)vec(D) + (XiIM)vec(C) +εi]

(21)

1.4. Assumptions

yi= (ιTIP)bi+ (YiIP)vec(Λ) + (WiIP)vec(Θ) +ui

whereH= (IT[IMA]),yi =vec(Yi),yi=vec(Yi)andxi=vec(Xi) We have therefore assumed that the observations are centered, all error terms have an expected value equal to zero, measurement errors are uncorrelated between them and we have latent variables, be they endogenous or exogenous. The hypothesis of non-existence of correlation between latent exogenous variables and latent errors has also been outlined above. Latent errors are assumed homoscedastic and non- autocorrelated. Thus, the corresponding covariance matrix is diagonal.

1.4 Assumptions

We assume a multivariate normal distribution ofyiand yi. 2 Therefore we posit that:

( yi yi

)

N (( µyi

µyi

) ,

( Vyi Vyi,y i

Vyi,yi Vyi

)) ,

and by using the posterior means we can express:

E(yi|yi)≡µc=µy

i+Vyi,y iVy1

i (yi−µyi) and

V(yi|yi)Vc=Vy i Vy

i,yiVy1

i Vyi,y i

We calculate the following expressions (see details in Appendix B):

vecVyi = (I(MT)2G)1vecC

vecVyi= [(ιTIp)TIp)]vecB+ [(ITΛ)(ITΛ)]vecVyi +vec(ITΥ)

Vyi,yi=H1(ITD)Bee +Vyi(ITΛ) +

T2 t=1

Het1H1(ITD)Υt

2We assume also multivariate normality with individual random effectsαiandbi.

(22)

1.5 The Maximum Likelihood Estimation

Since individuals are independent we can write the complete data likelihood3 as:

L(a,c,d, λ, θ,Ψ,Ω,B,Υ) =

N i=1

f(yi|yi, αi,bi, λ, θ,Υ)f(yii,bi,a,c,d,Ω)f(αi|Ψ)f(bi|B)

wherea=vec(A),c=vec(C),d=vec(D),λ=vec(Λ),θ=vec(Θ), and we as- sumeαiandbito be independent. Exogenous variables are not indicated explicitly as given in the second part of the equality in order to avoid over notation.

Given equations (1.1) and (1.2), the likelihood can be rewritten as:

L(.) =

N i=1

f(yi,yi, αi,bi|wi,xi)

L(.) =

N i=1

f(yi,yi|wi,xi,bi, αi)f(αi)f(bi)

L(.) =

N i=1

f(yi|yi,wi,xi,bi, αi)f(yi|wi,xi,bi, αi)f(αi)f(bi)

and by removing the exogenous wiand xi(just in the notation) we will have:

L(.) =

N i=1

f(yi|yi,bi)f(yii)f(αi)f(bi)

and applying log on both sides:

logL(.) =log[

N i=1

f(yi|yi,bi)f(yii)f(αi)f(bi)]

logL(.) =

N i=1

[logf(yi|yi,bi) +logf(yii) +logfi) +logf(bi)]

3The complete data likelihood contains all datayi,yi,αi,bi,wiandxi; the observed data areyi,wiand xi. If we we want to calculate the marginal likelihood ofyi we would need to integrate all the other unobserved terms of the complete data likelihood.

(23)

1.5. The Maximum Likelihood Estimation

logL(.) =1 2

N i=1

[const+log|ITΥ|+

(yiTIP)bi(YiIP)vecΛ(WiIP)vecΘ)(ITΥ)1 (yiTIP)bi(YiIP)vecΛ(WiIP)vecΘ)+

log|IT|+log|H|+ (yi TIMi(YiIM)vecA (Yi,1IM)vecD(XiIM)vecC)(ITΩ)1(yi TIMi (YiIM)vecA(Yi,1IM)vecD(XiIM)vecC) +log|Ψ|+ αiΨ1αi+log|B|+biB1bi]

or we could think about putting together the random effects with the disturbances which will lead to a new error term. This would give us the following likelihood:

logL(.) =1 2

N i=1

[const+log|(ιTIp)B(ιTIp)+ (ITΥ)|+

(yi(YiIP)vecΛ(WiIP)vecΘ)((ιIp)B(ιIp)+ (ITΥ))1 (yi(YiIP)vecΛ(WiIP)vecΘ) +log|TIM)Ψ(ιTIM)+ (ITΩ)|+log|H|+ (yi (YiIM)vecA(Yi,1IM)vecD

(XiIM)vecC)((ιTIM)Ψ(ιTIM)+ (ITΩ))1(yi (YiIM)vecA (Yi,1IM)vecD(XiIM)vecC)]

Meaning that we can rewrite the model as:

yi= (YiIP)vec(Λ) + (WiIP)vec(Θ) +i

yi = (YiIM)vec(A) + (Yi,1IM)vec(D) + (XiIM)vec(C) +˜εi

where

˜

ui= (ιTIP)bi+ui

˜

εi= (ιTIMi+εi

(24)

By looking at the log likelihood expression it is clear that we assume no correlation between individual random effects and disturbances. As we said above, this is merely another way to write the likelihood, and we will keep the first form to derive the calculations in the rest of the chapter.

By maximizing the first log-likelihood function with respect to the unknown pa- rameters we find the following results (see Appendix C):

1.

vecΘˆ = { N

i=1

E[(WiIP)(ITΥ1)(WiIP)|yi] }1

N i=1

{

E(WiIP)(ITΥ1)[yiTIP)bi(YiIP)vecΛ]|yi

}

2.

vecΛˆ = {N

i=1

E[(Yi IP)(ITΥ1)(YiIP)|yi] }1

N i=1

{E(YiIP)(ITΥ1)[yiTIP)bi(WiIP)vecΘ]|yi}

3.

vecDˆ = {N

i=1

E[(Yi,1IM)(IT1)(Yi,1IM)|yi] }1

N i=1

E[(Yi,1IM)(IT1)[yi TIMi(YiIM)vecA

(XiIM)vecC]|yi]

4.

vecCˆ = { N

i=1

E[(XiIM)(IT1)(XiIM)|yi] }1

N i=1

E[(XiIM)(IT1)[yi TIMi(YiIM)vecA

(Yi,1IM)vecD]|yi]

(25)

1.6. Statistical Properties of the likelihood estimator 5. The solution forvecAˆ is given by the following equation:

N i=1

Tvec[(IMA)]1+2E[(Yi IM)(IT1)(yi TINi

(Yi,1IM)vecD(XiIM)vecC|yi]

2E[(YiIM)(IT1)(YiIM)|yi]vecA=0

6.

Υˆ = 1 NT

N i=1

E(UiUi|yi) 7.

Ωˆ = 1 NT

N i=1

E(εiεi|yi) 8.

Ψˆ = 1 N

N i=1

E(αiαi|yi) 9.

= 1 N

N i=1

E(bibi|yi)

Intermediate computations are in Appendix A, B, C, and D for almost all of the estimators, except when one can be derived analogically from the others. The likelihood function should include the constraints on the parameters specific for each case, whether they are for identification purposes or exclusion restrictions. In those cases it will be necessary to either take these restrictions into account in the estimator form, or else to write the model directly with the constraints included (we are treating a general case for this chapter).

1.6 Statistical Properties of the likelihood estima- tor

Letδbe the vector that contains all the parameters of the model,δˆM Lthe vector containing the solutions andXe the matrix containing all the data for the likelihood.

The ML can be expressed as an M-estimator (Huber 1964, 1981), which can be defined as follows:

N i=1

Ψ(fXi;δ) = 0

(26)

The Ψ(Xfi;ˆδML) functions are given by the first derivatives of the log-likelihood function with respect to the parameters. Besides, the estimatorδˆM Lis asymptoti- cally normal under some conditions, given by Huber (1981) as follows:

n(ˆδML−δ)−→D N[0,M(δ)1Σ(δ)M(δ)′−1]

where:

M(δ) =E {

∂Ψ(X;e δ)

∂δ }

and

Σ(δ) =E[Ψ(X;e δ)Ψ(X;e δ)]

In our particular case of maximum likelihood,M(δ)andΣ(δ)are equivalent to the Fisher information matrix denoted asJ(δ):

J(δ) =E {[

∂logfδ(X)e

∂δ ] [

∂logfδ(X)e

∂δ

]}

=E [

2logfδ(X)e

∂δ∂δ ]

so the asymptotic distribution becomes:

n(ˆδML−δ)−→D N[0,M(δ)1]

where M(δ)1=J(δ)1, that is equal to the Cram´er-Rao Bound, which is the minimum possible variance for un unbiased estimator.

1.7 The ML and EM algorithm solution

Before proceeding to the model estimation it is important to check the model identification. One can use the two-step rule which consists of performing the identification in two separate steps: first, measurement equations are identified as a traditional factor analysis, and then one identifies the simultaneous equations where latent variables are treated as observed variables.

Since the latent variablesy and the random effectsαandbare not observed, we can use the EM algorithm(Dempster, Laid and Rubin, 1977) for the maximum likelihood estimation. This algorithm can be implemented in several steps. The main idea is to treat the latent variables and random effects as missing variables.

(27)

1.8. Results The algorithm will start by a consistent estimation of these variables (assuming that the normal distribution is the true distribution). It will then iterate until convergence to find all parameters.

The EM algorithm provides a framework when missing data is involved, and there- fore unobserved variables are considered as missing data. Dupuis and Ryan (1996);

Roy and Lin (2000); Moustaki (2003); Cagnone, Moustaki and Vasdekis (2009);

Irincheeva, Cantoni and Genton (2012); have used the EM algorithm or a respec- tive extended version in order to estimate their model parameters involving latent variables.

The steps for the EM algorithm are:

Step 1: Choose initial estimates for model parametersa,c,d, λ, θ,Ψ,Ω,B,Υ(fac- tor analysis and OLS separately, arbitrarily)

Step 2: Calculate the conditional expectations of the sufficient statistics involving latent variables and random effects (see Appendix D):E(yi|yi),E(αi|yi),E(bi|yi), etc (E-step).

Step 3: Obtain new estimates of the parameters by solving the complete maximum likelihood function (M-step).

Step 4: Return to step 2 and iterate until convergence.

1.8 Results

We have performed the simulations in MATLAB. All theoretical functions were programmed in order to achieve the realization of the EM algorithm. Our empir- ical model contains two latent variables (yi1 and yi2) , five observed endogenous variables (yi1, yi2, yi3, yi4 and yi5), two exogenous variables for the structural part of the model (xi1,xi2) and one exogenous variable for the measurement part of the model (wi). In order to simplify the example and avoid constraints on the loading parameters, we have selected the case where all variances-covariances ma- trices are known. In other words, our focus will be in the coefficients of the model.

This choice does not affect at all our objective, since the main idea is to be able to investigate how our procedure behaves.

We decided to stay in a small sample, with 50 individuals and 10 time periods. We repeated the EM algorithm until convergence for 200 samples. Our sample may be too small for these type of models, since we know that they behave better asymp- totically, but we wanted to study the small samples as a first step. Figures 1.2, 1.3 and 1.4 show the results for estimators of the structural part of the model and figures 1.5 and 1.6 show estimators for the measurement part of the model. Let us note that figure 1.2 only has one parameter since the diagonal of the matrix

(28)

A is equal to zero (each latent has no effect on itself) and we assumed the simul- taneity only for one of the latents. The stars in the graphs are the true values of the parameters and the red crosses are outliers. As all boxplots show, most of the true values are included in the interquartile range and the median of several estimators is close to the true value. These results seem to indicate consistency under asymptotic conditions.

Figure 1.2: Estimated parameters of the latent variables in the structural part

−0.2 0 0.2 0.4 0.6 0.8 1

a_12

Figure 1.3: Estimated parameters of the lagged endogenous variables in the structural part

−0.15

−0.1

−0.05 0 0.05 0.1 0.15

d_11 d_21 d_12 d_22 d_13 d_23 d_14 d_24 d_15 d_25

(29)

1.8. Results

Figure 1.4: Estimated parameters of the exogenous variables in the structural part

−0.3

−0.2

−0.1 0 0.1 0.2 0.3

c_11 c_12 c_21 c_22

Figure 1.5: Estimated parameters of the latent variables in the measurement equations

−1.5

−1

−0.5 0 0.5 1 1.5

La_11 La_21 La_31 La_41 La_51 La_12 La_22 La_32 La_42 La_52

Figure 1.6: Estimated parameters of the exogenous variables in the measurement equations

−0.06

−0.04

−0.02 0 0.02 0.04 0.06 0.08

theta_1 theta_2 theta_3 theta_4 theta_5

Références

Documents relatifs

In Section 6, we experimentally show that in some practical cases it is advantageous to use hybrid GLLiM, i.e., the response variable is only partially observed during

We present a methodology for developing Bayesian network (BN) models that predict and reason with latent variables, using a combination of expert knowledge and available data.

In this paper, we perform a simulation study to compare the parameters’ estimators provided by LISREL which is taken as a benchmark, and the LAMLE when the data are generated from

In this work, we propose a new estimator for the parameters of a GLLVM, the LAMLE, based on a Laplace approximation to the likelihood function and which can be computed even for

The first one presents a theoretical framework combining virtues and strengths of Capability Approach and the ex-ante point of view of the Equality of Opportunity approach for

These include mixtures of both finite and non-parametric product distributions, hidden Markov models, and random graph mixture models, and lead to a number of new results and

identified depending on the oxygenated compounds and the model molecules (2MT or 23DMB2N) and if these molecules are considered alone or in mixture. Water has a negative

Therefore, we design an independent sampler using this multivariate Gaussian distribution to sample from the target conditional distribution and embed this procedure in an