• Aucun résultat trouvé

Topics in Applied Econometrics : Panel Data

N/A
N/A
Protected

Academic year: 2022

Partager "Topics in Applied Econometrics : Panel Data"

Copied!
43
0
0

Texte intégral

(1)

Ch 3. Discrete Choice Panel Data Models 2015-16

Topics in Applied Econometrics : Panel Data

Ch 3. Discrete Choice Panel Data Models

Pr. Philippe Polomé, Université Lumière Lyon 2

M2 Equade & M2 GAEXA

2015 – 2016

(2)

Ch 3. Discrete Choice Panel Data Models 2015-16 Introduction to Binary Choices

Outline

Introduction to Binary Choices

RE Model

FE Model

Random Parameters (“Mixed”) Multinomial Model

(3)

Ch 3. Discrete Choice Panel Data Models 2015-16 Introduction to Binary Choices

Model

I

Underlying latent model (usual notation)

yit =xit0 +↵i+✏it

(1)

I

Assume

it

i.i.d. independant of x’s

I and symmetric ditribution functionF(.)

I y

is not observed

I yit=1 ifyit>xit0 +↵i+✏it

I elseyit =0

I

Estimation is by Max Likelihood

(4)

Ch 3. Discrete Choice Panel Data Models 2015-16 Introduction to Binary Choices

Incidental Parameters ↵

I

LogLikelihood fct ln

L( ,↵1, ...,↵N) = X

i,t

yit

ln

F ⇣

i+xit0 ⌘ +X

i,t

(1 yit)

ln

1

F⇣

i +xit0 ⌘⌘

I

For fixed

T

and

N! 1, the estimators are inconsistent

because the nbr of parameters

! 1

I Incidental parameter problem: inconsistency of↵ˆ carries over to the estimator for

I Was also there with linear models, but we could eliminate the

↵by difference

I Can we translate that in a non-linear model ?

(5)

Ch 3. Discrete Choice Panel Data Models 2015-16 RE Model

Outline

Introduction to Binary Choices

RE Model

FE Model

Random Parameters (“Mixed”) Multinomial Model

(6)

Ch 3. Discrete Choice Panel Data Models 2015-16 RE Model

RE probability

I

Assume

I Zero correlation between↵i andxit

Ii iid 0, 2

I Also independance acrossi &✏it iid

I

The (conditionnal)

probability

of observing a certain outcome for

i

is

f (yi1, ...,yiT|xi1, ...,xiT, ,↵i)

I This is the whole sequence fori, over all periodst =1. . .T

I

Assume

independance

between the

it

within each individual, then

f (yi1, ...,yiT|xi1, ...,xiT, ,↵i) =⇧tF(yit|xit, ,↵i)

(7)

Ch 3. Discrete Choice Panel Data Models 2015-16 RE Model

Maximum Likelihood

I

If

i

was known, say it was

ci

I then we could estimate the parameters of that distribution ( ) from the classical binary choice likelihood

Ni=1Tt=1[F(xit +ci)]yit[1 F(xit +ci)]1 yit

(8)

Ch 3. Discrete Choice Panel Data Models 2015-16 RE Model

Conditional Maximum Likelihood

I

But

ci

is not known and we cannot estimate

i

consistently

I so instead we want to get rid of it

I The way of getting rid of↵i is tointegrate it out

I That is, take expectation

I

The resulting likelihood (for one

i) is called Conditional Maximum Likelihood

f (yi1, ...,yiT|xi1, ...,xiT, ) = Z +1

1 [⇧tF (yit|xit, ,↵i)]g(↵i)d↵i I

For the sample, the log-likelihood function of the binary choice

model is a

X

i

X

t

on the Conditional ML

I Although it is now unconditional on↵i

(9)

Ch 3. Discrete Choice Panel Data Models 2015-16 RE Model

Integration

I

The integral has not usually a

closed form

I It cannot be solved analytically

I We use numerical integration techniques

I

The Gauss-Hermite quadrature is very common

I It is just a way of choosing the length of the rectangles

I

Later, we will see simulation techniques

(10)

Ch 3. Discrete Choice Panel Data Models 2015-16 RE Model

Integration in Stata

I

In principle quadrature works with any distribution for

and

I In Stata, both xtlogit and xtprobit commands use thenormal distribution for↵

I If↵is in fact distributed differently ...

I The choice in Stata is on the distribution for✏: Logit or Probit

I In the RE model, that may not have much importance

I The number of points of integration

I i.e. the number of rectangles

I can be adjusted

I It should not be too small

I Increase the number of these points until the results are stable

(11)

Ch 3. Discrete Choice Panel Data Models 2015-16 RE Model

Serial Correlation

I

When

independance

between the

it

cannot be assumed within each individual, then

I We cannot writef (yi1, ...,yiT|xi1, ...,xiT, ,↵i)as

tF(yit|xit, ,↵i)in the Conditional Likelihood Function

I

That does not invalidate the conditional approach, but

I There is now a multiple integral to integrate over all the periods jointly

I Above 3 such integrals (4 periods and more), quadratures do not work nicely

I

Instead, use

simulation-based

methods

I They are not implemented in Stata

I Stata does not appear to allow correlation between the✏it I Possibly, use R instead

I mlogitpackage below does not do that

(12)

Ch 3. Discrete Choice Panel Data Models 2015-16 FE Model

Outline

Introduction to Binary Choices RE Model

FE Model

Random Parameters (“Mixed”) Multinomial Model

(13)

Ch 3. Discrete Choice Panel Data Models 2015-16 FE Model

Logit Probability

I

Fixed effects estimation is possible for the panel

logit

model, using the conditional MLE

I Butnotfor other binary panel models such as panel probit

I Strict exogeneity (of thex) must hold

I Independance of the✏it acrosst must hold

I And of course acrossi as usual

I

For the logit model, it is possible to show that the

(conditionnal) probability of observing a certain outcome for

i

(over all periods

t) is

f (yi1, ...,yiT|xi1, ...,xiT, ,↵i) =

exp

(↵iP

tyit)

exp

⇣⇣P

yyitxit0⌘ ⌘

t

1

+

exp

i +xit0

(14)

Ch 3. Discrete Choice Panel Data Models 2015-16 FE Model

Unchanging Behavior

I

If

P

tyit =

0 or

=T

, then substituting in the above conditionnal probability shows that

I Changes inxit fromt tot cannot explain choices fory

I since such choices do not change overt

Ii, being time-invariant, issufficient to explain the choice for eitheryit=08t oryit =1 8t

I

Such

i

can be dropped from the likelihood

I as they add no information on

I Stata does it automatically

I

So, only

i

that change status/behavior at least once are

relevant for estimating

(15)

Ch 3. Discrete Choice Panel Data Models 2015-16 FE Model

Changing Behavior : Take T = 2 and P

t

y

it

= 1

I

Then only the sequences

{0,

1} or

{1,

0} are possible :

Pr

(

(0,

1)

|X

t

yit=

1,

i, )

=

Pr

{(0,

1)

|↵i, }

Pr

{(0,

1)

|↵i, }+

Pr

{(1,

0)

|↵i, }

I

Pr

{(0,

1)

|↵i, }=

Pr

{yi1 =

0|

i, }

Pr

{yi2=

1|

i, }

I

With the logistic :

I Pr{yit=1|↵i, }= exp(↵i+xit ) 1+exp(↵i+xit )

I Pr{yit=0|↵i, }= 1

1+exp(↵i+xit )

(16)

Ch 3. Discrete Choice Panel Data Models 2015-16 FE Model

Changing Behavior : Take T = 2 and P

t

y

it

= 1

I

So,

Pr

{(0,

1)

|↵i, }=

1

1

+

exp

(↵i +xi1 )

exp

(↵i+xi2 )

1

+

exp

(↵i+xi2 )

I

And Pr

(

(0,

1)

|X

t

yit =

1

,↵i, )

=

exp

(↵i+xi2 )

exp

(↵i +xi2 ) +

exp

(↵i +xi1 )

=

exp

(xi2 )

exp

(xi2 ) +

exp

(xi1 )

=

exp

((xi2 xi1) )

1

+

exp

((xi2 xi1) )

I

Thus

i

drops out of the model

I Much like in a first-difference linear model

(17)

Ch 3. Discrete Choice Panel Data Models 2015-16 FE Model

Estimation

I

This means that we can estimate the FE logit model for

T =

2 using a

standard logit

I withxi2 xi1(a first difference !) as explanatory variables and

I the change in yit as the endogenous event (1 for a positive change, 0 for a negative one)

I

For the case with larger

T

it is more cumbersome to derive all the necessary conditional probabilities

I but in principle it is a straightforward extension of the above case

(18)

Ch 3. Discrete Choice Panel Data Models 2015-16 FE Model

Other FE Binary Panel Issues

I

A similar transformation exists for a dynamic panel binary model

I Provided at least four time periods

I Conditions on theyit time-series that may be difficult to test

I See Cameron & Trivedi 23.4.4

I Not implemented in Stata

I

The elimination of the individual effect

i

I Changes the interpretation

I e.g. a one-unit difference inxit versusxi,t 1 induces a change in the probability of the sequence{yit,yi,t 1}

I compared to a certain probability ifxit=xi,t 1

(19)

Ch 3. Discrete Choice Panel Data Models 2015-16 FE Model

To sum up

I

If it can be assumed that

i

is

independant

of

xit

I RE ML estimator

I Probit or Logit, does not matter

I Otherwise, we only have the FELogit

estimator

I With a first-difference interpretation

I onlyi that change status/behavior at least once are relevant for estimating

I

These approaches relies on independance of the

it

I Not of theyit

I It is essential that the↵i andxit “filter out” any correlation in theyit

I In general, the remaining correlation will cause inconsistency

I

Packages

I Stata : xtlogit or menu stat!Panel!Binary outcomes

I R : mlogit

(20)

Ch 3. Discrete Choice Panel Data Models 2015-16 FE Model

Example : Unionization of Women in the US

I

Setup webuse union

I Loads the dataset on the Stata website : >4000 women observed 1 to 12 times

I

Random-effects logit model (default logit)

I xtlogit union age grade not_smsa south##c.year

I The lattest (##) is so that each variable and their product are in the regression

I South is a dichotomous variable

I South##year induces one dummy for each year and for (South=1 and year=t) : long

I South##c.year treats year as continuous I

Fixed-effects logit model

I xtlogit union age grade not_smsa south##c.year, fe

I 2744 groups (14165 obs) dropped because of all positive or all negative outcomes

(21)

Ch 3. Discrete Choice Panel Data Models 2015-16 Random Parameters (“Mixed”) Multinomial Model

Outline

Introduction to Binary Choices RE Model

FE Model

Random Parameters (“Mixed”) Multinomial Model

(22)

Ch 3. Discrete Choice Panel Data Models 2015-16 Random Parameters (“Mixed”) Multinomial Model

Random Utility Models

I

We are interested in individual

discrete choices

among

J

exclusive alternatives,

j =

1

. . .J

I We assume that each alternativej provides a certain utility to the individual

I Who then compares theJ alternatives on that basis

I e.g. transport mode choice

I

The utility and therefore the choice is purely deterministic from the individual point of view

I It is random from the researcher’s point of view

I because some of the determinants of the utility are unobserved,

I which implies that the choice can only be analyzed in terms of probabilities.

(23)

Ch 3. Discrete Choice Panel Data Models 2015-16 Random Parameters (“Mixed”) Multinomial Model

Random Utility in Cross-section

I i

’s utility for alternative

j

is

Uij =↵j+ xij + jzi + jwij +✏ij

where there may be

I alternative specific variablesxij

I with a generic coefficient

I e.g. transport mode time to work

I individual specific variableszi

I with alternative specific coefficients j I e.g. age

I alternative specific variableswij

I with an alternative specific coefficient j I e.g. transport mode safety

(24)

Ch 3. Discrete Choice Panel Data Models 2015-16 Random Parameters (“Mixed”) Multinomial Model

Utility Differences

I

Utility being ordinal, only utility differences are relevant to modelize the choice for one alternative

I The difference between the utility of two different alternatives j andk is

Uij Uik =↵jk+ (xij xik)+( j k)zi+ kwij jwik+✏ijik

so that thezi terms drop offwhen j = k

I ForJalternatives, there areJ 1 such utility differences

(25)

Ch 3. Discrete Choice Panel Data Models 2015-16 Random Parameters (“Mixed”) Multinomial Model

Utility Differences

I

Moreover, only differences of these coefficients are relevant and may be identified

I For example, with three alternatives 1, 2 and 3,

I the three coefficients associated to an individual specific variable cannot be identified,

I but only two linear combinations of them.

I Therefore, a choice of normalization is necessary

I the most simple one is 1=0

I

Coefficients for alternative specific variables may (or may not) be alternative-specific

I For example, transport time is alternative specific

I And individual-specific since it depends on location

I So there could be a constant

I But may be 10 mn in public transport don’t have the same value than 10 mn in a car

I So that would be a j

(26)

Ch 3. Discrete Choice Panel Data Models 2015-16 Random Parameters (“Mixed”) Multinomial Model

Conditional Probabilities

I

If alternative

l

is chosen,

I Rewrite the utility differences as

Ul Uj=Vl Vj+✏lj wherevj =↵j+ xij+ jzi+ jwij

I Possibly withoutzi in earlier models

I

The general expression of the probability of choosing alternative

l

is then :

Pl|✏l =

Pr

{Ul >U1, . . . ,Ul >Uj}

=F l(✏1<Vl V1+✏l, . . . ,✏J <Vl VJ +✏l)

where

F l

is the multivariate distribution of

J

1 error diff.

I F l is a J 1 dimensional integral that depends on the unobservable✏l

(27)

Ch 3. Discrete Choice Panel Data Models 2015-16 Random Parameters (“Mixed”) Multinomial Model

Unconditional Probabilities

I

Since

l

is unobserved,

I We must first remove it fromPl

I

This is done by taking expectation

Pl =

Z

Pl|✏lfl(✏l)d✏l

that is, integrating over

F l

I AJ dimensional integral

(28)

Log-Likelihood Function

I

Write

yij

equal to 1 if

i

chose

j

, and zero otherwise

I For any given choice situation, there areJ such variables

I

If the

ij

are independant across alternatives

j

,

I the probability of the choice made byi is :

Pi=Y

j

Pijyij

which collapses toPil for any particular choice l

I that is thePl from above

I with an addedi index in the context of a sample

I in log: lnPi =P

jyijlnPij

I

Over a sample of independant observations:

ln

L=X

i

ln

Pi =X

i

X

j

yij

ln

Pij

I

We seek to maximize this function over the set of parameters

of the utility differences

(29)

Ch 3. Discrete Choice Panel Data Models 2015-16 Random Parameters (“Mixed”) Multinomial Model

Standard Multinomial Logit

I

The standard multinomial logit model probability

Pil

is

Pil =

exp

n 0 xil

o P

j

exp

{ 0xij}

I

This is due to McFadden 1974,

I Assuming a “Gumbel” distribution and iid for✏

I And non-random coefficients

I Alternative-invariant regressors are impossible unless becomes j

I That is the next model I

I assume you know this model

I If not!Cameron & Trivedi, Chapter 15

(30)

Ch 3. Discrete Choice Panel Data Models 2015-16 Random Parameters (“Mixed”) Multinomial Model

Random Parameter (“Mixed”) Multinomial Logit

I

The mixed logit model is

Pil| i =

exp

n 0

ixil

o P

j

exp

i0xij

the

i

coefficients are treated as random

I Just as the↵i in panel linear models

I Therefore, the probability becomes conditional on the vector of random coefficientsPil| i

I

See Train 2003 for a complete presentation

I The✏ij are already integrated out as in the McFadden formula

I Random parameters are one way to address the Independance of Irrelevant Alternative issue

(31)

Ch 3. Discrete Choice Panel Data Models 2015-16 Random Parameters (“Mixed”) Multinomial Model

Unconditional Mixed Logit Probabilities

I

As earlier, the conditional probability must be made unconditional by taking expectations

I This time, over the i:

Pil =E [Pil| i] = Z

Pil| if( i)d i

where the integration is

I multiple over all the elements of , with possible correlations

I over the support of , usually 1,+1

I implies that we assume a distributionf for I

The question is how to compute this integral ?

(32)

Ch 3. Discrete Choice Panel Data Models 2015-16 Random Parameters (“Mixed”) Multinomial Model

Panel Discrete Choice

I

In a panel context, the iid hypothesis of the

yit

is untenable

I Since successive observations of a single individual are likely correlated

I But independance of the✏it will be assumed as before

I The↵i (individual-specific constant) also have to be integrated out

I Which is only possible by assumingE(↵i) =↵8i

I But here, that is also the case any slope coefficient i I So only in a RE model

I

More specifically, we compute one probability for each

i

and

this is this probability that is included in the log-likelihood

function

(33)

Ch 3. Discrete Choice Panel Data Models 2015-16 Random Parameters (“Mixed”) Multinomial Model

Panel Discrete Choice Probabilities

I

For a given vector of coefficients

i

, the probability that alternative

l

is chosen for the

tth

observation of

i

is

Pitl = P

exp

{ ixitl}

j

exp

{ ixitj}

I Across all alternatives that isPit=Y

l

(Pitl)yitl

I The joint probability for theT observations of individuali is

Pi=Y

t

Y

l

Pitlyitl

I In this formulation, the✏itj areindependant over time-profile

I But correlation in the behavior is modelled because the i

coefficients are constant in time

(34)

Ch 3. Discrete Choice Panel Data Models 2015-16 Random Parameters (“Mixed”) Multinomial Model

Panel Discrete Choice

I

Panel data in this case are often used for survey data

I where several similar choice situations are often presented to respondants

I e.g. choice of transport modes under different attributes

I Prices, frequency, duration...

I As indicated, experimental data are also suitable

I

Lagged dependent variables can be added to mixed logit

I without adjusting the probability formula

I or simulation method

I Provided they “behave”

I This is not explicited in the literature I

We now focus on the integration technique

(35)

Ch 3. Discrete Choice Panel Data Models 2015-16 Random Parameters (“Mixed”) Multinomial Model

Maximum Simulated Likelihood Principle

Outline

Introduction to Binary Choices RE Model

FE Model

Random Parameters (“Mixed”) Multinomial Model Maximum Simulated Likelihood Principle

(36)

Ch 3. Discrete Choice Panel Data Models 2015-16 Random Parameters (“Mixed”) Multinomial Model

Maximum Simulated Likelihood Principle

Simulation

I

The probabilities for the random parameter logit

I are integrals with no closed form

I the degree of integration is high

I Quadrature techniques become untractable

I

Instead

simulation

techniques are used

I i.e. the expected value is replaced by an arithmetic mean

I

Simulations of a rv are pseudo-random draws from that rv

I Most computer packages have a routine for the Uniform

I e.g. rand() in excel, runif in R

I From the uniform, there exist formulas for all the other distributions

I

Application of the ideas on simulation to ML estimation

I Key result: Simulation can lead to an estimator with the same distribution as the MLE

I Provided the number of simulation draws made to compute the probability for each observation! 1

(37)

Ch 3. Discrete Choice Panel Data Models 2015-16 Random Parameters (“Mixed”) Multinomial Model

Maximum Simulated Likelihood Principle

Numerical Approximation Maximum Likelihood

I

Assume:

I independenceover observations

I and thaty has conditional densityf(y|x,✓)

I or probabilities for the discrete choice case

I butf(y|x,✓)has no closed-form expression

I anintractableintegral

I

Replace the integral by a

numerical

approximation

f˜(y|x,✓),

I

maximize ln

N(✓) =PN

i=1

ln

f˜(yi|xi,✓)

with respect to

(38)

Ch 3. Discrete Choice Panel Data Models 2015-16 Random Parameters (“Mixed”) Multinomial Model

Maximum Simulated Likelihood Principle

Numerical Approximation Maximum Likelihood Properties

I

The estimator will be

I consistent and have the same asymptotic distribution as ML

I iff˜(y|x,✓)is a good approximation

I

The resulting first-order conditions

I are usually nonlinear

I and are solved by iterative methods

I but we do not discussed that

(39)

Ch 3. Discrete Choice Panel Data Models 2015-16 Random Parameters (“Mixed”) Multinomial Model

Maximum Simulated Likelihood Principle

Simulator

I

Let

f (y|x,✓)

take the following general form

f (yi|xi,✓) =

Z

h(yi|xi,✓,ui)g(ui)dui

without a closed-form solution

I andui isunobservable

I so the estimated parameter vector✓ cannot depend on it

I We sayui must beintegrated out(taking expectations) I

The

direct simulator

for

f (yi|xi,✓)

is the Monte Carlo sum

f˜(yi|xi,uiS,✓) =

1

S

XS s=1

h(yi|xi,✓,usi)

where

uiS

is a vector of S draws

uis ,s =

1, ...,

S

I that are independent simulated draws from unobservedg(ui)

I Weassume the distribution of unobservedui isg(ui)

(40)

Ch 3. Discrete Choice Panel Data Models 2015-16 Random Parameters (“Mixed”) Multinomial Model

Maximum Simulated Likelihood Principle

Simulators Properties

I

Such

iS

is unbiased and consistent for

fi

I as the number of drawsS ! 1

I So we dropuiS from the notation I

The direct simulator is one case of simulator

I Other simulators exist

I in some cases doing a better job at approximatingfi I depending on the distribution ofg(ui)

I

Generally we want that the simulator

i

be

differentiable

I so that gradient methods may be used to optimize the likelihood function

I Gradient methods: based on first-order (or second-order) derivatives

(41)

Ch 3. Discrete Choice Panel Data Models 2015-16 Random Parameters (“Mixed”) Multinomial Model

Maximum Simulated Likelihood Principle

Maximum Simulated Likelihood

I

In general, the Maximum Simulated Likelihood estimator is simply

✓ˆMSL

that maximises

ln

N(✓) = XN i=1

ln

f˜(yi|xi,✓) = XN i=1

ln 1

S

XS s=1

h(yi|xi,✓,uis)

I

To eliminate “chatter” caused by simulation and help numerical convergence

I the underlying Monte Carlo draws used to constructf˜i should not be redrawn

I as then✓ would changes across the optimization iterations

I ReminduiS is a vector of S drawsuis

(42)

Ch 3. Discrete Choice Panel Data Models 2015-16 Random Parameters (“Mixed”) Multinomial Model

Maximum Simulated Likelihood Principle

Random Parameter MN Logit Max Simulated Likelihood

I

More precisely, for the multinomial mixed logit case

1.

Make an initial hypothesis about the distribution of the

random parameter

I e.g. i ~ normal(µ, )

2.

Draw

R

numbers from that distribution

I And keep them throughout

3.

For each draw

ir

, compute probability

Pilr =

exp

{ rixil} P

j

exp

irxij

4.

Compute the average of these probabilities

il =PR

r=1Pilr/R 5.

Use these simulated probabilities into the log-likelihood

I Which is then a “pseudo”-likelihood 6.

Numerical maximization of this ln

L

as usual

(43)

Ch 3. Discrete Choice Panel Data Models 2015-16 Random Parameters (“Mixed”) Multinomial Model

Maximum Simulated Likelihood Principle

Application: Transport Mode Choice

In R, with package mlogit installed:

I

library("mlogit")

I

data("Train", package = "mlogit")

I Loads the data in memory

I

Tr <- mlogit.data(Train, shape = "wide", varying = 4:11, choice = "choice", sep = "", opposite = c("price", "time",

"change", "comfort"), alt.levels = c("choice1", "choice2"), id

= "id")

I Reshape the data in a form suitable for mlogit command

I

ml <- mlogit(choice ~ price + time + change + comfort, Tr, panel = TRUE, rpar = c(time = "cn", change = "n", comfort

= "ln"), correlation = TRUE, R = 20, tol = 10, halton = NA)

I Regress “Choice” on 4 regressors

I With random parameters for the last 3

I With distribution cn censored normal, n normal, ln log-normal

I Correlation between parameters allowed

Références

Documents relatifs

I Rather than a long panel such as a small cross section of countries observed for many time periods.. First advantage of panel data : Precision. I More observations because of

Panel-Robust Inference Bootstrap Standard Errors Hours &amp; Wages Example Fixed Effects vs..

Pr.. Linear Panel Models : Endogenous Regressors &amp; GMM 2015-16 GMM Theory in

In half a page, give two accounts on the econometrics of any presentation except yours, on two

Only Mancini-Griffoli and Pauwels (2006), by applying a modification of the Andrews (2003) test to homogeneous panel data (considering the alternative of a common effect of the

keywords : dynamic panel data, GMM, incidental functions, local first- differencing, time-varying fixed effects, nonparametric heterogeneity.. ∗ Address: Center for Operations

Spatial panel data models are exactly designed to deal with both type of heterogeneity: pure individual heterogeneity captured by fixed effects and interac- tive heterogeneity

Since our estimates for the the space-time filter parameters are consistent with model stability (the sum of the spatial filter parameters being less than one), we will see