• Aucun résultat trouvé

Semester1,Academicyear2016-2017 remi.bazillier@univ-paris1.fr R´emiBazillier Paneldata

N/A
N/A
Protected

Academic year: 2022

Partager "Semester1,Academicyear2016-2017 remi.bazillier@univ-paris1.fr R´emiBazillier Paneldata"

Copied!
40
0
0

Texte intégral

(1)

M2R “Development Economics”

Empirical Methods in Development Economics Universit´e Paris 1 Panth´eon Sorbonne

Panel data

R´ emi Bazillier

remi.bazillier@univ-paris1.fr

Semester 1, Academic year 2016-2017

(2)

Outline

Panel data

Panel data and endogeneity

Panel production functions

A panel macro production function A panel micro production function

Panel estimators Panel estimators

The micro panel production function extended

(3)

The magic of Panel Data

“Panel Data allows us to observe the unobservable”

I

Unobservable effects

I One of the most important reasons why it is difficult to argue forcausality. We always face the objection that there is something unobserved driving the outcome we observe

I We can use Panel Data to control fortime-invariant unobservables

I If the unobservables correlated with our explanatory variables do vary over time, then we need to take another step→ Instrumental Variables(next chapter)

(4)

Panel data

I

Panel data combines a cross-sectional with a time-series dimension

I But not all data that combines these two dimensions are panel-data: independently pooled cross sectionsare sampled randomly from a population at different points in time

I panel datacovers the same individuals (or same households, firms, countries...) over time

I

The structure of the panel

I N individuals followed overT time periods

I In microeconomic datasets: Nis large whileT is short

I In macroeconomic datasets,T can be large→you may have to take into account the time-series properties of data

I We will focus here onstatic models

I Lagged dependent variables in panels create problems (→you need to use dynamic panel data models)

(5)

Balanced and unbalanced Panels

I

Balanced panel: same T for all individuals

I

Unbalanced panel: different T

I

In practice: balanced and unbalanced panels are treated in the same way

I

One exception when the panel is unbalanced for reasons that are not random

I Firms with relatively low levels of productivity have relatively high exit rates (smallerT)

I In that case, you have to model the process ofselection

(6)

Outline

Panel data

Panel data and endogeneity

Panel production functions

A panel macro production function A panel micro production function

Panel estimators Panel estimators

The micro panel production function extended

(7)

Panel data and endogeneity

I

Let us set out the simplest version of a panel model

y

it

=

βxit

+ u

it

(1)

u

it

= c

i

+

eit

(2)

I

Two elements in the error term (c

i

and u

it

)

I ci isthe time-invariantunobservable determinant ofyit

I If we think that an individual ‘ability’ is something with which they are endowed and does not change over time

I The quality of management (if time-invariant) I

The Least Squares Dummy Variable model:

I Individual dummies to control for time-invariant unobservables

y

it

=

βxit

+ c

i

+

eit

(3)

(8)

I

Assumption of strict exogeneity using LSDVM:

y

it

=

βxit

+ c

i

+

eit

(4)

I It implies that explanatory variables in each period of time are uncorrelated with the idiosyncratic error, that is the

time-varying partof the error, in each period I

Pooled OLS estimators (estimations without c

i

)

y

it

=

βxit

+ v

itOLS

(5) v

itOLS

= c

i

+

eit

(6)

I Thexit should be uncorrelated with the time-invariant unobservablesci and the time-variant unobservables

(9)

I

However, even if both the

eit

and the c

i

are uncorrelated with the x

it

, the v

itOLS

will be autocorrelated

I Standard errors from OLS will not be correct

I We need standard errors which are robust to heteroskedasticity and autocorrelation→optionclusterin stata

I ... or use a different estimator allowing autocorrelation in the error term

(10)

Outline

Panel data

Panel data and endogeneity

Panel production functions

A panel macro production function A panel micro production function

Panel estimators Panel estimators

The micro panel production function extended

(11)

A panel macro production function

log V

it

L

it

=

α

log K

it

L

it

+ ( 1 −

α

)

φ

( E

it

) + ( 1 −

α

) log A

it

+ u

it

(7)

I

We assume that the level of productivity differs by country and is time-invariant (log A

i

) and the change in productivity is common across all countries (Year)

log A

it

= A

i

+

λYear

(8)

I

The production function is therefore:

logVit

Lit =αlogKit

Lit + (1−α)φ(Eit) + (1−α)logAi+ (1−α)λYear+uit (9)

(12)

I

In Hall et Jones (1999): residuals are interpreted as differences in underlying productive efficiency differences accross countries

I

Here: some time-invariant differences (logA

i

) and a time trend (year )

I This approach treatsAi as fixed (rather than random) and the results estimates arefixed effect estimates

I Here, fixed effects are the time-invariant difference in productivity across countries

(13)

Two waves (1 and 2)

logVi1

Li1 =αlogKi1

Li1 + (1−α)φ(Ei1) + (1−α)logAi+ (1−α)λ+ui1 (10)

logVi2

Li2 =αlogKi2

Li2 + (1−α)φ(Ei2) + (1−α)logAi+ (1−α)λ+ui2 (11) We can differenciate the equation:

log∆Vit

Lit = (1−α)λ+α∆logKit

Lit + (1−α)∆φ(Eit) +uit (12)

(14)

Pooled OLS

Data: PWT 1980 - 2000

(15)

Some remarks

I

Natural log of capital per capita: 0.57 (still much higher than 0.3)

I

Education: 0.08

I The correlation between education and income at the macro level is much weaker than in the micro data

I A review of the evidence linking micro and macro data:

Pritchett (2006) “Does learning to add up add up? The returns to schooling in aggregate data” inHandbook of the Economics of Education, vol. 1

I See also Cohen and Soto (2007)today presentation I

The measure of the change in productivity: negative and

significant (but small)

(16)

Fixed effects (1)

Data: PWT 1980 - 2000

(17)

Fixed effects (2)

(18)

Remarks

I

Two ways of estimating fixed effects (dummies or

xtreg)

I Coefficients are perfectly similar

I Slight changes in standard errors (the finite sample correction that is made is different btw the two methods in Stata) I

Lower level of lkp (but still much higher than expected)

I

Returns to education turns not significant!

I

Change in productivity: not significant

(19)

First difference

Data: PWT 1980 - 2000

(20)

Remarks

I

Model in first difference gives exactly the same result as for the method which uses dummy variables

I

This does not hold after we have more than two waves of data

(different estimates)

(21)

Outline

Panel data

Panel data and endogeneity

Panel production functions

A panel macro production function A panel micro production function

Panel estimators Panel estimators

The micro panel production function extended

(22)

A panel micro production function

I

A panel micro firm-level data from Ghana (Soderbom and Teal 2004)

log V

it

=

α0

+

α1

log K

it

+

α2

log L

it

+

α3

HC

it

+

α4

Sector

i

5

Location

i

+

α6

Ownership

i

+

α6

FirmAge

it

7

Wave7 + u

it

(13)

(23)

Pooled OLS

(24)

Fixed effects

(25)

I

Factor inputs are not significant when controlling for unobservable time-invariant characteristics (fixed effects)

I

Time dummy is not significant also:

I Changes in total factor productivity (as we controlled for observed input, what is left is TFP)

I

In Fixed effects estimates, we control for fixed effects

I But no consistent estimates of these fixed effects→do not interpret the estimated coefficients!

(26)

Outline

Panel data

Panel data and endogeneity

Panel production functions

A panel macro production function A panel micro production function

Panel estimators

Panel estimators

The micro panel production function extended

(27)

Panel estimators

y

it

=

β0

+

β1

x

1it

+

β2

x

2it

+ ... +

βK

x

Kit

+ ( c

i

+ u

it

) (14)

I

Pooled OLS are consistent if c

i

+ u

it

are uncorrelated with explanatory variables

I

Within transformation

I Transformation of the original equation by “demeaning variables” (yit−yi)

I It eliminatesci from the equation

I Resulting estimator: Within estimatoris unbiased (if E(uit−u

x1it−x1, ...,xKit−xK1) =0 and consistent (if (uit−u,xKit −xK1=0)

I We do not have to assume anything on the relationship betweenci and the explanatory variables

I However, we need to assume that explanatory variables are

(28)

I

Fixed effects and First differences estimators are exactely equivalent when T = 2

I

It is generally not the case when T > 2

I Under the null hypothesis that strict exogeneity holds, FE and FD differ only because of sampling error

I If FE and FD are significantly different (so that the differences in the estimates cannot be attributed to sampling error), it raises concerns as to the validity of the strict exogeneity assumption (See Wooldrige 2010, ch. 10)

(29)

Random Effects estimators

I

POLS is consistent if c

i

is uncorrelated with all explanotory variables and the time-varying part of the residual u

it

is uncorrelated with all explanatory variables

I

But POLS is not efficient because c

i

+ u

it

is serially correlated

I

Random effects models:

I Like POLS, the RE estimator leavesci in the error term

I But RE explicitly deals with the serial correlation in the equation residual (ci+uit)

(30)

Key assumptions for consistency

I

POLS: c

i

and u

it

have to be uncorrelated with the explanatory variables. Strict exogeneity not required.

I

FE: The residual in each period of time has to be uncorrelated with all explanatory variables across all time periods (strict exogeneity). Individual effects c

i

can be correlated with the explanatory variables

I

FD: the residual in t − 1, t and t + 1 has to be uncorrelated with all explanatory variables. Individual effects c

i

can be correlated with the explanatory variables

I

RE: Strict exogeneity and zero correlation between the

individual effect and the explanatory variables

(31)

Model selection

I

If strict exogeneity holds

I If the individual effectci is correlated with explanatory variables, use FD or FE

I If the individual effectci is not correlated with explanatory variables, use RE (more efficient)

I

Testing for correlation between the c

i

and explanatory variables

I Hausman test

I Null hypothesis accepted: RE is consistent and should be used

I Null hypothesis rejected: RE is not consistent, FE or FD should be used

I The Mundlak-based approach (see next section)

(32)

I

Testing for the presence of an unobserved effect

I If the regressors are strictly exogenous anduit is

non-autocorrelated and homoskedastic, thenPOLS and RE are efficientif the unobserved effectsci is constant across all individuals

I →There areno time-invariant unobserved effects

I If ci varies across individuals, thenonly RE are efficient

I Lagrange multiplier (LM) test due to Breusch and Pagan (1980):

I xttest0following RE estimation

(33)

Outline

Panel data

Panel data and endogeneity

Panel production functions

A panel macro production function A panel micro production function

Panel estimators

Panel estimators

The micro panel production function extended

(34)

Pooled OLS

I

Now six waves of data using the firm-level data from Ghana

(35)

Fixed effects

(36)

Random effects

(37)

Choosing between POLS and RE (Brush-Pagan)

I We reject the hypothesis that there are no time-invariant unobserved effects I RE is more efficient

(38)

Choosing between FE and RE (Hausman test)

I RE specification can be justified, since the p-value associated with the Hausman test is higher than 0.10

I However ’VbVB is not positive definite’ and RE standard error associated with wave 3 is higher thant its FE counterpartnot in line with the

(39)

Choosing between FE and RE (Mundlack-based approach)

I

Method:

I Add the individual specific means of the explanatory variables to the baseline specification and estimate by means of OLS

I Then test (using a F-test) the joint significance of these means

I If we reject the hypothesis that the coefficients on all the mean variables are equal to zero→we reject the hypothesis thatci is uncorrelated with the explanatory variablesxjit →FE instead of RE

I

Here: we reject the hypothesis that c

i

is uncorrelated with the

explanatory variables → FE estimates

(40)

What drives the productivity of Ghanaian firms?

I

If you believe that RE or POLS are consistent and efficient:

labour productivity is driven primarly by differences in capital intensity. Human capital plays some role, although its level of significnace varies across estimator

I

If you believe that only FE can be used: only time-invariant unobservables drive labour productivity

I It is an economist’s way of saying, “I don’t know”!

I

In this chapter, we assume that x

it

are uncorrelated with the u

it

I If it is not the case,endogeneity→Instrumental Variables→ Next chapter

Références

Documents relatifs

WHAT YOU NEED TO KNOW BEFORE AND AFTER SURGERY If you or your child is undergoing a surgical procedure,7. be sure to communicate the following to your

I Third, in case endogeneity is induced by the omitted variables problem, one can rely on quasi-experimental approaches (Instrumental Variables, Regression Discontinuity

Keywords: Nonlinear Panel Model, Factor Model, Exchangeability, Systematic Risk, Efficiency Bound, Semi-parametric Efficiency, Fixed Effects Estimator, Bayesian Statis- tics,

Dynamic Panel Data : Intro 2016-17 Course Content & Motivation..

Panel Data Estimators Random Effects Estimator. Reminder : GLS in

Dynamic Panel Data model DPD Interpreting a Dynamic Panel First-differences Model Anderson–Hsiao estimator Arellano-Bond

Since our estimates for the the space-time filter parameters are consistent with model stability (the sum of the spatial filter parameters being less than one), we will see

Estimation of dynamic panel data models with heterogeneous slopes and/or cross-sectional dependence has also been investigated by Chudik and Pesaran (2015a,b), using the