Topics in Applied Econometrics : Panel Data

(1)

Topics in Applied Econometrics : Panel Data

Ch 1. Linear Non Dynamic Panel Data Models

Pr. Philippe Polomé, Université Lumière Lyon 2

M2 Equade & M2 GAEXA

2015 – 2016

(2)

Overview of Ch. 1

Panel Data Models

Fixed Eﬀects & Random Eﬀects Panel Data Estimators

Pooled OLS Estimator Between Estimator Within Estimator

First-Differences Estimator Random Effects Estimator Fixed vs. Random Effects

Hours & Wages Example Panel Data Inference

Panel-Robust Inference Bootstrap Standard Errors Hours & Wages Example Fixed Eﬀects vs. Random Eﬀects

Non-Test Elements of Choice Hausman Test

Unbalanced Panel Data

(3)

Outline

Panel Data Models

Panel Data Estimators Panel Data Inference

Fixed Eﬀects vs. Random Eﬀects Unbalanced Panel Data

(4)

Models & Estimators

I

Wider range of models and estimators than with cross-section data

I

3 standard models

I Presented in this Section

I Several estimators presented in the next Section

I Same logic with more sophisticated models

I

The diﬀerent estimators may be applied to the diﬀerent models

I With varying results

I Will be a table

(5)

General Panel Data Model

I

A very general linear model for panel data

I intercept & slopecoeﬃcients vary over bothi&t y_it =↵_it+x_{it it}⁰ +u_it

i=1, ...,N: individual (or firm or country),t=1, ...,T: time

I y_it scalar dependent variable

I x_it K⇥1 vector of independentvariables

I u_it scalar disturbance term

I

Too general

I Notestimable : more parameters to estimate than observations

I Further restrictions needed

I on the extent to which↵itand itvary withi andt

I on the behavior of the erroruit

(6)

Pooled Model

I

The most restrictive model is a pooled model that specifies constant coeﬃcients

yit =↵+x_it⁰ +uit

(1)

I

If this is correctly specified

I and regressors are uncorrelated with the error,

I then it can be consistently estimated as a cross-section

I That is : just with OLS

(7)

Individual and Time Dummies

I

A simple variant of the pooled model (1) has

I Interceptsthat vary across individuals and over time

I Constant slopes

yit =↵i+ t+x_it⁰ +uit

(2) or

yit =

XN j=1

↵jdj,it+ XT s=2

sds,it+x_it⁰ +uit

where the

N

individual dummies

dj,it =

1 if

i =j

and

=

0 otherwise

the

T

1 time dummies

ds,it =

1 if

t=s

and

=

0 otherwise

(8)

Individual and Time Dummies

I xit

does not include an intercept

I If an intercept is included

I then one of theN individual dummies must be dropped

I Many packages do that

I

Focus on short panels where

N! 1

but

T

does not

I Then (time intercept) can be consistently estimated

I At least in the sense that there is a finite number of them

I T 1 time dummies are simply incorporated into the regressorsxit

I But if we inserted the full set ofN individual interceptsdj,it I It would cause problems asN! 1

I We cannot estimate consistently an1number of parameters

I Information does not increase on the↵i asN increases I

Challenge : estimating the parameters

I controlling for theN individual intercepts↵i

(9)

Outline

Panel Data Models

(10)

Individual-Specific Eﬀects Model

I

Individual-specific eﬀects model :

I each cross-sectional unit has a diﬀerent intercept term butall slopesare the same

yit =↵_i+x_it⁰ +✏_it

(3) where

✏_it

is iid over

i

and

t

I

= a more parsimonious way to express previous (2)

I Time dummies included in regressorsxit

I “standard” linear non-dynamic panel data model

I noyi(t s)inxit

I ↵_i

random variables

I Capture unobserved heterogeneity

I = unobserved time-invariant individual characteristics

I In eﬀect: a random parameter model

(11)

Reminder : Unobserved Heterogeneity

I

The correct model is

Y = ₀+ ₁x₁+ ₂x₂+✏

I

But the estimated model is

Y = ₀+ ₁x₁+⌫

I

The eﬀect of the missing regressor on

Y

is implied in the error of the estimated model :

⌫ = ₂x₂+✏

I = unobserved heterogeneity : Unobserved (individual) factors influence the LHS variable

I

If the missing regressor is correlated with an included regressor

I Then⌫ correlated with at least one included regressor

I LS inconsistent

I Furthermore, possibly :

I Heteroscedasticity ifvar(x2t)6=var(x2s),t6=s

I Autocorrelation ifcorr(x2t,x2s)6=0,t6=s

(12)

Reminder : Unobserved Heterogeneity

Same slopes

(13)

Exogeneity

I

Throughout this chapter: assume strong/strict exogeneity

E[eit|ai,x_i1, ...,xiT] =

0

, t =

1

, ...,T

(4)

I

So that

✏_it

is assumed to have mean zero conditional on past, current, and future values of the regressors

I Zero covariance

I Nothing is said between the random term↵_i andx_i

I

Strong exogeneity rules out models with lagged dependent variables or with endogenous variables as regressors (Ch. 2)

I y_it ₁=↵_i+x_it⁰ ₁ +✏_it ₁: it is often hard to maintain that E(✏_it✏_it ₁) =0

(14)

Fixed Eﬀects Model

I

2 variants to model (3) accordingly with hypotheses on

↵_i

I Both are models with “2” errors↵_i and✏_it

I Error component models

I Both variants treat↵i as an unobserved random variable

I

Variant 1 of model (3): fixed eﬀects (FE) model

I ↵i is potentiallycorrelatedwith the (time-invariant part of the) observed regressorsxit

I A form ofunobserved heterogeneity

I “fixed” because early treatments treated↵i as (non-random) parameters to be estimated (hence “fixed”)

(15)

Random Eﬀects Model

I

Variant 2 of model (3) : Random eﬀects (RE) model

I ↵_i distributed independently of x

I Usually makes the additional assumptions that both the random eﬀects ↵_i and the error term✏_it in (3) are iid :

↵i ⇠ ↵, ²_↵

✏it⇠ 0, ²✏ (5)

I

No distribution has been specified in (5)

I ✏it

may show autocorrelation

I Often it is assumedcov(✏_it,✏_is)6=0

I While bothcov(✏_it,✏_jt) =0 andcov(↵_i,↵_j) =0 are assumed

I Except in spatial models

I ↵

can be treated as the intercept of the model

(16)

Random Eﬀects Model

I

Other names for this model :

I One-way individual-specific eﬀects model

I Two-way = inclusion of time-dummies or time-specific random eﬀects

I Random intercept model

I To distinguish the model with more general random eﬀects models e.g. random slopes

I Random components model

I because the error term is↵i+✏it

I

The term fixed eﬀect is potentially misleading

I As said eﬀects are in fact random

I The random eﬀects are “purely” random eﬀects - un-correlated

(17)

Equicorrelated Random Eﬀects Model

I

RE model

yit=↵_i +x_it⁰ +✏_it

I can be viewed as regression ofy_it onx_it

I with composite error termuit=↵i+"it

I The RE hypothesis (5) (↵i and✏it iid) implies that

Cov[(ai +eit),(ai+eis)] =

⇢ sv²_a, t 6=s

sv²_a+sv²_e, t =s

(6)

I

RE model thus imposes the constraint that the composite error

uit

is equicorrelated

I SinceCor[u_it,u_is] = ²_↵/[ ²_↵+ _"²]fort 6=sdoes not vary with the time diﬀerence t s

I RE model is also called the equicorrelated model or exchangeable errors model

(18)

Synthesis of Panel Data Models

Pooled Model (1) yit=↵+x_it⁰ +uit uit ⇠ 0, _u² Fixed-eﬀects model

yit =↵i+x_it⁰ +✏it (3) Cov(↵i,xit)6=0

Random-eﬀects model ↵i ⇠ ↵, ²_↵

✏_it ⇠ 0, ²_✏ (5)

(19)

Outline

Panel Data Models Panel Data Estimators

Panel Data Inference

(20)

Panel Data Estimators

I

Several commonly used panel data estimators of

I In this non-dynamic, no endogeneity context : LS variants

I

Diﬀer in the extent to which cross-section and time-series variation in the data are used

I their properties vary according to what model is appropriate

I

A regressor

xit

may be either

I time-invariant,x_it=x_i fort=1, ...,T ,

I or time-varying

I For some estimators only the coeﬃcients of time-varying regressors are identified

(21)

Outline

Panel Data Models

(22)

Definition & Properties

I

Stack the data over

i

&

t

into one long regression with

NT

obs

I

Estimate

yit=↵+x_it⁰ +uit

by OLS

I

Pooled OLS is consistent (when

N ! 1, t constant) if

I Cov[u_it,x_it] =0 and

I Pooled model (1) is appropriate, or

I RE model is appropriate

I

OLS variance matrix based on iid errors is not appropriate

I as the errors for a given individuali are almost certainly positively correlated overt

(23)

Variance Matrix

I

For a given

i

we expect correlation in

y

over time :

I Cor[y_it,y_is]is high

I Pooled modely_it=↵+x_it⁰ +u_it

I Even after inclusion of regressors,Cor[uit,uis] may remain6=0

I CallCor[uit,uis] = its I Whent=s, its = ²it

(24)

Panel Block-Diagonal Var-Cov Matrix of the Errors ⌃

0 BB BB BB BB BB BB BB BB BB BB BB

@

sv²₁₁ sv112 · · · sv11T

sv²₁₂ ... ... ... ... ... sv_1(T _1)T

SYM · · · sv²_1T

0 · · · 0

0 ... ... ...

... ... ... 0

0 · · · 0

sv²_N1 svN12 · · · svN1T

sv²_N2 ... ...

... ... ... sv_N(T _1)T

SYM · · · sv²_NT

1 CC CC CC CC CC CC CC CC CC CC CC A

(25)

Variance Matrix

I

The RE model accommodates (partly) this correlation

I From (6):

Cov[(ai+eit),(ai+eis)] =

⇢ sv²_a, t 6=s sv²_a+sv²_e, t=s

I

OLS output treats each of the

T

years as independent information, but

I The information content islessthan this

I given the positive error correlation

I Tends to overstate estimator precision

I

Use panel-corrected standard errors when OLS is applied in a panel

I Many possible corrections, depending on assumed correlation and heteroskedasticity and whether short or long panel

(26)

FE Model

I

Pooled OLS is inconsistent if the true model is the FE model

I

Rewrite

yit =↵_i +x_it⁰ +✏_it

as

yit=a+x_it^�b+ (ai−a+eit)

I

Then Pooled OLS of

yit

on

xit

and an intercept leads to an inconsistent estimator of if the individual eﬀect

↵i

correlated with

xit

I Since such correlation implies that the combined error term (↵_i ↵+"_it)is correlated with the regressors

(27)

Outline

Panel Data Models

(28)

Definition

I

Pooled OLS uses variation over both time and cross-sectional units to estimate

I

Between estimator uses just the cross-sectional variation

I Individual-specific eﬀects model (3) yit =↵i+x_it⁰ +✏it I Average over all years : y¯i =↵i+ ¯x_i⁰ + ¯"i

I arithmetic means over time, per individual

I

between estimator = OLS estimator from regression of

y¯i

on an intercept and

x¯i

I so implicitly on thebetween model

¯

yi =↵+ ¯x_i⁰ + (↵i ↵+ ¯✏_i) i =

1, ...,

N

(7)

(29)

Properties

I

Uses variations between diﬀerent individuals

I Is the analogue of cross-section regression

I Variationswithinindividuals are discarded

I

Between is consistent if the regressors

x¯i

are independent of the composite error

(↵i ↵+ ¯"i)

in (7).

I True for the pooled model (1) and the RE model

I Between is inconsistent for the FE model

I as↵i is then correlated withxitand hencex¯i

I

Between is not normally used as it throws away a lot of info

I But it is didactical

I Do not normally use it in applications

(30)

Outline

Panel Data Models

(31)

Within Model

I

Principle: Individual-specific deviations of the dependent variable from its time-averaged value

I areexplained by

I individual-specificdeviationsof regressors from their time-averaged values

I

Individual-specific eﬀects model 3

yit =↵_i+x_it⁰ +✏_it

I Average over time : y¯_i=↵_i+ ¯x_i⁰ + ¯"_i

I Subtract: the↵_i terms cancel = thewithinmodel

yit y¯i = (xit x¯i)⁰ + (✏it ¯✏i)

1

, ...,N, t =

1

, ...,T

(8)

(32)

Within / Fixed Eﬀects Estimator

I

Within estimator = OLS estimator of

yit y¯i = (xit x¯i)⁰ + (✏it ¯✏_i)

I Consistent for in the FE model

I

Called the fixed eﬀects estimator by analogy with the FE model

I does not imply that↵i are fixed

I

Each

i

must be observed at least twice in the sample

I Elsexit x¯i =0

(33)

Consistency of Fixed Eﬀects Estimator

I

FE treats

↵_i

as nuisance parameters

I can be ignored when interest lies in

I do not need to be consistently estimated to obtain consistent estimates of the slope parameters

I This result needs not carry over to nonlinear FE models

I

Consistency further requires

E(✏it ¯✏i|xit x¯i) =

0 in the within model

yit y¯i = (xit x¯i)⁰ + (✏it ¯✏_i)

I Because of the averages, that requires more thanE(✏it|xit) =0

I Requires the strict exogeneity assumption (4) E[eit|ai,x_i1, ...,xiT] =0, t =1, ...,T

(34)

Fixed Eﬀects Estimates

I

If the fixed eﬀects

↵i

are of interest they can also be estimated as

↵ˆ_i = ¯yi x¯_i⁰ˆ

I unbiasedestimator of↵_i

I In short (smallT) panels↵ˆ_i are alwaysinconsistent, because information never accumulate for them

I Their distribution or their variation with a key variable may be informative

I

If

N

is not too large an alternative way to compute Within is Least-Squares Dummy variable estimation

I Directly estimatesy_it=↵_i+x_it⁰ +✏_it by OLS ofy_it onx_it and N individual dummy variables

I Yields Within estimator for , along with estimates of theN fixed eﬀects

(35)

Time-Invariant Regressors

I

Major limitation of Within

I the coeﬃcients of time-invariant regressors arenot identified

I Since ifx_it= ¯x_i then x¯_i=x_i so(x_it x¯_i) =0

I

Many studies seek to estimate the eﬀect of time-invariant regressors

I For example, in panel wage regressions : the eﬀect of gender or race

I

For this reason many practitioners prefer not to use the within estimator

I

Pooled OLS or RE estimators permit estimation of coeﬃcients of time-invariant regressors

I but are inconsistent if the FE model is the correct model

(36)

Outline

Panel Data Models

(37)

First-Diﬀerences Model

I

Principle: Individual-specific one-period changes in the dependent variable

I are explained by

I individual-specificone-period changesin regressors

I

Individual-specific eﬀects model (3)

I Lag one periody_i,t ₁=↵i+x_i,t⁰ ₁ +"i,t 1 I Subtract = thefirst-diﬀerences model

yit yi,t 1 = (xit xi,t 1)⁰ + (✏it ✏_i,t ₁)

i =

1

, ...,N, t=

2

, ...,T

(9)

(38)

First-Diﬀerences Estimator

I

The First-diﬀerences estimator is OLS in the first diﬀerences model (9)

I

Consistent estimates of in the FE model

I The coeﬃcients of time-invariant regressors arenotidentified

I

First-diﬀerences is less eﬃcient than the within estimator

I if"_it is iid (forT >2)

I

However, it may safeguard against I(1) variables

I That would wise lead to inconsistency

I See Time-series

(39)

Outline

Panel Data Models

(40)

Random Eﬀects Model

I

Individual-specific eﬀects model (3)

I Assume RE model with iid↵_i and✏_it as in RE hyp (5)

↵_i ⇠ ↵, ²_↵

✏_it⇠ 0, ²_✏

I

Pooled OLS is consistent

I But pooledGLSwill bemore eﬃcient

(41)

Reminder : GLS in a cross-section

I

When all the hypotheses of the linear model are satisfied but the errors covariance matrix

⌃

is not the identity, then

I OLS is consistent

I but it is not eﬃcient if we know⌃

I

Let the classical linear (cross-section) model

y =x⁰ +✏

with

E⇣

✏✏⁰⌘

=⌃6= ²I

I LetP⁰P=⌃ ¹

I Nonunique Cholesky decomposition for real sdp matrix

I Premultiply the linear model byP : Py =Px +P✏

I y^⇤=x^⇤ +✏^⇤

I ThenVar(✏^⇤) =E⇣

P✏✏⁰P⁰⌘

=PE⇣

✏✏⁰⌘ P⁰

I =P⌃P⁰ =P⇣

P⁰P⌘ 1

P⁰ =PP ¹⇣ P⁰⌘ 1

P⁰ =I

(42)

Reminder : GLS in a cross-section

I

So the transformed model has spherical disturbances

I Applying OLS to thetransformeddata is aneﬃcient estimator

I That is GLS

I

Since

⌃

is unknown in practice, we need an estimate

I Any consistent estimate of⌃,⌃, yields theˆ Feasible (consistent) GLS estimator

(43)

Panel Block-Diagonal Var-Cov Matrix of the Errors ⌃

0 BB BB BB BB BB BB BB BB BB BB BB

@

sv²_a+sv²_e sv²_a · · · sv²_a

sv²_a sv²_a+sv²_e ... ... ... ... ... sv²_a

sv²_a · · · sv²_a sv²_a+sv²_e

0 · · · 0

0 ... ... ...

... ... ... 0

0 · · · 0

sv²_a+sv²_e sv²_a · · · sv²_a

sv²_a sv²_a+sv²_e ... ... ... ... ... sv²_a

1 CC CC CC CC CC CC CC CC CC CC CC A

(44)

Random Eﬀects Estimator

I

The feasible GLS estimator of the RE model can be calculated from OLS estimation of the transformed model :

yit ˆ ¯yi =⇣

1

ˆ⌘

µ+⇣

xit ˆ ¯xi

⌘⁰

+⌫_it

(10) where

⌫_it = (1 ˆ)↵i+ ("it ˆ ¯"_i)

is asymptotically iid, and

I ˆ

is consistent for

=

1

p ₂ ^✏

✏ +T _↵²

(11)

I

Called the RE estimator

(45)

Random Eﬀects Estimator

I

The nonrandom scalar intercept

µ

is added to normalize the random eﬀects

↵_i

to have zero mean

I as in the RE hypothesis

I

Cameron & Trivedi provide a derivation of (10) and ways to estimate

_↵²

and

²_"

and hence to estimate

I Not detailed here

I

Note

I ˆ =0 corresponds to pooled OLS

I ˆ =1 corresponds to within estimation

I ˆ!1 asT ! 1(look at the formula)

I

This is a two-step estimator of

(46)

Random Eﬀects Estimator Properties

I

RE estimator is

I Fully eﬃcientunder the RE model

I The eﬃciency gain compared to Pooled OLS (applied to the RE model) need not be great

I Might still be ineﬃcientif the equicorrelation hypothesis is not true

I In particular, underAR(1)processes

I Inconsistentif the FE model is correct since then↵_i is correlated withx_it

(47)

RE Discussion

I

Most disciplines in applied statistics other than microeconometrics treat any unobserved individual heterogeneity as being distributed independently of the regressors

I Then the eﬀects arerandom eﬀects

I rather : purelyrandom eﬀects

I

Compared to FE models this stronger assumption has the advantage of permitting consistent estimation of all parameters

I Including coeﬃcients of time-invariant regressors

I However, RE and Pooled OLS are inconsistent if the true model is FE

I

Economists often view the assumptions for the RE model as

being unsupported by the data

(48)

Outline

Panel Data Models

(49)

Identification of the Individual-Specific Eﬀects

I

In

yit =↵i+x_it⁰ +✏it

the individual effect is a random variable (random coefficient) in both fixed and random effects models

I Both models assume thatE[y_it|↵_i,x_it] =↵_i+x⁰_it

I ↵_i is unknown andcannotbe consistently estimated

I UnlessT ! 1

I So wecannotestimateE[yit|↵i,xit]

I Contrarily to what we usually do with OLS

I That is reasonnable as↵i includes unobserved individual characteristics

I Possibly with a non-zero mean

I

But, take the expectation wrt

xit

:

E[yit|xit] =E[↵i|xit] +x_it⁰

I That is, what is the (conditional) expected value of↵i?

I FE and RE have diﬀerent takes on this expectation

(50)

RE : it is assumed that

i| it ↵, so it| it it

I HenceE[y_it|x_it] is identified

I Since we estimate consistently a single intercept asNT ! 1

I But the key RE assumption thatE[↵_i|x_it]is constant acrossi might not hold in many microeconometrics applications

I

FE :

E[↵i|xit]

varies with

xit

and it is not known how it varies

I So we cannot identifyE[y_it|x_it]

I Nonetheless Within & First-Diﬀestimators consistently estimate with short panels

I Thusidentify the marginal eﬀect =@E[y_it|↵_i,x_it]/@x_it

I e.g. identify eﬀect on earnings of 1 additional year of schooling

I Butonly for time-varying regressors

I so the marginal eﬀect of race or gender, for example, is not identified

I And not the expected individualyit as we do not know the individual eﬀect↵i

(51)

Random Eﬀects vs. Fixed Eﬀects

I

Both models have diﬀerent focuses

I

RE

I Time-series structure

I Eﬃciency

I

FE

I Endogeneity of unobserved heterogeneity

I Consistency

(52)

Summary Models & Estimators

Table:Linear Panel Model: Common Estimators and Models

Model

Estimator of Pooled (1) Rnd Eﬀects (3) & (5) Fixed Eﬀects (3)

Pooled OLS (1) Consistent Consistent Inconsistent

Between (7) Consistent Consistent Inconsistent

Within (Fixed Effects) (8) Consistent Consistent Consistent First Differences (9) Consistent Consistent Consistent Random Effects (10) Consistent Consistent Inconsistent

This table considers only consistency of estimators of . For correct computation of standard errors see next Section.

The only fully eﬃcient estimator is RE under the RE model

(53)

Outline

Panel Data Models

(54)

Eﬀect of Wage on Labor Supply

I

Labor economics : responsiveness of labor supply to wages

I

Standard textbook model of labor supply suggests that for people already working, the eﬀect of a wage increase on labor supply is ambiguous

I Income effect pushing in the direction of less work offsetting a (leisure-) substitution effect in the direction of more work

(55)

Cross-Section & Panel

I

Cross-section analysis for adult males finds a relatively small positive response to hours worked

I However, it is possible that this association isspurious

I Reflecting a greater unobserveddesire to workbeing positively associated with higher wages

I e.g. those who like to work get better/faster promotion (or similar) – that is↵i

I

Panel data analysis can control for this

I Under the assumption that the unobserved desire to work is time-invariant

I For ex. Within : measuring the extent to which an individual works above-average hours inperiodswith above-average wages

(56)

Data

I

Data on 532 males for each of the 10 years from 1979 to 1988

I File provided on Cameron & Trivedi’s website

I mom9.dta file on course page

I Balanced panel: don’t do that in your report

I

From the Panel Study of Income Dynamics (PSID)

I Used by Ziliak (1997)

I

5 320 observations, sample means of lnhrs and lnwg : respectively 7.66 & 2.61

I Geometric means of 2 120 hours and $13.60 per hour (geometric mean since sum of log)

(57)

Panel Study of Income Dynamics

I

Begun in 1968, PSID is a longitudinal study of a representative sample of U.S. individuals (men, women, and children)

I Family units in which they reside

I Dynamic aspects of economic and demographic behavior

I

Low attrition rates & success in following young adults as they form their own families and recontact eﬀorts (of those declining an interview in prior years)

I Sample size has grown from 4,800 families in 1968 to more than 7,000 families in 2001

I

Conclusion of 2003 data collection, PSID has collected

information about > 65 000 individuals spanning as much as

36 years of their lives

(58)

Model

ln

hrsit =↵_i +

ln

wgit+"_it

I

ln

hrs

natural logarithm of annual hours worked

I

Single explanatory variable: ln

wg

natural log of hourly wage

I

Individual-specific eﬀect

↵i

I Unobserved individual time-invariant characteristics

I e.g. education, abilities

I

measures the wage elasticity of labor supply

I "it

assumed to be independent over

i

, but may be correlated

over

t

for given

i

(59)

Model

I

Ziliak (1997) additionally included

age²

, # of children, an indicator for bad health & year dummies

I makessmalldiﬀerence to the estimate of and its standard error

I For simplicity, are omitted here

I

Ch. 2: more general models

I Endogenous lnwg

I FE↵i correlated with lnwgit

I Endogeneity✏it correlated with lnwgit I Lags of lnhrs as regressor

I If you work more, you will earn a higher hourly wage

(60)

Stata

I

Load the data

I .dta: double-click

I limited import capacity: .csv

I

Declare the dataset to be panel

I Menu: longitudinal / panel data

I ID = i

I Time = t

I

Obtaining the Results in Stata 1/2

I

Pooled OLS :

↵ˆ

,

ˆ

and other stats directly in the output

I

Between model (OLS regression on the average per individual) is obtained similarly to POLS

I

Within model (= Fixed-eﬀects)

I Individual↵ˆi estimates can be recovered after estimation (not consistent)

I

Stata presents an intercept in the FE estimates

I Rewrite model (3)y_it=↵i+x_it⁰ +✏it as y_it=↵+↵_i+x_it⁰ +✏_it

I leads to perfect multicollinearity

I We need to normalize

I In theory, we chose↵=0 for simplicity

I Instead Stata has chosenP

i↵_i=0 because of the analogous assumption in RE :E(↵_i) =0

I In all cases, it has no bearing on the estimates

(62)

Obtaining the Results in Stata 2/2

I

First-Diﬀerences estimator is not readily available in Stata

I In my version at least

I Define the first diﬀerences first, then apply the POLS

I Lag 1 period in Stata : by i: gen lnhrsL1 = lnhrs[_n-1](n indexes observations,by iindicates to lag by individual)

I Thenby i: gen lnwgD1 = lnwg-lnwgL1for the 1st diﬀ

I

RE : 2 versions

I GLS (OLS estimation of the transformed model as seen in the section on estimators)

I ML (I will not detail)

(63)

Linear Panel Data Estimates

POLS Between Within First Diﬀ RE-GLS RE-MLE

↵ 7.44 7.48 7.22 .001 7.35 7.35

.83 .067 .168 .109 .119 .12

.000 – .624 – .585 .586

N 5320 532 5320 4788 5320 5320

I

is the one from the RE estimator (10)

yit ˆ ¯yi =⇣

1

ˆ⌘

µ+⇣

xit ˆ ¯xi

⌘⁰

+⌫_it

I It can be infered with some other estimators

(64)

Slope Parameter Estimates

I

The estimates of the slope parameter diﬀer across the diﬀerent estimation methods

I

The between estimate that uses only cross-section variation is less than the pooled OLS estimate

I

The within (= fixed eﬀects) estimate of 0.168 is much higher than the pooled OLS estimate of 0.083

I

The first-diﬀerences estimate of 0.109 is also higher than that of pooled OLS

I but is considerably less than the within estimate

I

The RE estimates of 0.119 or 0.120 lie between the between and within estimates

I This is expected, as RE estimates can be shown to be a weighted average of between and within estimates

I The two RE estimates are very close to each other

(65)

Which estimates are preferred ?

I

within and first-diﬀerence estimators are consistent under all models (pooled, RE, and FE)

I The other estimators areinconsistentunder the FE model

I

The most robust estimates are therefore the within or first-diﬀerences estimates of 0.168 or 0.109

I

eﬃciency loss in using these more robust estimators : next section

I

Hausman test (following next section) : whether or not FE model is appropriate

I Turns out Hausman test rejects the null hypothesis of RE

I That seems natural because of the large diﬀerence between the coeﬃcients estimates

(66)

First-diﬀerence vs. Fixed-eﬀects

I

Both are consistent under all models (pooled, RE, and FE)

I

If

T =

2 they are identical

I

If

uit

has no serial correlation, FE is in principle better

I Because it does not throw away one period of data

I

If

uit

is a random walk, FD is in principle better

I Because it transforms the series to order 0

I

If there is correlation between

xit

and

uit

(endogeneity)

I Both FE and FD become inconsistent

I

Testing is complicated

I More details requires introducing time-series issues

(67)

Outline

Panel Data Models Panel Data Estimators Panel Data Inference

(68)

Outline

Panel Data Models

(69)

Panel-Robust Statistical Inference

I

The various panel models include error terms :

uit

,

"it

,

↵i I

In many microeconometrics applications :

I Reasonable to assume independence overi

I

The errors are potentially

1. serially correlated (correlated overt for giveni ) 2. heteroskedastic (at least acrossi)

I

Valid statistical inference requires controlling for both of

these factors

(70)

0 BB BB BB BB BB BB BB BB BB BB BB BB

@

sv²_a+sv²_e₁ sv²_a · · · sv²_a

sv²_a sv²_a+sv²_e₁ ... ... ... ... ... sv²_a

sv²_a · · · sv²_a sv²_a+sv²_e₁

0 · · · 0

0 ... ... ...

... ... ... 0

0 · · · 0

sv²_a+sv²_e_N sv²_a · · · sv²_a

sv²_a sv²_a+sv²_e_N ... ...

... ... ... sv²_a

N

1 CC CC CC CC CC CC CC CC CC CC CC CC A

I

The White heteroskedastic consistent estimator can be extended to short panels

I since for thei^thobservation the error variance matrix⌃is of finite dimensionT ⇥T whileN! 1

(71)

Reminder : The White heteroskedastic-consistent estimator

I

Classical linear model

y =x⁰ +✏

with

E⇣

✏✏⁰⌘

=⌃6= ²I

I OLS unbiased and consistent

I Var⇣ ˆ_OLS⌘

=⇣

X⁰X⌘ 1

X⁰⌃X⇣

X⁰X⌘ 1

6

= ²⇣

X⁰X⌘ 1 I

For pure heteroskedasticity, White (1980) shows that

S =

1

N

XN i=1

ˆ

✏²_iXiX_i⁰

I whereˆ✏_i is the OLS residual

I is a consistent estimate of _N¹X⁰⌃X under general conditions

I

The formula can be extended for Autocorrelation

I But often autocorrelation reveals time-series properties

I That need to be investigated in more details

(72)

Panel-Robust Statistical Inference

I

Panel-robust standard errors can thus be obtained

I withoutassuming specific functional forms for within-individual error correlation or heteroskedasticity

I

So we use ineﬃcient estimators

I but at least we get their variance right

I Only RE estimator in RE model is eﬃcient

I Moreeﬃcientestimators using GMM : Chap 2

I

The panel commands in many computer packages calculate default se assuming iid errors

I erroneous inference

I Ignoring it can lead tounderestimatedse and over-estimatedt-stat

I

FE or RE tend to reduce the serial correlation in errors, but

not eliminate it

(73)

Derivation of the White heteroskedastic-consistent estimator

I

Rewrite the panel estimators as OLS estimation of

✓

in

˜

yit = ˜w_it⁰✓+ ˜uit

(12)

I y˜it

a known function of only

y_i1, ...,yiT

; similarly for

w˜_it⁰

and

w_it⁰ =⇥

1

x_it⁰ ⇤

;

u˜it

and

uit

I Pooled OLS : no transformation,✓=⇥

↵ ⁰ ⇤⁰

I Within : y˜_it=y_it y¯_i, w˜_it=x_it−¯x_i only time-varying regressors

I ✓: coeﬃcients of the time-varying regressors

I ...

I

!! Such transformations will induce serial correlation even if

underlying errors are uncorrelated !!

(74)

Notation

I

Stack observations over time periods for a given individual :

I ~yi =W~_i⁰✓+~ui where

I ~yi : T⇥1

I for the first-diﬀerences model,(T 1)⇥1

I W~i : T⇥q

I

OLS estimator

✓ˆOLS =

" _N X

i=1

W ~

⁰_i

W ~

i

# 1

X

i

W ~

_i⁰

~y

i

(75)

OLS Variance

I

Asymptotic variance of

✓ˆ_OLS

is

Vh

✓ˆ_OLSi

=

" _N X

i=1

W ~

⁰_i

W ~

i

# 1

X

i

W ~

_i⁰Eh

~u

i

~u

⁰_i|

W ~

i

W ~

i

" _N X

i=1

W ~

⁰_i

W ~

i

# 1

I

= variance of OLS estimates of the ~ model

I We need a consistent estimate of it to make classical inference, e.g. t-test

(76)

Panel-Robust “Sandwich” Variance

I

Consistent estimation of

Vh

✓ˆ_OLSi

in this panel setting

I Analogous to the cross-section problem of obtaining a consistent estimate ofVh

✓ˆ_OLSi

that is robust to heteroskedasticity of unknown form

I Complication is thevectoru_i rather than a scalaru_i

I

Panel-robust estimate of

Vh

✓ˆ_OLSi

I Controling for both serial correlationandheteroskedasticity

V\h

✓ˆOLS

i=

" _N X

i=1

W ~

_i⁰

W ~

i

# 1

X

i

W ~

⁰_i

^ ~u

i

^ ~u

⁰_i

W ~

i

" _N X

i=1

W ~

_i⁰

W ~

i

# 1

(13)

where ^ ~u

i =

~y

i

W ~

⁰_i✓ˆ

(77)

Panel-Robust “Sandwich” Variance

I

Estimator (13) assumes independence over

i

and

N! 1

I but permitsV[uit]andCov[uit,uis] to vary withi,t, ands

I the case for short panels

I

Panel-robust standard errors based on (13) can be computed by use of a regular OLS command

I if the command has acluster-robuststandard error option

I as in Stata, cluster on the individuali

I

Common error : estimate OLS of

y˜it = ˜w_it⁰✓+ ˜uit

using the standard robust se option

I Only adjusts forheteroskedasticity

I In practice in a panel : more important to correct forserial correlation

(78)

Outline

Panel Data Models

(79)

Reminder : The Bootstrap

I

Bootstrap hypothesis : if we could resample the population in the same conditions, we would observe something similar to a resampling with replacement of the observed sample

I “Mediocrity principe”

I Not the same as representativity as our sample might not be representative

I

Principle

I Sample the current sample of sizenwith replacement

I makendraws, each with probability 1/n

I Called “Bootstrap pair” as bothy andX are sampled

I Replicate that processB times: B diﬀerentpseudo-samples

I For each pseudo-sample<Y_b,X_b>: one vector✓ˆb

(80)

Reminder : The Bootstrap 2

I

To construct a confidence interval for one element

✓_k

from

✓

I We haveB estimates✓ˆ_kb

I TakeB=10 000 and order those estimates from smallest to largest

I then estimates number 250 and 9750 are the lower bound and upper bound, respectively, of the 95%confidence interval

Why is that interesting?

1.

No distributional hypothesis

1.1 Although there must be no correlation between observations 1.2 Therefore, in panels, resampling is oni only, usingallt for

eachi in the new sample

2.

Confidence intervals can be calculated

2.1 for any function of the estimated parameters, including non-linear ones

2.2 for parameters estimated from models without exact finite sample properties

(81)

Panel Bootstrap Variance

I

For each of the

B

pseudo-samples : OLS of

y˜it

on

w˜it

I B estimates✓ˆ_b,b=1, ...,B

I

Variance matrix panel bootstrap “empirical” estimate :

VBoot\

⇣✓ˆ⌘

=

1

B

1

XB b=1

⇣✓ˆ_b ✓¯ˆ⌘ ⇣

✓ˆ_b ✓¯ˆ⌘⁰

(14) where

✓¯ˆ=B ¹P

b✓ˆ_b

I

May be slow – see e.g. Cameron & Trivedi

I

Given independence over

i

I Consistent asN! 1

I Asymptotically equivalent to Panel-Robust “Sandwich”

I 8form of heteroskedasticity or autocorrelation (as White)

I Can be applied to any panel estimator ~

(82)

Outline

Panel Data Models