• Aucun résultat trouvé

Panel data estimation of the intergenerational correlation of incomes

N/A
N/A
Protected

Academic year: 2022

Partager "Panel data estimation of the intergenerational correlation of incomes"

Copied!
28
0
0

Texte intégral

(1)

Book Chapter

Reference

Panel data estimation of the intergenerational correlation of incomes

ABUL NAGA, Ramses, KRISHNAKUMAR, Jaya

ABUL NAGA, Ramses, KRISHNAKUMAR, Jaya. Panel data estimation of the intergenerational correlation of incomes. In: Jaya Krishnakumar, Elvezio Ronchetti. Panel data econometrics : future directions : papers in honour of professor Pietro Balestra . Amsterdam :

North-Holland, 2000. p. 85-106

Available at:

http://archive-ouverte.unige.ch/unige:42045

Disclaimer: layout of this document may differ from the published version.

(2)

Intergenerational Correlation of Incomes 1

by

Ramses Abul Naga 2

and Jaya Krishnakumar 3

July 1999

Abstract

Weconsidertheproblemofestimatingtheintergenerationalcorrelationof

incomes inthe context of apanel data frameworkwith measurement errors.

We present single equation estimation methods as well as system methods

under various assumptions regarding the serial correlation of the error term

and taking into account possible correlations among children of the same

family. ApplicationtoasampleofUSparentsandchildrenleadstoestimates

of the orderof 0.42 to0.60 for the coeÆcient of income transmission.

Keywords: Intergenerational mobility,panel data, errors invariables

JEL Classication Codes: I3, J6,C3

1

Wewouldliketo thankPietro Balestra,ChristianSchluterandDirk Vandegaerfor

theircarefulreadingandvaluablecommentsonanearlierversionofthepaper.Suggestions

from the participants of the Economics Seminar at the University of St.Gallen are also

acknowledged.

2

DEEPandIEMS,UniversityofLausanne,Switzerland

3

DepartmentofEconometrics,UniversityofGeneva,Switzerland

(3)

Ever since President Johnson's declaration of the "War on Poverty" in the

United Statesinthe year 1964,economists haveincreasinglyturnedtheirat-

tention and eort tounderstanding the mechanismsof incometransmission.

There are several importantequity issues underlyingthe generalquestion of

the intensity of intergenerationalinheritance of income status.

Equityconcerns canbeformulatedalongseveral lines. Onequestionthat

oftenarises concerns the extent towhich childrenraised inpoorfamiliesare

themselveslikelytoendupinthelowerendofthedistributionofincomedur-

ing their adult lives (Atkinson et al.(1983), ch. 1). The continuity problem

in low incomes is often also formulated by contrasting the likelihoods that

childrenraisedindierentsocialenvironmentsface, ofendingup inpoverty.

Economists and social scientists also seek to assess the separate contribu-

tions ofenvironment-relatedfactorsand geneticallytransmittedtraitsinthe

overall transmission of income status ( the famous 'nature versus nurture'

debate).

In immigration countries such as the United States, another question

arises about the rate at which newcomers to the host country are likely

to be assimilatedin terms of economic and educational attainment (Stokey

(1996)). To answer such questions, one needs to make progress both at the

theory level, i.e. in the formulation of models of income transmission and

the dynamicsof wealthdistribution (e.g. Galorand Zeira(1993)),aswellas

in the directionof the empirical analysis of longitudinalincome data.

Theestimationofthe intergenerationalcorrelationofincomesisapartic-

ularly complex task, since the variables of interests, namely the permanent

incomes of parents and children, are typically unobservable quantities. In-

stead, the researcher usually possesses a short time-series (in the order of 3

or 4 observations)on some indicatorof economic attainment such as hourly

earnings, or a family income measure. The resulting measurement error

problem has been shown by previous researchers in the area to render the

OLS estimator biased towards zero (see e.g. Bowles (1972), Atkinson et al.

(1983), Behrman and Taubman(1990)).

Of the various approaches proposed in order to deal with this mea-

surement error problem, two methods have received considerable attention.

(4)

planatoryvariable)overtheseveral yearsofavailabledatainordertoreduce

theratiooftransitorytototalincomevariance,and(ii)theuseofoutofequa-

tion instruments(typicallythe father's education)for theincome pertaining

to the parents' generation (cf. for instance Solon(1992)).

The purpose of the present study is to estimate the intergenerational

correlation ofincomeswithinapaneldata framework. Itis rathersurprising

that noexplicit treatment of this problem has been considered in the panel

data perspective 4

,sincemostrecentstudies(atleast theUSones) have been

based on samplesextracted fromlongitudinal earningssurveys 5

.

There are indeed several good reasons for considering this estimation

problem within a panel data framework. Firstly, as shown by Griliches and

Hausman (1986), Hsiao and Taylor (1991) and others, a wide range of er-

rors in variables models may be identied without having to appeal to out

of equation instruments. Secondly, by controlling for unobserved popula-

tion heterogeneity,paneldata estimatorswillusuallyhavemoreexplanatory

power than their cross-section counterparts. Finally, because we have re-

peatedmeasurementsonthe incomesof parentsand children, weare able to

estimate our various models under alternative scenarios regarding the serial

correlation in the transitorycomponents of incomeand test these scenarios.

This latter point extends beyond eÆciency considerations, since the choice

of instruments itselfis dictated by the assumptions regarding the serial cor-

relation inerrors of measurements.

Our study provides a discussion of two further issues: the pooling of

instrumental variable regressions from the various cross-sections into a sys-

tem (of equations) estimator, and the consequences of allowing for multiple

children from the same parent family (entailing cross-sectional correlations

across observations) for eÆcient estimation. When applied to a US sample

of parents and children, the various models and estimators considered here

provide estimates in the rangeof 0.42 to0.60.

The structure of the paper isthe following: section2 discusses the data,

section 3 deals with inference issues, section 4 contains the empiricalappli-

4

ThoughsomeinsightsintothisproblemareprovidedinZimmerman(1992).

5

See for instance Behrman and Taubman (1990), Altonji and Dunn (1991), Solon

(1992),Zimmerman(1992)andBjrklundandJantti(1997)

(5)

research.

2 Data

The empirical applications contained in this study draw evidence from the

panel studyof incomedynamics(PSID).The PSIDis aUSlongitudinalsur-

vey of5000 familieswhich hasbeen runningannually since1968. A detailed

description of the history of the panel, and its main features, can be found

inHill (1993). Our sample wasextracted from the mergedfamily-individual

tapeofwaveXXI(1968-1988)ofthe PSID.Asummaryofthecharacteristics

of our sample can be found in Abul Naga (1998). Here we discuss three

sampling aspects of our data which entail direct consequences for the esti-

mation strategies to be pursued in section3 below. These are (i)the extent

to which our data meet the standard denition of a panel, (ii) the problem

of measuring long-run economic status, or permanent income, and (iii) the

presence of multiplechildren from the same parent family. We take up each

of these pointsin turn.

We have a sample of 1426 parent-child pairs. Incomes of parents and

children are observed respectively over the ve year intervals 1967-71 and

1983-87. On the one hand data ondependent and explanatory variables are

not contemporaneously observed, a feature which is not typicalof the stan-

dard framework adopted in panel data analysis (for instance Hsiao (1986),

Chap.3). On the other hand, the incomedata have anatural time ordering,

which is essential for modelling and testing for serial correlation in the dis-

turbances, and alsofor selectingvalidinstruments for explanatoryvariables

(see below). As willbecome apparent in the next section, the estimators we

adopt do not require incomes of parents and children to be contemporane-

ously observed. The estimators will be shown to have obvious panel data

interpretations,but more generally, they may alsobeconsidered within the

framework of multiple(simultaneous) equations systems.

Our main income concept is the total annual family income normalized

by the family's needs, and deated to 1967 dollars (the rst year of the

panel). The needs standard is the oÆcial Orshansky scale which is used

in the US in order to quantify the extent of poverty. The price deator

used is the consumer price index. Such income measures may be treated

(6)

(1986), as random draws from a variable with a population mean equal to

the household's permanent income. Thus, in section 3, we propose several

instrumental variables procedures in order to deal with the measurement

error problem inherentin our data.

Astheincomeconcept wehavechosen touseisbased onthe totalfamily

income,we did not nd itnecessary torestrictour sample tofather and son

pairs. Thus, for a daughter froman initialsamplefamily, the income of her

household is matched tothat of the parents' whether or not she happens to

bethe main incomeearner. We have alsoallowed formultiple children from

the same parent family. We note that because of unobserved family back-

ground components, observations on siblings are expected to be correlated.

Under such circumstances, eÆcient estimation will require the adoption of

generalized least squares methods, which are also discussed in some length

in the followingsection.

Table 1 : Summary Statistics

year Parents Children

mean coef.variation mean coef.variation

1 1.73 0.80 2.86 0.70

2 1.88 0.85 3.00 0.71

3 2.00 0.84 3.05 0.76

4 2.15 0.83 3.15 0.77

5 2.31 0.83 3.25 0.83

1. TheresourceconceptisthefamilyincomenormalizedbytheOrshanskyneedsstandard.

2. Therstobservationyear(t=1)pertainsto1967forparents,andto1983forchildren.

Table1 providessome summarystatisticsforthe incomesof parents and

children. Parents appear tobe onthe wholepoorer than the US population

in the late 1960s. This may be explained by noting that approximately a

half of the parents originate from the SEO le (a sample of families whose

(7)

havefound thatdisadvantagedgroupsweremore likelytoexitthe paneland

that on the whole, the PSID has evolved towards a sample of middle class

families. This nding may help in accounting for the seemingly lower level

of income dispersion experienced by the sample of children, as well as the

growth in their average income.

3 The Model and the Estimation Methods

We are interested in estimating a relationshipof the form

c;i

=

p;i +

i

(1)

where

c;i

representsthe logarithmofthe permanentincomeofthe child,

p;i

the logarithmof the the permanent income of the parents and

i

arandom

disturbance term specic tothe individual. If correctmeasurements of both

the variablesare available,then a consistent estimationof would begiven

byOLS.Howeverthevariablesinquestionarenotdirectlyobservableandthe

only information one generally has are the actual income of the parent and

the childat dierent pointsintime. Obviouslythese snapshots docontain a

component equaltothe permanent incomebut alsoinclude othertransitory

components. Hence these observations can be thought of as measurements

of thepermanentincomesubjecttoerror. Denotingtheobserved incomesas

y

it and x

it

, we can write:

y

it

=

c;i +u

it

(2)

x

it

=

p;i +v

it

(3)

whereu

it andv

it

areuncorrelatedrandomerrortermswithzeromeans. Thus

the relationshipbetween the observables is

y

it

=x

it +

i +(u

it v

it

) (4)

or

y

it

=x

it +

i +"

it

(5)

with

"

it

=u

it v

it

(6)

and

(8)

it i c it p it it it

Let us note that if only the dependent variable is measured with error

(

c;i

in our case), OLS would still consistently estimate , whereas if the

explanatoryvariablehasameasurementerrorthenOLSproducesinconsistent

estimates(cf. Greene(1993)e.g.). Thougharemedytothisproblemisgiven

by the instrumentalvariable(IV) technique, inthe caseof purecrosssection

data, oneneedstoresorttoextraneous instrumentsand this maypose areal

practical diÆculty. However there is an easier way out in the case of panel

data andGrilichesand Hausman(1986)wereamongstthe rstonestopoint

out thatthemodelcanbeidentiedwithouttheuse ofexternalinformation.

SeealsoHsiaoandTaylor(1991),WansbeekandKoning(1991),Biorn(1998)

in this regard. Indeed, the observations onthe explanatory variablerelating

to dierent time periods provide appropriate instruments for the current

period, the gap to be chosen depending on the nature of serial correlation

assumed for the errors.

Ingeneral,twoapproaches havebeen proposed inthe literatureforpanel

datamodelswithmeasurementerrors. Eithertheequationcanberstdier-

enced (toeliminatethe specic eect

i

) andthe instrumentsused inlevels,

or the equation is kept in levels and the instruments dierenced (see Biorn

(1998)).

For our model, special care needs to be taken while choosing the in-

struments as the true unobserved variable is time invariant and hence any

dierencing operationover time willleave us with onlythe dierence of the

corresponding v

it

'sand any \instrument"correlated withthis dierence will

necessarilybecorrelatedwiththesamedierenceappearingintheerrorterm.

For instance, uponrst dierencing we get

y

it y

i;t 1

=(x

it x

i;t 1 )+"

it

"

i;t 1

(8)

But from (3) we have

x

it x

i;t 1

=v

it v

i;t 1

(9)

which also appears in "

it

"

i;t 1

. Thus any variable correlated with (x

it

x

i;t 1

)(condition tobesatised by a validinstrument)willalwaysbecorre-

lated with ("

it

"

i;t 1 ).

(9)

case and we will therefore only consider estimating the equation in levels.

For the same reasons, dierencing of instruments is alsoinappropriate.

Theset ofvalidinstrumentsvaries accordingtothetypeofserial correla-

tionassumedfor"

it

. Letusdistinguishtwocases,bothassumingstationarity:

(i) iiderrors and (ii)MA(1) errors. More generalMA processes will alsobe

considered in the application section. The case of AR(1) errors oers no

immediate solution and hencewill not be pursued here.

Case 1: iid "

it 's

Assumption A1:

"

it

iid(0;

2

"

);

i

iid(0;

2

); E(

j

"

it

)=0 8i;j;t

In terms of the individual componentsthe assumptions would be

E(u

it u

js )=Æ

ij Æ

st

2

u

8i;j;s;t E(v

it v

js )=Æ

ij Æ

st

2

v

8i;j;s;t

E(

j u

it

)=0 8i;j;t E(

j v

it

)=0 8i;j;t

where Æ

ij

is the Kronecker delta. However it should be noted that only

the overall variance 2

"

= 2

u +

2

2

v

can beidentied.

Inthiscase,foranyperiodt,alltheobservationsatleastoneperiodapart

are valid instruments. Let y

t

;x

t

; and "

t

denote (N 1) vectors obtained

by piling all the individuals for period t, and dene r

t

= +"

t

. So for the

period1 equationwritten as

y

1

=x

1

++"

1 x

1 +r

1

(10)

z

1

= [x

2 x

3

::: x

T

] is a validinstrument matrix, assuming that we have

observationsfor T time periods. Furthermore,

V(r

1

)=V( )+V("

1 )=

2

I

N +

2

"

I

N

2

r I

N

Giventhis, twomethodsarepossible: applyingtheIV techniqueequationby

equation orcombining all periods into a system and then applying IV with

the whole set of instruments for all the periods together. Letting z

t

denote

the instrument set for the t-th periodequation, the rst methodleads to

^

(t)

=[x 0

t z

t (z

0

t z

t )

1

z 0

t x

t ]

1

x 0

t z

t (z

0

t z

t )

1

z 0

t y

t

(11)

(10)

as V(r

t )=

r I

N .

The system estimatorwillbe obtained as follows combining allt's:

y

1

= x

1

++"

1

:

:

:

y

T

= x

T

++"

T

or using appropriatenotations

y=X+(

T I

N

)+"X+r (12)

where

T

is a vector of ones, and y;X and r are (NT 1) vectors. The

resulting system variance covariancematrix isgiven by

V(r)= 2

(

T

0

T I

N )+

2

"

I

NT

Writing the whole set of instrumentsas

Z = 0

B

B

B

@ z

1

0 0

0 z

2 0

: : :

0 0 z

T 1

C

C

C

A

(13)

we willhave

^

(gl s)

=[X 0

Z(Z 0

Z) 1

Z 0

X]

1

X 0

Z(Z 0

Z) 1

Z 0

y (14)

This estimator can be shown to be a weighted average of the \individ-

ual"

^

(t)

'sobtained by (11) for each period,with theweights being inversely

proportional to the corresponding variance matrix. Needless to say that

the same estimatorcan bederived using a Generalised Method of Moments

(GMM)approach(cf. Greene(1993)e.g.) basedonthe orthogonalitycondi-

tions between the instruments and the combined error term 6

. Note that we

could alsowrite a system OLS estimatoras follows:

6

Analternativeestimatoristheso-calledGeneralisedInstrumentalVariablesEstimator

(GIVE)whichisobtainedbyusingtheinstrumentmatrix 1

2

ZandthenapplyingGLSon

themodeltransformedbytheinstrumentmatrix. TherelativeeÆciencyofthisestimator

with respect to the one given in (14) cannot be established in general. We have not

considered the GIVE in our empirical analysis due to the computational complications

involvedinitsimplementation.

(11)

^

(ol s)

=[X Z(ZZ) Z X] XZ(ZZ) Z y (15)

Sofar we have assumed independence across dierent individuals in the

sample. Howeverour sampleof parents andchildrencontains siblings andit

willbemore realisticto assumethat the individualeects of children of the

same parents are correlated. This would implynon zero correlationbetween

the specic eects

i

and

j

if i and j belong to the same family. Let us

takethis intoaccount by means of a variable d

ij

such that

d

ij

= (

1 if i,j are siblings

0 if not

(16)

Let usfurther assume that

E(

i

j )=d

ij c

This modiesour variancematrix of as follows:

V( )= 0

B

B

B

B

B

B

B

B

@

2

d

12 c

::: d

1N c

d

21 c

2

::: d

2N c

: : : :

: : : :

: : : :

d

N1 c

:: :::

2

1

C

C

C

C

C

C

C

C

A

(17)

Then

V(r

t )=

+

2

"

I

N

0

(18)

and

V(r)

=(

T

0

T

)+

2

"

I

NT

= 0

B

B

B

B

B

@

0

:::

: : : :

: : : :

: : : :

:::

0 1

C

C

C

C

C

A

(19)

Thus the single equation estimator of accounting for correlations across

siblings would be given by

^

(t)

=[x 0

t z

t (z

0

t

0 z

t )

1

z 0

t x

t ]

1

x 0

t z

t (z

0

t

0 z

t )

1

z 0

t y

t

(20)

and the system estimatorby

(12)

^

(gl s)

=[X Z(Z Z) Z X] XZ(Z Z) Z y (21)

Nowlet usproceed to the MA(1)case.

Case 2: MA(1) errors

Assumption A2: "

it

=w

it +

"

w

i;t 1

Here the underlyingassumptions onthe componentsare:

u

it

=

it +

u

i;t 1

(22)

v

it

=

it +

v

i;t 1

(23)

but onceagain only the parameters describing the overall error term can be

identied. Letus denote

1

E("

it

"

i;t+1 )

and we know for anMA(1) process

E("

it

"

is

)=0 for js tj2

Theinstrumentset has tobemodiedaccording tothe newassumptions

asobservationsone periodapart arenolonger validinstruments. Theyhave

to be at least two periodsaway. Thus we would have (taking T=5 as is the

case in our study):

~ z

1

= [x

3 x

4 x

5 ]

~ z

2

= [x

4 x

5 ]

~ z

3

= [x

1 x

5 ]

~ z

4

= [x

1 x

2 ]

~ z

5

= [x

1 x

2 x

3 ]

Havingmade these changes and assumingforthe momentthat there are

no siblings, the expression of the single equation estimator

~

(t)

remains the

same ( equation(11) ) where the new instrument z~

t

is to be used instead of

z

t

. In the system estimation procedure, the varianceof the vector r changes

as we now have non zero correlation between r

t

and r

t 1 or r

t+1

. Again

assuming nosiblings,we have

E(r

t r

0

t )=

2

I

N +

2

"

I

N

0

(24)

(13)

E(r

t r

t+1 )=

I

N +

1 I

N

1

(25)

and

E(r

t r

0

s )=

2

I

N

2

for js tj2 (26)

Thus

E(rr 0

)= 0

B

B

B

B

B

B

B

B

@

0

1

2

:::

2

1

0

1

:::

2

: : : : :

: : : : :

: : : : :

2

2

:::

1

0 1

C

C

C

C

C

C

C

C

A

~

(27)

Thus the modied system estimatorwould be written as

~

gl s

=[X 0

~

Z(

~

Z 0

~

~

Z) 1

~

Z 0

X]

1

X 0

~

Z(

~

Z 0

~

~

Z) 1

~

Z 0

y (28)

where

~

Z now contains the new set of instruments.

Let usnow introduce correlations amongsiblings. We get

E(r

t r

0

t

) =

+

2

"

I

N

0

(29)

E(r

t r

0

t+1

) =

+

1 I

N

1

(30)

E(r

t r

0

t+2

) =

2

(31)

Thus

E(rr 0

)= 0

B

B

B

B

B

B

B

B

@

0

1

2

:::

2

1

0

1

:::

2

: : : : :

: : : : :

: : : : :

2

2

:::

1

0 1

C

C

C

C

C

C

C

C

A

~

(32)

Now, the single equation and system estimators taking account of such cor-

relations are given respectively by

~

(t)

=[x 0

t

~ z

t (~z

0

t

0

~ z

t )

1

~ z 0

t x

t ]

1

x 0

t

~ z

t (~z

0

t

0

~ z

t )

1

~ z 0

t y

t

(33)

(14)

~

(gl s)

=[X 0

~

Z(

~

Z 0

~

~

Z) 1

~

Z 0

X]

1

X 0

~

Z(

~

Z 0

~

~

Z) 1

~

Z 0

y (34)

Variance - covariance estimators

In order to implement all the above methods in practice, we need prior

estimates ofthevariousvariance-covarianceparametersappearinginthedif-

ferent blocksof V(r). We give below the various sample moments that con-

sistently estimatethe relevant parameters.

As

r

it r

i:

="

it

"

i:

(35)

and

V("

it

"

i:

)=

(T 1)

T

2

"

(36)

where a dot in the place of a subscript indicates the average taken over it,

2

"

can be estimated by

^ 2

"

= 1

N(T 1) X

i X

t (^r

it

^ r

i:

) 2

(37)

wherer^

it

=y

it x

it

^

and

^

canbeeitherthecorrespondingestimatorforthat

time period (given by equation (11)) or a rst stage system OLS estimator

(given by (15)).

Using

V(r

i:

)=V(

i

)+V("

i:

)= 2

+

2

"

=T (38)

we can estimate 2

as follows:

^ 2

= 1

N X

i

^ r 2

i:

1

T

^ 2

"

(39)

Finally, anestimatorof c

is given by

^ c

= 1

N

J X

i;j2J (^r

i:

~ r

::

)(^r

j:

~ r

::

) (40)

(15)

J

personsin J and r~

::

isthe mean overthis group.

In the MA(1)case, we need

1

in additiontothe above. Knowing

Cov(r

it

;r

i;t+1 )=

2

+

1

and

Cov(r

it r

i:

;r

i;t+1 r

i:

)=

(T 1)

T

1

we can obtain

^

1

= T

(T 1) 1

(NT) X

i X

t (^r

it

^ r

i:

)(^r

i;t+1

^ r

i:

)

= 1

(T 1) 1

N X

i X

t (^r

it

^ r

i:

)(^r

i;t+1

^ r

i:

)

The above procedures (single equation and system) can be generalised

to MA(2), MA(3),... errors in an exactly analogous way from iid to MA(1)

without much diÆculty. In fact these possibilitieswill alsobe explored em-

pirically.

Specication tests

Before nishingwith the methodology, let us look at the dierent speci-

cation tests that can be carried out inthis context. The rst type of tests

that one can think of are those that concern the nature of serial correlation

assumed for the errors r

t

. For instance, in order to test the assumption of

serial independence, denoted as MA(0), against the alternative MA(1), a

Hausmantest statisticcanbe constructedbasedonthedierencebetween

^

and

~

denotingtheestimatorunderthenull(timeindependence)as

^

andthe

one under the alternative(MA(1)) as

~

. Under the null,both are consistent

with

^

being more eÆcient and under the alternative only

~

is consistent.

Let us consider the single equation case say for t = 1. The instrument will

be z

1

= [x

2 x

3 x

4 x

5

] under H

0

and z~

1

= [x

3 x

4 x

5

] under H

a . The

twoestimators of are given by:

^

=[x 0

1 z

1 (z

0

1

^

0 z

1 )

1

z 0

1 x

1 ]

1

x 0

1 z

1 (z

0

1

^

0 z

1 )

1

z 0

1 y

1

(41)

and

(16)

~

=[x

1

~ z

1 (~z

1

^

0

~ z

1 ) z~

1 x

1 ] x

1

~ z

1 (~z

1

^

0

~ z

1 ) z~

1 y

1

(42)

where

^

0

is the estimator of the variance of r

1

under MA(1). It is easily

shown that

V(

^

)=[x 0

1 z

1 (z

0

1

^

0 z

1 )

1

z 0

1 x

1 ]

1

(43)

V(

~

)=[x 0

1

~ z

1 (~z

0

1

^

0

~ z

1 )

1

~ z 0

1 x

1 ]

1

(44)

with V(

^

)<V(

~

)

and

Cov(

^

;

~

)=V(

^

) (45)

Hence

V(

~

^

)=V(

~

) V(

^

) (46)

The test statisticis therefore

(

~

^

) 0

[V(

~

) V(

^

)]

1

(

~

^

)

2

1

with large values rejecting independence.

Similarly, MA(1) can be tested against MA(2) by constructing the test

statistic based on the dierence between the two corresponding estimators.

In this case, the instrument sets for the rst period equation are z

1

=

[x

3 x

4 x

5

] under H

0

and z~

1

= [x

4 x

5

] under H

a

. Other tests such as

MA(2) against MA(3) or MA(1) against MA(3), MA(0) against MA(2) or

MA(0) againstMA(3) can all bedevised similarly.

The same resultshold for the system estimators too,the basicdierence

being the dimension of the variables involved. Let us illustrate the case iid

against MA(1).

We have the system (see (12))

y=X+r (47)

and

(17)

Z = B

B

B

@ z

1

0 0

0 z

2 0

: : :

0 0 z

T C

C

C

A

(48)

for the nulland

~

Z = 0

B

B

B

@

~ z

1

0 0

0 z~

2 0

: : :

0 0 z~

T 1

C

C

C

A

(49)

for the alternative. The expressions ofthe estimators willbe

^

=[X 0

Z(Z 0

Z) 1

Z 0

X]

1

X 0

Z(Z 0

Z) 1

Z 0

y (50)

and

~

=[X 0

~

Z(

~

Z 0

~

Z) 1

~

Z 0

X]

1

X 0

~

Z(

~

Z 0

~

Z) 1

~

Z 0

y (51)

Finally the test statisticwillbe given by

(

~

^

) 0

[V(

~

) V(

^

)]

1

(

~

^

)

2

K

Another important category of tests is the so-called overidentication

tests ofSargan(1976). These testsuse covariancesbetween IV residualsand

asetof instrumentsthatneednothavebeen usedinthe estimationtodetect

any model misspecication regarding the validity of the instruments. Under

the nullof nomisspecication,the covariance between the residuals and the

instrumentset in questionwould be closetozero. More explicitly,letus say

we have amodel writtenas

y=X+" (52)

and estimatedby IV usingW asinstruments. Nowsupposewehaveanother

potentialinstrumentset

~

W oforderm(

~

W mayormaynotoverlapwithW).

Then thetest statisticusingthe covariance matrix

^

of "^ 0

~

W is given by (see

Godfrey(1988))

^

"

0

~

W

^

1

~

W"^ 2

m

Now if

~

W is a subset of W (Pagan-Hall (1983) suggestion) then the above

test isequivalent totesting Æ=0in the augmented regression

(18)

y=X+WÆ+" (53)

as the above test statistic can be shown to be the same as that of Gallant

and Jorgenson (1979) for testing Æ = 0 in the above model. The latter is

basedonthedierencebetweentheoptimisationcriterionintheunrestricted

model (53) and the restricted model(with Æ=0). It is given by

~

S

^

S

2

m

where

~

S ="~ 0

W(W

^

W) 1

W 0

~

"; "~=y X

~

~

W

~

Æ

^

S ="^ 0

W(W

^

W) 1

W 0

^

"; "^=y X

^

and both models are estimated using W as instruments and

^

calculated

using ".~

Coming to our model say for the single equation t = 1, in the iid case, one

cantestthevalidityof

~

W =[x

2

]asinstrumentsbyestimatingtheaugmented

model

y

1

=x

1 +x

2 Æ+r

1

(54)

and testing Æ = 0. If the test is rejected then x

2

is not a valid instrument

and only [x

3 x

4 x

5

] should be used as the instrument matrix. This can

be repeated for

~

W = [x

2 x

3 ] or

~

W = [x

2 x

3 x

4

]. In the MA(1) case,

W =[x

3 x

4 x

5 ] and

~

W can be either [x

3 ] or[x

3 x

4 ].

Once again the procedure is easily extended for testing other MA struc-

tures and to the case of system estimators.

Letusnowgoontolookattheempiricalresultsonthevariousestimators

and tests discussed above.

(19)

This empirical sectionis divided intothree parts. Firstly we examine single

equationestimation of theincometransmissionequation. Then,we consider

a system analysis where all the equations are pooled to arrive at a unique

estimator. Finally, in the third sub-section we carry out specication tests

on the system estimates in aneort to reconcile the various ndingsand to

arriveat generalconclusions pertainingto the serial correlation in the error

term, and hence the appropriate choice of instruments.

All the regressions results reported below include as explanatory vari-

ables ameasurementontheparentfamily'sannualincome,age controls(age

and `age-squared' ofthe parent and childfamilyheads)together with acon-

stant. All estimationstakeaccountofcorrelations acrosssiblings,andwhere

appropriate, are robust toserial correlation up to three lags(orleads).

4.1 Single equation results

For the single equation analysis, we can report estimates for up to 5 years

(year one pertains to observations on the child's 1983 income and her/his

parents' 1967 income, while year 5 pertains to the child's 1987 income and

the parents' 1971 resources). Single equation analysis is very popularin the

empirical literature on intergenerational mobility (see for instance tables 2

and 3of Solon (1992)) and table 2of Bjrklund and Jantti(1997)). We note

that potentially we could estimatea total of 25 equations, not just 5. This

is so, because, in practice, nothingprecludes us fromusing each of x

1 to x

5

toexplainy

1

(i.e. ve separateregressions), andtousethe samevariablesto

explain y

2

etc. Our choicehere toregressy

t onx

t

ispurely expositional,and

has been motivated by our interest in formulating our estimation problem

in the context of the panel data framework. Below we report results for the

rst and last year of the panel only, since these will suÆce to motivate the

need for proceedingin the directionof system levelestimation.

(20)

instr: set :

^

x

5 x

4 x

3 x

2

Sargantest

0:550(0:035)

0:544(0:033) 0:511

0:539(0:032) 0:525

0:532(0:031) 1:680

1. The dependent variable is y

1

(the child's 1983 income). The set of explanatory

variablescontainsx

1

(theparents'1967income)togetherwithagecontrolsforthe

parentandchildfamilyheads.

2. Thesingleequationestimatorofisasdenedinequation(20). Amarkindicates

that the corresponding variableis used asan instrument. Theset of instruments

(x

2 to x

5

)containsleadsontheparentfamily'sincomefrom 1968to1971.

3. Standarderrorsappearinsideparentheses.

Intable2wereportresultsforperiodoneofthepanel. Whenx

5

,parental

income in 1971, is used to instrument x

1

, (the corresponding 1967 variable)

the coeÆcientofincomeinheritanceisestimated as0.55. Asabenchmark,

the OLSestimateforthis parameteris0.485withastandarderrorof 0.0245.

As thislatterestimatehoweveris downwardlybiased(and itsstandarderror

is incorrect), it is expected that IV estimates of would be higher; this is

indeed the case for all the guresreported in table 2.

Allfourestimatesoftable 2arewithinthe narrowbracketof 0.53to0.55

(less than one standarderror fromone another). Withasample sizeof 1426

observations, adding extra instruments can generally be recommended on

eÆciencygrounds, sincetheresultingnitesamplebias(whichincreaseswith

the number of instruments included in the regression) is small. The results

of table 2 conrm this pattern: standard errors fall from 0.035 to 0.031 as

the set of instruments is expanded from x

5

to include x

5 to x

2

. Finally, the

Sargantests (lastcolumnofthe table)donot rejectany ofthe specications

at the 5% levelwhen x

4 tox

2

are included inthe set of instruments.

(21)

in the disturbances, the story somewhat changes when we turn to period 5

results, reported intable 3.

Table 3: IV results for period ve of the panel

instr: set :

^

x

1 x

2 x

3 x

4

Sargantest

0:599(0:034)

0:577(0:033) 1:796

0:564(0:032) 2:166

0:545(0:031) 14:040

1. The dependent variable is y

5

(the child's 1987 income). The set of explanatory

variablescontainsx

5

(theparents'1971income) togetherwithagecontrols.

2. A *markindicates thatthe Sargantestof over-identifying restrictionsisrejected

atthe5%level.

The estimates of table 3 use lags of family income as instruments, while

thoseoftable2usedleadsofparentalincome. Noterstthattheestimatesof

intable 3 are generally larger(rangingbetween 0.55 and 0.60) than those

of table 2, though these dierences are not statistically signicant. On the

basis of the Sargan tests statistics we may also conclude that the ndings

reject the validity of a one period lagged income as an instrument at the

5% (also atthe 1%) signicance level. Conditional upon x

1

(incomelagged

four periods) being a valid instrument, x

2

, and [x

2

;x

3

] are not rejected as

over-identifying instruments, while the inclusion of x

4

, leads to a rejection

of the orthogonality conditions. Thus, period 5 results suggest an MA(1)

specication for the regression error term, while those of period 1 give no

indication of any potential serial correlation in the disturbances. It is for

this reason that we will also subject our system-level estimators to various

specication tests, given that the set of instruments may be called todoubt

in the light of our ndings.

Weclose thissub-section byreportingsome relatedndingsinthelitera-

ture. Usingeducation of the household head asan instrument, Solon(1992)

(22)

mates this same parameter at 0.63 (also using education asan instrumental

variableforasub-sampleofthePSID).SomeUKresultsprovidedbyDearden

et al. (1997) estimate to be in the order of 0.55 to 0.6 when education is

used to instrument parental income. They also report aninteresting nding

that theincomesof daughters tendtobemore correlatedwith theirparents'

economic outcomes than inthe case of sons.

4.2 System estimation

Movingfromthe single equationframework tothe system estimatorisanal-

ogous to the transitionfrom two-stage to three stage least squares methods

in the general simultaneous equations framework. This allows for potential

eÆciency gainsprovidedthe specicationsconsideredarevalidatedusingap-

propriate diagnostics tests.

Table 4 : System estimators

Model System OLS System GLS

MA(0) 0:498(0:026) 0:417(0:022)

MA(1) 0:520(0:027) 0:475(0:024)

MA(2) 0:538(0:027) 0:489(0:025)

MA(3) 0:565(0:030) 0:522(0:027)

1. MA(0) denotesamodel withserially uncorrelatedresiduals, whileMA(q) denotes

amodelwithamovingaveragedisturbancetermoforderq.

2. Thesystem-OLSand system-GLSestimatorsareasdened in equations(14)and

(21)or(34)respectively.

In comparison to the single equation results of tables 2 and 3, we may

note thatthe systemGLSestimateissubstantiallysmallerundertheMA(0)

assumption-noserialcorrelationintheerrors-(avalueof0.42)butthatthis

estimate increases to 0.52 with a standard error of 0.027 under the MA(3)

(23)

sitive to the time-series structure of the error term than the system OLS

estimator, thoughwith the OLS estimator we alsond that the estimate of

increases as we move from the MA(0) specication to the MA(3) model,

with valuesranging from0.50 to 0.57.

Asdiscussedinsection3,theexistenceofserialcorrelationintheresiduals

reduces the set of potential instruments for parental income. Thus, while

undertheMA(0)andMA(1)assumptionsallequationscanbeused toarrive

at an estimate of , this is nolonger the case under the MA(2) and MA(3)

specications. Under the MA(2) specication, x

3

cannot be instrumented

usinglagsandleadsofx(whenthepanelspansaperiodof5years). Likewise,

under the MA(3) structure, x

2 , x

3

and x

4

will not possess within-equation

instruments. Accordingly,for the MA(2)specication the period3equation

hasbeendeletedfromthesystemestimator. Likewise,theMA(3)modelonly

poolsperiod one and period 5data toestimate . The resulting patternfor

the standarderrors ofthesystem estimatoris totakeonincreasing valuesas

we movefrom the MA(0) tothe MA(3)models (for both OLS and GLS).

Finally, note that the variance of the individual eect accounts for ap-

proximately 70% of the total variance in the error term:

2

is estimated at

0.354,while 2

"

isestimatedat0.166(resultsnot showninthetable). Viewed

underthisperspective,itisclearthatpaneldatamodelswillprovideabetter

t for microdata with a substantialcomponent of unobserved heterogeneity

than pure cross-section regressions.

4.3 Hausman specication tests

Ournaltaskinthisempiricalsectionofthepaperconsistsintestingvarious

scenariosregardingthetime-seriesstructureofthedisturbancesofthesystem

estimator. Such tests are required in order to provide a sound basis for

discriminating between the various estimates of in table 4, with values

ranging from0.42 to 0.57.

(24)

Specication System OLS System GLS

MA(2) vs: MA(3) 11:525 16:539

MA(1) vs: MA(3) 16:942 19:897

MA(0) vs: MA(3) 25:232 52:449

MA(1) vs: MA(2) 8:649 7:199

MA(0) vs: MA(2) 18:661 46:828

MA(0) vs: MA(1) 0:0127 49:246

A *(**)indicates thatthenullhypothesis isrejectedat a5%(1%) probabilityoftype-I

error.

Our benchmark assumption is that of an MA(3) process: the resulting

systemestimatorwillbeconsistent(butineÆcient)undertheMA(2),MA(1),

or MA(0) specications, while, the other way around, the MA(q) estimator

is inconsistent under the MA(r) scenario, for r>q.

Specicationtestsarecarriedoutusingboththesystem-OLSandsystem-

GLSestimators,as,inpractice,asub-optimalweighting schemeforthe GLS

estimatorneednotresultiseÆciencygainsoverOLS(Szroeter(1994)). Com-

paringourteststatisticstoa 2

variablewith6degreesoffreedom,ourresults

(table5)canbesummarizedasfollows: theMA(1)andMA(0)specications

are rejected atthe 1% levelagainst the nullhypothesisof anMA(3) process

using both the system OLS and GLS estimators. The MA(2) hypothesis

(rst lineoftable 5)isnot rejected usingthe system OLS estimator, though

it is rejected at the 5% level (but not at 1%) using GLS. In this sense, the

MA(1) and MA(0) assumptions appear to be inadequate, while, provided

the MA(3) hypothesis is a valid one, the MA(2) model appears to be more

plausible.

Thelast threelines of table 5providefurther resultswhen the nullspec-

ication is modied. A test of the MA(1) specication against the MA(2)

(25)

assumption isrejectedagainstMA(2)atthe 1% level. Finally,the GLSesti-

matorrejectstheassumptionofnoserialcorrelationintheerrorterm,against

theMA(1)model,whileOLSfailstorejecttheMA(0)modelagainstMA(1).

The overall pattern, then, is that both estimators reject the MA(q-2) model

againstMA(q),whilesystem-OLSneverrejectstheMA(q-1)scenarioagainst

MA(q). Adopting a more global perspective, the system-GLS estimator is

perhapsto be recommended oversystem-OLS onsuch grounds.

5 Conclusions

In this study, we have carried out the estimation of the intergenerational

correlationofpermanentincomeswithintheframeworkofapaneldatamodel

with measurement errors. This has enabledus toimplementnew estimators

aswellasspecicationtests,astheproblemhasnotbeenexplicitlyconsidered

from this perspective sofar.

The availability of time-series on a cross-section of families has allowed

us to exploit within-sample information to deal with the errors in variables

problemas wellas todevelop system estimators for the parameters of inter-

est. Thepanel data contexthas alsobeen useful intesting various scenarios

regarding the time-series structure of the error term and its implications

for the choice of instruments. Our estimators have also taken into account

possiblecorrelations among childrenbelongingtothe same family.

Forthevariousmethodsusedinthis study, wehaveestimated the coeÆ-

cientofincomeinheritancetobeintherangeof0.42to0.60. Singleequation

estimates varybetween 0.53and 0.60. Results varydepending onthe choice

of instrumentsfor a given time period,but alsofrom one periodto another.

System estimates were found to be between 0.42 and 0.57, with the system

OLS yielding higher values than the GLS counterpart. On the basis of our

specication tests we nd that MA(0)(no serial correlation) isdenitely re-

jected compared to MA(3), MA(2) and MA(1) (at least in the system-GLS

case). TurningtoMA(1),itisrejectedcomparedtoMA(3)butnotcompared

to MA(2). Finally MA(2) is accepted against MA(3). Given that MA(3) is

the least restricted model that our data permit and hence using this as a

reference for comparing the other scenarios, we can say that there is some

supporting evidencefor an MA(2)structure of the errors.

(26)

ityfromonegenerationtothenext. Aslopevalue inthe orderof0.5to0.55,

while indicative of a process of regression towards the mean, is nonetheless

far remote froman idealizedequalopportunities situation.

In general, panel data models allow the researcher to relax the indepen-

dence assumption between the specic eect and the explanatory variable.

However, this was not possible inour study because the variable of interest

subject to errors (namely permanent income of the parent family) is time-

invariant. Furthermore,wehavenotconsideredotherdynamicstructuresfor

the error term such AR or ARMA processes, which could provide a useful

directionfor further work.

References

Abul Naga, R. (1998), \ Estimating the Intergenerational Correlation of

Incomes: anErrorsinVariablesFramework,"DiscussionpaperNo.9812,

DEEP, University of Lausanne.

Altonji, J. and T. Dunn (1991), \Relationships Among the Family In-

comes and Labor Market Outcomes of Relatives," Research in Labor

Economics, 12,269-310.

Atkinson, A., A. Maynard and C. Trinder (1983)ParentsandChildren:

Incomes in Two Generations, London,Heinemann.

Behrman, J. and P. Taubman (1990), \The Intergenerational Correla-

tion betweenChildren's AdultEarningsandtheirParentsIncome: Re-

sultsfromtheMichiganPanel ofIncomeDynamics,"Reviewof Income

and Wealth, 36,115-127.

Bicketti, S., W. Gould, L. Lillard and F. Welch (1988),\ThePanelStudy

of Income Dynamics after Fourteen Years: an Evaluation,"Journal of

Labor Economics, 6,472-92.

Biorn, E. (1998)\PanelDatawithMeasurementErrors: InstrumentalVari-

ablesandGMMProceduresCombiningLevelsandDierences",Invited

PaperattheEighthInternationalConferenceonPanelData,Goteberg.

(27)

ity in Sweden Compared to the United States," American Economic

Review 87, 1009-1018.

Bowles, S. (1972), \Schooling and Inequality from Generation to Genera-

tion," Journal of Political Economy,80, S219-S251.

Dearden, L., S. Machin and H. Reed (1997),\IntergenerationalMobil-

ity inBritain," Economic Journal 107, 47-66.

Gallant, A. and D. Jorgenson (1979), \ Statistical Inference for a Sys-

tem of Simultaneous, Non-linear,Implicit Equationsin the Context of

Instrumental VariableEstimation," Journal of Econometrics, 11, 275-

302.

Galor, O. and J. Zeira (1993),\IncomeDistributionandMacroeconomics,

" Review of Economic Studies,60, 35-52.

Godfrey, L. (1988), Misspecication Tests in Econometrics, Cambridge,

Cambridge University Press.

Greene, W.H. (1993), Econometric Analysis, Second Edition, Macmillan

PublishingCompany, New York.

Griliches, Z. (1986), \Economic Data Issues" in Z. Griliches and M. In-

triligator (eds): Handbook of Econometrics, Vol. 3, North Holland,

Amsterdam.

Griliches, Z. and J. Hausman (1986),\ErrorsinVariablesinPanelData,"

Journal of Econometrics, 31, 93-118.

Hill, M. (1993),The Panel Study of Income Studies: a User's Guide, New

York,Sage Publications.

Hsiao, C. (1986), Analysis of Panel Data, Cambridge University Press,

Cambridge.

Hsiao, C. and G. Taylor (1991),\Some Remarks onMeasurement Error

and the Identication of Panel Data Models," Statistica Neerlandica

45, 187-194.

(28)

Econometric Reviews, 2, 159-218.

Sargan, D. (1976),\TestingforMisspecication afterEstimatingusingIn-

strumental Variables",mimeo, London Schoolof Economics.

Solon, G. (1992),\IntergenerationalIncomeMobilityintheUnitedStates,"

American Economic Review, 82,393-408.

Stokey, N. (1996), \Shirtsleeves to Shirtsleeves: The Economics of Social

Mobility",reprintedinD.Jacobs,E.KalaiandM.Kamian(eds)(1998):

Frontiersof Research in EconomicTheory: Nancy L. Schwartz Memo-

rial Lectures 1983-1997, CambridgeUniversity Press, Cambridge.

Szroeter, J. (1994), \Exact Finite-Sample Relative EÆciency of Subop-

timally Weighted Least Squares Estimators in Models with Ordered

Heteroscedasticity", Journal of Econometrics, 64,29-43.

Wansbeek, T. and R. Koning (1991), \Measurement Error and Panel

Data,"Statistica Neerlandica,45, 85-92.

Zimmerman, D. (1992),\RegressionTowardMediocrityinEconomicSta-

tus," American Economic Review, 82,409-429.

Références

Documents relatifs

A two step approach is used: first the lithofacies proportions in well-test investigation areas are estimated using a new kriging algorithm called KISCA, then these estimates

White LZ focused on the asymmetric functional role of pesticides as a source of bias is estimates , our focus is on the possibility that over-estimation of

Second, the results suggest that the emigration of skilled workers from developing to rich countries tends to exert a positive impact on the long-run level of human capital of

As the plots in Figure 2 show, using bias-corrected estimates of the coefficient matrices yields estimated impulse-response functions that suffer from much less bias.. The

Our study of satisfaction surveys has shown, in particular, that such data cannot simultaneously remove individual …xed e¤ects linked to adaptation and give us good information

For the linear dynamic model we applied a degree-of-freedom correction to account for the estimation of the error variance and, for the half-panel jackknife estimates of θ 0 ,

This paper discusses a data assimilation method that simultaneously estimates motion from image data and the inaccuracy in the evolution equation used to model the dynamics of

In summary, the Cramér-Rao lower bound can be used to assess the minimum expected error of a maximum likelihood estimate based on the inverse of the expected Fisher information