Lectures in Applied Econometrics Amazonian Deforestation

(1)

Lectures in Applied Econometrics Amazonian Deforestation

Pr. Philippe Polomé, Université Lumière Lyon 2

M1 APE Analyse des Politiques Économiques M1 RISE Gouvernance des Risques Environnementaux

2016 – 2017

(2)

Outline

Introduction Time-series Theory

Deforestation Data & Analysis References

(3)

Definition and scope

I

United States Environmental Protection Agency defines deforestation as the "permanent removal of standing forests."

I

Amazonian Deforestation is monitored by Landsat since 1975

I Google publishes some images

(4)

(5)

(6)

(7)

(8)

(9)

(10)

(11)

(12)

(13)

(14)

(15)

(16)

(17)

(18)

(19)

Why is this an important issue ?

I

Biodiversity reservoir

I Habitat loss

I

Carbon sinkhole

I Old-growth forest are (net) carbon sinkholes

I + deforesting emits carbon

I By burning

I Released from soil I

Changes moisture in the air

I Causes droughts down South

(20)

Why is this an important issue ?

I

Social issue 1 : not developement

I Deforestation is mostly due to agriculture

I Cattles mostly (about 80%), on planted pasture

I The Amazon basin appears generally not well-suited for crops, soy-bean in particular [9].

I 70% of formerly forested land in the Amazon, and 91% of land deforested since 1970, is used for livestock pasture

I This in turns causes soil erosion and flash floods

I

Social issue 2 : Indigenous people

(21)

Deforestation Time Profile

Source: Landsat images interpreted by PRODES project of the Instituto de Pesquisas Espaciais since 1975 - Values for some years linearly interpolated.

I

This is the “Legal Amazon” deforestation

I Why is it declining since the mid-2000’s ?

(22)

Outline

Introduction

Causes of Deforestation Environnemental Kuznets Curves

Time-series Theory Trends

I(0)

Autorregressive Errors of Order 1 AR(1)

Stationnarity : integration of order 1 I(1)

Deciding if a time-series is I(1)

Cointegration

Errors correction models Johansen Test of Cointegration Deforestation Data & Analysis

Analysis: EKC, I(1) tests and cointegration

The Estimated Relation Alternative Theory Other Econometric Issues Other Explanatory variables Discussion and conclusions References

(23)

Causes of Deforestation

I

Are certainly complex

I but primarily driven by human action

I Hence economic might be a factor

I And thus it may compete with other economic activities

I

The 70’s and 80’s deforestation had been induced by government policies and subsidies

I Slash and burn agriculture appears much less prevalent than it was

(24)

World Bank 2004 [9]: Deforestation in the 90s and early 00s

I

Attributed mainly to cattle ranching

I Soybean to a much lesser extent

I Grass does not deplete the soil so much

I The 1995 peak was attributed to accidental forest fire

I

Agriculture and cattle ranching may be more profitable in the Amazon due to

I weak land titling, land grabbing, irregular labor contracts,

I and the continuous process of opening up of new forest areas

I The later are carried out at low cost by small farmers

I who prepare the land for medium- and large-scale cattle ranching which follow them

I

Small farmers are less blamed than they once were

(25)

Causes of Deforestation

I

Weinhold and Reis [12]

I analyse the way roads creation induces deforestation

I it turns out that is does only in areas that have not seen deforestation

I but it reduces deforestation in areas where land is already cleared I

Nasa Earth Observatory

¹

states:

This pattern follows one of the most common deforestation

trajectories in the Amazon. Legal and illegal roads penetrate a remote part of the forest, and small farmers migrate to the area. They claim land along the road and clear some of it for crops. Within a few years, heavy rains and erosion deplete the soil, and crop yields fall. Farmers then convert the degraded land to cattle pasture, and clear more forest for crops. Eventually the small land holders, having cleared much of their land, sell it or abandon it to large cattle holders, who consolidate the plots into large areas of pasture.

1Anonymous, 2012 data, accessed October 2015 at

http://earthobservatory.nasa.gov/Features/WorldOfChange/deforestation.php

(26)

Causes of Deforestation

I

“Geography”: Kauppi et.al. 2006 [7]

I Above a certain level of income, countries stop to deforest

I Evidence is essentially a world-wide cross-section

I

This points to an explanation economists are familiar with:

I Deforestation as worldwide cross-section follows an Environmental Kuznets Curve

(27)

Outline

Introduction

I(0)

Cointegration

(28)

Hypothesis: Environmental Kuznetz Curves EKC

I

S. Kuznets (1955) suggested an inverted U-shaped relationship between economic growth and income inequality

I At first, economic development induces major inequalities between the richs and the poors

I As income (per capita) rose, inequalities would become more intolerable and disappear

I Possible because of money transfer, better opportunities or better education / health care / public goods

I

Environnemental KC suggested by Grossman & Kruger [4][5]

I Environmental damage first worsen and then recover as income per capita rises

I Emissions, deforestation,...

(29)

Formal EKC

I

Larger levels of per capita income are associated with gradually lower levels of pollutants

yt=b₀+b₁GDPht+b₂GDPh²_t+gxt+et

(30)

Evidence for EKC

I

Cross-section or Panel studies found EKC on occasion

I Is that causal or spurious?

I

Stern 2004 [10]

I EKC forCO₂andCO₂eqemissions is an artefact (=spurious) of the analysis

I Instead, the apparent EKC is a mixture of effects:

1. Pollution increases roughly monotonically (linearly) with income 2. But “time” reduces pollution, that is, income-independant policies 3. In rapidly growing middle-income countries, the income effect (%

pollution) overwhelms the time effect

4. In wealthy countries, growth is slower, and pollution reduction efforts can overcome the income effect

I That is what causes an apparent EKC effect in cross-section or panel data sets

(31)

Time-series

I

Stern 2004 [10] and others clearly identify EKC as a time-series issue

I as a cross-section forces all countries to the same path

I and a panel only allows a different starting point but the same curvature

I

In other words, Kauppi et.al. [7]

I make the same mistake as earlier papers on identifying an EKC in CO₂emissions

I Their results could then be an artefact

(32)

Comparing Deforestation across Countries

I

Barbier and Burges (2001)

I Survey of the economics of tropical deforestation

I Indicate that even if countries might follow an EKC,

I They are unlikely to follow all the same path I

Lambin & Meyfroidt 2011 [8]

I using forest cover evidence in a more “geographical” study

I indicate that “there is no default forest transition pathway”

I

Both these results are to be interpreted against resorting to

cross-sections or even panel studies to test EKC

(33)

Meta-analyses

I

Lack of an EKC is generally NOT clearly established for damages / emissions other than

CO₂

I For deforestation, mixed issue

I

Choumert et.al. [2]

I Review 69 papers on Environmental Kuznets Curve for deforestation

I They find only one paper using time-series

I Probably Shafik, N. & Bandyopadhyay, S., 1992

I It is not cointegration

I The economics literature does not appear to supply an explanation for the current decreasing trend in deforestation

I But a more “geography-oriented” literature does not hesitate to point to economic factors

(34)

Objectives

I

This paper proposes to test EKC for Brazil deforestation

I Because there is a well-documented and relatively long time-series

I Currently 40 years

I

Regression analysis of Deforestation on

GDPh

and its square ?

I Number of issues

I But the cointegration issue appears both essential and untreated

(35)

EKC Econometric Issues in a Nutshell

I

Deforestation is a time-series

I Non stationary “stochastic trend”

I No stable expectation or variance

I Several well-known statistical tests

I But no “deterministic trend”yt=a+bt+et

I So Deforestation decline cannot be “only time”

I GDPh

is also a non stationary series

I Regression of non-stationary on non-stationary is spurious

I Unless Cointegration

I A difference between the two series is stationary

I Large literature in econometrics / finance

I But not used in Deforestation EKC studies (roughly 70 studies)

I Could EKC be the cointegration relation ?

I

Cointegration relation could be more complex

I Other series should be considered

(36)

Outline

Introduction Time-series Theory

Deforestation Data & Analysis References

(37)

Time Series

I

Are very common

I Most macroeconomic data : GDP, inflation, unemployement...

I Individual (or population aggregate) employment, wage, consumption ...

I Stock quotes : yearly, monthly, daily, real-time. . .

I Exchange rate

I Sales / purchases in a firm

I

Time-series are often considered

autocorrelated

I The present is influenced by the past

I

This section is mostly based on

I Wooldridge [13]

I the Gretl User’s Guide[3]

(38)

Time-series vs. cross-section

I

Time-series observations are naturally ordered

I Cross-section data has no natural order

I except geo-localised data

I

Time-series observations proceed from a random

stochastic process

I Cross-section data proceed from a random sample

I

Time-series models are usually indexed by

t

:

yt=b₀+b₁x_1t+. . .+bkxkt+et

(39)

Distributed lags model

I

Model

yt=b₀+d₀xt+et

is said

static

I Classical Phillips’ curveinflationt =b₀+b₁unemploymentt+et I

Finite

distributed lags

(of the regressor) models

I one or severalximpacty with one or more lags

I gft=b₀+d₀tet+d₁tet 1+d₂tet 2+et

I gf “general (average) fertility” (re-used later)

I te“tax exemption”

I this an “order 2” distributed lag

I d₀= immediate impact (= short term) fromxony

I The setd₀,d₁, . . . ,dqdescribes the long-term relation betweenx andy

(40)

Shocks

I

Order 2 model

yt=b₀+d₀xt+d₁xt 1+d₂xt 2+et I

Transient shock (1

t)

on constant

x

at time

t

I yt=b₀+d₀(x+ ) +d₁x+d₂x+et

I yt+1=b₀+d₀x+d₁(x+ ) +d₂x+et+1 I yt+2=b₀+d₀x+d₁x+d₂(x+ ) +et+2

I

Permanent shock (starting from time

t)

on constant

x

I yt=b₀+d₀(x+ ) +d₁x+d₂x+et

I yt+1=b₀+d₀(x+ ) +d₁(x+ ) +d₂x+et+1 I yt+2=b₀+d₀(x+ ) +d₁(x+ ) +d₂(x+ ) +et+2

(41)

Outline

Introduction

I(0)

Cointegration

(42)

Trend models

I

Model

yt=b₀+d₀t+et

i s a

trend

I yt“follows” the time flow with a stochastic noisee

I

Several specifications

I Computer-simulated onTrend and RndWalk.odson website

I Monte-Carlo

(43)

Linear trend y

_t

= b

₀

+ b

₁

z

_t

+ b

₂

t + e

t

, t = 1, 2 . . . T

(44)

Quadratic trend y

_t

= b

₀

+ b

₁

z

_t

+ b

₂

t + b

₃

t

²

+ e

t

I

Not easy to spot or to differentiate from a

ln

(45)

Exponential trend y

_t

= exp (b

₀

+ b

₁

z

_t

+ b

₂

t + e

t

)

I ln(yt) =ln(yt) ln(yt 1)⇡yt yt 1

yy 1

I Thelog-differentialapproximately equals the growth rate

I For rather small rates

I

An expon. trend without regressor is then

lnyt=b₀+b₂t+et

I Happens whenyhas the samegrowth rateeveryt

I ln(y_t) =b2+ et: cst growth rate + zero-expectation error

(46)

Spurious Regression

I

Economic chronological variables may have a temporal trend

I Regressing a trend on a trend often seems like a good idea :

I R²&tare often high

I However, unobserved (by the econometrician) variables may actually be causing the trends

I 3 examples below

I The unobserved variables may becontroled forintroducing a deterministic time trend

I the significance of the other regressors might then be brought back to their correct levels

I the time trend maybe only a proxy

I So: not explaining anything

(47)

Spurious Example 1: The storks and the babies

I

Fisher, 1936, Copenhagen, post WWII decade

I B=b₀+b₁S+e

I bˆ₁=.15with t-stat 5.98

I What is up ?

(48)

Spurious Example 1: The storks and the babies

I

Fisher, 1936, Copenhagen, post WWII decade

I

More likely: reconstruction + rural migration to the city

I Assuming migration and construction are linear:

time trend

I B=b0+b1S+b2t+e

I bˆ1=.03with t-stat 0.34

I However low dof

(49)

Note on storks and babies

I

Birds that leave Northern & Central Europe in autumn and come back early april

I That is about 9 months after the summer solstice (21 of June / Saint John)

I

The summer soltice was an important pagan (and later Christian) festival

I In which many people would marry...

(50)

Spurious Example 2: property investment and prices

I

Gretl

I File!open data : sample file

I Wooldridge tab

I hseinv.gdt

I data from [13]

I

1947-88 series

I General info under Data!Dataset info

I housing investment per cap

I housing price

I ...

I

Data

!

dataset structure

I time-series

I indicate

I periodicity

I start period

(51)

Spurious Example 2: property investment and prices

I

model menu : OLS Regression

ln\(invpc) = .55+1.24ln(price)

I

Household property investment elasticity wrt price is significantly different from zero but not from one

I A change in price appears completely passed on to the investment

I But both series follow a trend

(52)

Spurious Example 2: property investment and prices

I

Adding a trend (data menu)

ln\(invpc) = .91 .38ln(price) +.0098t

I

Price is not significant any more

I But the (real) investment grows of about 1% yearly

I Possibly, this might be due to omitted regressors

I

The previous result was

spurious

I

If

y

and

x

have opposing trends

I Introducing a trend mayincreasethe significance ofx

I The t-stat of a trend is not necessarily correct as we will see on the section on I(1)

(53)

Spurious Example 3: simulated data

I Trend and RndWalk.ods

tab

Trend on trend

(54)

De-trending

I

To purge the data from the trend (detrend)

I Instead of introducing a linear trend in the regression 1.

Regress each variable from the model on a trend

2.

Use the residuals from each equation as new variables

I like a redefinition

I

For example

yt=b₀+b₁zt+et

1.

Creation de-trended variables

yt=g₀+g₁t+zt99Ky_t^d= ˆzt

zt=q₀+q₁t+xt99Kz_t^d= ˆxt

2.

Regress the de-trended variables

y_t^d=lz_t^d+nt

I Intercept no more useful sinceE y_t^d =E z_t^d =0

(55)

De-trending (2)

I

Introducing a trend and using de-trended variables are in principle

equivalent

approaches

I

But de-trending is a 2-step method

I introduces ameasurement errorin the 2nd step

I The de-trended variables are constructed on the basis ofestimated parameters

I xˆt is a measurement ofz_t^d witherror

I Thus the results are not identical

I

Why use de-trending ?

I Time-series regressions often have a highR²

I mostly because of the trend, which does not explain anything

I So suchR²does not reflect the real explanatory power of the estimated model

I TheR²of the regression using de-trended variables is likely a better measure of the true explanatory power of the model

(56)

Outline

Introduction

I(0)

Cointegration

(57)

Stationnarity

I

A stochastic process is

stationnary

I When its distribution does not change through time

I parameters included

I Stationnarity is similar to“identically distributed”

I

A trend is not stationnary since its expectation changes with time

I

A stochastic process is said

covariance-stationnary

I If its expectation and its variance are constant through time

I And if the covariance between 2 periods depend only on the number of periods between them

I

Stationnary process are covariance-stationnary

I unless the covariance is•

(58)

Integration

I

A stationnary process is

integrated of order zero I(0)

if

I xtandxt+hare “nearly independant” whenh!•

I We also sayweakly dependentfor I(0)

I

A similar definition exists for a non-stationary process

I I(0) is similar to“independently distributed”

I

A covariance-stationnary series is I(0) if

I its correlation betweenxtandx_t₊_h!0whenh!•

I

I(0) implies that some law of large numbers and central limit theorem may be applied

I It replaces the (simple) random sample hypothesis, that is “iid”

I I(0) is a sufficient condition to use a time series in regression

(59)

MA(1) : moving average process of order 1

I

MA(1)

xt=et+aet 1

,

t=1,2, . . .

I {e_t:t=0,1, . . .}is an i.i.d. sequence with mean zero and variances_e²

I e_tis aWhite Noise I

An MA(1) is I(0)

I Adjacent terms (in a sequence) are correlated

I As soon as there are 2 periods between 2 terms of an MA(1), correlation falls to zero sincee_tis i.i.d.

I Sinceetis i.i.d., an MA(1) is stationnary

I Clearly an MA(1) is covariance-stationnary

(60)

Outline

Introduction

I(0)

Cointegration

(61)

I

AR(1) is said

stable

when

|r|<1

[vs. explosive]

I µt⇠iid 0,s_µ² white noise

I Expectation 0, constant variance and covariance 0

I

We can write

et=µt+rµt 1+r²µt 2+. . .

I Sovar(et) =s_e²=s_µ²+r²s_µ²+r⁴s_µ²+. . .= s_µ² 1 r²

I Andcov(et,et 1) =cov(ret 1+µt,et 1) =rs_e²= rs_µ² 1 r²

I

Substituting successively in the AR(1)

et=ret 1+µt=r(ret 2+µt 1) +µt=r²et 2+rµt 1+µt=. . .=r^set s+

s 1 i=0

Â

rⁱµt i

Thus

cov(et,et s) = r^ss_µ²

1 r² =r^ss_e²

(62)

Matrix of var-cov of AR(1) errors

⌃e =s_e²

0 BB BB B@

1 r r² ··· r^T ¹ 1 r ··· r^T ²

... ...

1 r

sym 1

1 CC CC CA

=s_e²I_T+s_e² 0 BB BB B@

0 r r² ··· r^T ¹ 0 r ··· r^T ²

... ...

0 r

sym 0

1 CC CC CA

=s_e²I_T+s_e²

AR(1) stable is I(0)

I

Stationnary since

µt

i.i.d.

I

Cov

!0

when

time between

periods

!•

(63)

Var-cov matrix of the OLS coefficients MCO with AR(1) errors

I y=Xb+e

with

et=ret 1+µt

⌃bˆ =⇣

X⁰X⌘ 1

X⁰⌃_eX⇣

X⁰X⌘ 1

=⇣

X⁰X⌘ ₁ X⁰⇥

s_e²IT+s_e² ⇤ X⇣

X⁰X⌘ ₁

=s_e²⇣

X⁰X⌘ ₁ +s_e²⇣

X⁰X⌘ ₁

X⁰ X⇣

X⁰X⌘ ₁

I

It cannot be shown whether it is larger than

⌃bˆ=s_e²⇣

X⁰X⌘ 1

I Thus, it is not known whether the t-stats will be over- or under-evaluated

(64)

Outline

Introduction

I(0)

Cointegration

(65)

Random walk definition

I

In an AR(1), the hypothesis

|r|<1

is crucial for the series to be I(0)

I

Many economic time-series are better described with an AR(1) where

|r|=1:

I yt=yt 1+et : called arandom walk

I

Prediction

I SinceE et+j|yt =08j 1, we haveE(yt+h|yt) =yt 8h 1

I So that whatever the time differenceh, thebest predictionfor yt+hisyt

(66)

y

_t

= y

_t ₁

+ e

t

with e ⇠ n (0, 4) and y

₀

= 0

Computer-simulated data, to show random walk profile

tab

Rnd walk

(67)

Random walk and OLS

I

Variance of a random walk

%

linearly with time (in theory)

I

An AR(1) process is thus non-stationnary

I since its distribution changes with time

I

It can be shown it is not I(0) either

I xtandxt+hdo not become nearly independents whenh!•

I

So the OLS hypotheses for time-series (i.i.d. equivalent) are not satisfied

I OLS has unknown properties

(68)

I(1)

I

A random walk is one particular case of

unit root

or I(1) process

I

Such an I(1) process is “strongly persistent” or “long memory”

I “Trend”6=“strongly persistent”

I Series like interests rates, inflation or unemployement are often considered “long memory”

I but have no clear trend

I

But in many other cases, a long memory series also has a clear trend

I e.g. a random walk with drift:yt=⌦+yt 1+et

I ⌦is the drift

I See plot next page

(69)

y

_t

= ⌦ + y

_t ₁

+ e

t

with e ⇠ n (0, 4), y

₀

= 0 and ⌦ = .05

Drift :

yt=⌦+yt 1+et=2⌦+yt 2+et 1+et=. . .

Computer-simulated data, to show random walk with drift profile

tab

Rnd walk

(70)

Regression between I(1)

I A simple regression between 2 independent I(1) will often result in a significant t-stat

I Even without trend in any variable

I

Let 2 random walks

yt=yt 1+et

and

xt=xt 1+at

I Specifyy_t=b0+b1x_t+xt,

I ThenH₀:b1=0is true,

I butxtcontainsyt 1which is a random walk,

I Then the t-stat associated withbˆ₁,tbˆ₁!•whenT !•

I The limit distribution oft_b_ˆ

1is not normal

I So we are led to thinkxis a significant regressor fory

I Simulated Example

I Trend and RndWalk.odstabSpurious I(1)

(71)

Remark: the types of spurious regressions

1.

In a cross-section

I Spurious regression may be due tounobserved heterogeneity

I 2 variables are unrelated, but are both correlated to a third

I Regressing the 1º on the 2º, it appears that the relation is significant

I but inserting the 3º variable, then the 2º looses its significance

I This phenomenon mayalsooccur in time-series

I e.g. Storks & babies

2.

A spurious relation also occurs between series who share a

trend

I Both series have a positive trend or a negative one

I This issue may be solved by inserting a trend in the model

I but not always

3.

2

I(1) series

often appear in a spurious relation

(72)

First Differences

I

The first difference of a unit root

yt

:

yt yt 1

I is I(0):ytandyt+hbecome near independent whenh!•

I and is often stationnary

I its distribution does not change with time

I It is said that the series isdifference-stationnary

I

Many series

yt

that are

>08t

are such that

ln(yt)

is I(1)

I Then we often can useln(y_t) ln(y_t ₁)in an OLS regression

I Sinceln(y_t) ln(y_t ₁)⇡yt yt 1

yt 1 the interpretation is in terms of growth rates

I That is : groth rates are often I(0)

I

Differenciating a time-series also remove any linear trend

(73)

Outline

Introduction

I(0)

Cointegration

(74)

Correlation

I

Let

r₁=Corr(yt,yt 1)

I Is called the1º order autocorrelationof{yt}

I r₁can be estimated from the sample correlation betweenytand yt 1

I rˆ1=Â^T_t=2⇣

yt Â^T_t=2yt

⌘⇣

yt 1 Â^T_t=2yt 1

⌘

/(T 2) I

However, the

sampling distributions

of

rˆ₁

are very different

when

r₁

is close to 1 than when

r₁

is far from 1

I Whenr₁is close to 1,rˆ₁may have a large downwards bias

I Otherwise, the sample correlation is unbiased and consistent

I As a rule of thumb, to “counter” this downward bias, the series should be differenciated as soon asrˆ1> .8, at worstrˆ1> .9

I

When the series has a clear trend

I it is first de-trended and thenr₁is estimated

I Otherwise,rˆ₁tends to be over-estimated

(75)

Unit Root Test

I

AR(1) model

yt=a+ryt 1+et

I Dickey-Fuller

(DF) Test

H0:r=1

against

H1:r<1

I Subtractyt 1on each side

I yt=a+qyt 1+etwithq=r 1

I UnderH₀:q=0(sor=1),y_t ₁is I(1)

I So that the associated t-stat in an OLS regression does not converge to a normal

I but to aDickey-Fullerdistribution

I We testq=0(sor=1) calculating the usual t-stat

I but compare it with the Dickey-Fuller distribution tabulated values

(76)

Augmented DF Test

I

Same test as DF for

r=1

but in the model

I yt=a+qyt 1+g₁ yt 1+g₂ yt 2+···+gp yt p+et

I This is most often used: “ADF” test

I

The test can be specified

I Without constant yt=qyt 1+g1 yt 1+···+gp yt p+et

I With a trend

yt=a+bt+qyt 1+g₁ yt 1+···+gp yt p+et

I How to choose ?

I a=0andb=0: “pure” random walk

I a6=0andb=0: random walk with drift

I a6=0andb6=0: random walk with drift and trend

I These cases are discussed below

(77)

Trend and I(1)

I

For series that have clear time trends, the test is

I yt=a+bt+qyt 1+g₁ yt 1+···+gp yt p+et I

A trend-stationary process

I which has a linear trend in its mean but is I(0) about its trend

I can be mistaken for a unit root process

I if we do not control for a time trend in the test [Wooldridge [13]]

I Cfr how a random walk with drift looks like a trended I(0)

I

The usual DF or ADF test on a trending but I(0) series

I (that is not including a trend term)

I has little power for rejecting a unit root

I power = probability of rejecting the null hypothesis of a unit root when there isnotone

I the trend makes us believe there is a unit root

I

BUT, if we include a un-needed trend, we loose power

I So try to avoid including the trend as much as can be

(78)

Notes on DF

I

When we include a time trend in the regression, the critical values of the test change.

I

Omitting the intercept

a

in the DF equation is rarely done because of biases induced if

a6=0

I

We can allow for more complicated time trends, such as

quadratic, is also seldom used.

(79)

How many lags ?

I

The inclusion of the lagged changes is intended to “clean up”

serial correlation in

yt I

The more lags,

I the more initial observations we lose

I the smaller the power of the test

I

Too few lags,

I the size of the test will be incorrect, even asymptotically,

I size = probability of rejecting the null hypothesis of a unit root when thereisone

I because the validity of the DF critical values relies on the dynamics being completely modeled

I

Often,

I annual data, one or two lags usually suffice [Wooldridge [13]]

I monthly data, 12 lags may be used

I large sample size : you may experiment

(80)

One application of the DF test

I r3t

(annualised) interest rate (or yield) on 3-month treasury bills

I “Bond equivalent yields”, in the financial pages

I

In

Gretl, data in INTQRT.gdt, using Wooldridge [13]

I Change structure of the dataset: monthly, initial date unknown

I

Estimate

yt=a+qyt 1+et

I OLS cr3 against 0 r3_1

I Coefficient of r3_1 is−0,0907, sorˆ=0.9093

I t-stat of r3_1 is -2.47, but does not follow a t distribution

I On the r3 variable

I Menu “variable”!“unit-root test”!“Augmented...”

I No lag (so: simple DF test), with constant, without trend

I This produces the same results as the regression, with a correct p-value of .12 so¬R H₀: there is a unit root

(81)

Outline

Introduction

I(0)

Cointegration

(82)

Motivation & Definition

I

Taking first differences of I(1) series before regressing them is a

“safe strategy”

I but limits the analysis to short term relations

I That is: one-period changes explained by one-period changes

I Cointegrationmay give back its meaning to regressions between I(1) series in levels (or logs)

I

If

{yt}

and

{xt}

are I(1), then in general

yt bxt

is I(1)

8b

I However, it ispossiblefor someb6=0,yt bxt to be

I I(0): Asymptotically un-correlated with its own past

I Stationnary : Constant expectation & variance

I When such ab exists, we say that{yt}and{xt}arecointegrated

I bis the cointegration parameter

I {yt}

and

{xt}

cannot move much apart from each other in the

long run

(83)

Example: Treasury bills Interest Rates

I r6t

(annualised) interest rate series of 6-month treasury bill

I T-bill,r3t idem but 3-month

I

Data in INTQRT.gdt from Wooldridge [13]

I We saw earlier thatr3t had a unit root

I That is also true ofr6t

I

Let

Sprt=r6t r3t

(spr for spread)

I b=1: we know the coint. param.

I Test if Spr has a unit root

I DF stat -7.71 with a corresponding near-zero p-value

I thusRH₀: spr has unit root

I sor6t arer3tcointegrated with parameter 1

I

Interpretation : if the rates moved apart, one of the two would become a relatively more attractive investment than the other

I therefore, investors would pay more for it, its price would rise

I since the interest rate is the return of the bond divided by its price, it would decrease automatically

(84)

Cointegration test

I

When we know the value of the cointegration coefficient

b

I then we test whetheryt bxt has a unit root: DF or ADF

I

Usually, we do not know

b

I Ifytandxt are cointegrated

I OLS isconsistentforbinyt=a+bxt+ut

I otherwise, OLS yields spurious results andb is falsely significant

I Engle-Granger Test= Dickey-Fuller onuˆt=yt aˆ bˆxt

I Regress uˆtonuˆt 1with a constant, without lag ˆ

u_t=d+guˆ_t ₁+xt

I Ifuˆt 1is not significant, thenuˆtis I(0)

I Thenytandxtarecointegrated

I Again, the test uses a special distribution, not at

(85)

Engle-Granger Test

I

If the lag order, k, is greater than 0,

I then k lags of the dependent variable are included on the right-hand side of the test regression

I Gretl allows "test down from maximum lag"

I From a selected lag order taken as a maximum,

I the actual lag order used is obtained by testing down

I AIC can be used to compare the different lag levels I

If

yt

or

xt

has a trend, it must be modeled

I See Wooldridge 2012 p648 [13]

I Where the trend is improperly called a drift

(86)

Engle and Granger 2003 Nobel Prize in Economics

“for methods of analyzing economic time series with time-varying volatility

(ARCH)”

Robert F. Engle

with common trends (cointegration)”

Clive Granger

(87)

Example: cointegration between fertility and fiscality

I

In the USA “personal exemption” is a tax break on household income

I Among others, the more the HH has children, the bigger the tax break

I The amount is relatively small, but changes arbitrarily through time

I One can then imagine testing a link between the exemption and the number of births

(88)

Example: cointegration between fertility and fiscality

I

Data in Gretl Fertil3.gdt from Wooldridge [13]

I Modify the dataset structure for a time-series, annual, beginning 19??

I gfrbirths / 1000 women 15-44 year-old

I DF: p-value .80 so¬R H₀: unit root

I pe“personal exemption”, in real $

I DF: p-value .45 so¬R H₀: unit root I

Regressions

I In levelsgfrt=a+bpet+ut

I In first differences gfrt =a+b pet+ ut

(89)

gfr and pe

gfrt coef (p-val) gfrt coef(p-val)

Cst 99.4 (0) 92.9 (0) 108.6 (0) Cst -.08 (.92) -.32 (.68) -3.45 (0) pet .05 (.40) -.06 (.36) .03 (.66) pet -.05 (.27) -.05 (.17) -.05 (.19)

pet 1 -.02 (.83) -.04 (.72) pet 1 -.01 (.69) -.009 (.75)

pet 2 .11 (.07) .13 (.11) pet 2 .09 (0) .09 (0)

pet 3 -.005 (.93) -.01 (.88) pet 3 .04 (.17) .04 (.15)

pet 4 .08 (.16) .02 (.04) pet 4 -.04 (.04) -.36 (.05)

Pill (63) -27.8 (0) -30.9 (0) .38 (.97) Pill (63) -2.23 (.07) -1.78 (.14) -5.43 (.005)

t -1.17 (0) t .11 (.01)

DW .12 .17 .25 1.44 1.34 1.57

T 72 68 68 T 71 67 67

The differences between the model in levels and in first differences

suggest to test for cointegration because if the series are not

cointegrated, the regressions in level are spurious

(90)

gfr and pe

I

Cointegration test

I Gretl : “model”!“Time Series”!“Coint Test”!

“Engle-Granger”

I Variables: gfr and pe, without lag since we test ˆ

ut=a+buˆt 1+et

I Complete output

I DF for gfr and pe : each is I(1)

I MCOgfronpe

I MCO residuals :¬R H₀:b=0

I So¬R H0:1 b=1: the residuals are I(1)

I Thusgfrandpeare NOT cointegrated

I

Control for a possible common trend between

gfr

and

pe

I Same procedure, but select “constant and trend”

I Same conclusion

I

Thus, the relation in levels is spurious (Pill !)

I The one in first differences reflects only the short run

(91)

Outline

Introduction

I(0)

Cointegration

(92)

Definition

I

If

yt

and

xt

are I(1)

I One can only estimate a model in first differences

I a “VAR”: Vector Autoregressive Model

I e.g. yt=a0+a1 yt 1+g0 xt+g1 xt 1+ut I

But if

yt

and

xt

are cointegrated

I We can introduce additional I(0) variables

I Letst=yt bxtwhich is I(0)

I For simplicity, assumeE(st) =0

I In the simplest case, we insert a lag ofst

I yt=a0+a1 y_t ₁+g0 xt+g1 x_t ₁+dst 1+ut I Thedst 1term is called error correction

I As is the whole model

(93)

Discussion

I

An error correction model ECM allows us to analyse the short run dynamics between

yt

and

xt

I Usually,b has to be estimated

I OLS is consistent under cointegration

I There are other models (Leads and Lags) I

For simplicity, a model without lags of

yt

or

xt

I yt=a₀+g₀ xt+dst 1+ut

I yt=a₀+g₀ xt+d(yt 1 bxt 1) +ut

(94)

Discussion

I

Then it should be that

d <0

I Ifyt 1 bxt 1>0theny has overshoot the equilibrium int 1

I Cointegration imposes that we return to the equilibrium

I Sinced<0the error correction tends to reduce yt

I Which brings us back to the equilibrium

I Likewise wheny_t ₁ bx_t ₁<0

I

However, ECM can also be seen as a context for an estimation of a cointegration relation

I In which short-run terms in ytor xtare introduced to reduce the unexplained noise

I That isyt=p₀+p₁ xt+p₂ yt+p₃xt+xt

(95)

Vector Error Correction Models

I

Consider an n-variate process of order p

I yt= 0 B@

y₁t

...

ynt

1

CAthat is n endog. variables

I yt=µt+A₁yt 1+. . .+Apyt p+et

I In real life, we don’t know p

I µtmay include exog. variables I

Rewrite

I tautology:yt s=yt 1 ( yt 1+ tt 2+. . .+ yt s+1)

I so yt=µt+⇧yt 1+Â^p_s₌₁¹ s yt s+et

I with⇧=Â^p_s=1As Iand _s= Â^p_h=s+1A_h

I called the VECM representation ofyt

(96)

Vector Error Correction Models

I

The important things are

I It looks like the expression for the Engel-Granger testyt 1 I Plus terms that look like error corrections yt s

I

Interpretation of

yt=µt+⇧yt 1+Â^p_s=1¹ s yt s+et

I depends on the rank of⇧

I called r

I Ifr=0: all the elements ofyt are I(1)

I and not cointegrated

I Ifr=n: all the elements ofytare I(0)

I sis the lag order of the VECM²

I Note yt s=yt s yt s 1

2In Gretl,sis the chosen Lag-order minus 1 because Gretl first computes a VAR of that lag order, while the VECM is with 1st differences, so one lag order less.

(97)

Cointegration

I

Occurs when

0<r<n

I Then⇧can be written asab⁰

I ytis I(1)

I butzt=ab⁰ytis I(0) I

For ex.

I Assumeb₁= 1andr=1

I Then9bs.t. zt= y₁t+b₂y₂t+. . .+bnyntis I(0)

I That isy₁t =b₂y₂t+. . .+bnynt+zt is a long run relation

I ztmay be non-zero but is stationary I

In practice

I We do not knowb

I We estimate it first and then the rest

(98)

Outline

Introduction

I(0)

Cointegration

(99)

Johansen Test of Cointegration

I

Works by computing the eigenvalues of a matrix closely related to

⇧

I l is the vector of (real) eigenvalues of⇧ifdet(⇧ lI) =0

I So that⇧n=0has a non-zero solution

I We can guess the relation with the VECM representation I

Count the number of eigenvalues different from zero

I If all are significantly6=0

I then all the processes are I(0) (stationary)

I If there is at least one zero eigenvalue

I theny_tis I(1)

I but some linear combinationb⁰ytis stationary

I If no eigenvalues are significantly6=0

I thenytis I(1)

I also any linear combinationb⁰yt I SO : no cointegration

(100)