Autoregression-Based Estimators for ARFIMA Models

(1)

Montréal

Février 2001

Série Scientifique

Scientific Series

2001s-11

Autoregression-Based

Estimators for ARFIMA Models

(2)

CIRANO

Le CIRANO est un organisme sans but lucratif constitué en vertu de la Loi des compagnies du Québec. Le

financement de son infrastructure et de ses activités de recherche provient des cotisations de ses

organisations-membres, d’une subvention d’infrastructure du ministère de la Recherche, de la Science et de la Technologie, de

même que des subventions et mandats obtenus par ses équipes de recherche.

CIRANO is a private non-profit organization incorporated under the Québec Companies Act. Its infrastructure and

research activities are funded through fees paid by member organizations, an infrastructure grant from the

Ministère de la Recherche, de la Science et de la Technologie, and grants and research mandates obtained by its

research teams.

Les organisations-partenaires / The Partner Organizations

•École des Hautes Études Commerciales

•École Polytechnique

•Université Concordia

•Université de Montréal

•Université du Québec à Montréal

•Université Laval

•Université McGill

•MEQ

•MRST

•Alcan Aluminium Ltée

•AXA Canada

•Banque du Canada

•Banque Laurentienne du Canada

•Banque Nationale du Canada

•Banque Royale du Canada

•Bell Québec

•Bombardier

•Bourse de Montréal

•Développement des ressources humaines Canada (DRHC)

•Fédération des caisses populaires Desjardins de Montréal et de l’Ouest-du-Québec

•Hydro-Québec

•Imasco

•Industrie Canada

•Pratt & Whitney Canada Inc.

•Raymond Chabot Grant Thornton

•Ville de Montréal

© 2001 John W. Galbraith et Victoria Zinde-Walsh. Tous droits réservés. All rights reserved.

Reproduction partielle permise avec citation du document source, incluant la notice ©.

Short sections may be quoted without explicit permission, if full credit, including © notice, is given to the source.

ISSN 1198-8177

Ce document est publié dans l’intention de rendre accessibles les résultats préliminaires

de la recherche effectuée au CIRANO, afin de susciter des échanges et des suggestions.

Les idées et les opinions émises sont sous l’unique responsabilité des auteurs, et ne

représentent pas nécessairement les positions du CIRANO ou de ses partenaires.

This paper presents preliminary research carried out at CIRANO and aims at

encouraging discussion and comment. The observations and viewpoints expressed are the

sole responsibility of the authors. They do not necessarily represent positions of CIRANO

or its partners.

(3)

Autoregression-Based Estimators for ARFIMA Models

*

John W. Galbraith

†

, Victoria Zinde-Walsh

‡

Résumé / Abstract

Nous décrivons une méthode d'estimation pour les paramètres des modèles

ARFIMA stationnaires ou non-stationnaires, basée sur l'approximation

auto-régressive. Nous démontrons que la procédure est consistante pour -½ < d < 1, et

dans le cas stationnaire nous donnons une approximation Normale utilisable pour

inférence statistique. La méthode fonctionne bien en échantillon fini, et donne des

résultats comparables pour la plupart des valeurs du paramètre d, stationnaires ou

non. Il y a aussi des indications de robustesse à la mauvaise spécification du

modèle ARFIMA à estimer, et le calcul des estimations est simple.

This paper describes a parameter estimation method for both stationary

and non-stationary ARFIMA (p,d,q) models, based on autoregressive

approximation. We demonstrate consistency of the estimator for -½ < d < 1, and

in the stationary case we provide a Normal approximation to the finite-sample

distribution which can be used for inference. The method provides good

finite-sample performance, comparable with that of ML, and stable performance across

a range of stationary and non-stationary values of the fractional differencing

parameter. In addition, it appears to be relatively robust to mis-specification of

the ARFIMA model to be estimated, and is computationally straightforward.

Mots Clés :

Modèle ARFIMA, autorégression, intégration fractionnelle, mémoire longue

Keywords:

ARFIMA model, autoregression, fractional integration, long memory

JEL: C12, C22

*

_{Corresponding Author: John Galbraith, Department of Economics, McGill University, 888 Sherbrooke Street}

West, Montréal, Qc, Canada H3A 2T7

Tel.: (514) 985-4008 Fax: (514) 985-4039

email: galbraij@cirano.qc.ca

We thank Javier Hidalgo, Aman Ullah and seminar participants at Alberta, Meiji Gakuin, Montréal, Queen's and

York universities, and the 1997 Econometric Society European Meeting, for valuable comments. Thanks also to the

Fonds pour la formation de chercheurs et l'aide à la recherche (Québec) and the Social Sciences and Humanities

Research Council of Canada for financial support. The program for computing GPH estimates was written by Pedro

de Lima; the code for ML estimation of ARFIMA models was based on programs to compute the autocovariances of

ARFIMA models written by Ching-Fan Chung. Nigel Wilkins kindly provided additional code for ML estimation.

†

_{McGill University and CIRANO}

‡

_{McGill University}

(4)

Long me mory pro cesses, and in part icular mo dels base d on fractional int egrat ion

as in Grange r and Joyeux (1980) and Hosking ( 1981) , have come t o play an increasing

role in t ime series analy sis as longer t ime se ries have b ec ome available; nancial t ime

series have yielde d an esp ecially large numb er of applicat ions. There is a corresp ondingly

substant ial lit eratureon thee st imationof suchmo dels,b oth in thefreq ue ncy domainand

thetimedomain; t hesecont ribut ionscan b efurthersub-divide d intothosewhichestimat e

afractionalintegration(FI)parameter joint lywitht he st andard A RMA paramet ersof an

ARFIMA mo del,andt hose whichest imatet he FI parame ter alone, leaving any ARMA or

other parameters forp ossible estimat ion in a second stage.

Imp ortant cont ribut ions to f req uency -domain estimat ion of theFI parameter include

those of Gewe ke and Porter-H udak (1983) and Robinson ( 1995) . In the time domain,

Hosking (1981), Li and MacLeod ( 1986) and Haslett and Raerty (1989) sugge st ed e st

i-mat ion st rategies based on rst -stage est imation of the long-me mory parameter alone . A

p otential nite -sample problem arises in processes t hat have a short-memorycomp onent,

which can lead to bias as the short-memory comp onent s proj ect onto the long-memory

parameter in the rst stage; see A giak loglou et al. (1992) f or a discussion of t his bias in

the conte xt of the G eweke and Porter-H udak (hereafter GPH) estimat or.

Sowe ll (1992a) and Tieslau et al. (1996) treat joint estimat ion of frac tional

inte-grat ion and ARMA parameters, in the f orme r case via Max imum Likeliho o d, and in the

latte rvia aminimum-distance estimat orbase don estimat edautocorrelat ions and thet

(5)

knowledge of t he non-stationarity) b e transformed to stat ionarity. Mart in and Wilkins

(1999) use the indire ct infe rence estimat or, which uses simulat ion to obt ain the f unct ion

to which distance is minimized.

The prese nt pap er oers an alt ernative estimation met hod for joint est imation of a

setof ARFIMAparameters whichisapplic ablet ob othstationaryandnon-st ationary

pro-cesses. The me tho d isbase d on aut oregressive approx imation, as considered by Galbraith

and Zinde-Walsh (1994, 1997), and has several advant ages in addition t o ease of

com-put ation. First, it o ers stable p erformance across a range of stationary (including `ant

i-p ersistent')andnon-st ationaryvaluesoft helong-memoryparamet erd;thatis0 1 2

<d<1;

andt heref oredo esnotrequirepriork nowledgeoft henon-stat ionarityortransformationto

the stat ionarity re gion. Ingeneral, t he aut oregressive metho d p erforms we ll in t he

nite-samplecase sthatweexamine , yieldingro ot meansquarederrorscomparablewit ht hose of

exact time-domain ML in the case s forwhich that estimat or is applicable . Perhaps most

imp ortant ly, thismetho d app ears tob e relatively robust to mis-sp ecic ation, b eing based

on AR approximations which c an represent quite general pro cesses. We oe r simulat ion

evidence on each of these p oint s.

InSec tion2 we brie yreview e st imation ofA RFIMA mo delsin t he timedomainand

autore gressiveapproximat ionof A RFIMAprocesses, andpresent est imatorsbased ont his

approx imation. Se ction 3 considers b oth asymptot ic and nite-sample prop erties of t his

(6)

2.1 A ssumptions and the autoregre ssive represent ation

Consider an ARFIMA( p;d;q) pro c ess, denedas

P(L)( 10L) d y t =Q (L)" t ; (2:1:1) wherefy t g T t=1

isasetofobservat ionsonthepro cessofint ere standf" t

g T t=1

formsast

ation-ary innovat ionsequence suchthatf ort he0algebra F t0 1 generat edby f" ; t01g; E(" t jF t0 1 )=0a.s., E( " 2 t jF t01 )= 2 >0a.s.,andE(" 4 t )<1:Le tP(L) ;Q( L) b ep

olyno-mialsofdegre espandq;sucht hatP(L)=10 1 L0::: p L p ; Q (L) =1+ 1 L+:::+ q L q :

Assume t hatQ (L) has allro ots outside t heunit circle,so thatthe mov ingaverage part of

the pro ce ss is invert ible; we can t hen write

[Q (L)] 01 P( L)(10L) d y t =" t : (2:1:2)

Wewillassume alsot hat Q( L)and P(L)haveno commonf actors,and thatthero ots

of P( L) are out side t he unit circ le . However, we will consider non-stationarity arising

through values of d great er than 1 2

, and provide b oth analytic and simulation re sults f or

those cases. We treat the pro ce ss as hav ing zero mean, although we will note below the

eectof using anestimat edmean,andwilluseanest imatedmeaninsimulationevaluat ion

of theest imator.

The te rm ( 10L) d may b e ex pandedas (10L) d = 1 X k =0 d k ( 0L) k = 1 X k =0 b k L k ; (2:1:3)

(7)

with b 0 = 1; b 1 = 0d; b 2 = 2 d(1 0 d) ; b j = j b j01 (j 01 0 d) ; j 3: If jdj < 2 ; then P 1 k =0 0 d k 1 2

< 1; and (2.1.3) denes a stat ionary process. For d > 0 1 2

; (10L) d

is

invertible , and expression (2.1.2) can theref ore b e used to obtain the co e cient s of the

innite aut oregressive repre sentation of the ARFIMA process y:

(10 1 X i=1 i L i ) y t =" t ; (2:1:4) with i =b i 0 P q j=1 j i0j + P p j=1 j b i0j and P 2 i <1 where jdj< 1 2 : 2.2 Time-domain e st imation

Time-domain e st imators have b een prop osed by, among ot hers, Hosking (1981), Li

and MacLe o d (1986),Sowell ( 1992a) and Tieslau et al. ( 1996) . The lat ter two allow j oint

estimat ion of all ARFIMA parame ters.

Sowe ll(1992a)givesanexactMax imumLikelihoo dalgorithmforst ationaryA RFIMA

mo de ls withdistinctroot s intheA Rp olynomial. A sBaillie(1996)not es, MLis

computa-tionally burde nsome he re, sincesubstantialcalculat ion( includingT 2T matrix inversion)

is req uiredat each iterationof t henumericalopt imization. Be lowwecomparet he prop

er-ties of e xact ML wit h t hose of an e st imator based on the co e cients of an aut oregressive

approx imation,analogoustotheA RMAestimationme tho dsofSaikkonen( 1986)and

Gal-braithandZ inde-Walsh (1997). Toobtaint he est imatorwewillusee st imatorsoft heco

ef-cients of a long autoregre ssion, and theautore gressive expansion of t heA RFIMA(p;d;q)

mo de l, t o obt ainthe estimates of the long memoryparameter d toget her withparamet ers

which characterize the \short me mory" parts of the mode l. B elow we will give a rule of

(8)

threeestimatorsf ort heco ecient sofanautore gressiveapproximat ion,onwhichA RFIMA

estimat es c an later b e based. Each of t he t hree has t he same asympt otic limit in t he

sta-tionarity region. 1

Oneoft he thre e, t heOLSestimat or, canb euse dint henon-st ationarity

regionaswell;forthatreasonOLSwouldb eusedin applications,wherest ationarityisnot

normallyknownapriorit oapply. Howeverot herestimators,inpart iculartheYule-Walker,

are conve nient f or obt aining theoretical prop ert ies in the st ationarity region.

The OLS estimator ~a OLS solves ~ a O L S = argmin 1 T T X t=k +1 (y t 0 1 y t01 01110 k y t0k ) 2 : (2:2:1)

The sec ond estimat or is t he sp ec tral e st imator, asympt ot ic prop erties of which were

examined by Yaj ima ( 1992) forGaussianerrors. We denoteit a~ sp

; it follows from Yajima

(1992) that, in the present notation,

~ a sp = argmin 1 T y 2 1 +(y 2 0 1 y 1 ) 2 +111+(y k 0 1 y k 0 1 01110 k 01 y 1 ) 2 + T X t=k +1 (y t 0 1 y t01 01110 k y t0k ) 2 : (2:2:2)

The third is the Yule-Walker estimat or. Dene j =E( y t y t0j ) and ^ j = 1 T0k P T t=k +1 y t y t0j

; 6(k) as the k 2k matrix with f6( k ) g ij = ji0 jj ; and ^ 6( k ) as the

mat rix with elements f ^ 6( k ) g

i j = ^

ji0jj

: The Yule-Walker estimator is t he n a~ YW

; which

1

Each of course dep e nds on the A R order paramet er, k; but to simplify notat ion t his dep endence will not b e indicated e xplicitly.

(9)

^ 6(k ) ~a YW = ^ ( k ) ; (2:2:3) where (k )^ =(^ 1 ; ^ 2 ;:::^ k

): Conside r also the p opulat ion analogue of (2.2.3),

6(k)a (k ) = (k); (2:2:4)

where a( k ) is t he solut ion to t he system.

Thet ermsint he rst-orderc onditionsthata~ OLS

and ~a sp

solvedierf rom( 2.2.3),f or

~ a

YW

;byt ermsoforderinprobabilityatmostO p

(kT 01

) ;itfollowsthatalloft heestimators

ab ove die r by at most O p (k 2 T 01 ): 2

Therefore each one has the same asymptotic limit

a( k ) ; and t he choice of k as O (lnT) implies t hat all of t he est imators have t he same

asymptot ic distributions. We will therefore use the not ation ~a to denot e any one of these

estimat ors;no distinctionamongt hemne edb e madein consideringasy mptotic prop erties

ofARFIMAparamete restimat orsbasedonsuchautoregre ssiveest imates,for0 1 2 <d < 1 2 .

Belowwe use t heYule -Walkerest imator inderivat ionsfort he st ationaryc ase;weuseOLS

for derivations in non-st at ionary case s and in t he Mont e Carlo exp eriments rep ort ed in

subsection 3.3.

The values a( k ) of (2.2.4) are related t o

t he co ecient s of t he innite aut oregressive

representat ion of t he stationary A RFIMA pro cess. If we denote by [1;1)

the vect or of

co e cient s of the innit e autoregre ssion (2.1.4),t he n that vector solves

6(1) [ 1;1)

= (1) : (2:2:5)

2

The only non-standard case here is t hat of 0 1 2

< d < 0; where the eige nvalues of 6 01

could grow at a rat e as high as k 02d

(10)

Partit ion [1;1) as ( [1;k ] : [k +1;1)

) ; and denot e t he t op-right sub-mat rix of 6 1 ; part i-tioned conformably, by 6 [ k +1;1) : Then 6(k) [ 1;k ] +6 [k +1;1) [k +1;1) = (k); and therefore a( k )0 [1;k ] =(6( k )) 01 6 [ k +1;1) [k +1;1) : (2:2:6)

Thus anautoregressiveestimatorof [ 1;k ]

will includeadet erministicbias represe nted

by ( 2.2.6). We demonstrat e b elow that this bias goes to zero as k;T !1:

2.3 A RFIMA paramete r est imation

Nowthatwehaveintro ducedest imatorsofaut oregressiveco e cient sfort heA RFIMA

pro c ess, wecanpro c eedtodene est imatorsfort he fullset of ARFIMA paramet ers based

on any of the ab ove est imators. We dene the vect or of all t he paramet ers of A RFIMA

to b e estimat ed by ! = (d; 0 ; 0 ) ; where 0 = ( 1 ;:::; p ); 0 = ( 1 ;:::; q ): Let ~ b e

any of t he autoregressive c o ec ie nt estimat ors intro duc ed ab ove, and let (!) denot e the

vect orcont ainingt hecoec ie ntsoftheinniteaut oregressiverepresentat ionoftheprocess,

viewed as f unct ions of !.

We will examine a minimum-distance estimat or of t he form

~ ! = argmin ( ~a0(!)) 0 (~a0(!) ); (2:2:7)

const ructe dusing any of the estimat ors ~a of ( 2.2.1-2.2.3) , where(!)is givene xplicit ly in

(2.1.4); re presents a k 2k weighting matrix. In all simulations b elow, is chosen as

(11)

lags, which contributes t o t he eciency of t he estimator. 3

It isthe fact t hat this estimat or uses aut oregressive parameters,inst ead of auto

corre-lations, that allows it s use for non-stat ionary pro cesses. 4

Note nally t hat estimat es can

b e obt ained for multiple sp ecicationsof an ARFIMA mo del from a single aut oregressive

t t o the data, which is conve nient in model select ion ( informat ion criteria, for example,

can b e compute d based on the residuals fromeach of several mo dels) .

3. Prop er ti es and p erf or mance of the esti mators

In t his se ction we consider asymptotic and nite-sample prop ert ie s of A RFIMA

pa-ramete r estimat es based on (2.2.7), obt ained using any of the various preliminaryaut

ore-gre ssive est imators( OLS, sp e ctral, Yule-Walker) discusse din Sect ion2. We will also not e

the corresp onding prop ertiesof some of the e xist ing ML ,MDE and sp ectral estimators.

Maximum likelihoo d produces asy mptotically Normal est imate s of ARFIMA

param-et ers which converge at the usual T 1 2

rate f or stationary Gaussian processes ( Dahlhaus

1989). The Quasi-MLE has t his prop erty as well for a range of assumpt ions on the e rror

3

If = I, the k 2k identity matrix , we have an unweight ed form of t he est imator. In simulations we found substantial cont ribut ions to nite-sample eciency from the use of = cov( ~a )

01

;and this form alone is used in the result s present ed b elow. 4

It is not eworthy that f or t he pure ARFIMA( 0,d,0) case, an e st imator can b e base d on the rst c o ecient of the approximating A R mo del, since that rst co ecient c onverges to 0d in t his c ase, which corresp onds t o ( 2.1.3). This p oint has a parallel in the use of thesame e st imatorby GalbraithandZ inde-Walsh(1994) f or thepureMA( 1)mode l( and also in the fact, noted by Tie slau et al., that a consistent est imator of d can b e based on

the rst aut o correlation; t hat is, ^ d = ^ 1 (1+^ 1 )

) . The fact that t he same estimat e can serve

for eitherof two dierent mo delsunde rline s, of course, the imp ortance of mode l select ion in determining the character of the estimat ed representat ion.

(12)

standardrate toanasy mptoticnormaldistributionf ord2(0 1 2 ; 1 4 );at d= 1 4 , convergence

is to the normal dist ribution at rate ( T log (T) ) 1 2 ; and f or d 2 ( 1 4 ; 1 2 ) convergence is at rat e T ( 1 2 0d)

and t he limiting dist ribution is non-normal. For t he G eweke and Port er-Hudak

estimat or based on least square s estimat es of a regression with de p endent variable given

by theharmonicordinat esofthep eriodogram,andforage neraliz edversionwhichdiscards

anumb erof lowerf requencies,theasy mptoticprop ertiesareobt ainedbyRobinson(1995).

Asympt oticprop ert ies oft heindire ctinference e st imatorforlong memory mo dels(Martin

and Wilkins 1999) have not b een e st ablished.

Before considering prop ert ies of ARFIMA parameter est imate s base d on aut

oregres-sion,weexaminetheestimat es~aoft heaut oregre ssiveparame terst he mselves. Insubsect ion

3.1 we show c onsist enc y of ~ in b ot h st ationary and non-stationary cases, and show that

a N ormal approximat ion to t he asy mptotic distribution can b e used. Consistency and

dist ribut ional results for t he e st imates !~

of the ARFIMA paramet ers are given in 3.2.

Simulation result s describing Finit e sample p erformance app ear in3.3.

3.1 A sy mptotic prop ertiesof estimators ~a

Therstsetofre sultsconce rnsconsistencyof t heautoregressiveparameterest imate s,

from which parametric mo del paramet er est imates are later deduced, in b ot h st at ionary

and non-stat ionary cases. A s note d ab ove we use the Yule-Walke r estimator for ~a in the

stationary case;thesame prop ert iesthenholdforOLS and sp ect ralestimat es. Theorem1

establishes that ~ is a consistent estimator of [ 1;k ]

as T !1; k!1 in t he st ationarity

region; the cases 0< d< 1 2

and 0 1 2

(13)

treatment . Theorem 2 establishes t he result, using OLS, in the non-stationary case 2

<

d < 1: For d = 1; the pro ce ss cont ains a unit ro ot ; see for example Phillips (1987) f or

asymptot ic results. For bre vity we omit the case where d = 1 2

: A pro of of consiste ncy can

b e const ruct ed similarly t o that of The orem 2, using t he rates appropriate to the d = 1 2 case. The or em 1. For d 2 (0 1 2 ; 1 2 ) and a( k ) as dene d in (2.2.4), ka (k)0 [ 1;k ] k = O(k jdj0 1 2 )

as k!1: Unde r the assumptions in Section 2.1, for any ;" t he re exist k ; ~ T such that Pr( ka(k)0 [ 1;k ] k>" ) < 8T > ~ T: Proof: Se e A pp e ndix A.

Next considert henonstationary case 1 2

<d<1: Shimot su andPhillips(1999)discuss

two p ossible charac teriz ations of non-stat ionaryI(d) pro ce sses for 1 2 <d< 3 2 :one denes y t

as a partial sum of a stat ionary fractionally int egrated pro cess z t ; so that y t = y 0 + P t i=1 z i

; while the ot her denes y t

via t he expansion of t he function of t he lag op erat or,

(y t 0y 0 ) = P t01 i=0 0 1 i! 1 0(d+i) 0(d) " t0i

:Eachleadstoanexpre ssionoft heform(10L) d (y t 0y 0 )= " t

; Shimot su and Phillips show that t he essent ial dierenc e lies in the fact t hat the rst

denition involves t he presenc e of pre-sample values. Here we treat y t

as a partial sum

andassumet hat y 0

andallpre-samplevaluesarezero es;thesame resultswould hold ifwe

assume d thatmax fy t

:t0g=O p

(1):N ote thatwhileTheore ms 2and 4use t he partial

sum represe ntation, thisestimator do e snot relyon dierenc ed datain the non-st at ionary

case, so t hat a priori knowledge of t he range inwhich d lies is not nec essary.

The or em2. For d 2 ( 1 2

;1); let t he dierenced pro cess z t =y t 0y t01 ; t = 2;:::;T; b e a

(14)

stationaryA RFIMApro ce ss sat isfy ing theassumptionsof Sect ion 2.1,withd =10d<0; and let y 0i =0 8i0: Then as T !1;k!1;T 1 2 k !0; we have k~0 [1;k ] k= O p ( k 03d+ 3 2 ) if 0 1 2 <d < 3 4 ; O p ( k 0d ) if d 3 4 : Proof: Se e A pp e ndix B.

Next we charact erize the asy mptotic distribut ions of the e st imators (k)~ of ; in the

stationarity re gion, asT;k !1: Forxedkand0 1 2 <d < 1 4

;t hee st imator~a (k)hasanasymptoticNormaldist ribut ion

withtheusualconvergenc erateofT 1 2

andasy mptot icmeanofa( k )( Yajima1992) . Forthe

co e cient s of the innite autore gression, the diere nce ~a(k)0 [1;k ]

can b e represe nted

as the sum of (~a( k )0a( k ) ) and (a( k )0 [ 1;k ]

) ; the rst of these terms has an asympt otic

Normal dist ribut ion, and the se cond go es to ze ro, by The orem 1, as k!1: The Normal

dist ribut ion can therefore b e used for inf erence in large samples, for0 1 2 < d< 1 4 : For 1 4 < d < 1 2

; t he asymptot ic dist ribut ion of (k)~ is not N ormal, but is related to

the Rose nblat t dist ribut ion; the conve rgence rate is non-st andard (Yajima 1992) , and if

an estimated mean is subtracted the asymptotic dist ribut ion changes. Nonetheless, we

can again represent ~a (k)0 [ 1;k ]

as a sum of two te rms, one of which has an asympt otic

Normal distribution, and t he second of which is a `c orre ction' term which can b e made

arbitrarilysmallinprobability asT;k!1:Thisrepresentationappliestoall0 1 2 <d < 1 2 ;

and is based on Hosking ( 1996) , where asymptot icnormality is est ablished for dierences

of samplecovariances,forallst at ionaryARFIMA processes,under theconditions givenin

(15)

dist ribut ion,butratheraNormalapproximat iontothenite-sampledist ribut ionforwhich

the c ovarianc e and closeness t o the true dist ribut ion are governed by t he choice of k f or

sucient ly largeT:The result istheref orenot in con ic twithYajima'sasympt ot ic re sult,

but none theless indicates t hat it is p ossible to conduct approximate inference using the

Normal distribution, ford in this range.

The or em3. Under thecondit ions of Theorem 1, the dierence ~a (k )0 [1;k ]

can b e

repre-sente d as the sum (k )+ (k); where for any p osit ive "; t he re existsk suciently large

that Pr(k(k ) k < ") > 10; and T 1 2

(k) has a limit ing N ormal dist ribution of the form

N( 0;W(k)):

Proof: See App endix C. The asy mptotic covariance matrix W(k ) is given in t he pro of of

Theorem C in A pp endix C.

Asnoted earlier, wesugge st cho osing k =O(ln T) ; a particular rule is givenbelow.

Figure 1( a{d) illust rat est he approximat ion provided by t he Normal by depicting the

simulate d densities of t he rst and second aut oregressive coecients in estimat ed AR

representat ions of A RFIMA(0;d;0) mo dels, d = f0:4;0:5;0:7;0:9g; T = 100 (k = 8):

In each case the Normal approximat ion is very close to the true distribut ion of the AR

co e cient s;small but clearly discernible depart ures fromtheNormalare visible forlarger

values of d; part icularly near t he means of t he dist ribut ions. Note that these simulations

include values of d 0:5; to which Theorem 3 does not apply; nonetheless the Normal

(16)

Inordertodiscussconsistencyoft heARFIMAparame terestimat es!~

givenin(2.2.7)

ab ove , we need an additional condit ion.

Condition 3.2. There e xist s a non-stochastic innite-dimensionalmatrix 5 c orre sp onding

to a b ounded norm op erat or on the spac e L 2 such that k05 k k ! 0; whe re 5 k is the

k2k principalsub-mat rix of 5 (convergenc e is in probability if is st o chastic) .

If theestimat or is usedin unweighted form(=I); then all of t he matrices involved

are ident ity mat rices and t his condition is t rivially satised. If t he weight ing matrix is

an inverse covariance matrix (known or estimat ed) of a st ationary and invert ible process,

then 5 c orre sp onds to the Toeplitz form of the inverse pro cess.

Thenext theore mshowst hat!~

isaconsistentestimat orofthetruevectorof

parame-te rs! 0

;andTheorem5showst hatthedi erenceb e tweent hetwocanagainb e represe nted

asasumoftwoterms,onewit hanasympt oticallyN ormaldist ribut ion,andonewhichgo es

to zero in probability as k;T!1: We denot e the q uadrat ic f orm ( ~a0( ! ) ) 0

( ~a0(!) )

by Q T;k

(;~ !):

The or em 4. Under the conditions of Theorems 1 and 2 and condit ion 3.2, and where

T !1; k!1 and T 1 2

k ! 0; t he minimum-dist ance estimator !~

is c onsist ent f or ! 0

;

the t rue paramete r vec tor of a correct ly-sp ec ied A RFIMA mode l. The c orre sp onding

dist ance function min Q T;k

(;~ !) goesto zero in probability.

(17)

as Theore m 5 indicat es.

The or em5. U ndert heconditionsof Theorem1,t hedierence !~

0! 0

canb e represe nted

as t he sum ( k )+b(k) whe re foreachk t he correct ion term is

b (k )= @ 0 ( ! 0 ) @! @( ! 0 ) @! 0 01 @ 0 (! 0 ) @! (k);

with(k)as giveninTheorem3,andT 0

1 2

(k ) isasymptot ic allydistribute dasN(0;V( k ) )

as T !1; foreachk:

Proof: Se e A pp e ndix C. The asymptotic covariance matrix V(k) is give n in the pro of.

3.3 Perf ormance in nit esamples

To investigate the p erformance of these estimators in small samples, we generat e

simulate dsample sfromaselect ionof ARFIMA( 0;d;0); (1;d;0)and( 1;d;1)pro cesses,and

comparethere sultsbyparameterrootme ansquarederror( RMSE).MLandaut oregressive

estimat ors, along with GPH in the ( 0;d;0) case s, are compared. In order to examine the

impactofmis-sp ecicationweconsiderseveralcasesof(0;d;0)modelsof(1;d;0)pro cesse s;

in these case s p erf ormance is evaluat ed by t he out-of -sample RMSE of 1-step fore cast s.

In all of the following simulations, t he mean is treate d as unknown and t he

sam-ple mean is removed f rom the pro cess prior to A RFIMA parameter estimation. In the

frequency domain, removal of an e st imated mean is not req uired; the ze ro frequency is

(18)

eratelylargesamples,andinfactthattime-domainMLshowslowerMSEinsmallsample s.

Simulate d frac tional noise is obtained by t ransf ormat ion of ( Gaussian) whit e noise using

the Cholesky de comp osition of t he exact covariance of t he ARFIMA process, obt ained

using the result s of Chung ( 1994) ; see also Dieb old and Rudebusch (1989). In each case

estimat ors are compared using the same se ts of starting values (ze ro in (0;d;0) cases; the

b et ter of two alt ernatives is chosen in multiple -parameter cases). Optimization is p

er-formed using the Powell algorithm. Throughout, we use t he rule of thumb that AR lag

length can b e chosen as 8+3ln( T 100

); T 100; rounde d to t he nearest integer. This rule

is approximat ely equivalent to 3ln( T)06; we e xpress it in the former way to emphasize

T =100 as the `base'sample size.

At least 1000 replications are used in each case, more in the (0;d;0) c ases. For

multiple-parameter mo dels we use a sample size of 100, in common wit h much of the

literaturec ontaining resultson exact and approx imate ML; t he comput ational c ost of the

rep eatedinversionof T2T covariancemat ricesasso ciatedwit hMLb ecome sprohibitivef or

simulation atlargesample siz es. Int he (0;d;0)cases werep ortre sults forT =f100;400g:

(i) ARFIMA( 0;d;0)

Web egin by comparing estimat orsin thepurefractionally -int egrat ed case,whichhas

b eenthemostt horoughlyexaminedinthelite raturet odate. Figure2a/bplot stheRMSE's

fromML ,GPHandtheARest imator,onthegridofvaluesfromd= 00:4tod =0:9atan

inte rvalof0.1. 5

BothMLandARshowlowerRMSEthanGPHthroughoutt hest ationarity

5

(19)

considered as well. The AR estimator has lower RMSE t han ML at moderate absolut e

values of d; slightly higher RMSE at d = 0:3;0:4; and subst antially highe r at d = 00:4:

Howe ver,MLisconstrainedbytheoptimizationalgorithmtolieind2(00:5;0:5) ;whereas

ARisnotconstrainedinthisway, b eingusableout sidet hisinte rval;MLt he refore b enets

near -0.5 and 0.5 fromb eing const rained t o lie in a region around the true value which is

fairly t ightly b ounded on one side.

Qualitative re sults do not dier great ly b etwe en t he two sample siz es of Figures 2a

and 2b. At T = 400, the RMSE of the A R estimat or is almost complet ely insensitive to

the value of d; except at -0.4 where it rises to almost thevalue produced by GPH.

(ii) ARFIMA( 1;d;0) and A RFIMA(1;d;1)

The rst multiple -parameter cases that we consider are ARFIMA( 1;d;0); a simple

mo de l which allows short-run and long-run comp onent s. In these c ases, one eleme nt of

the estimation e rror arises through the diculty in discriminating these two comp onents

at small sample siz es, sinc e AR and fractional integration paramete rs can imply similar

patterns of autocorrelation for the rst fe w lags. Table 1 gives re sults for a few cases

similar to the A RFIMA(1;d;1)pro cesse s that we will address b elow: d=f00:3;0:3g and

=f0:7;00:2g:

knowledge of t he need for dierencing. The maximum numb er of p erio dogram ordinates is included for GPH , which is optimal for t his case in which there is no short memory comp onent.

(20)

RMSE's of ARFIMA( 1,d,0) parame ter estimates T =100; 2000replicat ions ( ML) , 10000 replic at ions ( AR)

d ^ d AR ^ d ML ^ AR ^ ML -0.3 0.7 0.266 0.186 0.287 0.195 -0.3 -0.2 0.110 0.120 0.137 0.139 0.3 0.7 0.179 0.237 0.150 0.116 0.3 -0.2 0.209 0.233 0.217 0.241

Neither estimator dominates in these ex amples; ML is markedly b ett erat (-0.3, 0.7),

AR is b et ter at ( 0.3, -0.2). The two are ve ry similar at (-0.3, 0.2) . For the case ( 0.3,

0.7), p ossibly t he most interesting in combining p ositive d with p ositive short -memory

auto correlat ion, d is b et ter estimatedby theA R estimat or, and by ML.

TheARFIMA( 1;d;1)parameterizationsex aminedinTable2arethoseusedby Chung

and B aillie (1993). N ot e thatin eachc ase t he short -memory parameters and have the

samesign,sot hatthecorresp ondingt erms (10L)and(1+ L) inthelagp olynomialsdo

notapproachc ancellation;at=0:5; =00:5;forexample, cance llat ionwouldt akeplace

and the apparent A RFIMA (1;d;1) pro cess would in fact b e A RFIMA(0;d;0) . In cases of

near-cancellat ion, t he pro cess may b e well approximat ed by a process with subst antially

dierent paramet er values, but similar ro ots of the short -memoryp olynomials, leading to

diculty in evaluating the est imates. Such cases are ruled out by paramete r values such

(21)

RMSE's of ARFIMA( 1,d,1) parame ter estimates T =100; 1000 replications (ML), 5000 re plications (A R) (d;;) ^ d AR ^ d ML ^ AR ^ ML ^ AR ^ ML (-0.3, 0.5, 0.2) 0.251 0.163 0.350 0.220 0.213 0.163 (-0.3, 0.2, 0.5) 0.207 0.151 0.295 0.237 0.152 0.154 (-0.3,-0.5,-0.2) 0.339 0.162 0.136 0.139 0.391 0.254 (-0.3,-0.2,-0.5) 0.333 0.179 0.183 0.173 0.291 0.250 ( 0, 0.5, 0.2) 0.285 0.298 0.307 0.235 0.198 0.154 ( 0, 0.2, 0.5) 0.291 0.285 0.332 0.313 0.144 0.139 ( 0,-0.5,-0.2) 0.228 0.187 0.154 0.142 0.328 0.265 ( 0,-0.2,-0.5) 0.243 0.263 0.220 0.202 0.384 0.381 (0.3, 0.5, 0.2) 0.375 0.408 0.285 0.271 0.178 0.152 (0.3, 0.2, 0.5) 0.375 0.390 0.378 0.381 0.132 0.124 (0.3,-0.5,-0.2) 0.231 0.209 0.156 0.149 0.324 0.285 (0.3,-0.2,-0.5) 0.286 0.314 0.238 0.207 0.434 0.430

The RMSE's of ML estimat es are generally some what b ett erthan those rep orted by

Chung and Baillie for the approx imate CSS estimat or. The pat tern of relat ive RMSE's

for t he AR and ML estimat ors is broadly similar t o that observed in lower-orde r case s.

Neit he r estimat or dominat es; ML giveslowe r RMSE fordinsevenoft he twelvecases,AR

inve. Four of t hecases inwhichML issup erior arethefour caseswithnegatived; where

ML's p erformance is markedly b ette r. For non-negative values, p erf ormance of the two

for d is similar. In e st imation of t he short-memory paramet ers, ML shows an advantage

throughoutt heparamete rspace,althoughthedierencesinRMSEb etweenthetechniq ues

(22)

Inthenal setofsimulat ionsweconsiderMLandARestimatesof din pro cessesthat

are ARFIMA (1;d;1); but whe re the mo del use d is ARFIMA( 0;d;0). These result s serve

to illustrate the point that the A RFIMA mo dels estimators ne ed not b e seen as purely

`paramet ric' estimat ors, int he sense of dep ending crucially on acorrec t paramet erizat ion

of the pro ce ss; this is t rue in part icular of the AR est imator, where the aut oregressive

struct ure which ex tract s st at ist ical information from the data is capable of tt ing a very

generalclassofpro c esses,andforsomepurp osesusefulestimatescanalsob eobtainedfrom

mis-sp ec ied mo de ls. In t his sense t his estimat or may b e thought of as semi-paramet ric:

the long-memory comp onent is capture d via the paramet ric fract ionally-integrated mo del

with paramet er d; and t he short-me mory comp onent via a variable numb er of A R ( or

ARMA) terms, which may b e increased with sample size t o de tect inc reasingly subtle

short-memory fe at ures as sample inf ormat ion accumulat es.

In mis-sp ec ied case s the obvious criterion of evaluation, accuracy of parameter

es-timates, is not applicable; some paramete rs are missing, and in such cases it will

typi-cally b e optimal t o deviate from the `true' values of paramet ers for some purp oses. For

this exercise, we therefore evaluate the mis-sp e cied mo dels by the accuracy ( RMSE)

of 1-st ep out-of-sample f orecasts of the true pro cess. We consider d = f00:3;0:;0:3g and

(;)=f(0:7;0:5);(0:;0:);(0:2;0:2) g;forat ot alofninecases(c asesinwhich(;)=f( 0;0)

are of c ourse correctly sp ecie d,and are inc luded forcomparison). Weagain use T =100

in e ach case , and noise variance is unity. Results are pre sented in Table 3. This small

(23)

mo de l form is unk nown.

Table 3

RMS Forecast Errors in A RFIMA(0,d,0) mo dels of ARFIMA( 1,d,1) pro c esses

T =100; 5000 replic ations d A R ML -0.3 0. 0. 1.0026 1.0016 0. 0. 0. 1.0028 1.0036 0.3 0. 0. 1.0103 1.0103 -0.3 0.2 0.2 1.0263 1.0262 0. 0.2 0.2 1.0265 1.0256 0.3 0.2 0.2 1.0280 1.0351 -0.3 0.7 0.5 1.1378 1.1566 0. 0.7 0.5 1.1382 1.2728 0.3 0.7 0.5 1.1381 1.5823

In t he c orrectly-sp ecie d comparison cases, these RMS forecast error result s mirror

the parame ter RMSE results of Figure 2a; ML is sup erior at d =-0.3, AR at d =0, and

the two are very close at d = 0.3. Whe re (;) = (0:2;0:2); ML pro duc es slight ly b e tter

resultsatd= 00:3 and0. H owever, theA Rforec astsaresubstant ially b ett eratd=0:3;a

phenomenonwhichapp earsmorestronglyat( ; ) =( 0:7;0:5):Inallofthelat tercases,AR

estimat espro ducemarkedlyb ett er1-st epforecasts. Notet hatincase ssuchasthesewhere

there issubstantialautocorrelationf rom theshort -memory parameters, the ARestimat or

is able t o comp ensate for the lack of short -memory comp onents in the estimat ed mo del

(24)

also b e produced by imp osing a di erenced represent ation; however, t o do so it is again

necessary t o make a dete rmination of an opt imal degre e of dierencing, either a priori or

based on featuresof the dat a, which t he A R estimat or doesnot require .

4. Conc ludi ng remark s

Est imation of long-memory mo dels by autoregressive approximation is f easible even

in quite small samples, and is consistent across a range of stationary and non-st at ionary

values oft he long-me moryparamete r,sot hat estimationdo esnot requireknowledgeof an

appropriate transformation t o apply.

Autore gressive estimat ion, and mo del select ion based on t hese est imate s, is often

convenientincho osingstarting values forte chniquessuchasML;estimationbased ont his

principlehas b een usedforARMA mo delsasf arback asDurbin (1960). While thisis one

p otential use of autoregressive estimates, such estimates seem to have desirable f eatures

b eyond their ability to pro duc e go o d starting values quick ly. Their relat ive insensit ivity

to t he stat ionarity of the process, and ability for a wide class of pro cesses t o prov ide an

arbitrarily go od statistical approximation as sample size and order increase, make t hem

wellsuitedtot het reat mentofprocessesofunknownform. Thismayb eesp e cially valuable

in economic dataf orwhichtheARFIMA class itself willty pically b e only anapproximat e

(25)

Agiakloglou, C., P. Newb old and M. Wohar ( 1992) Bias In an Est imator of The Fractional Dierenc e Parameter. Journal of Time Series Analysis 14, 235-246.

Ame miya, T. (1985) A dvanc ed Ec onometric s. Harvard U niversity Press, Cambridge, MA.

Berk, K.N. (1974)Consistent A ut oregressiveSp ec tral Est imates. Annals of Stat istics 2, 489{502.

Cheung, Y.W. and F.X. Dieb old ( 1994) On Max imum-L ikeliho o d Est imation of the Dierenc ingParamet erof Fract ionally-Integrated Noise wit h U nk nownMean. Journal of Economet rics 62, 311-316.

Chung, C.-F. (1994) A not e on calculat ing t he aut o covariances of t he fract ionally inte grated ARMA mo dels. Ec onomics Lett ers 45, 293-297.

Chung,C.-F.andR.T.B aillie(1993)SmallSampleBiasinCondit ionalSumofSquares Estimat ors of Fractionally -Int egrat ed ARMAModels. Empirical Economics 18, 791-806.

Dahlhaus, R.( 1989)Ec ientparamete rest imation forself -similarpro cesses. Annals of St atistics 17, 1749-1766.

Davies, R.B.and D.S. Hart e (1987) Test s forHurst Eect . Biomet rika 74, 95-101. Dieb old,F.X.andG.D.Rudebusch(1989)LongMe moryandPersiste nceinAggregat e Out put . Journal of Monet ary Economics 24, 189-209.

Durbin, J. (1960). The Fit ting of Time Series Mo dels. Review of the Internat ional Statistical Institute 28, 233-243.

Galbrait h, J.W. and V. Z inde-Walsh (1994) A Simple, N on-Iterative Estimat or f or Moving-A verage Mo dels. B iometrika 81, 143-155.

Galbrait h, J.W. and V. Zinde-Walsh (1997) On Some Simple, Aut oregre ssion-based Estimat ion and Identic at ion Techniques forA RMA Mo dels. B iometrika 84,685-696.

Gewe ke , J. and S. Porter-H udak ( 1983) The Estimation and Applicat ion of L ong Memory Time Se riesMo dels. Journalof Time Series Analy sis 4, 221-238.

Giraitis, L. and D. Surgailis ( 1990) A Cent ral L imit Theorem for Quadrat ic Forms in Strongly D ep endent Linear Variables and its Application to Asympt otic Normality of Whittle's Estimate. Probability Theory andRelated Fields 86, 87-104.

Granger,C.W.J.( 1980)L ongMe mory RelationshipsandtheAggre gat ionofDynamic Mo dels. Journal of Econometrics 14, 227-238.

Granger,C.W.J.,andR.Joyeux ( 1980)AnInt ro ductiontoLong-MemoryTimeSeries Mo dels and Frac tional Diere ncing. Journal of Time Serie s Analysis 1, 15-39.

Grenander,U.andG.Szego(1958)To eplitzFormsandTheirApplications. University of Calif ornia Press, B erkeley.

Hosking, J.R.M. (1981)Frac tional Diere ncing. Biomet rika 68, 165-176.

Hosking, J.R.M. ( 1996) A sy mptot ic Distributions of the Sample Mean, A ut o covari-ances, and A ut o correlations of L ong-memory Time Series. Journal of Economet rics 73, 261-284.

(26)

American Society of Civil Engineers 116, 770-799.

Janacek, G.J. (1982) Determining the De gree of Dierencing for Time Series v ia the Long Sp ect rum. Journal of Time Se riesA nalysis3, 177-183.

Li, W.K. and A.I. McLeo d (1986) Fract ional Time Series Mo delling. B iomet rika 73, 217-221.

Martin,V.L.andN .PW ilkins(1999)IndirectEstimat ionofARFIMAandVA RFIMA Mo dels. Journal of Econometrics 93, 149-175.

Phillips, P.C.B. (1987) Time series regre ssion with a unit ro ot. Econome trica 55, 277-301.

Priestley,M.B .(1981)Sp ect ralA nalysisandTimeSeries. AcademicPress,N ewYork. Robinson, P.M. ( 1991) Testing for St rong Serial Correlation and Dy namic Het ero-skedasticity in Mult iple Re gression. Journal of Economet rics 47, 67-84.

Robinson, P.M.( 1995) Log-p erio dogram Regression of Time Series with L ong-Range Dep endence. Annals of Stat istics 23, 1048-1072.

Saikkonen, P. (1986). Asympt ot ic Prop ert ie s of some Preliminary Estimat ors f or Autoregre ssive Mov ing A verage Time Series Models. Journal of Time Series Analy sis 7, 133-155.

Shimot su, K. and P.C.B. Phillips (1999) Mo died Local Whit tle Est imation of the Memory Paramete r in the Nonst ationary Case. Workingpap er, Yale U niversity.

Sowe ll, F.B . (1992) Maximum Likeliho o d Estimat ion of St ationary U nivariate F rac-tionally Integrat ed Time Serie s Mode ls. Journal of Econome trics 53, 165-188.

Tanaka, K. (1999) N onstat ionary Fractional U nit Roots. Econometric The ory 15, 549-582.

Tie slau, M.A ., P. Schmidt and R.T. Baillie ( 1996) A Minimum-Dist ance Estimat or forLong-Memory Pro cesse s. Journal of Econometrics 71,249-264.

Yajima,Y. (1992)AsymptoticProp ertiesof EstimatesinIncorrec tARMA Mo delsf or Long-memoryTime Se ries. Pro c . IMA , Springer-Verlag, 375-382.

(27)

We also consider est imators t hat allow us to use Hosking's (1996) results on asymp-tot ic normality of the dierences b etwe en sample covariances. A n estimator wit h these prop erties t hat is similar in form t o ~a

YW

can b e const ructe d in the following way. We denet hevec tor (k;0)=(

1 0 2 ; 2 0 3 ;::: k 0 k +1

)anditssampleanalogue (k;^ 0); andthematrix6( k ;0)wit htypical elementf6(k;0) g

ij = ji0jj 0 ji0 j+1j ;andit ssample analogue ^

6(k ;0) : Dene t he estimator ^a( k ;0) as t he solutiont o

^

6(k;0)^a( k ;0)= ^ ( k ;0); (2:2:5)

and let a( k ;0) b e thesolution to

6(k;0) a( k ;0)= ( k ;0): (2:2:6)

As plim(~a) is denotedby a(k); here a(k;0)=plim(^a( k ;0)) .

An estimator with a simpler form was prop osed by H osking, who suggeste d that rat ios of diere nce s of auto c orrelat ions can be use d. Indeed, conside r the vect or ^a

H with comp onents ^ a H(i) = 10^ i 10^ k +1 ;

fori=1;:::;k . This ve ctor conve rges atrate T 0

1 2

t o its p opulationanalogue a H

and has an asympt ot ically normal dist ribut ion for alld 2(0

1 2 ; 1 2 ):

(28)

Proof of Theorem 1.

Consider the expression in (2.2.6).

For the norm of the left-hand-side vector we can write

a

(

k

);

[1

;k

] (

k

;

1)

[

k

+1

;

1)

:

Denote the j-th

com-ponent of this vector by

s

j

where

=1

;

2

;:::k:

< c

l

2

d

;1

and

j

l

j

< c

l

;

d

;1

for some

c

;c

>

0

:

Then for

; 1 2

< d <

0

we have

s

j

i

) 2

d

;1

is

O

((

k

;

j

) 2

d

)

:

For

0

< d <

1 2

;

s

j

s

j

is thus

k

s

k = P

k

j

=1

s

2

) ;1

k

:

:

Consider the spectral density function associated with

;

denote by

f

(

x

)

the spectral density multiplied by

2

:

Here

f

(

x

)=[

P

(

e

ix

)

P

(

e

;

ix

)] ;1 (2;2cos

x

) ;

d

Q

(

e

ix

)

Q

(

e

;

ix

)

:

The polynomials

P;Q

with none of the roots on the unit circle are bounded

there from above and below, providing bounds on

f

(

x

);

B

L

(2;2cos

x

) ;

d

x

)=

B

L

(2;2cos

x

)

;

d

:

In the case

0

< d <

1

2

the lower bound of

f

L

(

x

)

is

B

L

4 ;

d

;

When

d <

0

the lower bound of

f

L

(

x

)

is zero. We use a dierent approach

to obtain a tighter bound for the eigenvalues of

(

k

)

from below and evaluate

the rate at which this lower bound goes to zero. To do this we use properties

of eigenvalues of symmetric matrices and of circulant matrices.

Property 1 (Separation Theorem; Wilkinson (1965), pp.103-104). If

S

n

is

a symmetric matrix and is partitioned as

S

n

=

S

n

;1

s

0 1

;

then the eigenvalues

i

(

n

(

S

n

)

:

Property 2 (Priestley(1981), p.261). If

C

k

is a circulant matrix (and also

(30)

a Hermitian Toeplitz matrix) of the form

C

k

... ... ... ...

... ...

... ... ... ... ...

...

... ... ... ... ... ...

... ... ... ... ...

0

1

0 3 7 7 7 7 7 7 7 7 7 7 7 5

;

then its eigenvalues (here not subscipted by order of magnitude) are given

by

l

(

C

k

)=

k

X

r

=;

k

r

exp

(;

i!

l

r

)

;

where

!

l

= 2

l

2

k

+1

:

(A.2)

We shall evaluate the eigenvalues of

(

k

)

by embedding this matrix in

a circulant

C

k

:

By Property 1 the smallest eigenvalue of

(

k

)

is bounded

from below by the smallest eigenvalue of

C

k

;

which will be given by

l

(

C

k

)

in

(A.2) for an appropriate

!

l

, moreover, by applying the separation theorem

repeatedly we can see that the smallest eigenvalue of

(

k

)

is bounded from

below by any of

k

smallest eigenvalues of

C

k

:

The

l

(

C

k

)

for large

k

is approximated by

f

(

!

l

)= P

1

r

=;1

r

exp

(;

i!

l

r

)

;

which without loss of generality we can assume to equal

(1;cos

2

k

)

;

d

;

more-over for

l

=1

we have that

1 (

0

by comparing magnitudes of groups of positive and negative terms in the

sum, thus

1 (

C

k

)

f

(

!

1 )=(1;cos 2

k

) ;

d

O

(

k

2

d

)

;

where

indicates that

1

(

C

k

)

declines at exactly the rate

O

(

k

2

d

)

:

Next we

show that

1

(

C

k

)

is among the

k

smallest eigenvalues. Indeed, for any

l

j

l

(

C

k

);

f

(

!

:

3

(31)

On the other hand, if

k

2

> l > k

_with

_<

1

f

(

!

l

)=(1;cos 2

l

k

) ;

d

O

(

k

2

d

(1;

) )

;

thus for large enough

k

the corresponding

l

(

k

;2

d

)

:

Combining this with (A.1) we get

that

a

(

k

:

Finally, for any

;"

choose

k

such that

a

(

k

);

[1

;k

]

<

"

2

;

then for that

value of

k;

by consistency of

~

a

for

a

(

k

)

(Yajima, 1992), nd

~

T

such that

Pr(k ~

a

;

a

(

k

)k

>

"

2

)

<

for samples of size

T >

~

T:

Appendix B.

We begin with a lemma which will be useful for the proof of Theorem 2.

Lemma B.

Under the assumptions in Theorem 2

.

T

;2

d

T

X

i

j;

O

(

i

;

j

) 2

d

;2

if

i > j

;

(B.2)

E

(

z

i

y

j

) 2 =

O

(

j

);

(B.3)

E

(

z

i

y

i

;

l

z

j

y

j

;

l

)=

O

(1)

for

i

6=

j

;

(B.4)

E

(

T

;1

z

t

;

l

y

t

) 2 =

O

(1);

(B.5)

T

;1

y

t

is similar to that obtained when the process

is dened directly via the expansion of the operator

(1;

L

)

d

_:

_{Thus similarly}

by Lemma 2.13 in Shimotsu and Phillips we have that

Ey

2

T

=

O

(

T

2

d

;1 )

and

from Tanaka (1999)

T

;2

d

P

T

t

=

k

y

2

t

;1

converges in law to a functional of an

4

(32)

integrated Brownian motion and is thus bounded in probability, which gives

(B.1)

:

Since

y

t

is a partial sum of

z

where

E

(

z

i

z

j

)=

i

;

j

;

we have that

Ez

i

y

j

= P

j

l

=1

i

;

l

:

Recall that

m

for the process

z

is

O

(

m

2

d

;3

);

(B.2) follows.

Next we consider the MA representation

O

(

l

d

;2

)

;

and that

e

has bounded fourth moments by the assumptions in

2.1. We can express

E

(

z

i

1

z

i

2

z

i

3

for all dierent subscripts on

z;

and similarly

E

(

z

2

i

z

2

l

)=

O

(1);

E

(

z

2

j

z

l

z

m

)=

O

(j

j

;

l

j 3

d

;3 j

j

;

m

j 2

d

;3 )

:

Then (B.3) is obtained by expressing

E

(

z

i

y

j

) 2 =

j

X

l

=1

E

(

z

i

z

l

) 2 +2

j

X

l

=1

j

X

m

=

z

i

z

n

z

j

z

m

provides (B.4); note that this expression has no terms of the form

E

(

z

2

i

z

2

j

)

.

The expression in (B.5) immediately follows and thus (B.6) follows by

Chebyshev's inequality.

Proof of Theorem 2.

Consider the AR(

1)

representation and write

= 1 X

i

=1

k

+

i

y

t

;

k

;

i

+

"

t

;t

=

k

+1

;:::;T

(B.7)

5

(33)

Recall that

y

with a non-positive subscipt is zero, thus on the right-hand

side we have

;

denote this quantity, which represents the error in the k-th order AR

repre-sentation, by

u

t

:

Compare the OLS estimator

a

^ 0 =( ^

a

1

;:::

^

a

k

)

of the model

y

t

=

1

y

t

;1 ++

k

y

t

;

k

+

u

t

(B.8)

with the OLS estimator

^ 0 =( ^

1

;:::

^

k

)

of

y

t

=

X

t

+

u

t

(B.9)

with

X

t

= (

Z

t

;y

t

;1 )=(

z

t

;1

;:::z

t

;

k

+1

;y

t

;1 )

;

0 =(

1

;:::

k

)

:

We note that the one-to-one relation

1

=

k

+

1

;

l

=

l

;1

;

l

for

l

=2

;:::;k

produces the same residuals in each model thus we get

a

^

1 = ^

k

+ ^

1 ;^

a

^

is a consistent estimator

of

consistency of

a

^

as an estimator of

will follow.

Consider (B.9); we can write

^

;

= "

T

X

t

=

k

X

0

t

T

X

t

=

k

Z

0

t

Z

t

;

v

=

T

X

t

=

k

Z

0

t

y

t

;1 ;

w

=

T

X

t

=

k

y

2

t

;1

:

Dene a diagonal

k

;T

d

);

then

; ;1 =

T

;1

A

T

;1

=

2;

d

v

T

;1

=

2;

d

v

0

T

;2

d

w

;1

:

Since

Z

t

is stationary and as was demonstrated for

d

0

= 1;

d <

0

;

the

covariance matrix for the process

(

z

)

is such that

k(

z

)

;1

k =

O

(

k

;2

d

0 )

;

it

can be shown from Lemma 3 of Berk (1974) that as

o

p

(

k

;2

d

0 )

:

From (B.6)

T

; 1 2 ;

:

We can ignore the

"

t

part in

u

t

and need to concentrate on evaluating the

terms

T

;1 P

T

t

=

k

z

i

for the subvector

T

;1 P

T

t

=

k

Z

0

t

u

t

of the

rst

k

;1

components and

T

;2

d

P

T

com-ponent.

Consider rst

T

;1 P

T

t

=

k

z

denote

this by

W

(

l

)

:

Then

W

(

l

) =

T

;1 P

T

;

k

i

=1

w

i

with

w

i

=

k

+

i

P

T

t

=

k

+

i

z

t

;

l

y

t

;

k

;

i

j

Ew

i

w

) 2 =

O

(

k

;2

d

;2 )

and for each

l

W

(

l

)=

T

;1

T

;

k

X

i

=1

w

i

=

O

p

(

k

d

;1 )

:

Then

T

;1

T

X

t

=

k

Z

0

t

u

t

=(

W

(

l

) 2 ) 1

=

2 =

Z

0

t

u

t

=

O

p

(

k

;3

d

+3

=

2 )

:

7

(35)

Finally

T

;2

d

X

k

+

i

;1 =

O

p

(

k

;

d

)

and the statement of the Theorem obtains since

k

;3

d

+3

=

2

< k

;

d

if

d

3 4

and

is

k

;

d

for

1 2

< d <

3 4

.

Appendix C.

Proof of Theorem 3.

Dene for some

1

< n < T

the matrix

^

(;

;k

) = ^

(

k

);^

n

0

and the

vector

^(;

;k

)=

^(

k

);

^

n

;

where

is a vector of ones, and consider

^

a

(;

;k

)

that solves

^

(;

;k

) ^

a

(;

;k

)=^

(;

;k

)

:

Similarly dened matrices and vectors for the population quantities will

be denoted by the same symbols but without the hats. Also dene

D

l

=

T

1

=

2 [^

0 ;

0

;

^

l

;

l

]

:

To prove Theorem 3 we prove the following Theorem

that provides an explicit description of the asymptotics for the terms

(

k

)= ^

a

(;

;k

Under the conditions of Theorem 3 as

T

!1

for any xed

k;n

the limiting distribution of

T

1 2

( ^

a

(;

;k

);

a

(;

;k

))

is multivariate normal

with mean

0 and covariance matrix

W

(

k

)=(;

;k

)

;1

AMA

0

(;

;k

)

;1

:

The

k

(

k

+1)

matrix

A

has elements

f

A

g

ij

= 8 < :

a

i

;

j

+

a

i

+

j

for

1

i;j

j

=

k

+1

;

(C.1)

where

a

l

are the components of the vector

a

(;

;k

)

for

1

l

k

and

a

0 =1

;

a

l

= 0

;

if

l <

0

. The (

k

+1)(

k

+1)

matrix

M

is the limit covariance

matrix of the vector

G

with the

l

th component

G

l

=

D

l

for

l

= 1

;

2

:::k

and

G

k

+1

=

D

n

for

l

=

k

+1;

its elements are

f

M

g

ij

=

limcov

(

D

m

+

l

) 2 +

(

0 ;

m

)(

0 ;

l

)