• Aucun résultat trouvé

Quantile-Based Inference and Estimation of Heavy-Tailed Distributions

N/A
N/A
Protected

Academic year: 2021

Partager "Quantile-Based Inference and Estimation of Heavy-Tailed Distributions"

Copied!
145
0
0

Texte intégral

(1)

Universit´

e libre de Bruxelles

Solvay Brussels School of Economics and Management

European Center for Advanced Research in Economics and Statistics

Quantile-Based Inference and Estimation of

Heavy-Tailed Distributions

Yves Dominicy

Thesis submitted to obtain the degree of Doctor of Economic Sciences and Management

Advisor: Prof. Dr. David Veredas

(2)
(3)

Committee

Prof. Dr. Marc Hallin (Universit´e libre de Bruxelles and Princeton University)

Prof. Dr. Davy Paindaveine (Universit´e libre de Bruxelles)

Prof. Dr. Gennady Samorodnitsky (Cornell University)

Prof. Dr. David Veredas (Universit´e libre de Bruxelles)

(4)

Acknowledgements

This thesis concludes five years of research, which I never could have achieved without the support of several persons.

First of all, I would like to express my deep and sincere gratitude to my advisor, David Veredas. I thank him for having proposed me to do a Ph.D. thesis in econometrics at the end of my Master’s degree, and for the subject, which I enjoyed a lot. David was more than just an advisor, he was a teacher, a mentor and a friend. His availability, his continuous support, his constant concern and his careful reading of my different manuscripts have greatly contributed to the realized work, and I cannot thank him enough for this.

I would further like to address my thanks to Marc Hallin, Davy Paindaveine, Gennady Samorod-nitsky and Chen Zhou for having accepted to be part of my committee, and for their useful com-ments during the private defense.

I wish to thank Siegfried H¨ormann for having contributed to this work by being one of my co-authors, it was nice working with you. To Hiroaki Ogata for the joint projects, his kindness and for the invitation to Waseda University, you were a great host. To Pauliina Ilmonen for the joint research, the discussions about the Hill estimator and the off-research conversations, which were a lot of fun. I wish you all the best.

During my Ph.D., I was privileged to spend three months in the USA at Cornell University. Thank you, Gennady Samorodnitsky, for the invitation, your availability and for the interesting discussions we had.

The past five years, I benefited from financial support from the IAP (Interuniversity Attraction Poles) and F.R.I.A. (Fonds pour la formation `a la Recherche dans l’Industrie et dans l’Agriculture) scholarships. Moreover, I was a teaching assistant at the Universit´e libre de Bruxelles for a short period of time for Marjorie Gassner. It was a pleasure to work for you.

Thanks to all the colleagues of the H building which spiced up my daily Ph.D. life. I keep good memories of the different breaks, be it colleagues passing at my office, discussions in the lounge or the hallway. I do not forget the ones I know from the other campus, mainly the ones of the NO building. You played a big part in the after work life, in particular Germain with his game evenings. Thanks to the different colleagues that shared an office with me. I also acknowledge the secretaries, Claude, Nancy, Claire and Pierre, for their efficiency.

Furthermore, I would as well like to express my thanks to the former and present members of the “Brussels Summer School of Mathematics”. It was and still is, at least for this year, a pleasure to be a part of the organization.

I thank all the persons, be it the trainers or the playing partners, of the Ixelles Tennis Club. It was great playing there and getting to know all of you.

(5)

Thanks to all the people I have met at the different house parties, street celebrations, or while hanging out in Moonies or Dunbar’s, you have made my stay memorable.

Furthermore, I am thankful to my friends’ help and support. Carine and Kim, our lunches and conversations at Bastoche will always remain as my favourite lunch breaks. Thank you Christiane for all the moments and all the laughters shared, you were of a great support. Florence, you have been a great lunch partner the last years and the board game evenings at your place were really nice.

Thank you, Benjamin, for not being just an office mate, but a real good friend. All the different moments we shared at the university, at restaurants or anywhere else in Brussels (without forgetting Ostend) are written in my mind. Some call us “Chip ’n’ Dale”, and it is true like them we have a lot of adventures, but mostly a lot of fun.

And how could I forget the intense friendship with Christophe. Impossible. Thank you for being a great friend and this already for over 10 years. It would be too long to summarize all the moments and memories we share. Therefore, I would just acknowledge you for something that is a hallmark of you, which are your phone calls. Sometimes serious, sometimes relaxing, sometimes urgent, sometimes not so important, sometimes down to earth, sometimes out of this world, but always great moments of a true friendship.

Last but not least, I would like to express my thanks to my parents for their encouragement and love. They have shown me their understanding and unconditional support when I decided to open the way of academia, although it is a adventure full of uncertainties. Thank you for everything you have done during all those years. All the moments shared together are unforgettable.

(6)

Contents

Committee i

Acknowledgements ii

Table of contents iv

List of figures vii

List of tables viii

Introduction 2

1 Motivation . . . 2

2 Heavy-Tailed Distributions . . . 3

2.1 Examples of Heavy-Tailed Distributions . . . 4

3 Elliptical distributions . . . 6

3.1 Examples . . . 7

4 Order Statistics and Quantiles . . . 9

5 Estimation Methods . . . 9

5.1 Maximum Likelihood Estimation . . . 9

5.2 Generalized Method of Moments . . . 11

5.3 Indirect Inference . . . 12

5.4 Minimum Distance Estimation . . . 13

5.5 Estimation of the tail-index α . . . 13

(7)

Table of contents

I The Method of Simulated Quantiles 18

1 Introduction . . . 20

2 The Method of Simulated Quantiles . . . 23

2.1 Asymptotic Properties . . . 28

3 Monte-Carlo Simulations . . . 30

4 An Illustration . . . 32

5 Conclusion . . . 39

II Simulated Quantile Method for Elliptical Distributions 44 1 Introduction . . . 46

2 Elliptical Distributions . . . 48

3 Quantile–based Inference . . . 49

3.1 Locations, dispersions and tail index . . . 49

3.2 Co–dispersions . . . 50

3.3 The optimization and asymptotic properties . . . 53

3.4 Going fast . . . 56

4 Assessing the finite sample properties . . . 57

5 Testing for level contours fit . . . 62

6 Illustration . . . 66

7 Conclusions . . . 70

III A Multivariate Hill Estimator 74 1 Introduction . . . 76

2 A review of the univariate Hill estimator . . . 77

2.1 Maximum domain of attraction . . . 78

2.2 Three derivations of the Hill estimator . . . 79

2.3 Asymptotics . . . 81

2.4 Optimal choice of order statistics . . . 83

(8)

Table of contents

3 The regularly varying elliptical family . . . 88

4 The multivariate Hill estimator . . . 90

5 Simulation Study . . . 94

6 Empirical illustration . . . 100

7 Conclusion . . . 104

IV On Sample Marginal Quantiles for Stationary Processes 111 1 Introduction . . . 113

2 Setup and main result . . . 114

3 Monte Carlo study . . . 116

4 Conclusions . . . 117

(9)

List of Figures

1.1 Returns (top panel) and adjusted returns (bottom panel). . . 35

1.2 World portfolio conditional location, annualized volatility and asymmetry . . . 39

2.1 A diagrammatic representation of the function of quantiles for the co–dispersion . . 51

2.2 Locations . . . 58

2.3 Dispersion matrices: Dimension 20 . . . 59

2.4 Dispersion matrices: Dimension 200 . . . 60

2.5 Dispersion matrices: Dimension 2000 . . . 61

2.6 Hypothesis testing scenarios . . . 64

2.7 Sample correlations . . . 67

2.8 Estimated standardized dispersions matrices . . . 68

3.1 Kernel densities - dimension 10 and α = 3 . . . 99

3.2 Tail dependence coefficients – full sample . . . 101

3.3 Tail dependence coefficients – sub-samples . . . 103

4.1 VAR(1) with α = 10 . . . 121

4.2 VAR(1) with α = 20 . . . 122

4.3 CCC-GARCH(1,1) with α = 10 . . . 123

(10)

List of Tables

1.1 Univariate Monte Carlo study . . . 31

1.2 5–dimensional Monte Carlo study . . . 32

1.3 10–dimensional Monte Carlo study . . . 33

1.4 Descriptive Statistics . . . 34

1.5 Estimated parameters . . . 36

2.1 Estimated tail indexes and computational time . . . 62

2.2 Hypothesis testing scenarios . . . 65

2.3 Sizes and upper bounds of ξN . . . 66

2.4 Testing results . . . 69

3.1 Univariate Monte Carlo study . . . 95

3.2 Bivariate Monte Carlo study . . . 96

3.3 10 dimensional Monte Carlo study . . . 97

3.4 50 dimensional Monte Carlo study . . . 98

3.5 Tail descriptive statistics . . . 101

(11)
(12)

Introduction Motivation

1

Motivation

Many techniques in finance rest upon the assumption that the random variables under investigation follow a Gaussian distribution. However, observations often deviate from the Gaussian model. In particular the joint and the marginal distributions are heavy-tailed, meaning that the probability of extreme observations, be it losses or gains, is much higher than what the Gaussian distribution predicts. For instance, the latter is unable to describe the crashes, bubbles, and crises observed in financial markets.

Heavy tails have been detected in almost every financial return series, since the publication of Mandelbrot (1963)’s seminal analysis of cotton price changes. Heavy-tailed distributions are of paramount importance in finance, for instance, investors holding some portion of their wealth in risky assets need a realistic estimation of the probability of major losses. Furthermore, they have been successfully applied for the analysis of stock returns, excess bond returns, foreign exchange rates and commodity price returns (see, for example, McCulloch (1996), Embrechts et al. (1997) and Rachev and Mittnik (2000)). Popular heavy-tailed distributions are the Pareto, the Student-t, the hyperbolic, the normal inverse Gaussian, or the α-stable distribution, to name a few. Those probability laws often fit well the empirical data under consideration. In particular, they are relevant for samples containing extreme events, like natural catastrophes or market crashes.

Estimating the parameters of an econometric or economic parametric model is a first order concern. However, heavy tails affect traditional estimation methods, due to the presence of extreme observations, calling for new estimation methods. It is known that in the situation where we know the probability law that governs the random variables, Maximum Likelihood Estimation (MLE) is the benchmark technique. If we relax the assumption of knowledge of the distribution, but we still have information about the moments, the Generalized Method of Moments (GMM) becomes the standard approach. However, there are many models that cannot be easily estimated with those classical estimation methods as stochastic volatilities, models with stochastic regimes switches, or involving expected utilities, to name a few. For those models the likelihood function may not be available analytically or is difficult to estimate, and/or the moments do not exist.

(13)

Introduction Heavy-Tailed Distributions Before describing the different chapters of the thesis, we will briefly introduce the notions of univariate heavy-tailed distribution and the elliptical (heavy-tailed) distribution, as well as the concept of quantiles, and a brief summary of the above mentioned estimation methods. We will conclude this section with the structure of the thesis.

2

Heavy-Tailed Distributions

Heavy-tailed models attract more and more attention as they play an important role in various areas of application, e.g. computer science, telecommunications, insurance, finance, geology, climatology, and biostatistics, to name a few.

Heavy-tailed distributions are probability distributions whose tails are heavier than the exponen-tial distribution, meaning laws whose exponenexponen-tial moments are infinite, E(eλx) = ∞ for all λ > 0

and x ∈ R. There exist some discrepancy over the terminology heavy-tailed as it varies according to the area of interest. In the literature, some authors use the term heavy-tailed to refer to the distributions which do not have all their power moments finite; and others use it to define those distributions that do not have a finite variance. Occasionally, heavy-tailed distributions are defined as the probability laws that have heavier tails than the Gaussian distribution.

Let X be a random variable with cumulative distribution function F , and denote its tail function by ¯ F (x) = 1 − F (x) ≡ Pr[X > x]. X is said to be heavy-tailed if lim x→∞e λx F (x) = ∞ for all λ > 0.

That is the case if and only if F fails to possess any positive exponential moment. In other words, F is heavy-tailed if and only if its tail function ¯F fails to be bounded by any exponentially decreasing function.

Sub-families of heavy-tailed distributions that are often encountered in the literature are the regular varying distributions and the sub-exponential distributions. We briefly introduce both distribution classes here.

A regularly varying distribution is defined in the following way.

Definition 2.1. A distribution F on (0, ∞) is called regularly varying at infinity with index −α, α > 0, if lim t→∞ ¯ F (tx) ¯ F (t) = x −α , where α is called the tail index and for x > 0.

(14)

Introduction Heavy-Tailed Distributions The class of distribution functions of regular variation is included in the class of sub-exponential distributions. Their name arises from one of their properties which is that their tails decrease slower than the exponential tail. This entails that extreme observations can occur in a data set with non-negligible probability, and makes the sub-exponential family candidate for modelling such behaviour. In 1964, Chistyakov is the first one to investigate sub-exponential distributions and they are defined in terms of convolutions of probability distributions.

Definition 2.2. Let Xi, i ∈ N, be independent and identically distributed (i.i.d.) random variables

with distribution function F such that F (x) < 1 for all x > 0. Denote by ¯F (x) = 1 − F (x), the tail function of F and by ¯FN ∗(x) = 1 − FN ∗(x) = P (X

1+ . . . + XN > x), the tail function of the N -fold

convolution of F . F is a sub-exponential distribution function if one of the following equivalent conditions holds: 1) limx→∞ ¯ FN ∗(x) ¯ F (x) = N for some N ≥ 2, 2) limx→∞ ¯ FN ∗(x)

P (max(X1,...,XN)>x) = 1 for some N ≥ 2.

In particular, all commonly used heavy-tailed distributions are sub-exponential. Those that are one-tailed include, among others, the Pareto, the Log-normal, the L´evy, the Burr and the Log-gamma distribution. Those that are two-tailed include, for example, the Student-t and the α-stable distribution. The Cauchy distribution is a special case of both distributions, with the tail index or degree of freedom being equal to 1. The family of α-stable distributions is heavy-tailed, excepting when α = 2 which corresponds to the Gaussian distribution. Moreover, some α-stable distributions are one-sided, for instance the L´evy distribution with tail index α = 1/2 and skewness parameter β = 1.

2.1

Examples of Heavy-Tailed Distributions

We briefly describe the three heavy-tailed distributions that occur in the different chapters of the thesis.

Student-t distribution

The Student-t distribution is a continuous probability law. Its density function is symmetric, and its form resembles the bell shape of a Gaussian distribution with mean 0 and variance 1, except that it has heavier tails.

The Student-t’s probability density function is given by

f (x) = Γ( α+1 2 ) √ απ Γ(α 2)  1 + x 2 α −α+12 , x ∈ R,

(15)

Introduction Heavy-Tailed Distributions In the literature, this probability law takes its name from Gosset’s (1908) paper in Biometrika published under the alias “Student”. This distribution gain in popularity due to the work of Fisher (1925), who referred to it as “Student’s distribution”.

α-stable distribution

In finance, it is usually argued that asset returns are the cumulative result of a large number of many small terms, i.e. the price of a stock (McCulloch (1996), Rachev and Mittnik (2000)). Due to this phenomenon, they are often assumed to follow a Gaussian distribution as the Gaussian law possesses the “Central Limit Theorem”. It states that the sum of a large number of i.i.d. variables, with finite variance, tends to be distributed following a Gaussian distribution. However, as already mentioned, in finance we often deal with asset returns that have heavier tails. In response to the empirical evidence Mandelbrot (1963) and Fama (1965) propose as an alternative approach to use the α-stable distribution, which accommodates for heavy tails. There is at least one very good reason for modelling financial variables using this class of distributions, namely they are supported by the “Generalized Central Limit Theorem”. It states that stable laws are the only possible non-trivial limit of normalized and centred sums of i.i.d. random variables. Moreover, the α-stable distribution is able to capture skewness, which is another feature of financial series. The stable laws were characterized by Paul L´evy in 1925.

Let X be a random variable distributed following an α-stable distribution, denoted X ∼ Sα(σ, β, µ). The parameter α ∈ (0, 2], denotes the tail index and governs the existence of

mo-ments: E[Xκ] < ∞, ∀κ < α. Asymmetry is captured by β ∈ [−1, 1]. The dispersion parameter

σ ∈ R+ expands or contracts the distribution, and the parameter µ ∈ R controls the location of

the distribution. The α-stable distribution possesses the property of stability, which states that a random variable is stable if a linear combination of two independent copies of the variable has the same distribution, up to location and scale parameters.

The probability density functions of the α-stable distribution does not have a closed form. However, the characteristic function has a known and manageable closed form:

E[e{itX}] =

(

e{−σα|t|α(1−iβ(sign t) tanπα2 (|σt|1−α−1))+iµt}

if α 6= 1

e{−σ|t|(1+iβπ2(sign t) ln(σ|t|))+iµt} if α = 1.

Pareto distribution

Let X be a random variable following a Pareto distribution. The probability that X is greater than some value x is given by

¯ F (x) = Pr(X > x) = ( u x −α x ≥ u 1 x < u,

(16)

Introduction Elliptical distributions tail index, which regulates the existence of moments: E[Xκ] < ∞, ∀κ < α. Other definitions are

closely relate to the the Pareto law, namely the strict Pareto distribution which has as tail function given by ¯F (x) = x−α, and the Pareto-type distribution which is characterized by ¯F (x) = x−αL(x), where L is a slowly varying function. 1

This power law probability distribution is named after the Italian civil engineer and economist Vilfredo Pareto (1964). He originally used it to describe the allocation of wealth among individuals, showing fairly well that a large percentage of the wealth of a community is owned by a smaller portion of the individuals in that same community.

3

Elliptical distributions

Due to the well-known properties and statistical convenience, most approaches in multivariate analysis rest upon the multivariate normal assumption. However, this probability law approximates poorly the reality of many empirical analyses. In particular, it does not admit heavy tails, which is a widely accepted stylized fact in financial data.

A natural generalization of the class of multivariate normal distributions is the family of elliptical distributions.2 This class of distributions was introduced by Kelker (1970) and further investigated

by Cambanis et al. (1981) and Fang et al. (1990). The family of elliptical distributions possesses a number of useful properties, among which the conditional and marginal distributions being also elliptical, and its closure to affine transformations and aggregation (Fang et al. (1990)). Let Z be a spherically distributed random vector represented by Z = RU, where R is a positive real variable called the generating variate of Z and stochastically independent of U, which is uniformly distributed over the unit hypersphere. A random vector X with an elliptical density is obtained as the linear transformation of the random vector Z.

Definition 3.1. The J -dimensional random vector X is elliptically distributed if it has the following stochastic representation:

X =dµ + RΛU.

U is a J -dimensional random vector uniform on the unit sphere SJ −1 =u ∈ RJ : kuk2 = 1 .

The J × J scaling matrix Λ produces the ellipticity and is a matrix such that Σ = ΛΛ0, is a positive definite symmetric dispersion matrix of rank J .3 R, the non-negative random variable,

is the generating variate and is stochastically independent of U. It determines the distribution’s shape, in particular the tailedness of the distribution. The J -dimensional vector µ re-allocates the center of the distribution.

1A positive function L on (0, ∞) is slowly varying at ∞ if

lim

t→∞

L(tx)

L(t) = 1, x > 0.

2They are as well often called elliptical symmetric distributions. We drop the attribute symmetric in this work

as we will always consider the symmetric case.

(17)

Introduction Elliptical distributions The density function of an elliptical distribution can be simply derived from the density function of R provided it is absolutely continuous. The probability density function of X is given by

fX(x) =p|Σ−1|gR (x − µ)0Σ−1(x − µ) , where gR(x) = Γ J2 2πJ/2 √ x−(J−1)fR( √ x)

is the density generator, and fR is the probability density function of the generating variate R.

The elliptically distributed random vector can alternatively be written in terms of the characteristic function:

φX(ξ) = exp(iξ0µ)ϕ(ξ0Σξ),

where ξ ∈ RJ is a J × 1 vector of frequencies, ϕ is the characteristic generator of X. This result

is implied by the uniqueness theorem (Shorack (2000)), which also suggest a one-to-one relation between the cumulative distribution function of the generating variate FR and ϕ (Fang et al.

(1990)).

Let us further assume that the generating variate R belongs to the maximum domain of at-traction of the Fr´echet distribution (see Embrechts et al. (1997)), i.e. ¯FR= L(x)x−α for all x > 0,

where α > 0 is the tail index and L is a slowly varying function. A result by Hult and Lindskog (2002) states that the tail index α of ¯FR corresponds to the tail index of the regularly varying

random vector X. Hence the family of elliptical distributions admits heavy tails, though it remains the simple linear dependence structure known from the multivariate normal distribution.

Various multivariate distributions that are relevant in theory and practice belong to the elliptical class, among others the Gaussian, the Laplace, the symmetric generalized hyperbolic distribution, the Student-t, the Elliptical Stable (and hence the Cauchy with α = 1), and the Kotz distribution.

3.1

Examples

We briefly describe the multivariate Student-t and the Elliptical Stable distribution, as those are the two elliptical distributions used in the next chapters.

Multivariate Student-t distribution

The multivariate Student-t distribution is a generalization to random vectors of the univariate Student-t distribution. It is a probability law for random vectors of correlated variables, where each element has a univariate Student-t distribution.

Similar to the univariate Student-t distribution, the multivariate one can be constructed by dividing a multivariate normal random vector, having mean vector zero and unit variances, by an univariate chi-square random variable. Let Y and Q be independent and distributed as a multivari-ate normal distribution N (0, Σ) and a chi-squared distribution χ2

(18)

Introduction Elliptical distributions then the J -dimensional random vector X has the following density

Γ [(α + J )/2] Γ(α/2)αJ/2πJ/2|Σ|1/21 + 1

α(X − µ)

TΣ−1(X − µ)(α+J )/2,

where Γ is the Gamma function, and X is distributed as a multivariate Student-t distribution, where Σ is a J × J symmetric positive definite matrix, µ is a J × 1 location vector, and α is a positive scalar denoting the degree of freedom parameter.

Elliptical Stable distribution

The multivariate stable distribution is a generalization of the univariate α-stable distribution. A J -dimensional random vector X has a multivariate stable distribution, denoted as X ∼ Sα(Γ, µ),

if the joint characteristic function of X is

E exp(uTX) = exp    − Z s∈S |uTs|α+ iν Γ(ds) + iuTµ    , where ν = ( −sign(uTs) tan(πα 2 )|u Ts|α α 6= 1, 2 πsign(u Ts)|uTs| ln |uTs| α = 1.

Thus, the stable random vector is characterized by the tail index α ∈ ]0, 2], µ ∈ RJ a location vector and Γ which is a spectral finite measure on the unit sphere SJ −1 = {u ∈ RJ : ||u||

2 = 1}.

Note that if α = 2 the distribution is equivalent to the multivariate normal distribution.

The Elliptical Stable distribution is a special symmetric case of the multivariate stable distri-bution. If the random vector X is an elliptically contoured multivariate stable distribution, then the joint characteristic function is given by

E exp(iuTX) = exp{−(uTΣu)α/2+ iuTµ)},

for some positive definite matrix Σ and location vector µ ∈ RJ. Note the link to the characteristic function of the multivariate normal distribution as α = 2:

E exp(iuTX) = exp{−(uTΣu) + iuTµ)}.

All elliptically multivariate contoured stable distributions are scale mixtures of multivariate normal distributions (Samorodnitsky and Taqqu (1994)). Let G ∼ N (0, Σ) be a J -dimensional multivariate random vector and A ∼ Sα/2(σ, 1, 0) be an independent univariate positive α2-stable random variable

with α ∈]0, 2[. Then X = A1/2G is an elliptically multivariate contoured stable distribution. This is

(19)

Introduction Order Statistics and Quantiles

4

Order Statistics and Quantiles

For N random variables X1, . . . , XN, an order statistics is defined as X(i), denoting the i-th

small-est value in X1, . . . , XN. The smallest order statistic, is the minimum of the sample, X(1) =

min{ X1, . . . , XN}, and similarly the largest order statistic is the maximum of the sample, X(N ) =

max{ X1, . . . , XN }.

Now let τ ∈ [0, 1]. The τ -th quantile of a distribution F is denoted by qτ and defined by

qτ = inf{x ∈ R : F (x) ≥ τ }.

qτ is sometimes called the quantile function or inverse distribution function.

The τ -th sample quantile is usually written in terms of order statistics. The τ -th sample quantile from a sample of size N can be defined as

ˆ

qτ = (1 − γ)X(`)+ γX(`+1),

where γ ∈ [0, 1] and ` is an integer such that N` ≤ τ < `+1

N . One of the simplest definitions of a

quantile using order statistics is

ˆ

qτ = X(`),

where ` = [N τ ] is the nearest integer to N τ . Other definitions are based on N τ being an integer or not. If it is not an integer then one takes the smallest integer greater or equal to N τ , and if it is an integer one can simply take as quantile the order statistic X(N τ ) or compute the quantile by

the formula X(N τ )+X(N τ +1)

2 , for example.

The asymptotic normality of a sample quantile is stated in the following way (Cram´er (1946)): Theorem 4.1. Let X1, ..., XN be N i.i.d. draws from a cumulative distribution function F with

a continuous density function f . Let τ ∈ [0, 1]. Suppose that F has a density function f in the neighbourhood of qτ and that f is positive and continuous at qτ. Then

√ N (ˆqτ − qτ) →d N (0, τ (1 − τ ) [f (qτ)]2 ).

5

Estimation Methods

We review here the classical and most used estimation methods.

5.1

Maximum Likelihood Estimation

Maximum Likelihood estimation

(20)

Introduction Estimation Methods Let X be a random variable, whose probability density function depends on θ. For an i.i.d. sample, the joint density function for all the realizations is

f (x1, x2, . . . , xN | θ) = f (x1|θ) × f (x2|θ) × · · · × f (xN|θ) = N

Y

i=1

f (xi|θ).

In a second step, instead of evaluating for a fixed parameter θ the density of x1, . . . , xN, the setting

can be reversed. The density will be regarded as a function of θ with fixed realizations x1, . . . , xN.

This function is called the likelihood function L of θ:

L(θ | x1, . . . , xN) = f (x1, x2, . . . , xN | θ) = N

Y

i=1

f (xi|θ).

The maximum likelihood estimator is the most plausible parameter value θ for the realizations x1, . . . , xN of the random variable X, yielding the highest probability. This is an optimization

prob-lem. The maximization of the likelihood function is carried out by computing the first derivative with respect to θ and setting it equal to zero. Then the second derivative is calculated with respect to θ, in order to check if indeed it is a maximum. For practical reasons the log-likelihood function is preferred, as it is easier to compute due to the sum and moreover, due to the monotonicity of the logarithm its maximum is at the same position as the non-logarithmic function. The log-likelihood is given by ˆ ` = log L(θ | x1, . . . , xN) = N X i=1 log f (xi|θ).

As the aim is to find a value of θ that maximizes ˆ`, the MLE of θ is obtained by ˆ

θMLE = arg max θ∈Θ

ˆ

`(θ | x1, . . . , xN).

In order to be able to apply the MLE method, two conditions have to be satisfied, namely (i) one must have knowledge about the distribution of the random variable that is considered and (ii) the likelihood function must be tractable, so that it is possible to evaluate it for all admissible parameter vector θ.

Quasi-Maximum Likelihood estimation

(21)

Introduction Estimation Methods Maximum Simulated Likelihood

Since the papers of Gouri´eroux and Monfort (1996) and Hajivassiliou and Ruud (1994), Max-imum Simulated Likelihood (MSL) estimation has been used in a large and growing number of studies. MSL works by simulating a likelihood and then averaging over the simulated likelihoods. One needs a probability density function ˜f to be a simulator of the density function f and MLE is then computed on ˜f . In more statistical terms:

ˆ

θMSL = arg max θ∈Θ

˜

`(θ | x1, . . . , xN),

where the log-likelihood function is

˜ `(θ | x1, . . . , xN) = N X i=1 log ˜f ( xi, ωi|θ)

for some given simulation sequence ωi.

5.2

Generalized Method of Moments

Generalized Method of Moments

Generalized Method of Moments (GMM) was developed by Lars Peter Hansen in 1982 as a generalization of the Method of Moments.4 It is a very general estimation method for econometric

models. GMM is usually applied in the context of semi-parametric models, where the parameter of interest is finite-dimensional, whereas the full shape of the distribution function of the data may not be known, and therefore MLE is not applicable.

Suppose we have N observations Xi, i = 1, ..., N where each observation Xi is an J -dimensional

multivariate random variable. The method requires that a certain number of moment conditions are specified for the model. These moment conditions are functions of the parameters and the data, such that their expectation is zero at the true values of the parameters, i.e. we need to know a vector-valued function g(X, θ) such that m(θ) ≡ E[ g(Xi, θ) ] = 0. The essence of the GMM, involves

replacing the theoretical moments of the population by empirical ones, ˆm(θ) ≡ N1 PN

i=1g(Xi, θ),

and to minimize the norm of this expression with respect to θ. The theory of GMM considers an entire family of norms, defined as k ˆm(θ)k2

W = ˆm(θ)

0W ˆm(θ), where W is a positive-definite

weighting matrix. In practice, the weighting matrix W is computed based on the available data, which will be denoted as ˆW . Thus, the GMM estimator can be written as

ˆ θGM M = arg min θ∈Θ  1 N N X i=1 g(Xi, θ) 0 ˆ W 1 N N X i=1 g(Xi, θ)  .

4The Method of Moments consists in retrieving the parameters of interest by matching the theoretical moments,

(22)

Introduction Estimation Methods MLE may itself be interpreted as a method of moment procedure with the derivative of the log-likelihood function, the score vector, providing the moment conditions.

Method of Simulated Moments

The Method of Simulated Moments (MSM) is an estimation procedure introduced by Daniel McFadden (1989). It extends GMM to situations where theoretical moment functions cannot be evaluated directly. The principle stays the same as GMM only that it is based on simulated moment conditions. One measures the difference between some empirical moments computed on the ob-served random variable and on their simulated counterparts. GMM is a semi-parametric approach, however MSM is not. This is due to the complete specification requirement of the distribution for simulation purposes.

Efficient Method of Moments

The Efficient Method of Moments (EMM), introduced by Gallant and Tauchen (1996), is a specific MSM approach. The idea is to achieve the efficiency of the estimator of MLE while maintaining the flexibility of the GMM. It is useful for situations where the likelihood is intractable and thus likelihood-based methods are not feasible. Instead, an auxiliary parametric model and an auxiliary parameter θ are introduced, which can be estimated for instance by QMLE. The score of the auxiliary model is referred to as the score generator for EMM. Once a score generator is available, given a parameter setting for the model, simulations enable to evaluate the expected value of the score. Thus, in the EMM approach the moment equations are based on the score vector of an auxiliary model.

5.3

Indirect Inference

The Indirect Inference is a simulation-based estimation method introduced by Gouri´eroux et al. (1993), based on Smith (1993). It shows that a correct inference can be based on an incorrect criterion. Its principle relies on an auxiliary model, for the estimation of the parameters, which captures the aspects of the data. The main steps of Indirect Inference are the following. First, consider an auxiliary model that does not need to be an accurate description of the data generating process, but is easy to estimate. Consider real data and simulated data from the model of interest. In a second step, estimate the auxiliary model with both data sets. Then match the two estimated parameters and finally, minimize the matching. The last step enables to find the value of the parameters of the model of interest that maximizes the similarity between the observed data and the simulated data from the point of view of the chosen auxiliary model. The key feature of Indirect Inference is that simulation from the model of interest is feasible.

(23)

Introduction Estimation Methods

5.4

Minimum Distance Estimation

Minimum Distance Estimation (MDE) is a statistical method for fitting an econometric model to data, usually the empirical distribution is used. Assume an i.i.d. random sample from a population with distribution F (x, θ), where θ ∈ Θ and Θ ⊆ Rp, p ≥ 1. Let F

N(x) be the empirical distribution

function based on the sample and let ˆθ be an estimator of θ. Then F (x, ˆθ) is an estimator for F (x, θ). Let d be a functional calculating some measure of distance between the two arguments. If there exists a ˆθ ∈ Θ such that

d(F (x, ˆθ), FN(x)) = inf{d(F (x; θ), FN(x)); θ ∈ Θ},

then ˆθ is called the minimum distance estimate of θ.

Besides the empirical distribution, MDE is also often used with the empirical characteristic function, mainly in heavy-tailed models. Some of the heavy-tailed distributions have no closed form expressions of the probability density function but possesses a closed form for the characteristic function. Thus the main idea consists in minimizing the characteristic function and the empirical characteristic function in an appropriate norm. Denote by φ the characteristic function and by ˆφ the empirical characteristic function, defined by

ˆ φ(t) = 1 N N X l=1 eitXl.

The procedure finds a minimum distance estimate ˆθ of θ by ˆ

θ = arg min

θ∈Θ

||φ(t) − ˆφ(t)||.

Moreover, the quantile approaches introduced by Fama and Roll (1971) and McCulloch (1986) can be seen as MDE methods.5 The idea behind those techniques is based on finding the parameters

that match the theoretical quantiles with their empirical counterparts, which is like minimizing the distance between both quantities.

5.5

Estimation of the tail-index α

One way to estimate the tail-index α of a heavy-tailed distribution is by applying MLE on the Pareto distribution. However, the most used method for estimating the tail index α is the Hill estimator proposed by Bruce Hill in his seminal paper in 1975. It is a semi-parametric estimator that does not require the knowledge of the entire distribution function, but only of the tail behaviour. The Hill estimator is very popular, mostly due to its simplicity.

Let X1, . . . , XN be a sequence of i.i.d. positive random variables with a distribution function F

of Pareto-type with tail index α so that the tail function is defined as ¯F ∼ Cx−α as x → ∞ and

5McCulloch’s method is briefly described in Chapter 1, as we base our quantile-based estimation method on his

(24)

Introduction Structure of the Thesis C is a constant. Let X(l) the l-th order statistic of the sample X1, . . . , XN, i.e. X(1) ≤ . . . ≤ X(N ).

The Hill estimator for the tail index α is defined as

ˆ αk,N = 1 k k X j=1 log X(N −j+1)− log X(N −k) !−1 for 1 ≤ k ≤ N − 1.

For a review on the univariate Hill estimator, we refer the reader to section 2 of Chapter 3 of this work.

6

Structure of the Thesis

This thesis is divided in four chapters (besides this introduction). The two first chapters introduce a parametric quantile-based estimation method of univariate heavy-tailed distributions and elliptical distributions, respectively. If one is interested in estimating the tail index α without imposing a parametric form for the entire distribution function, but only on the tail behaviour, we propose a multivariate Hill estimator for elliptical distributions in chapter three. In the first three chapters we assume an i.i.d. setting, and so as a first step to a dependent setting, using quantiles, we prove in the last chapter the asymptotic normality of marginal sample quantiles for stationary processes under the S-mixing condition. Chapter 1, 2 and 4 are papers that are published, and Chapter 3 is a paper that is submitted.6

The first chapter introduces a quantile- and simulation-based estimation method, which we call the Method of Simulated Quantiles, or simply MSQ. Since it is based on quantiles, it is a moment-free approach. And since it is based on simulations, we do not need closed form expressions of any function that represents the probability law of the process. Thus, it is useful in case the probability density functions has no closed form or/and moments do not exist. Also, due to the robustness of quantiles, MSQ is appropriate when data shows unusually large observations that do not follow the same process as the rest of the observations. It is based on a vector of functions of quantiles. The principle consists in matching functions of theoretical quantiles, which depend on the parameters of the assumed probability law, with those of empirical quantiles, which depend on the data. Since the theoretical functions of quantiles may not have a closed form expression, we rely on simulations. MSQ can be seen as an application of the Indirect Inference method introduced by Gouri´eroux, et al. (1993), where the vector of functions of quantiles stands for the vector of auxiliary parameters, or as a kind of MDE based on functions of quantiles. We prove that the proposed method provides asymptotically normal estimators of the parameters of interest. Throughout this chapter, we exemplify the proposed method with the α-stable distribution. Although MSQ is based on quantiles, it is broader than Fama and Roll (1971) and McCulloch (1986), which were designed for the estimation of the α-stable distribution. MSQ is very general as it applies to any model and distribution. Moreover, it does not make any assumptions on the functional forms for the functions of quantiles. A second advantage is that the method is not based on tabulations but on simulations.

6Let us remark, that since each journal has its own requirements concerning the presentation of the material,

(25)

Introduction Structure of the Thesis This allows a larger flexibility and accuracy. This chapter is published under Yves Dominicy and David Veredas, The Method of Simulated Quantiles, Journal of Econometrics 172(2), 208-221, 2013.

The second chapter deals with the estimation of the parameters of elliptical distributions by means of a multivariate extension of MSQ. While the statistical properties of the elliptical family of distributions are well-known (see, among others, Kelker (1970), Cambanis, et al. (1981), Fang et al. (1990) and Frahm (2004)), estimation for vast dimensions is still an almost unexplored area, in particular for heavy-tailed distributions. For moderate dimensions and for thin-tailed distributions, such as the Gaussian, standard estimation methods, namely MLE and GMM, are straightforward. For heavy-tailed distributions, they may fail because of intractability of the probability density function or/and lack of existence of moments. This is the case of, for instance, the Elliptical Stable, the Cauchy and the Student-t distribution. In this chapter we propose inference (i.e. estimation and testing) for vast dimensional elliptical distributions. Estimation is based on quantiles, which always exist regardless of the thickness of the tails, and testing is based on the geometry of the elliptical family. The multivariate extension of MSQ faces the difficulty of constructing a function of quantiles that is informative about the covariation parameters. More precisely, the contribution is threefold. First, we introduce a quantile-based function that is informative about the co-dispersions. We show that the interquartile range of a projection of pairwise random variables onto the 45 degree line is very informative about the covariation. Second, we propose a fast method for the estimation of the parameters that i) does not require tractability of the density function and existence of moments, and ii) does not suffer from the curse of dimensionality. Last, we propose simple testing procedures for the null hypothesis of correct specification of one or several probability level contours. MSQ provides the asymptotic theory for the estimators. Furthermore, due to the properties of the elliptical distributions, we find a way to overcome almost all the optimizations. This renders the method extremely fast and suitable for vast dimensions. This chapter is published under Yves Dominicy, Hiroaki Ogata and David Veredas, Inference for Vast Dimensional Elliptical Distributions, Computational Statistics 28(4), 1853-1880, 2013.

The third chapter consists in constructing a multivariate tail estimator. As already mentioned, heavy-tailed models are common in various areas of application and the interest often lies on the so-called tail index α. In the univariate case, the most popular estimator for the tail exponent α is the Hill estimator introduced by Bruce Hill in 1975. The popularity of the Hill estimator is not surprising since it provides a simple and semi-parametric approach to draw inference about the tail behaviour of regularly varying distributions. The aim of this chapter is to propose an estimator of the tail index α in a multivariate context; more precisely, in the case of regularly varying elliptical distributions. Since, for univariate random variables, our estimator boils down to the Hill estimator, we name it after Bruce Hill. Our estimator is based on the distance between an elliptical probability contour and the exceedance observations. The Minimum Covariance Determinant method provides the probability ellipsoid. Our estimator is as easy as the univariate one. A simulation study and an empirical illustration to financial market indexes show that our elliptical multivariate Hill estimator works well in practice. This chapter is submitted.

(26)

Introduction BIBLIOGRAPHY the empirical quantile vector (qτ(1)1,n, . . . , q

(J )

τJ,n)

0 for given τ

1, . . . , τJ ∈ [0, 1]. The classical approach

to obtain limiting distributions for statistics of weakly dependent processes is to impose mixing conditions (i.e. α-, φ-, ψ-, and β-mixing). Those classical mixing conditions lead to correct results. However, often they are not only difficult to be verified but require as well strong smoothness of the process, and so their range of applications in time series context is somehow limited. We assume that the processes are S-mixing, a recently introduced and widely applicable notion of dependence (Berkes et al. (2009)). S-mixing is attractive since its verification is almost immediate and it nests a large number of well-known econometric models such as linear processes (i.e. ARMA models), GARCH models and its extensions, and stochastic volatility models, among others. A remarkable property of S-mixing is the fact that it does not require any higher order moment assumptions to be verified. Since we are interested in quantiles and processes that are probably heavy-tailed, this is of particular interest. This chapter is published under Yves Dominicy, Siegfried H¨ormann, Hiroaki Ogata and David Veredas, On Sample Marginal Quantiles for Stationary Processes, Statistics and Probability Letters 83(1), 28-36, 2013.

Bibliography

[1] I. Berkes, S. H¨ormann, and J. Schauer. Asymptotic results for the empirical process of stationary sequences. Stochastic Processes and their Applications, 119, 1298-1324, 2009.

[2] S. Cambais, S. Huang and G. Simons. On the theory of elliptically contoured distributions. Journal of Multivariate Analysis, 11, 368-385, 1981.

[3] V.P. Chistyakov. A theorem on sums of independent positive random variables and its applications to branching random processes. Theory of Probability and its Applications, 9, 640-648, 1964.

[4] H. Cram´er. Mathematical Methods of Statistics. Princeton Mathematical Series, vol. 9. Princeton University Press, Princeton, N. J., 1946.

[5] D. Duffie and K.J. Singleton. Simulated Moments Estimation of Markov Models of Asset Price. Econometrica, 61, 929-952, 1993.

[6] P. Embrechts, C. Kl¨uppelberg and T. Mikosch. Modelling extremal events for Insurance and Finance. Springer-Verlag, Berlin, 1997.

[7] E.F. Fama. The behavior of stock market prices. Journal of Business, 38, 34-105, 1965.

[8] E.F. Fama and R. Roll. Parameter estimates for symmetric stable distributions. Journal of the American Statistical Association, 66, 331-338, 1971.

[9] K. Fang, S. Kotz and K. Ng. Symmetric multivariate and related distributions. Chapman and Hall, New York, 1990.

[10] R.A. Fisher. On an absolute criterion for fitting frequency curves. Messenger of Mathematics, 41, 155-160, 1912.

[11] R.A. Fisher. Applications of Student’s Distribution. Metron, 5, 3-17, 1925.

(27)

Introduction BIBLIOGRAPHY

[13] A.R. Gallant and G. Tauchen. Which moments to match? Econometric Theory, 12, 657-681, 1996.

[14] C. Gouri´eroux and A. Monfort. Simulation-based econometric methods. Oxford University Press, 1996.

[15] C. Gouri´eroux, A. Monfort and E. Renault. Indirect Inference. Journal of Applied Econometrics, 8, 85-118, 1993.

[16] V. Hajivassiliou and P. Ruud. Classical estimation methods for LDV models using simulation. Handbook of Econometrics, 4, 2383-2441, 1994.

[17] L.P. Hansen. Large Sample Properties of Generalized Method of Moments Estimators. Econometrica, 50, 1029-1054, 1982.

[18] B. Hill. A simple general approach to inference about the tail of a distribution. Annals of Statistics, 13, 331-341, 1975.

[19] H. Hult and F. Lindskog. Multivariate extremes, aggregation and dependence in elliptical distributions. Ad-vances in Applied Probability, 34, 587-608, 2002.

[20] D. Kelker. Distribution theory of spherical distributions and a location-scale parameter generalization. Sankhya, A32, 419-430, 1970.

[21] P. L´evy. Calcul des Probabilit´es. Gauthier-Villars, Paris, 1925.

[22] B. Mandelbrot. The Variation of Certain Speculative Prices. Journal of Business, 36, 394-419, 1963.

[23] J.H. McCulloch. Simple consistent estimators of stable distribution parameters. Communications in Statistics - Simulation and Computation, 15(4), 1109-1136, 1986.

[24] J.H. McCulloch. Financial applications of stable distributions. In Handbook of Statistics, edited by G.S. Maddala and C.R. Rao, 393-425, Amsterdam: Elsevier, 1996.

[25] D. McFadden (1989). A Method of Simulated Moments for Estimation of Discrete Response Models without Numerical Integration. Econometrica, 57, 995-1026, 1989.

[26] A. Pakes and D. Pollard. The Asymptotics of Simulation Estimators. Econometrica, 57, 1027-1058, 1989.

[27] V. Pareto. Cours d’´economie politique. Nouvelle ´edition par G.-H. Bousquet et G. Busino, Librairie Droz, Geneva, 1964, pages 299-345, 1964.

[28] S. Rachev and S. Mittnik. Stable Paretian Models in Finance. Chichester: Wiley, 2000.

[29] G. Samorodnitsky and M.S. Taqqu. Stable non-Gaussian random processes: Stochastic models with infinite variance. Chapman & Hall/CRC, Stochastic Modeling, 1994.

[30] G. Shorack. Probability for Statisticians. Springer, New York, 2000.

[31] A.A. Smith. Estimating Nonlinear Time-series Models Using Simulated Vector Autoregressions. Journal of Applied Econometrics, 8, S63-S84, 1993.

(28)

Chapter I

(29)

Chapter I I

The Method of Simulated Quantiles

1

Abstract

We introduce the Method of Simulated Quantiles, or MSQ, an indirect inference method based on quantile matching that is useful for situations where the density function does not have a closed form and/or moments do not exist. Functions of theoretical quantiles, which depend on the parameters of the assumed probability law, are matched with the sample counterparts, which depend on the observations. Since the theoretical quantiles may not be available analytically, the optimization is based on simulations. We illustrate the method with the estimation of α-stable distributions. A thorough Monte Carlo study and an illustration to 22 financial indexes show the usefulness of MSQ.

Keywords: Quantiles; Simulation; Matching; Inference.

1This chapter is joint work with David Veredas (Universit´e libre de Bruxelles), and the paper is published in

(30)

Chapter I Introduction

1

Introduction

Estimation of the parameters of an econometric or economic parametric model is a first order concern. In the case that we know the probability law that governs the random variables, Maximum Likelihood (ML henceforth) is the benchmark technique. If we relax the assumption of knowledge of the distribution but we still have knowledge of the moments, the Generalized Method of Moments (GMM henceforth) becomes the benchmark technique. However, there are models that cannot be easily estimated with ML or GMM: stochastic volatilities, models with stochastic regimes switches, or involving expected utilities to name a few. For those models the likelihood function may not be available analytically (or can be difficult to estimate), and/or the moments may not exist.

To circumvent these estimation difficulties, numerous estimation methods based on simulations have been developed. Gouri´eroux and Monfort (1996) and Hajivassiliou and Ruud (1994) introduce Simulated ML (SML), similar to ML except that simulated probabilities are used instead of the exact probabilities. McFadden (1989), Pakes and Pollard (1989), and Duffie and Singleton (1993) independently introduced the Method of Simulated Moments (MSM), which is based on matching sample moments and theoretical moments that are generated by simulations. Gouri´eroux et al. (1993) propose Indirect Inference (IndInf), a method based on estimating indirectly the parameters of the model of interest through matching the parameters of an auxiliary model. The Efficient Method of Moments (EMM) of Gallant and Tauchen (1996) is based on the same idea.

In this article we introduce the Method of Simulated Quantiles (MSQ henceforth). Since it is based on quantiles, it is a moment-free method. And since it is based on simulations, we do not need closed form expressions of any function that represents the probability law of the process. Also, due to the robustness of quantiles, MSQ is appropriate when data shows unusually large observations that do not follow the same process as the rest of the observations. In a nutshell, MSQ is based on a vector of functions of quantiles. These functions can be either computed from data (the sample functions) or from the distribution (the theoretical functions). The estimated parameters are those that minimize a quadratic distance between both. Since the theoretical functions of quantiles may not have a closed form expression, we rely on simulations. MSQ is therefore an application of the IndInf principle of Gouri´eroux et al. (1993), where the vector of functions of quantiles stands for the vector of auxiliary parameters. Throughout the article, we exemplify the method with the α-stable distribution. To make a distinction between the theory and the example, the example ends with a white square.

Example

Let X be a random variable distributed following an α-stable distribution that is represented as X ∼ Sα(σ, β, µ). The parameter α ∈ (0, 2], often denoted as tail index, measures the thickness of the

(31)

Chapter I Introduction for k = 1, . . . , K, then PK

k=1ωkXk∼ Sα(σ, β, µ).

The probability density functions (pdf) of the α-stable distribution does not have a closed form. Since it is a complicated integral, even difficult to evaluate numerically, estimation by ML has been often not considered in applied work (though the theoretical properties of the ML estimator exist, Dumouchel, 1973, and the actual estimation has been performed by Nolan, 2001). However, the characteristic function (CF hereafter) has a manageable closed form:2

E[exp{itX}] = (

exp{−σα|t|α(1 − iβ(sign t) tanπα 2 (|σt|

1−α − 1)) + iµt} if α 6= 1

exp{−σ|t|(1 + iβπ2(sign t) ln(σ|t|)) + iµt} if α = 1.

All the methods based on the CF match the theoretical and sample counterparts, but in different ways.3 A problem inherent to these methods is the choice of the grid of frequencies at which to

evaluate the CF. While Fielitz and Rozelle (1981) recommend, on the basis of Monte Carlo results, matching only a few frequencies, others, like Feuerverger and McDunnough (1981), recommend using as many frequencies as possible. However, in the latter case, Carrasco and Florens (2002) have shown that, even asymptotically, matching a continuum of moment conditions introduces a fundamental singularity problem.

An alternative is the use of simulation-based methods. From a Bayesian perspective Buckle (1995), Qiou and Ravishanker (1998), and Lombardi (2007) use Monte Carlo Markov Chain meth-ods. From a frequentist perspective, and since random numbers from α-stable distributions can be obtained straightforwardly, IndInf and EMM are appealing, as has been shown by Garcia et al. (2011) and Lombardi and Calzolari (2008). They both use a skewed-t distribution as auxiliary model.

Finally, Fama and Roll (1971) and McCulloch (1986) propose using functions of quantiles. Four specific functions of quantiles are constructed to capture the same features as those captured by α, β, σ and µ. Since the pdf does not have a closed form, so do the cumulative distribution function and the quantiles. Estimation has to be done either by simulation or by tabulation. They opt for the latter. Fama and Roll (1971) and McCulloch (1986) estimate the parameters by calibrating the value of the sample functions of quantiles with tabulated values of the theoretical quantiles. This is a fast way to estimate the parameters, since it avoids optimization, but the theoretical properties remain unclear and the extension to the case of linear combinations of α-stable random variables

2A different parametrization, among others, is given by

E[exp{itX}] = (

exp{−σα|t|α(1 − iβ(sign t) tanπα

2 ) + iµt} if α 6= 1

exp{−σ|t|(1 + iβπ2(sign t) ln |t|) + iµt} if α = 1.

The one showed in the main body of the text is commonly used in numerical computations. For instance, Chambers et al. (1976) use it for simulation.

3Since the sample CF is a random variable with complex values, one can think about (i) matching moments

(32)

Chapter I Introduction is not possible (since the tail index has to be the same for all the random variables, estimation has to be done jointly). 2

MSQ combines the simulation-based and the quantile-based methods. It is broader than Fama and Roll (1971) and McCulloch (1986), which were designed for the estimation of the α-stable distribution. In fact, MSQ is very general as it applies to any model and distribution. Moreover, it does not make any assumptions on the functional forms for the functions of quantiles. A second advantage is that the method is not based on tabulations but on simulations. This allows a larger flexibility and accuracy. Indeed, tabulation requires interpolation if the sample functions of quantiles are not exactly equal to the tabulated theoretical functions of quantiles. Third, we provide an asymptotic theory that shows the consistency, asymptotic normality and the asymptotic variance-covariance matrix of the estimated parameters.

Estimation via quantiles is a natural alternative to moment-based methods and tracks back to Aitchison and Brown (1957). They estimate a three-parameter log-normal distribution by matching quantiles. A similar result is also found in Bury (1975). Quantiles can also be used to construct functions that measure aspects of the probability distribution. Let qτ denote the τ -th quantile of

Xtfor τ ∈ (0, 1). The median, q0.50, is often used as an estimator of the location. The interquartile

range, q0.75− q0.25 is a natural measure of dispersion. Bowley (1920) and Hinkley (1975) proposed

the quartile skewness (known as the Bowley coefficient):

BC = (qτ − q0.5) − (q0.5− q1−τ) (qτ − q1−τ)

.

The smaller τ , the less sensitive to outliers, but the less information from the tails it uses. This measure of asymmetry is dispersion and location invariant, i.e. γ(aXt+b) = γ(Xt), where γ denotes

the above measures.

As far as measures for tail thickness are concerned, Moors (1988) proposed

M o = (q0.875− q0.625) + (q0.375− q0.125)

(q0.750− q0.250)

.

The two terms in the numerator are large if little probability mass is concentrated in the neigh-bourhood of the first and third quartile. It is standardized by the interquartile range to guarantee invariance under linear transformations.4

The rest of the paper is organized as follows. In Section 2 we first introduce notation followed by MSQ. Each step of the presentation of the method is illustrated with our example. We also show the assumptions and the asymptotic distribution of the estimators. In Section 3 we report the results of a Monte Carlo study based on our example. We consider univariate and multidimensional estimation of α-stable distributions. Multidimensional should not be mistaken with multivariate. By multidimensional we mean joint estimation of univariate distributions that share the same tail index. For the univariate case, our method is compared with McCulloch (1986). In Section 4

(33)

Chapter I The Method of Simulated Quantiles we show an illustration to 22 world-wide market indexes, assumed to be distributed according to α-stable distributions. We first estimate the parameters independently. Then we estimate them jointly assuming a common tail index, which is needed for the construction of linear combinations, as we show in the last part of the section. Section 5 concludes. Proofs and other technicalities are relegated to the Appendix.

2

The Method of Simulated Quantiles

Consider a random variable X that follows a distribution D(θ), where θ denotes the vector of unknown parameters that are in an interior point of the compact parameter set Θ ⊂ Rp. Let

x = (x1, ..., xi, ..., xN)T be the vector of N realizations and ˆqτk be its τk-th sample quantile. Denote

by ˆq = (ˆqτ1, ..., ˆqτs)

T

∈ Rs and ˆq= (ˆq

τ1, ..., ˆqτb)

T

∈ Rb two s × 1 and b × 1 vectors of sample

quantiles. Let h( ˆq) and g( ˆq∗) be two M × 1 vectors Rs → RM and Rb → RM. Identification requires that M ≥ p. Consider their Hadamard (element-by-element) product ˆΦ = h( ˆq) g( ˆq∗).

Likewise, denote by qθ = (qτ1,θ, ..., qτsθ) T ∈ Rsand q∗ θ = (q ∗ τ1,θ, ..., q ∗ τb,θ) T ∈ Rb two s×1 and b×1

vectors of theoretical quantiles corresponding to D(θ). That is, qτk,θ denotes the τk-th theoretical

quantile of X. These quantiles may not be available analytically but can be computed through simulations. Let h(qθ) and g(qθ∗) be two M × 1 vectors Rs → RM and Rb → RM of continuously

differentiable functions with respect to θ. Consider their Hadamard product Φθ = h(qθ) g(q∗θ).

Example (cont.)

McCulloch (1986) defines four functions of quantiles that represent the four parameters of the α-stable distribution. Let ˆq = (ˆq0.95, ˆq0.75, ˆq0.50, ˆq0.25, ˆq0.05)T and ˆq∗ = (ˆq0.95, ˆq0.75, ˆq0.25, ˆq0.05)T. The

functions h( ˆq) and g( ˆq∗) are both 4 × 1:

h( ˆq) =      ˆ q0.95− ˆq0.05 (ˆq0.95− ˆq0.50) + (ˆq0.05− ˆq0.50) ˆ q0.75− ˆq0.25 ˆ q0.50      and g( ˆq∗) =      (ˆq0.75− ˆq0.25)−1 (ˆq0.95− ˆq0.05)−1 1 1      .

And the vector of functions of quantiles ˆΦ is

ˆ Φ =       ˆ q0.95−ˆq0.05 ˆ q0.75−ˆq0.25 (ˆq0.95−ˆq0.50)+(ˆq0.05−ˆq0.50) ˆ q0.95−ˆq0.05 ˆ q0.75− ˆq0.25 ˆ q0.50       .

The last two are the interquartile range and the median.

The two upper elements in ˆΦ are dispersion and location invariant, meaning that they are insensitive to µ and σ. This is why McCulloch (1986) standardizes the sample. So if the process truly has unit dispersion, the sample interquartile range ˆq0.75− ˆq0.25 is a consistent estimator of the

(34)

Chapter I The Method of Simulated Quantiles by the scale parameter σ. Similarly, if the process truly has location zero and unit dispersion, the values of the theoretical median q0.50,θ and the sample median ˆq0.50should be close. Otherwise, the

theoretical median is re-scaled and re-located by σ and µ respectively. McCulloch (1986) notices that q0.50,θ has a double singularity as α crosses 1 when β 6= 0. This makes interpolation meaningless

between α = 0.9 and α = 1.1.5 To circumvent this problem, he uses as location parameter the one introduced by Zolotarev (1957),

ζ = (

µ + β σ tanπα2 for α 6= 1 µ for α = 1.

Putting all these elements together, the vector of theoretical functions of quantiles equals:

Φθ =         q0.95,θ−q0.05,θ q0.75,θ−q0.25,θ (q0.95,θ−q0.50,θ)+(q0.05,θ−q0.50,θ) q0.95,θ−q0.05,θ (q0.75,θ − q0.25,θ)σ µ + σq0.50,θ         . 2

The vector Φθ does not have an explicit relation with θ but it can be obtained as the sample

quantile from simulated observations, which we denote by ˜Φθ. Moreover, we can draw R simulated

paths and compute the average vector ˜ΦR

θ =

1 R

PR

r=1Φ˜θ,r. The principle of MSQ is to find the

value of the parameters that best match the sample and theoretical functions of quantiles. This is done by minimizing the quadratic distance between ˆΦ and ˜ΦR

θ:

ˆ

θ = argminθ∈Θ( ˆΦ − ˜ΦRθ)

TW

θ( ˆΦ − ˜ΦRθ), (2.1)

where Wθ is a M × M symmetric positive definite weighting matrix defining the metric. Three

particular cases are nested in (2.1). The first is when no simulations are needed. If ˜ΦR

θ can

be computed explicitly, then no simulations are needed and (2.1) can be solved by the standard optimization techniques. An example is the generalized Tukey lambda distribution (Ramberg and Schmeiser, 1974), which is only defined in terms of the quantiles:

qτk,θ = θ1+ τθ3 k − (1 − τk) θ4 θ2 ,

5When α = 1 the scale of the sample mean is the same as the original scale. This implies either that every

(35)

Chapter I The Method of Simulated Quantiles where θ1 and θ2 are location and dispersion parameters respectively, and θ3 and θ4 are shape

pa-rameters. The second particular case is when M = p. Then Wθ is irrelevant and the problem boils

down to solving the system ˆΦ − ˜ΦR

θ = 0. This is the case of the α-stable distribution where there

are four parameters and four functions of quantiles. The third particular case combines the first and second: if ˜ΦRθ can be computed explicitly and M = p, the problem reduces to find the θ such that ˆΦ = Φθ. This is the case of the Tukey lambda distribution, where there is only one parameter

that can be estimated with just one function of quantiles (which could be the τk-th quantile itself,

i.e. matching ˆqτk with qτk,θ), or its generalized version with four parameters and four functions of

quantiles.

Example (cont.) Since ˜ΦR

θ and ˆΦ are 4 × 1 vectors, optimization boils down to solving the system ˆΦ − ˜ΦRθ = 0.

Because the functions of quantiles for α and β are location and dispersion invariant, they can be matched independently from those for σ and µ. Let θ†= (α, β),

˜ Φ†θ,r =     qr 0.95,θ†−q r 0.05,θ† qr 0.75,θ†−q r 0.25,θ† (qr 0.95,θ†−q r 0.50,θ†)+(q r 0.05,θ†−q r 0.50,θ†) qr 0.95,θ†−q r 0.05,θ†     and ˆΦ† = ˆ q0.95−ˆq0.05 ˆ q0.75−ˆq0.25 (ˆq0.95−ˆq0.50)+(ˆq0.05−ˆq0.50) ˆ q0.95−ˆq0.05 ! where qr

τ,θ† denoted the τ -th sample quantile for the r-th simulated path. Once ˆθ

= ( ˆα, ˆβ) are

obtained, estimates for σ and µ are straightforward. For σ: ˆ σ = qˆ0.75− ˆq0.25 1 R PR r=1q r 0.75, ˆθ†− q r 0.25, ˆθ† , (2.2) where qr 0.75, ˆθr,† and q r

0.25 ˆθr,† are the theoretical 0.75-th and 0.25-th quantiles of a standardized

α-stable distribution evaluated at ˆα and ˆβ. For µ:

ˆ µ = ( ˆ q0.50+ ˆσ ˆβ tan π ˆ2α − R1 PRr=1q0.50, ˆr θ†  for α 6= 1 ˆ q0.50− ˆσR1 PR r=1qr0.50, ˆθ† for α = 1. (2.3)

where qr0.50, ˆθ† is the theoretical 0.50-th quantile of a standardized α-stable distribution evaluated

at ˆα and ˆβ. 2

Optimization (2.1) works as follows. First, the sample functions of quantiles ˆΦ are estimated from the observations. Second, given some initial values of the parameters , we simulate R samples of size N∗ from the probability law that generates the process.6 The simulated samples are used

to compute ˜ΦR

θ (as the average sample quantiles of the simulated samples). Third, an iterative 6In the Monte Carlo study and the illustration, we use McCulloch’s estimates as initial values for the parameters.

(36)

Chapter I The Method of Simulated Quantiles process starts to find θ that minimizes (2.1). The simulation and calculation of ˜ΦR

θ are repeated at

each iteration of the algorithm (using always the same seed). The iterative process continues until the convergence criterion is achieved. Note that N∗ can be larger than N . In fact, the larger N∗ the smaller the randomness induced by the simulation. An alternative is to replace the R simulated samples of size N∗ for one sample of size RN∗.7

Several remarks are in order.

First, optimization (2.1) depends on the weighting matrix Wθ, which in turn depends on the

parameters. This is a similar problem to GMM and IndInf. So we proceed similarly: we optimize (2.1) with Wθ = I, a M × M identity matrix. The estimated parameters, ˇθN, albeit inefficient,

are consistent. Then we replace θ by ˇθN in Wθ, and we optimize ( ˆΦ − ˜ΦRθ)TWˆ θˇ( ˆΦ − ˜ΦRθ). The

optimal choice of Wθ and its estimator are discussed in the next sub-section.

Second, in many situations there are constraints between parameters to be estimated (e.g. equality or proportionality). A first thought to account for them is to optimize (2.1) subject to the constraints. This leads to a complicated constrained optimization problem that may involve Lagrange multipliers and Kuhn-Tucker conditions. In our simulation-based framework there is no need of such a constrained optimization. The constraints between the parameters can be easily imposed in the simulation step.

Third, notwithstanding the previous appealing feature, there are also constraints in the parame-ter spaces that the optimization has to handle. For instance, for the α-stable distribution α ∈ (0, 2], β ∈ [−1, 1] and σ ∈ R+. An appropriate re-parametrization avoids this extra complexity.

Fourth, the choice of the functions of quantiles does not affect the consistency of the estimators but the asymptotic variance (cf. next sub-section). We could match a small number of empirical and theoretical quantiles avoiding the functions h(·) and g(·). But in practice the choice of the functions of quantiles plays an important role. As mentioned in the introduction, these functions should be informative about the parameters. A non informative function of quantiles about one of the parameters will make estimation to fail, as explained below in the Monte Carlo study. Alternatively, we could consider a thin grid of quantiles (i.e. a vector of quantiles of a thin grid of τ ’s) instead of a small number of functions of quantiles. This approach would produce consistent estimators but i) the grid of τ ’s needs to be thin enough to contain information about all the parameters, which entails a significant increase in the complexity of the optimization (both time and computationally wise), and, ii) when the grid becomes sufficiently thin, it would not be possible to estimate efficiently since the quantiles become collinear when the number of τ ’s goes to infinity (this was pointed out by Carrasco and Florens (2002) in GMM). As a result, the inverse of the weighting function is not continuous and it needs to be stabilized by a regularization parameter.

7Gouri´eroux and Monfort (1996), Czellar and Zivot (2008) and Gouri´eroux et al. (2010), among others, compare

Références

Documents relatifs

However, the results on the extreme value index, in this case, are stated with a restrictive assumption on the ultimate probability of non-censoring in the tail and there is no

Estimation of limiting conditional distributions for the heavy tailed long memory stochastic volatility process.. Rafal Kulik,

Asymptotically, the quantile-based realised variance is immune to finite activity jumps and outliers in the price series, while in modified form the estimator is applicable with

Keywords: Extreme value index, second order parameter, Hill estimator, semi-parametric estimation, asymptotic properties..

Keywords: Elliptical distribution; Extreme quantiles; Extreme value theory; Haezendonck-Goovaerts risk measures; Heavy-tailed distributions; L p

Keywords: Income distribution, delta method, asymptotic inference, boot- strap, influence function, empirical process.. JEL codes: C12, C13, C81,

These measures also arise as the limit distributions in inference based on empirical likelihood, under certain condi- tions on the function ϕ which turn the functional (1.2) into

2 we recall the known results for the sample covariance matrix estimator under light-tailed assumptions, together with some recent high-probability results for an alternative