Some variations on the extremal index

(1)

HAL Id: hal-03250576

https://hal.archives-ouvertes.fr/hal-03250576

Preprint submitted on 8 Jun 2021

HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Some variations on the extremal index

Gloria Buriticá, Nicolas Meyer, Thomas Mikosch, Olivier Wintenberger

To cite this version:

Gloria Buriticá, Nicolas Meyer, Thomas Mikosch, Olivier Wintenberger. Some variations on the

extremal index. 2021. �hal-03250576�

(2)

SOME VARIATIONS ON THE EXTREMAL INDEX

GLORIA BURITIC ´A, NICOLAS MEYER, THOMAS MIKOSCH, AND OLIVIER WINTENBERGER

Abstract. We re-consider Leadbetter’s extremal index for stationary sequences. It has interpretation as reciprocal of the expected size of an extremal cluster above high thresholds. We focus on heavy-tailed time series, in particular on regularly varying stationary sequences, and discuss recent research in extreme value theory for these models. A regularly varying time series has multivariate regularly varying finite- dimensional distributions. Thanks to results by Basrak and Segers [2]

we have explicit representations of the limiting cluster structure of extremes, leading to explicit expressions of the limiting point process of exceedances and the extremal index as a summary measure of extremal clustering. The extremal index appears in various situations which do not seem to be directly related, like the convergence of maxima and point processes. We consider different representations of the extremal index which arise from the considered context. We discuss the theory and apply it to a regularly varying AR(1) process and the solution to an affine stochastic recurrence equation.

Some personal words by Thomas Mikosch. During my PhD studies at the University of Leningrad 1981–1984 I met Yasha Nikitin at seminars and workshops at LOMI (now POMI) and the university. I remember him as a professor who had a lot of humor and a balanced personality. Later, since the 1990s, Yasha Nikitin became a representative of Russian Probability Theory and Mathematical Statistics with a high international reputation. I appreciated his cosmopolitan attitude. Whenever one needed constructive advice as regards some international scientific event (such as the European Meeting of Statisticians) or editorial issues, he would help. He supported the Bernoulli Society actively, as an organization which embraces all European probabilists and statisticians.

I met Yasha at the Vilnius Conference in 2018 at the last time and, as always, I enjoyed his warm-hearted personality. He went from us too early.

His achievements for Probability Theory and Statistics at the University of St. Petersburg and worldwide remain alive.

2020 Mathematics Subject Classification. Primary 60G70 Secondary 60G55 60F99 60J10 62M10 62G32.

Key words and phrases. Extremal index, cluster Poisson process, extremal cluster, regularly varying time series, affine stochastic recurrence equation, autoregressive process.

Thomas Mikosch’s research is partially supported by Danmarks Frie Forskningsfond Grant No 9040-00086B.

1

(3)

1. Leadbetter’s approach to modeling the extremes of a stationary sequence

The paper by Leadbetter [22] and the book of Leadbetter, Lindgren and Rootz´ en [23] provided a first systematic approach to the extreme value theory of dependent stationary sequences. In particular, Leadbetter introduced mixing and anti-clustering conditions, the conditions D and D

⁰

, which are tailored for the analysis of dependent extremal events. Moreover, [23] prop- agated the use of the extremal index as a measure for extremal clustering.

The idea of an extremal index originates from [25, 24, 27] who discovered that the maxima

M

_n

= max

t=1,...,n

X

_t

, n > 1 ,

of numerous examples of dependent stationary sequences (X

_t

) with common distribution F share the property that

P (M

_n

6 u

_n

) ≈

P (X 6 u

_n

)

n θX

= (F (u

_n

))

ⁿ

θX

, n → ∞ ,

for some number θ

_X

∈ [0, 1] provided (u

_n

) is a sequence of high thresholds converging sufficiently fast to the right endpoint x

F

of F . Leadbetter [22] made this notion precise as the expected size of an extremal cluster of exceedances above high-level thresholds. Since (F (u

n

))

ⁿ

is the distribution function of the maximum of n iid random variables with common distribution F at the threshold u

n

, the quantity θ

X

describes the shrinking effect that the appearance of dependent extremes may have on the distribution of M

_n

compared to (F (u

_n

))

ⁿ

.

Leadbetter defined the extremal index θ

X

as follows: assume that for every τ ∈ (0, ∞) there exists a sequence (u

_n

(τ )) such that

n F (u

n

(τ )) = n (1 − F(u

n

(τ ))) → τ, and there exists a number θ

_X

such that

P (M

_n

6 u

_n

(τ )) → e

^{−τ θ}^X

, n → ∞ .

If such a number θ

_X

exists it belongs to the interval [0, 1] and is independent of the choice of the sequences (u

n

).

An immediate application is to the convergence in distribution of the sequence (M

_n

). Assume that (X

_t

) belongs to the maximum domain of at- traction of an extreme value distribution H, i.e., for iid copies ( X e

t

) of X

1

, M f

n

= max( X e

1

, . . . , X e

n

), there exist constants c

n

> 0, d

n

∈ R such that c

⁻¹_n

( M f

n

− d

n

) →

^d

ξ as n → ∞ and ξ has distribution H. Then if (X

t

) has an extremal index θ

_X

we have

n F (c

_n

x + d

_n

| {z }

=:un(τ)

) → − log H(x)

| {z }

=:τ

, n → ∞ , x ∈ suppH . and

P c

⁻¹_n

(M

_n

− d

_n

) 6 x

→ H

^θ^X

(x) , n → ∞ , x ∈ suppH .

(4)

In the case of an iid sequence it is easily seen that n F (u

n

(τ )) → τ holds if and only if P (M

_n

6 u

_n

(τ )) → e

^−τ

. Hence θ

_X

= 1. The extremal index 1 is not exclusive to iid sequences. Indeed, in the book [23] various examples of strictly stationary sequences are considered for which θ

_X

= 1. For example, if (X

_t

) is a Gaussian stationary sequence whose autocovariance function satisfies cov(X

0

, X

_h

) = o(1/ log h) as h → ∞, then θ

X

= 1.

2. Sufficient conditions for the existence of the extremal index

The extremal index is often interpreted as the reciprocal of the expected size of an extremal cluster for a stationary sequence (X

t

). We will give some justification for this interpretation.

2.1. The method of block maxima. The key is the definition of an extremal cluster in the sample X

1

, . . . , X

n

: split the sample into k

n

= [n/r

n

] blocks of equal length r

_n

:

X

1

, . . . , X

rn

| {z }

Block 1

, X

rn+1

, . . . , X

2rn

| {z }

Block 2

, . . . , X

_(k_n−1)rn+1

, . . . , X

knrn

| {z }

Blockkn

,

we ignore the last block of length less than r

n

, and we simply call a block an extremal cluster relative to a high threshold u = u

_n

(this means that u

_n

↑ x

_F

as n → ∞) if there is at least one exceedance of this threshold in this block. For an asymptotic theory it will be important that r = r

n

→ ∞ such that r

_n

is small compared to n, i.e., k

_n

→ ∞.

In view of the stationarity of (X

t

) the expected cluster size of a block is given by

E h X

^rⁿ

t=1

11(X

t

> u

n

)

M

rn

> u

n

i

=

rn

X

t=1

P (X

t

> u

n

, M

rn

> u

n

) P (M

_r_n

> u

_n

)

=

rn

X

t=1

P(X

t

> u

n

) P (M

rn

> u

n

)

= r

_n

P (X > u

_n

) P(M

rn

> u

n

) =: 1

θ

n

.

Obviously, θ

_n

is a number in [0, 1]. Under mild regularity conditions the limit θ = lim

n→∞

θ

n

exists, assumes values in (0, 1] and coincides with Leadbet- ter’s extremal index θ

_X

; see Theorem 2.3 below. For this reason, the extremal index θ

X

is often referred to as the reciprocal of the expected extremal cluster size above high thresholds.

2.2. Approximation of θ

_X

by θ

_n

. The following result can be found in slightly different forms in [9], proof of Lemma 2.8, [34, 2].

Theorem 2.3. Consider the following conditions:

(1) (X

_t

) is a real-valued stationary sequence whose marginal distribution F

(5)

0 20 40 60 80 100

−20246

Time

Time series

Figure 2.1.

Visualization of the max-moving averageX_t= max(Z_t, Z_t+1, Z_t+2), t = 1, . . . ,100, (blue) for iid student noise Zt, t = 1, . . . ,102, with α= 4 degrees of freedom (red dots). The values ofX_t typically appear in clusters of size 3. The process (|Xt|) has extremal indexθ_|X|= 1/3.

−0.4−0.3−0.2−0.10.00.10.2

Time

Bit Coin USD log−returns

2015 2016 2017 2018 2019 2020 2021

Figure 2.2.

The daily log-return series of the Bit Coin USD stock prices from 17 September 2014 until 8 January 2021. We only show the returns below -0.04 or above 0.04 which we interpret as extreme values. These limits roughly correspond to the 10% and 90% quantiles of the data. The extremes typically appear in clusters.

does not have an atom at the right endpoint x

_F

.

(2) For a sequence u

n

↑ x

F

and an integer sequence r = r

n

→ ∞ such that k

_n

= [n/r

_n

] → ∞ the following anti-clustering condition is satisfied:

k→∞

lim lim sup

n→∞

P M

_k,r_n

> u

_n

| X

₀

> u

_n

= 0 . (2.1)

Here M

_a,b

= max

_i=a,...,b

X

_i

for a 6 b such that M

_b

= M

_a,b

with a = 1.

(3) A mixing condition holds:

P (M

_n

6 u

_n

) − P (M

_r_n

6 u

_n

)

kn

→ 0 , n → ∞ , (2.2)

where (u

_n

), (k

_n

) and (r

_n

) are as in (2).

(4) For all positive τ there exists a sequence (u

n

) = (u

n

(τ )) such that n F (u

n

) → τ and (2), (3) are satisfied for these sequences (u

n

).

Then the following statements hold:

(6)

1. If (1) and (2) are satisfied then

k→∞

lim lim sup

n→∞

θ

n

− P M

_k

6 u

n

| X

0

> u

n

= 0 , (2.3)

and lim inf

n→∞

θ

_n

> 0.

2. If (1) and (4) are satisfied and θ = lim

n→∞

θ

n

exists, then θ

X

∈ (0, 1]

exists and coincides with θ.

Condition (2.2) is satisfied for strongly mixing (X

t

) with mixing rate (α

_h

) if one can find integer sequences (`

n

) and (r

n

) such that `

n

/r

n

→ 0, r

_n

/n → 0 and k

_n

α

_`_n

→ 0 as n → ∞. Anti-clustering conditions are common in extreme value theory since Leadbetter introduced the D

⁰

condition which is much stronger than (2.1) but also easily verified on examples. The goal of such a condition is to avoid that the stationary sequence stays above a high threshold for too long.

Relation (2.3) is in agreement with O ’Brien’s [28] characterization of the extremal index of (X

t

) as the limit

θ

_X

= lim

n→∞

P (M

_`_n

6 u

_n

| X

₀

> u

_n

), (2.4)

for a sequence (`

_n

) with `

_n

/n → 0, thresholds u

_n

↑ x

_F

such that n F (u

_n

) → 1 as n → ∞, provided a mixing condition holds. O’Brien’s condition (2.4) has the advantage that it avoids the definition of an extremal cluster.

Remark 2.4. Relation (2.3) provides a constructive way of calculating θ

X

: if we know that the limits f (k) := lim

n→∞

P M

_k

6 u

_n

| X

₀

> u

_n

exist for every k > 1 then we can try to derive θ

X

= lim

k→∞

f(k). In Section 3 we will follow this approach in the case of a regularly varying sequence.

3. Regularly varying sequences

3.1. Definition and examples. As a matter of fact, clusters of extremes are more prominent in stationary sequences with heavy-tailed marginal distribution. To illustrate this fact, consider a stationary causal AR(1) process which solves the difference equation X

_t

= ϕ X

t−1

+Z

_t

, t ∈ Z , for an iid noise sequence (Z

_t

). Necessarily, ϕ ∈ (−1, 1) and, if (Z

_t

) is iid standard normal then (|X

_t

|) has extremal index θ

_|X|

= 1 (see [23]), while for iid student noise (Z

t

) with α degrees of freedom we have θ

_|X|

= 1 − |ϕ|

^α

; see Example 3.4 below. Thus, the smaller α (the heavier the tail) for given ϕ the closer θ

_|X_|

to zero.

An AR(1) process with student noise is an example of a regularly varying

time series. This class of heavy-tailed processes has been studied rather

extensively in the last 15 years; see [31] for some basics about multivari-

ate regular variation, and [21] for a recent textbook treatment. This class

was considered in full generality first by [9]: they required that the finite-

dimensional distributions of the process satisfy a multivariate regular varia-

tion condition; see [30, 31] for the definition of this notion. It is an extension

(7)

of power-law tail behavior from the univariate to the multivariate case defined via the vague convergence of tail measures with infinite limit measures which have the homogeneity property.

Here we will follow an alternative approach by [2] tailored for stationary sequences, avoiding the vague convergence concept. They proved that a real-valued stationary sequence (X

t

) is regularly varying with index α > 0 in the sense of [9] if and only if there exists a sequence (Θ

_t

) and a Pareto(α) distributed random variable Y , i.e., P(Y > y) = y

^−α

, y > 1, such that (Θ

t

) and Y are independent and, for all h > 0,

P x

⁻¹

(X

t

)

|t|6h

∈ ·

|X

₀

| > x

w

→ P Y (Θ

t

)

|t|6h

∈ ·

, x → ∞ .

In the latter relation x can be replaced by any sequence (a

n

) such that n P (|X| > a

n

) → 1 as n → ∞. Moreover, by definition, |Θ

₀

| = 1 a.s. The sequence (Θ

_t

) is the spectral tail process of the regularly varying process (X

t

); it describes the propagation of a value |X

₀

| > x for large x through the stationary sequence (X

_t

) into its past and future.

Example 3.1. We consider a stationary AR(1) process given as the causal solution to the difference equation X

t

= ϕ X

t−1

+ Z

t

, t ∈ Z, where (Z

t

) is iid regularly varying with index α (e.g. Pareto(α) or student(α)). This means that a generic element Z satisfies lim

x→∞

P(±Z > x)/P(|Z | > x) = p

±

for non-negative values p

±

such that p

+

+ p

−

= 1, and P (|Z| > x) = L(x)x

^−α

, x > 0, for some slowly varying function L. Then a generic element X inherits the regularly varying tail behavior from Z (see [10]):

P (±X > x) P(|Z | > x) ∼

∞

X

j=0

p

±

(ϕ

^j

)

^α_±

+ p

∓

(ϕ

^j

)

^α_∓

= P (Θ

₀

= ±1)(1 − |ϕ|

^α

) . But even more is true: (X

t

) is a regularly varying time series with spectral tail process

Θ

_t

= Θ

_Z

sign(ϕ

^J^+t

) |ϕ|

^t

11(J + t > 0) = Θ

₀

ϕ

^t

11(J + t > 0) , t ∈ Z , (3.1)

where P (Θ

_Z

= ±1) = p

±

, Θ

_Z

is independent of J which has distribution P (J = j) = (1 − |ϕ|

^α

) |ϕ|

^{j α}

, j > 0 .

In particular, the forward spectral tail process is given by Θ

_t

= Θ

₀

ϕ

^t

, t > 0.

Example 3.2. We consider the unique causal solution to the affine stochastic recurrence equation X

t

= A

t

X

t−1

+ B

t

, t ∈ Z, for an iid sequence ((A

_t

, B

_t

))

t∈Z

with generic element (A, B) ∈ R

²+

. We assume that the distribution of (A, B) satisfies the conditions of the Kesten-Goldie theory; see [20, 13], cf. [6] for a textbook treatment. The most important condition in this context is the existence of a unique solution α > 0 to the equation E [A

^α

] = 1 which yields the tail index α. Under these conditions for a generic element X, there exists a positive constant c

₊

such that

P (X > x) ∼ c

₊

x

^−α

, x → ∞ .

(8)

The forward spectral process is then given by

(Θ

t

)

_t>0

= (Π

t

)

_t>0

, where Π

t

= A

1

· · · A

t

,

while the backward spectral tail process (Θ

_t

)

_t6−1

has a rather complicated structure.

Writing S

t

= log Π

t

= P

t

i=1

log A

i

, t > 1, we observe that (S

t

) constitutes a random walk with a negative drift. Indeed, by Jensen’s inequality we have E [log(A

^α

)] < log( E [A

^α

]) = 0.

3.2. The extremal index. Following Remark 2.4, we will derive the extremal index θ

_X

of a stationary non-negative regularly varying sequence (X

_t

) in terms of its spectral tail process. First, we observe that by virtue of the continuous mapping theorem, as n → ∞ for k > 1,

P a

⁻¹_n

M

_k

6 1

X

₀

> a

_n

→ P Y max

16t6k

Θ

_t

6 1

= P max

16t6k

Θ

^α_t

6 Y

^−α

= E

1 − max

16t6k

Θ

^α_t

+

= E

06t6k

max Θ

^α_t

− max

16t6k

Θ

^α_t

.

Here we used the fact that Y

^−α

is U (0, 1) uniformly distributed and Θ

₀

= 1 a.s. Using dominated convergence as k → ∞, we proved under the anti- clustering condition (2.1) that

n→∞

lim θ

_n

= lim

k→∞

lim

n→∞

P a

⁻¹_n

M

_k

6 1

X

₀

> a

_n

= lim

k→∞

E

06t6k

max Θ

^α_t

− max

16t6k

Θ

^α_t

= E

1 − max

t>1

Θ

^α_t

+

.

From Theorem 2.3 we obtain the following result in [2].

Corollary 3.3. Consider a non-negative stationary regularly varying process (X

t

) with index α > 0. Then the following statements hold:

1. If the anti-clustering condition (2.1) holds for (u

n

) = (x a

n

) and some x > 0 then the limit θ = lim

n→∞

θ

_n

exists, is positive and has the representations

θ = P Y sup

t>1

Θ

t

6 1

= E

1 − sup

t>1

Θ

^α_t

+

= E sup

t>0

Θ

^α_t

− sup

t>1

Θ

^α_t

. (3.2)

2. If (2.1) and the mixing condition (2.2) hold for (u

n

) = (x a

n

) and all x > 0 then the extremal index θ

_X

exists and coincides with θ.

The representations of θ given in (3.2) only depend on the forward spectral

process (Θ

_t

)

_t>0

. In Proposition 3.10 below we provide representations of the

extremal index θ

|X|

depending on the whole spectral tail process (Θ

t

)

t∈Z

.

Example 3.4. We consider the regularly varying AR(1) process from Exam-

ple 3.1. It can be shown to satisfy the anti-clustering and mixing conditions

(9)

of Theorem 2.3. We conclude from Corollary 3.3 and the form of the spectral tail process given in (3.1) that

θ

|X|

= E

1 − max

t>1

Θ

^α_t

+

= 1 − max

t>1

|ϕ|

^{α t}

= 1 − |ϕ|

^α

.

This formula was already achieved in [10] in the wider context of linear processes.

Example 3.5. We consider the regularly varying solution of an affine stochastic recurrence equation under the conditions and with the notation of Example 3.2. It can be shown to satisfy the anti-clustering and mixing conditions of Theorem 2.3; see [6]. We conclude from this result that (X

_t

) has extremal index

θ

_X

= E

1 − max

t>1

Π

_t

+

= E

1 − exp max

t>1

S

_t

+

, where S

_t

= P

t

i=1

log A

_i

, t > 1, is a random walk with a negative drift. This value of θ

X

was derived in [16]. In that paper a Monte Carlo simulation procedure for the evaluation of θ

_X

was proposed. Direct calculation of θ

_X

is difficult; see Example 3.12 for an exception.

3.3. The extremal index and point process convergence toward a cluster Poisson process.

3.3.1. A useful auxiliary result.

Lemma 3.6. Consider a non-negative stationary regularly varying sequence (X

_t

) with index α > 0 and assume that (2.1) holds for (u

_n

) = (x a

_n

) and all x > 0. Then

kΘk

^α_α

:= X

j∈Z

Θ

^α_j

< ∞ a.s.

In particular, Θ

t

→ 0 a.s. as |t| → ∞, and the time T

^∗

of the largest record of (Θ

_t

) is finite, i.e., |T

^∗

| is the smallest integer such that

Θ

_T^∗

= max

t∈Z

Θ

_t

.

Proof. Write (Y

t

) = Y (Θ

t

) where the Pareto(α) variable Y and the spectral tail process (Θ

_t

) are independent. We start by showing

Y

ta.s.

→ 0 , t → ∞ . (3.3)

Since (X

_t

) is regularly varying we have for all x > 0 and integers k > 1,

h→∞

lim lim

n→∞

P M

k,k+h

> x a

n

| X

0

> a

n

= lim

h→∞

P max

k6t6k+h

Y

t

> x

= P max

t>k

Y

_t

> x

.

(10)

On the other hand, using the anti-clustering condition (2.1) for all x ∈ (0, 1], we have for fixed k, h > 1,

n→∞

lim P M

_k,k+h

> x a

_n

| X

₀

> a

_n

6 lim sup

n→∞

P M

_k,r_n

> x a

_n

| X

₀

> x a

_n

P (X > x a

_n

) P (X > a

_n

)

= x

^−α

lim sup

n→∞

P M

_k,r_n

> x a

n

| X

0

> x a

n

= x

^−α

ε

_k

,

and the right-hand side term ε

k

vanishes for large k. Hence, letting h → ∞, we obtain for all x > 0,

P max

t>k

Y

_t

> x

6 x

^−α

ε

_k

, and therefore

k→∞

lim P max

t>k

Y

t

> x 6 lim

k→∞

x

^−α

ε

_k

= 0 ,

implying max

_t>k

Y

_t

→

^P

0 as k → ∞. Since (Y

_t

) = Y (Θ

_t

) a.s. and Y > 0 is independent of (Θ

_t

) this is only possible if max

_t>k

Θ

_t

→

^P

0 as k → ∞ but the latter relation is equivalent to Θ

_t^a.s.

→ 0 as t → ∞, implying (3.3).

Next we show that

Y

−ta.s.

→ 0 , t → ∞ . Since Y

t

a.s.

→ 0 as t → ∞ and Y

0

> 1 a.s. the following relation holds P

[

i>0

Y

_i

> 1 > max

t>i

Y

_t

= X

i>0

P Y

_i

> 1 > max

t>i

Y

_t

= 1 . Suppose that P ( P

j60

11(Y

_j

> ε) = ∞) > 0 for some ε > 0. Then there exists some i > 0 such that

P X

j60

11(Y

_j

> ε) = ∞, Y

_i

> 1 > max

t>i

Y

_t

> 0 . We recall the time-change formula from [2]:

P((Θ

−h

, . . . , Θ

h

) ∈ · | Θ

−t

6= 0) = E Θ

^α_t

E

Θ

^α_t

11 (Θ

t−h

, . . . , Θ

t+h

) Θ

t

∈ · . (3.4)

In particular, P (Θ

_t

6= 0) = E [Θ

^α_t

] = 1 if and only if for all h > 0, P((Θ

−h

, . . . , Θ

h

) ∈ ·) = E Θ

^α_t

E

Θ

^α_t

11 (Θ

t−h

, . . . , Θ

t+h

) Θ

t

∈ ·

.

(11)

Therefore

∞ = E X

j60

11(Y

_j

> ε) 11 Y

_i

> 1 > max

t>i

Y

_t

= X

j60

P Y

_j

> ε, Y

_i

> 1 > max

t>i

Y

_t

= X

j60

Z

∞ 1

E

11 y Θ

j

> ε , y Θ

i

> 1 > y max

t>i

Θ

t

d − y

^−α

= X

j60

Z

∞ 1

E

Θ

^α_−j

11 y > ε Θ

−j

, y Θ

i−j

Θ

−j

> 1 > y max

t>i−j

Θ

_t

Θ

−j

d − y

^−α

6 ε

^−α

X

j60

E Z

∞

1

11 z > 1 , z Θ

i−j

> ε

⁻¹

> z max

t>i−j

Θ

_t

d − z

^−α

= ε

^−α

X

j60

P Y

i−j

> ε

⁻¹

> max

t>i−j

Y

_t

= ε

^−α

X

k>i

P Y

_k

> ε

⁻¹

> max

t>k

Y

_t

6 ε

^−α

.

In the last step we used the fact that the events {Y

_k

> ε

⁻¹

> max

_t>k

Y

_t

}, k > i, are disjoint. Thus we got a contradiction. This proves that for all ε > 0 there exist only finitely many j 6 0 such that Y

_j

> ε, hence Y

_t ^a.s.

→ 0 and also Θ

ta.s.

→ 0 as t → −∞, as desired.

In particular, the time T

^∗

of the largest record of the sequence (Θ

_t

) is finite a.s.

Now suppose that P ( P

j∈Z

Θ

^α_j

= ∞) > 0. Then there exists an i ∈ Z such that

P X

j∈Z

Θ

^α_j

= ∞ , T

^∗

= i

> 0 ,

and an application of the time-change formula (3.4) yields

∞ = E h X

j∈Z

Θ

^α_j

11(T

^∗

= i) i

= X

j∈Z

E

Θ

^α_j

11(T

^∗

= i)

= X

j∈Z

P (T

^∗

= i − j) = 1 , leading to a contradiction. Thus P

j∈Z

Θ

^α_j

< ∞ a.s. This proves the lemma.

3.3.2. Point process convergence toward cluster Poisson processes. The following point process result was proved in [9] and re-proved in [2] by using the terminology of the spectral tail process.

We adapt the mixing condition in [9] tailored for point process conver-

gence. It is expressed in terms of the Laplace functionals of point processes.

(12)

Recall that a point process N with state space E = R

0

= R \{0} has Laplace functional

Ψ

_N

(g) = E

exp

− Z

E

g dN

for g ∈ C

⁺_K

,

where the set C

⁺_K

consists of the continuous functions on E with compact support. Since 0 is excluded from E this means that g ∈ C

⁺_K

vanishes in some neighborhood of the origin. Moreover, we have the weak convergence of point processes N

_n

→

^d

N on E if and only if Ψ

_N_n

→ Ψ

_N

pointwise; see [30, 31].

Mixing condition A(a

_n

) Consider integer sequences r

_n

→ ∞ and k

_n

= [n/r

n

] → ∞ and the point processes with state space E = R

0

,

N

_n

=

n

X

i=1

ε

_a⁻¹

n Xi

and N e

_r_n

=

rn

X

i=1

ε

_a⁻¹

n Xi

, n > 1 ,

where ε

_x

denotes Dirac measure at x. The stationary regularly varying sequence (X

t

) satisfies A(a

_n

) if there exist (r

n

) and (k

n

) such that

Ψ

Nn

(g) − Ψ

_N_e

rn

(g)

kn

→ 0 , n → ∞ , g ∈ C

⁺_K

. (3.5)

Remark 3.7. This condition is satisfied for a strongly mixing sequence (X

_t

) with mixing rate (α

_h

) if one can find integer sequences (`

_n

) and (r

_n

) such that `

n

/r

n

→ 0, r

n

/n → 0 and k

n

α

_`_n

→ 0. This is a very mild condition indeed. Relation (3.5) ensures that, if N

n d

→ N on the state space E, then also P

kn

i=1

N e

r⁽ⁱ⁾n

→

d

N where ( N e

r⁽ⁱ⁾n

)

i=1,...,kn

are iid copies of N e

rn

. This fact ensures that the limit processes considered are infinitely divisible; cf. [19].

Theorem 3.8. Consider a stationary regularly varying sequence (X

_t

) with index α > 0. We assume the following conditions:

(1) The mixing condition A(a

_n

) for integer sequences r

_n

→ ∞ such that k

n

= [n/r

n

] → ∞ as n → ∞.

(2) The anti-clustering condition (2.2) for the same sequence (r

_n

) . Then we have the point process convergence on the state space R

0

N

_n

=

n

X

i=1

ε

_a⁻¹

n Xi

→

d

N =

∞

X

i=1

∞

X

j=−∞

ε

Γ^−1/α_i Qij

, (3.6)

where

• P

∞

j=−∞

ε

_Q_ij

, i = 1, 2, . . ., is an iid sequence of point processes with state space R . A generic element Q = (Q

j

) of the sequence Q

⁽ⁱ⁾

= (Q

ij

)

j∈Z

, i = 1, 2, . . ., has the distribution of the spectral cluster process

Q = Θ

_t

kΘk

_α

t∈Z

.

• (Γ

i

) are the points of a unit rate homogeneous Poisson process on (0, ∞).

• (Γ

_i

) and (Q

⁽ⁱ⁾

)

_i=1,2,...

are independent.

(13)

Remark 3.9. In view of Lemma 3.6 we know that kΘk

_α

< ∞ a.s. Hence the spectral cluster process Q is well defined.

Since the Poisson points (Γ

^−1/α_i

) and the sequence of iid point processes P

j∈Z

ε

Qij

are independent it is not difficult to calculate the Laplace functional of the limit process N :

Ψ

_N

(g) = exp

− Z

∞

0

E

1 − e

⁻^P^j∈^Z^{g(y Q}^j⁾

d(−y

^−α

)

, g ∈ C

⁺_K

.

Now we apply the change of variables z = y |Q

_T^∗

| in Ψ

N

(g) where

|Q

_T^∗

| = |Θ

_T^∗

|

kΘk

_α

= max

t∈Z

|Θ

_t

| P

j∈Z

|Θ

_j

|

^α

1/α

. Then we obtain for g ∈ C

⁺_K

,

Ψ

N

(g) = exp

− E[|Q

_T^∗

|

^α

]

× Z

∞

0

E

h |Q

_T^∗

|

^α

E [|Q

_T^∗

|

^α

] 1 − e

⁻

P

j∈Zg(z Qj/|Q_T∗|)

i

d(−z

^−α

)

. According to Proposition 3.10 below, θ

_|X|

= E [|Q

_T^∗

|

^α

]. Now, changing the measure with the density |Q

_T^∗

|

^α

/E[|Q

_T^∗

|

^α

] and writing Q e = ( Q e

j

)

j∈Z

for the sequence Q/|Q

_T^∗

| under the new measure, we arrive at

Ψ

_N

(g) = exp

− Z

∞

0

E

1 − e

⁻^P^j∈^Z^g(z^Q^e^j^|)

d − (z/θ

_|X|^1/α

−α

. However, this alternative expression of the Laplace functional Ψ

_N

corre- sponds to another representation of the point process N :

N =

∞

X

i=1

∞

X

j=−∞

ε

_(Γ

i/θ_|X|)^−1/αQeij

, (3.7)

where the Poisson points (Γ

^−1/α_i

) are independent of the sequence P

j∈Z

ε

Qeij

of iid copies of P

j∈Z

ε

Qej

.

We observe that | Q e

_j

| 6 1 a.s. and | Q e

_T^∗

| = 1 a.s. The extremal index θ

_|X_|

plays an important role in representation (3.7). Each Poisson point (Γ

i

/θ

|X|

)

^−1/α

stands for the radius of a circle around the origin, and the points ( Q e

_ij

)

j∈Z

are inside or on this circle. In this sense, each Poisson point (Γ

i

/θ

|X|

)

^−1/α

creates an extremal cluster. Therefore we refer to the process N as a cluster Poisson process.

3.3.3. Equivalent expressions for the extremal index. Based on the results

in the previous subsection we can derive equivalent expressions of θ

_|X|

in

terms of Q

_T^∗

and T

^∗

.

(14)

Proposition 3.10. Assume the conditions of Theorem 3.8. Then the extremal index θ

_|X|

of (|X

_t

|) coincides with the following quantities:

E[|Q

_T^∗

|

^α

] = P(Y |Q

_T^∗

| > 1) = P(T

^∗

= 0) . (3.8)

Here Y is a Pareto(α) random variable independent of Q

T^∗

and T

^∗

is the time of the largest record of (|Θ

_t

|).

Remark 3.11. We observe that E [|Q

_T^∗

|

^α

] = E

h max

t∈Z

|Θ

_t

|

^α

P

j∈Z

|Θ

_j

|

^α

i

= θ

_|X_|

.

Since θ

_|X|

= P (T

^∗

= 0) the extremal index θ

_|X|

has the intuitive interpretation as the probability that (|Θ

_t

|) assumes its largest value at time zero.

Example 3.12. We consider the regularly varying solution of an affine stochastic recurrence equation under the conditions and with the notation of Example 3.2. An exception where the extremal index has an explicit solution is the case log A

_t

= N

_t

− 0.5 for an iid standard normal sequence (N

t

). Then E [A

t

] = 1 and the theory mentioned in Example 3.2 yields regular variation of (X

_t

) with index 1. Using the expression P (T

^∗

= 0) and applying some random walk theory (such as the results in [8]), one obtains an exact expression for θ

_X

in terms of the Riemann zeta function ζ; see Example 3.13. A first order approximation to this formula is given by

θ

_X

≈ 1

2 exp ζ(0.5)

√ 2π ≈ 1

2 exp(−0.5826) ≈ 0.2792.

(3.9)

Example 3.13. Let B

⁽ⁱ⁾

= (B

t

)

t∈R

be iid standard Brownian motions independent of Γ

₁

< Γ

₂

< · · · which are the points of a unit-rate Poisson process on (0, ∞). We consider the stationary max-stable Brown-Resnick [4] process

X

t

= sup

i≥1

Γ

⁻¹_i

e

√

2B_t⁽ⁱ⁾−|t|

, t ∈ R .

It has unit Fr´ echet marginals P (X

_t

6 x) = Φ

₁

(x) = e

^−x⁻¹

, x > 0. Any discretization X

^(δ)

= (X

δ t

)

t∈Z

for δ > 0 is regularly varying with index 1 and spectral tail process Θ

^(δ)_t

= e

√

2Bδ t−δ|t|

, t ∈ Z. Direct calculation of

−x log P (n

⁻¹

max

_16t6n

X

_{δ t}

6 x), x > 0, yields the extremal index of X

^(δ)

as the limit

(3.10) θ

^(δ)_X

= lim

n→∞

n

⁻¹

E h

sup

06t6n

e

√

2Bδ t−δt

i .

We use the expression θ

_X^(δ)

= P (T

^∗(δ)

= 0) where T

^∗(δ)

is the first record time of (Θ

^(δ)_t

)

t∈Z

; see (3.8). We consider the first ladder height epoch τ

₊

(δ) = inf{t > 1 : √

2 B

_{δ t}

+ δt < 0}. Using the symmetry of the Gaussian distribution, (Θ

^(δ)_t

)

_t>1

= (1/Θ

^d ^(δ)_−t

)

_t>1

, we obtain θ

_X^(δ)

= P(T

^∗(δ)

= 0) = P(τ

+

(δ) =

∞)

²

. Combining this with the classical identity P (τ

₊

(δ) = ∞) = 1/ E [τ

−

(δ)]

(15)

for τ

−

(δ) = inf{t > 1 : √

2 B

δ t

− δt 6 0}, from random walk theory (see [1]) we get

θ

^(δ)_X

= 1 E [τ

−

(δ)]

2

= E [B

_δ

− δ]

E [ √

2B

_τ₋_(δ)

− τ

−

(δ)]

2

= δ

²

( E [ √

2 B

_τ₊_(δ)

+ τ

₊

(δ)])

⁻²

, where we used Wald’s lemma and the symmetry of the Gaussian distribution.

To be able to apply Theorem 1.1 in [8] we standardize the increments of the random walk √

2B

_{δ t}

dividing them by √

2δ, turning the drift into p δ/2, and we get

E [ √

2 B

_τ₊_(δ)

+ τ

₊

(δ)] = √ δ exp

−

√ δ 2 √

π

∞

X

n=0

ζ (1/2 − n) n!(2n + 1)

− δ 4

n

.

This implies that

θ

_X^(δ)

= δ exp

r δ π

∞

X

n=0

ζ (1/2 − n) n!(2n + 1)

− δ 4

n

.

We recover the Pickands constant of the Brown-Resnick process (see [29]) as the limit lim

δ↓0

δ

⁻¹

θ

^(δ)_X

:

H

⁽⁰⁾_X

= lim

T→∞

1 T E

h sup

06t6T

e

√2Bt−t

i

= 1.

Proof of Proposition 3.10. Consider the supremum of all points of the limit process N in Theorem 3.8:

M = sup

i>1

Γ

^−1/α_i

sup

j∈Z

|Q

_ij

| .

The sequences (Γ

_i

) and (Q

⁽ⁱ⁾

) are independent and M = sup

_i>1

Γ

^−1/α_i

V

_i

for the iid sequence V

i

:= sup

_j∈_Z

|Q

_ij

|, i = 1, 2, . . . , whose generic element V has the property E[V

^α

] < ∞. Indeed, V 6 1 a.s. by construction. The points (Γ

^−1/α_i

, V

i

) constitute a marked Poisson process N

Γ,V

with state space E = (0, ∞) × [0, ∞) and mean measure given by µ((x, ∞) × [0, y]) = x

^−α

F

_V

(y), x > 0, y > 0, where F

V

is the distribution function of V . For x > 0 we consider B

_x

= {(y, v) ∈ E : y v > x}. We observe that

µ(B

x

) = Z

∞

v=0

Z

∞ y=x/v

α y

^−α−1

F

V

(dv) = Z

∞

0

(x/v)

^−α

F

V

(dv) = x

^−α

E[V

^α

] . Therefore we have for x > 0,

P(M 6 x) = P Γ

^−1/α_i

V

i

6 x , i > 1

= P (N

_Γ,V

(B

_x

) = 0)

= e

^−µ(B^x⁾

= e

^−x^−α^E^[V^α^]

.

Thus M is a scaled version of the standard Fr´ echet distribution, Φ

_α

(x) = e

^−x^−α

, x > 0:

P (M 6 x) = Φ

^E[V_α ^α^]

(x) , x > 0 .

(16)

On the other hand, Theorem 3.8 and an application of the continuous mapping theorem yield as n → ∞,

P a

⁻¹_n

M

_n

6 x

= P N

_n

(x, ∞) = 0

→ P N (x, ∞) = 0

= P (M 6 x) , x > 0 .

In view of the definition of the extremal index of the sequence (|X

_t

|) we can identify

E [V

^α

] = E h

sup

j∈Z

|Q

_j

|

^α

i

= E [|Q

_T^∗

|

^α

].

as the value θ

|X|

. This proves the first part of (3.8). The identity E [|Q

_T^∗

|

^α

] = P (Y |Q

_T^∗

| > 1) = P (|Q

_T^∗

|

^α

> Y

^−α

).

is immediate since Q and Y are independent, and Y

^−α

is U (0, 1) distributed.

Applying the time-change formula (3.4), shifting k to zero, we obtain θ

_|X_|

= E [|Q

_T^∗

|

^α

]

= X

k∈Z

E

h |Θ

_k

|

^α

P

j∈Z

|Θ

_j

|

^α

11(T

^∗

= k) i

= X

k∈Z

E

h |Θ

−k

|

^α

P

j∈Z

|Θ

_j−k

|

^α

11(T

^∗

= 0) i

= P(T

^∗

= 0) .

This proves the last identity in (3.8).

4. Estimation of the extremal index - a short review and a new estimator based on the spectral cluster process First approaches to the estimation of the extremal index are due to [17, 38]. Estimators based on exceedences of a threshold were proposed in [12, 35, 36, 33]. A modern approach to the maxima method was started in [26];

improvements and asymptotic limit theory can be found in [3, 5].

We will consider some standard estimators of θ

X

. For the sake of argu- ment we assume that (X

_t

) is a non-negative stationary process with marginal distribution F, k

n

= n/r

n

is an integer sequence such that r

n

→ ∞, k

_n

→ ∞, and (u

_n

) is a threshold sequence satisfying u

_n

↑ x

_F

.

4.1. Blocks estimator. Recall that θ

_X

has interpretation as the reciprocal

of the expected size of extremal clusters. This idea is the basis for inference

procedures from the early 1990s (see [37, 11]). Clusters are identified as

blocks of length r = r

n

with at least one exceedance of a high threshold

u = u

_n

. A blocks estimator θ b

^bl

is given by the ratio of the number K

_n

(u)

(17)

of such clusters and the total number of exceedences N

n

(u):

θ b

_u^bl

(r) = K

n

(u) N

_n

(u) :=

P

kn

t=1

11(M

(t−1)r+1,t r

> u) P

_n

t=1

11(X

_t

> u) . (4.11)

This method requires the choice of block length r and threshold level u satisfying r

_n

F (u

_n

) → 0; if r

_n

→ ∞ does not hold at the prescribed rate θ b

^bl

is biased. Estimators using clusters of extreme exceedences were also considered in [17].

A slight modification of the blocks estimator is the disjoint blocks estimator of [38]:

θ b

^dbl

= log(1 − K

_n

(u)/k

_n

) r log(1 − N

_n

(u)/n) .

Assuming some weak dependence condition on (X

_t

), the heuristic idea be- hind the estimator is the approximations

P (M

r

6 u

n

)

kn

≈ P (M

n

6 u

n

) ≈ F

^θ^Xⁿ

(u

n

),

for a suitable sequence (u

n

). Then, taking logarithms and replacing F(u

n

) and P (M

_n

> u

_n

) by their empirical estimators N

_n

(u)/n and K

_n

(u)/k

_n

, respectively, we obtain

θ

X

≈ log P(M

n

6 u

n

)

n log F (u

n

) = log(1 − P(M

n

> u

n

)) n log(1 − F (u

_n

))

≈ log(1 − K

n

(u)/k

n

)

r

_n

log(1 − N

_n

(u)/n) = b θ

^dbl

.

Assuming that both K

_n

(u)/k

_n

and N

_n

(u)/n converge to zero, a Taylor ex- pansion of log(1 + x) = x(1 + o(1)) as x → 0 shows that θ b

^bl

≈ θ b

^dbl

. [38]

showed that θ b

^dbl

has a smaller asymptotic variance than θ b

^bl

. [33] proposed a sliding blocks version of θ b

^dbl

with an even smaller asymptotic variance.

θ b

^slbl

(u, r) =

− log

1

n−r+1

P

n−r+1

t=1

11(M

t,t+r

6 u)

N

_n

(u)/k

_n

.

(4.12)

4.2. Runs and intervals estimator. [38] proposed the alternative runs estimator. It is based on the limit relation (2.4): the probability P(M

`n

6 u

_n

| X

₀

> u

_n

) is replaced by a sample version for some sequence l = l

_n

→

∞:

θ b

^runs_u

(l) = 1 N

n

(u)

n−l

X

i=1

11(X

_i

> u

_n

, M

_i+1,i+l

6 u

_n

) . (4.13)

Clusters are considered distinct if they are separated by at least l obser-

vations not exceeding u. In [12] a complete study of the runs estimator

and the inter-exceedence times is given. The thresholds (u

_n

) need to satisfy

r

_n

F (u

_n

) → 1, and l

_n

6 r

_n

.