• Aucun résultat trouvé

E. Di Bernardino

N/A
N/A
Protected

Academic year: 2022

Partager "E. Di Bernardino"

Copied!
11
0
0

Texte intégral

(1)

Received: 29 July 2015, Revised: 8 January 2016, Accepted: 11 January 2016, Published online in Wiley Online Library: 8 February 2016

(wileyonlinelibrary.com) DOI: 10.1002/env.2385

Estimation of extreme quantiles conditioning on multivariate critical layers

E. Di Bernardino

a

and F. Palacios-Rodríguez

b*

LetTi WD ŒXijX 2 @L.˛/, fori D 1; : : : ; d, where X D .X1; : : : ; Xd/is a risk vector and@L.˛/is the associated multivariate critical layer at level˛ 2 .0; 1/. The aim of this work is to propose a non-parametric extreme estimation procedure for the.1pn/-quantile ofTi for a fixed˛and whenpn!0, as the sample sizen! C1. An extrapolation method is developed under the Archimedean copula assumption for the dependence structure of X and the von Mises condition for marginalXi. The main result is the central limit theorem for our estimator forp D pn ! 0, whenn tends towards infinity. A set of simulations illustrates the finite-sample performance of the proposed estimator. We finally illustrate how the proposed estimation procedure can help in the evaluation of extreme multivariate hydrological risks.

Copyright © 2016 John Wiley & Sons, Ltd.

Keywords: multivariate risk measures; return levels; critical layers; extreme quantile

1. INTRODUCTION

1.1. Multivariate extreme events and multivariate critical layers

In hydrology, phenomena are usually characterized by extreme events, and quantification of the risk of the occurrence of a specific extreme event is gaining attention in environmental sciences (e.g. Hochrainer-Stigler and Pflug, 2012). The classic univariate measure in environmen- tal sciences is that of thereturn level. This quantity represents the magnitude of an event that occurs at a given time and at a given site. More precisely, the return level is the quantilexp, which expresses the magnitude of the event that is exceeded with a probability equal top, with pD1=T (T is called thereturn period). Environmental risks frequently involve several variables that are often correlated. For instance, a flood can be described by the volume, the peak and the duration (Chebana and Ouarda, 2011a; , 2011b). It is therefore crucial to identify extreme risks in a multivariate setting.

The notion of multivariate return level is not univalent (Vandenbergheet al., 2012), and several definitions can be found in the recent literature. Since different combinations of probabilities may produce the same return period, a multivariate return level is inherently ambigu- ous. Events that have equal probability of exceedance define iso-hyper-surfaces, otherwise known ascritical layers. Salvadoriet al.(2011) provide the following definition.

Definition1.1 LetXD.X1; : : : ; Xd/be a random risk vector with joint distribution functionF. For˛2 .0; 1/andd >2, the critical layer@L.˛/at level˛is defined as@L.˛/D fx2Rd WF .x/D˛g:

Definition 1.1 provides a partition composed of three probability regions:@L<.˛/D fx 2 Rd WF .x/ < ˛g(the sub-critical region);

@L.˛/(the critical region where all the events have a constantF); and@L>.˛/D fx2RdWF .x/ > ˛g(the super-critical region).

In practice, at any occurrence of the phenomenon, only these three mutually exclusive events may occur (Belzunceet al., 2007). The multivariate return level can be defined with respect to one of the aforementioned three areas. For instance, in hydrology, the sub-critical region may be of interest if droughts are to be investigated, while the study of floods may require the use of super-critical regions (Salvadori et al., 2012). Chebana and Ouarda (2011b) propose a parametric estimator for the critical layer and apply it to a bivariate flood real data set.

An estimation for the bivariate critical layers@L.˛n/by assuming˛n!1, asn! 1, is presented in de Haan and Huang (1995). In the last decade, Embrechts and Puccetti (2006), Nappo and Spizzichino (2009), Prékopa (2012), and Cousin and Di Bernardino (2013) have used the notion of critical layers (i.e. multivariate quantile curves) to generalize the notion of univariate return level in the multivariate setting.

* Correspondence to: Fátima Palacios-Rodríguez, Departamento de Estadística e Investigación Operativa, Facultad de Matemáticas, Universidad de Sevilla, Calle Tarfia sin número, Sevilla 41012, Spain. E-mail: fpalacios2@us.es.

a Département IMATH, Laboratoire Cédric EA4629, CNAM, Paris, France

b Departamento de Estadística e Investigación Operativa, Facultad de Matemáticas, Universidad de Sevilla, Sevilla, Spain

158

(2)

1.2. Conditional distribution on critical layers and considered multivariate extreme return levels We now introduce the conditional random variableXion the critical layer@L.˛/foriD1; : : : ; d, that is

TiWDŒXijX2@L.˛/;for˛2.0; 1/; (1)

where@L.˛/is as in Definition 1.1, and the associated conditional distribution function is given byFi.xj˛/DPŒXi 6xjX2@L.˛/:

One can interpret the random variable Ti in Equation (1) as the contribution (or the responsibility) of the marginal risk Xi in the case where the whole risk vectorXbelongs to the multivariate stress scenario represented by the critical layer@L.˛/, for some suitable level

˛ 2 .0; 1/. It should be borne in mind that Equation (1) can also be written by using the multivariate probability integral transformation (MPIT)Z WDF .X1; : : : ; Xd/. Indeed, under regularity conditions onF, one can write Ti WDŒXijZD ˛; for˛2.0; 1/(e.g. Cousin and Di Bernardino, 2013; Nelsen, 2006; Genest and Rivest, 2001).

Using the aforementioned notation, in this paper, we consider the multivariate return level, based on the critical layers, that was recently proposed by Di Bernardinoet al.(2015). This risk measure is defined by the.1p/-quantile of the random variableTiin (1), that is xipWDUTi

1 p

; forp2.0; 1/; (2)

whereUTi.t /WDFi .11t j˛/fort > 1, andFi . j˛/denotes the left-continuous inverse ofFi. j˛/.

1.3. Proposal of the present work

The goal of this paper is to estimate the multivariate extreme return levelsxpin defined by Equation (2). To this end, two problematic points can be identified:

(1) Di Bernardinoet al.(2015) analyse this measure and introduce a semi-parametric estimation procedure. However, the aforementioned semi-parametric estimation and empirical quantile estimators perform well only if the threshold is not too high. These methods cannot handle extreme events, that is,pn1=n, which are specifically required for hydrological and environmental risk measures.

(2) As pointed out before, the considered conditional random variableTirelies on the latent MPITZ, which is not observed. Therefore, in order to apply a quantile estimation procedure,Zhas to be previously estimated. This type of plug-in procedure increases the variance of the final estimation and introduces statistical difficulties (e.g. (Di Bernardinoet al., 2013)).

In order to overcome the drawback outlined in item (1), in the present work, we provide an estimator ofxpinin Equation (2), for a fixed˛ and whenpn!0, asn! C1, by using Extreme Value Theory (EVT). In order to model the dependence structure of the multivariate risk vectorX, we consider Archimedean copulas. Recall that Archimedean copulas can be written as

C.u1; : : : ; ud/D1..u1/C: : :C.ud//; for all .u1; : : : ; ud/2Œ0; 1d, where the functionis called the generator of the copula, andis a continuous, strictly decreasing function fromŒ0; 1toŒ0;1such that.1/D0and.0/D 1(e.g. Definition 2 in (McNeil and Nešlehová, 2009)). Furthermore, the generator of an Archimedean copula also satisfies several additionald-monotony properties ( for further details, see Theorem 2.2 in (McNeil and Nešlehová, 2009)). The rationale for employing Archimedean copulas is motivated by the fact that under this assumption, the distribution ofTi and its tail index can be easily obtained (see Proposition 2.1). Furthermore, in this framework, one can avoid having to previously estimate the latent variableZ(see item (2)). Indeed, the proposed estimator procedure is only based on quantities that can be directly estimated by using the observedd-dimensional independent and identically distributed (i.i.d.) sample of.Xj/, forj D1; : : : ; n(see Equation (8)).

We remark that Archimedean copulas play a central role in the understanding of dependencies of multivariate random vectors (see Nelsen, 2006; McNeil and Nešlehová, 2009; Durante and Salvadori, 2010).

Frequently, hydrological phenomena are characterized by upper tail dependence described by Gumbel-Logistic models (e.g. Fawcett and Walshaw, 2012; Chebana and Ouarda, 2011a ; de Waalet al., 2007).

Following these considerations, under a regular variation condition for the generatorand the von Mises condition for the marginalXi, we develop an extreme extrapolation technique in order to estimatexpin (e.g. Caiet al., 2015).

1.4. Organization of the paper

In Section 2.1, we derive the tail index of the distribution ofTi. Under suitable assumptions, a non-parametric estimation procedure forxpin is obtained, for a fixed level˛and whenpn!0, asn! C1(Section 2.2). The main result is the asymptotic convergence of our estimator withp D pn ! 0, asn ! 1(Section 3). In Section 4, the performance of the estimatorbxipn is illustrated on simulated data. Finally, Section 5 concludes with an application to a three-dimensional rainfall data set in order to illustrate how the proposed estimation procedure can help in the evaluation of multivariate extreme return levels.

2. PROPOSED EXTREME ESTIMATOR FOR THE MULTIVARIATE RETURN LEVEL x

pi

2.1. Tail index forTi

In this section, we aim to study the tail behaviour ofTi, foriD1; : : : ; d. We assume the existence of the limit inŒ1;1of D lim

s"1

.1s/0.s/

.s/ : (3)

159

(3)

Equation (3) is equivalent to regular variation ofat 1 with index, that is,2 RV.1/(see Charpentier and Segers (2009) for details).

Furthermore,>1due to the convexity of. When > 1, the upper tail of the copula exhibits asymptotic dependence, while ifD1, then the upper tail exhibits asymptotic independence. Under condition (3) for the generator, we now study the maximum domain of attraction (MDA) ofTi, foriD1; : : : ; d.

Proposition2.1 (The von Mises condition forTi) Let.X1; : : : ; Xd/be a random vector with Archimedean copula with twice differentiable generator. Assume that2RV.1/, with2Œ1;C1. Leti2 f1; : : : ; dgandFibe the twice differentiable distribution function ofXi. Assume thatFiverifies the von Mises condition with indexi2R. LetTibe as in (1) with distribution functionFi.j˛/.

(i). If2Œ1;C1/, thenFi.j˛/verifies the von Mises condition with tail indexTi D i. Specifically,Ti2MDA Ti

. (ii). IfD C1, thenFi.j˛/verifies the von Mises condition with tail indexTi D0. In particular,Ti2MDA .0/.

The proof and illustrations of Proposition 2.1 are given in the Supporting Information. Furthermore, the von Mises condition and the explicit form of the distributionFi.j˛/can also be found in the Supporting Information.

Remark Note thatTi depends on neither the risk level˛nor on the dimensiond. However,Ti depends on the domain of attraction of the respective marginXiand on the regularly varying indexof the generator of the Archimedean copula considered. It should be borne in mind that the assumptions of Proposition 2.1 can be easily satisfied. Indeed, in Table 1 in Charpentier and Segers (2009), various copula models with associatedindex can be found, and the von Mises condition is verified for a large class of marginal distributionsFi (see illustrations in the Supporting Information).

The relationship between the quantile functionsUTi andUXi is established in the following result. The proof of Proposition 2.2 is given in the Supporting Information.

Proposition2.2 (Relation between UTi and UXi) Let .X1; : : : ; Xd/be a random vector with Archimedean copula with generator. Assume that 2 RV.1/, with 2Œ1;C1. LetTi be as in (1) with distribution functionFi.j˛/. Letk Dk.n/ ! 1,k=n!0, as n! 1, and

kU.n/WDn (

11

"

1

1k.n/

n

1=.d1/! .˛/

#)

: (4)

Therefore,

(i) kU.n/is an intermediate sequence, that is,kU.n/! 1,kU=n!0asn! 1.

(ii) UTin

k

DUXi

n kU

;whereUXiis the marginal quantile function, that is,UXi.t /WDFi .11=t /.

2.2. Proposed estimator using an extrapolation method

Henceforth, we will focus on the case:i > 0and 2 Œ1;C1/(in particular, this impliesTi > 0). This choice is motivated by our applications in hydrology and in particular in real rainfall data sets. Indeed, in these real-life applications, we can easily observe heavy-tailed distributions (see, for instance, Pavlopouloset al.(2008) and Papalexiouet al.(2013)). For the marginal distributionXi, we therefore assume that there existsi > 0such that for allx > 0

t!1lim UXi.tx/

UXi.t / Dxi: (5)

In this case, Propositions 2.1 and 2.2 yield, asn! 1, xipn DUTi

1 pn

UXi

n kU

k n pn

i=

DUXi

n kU

k n pn

Ti

(6) wherekU is as in Equation (4) andkDk.n/! 1,k.n/=n!0, asn! 1.

Let.X1; : : : ; Xd/be ad-dimensional random vector with continuous distribution functionF and Archimedean copula with generator . The goal is to estimatexipn in (6) based ond-dimensional i.i.d. observations,.Xj/, forj D 1; : : : ; n, fromF, wherepn ! 0, as n! C1. LetXnbki

Uc;nbe the.nbkUc/-th order statistic of

X1i; : : : ; Xni

. Therefore, the natural estimator ofUXi

n kU

is its empirical counterpart, that is,Xnbki

Uc;n(e.g., see de Haan and Ferreira, 2006).

From Equation (6), in order to define the estimator ofxipn , it thus remains to estimate i and . We estimate i with the Hill estimator (see Hill, 1975):

biD 1 k1

kX11

jD0

logXnj;ni logXnki

1;n; (7)

160

(4)

wherek1is an integer sequence such thatk1.n/! 1,k1=n!0,n! 1, and thatXnki

1;nis the intermediate order statistic at level nk1. In addition, the regularly varying indexis estimated by taking into account the estimator of the upper tail dependence coefficient proposed by Schmidt and Stadtmüller (2006) (for details, see the Supporting Information).

LetbTi WDbi

b. We can therefore estimatexpin in (6) by bxipnDXnbki

Uc;n

k n pn

bTi

: (8)

Remark Notice that the proposed estimator in Equation (8) does not rely on the latent MPITZWDF .X1; : : : ; Xd/, which is not directly observed. Under the assumptions of Proposition 2.1, the application of the proposed extrapolation technique precludes the necessity to previously estimateZ. Indeed, the estimatorbxipn in Equation (8) is only based on quantities that can be directly estimated by using the observedd-dimensional i.i.d. sample.Xj/, forj D1; : : : ; n. In Section 4, we provide a comparison with an empirical quantile estimation ofTiconstructed by using the empirical multivariate distribution functionFn.Xj/(see Equation (10)).

3. ASYMPTOTIC NORMALITY

In order to prove the asymptotic normality ofbxipn, we need to quantify the rate of convergence in Equation (5). We therefore assume the following second-order regularity condition.

Assumption3.1 (2RV condition onUXi) There existi < 0and an eventually positive or negative functionAisuch that, ast ! 1, Ai.t x/=Ai.t /!xi for allx > 0and supx>1ˇˇˇxi UUXi.t x/

Xi.t / 1ˇˇˇDO.Ai.t //;(see Condition (3.2.4) in de Haan and Ferreira (2006)).

For the sake of brevity, the auxiliary results necessary to obtain Theorem 3.1 are presented in the Supporting Information. In the following, our main result is presented: the asymptotic normality for the estimatorbxipn in Equation (8).

Theorem 3.1 (Asymptotic normality ofbxipn in the upper tail dependence case, > 1) Let.X1; : : : ; Xd/be a random vector with Archimedean copula with twice differentiable generator. Assume that2RV.1/, with2.1;C1/. Leti 2 f1; : : : ; dgandFibe the twice differentiable distribution function ofXi. Assume thatFiverifies the von Mises condition with indexi> 0. LetTibe as in (1) with distribution functionFi.j˛/. Assume

1. For.Xi; Xj/, withi ¤j, the upper tail copulaƒU exists, has continuous partial derivatives and satisfies the second-order condition in Equation (10) in the Supporting Information with auxiliary functionA./.

2. UXi satisfies Assumption 3.1 with auxiliary functionAi./,i> 0andi< 0.

3. kDk.n/! 1,k=n!0,n! 1such that Theorem 2.1 in the Supporting Information is satisfied.

4. k1Dk1.n/! 1,k1=n!0, andp

k1Ai.n=k1/!,n! 1.

5. k2Dk2.n/! 1,k2=n!0, andp

k2A.n=k2/!0,n! 1.

Letr Dlimn!C1 pk1.n/

pk2.n/,r0Dlimn!C1

pkUp.n/log.dn/

k1.n/ andr00Dlimn!C1

pkUp.n/log.dn/

k2.n/ withr; r0andr002Œ0;1.

Hence, asn! 1, ifr 61and limn!C1plog.dn/

k1.n/ D0, then

min p

kU; pk1

log.dn/

! bxipn xpin 1

!

!d

BCr0.‚1Cr‚2/; r061I

1

r0BC‚1Cr‚2; r0> 1;

and, ifr > 1and limn!C1plog.dn/

k2.n/ D0, then

min p

kU; pk2

log.dn/

! bxipn xpin 1

!

!d

BCr00.1r1C‚2/; r0061I

1

r00BC1r1C‚2; r00> 1;

where dn WD k=.n pn/, B N.0; i2/, ‚1 N.=i; 1/ with D =.1 i/ and ‚2 N

0; 2=2

, with 2 D

2 U

log.2/

.2U/log2.2U/

2

, U WDƒ.1; 1/the upper tail dependence coefficient and U2 DU C

@

@xƒU.1; 1/2

C

@

@yƒU.1; 1/2

C 2U

@

@xƒU.1; 1/1 @y@ƒU.1; 1/1 1

:

The proof of Theorem 3.1 is presented in the Supporting Information.

Remark(Asymptotic consistency ofbxipn in the upper tail independence case, D 1) Notice that if D 1(i.e. tail copulaƒU 0), then the asymptotic variance U2 in Theorem 3.1 vanishes (see Supporting Information). However, in the upper tail independence case, the

161

(5)

consistency of the proposed estimatorbxipn can be obtained. To be precise, if 2 RV1.1/and the second, third and fourth conditions of Theorem 3.1 hold, thenbxipn

xipn

!P 1; forn! 1:

4. SIMULATION STUDY

The aim of this section is to evaluate the performance ofbxipn in finite-size samples. Although we restrict ourselves to a three-dimensional case in this study, these illustrations could be adaptable in any dimensiond. The performance of our extreme estimatorbxipnis also compared with a pseudo-empirical estimator (denotedbxpseud opn ), an empirical estimator (bxe mppn ) and a semi-parametric empirical estimator (bxpn). The construction of these three competitor estimators is now described. In order to attainbxpseud opn , it is assumed that the distribution function ofTiis known (see Lemma 2.1 in the Supporting Information). We can then sample from the random variableTi by using the fact that Ti d

DFi1n 1h

1U1=.d1/

.˛/io

, whereUis a uniform random variable. Finally, the pseudo-empirical estimatorbxpseud opn can be defined as the.n bn pnc/-th order statistic of the sample obtained fromTi,

bxpseud opn DTnbn pi

nc;n: (9)

On the other hand, an empirical estimator (bxe mppn ) can be proposed without the need for any information aboutTi. To this end, we sample from the latent random variableTi D ŒXijF .X/D˛by using the empirical multivariate distribution function. Let.Xj/,j D 1; : : : ; n;

be ad-dimensional i.i.d. sample ofX. For allt 2 Rd, thed-dimensional empirical distribution function of Xis defined asFn.t / WD

1 n

Pn

iD11fXj6tg:TQiis then obtained by collecting the points.Xij/, forj D1; : : : ; n;such thatFn.Xj/ 2Œ˛h; ˛Chfor a positive sufficiently small valueh. The quantityhis adjusted to each considered model and each sample size. The competitor estimatorbxe mppn is given by

bxe mppn D QTnbn pi

nc;n: (10)

Finally, from Definition 5.1 in Di Bernardinoet al.(2015), a semi-parametric empirical competitor estimator is presented, denoted as bxpn. LetBQi DXbsc;ni be thes-th order statistic ofXiwiths Dn 1

bn

.Sibn.˛///, whereSiis a random variable withBet a.1; d1/

distribution. Bear in mind thatbn is the semi-parametric estimator of the generator of the copulaobtained by considering the maximum pseudo-likelihood estimator of the parameter associated to (e.g., see Genestet al., 1995). The competitor semi-parametric empirical estimator is given by

bxpnD QBnbn pi

nc;n: (11)

We now consider the following three-dimensional distributional models:

1. Joe copula and Fréchet margins:Fi.t /Dexpftˇg,i D1; 2; 3, andC.u1; u2; u3/D1Œ1expflog.1.1u1//Clog.1.1 u2//Clog.1.1u3//g1=. In this section, we take the dependence copula parameter D3and the marginal parameterˇD3.

Bear in mind that the assumptions of Theorem 3.1 are satisfied. Indeed,2RV3.1/andUXisatisfy Assumption 3.1 withiD1=3and iD 1, fori D1; 2; 3. In addition, the associated tail-logistic model is given byƒU.x; y/DxCy.x3Cy3/1=3, which satisfies the second-order condition in Equation (10) in the Supporting Information, andU D0:74.

2. Independence copula with Fréchet margins:Fi.t /Dexpn tˇo

,iD1; 2; 3, andC.u1; u2; u3/Du1u2u3. In this case,2RV1.1/

and the upper tail copula is given byƒU DU D0. However, using Remark in Section 3, the consistency ofbTi andbxipnis illustrated in this section.

Figure 1. Boxplots of the ratioOTi=Ti: for Joe copula with D3and Fréchet margins withˇ D3(left panel); for Independence copula and Fréchet margins withˇD3(centre panel); and for Gumbel copula withD2and Pareto margins withı1D1andı2D2. We considernD150,nD500and

nD2000and 500 Monte Carlo simulations

162

(6)

3. Gumbel copula with Pareto margins: Fi.t / D 1.ı1=.tCı1//ı2, i D 1; 2; 3, and C.u1; u2; u3/ D expn

.log.u1//C .log.u2//C.log.u3//1=

. In the simulation study, we take the dependence copula parameter D2and the marginal param- etersı1D1andı2 D2. Bear in mind that the assumptions of Theorem 3.1 are satisfied. Indeed,2RV2.1/andUXi verify the 2RV condition withi D1=2and D 1=2, fori D 1; 2; 3. Furthermore,ƒU.x; y/DxCy.x2Cy2/1=2verifies the second-order condition in Equation (10) in the Supporting Information, andU D0:59.

Various sample sizes are taken, and we considerpn D 1=nand pn D 1=2n, the critical layer level˛ D 0:9and 500 Monte Carlo simulations. Note that specific values for auxiliary sequences of our procedure (k1,k2andk) are chosen for each sample size as indicated in

Figure 2. Independence copula and Fréchet margins withˇ D3. Boxplots for the ratioxOpin=xipn withpn D1=n,pnD1=2nandnD150(left panel),nD1000(right panel). Boxplots for the competitor empirical estimators withpnD1=nare also displayed. We consider˛D0:9and 500 Monte

Carlo simulations

Figure 3. First row: trivariate Joe copula with D3and Fréchet margins withˇ D3. Second row: trivariate Gumbel copula with D2and Pareto margins withı1 D1andı2D2. Boxplots for the ratioxOpin=xipnwithpn D1=n; 1=2nfornD150,nD500andnD2000. Boxplots for the

competitor empirical estimators withpnD1=nare also displayed. We consider˛D0:9and 500 Monte Carlo simulations

163

(7)

the figures. In order to choosek1, the estimator ofiis plotted against various values ofk1. By balancing the potential estimation bias and variance, a common practice is to choosek1from the first stable region of the plots (e.g. Caiet al., 2015). Finally, in order to gain stability in the estimates, the obtained values are averaged in this region. A similar procedure is developed for the auxiliary sequencek2(for the estimation of). The sequencekis selected by observing the stability of the final ratiobxipn=xipn.

4.1. Boxplots of ratiobTi=Ti

In Figure 1, we present the boxplots of ratiobTi=Ti for the three distributional models considered, and for different sample sizes.

4.2. Boxplots of ratiobxipn=xipn

Using Remark in Section 3, an illustration of the consistency of the proposed estimator is provided in the independent copula case. In Figure 2, the obtained box plots for the ratiobxipn=xipn are presented forpn D 1=nand pn D 1=2nforn D 150(left panel) andn D 1000 (right panel). ForpnD1=n, we also provide the boxplots of the ratiosbxpseud opn =xpin,bxe mppn =xpin andbxpn=xpin, with empirical competitor estimators previously defined in Equations (9), (10) and (11), respectively.

In Figure 3, the obtained boxplots for the ratiobxipn=xipn are presented forpnD1=nandpnD1=2n, in the Joe (first row) and Gumbel copula model (second row). ForpnD1=n, the comparison with the empirical competitor estimators is also provided.

In Figures 2 and 3, it can be noticed that the empirical (bxe mppn ), pseudo-empirical (bxpseud opn ) and semi-parametric empirical (bxpn) com- petitor estimators underestimate the conditional quantilexpin and are consistently outperformed by the proposed EVT estimatorbxipn. In addition, the empirical estimators are not applicable forp < 1=n. Furthermore, the behaviour of the EVT estimatorbxipn remains stable whenpnchanges from1=nto1=2n.

4.3. Asymptotic normality

Finally, the asymptotic normality in Theorem 3.1 is illustrated in Figure 4 for the Joe copula model. The Q–Q plots in Figure 4 gather the sample quantiles of minp

kU;

pk1

log.dn/;

pk2

log.dn/ bxipn

xipn 1

versus the theoretical normal quantiles for various sample sizes withpn D

−3 −2 −1 0 1 2

−0.50.00.51.0

Joe copula theta=3 with Fréchet margins beta=3 n=150, k1=20, k2=50,kn=75

Theoretical Quantiles

Sample Quantiles −0.4−0.20.00.20.40.6

Joe copula theta=3 with Fréchet margins beta=3, n=500, k1=50, k2=180, kn=100

Sample Quantiles

−3 −2 −1 0 1 2 3

−0.4−0.20.00.20.4

Joe copula theta=3 with Fréchet margins beta=3, n=2000, k1=150, k2=600, kn=350

Theoretical Quantiles

Sample Quantiles

3 −3 −2 −1 0 1 2 3

Theoretical Quantiles

Figure 4. Joe copula with parameter D3and Fréchet margins withˇ D3. Q–Q plots for minp kU;

p k1 log.dn/;

p k2 log.dn/

O xipn xipn1

forpnD1=n.

We takenD150,nD500andnD2000. We consider˛D0:9and 500 Monte Carlo simulations

Joe copula theta=4 with Fréchet margins beta=4 n=150, k1=25, k2=55, kn= 35

alpha =0.85 alpha =0.90 alpha =0.95 alpha =0.98 alpha =0.99

4

alpha =0.85 alpha =0.90 alpha =0.95 alpha =0.98 alpha =0.99

3201 01234

Joe copula theta=4 with Fréchet margins beta=4 n=500, k1=55, k2=190, kn= 120

Figure 5. Trivariate Joe copula withD4and Fréchet margins withˇD4. Boxplots for the ratioxOpin=xipnwithpnD1=n,nD150(left panel) and nD500(right panel). Different values of˛are taken and 500 Monte Carlo simulations

164

(8)

1=n. Since the scatterplots line up on the line in Figure 4, this indicates that the sample quantiles coincide largely with the theoretical quantiles from the asymptotic distribution. Hence, Theorem 3.1 provides an adequate approximation for finite sample sizes.

4.4. Behaviour of ratiobxipnin terms of˛

In Figure 5, the boxplots of ratiobxipn=xpin are presented for a Joe copula with D 4and Fréchet margins withˇ D 4by considering different values of the critical layer level˛. Note that the convergence ratekU in Proposition 2.2 (see Equation (4)) satisfies @kU.˛/ 60, for fixed values of sample sizenand dimensiond. Therefore, as can be observed in Figure 5, the performance of the proposed estimators decreases when˛increases.

5. MULTIVARIATE AND UNIVARIATE EXTREME RETURN LEVELS: AN ILLUSTRATION FOR HYDROLOGICAL DATA SET

In this section, we focus on estimating the risk of a flood in theBièvreregion in the south of Paris (France) by using both the proposed multivariate extreme return level (see Equation (2)) and the classic univariate return level (see the Introduction section).

5.1. Presentation of the hydrological data set

The data set contains the monthly mean of the rainfall measurements recorded in three different stations of theBièvreregion, from 2003 to 2013. The unit of measurement ismm. The size of the data set isnD125. The localization of the three stations is presented in Figure 6, and the data set is represented in Figure 7 (left panel). LetXidenote the temporal series of the monthly mean of the rainfall measurements for stationi, fori D1; 2; 3. Station 1 is calledGeneste(denotedX1), station 2Loup Pendu(X2) and station 3Trou salé(X3). The data set considered was provided by theSyndicat Intercommunal pour l’Assainissement de la Valle de la Bièvre(SIAVB, see http://www.siavb.fr/).

Figure 6. Localization of the three stations in theBièvreregion (in the south of Paris, France)

Geneste

Loup Pendu 20 40 60 80

20406080

0 20 40 60 80 0 20 40 60 80

20406080020406080

0 20 40 60 80

020406080

Trou salé

Figure 7. Left panel: scatterplot of the three-dimensional hydrological data set considered. Right panel: associated critical layers@L.˛/, for˛D0:75

and 0.9

165

(9)

Di Bernardino and Prieur (2014) discussed the plausibility of the temporal independence assumption for these three-dimensional monthly rainfall measurements.

For the sake of completeness, a test of exchangeability is developed (e.g., see Genestet al., 2012) for the three pairs (X1,X2), (X1,X3) and (X2,X3): we obtainp-values of 0.511, 0.206 and 0.181, respectively. The test is performed with the functionexchTestof theR packagecopulaand suggests exchangeability for all pairs (Figure 7, left panel). Furthermore, by using a goodness-of-fit test for various parametrical multivariate distributions, Di Bernardino and Prieur (2014) proposed for this data set a three-dimensional Gumbel copula with dependence parameter D3:93. The critical layers@L.˛/of this data set for different values of˛are displayed in Figure 7 (right panel).

5.2. Univariate versus multivariate return levels

Given the temporal seriesXiof the monthly mean of the rainfall measurements for stationi, one can define the classic univariate return level with associated probabilitypas the quantity

xi;univp DUXi

1 p

; foriD1; 2; 3;

wherepD1=T andT is called thereturn period. As proposed by Salvadoriet al.(2011), the return level associated to the three stations can be obtained by considering the vector!xunivp WD

x1;univp ; xp2;univ; xp3;univ

, that is, the aggregation of univariate quantiles.

However,!xunivp does not take into account the dependence structure between the three temporal series. As discussed in the Introduction section, while the return level in the univariate setting is usually identified without ambiguity (see, for instance, Corbella and Stretch (2012) and Salvadoriet al.(2011)), in the multivariate setting, it is a troublesome task (Vandenbergheet al., 2012; Gräleret al., 2013).

In this present paper, a possible procedure is proposed for the identification of the contribution of the margins to the global (regional) multivariate risk. As mentioned in the Introduction section, the information concerning the dependence structure of the three considered rainfall measurements is integrated in order to calculate the associated multivariate return levels. We considerxpi WDUTi

1 p

, fori D 1; 2; 3, whereTiWDŒXij.X1; X2; X3/2@L.˛/, with˛2.0; 1/(see Equation (2)), and we define our multivariate return level as!xp WD xp1; xp2; x3p

. In this case,xpi represents the return level associated to thei-th station conditioned to the fact that the three-dimensional rainfall data set belongs to the iso-surface@L.˛/.

5.3. Estimation procedure and obtained results

In the following, the return levelsxpi;univandxpi on the considered rainfall data are estimated, fori D1; 2; 3. We consider here˛D0:9.

To estimatexpi, firstly, we deal with the estimation ofbTi for each station. In Figure 8 (left panel), the Hill estimatorbiis presented versus k1for each station;k1 2Œ7; 27is chosen because this window is the first stable region of this plot. Similarly, the stable region chosen for the consideredbcorresponds tok22Œ25; 50(Figure 8, centre panel). Furthermore, the adaptive sequencebkU.n/is estimated as described in the Supporting Information. In addition, the stable region chosen forbxipn isk2Œ38; 80(Figure 8, right panel). Finally, to gain stability, the estimationsbi,bandbxipnare averaged in the chosen stable regions (see also Caiet al., 2015).

The obtained extreme estimation for!xp is presented in Table 1 for different probability levelsp. In Table 1, the empirical estimator bxi;e mp1=n given in Equation (10) is also included. Unsurprisingly,bxi;e mp1=n underestimates the risk (see also the simulation study in Section 4).

Furthermore, using the extreme quantile estimator proposed in Theorem 4.3.8 in de Haan and Ferreira (2006), the univariate return level xi;univp is estimated for different probability levelsp(see Table 2).

Note that in Table 1,bi> 0(i.e. the monthly mean of the rainfall measurements for each stationibelongs to the Fréchet MDA) andb > 1 (i.e. upper tail dependency). Values gathered in Table 1 (resp. in Table 2) represent the estimated multivariate return levels (resp. univariate return levels) inmmwith an associated return period of around 10 years (forpD1=n), 20 years (forpD1=2n), 52 years (forpD1=5n) and 105 years (forpD1=10n).

In Tables 1 and 2, a major contribution of the second station (i.e.X2) can be observed. One can interpret that the manager of theBièvre region needs to pay more attention to this station because it contributes towards producing a flood in the region to a greater degree, both in the univariate and multivariate return level cases.

k1

Hill estimator of γi

Geneste Loup Pendu Trou salé

k2

Estimator of ρ

0 20 40 60 80 100 120

0.20.40.60.81.01.21.41.6

0 20 40 60 80 100 120

020406080

0 20 40 60 80 100 120

5060708090100

k Estimates of x pn, pn=1/2n

Figure 8. Hill estimatorsOiversusk1, foriD1; 2; 3(left panel). EstimatorObased on the subvector (X1,X3) versusk2(centre panel). EstimatesxOpin against various values of the intermediate sequencek, foriD1; 2; 3andpnD1=2n(right panel).

166

(10)

Table 1. Estimated extreme multivariate return levelxOpin, for different values ofpn. Hill estimator O

iis calculated by taking the average fork1 2Œ7; 27;Ois obtained by taking the average fork22 Œ25; 50;xOpin are calculated by taking the average fork2Œ38; 80. The empirical estimatorxOi;e mp1=n in Equation (10) is also displayed

i Station bi b bxi;e mp1=n bxi1=n bxi1=2n bxi1=5n bxi1=10n 1 Geneste 0.227 4.852 52.33912 53.82129 55.59621 58.03267 59.94646 2 Loup Pendu 0.239 4.852 58.55755 59.98556 62.07613 64.95194 67.21560 3 Trou salé 0.222 4.852 41.49933 54.04171 55.78875 58.18519 60.06618

Table 2. Estimated extreme univariate return levelxOi;univp , for different values ofpn

i Station bxi;univ1=n bxi;univ1=2n bxi;univ1=5n bxi;univ1=10n 1 Geneste 71.63235 83.84691 103.24705 120.85247 2 Loup Pendu 81.12761 95.79985 119.34466 140.92860 3 Trou Salé 72.41996 84.51017 103.64423 120.94719

5.4. Conclusions

The proposed multivariate return level approach in the present paper has the advantage of using a mathematically consistent way of defining the multivariate probability of dangerous events by relying on the iso-curves@L.˛/. However, there is no universal choice of an appropriate approach to all real-world problems (see also Introduction section). It is necessary to address the problem from a probabilistic point of view and to be aware of the practical implications of the approach chosen.

It is also evident in our hydrological study, but not necessarily the case, that the more variables/information included, the smaller the design quantiles become (see multivariate and univariate return levels in Tables 1 and 2). Indeed, marginal components of the multivariate levels (i.e.bxip) are considerably lower than the corresponding univariate return levels (i.e.bxi;univp ) (see Tables 1 and 2). This fact can be intuitively interpreted: the probability of an extreme event that simultaneously exceeds a return level in all margins is liable to be much lower than the probability of an event that exceeds the same level in any one of the margins considered alone. Therefore, the univariate levelsxpi;univshould be much higher to obtain the same small exceedance probabilityp. Salvadori and De Michele (2013) discuss this dimensionality paradoxand provide a theoretical explanation. The interested reader is also referred to Salvadoriet al.(2011) and Gräleret al.(2013) for analogous considerations.

In particular, in our study, the large discrepancy between the estimatedbxip andbxi;univp depends on the considered parameter setting (˛D0:9,pnD n1;2n1 ;5n1;10n1 ;withnD125) and on the theoretical properties of the considered multivariate return levelxpi. For further details about the properties of this risk measure, the interested reader is referred to Propositions 2.3–2.5 and Corollary 4.4 in Di Bernardino et al.(2015).

Furthermore, one should also be aware of the fact that ourTi-quantile approach (see Equation (2)) is only applied to variables that are positively associated and with a focus on extremes in terms of large values. In all other cases, adaptations should be made in order to operate in the correct area of the copula (such as thedirectionalmultivariate return levels proposed by Torreset al.(2015)).

From a practical perspective, it is impossible to provide a general suggestion for an appropriate approach to estimate multivariate design events applicable to a vast set of design exercises. Hitherto, many applications have been based on the concept of univariate return level, since the concept of multivariate return level has a different meaning and is potentially less conservative (as can be observed by comparing Tables 1 and 2).

On the other hand, when the analyst estimates the extension of flood inundation, a joint return period approach could prove appealing.

Indeed, an ensemble of equally rare scenarios (i.e. those with the same return probabilityp) could be used to assess the variability of the flood maps obtained due to the selection of a single design event.

Acknowledgements

The authors wish to thank the reviewers for their numerous and very useful comments and suggestions. The authors would like to thank Latouche A. for fruitful discussions. This work was partly supported by a grant from Junta de Andalucía (Spain) for research group (FQM- 328) and by a pre-doctoral contract (Palacios Rodríguez, F.) from the ‘V Plan Propio de Investigación’ of the University of Seville.

167

(11)

REFERENCES

Belzunce F, Castaño A, Olvera-Cervantes A, Suárez-Llorens A. 2007. Quantile curves and dependence structure for bivariate distributions.Computational Statistics & Data Analysis51:5112–5129.

Cai JJ, Einmahl JHJ, de Haan L, Zhou C. 2015. Estimation of the marginal expected shortfall: the mean when a related variable is extreme.Journals of the Royal Statistical Society. Series B. Statistical Methodology77(2):417–442.

Charpentier A, Segers J. 2009. Tails of multivariate Archimedean copulas.Journal of Multivariate Analysis100(7):1521–1537.

Chebana F, Ouarda TBMJ. 2011. Multivariate extreme value identification using depth functions.Environmetrics22:441–455.

Chebana F, Ouarda TBMJ. 2011. Multivariate quantiles in hydrological frequency analysis.Environmetrics22:63–78.

Corbella S, Stretch DD. 2012. Multivariate return periods of sea storms for coastal erosion risk assessment.Natural Hazards and Earth System Sciences 12(8):2699–2708.

Cousin A, Di Bernardino E. 2013. On multivariate extensions of Value-at-Risk.Journal of Multivariate Analysis119:32–46.

de Haan L, Ferreira A. 2006.Extreme Value Theory. An Introduction. Springer Series in Operations Research and Financial Engineering. Springer: New York.

de Haan L, Huang X. 1995. Large quantile estimation in a multivariate setting.Journal of Multivariate Analysis53(2):247–263.

de Waal DJ, van Gelder PHAJM, Nel A. 2007. Estimating joint tail probabilities of river discharges through the logistic copula.Environmetrics18:621–631.

Di Bernardino E, Fernández-Ponce J, Palacios-Rodríguez F, Rodríguez-Griñolo M. 2015. On Multivariate Extensions of the Conditional Value-at-Risk measure.Insurance: Mathematics and Economics61:1–16.

Di Bernardino E, Laloë T, Maume-Deschamps V, Prieur C. 2013. Plug-in estimation of level sets in a non-compact setting with applications in multivariable risk theory.ESAIM: Probability and Statistics17:236–256.

Di Bernardino E, Prieur C. 2014. Estimation of multivariate conditional-tail-expectation using Kendall’s process.Journal of Nonparametric Statistics 26(2):241–267.

Durante F, Salvadori G. 2010. On the construction of multivariate extreme value models via copulas.Environmetrics21:143–161.

Embrechts P, Puccetti G. 2006. Bounds for functions of multivariate risks.Journal of Multivariate Analysis97(2):526–547.

Fawcett L, Walshaw D. 2012. Estimating return levels from serially dependent extremes.Environmetrics23:272–283.

Genest C, Ghoudi K, Rivest LP. 1995. A semiparametric estimation procedure of dependence parameters in multivariate families of distributions.Biometrika 82(3):543–552.

Genest C, Nešlehová J, Quessy JF. 2012. Tests of symmetry for bivariate copulas.Annals of the Institute of Statistical Mathematics64:811–834.

Genest C, Rivest LP. 2001. On the multivariate probability integral transformation.Statistics & Probability Letters53:391–399.

Gräler B, van den Berg MJ, Vandenberghe S, Petroselli A, Grimaldi S, De Baets B, Verhoest NEC. 2013. Multivariate return periods in hydrology: a critical and practical review focusing on synthetic design hydrograph estimation.Hydrology and Earth System Sciences17(4):1281–1296.

Hill BM. 1975. A simple general approach to inference about the tail of a distribution.The Annals of Statistics3(5):1163–1174.

Hochrainer-Stigler S, Pflug G. 2012. Risk management against extremes in a changing environment: a risk-layer approach using copulas.Environmetrics 23:663–672.

McNeil A, Nešlehová J. 2009. Multivariate Archimedean copulas, d-monotone functions andl1norm symmetric distributions.The Annals of Statistics 37(5B):3059–3097.

Nappo G, Spizzichino F. 2009. Kendall distributions and level sets in bivariate exchangeable survival models.Information Sciences179:2878–2890.

Nelsen RB. 2006.An Introduction to Copulas.Springer Series in Statistics. Springer: New York.

Papalexiou SM, Koutsoyiannis D, Makropoulos C. 2013. How extreme is extreme? An assessment of daily rainfall distribution tails.Hydrology and Earth System Sciences17:851–862.

Pavlopoulos H, Picek J, Jureˇcková J. 2008. Heavy tailed durations of regional rainfall.Applications of Mathematics53(3):249–265.

Prékopa A. 2012. Multivariate Value-at-Risk and related topics.Annals of Operation Research193:49–69.

Salvadori G, De Michele C. 2013. Multivariate Extreme Value Methods. In AghaKouchak A, Easterling D, Hsu K, Schubert S, Sorooshian S (eds),Extremes in a Changing Climate. Springer: Dordrecht.

Salvadori G, De Michele C, Durante F. 2011. On the return period and design in a multivariate framework.Hydrology and Earth System Sciences15:

3293–3305.

Salvadori G, Durante F, Perrone E. 2012. Semi-parametric approximation of Kendall’s distribution function and multivariate return periods.Journal de la Société Française de Statistique153(3):151–173.

Schmidt R, Stadtmüller U. 2006. Non-parametric estimation of tail dependence.Scandinavian Journal of Statistics. Theory and Applications33(2):307–335.

Torres R, Lillo RE, Laniado H. 2015. A directional multivariate value at risk.Insurance: Mathematics and Economics65:111–123.

Vandenberghe S, van den Berg MJ, Gräler B, Petroselli A, Grimaldi S, De Baets B, Verhoest NEC. 2012. Joint return periods in hydrology: a critical and practical review focusing on synthetic design hydrograph estimation.Hydrology and Earth System Sciences Discussions9:6781–6828.

SUPPORTING INFORMATION

Additional Supporting Information may be found in the online version of this article at the publisher’s web site.

168

Références

Documents relatifs

Then, the ratio between the mechanization benefits and the climatic benefits must be less than 0.3 (Figure 4): with this value both viticulture (on the bottom of the valley and on

If the rejection sampler is replaced by the Hamiltonian Monte Carlo method for truncated normal vectors described in Pakman and Paninski (2014) and implemented in the R package

The goal of the article is the use of the hypercomplex number in a three-dimensional space to facilitate the computation of the amplitude and the probability of the superposition

Using the separator algebra inside a paver will allow us to get an inner and an outer approximation of the solution set in a much simpler way than using any other interval approach..

The full cascade radiation (all generations) computed with the Monte Carlo code (black solid lines) and the pri- mary injected gamma-ray source (isotropic, dotted line) are shown for

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des

appearing in the numerator of (1.1). Approaches based on the Nadaraya-Watson kernel estimator have been proposed by Da Veiga and Gamboa [10] or Sol´ıs [29]. We propose here a

1, fracture line; 2, anterior part of the frontal bone depressed fracture penetrating the endocranial volume; 3, irregular shape of virtual endocranial surface indicating brain