• Aucun résultat trouvé

Estimating the conditional tail index with an integrated conditional log-quantile estimator in the random covariate case

N/A
N/A
Protected

Academic year: 2021

Partager "Estimating the conditional tail index with an integrated conditional log-quantile estimator in the random covariate case"

Copied!
34
0
0

Texte intégral

(1)

HAL Id: hal-01074694

https://hal.archives-ouvertes.fr/hal-01074694

Preprint submitted on 15 Oct 2014

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires

Estimating the conditional tail index with an integrated conditional log-quantile estimator in the random

covariate case

Laurent Gardes, Gilles Stupfler

To cite this version:

Laurent Gardes, Gilles Stupfler. Estimating the conditional tail index with an integrated conditional log-quantile estimator in the random covariate case. 2014. �hal-01074694�

(2)

integrated onditional log-quantile estimator in

the random ovariate ase

Laurent Gardes

(1)

&GillesStuper

(2)

(1)

UniversitédeStrasbourg&CNRS,IRMA,UMR7501,7rueRenéDesartes,

67084StrasbourgCedex,Frane

(2)

AixMarseilleUniversité,CNRS,EHESS,CentraleMarseille,GREQAMUMR

7316, 13002Marseille,Frane

Abstrat. It iswellknownthat the tailbehaviorofaheavy-tailed distri-

bution isontrolled byaparameteralled the tailindex. Suh aparameteris

thereforeofprimaryinterestinextremevalueanalysis,partiularlytoestimate

extremequantiles. Invarious appliations,therandomvariable ofinterestan

belinkedtoanite-dimensionalrandomovariate. Insuhasituation,thetail

indexisfuntionoftheovariateandisreferredtoastheonditionaltailindex.

Thegoalofthispaperisto providealassof estimatorsof thisquantity. The

pointwise weak onsisteny and asymptoti normality of these estimators are

established. Weillustrate thenite sampleperformaneof ourtehniqueona

simulationstudyand onarealhurrianedataset.

AMS Subjet Classiations: 62G05,62G20,62G30,62G32.

Keywords: Heavy-taileddistribution,tailindex,randomovariate,onsis-

teny,asymptotinormality.

1 Introdution

Studyingextremeeventsisrelevantinnumerouseldsofstatistialappliations.

Inhydrologyforexample,itisofinteresttoestimatethemaximumlevelreahed

byseawateralongaoastoveragivenperiod, ortostudyextremerainfallata

givenloation;in atuarialsiene,amajorproblemforaninsuranermisto

estimatetheprobabilitythatalaimsolargethatitrepresentsathreattoitssol-

venyisled. Apartiularbranhofextremevalueanalysisfousesonthestudy

ofheavy-tailedrandomvariables, that is,those randomvariableswhosedistri-

butionfuntion F issuh that,forallλ >0, (1F(λx))/(1F(x))λ−1/γ

asxgoestoinnity, whereγ >0 is theso-alledtailindex. Theparameterγ

drivestheasymptotibehaviorofF initsrighttail,whihmakesitsestimation

(3)

Theestimation of the tailindex has thereforebeen extensivelystudied in the

literature. Reentoverviewsonunivariatetailindexestimationanbefoundin

Beirlantetal.[2℄anddeHaanandFerreira[22℄.

In pratial appliations, the variable of interest Y an often be linked to a

ovariate X. For instane, the value of rainfall at a given loation depends

onitsgeographialoordinates;in atuarialsiene,thelaimsizedependson

thesuminsuredby thepoliy. Inthis situation, thetailindex of therandom

variableY givenX =xisafuntion ofxto whihweshallreferastheondi-

tional tailindex. Itsestimation hasrst beenonsidered in thexed design

ase,namelywhentheovariatesarenonrandom. Smith[30℄ andDavisonand

Smith [12℄ onsidered a regression model while Hall and Tajvidi [23℄ used a

semi-parametri approah to estimate the onditional tail index. Fully non-

parametrimethods havebeen developed using splines (see Chavez-Demoulin

and Davison [7℄), loal polynomials(see Davisonand Ramesh[11℄), amoving

window approah (see Gardes and Girard [15℄), a nearest neighbor approah

(seeGardes and Girard [16℄), and aonditional quantile-based tehnique(see

Gardeset al.[18℄),amongothers.

Despitethegreatinterestinpratie,thestudyoftherandomovariateasehas

beeninitiatedonlyreently. WerefertotheworksofWangandTsai[32℄,based

onamaximumlikelihoodapproah,Daouiaetal.[9℄whousedaxednumberof

nonparametrionditionalquantileestimatorstoestimatetheonditional tail

index,latergeneralizedinDaouiaetal.[10℄toaregressionontextwithondi-

tionalresponsedistributionsbelongingtothegeneralmax-domainofattration,

GardesandGirard[17℄ whointrodued aloalgeneralizedPikands-typeesti-

mator(seePikands[27℄),Goegebeuretal.[20℄whostudiedanonparametrire-

gressionestimatorwhosestronguniformpropertiesareexaminedinGoegebeur

etal.[21℄,Stuper[31℄whointroduedageneralizationofthepopularmoment

estimatorof Dekkersetal. [13℄and GardesandStuper [19℄whoworkedona

smoothedloalHillestimator(seeHill[24℄)relatedtotheworkofResnikand

St ri [28℄.

Theaimofthispaperistointrodueanestimatoroftheonditionaltailindex

basedon theintegration of aonditional log-quantileestimator. This typeof

estimatorsissimilartotheoneofGardesandGirard[15℄;ouraimistoproveits

onsistenyand asymptotinormalitywhentheovariatesarerandom,aswell

asto examineits appliability on numerialexamples and on real data. Our

paperisorganizedasfollows: wedeneouronditional tailindexestimatorin

Setion2,itsasymptotipropertiesarestatedin Setion3,asimulationstudy

isprovidedinSetion4andweshowaseourestimatoronasetofrealhurriane

datainSetion5. WeoeraoupleofonludingremarksinSetion6. All the

auxiliaryresultsandproofsaredeferredtotheAppendix.

(4)

Welet(X1, Y1), . . . ,(Xn, Yn)benindependentopiesofarandompair(X, Y) E ×R+, where (E, d) is a metri spae. We assume that for any x ∈ E, the

onditional distributionfuntion y 7→F(y|x) := P(Y y|X = x) of Y given X=xbelongstothesetRV−1/γ(x)ofregularlyvaryingfuntions(atinnity)of

index1/γ(x)<0. ReallthatafuntionG∈ RVa,aRifGisnonnegative andforallλ >0,G(λy)/G(y)λa asy goestoinnity. Thisistheadaptation

ofthestandardextreme-valueframeworktotheasewhenthereisaovariate.

Anequivalentassumption(seeBinghametal.[5,Proposition1.5.15℄)is:

(M1) Foranyx∈ E,the onditionalquantilefuntion α7→q(α|x) :=F(1 α|x) = inf{yR|F(y|x)1α} ∈ RV−γ(x).

Our goal is to estimate the onditional tail index γ at a point x ∈ E. Re-

mark rst that, under (M1), for u (0,1) small enough and α (0, u), logq(α|x)/q(u|x)γ(x) log(u/α). Hene,foranymeasurablefuntionΨ(.|x, u)

on(0, u)suhthat Z u 0

Ψ(α|x, u) log (u/α)= 1, (1)

onehas Z u

0

Ψ(α|x, u) logq(α|x)

q(u|x)γ(x). (2)

We propose to estimate γ(x) by replaing in the previous approximation the onditional quantilefuntion q(.|x) by aonsistentestimator of this quantity.

Tothisend,letI{.}denotetheindiatorfuntionand,foranyh >0,B(x, h) :=

{x ∈ E | d(x, x)h} denotethelosed ballin E withenter xandradiush.

ThetotalnumberofovariatesbelongingtotheballB(x, h)isgivenby M(x, h) =

Xn i=1

I{XiB(x, h)}.

Theonditional distributionfuntion F(.|x)isestimatedby:

Fbn(y|x, hx) = 1 M(x, hx)

Xn i=1

I{Yi y}I{XiB(x, hx)},

wherehx =hx(n) isa positivesequene onvergingto 0. Theassoiated esti-

matoroftheonditional quantilefuntionq(.|x)isthen,forα(0,1), b

qn|x, hx) =Fbn(1α|x, hx) = inf{yR|Fbn(y|x, hx)1α}.

Replaingq(.|x)byqbn(.|x, hx)in(2),ourlassofestimatorsofγ(x)isgivenfor

a(0,1)-valuedmeasurablefuntion uxonvergingto0atinnityby:

b

γ(x, ux, hx) = Z Ux

0

Ψ(α|x, Ux) log bqn|x, hx) b

qn(Ux|x, hx)dα, (3)

(5)

inwhihUx=ux(M(x, hx))andΨ(.|x, u)isanintegrablefuntionon(0, u)sat-

isfying(1). Theestimatorbγ(x, ux, hx)isthusaweightedintegralofanestimator

oftheonditional log-quantilefuntion.

Weonludethissetionbypointingoutthatpartiularhoiesofthefuntion

Ψ(.|x, u)atuallyyieldgeneralizationsofsomewell-knowntailindexestimators to theonditional framework. Let kx :=UxM(x, hx). Thehoie Ψ(.|x, u) = u−1yields:

b

γH(x, ux, hx) = 1 kx

⌊kXx i=1

logqbn((i1)/M(x, hx)|x, hx)

qbn(kx/M(x, hx)|x, hx) , (4)

whih is the straightforward adaptation of the lassial Hill estimator (see

Hill [24℄). Similarly, letting Ψ(.|x, u) = u−1(log(u/.)1) entails, after some

algebra:

b

γZ(x, ux, hx) = 1 kx

⌊kXx i=1

log kx

i ilogqbn((i1)/M(x, hx)|x, hx) b

qn(i/M(x, hx)|x, hx)

.

ThisestimatoranbeseenasageneralizationoftheZipfestimator(seeKratz

andResnik[26℄,ShultzeandSteinebah[29℄).

3 Asymptoti properties

3.1 Main results

Westartbystating theweakonsistenyoftheestimator(3). Tothis end,an

additionalhypothesisisrequired.

(A1) ThefuntionΨ(.|x, u)satises:

lim sup

u↓0

Z u

0 |Ψ(α|x, u)|dα <,

andforallu(0,1) andβ(0, u], u

β Z β

0

Ψ(α|x, u)dα= Φ(β/u|x),

whereΦ(.|x)isasquare-integrablenoninreasingprobabilitydensityfun- tionon(0,1).

Notethatondition(A1)issatisedbythetwofuntionsΨ(.|x, u) =u−1 and Ψ(.|x, u) =u−1(log(u/.)1)withΦ(.|x) = 1andΦ(.|x) =log(.)respetively. We also assume in all what follows that q(.|x) is ontinuous and dereasing.

Partiular onsequenes of this ondition inlude that F(q(α|x)|x) = 1α

for any α (0,1) and that given X = x, Y has an absolutely ontinuous

(6)

distributionwith probabilitydensityfuntion f(.|x). For0< α1< α2<1,we

nallyintroduethequantity:

ω1, α2, x, hx) = sup

α∈[α12]

sup

x∈B(x,hx)

logq(α|x) q(α|x)

,

whih istheuniformosillationofthelog-quantilefuntion in itsseondargu-

ment. Suhaquantityis alsostudied inGardesandStuper [19℄,forinstane.

Lettingmx(hx) =nP(X B(x, hx))betheaveragenumberofovariateswhih

belong to B(x, hx), theweak onsistenyof ourfamily of estimatorsis estab-

lishedinthefollowingtheorem.

Theorem 1. Assume that onditions (M1) and (A1) are satised. Assume

furtherthatmx(hx)→ ∞asn→ ∞ andthatux∈ RV−a(x) witha(x)(0,1).

If,for someδ >0,

ω [mx(hx)]−1−δ,1[mx(hx)]−1−δ, x, hx

0, (5)

thenitholdsthat bγ(x, ux, hx)−→P γ(x)asn→ ∞.

Note that ux(mx(h))mx(h) → ∞ is the average numberof observations used toomputeourestimatorof γ(x). Theonditionsin Theorem1are thus ana-

loguesof the lassialhypotheses in theestimation of thetail index. Besides,

ondition(5)ensuresthatthedistributionofY givenX =x isloseenoughto

thatofY givenX =xwhenx isinasuientlysmall neighborhoodofx.

Ouraim isnowto establishanasymptoti normality result. First,reall that

under(M1),theonditionalquantilefuntionmaybewrittenasfollows:

t >1, q(t−1|x) =c(t|x) exp Z t

1

∆(v|x)γ(x)

v dv

,

wherec(.|x) isapositivefuntion onvergingtoapositiveonstantat innity

and ∆(.|x) is a measurable funtion onverging to 0 at innity, see Bingham

et al. [5, Theorem 1.3.1℄. We introdue the following lassial seond-order

ondition:

(M2) Condition (M1) holds, c(.|x) is a onstantfuntion equalto c(x) > 0,

thefuntion∆(.|x)hasultimatelyonstantsignatinnityand|∆(.|x)| ∈ RVρ(x),withρ(x)<0.

In ondition (M2), ρ(x) is alled the onditional seond-order parameter of the distribution. This ondition is ommonly used when studying tail index

estimatorsandmakesitpossibletoontroltheasymptotibiasoftheestimator

b

γ(x, ux, hx). Wealsointrodueafurtherassumptionontheweightingfuntion Φ(.|x),whih issimilarinspiritto aonditionintroduedin Beirlantetal.[1℄.

Towritedownthisondition,wenotethat if(A1)holdsthen

β(0,1), 0βΦ(β|x) Z β/2

0 |Ψ(α|x,1/2)|

(7)

and the right-hand side onverges to 0 asβ 0, so that we may extend the

denitionofthemapt7→tΦ(t|x)bysayingitis0at t= 0.

(A2) Condition(A1)holds,thereisκ >0suhthat Φ2+κ(.|x)isintegrableon (0,1)andthereexistsapositivefuntiong(.|x),whihiseitherontinuous

on[0,1]ornoninreasingon(0,1),suh thatforanyk >1 andi[1, k),

|iΦ (i/k|x)(i1)Φ ((i1)/k|x)| ≤g(i/k|x),

wherethefuntiong(.|x) max(log(1/.),1) isintegrableon(0,1).

Notethat ondition(A2) issatisedforinstane bythefuntions Ψ(.|x, u) = u−1 andΨ(.|x, u) = u−1(log(u/.)1) mentionedat theend ofSetion 2with g(.|x) = 1for therstoneand, for theseondone, g(.|x) =log(.) + 1. Our

asymptotinormalityresultisthefollowing:

Theorem 2. Assume that onditions (M2) and (A2) are satised. Assume

furtherthatmx(hx)→ ∞asn→ ∞,that ux∈ RV−a(x) with a(x)(0,1)and (zux(z))1/2∆(1/ux(z)|x)λ(x)Ras z→ ∞. If forsome δ >0,

vx1/2ω [mx(hx)]−1−δ,1[mx(hx)]−1−δ, x, hx

0 (6)

wherevx=mx(hx)ux(mx(hx)),thenitholdsthat

vx1/2(bγ(x, ux, hx)γ(x))−→ Nd λ(x)ABx(Φ, ρ(x)), γ2(x)AVx(Φ)

asn→ ∞,with ABx(Φ, ρ(x)) =

Z 1 0

Φ(α|x)α−ρ(x) and AVx(Φ) = Z 1

0

Φ2|x)dα.

Ourasymptotinormalityresultthusholdsundergeneralizationsoftheommon

hypothesesonthemodel andonuxand hx,providedtheonditionaldistribu-

tionsofY attwoneighboringpointsaresuientlylose.

We onlude this paragraph by noting that these results are similar in spirit

to results obtained in the literature for other onditional tail index orondi-

tional extreme-valueindex estimators, see e.g. Gardes and Stuper [19℄ and

Stuper [31℄. The main disadvantage of formulating the hypotheses in terms

of the uniform osillationω is that theyannot immediately be translated in termsofonditionsonux andhx. Inour nextparagraph,wegivealternative, simpleonditionsforourmain resultstohold.

3.2 Disussion of the hypotheses

Asastartingpoint,wenotethatifX hasaprobabilitydensityfuntionf with

respettotheLebesguemeasureonE =Rd equippedwiththeEulideannorm

k.k then suient onditions for mx(hx) → ∞ are that hx 0, nhdx → ∞,

(8)

f(x)>0andf isontinuousatx. Indeed,inthisase,ifV denotesthevolume

oftheunit ballofRd,ahangeofvariablesentails:

mx(hx) =n Z

B(x,hx)

f(s)ds=nhdxf(x) V+ Z

kvk≤1

f(x+hxv) f(x) 1

dv

! .

Sinef isontinuousatx,wegetmx(hx) =nhdxVf(x)(1 + o(1))→ ∞. Further-

more,wepointoutthatifthefuntions γ,logc(t|.)and∆(t|.)satisfyaHölder

ondition,namely:

sup

x∈B(x,hx)|γ(x)γ(x)| = O(hβx), sup

t−1∈Kx,δ(hx)

sup

x∈B(x,hx)|logc(t|x)logc(t|x)| = O(hβx)

and sup

t−1∈Kx,δ(hx)

sup

x∈B(x,hx)|∆(t|x)∆(t|x)| = O(hβx),

where β > 0 and Kx,δ(hx) is the interval [(mx(hx))−1−δ,1(mx(hx))−1−δ],

then(5) isaonsequeneof theonvergene hβxlogmx(hx)0. Intheafore-

mentioned ontext when X has a probability density funtion, this ondition beomes hβxlogn 0 as n→ ∞. Suh onditionswerealready onsidered in

Stuper[31℄.

As an illustration, we now ompute the optimal rate of onvergene of our

estimatorwhenE =Rd andX has aprobability density funtion. Leta(x) (0,1) andb(x)(0,1/d). Wetakelog(hx) =b(x) log(n) and log(nux(n)) = (1a(x)) log(n). In this ontext, the rate of onvergene of theestimator is

essentially(mx(hx)ux(mx(hx))1/2 =n(1−db(x))(1−a(x))/2

. Besides, sine∆(.|x)

isregularlyvaryingwith indexρ(x)<0, theonditionsforTheorem 2to hold

arethenessentially:

1a(x) + 2a(x)ρ(x)0 and 1a(x)2βb(x)0.

Theproblemthusamountstomaximizingthefuntion(a, b)7→(1db)(1a)

undertheseonditions. Thesolutionis:

(a(x), b(x)) =

1

12ρ(x), ρ(x)

dρ(x) +β(2ρ(x)1)

,

whihyieldstheoptimalrateofonvergenenβρ(x)/(dρ(x)+β(2ρ(x)−1))

. Notethat

settingd= 0, i.e. onsideringthe asewhen thereis noovariate,wereover

the optimal rate of onvergene of the Hill estimator, see e.g. de Haan and

Ferreira[22℄.

4 Simulation study

Weexamine thebehaviorof our estimatoron several nite-sample situations.

To make it easier to showase our results, we fous on the ase E = [0,1]

Références

Documents relatifs

Nous avons décidé d'étudier et d'optimiser le doublage intracavité dans un OPO pour produire le rayonnement visible [MDB08]. Nous avons construit un oscillateur qui génère

As an alternative to these summary measures, one can work with functions that give a complete characterisation of the extremal dependence, like the spectral distribution

This definition of multivariate quantiles directly allowed to define multiple-output conditional quantile. Following the same construction as in Chapter 2 , we defined in Chapter

This is in line with Figure 1(b) of [11], which does not consider any covariate information and gives a value of this extreme survival time between 15 and 19 years while using

When analyzing the extremes of a random variable, a central issue is that the straightforward empirical estimator of the quantile function is not consistent at extreme levels; in

In this paper, we address estimation of the extreme-value index and extreme quantiles of a heavy-tailed distribution when some random covariate information is available and the data

From the Table 2, the moment and UH estimators of the conditional extreme quantile appear to be biased, even when the sample size is large, and much less robust to censoring than

Firstly, we introduce a suitable generalization of the notion of conditional Gaussian graphical model for data subject to censoring, secondly we propose a doubly penalized