HAL Id: hal-01074694
https://hal.archives-ouvertes.fr/hal-01074694
Preprint submitted on 15 Oct 2014
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires
Estimating the conditional tail index with an integrated conditional log-quantile estimator in the random
covariate case
Laurent Gardes, Gilles Stupfler
To cite this version:
Laurent Gardes, Gilles Stupfler. Estimating the conditional tail index with an integrated conditional log-quantile estimator in the random covariate case. 2014. �hal-01074694�
integrated onditional log-quantile estimator in
the random ovariate ase
Laurent Gardes
(1)
&GillesStuper
(2)
(1)
UniversitédeStrasbourg&CNRS,IRMA,UMR7501,7rueRenéDesartes,
67084StrasbourgCedex,Frane
(2)
AixMarseilleUniversité,CNRS,EHESS,CentraleMarseille,GREQAMUMR
7316, 13002Marseille,Frane
Abstrat. It iswellknownthat the tailbehaviorofaheavy-tailed distri-
bution isontrolled byaparameteralled the tailindex. Suh aparameteris
thereforeofprimaryinterestinextremevalueanalysis,partiularlytoestimate
extremequantiles. Invarious appliations,therandomvariable ofinterestan
belinkedtoanite-dimensionalrandomovariate. Insuhasituation,thetail
indexisfuntionoftheovariateandisreferredtoastheonditionaltailindex.
Thegoalofthispaperisto providealassof estimatorsof thisquantity. The
pointwise weak onsisteny and asymptoti normality of these estimators are
established. Weillustrate thenite sampleperformaneof ourtehniqueona
simulationstudyand onarealhurrianedataset.
AMS Subjet Classiations: 62G05,62G20,62G30,62G32.
Keywords: Heavy-taileddistribution,tailindex,randomovariate,onsis-
teny,asymptotinormality.
1 Introdution
Studyingextremeeventsisrelevantinnumerouseldsofstatistialappliations.
Inhydrologyforexample,itisofinteresttoestimatethemaximumlevelreahed
byseawateralongaoastoveragivenperiod, ortostudyextremerainfallata
givenloation;in atuarialsiene,amajorproblemforaninsuranermisto
estimatetheprobabilitythatalaimsolargethatitrepresentsathreattoitssol-
venyisled. Apartiularbranhofextremevalueanalysisfousesonthestudy
ofheavy-tailedrandomvariables, that is,those randomvariableswhosedistri-
butionfuntion F issuh that,forallλ >0, (1−F(λx))/(1−F(x))→λ−1/γ
asxgoestoinnity, whereγ >0 is theso-alledtailindex. Theparameterγ
drivestheasymptotibehaviorofF initsrighttail,whihmakesitsestimation
Theestimation of the tailindex has thereforebeen extensivelystudied in the
literature. Reentoverviewsonunivariatetailindexestimationanbefoundin
Beirlantetal.[2℄anddeHaanandFerreira[22℄.
In pratial appliations, the variable of interest Y an often be linked to a
ovariate X. For instane, the value of rainfall at a given loation depends
onitsgeographialoordinates;in atuarialsiene,thelaimsizedependson
thesuminsuredby thepoliy. Inthis situation, thetailindex of therandom
variableY givenX =xisafuntion ofxto whihweshallreferastheondi-
tional tailindex. Itsestimation hasrst beenonsidered in thexed design
ase,namelywhentheovariatesarenonrandom. Smith[30℄ andDavisonand
Smith [12℄ onsidered a regression model while Hall and Tajvidi [23℄ used a
semi-parametri approah to estimate the onditional tail index. Fully non-
parametrimethods havebeen developed using splines (see Chavez-Demoulin
and Davison [7℄), loal polynomials(see Davisonand Ramesh[11℄), amoving
window approah (see Gardes and Girard [15℄), a nearest neighbor approah
(seeGardes and Girard [16℄), and aonditional quantile-based tehnique(see
Gardeset al.[18℄),amongothers.
Despitethegreatinterestinpratie,thestudyoftherandomovariateasehas
beeninitiatedonlyreently. WerefertotheworksofWangandTsai[32℄,based
onamaximumlikelihoodapproah,Daouiaetal.[9℄whousedaxednumberof
nonparametrionditionalquantileestimatorstoestimatetheonditional tail
index,latergeneralizedinDaouiaetal.[10℄toaregressionontextwithondi-
tionalresponsedistributionsbelongingtothegeneralmax-domainofattration,
GardesandGirard[17℄ whointrodued aloalgeneralizedPikands-typeesti-
mator(seePikands[27℄),Goegebeuretal.[20℄whostudiedanonparametrire-
gressionestimatorwhosestronguniformpropertiesareexaminedinGoegebeur
etal.[21℄,Stuper[31℄whointroduedageneralizationofthepopularmoment
estimatorof Dekkersetal. [13℄and GardesandStuper [19℄whoworkedona
smoothedloalHillestimator(seeHill[24℄)relatedtotheworkofResnikand
St ri [28℄.
Theaimofthispaperistointrodueanestimatoroftheonditionaltailindex
basedon theintegration of aonditional log-quantileestimator. This typeof
estimatorsissimilartotheoneofGardesandGirard[15℄;ouraimistoproveits
onsistenyand asymptotinormalitywhentheovariatesarerandom,aswell
asto examineits appliability on numerialexamples and on real data. Our
paperisorganizedasfollows: wedeneouronditional tailindexestimatorin
Setion2,itsasymptotipropertiesarestatedin Setion3,asimulationstudy
isprovidedinSetion4andweshowaseourestimatoronasetofrealhurriane
datainSetion5. WeoeraoupleofonludingremarksinSetion6. All the
auxiliaryresultsandproofsaredeferredtotheAppendix.
Welet(X1, Y1), . . . ,(Xn, Yn)benindependentopiesofarandompair(X, Y)∈ E ×R+, where (E, d) is a metri spae. We assume that for any x ∈ E, the
onditional distributionfuntion y 7→F(y|x) := P(Y ≤ y|X = x) of Y given X=xbelongstothesetRV−1/γ(x)ofregularlyvaryingfuntions(atinnity)of
index−1/γ(x)<0. ReallthatafuntionG∈ RVa,a∈RifGisnonnegative andforallλ >0,G(λy)/G(y)→λa asy goestoinnity. Thisistheadaptation
ofthestandardextreme-valueframeworktotheasewhenthereisaovariate.
Anequivalentassumption(seeBinghametal.[5,Proposition1.5.15℄)is:
(M1) Foranyx∈ E,the onditionalquantilefuntion α7→q(α|x) :=F←(1− α|x) = inf{y∈R|F(y|x)≥1−α} ∈ RV−γ(x).
Our goal is to estimate the onditional tail index γ at a point x ∈ E. Re-
mark rst that, under (M1), for u ∈ (0,1) small enough and α ∈ (0, u), logq(α|x)/q(u|x)≈γ(x) log(u/α). Hene,foranymeasurablefuntionΨ(.|x, u)
on(0, u)suhthat Z u 0
Ψ(α|x, u) log (u/α)dα= 1, (1)
onehas Z u
0
Ψ(α|x, u) logq(α|x)
q(u|x)dα≈γ(x). (2)
We propose to estimate γ(x) by replaing in the previous approximation the onditional quantilefuntion q(.|x) by aonsistentestimator of this quantity.
Tothisend,letI{.}denotetheindiatorfuntionand,foranyh >0,B(x, h) :=
{x′ ∈ E | d(x, x′)≤h} denotethelosed ballin E withenter xandradiush.
ThetotalnumberofovariatesbelongingtotheballB(x, h)isgivenby M(x, h) =
Xn i=1
I{Xi∈B(x, h)}.
Theonditional distributionfuntion F(.|x)isestimatedby:
Fbn(y|x, hx) = 1 M(x, hx)
Xn i=1
I{Yi ≤y}I{Xi∈B(x, hx)},
wherehx =hx(n) isa positivesequene onvergingto 0. Theassoiated esti-
matoroftheonditional quantilefuntionq(.|x)isthen,forα∈(0,1), b
qn(α|x, hx) =Fbn←(1−α|x, hx) = inf{y∈R|Fbn(y|x, hx)≥1−α}.
Replaingq(.|x)byqbn(.|x, hx)in(2),ourlassofestimatorsofγ(x)isgivenfor
a(0,1)-valuedmeasurablefuntion uxonvergingto0atinnityby:
b
γ(x, ux, hx) = Z Ux
0
Ψ(α|x, Ux) log bqn(α|x, hx) b
qn(Ux|x, hx)dα, (3)
inwhihUx=ux(M(x, hx))andΨ(.|x, u)isanintegrablefuntionon(0, u)sat-
isfying(1). Theestimatorbγ(x, ux, hx)isthusaweightedintegralofanestimator
oftheonditional log-quantilefuntion.
Weonludethissetionbypointingoutthatpartiularhoiesofthefuntion
Ψ(.|x, u)atuallyyieldgeneralizationsofsomewell-knowntailindexestimators to theonditional framework. Let kx :=UxM(x, hx). Thehoie Ψ(.|x, u) = u−1yields:
b
γH(x, ux, hx) = 1 kx
⌊kXx⌋ i=1
logqbn((i−1)/M(x, hx)|x, hx)
qbn(kx/M(x, hx)|x, hx) , (4)
whih is the straightforward adaptation of the lassial Hill estimator (see
Hill [24℄). Similarly, letting Ψ(.|x, u) = u−1(log(u/.)−1) entails, after some
algebra:
b
γZ(x, ux, hx) = 1 kx
⌊kXx⌋ i=1
log kx
i ilogqbn((i−1)/M(x, hx)|x, hx) b
qn(i/M(x, hx)|x, hx)
.
ThisestimatoranbeseenasageneralizationoftheZipfestimator(seeKratz
andResnik[26℄,ShultzeandSteinebah[29℄).
3 Asymptoti properties
3.1 Main results
Westartbystating theweakonsistenyoftheestimator(3). Tothis end,an
additionalhypothesisisrequired.
(A1) ThefuntionΨ(.|x, u)satises:
lim sup
u↓0
Z u
0 |Ψ(α|x, u)|dα <∞,
andforallu∈(0,1) andβ∈(0, u], u
β Z β
0
Ψ(α|x, u)dα= Φ(β/u|x),
whereΦ(.|x)isasquare-integrablenoninreasingprobabilitydensityfun- tionon(0,1).
Notethatondition(A1)issatisedbythetwofuntionsΨ(.|x, u) =u−1 and Ψ(.|x, u) =u−1(log(u/.)−1)withΦ(.|x) = 1andΦ(.|x) =−log(.)respetively. We also assume in all what follows that q(.|x) is ontinuous and dereasing.
Partiular onsequenes of this ondition inlude that F(q(α|x)|x) = 1−α
for any α ∈ (0,1) and that given X = x, Y has an absolutely ontinuous
distributionwith probabilitydensityfuntion f(.|x). For0< α1< α2<1,we
nallyintroduethequantity:
ω(α1, α2, x, hx) = sup
α∈[α1,α2]
sup
x′∈B(x,hx)
logq(α|x′) q(α|x)
,
whih istheuniformosillationofthelog-quantilefuntion in itsseondargu-
ment. Suhaquantityis alsostudied inGardesandStuper [19℄,forinstane.
Lettingmx(hx) =nP(X ∈B(x, hx))betheaveragenumberofovariateswhih
belong to B(x, hx), theweak onsistenyof ourfamily of estimatorsis estab-
lishedinthefollowingtheorem.
Theorem 1. Assume that onditions (M1) and (A1) are satised. Assume
furtherthatmx(hx)→ ∞asn→ ∞ andthatux∈ RV−a(x) witha(x)∈(0,1).
If,for someδ >0,
ω [mx(hx)]−1−δ,1−[mx(hx)]−1−δ, x, hx
→0, (5)
thenitholdsthat bγ(x, ux, hx)−→P γ(x)asn→ ∞.
Note that ux(mx(h))mx(h) → ∞ is the average numberof observations used toomputeourestimatorof γ(x). Theonditionsin Theorem1are thus ana-
loguesof the lassialhypotheses in theestimation of thetail index. Besides,
ondition(5)ensuresthatthedistributionofY givenX =x′ isloseenoughto
thatofY givenX =xwhenx′ isinasuientlysmall neighborhoodofx.
Ouraim isnowto establishanasymptoti normality result. First,reall that
under(M1),theonditionalquantilefuntionmaybewrittenasfollows:
∀t >1, q(t−1|x) =c(t|x) exp Z t
1
∆(v|x)−γ(x)
v dv
,
wherec(.|x) isapositivefuntion onvergingtoapositiveonstantat innity
and ∆(.|x) is a measurable funtion onverging to 0 at innity, see Bingham
et al. [5, Theorem 1.3.1℄. We introdue the following lassial seond-order
ondition:
(M2) Condition (M1) holds, c(.|x) is a onstantfuntion equalto c(x) > 0,
thefuntion∆(.|x)hasultimatelyonstantsignatinnityand|∆(.|x)| ∈ RVρ(x),withρ(x)<0.
In ondition (M2), ρ(x) is alled the onditional seond-order parameter of the distribution. This ondition is ommonly used when studying tail index
estimatorsandmakesitpossibletoontroltheasymptotibiasoftheestimator
b
γ(x, ux, hx). Wealsointrodueafurtherassumptionontheweightingfuntion Φ(.|x),whih issimilarinspiritto aonditionintroduedin Beirlantetal.[1℄.
Towritedownthisondition,wenotethat if(A1)holdsthen
∀β∈(0,1), 0≤βΦ(β|x)≤ Z β/2
0 |Ψ(α|x,1/2)|dα
and the right-hand side onverges to 0 asβ ↓ 0, so that we may extend the
denitionofthemapt7→tΦ(t|x)bysayingitis0at t= 0.
(A2) Condition(A1)holds,thereisκ >0suhthat Φ2+κ(.|x)isintegrableon (0,1)andthereexistsapositivefuntiong(.|x),whihiseitherontinuous
on[0,1]ornoninreasingon(0,1),suh thatforanyk >1 andi∈[1, k),
|iΦ (i/k|x)−(i−1)Φ ((i−1)/k|x)| ≤g(i/k|x),
wherethefuntiong(.|x) max(log(1/.),1) isintegrableon(0,1).
Notethat ondition(A2) issatisedforinstane bythefuntions Ψ(.|x, u) = u−1 andΨ(.|x, u) = u−1(log(u/.)−1) mentionedat theend ofSetion 2with g(.|x) = 1for therstoneand, for theseondone, g(.|x) =−log(.) + 1. Our
asymptotinormalityresultisthefollowing:
Theorem 2. Assume that onditions (M2) and (A2) are satised. Assume
furtherthatmx(hx)→ ∞asn→ ∞,that ux∈ RV−a(x) with a(x)∈(0,1)and (zux(z))1/2∆(1/ux(z)|x)→λ(x)∈Ras z→ ∞. If forsome δ >0,
vx1/2ω [mx(hx)]−1−δ,1−[mx(hx)]−1−δ, x, hx
→0 (6)
wherevx=mx(hx)ux(mx(hx)),thenitholdsthat
vx1/2(bγ(x, ux, hx)−γ(x))−→ Nd λ(x)ABx(Φ, ρ(x)), γ2(x)AVx(Φ)
asn→ ∞,with ABx(Φ, ρ(x)) =
Z 1 0
Φ(α|x)α−ρ(x)dα and AVx(Φ) = Z 1
0
Φ2(α|x)dα.
Ourasymptotinormalityresultthusholdsundergeneralizationsoftheommon
hypothesesonthemodel andonuxand hx,providedtheonditionaldistribu-
tionsofY attwoneighboringpointsaresuientlylose.
We onlude this paragraph by noting that these results are similar in spirit
to results obtained in the literature for other onditional tail index orondi-
tional extreme-valueindex estimators, see e.g. Gardes and Stuper [19℄ and
Stuper [31℄. The main disadvantage of formulating the hypotheses in terms
of the uniform osillationω is that theyannot immediately be translated in termsofonditionsonux andhx. Inour nextparagraph,wegivealternative, simpleonditionsforourmain resultstohold.
3.2 Disussion of the hypotheses
Asastartingpoint,wenotethatifX hasaprobabilitydensityfuntionf with
respettotheLebesguemeasureonE =Rd equippedwiththeEulideannorm
k.k then suient onditions for mx(hx) → ∞ are that hx → 0, nhdx → ∞,
f(x)>0andf isontinuousatx. Indeed,inthisase,ifV denotesthevolume
oftheunit ballofRd,ahangeofvariablesentails:
mx(hx) =n Z
B(x,hx)
f(s)ds=nhdxf(x) V+ Z
kvk≤1
f(x+hxv) f(x) −1
dv
! .
Sinef isontinuousatx,wegetmx(hx) =nhdxVf(x)(1 + o(1))→ ∞. Further-
more,wepointoutthatifthefuntions γ,logc(t|.)and∆(t|.)satisfyaHölder
ondition,namely:
sup
x′∈B(x,hx)|γ(x′)−γ(x)| = O(hβx), sup
t−1∈Kx,δ(hx)
sup
x′∈B(x,hx)|logc(t|x′)−logc(t|x)| = O(hβx)
and sup
t−1∈Kx,δ(hx)
sup
x′∈B(x,hx)|∆(t|x′)−∆(t|x)| = O(hβx),
where β > 0 and Kx,δ(hx) is the interval [(mx(hx))−1−δ,1−(mx(hx))−1−δ],
then(5) isaonsequeneof theonvergene hβxlogmx(hx)→0. Intheafore-
mentioned ontext when X has a probability density funtion, this ondition beomes hβxlogn→ 0 as n→ ∞. Suh onditionswerealready onsidered in
Stuper[31℄.
As an illustration, we now ompute the optimal rate of onvergene of our
estimatorwhenE =Rd andX has aprobability density funtion. Leta(x)∈ (0,1) andb(x)∈(0,1/d). Wetakelog(hx) =−b(x) log(n) and log(nux(n)) = (1−a(x)) log(n). In this ontext, the rate of onvergene of theestimator is
essentially(mx(hx)ux(mx(hx))1/2 =n(1−db(x))(1−a(x))/2
. Besides, sine∆(.|x)
isregularlyvaryingwith indexρ(x)<0, theonditionsforTheorem 2to hold
arethenessentially:
1−a(x) + 2a(x)ρ(x)≤0 and 1−a(x)−2βb(x)≤0.
Theproblemthusamountstomaximizingthefuntion(a, b)7→(1−db)(1−a)
undertheseonditions. Thesolutionis:
(a∗(x), b∗(x)) =
1
1−2ρ(x), ρ(x)
dρ(x) +β(2ρ(x)−1)
,
whihyieldstheoptimalrateofonvergenenβρ(x)/(dρ(x)+β(2ρ(x)−1))
. Notethat
settingd= 0, i.e. onsideringthe asewhen thereis noovariate,wereover
the optimal rate of onvergene of the Hill estimator, see e.g. de Haan and
Ferreira[22℄.
4 Simulation study
Weexamine thebehaviorof our estimatoron several nite-sample situations.
To make it easier to showase our results, we fous on the ase E = [0,1]