Extreme Lp-quantile kernel regression

(1)

HAL Id: hal-03182032

https://hal.archives-ouvertes.fr/hal-03182032

Submitted on 26 Mar 2021

HAL

is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire

HAL, est

destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Extreme Lp-quantile kernel regression

Stéphane Girard, Gilles Stupfler, Antoine Usseglio-Carleve

To cite this version:

Stéphane Girard, Gilles Stupfler, Antoine Usseglio-Carleve. Extreme Lp-quantile kernel regression.

Advances in Contemporary Statistics and Econometrics, Springer, pp.197-219, 2021, �10.1007/978-3-

030-73249-3_11�. �hal-03182032�

(2)

Extreme L ^p −quantile kernel regression

Stéphane Girard, Gilles Stupfler and Antoine Usseglio-Carleve

Abstract Quantiles are recognized tools for risk management and can be seen as minimizers of an L¹−loss function, but do not define coherent risk measures in general. Expectiles, meanwhile, are minimizers of an L²−loss function and define coherent risk measures; they have started to be considered as good alternatives to quantiles in insurance and finance. Quantiles and expectiles belong to the wider family ofL^p−quantiles. We propose here to construct kernel estimators of extreme conditional L^p−quantiles. We study their asymptotic properties in the context of conditional heavy-tailed distributions and we show through a simulation study that takingp∈ (1,2)may allow to recover extreme conditional quantiles and expectiles accurately. Our estimators are also showcased on a real insurance data set.

1 Introduction

The quantile, also called Value-at-Risk in actuarial and financial areas, is a widespread tool for risk measurement, due to its simplicity and interpretability:

ifY is a random variable with a cumulative distribution functionF, the quantile at levelα∈ (0,1)is defined asq(α)=inf{y∈R|F(y)≥α}. As pointed out in [19], quantiles may also be seen as a solution of the following minimization problem:

q(α)=arg min

t∈R

E

fρ⁽¹⁾_α (Y −t)−ρ_α⁽¹⁾(Y)g

, (1)

whereρ⁽_α¹⁾(y)=|α−1{y≤0}||y|is the quantile check function. However, the quantile is not subadditive in general and so is not a coherent risk measure in the sense of [1].

An alternative risk measure gaining popularity is the expectile, introduced in [20].

This is the solution of (1), with the new loss function ρ⁽²⁾_α (y) = |α−1{y≤0}|y² in place of ρ⁽_α¹⁾. Expectiles larger than the mean are coherent risk measures, and have started to be used in actuarial and financial practice (see for instance [3]). A pi- oneering paper for the estimation of extreme expectiles in heavy-tailed settings is [7].

Stéphane Girard

Univ. Grenoble-Alpes, Inria, CNRS, Grenoble INP, LJK, 38000 Grenoble, France, e-mail:

[email protected] Gilles Stupfler

Univ Rennes, Ensai, CNRS, CREST - UMR 9194, F-35000 Rennes, France, e-mail:

[email protected] Antoine Usseglio-Carleve

Univ. Grenoble-Alpes, Inria, CNRS, Grenoble INP, LJK, 38000 Grenoble, France, e-mail:

[email protected]

1

(3)

2 Stéphane Girard, Gilles Stupfler and Antoine Usseglio-Carleve Quantiles and expectiles may be generalized by considering the family ofL^p−quantiles.

Introduced in [4], this class of risk measures is defined, for allp≥1, by:

q^(p)(α)=arg min

t∈R

E

fρ^(p)_α (Y−t)−ρ^(p)_α (Y)g

, (2)

whereρ^(p)_α (y) =|α−1{y≤0}||y|^p is theL^p−quantile loss function; the casep =1 leads to the quantile andp =2 gives the expectile. Note that, forp >1, using the formulation (2) and through the subtraction of the (at first sight unimportant) term ρ^(p)_α (Y), it is a straightforward consequence of the mean value theorem applied to the functionρ^(p)_α that theL^p−quantileq^(p)(α)is well-defined as soon asE(|Y|^p−¹)<∞. While the expectile is the only coherentL^p−quantile (see [2]), [8] showed that for extreme levels of quantiles or expectiles (α → 1), it may be better to estimate L^p−quantiles first (where typicallypis between 1 and 2) and exploit an asymptotic proportionality relationship to estimate quantiles or expectiles. An overview of the potential applications of this kind of statistical assessment of extreme risk may for instance be found in [14].

The contribution of this work is to propose a methodology to estimate extreme L^p−quantiles ofY|X =x, where the random covariate vectorX ∈R^d is recorded alongsideY. In this context, the casep =1 (quantile) has been considered in [6]

and [5], and the casep =2 (expectile) has recently been studied in [17]. For the general case p ≥ 1, only [22] proposes an estimation procedure under the strong assumption that the vector(X,Y)is elliptically distributed. The present paper avoids this modeling assumption by constructing a kernel estimator.

The paper is organized as follows. Section 2 introduces an estimator of conditional L^p−quantiles. Section 3 gives the asymptotic properties of the estimator previously introduced, at extreme levels. Finally, Section 4 proposes a simulation study in order to assess the accuracy of our estimator which is then showcased on a real insurance data set in Section 5. Proofs are postponed to the Appendix.

2 L

^p

−quantile kernel regression

Let (X_i,Y_i),i = 1, . . . ,n be independent realizations of a random vector(X,Y) ∈ R^d×R. For the sake of simplicity we assume thatY ≥0 with probability 1. We denote bygthe density function ofXand let, in the sequel,xbe a fixed point inR^dsuch thatg(x)>0. We denote by ¯F⁽¹⁾(y|x)=P(Y >y|X=x)the conditional survival function ofY givenX=xand assume that this survival function is continuous and regularly varying with index−1/γ(x):

∀t>0, lim

y→∞

F¯⁽¹⁾(ty|x)

F¯⁽¹⁾(y|x) =t⁻¹^/γ(x). (3) Such a distribution belongs to the Fréchet maximum domain of attraction [11]. Note that for anyk <1/γ(x),E

fY^k|X=xg

<∞. Since the definition ofL^p−quantiles in (2) requiresE

f|Y|^p−¹|X=xg

<∞, our minimal assumption will be thatp−1<

(4)

ExtremeL −quantile kernel regression 3 1/γ(x). From Equation (2),L^p−quantiles of levelα∈ (0,1)ofY givenX=xmay also be seen as the solution of the following equation:

E

f|Y−y|^p−¹1{Y>y}|X=xg E

f|Y−y|^p−¹|X=xg =1−α.

In other terms, as noticed in [18], (conditional) L^p−quantiles can be equivalently defined as quantiles

q^(p)(α|x)=inf

(y∈R|F¯^(p)(y|x)≤1−α) of the distribution associated with the survival function

F¯^(p)(y|x)= ϕ^(p−¹⁾(y|x) m^(p−¹⁾(y|x), where, for allk≥0,

m^(k)(y|x)=E

f|Y−y|^k|X=xg g(x) andϕ^(k)(y|x)=E

f|Y−y|^k1{Y>y}|X=xg g(x).

Obviously, ifp =1, we get the survival function introduced above. The casep=2 leads to the function introduced in [18] and used in [17]. To estimate ¯F^(p)(y|x), we letKbe a probability density function onR^dand we introduce the kernel estimators

mˆ^(k_n⁾(y|x)=

n

P

i=1|Y_i−y|^kK_x−X_i

hn

nh^d_n , ˆϕ^(k)_n (y|x)=

n

P

i=1|Y_i−y|^kK_x−X_i

hn

1{Yi>y}

nh_n^d .

Fˆ¯_n^(p)(y|x)= ϕˆ^(p−_n ¹⁾(y|x)

mˆ^(p−_n ¹⁾(y|x), ˆq_n^(p)(α|x)=inf

(y∈R|Fˆ¯_n^(p)(y|x)≤1−α) . (4)

The casep =1 gives the kernel quantile estimator introduced in [5], whilep =2 leads to the conditional expectile estimator of [17]. We study here the asymptotic properties of ˆq_n^(p)(α|x)for an arbitraryp≥1, whenα=α_n→1.

3 Main results

We first make a standard assumption on the kernel. We fix a norm|| · ||onR^d. (K)The density functionKis bounded and its supportSis contained in the unit ball.

(5)

4 Stéphane Girard, Gilles Stupfler and Antoine Usseglio-Carleve To be able to analyze extreme conditionalL^p−quantiles in a reasonably simple way, we make a standard second-order regular variation assumption (for a survey of those conditions, see Section 2 in [11]).

C₂(γ(x), ρ(x),A(.|x))There exist γ(x) >0, ρ(x) ≤ 0 and a positive or negative functionA(·|x)converging to 0 such that:

∀t >0, lim

y→∞

1 A(y|x)

q⁽¹⁾(1−1/(ty)|x) q⁽¹⁾(1−1/y|x) −t^γ(x)

!

=













t^γ(x)t^ρ(x)−1

ρ(x) if ρ(x)<0, t^γ(x)log(t) if ρ(x)=0. Our last assumption is a local Lipschitz condition which may be found for instance in [5, 12]. We denote byB(x,r)the ball with centerxand radiusr.

(L)We haveg(x)>0 and there existc,r>0 such that

∀x⁰∈ B(x,r), |g(x)−g(x⁰)| ≤c||x−x⁰||.

To be able to control the local oscillations of (x,y) 7→ F¯⁽¹⁾(y|x), we let, for any nonnegativey_n→ ∞,

ω⁽¹⁾

hn(y_n|x)= sup

x⁰∈B(x,hn) sup

z≥yn

1 log(z)

log

F¯⁽¹⁾(z|x⁰) F¯⁽¹⁾(z|x)

! , ω⁽²⁾

hn(y_n|x)= sup

x⁰∈B(x,hn) sup

0<y≤yn

|F¯⁽¹⁾(y|x⁰)−F¯⁽¹⁾(y|x)|, andω_h⁽³⁾

n(y_n|x)= sup

x⁰∈B(x,hn)

supλ≥1

sup

bn,b⁰n→0

F¯⁽¹⁾(λy_n(1+b_n)|x⁰) F¯⁽¹⁾(λy_n(1+b⁰_n)|x⁰) −1

.

The quantityω_h⁽¹⁾

n(y_n|x), discussed for instance in [17], controls the oscillation of the conditional survival function with respect toxin its right tail, whileω⁽²⁾_h

n(y_n|x) andω⁽³⁾

hn(y_n|x)are introduced to be able to deal with the casep<{1,2}specifically.

Let us highlight thatω_h⁽³⁾

n(y_n|x) is again geared towards controlling an oscillation of the right tail of the conditional distribution; however,ω⁽_h²⁾

n(y_n|x)focuses on the oscillation of the center of the conditional distribution with respect tox. Forp>1, the introduction of a quantity such asω⁽_h²⁾

n(y_n|x)is in some sense natural, since we will have to deal with the local oscillation of the conditional momentm^(p−¹⁾(y|x), appearing in the denominator of ¯F^(p)(y|x), and this conditional moment indeed depends on the whole of the conditional distribution rather than merely on its right tail. Typically ω_h⁽¹⁾

n(y_n|x) = O(h_n), ω_h⁽²⁾

n(y_n|x) = O(h_n) andω_h⁽³⁾

n(y_n|x) = o(1) under reasonable assumptions; we give examples below.

Remark 1 Assume thatY|X=xhas a Pareto distribution with tail indexγ(x)>0:

∀y≥1, F¯⁽¹⁾(y|x)=y⁻¹^/γ(x).

(6)

ExtremeL −quantile kernel regression 5 Ifγis locally Lipschitz continuous, we clearly haveω_h⁽¹⁾

n(y_n|x) =O(h_n). Further- more, for anyy≥1, the mean value theorem yields

|F¯⁽¹⁾(y|x⁰)−F¯⁽¹⁾(y|x)| ≤

1 γ(x⁰) − 1

γ(x)

×y⁻1/[γ(x)∨γ(x⁰)]

logy.

(Here and below∨denotes the maximum operator.) Under this same local Lipschitz assumption, one then findsω⁽²⁾_h

n(y_n|x)=O(h_n)as well. Finally, for anyy,y⁰>1,

F¯⁽¹⁾(y⁰|x⁰) F¯⁽¹⁾(y|x⁰) −1

=

y y⁰

!₁/γ(x⁰)

−1

≤ |y−y⁰|

y⁰ ×1+(y/y⁰)¹^/γ(x⁰⁾⁻¹ γ(x⁰)

by the mean value theorem again. This inequality yieldsω_h⁽³⁾

n(y_n|x)=o(1). The same arguments, and asymptotic bounds onω_h⁽¹⁾

n(y_n|x),ω⁽_h²⁾

n(y_n|x)andω_h⁽³⁾

n(y_n|x), apply to the conditional Fréchet model

∀y>0, F¯⁽¹⁾(y|x)=1−exp(−y⁻¹^/γ(x)).

Analogous results are easily obtained for the conditional Burr model

∀y>0, F¯⁽¹⁾(y|x)=(1+y^{−ρ(x)/γ(x)})^1/ρ(x)

when ρ < 0 is assumed to be locally Lipschitz continuous, and the conditional mixture Pareto model

∀y≥1, F¯⁽¹⁾(y|x)=y⁻^1/γ(x)f

c(x)+(1−c(x))y^ρ(x)/γ(x)g

when ρ <0 andc∈(0,1)are assumed to be locally Lipschitz continuous.

3.1 IntermediateL^p−quantile regression

In this paragraph, we assume thatσ⁻_n² =nh_n^d(1−α_n) → ∞. Such an assumption means that theL^p−quantile levelα_ntends to 1 slowly (by extreme value standards), hence the denominations intermediate sequence and intermediate L^p−quantiles. This assumption is widespread in the literature of risk measure regression: see, among others, [5, 6, 12, 17]. Throughout, we let||K||²

2 =R

SK(u)²dube the squared L²−norm of K,Ψ(·) denote the digamma function andI B(t,x,y) = Rt

0 u^x−¹(1− u)^y−¹du be the incomplete Beta function. Note that I B(1,x,y) = B(x,y) is the standard Beta function.

We now give our first result on the joint asymptotic normality of a finite numberJ of empirical conditional quantiles with an empirical conditionalL^p-quantile (p>1).

Theorem 1 Assume that (K),(L)andC₂(γ(x), ρ(x),A(.|x)) hold. Letα_n → 1, hn → 0 andan = 1−τ(1−α_n)(1+o(1)), whereτ > 0. Assume further that

(7)

6 Stéphane Girard, Gilles Stupfler and Antoine Usseglio-Carleve σ⁻_n² = nh^d_n(1−α_n) → ∞, nh^d+2_n (1−α_n) → 0, σ⁻_n¹A

(1−α_n)⁻¹|x

= O(1), ω⁽_h³⁾

n(q⁽¹⁾(α_n|x)|x)→0 and there existsδ∈(0,1)such that σ⁻_n¹ω⁽_h¹⁾

n((1−δ)(θ∧1)q⁽¹⁾(α_n|x)|x)log(1−α_n)→0, (5) whereθ =

τγ(x)/B

p, γ(x)⁻¹−p+1

−γ(x)

. Let furtherα_n,j =1−τ_j(1−α_n), for 0< τ₁ < τ₂ < . . . < τ_J ≤1 such that

σ⁻_n¹ω⁽²⁾_h

n((1+δ)(θ∨τ^−γ(x))

1 )q⁽¹⁾(α_n|x)|x)→0. (6) Then, for allp∈(1, γ(x)⁻¹/2+1), one has

σ⁻_n¹







* ,

qˆ_n⁽¹⁾(α_n,j|x) q⁽¹⁾(α_n,j|x) −1+

-1≤j≤J

,* ,

qˆ_n^(p)(a_n|x) q^(p)(an|x) −1+

-









−→ Nd *

,

0_J+1,||K||²

2

g(x) γ(x)²Σ(x)+ - , (7) whereΣ(x)is the symmetric matrix having entries













Σ_j,`(x)=

τ_j∨τ_`−1

Σ_j,J+1(x)= τ_j⁻¹





 γ(x)

(p−1)I B* . ,

* ,

1∨^τ

−γ(x) j

θ +

-

−1

,γ(x)⁻¹−p+1,p−1+ / - B(p,γ(x)⁻¹−p+1) +

1∨^τ

−γ(x) j

θ

!

−1

!p−1







Σ_J+1_,J+1(x)= ^B(2p−1,γ(x)⁻¹−2p+2)

τB(^p,γ^(x)⁻¹−p+1)

.

(8) Theorem 1, which will be useful to introduce estimators of the tail indexγ(x)as part of our extrapolation methodology, generalizes and adapts to the conditional setup several results already found in the literature: see Theorem 1 in [5], Theorem 1 in [8]

and Theorem 3 in [10]. Note however that, although they are in some sense related, Theorem 1 does not imply Theorem 1 of [17], because the latter is stated under weaker regularity conditions warranted by the specific contextp=2 of extreme conditional expectile estimation. On the technical side, assumptions (5) and (6) ensure that the bias introduced by smoothing in thexdirection is negligible compared to the standard deviationσ_n of the estimator. The aim of the next paragraph is now to extrapolate our intermediate estimators to properly extreme levels.

3.2 ExtremeL^p−quantile regression

We consider here a level β_n →1 such thatnh^d_n(1−β_n)→c<∞. The estimators previously introduced no longer work at such an extreme level. In order to overcome this problem, we first recall a result of [8] (see also Lemma 5 below):

(8)

ExtremeL −quantile kernel regression 7

∀p≥1, _α→lim

1

q^(p)(α|x)

q⁽¹⁾(α|x) = γ(x) B p, γ(x)⁻¹−p+1

!−γ(x)

. (9)

In the sequel we shall use the notationg_p(γ)=γ/B

p, γ⁻¹−p+1

. A first consequence of this result is that theL^p−quantile function is regularly varying, i.e.:

∀t >0, lim

y→∞

q^(p)(1−1/(ty)|x)

q^(p)(1−1/y|x) =t^γ(x). (10) This suggests then that, by considering an intermediate sequence (α_n), our conditional extremeL^p−quantile may be approximated (and estimated) as follows:

q^(p)(β_n|x)≈ 1−α_n 1−β_n

!γ(x)

q^(p)(α_n|x),

estimated by ˜q_n,α^(p)_n(β_n|x)= 1−α_n 1−β_n

!γˆ_αn(x)

qˆ_n^(p)(α_n|x).

Here, ˆq^(p)_n (α_n|x)is the kernel estimator introduced in Equation (4) and ˆγ_α_n(x)is a consistent estimator of the conditional tail indexγ(x). This is a class of Weissman- type estimators (see [23]) of which we give the asymptotic properties.

Theorem 2 Assume that(K),(L)andC₂(γ(x), ρ(x),A(·|x))hold with ρ(x) <0.

Letα_n, β_n→1,hn →0 be such thatσ⁻_n²=nh_n^d(1−α_n)→ ∞andnh_n^d(1−β_n)→ c<∞. Assume further thatnh^d_n⁺²(1−α_n)→0,ω_h⁽³⁾

n(q⁽¹⁾(α_n|x)|x)→0 and i) σ⁻_n¹A

(1−α_n)⁻¹|x

=O(1),σ⁻_n¹(1−α_n)=O(1)and σ⁻_n¹E

fY1{0<Y<q⁽¹⁾(αn|x)}|xg

q⁽¹⁾(α_n|x)⁻¹=O(1), ii) For someδ∈(0,1),σ⁻_n¹ω⁽¹⁾_h

n((1−δ)[g_p(γ(x))]^−γ(x)q⁽¹⁾(α_n|x)|x)log(1−α_n)→ 0 andσ⁻_n¹ω_h⁽²⁾

n((1+δ)q⁽¹⁾(α_n|x)|x)→0, iii) σ⁻_n¹/log((1−α_n)/(1−β_n))→ ∞.

Takep∈(1, γ(x)⁻¹/2+1). If in additionσ_n⁻¹(γˆ_α_n(x)−γ(x))−→^d Γ,then σ⁻_n¹

log((1−α_n)/(1−β_n))* ,

q˜_n,α^(p)_n(β_n|x) q^(p)(βn|x) −1+

-

−→d Γ.

We notice, as is classical in the analysis of heavy tails, that the asymptotic distribution of the extrapolated estimator ˜q_n,α^(p)_n(β_n|x) is exactly that of the purely empirical estimator ˆγ_α_n(x)with a slightly slower rate of convergence. Technically speaking, assumption (i) controls the bias due to the asymptotic approximation (9), while assumption (ii) is used to deal with the bias due to smoothing.

Our aim is now to propose some estimators ofγ(x)solely based on intermediate L^p−quantiles, in order to carry out the extrapolation step.

(9)

8 Stéphane Girard, Gilles Stupfler and Antoine Usseglio-Carleve

3.3 L^p−quantile based estimation of the conditional tail index

The aim of this paragraph is to discuss the estimation of the conditional tail index γ(x). A local Pickands estimator is studied in [5, 6]. This estimator however has a large variance, which is why [6] propose a simplified, conditional and local version of the Hill estimator:

γˆ_α^(H)_n (x)= 1 log(J!)

J

X

j=1

log qˆ_n j−1+α_n j |x

!

/qˆ_n(α_n|x)

!

. (11)

They also mentioned that takingJ=9 is an optimal choice, and leads to an asymptotic variance close to 1.25||K||²

2γ(x)²/g(x). Recently, [9, 17] have shown that re- placing the quantile by the expectile in tail index estimators can lead to a significant variance reduction. Our idea here is to propose an estimator based onL^p−quantiles rather than quantiles. In this context, we propose to follow the approach of [16] and exploit the asymptotic relationship (9) by introducing the following estimator, valid for all 1<p< γ(x)⁻¹+1:

γˆ_α^(p)_n(x)=inf











γ >0 :g_p(γ)≤ Fˆ¯_n⁽¹⁾

qˆ^(p)_n (α_n|x)|x 1−α_n











. (12)

This class of estimators is introduced in [16] in an unconditional setting, and the (ex- plicit) estimator ˆγ_α⁽²⁾_n(x)is introduced in [17]. Using the results previously obtained, we can give the asymptotic distribution of ˆγ_α^(p)_n(x)for all 1<p< γ(x)⁻¹/2+1.

Theorem 3 Assume that (K), (L) and C₂(γ(x), ρ(x),A(·|x)) hold with γ(x) <

1. Let α_n → 1 and h_n → 0. Assume further that σ⁻_n² = nh^d_n(1−α_n) → ∞, nh_n^d+2(1−α_n)→0,ω⁽³⁾

hn(q⁽¹⁾(α_n|x)|x)→0 and i) σ⁻_n¹A

(1−α_n)⁻¹|x

→0, ii) σ⁻_n¹q⁽¹⁾(α_n|x)⁻¹→λ∈R, iii) For some δ ∈ (0,1), σ⁻_n¹ω⁽¹⁾

hn((1 −δ)

g_p(γ(x))⁻^γ(x)q⁽¹⁾(α_n|x)

|x)log(1− α_n)→0 andσ⁻_n¹ω_h⁽²⁾

n((1+δ)

q⁽¹⁾(α_n|x)

|x)→0.

Then, for allp∈(1, γ(x)⁻¹/2+1), one has σ⁻_n¹*

,

γˆ_α^(p)_n (x)−γ(x),qˆ_n^(p)(α_n|x) q^(p)(α_n|x)−1+

-

−→d Θ, (13)

whereΘis a bivariate Gaussian distribution with mean vector

b_p(x),0

and co- variance matrix||K||²

2γ(x)²g(x)⁻¹Ω(x)such that:

(10)













b_p(x)= ⁽¹^−p)γ(x)g^p^(γ(x))^γ(x)^E[Y^|X=x]

1−_γ(x)¹ (^Ψ(γ(x)⁻¹+1)^−Ψ(γ(x)⁻¹−p+1))λ Ω₁₁(x)= ^B(p,γ(x)⁻¹−p+1)

1−_γ(x)¹ (^Ψ(γ(x)⁻¹+1)^−Ψ(γ(x)⁻¹−p+1))²

B(2p−1,γ(x)⁻¹−2p+2)

B(p,γ(x)⁻¹−p+1)² −_γ(x)¹

!

Ω₁₂(x)= ^B(^p,γ(x)⁻¹−p+1)

1−_γ(x)¹ (^Ψ(^γ(x)⁻¹+1)^−Ψ(^γ(x)⁻¹−p+1)) ^γ(x)¹ −^B(2p−1,γ(x)⁻¹−2p+2)

B(p,γ(x)⁻¹−p+1)²

!

Ω₂₂(x)= ^B(2p−1,γ(x)⁻¹−2p+2)

B(p,γ(x)⁻¹−p+1)

.

(14) Let us remark here that although Theorem 3 can be seen as a version of Theorem 4 of [17], the latter is stated under weaker regularity assumptions and applies to further examples of estimators developed specifically in the conditional expectile setup.

Note that condition γ(x) < 1 entails E[Y|X = x] < ∞ and leads to a simple expression of the bias termb_p(x). A result dropping this assumption is available in the unconditional setting in [16]; here, our motivation for this condition is that we shall use extreme regressionL^p−quantiles as a way to estimate extreme regression expectiles, for the existence of which a natural condition is thatE[|Y||X=x]<∞. The bias term b_p(x) is related to γ(x), q⁽¹⁾(α_n|x) and E[Y|X = x]. All these quantities may be easily estimated (the latter two by kernel regression estimators) to construct a bias-reduced conditional tail index estimator as follows:

γ˜_α^(p)_n(x)=γˆ_α^(p)_n(x)

* . . . . . . . . . ,

1+

(p−1)* . ,

n P i=1YiK

_x−X i hn

n P i=1K

_x−X i hn

+ / -

qˆ_n^(p)(α_n|x)⁻¹ 1+ ¹

γˆ^(p)_αn(x)

Ψ

1/γˆ_α^(p)_n(x)−p+1 −Ψ

1/γˆ_α^(p)_n(x)+1 + / / / / / / / / / - .

Under the conditions of Theorem 3, it is clear that σ⁻_n¹(γ˜_α^(p)_n(x) −γ(x)) −→^d N (0,Ω₁₁(x))whereΩ₁₁(x)is given in Equation (14). This bias reduction improves significantly the numerical results, and is used in the finite-sample study below.

Even though L^p−quantiles with 1< p <2 are more widely estimable than expectiles and take into account the whole tail information, they are neither easy to interpret nor coherent as risk measures. Recent work in [8] has shown that extreme L^p−quantiles can be used as vehicles for extreme quantile and expectile estimation;

see also [15] for an analogous study of the estimation of (a compromise between) Median Shortfall and Conditional Tail Expectation at extreme levels, using tail L^p−medians. Our focus in the following finite-sample study is to analyse the potential of extreme regressionL^p−quantiles for the estimation of extreme regression quantiles and expectiles.

(11)

10 Stéphane Girard, Gilles Stupfler and Antoine Usseglio-Carleve

4 Simulation study

We consider here a one-dimensional covariate (d = 1), uniformly distributed on [0,1], and a Burr-type distribution forY givenX =x:

F¯⁽¹⁾(y|x)=

1+y^{−ρ(x)/γ(x)}1/ρ(x)

,γ(x)=4+sin(2πx)

10 andρ(x)≡ −1. Such a distribution fulfills AssumptionC₂(γ(x), ρ(x),A(·|x))with auxiliary func- tionA(y|x)=γ(x)y^ρ(x). We simulateN =500 samples of sizen=1,000 independent replications of (X,Y), and propose to estimate the conditional quantiles and expectiles of level β_n=1−1/n=0.999 using our extreme regressionL^p−quantile estimators. Note that the quantiles may be calculated explicitly:

q(α|x)=f

(1−α)^ρ(x)−1

g−γ(x)/ρ(x)

.

Expectiles have to be approximated numerically, since they do not have a simple closed form. In order to estimate these two quantities, we propose to compare different approaches (called either direct or indirect):

(i) Use the conditional Weissman-type estimators respectively based on empirical quantiles and the estimator ˆγ_α^(H)_n (x)(direct quantile estimator) and on empirical expectiles and ˜γ_α⁽²⁾_n(x)(direct expectile estimator), i.e.

1−α_n 1−β_n

!γˆ^(H)αn (x)

qˆ_n⁽¹⁾(α_n|x), 1−α_n 1−β_n

!γ˜αn⁽²⁾(x)

qˆ_n⁽²⁾(α_n|x).

(ii) Indirect quantile estimator: estimate first the conditional L^p−quantile using estimator (4), and exploit asymptotic relationship (9) to recover the extreme conditional quantile,

1−α_n 1−β_n

!γ˜^(p)αn(x)

qˆ_n^(p)(α_n|x)* . ,

γ˜_α^(p)_n (x) B

p,γ˜_α^(p)_n (x)⁻¹−p+1

+ / -

γ˜αn^(p)(x)

.

(iii) Indirect expectile estimator: use Equation (9) to get a connection between L^p−quantile and quantile, and quantile and expectile, resulting in the extreme conditional expectile estimator

1−α_n 1−β_n

!γ_˜^(p)_αn(x)

qˆ_n^(p)(α_n|x)* . ,

B

2,γ˜_α^(p)_n(x)⁻¹−1

B

p,γ˜_α^(p)_n (x)⁻¹−p+1

+ / -

γ˜_αn^(p)(x)

.

The choice ofpis discussed in [16] using the MSE of (the unconditional version of) γ˜_α^(p)_n (x)as a criterion. Cross-validation choices of the bandwidthhnand intermediate quantile levelα_n, meanwhile, are discussed in [5, 17]. For the sake of simplicity, we choose here common parametersp=1.7 following the guidelines of [16]),h_n=0.15

(12)

ExtremeL −quantile kernel regression 11 andα_n=1−1/√

n≈0.968 across all replications andKis the Epanechnikov kernel defined byK(t)=0.75(1−t²)1{ |t|<1}. Results are shown in Figure 1.

Fig. 1 Left: Boxplots of 500 estimates ofq⁽¹⁾(βn|x)with the direct (green) and indirect (blue) quantile estimators. Right: Boxplots of 500 estimates ofq⁽²⁾(βn|x)with the direct (green) and indirect (blue) expectile estimators. True values are in red.

We can notice that an indirect estimation of extreme quantiles or expectiles with a L^p−quantile (withpbetween 1 and 2) leads to a trade-off between bias and variance:

the indirectL^p−estimator of an extreme regression quantile is less variable than the direct estimator but slightly more biased, and the indirectL^p−estimator of an extreme regression expectile is more variable than the direct estimator but less biased. For conditional quantiles, an explanation is that using the asymptotic approximation (9) in the construction of the indirect estimator adds a source of bias, while the reduced variance stems from the use of p = 1.7 in the estimator ˜γ_α^(p)_n(x), providing an estimator with lower variance compared to the simple Hill estimator in our case (see [16]). The case of conditional expectiles is less clear, although the increased variability observed for x ∈ [0,0.5] seems to originate in the use of the estimated constant B(2,γ˜_α^(p)_n (x)⁻¹−1)/B(p,γ˜_α^(p)_n(x)⁻¹ −p+1): when ˜γ_α^(p)_n (x)gets close to 1, which is sometimes the case in this zone whereγ(x) ∈[0.4,0.5], this estimated constant tends to explode, while the direct estimator is less affected. A similar observation, in the context of extreme Wang distortion risk measure estimation, is made by [13].

5 Real data example

We study here a data set on motorcycle insurance, collected from the former Swedish insurance provider Wasa. Our data is on motorcycle insurance policies and claims over the period 1994-1998 and is available from www.math.su.se/GLMbookor the R packages insuranceData and CASdatasets, and analyzed in [21]. We concentrate here on the relationship between the claim severityY (defined as the

(13)

12 Stéphane Girard, Gilles Stupfler and Antoine Usseglio-Carleve ratio of claim cost by number of claims for each given policyholder) in Swedish kroner (SEK), and the number of years X of exposure of a policyholder. Data for X >3 are very sparse, so we restrict our attention to the caseY >0 andX ∈[0,3], resulting inn=593 pairs(X_i,Yi).

Our goal in this section is to estimate extreme conditional quantiles and expectiles ofY givenX, at a level β_n =1−3/n ≈0.9949. This level is slightly less extreme than the more standardβ_n=1−1/n≈0.9985, but is an appropriately extreme level in this conditional context where less data are available locally for the estimation.

A preliminary diagnostic using a local version of the Hill estimator (which we do not show here) suggests that the data is indeed heavy-tailed withγ(x)∈[0.25,0.6].

Following again the guidelines in [16], we choosep=1.7 for our indirect extreme conditional quantile and expectile estimators. These are respectively compared to:

• the estimatorDq^W_n (β_n|x)of [17], calculated as in Section 5 therein, and our direct quantile estimator presented in Section 4 (i),

• the estimatorDe^W,BR_n (β_n|x)of [17], calculated as in Section 5 therein, and our direct expectile estimator presented in Section 4 (i).

For the direct and indirect estimators presented in Section 4 (ii)-(iii), the parameters α_nandh_nare chosen by a cross-validation procedure analogous to that of [17]. The Epanechnikov kernel is adopted. Results are given in Figure 2. In each case, all three estimators reassuringly point to roughly the same results, with slight differences;

in particular, for quantile estimation and when data is scarce, the direct estimator in Section 4 (i) appears to be more sensitive to the local shape of the tail than the indirect,L^p−quantile based estimator in Section 4 (ii), resulting in less stable estimates.

6 Appendix

6.1 Preliminary results

Lemma 1 Assume that (L) andC₂(γ(x), ρ(x),A(.|x))hold, and let y_n → ∞and h_n → 0 be such that ω⁽¹⁾

hn(y_n|x)log(y_n) → 0 andω⁽²⁾

hn(y_n|x) → 0. Then for all 0≤k< γ(x)⁻¹we have, uniformly inx⁰∈B(x,h_n),

m^(k)(y_n|x⁰)=m^(k)(y_n|x)

1+O(h_n)+o ω⁽¹⁾_h

n(y_n|x) +O

ω_h⁽²⁾

n(y_n|x) . In particularm^(k⁾(y_n|x⁰)=y^k_ng(x) (1+o(1))uniformly inx⁰∈B(x,h_n). Proof Let us first write

m^(k)(y_n|x)=E

f(Y−y_n)^k1{Y>yn}|X=xg

g(x)+Ef

(y_n−Y)^k1{Y≤yn}|X=xg g(x).

By the arguments of the proof of Lemma 3 in [17], E

f(Y −y_n)^k1{Y>yn}|X=x⁰g g(x⁰) E

g(x) =1+O(hn)+O ω⁽¹⁾

hn(y_n|x)log(y_n) .

(14)

Fig. 2 Swedish motorcycle insurance data. Left panel: extreme conditional quantile estimation, black curve: estimatorDq_n^W(βn|x)of [17], blue curve: direct quantile estimator (i) of Section 4, red curve: indirect quantile estimator (ii) of Section 4. Right panel: extreme conditional expectile estimation, black curve: estimatorDe^Wn ^{,B R}(βn|x)of [17], blue curve: direct expectile estimator (i) of Section 4, red curve: indirect expectile estimator (iii) of Section 4. In each panel,x-axis: number of years of exposure of policyholder,y-axis: claim severity.

Besides, an integration by parts yields E

f(y_n−Y)^k1{Y≤yn}|X=xg

= Z yn

0

kt^k−¹F⁽¹⁾(y_n−t|x)dt.

It clearly follows that

E

f(y_n−Y)^k1{Y≤yn}|X=x⁰g

−E

≤y_n^kω_h⁽²⁾

n(y_n|x).

Now E

=y^k_nE





 1− Y

y_n

!k

1{Y≤yn}|X=x







=y^k_n(1+o(1))

by the dominated convergence theorem, and E

= g(x)B

k+1, γ(x)⁻¹−k

γ(x) y_n^kF¯⁽¹⁾(y_n|x)(1+o(1)), (15) see for instance Lemma 1(i) in [8]. The result follows from direct calculations.

Lemma 2 Assume that(K),(L)andC₂(γ(x), ρ(x),A(.|x))hold, and lety_n→ ∞ andh_n →0 be such thatnh^d_n → ∞,ω⁽¹⁾

hn(y_n|x)log(y_n) →0 andω⁽²⁾

hn(y_n|x) →0.

Then for all 0≤k< γ(x)⁻¹/2,

(15)

14 Stéphane Girard, Gilles Stupfler and Antoine Usseglio-Carleve E

f

mˆ_n^(k)(y_n|x)g

=m^(k)(y_n|x)

1+O(h_n)+o ω_h⁽¹⁾

n(y_n|x) +O

ω_h⁽²⁾

n(y_n|x) and Var

f

mˆ_n^(k)(y_n|x)g

= ||K||²

2

nh^d_n g(x)y_n²^k(1+o(1)).

Proof Note thatE f

mˆ_n^(k)(y_n|x)g

=R

Sm^(k)(y_n|x−uh_n)K(u)duby Assumption(K) and a change of variables, and use Lemma 1 to get the first result. The second result

is obtained through similar calculations.

Lemma 3 Assume that (K), (L) and C₂(γ(x), ρ(x),A(.|x)) hold. Let y_n → ∞, h_n →0 be such thatnh^d_n → ∞andω_h⁽¹⁾

n(y_n|x)log(y_n) →0. Then for all 0 ≤ k <

γ(x)⁻¹/2,









 E

f

ϕˆ^(k)_n (y_n|x)g

=ϕ^(k⁾(y_n|x)

1+O(h_n)+O ω⁽_h¹⁾

n(y_n|x)log(y_n) , Var

f

ϕˆ_n^(k)(y_n|x)g

=||K||²

2g(x)^B(2k+1,γ(x)⁻¹−2k)

γ(x)

y²_n^kF¯⁽¹⁾(yn|x)

nh^dn (1+o(1)).

Proof See Lemma 5 of [17].

Lemma 4 Assume that C₂(γ(x), ρ(x),A(.|x)) holds. Let λ ≥ 1, y_n → ∞, y⁰_n = λy_n(1+o(1))and 0<k< γ(x)⁻¹.

(i) Then the following asymptotic relationship holds:

E

f|Y−y_n|^k1{Y>y⁰n}|X=xg

=y_n^kF¯⁽¹⁾(y_n|x)f k I B

λ⁻¹, γ(x)⁻¹−k,k

+(λ−1)^kλ⁻¹^/γ(x)g

(1+o(1)).

(ii) Assume further thatω⁽¹⁾

hn(y_n∧y⁰_n|x)log(y_n) → 0 andω⁽³⁾

hn(y_n|x) → 0. Then, uniformly inx⁰∈B(x,h_n),

E

f|Y−y_n|^k1{Y>y⁰n}|X=x⁰g

=E

(1+o(1)).

Proof (i) Straightforward calculations entail E

=y_n^kE













 Y y_n −1

!k

−(λ−1)^k







1^{Y>λyn}|X=x







(1+o(1)) +y_n^k(λ−1)^kF¯⁽¹⁾(λy_n|x)(1+o(1)),

withy_n⁰ =λy_n(1+o(1)). The result then comes directly from the regular variation property of ¯F⁽¹⁾(·|x)and Lemma 1 in [8] withH(t)=(t−1)^kandb=λ.

(ii) Note first that fornlarge enough

Extreme Lp-quantile kernel regression

HAL Id: hal-03182032

https://hal.archives-ouvertes.fr/hal-03182032

Submitted on 26 Mar 2021

is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire

destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Extreme Lp-quantile kernel regression

Stéphane Girard, Gilles Stupfler, Antoine Usseglio-Carleve

To cite this version:

Stéphane Girard, Gilles Stupfler, Antoine Usseglio-Carleve. Extreme Lp-quantile kernel regression.

Advances in Contemporary Statistics and Econometrics, Springer, pp.197-219, 2021, �10.1007/978-3-

030-73249-3_11�. �hal-03182032�

Extreme L p −quantile kernel regression

1 Introduction

2 L

−quantile kernel regression

3 Main results

4 Simulation study

5 Real data example

6 Appendix

Extreme L ^p −quantile kernel regression