HAL Id: hal-00008881
https://hal.archives-ouvertes.fr/hal-00008881
Submitted on 20 Sep 2005
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de
Bias-reduced estimators of the Weibull tail-coefficient
Jean Diebolt, Laurent Gardes, Stéphane Girard, Armelle Guillou
To cite this version:
Jean Diebolt, Laurent Gardes, Stéphane Girard, Armelle Guillou. Bias-reduced estimators of the
Weibull tail-coefficient. Test, Spanish Society of Statistics and Operations Research/Springer, 2008,
17 (2), pp.311-331. �10.1007/s11749-006-0034-6�. �hal-00008881�
BIAS-REDUCED ESTIMATORS OF THE WEIBULL TAIL-COEFFICIENT
Jean Diebolt
(1), Laurent Gardes
(2), St´ephane Girard
(3,∗)and Armelle Guillou
(4)(1)
CNRS, Universit´e de Marne-la-Vall´ee
Equipe d’Analyse et de Math´ematiques Appliqu´ees ´ 5, boulevard Descartes, Batiment Copernic
Champs-sur-Marne
77454 Marne-la-Vall´ee Cedex 2
(2)
Universit´e Grenoble 2, LabSAD, 1251 Avenue centrale B.P. 47, 38040 Grenoble Cedex 9
(3,∗)
Corresponding author.
Universit´e Grenoble 1,
LMC-IMAG, 51 rue des Math´ematiques B.P. 53, 38041 Grenoble Cedex 9,
Phone: +(33) 4 76 51 45 53, Fax: +(33) 4 76 63 12 63 Stephane.Girard@imag.fr
(4)
Universit´e Paris VI
Laboratoire de Statistique Th´eorique et Appliqu´ee Boˆıte 158
175 rue du Chevaleret 75013 Paris
Abstract. In this paper, we consider the problem of the estimation of a Weibull tail-coefficient θ.
In particular, we propose a regression model, from which we derive a bias-reduced estimator of θ.
This estimator is based on a least-squares approach. The asymptotic normality of this estimator is also established. A small simulation study is provided in order to prove its efficiency.
Key words and phrases. Weibull tail-coe ffi cient, Bias-reduction, least-squares ap- proach, asymptotic normality.
AMS Subject classifications. 62G05, 62G20, 62G30.
1. Introduction.
Let X
1, ..., X
nbe a sequence of independent and identically distributed random variables with distribution function F, and let X
1,n≤ ... ≤ X
n,ndenote the order statistics associated to this sample.
In the present paper, we address the problem of estimating the Weibull tail- coe ffi cient θ > 0 defined as
1 − F(x) = exp( − H(x)) with H
−1(x) : = inf { t : H(t) ≥ x } = x
θ`(x), (1) where ` is a slowly varying function at infinity satisfying
`(λx)
`(x) −→ 1, as x → ∞ , for all λ > 0. (2) Girard (2004) investigated this estimation problem and proposed the following estima- tor of θ:
θ e
n=
kn
X
i=1
³ log X
n−i+1,n− log X
n−kn+1,n´
kn
X
i=1
³ log log n
i − log log n k
n´ , (3)
where k
nis an intermediate sequence, i.e. a sequence such that k
n→ ∞ and k
n/n → 0 as n → ∞ .
We refer to Beirlant et al. (1995) and Broniatowski (1993) for other propositions and to Beirlant et al. (2005) for Local Asymptotic Normality (LAN) results. Estimator (3) is closed in spirit to the Hill estimator (see Hill, 1975) in the case of Pareto-type distri- butions. In Girard (2004), the asymptotic normality of e θ
nis established under suitable assumptions. To prove such a result, a second-order condition is required in order to specify the bias-term. This assumption can be expressed in terms of the slowly varying function ` as follows:
Assumption (R
`(b, ρ)) There exists a constant ρ ≤ 0 and a rate function b satisfying b(x) → 0 as x → ∞ , such that for all ε > 0 and 1 < A < ∞ , we have
sup
λ∈[1,A]
¯¯ ¯¯
¯
log
`(λx)`(x)b(x)K
ρ(λ) − 1 ¯¯
¯¯ ¯ ≤ ε, for x sufficiently large, with K
ρ(λ) =
Z
λ 1t
ρ−1dt.
It can be shown that necessarily | b | is regularly varying with index ρ (see e.g. Geluk
and de Haan, 1987). Moreover, we focus on the case where the convergence (2) is slow,
and thus when the bias term in θ e
nis large. This situation is described by the following assumption:
xb(x) → ∞ as x → ∞ . (4)
Let us note that this condition implies ρ ≥ − 1. Gamma and Gaussian distributions fulfill (4), whereas Weibull distributions do not (see Table 1) since, in this case, the bias term vanishes.
Using this framework, we will establish rigorously in Section 2 the following approxi- mation for the log-spacings of upper order statistics:
Z
j: = j log n j
³ log X
n−j+1,n− log X
n−j,n´ ≈ Ã
θ + b ³ log n
k
n´Ã log
njlog
knn
!
ρ!
f
j, (5)
for 1 ≤ j ≤ k
n, where ( f
1, ..., f
kn) is a vector of independent and standard exponentially distributed random variables.
This exponential regression model is similar to the ones proposed by Beirlant et al.
(1999, 2002) and Feuerverger and Hall (1999) in the case of Pareto-type distributions.
Ignoring b ³
log
knn´Ã
lognj logknn!
ρin (5) leads to the maximum likelihood estimator θ ˇ
n= 1
k
n knX
j=1
Z
j,
which turns out to be an alternative estimator of e θ
n.
The full model (5) allows us to generate bias-corrected estimates b θ
nfor θ through maximum likelihood estimation of θ, b(log n/k
n) and ρ for each 1 ≤ k
n≤ n − 1. An alternative to this approach consists in using a canonical choice for ρ and to estimate the two other parameters by a least-squares method (LS). For the canonical choice of ρ, we can use for instance the value -1, which is the same as the one proposed by Feuerverger and Hall (1999) for the regression model in the case of Pareto-type distributions. The asymptotic normality of the resulting LS-estimator is established in Section 3. In order to illustrate the usefulness of the bias-term, we provide a small simulation study in Section 4. The proofs of our results are postponed to Section 6.
2. Exponential regression model
In this section, we formalize (5). First, remark that F
−1(x) = h
− log(1 − x) i
θ` ³
− log(1 − x) ´ . Since X
n−j+1,n=
dF
−1(U
n−j+1,n), 1 ≤ j ≤ k
n, where U
j,ndenotes the j-th order statistic of a
uniform sample of size n, we have X
n−j+1,n=
dh
− log(1 − U
n−j+1,n) i
θ` ³
− log(1 − U
n−j+1,n) ´ which implies that
log X
n−j+1,n=
dθ log h
− log(1 − U
n−j+1,n) i
+ log h
` ³
− log(1 − U
n−j+1,n) ´i .
Moreover, considering the order statistics from an independent standard exponential sample, E
n−j+1,n=
d− log(1 − U
n−j+1,n). Therefore log X
n−j+1,n=
dθ log(E
n−j+1,n) + log h
`(E
n−j+1,n) i
= : A
n( j) + B
n( j).
Our basic result now reads as follows.
Theorem 1 Suppose (1) holds together with (R
`(b, ρ)) and (4). Then, if k
n→ ∞ and log k
n/ log n → 0, we have
sup
1≤j≤kn
¯¯ ¯¯
¯ j log n j
³ log X
n−j+1,n− log X
n−j,n´ − Ã
θ + b ³ log n
k
n´Ã log
njlog
knn
!
ρ! f
j¯¯ ¯¯
¯
= o
P³ b ³ log n
k
n´´ , (6)
where ( f
1, ..., f
kn) is a vector of independent and standard exponentially distributed random variables.
The proof of this theorem is based on the following two lemmas:
Lemma 1 Suppose (1) holds together with (R
`(b, ρ)) and (4). Then, if k
n→ ∞ and k
n/n → 0, we have
sup
1≤j≤kn
¯¯ ¯¯
¯ j log n j
h A
n(j) − A
n(j + 1) i
− θ f
j¯¯ ¯¯
¯ = o
P³ b ³
log n k
n´´ , (7)
and
Lemma 2 Suppose (1) holds together with (R
`(b, ρ)). Then, if k
n→ ∞ and log k
n/ log n → 0, we have
sup
1≤j≤kn
¯¯ ¯¯
¯ j log n j
h B
n(j) − B
n( j + 1) i
− b ³ log n
k
n´Ã log
njlog
knn!
ρf
j¯¯ ¯¯
¯ = o
P³ b ³ log n
k
n´´ . (8)
The proof of these lemmas is postponed to Section 6.
Remark 1 Under the assumptions of Theorem 1, we also have sup
1≤j≤kn
¯¯ ¯¯
¯ j log n j
³ log X
n−j+1,n− log X
n−j,n´ − Ã
θ + b ³ log n
k
n´Ã log
njlog
knn!
−1! f
j¯¯ ¯¯
¯
= o
P³ b ³
log n k
n´´ ,
where ( f
1, ..., f
kn) is a vector of independent and standard exponentially distributed random variables.
This implies that one can plug the canonical choice ρ = − 1 in the regression model (6) without perturbing the approximation. From model (6) we can easily deduce the asymptotic normality of the estimator ˇ θ
n, given in the next theorem:
Theorem 2 Suppose (1) holds together with (R
`(b, ρ)) and (4). Then, if k
n→ ∞ ,
√ k
nb(log(n/k
n)) → λ ∈ R and if λ = 0: log k
n/ log n → 0, we have p k
nÃ
θ ˇ
n− θ − b ³ log n
k
n´ 1 k
nkn
X
j=1
à log
njlog
knn
!
ρ!
−→ N
d(0, θ
2).
This model (6) now plays the central role in the remainder of this paper. It allows us to generate bias-corrected estimates of θ as we will show in the next section.
3. Bias-reduced estimates of θ
In order to reduce the bias of the estimator ˇ θ
n, we can either estimate simultaneously θ, b(log n/k
n) and ρ by a maximum likelihood method or estimate θ and b by a least- squares approach after substituting a canonical choice for ρ. In fact, this second-order parameter is di ffi cult to estimate in practice and we can easily check by simulations that fixing its value does not much influence the result. This problem has already been discussed in Beirlant et al. (1999, 2002) and Feuerverger and Hall (1999) where similar observations have been made in the case of Pareto-type distributions. The canonical choice ρ = −1 is often used although other choices could be motivated performing a model selection.
In all the sequel, we will estimate θ and b(log
knn
) by a LS-method after substituting ρ with the value − 1. In that case, we find the following LS-estimators:
θ b
n= Z
kn− b b ³ log
knn
´ x
knb b ³ log n
k
n´ = P
knj=1
(x
j− x
kn)Z
jP
knj=1
(x
j− x
kn)
2where x
j= Ã
lognjlogknn
!
−1, x
kn=
1kn
P
knj=1
x
jand Z
kn=
1kn
P
knj=1
Z
j.
Our next goal is to establish, under suitable assumptions, the asymptotic normality of b θ
n. This is done in the following theorem.
Theorem 3 Suppose (1) holds together with (R
`(b, ρ)) and (4). Then, if k
nis such that k
n→ ∞ ,
√ k
nlog
knn
b ³ log n
k
n´ → Λ ∈ R and, if Λ = 0, log
2k
nlog
knn
→ 0 and
√ k
nlog
knn
→ ∞ , (9)
we have √
k
nlog
knn³ b θ
n− θ ´
d−→ N (0, θ
2).
Remark that the rate of convergence of ˇ θ
nis the same as the one of b θ
nin the cases where both λ and Λ are not equal to 0.
The proof of this theorem is postponed to Section 6.
In order to illustrate the usefulness of the bias-term in the model (6), we will provide a small simulation study in the next section.
4. A small simulation study
The finite sample performances of the estimators θ b
n, θ e
nand ˇ θ
nare investigated on 5 di ff erent distributions: Γ (0.25, 1), Γ (4, 1), N (1.1, 1), W (0.25, 0.25) and W (4, 4). We limit ourselves to these three estimators, since it is shown in Girard (2004) that θ e
ngives better results than the other approaches (Beirlant et al., 1995; Broniatowski, 1993). In each case, N = 100 samples (X
n,i)
i=1,...,Nof size n = 500 were simulated. On each sample (X
n,i), the estimates b θ
n,i(k
n), θ e
n,i(k
n) and ˇ θ
n,i(k
n) were computed for k
n= 2, . . . , 360. Finally, the Hill-type plots were built by drawing the points
k
n, 1 N
X
N i=1θ b
n,i(k
n)
,
k
n, 1 N
X
N i=1θ e
n,i(k
n)
and
k
n, 1 N
X
N i=1θ ˇ
n,i(k
n)
.
We also present the associated MSE (mean square error) plots obtained by plotting the points
k
n, 1 N
X
N i=1³ b θ
n,i(k
n) − θ ´
2
,
k
n, 1 N
X
N i=1³ e θ
n,i(k
n) − θ ´
2
, and
k
n, 1 N
X
N i=1³ θ ˇ
n,i(k
n) − θ ´
2
.
The results are presented on figures 1–5. In all the plots, the graphs associated to θ e
nand ˇ θ
nare similar, with a slightly better behaviour of ˇ θ
n. The bias corrected estimator b θ
nalways yields a smaller bias than the two previous ones leading to better results for Gamma and Gaussian distributions (figures 1–3). On Weibull distributions (figures 4–
5), it presents a larger variance.
5. Concluding remarks
In this paper, we introduce a regression model, from which we derive a bias-reduced estimator for the Weibull tail-coe ffi cient θ. Its asymptotic normality is established and its e ffi ciency is illustrated in a small simulation study. However, in many cases of practical interest, the problem of estimating a quantile x
pn= F
−1(1 − p
n), with p
n< 1/n, is much more important. Such a problem has already been studied in Gardes and Girard (2005) where the following Weissman-type estimator has been introduced
e x
pn= X
n−kn+1,nà log
p1nlog
knn+1
!
eθn.
It is, however, desirable to refine e x
pnwith the additional information about the slowly varying function ` that is provided by the LS-estimates for θ and b. To this aim, condition (R
`(b, ρ)) is used to approximate the ratio F
−1(1 − p
n)/X
n−kn+1,n, noting that
X
n−kn+1,n=
dF
−1(U
n−kn+1,n),
with U
1,n≤ ... ≤ U
n,nthe order statistics of a uniform (0, 1) sample of size n, x
pnX
n−kn+1,n=
dF
−1(1 − p
n) F
−1(U
n−kn+1,n)
=
d( − log p
n)
θ( − log(1 − U
n−kn+1,n))
θ`( − log p
n)
`( − log(1 − U
n−kn+1,n)) '
dà log
p1nlog
knn+1!
θexp
"
b ³ log n
k
n´ ³
logpn1 logknn+1´
ρ− 1 ρ
# .
The last step follows from replacing U
kn+1,n(resp. E
n−kn+1,n) by (k
n+ 1)/n (resp. log n/(k
n+ 1)). Hence, we arrive at the following estimator for extreme quantiles
b x
pn= X
n−kn+1,nà log
p1n
log
knn+1!
θbnexp
"
b b ³ log n
k
n´ ³
logpn1 logknn+1´
bρ− 1 b ρ
# .
Here, the LS-estimators of θ and b can be used after substituting ρ by the canonical
choice − 1. The study of the asymptotic properties of such an estimator is beyond the
scope of the present paper, but it will lead to further investigations.
6. Proofs of our results
6.1 Preliminary lemmas
Lemma 3 For all 1 ≤ j ≤ k
nsuch that k
n→ ∞ and
knn→ 0, we have E
n−j,nlog
nj= 1 + O
P³ 1 log
knn´ uniformly in j.
Proof of Lemma 3. According to R´enyi’s representation, we have E
n−j,n=
d nX
−j+1`=1
f
n−`−j+1` + j − 1 where f
j i.i.d.∼ Exp(1). Since
Var Ã
nX
−j+1`=1
f
n−`−j+1` + j − 1
!
= O(1), denoting
T
j,n: =
n
X
−j+1`=1
"
f
n−`−j+1` + j − 1 − E f
n−`−j+1` + j − 1
# ,
we have, using Kolmogorov’s inequality (see e.g. Shorack and Wellner, 1986, p. 843), that
P Ã
1
max
≤j≤kn| T
j,n| ≥ λ
!
≤ Var(T
1,n)
λ
2, λ > 0.
This implies that T
j,n= O
P(1) uniformly in j. Taking into account the fact that
¯¯ ¯¯
¯ X
n`=j
1
` − log n j
¯¯ ¯¯
¯ = O(1) uniformly in j, 1 ≤ j ≤ k
n,
it is easy to deduce Lemma 3. t u
Let us introduce the E
m− function defined by the integral E
m(x) : =
Z
∞1
e
−xtt
mdt
for a positive integer m. The asymptotic expansion of this integral is given in the
following lemma.
Lemma 4 As x → ∞ , for any fixed positive integers m and p, we have E
m(x) = e
−xx (
1 − m
x + m(m + 1)
x
2+ ... + (−1)
pm(m + 1)...(m + p − 1)
x
p+ O ³ 1
x
p+1´) . The proof of this lemma is straightforward from Abramowitz and Stegun (1972, p. 227- 233) and the O − term can be obtained by a Taylor expansion with an integral remainder.
Denote
µ
p: = 1 k
nkn
X
j=1
³ x
j− x
kn´
p, p ∈ N
∗.
The next lemma provides a first order expansion of this Riemman sum.
Lemma 5 If k
n→ ∞ ,
knn→ 0,
logknnkn
→ ∞ and
loglog2knnkn
→ 0, then
µ
p∼ C
pà log n
k
n!
−pas n → ∞ , where C
p= Z
10
³ log x + 1 ´
pdx < ∞ . Proof of Lemma 5. Denote α
n=
1logn/kn
. Then x
kncan be rewritten as x
kn= 1
k
n+ Ã 1
k
n kn−1X
j=1
f
n(j/k
n) − Z
10
f
n(x)dx
! +
Z
1 0f
n(x)dx = : 1 k
n+ T
1+ T
2,
where f
n(x) = (1 − α
nlog x)
−1, x ∈ [0, 1].
Denoting by f
n(i), i ∈ { 1, 2 } , the ith derivative of f
n, we infer that T
1=
kn−1
X
j=1
Z
(j+1)/knj/kn
³ j k
n− t ´
f
n(1)³ j k
n´ dt +
kn−1
X
j=1
Z
(j+1)/knj/kn
Z
t j/kn(x − t) f
n(2)(x)dxdt
+
Z
1/kn0
f
n(x)dx
= : T
3+ T
4+ T
5. Remark that
T
3= − 1 2k
nà 1 k
nkn−1
X
j=1
f
n(1)³ j k
n´ − Z
11/kn
f
n(1)(t)dt
!
− 1 2k
nZ
1 1/knf
n(1)(t)dt
= : − 1 2k
nT
6+ T
7.
Since f
n(1)is positive and decreasing on h
1 kn
, 1 i
for n su ffi ciently large, we can prove that
| T
4| ≤
2k12 n¯¯ ¯¯
¯ f
n(1)³
1kn
´ − f
n(1)(1) ¯¯
¯¯ ¯ = o ³
1kn
´ T
5= O ³
1 kn
´ ,
| T
6| ≤
k1n¯¯
¯¯ ¯ f
n(1)³
1 kn
´ − f
n(1)(1) ¯¯
¯¯ ¯ = o(1) T
7= −
2k1nÃ
f
n(1) − f
n³
1 kn´! = o ³
1 kn
´ and consequently T
1= O ³
1kn
´ . Besides, a direct application of Lemma 4 provides T
2= 1 − α
n+ O(α
2n).
Therefore x
kn= 1 − α
n+ O ³
1 kn
´ + O(α
2n). Now, we can check that
µ
p= α
pn( 1
k
n knX
j=1
³ log ³ j k
n´ + 1 ´
p+ R
n)
where
R
n= 1 k
nkn−1
X
j=1
(³ log ³ j k
n´ + 1 + ε
n´
p− ³ log ³ j
k
n´ + 1 ´
p)
with ε
n= O ³
α
nlog
2k
n´ + O ³
1 kαn
´ which tends to 0 by assumption.
Since
C1p
1 kn
P
knj=1
³ log ³
jkn
´ + 1 ´
p→ 1, in order to conclude the proof of Lemma 5, we only
have to remark that R
n→ 0. t u
6.2 Proof of Lemma 1 Remark that
α
j,n: = j log n j
h A
n( j) − A
n( j + 1) i
= θ j log n
j log E
n−j+1,nE
n−j,n= θ log n j
j(E
n−j+1,n− E
n−j,n) E
∗n−j,n=
dθ f
jlog
njE
∗n−j,nwhere E
∗n−j,n∈ h
E
n−j,n; E
n−j+1,ni . Consequently, from Lemma 3,
α
j,n= θ f
j+ O
P³ 1
log
knn´
= θ f
j+ o
P³ b ³ log n
k
n´´ , (10)
by the assumption xb(x) → ∞ as x → ∞ with a o
P− term which is uniform in j. Lemma 1
is therefore proved. t u
6.3 Proof of Lemma 2 We consider
β
j,n: = j log n j
h B
n( j) − B
n( j + 1) i .
In order to study this term, we will use the notations λ
1j=
En−j+1,nEn−kn+1,n
, λ
2j=
En−j,nEn−kn+1,n
and y
kn= E
n−kn+1,n, and we rewrite β
j,nas
β
j,n= j log n j
( log ` ³
λ
2jλ
1jλ
2jy
kn´ − log ` ³ λ
2jy
kn´) .
It is clear that 1 ≤
λλ1j2j−→
P1 uniformly in j by Lemma 3 and therefore for n ≥ N
0,
λ1j
λ2j
∈ [1, 2] in probability. Under our assumption (R
`(b, ρ)) on the slowly varying function, we deduce that
β
j,n= j log n j
(
b(λ
2jy
kn)K
ρ³ λ
1jλ
2j´ (1 + o
P(1)) )
. Now, since λ
2j−→
P1 uniformly in j and b(.) is regularly varying with index ρ, b(λ
2jy
kn) = λ
ρ2jb(y
kn)(1 + o
P(1)) with a o
P(1)-term uniform in j.
Therefore
β
j,n= j log n j b(y
kn)
( λ
ρ2jK
ρ³ λ
1jλ
2j´ (1 + o
P(1)) )
.
Again, uniformly in j,
K
ρ³ λ
1jλ
2j´ = Ã λ
1jλ
2j− 1
!
(1 + o
P(1)), which implies that β
j,ncan be rewritten as follows:
β
j,n= − j log n
j b(y
kn)(λ
2j− λ
1j)λ
ρ−12j(1 + o
P(1)).
Therefore, we have
β
j,n= f
jà log
njlog
knn
!
ρb(y
kn)(1 + o
P(1)),
with a o
P(1)-term which is uniform in j. This achieves the proof of Lemma 2. t u Remark that, since
log(n/klog(n/j)n)
→ 1 uniformly in j, one also has β
j,n= f
jà log
njlog
knn!
−1b(y
kn)(1 + o
P(1)),
with a o
P(1)-term which is uniform in j, and this proves Remark 1.
6.4 Proof of Theorem 2 From model (6), we infer that
p k
nÃ
θ ˇ
n− θ − b ³ log n
k
n´ 1 k
nkn
X
j=1
à log
njlog
knn
!
ρ!
= p k
nθ 1
k
n knX
j=1
( f
j− 1) + p k
nb ³
log n k
n´ 1 k
nkn
X
j=1
à log
njlog
knn
!
ρ( f
j− 1) + o
P³ p k
nb ³
log n k
n´´ .
Now, an application of Tchebychev’s inequality gives that 1
k
n knX
j=1
à log
njlog
knn
!
ρ( f
j− 1) = o
P(1).
Then, under our assumptions, Theorem 2 follows by an application of the Central Limit
Theorem. t u
6.5 Proof of Theorem 3 From Remark 1, we have
√ k
nlog
knn
³ θ b
n− θ ´
=
√ k
nlog
knn
1 k
nkn
X
j=1
à θ + b ³
log n k
n´ x
j!Ã
1 − x
j− x
knµ
2x
kn!
( f
j− 1)
+ o
PÃ √
k
nlog
knn
b ³ log n
k
n´! .
Since we have (9), the o
P-term is negligible. The first term can be viewed as a sum
of a weighted mean of independent and identically distributed variables. Now, using
Lyapounov’s theorem, we only have to show that
k
lim
n→∞1 s
4kn
kn
X
j=1
E X
4j= 0,
where X
j= ³ θ + b ³
log
knn´ x
j´³ 1 −
xj−µx2knx
kn´ ( f
j− 1), j = 1, ..., k
nand s
2kn
= P
knj=1
VarX
j. We remark that
s
2kn∼ θ
2kn
X
j=1
Ã
1 − x
j− x
knµ
2x
kn!
2as n → ∞
and X
knj=1
E X
4j∼ 9θ
4kn
X
j=1
Ã
1 − x
j− x
knµ
2x
kn!
4as n → ∞ from which we deduce by direct computations that
1 s
4kn
kn
X
j=1
E X
4j∼ 9 k
nµ
42+ 6(x
kn)
2µ
32− 4(x
kn)
3µ
2µ
3+ (x
kn)
4µ
4[µ
22+ (x
kn)
2µ
2]
2∼ 9C
4k
nby Lemma 5. Our Theorem 3 now follows from the fact that s
2kn∼ θ
2k
nlog
2(n/k
n).
t
u
References
[1] Abramowitz, M., Stegun, I., (1972), Handbook of Mathematical Functions, Dover.
[2] Beirlant, J., Bouquiaux, C., Werker, B., (2005), Semiparametric lower bounds for tail index estimation, Journal of Statistical Planning and Inference, to appear.
[3] Beirlant, J., Broniatowski, M., Teugels, J.L., Vynckier, P., (1995), The mean residual life function at great age: Applications to tail estimation, Journal of Statistical Planning and Inference, 45, 21–48.
[4] Beirlant, J., Dierckx, G., Goegebeur, Y., Matthys, G., (1999), Tail index estimation and an exponential regression model, Extremes, 2, 177–200.
[5] Beirlant, J., Dierckx, G., Guillou, A., Starica, C., (2002), On exponential representa- tions of log-spacings of extreme order statistics, Extremes, 5 (2), 157–180.
[6] Broniatowski, M., (1993), On the estimation of the Weibull tail coe ffi cient, Journal of Statistical Planning and Inference, 35, 349–366.
[7] Feuerverger, A., Hall, P., (1999), Estimating a Tail Exponent by Modelling Depar- ture from a Pareto Distribution, Annals of Statistics, 27, 760–781.
[8] Gardes, L., Girard, S., (2005), Estimating extreme quantiles of Weibull tail- distributions, Communication in Statistics - Theory and Methods, 34, 1065-1080.
[9] Geluk, J.L., de Haan, L., (1987), Regular Variation, Extensions and Tauberian Theorems, Math Centre Tracts, 40, Centre for Mathematics and Computer Science, Amsterdam.
[10] Girard, S., (2004), A Hill type estimate of the Weibull tail-coe ffi cient, Communication in Statistics - Theory and Methods, 33(2), 205–234.
[11] Hill, B.M., (1975), A simple general approach to inference about the tail of a distribution, Annals of Statistics, 3, 1163–1174.
[12] Shorack, G.R., Wellner, J.A., (1986), Empirical Processes with Applications to Statistics,
Wiley New York.
θ b(x) ρ Gaussian N (µ, σ
2) 1/2 1
4 log x
x − 1
Gamma Γ (α , 1, β) 1 (1 − α) log x
x −1
Weibull W (α, λ) 1/α 0 −∞
Table 1: Parameters θ, ρ and the function b(x) associated to some usual distributions
0 40 80 120 160 200 240 280 320 360 0.9
1.1 1.3 1.5 1.7 1.9 2.1 2.3 2.5 2.7 2.9
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
♦
♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦
×
×
×
××
×
××××××××××××××××××××××××××××××××××××××××××××××××××××××××××××××××
(a) Mean as a function of k
n0 40 80 120 160 200 240 280 320 360
0 2 4 6 8 10 12 14 16 18 20
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
♦
♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦
×
×
×
×
××
××
××××
×××××××××××××××××××××××××××××××××××××××××××××××××××××××
(b) Mean square error as a function of k
n0 40 80 120 160 200 240 280 320 360 0.4
0.5 0.6 0.7 0.8 0.9 1.0 1.1
++++++++++++
+++++++++++++++++++++++++++++++++
+++++++++++++++++++++++++
♦
♦♦♦
♦♦♦♦♦♦
♦♦♦♦♦
♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦
×
×
×
××××××××
××××××××××××××××××××××××
×××××××××××××××××××××××××××××××××××
(a) Mean as a function of k
n0 40 80 120 160 200 240 280 320 360
0 2 4 6 8 10 12 14 16
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦
×
×
×
×
×××××
××××××××××××××××××××××××××××××××××××××××××××××××××××××××××××
(b) Mean square error as a function of k
n0 40 80 120 160 200 240 280 320 360 0.4
0.5 0.6 0.7 0.8 0.9 1.0
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
♦
♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦
××
×
×××
×
×
×××××
××××××××××××××××××××××××××××××××××××××××××××××××××××××××
(a) Mean as a function of k
n0 40 80 120 160 200 240 280 320 360
0 2 4 6 8 10 12
+
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
♦
♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦
×
×
×
×
××
××
×
×××××××××××××××
××××××××××××××××××××××××××××××××××××××××××××
(b) Mean square error as a function of k
n0 40 80 120 160 200 240 280 320 360 3.5
3.6 3.7 3.8 3.9 4.0 4.1 4.2 4.3 4.4 4.5
+ +
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
♦♦
♦♦♦
♦♦♦♦♦
♦♦♦♦♦♦♦
♦♦♦♦♦
♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦
×
×××××
×
×
×
×
×
××××
×
×
×××
××××××
×××
×××××××××××××××××××
××××××××××××××××××××
(a) Mean as a function of k
n0 40 80 120 160 200 240 280 320 360
0 1 2 3 4 5 6 7 8 9 10
+ +
+ ++++
++ +++++
+++++
+++++++++++++++++++++++++++++++
++++++++++++++++++
♦
♦
♦
♦♦♦♦
♦
♦♦♦♦♦♦
♦♦
♦♦♦
♦♦♦♦♦♦♦♦♦♦♦♦♦♦
♦♦♦♦♦♦♦♦♦♦♦♦♦
♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦
× ××××××
××
×
×
×
××
×××××
××
×××××××××
×××××××××××××
××××
(b) Mean square error as a function of k
n0 40 80 120 160 200 240 280 320 360 0.20
0.21 0.22 0.23 0.24 0.25 0.26 0.27 0.28 0.29 0.30
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
♦
♦
♦♦♦♦♦♦♦♦♦♦
♦♦♦♦♦♦♦
♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦
×
×
×
××
×
×
××
××
×××××××
××××
×××××××××××××
××××××××××××××××××××××××××××××××××
(a) Mean as a function of k
n0 40 80 120 160 200 240 280 320 360
0.0 0.4 0.8 1.2 1.6 2.0 2.4 2.8 3.2 3.6 4.0
+
++
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
♦
♦♦
♦♦♦♦♦♦
♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦
×
×
×
××
×××
××××××
××
×××××××
××××××××××
×××××××××××××××××××××××××××××××××××