• Aucun résultat trouvé

CONVERGENCE RATES IN EMPIRICAL BAYES PROBLEMS WITH A WEIGHTED SQUARED-ERROR LOSS. THE PARETO DISTRIBUTION CASE

N/A
N/A
Protected

Academic year: 2022

Partager "CONVERGENCE RATES IN EMPIRICAL BAYES PROBLEMS WITH A WEIGHTED SQUARED-ERROR LOSS. THE PARETO DISTRIBUTION CASE"

Copied!
10
0
0

Texte intégral

(1)

CONVERGENCE RATES IN EMPIRICAL BAYES PROBLEMS WITH A WEIGHTED SQUARED-ERROR

LOSS. THE PARETO DISTRIBUTION CASE

VASILE PREDA and ROXANA CIUMARA

We study the problem of estimating the scale parameterθfor a Pareto distribution under a weighted squared-error loss through the empirical Bayes approach. An empirical Bayes estimator is proposed and some asymptotic optimality properties are given. Also, under certain conditions, the empirical Bayes estimator proposed is asymptotically optimal with rate of convergence of ordern23.

AMS 2000 Subject Classification: 62P05.

Key words: empirical Bayes, weighted squared-error loss, asymptotic optimality, rate of convergence.

1. INTRODUCTION

Robbins [12] argued that, for some estimation problems, the information obtained at each step could be used to improve the next step decision. These procedures, known as empirical Bayes or adaptive methods, were studied by many authors and among them we remind the work of Johns [4], Samuel [13], Berger and Berliner [2], Preda [9, 11], Tiwari and Zalkikar [16], Liang [5], and Singh [15].

The usefulness of empirical Bayes estimation in practical statistical ap- plications depends on the overall risks rate of convergence to optimal risk.

The problem of convergence rates for empirical Bayes estimates was stud- ied by Lin [6], Preda [8] and by Tiwari and Zalkikar [16] and Liang [5], con- sidering a squared-error loss.

Tiwari and Zalkikar [16] found that, under certain conditions, the em- pirical Bayes estimator for the scale parameter in the Pareto distribution is asymptotically optimal and the rate of convergence is of ordern12. Liang [5]

used the same squared-error loss, but relaxed the conditions stated in Tiwari and Zalkikar [16] and proved that the empirical Bayes estimator proposed of the same parameter of the Pareto distribution is asymptotically optimal with associated rate of convergence of ordern23.

REV. ROUMAINE MATH. PURES APPL.,52(2007),6, 673–682

(2)

In this paper, we consider a weighted squared-error loss and propose an empirical Bayes estimator for the scale parameter of the Pareto distribution.

We assume that the weights are given by a function which satisfies certain properties.

In Section 2, we describe the Pareto distribution with a known shape parameter α and unknown scale parameter θ. Furthermore, the conditions that have to be satisfied in order to obtain the results from Sections 3 and 4 are stated. We define the Bayes risk for a weighted squared-error loss and the overall Bayes risk for a sequence of empirical Bayes estimators. Next, asymptotic optimality and rate of convergence for a sequence of empirical Bayes estimators are defined.

In Section 3, we consider the conditions imposed in Section 2 and propose an empirical Bayes estimator for the unknown scale parameter of the Pareto distribution for a class of prior distributions. A useful study of the Pareto distribution could be found in Arnold [1] and Preda [10].

In Section 4, we study asymptotic optimality and prove that under the conditions assumed the rate of convergence is of ordern23.

2. SOME PRELIMINARIES

LetXbe a random variable having a Pareto distribution with probability density function

f(x|θ) = αθα xα+1,

wherex > θ,α >0 andθ >0. The shape parameterα is known and the scale parameterθ is unknown. We suppose that the parameterθ represents a value of a random variable Θ, which has a prior distribution functionG: (0,∞)→ [0,1]. In this case, the marginal density of X is given by

f(x) =

min(x,m)

0 f(x|θ) dG(θ) =

min(x,m)

0 f(x|θ)g(θ)dθ, where dG(θ) =g(θ)dθ.

Denotingf(x|θ) =ϕ(θ)u(x), whereϕ(θ) =αθαandu(x) = xα+11 , we get f(x) =u(x)

min(x,m)

0 αθαdG(θ) or

xα+1f(x) =

min(x,m)

0 αθαdG(θ).

As for the prior distributionG, we impose the conditions below.

Conditions on G (Liang [5]):

(3)

(A1)G(m) = 1 for some known positive real numberm.

(A2) If a = sup{θ|G(θ) = 0} then f is a decreasing function in x on (a, m].

We consider the problem of estimating the parameterθunder a weighted squared-error loss,L:R2+R+ defined as

L(x, θ) =w(θ) (x−θ)2,

with a weight function w : R+ R+ continuous and differentiable. The robustness of loss function of this type was studied by Makov [7].

Next, we suppose thatw satisfies the conditions below.

Conditions onw:

(A3)∃c1 R+ such thatw(θ)≤c1,∀θ∈R+.

(A4)∃c2 R+ such that 0≤w(θ) +θw(θ)≤c2,∀θ∈R+ and

∃ε >0 such thatw(θ) +θw(θ)> εon (0, m].

(A5)∃ε0 >0 such thatε0 < q(x) =E(w(Θ)|X=x),∀x∈(0, m].

Example 2.1. Ifw :R+ R+, w(θ) = 1+θ1 , we get 0 ≤w(θ) 1 and 0 w(θ) +θw(θ) 1, thus, conditions (A3) and (A4) are satisfied. If we consider a uniform on [0,1] prior distribution, andm= 1, then condition (A5) also holds since 12 < q(x)≤1.

The Bayes estimator of θgiven X=xis

(2.1) ϕG(x) = arg minE(L(X,Θ)|X=x) = E(Θw(Θ)|X=x) E(w(Θ)|X =x)

assuming that all posterior expectations involved in the above expression exist andE(w(Θ)|X =x)= 0.

The Bayes risk ofϕG is

R(G, ϕG) =E(L(ϕG(X),Θ)) =E

w(Θ) (ϕG(X)Θ)2 ,

where the expectation is taken with respect to (X,Θ).

Let X1, X2, . . . , Xn be the past data, independent and indentically dis- tributed random variables with probability density functionf(x). Denote by Xn= (X1, X2, . . . , Xn) andϕn(X) =ϕn(X, Xn) the empirical Bayes estima- tor of the parameterθ based on past dataXn and the present observationX.

The conditional Bayes risk ofϕn given Xn is R(G, ϕn|Xn) =E

w(Θ) (ϕn(X)Θ)2|Xn and

R(G, ϕn) =E(R(G, ϕn|Xn))

(4)

is the overall Bayes risk of ϕn. Here the expectation is taken with respect to Xn.

We note that becauseϕG is the Bayes estimator, that is, ϕG(x) = arg minE(L(X,Θ)|X =x) we have

R(G, ϕG)≤R(G, ϕn|Xn)

∀Xn vector of past data and ∀n∈N. Moreover, (2.2) R(G, ϕG)≤R(G, ϕn) ∀n∈N.

Thus,R(G, ϕn)−R(G, ϕG) is nonnegative and could be used as a mea- sure of performance of the empirical Bayes estimator ϕn.

Definition 2.1 (Robbins [12], Preda [11], Liang [5]). A sequence (ϕn)n≥1 of empirical Bayes estimators is said to be asymptotically optimal if

R(G, ϕn)−R(G, ϕG) −→

n→∞0.

Moreover, if R(G, ϕn)−R(G, ϕG) =On), where (αn)n≥1 is a sequence of real numbersαn>0 andαn −→

n→∞0, then (ϕn)n≥1 is said to be asymptotically optimal with convergence rate of orderαn.

3. THE EMPIRICAL BAYES ESTIMATOR

In order to propose an empirical Bayes estimator, we first have to derive the Bayes estimator.

Theorem3.1. Under conditions (A1)–(A5), the Bayes estimator of the scale parameter for the Pareto distribution is given by

(3.1) ϕG(x) =



xw(x) xα+1M(x)f(x) if 0< x≤m mw(m) mα+1M(m)f(m) if x > m,

wherew(x) = w(x)q(x),M(x) = M(x)q(x) andM(x) = 0xθα+1(w(θ) +θw(θ)) dF(θ).

Proof. On account of (2.1), we evaluate the numeratorE(Θw(Θ)|X=x) in the expression of the Bayes estimator.

For 0< x≤m we have E(Θw(Θ)|X =x) =xw(x)

f(x) f(x) 1 xα+1f(x)

x

0 θα+1

w(θ) +θw(θ) dF(θ) that is,

(3.2) E(Θw(Θ)|X =x) =xw(x)− 1

xα+1f(x)M(x),

(5)

whereM(x) = 0xθα+1(w(θ) +θw(θ)) dF(θ). It follows from condition (A4), thatM(x)0. SinceE(Θw(Θ)|X =x)≥0, we get

(3.3) 1

xα+1f(x)M(x)≤xw(x) and

(3.4) E(Θw(Θ)|X =x)≤xw(x).

Forx > mwe have

E(Θw(Θ)|X =x) = mα+2w(m)

xα+1f(x) f(m) M(m) xα+1f(x). Since

(3.5) xα+1f(x) =mα+1f(m)

forx > m, we get

(3.6) E(Θw(Θ)|X =x) =mw(m)− M(m) mα+1f(m), for allx > m.

Because E(w(Θ)|X =x) = q(x) > 0 , w(x) = w(x)q(x) and M(x) = Mq(x)(x), for 0< x≤m we obviously have

ϕG(x) =xw(x) M(x) xα+1f(x). Sinceq(x) =q(m), forx > mwe have

ϕG(x) =mw(m) M(m) mα+1f(m). Thus, for x > m,

ϕG(x) =ϕG(m).

We proved before that xα+11f(x)M(x) xw(x) for 0 < x m. Since we imposed condition (A5) andq(x)>0, the previous inequality implies

(3.7) M(x)

xα+1f(x) ≤xw(x).

Now, we can express the Bayes estimator ofθas ϕG(x) =



xw(x) xα+1M(x)f(x) if 0< x≤m mw(m) mα+1M(m)f(m) ifx > m.

(6)

Let (bn)n≥1 be a sequence of strictly positive real numbers such that bn −→

n→∞0 and nbn −→

n→∞∞. We define

fn(x) = Fn(x+bn)−Fn(x)

bn ,

where Fn(x) is the empirical distribution function based on X1, X2, . . . , Xn. We note thatfn(x) can be expressed as

(3.8) fn(x) = 1

nbn n j=1

I(x,x+bn](Xj) .

Moreover, E(fn(x)) = F(x+bnb)−F(x)

n n→∞−→ f(x). Thus, fn(x) is a consistent estimator off(x) (Iosifescu, Mihoc and Theodorescu [3]).

Next, define (3.9) Mn(x) = 1

n n j=1

Xjα+1

w(Xj) +Xjw(Xj)

I(0,x)(Xj) .

We can easily see that E(Mn(x)) =M(x) since X1, X2, . . . , Xn are indepen- dent and identically distributed random variables:

E(Mn(x)) = 1 nnE

Xjα+1

w(Xj) +Xjw(Xj)

I(0,x)(Xj)

=

= x

0 θα+1

w(θ) +θw(θ)

dF(θ) =M(x).

Thus, Mn(x) is a consistent estimator of M(x) (Iosifescu, Mihoc and Theo- dorescu [3]).

The empirical Bayes estimator for the scale parameterθthat we propose is given by

(3.10) ϕn(X) =

Xw(X) Mn(X) Xα+1fn(X)

I(0,m](X)0

+

+

mw(m) Mn(m) mα+1fn(m)

I(m,∞)(X)0

, whereMn= Mqn and a∨b= max (a, b).

(7)

4. ASYMPTOTIC OPTIMALITY OF

THE EMPIRICAL BAYES ESTIMATOR PROPOSED In this section we study the asymptotic optimality of empirical Bayes estimator. Our analysis is based on conditions (A1)–(A5). The main result is as follows.

Theorem4.1. If (bn)n≥1 is a sequence of strictly positive real numbers such that bn −→

n→∞0and nbn −→

n→∞∞,n)n is the sequence of empirical Bayes estimators (3.10) and ϕG is the Bayes estimator (3.1), then

R(G, ϕn)−R(G, ϕG) =O 1

n

+O 1

nbn

+O b2n

.

Proof. SinceR(G, ϕG)≤R(G, ϕn) and condition (A3) holds, we have 0≤R(G, ϕn)−R(G, ϕG) =

=E(R(G, ϕn|Xn))−E

w(Θ) (ϕG(X)Θ)2

≤c1E

n(X)−ϕG(X))2 . Moreover,

0≤R(G, ϕn)−R(G, ϕG)≤c1E

n(X)−ϕG(X))2

=

=c1 m

0 E

n(x)−ϕG(x))2

f(x)dx+

m E

n(x)−ϕG(x))2 f(x)dx

.

For x > m we have ϕG(x) = ϕG(m) and ϕn(x) = ϕn(m). Therefore, ϕn(x)−ϕG(x) =ϕn(m)−ϕG(m) and, consequently,

m E

n(x)−ϕG(x))2

f(x)dx=E

n(m)−ϕG(m))2

·(1−F(m)) . Assume now that 0 < x m. In this case, on account of conditions (A4) we have 0 ϕG(x) ≤xw(x) = xw(x)q(x) and 0 ≤ϕn(x) xw(x) = xw(x)q(x) becauseMn(x)0. We thus obtain

|n(x)−ϕG(x))| ≤xw(x) = xw(x) q(x).

So, considering the expressions of ϕn(x) and ϕG(x) and following the same reasoning as in Singh [14], we get

E

n(x)−ϕG(x))2

=E

Mn(x)

q(x)·xα+1fn(x) M(x) q(x)·xα+1f(x)

2

(8)

8 f2(x)q2(x)E

Mn(x)−M(x) xα+1

2 +

+ 8

f2(x)q2(x)

M(x) xα+1f(x)

2

+x2w2(x) 2

E

(fn(x)−f(x))2 .

SinceE(Mn(x)) =M(x), we have E

Mn(x)−M(x) xα+1

2

= Var

Mn(x) xα+1

=

= Var

1 n

n j=1

Xjα+1 xα+1

w(Xj) +Xjw(Xj)

I(0,x)(Xj)

1 nc22 from condition (A4). Moreover,

E

(fn(x)−f(x))2

= Var (fn(x)) + (E(fn(x))−f(x))2 with

Var (fn(x)) = Var

 1 nbn

n j=1

I(x,x+bn](Xj)

1 nb2n

x+bn

x f(y)dy f(x) nbn ,

where the last inequality holds because of condition (A2) while, from Liang [5], (E(fn(x))−f(x))2 ≤f2(x)(α+ 1)2b2n

4x2 . Finally,

M(x) xα+1f(x)

2

+x2w2(x)

2 ≤x2w2(x) +x2w2(x)

2 2x2w2(x). Now, on account of the above expressions, for 0< x≤m we have E

n(x)−ϕG(x))2

8c22

nf2(x)q2(x)+ 16x2c21

nbnf(x)q2(x) +4c21(α+ 1)2b2n q2(x) and

E

n(m)−ϕG(m))2

8c22

nf2(m)q2(m) + 16m2c21

nbnf(m)q2(m) +4c21(α+1)2b2n q2(m) =

=O 1

n

+O 1

nbn

+O b2n

.

(9)

Then, since (A2) and (A5) hold, we get m

0

E

n(x)−ϕG(x))2

f(x)dx≤1 n

8mc22 ε20f(m)+ 1

nbn

16m3c21

20 +b2n4c21(α+1)2 ε20 =

=O 1

n

+O 1

nbn

+O b2n

.

Summaryzing the results obtained by now, we have 0≤R(G, ϕn)−R(G, ϕG) =O

1 n

+O

1 nbn

+O

b2n

.

Remark 4.1. Under the conditions of Theorem 3.1, ifw(θ) = 1 then we get the Bayes estimator and, respectively, the empirical Bayes estimator from Liang [5].

REFERENCES

[1] B.C. Arnold,Pareto Distributions.International Co-operative Publishing House, Fair- land, MD, 1983.

[2] J. Berger and L.M. Berliner,Robust Bayes empirical Bayes analysis withε-contaminated priors. Ann. Statist.14(1986),2, 461–486.

[3] M. Iosifescu, G. Mihoc and R. Theodorescu,Teoria probabilit˘at¸ilor ¸si statistica matema- tic˘a.Ed. Tehnic˘a, Bucure¸sti, 1966.

[4] M.V. Johns, Jr., Nonparametric empirical Bayes procedures. Ann. Math. Statist. 28 (1957), 649–669.

[5] T.C. Liang,Convergence rates for empirical Bayes estimation of the scale parameter in a Pareto distribution. Comput. Statist. Data Anal.16(1993), 35–45.

[6] P.E. Lin, Rates of convergence in empirical Bayes estimation problems: Continuous case. Ann. Statist.3(1975), 155–164.

[7] U. Makov, Loss robustness via Fisher-weighted squared-error loss function. Insurance Math. Econom.16(1995), 1–6.

[8] V. Preda, Entropia ponderat˘a ¸si problema de select¸ie neparametric˘a. Stud. Cerc. Mat.

34(1982), 169–181.

[9] V. Preda and V. Craiu,Probleme de decizie multipl˘a. Tipografia Univ. Bucure¸sti, 1980.

[10] V. Preda,Informational characterizing of Pareto and power distributions. Bull. Math.

Soc. Sci. Math. R.S. Roumanie (N.S.)28(76) (1984), 77–79.

[11] V. Preda,Teoria deciziilor statistice. Ed. Academiei Romˆane, 1992.

[12] H. Robbins,An empirical Bayes approach to statistics. In:Proc. Third Berkeley Sympos.

Math. Statist. Probab.1(1956), 157–163.

[13] E. Samuel,An empirical Bayes approach to the testing of certain parametric hypotheses.

Ann. Math. Statist.34(1963), 1370–1385.

[14] R.S. Singh,Applications of estimators of a density and its derivatives to certain statis- tical problems. J. Roy. Statist. Soc. Ser. B39(1977), 357–363.

(10)

[15] R.S. Singh,Empirical Bayes estimation in Lebesgue-exponential families rates near the best possible rate. Ann. Statist.7 (1979), 890–902.

[16] R.C. Tiwari and J.N. Zalkikar,Empirical Bayes estimation of the scale parameter in a Pareto distribution. Comput. Statist. Data Anal.10 (1990), 261–270.

Received 11 December 2006 University of Bucharest

Faculty of Mathematics and Computer Science Str. Academiei 14

010014 Bucharest, Romania [email protected]

and

Academy of Economic Studies Department of Mathematics

Calea Dorobantilor 15-17 010552 Bucharest, Romania roxana [email protected]

Références

Documents relatifs

Even when consistency and weak merging hold, simple examples show that the empirical Bayes posterior can have unexpected and counterintuitive behaviors. Frequentist strong merging is

Assuming entropy conditions like VC dimension, we use nonasymptotic strong approximation arguments to characterize the joint limiting Gaussian processes of b n bootstrap experiments

By the theorem 1, the Bayes estimator relative to the quadratic loss function exists, is asymptotically efficient, regular, ℓ-asymptotically of minimal risk and

The vertical red solid line represents the true parameter value and the dashed black, green and blue line indicates the 50000, 100000 and 300000 iteration estimates of β

This section compares Linear SVMs trained on the medium-scale and large-scale datasets using the batch Annealed NonConvex NP-SVM solver (BNP-SVM, algo- rithm 3), the online

The objective of this contribution is to propose a semi-empirical model to characterize the error of air temperature measurement induced by a shelter under given

Keywords and phrases: Pointwise mean squared error, Wavelet methods, Density estimation, Nonparametric

We will assume that there is a continuous probability density function defined on the feature space and we will present a formula for a reverse mapping G that allows minimum