A study on the estimation of the Transmuted Generalized Uniform Distribution

(1)

HAL Id: hal-02523922

https://hal.archives-ouvertes.fr/hal-02523922

Submitted on 2 Apr 2020

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

A study on the estimation of the Transmuted

Generalized Uniform Distribution

Issa Cherif Geraldo

To cite this version:

(2)

A study on the estimation of the Transmuted

Generalized Uniform Distribution

∗

Issa Cherif Geraldo

Department of Mathematics, Faculty of Sciences, Université de Lomé, 1 B.P. 1515 Lomé 1, Togo.

Abstract

In this paper, we consider the maximum likelihood (ML) estimation of the parame-ters a new probability distribution recently developed and called transmuted generalized uniform distribution (TGUD). Because of the complicated form of its log-likelihood func-tion, this estimation can only be done by using numerical optimization algorithms but this problem has not been studied yet. We address this lack through a comprehensive simulation study in R software using some of the best optimization algorithms (Newton, quasi-Newton and Nelder-Mead algorithms). It is found that the Nelder-Mead algorithm is the best of all the selected algorithms.

Keywords – Numerical optimization, iterative method, maximum likelihood, parameter estimation, transmuted distribution.

1. Introduction

In parametric statistics, significant efforts are continuously made to develop new distributions with the aim of better modelling data in fields such as quality and reliability control, envi-ronmental sciences, insurance, public health, medicine, biology, physics, industry, computer science, communications, engineering, lifetime testing and many others [1]. In this context, the last twenty years or so have seen the flourishing of new families of distributions obtained by adding new parameters to the classical distributions (see [1] for a detailed survey).

Subramanian and Rather [2] have developed the transmuted generalized uniform distribu-tion (TGUD) by applying the quadratic rank transmutadistribu-tion map to the generalized uniform distribution [3]. The cumulated distribution function (CDF) of the TGUD is given by:

F (x) = x β α+1" 1 + λ − λ x β α+1# , 0 < x < β, α > −1 (1)

and its probability density function (PDF) is given by:

f (x) = α + 1 β x β α" 1 + λ − 2λ x β α+1# . (2) ∗

(3)

The TGUD has a parameter vector θ = (α, β, λ) where α > −1 is the shape parameter, β > 0 is the scale parameter and λ is the transmutation parameter such that |λ|_{6 1.}

Subramanian and Rather [2] provided a comprehensive study of its statistical properties (moments, survival function, failure rate, reverse hazard rate, distribution of order statistics) and derived the maximum likelihood (ML) method. Because of the complicated form of the log-likelihood, the maximum likelihood estimates of the parameters could not be obtained in closed-form. Numerical algorithms such as the Newton algorithm and the quasi-Newton al-gorithms have been proposed by the authors for the ML estimation. However, this numerical optimization problem has not been studied.

In this paper, we address this lack by making a comprehensive numerical study of the ML estimation of the parameters of the TGUD in R software [4] based on simulated sam-ples. For the numerical maximization of the log-likelihood, we compare three of the most used optimization algorithms that are the Newton-Raphson algorithm, the quasi-Newton Broden-Fletcher-Goldfarb-Shanno algorithm and the Nelder-Mead algorithm. The choice of these three algorithms is motivated by two reasons: (a) these algorithms use different strategies to find the solution so that if one fails, it is reasonable to hope that the others can give the solution; (b) choosing several algorithms also enables to compare their performances and to determine which one is more efficient for the ML estimation problem considered in this paper.

The rest of the paper is organized as follows. In Section 2, the ML estimation of the parameters of the TGUD is presented. Afterwards, the numerical optimization algorithms selected for the numerical ML estimation are described in Section3. In Section4, we describe the quantile function of the TGUD because its plays an important role in the random gener-ation of the samples used in the simulgener-ation study. Section5 presents the main results of our simulation study and Section6 gives some concluding remarks.

2. Estimation of the parameters of the TGUD via the maximum likelihood method

Let x₁, . . . , xn be a random sample of size n from the TGUD with parameters θ = (α, β, λ)

where α > −1, β > 0 and |λ|_{6 1. The log-likelihood function is given by}

`(θ) = n log(α + 1) − n(α + 1) log β + α n X i=1 log xi+ n X i=1 log " 1 + λ − 2λ xi β α+1# . (3)

Therefore, the MLE of θ is solution to the following non-linear system of equations:

(4)

∂` ∂λ = n X i=1 1 − 2 xi β α+1 1 + λ − 2λ xi β α+1 = 0 (6)

Subramanian and Rather [2] noted that Equations (4), (5) and (6) are in a complicated form. So, it is very difficult to obtain closed-form expressions of their solutions. They pro-posed the use of non-linear optimization algorithms such as Newton-Raphson algorithm or quasi-Newton algorithms to maximize the log-likelihood function (3). However, this numerical optimization problem has not been studied.

In this paper, we address this lack by making a comprehensive numerical study of this nu-merical optimization problem in R software [4] based on simulated samples. For the numerical maximization of the log-likelihood, we use the Newton-Raphson algorithm, the quasi-Newton Broden-Fletcher-Goldfarb-Shanno algorithm and also the Nelder-Mead algorithm. These three algorithms are described below.

3. Numerical algorithms for ML estimation

3.1. Newton-Raphson’s algorithm

It is the very first algorithm that comes to mind when dealing with a numerical optimization problem. This algorithm starts with an initial estimate θ(0) given by the user and compute successive iterates as

θ(k+1)= θ(k)−∇2`(θ(k))

−1

∇`(θ(k)) (7)

where ∇` and ∇2` respectively denote the gradient vector and the Hessian matrix.

Newton-Raphson’s (NR) algorithm converges quickly to the solution if θ(0) is close enough to the unknown value of the parameter to be estimated [5]. However, it can diverge violently when θ(0) _{is far from the unknown solution [}₆_{] and it is impossible to implement if, at some}

step k, ∇2`(θ(k)) is singular.

3.2. Quasi-Newton BFGS algorithm

Quasi-Newton algorithms are inspired from the NR algorithm (7) but they are different from the NR algorithm in that they compute approximations of the inverse of the Hessian matrix using first derivatives and these approximations must be positive definite in order to ensure that the log-likelihood increases with the iterations i.e. `(θ(k+1)_{) > `(θ}(k)). One of the most fa-mous and effective quasi-Newton algorithms is the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm [7, chapter 6]. At each iteration, the BFGS algorithm computes the inverse of the Hessian matrix using the formula

Hk+1 = (I − ρkskykT)Hk(I − ρkyksTk) + ρksksTk (8)

where I is the identity matrix, s_k = θ(k+1) − θ(k)_{, y}

k = ∇`(θ(k+1)) − ∇`(θ(k)), Hk is the

approximated inverse of the Hessian matrix at step k and ρ_k = 1/(yT_ksk).

(5)

3.3. Nelder-Mead’s algorithm

The Nelder-Mead’s (NM) algorithm [8] is a very famous derivative-free algorithm i.e. it does not evaluate derivatives of the function ` and computes the successive iterates only from the values of ` on a finite set of points.

Each iteration of the NM algorithm is based on a simplex whose vertices are sorted by increasing order of ` and the vertex with the lowest value of ` (called the worst point) is replaced by a new vertex through operations on the centroid of the other vertices (all vertices except the worst one). The iterations are repeated until the images of the vertices by ` are sufficiently close. For more details about the NM algorithm, we refer the reader to [9].

Because of the complexity of the mathematical analysis of the NM algorithm, there does not exist in the literature any general convergence result [10]. However, the NM algorithm remains very popular because of its simplicity and is widely used in many scientific and engineering applications [10].

4. Quantile function and random number generation

We are particularly interested in the quantile function because it plays a big role in the generation of TGUD samples. For any u ∈ [0, 1], the quantile of order u of TGUD(α, β, λ) is the solution to the equation F (x) = u where F is the CDF defined by (1). It therefore comes down to finding x such that

(1 + λ) x β α+1 − λ x β 2α+2 = u or, equivalently, λy2− (1 + λ)y + u = 0 where y = (x/β)α+1.

The case λ = 0 being obvious (x = βu1/(α+1)), it is assumed in the rest of this section that λ 6= 0. The discriminant ∆ is such that

∆ = (1 + λ)2− 4λu = [λ + (1 − 2u)]2+ 4u(1 − u) > 0 because u ∈ [0, 1]. We have two solutions which are

y1 =

(1 + λ) −p(1 + λ)2_{− 4λu}

2λ and y2 =

(1 + λ) +p(1 + λ)2_{− 4λu}

2λ

The condition 0 < x < β is equivalent to 0 < y < 1 so it is sufficient to find which of the real numbers y1 and y2 belongs to the interval [0, 1]. We distinguish the following two cases:

(a) if 0 < λ_{6 1, then both y}₁ and y₂ are positive because y₂> 0 and the product y1y2 equals

u

λ which is positive. Moreover,

(6)

and

y2< 1 ⇐⇒ (1 + λ) +

p

(1 + λ)2_{− 4λu < 2λ}

⇐⇒ p(1 + λ)2_{− 4λu < λ − 1.}

The inequality y₁ < 1 always holds (because 0 < u < 1) while inequality y2 < 1 never

holds (because λ − 1 is negative andp(1 + λ)2_{− 4λu cannot be negative).}

(b) if −1 _{6 λ < 0, then y}₂ < 0 (because its numerator is positive and its denominator is negative). On the one hand, we have

(1 + λ)2− 4λu > (1 + λ)2

or, equivalently, p(1 + λ)2_{− 4λu > (1 + λ). So, the numerator and the denominator of}

y1 are both negative and, consequently, y1> 0. On the other hand,

y1< 1 ⇐⇒ (1 + λ) − p (1 + λ)2_{− 4λu > 2λ} ⇐⇒ p(1 + λ)2_{− 4λu < 1 − λ} ⇐⇒ (1 + λ)2− 4λu < (1 − λ)2 ⇐⇒ 4λ(1 − u) < 0 (which is true). Thus, we also have y₁< 1.

We have thus demonstrated the following theorem:

Theorem 1. Let α > −1, β > 0, λ such that |λ| _{6 1 and u ∈ [0, 1]. Then, the quantile of} order u of the TGUD(α, β, λ) is given by

           x = β 1 + λ −p(1 + λ) 2_{− 4λu} 2λ !_α+11 if λ 6= 0 x = βu1/(α+1) if λ = 0. (9)

In our study, the samples from the TGUD are generated by the inversion method, that is, using formula (9) where u ∼ U (0, 1).

5. Simulation study

We study the ML estimation of the parameter vector θ of the TGUD using R software [4]. Instead of selecting just only one of the many existing optimization algorithms, we have chosen three of the most performing optimization algorithms and we compare the performances of these algorithms. The selected algorithms are: NR, BFGS and NM. The NR algorithm is implemented using the newton function of the R package Bhat [11] and the BFGS and NM algorithms are implemented using the constrOptim.nl function of the package alabama [12]. Our main criteria for evaluating the selected algorithms is the mean squared error (MSE) defined as: MSE(ˆθ) = 1 3kˆθ − θk 2 ₌ 1 3 ( ˆα − α)2+ ( ˆβ − β)2+ (ˆλ − λ)2

where θ = (α, β, λ) is the true value of the parameter vector and ˆθ = ( ˆα, ˆβ, ˆλ) is the estimate.

(7)

Table 1: Estimation results for (α, θ, λ) = (1, 0.7, 0.3) over 1000 replications. The values in brackets are the standard deviations.

NR BFGS NM n = 25 ˆ α 0.834 (0.796) 0.880 (0.563) 0.982 (0.525) ˆ β 0.683 (0.016) 0.681 (0.018) 0.681 (0.018) ˆ λ -0.126 (0.638) -0.013 (0.485) 0.093 (0.403) MSE 0.409 0.222 0.160 Number of convergences 48 1000 1000 n = 50 ˆ α 0.647 (0.605) 0.863 (0.469) 0.940 (0.408) ˆ β 0.691 (0.008) 0.691 (0.009) 0.691 (0.009) ˆ λ -0.222 (0.62) 0.058 (0.461) 0.143 (0.373) MSE 0.375 0.170 0.111 Number of convergences 37 1000 1000 n = 100 ˆ α 0.742 (0.496) 0.871 (0.422) 0.975 (0.307) ˆ β 0.695 (0.005) 0.695 (0.005) 0.695 (0.005) ˆ λ -0.040 (0.545) 0.109 (0.446) 0.226 (0.294) MSE 0.238 0.143 0.062 Number of convergences 48 1000 1000 n = 500 ˆ α 1.037 (0.514) 0.850 (0.403) 0.999 (0.142) ˆ β 0.718 (0.020) 0.699 (0.002) 0.699 (0.001) ˆ λ 0.450 (0.661) 0.117 (0.453) 0.287 (0.133) MSE 0.238 0.141 0.013 Number of convergences 54 1000 1000 n = 1000 ˆ α 1.308 (0.185) 0.88 (0.353) 0.995 (0.111) ˆ β 0.729 (0.017) 0.699 (0.001) 0.699 (0.001) ˆ λ 0.823 (0.311) 0.157 (0.401) 0.289 (0.111) MSE 0.166 0.107 0.008 Number of convergences 52 1000 1000 n = 5000 ˆ α 1.393 (0.031) 0.908 (0.338) 0.996 (0.065) ˆ β 0.740 (0.001) 0.702 (0.042) 0.700 (0.000) ˆ λ 1.000 (0.000) 0.190 (0.367) 0.295 (0.068) MSE 0.216 0.090 0.003 Number of convergences 52 1000 1000

(8)

Table 2: Estimation results for (α, θ, λ) = (2, 1, 0.5) over 1000 replications. The values in brackets are the standard deviations.

NR BFGS NM n = 25 ˆ α 1.725 (0.939) 1.796 (0.803) 1.954 (0.774) ˆ β 0.980 (0.020) 0.975 (0.023) 0.976 (0.023) ˆ λ 0.093 (0.584) 0.127 (0.474) 0.248 (0.403) MSE 0.483 0.350 0.276 Number of convergences 81 1000 1000 n = 50 ˆ α 1.584 (0.806) 1.792 (0.666) 1.953 (0.587) ˆ β 0.99 (0.012) 0.988 (0.020) 0.988 (0.012) ˆ λ 0.084 (0.623) 0.214 (0.470) 0.339 (0.370) MSE 0.457 0.263 0.170 Number of convergences 79 1000 1000 n = 100 ˆ α 1.567 (0.906) 1.848 (0.550) 1.982 (0.453) ˆ β 0.994 (0.006) 0.993 (0.007) 0.994 (0.007) ˆ λ 0.049 (0.665) 0.320 (0.401) 0.430 (0.298) MSE 0.546 0.173 0.100 Number of convergences 84 1000 1000 n = 500 ˆ α 1.882 (0.711) 1.874 (0.467) 2.05 (0.224) ˆ β 1.008 (0.015) 0.999 (0.002) 0.999 (0.003) ˆ λ 0.438 (0.628) 0.389 (0.379) 0.535 (0.165) MSE 0.303 0.130 0.027 Number of convergences 80 1000 1000 n = 1000 ˆ α 2.161 (0.348) 1.842 (0.516) 2.055 (0.225) ˆ β 1.017 (0.017) 0.999 (0.001) 1.000 (0.003) ˆ λ 0.739 (0.357) 0.362 (0.426) 0.539 (0.176) MSE 0.109 0.164 0.029 Number of convergences 50 1000 1000 n = 5000 ˆ α 2.332 (0.116) 1.844 (0.510) 2.061 (0.176) ˆ β 1.029 (0.009) 1.000 (0.000) 1.000 (0.002) ˆ λ 0.954 (0.147) 0.369 (0.424) 0.552 (0.150) MSE 0.117 0.160 0.020 Number of convergences 76 1000 1000

Three important remarks can be made:

(a) The NR algorithm has a very low rate of convergence. Out of twelve thousand (12000) replications, it has only converged 741 times, i.e. NR had a convergence rate of 6.175 %. The main reasons for failure of the NR algorithm are: singular Hessian matrices and number of iterations exceeded.

(b) For all algorithms, the MSE decreases as the sample size increases.

(9)

6. Concluding remarks

In this paper, we studied the maximum likelihood (ML) estimation of the parameters of the transmuted generalized uniform distribution (TGUD) that was not studied yet. Because of the complex expression of the log-likelihood function, numerical optimization algorithms are required. We studied, via intensive simulation experiments, three well-known algorithms (Newton-Raphson, quasi-Newton BFGS and Nelder-Mead) for the numerical ML estimation of the parameters. Of these three algorithms, the NM algorithm appears to be the best because it has a convergence rate of 100% and the smallest mean squared errors.

References

[1] Zubair Ahmad, G. G. Hamedani, and Nadeeme Shafique Butt. Recent developments in distribu-tion theory: a brief survey and some new generalized classes of distribudistribu-tions. Pakistan Journal of Statistics and Operations Research, 15(1):87–110, 2019.

[2] C. Subramanian and A. A. Rather. Transmuted generalized uniform distribution. International Journal of Scientific Research in Mathematical and Statistical Sciences, 5:25–32, 2018.

[3] Chang-Soo Lee. Estimations in a generalized uniform distribution. Journal of the Korean Data and Information Science Society, 11(2):319–325, 2000.

[4] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2019.

[5] I. Griva, S. G. Nash, and A. Sofer. Linear and Nonlinear Optimization: Second Edition. Society for Industrial and Applied Mathematics, 2009.

[6] J. E. Dennis, Jr and Robert B. Schnabel. Numerical Methods for Unconstrained Optimization and Nonlinear Equations. SIAM’s Classics in Applied Mathematics, 1996.

[7] Jorge Nocedal and Stephen J. Wright. Numerical optimization. Springer, second edition, 2006. [8] J. A. Nelder and R. Mead. A simplex algorithm for function minimization. Computer Journal,

7(4):308–313, 1965.

[9] Jeffrey C. Lagarias, James A. Reeds, Margaret H. Wright, and Paul E. Wright. Convergence Prop-erties of the Nelder–Mead Simplex Method in Low Dimensions. SIAM Journal on Optimization, 9(1):112–147, 1998.

[10] Jeffrey C Lagarias, Bjorn Poonen, and Margaret H Wright. Convergence of the Restricted Nelder– Mead Algorithm in Two Dimensions. SIAM Journal on Optimization, 22(2):501–532, 2012. [11] Georg Luebeck and Rafael Meza. Bhat: General likelihood exploration, 2013. R package version

0.9-10.