• Aucun résultat trouvé

ROBUST SEMIPARAMETRIC JOINT ESTIMATORS OF LOCATION AND SCATTER IN ELLIPTICAL DISTRIBUTIONS

N/A
N/A
Protected

Academic year: 2021

Partager "ROBUST SEMIPARAMETRIC JOINT ESTIMATORS OF LOCATION AND SCATTER IN ELLIPTICAL DISTRIBUTIONS"

Copied!
7
0
0

Texte intégral

(1)

HAL Id: hal-02976889

https://hal.archives-ouvertes.fr/hal-02976889

Submitted on 23 Oct 2020

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

ROBUST SEMIPARAMETRIC JOINT ESTIMATORS

OF LOCATION AND SCATTER IN ELLIPTICAL

DISTRIBUTIONS

Stefano Fortunati, Alexandre Renaux, Frédéric Pascal

To cite this version:

Stefano Fortunati, Alexandre Renaux, Frédéric Pascal. ROBUST SEMIPARAMETRIC JOINT

ES-TIMATORS OF LOCATION AND SCATTER IN ELLIPTICAL DISTRIBUTIONS. 2020 IEEE 30th

International Workshop on Machine Learning for Signal Processing (MLSP), Sep 2020, Espoo,

Fin-land. pp.1-6, �10.1109/MLSP49062.2020.9231865�. �hal-02976889�

(2)

2020 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, SEPT. 21–24, 2020, ESPOO, FINLAND

ROBUST SEMIPARAMETRIC JOINT ESTIMATORS OF LOCATION AND SCATTER IN

ELLIPTICAL DISTRIBUTIONS

Stefano Fortunati, Alexandre Renaux, Fr´ed´eric Pascal

Universit´e Paris-Saclay, CNRS, CentraleSupel´ec, Laboratoire des signaux et syst`emes,

91190, Gif-sur-Yvette, France.

e-mails: {stefano.fortunati, alexandre.renaux, frederic.pascal}@centralesupelec.fr

ABSTRACT

This paper focuses on the joint estimation of the location vec-tor and the shape matrix of a set of Complex Elliptically Sym-metric (CES) distributed observations. This well-known esti-mation problem is framed in the original context of semipara-metric models allowing us to handle the (generally unknown) density generator as an infinite-dimensional nuisance param-eter. A joint estimator, relying on the Tyler’sM -estimator of location and on a newR-estimator of shape matrix, is pro-posed and its Mean Squared Error (MSE) performance com-pared with the Semiparametric Cram´er-Rao Bound (CSCRB). Index Terms— Elliptical distribution, robust estimator, covariance estimation, semiparametric model, efficiency.

1. INTRODUCTION

Among the wide range of statistical tools exploited in Ma-chine Learning (ML), covariance matrix estimators are cer-tainly of primary importance [1]. In fact, popular ML infer-ence methods, e.g. image segmentation, clustering, dimension reduction and distance learning, rely on covariance matrix es-timation as a first preliminary step. The concept of covari-ance matrix implies the existence of a statistical model able to describe the data behaviour and consequently, the possi-bility to derive inference algorithms based on it. A common approach in many ML inference problems is the model-based one that consists in considering the available data as sampled from a known probability density function (pdf) with possi-bly unknown fixed parameters. More formally, let {zl}Ll=1

be a set ofL independent and identically distributed (i.i.d.) N -dimensional observations sharing the same pdf, i.e. CN 3

zl ∼ pZ, ∀l. In model-based inference, pZ is assumed to

belong to a parametric model Pθ , {pZ|pZ(zl; θ), θ ∈ Θ }

that represents a family of pdfs parametrized by a finite di-mensional vector θ ∈ Θ ⊆ Cq. In multivariate

Gaussian-based inference, for example, θ is set up by the mean vec-tor µ and by the vecvec-torized version of the covariance/scatter

The work of S. Fortunati, A. Renaux and F. Pascal has been partially supported by DGA under grant ANR-17-ASTR-0015.

matrixvec(Σ),1 i.e. θ

, (µT, µH, vec(Σ)T)T. Note that,

since we are considering the general case of complex param-eter vectors, the use of Wirtinger calculus is required [2]. This is the reason why, in θ the entries of the mean vector and of the (Hermitian) covariance/scatter matrix have to be listed to-gether with theirs complex conjugates.

Due to its simplicity, the Gaussian assumption has been of mainstream use for decades. However, extensive statistical analysis, performed in a large number of applications, have shown the ubiquitous non-Gaussian nature of the data.

Different statistical models, aiming at combining the “conciseness” of the Gaussian one with the heavy-tailed be-haviour of the observed data, have been developed in statis-tic and ML literature. Among others, a non-Gaussian data model, that has been recognized to provide a reliable charac-terization of the data behaviour in many applications [3, 4], is the set of Complex Elliptically Symmetric (CES) distri-butions [5]. Each pdf pZ belonging to the CES model is

parametrized by a finite-dimensional parameter vector of in-terest θ ∈Θ ⊆ Cq and by an infinite-dimensional parameter,

usually called density generator, G 3 h : R+

→ R+ that

characterizes the data non-Gaussianity and that is generally unknown. In particular, as in the Gaussian case, θ accounts for a location parameter µ ∈ CNand the covariance structure of the data. Due to the well-known scale ambiguity affect-ing the CES model, the covariance/scatter matrix Σ is not identifiable and only scaled versions, usually called shape matrices, V , Σ/s(Σ) can be estimated [5]. The rigor-ous definition of the assumptions that a scale functionals(·) has to satisfy can be found in [6–8]. Here, following [8], among all the possible s(·), we choose the one that forces the shape matrix to have the first top-left entry equal to 1, i.e. s(Σ) = [Σ]1,1. According to the notation introduced in [8],

the resulting (Hermitian and positive-definite) shape matrix will be indicated as V1, Σ/[Σ]1,1. By relying again on the

Wirtinger calculus, the finite dimensional parameter vector θ

1The standard vectorization operator vec maps column-wise the entry of

an N × N matrix A in an N2-dimensional column vector vec (A).

(3)

that uniquely parametrizes the CES model, is given by:2 θ , (µT, µH, vec(V1)T)T ∈ Θ ⊆ Cq, (1)

whereq = N (N + 2) − 1 (= 2N + N2− 1). Note that the

term “2N ” is due to the location µ ∈ CN and to its conjugate,

the term “N2” represents the total number of entries of the

shape matrix V1, while the “−1” is due to the constraint on

its first top-left element. The infinite-dimensional parameter h, characterizing the CES pdf pZ, belongs to the set G =

h : R+→ R+ R∞ 0 t N −1h(t)dt < ∞,R p Z = 1 [5].

Due to the “mixed” (both finite and infinite) dimensional-ity of its parameter space, the CES model can then be framed in the context of semiparametric models [9]. In particular, as already shown in [6, 7] and in [10, 11], the CES model is the set of pdfpZ ≡ CESN(µ, V1, h) such that:

Pθ,h=pZ|pZ(z|θ, h) = |V1|−1×

h (z − µ)HV−11 (z − µ) ; θ ∈ Θ, h ∈ G , (2) Inference problems in CES models then involve the es-timation of θ ∈ Θ in the presence of a functional nuisance parameter represented by the density generatorh ∈ G.

Previous papers from both statistic [7] and signal process-ing communities [8], have already investigated the possibility to derive robust and semiparametric efficient estimators for the shape matrix V1. Specifically, in the recent work [8], a

theoretical and simulative analysis of the “finite-sample” per-formance of a newR-estimator of V1is provided. Our aim

here is then to complete the analysis already developed in [8] by investigating the joint estimation problem of the location parameter µ and the shape matrix V1 in the presence of an

unknown density generatorh.

Algebraic notation: For the sake of clarity, in the rest of this paper, we adopt the same notation already introduced in [8]. In addition to the list of symbols detailed in [8], we will make extensive use of some specific matrices whose def-initions are collected below. In particular:

LV1 , P



V−T /21 ⊗ V−1/21 Π⊥vec(IN), (3)

where V1 is the shape matrix previously introduced, P ,

[e2|e3| · · · |eN2]T where eiis thei-th vector of the canonical

basis of RN2andΠ⊥

vec(IN), IN2− N

−1vec(I

N)vec(IN)T.

CES-related notation: CES distributions have been the subject of extensive theoretical and applicative-oriented re-search papers and tutorial presentations. Here, we will col-lected the very basic facts and notation that will be used in the rest of this work. Let θ0 , (µT0, µH0, vec(V1,0)T)T

be the “true” parameter vector to be estimated and let h0

be the actual (and unknown) density generator. Let CN 3 z ∼p0(z) ≡ pZ(z; θ0, h0) ≡ CESN(µ0, V1,0, h0) a

CES-distributed random vector characterized by a location vector

2The operator vec(A) defines the N2− 1-dimensional vector obtained

from vec (A) by deleting its first element, i.e. vec (A), [a11, vec(A)T]T.

µ0, a shape matrix V1,0, Σ0/[Σ0]1,1where Σ0represents

the relevant scatter matrix and a density generatorh0 ∈ G.

Then, z satisfies the following stochastic representation: z=dµ0+

QΣ1/20 u, (4)

where u ∼ U(CSN) is a complex random vector uniformly

distributed on the unitN -sphere and =dstands for “has the

same distribution as”. The 2nd-order modular variate Q is independent from u and such that:

Q =d (z − µ0)HΣ−10 (z − µ0) , Q, (5)

Moreover, Q is distributed according to the following pdf: pQ,0(q) = πNΓ(N )−1qN −1h0(q), (6)

whereΓ(·) is the Gamma function. For any h ∈ G, the func-tionψ is defined as

ψ(t) , d ln h(t)/dt. (7) Finally, the expectation operator of any measurable func-tion f with respect to p0(z) is indicated as E0{f (z)} ,

R f(z)pZ(z; θ0, h0)dz.

2. ON THE SEMIPARAMETRIC JOINT ESTIMATION OF LOCATION AND SHAPE Let {zl}Ll=1 be a set of CES distributed vectors such that

CN 3 zl∼ p0≡ CESN(µ0, V1,0, h0), ∀l. Our goal then is

to investigate the problem of jointly estimating µ0and V1,0

in the presence of an unknown density generatorh0.

The two fundamental questions that we are going to ad-dress in this section are the following:

1. What is the impact of not knowingh0on the joint

esti-mation of(µ0, V1,0)?

2. What is the (asymptotic) impact that the lack of knowl-edge of µ0 has on the estimation of V1,0 and vice

versa?

To answer these two questions, we need to introduce the semiparametric efficient score vector¯sθ0 and the

semipara-matric Fisher Information Martix(SFIM) ¯I(θ0|h0). As

dis-cussed in the relevant statistical literature for a generic semi-parametric model [9] and recently investigated for the specific CES model [10, 11], the semiparametric efficient score vector for the joint estimation of location and shape matrix of a set of CES distributed data is given by:

¯sθ0= [¯s T µ0, ¯s T µ∗ 0, ¯s T vec(V1,0)] T = s θ0− Π(sθ0|Th0), (8)

where sθ0is the “classical” score vector defined, by means of

the Wirtinger derivatives, as [11]: [sθ0]i, ∂ ln pZ(z; θ, h0)/∂θ

(4)

and θ is given in (1) and q = N (N + 2) − 1. The term Π(sθ0|Th0) indicates the orthogonal projection of sθ0 on

the nuisance tangent space Th0 of the CES model Pθ,h in

(2) evaluated at the true density generatorh0. Specifically,

Π(sθ0|Th0) tells us the loss of information on the

estima-tion of θ0 due to the lack of knowledge of h0. As shown

in [11, eq. (23)], the orthogonal projection operator of¯sT µ0

and¯sT µ∗

0 onto Th0 is equal to zero and consequently, the lack

of knowledge ofh0does not have any impact on the

(asymp-totic) estimation of the location parameter µ0. On the other

hand, not knowingh0does have an impact on the estimation

of the shape matrix V1,0, since as proved in [11, eq. (24)],

Π(svec(V1,0)|Th0) 6= 0. This answers the first question.

To address the second point about the (asymptotic) cross-information between µ0and V1,0, we need to check the

struc-ture of the SFIM ¯I(θ0|h0). The SFIM for the joint estimation

of µ0and V1,0in the CES semiparametric model Pθ,hin (2)

has been evaluated in [10, 11] as: ¯I(θ0|h0) , E0{¯sθ0(zl)¯s H θ0(zl)} =  ¯I(µ 0|h0) 02N ×(N2−1) 0(N2−1)×2N ¯I(V1,0|h0)  , (10) ¯I(µ0|h0) = E{Qψ0(Q)2} N  Σ−10 0N ×N 0N ×N Σ−∗0  , (11) ¯I(vec(V1,0)|h0) = E{Q2ψ 0(Q)2} N (N + 1) LV1,0L H V1,0, (12)

where LV1,0 is defined in (3) and the function ψ0 is given

in (7). As we can clearly see from (10), the efficient SFIM ¯I(θ0|h0) is a block-diagonal matrix. This implies that the

cross-information terms between the location µ0 and the

shape matrix V1,0 are equal to zero. Consequently, the

rel-evant estimation problems are (asymptotically) decorrelated and can be considered as two separate estimation problems. In particular, in estimating the shape matrix V1,0, the true

(and generally unknown) location vector µ0 can be

substi-tuted by any of its √L-consistent estimators without any impact on the (asymptotic) performance of the estimator of V1,0. Of course, the vice versa holds true as well, i.e. any

L-consistent estimator of V1,0can be used in place of the

true shape matrix without any (asymptotic) impact on the estimation of µ0. In the next Section, we exploit this result to

implement a robust, semiparametric efficient joint estimator for the location and shape matrix in CES distributed data.

3. A ROBUST SEMIPARAMETRIC EFFICIENT JOINT ESTIMATOR OF LOCATION AND SHAPE Robust estimation of location and shape in elliptical distribu-tions is a well-known topic in statistics and signal processing since the seminal paper of Maronna [12]. Let us first recall the framework we are interested in. As before, let {zl}Ll=1be

a set of CES-distributed vectors such that CN 3 zl ∼ p0 ≡

CESN(µ0, V1,0, h0), ∀l. In [12], a general class of joint

M -estimators of µ0and V1,0(in the presence of an unknown

density generatorh0) has been introduced as the “fixed-point”

solution of the following system of equations: XL l=1u1( ˆQ 1/2 l )(zl−µ) = 0,ˆ (13) L−1XL l=1u2( ˆQl)(zl−µ)(zˆ l−µ)ˆ H = bV 1, (14)

where, according to (5), ˆQl= (zl−µ)ˆ HVb−11 (zl−µ), whileˆ

the functionsu1andu2have to satisfy a given set of

assump-tions that guarantees the existence and the uniqueness of the solution of (13) and (14) (see [12] for the real case and [5] for the extension to the complex one).

3.1. Tyler’s jointM -estimator of µ0and V1,0

Among different possible choices foru1andu2, Tyler in [13]

(see also [3] and [4]) showed that the functions u1(Q) =

Q−1/2 andu

2(Q) = N Q−1 lead to the “minimax robust”

M -estimator of the location and shape. Specifically, by defin-ing

ˆ

Q(k)l = (zl−µˆ(k))H[ bV1(k)] −1(z

l−µˆ(k)), (15)

where k indicates the iteration number, we have that the Tyler’s jointM -estimator of location and shape, i.e. ( ˆµT y, bV1,T y),

can be obtained as the convergence points (k → ∞) of the following iterations: ˆ µ(k+1)T y = " L X l=1 [ ˆQ(k)l ]−1/2 #−1 L X l=1  ˆQ(k) l −1/2 zl, (16) ( b V(k+1)T y =NLPL l=1[ ˆQ (k) l ]−1(zl−µˆ (k) T y)(zl−µˆ(k)T y)H. b V(k+1)1,T y ,Vb (k+1) T y /[ bV(k+1)T y ]1,1. (17) Note that, even if a formal proof of the joint convergence of (16) and (17) is still an open problem, these iterative algo-rithm has been shown to provide reliable estimates in most of the scenarios of possible interest in practical applications. We refer to [3] where jointM -estimators of the form (13)-(14) have been exploited in hyperspectral anomaly detection problems and to [4] where jointM -estimators have been de-rived as part of a general Expectation-Maximization (EM) al-gorithm for clustering applications.

The estimatorsµˆT yand bV1,T yhave the remarkable

prop-erty of being √L-consistent under any (unknown) density generatorh ∈ G [13]. Consistency, however, is only one of the properties that good robust estimators should have. An-other important property is the (semiparametric) efficiency. 3.2. The Semiparametric Cram´er-Rao Bound (SCRB) With semiparametric efficiency we indicate the ability of a robust estimator to achieve the Semiparametric Cram´er-Rao

(5)

Bound (SCRB) [9] as the number of available observationsL goes to infinity. The SCRB for the specific estimation prob-lem at hand has been recently derived in [10,11] as the inverse of the SFIM in (10). As discussed before, since the efficient score vectors for the location, i.e.¯sT

µ0and¯s

T µ∗

0, are orthogonal

to the nuisance tangent space Th0, the SCRB on the

estima-tion of µ0is equal to the “classical” CRB and it is given by

CRB(µ0|h0) , ¯I(µ0|h0)−1 = N E{Qψ0(Q)2}  Σ0 0N ×N 0N ×N Σ∗0  . (18)

On the other hand, since Π(svec(V1,0)|Th0) 6= 0 [11, eq.

(24)], the SCRB on the estimation of the shape matrix V1,0

is tighter than the “classical” CRB (that is obtained for a perfectly knownh0) and is given by:

SCRB(vec(V1,0)|h0) , ¯I(vec(V1,0)|h0)−1 = N (N + 1) E{Q2ψ 0(Q)2} h LV1,0L H V1,0 i−1 . (19)

In [10, 11], it has been shown that robustM -estimators of the shape matrix are not semiparametric efficient. This effi-ciency issue can be overcome by exploiting the results pro-posed by Hallin, Oja and Paindaveine in [7] for the real case and extended to the complex case in [8]. By combining the Le Cam’s theory on “one-step” estimators and the robustness property of rank-based inference procedure, in [7] a robust semiparametric efficient R-estimator of V1,0 has been

de-rived and its “finite-sample” performance investigated in [8] for the case of zero-mean CES observations. Here, an exten-sion of the previously derivedR-estimator of shape matrix is provided to CES data with unknown mean, while Sec. 4 will focus on a simulative analysis of its estimation performance. Note that, due to the orthogonality between the efficient score vectors of µ0and the nuisance tangent space Th0, the

estima-tion of the locaestima-tion parameter does not require any “one-step” correction `a la Le Cam.

3.3. AnR-estimator of V1,0

In their seminal works [7] (see [8] for a tutorial discussion and for the complex extension of these results), Hallin, Oja and Paindaveine showed that a robust and semiparametric ef-ficient estimator of the shape matrix can be obtained by ap-plying a linear, “one-step”, correction to any√L-consistent preliminary estimator bV?1of V1,0, obtained from the centred

CES dataset {zl−µˆ?}Ll=1, whereµˆ?is any

L-consistent preliminary estimator of the location µ0. Due to their

prop-erties of minimax robustness and√L-consistency under any density generatorh ∈ G, the Tyler’s estimators previously in-troduced may be good candidates for this role, i.e.µˆ?≡µˆT y

and bV?1 ≡ bV1,T y. Consequently, following the results

ob-tained in [8], anR-estimator of V1,0is given by:3

vec( bV1,R) = vec( bV1,T y) + 1 Lˆα h L b V1,T yL H b V1,T y i−1 ×L b V1,T y XL l=1KvdW  r? l L + 1  vec(ˆu?l(ˆu?l)H), (20) where {r?

l}Ll=1are the ranks4of the continuous random

vari-ables { ˆQ?l}Ll=1such that:

ˆ

Q?l , (zl−µˆT y)H[ bV1,T y]−1(zl−µˆT y), (21)

whileuˆ?l are random vectors obtained as: ˆ

u?l , ( ˆQ?l)−1/2[ bV1,T y]−1/2(zl−µˆT y). (22)

Moreover, the data-dependent termα is defined in [8, Sec.ˆ IV.B], while the van der Waerden scoreKvdW: (0, 1) → R+

is a function defined asKvdW(u) , Φ−1G (u) where Φ −1 G

indi-cates the inverse function of the cdf of a Gamma-distributed random variable with parameters (N, 1). It is important to note that theR-estimator in (20) depends only on the ranks r? l

and on the vectors u?l whose statistics are invariant with re-spect to the true (and generally unknown) data pdf [7,8]. This is not the case for the Maronna’s and Tyler’sM -estimators that directly rely on the collected data (and consequently on their unknown pdf). Interestingly, theR-estimator in (20) is in that sense more robust than M -estimators. To conclude, the pseudocode for the implementation of theR-estimator in (20) is provided in the following.

4. NUMERICAL RESULTS

In this last Section we finally assess, through numerical sim-ulations, the semiparametric efficiency of the joint estimator ( ˆµT y, bV1,R), where ˆµT y is the Tyler’s estimator in (16) of

the location vector µ0, while bV1,Ris theR-estimator in (20)

of the shape matrix V1,0exploiting the Tyler’s joint

estima-tor( ˆµT y, bV1,T y) as preliminary

L-consistent estimator. As basis of comparison, we also report the performance of the joint “sample” estimator( ˆµSM, bV1,SCM), defined as:

ˆ µSM , L−1 XL l=1zl, (23) ( b ΣSCM, L−1PLl=1(zl−µˆSM)(zl−µˆSM)H b V1,SCM ,ΣbSCM/[ bΣSCM]1,1. (24) We generate the set of non-zero mean data {zl}Ll=1

ac-cording to a Generalized Gaussian (GG) distribution [14], such that CN 3 zl∼ p0, ∀l where:

p0(zl) = |Σ0|−1h0 (zl− µ0)HΣ−10 (zl− µ0) , (25)

3Matlab code at https://github.com/StefanoFor

4Let Ω, {ω

l}Ll=1be a set of continuous random variables and let Ωo,

L(1)< ωL(2)< . . . < ωL(L)} be a set containing the same elements of

Ω ordered in an ascending way. Then, the rank rlof ωl∈ Ω is its position

(6)

Algorithm 1R-estimator for V1,0

Input: z1, . . . , zL.

Output: bV1,R.

1: Obtain the preliminary Tyler’s joint estimators: ˆ µT y← limk→∞µˆ(k)T yin (16), b V1,T y ← limk→∞Vb (k+1) 1,T y in (17), 2: Data centring: {zl}Ll=1← {zl−µˆT y}Ll=1, 3: forl = l to L do 4: Qˆ?l ← zHl Vb−11,T yzl, 5: uˆ?l ← ( ˆQ? l)−1/2Vb −1/2 1,T yzl, 6: end for

7: Evaluate the ranks {r?1, . . . , rL?} of { ˆQ?1, . . . , ˆQ?L},

8: L b V1,T y ← P( bV −T /2 1,T y ⊗ bV −1/2 1,T y)Π⊥vec(IN),

9: Evaluateα as in [8, Sec. IV.B].ˆ

10: Υ ←b αLˆ b V1,T yL H b V1,T y, 11: ∆ b V1,T y ← LVb1,T y PL l=1KvdW  r? l L+1  vec(ˆu?l(ˆu?l)H), 12: vec( bV1,R) ← vec( bV1,T y) + L−1/2Υb−1∆ b V1,T y,

13: Reshapevec( bV1,R) as an N × N matrix,

14: return bV1,R

while the relevant density generator is given by: h0(t) , sΓ(N )b−N/s πNΓ(N/s) exp  −t s b  , t ∈ R+. (26) We chose the GG distribution to assess the performance of the proposed joint estimator because of its flexibility in character-izing the data “heavy-tailness” with respect to the Gaussian one. In fact, according to the value of the shape parameter s > 0, the GG density generator in (26) is able to define a distribution with both heavier tails (0 < s < 1) and lighter tails (s > 1) compared to the Gaussian one (s = 1).

The parameters adopted in our simulations are:

• Σ0is a Toeplitz Hermitian matrix whose first column is

given by[1, ρ, . . . , ρN −1]T;ρ = 0.8ej2π/5andN = 8.

• Shape matrix: V1,0, Σ0/[Σ0]1,1.

• Location vector:[µ0]n, 0.5ej1π/7(n−1),n = 1, . . . , N .

• Scale parameter: b = [σ2

XN Γ(N/s)/Γ((N + 1)/s)]s

in (26) andσ2

X= E{Q}/N = 4.

• Numbers of observations: L = 5N . This clearly de-fines a “finite-sample” regime.

The performance assessment will be performed in terms of the following Mean Squared Error (MSE) indices:

%γ , ||E{( ˆµaγ− µa0)( ˆµaγ− µa0)H}||F, (27)

where, for a given x ∈ CN, xa , (xT, xH)T ∈ C2N,

ςγ , ||E{vec( bV1,γ− V1,0)vec( bV1,γ− V1,0)H}||F, (28)

and γ indicates the relevant estimator at hand. As lower bound indices, we use εCRB,µ0 , ||CRB(µ0|h0)||F and

εSCRB,V1,0 , ||SCRB(vec(V1,0)|h0)||F, where CRB(µ0|h0)

is given in (18) andSCRB(vec(V1,0)|h0) in (19).

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Shape parameter: s MSE indices & Bound %SM %Ty εCRB,µ0

Fig. 1: Estimation of the location vector: MSE indices and CRB vss (L = 5N and 106Monte Carlo runs).

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0.6 0.8 1 1.2 1.4 Shape parameter: s MSE indices & Bound ςSCM ςT y ςR εSCRB,V1,0

Fig. 2: Estimation of the shape matrix: MSE indices and SCRB vss (L = 5N and 106Monte Carlo runs).

Fig. 1 shows the performance of the sample meanµˆSM

estimator in (23) and of the Tyler’s estimator µˆT y in (16)

compared to the lover bound in (18). As wee can see,µˆT yis

almost efficient with respect toCRB(µ0|h0) in heavy-tailed

data (0 < s < 1) and outperforms ˆµSM that it is known to be

non robust. On the other hand,µˆSM is efficient in the

Gaus-sian case (s = 1), and tends to have better performance than ˆ

µT yfors > 1. However, in this light-tails scenario, the MSE

ofµˆT ydoes not explode and remains close to theµˆSM’s one.

As far it concern the shape matrix estimation, the simula-tion results are shown in Fig. 2. The main fact here is that the R-estimator bV1,R in (20) outperforms the Tyler’s estimator

b

V1,T yin (17) for every values ofs, i.e. for both heavy-tailed

and light-tailed data. Moreover, as expected, bV1,R greatly

outperforms the sample covariance matrix bV1,SCMin (24) in

the presence of heavy-tailed data (0 < s < 1), while their MSE is of the same order for s > 1. Finally, a comment

(7)

on the efficiency of the above-mentioned estimator is in or-der. As we can see, there is a gap between the MSE indices of bV1,SCM, bV1,T y and bV1,R. However, it is worth to

un-derline that our aim here is to compare the performance of shape matrix estimators in a “finite-sample” regime, i.e. with a number of observations equal toL = 5N that represents a reasonable value in many practical applications. Of course, by lettingL → ∞, it can be shown that bV1,Rachieves the

boundSCRB(vec(V1,0)|h0) in (19) as predicted by

theoreti-cal considerations [7, 8].

In summary, previous simulations highlights the benefits that the proposed robustR-estimator can bring. Specifically, it always outperforms the Tyler’s estimator in both heavy- and light-tails scenarios. Moreover its estimation performance is way better that the SCM one in heavy-tailed data while it is almost similar in light-tailed scenarios: high gain, very small loss. These very promising results promote the use of the R-estimator to other problems, as the structured shape esti-mation discussed in [15].

5. CONCLUSIONS

The joint estimation of the location vector µ0and the shape

matrix V1of a set of i.i.d. CES-distributed, multivariate

ob-servations has been addressed. Building upon the asymptotic decorrelation of the location and shape estimation problems, a joint estimator that relies on the Tyler’sM -estimator ˆµT yfor

µ0and on a recently proposedR-estimator bV1,Rfor V1has

been discussed and its MSE performance assessed and com-pared with the relevant Semiparametric Cram´er-Rao Bound. Our simulation results, obtained for GG-distributed data, have shown that joint estimator( ˆµT y, bV1,R) of location and shape

represents a good alternative to the classical Maronna’s joint M -estimators. In particular, in terms of shape matrix esti-mation, the proposed joint estimator outperforms the joint Tyler’s estimator in both heavy-tailed and light-tailed data. Future works will investigate the application of the proposed estimator in robust clustering and distance learning problems.

6. REFERENCES

[1] C. M. Bishop, Pattern Recognition and Machine Learning (Information Science and Statistics), 1st ed. Springer, 2007.

[2] R. Remmert, Theory of Complex Functions. New York: Springer, 1991.

[3] J. Frontera-Pons, M. A. Veganzones, F. Pascal, and J. Ovarlez, “Hyperspectral anomaly detectors using ro-bust estimators,” IEEE Journal of Selected Topics in Ap-plied Earth Observations and Remote Sensing, vol. 9, no. 2, pp. 720–731, 2016.

[4] V. Roizman, M. Jonckheere, and F. Pascal, “A flexible EM-like clustering algorithm for noisy data.” [Online]. Available: https://arxiv.org/abs/1907.01660

[5] E. Ollila, D. E. Tyler, V. Koivunen, and H. V. Poor, “Complex elliptically symmetric distributions: Survey, new results and applications,” IEEE Transactions on Signal Processing, vol. 60, no. 11, pp. 5597–5625, 2012. [6] M. Hallin and D. Paindaveine, “Parametric and semi-parametric inference for shape: the role of the scale functional,” Statistics & Decisions, vol. 24, no. 3, pp. 327–350, 2009.

[7] M. Hallin, H. Oja, and D. Paindaveine, “Semiparametri-cally efficient rank-based inference for shape II. Optimal R-estimation of shape,” The Annals of Statistics, vol. 34, no. 6, pp. 2757–2789, 2006.

[8] S. Fortunati, A. Renaux, and F. Pascal, “Ro-bust semiparametric efficient estimators in ellip-tical distributions,” Submitted to IEEE Transac-tions on Signal Processing. [Online]. Available: https://arxiv.org/abs/2002.02239

[9] P. Bickel, C. Klaassen, Y. Ritov, and J. Wellner, Effi-cient and Adaptive Estimation for Semiparametric Mod-els. Johns Hopkins University Press, 1993.

[10] S. Fortunati, F. Gini, M. S. Greco, A. M. Zoubir, and M. Rangaswamy, “Semiparametric inference and lower bounds for real elliptically symmetric distributions,” IEEE Transactions on Signal Processing, vol. 67, no. 1, pp. 164–177, Jan 2019.

[11] S. Fortunati, F. Gini, M. S. Greco, A. M. Zoubir, and M. Rangaswamy, “Semiparametric CRB and Slepian-Bangs formulas for complex elliptically symmetric dis-tributions,” IEEE Transactions on Signal Processing, vol. 67, no. 20, pp. 5352–5364, Oct 2019.

[12] R. A. Maronna, “RobustM -estimators of multivariate location and scatter,” Ann. Statist., vol. 4, no. 1, pp. 51– 67, 01 1976.

[13] D. E. Tyler, “A distribution-free M-estimator of multi-variate scatter,” The Annals of Statistics, vol. 15, no. 1, pp. 234–251, 1987.

[14] F. Pascal, L. Bombrun, J. Tourneret, and Y. Berthoumieu, “Parameter estimation for multi-variate generalized gaussian distributions,” IEEE Transactions on Signal Processing, vol. 61, no. 23, pp. 5960–5971, 2013.

[15] B. M´eriaux, C. Ren, M. N. El Korso, A. Breloy, and P. Forster, “Robust estimation of structured scatter ma-trices in (mis)matched models,” Signal Processing, vol. 165, pp. 163 – 174, 2019.

Figure

Fig. 1: Estimation of the location vector: MSE indices and CRB vs s (L = 5N and 10 6 Monte Carlo runs).

Références

Documents relatifs

Concrétiser mon sublime rêve A toujours été mon grand rêve Dans mon sommeil ou en éveil J’y cogite tout le temps sans trêve J’ai passé des nuits à penser Plusieurs années

We could extend the dynamical range of star formation rates and stellar masses of SFGs with observationally constrained molecular gas contents below SFR &lt; 40 M  yr −1 and M ∗

In this chapter we propose inference (i.e. estimation and testing) for vast dimensional elliptical distributions. Estimation is based on quantiles, which always exist regardless of

Joint event location again produces a better estimate of the fracture spacing than individual location does, by correctly handling the correlation among different events induced

However, the results on the extreme value index, in this case, are stated with a restrictive assumption on the ultimate probability of non-censoring in the tail and there is no

We find that joint measurability of binary POVMs is equivalent to the inclusion of the matrix diamond into the free spectrahedron defined by the effects under study7. This

Abstract This paper presents new approaches for the estimation of the ex- treme value index in the framework of randomly censored (from the right) samples, based on the ideas