• Aucun résultat trouvé

Bounds for estimation of covariance matrices from heterogeneous samples

N/A
N/A
Protected

Academic year: 2021

Partager "Bounds for estimation of covariance matrices from heterogeneous samples"

Copied!
6
0
0

Texte intégral

(1)

Bounds for Estimation of Covariance Matrices From Heterogeneous Samples

Olivier Besson, Senior Member, IEEE, Stéphanie Bidon, Student Member, IEEE, and

Jean-Yves Tourneret, Member, IEEE

Abstract—This correspondence derives lower bounds on the mean-square error (MSE) for the estimation of a covariance matrix , using samples k = 1; . . . ; K, whose covariance matrices are randomly distributed around . This framework can be encountered e.g., in a radar system operating in a nonhomogeneous environment, when it is desired to estimate the covariance matrix of a range cell under test, using training samples from adjacent cells, and the noise is nonhomogeneous between the cells. We consider two different assumptions for . First, we assume that is a deterministic and unknown matrix, and we derive the Cramér–Rao bound for its estimation. In a second step, we assume that is a random matrix, with some prior distribution, and we derive the Bayesian bound under this hypothesis.

Index Terms—Bayesian bound, covariance matrix estimation, Cramér–Rao bound, heterogeneous environment.

I. PROBLEMSTATEMENT ANDDATAMODEL

Estimating the covariance matrix of an observation vector is funda-mental in many array processing applications, notably in adaptive radar detection where it is desired to estimate the noise statistics of a vector under test, so as to implement an adaptive detection scheme [1]. In an ideal situation, this task is performed using independent and identically distributed (i.i.d.) training samples, which share the same covariance matrix as the vector under test. In such a case, and under the assump-tion that all vectors are Gaussian, the sample covariance matrix (SCM) estimator is the maximum-likelihood estimator (MLE). However, het-erogeneous environments are very frequently encountered [2], [3], and therefore the assumption of i.i.d. samples is often violated. More pre-cisely, the training samples do not have the same covariance matrix as the vector under test, and they may even not share a common co-variance matrix. In an attempt to take into account this fact, we pro-posed in [4] a model for heterogeneous environments; see also [5], where we discuss the rationale and relevance of such a model along with adaptive detection schemes related to it. More precisely, we as-sumed that the set of training samples can be divided inK groups. Thekth group contains Lksnapshotsfzzzk;`gL`=1sharing the same co-variance matrixMMMk 6= MMMp. WhenK = 1, all training samples have a common covariance matrix, which is however different fromMMMp. WhenLk= 1 for k = 1; . . . ; K, all training samples have a different covariance matrix. The snapshotszzzk;` are assumed independent and Gaussian distributed, with covariance matrixMMMk, i.e., the distribution ofZZZk= [ zzzk;1 1 1 1 zzzk;L ], conditionally to MMMkis

f(ZZZkj MMMk) = 0mL jMMMkj0L etr 0MMMk01ZZZkZZZkH (1)

Manuscript received July 16, 2007; revised December 13, 2007. The asso-ciate editor coordinating the review of this manuscript and approving it for pub-lication was Dr. Petr Tichavsky. This work was supported by the Délégation Générale pour l’Armement (DGA) and by Thales Systèmes Aéroportés.

O. Besson and S. Bidon are with the Department of Electronics, Optronics and Signal, ISAE, University of Toulouse, , 31055 Toulouse, France (e-mail: besson@isae.fr; sbidon@isae.fr).

J.-Y. Tourneret is with IRIT/ENSEEIHT, 31071 Toulouse, France (e-mail: jean-yves.tourneret@enseeiht.fr).

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TSP.2008.917341

wherej 1 j and etrf 1 g stand for the determinant and the exponential of the trace of a matrix, respectively, andm is the size of the observation vector. The matricesMMMkare assumed to be independent conditionally toMMMp, and distributed according to an inverse Wishart distribution with meanMMMpandkdegrees of freedom, i.e., [6]

f(MMMkjMMMp) = j(k0 m)MMM~0m(k)pj jMMMkj0( +m) 2etr 0(k0 m)MMMk01MMMp (2) where ~0m(p) = m(m01)=2 m k=1 0(p 0 k + 1): (3)

The scalarkallows one to adjust the distance betweenMMMkandMMMp: the largerk, the closerMMMktoMMMp[6]. To summarize, the model for the training samples is given by

ZZ

Zkj MMMk ~Nm;L (0; MMMk; IIIL ) (4) M

M

MkjMMMp ~Wm01((k0 m)MMMp; k) (5) for k = 1; . . . ; K, where Nm;L~ (0; MMMk; IIIL ) and

~ W01

m ((k0 m)MMMp; k) denote the complex normal distribu-tion and the complex inverse Wishart distribudistribu-tion, respectively. This correspondence considers two assumptions forMMMp, namelyMMMp is deterministic orMMMp is a random matrix whose prior distribution is Wishart, with mean MMMpand degrees of freedom, i.e.,

f(MMMp) = 1

~0m()j01MMMpj 0jMMMpj0metr 0MMMpMMM01p : (6) We denote this distribution asMMMpj MMMp ~Wm(01MMMp; ). Note that the distance betweenMMMpand MMMpdecreases as increases [6].

In [4], we proposed strategies for estimatingMMMpunder this frame-work. The aim of this correspondence is to derive lower bounds for the MSE ofMMMp. More precisely, we first assume thatMMMpis deterministic and derive its Cramér–Rao bound (CRB). Next, assuming thatMMMpis drawn from (6), we derive the Bayesian bound (BB) for its estimation. The counterpart of estimation, namely detection, is beyond the scope of the present correspondence and is not addressed here. Note also that, if the bounds enable one to measure the performance of estimators, they cannot always prejudge their performance in detection.

II. CRAMÉR–RAOBOUND(DETERMINISTICMMMp)

We first derive the Cramér–Rao bound for estimation ofMMMp, as-suming the latter is a deterministic and unknown matrix. LetZZZ = [ ZZZ1 . . . ZZZK] and first note that f(ZZZ j MMMp) is given by

f(ZZZ j MMMp) = K k=1 f(ZZZkj MMMp) = K k=1 f(ZZZkj MMMk)f(MMMkj MMMp)dMMMk = K k=1 j(k0 m)MMMpj mL ~0m(k) jMMMkj 0( +L +m) 2 etr 0MMMk01 (k0 m)MMMp+ ZZZkZZZkH dMMMk 1053-587X/$25.00 © 2008 IEEE

(2)

= K k=1 ~0m(k+ Lk) mL ~0m(k)j(k0 m)MMMpj  2 j(k0 m)MMMp+ ZZZkZZZkHj0( +L ) = K k=1 ~0L (k+ Lk) mL ~0L (k+ Lk0 m)j(k0 m)MMMpj 0L 2 IIIL + (k0 m)01ZZZkHMMMp01ZZZk 0( +L ) : (7)

We observe thatZZZkis distributed according to a generalized complex multivariatet distribution, with k+ Lk0 m degrees of freedom [7]. The log-likelihood function is thus

3(ZZZ j MMMp) = const: + K k=1 k ln jMMMpj 0 K k=1 (k+ Lk) ln (k0 m)MMMp+ ZZZkZZZkH : (8) Letmmmp = vec(MMMp) be the vector obtained by stacking the columns ofMMMp on top of each other. Accordingly, letmmmp~ 2 m 21be the real-valued vector that consists of the elements along the diagonal of MM

Mpand the real and imaginary parts of its elements under the diagonal. In order to obtain the CRB, we need to derive the Fisher information matrix (FIM) which is defined as [8]

~ FF F (MMMp) = EZZ jMZ MM 0 @23(ZZZ j MMMp) @ ~mmmp@ ~mmmT p : (9)

Observe thatmmmp~ = JJJmmmpwithJJJ the (invertible) Jacobian matrix. It is straightforward to show that

FF

F (MMMp) = EZZ jZ MMM 0 @23(ZZZ j MMMp)

@mmmp@mmmpH = JJJHFFF (M~ MMp)JJJ: (10) For mathematical convenience, we will derive the matrixFFF (MMMp) in (10) and, with a slight abuse of language, refer to it as the FIM in the sequel. Herein, we define the derivative with respect to a complex scalar x = xR+ ixI as@=@x (1=2)[@=@xR+ i@=@xI]. Differentiating 3(ZZZ j MMMp) with respect to MMMpyields the following result:

@3(ZZZ j MMMp) @MMMp = K k=1 k MMMp01 0 K k=1 (k+ Lk)(k0 m) (k0 m)MMMp+ ZZZkZZZkH 01: (11) In order to differentiate (11), we use the fact that

@MMMp01

@MMMp3(k; `) = 0MMMp

01 @MMMp @MMMp3(k; `)MMMp

01:

Accordingly, sinceMMMpis Hermitian, for any two matricesAAA and BBB

AA A@MMMp@MMM3(k; `)p BBB i;j = m p;q=1 AA Ai;p @MMM@MMMp p3(k; `) p;qBBBq;j = AAAi;kBBB`;j= BBBT A i+(j01)m;k+(`01)m

where stands for the Kronecker product [9]. Using these results, it is straightforward to show that

@23(ZZZ j MMMp) @mmpm @mmpHm = 0 K k=1 k MMMp0T MMM p01 + K k=1 (k+ Lk)(k0 m)2 2 (k0 m)MMMp+ ZZZkZZZkH 0T (k0 m)MMMp+ ZZZkZZZkH 01: (12) For the sake of notational convenience, let us introduce

~ ZZZk= (k0 m)01=2MMM p01=2ZZZk (13) ~ B B Bk= IIIm+ ~ZZZkZZZ~Hk 01 (14) and note that

(k0 m)MMMp+ ZZZkZZZkH 01= (k0 m)01MMMp01=2BBkMB~ MMp01=2: (15) Therefore, we can write

@23(ZZZ j MMMp) @mmpm @mmpHm = 0 K k=1 k MMMp0T MMMp01 + K k=1 (k+ Lk) MMMp0T =2 MMMp01=2 2 ~BBBTk ~BBBk MMMp0T =2 MMMp01=2 : (16) In order to derive the FIM, we need to evaluate the statistical mean of ~BBBTk ~BBBk. Towards this end, we first note that ~ZZZkhas a complex multivariatet distribution with k+ Lk0 m degrees of freedom [7], i.e.,

f(~ZZZkj MMMp) = ~0L (k+ Lk) mL ~0L (k+ Lk0 m)

2jIIIL + ~ZZZHkZZZkj~ 0( +L ): (17) It follows that ~BBBk, conditionally toMMMp, has a multivariate beta dis-tribution, with(k; Lk) degrees of freedom [7], [10]. i.e., ~BBBkj MMMp

~

Bm(k; Lk). Now, we make use of the following result. Let BBB be dis-tributed asBBB  ~Br(p; q) with p + q  r. Then, for any matrices AAA1 andAA2A [11]–[13]

EfTrfAAA1BBABA2BA BBgg

= pp+q p(p+q)01(p+q)201TrfAAA1AA2g+A (p+q)q201TrfAA1A gTrfAAA2g : (18) Leteeei denote the vector whose elements are all zero, except theith element which equals 1. Accordingly, let us noteEEEij = eeeieeeTj. Then,

(3)

using (18), one can obtain the(i + (j 0 1)m; n + (` 0 1)m) element of ~BBBTk ~BBBkas

Ef ~BBk(`; j) ~B BBBk(i; n)g = E eeeT`BBBkeeej~ eeeiTBBkeeenB~ = E TrfEEEn`BBBkE~ EEjiBBBkg~

= kk+ Lk k(k(k+ Lk)+ Lk) 0 120 1TrfEEn`EE EEjig +(k+ Lk)Lk 20 1TrfEEEn`gTrfEEjigE

= kk+Lk k(k+Lk)01(k+Lk)201`;ji;n+(k+Lk)Lk20 1i;j`;n : (19) It follows that

E ~BBBTk ~BBBk =kk+ Lk k(k(k+ Lk)+ Lk) 0 120 1IIIm

+(k+ Lk)Lk 20 1eeeeeeT (20) whereeee = [ eeeT1 . . . eeeTm]T = vec(IIIm). Consequently, the FIM can be expressed as FF F (MMMp) = MMMp0T=2 MMMp01=2 2 III + eeeeeeT MMMp0T =2 MMMp01=2 (21) with = K k=1 k0 kk(k(k+ Lk)+ Lk) 0 120 1 = K k=1 kLk(k+ Lk) (k+ Lk)20 1 (22) = 0 K k=1 kLk (k+ Lk)20 1: (23)

It ensues that the Cramér–Rao bound can be written as CRB = MMMpT =2 MMMp1=2 2 III + eeeeeeT 01 MMM pT =2 MMMp1=2 = 01 MMM pT =2 MMMp1=2 2 III 0 eeeeee + m T MMMpT =2 MMMp1=2 = 01 MMM pT MMMp 0 + m vec(MMMp)vec MMMpT T (24) where we have used the fact that(AA BA BB)eee = (AAA BBB)vec(III) = vec(BBBAAAT) [9]. The MSE of any estimate ^MMM

p of MMMp; EZZjMZMM fk ^MMMp0 MMMpk2g, is thus lower bounded by

TrfFFF (MMMp)01g = 01 TrfMMMpg20

+ m TrfMMMp2g : (25)

Equation (25) provides a lower bound for the MSE of any estimator ofMMMp, whenMMMp is a deterministic matrix. Some insights into the properties of the CRB can be gained by considering special cases.

1) Consider first the caseK = 1 and, for the sake of convenience, let us noteL = L1and = 1. In this case, there areL snapshots, all sharing the same covariance matrixMMMs= MMM1, and the latter has an inverse Wishart prior, centered aroundMMMp, with degrees of freedom. Under this framework, it is straightforward to show that (25) reduces to TrfFFF (MMMp)01g = ( + L)( + L)L20 1 TrfMMMpg2 + ( + L 0 m)01TrfMMM p2g ' 1TrfMMMpg2whenL ! 1 ' 1LTrfMMMpg2when ! 1:

Two important observations can be made. First, note that, for finite , the lower bound does not go to zero but instead converges to 01TrfMMMpg2. Therefore, consistent estimation ofMMM

pis not pos-sible within this framework. This phenomenon can be explained as follows. The snapshotsZZZ provide information about MMMs, and we can expect them to provide accurate estimates of this matrix. However,MMMsis randomly distributed “around”MMMpand [6]

EZZ jZ MMM kMMMs0 MMMpk2

= ( 0 m)TrfM( 0 m + 1)( 0 m 0 1)MMpg2+ TrfMMMp2g ' 1TrfMMMpg2 1 + m( 0 m) + 1

( 0 m + 1)( 0 m 0 1) : Therefore01TrfMMMpg2 corresponds to the minimum distance betweenMMMsandMMMp, and hence the “least” uncertainty that we can obtain when estimatingMMMpfromZZZ. The second point to be noted is that, when increases, the lower bound is inversely pro-portional toL. We recover here the well-known fact that, in a ho-mogeneous environment, the CRB is inversely proportional to the number of snapshots.

2) Let us now consider the case of most interest to us, namelyLk= 1, i.e., there are K snapshots with K different covariance ma-trices. For the sake of simplicity, let us assume thatk = ; 8k = 1; . . . ; K. Then, the trace of the CRB becomes

TrfFFF (MMMp)01g =  + 2 ( + 1)K TrfMMMpg2 + ( + 1 0 m)01TrfMMMp2g 00!0 K!1 ' 1KTrfMMMpg2when ! 1:

An important observation follows from this result: in contrast to the preceding case, the CRB now goes to zero as the number of snapshots goes to infinity; therefore consistent estimation ofMMMp is possible, even for finite. This can be explained by the “diver-sity” effect. Indeed, when all snapshots have the same covariance matrixMMMs, they more or less provide the same “view” ofMMMp(we can think ofMMMsas a given point in the space ofm2m Hermitian matrices, aroundMMMp). In contrast, whenLk= 1, each snapshot provides a different point of view ofMMMp, and this diversity can be exploited advantageously to yield consistent estimation ofMMMp. Therefore, for a given number of snapshots, the caseLk= 1 is a

(4)

more favorable situation than the caseK = 1. For large , how-ever, the same CRB is obtained.

III. BAYESIANBOUND(RANDOMMMMp)

We now assume thatMMMpis distributed according to a Wishart distri-bution with mean MMMpand degrees of freedom, see (6). The Bayesian bound is obtained as the inverse of the information matrix, which is given by [8] FF FB= EZZZ;MMM 0 @ 23(ZZZ; MMMp) @mmpm @mmpHm = EZZ;MZMM 0 @@m2mmp3(ZZZjM@mmpHmMp)M 0 @@mmmp23(M@mMMmmpHp) = EMMM EZZZjMMM 0 @ 23(ZZZ j MMMp) @mmpm @mmpHm 0 @ 23(MMMp) @mmpm @mmpHm = EMMM FFF (MMMp) + ( 0 m)MMMp0T MMMp01 (26) since, from (6), we have

@3(MMMp) @MMMp = ( 0 m)MMMp 010  MMM01 p (27a) @23(MMMp) @mmpm @mmpHm = 0( 0 m)MMpM 0T MMMp01: (27b) The information matrix is thus the average value, with respect to the prior distributionf(MMMp), of FF F0(MMMp) = FFF (MMMp) + ( 0 m)MMM p0T MMMp01 = MMMp0T=2 MMM p01=2 2 0III + eeeeeeT MMMp0T =2 MMM p01=2 = 0 MMM p0T MMMp01 + MMMp0T =2 MMMp01=2 eeeeeeT MMMp0T =2 MMMp01=2 = 0 MMM p0T MMMp01 + vec MMMp01 vec MMMp0T T (28) with 0= +  0 m. Let us now evaluate the average value of each term in the previous equation. The(i+(j01)m; k+(`01)m) element ofEMMM fMMMp0T MMMp01g is [6] E MMMp01(`; j)MMMp01(i; k) = E Tr EEjiME MMp01EEk`ME MMp01 = Tr EEEjiE MMMp01EEk`ME MMp01 = 2( 0 m)Tr EEjiE MMM01 p EEEk`MMM01p ( 0 m + 1)( 0 m)( 0 m 0 1) +  2Tr EEEjiMMM01 p Tr EEEk`MMM01p ( 0 m + 1)( 0 m)( 0 m 0 1) =2( 0 m) MMM 01 p (i; k) MMM01p (`; j) + 2MMM01p (i; j) MMM01p (`; k) ( 0 m + 1)( 0 m)( 0 m 0 1) : (29)

Observing that the(i + (j 0 1)m; k + (` 0 1)m) elements of AAA BBB andvec(AAA)vec(BB)B T areA(j; `)BAA BB(i; k) and AAA(i; j)BB(k; `), it followsB that EMMM MMMp0T MMMp01 =  2( 0 m) MMM0T p MMM01p + 2vec MMM01p vec MMM0Tp T (0m + 1)(0m)(0m 0 1) = 2 MMM 0T=2 p MMM01=2p (0m)III+eeeeeeT MMM0T=2p MMM01=2p ( 0 m + 1)( 0 m)( 0 m 0 1) : (30) Using similar arguments, it can be shown that

EMMM vec MMMp01 vec MMMp0T T = 2 MMM 0T=2 p MMM01=2p III+(0m)eeeeeeT MMM0T=2p MMM01=2p (0m + 1)(0m)(0m01) : (31) Gathering the previous results, we end up with the following expres-sion: FF FB= MMM0T=2p MMM01=2p 00III + 00eeeeeeT MMM0T=2p MMM01=2p (32) where 00= ( 0 m + 1)( 0 m)( 0 m 0 1)2[ 0( 0 m) + ] 00= 2[ 0+ ( 0 m) ] ( 0 m + 1)( 0 m)( 0 m 0 1): (33) The Bayesian bound is obtained as the inverse ofFFFB, which yields

BB = 0001 MMMTp MMMp 0 00 00+m 00vec MMMp vec MMMTp T : (34) Finally, under the assumption thatMMMphas a Wishart prior, the MSE of any estimator ofMMMpis lower-bounded by the following BB trace

Tr FFF01 B = 0001 Trf MMMpg20 00+ m 00 00Tr MMM2p = ( 0 m + 1)( 0 m)( 0 m 0 1)2[( 0 m)2+ ( 0 m) + ] Trf MMMpg2 0(0m)2+ (0m) [ + m(1 + )] + m + + (0m)(1 + ) Trf MMM2pg : (35) The BB ofMMMpdepends on and MMMp, as expected. However, one can observe the similarity between (25) and (35). Note also that the lower bound in (35) depends on MMMponly throughTrf MMMpg2andTrf MMM2pg.

(5)

Fig. 1. Cramér-Rao bound versus number of snapshots.

Fig. 2. Cramér–Rao bound and MSE of the MLE versus number of snap-shots—L = 1.

IV. NUMERICALILLUSTRATIONS

In this section, we provide numerical illustrations of the CRB and BB properties. First, we contrast the behavior of the CRB in the two opposite cases, namelyK = 1 and Lk = 1. For the sake of sim-plicity, whenLk= 1, we assume that all k’s are equal to a common value denoted as. Whatever the case, N = Kk=1Lk denotes the total number of snapshots. In all simulations, the size of the observa-tion space ism = 8. When considering the CRB the true covariance matrix is given byMMMp(k; `) = 0:9jk0`j, while MMMp(k; `) = 0:9jk0`j whenMMMpis assumed to be random. The matricesMMMkwere generated according to the inverse Wishart distribution of (2). In practice, theMMMk are generated asMMMk= (GGkGG GGHk)01whereGGkG 2 m2is drawn from a zero-mean multivariate Gaussian distribution with covariance matrix (k0 m)01MMMp01.

In Fig. 1, we display the CRB versus the total number of snapshots N, for two different values of , namely  = 10 and  = 20. This

Fig. 3. Cramér–Rao bound and MSE of the MLE versus—L = 1.

Fig. 4. Bayesian bound and MSE of the MMSE estimator versus—N = 20 and = 20.

figure confirms the observations made previously. WhenLk= 1, the CRB decreases nearly linearly with the number of snapshots, while for K = 1 we can observe a threshold effect, i.e., the CRB does no longer decrease when the number of snapshots increases. It can also be seen that the CRB decreases when increases, i.e., as the environment is more homogeneous. However, this improvement is more pronounced whenK = 1 than when Lk= 1, which seems logical.

Next, we compare the performance of the MLE derived in [4] with the CRB, in the caseLk = 1. Figs. 2 and 3 consider the influence of the number of snapshots and, respectively. From inspection of these figures, it can be seen that the MLE has a performance quite close to the CRB. The difference between the two is smaller as eitherK or  increases.

Finally, we provide illustrations of the BB properties. In Fig. 4 we contrast the trace of the BB for the two casesK = 1 and Lk = 1, and we study the influence of which rules the degree of a priori

(6)

knowledge about MMMp. In this figure, we also display the MSE of the MMSE estimator derived in [4]. The total number of snapshots isN = 20 and  = 20. As can be observed, for a given number of snapshots, the BB is smaller whenLk= 1 than when K = 1, which confirms the previous observations made on the CRB. Also, as could be expected, the BB decreases as increases, i.e., as the prior is more and more informative. Finally, we note that the MMSE estimator has a MSE close to the BB only for large values of.

V. CONCLUDINGREMARKS

This correspondence derived lower bounds on the estimation of a co-variance matrixMMMpusing heterogeneous samplesZZZk; k = 1; . . . ; K, which have covariance matricesMMMkdifferent fromMMMp. WhenMMMp is deterministic, we showed that consistent estimation ofMMMp is not feasible, when all samples share the same covariance matrix, i.e., when K = 1. Indeed, the CRB does not converge to zero as the number of training samples increases. In contrast, if all snapshots have different covariance matrices, randomly distributed aroundMMMp(i.e., Lk = 1, for k = 1; . . . ; K), the CRB goes to zero when the number of training samples increases. The correspondence also derived the Bayesian bound associated to a random covariance matrixMMMp. The bounds derived herein enable one to quantify the degradation induced by heterogeneity, and can serve as references for any estimator of the covariance matrixMMMp.

ACKNOWLEDGMENT

The authors would like to thank Prof. G. Letac for enthusiasti-cally sharing his expert knowledge on multivariate Wishart and beta distributions.

REFERENCES

[1] L. L. Scharf, Statistical Signal Processing: Detection, Estimation and Time Series Analysis. Reading, MA: Addison-Wesley, 1991. [2] W. L. Melvin, “Space-time adaptive radar performance in

heteroge-neous clutter,” IEEE Trans. Aerosp. Electron. Syst., vol. 36, no. 2, pp. 621–633, Apr. 2000.

[3] W. L. Melvin, “A STAP overview,” IEEE Aerosp. Electron. Syst. Mag., vol. 19, no. 1 , pt. 2, pp. 19–35, Jan. 2004.

[4] O. Besson, S. Bidon, and J.-Y. Tourneret, “Covariance matrix estima-tion with heterogeneous samples,” IEEE Trans. Signal Process., vol. 56, no. 3, pp. 909–920, Mar. 2008.

[5] S. Bidon, O. Besson, and J.-Y. Tourneret, “A Bayesian approach to adaptive detection in non-homogeneous environments,” IEEE Trans. Signal Process., vol. 56, no. 1, pp. 205–217, Jan. 2008.

[6] J. A. Tague and C. I. Caldwell, “Expectations of useful complex Wishart forms,” Multidimen. Syst. Signal Process., vol. 5, pp. 263–279, 1994.

[7] C. G. Khatri and C. R. Rao, “Effects of estimated noise covariance ma-trix in optimal signal detection,” IEEE Trans. Acoust., Speech, Signal Process., vol. 35, no. 5, pp. 671–679, May 1987.

[8] H. L. V. Trees, Optimum Array Processing. New York: Wiley, 2002.

[9] H. Lütkepohl, Handbook of Matrices. Chichester, U.K.: Wiley, 1996.

[10] C. G. Khatri, “Classical statistical analysis based on a certain multi-variate complex Gaussian distribution,” Ann. Math. Stat., vol. 36, no. 1, pp. 98–114, Feb. 1965.

[11] M. Capitaine and M. Casalis, “Asymptotic freeness by generalized mo-ments for Gaussian and Wishart matrices,” Indiana Univ. Math. J., vol. 53, no. 2, pp. 397–431, 2004.

[12] M. Capitaine and M. Casalis, “Cumulants for random matrices as con-volutions on the symmetric group,” Probab. Theory Relat. Fields, vol. 136, no. 1, pp. 19–36, Sep. 2006.

[13] G. Letac, Expectation ofZ Z for a Matrix Beta Law 2007, private communication.

Cross Entropy Approximation of Structured Gaussian Covariance Matrices

Cheng-Yuan Liou and Bruce R. Musicus

Abstract—We apply two variations of the principle of minimum cross entropy (the Kullback information measure) to fit parameterized proba-bility density models to observed data densities. For an array beamforming problem with incident narrowband point sources, sensors, and colored noise, both approaches yield eigenvector fitting methods similar to that of the MUSIC algorithm and of the oblique transformation in factor analysis. Furthermore, the corresponding cross entropies (CE) are related to the MDL model order selection criterion .

Index Terms—Array beamforming, eigenvector methods, factor analysis, generalized principle component analysis, Kullback information measure, minimum cross entropy (CE), oblique transformation, stochastic estima-tion, structured covariance.

I. INTRODUCTION

Many existing high resolution methods for spectral analysis and for optimal beamforming utilize covariance matrices estimated from ob-served data. Often, an underlying structure for the covariance matrix is known in advance, and our goal is to estimate the covariance matrix with this structure which best fits the observed data. Previous litera-ture has suggested a variety of methods of optimally estimating struc-tured covariance matrices from data [1]–[5]. In this correspondence, we will apply the minimum cross entropy (CE) and minimum reverse cross-entropy (RCE) [6] principles to estimate the covariance matrix. These principles have proved to be quite powerful in a wide variety of signal processing applications, such as complex independent compo-nent analysis [7], [8], encoding mechanism [9]. They have been justi-fied as being “optimal” under suitable assumptions. In Section II, we apply the CE and RCE procedures to the problem of estimating struc-tured covariance matrices, and in Section III we demonstrate the utility of the idea for a beamforming application.

II. PROBLEMSTATEMENT

Letx be an N-dimensional real or complex random vector. Assume that a Gaussian probability density forx is either known a priori or has been estimated by some procedure from observed data

p(x) = N(m; R) (1)

wherem is the expected value of x, and R is the covariance matrix, R = E[(x 0 m)(x 0 m)H], and where xHis the Hermitian (complex conjugate transpose) ofx. Suppose we wish to approximate this p(x) with a parameterized probability density function (pdf)

q(x) = N(m; R) (2)

where denotes the unknown parameters in the model q(x) which are to be estimated. Conceptually, we wish to choose to make q(x)

op-Manuscript received March 25, 2007; revised January 8, 2008. The associate editor coordinating the review of this manuscript and approving it for publica-tion was Dr. Sven Nordebo.

C.-Y. Liou is with the Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan, R.O.C. (e-mail: cyliou@csie.ntu.edu.tw).

B. R. Musicus resides in Boston, MA 02421 USA (e-mail: bmusicus@rcn. com).

Digital Object Identifier 10.1109/TSP.2008.917878 1053-587X/$25.00 © 2008 IEEE

Références

Documents relatifs

Pang, “A comparative analysis of covariance matrix estimation in anomaly detection,” in IEEE Workshop on Hyperspectral Image and Signal Processing, 2014..

• D’abord, nous avons d´evelopp´e la borne de Weiss-Weinstein (pour le cas bay´esien) dans le contexte du radar MIMO et la borne de Cram´er-Rao (pour le cas d´eterministe) dans

Moreover re- marking some strong concentration inequalities satisfied by these matrices we are able, using the ideas developed in the proof of the main theorem, to have some

From these results for the LUE, Theorems 1, 2 and 3 are then extended to large families of non-Gaussian covariance matrices by means of localization properties by Pillai and Yin

We consider the problem of estimating a conditional covariance matrix in an inverse regression setting.. We show that this estimation can be achieved by estimating a

Ricci, “Recursive estimation of the covariance matrix of a compound-Gaussian process and its application to adaptive CFAR detection,” Signal Processing, IEEE Transactions on,

Le stress psychologique, l’angoisse et l’inquiétude massivement éprouvés dans les régions durablement affectées par la catastrophe, à tel point qu’il s’agit pour

(2017) focus on the challenging covariance estimation in non-Gaussian Monte Carlo methods, and include various implementations of offline and online methods based on maximum