an author's
https://oatao.univ-toulouse.fr/25983
https://doi.org/10.1016/j.sigpro.2019.107300
Besson, Olivier and Vincent, François Properties of the partial Cholesky factorization and application to
reduced-rank adaptive beamforming. (2020) Signal Processing, 167. 107300-107309. ISSN 0165-1684
Properties
of
the
partial
Cholesky
factorization
and
application
to
re
duce
d-rank
adaptive
beamforming
Olivier
Besson
∗,
François
Vincent
ISAE-SUPAERO, 10 Avenue Edouard Belin, Toulouse 31055, France
Keywords: Adaptive beamforming Cholesky factorization Reduced rank Wishart matrices
a
b
s
t
r
a
c
t
Reduced-rankadaptivebeamformingisawellestablishedandefficientmethodology,notablyfor distur-bancecovariancematriceswhicharethesumofastronglow-rankcomponent(interference)andascaled identitymatrix(thermalnoise).Eigenvalue orsingulardecompositionisoftenusedtoachieve rank re-duction.Inthispaper,westudyandanalyzeanalternative,namelyapartialCholeskyfactorization,asa meanstoretrieveinterferencesubspaceandtocomputereduced-rankbeamformers.First,westudythe anglesbetweenthetruesubspace andthat obtainedfrom partialCholeskyfactorizationofthe covari-ancematrix.Then,astatisticalanalysisiscarriedoutinfinitesamples.Usingpropertiesofpartitioned Wishartmatrices, weprovideastochasticrepresentationofthebeamformerbasedonpartialCholesky factorizationandofthecorrespondingsignaltointerferenceandnoiseratioloss.Weshowthatthelatter followsapproximatelyabetadistribution, similarlytothe beamformerbasedoneigenvalue decompo-sition.Finally,numericalsimulationsarepresented whichindicatethatareduced-rankadaptive beam-formerbased onpartial Choleskyfactorizationincursalmostnoloss, and caneven performbetterin somescenariosthanitseigenvalueorsingularvalue-basedcounterpart.
1. Motivationofthework
Enhancing retrieval of a signal of interest (SoI) buried in noise and interference by means of adaptive filtering is a very widespread problemin many engineering applications, including radar sonar andcommunications [26], aswell asin finance with theselectionofamean-varianceefficientportfolio[23].The most widelyusedapproachconsistsindesigningalinearfilterw,which preserves the SoI through some constraints, and minimizes the output power, hence tends to cancel or at least attenuate inter-ference. In other words, one tries to minimize wH
w, under a unit-gainconstraintwH v=1,where
standsforthep× p covari-ancematrixofthemeasurementsandvdenotestheSoIsignature. The solution is wopt=
−1v/
(
vH−1v
)
, which provides theopti-malsignaltointerferenceandnoiseratio(SINR),whichwedenote asSINRopt=SINR
(
wopt)
.In array processing applications,when
=RN contains noise
only(i.e.,thermalnoiseandinterference),thisfilterisreferredto as the minimum variance distortionlessresponse (MVDR) beam-former. If the SoI is presentin the measurement, in which case
=RS+N, one speaks ofminimum power distortionlessresponse
(MPDR) [26].When
is known, there is no difference between
∗ Corresponding author.
E-mail address: olivier.besson@isae-supaero.fr (O. Besson).
the two filters. However, in practice
is unknown and must be estimated from a set of n independent and identically dis-tributedsamplesxi ,with(supposedlycommon)covariancematrix
=ExixH i
.Useofthesamplecovariancematrix
ˆ inlieuofthe trueonemakesabigdifferencebetweentheMVDRandMPDR sce-narios. Indeed,while only 2p− 3 samples are neededto achieve anaverageSINRlossequalto−3dBintheMVDRcase,thisfigure risestoabout
(
p+1)(
1+SINRopt)
intheMPDRcase,whichcanbemuchlarger.Inmostpracticalcases,thenumberofavailable snap-shotsismuchlower, whichresultsina significantdegradation in theMPDRcontext.
In some cases one might even face a situation where n<p, which makes
ˆ singular, and therefore its inverse does not ex-ist. Inorder to address thissituation andto improve the rateof convergence,twomainapproacheshaveemergedintheliterature. The firstis based on diagonalloading [1,5,6], a simpletechnique which was shown to achieve, provided that the loading level is properlychosen,a fastconvergencerate, typicallyofthe orderof twicethenumberofinterferingsignals[2,6,7,10].Thesecondmain categoryisthat ofreduced-rank adaptivefilterswheretheweight vectorw isconstrainedto liein alow-dimensional subspace,see e.g.,[11,14,15,17,22].Thisapproachisparticularlysuitablewhenthe interferenceplusnoisecovariancematrixisthesumofalow-rank term and a scaled identity matrix, i.e., when RN=U
UH +
σ
2Iwhere U is a p× r matrix of the eigenvectors of RN and
is
thediagonalmatrixofeigenvalues. Actually,forlarge interference to noise (INR) ratio wopt is approximately, up to scaling factor,
theprojection ofv on thesubspace orthogonal to U, i.e., wopt
α
(
I− UUH)
v.Inpractice,anestimate ofUismadeavailable from the eigenvalue decomposition (EVD) ofˆ or the singular value decomposition(SVD) ofthedatamatrixX=x1 x2 ... xn . AnalysisoftheSINRlosswasconductedin[21]whereitwasstated thatitapproximatelyfollowsaBetadistribution.In[16],theSINR lossisanalyzed startingfromthe asymptoticpropertiesof eigen-valuesandeigenvectors.Whiletheproofisrigorous,itholdsonly forn→∞,whilereduced-rankadaptivefilteringisespecially inter-estinginlowsamplesupport.
While EVD or SVD are the usual tools forretrieval of princi-pal subspace, one might investigate alternative, possibly simpler factorizationsthat could yield a similar performance in terms of adaptivebeamforming:thisis theobjective ofthe presentpaper. Our goal is not to propose a new reduced-rank adaptive beam-former,ratherto suggest anotherimplementationandto validate its performance through a theoretical analysis. Towards this end weproposetouseapartial(truncated)Choleskyfactorization,that iswe suggest to use a conventional (with no pivoting) Cholesky factorizationalgorithm,applied to
ˆ, andtostop it afterr itera-tions.We are particularlyinterestedin thecasewhere
ˆ is esti-mated fromn<p snapshots andis thus of rankn. Also, we will focus on matrices
=U
UH +
σ
2I (MVDR case) or=PvvH +
U
UH +
σ
2I(MPDR case),whereUformsabasisfortheinterfer-encesubspace.
The paperisorganizedasfollows.InSection 2,we willdefine thepartialCholeskyfactorizationandwillshowthatitisarather accuratemethodtoretrievetheprincipalsubspaceof
whenthe latteristhe sumofa low-rankmatrix anda scaled identity ma-trix.InSection3wewillintroduceareduced-rankadaptive beam-formerbased on thepartial Cholesky factorizationandderive its statistical properties. The simulations of Section 4 will compare thisbeamformer toits counterpart usingSVDandfinally conclu-sionswillbedrawninSection5.
2. PartialCholeskyfactorization
In thissection,we introducethe partialCholeskyfactorization andwestudytheanglesbetweenthesubspaceobtainedfromthe partialCholeskyfactorizationof
andtheinterferencesubspace, intheMVDRaswellasintheMPDRcase.
2.1.Definition
Let us start fromthe (full) Choleskyfactorization [12,19] ofa positive definite matrix A which yields a p× p lower triangular matrixL,withpositivediagonalelements, suchthat A=LLH .The principleofthepartialCholeskyfactorizationissimplytostopthe factorizationafterrsteps(columns),yieldingap× rlower triangu-larmatrix,whosepropertieswillbestudiedbelow.IfAisfull-rank, andnopivoting is used,then stoppingafter riterationsprovides thesameresultasselectingthefirstrcolumnsofa fullCholesky factorizationofA.IfAhasrankn<p,thenthefullCholesky factor-izationdoesnotexist,andstoppingafterr≤ nsteps yieldsa low-rankapproximation, which will be used to obtain a basis of the interferencesubspace. Note that, when A has rank n, symmetric pivotingcanbeusedtoproduceap× nlower triangularmatrixL
withpositivediagonalelements,andapermutationmatrix
such that of
A
=LLH , see e.g., [12], Algorithm 4.2.4. Now, with a positivesemi-definitematrixA,Choleskyfactorizationwith pivot-ingcanalsobestoppedafterrsteps,wherer≤ rank(A),toproduce alow-rankapproximationofA.Suchatechniquehasbeenusedin machinelearningwhereitservesthepurposeoffindingalow-rank approximationoftheKernelmatrixseee.g.,[3,9],anddifferences
amongmethods concernthestrategy forpivoting. In[18] conver-genceresultswereobtainedwhichprovetheeffectivenessofsuch amethod.
The partial Cholesky factor of
, which will be denoted as pchol(
, r), is thus the p× r (with r≤ rank(
)) lower triangular matrixwithpositivediagonalelementsG=
(
G−1G−2
)
,whereG−1isa r× r lowertriangular matrixwithpositive diagonalelements andG−2isa
(
p− r)
× r matrix,definedfrom=
11
12
21
22 =
11
12
21
21
−111
12 + 0 0 0
2. 1 = G−1 G−2 GH −1 GH −2
+ 0 0 0
2. 1 (1) where
2. 1=
22−
21
−111
12, G−1GH−1=
11 and
G−2GH −1=
21.Fromapracticalpointofview,G=pchol
(
,r
)
can be obtained, e.g., by usingonly rsteps of Algorithm4.2.2 of GolubandLoan[12].2.2. Angleswiththeinterferencesubspace
Letusnow examinethe abilityof thepartial Cholesky factor-ization to retrieve the principal subspace of
when the latter isoftheform
=U
UH +
σ
2Iwhere=diag
(
λ
1,...,
λ
r)
isthe diagonal matrix of eigenvalues and U is the matrix of eigenvec-tors. Towardsthis end, let usstudythe angles betweenthe sub-spacespannedbypchol(,r) andthesubspacespannedbyU.Let us partition U as U=[U1
U2
] where U1 is r× r. From (1), one has
11=U1
UH 1 +
σ
2Ir and21=U2
UH 1,sothat range
(
G)
=range11
21 =range U1
UH 1+
σ
2Ir U2UH 1 =range U1 U2
UH 1+
σ
2I r 0 =range U1 U2 +σ
2UH 1−1
−1 0 =range U1+
1 U2 . (2)
ForlargeINR,i.e.,for
λ
kσ
2,1isamatrixwithsmallelements,
hencerange(G)should beclosetorange(U).The anglesbetween thetwosubspacesareobtainedfromthesingularvaluesofthe fol-lowingmatrix[12] M=UH
U1+1 U2
(
U1+1
)
H(
U1+1
)
+UH2U2−H/2 =Ir +
σ
2−1
Ir +
1HU1+UH1
1+
H1
1
−H/2 =Ir +
σ
2−1
Ir +2
σ
2−1+
σ
4−1UH 1U1
−1
−1
−H/2 Ir+
σ
2−1
Ir−
σ
2−1− 1 2
σ
4−1UH 1U1
−1
−1 +3 2
σ
4−2
Ir−
σ
4 2−1UH 1U1
−1
−1 +1 2
σ
4−2 =Ir −
σ
4 2−1UH 2 Is +U2UH2
−1 U2
−1. (3)
Note that one should go up tosecond orderin the expansion.If we let
θ
k ,k=1,...,r denote the angles between the subspaces,Fig. 1. Distance between range pchol UUH + σ2 I , r and range( U ) using the par-
tial Cholesky decomposition. r = 2 interference with varying INR in the field of view of a p = 16 element uniform linear array.
one has Tr
MMH=rk =1cos2
θ
k . Now, the square distance be-tween the two subspaces is d2=rk =1
θ
k 2r k =1sin2θ
k and, for largeINR, r k =1 sin2θ
kσ
4Tr−1UH 2 Is +U2UH 2
−1 U2
−1 . (4)
Although this is not zero, as would be the case with the sin-gular value decomposition whoser principalleft singularvectors sharethesamesubspaceasU,thedistancebetweenrange(G)and range(U)goestozeroasINR−1 whenINRgrowslarge.
This is illustrated in Fig. 1 where we display the distance betweenrange(G) andrange(U) inthecaseofa p=16element uniform linear array with inter-distance half a wavelength. Two narrowband interference are present in the field of view with respective directions of arrival −10◦ and10◦,and a varying INR.
As canbe observed, thedistancegoesto zeroasboth INRsgrow large. Note that this figure concern the asymptotic case where
is known. Of more practical interest is the case where
is estimated from n snapshots with n possibly smaller than p. In this case, a different analysis should be conducted. In the next section, we provide some results about the statistical properties of pchol
ˆ,r
while numerical results about its application to adaptivebeamformingisthesubjectofSection4.
Let us now study what happens when
=PvvH +U
UH +
σ
2I, i.e., when the SoI is present in the measurements. Withno loss of generality, one can assume that v=ep where ep =
0 0 ... 0 1
T
.Then,onlythe(p,p)elementofis mod-ified, compared tothe MVDRcasewhere
=U
UH +
σ
2I.Sincethisaffectsonly
22 itmeansthatthepartialCholeskyfactorization
isleftunchanged,thatis
pchol
PvvH +UUH +
σ
2I,r=pcholU
UH +
σ
2I,r(5)
which implies that the angles between
range
pcholPvvH +UUH +
σ
2I,rand range(U) are still given
by the analysis above. Note that this holds true provided that the first r components of v are zero. In contrast, the angles be-tweenrange(U) andthefirstreigenvectorsofU
UH +
σ
2Iarenolonger zero.Indeed,it isonlyknownthat thefirstr+1principal eigenvectors of PvvH +U
UH +
σ
2I share the same subspace asU v
. Accordingly, the subspace spanned by the r principalFig. 2. Distance between range eig PvvH + UUH + σ2 I , r and range( U ). The dis-
tance between range pchol PvvH + UUH + σ2 I , r and range( U ) is given by Fig. 1 .
r = 2 interference with varying INR in the field of view of a p = 16 element uniform linear array. Signal to noise ratio is 0 dB.
eigenvectors contains a contribution from v and one cannot recover exactly the interference subspace, in contrast with the MVDR case. Therefore, when one wants to retrieve the interfer-encesubspaceinaMPDRcontext,apartialCholeskyfactorization, stopped afterr steps withrthe number ofinterfering signals,is a meaningful approach, because it brings the SoI component in a part of the matrix which is not used in the partial Cholesky factorization.
Fig. 2 displaysthe angles obtainedwith an EVD in the same scenario as before but with the SoI present. Note that the an-glesobtained witha partial Choleskyfactorization are still given by Fig. 1,due to (5). Comparingthe two figures, it appears that thedifference in termsof subspace proximity isnot very impor-tant. Therefore,fromthe point ofview of retrievingthe interfer-encesubspaceinaMPDRcontext,apartialCholeskyfactorization possessesinterestingproperties.
3. Reduced-rankadaptivebeamformingusingthepartial Choleskyfactorization
Inthissection, we consider theuse ofa partial Cholesky fac-torization for reduced-rank adaptive beamforming purposes. Let us assume that n independent and identically distributed com-plexGaussianvectorsxi withzeromeanandcovariancematrix
areavailableandgatheredinthematrixX=x1 x2 ... xn . We denote the distribution of X as X=d CNp,n
(
0,,In
)
. We are especially interestedin the casewhere the numberof snapshots is less than the size of the observation space, i.e., n<p. LetS=XXH be the samplecovariance matrix andlet G=pchol
(
S,r)
withr≤ min(p,n)denoteitspartialCholeskyfactorization.Assaid above,Gconstitutesausefultooltoretrievetheinterference sub-space, which leads naturally to consider the following reduced-rankadaptivebeamformerw=P⊥Gv (6)
where P⊥G=I− G
(
GH G)
−1GH is the orthogonal projector on the subspaceorthogonal torange(G). Inthesequel, weprovide a sta-tisticalanalysisoftheSINRlossoftheadaptivebeamformerin(6). Priortothat,wealsoprovidesomestatisticalpropertiesofG.The mainresultsare statedbelowwhiletechnicalproofsare deferredFig. 3. Distribution of the SINR loss using either partial Cholesky decomposition of ˆ or SVD of X . p = 16 , r = 2 interference at −10 ◦, 10 ◦with INR = 20 dB , 15 dB . Wishart
and pseudo-Wishart conditions.
Fig. 4. Distribution of the SNR loss using either partial Cholesky decomposition of ˆ or SVD of X . p = 16 , n = 8 , r = 4 interference at −25 ◦, −10 ◦, 10 ◦and 18 ◦.
totheappendices.InwhatfollowsCWp
(
n,)
denotesthecomplex Wishartdistributionwithndegreesoffreedomandparameter ma-trix.
Proposition1. Let us partition G as G=
(
G1G2
)
where G1 is a r× rlowertriangularmatrixwithpositiverealdiagonalelements.ThenG1
andG2−
21
−111G1areindependentanddistributedas p
(
G1)
∝ r k =1 G1+2kk (n −k) etr−GH 1−111G1 (7a) G2−
21
−111G1=d CNp−r,r
(
0,2. 1,Ir
)
(7b)andGadmitsthefollowingstochasticrepresentation
G=pchol
(
CWp(
n,)
,r)
=d chol(
)
× pchol(
CWp(
n,Ip)
,r)
. (8) wherechol()istheCholeskyfactorof
.
Proof. SeeAppendixA.
WenowconsidertheSINRlossassociatedwiththebeamformer of(6).Foranybeamformerw,theSINRlossisdefinedas
ρ
(
w)
= SINR(
w)
SINR(
wopt)
=|
wH v|
2(
wH optRNwopt)
(
wH RNw)
|
wH optv|
2 . (9)We let CBr,n −r+1
(
0)
denote the complex beta distribution withdensityp
(
b)
= (r()(n n +1−r+1) )bn −r(
1− b)
r−1.Proposition 2. The SINR loss of the reduced-rank adaptive beam-formerof(6)basedonthepartialCholeskyfactorization ofthe sam-plecovariancematrixfollowsapproximatelyascaledBetadistribution, i.e.,
ρ
(
P⊥Gv)
=d a× CBr,n −r+1(
0)
(10)where thescaling factor a is defined in (B.14). The exact stochastic distributionof
ρ
(
P⊥Gv)
isgivenby(B.10).Fig. 5. MVDR case. SINR using either partial Cholesky decomposition of ˆ or SVD of X versus r . p = 16 , J = 4 interference at −25 ◦, −10 ◦, 10 ◦and 18 ◦with INR = 30 , 15 , 20 , 25
dB respectively. SNR = 0 dB.
Fig. 6. MPDR case. SINR using either partial Cholesky decomposition of ˆ or SVD of X versus r . p = 16 , J = 4 interference at −25 ◦, −10 ◦, 10 ◦and 18 ◦with INR = 30 , 15 , 20 , 25
dB respectively. SNR = 0 dB.
For illustrationpurposes, we plot in Fig. 3 thedistribution of theSINRlosswheneitherapartialCholeskyofthesample covari-ance matrix
ˆ or a SVD of the data matrix X is used. Wishart (n≥ p) as well as pseudo-Wishart (n<p) conditions are consid-ered. Ascanbe observed,thereisnodifferencebetweenthe par-tialCholeskyandtheSVD.Moreover,theapproximatedscaledbeta distributionisshowntopredictaccuratelytheactualdistribution.
Next,we consider a morecomplicatedscenario in Fig.4 with r=4interferingsignalswithdirectionsofarrival−25◦,−10◦,10◦,
18◦ andvaryingINR.Ascanbeseenhere,westillhavethatpartial
Choleskyperformsaswell asSVD. However,one can observethe limitationoftheapproximationmadeabove,sincethescaledbeta distributiondoesnotpredictwelltheactualdistribution.
4. Numericalsimulations
Inthis section, we use numerical simulations to compare the reduced-rank P⊥Gv obtained from a partial Cholesky decomposi-tionof
ˆ to itscounterpart usingSVDofthedatamatrix.As be-fore,ap=16uniformlineararraywithhalf-wavelengthspacingis
Fig. 7. MVDR case. SINR using either partial Cholesky decomposition of ˆ or SVD of X versus n . p = 16 , J = 4 interference at −25 ◦, −10 ◦, 10 ◦and 18 ◦with INR = 30 , 15 , 20 , 25
dB respectively, r = 4 . SNR = 0 dB.
Fig. 8. MPDR case. SINR using either partial Cholesky decomposition of ˆ or SVD of X versus n . p = 16 , J = 4 interference at −25 ◦, −10 ◦, 10 ◦and 18 ◦with INR = 30 , 15 , 20 , 25
dB respectively, r = 4 . SNR = 0 dB or SNR = 10 dB.
consideredandinterferingsignals arelocated at−25◦, −10◦,10◦,
18◦ withINR=[30,15,20,25]dB.Whenpresent,theSoIimpinges frombroadside(0◦).
First,westudytheinfluenceofr,theorderofthelow-rank ap-proximation.InFig.5,we consideran MVDRscenario.Afirst im-portantobservationisthatthepartialCholeskyfactorizationincurs almostnolosscomparedtoSVD,whentherankofthe transforma-tioncoincideswiththenumberofinterferingsignals,sayJ. Addi-tionally,itappearstodegrade slightlymorenicelywhenr>J.For theMPDR case, which is considered in Fig.6 it is also observed
thatforr=J,thepartialCholesky-basedbeamformer performsas well asits SVDcounterpart.However, degradationforr>Jis sig-nificantly more pronounced for the SVD, which means that par-tial Cholesky factorization is more robust to an over-estimation of J in MPDR scenarios. Finally, we study the influence of the number of snapshots n in Figs. 7 and 8. The conclusion is that, in MVDR orin MPDR scenarios withlow SNR, all methods have similar performance. However, when SNR=10 dB, one can ob-serve improvement of the Cholesky-based method over the SVD method.
To summarizethisnumerical study,we showedthat a partial Choleskyfactorizationperformsaswell asSVDinmostsituations, andismorerobustinMPDRscenariosorwhentheorderrofthe low-rankapproximationisover-estimated.
5. Conclusions
In thispaper,we showed that a partialCholesky factorization is a worthyalternative to SVDor EVD to implementa reduced-rankadaptivebeamformerinthecaseofstronglow-rank interfer-ence. While, for the true interference +noise covariance matrix, it doesnotprovide exactlythe interferencesubspace (incontrast toSVD),weshowedthatwhenthesignalofinterestispresent,it enablesonetorecoverratherwell thissubspace.Weconducteda statisticalanalysisofthepartialCholesky-basedfilterandshowed theoreticallythat theSINRlossobtainedisveryclosetothatofa SVD-basedfilter.Finally,numericalsimulationsconfirmedthis re-sultand evenshowed situationswhere partial Choleskyprovides improvement.
DeclarationofCompetingInterest
We acknowledgethat thereisno conflictofinterestregarding thepaperentitled“PropertiesofthepartialCholeskyfactorization andapplicationtoreduced-rankadaptivebeamforming”.
AppendixA. ProofofProposition1
In thisappendix we derive thedistribution ofG=pchol
(
S,r)
, withr≤ min(p, n),fromthedistributionofS.Thelatterfollowsa complex Wishart distribution(when n≥ p) ora complex pseudo-Wishart distribution [4,8,13,20,24,25] (when n<p), whose proba-bilitydensityfunction(p.d.f.)isgivenbyp
(
S)
∝|
S+|
n −p|
|
−netr−−1S (A.1)
where ∝ means proportional to, |.| and etr{.} stands for the de-terminantandexponentialtrace,andS+=S
(
1:q,1:q)
whereq= min(
p,n)
.WewillnoteS=d CWp(
n,)
.LetSbepartitionedasS=rs r s
S11 S12 S21 S22 . (A.2) Let us partition G as G=(
G1 G2)
where G1 is r× r. A said before,
when n≥ p, S is full-rank with probability one, and G is simply a sub-matrix ofthe full Choleskyfactor ofS. However, the latter doesnot existwhen n<pandrank
(
S)
=n.In thiscase, weneed toresorttoresultsaboutarbitrarypartitioningofpseudoWishart matrices [4]. In fact, whetherin the Wishart case [24] or inthe pseudo-Wishartcase[4],thejointdistributionof(S11,S21)isgivenby p
(
S11,S21∝|
S11|
n −petr −−1 11S11 ×etr−S21−
21
−111S11
S−111S21−
21
−111S11
H
−1 2. 1 . (A.3) UsingaproofsimilartothatofKhatri[20]fortheBartlett decom-position, itcan be shownthat theJacobian J
(
S11,S21→G1,G2)
=2r r k =1Gkk 1+2(p−k).Next,observingthatS11=G1GH 1,S21=G2GH 1 and
|
S11|
=r k =1G2kk ,itfollowsthat p(
G1,G2)
∝ r k =1 G1+2kk (n −k) etr−GH 1−111G1 × etr−G2−
21
−111G1
H
−1 2. 1 G2−
21
−111G1
(A.4)
whichshowsthatG1andG2−
21
−111G1areindependentand
dis-tributedas p
(
G1)
∝ r k =1 G1kk +2(n −k) etr−GH 1−111G1 (A.5a) G2−
21
−111G1=d CNs,r
(
0,2. 1,Ir
)
(A.5b) From(A.5),onecanobtainthefollowingstochastic representa-tionofGasG=pchol
(
CWp(
n,)
,r)
=d chol(
)
× pchol(
CWp(
n,Ip)
,r)
. (A.6) where chol() is the Cholesky factor of
. This property is ob-tained by making the change of variables ¯G=L−1G with L= chol
(
)
. ¯Gisalowertriangularmatrixwithpositivediagonal ele-mentandtheJacobianisJ(
G→ ¯G)
=|
L|
2r .Next,observingthat G1 G2 = L11 0 L21 L22 ¯G1 ¯G2 = L11¯G1 L21¯G1+L22¯G2 (A.7) togetherwith11=L11LH 11 and
21=L21LH 11 itensuesthat G2−
21
−111G1=L22¯G2
2. 1=L22LH 22
Gkk =Lkk ¯Gkk . (A.8) Therefore,thejointp.d.f.of
(
¯G1,¯G2)
writesp
(
¯G1,¯G2)
∝ r k =1 ¯G1+2(n −k) kk etr− ¯GH 1¯G1 etr− ¯GH 2¯G2 (A.9) whichisrecognizedasthedistributionofthepartialCholesky fac-torof ¯S=d CWp n,Ip.
Eq.(A.6)canberewritteninanotherformas
G=L¯G=
L11 0 L21 L22 ¯G1 ¯G2 = L11 L21 ¯G1+ 0 L22¯G2 =pchol(
,r
)
¯G1+ 0 L22¯G2 (A.10) whichenablesonetorelatepchol(S,r)topchol(,r)and ¯G.
AppendixB. ProofofProposition2
Inthisappendix,weanalyzetheSINRlossoftheweightvector in(6).SinceitdependsonP⊥G, wefirstobtain astochastic repre-sentationofthisprojectionmatrix. TherangespaceofG isgiven by
range
(
G)
=rangeL¯G=rangeL¯G¯G−1 1
=range L Ir ¯T =range L11 L21+L22¯T =range pchol
(
,r
)
+ 0 L22¯T (B.1) where ¯T= ¯G2¯G−11 =¯S21¯S−111. The matrix ¯T follows a complexmultivariate Student distribution and ¯T=d N¯W¯−1/ 2 where N¯=d CNs,r
(
0,Is,Ir)
andW¯ =d CWr(
n,Ir)
.Wewillnote ¯T=d CT(
0,n,Is,Ir)
. Itfollowsthatrange(G)⊥isspannedbythecolumnsofthematrixF=L−H[¯TH
−Is ] sothat theprojector ontheorthogonal complement ofrange(G)writes P⊥G=F
FH F−1FH =L−H ¯TH −Is ¯T −Is L−1L−H ¯TH −Is −1
¯T −Is L−1. (B.2) Let A1/2 denote the Hermitian square-root of A. Since
2. 1=
L22LH 22,
11=L11LH 11 and
21=L21LH 11, it follows that L22=
1/ 2
2. 1Q2,L11=
111/ 2Q1andL21=
21
−111/ 2Q1forsomeunitary
ma-tricesQ1 andQ2.Hence,weget
¯T −Is L−1=¯T −Is L −1 11 0 −L−1 22L21L−111 L−122 =¯TL−111 +L22−1L21L−111 −L−122 =L−122L22¯TL−111 +L21L−111 −Is =L−122T −Is (B.3) where T=21
−111 +
1/ 2 2. 1Q2¯TQH 1
−1/ 2 11 d =
21
−111 +
1/ 2 2. 1¯T
−1 / 2 11 d =CT
21
−111,n,
2. 1,
11
(B.4) andwherewe usedthe fact that ¯T has thesame distribution as
Q2¯TQH 1. Therefore, we obtain the following stochastic
representa-tionfortheprojectionmatrix
P⊥G d =
TH −Is Is +TTH−1T −Is. (B.5) Asfortheweightvector, assumingwithnolossofgeneralitythat
v=[0T vT 2]T ,wecanwritewas w=P⊥Gv=−
TH −Is Is +TTH−1v2 (B.6) sothat wH v=d vH 2 Is+TTH
−1v2. (B.7)
Now,wehavethat
P⊥G
P⊥G=FFH F
−1FH
FFH F
−1FH
=F
FH F−1Is +¯T¯TH
FH F
−1FH (B.8) whichimpliesthat
wH
w=vH P⊥G
P⊥Gv =vH 2Is +TTH
−1
12/ . 12 Is +¯T¯TH
12/ . 12 Is +TTH
−1v2. (B.9)
ObservingthatvH
−1v=vH 2
−12. 1v2,weendupwith
ρ
(P⊥Gv)=d vH 2Is +TTH−1v2 2 vH 2−12. 1v2
vH 2Is +TTH
−11/ 2 2. 1 Is +¯T¯TH
1/ 2 2. 1 Is +TTH
−1v2
(B.10)
Eq.(B.10)providesastochasticrepresentationoftheSINRloss, us-ing matrices with known distribution. At this stage, it does not seempossibletoderive theexactdistribution of
ρ
(
P⊥Gv)
andone needstoresorttosomeapproximation,aswasthecasein[16,21]. If we consider thatis approximately of rank r, then
2.1
which bears the deviation from a rank−r approximation of
shouldbe“small”.We translatethispropertybythe factthat the
randomdeviationsofTfromitsmeanMT =
21
−111 arenegligible.
Inotherwords,wemaketheapproximation
Is +TTH=Is +
MT +12/ . 12¯T
−1/ 2 11
MT +
12/ . 12¯T
−1/ 2 11
H Is +MT MH T . (B.11) For notational convenience, let us define temporarily
z=
12. / 12Is +MT MH T
−1
v2 and y=
2−1. 1/ 2v2. With this
approxi-mation,weneednowtoderivethedistributionof
ρ
= zH y
2yH y
zH Is +¯T¯TH
z.
(B.12)
Usingthefactthat ¯T=d N¯W¯−1/ 2whereN¯ =dCN
s,r
(
0,Is ,Ir)
andW¯ =d CWr(
n,Ir)
, along withwell-known resultson quadratic forms in Wishartmatrices,onecanshowthat zH Is +¯T¯THz−1 d =zH z
−1 1+ C
χ
r 2(
0)
Cχ
2 n −r+1(
0)
−1 d =zH z−1CBr,n −r+1
(
0)
(B.13) where the p.d.f. of CBr,n −r+1(
0)
is (r()(n n+1−r+1) )bn −r(
1− b)
r−1. Therefore,weendupwiththefollowingrepresentationoftheSINR lossρ
=d zH y2
yH y
zH z
× CBr,n −r+1
(
0)
(B.14) whichcorrespondstoascaledbetadistribution.Notethat,in[21], theSINRlosswasfoundtohaveaCBr,n −r+1(
0)
distribution,thatisthescalingfactorvanishes.
References
[1] Y.I. Abramovich , Controlled method for adaptive optimization of filters using the criterion of maximum SNR, Radio Eng. Electron. Phys. 26 (1981) 87–95 . [2] Y.I. Abramovich , A.I. Nevrev , An analysis of effectiveness of adaptive maximiza-
tion of the signal to noise ratio which utilizes the inversion of the estimated covariance matrix, Radio Eng. Electron. Phys. 26 (1981) 67–74 .
[3] F.R. Bach , M.I. Jordan , Kernel independent component analysis, J. Mach. Learn. Res. 3 (2002) 1–48 .
[4] T. Bodnar , Y. Okhrin , Properties of the singular, inverse and generalized inverse partitioned Wishart distributions, J. Multivariate Anal. 99 (2008) 2389–2405 . [5] B.D. Carlson , Covariance matrix estimation errors and diagonal loading in
adaptive arrays, IEEE Trans. Aerosp. Electron. Syst. 24 (4) (1988) 397–401 . [6] O.P. Cheremisin , Efficiency of adaptive algorithms with regularised sample co-
variance matrix, Radio Eng. Electron. Phys. 27 (10) (1982) 69–77 .
[7] R.L. Dilsavor , R.L. Moses , Analysis of modified SMI method for adaptive array weight control, IEEE Trans. Signal Process. 41 (2) (1993) 721–726 .
[8] M.L. Eaton, Chapter 8: The Wishart Distribution, vol. 53 of Lecture Notes– Monograph Series, Institute of Mathematical Statistics, Beachwood, Ohio, USA, pp. 302–333.
[9] S. Fine , K. Scheinberg , Efficient SVM training using low-rank kernel represen- tations, J. Mach. Learn. Res. 2 (2001) 243–264 .
[10] M.W. Ganz , R.L. Moses , S.L. Wilson , Convergence of the SMI and the diagonally loaded SMI algorithms with weak interference 38 (3) (1990) 394–399 . [11] J.S. Goldstein , I.S. Reed , Reduced-rank adaptive filtering, IEEE Trans. Signal Pro-
cess. 45 (2) (1997) 4 92–4 96 .
[12] G. Golub , C.V. Loan , Matrix Computations, 3rd, John Hopkins University Press, Baltimore, 1996 .
[13] N.R. Goodman , Statistical analysis based on a certain multivariate complex Gaussian distribution (an introduction), Ann. Math. Stat. 34 (1) (1963) 152–177 . [14] J.R. Guerci , J.S. Goldstein , I.S. Reed , Optimal and adaptive reduced-rank STAP,
IEEE Trans. Aerosp. Electron. Syst. 36 (2) (20 0 0) 647–663 .
[15] A.M. Haimovich , The eigencanceler: adaptive radar by eigenanalysis methods, IEEE Trans. Aerosp. Electron. Syst. 32 (2) (1996) 532–542 .
[16] A.M. Haimovich , Asymptotic distribution of the conditional signal to noise ratio in an eigenanalysis-based adaptive array, IEEE Trans. Aerosp. Electron. Syst. 33 (3) (1997) 988–997 .
[17] A.M. Haimovich , Y. Bar-Ness , An eigenanalysis interference canceler, IEEE Trans. Acoust. Speech Signal Process. 39 (1) (1991) 76–84 .
[18] H. Harbrecht , M. Peters , R. Schneider , On the low-rank approximation by the pivoted Cholesky decomposition, Appl. Numer. Math. 62 (2012) 428–440 . [19] N.J. Higham , Cholesky factorization, Wiley Interdiscip. Rev. 1 (2) (2009)
[20] C.G. Khatri , Classical statistical analysis based on a certain multivariate com- plex Gaussian distribution, Ann. Math. Stat. 36 (1) (1965) 98–114 .
[21] I.P. Kirsteins , D.W. Tufts , Rapidly adaptive nulling of interference, in: M. Bouvet, G. Bienvenu (Eds.), High-Resolution Methods in Underwater Acoustics, Springer Berlin Heidelberg, Berlin, Heidelberg, 1991, pp. 217–249 .
[22] I.P. Kirsteins , D.W. Tufts , Adaptive detection using low rank approximation to a data matrix, IEEE Trans. Aerosp. Electron. Syst. 30 (1) (1994) 55–67 .
[23] H. Markowitz, Portfolio selection, J. Finance 7 (1) (1952) 77–91, doi: 10.1111/j. 1540-6261.1952.tb01525.x .
[24] M.S. Srivastava , Singular Wishart and multivariate beta distributions, Ann. Stat. 31 (5) (2003) 1537–1560 .
[25] H. Uhlig , On singular Wishart and singular multivariate beta distributions, Ann. Stat. 22 (1994) 395–405 .