Contents lists available atScienceDirect
Linear Algebra and its Applications
www.elsevier.com/locate/laa
Perturbation analysis for matrix joint block diagonalization
Yunfeng Caia,1,Ren-Cang Lib,∗,2
aCognitiveComputingLab,BaiduResearch,Beijing100193,PRChina
bDepartmentofMathematics,UniversityofTexasatArlington,P.O.Box19408, Arlington,TX76019-0408,USA
a r t i c l e i n f o a b s t r a c t
Articlehistory:
Received23January2019 Accepted8July2019 Availableonline17July2019 SubmittedbyV.Mehrmann MSC:
65F99 49Q12 15A23 15A69 Keywords:
Matrixjointblock-diagonalization Perturbationanalysis
Modulusofuniqueness Modulusofnon-divisibility MICA
Thematrix joint block-diagonalization problem (jbdp)of a givenmatrixsetA={Ai}mi=1isaboutfindinganonsingular matrix W such that all WTAiW are block-diagonal. It includes the matrix joint diagonalization problem (jdp) as a specialcase for whichall WTAiW arerequireddiagonal.
Generically, such a matrix W may not exist, but there are practicalapplicationssuch asmultidimensional independent componentanalysis(MICA)forwhichitdoesexistunderthe idealsituation,i.e.,nonoiseispresented.However,inpractice noisesdogetinand,asaconsequence,thematrixsetisonly approximatelyblock-diagonalizable,i.e., onecan only make all WTAiW nearly block-diagonal at best, whereW is an approximationtoW,obtainedusuallybycomputation.The maingoalofthispaperistodevelopaperturbationtheoryfor jbdptoaddress,amongothers,thequestion:howaccuratethis Wis.Previouslysuchatheoryforjdphasbeendiscussed,but noefforthadbeenattemptedforjbdpbecause,inlargepart, thereisnoquantitative wayto describesolutionuniqueness of jbdpuntil2017 whenCaiandLiu(2017)[9] successfully obtained a necessary and sufficient uniqueness condition.
Based on the condition, in this article, we will establish a perturbationtheoryforjbdp.Ourmaincontributionsinclude an error bound for the approximate block-diagonalizer W
* Correspondingauthor.
E-mailaddresses:yfcai1116@gmail.com(Y. Cai),rcli@uta.edu(R.-C. Li).
1 SupportedinpartbyNNSFCgrants11671023,11301013,and11421101.
2 SupportedinpartbyNSFgrantsCCF-1527104andDMS-1719620.
https://doi.org/10.1016/j.laa.2019.07.012
0024-3795/©2019ElsevierInc. Allrightsreserved.
andabackwarderroranalysisforjbdp.Numericaltestsare presentedtovalidatethetheoreticalresults.
©2019ElsevierInc.Allrightsreserved.
1. Introduction
The matrix joint block-diagonalization problem (jbdp) is about jointly block- diagonalizing a set of matrices. In recent years, it has found many applications in independent subspace analysis, also known as multidimensional independent compo- nent analysis (MICA)(see, e.g.,[1–4]) andsemidefiniteprogramming (see,e.g., [5–8]).
Tremendous effortshavebeendevoted tosolvingjbdp and,as aresult,several numer- ical methodshave been proposed. Thepurpose of this paper, however,is to develop a perturbationtheoryforjbdp.Forthisreason,wewillnotdelveintonumericalmethods, butrefertheinterestedreaderto[9–12] andreferencestherein.Thematlabtoolboxfor tensor computation–tensorlab[13] canalsobeused forthepurpose.
Intherestofthissection,wewillformallyintroducejbdpandformulateitsassociated perturbationproblem,alongwithsomenotationsanddefinitions.Throughacasestudy onthebasicMICAmodel,werationalizeourformulationsandprovide ourmotivations for current study. Previously, there are only a handful papers in the literature that studied the perturbation analysis of the matrix joint diagonalization problem (jdp).
Briefly,wewillreviewtheseexistingworksandtheirlimitations.Finally,weexplainour contributionandtheorganizationofthispaper.
1.1. Jointblockdiagonalization(jbd)
A partition ofpositiveintegern:
τn= (n1, . . . , nt) (1.1) means thatn1,n2,. . . ,ntare allpositiveintegersand theirsumisn,i.e., t
i=1ni =n.
Theintegertiscalled thecardinality ofthepartitionτn,denotedbyt= card(τn).
Givenapartitionτnasin(1.1) andamatrixX ∈Rn×n(thesetofn×nrealmatrices), we partitionX as
X =
⎡
⎢⎢
⎢⎢
⎣
n1 n2 ··· nt n1 X11 X12 · · · X1t
n2 X21 X22 · · · X2t ... ... ... ...
nt Xt1 Xt2 · · · Xtt
⎤
⎥⎥
⎥⎥
⎦ (1.2)
and defineitsτn-block-diagonalpart andτn-off-block-diagonal part as
Bdiagτn(X) = diag(X11, . . . , Xtt), OffBdiagτn(X) =X−Bdiagτn(X).
ThematrixX isreferredtoasaτn-block-diagonalmatrixifOffBdiagτn(X)= 0.Theset ofallτn-block-diagonalmatricesisdenotedbyDτn.
TheJointBlockDiagonalizationProblem(jbdp).LetA={Ai}mi=1bethesetofmma- trices,whereeachAi∈Rn×n.ThejbdpforAwithrespecttoτn istofindanonsingular matrixW ∈Rn×n suchthatallWTAiW areτn-block-diagonal,i.e.,
WTAiW = diag(A(11)i , . . . , A(tt)i ) for i= 1,2, . . . , m, (1.3) where A(jj)i ∈Rnj×nj. When(1.3) holds, we saythatA is τn-block-diagonalizable and W isa τn-block-diagonalizer of A. If W isalso required to be orthogonal,this jbdp is referredto asanorthogonal jbdp (o-jbdp).
Byconvention,ifτn = (1,1,. . . ,1),theword“τn-block” isdroppedfrom allrelevant terms.Forexample,“τn-block-diagonal”isreducedtojust“diagonal”.Correspondingly, theletter “B”is dropped from allabbreviations. For example,“jbdp” becomes“jdp”.
Thisconventionisadoptedthroughoutthisarticle.
Generically, jbdp often has no solution for m ≥ 3 and nj not so unevenly dis- tributed, simply by counting the number of equations implied by (1.3) and the num- ber of unknowns. For example, when m = 3 and n1 = n2 = n3 = n/3, there are m(n2−t
i=1n2i) = 2n2 equations but only n2 unknowns in W. However, in certain practicalapplicationssuchasMICA withoutnoises,solvablejbdpdoarise.
Definition1.1.ApermutationmatrixΠ∈Rn×n iscalledτn-block-diagonalpreserving if ΠTDΠ∈Dτn foranyD∈Dτn.Thesetof allτn-block-diagonalpreservingpermutation matricesisdenotedbyPτn.
Evidentally, any permutation matrix Π ∈ Dτn is inPτn. This is because such a Π canbeexpressed asΠ= diag(Π1,. . . ,Πt), whereΠj isannj×nj permutationmatrix.
But not all Π ∈ Pτn also belong to Dτn. For example, for n = 4 and τ4 = (2,2), Π =
0 I2 I2 0
∈ Pτ4 but Π ∈/ Dτ4. In particular, any permutation matrix Π ∈ Rn×n is in Pτn when τn = (1,1,. . . ,1). It can be proved that for given Π ∈ Pτn, there is a permutationπof{1,2,. . . ,t}suchthat
ΠTDΠ∈Dτn= diag(ΠT1Dπ(1)Π1,ΠT2Dπ(2)Π2, . . . ,ΠTtDπ(t)Πt)
forany D= diag(D1,D2,. . . ,Dt)∈Dτn. Specifically,thesubblocksof Π,ifpartitioned asin(1.2),areall0 blocks,exceptthoseattheblockpositions(π(j),j),whicharenj×nj
permutationmatricesΠj. Asaconsequence,nj =nπ(j)forall1≤j≤t.
It is not hard to verify thatif W is a τn-block-diagonalizer of A, then so is W DΠ for any given D ∈ Dτn and Π ∈ Pτn. In view of this, τn-block-diagonalizers, if exist, arenotuniquebecauseanydiagonalizerbringsoutaclassofequivalent diagonalizersin
the form of W DΠ. For this reason, we introduce the following definition for uniquely block-diagonalizable jbdp.
Definition 1.2.Twoτn-block-diagonalizers W and W of A are equivalent if there exist a nonsingularmatrixD ∈Dτn and Π∈Pτn suchthatW =W DΠ. The jbdpfor Ais said uniquely τn-block-diagonalizable ifit hasa τn-block-diagonalizer and ifany two of itsτn-block-diagonalizersareequivalent.
To further reducefreedoms for the sakeof comparing two diagonalizers, we restrict ourconsiderationsofblock-diagonalizerstothematrixset:
Wτn :={W ∈Rn×n : W is nonsingular and Bdiagτn(WTW) =In}. (1.4) Thisdoesn’tlossany generalitybecauseW[Bdiagτn(WTW)]−1/2∈Wτn foranynonsin- gular W ∈Rn×n.
1.2. Perturbationproblem for jbdp Let A = Ai
m
i=1 = {Ai+ ΔAi}mi=1, where ΔAi is a perturbation to Ai. Assume A = {Ai}mi=1 is τn-block-diagonalizable and W ∈ Wτn is a τn-block-diagonalizer and (1.3) holds. Let W ∈ Wτn be an approximate τn-block-diagonalizer of Ain the sense that all WTAiW are approximately τn-block-diagonal. How much does W differ from theblock-diagonalizerW ofA?
There aretwo importantaspects that needclarification regardingthis perturbation problem. First, Amay or maynot be τn-block-diagonalizable. Although allowing this counters the common sense that one can only gauge the difference between diagonal- izers thatexist, it is for agood reasonand importantpractically to allow this. As we argued above,a genericjbdp is usually not block-diagonalizable, and thus even if the jbdpforAhasadiagonalizer,itsarbitrarilyperturbedproblemispotentiallynotblock- diagonalizablenomatterhowtinytheperturbationmaybe.Thisleadstoanimpossible task: to compare the block-diagonalizer W of the unperturbed A, that does exist, to a diagonalizer W of the perturbed matrix set A, that may not exist. We get around this dilemmabytalking aboutanapproximatediagonalizer forA,thatalwaysexist.It turns out this workaroundis exactlywhatsome practicalapplications call for because mostpracticaljbdpcomefromblock-diagonalizablejbdpbutcontaminatedwithnoises to becomeapproximatelyblock-diagonalizableand anapproximatediagonalizer forthe noisyjbdpgetscomputednumerically.Insuchascenario,itisimportanttogetasense as how farthecomputed diagonalizer isfrom theexactdiagonalizer of theclean albeit unknown jbdp,hadthenoisesnotpresented.
The second aspect is about what metric to use in order to measure the difference betweentwoblock-diagonalizers,giventhattheyarenotunique.InviewofDefinition1.2 and thediscussionintheparagraphimmediatelyproceedingit,weproposetouse
dist(W,W) := min
D∈Dτn,DTD=I Π∈Pτn
W−W DΠ (1.5)
forthepurpose,where·issomematrixnorm.Usuallywhichnormtouseisdetermined by the convenience of any particular analysis,but for all practical purpose,any norm is just as good as another. In our theoretical analysis below, we use both · 2, the matrix spectral norm, and · F, the matrix Frobenius norm [14], butuse only · F
in our numerical tests because then (1.5) is computable. Additionally, in using (1.5), we usually restrict W and W to Wτn. In fact, we can show that dist(·,·) is a metric over Wτn for any unitary invariant norm · ui as follows: first, the non-negativity dist(W,W)≥0 isobvious;second,dist(W,W)= 0 ifandonlyifWandWareequivalent;
third, dist(W,W) = dist(W ,W) holds since · ui is unitary invariant and D, Π are unitary;fourth,foranyW1,W2,W3∈Wτn, let
(D12,Π12) = argmin
D,Π
dist(W1, W2), (D23,Π23) = argmin
D,Π
dist(W2, W3).
ItcanbeseenthatD =D23Π23D12ΠT23∈Dτn andisalsoorthogonal,andΠ = Π 23Π12∈ Pτn,andfinally
dist(W1, W3)≤ W1−W3DΠui
≤ W1−W2D12Π12ui+W2D12Π12−W3DΠui
= dist(W1, W2) + dist(W2, W3).
1.3. A casestudy:MICA
MICA [1,15,4] aims at separating linearlymixed unknown sources into statistically independentgroupsofsignals.AbasicMICAmodelcanbe statedas
x=M s+v, (1.6)
wherex∈Rn istheobservedmixture, M∈Rn×n isanonsingular matrix(oftencalled themixingmatrix),s∈Rn isthesourcesignal,andv∈Rn is thenoisevector.
We would like to recover the source s from the observed mixture x. Let s = sT1,. . . ,sTtT
withsj ∈Rnj forj= 1,2,. . . ,t,andv= [ν1,. . . ,νn]T.Assumethatallsj areindependentofeachother,andeachsjhasmean0 andcontainsnolower-dimensional independent component, and among allsj, there exists at most oneGaussian compo- nent.Assumefurtherthatthenoisesν1,. . . ,νnarerealstationarywhiterandomsignals, mutually uncorrelated with the same variance σ2, and independent of the sources. To recoverthesourcesignals,itsufficestofindM oritsinversefromtheobservedmixture x. NoticethatifM isasolution,then sois M DΠ,where D isablock-diagonalscaling matrixandΠ isablock-wisepermutationmatrix.Inthissense,therearecertaindegree
of freedoms in thedetermination of M. Such indeterminacy of thesolution is natural, and doesnotmatterinapplications.Wehavethefollowingstatements.
(a) ThecovariancematrixRxxofxsatisfies
Rxx=E(xxT) =ME(ssT)MT+E(vvT) =M RssMT+σ2I, (1.7) where E(·) stands for the mathematical expectation, and Rss is the covariance matrix of s. Bythe aboveassumptions, we know thatRss ∈ Dτn. Often σ canbe verywell estimated:σ≈σ.ˆ Then wehave
Rxx−σˆ2I≈M RssMT. (1.8) Inparticular,intheabsenceofnoises,i.e.,σ= 0, (1.8) becomesanequality.
(b) Thekurtosis3Cx4ofxisatensorofdimensionn×n×n×n.Fixingtwoindices,say thefirsttwo,and varyingthelasttwo,wehave
Cx4(i1, i2,:,:) =MCs4(i1, i2,:,:)MT, (1.9) where Cs4 isthekurtosisof sanditcanbe shownthatC4s(i1,i2,:,:)∈Dτn.
Together, theyresultinajbdpfor
A={Rxx−ˆσI} ∪ {Cx4(i1, i2,:,:)}ni1,i2=1,
for which W := M−T is an exact τn-block-diagonalizer when no noise is presented.
When we attempt to numerically block-diagonalize A, what we do is to calculate an approximationWofM−TDΠ forsomeD∈DτnandΠ∈Pτn,whichcorrespondstothe indeterminacyof MICA(eveninthecasewhenthereis nonoise).
Thepoint wetrytomakefrom thiscasestudy isthat,inpracticalapplications,due to measurement errors, we only get to work with A={Ai}that are, in general,only approximately block-diagonalizableand, intheend,anapproximate block-diagonalizer W of A gets computed. In other words, we usually don’t have A which is known block-diagonalizable in theory but what we do have is A which may or may not be block-diagonalizableandforwhichwehaveanapproximateblock-diagonalizerW.Then how far this W is from the exactdiagonalizer W of A becomes acentral question, in order to gaugethe quality of W. This is whatwe set out to do inthis paper. Our re- sult isan upperbound onthemeasure in(1.5). Suchanupper boundwill alsohelpus understandwhataretheinherentfactorsthataffectthesensitivityof jbdp.
3 Othercumulantscanalsobeconsidered.
1.4. Related works
Thoughtremendouseffortshavegonetosolvejdp/jbdp,theirperturbationproblems had receivedlittle or no attention inthe past. Infact, today there are only ahandful articleswrittenontheperturbationsofjdponly.Foro-jdp,Cardoso[1] presentedafirst orderperturbationboundforasetofcommutingmatrices,andtheresultwaslatergener- alizedbyRusso[16].Forgeneraljdp,usinggradientflows,Afsari[17] studiedsensitivity viacostfunctionsandobtainedfirstorderperturbationboundsforthediagonalizer.Shi andCai[18] investigatedanormalizedjdpthroughaconstrainedoptimizationproblem, and obtained an upper bound on certain distance between an approximate diagonal- izerof aperturbed optimizationproblem andanexactdiagonalizer oftheunperturbed optimizationproblem.
jbdpcanalsoberegardedasaparticularcaseoftheblockterm decomposition(BTD) ofthirdordertensors[19–22].Theuniquenessconditionsoftensordecompositions,which is stronglyconnected to the sensitivity oftensor decompositions, receivedmuch atten- tion recently (see, e.g., [20,23–30]). However,as to the perturbation theory for tensor decompositions,despite itsimportance,few resultsexist.Recently in[31] and[32], the condition numbers for the so-called canonical polyadic decomposition (CPD) and join decompositionproblem (includingtheWaringdecomposition,andsomespecifictypesof BTD,etc.) are investigated. Nonetheless, morestudies arein needinthe perturbation theoryforvarioustypesoftensordecompositions.
1.5. Ourcontributionand theorganizationof this paper
A biggest reasonas to whyno available perturbation analysis forjbdp is, perhaps, dueto lackingperfectwaysto uniquelydescribe block-diagonalizers,nottomentionno availableuniquenesscondition to nailthem down, unlikemany othermatrix perturba- tion problems surveyed in [33]. Quite recently, inthe sense of Definition 1.2, Caiand Liu[9] establishednecessary andsufficient conditionsfor ajbdpto be uniquelyblock- diagonalizable.Theseconditionsarethecornerstoneforourcurrentinvestigationinthis paper.Unliketheresultsinexistingliteratures,theresultinthispaperdoesnotinvolve any cost function, which makes it widely applicable to any approximate diagonalizer computed from min/maximizing a cost function. The result also reveals the inherent factorsthataffect thesensitivity of jbdp.
Therestofthispaperisorganizedasfollows. Insection2,wediscuss propertiesofa uniquelyblock-diagonalizablejbdpandintroducetheconceptsofthemoduliofunique- nessandnon-divisibilitythatplaykeyrolesinourlaterdevelopment.Ourmainresultis presentedinsection3,alongwithdetaileddiscussionsonitsnumerousimplications.The proofofthemainresultisratherlongandtechnicalandthusisdeferredtosection4.We validate ourtheoretical contributions bynumerical tests reportedinsection5. Finally, concludingremarksaregiveninsection6.
Notation.Rm×n is theset ofall m×nreal matricesand Rm=Rm×1. In isthen×n identitymatrix,and0m×nisthem-by-nzeromatrix.Whentheirsizesareclearfromthe context, we maysimply write I and 0. The symbol ⊗denotes the Kronecker product.
Theoperationvec(X) turnsamatrixX intoacolumnvectorformedbythefirstcolumn of X followed by its second column and then its third column and so on. Inversely, reshape(x,m,n) turns themn-by-1 vectorxinto anm-by-nmatrixinsuchaway that reshape(vec(X),m,n)=X for anyX ∈Rm×n.Thespectralnorm andFrobeniusnorm of amatrixaredenotedby· 2and · F,respectively.Forasquare matrixA,eig(A) is the set ofall eigenvaluesof A, counting algebraic multiplicities.For convenience, we will agreethatanymatrixA∈Rm×n hasnsingularvaluesand σmin(A) isthesmallest oneamongthem.κ2(A)= σA2
min(A) denotesthematrixspectralconditionnumber.
2. Uniquelyblock-diagonalizable jbdp
We begin by fixing two universal notations, throughout the rest of this paper, for the sets ofmatrices of interest.Let A={Ai}mi=1 be the set ofm matrices, where each Ai∈Rn×n. WhenAisτn-blockdiagonalizable,i.e., (1.3) holds,wedefinematrixsets
Aj ={A(jj)i }mi=1 forj= 1,2, . . . , t= card(τn). (2.1) In[9],aclassificationof jbdpisproposed.Among allandbesidestheoneinsubsec- tion 1.1, there is the so-calledgeneral jbdp(gjbdp)for A for which apartition τn is notgiven butinsteaditasksfor findingapartitionτn withthelargestcardinality such thatA isτn-block-diagonalizableand at thesametimeaτn-block-diagonalizer.Via an algebraicapproach,necessaryandsufficientconditions[9,Theorem2.5] areobtainedfor theuniquenessof(equivalent)block-diagonalizersofthegjbdpforA.Asacorollary,we havethefollowingresult.
Theorem2.1([9]).Given partitionτn ofn,supposethat thejbdpof Aisτn-blockdiag- onalizableandW isitsτn-block-diagonalizersatisfying (1.3),andassumethatevery Aj
cannotbefurtherblockdiagonalized,4i.e.,foranypartitionτnj ofnj withcard(τnj)≥2, Aj isnotτnj-block-diagonalizable.ThenthejbdpofAisuniquelyτn-block-diagonalizable if andonly ifthematrix
Mjk= m i=1
I
nk⊗
(A(jj)i )TA(jj)i +A(jj)i (A(jj)i )T
A(kk)i ⊗A(jj)i +(A(kk)i )T⊗(A(jj)i )T A(kk)i ⊗A(jj)i +(A(kk)i )T⊗(A(jj)i )T
(A(kk)i )TA(kk)i +A(kk)i (A(kk)i )T
⊗Inj
(2.2)
is nonsingularforall1≤j < k≤t.
4 FortheMICAmodel,thisassumptionisequivalenttosaythateachcomponentsj hasnolowerdimen- sionalindependentcomponent.
ThefollowingsubspaceofRn×n N (A) :=
Z∈Rn×n : AiZ−ZTAi= 0 for 1≤i≤m
(2.3) hasplayedanimportantroleintheproofof[9,Theorem2.5],anditwillalsocontribute toourperturbationanalysislaterinabigway.
Next,letusexaminesomefundamentalpropertiesofZ ∈N (A) with
Ai= diag(A(11)i , . . . , A(tt)i ) for 1≤i≤m (2.4) already.AnyZ∈N (A) satisfies
diag(A(11)i , . . . , A(tt)i )Z−ZTdiag(A(11)i , . . . , A(tt)i ) = 0 for 1≤i≤m. (2.5) Partition Z conformally as Z = [Zjk], where Zjk ∈ Rnj×nk. Blockwise, (2.5) can be rewrittenas
A(jj)i Zjk−ZkjTA(kk)i = 0 for 1≤i≤m, 1≤j, k≤t. (2.6) Theseequationscanbe decoupledintotwosetsofmatrixequations.Thefirstset is,for 1≤j ≤t,
A(jj)i Zjj−ZjjTA(jj)i = 0 for 1≤i≤m, (2.7a) andthesecondset is,for1≤j < k≤t,
A(jj)i Zjk−ZkjTA(kk)i = 0, A(kk)i Zkj−ZjkTA(jj)i = 0 for 1≤i≤m. (2.7b) Considerfirst (2.7b) for1≤j < k ≤t. Withthe helpof the Kronecker product (see, e.g.,[34]),theyareequivalentto
Gjk
vec(Zjk)
−vec(ZkjT)
= 0, (2.8a)
where
Gjk=
⎡
⎢⎢
⎢⎢
⎢⎢
⎢⎢
⎣
Ink⊗A(jj)1 (A(kk)1 )T⊗Inj
Ink⊗(A(jj)1 )T A(kk)1 ⊗Inj
... ...
Ink⊗A(jj)m (A(kk)m )T⊗Inj Ink⊗(A(jj)m )T A(kk)m ⊗Inj
⎤
⎥⎥
⎥⎥
⎥⎥
⎥⎥
⎦
. (2.8b)
Notice that Mjk defined in (2.2) simply equals to GTjkGjk. Thus, according to Theo- rem 2.1,Ais uniquelyτn-block-diagonalizable ifand onlyifthesmallestsingularvalue σmin(Gjk)>0,providedallAj cannotbefurtherblockdiagonalized.
Next,wenote that(2.7a) isequivalentto
Gjjvec(Zjj) = 0, (2.9a)
where
Gjj=
⎡
⎢⎣
Inj ⊗A(jj)1 −
(A(jj)1 )T⊗Inj
Πj
... Inj ⊗A(jj)m −
(A(jj)m )T⊗Inj
Πj
⎤
⎥⎦, (2.9b)
and Πj ∈ Rn2j is the perfect shuffle permutation matrix [35, Subsection 1.2.11] that enablesΠjvec(ZjjT)= vec(Zjj).
Theorem 2.2.Suppose A is already in the jbd form with respect to τn = (n1,. . . ,nt), i.e., Ai aregivenby (2.4).Thefollowingstatements hold.
(a) Gjjvec(Inj)= 0,i.e.,Gjj isrank-deficient;
(b) Aj cannot be further block-diagonalized if and only if for any Zjj ∈ N (Aj), its eigenvalues areeitherasinglerealnumber orasinglepairoftwocomplexconjugate numbers.
(c) If dimN (Aj)= 1 whichmeanseither nj= 1 orthesecond smallestsingularvalue of Gjj ispositive, thenAj cannotbe furtherblock-diagonalized.
Proof. Item(a)holds becauseZ =Inj clearlysatisfies(2.7a).
Foritem (b),wewillprovebothsufficiencyandnecessitybycontradiction.
(⇒) Suppose there exists a Zjj ∈ N (Aj) such that its eigenvalues are neither a single real numbernor asingle pairof two complex conjugate numbers.Then Zjj can be decomposed into Zjj =Wjdiag(D1(j),D(j)2 )Wj−1, where Wj, D(j)1 , D(j)2 are all real matricesandeig(D1(j))∩eig(D(j)2 )=∅.Thensubstitutingthedecompositioninto(2.7a), we can conclude that WjTA(jj)i Wj for i = 1,2,. . . ,m are all block-diagonal matrices, contradicting thatAj cannotbefurtherblockdiagonalized.
(⇐) Assume, to the contrary, that Aj canbe furtherblock-diagonalized, i.e., there exists a nonsingular Wj such that WjTA(jj)i Wj = diag(Bi(j1),Bi(j2)), where Bij1, Bi(j2) are ofordernj1 andnj2,respectively.Then
Zjj=Wjdiag(γ1Inj1, γ2Inj2)Wj−1∈N (Aj),
where γ1, γ2 are arbitrary real numbers. That is that some Zjj ∈ N (Aj) can have distinct realeigenvalues,acontradiction.
Lastlyforitem(c),assume,tothecontrary,thatAjcanbefurtherblock-diagonalized.
Withoutloss ofgenerality,wemayassumethatthere existsanonsingularmatrixWj∈ Rnj×nj such that WjTA(jj)i Wj = diag(A(jj1)i ,A(jj2)i ) for i = 1,2,. . . ,m, where A(jj1)i and A(jj2)i are respectively of order nj1 and nj2. Then (2.7a) has at least two linearly independentsolutionsWjdiag(Inj1,0)Wj−1, Wjdiag(0,Inj2)Wj−1. Therefore,(2.9a) has twolinearlyindependentsolutions,whichimpliesthatthesecondsmallestsingularvalue ofthecoefficientmatrixGjj mustbe 0,acontradiction. 2
In view of Theorems 2.1 and 2.2, we introduce the moduli of uniqueness and non- divisibilityforτn-block-diagonalizableA.
Definition2.3.SupposethatAisτn-block-diagonalizableandletW ∈Wτnbeaτn-block- diagonalizerofAsuchthat(1.3) holds.
(a) The modulus of uniqueness of the jbdp for A with respective to the τn-block- diagonalizerW isdefinedby
ωuq≡ωuq(A;W) = min
1≤j<k≤tσmin(Gjk), (2.10) whereGjk isgivenby(2.8b).
(b) Suppose that none of Aj can be further block-diagonalized. The modulus of non- divisibility ωnd ≡ ωnd(A;W) of the jbdp for A with respective to the τn-block- diagonalizerW isdefinedbyωnd=∞ifτn = (1,1,. . . ,1) and
ωnd= min
nj>1{the smallest nonzero singular value ofGjj} (2.11) otherwise,where Gjj isgivenby(2.9b).
Notethenotionofthemodulusofnon-divisibilityisdefinedundertheconditionthat noneofAj canbefurtherblock-diagonalized.Itisneededbecauseinorderfor(2.11) to be well-defined,we needto makesure thatGjj has atleast onenonzerosingularvalue inthecasewhennj >1.Indeed,Gjj= 0 whenevernj>1,ifnoneofAj canbefurther block-diagonalized.To see this, we note Gjj = 0 impliesthat any matrix Zjj of order nj isasolutionto (2.7a) and thus A(jj)i for 1≤i≤m arediagonal, whichmeansthat Ajcanbefurther(block)diagonalized.ThiscontradictstheassumptionthatnoneofAj
canbe furtherblock-diagonalized.
ThepropositionbelowpartiallyjustifiesDefinition2.3.
Proposition2.4.SupposethatAisτn-block-diagonalizableandletW ∈Wτnbeaτn-block- diagonalizer of Asuch that (1.3)holds.Suppose dimN (Aj)= 1 forall1≤j ≤t,and letσ(j)−2 bethesecondsmallestsingularvalueofGjj whenevernj >1.Thenthefollowing statementholds.
(a) Ais uniquelyτn-block-diagonalizableifωuq(A;W)>0.
(b) None ofAj can befurtherblock-diagonalizedand ωnd≡ωnd(A;W) = min
nj>1σ(j)−2>0.
Remark2.5.A fewcommentsareinorder.
(a) Thedefinitionofωuqisanaturalgenerationofthemodulusofuniquenessin[18] for jdp(i.e.,when τn= (1,1,. . . ,1)).
(b) By Theorem 2.2(a), we know the smallest singularvalue of Gjj is always 0.Thus it seemsnatural thatindefining ωnd in(2.11),one wouldexpect using thesecond smallestsingularvalueofGjj.ItturnsoutthatthereareexamplesforwhichAjcan- notbe furtherblock-diagonalized andyetdimN(Aj)= 2,i.e.,thesecond smallest singularvalueofGjjisstill0.Asanexample,weconsiderAi=
αi βi
βi −αi
∈R2×2for i= 1,2,. . . ,m,whereβi = 0 foralliandαi/βi’sarenotaconstant.ThenAcannot be simultaneouslydiagonalizedbut N(A)= span{I2,
0 1
−1 0
},i.e.,dimN (A)= 2.
The moduli ωuq and ωnd, as defined in Definition 2.3, depend on the choiceof the diagonalizer W. But, as the following theorem shows, inthe case when A is uniquely τn-block-diagonalizable,theirdependencyondiagonalizerW ∈Wτn canbe removed.
Theorem 2.6. If A isuniquely τn-block-diagonalizable, then ωuq andωnd are both inde- pendent of thechoiceof diagonalizersW ∈Wτn.
Proof. Let W ∈ Wτn be a τn-block-diagonalizer of A. Then all possible τn-block- diagonalizers ofAfromWτn taketheform W =W DΠ forsomeD∈Dτn and Π∈Pτn. Wewill showthatωuq(A;W)=ωuq(A;W) andωnd(A;W)=ωnd(A;W).
WecanwriteD= diag(D1,. . . ,Dt),whereDj ∈Rnj×nj.AllDjareorthogonalsince W,W∈Wτn.Wehave
WTAiW = ΠTdiag(DT1A(11)i D1, . . . , DTtA(tt)i Dt)Π
= diag(ΠT1DT1A(i 11)D1Π1, . . . ,ΠTtDTtA(i tt)DtΠt),
where{1,2,. . . ,t}isapermutationof{1,2,. . . ,t},andΠj isapermutationmatrixof order nj.Denote byA(jj)i = ΠTjDTjA(i jj)DjΠj,anddefine Gjk,accordinglyasGjk in (2.8b),butintermsofA(jj)i andA(kk)i ,Gjj,accordinglyas Gjj in(2.9b),butinterms of A(jj)i .Thenbycalculations,wehave
Gjk =
I2m⊗(ΠkDk)T⊗(ΠjDj)T) Gjk
I2⊗(ΠkDk)⊗(ΠjDj)) , Gjj =
Im⊗(ΠjDj)T⊗(ΠjDj)T) Gjj
(ΠjDj)⊗(ΠjDj) ,
whichimply thatthesingularvaluesof Gjk andGjj arethesameas those ofGjk and Gjj,respectively.Theconclusionfollows. 2
3. Mainperturbationresults
Inthis section,we presentourmain theorem,along withsomeillustratingexamples anddiscussionsonitsimplications.Wedeferitslengthy prooftosection4.
3.1. Set upthestage
In what follows, we will set up the groundwork for our perturbation analysis and explainsomeof ourassumptions.
RecallA={Ai}mi=1 which istheunperturbed matrixset,where allAi ∈Rn×n, and τn = (n1,. . . ,nt) isapartitionofnwitht= card(τn)≥2.Weassumethat
Aisτn-block-diagonalizable,W ∈Wτnisitsτn-block-diagonalizer
suchthat(1.3) holds,and dimN (Aj)= 1 forallj. (3.1) The assumption that dimN(Aj) = 1 implies that Aj cannot be further block- diagonalizedbyTheorem2.2(c).
SupposethatAisperturbedto A={Ai}mi=1≡ {Ai+ ΔAi}mi=1, andlet AF:=
m
i=1
Ai2F
1/2
, δA:=
m
i=1
ΔAi2F
1/2
. (3.2)
Previously, we commented on that, more often than not, a generic jbdp may not be τn-block-diagonalizable for m ≥ 3. This means that A may not be τn-block- diagonalizableregardlesshow tinyδA maybe.Forthis reason,wewillnotassumethat Aisτn-block-diagonalizable, but,instead, it hasan approximateτn-block-diagonalizer W ∈Wτn inthesensethat
allWTAiWare nearlyτn-block-diagonal. (3.3) Doing so has two advantages. Firstly, it serves all practical purposes well, because in anylikelypracticalsituationsweusuallyendupwithAwhichisclosetosomeτn-block- diagonalizableA, which is not actually availabledue to unavoidable noises such as in MICA,and,atthesametime,anapproximateτn-block-diagonalizercanbemadeavail- ablebycomputation.Secondly,itisgeneralenoughtocoverthecasewhenthejbdpfor Aisactuallyτn-block-diagonalizable.
Wehaveto quantifythe statement(3.3) inorder toproceed. Tothis end,wepicka diagonal matrix Γ = diag(γ1In1,. . . ,γtInt), where γ1,. . . ,γt are distinct real numbers withmaxj|γj|= 1,anddefine theτn-block-diagonalzablilityresiduals
Ri=WTAiWΓ−ΓWTAiW fori= 1,2, . . . , m. (3.4) NoticeBdiagτn(Ri)= 0 alwaysnomatterwhatΓ is.Therationalebehinddefiningthese residualsisinthefollowingproposition.
Proposition 3.1.WTAiW is τn-block-diagonal, i.e., OffBdiagτn(WTAiW) = 0 if and only if Ri= 0.
As farasthis propositionisconcerned,anydiagonal Γ withdistinctdiagonalentries suffices.Butlater,wewillseethatourupperbounddependsonΓ,whichmakesuswonder whatthe bestΓ isfor thebestpossible bound. Unfortunately,this isnot atrivialtask and wouldbeaninteresting subjectfor futurestudies.Laterinournumericaltests, we useafewrandomΓ alongwith thefollowing one
Γ = diag(−In1,(−1 + 2
t−1)In2,(−1 + 4
t−1)In3, . . . , Int). (3.5) Nonetheless, westillkeepΓ asaparameterto chooseinourmain resultinhopethatit maycometohelpincertaincircumstance.Werestrictγitorealnumbersforconsistency considerationsinceAandAareassumedreal.Alldevelopmentsbelowworkequallywell even iftheyarecomplex.Set
g= min
j=k|γj−γk|, ˜r= m
i=1
Ri2F
1/2
. (3.6)
The quantityr˜willbe used tomeasure how goodW is inapproximately diagonalizing A. Infact,itcanbeverifiedthat
g m
i=1
OffBdiagτn(WTAiW)2F
12
≤r˜
≤2 m
i=1
OffBdiagτn(WTAiW)2F
12 , (3.7)
whichimpliesthat˜riscomparabletom
i=1OffBdiagτn(WTAiW)2F
12
,providedgis nottootiny.Forthechoice(3.5),g= 2/(t−1).
The next proposition is due to an anonymous referee and it improves our earlier version.
Proposition3.2. Wisanexactτn-block-diagonalizerofthematrixset{Ai+Ei}mi=1 with relative backwarderror