Perturbation analysis for matrix joint block diagonalization

(1)

Contents lists available atScienceDirect

Linear Algebra and its Applications

www.elsevier.com/locate/laa

Perturbation analysis for matrix joint block diagonalization

Yunfeng Cai^a,1,Ren-Cang Li^b,∗,2

aCognitiveComputingLab,BaiduResearch,Beijing100193,PRChina

bDepartmentofMathematics,UniversityofTexasatArlington,P.O.Box19408, Arlington,TX76019-0408,USA

a r t i c l e i n f o a b s t r a c t

Articlehistory:

Received23January2019 Accepted8July2019 Availableonline17July2019 SubmittedbyV.Mehrmann MSC:

65F99 49Q12 15A23 15A69 Keywords:

Matrixjointblock-diagonalization Perturbationanalysis

Modulusofuniqueness Modulusofnon-divisibility MICA

Thematrix joint block-diagonalization problem (jbdp)of a givenmatrixsetA={Ai}^mi=1isaboutﬁndinganonsingular matrix W such that all W^TAiW are block-diagonal. It includes the matrix joint diagonalization problem (jdp) as a specialcase for whichall W^TAiW arerequireddiagonal.

Generically, such a matrix W may not exist, but there are practicalapplicationssuch asmultidimensional independent componentanalysis(MICA)forwhichitdoesexistunderthe idealsituation,i.e.,nonoiseispresented.However,inpractice noisesdogetinand,asaconsequence,thematrixsetisonly approximatelyblock-diagonalizable,i.e., onecan only make all W^TA_iW nearly block-diagonal at best, whereW is an approximationtoW,obtainedusuallybycomputation.The maingoalofthispaperistodevelopaperturbationtheoryfor jbdptoaddress,amongothers,thequestion:howaccuratethis Wis.Previouslysuchatheoryforjdphasbeendiscussed,but noeﬀorthadbeenattemptedforjbdpbecause,inlargepart, thereisnoquantitative wayto describesolutionuniqueness of jbdpuntil2017 whenCaiandLiu(2017)[9] successfully obtained a necessary and suﬃcient uniqueness condition.

Based on the condition, in this article, we will establish a perturbationtheoryforjbdp.Ourmaincontributionsinclude an error bound for the approximate block-diagonalizer W

* Correspondingauthor.

E-mailaddresses:yfcai1116@gmail.com(Y. Cai),rcli@uta.edu(R.-C. Li).

1 SupportedinpartbyNNSFCgrants11671023,11301013,and11421101.

2 SupportedinpartbyNSFgrantsCCF-1527104andDMS-1719620.

https://doi.org/10.1016/j.laa.2019.07.012

(2)

andabackwarderroranalysisforjbdp.Numericaltestsare presentedtovalidatethetheoreticalresults.

1. Introduction

The matrix joint block-diagonalization problem (jbdp) is about jointly block- diagonalizing a set of matrices. In recent years, it has found many applications in independent subspace analysis, also known as multidimensional independent compo- nent analysis (MICA)(see, e.g.,[1–4]) andsemideﬁniteprogramming (see,e.g., [5–8]).

Tremendous eﬀortshavebeendevoted tosolvingjbdp and,as aresult,several numerical methodshave been proposed. Thepurpose of this paper, however,is to develop a perturbationtheoryforjbdp.Forthisreason,wewillnotdelveintonumericalmethods, butrefertheinterestedreaderto[9–12] andreferencestherein.Thematlabtoolboxfor tensor computation–tensorlab[13] canalsobeused forthepurpose.

Intherestofthissection,wewillformallyintroducejbdpandformulateitsassociated perturbationproblem,alongwithsomenotationsanddeﬁnitions.Throughacasestudy onthebasicMICAmodel,werationalizeourformulationsandprovide ourmotivations for current study. Previously, there are only a handful papers in the literature that studied the perturbation analysis of the matrix joint diagonalization problem (jdp).

Brieﬂy,wewillreviewtheseexistingworksandtheirlimitations.Finally,weexplainour contributionandtheorganizationofthispaper.

1.1. Jointblockdiagonalization(jbd)

A partition ofpositiveintegern:

τn= (n1, . . . , nt) (1.1) means thatn₁,n₂,. . . ,n_tare allpositiveintegersand theirsumisn,i.e., t

i=1n_i =n.

Theintegertiscalled thecardinality ofthepartitionτ_n,denotedbyt= card(τ_n).

Givenapartitionτ_nasin(1.1) andamatrixX ∈Rⁿ^×ⁿ(thesetofn×nrealmatrices), we partitionX as

X =

⎡

⎢⎢

⎣

n1 n2 ··· nt n1 X11 X12 · · · X1t

n2 X₂₁ X₂₂ · · · X_2t ... ... ... ...

nt Xt1 Xt2 · · · Xtt

⎤

⎥⎥

⎦ (1.2)

and deﬁneitsτn-block-diagonalpart andτn-oﬀ-block-diagonal part as

(3)

Bdiag_τ_n(X) = diag(X11, . . . , Xtt), OﬀBdiag_τ_n(X) =X−Bdiag_τ_n(X).

ThematrixX isreferredtoasaτ_n-block-diagonalmatrixifOﬀBdiag_τ_n(X)= 0.Theset ofallτ_n-block-diagonalmatricesisdenotedbyD_τ_n.

TheJointBlockDiagonalizationProblem(jbdp).LetA={Ai}^mi=1bethesetofmma- trices,whereeachAi∈R^n×n.ThejbdpforAwithrespecttoτn istoﬁndanonsingular matrixW ∈R^n×n suchthatallW^TAiW areτn-block-diagonal,i.e.,

W^TA_iW = diag(A⁽¹¹⁾_i , . . . , A^(tt)_i ) for i= 1,2, . . . , m, (1.3) where A^(jj)_i ∈Rⁿ^j^×n^j. When(1.3) holds, we saythatA is τn-block-diagonalizable and W isa τn-block-diagonalizer of A. If W isalso required to be orthogonal,this jbdp is referredto asanorthogonal jbdp (o-jbdp).

Byconvention,ifτ_n = (1,1,. . . ,1),theword“τ_n-block” isdroppedfrom allrelevant terms.Forexample,“τn-block-diagonal”isreducedtojust“diagonal”.Correspondingly, theletter “B”is dropped from allabbreviations. For example,“jbdp” becomes“jdp”.

Thisconventionisadoptedthroughoutthisarticle.

Generically, jbdp often has no solution for m ≥ 3 and nj not so unevenly dis- tributed, simply by counting the number of equations implied by (1.3) and the number of unknowns. For example, when m = 3 and n₁ = n₂ = n₃ = n/3, there are m(n²−t

i=1n²_i) = 2n² equations but only n² unknowns in W. However, in certain practicalapplicationssuchasMICA withoutnoises,solvablejbdpdoarise.

Deﬁnition1.1.ApermutationmatrixΠ∈Rⁿ^×ⁿ iscalledτ_n-block-diagonalpreserving if Π^TDΠ∈Dτn foranyD∈Dτn.Thesetof allτn-block-diagonalpreservingpermutation matricesisdenotedbyPτn.

Evidentally, any permutation matrix Π ∈ Dτn is inPτn. This is because such a Π canbeexpressed asΠ= diag(Π1,. . . ,Πt), whereΠj isannj×nj permutationmatrix.

But not all Π ∈ Pτn also belong to Dτn. For example, for n = 4 and τ4 = (2,2), Π =

0 I₂ I₂ 0

∈ Pτ4 but Π ∈/ Dτ4. In particular, any permutation matrix Π ∈ R^n×n is in Pτn when τn = (1,1,. . . ,1). It can be proved that for given Π ∈ Pτn, there is a permutationπof{1,2,. . . ,t}suchthat

Π^TDΠ∈D_τ_n= diag(Π^T₁D_π(1)Π₁,Π^T₂D_π(2)Π₂, . . . ,Π^T_tD_π(t)Π_t)

forany D= diag(D1,D2,. . . ,Dt)∈Dτn. Speciﬁcally,thesubblocksof Π,ifpartitioned asin(1.2),areall0 blocks,exceptthoseattheblockpositions(π(j),j),whicharenj×nj

permutationmatricesΠ_j. Asaconsequence,n_j =n_π(j)forall1≤j≤t.

It is not hard to verify thatif W is a τ_n-block-diagonalizer of A, then so is W DΠ for any given D ∈ Dτn and Π ∈ Pτn. In view of this, τn-block-diagonalizers, if exist, arenotuniquebecauseanydiagonalizerbringsoutaclassofequivalent diagonalizersin

(4)

the form of W DΠ. For this reason, we introduce the following deﬁnition for uniquely block-diagonalizable jbdp.

Deﬁnition 1.2.Twoτn-block-diagonalizers W and W of A are equivalent if there exist a nonsingularmatrixD ∈D_τ_n and Π∈P_τ_n suchthatW =W DΠ. The jbdpfor Ais said uniquely τ_n-block-diagonalizable ifit hasa τ_n-block-diagonalizer and ifany two of itsτn-block-diagonalizersareequivalent.

To further reducefreedoms for the sakeof comparing two diagonalizers, we restrict ourconsiderationsofblock-diagonalizerstothematrixset:

Wτn :={W ∈Rⁿ^×ⁿ : W is nonsingular and Bdiag_τ_n(W^TW) =I_n}. (1.4) Thisdoesn’tlossany generalitybecauseW[Bdiag_τ_n(W^TW)]⁻^1/2∈Wτn foranynonsin- gular W ∈R^n×n.

1.2. Perturbationproblem for jbdp Let A = Ai

m

i=1 = {Ai+ ΔAi}^mi=1, where ΔAi is a perturbation to Ai. Assume A = {Ai}^mi=1 is τn-block-diagonalizable and W ∈ Wτn is a τn-block-diagonalizer and (1.3) holds. Let W ∈ Wτn be an approximate τn-block-diagonalizer of Ain the sense that all W^TA_iW are approximately τ_n-block-diagonal. How much does W diﬀer from theblock-diagonalizerW ofA?

There aretwo importantaspects that needclariﬁcation regardingthis perturbation problem. First, Amay or maynot be τn-block-diagonalizable. Although allowing this counters the common sense that one can only gauge the diﬀerence between diagonalizers thatexist, it is for agood reasonand importantpractically to allow this. As we argued above,a genericjbdp is usually not block-diagonalizable, and thus even if the jbdpforAhasadiagonalizer,itsarbitrarilyperturbedproblemispotentiallynotblock- diagonalizablenomatterhowtinytheperturbationmaybe.Thisleadstoanimpossible task: to compare the block-diagonalizer W of the unperturbed A, that does exist, to a diagonalizer W of the perturbed matrix set A, that may not exist. We get around this dilemmabytalking aboutanapproximatediagonalizer forA,thatalwaysexist.It turns out this workaroundis exactlywhatsome practicalapplications call for because mostpracticaljbdpcomefromblock-diagonalizablejbdpbutcontaminatedwithnoises to becomeapproximatelyblock-diagonalizableand anapproximatediagonalizer forthe noisyjbdpgetscomputednumerically.Insuchascenario,itisimportanttogetasense as how farthecomputed diagonalizer isfrom theexactdiagonalizer of theclean albeit unknown jbdp,hadthenoisesnotpresented.

The second aspect is about what metric to use in order to measure the diﬀerence betweentwoblock-diagonalizers,giventhattheyarenotunique.InviewofDeﬁnition1.2 and thediscussionintheparagraphimmediatelyproceedingit,weproposetouse

(5)

dist(W,W) := min

D∈D_τn,D^TD=I Π∈P_τn

W−W DΠ (1.5)

forthepurpose,where·issomematrixnorm.Usuallywhichnormtouseisdetermined by the convenience of any particular analysis,but for all practical purpose,any norm is just as good as another. In our theoretical analysis below, we use both · 2, the matrix spectral norm, and · F, the matrix Frobenius norm [14], butuse only · F

in our numerical tests because then (1.5) is computable. Additionally, in using (1.5), we usually restrict W and W to W_τ_n. In fact, we can show that dist(·,·) is a metric over W_τ_n for any unitary invariant norm · ui as follows: ﬁrst, the non-negativity dist(W,W)≥0 isobvious;second,dist(W,W)= 0 ifandonlyifWandWareequivalent;

third, dist(W,W) = dist(W ,W) holds since · ui is unitary invariant and D, Π are unitary;fourth,foranyW1,W2,W3∈Wτn, let

(D₁₂,Π₁₂) = argmin

D,Π

dist(W₁, W₂), (D₂₃,Π₂₃) = argmin

D,Π

dist(W₂, W₃).

ItcanbeseenthatD =D₂₃Π₂₃D₁₂Π^T₂₃∈D_τ_n andisalsoorthogonal,andΠ = Π ₂₃Π₁₂∈ P_τ_n,andﬁnally

dist(W1, W3)≤ W1−W3DΠui

≤ W1−W2D12Π12ui+W2D12Π12−W3DΠui

= dist(W₁, W₂) + dist(W₂, W₃).

1.3. A casestudy:MICA

MICA [1,15,4] aims at separating linearlymixed unknown sources into statistically independentgroupsofsignals.AbasicMICAmodelcanbe statedas

x=M s+v, (1.6)

wherex∈Rⁿ istheobservedmixture, M∈Rⁿ^×ⁿ isanonsingular matrix(oftencalled themixingmatrix),s∈Rⁿ isthesourcesignal,andv∈Rⁿ is thenoisevector.

We would like to recover the source s from the observed mixture x. Let s = s^T₁,. . . ,s^T_tT

withs_j ∈Rⁿ^j forj= 1,2,. . . ,t,andv= [ν₁,. . . ,ν_n]^T.Assumethatalls_j areindependentofeachother,andeachsjhasmean0 andcontainsnolower-dimensional independent component, and among allsj, there exists at most oneGaussian component.Assumefurtherthatthenoisesν1,. . . ,νnarerealstationarywhiterandomsignals, mutually uncorrelated with the same variance σ², and independent of the sources. To recoverthesourcesignals,itsuﬃcestoﬁndM oritsinversefromtheobservedmixture x. NoticethatifM isasolution,then sois M DΠ,where D isablock-diagonalscaling matrixandΠ isablock-wisepermutationmatrix.Inthissense,therearecertaindegree

(6)

of freedoms in thedetermination of M. Such indeterminacy of thesolution is natural, and doesnotmatterinapplications.Wehavethefollowingstatements.

(a) ThecovariancematrixR_xxofxsatisﬁes

R_xx=E(xx^T) =ME(ss^T)M^T+E(vv^T) =M R_ssM^T+σ²I, (1.7) where E(·) stands for the mathematical expectation, and Rss is the covariance matrix of s. Bythe aboveassumptions, we know thatR_ss ∈ D_τ_n. Often σ canbe verywell estimated:σ≈σ.ˆ Then wehave

R_xx−σˆ²I≈M R_ssM^T. (1.8) Inparticular,intheabsenceofnoises,i.e.,σ= 0, (1.8) becomesanequality.

(b) Thekurtosis³Cx⁴ofxisatensorofdimensionn×n×n×n.Fixingtwoindices,say theﬁrsttwo,and varyingthelasttwo,wehave

Cx⁴(i1, i2,:,:) =MCs⁴(i1, i2,:,:)M^T, (1.9) where Cs⁴ isthekurtosisof sanditcanbe shownthatC⁴s(i₁,i₂,:,:)∈D_τ_n.

Together, theyresultinajbdpfor

A={Rxx−ˆσI} ∪ {Cx⁴(i1, i2,:,:)}ⁿi1,i2=1,

for which W := M⁻^T is an exact τ_n-block-diagonalizer when no noise is presented.

When we attempt to numerically block-diagonalize A, what we do is to calculate an approximationWofM⁻^TDΠ forsomeD∈DτnandΠ∈Pτn,whichcorrespondstothe indeterminacyof MICA(eveninthecasewhenthereis nonoise).

Thepoint wetrytomakefrom thiscasestudy isthat,inpracticalapplications,due to measurement errors, we only get to work with A={Ai}that are, in general,only approximately block-diagonalizableand, intheend,anapproximate block-diagonalizer W of A gets computed. In other words, we usually don’t have A which is known block-diagonalizable in theory but what we do have is A which may or may not be block-diagonalizableandforwhichwehaveanapproximateblock-diagonalizerW.Then how far this W is from the exactdiagonalizer W of A becomes acentral question, in order to gaugethe quality of W. This is whatwe set out to do inthis paper. Our result isan upperbound onthemeasure in(1.5). Suchanupper boundwill alsohelpus understandwhataretheinherentfactorsthataﬀectthesensitivityof jbdp.

3 Othercumulantscanalsobeconsidered.

(7)

1.4. Related works

Thoughtremendouseffortshavegonetosolvejdp/jbdp,theirperturbationproblems had receivedlittle or no attention inthe past. Infact, today there are only ahandful articleswrittenontheperturbationsofjdponly.Foro-jdp,Cardoso[1] presentedafirst orderperturbationboundforasetofcommutingmatrices,andtheresultwaslatergener- alizedbyRusso[16].Forgeneraljdp,usinggradientflows,Afsari[17] studiedsensitivity viacostfunctionsandobtainedfirstorderperturbationboundsforthediagonalizer.Shi andCai[18] investigatedanormalizedjdpthroughaconstrainedoptimizationproblem, and obtained an upper bound on certain distance between an approximate diagonal- izerof aperturbed optimizationproblem andanexactdiagonalizer oftheunperturbed optimizationproblem.

jbdpcanalsoberegardedasaparticularcaseoftheblockterm decomposition(BTD) ofthirdordertensors[19–22].Theuniquenessconditionsoftensordecompositions,which is stronglyconnected to the sensitivity oftensor decompositions, receivedmuch attention recently (see, e.g., [20,23–30]). However,as to the perturbation theory for tensor decompositions,despite itsimportance,few resultsexist.Recently in[31] and[32], the condition numbers for the so-called canonical polyadic decomposition (CPD) and join decompositionproblem (includingtheWaringdecomposition,andsomespeciﬁctypesof BTD,etc.) are investigated. Nonetheless, morestudies arein needinthe perturbation theoryforvarioustypesoftensordecompositions.

1.5. Ourcontributionand theorganizationof this paper

A biggest reasonas to whyno available perturbation analysis forjbdp is, perhaps, dueto lackingperfectwaysto uniquelydescribe block-diagonalizers,nottomentionno availableuniquenesscondition to nailthem down, unlikemany othermatrix perturbation problems surveyed in [33]. Quite recently, inthe sense of Definition 1.2, Caiand Liu[9] establishednecessary andsufficient conditionsfor ajbdpto be uniquelyblock- diagonalizable.Theseconditionsarethecornerstoneforourcurrentinvestigationinthis paper.Unliketheresultsinexistingliteratures,theresultinthispaperdoesnotinvolve any cost function, which makes it widely applicable to any approximate diagonalizer computed from min/maximizing a cost function. The result also reveals the inherent factorsthataffect thesensitivity of jbdp.

Therestofthispaperisorganizedasfollows. Insection2,wediscuss propertiesofa uniquelyblock-diagonalizablejbdpandintroducetheconceptsofthemoduliofunique- nessandnon-divisibilitythatplaykeyrolesinourlaterdevelopment.Ourmainresultis presentedinsection3,alongwithdetaileddiscussionsonitsnumerousimplications.The proofofthemainresultisratherlongandtechnicalandthusisdeferredtosection4.We validate ourtheoretical contributions bynumerical tests reportedinsection5. Finally, concludingremarksaregiveninsection6.

(8)

Notation.R^m^×ⁿ is theset ofall m×nreal matricesand R^m=R^m^×¹. In isthen×n identitymatrix,and0m×nisthem-by-nzeromatrix.Whentheirsizesareclearfromthe context, we maysimply write I and 0. The symbol ⊗denotes the Kronecker product.

Theoperationvec(X) turnsamatrixX intoacolumnvectorformedbytheﬁrstcolumn of X followed by its second column and then its third column and so on. Inversely, reshape(x,m,n) turns themn-by-1 vectorxinto anm-by-nmatrixinsuchaway that reshape(vec(X),m,n)=X for anyX ∈R^m^×ⁿ.Thespectralnorm andFrobeniusnorm of amatrixaredenotedby· 2and · F,respectively.Forasquare matrixA,eig(A) is the set ofall eigenvaluesof A, counting algebraic multiplicities.For convenience, we will agreethatanymatrixA∈R^m^×ⁿ hasnsingularvaluesand σ_min(A) isthesmallest oneamongthem.κ2(A)= _σ^A²

min(A) denotesthematrixspectralconditionnumber.

2. Uniquelyblock-diagonalizable jbdp

We begin by ﬁxing two universal notations, throughout the rest of this paper, for the sets ofmatrices of interest.Let A={A_i}^mi=1 be the set ofm matrices, where each Ai∈Rⁿ^×ⁿ. WhenAisτn-blockdiagonalizable,i.e., (1.3) holds,wedeﬁnematrixsets

Aj ={A^(jj)_i }^mi=1 forj= 1,2, . . . , t= card(τn). (2.1) In[9],aclassificationof jbdpisproposed.Among allandbesidestheoneinsubsec- tion 1.1, there is the so-calledgeneral jbdp(gjbdp)for A for which apartition τ_n is notgiven butinsteaditasksfor findingapartitionτn withthelargestcardinality such thatA isτn-block-diagonalizableand at thesametimeaτn-block-diagonalizer.Via an algebraicapproach,necessaryandsufficientconditions[9,Theorem2.5] areobtainedfor theuniquenessof(equivalent)block-diagonalizersofthegjbdpforA.Asacorollary,we havethefollowingresult.

Theorem2.1([9]).Given partitionτ_n ofn,supposethat thejbdpof Aisτ_n-blockdiag- onalizableandW isitsτn-block-diagonalizersatisfying (1.3),andassumethatevery Aj

cannotbefurtherblockdiagonalized,⁴i.e.,foranypartitionτnj ofnj withcard(τnj)≥2, Aj isnotτ_n_j-block-diagonalizable.ThenthejbdpofAisuniquelyτ_n-block-diagonalizable if andonly ifthematrix

M_jk= m i=1

_I

nk⊗

(A^(jj)_i )^TA^(jj)_i +A^(jj)_i (A^(jj)_i )^T

A^(kk)_i ⊗A^(jj)_i +(A^(kk)_i )^T⊗(A^(jj)_i )^T A^(kk)_i ⊗A^(jj)_i +(A^(kk)_i )^T⊗(A^(jj)_i )^T

(A^(kk)_i )^TA^(kk)_i +A^(kk)_i (A^(kk)_i )^T

⊗I_nj

(2.2)

is nonsingularforall1≤j < k≤t.

4 FortheMICAmodel,thisassumptionisequivalenttosaythateachcomponentsj hasnolowerdimen- sionalindependentcomponent.

(9)

ThefollowingsubspaceofRⁿ^×ⁿ N (A) :=

Z∈Rⁿ^×ⁿ : A_iZ−Z^TA_i= 0 for 1≤i≤m

(2.3) hasplayedanimportantroleintheproofof[9,Theorem2.5],anditwillalsocontribute toourperturbationanalysislaterinabigway.

Next,letusexaminesomefundamentalpropertiesofZ ∈N (A) with

A_i= diag(A⁽¹¹⁾_i , . . . , A^(tt)_i ) for 1≤i≤m (2.4) already.AnyZ∈N (A) satisﬁes

diag(A⁽¹¹⁾_i , . . . , A^(tt)_i )Z−Z^Tdiag(A⁽¹¹⁾_i , . . . , A^(tt)_i ) = 0 for 1≤i≤m. (2.5) Partition Z conformally as Z = [Z_jk], where Z_jk ∈ Rⁿ^j^×ⁿ^k. Blockwise, (2.5) can be rewrittenas

A^(jj)_i Z_jk−Z_kj^TA^(kk)_i = 0 for 1≤i≤m, 1≤j, k≤t. (2.6) Theseequationscanbe decoupledintotwosetsofmatrixequations.Theﬁrstset is,for 1≤j ≤t,

A^(jj)_i Zjj−Z_jj^TA^(jj)_i = 0 for 1≤i≤m, (2.7a) andthesecondset is,for1≤j < k≤t,

A^(jj)_i Z_jk−Z_kj^TA^(kk)_i = 0, A^(kk)_i Z_kj−Z_jk^TA^(jj)_i = 0 for 1≤i≤m. (2.7b) Considerﬁrst (2.7b) for1≤j < k ≤t. Withthe helpof the Kronecker product (see, e.g.,[34]),theyareequivalentto

Gjk

vec(Zjk)

−vec(Z_kj^T)

= 0, (2.8a)

where

G_jk=

⎡

⎢⎢

⎣

Ink⊗A^(jj)₁ (A^(kk)₁ )^T⊗Inj

Ink⊗(A^(jj)₁ )^T A^(kk)₁ ⊗Inj

... ...

I_n_k⊗A^(jj)m (A^(kk)m )^T⊗I_n_j Ink⊗(A^(jj)m )^T A^(kk)m ⊗Inj

⎤

⎥⎥

⎦

. (2.8b)

(10)

Notice that Mjk deﬁned in (2.2) simply equals to G^T_jkGjk. Thus, according to Theo- rem 2.1,Ais uniquelyτn-block-diagonalizable ifand onlyifthesmallestsingularvalue σmin(Gjk)>0,providedallAj cannotbefurtherblockdiagonalized.

Next,wenote that(2.7a) isequivalentto

G_jjvec(Z_jj) = 0, (2.9a)

where

G_jj=

⎡

⎢⎣

Inj ⊗A^(jj)₁ −

(A^(jj)₁ )^T⊗Inj

Πj

... Inj ⊗A^(jj)m −

(A^(jj)m )^T⊗Inj

Πj

⎤

⎥⎦, (2.9b)

and Π_j ∈ Rⁿ²^j is the perfect shuﬄe permutation matrix [35, Subsection 1.2.11] that enablesΠjvec(Z_jj^T)= vec(Zjj).

Theorem 2.2.Suppose A is already in the jbd form with respect to τn = (n1,. . . ,nt), i.e., Ai aregivenby (2.4).Thefollowingstatements hold.

(a) G_jjvec(I_n_j)= 0,i.e.,G_jj isrank-deﬁcient;

(b) Aj cannot be further block-diagonalized if and only if for any Zjj ∈ N (Aj), its eigenvalues areeitherasinglerealnumber orasinglepairoftwocomplexconjugate numbers.

(c) If dim_N (Aj)= 1 whichmeanseither n_j= 1 orthesecond smallestsingularvalue of G_jj ispositive, thenAj cannotbe furtherblock-diagonalized.

Proof. Item(a)holds becauseZ =Inj clearlysatisﬁes(2.7a).

Foritem (b),wewillprovebothsuﬃciencyandnecessitybycontradiction.

(⇒) Suppose there exists a Z_jj ∈ N (Aj) such that its eigenvalues are neither a single real numbernor asingle pairof two complex conjugate numbers.Then Z_jj can be decomposed into Z_jj =W_jdiag(D₁^(j),D^(j)₂ )W_j⁻¹, where W_j, D^(j)₁ , D^(j)₂ are all real matricesandeig(D₁^(j))∩eig(D^(j)₂ )=∅.Thensubstitutingthedecompositioninto(2.7a), we can conclude that W_j^TA^(jj)_i Wj for i = 1,2,. . . ,m are all block-diagonal matrices, contradicting thatAj cannotbefurtherblockdiagonalized.

(⇐) Assume, to the contrary, that Aj canbe furtherblock-diagonalized, i.e., there exists a nonsingular Wj such that W_j^TA^(jj)_i Wj = diag(B_i^(j1),B_i^(j2)), where B_i^j1, B_i^(j2) are ofordernj1 andnj2,respectively.Then

Z_jj=W_jdiag(γ₁I_n_j1, γ₂I_n_j2)W_j⁻¹∈N (Aj),

where γ₁, γ₂ are arbitrary real numbers. That is that some Z_jj ∈ N (Aj) can have distinct realeigenvalues,acontradiction.

(11)

Lastlyforitem(c),assume,tothecontrary,thatAjcanbefurtherblock-diagonalized.

Withoutloss ofgenerality,wemayassumethatthere existsanonsingularmatrixWj∈ Rⁿ^j^×n^j such that W_j^TA^(jj)_i Wj = diag(A^(jj1)_i ,A^(jj2)_i ) for i = 1,2,. . . ,m, where A^(jj1)_i and A^(jj2)_i are respectively of order nj1 and nj2. Then (2.7a) has at least two linearly independentsolutionsW_jdiag(I_n_j1,0)W_j⁻¹, W_jdiag(0,I_n_j2)W_j⁻¹. Therefore,(2.9a) has twolinearlyindependentsolutions,whichimpliesthatthesecondsmallestsingularvalue ofthecoeﬃcientmatrixG_jj mustbe 0,acontradiction. 2

In view of Theorems 2.1 and 2.2, we introduce the moduli of uniqueness and non- divisibilityforτn-block-diagonalizableA.

Deﬁnition2.3.SupposethatAisτn-block-diagonalizableandletW ∈Wτnbeaτn-block- diagonalizerofAsuchthat(1.3) holds.

(a) The modulus of uniqueness of the jbdp for A with respective to the τ_n-block- diagonalizerW isdeﬁnedby

ω_uq≡ω_uq(A;W) = min

1≤j<k≤tσ_min(G_jk), (2.10) whereGjk isgivenby(2.8b).

(b) Suppose that none of Aj can be further block-diagonalized. The modulus of non- divisibility ω_nd ≡ ω_nd(A;W) of the jbdp for A with respective to the τ_n-block- diagonalizerW isdeﬁnedbyω_nd=∞ifτ_n = (1,1,. . . ,1) and

ω_nd= min

nj>1{the smallest nonzero singular value ofG_jj} (2.11) otherwise,where G_jj isgivenby(2.9b).

Notethenotionofthemodulusofnon-divisibilityisdeﬁnedundertheconditionthat noneofAj canbefurtherblock-diagonalized.Itisneededbecauseinorderfor(2.11) to be well-deﬁned,we needto makesure thatG_jj has atleast onenonzerosingularvalue inthecasewhenn_j >1.Indeed,G_jj= 0 whenevern_j>1,ifnoneofAj canbefurther block-diagonalized.To see this, we note G_jj = 0 impliesthat any matrix Z_jj of order nj isasolutionto (2.7a) and thus A^(jj)_i for 1≤i≤m arediagonal, whichmeansthat Ajcanbefurther(block)diagonalized.ThiscontradictstheassumptionthatnoneofAj

canbe furtherblock-diagonalized.

ThepropositionbelowpartiallyjustiﬁesDeﬁnition2.3.

Proposition2.4.SupposethatAisτ_n-block-diagonalizableandletW ∈W_τ_nbeaτ_n-block- diagonalizer of Asuch that (1.3)holds.Suppose dim_N (Aj)= 1 forall1≤j ≤t,and letσ^(j)₋₂ bethesecondsmallestsingularvalueofG_jj whenevern_j >1.Thenthefollowing statementholds.

(12)

(a) Ais uniquelyτn-block-diagonalizableifωuq(A;W)>0.

(b) None ofAj can befurtherblock-diagonalizedand ω_nd≡ω_nd(A;W) = min

nj>1σ^(j)₋₂>0.

Remark2.5.A fewcommentsareinorder.

(a) Thedeﬁnitionofωuqisanaturalgenerationofthemodulusofuniquenessin[18] for jdp(i.e.,when τn= (1,1,. . . ,1)).

(b) By Theorem 2.2(a), we know the smallest singularvalue of Gjj is always 0.Thus it seemsnatural thatindeﬁning ω_nd in(2.11),one wouldexpect using thesecond smallestsingularvalueofG_jj.ItturnsoutthatthereareexamplesforwhichAjcan- notbe furtherblock-diagonalized andyetdimN(Aj)= 2,i.e.,thesecond smallest singularvalueofGjjisstill0.Asanexample,weconsiderAi=

αi βi

βi −αi

∈R²^×²for i= 1,2,. . . ,m,whereβ_i = 0 foralliandα_i/β_i’sarenotaconstant.ThenAcannot be simultaneouslydiagonalizedbut N(A)= span{I2,

0 1

−1 0

},i.e.,dimN (A)= 2.

The moduli ωuq and ωnd, as deﬁned in Deﬁnition 2.3, depend on the choiceof the diagonalizer W. But, as the following theorem shows, inthe case when A is uniquely τn-block-diagonalizable,theirdependencyondiagonalizerW ∈Wτn canbe removed.

Theorem 2.6. If A isuniquely τn-block-diagonalizable, then ωuq andωnd are both inde- pendent of thechoiceof diagonalizersW ∈Wτn.

Proof. Let W ∈ Wτn be a τn-block-diagonalizer of A. Then all possible τn-block- diagonalizers ofAfromW_τ_n taketheform W =W DΠ forsomeD∈D_τ_n and Π∈P_τ_n. Wewill showthatω_uq(A;W)=ω_uq(A;W) andω_nd(A;W)=ω_nd(A;W).

WecanwriteD= diag(D1,. . . ,Dt),whereDj ∈Rⁿ^j^×ⁿ^j.AllDjareorthogonalsince W,W∈Wτn.Wehave

W^TA_iW = Π^Tdiag(D^T₁A⁽¹¹⁾_i D₁, . . . , D^T_tA^(tt)_i D_t)Π

= diag(Π^T₁D^T₁A⁽_i ¹¹⁾D1Π1, . . . ,Π^T_tD^T_tA⁽_i ^t^t⁾DtΠt),

where{1,2,. . . ,t}isapermutationof{1,2,. . . ,t},andΠj isapermutationmatrixof order nj.Denote byA^(jj)_i = Π^T_jD^T_jA⁽_i ^j^j⁾DjΠj,anddeﬁne Gjk,accordinglyasGjk in (2.8b),butintermsofA^(jj)_i andA^(kk)_i ,Gjj,accordinglyas Gjj in(2.9b),butinterms of A^(jj)_i .Thenbycalculations,wehave

Gjk =

I2m⊗(ΠkDk)^T⊗(ΠjDj)^T) Gjk

I2⊗(ΠkDk)⊗(ΠjDj)) , G_jj =

I_m⊗(Π_jD_j)^T⊗(Π_jD_j)^T) G_jj

(Π_jD_j)⊗(Π_jD_j) ,

(13)

whichimply thatthesingularvaluesof Gjk andGjj arethesameas those ofGjk and Gjj,respectively.Theconclusionfollows. 2

3. Mainperturbationresults

Inthis section,we presentourmain theorem,along withsomeillustratingexamples anddiscussionsonitsimplications.Wedeferitslengthy prooftosection4.

3.1. Set upthestage

In what follows, we will set up the groundwork for our perturbation analysis and explainsomeof ourassumptions.

RecallA={Ai}^mi=1 which istheunperturbed matrixset,where allAi ∈Rⁿ^×ⁿ, and τn = (n1,. . . ,nt) isapartitionofnwitht= card(τn)≥2.Weassumethat

Aisτn-block-diagonalizable,W ∈Wτnisitsτn-block-diagonalizer

suchthat(1.3) holds,and dimN (Aj)= 1 forallj. (3.1) The assumption that dimN(Aj) = 1 implies that Aj cannot be further block- diagonalizedbyTheorem2.2(c).

SupposethatAisperturbedto A={Ai}^mi=1≡ {Ai+ ΔAi}^mi=1, andlet AF:=

_m

i=1

Ai²F

1/2

, δ_A:=

_m

i=1

ΔAi²F

1/2

. (3.2)

Previously, we commented on that, more often than not, a generic jbdp may not be τn-block-diagonalizable for m ≥ 3. This means that A may not be τn-block- diagonalizableregardlesshow tinyδ_A maybe.Forthis reason,wewillnotassumethat Aisτ_n-block-diagonalizable, but,instead, it hasan approximateτ_n-block-diagonalizer W ∈W_τ_n inthesensethat

allW^TAiWare nearlyτn-block-diagonal. (3.3) Doing so has two advantages. Firstly, it serves all practical purposes well, because in anylikelypracticalsituationsweusuallyendupwithAwhichisclosetosomeτn-block- diagonalizableA, which is not actually availabledue to unavoidable noises such as in MICA,and,atthesametime,anapproximateτ_n-block-diagonalizercanbemadeavail- ablebycomputation.Secondly,itisgeneralenoughtocoverthecasewhenthejbdpfor Aisactuallyτ_n-block-diagonalizable.

Wehaveto quantifythe statement(3.3) inorder toproceed. Tothis end,wepicka diagonal matrix Γ = diag(γ1In1,. . . ,γtInt), where γ1,. . . ,γt are distinct real numbers withmaxj|γj|= 1,anddeﬁne theτn-block-diagonalzablilityresiduals

(14)

Ri=W^TAiWΓ−ΓW^TAiW fori= 1,2, . . . , m. (3.4) NoticeBdiag_τ_n(R_i)= 0 alwaysnomatterwhatΓ is.Therationalebehinddeﬁningthese residualsisinthefollowingproposition.

Proposition 3.1.W^TA_iW is τ_n-block-diagonal, i.e., OﬀBdiag_τ_n(W^TA_iW) = 0 if and only if Ri= 0.

As farasthis propositionisconcerned,anydiagonal Γ withdistinctdiagonalentries suﬃces.Butlater,wewillseethatourupperbounddependsonΓ,whichmakesuswonder whatthe bestΓ isfor thebestpossible bound. Unfortunately,this isnot atrivialtask and wouldbeaninteresting subjectfor futurestudies.Laterinournumericaltests, we useafewrandomΓ alongwith thefollowing one

Γ = diag(−In1,(−1 + 2

t−1)I_n₂,(−1 + 4

t−1)I_n₃, . . . , Int). (3.5) Nonetheless, westillkeepΓ asaparameterto chooseinourmain resultinhopethatit maycometohelpincertaincircumstance.Werestrictγitorealnumbersforconsistency considerationsinceAandAareassumedreal.Alldevelopmentsbelowworkequallywell even iftheyarecomplex.Set

g= min

j=k|γ_j−γ_k|, ˜r= _m

i=1

R_i²F

1/2

. (3.6)

The quantityr˜willbe used tomeasure how goodW is inapproximately diagonalizing A. Infact,itcanbeveriﬁedthat

g _m

i=1

OﬀBdiag_τ_n(W^TAiW)²F

¹₂

≤r˜

≤2 _m

i=1

OﬀBdiag_τ_n(W^TA_iW)²F

¹₂ , (3.7)

whichimpliesthat˜riscomparabletom

i=1OﬀBdiag_τ_n(W^TAiW)²F

¹₂

,providedgis nottootiny.Forthechoice(3.5),g= 2/(t−1).

The next proposition is due to an anonymous referee and it improves our earlier version.

Proposition3.2. Wisanexactτ_n-block-diagonalizerofthematrixset{A_i+E_i}^mi=1 with relative backwarderror