HAL Id: hal-01862241
https://hal.archives-ouvertes.fr/hal-01862241
Submitted on 27 Aug 2018
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
ON JEVD OF SEMI-DEFINITE POSITIVE
MATRICES AND CPD OF NONNEGATIVE
TENSORS
Rémi André, Laurent Albera, Xavier Luciani, Eric Moreau
To cite this version:
Rémi André, Laurent Albera, Xavier Luciani, Eric Moreau. ON JEVD OF SEMI-DEFINITE
POSI-TIVE MATRICES AND CPD OF NONNEGAPOSI-TIVE TENSORS. IEEE SAM 2016, Jul 2016, Rio de
Janeiro, Brazil. pp.7569738. �hal-01862241�
ON JEVD OF SEMI-DEFINITE POSITIVE MATRICES AND
CPD OF NONNEGATIVE TENSORS
Rémi André
(1,2), Laurent Albera
(3,4), Xavier Luciani
(1,2)and Eric Moreau
(1,2)(1)
Aix Marseille Université, CNRS, ENSAM, LSIS, UMR 7296, 13397 Marseille, France.
(2)Université de Toulon, CNRS, LSIS, UMR 7296, 83957 La Garde, France.
(3)
Inserm, U1099, Rennes F-35000, France.
(4)
Université de Rennes 1, LTSI, Rennes F-35000, France.
ABSTRACT
In this paper, we mainly address the problem of Joint Eigen-Value Decomposition (JEVD) subject to nonnegative con-straints on the eigenvalues of the matrices to be diagonal-ized. An efficient method based on the Alternating Direc-tion Method of Multipliers (ADMM) is designed. ADMM provides an elegant approach for handling nonnegativity constraints, while taking advantage of the structure of the objective function. Numerical tests on simulated matrices show the interest of the proposed method for low Signal-to-Noise Ratio (SNR) values when the similarity transformation matrix is ill-conditioned. The ADMM was recently used for the Canonical Polyadic Decomposition (CPD) of nonneg-ative tensors leading to the ADMoM algorithm. We show through computer results that DIAG+, a semi-algebraic CPD
method using our ADMM-based JEVD+algorithm, will give
a better estimation of factors than ADMoM in the presence of swamps. DIAG+also appears to be less time-consuming
than ADMoM when low-rank tensors of high dimensions are considered.
1. INTRODUCTION
The Canonical Polyadic Decomposition (CPD) of multi-way arrays plays an important role in various application areas such as medical diagnosis [1]. A large number of CPD meth-ods were proposed [2–6]. Among them, semi-algebraic (or di-rect) methods showed a very good performance (e.g. fast con-vergence speed, robustness with respect to initialization and overfactoring) [7]. Semi-algebraic techniques aim at refor-mulating the CPD as a matrix factorization problem of lower dimension. For instance, the CFS (Closed Form Solution) [8] and DIAG (DIrect AlGorithm) [7] algorithms formulate the CPD as a JEVD (Joint EVD) problem while the SSD-CP (CP decomposition based on Simultaneous Schur Decomposition) method formulates it as a problem of joint upper triangular-ization by an orthogonal similarity [9].
In some applications such as fluorescence spectroscopy [7], the multi-way array to be decomposed and its loading
matrices are nonnegative. Hence the need for efficient non-negative CPD methods. Note that the use of nonnegativity was shown to ensure the existence of optimal solutions to the CPD approximation problem [10]. Then we may wonder whether the semi-algebraic CPD methods can be modified in order to take into account the nonnegativity of the data. If tensors are nonnegative, then it appears that the correspond-ing matrices to be jointly diagonalized or triangularized have nonnegative eigenvalues. In other words, modifying the CFS and DIAG techniques in order to take into account nonnega-tivity requires to solve the following JEVD+problem:
Problem 1. Given K semi-definite positive matrices M(k)in RR×R, factorize them asM(k)= AD(k)A−1
subject to:
∀k ∈ {1, . . . , K}, D(k)∈ D+
whereD+is the subset of diagonal matrices ofRR×R such
thatDiag(D+) = [0, +∞[R.
In order to solve problem 1, we proposed an approach based on the Alternating Direction Method of Multipliers (ADMM) [11]. ADMM provides an elegant approach for handling nonnegativity constraints, while taking advantage of the structure of the objective function. Numerical tests on simulated matrices show the interest of the proposed method for low Signal-to-Noise Ratio (SNR) values when the simi-larity transformation matrix is ill-conditioned. The ADMM was recently used for the Canonical Polyadic Decomposition (CPD) of nonnegative tensors leading to the ADMoM algo-rithm [12]. We show through computer results that DIAG+, a
semi-algebraic CPD method using our ADMM-based JEVD+
algorithm, will give a better estimation of factors than AD-MoM in the presence of swamps. DIAG+also appears to be
less time-consuming than ADMoM when low-rank tensors of high dimensions are considered.
2. ALGORITHMS
The purpose of this section is first to show the interest in solv-ing the JEVD+problem for the CPD of nonnegative tensors.
Secondly, an efficient ADMM-based JEVD+solution is
pro-posed.
2.1. The CPD of nonnegative tensors
The Nonnegative CPD (NCPD) of a three-way array T ∈ [0, +∞[N1×N2×N3 is given by: T = R X r=1 f(1)r ◦ f(2)r ◦ f(3)r (1) where R = rank(T ) denotes the rank of T and where:
F(1) = [f(1)1 , . . . , f(1)R ] (2) F(2) = [f(2)1 , . . . , f(2)R ] (3) F(3) = [f(3)1 , . . . , f(3)R ] (4)
are the so-called nonnegative loading matrices of T .
The first unfolding matrix T(1)of size (N1× N2N3) can
be expressed as: T(1)= F(1)F(3) F(2) T (5) with: F(3) F(2) = hf(3) 1 ⊗ f (2) 1 , . . . , f (3) P ⊗ f (2) P i (6) = hΦ(1)F(2)T, . . . ,Φ(N3)F(2)Ti T (7)
according to [7] where Φ(n)is a (R×R) diagonal matrix built
from the n-th row of matrix F(3). The Singular Value Decom-position (SVD) truncated at order R of T(1)is given by:
T(1) = V Σ WT (8)
Then there is a non-singular matrix A of size (R × R) such that: V Σ = F(1)A−1 (9) WT = AF(3) F(2) T (10)
As shown in [7], W has the following matrix block structure:
W =hΓ(1), . . . , Γ(N3)i T
(11)
where the N3matrices Γ(n)of size (R×N2) are given by:
Γ(n)= A Φ(n)F(2)T (12)
Then we can define N(3)(N(3) − 1) matrices M(n1,n2) as
follows:
M(n1,n2) = Γ(n1)Γ(n2)]
= AΦ(n1)F(2)T(F(2)T)](Φ(n2))−1A−1
M(n1,n2) = AD(n1,n2)A−1 (13)
where for each couple (n1, n2), the (R × R) matrix D(n1,n2)
given by:
D(n1,n2)= Φ(n1)(Φ(n2))−1 (14)
is diagonal. Consequently, the DIAG algorithm [7] consists in reformulating the CPD problem (1) as the JEVD problem given by (13), which appears to be an optimization problem of smaller size. Once matrix A is computed, the matrices F(1)and F(3) F(2)can be derived from the SVD of T(1) using equations (9) and (10). As explained in [7], the column vectors of F(3)and F(2)can be derived from the columns of F(3) F(2)using the HOSVD [13].
By considering nonnegative loading matrices, it clearly appears from (14) that the N(3)(N(3)− 1) matrices M(n1,n2)
to be jointly diagonalized have nonnegative eigenvalues. Un-fortunately, the JEVD methods [7, 14–16], such as the ef-ficient Jacobi-like JDTM (Joint Diagonalization algorithm based on Targeting hyperbolic Matrices) technique based on the polar matrix factorization, did not take into account such a property. Consequently, let’s see in the following subsection how to overcome this drawback by proposing a nonnegative JEVD method.
2.2. An ADMM-based JEVD+method
The JEVD+problem (see problem 1 defined in introduction)
can be reformulated as the following minimization problem:
minimize
A,B,C Ψ(A, B, C) subject to C = ˜C (15)
with: Ψ(A, B, C) = ||X(1)− A (C B)T||2 F +α2||ABT− I R||2F+ ιC+( ˜C) (16)
where X(1) = [M(1), · · · , M(K)] is the first unfolding ma-trix of a third order tensor X for which A, B and C are the three loading matrices, where the k-th row of C is the di-agonal vector of D(k) and where B = A−T
with M(k) = AD(k)A−1 the k-th matrix to be jointly diagonalized (see problem 1). ιC+ is the indicator function of set of (K × R)
matrices with nonnegative components. The indicator of a set S of a Hilbert space H is defined as (∀x ∈ H) ιS(x) = 0 if
x ∈ S, and +∞ otherwise.
The augmented Lagrangian function associated with (15) is given by Lρ(A, B, C, ˜C, Λ) = Ψ(A, B, C) +||Λ (C − ˜C))||2F +ρ 2||C − ˜C|| 2 F (17)
where the (K × R) matrix Λ is the matrix Lagrangian mul-tiplier, where is Hadamard product operator and where the variable ρ ∈]0, +∞[ is a penalty parameter for the linear equality constraints.
An ADMM-like algorithm for solving (15) is obtained by successively i) minimizing the augmented Lagrangian func-tion (17) with respect to A, B, C and ˜C, one variable at a time while setting the others to their most recent values, and ii) updating the multiplier Λ using an ascent gradient rule. By using (16) and (17), it turns out that the alternating minimiza-tions of (17) have the following closed form soluminimiza-tions:
A = X(1)(C B) + αB (C B)T(C B) + αBTB−1 B = X(2)(C A) + αA (C A)T(C A) + αATA−1 C = X(3)(B A) + ρ ˜C − Λ (B A)T(B A) + ρI R −1 ˜ C = PC+C + 1 ρΛ
where X(2)and X(3)are the second and the third unfolding matrix of X , respectively, and where PS denote the
projec-tion onto a convex set S. The gradient of the augmented La-grangian function Lρwith respect to the multiplier Λ is given
by ρ(C − ˜C). Regarding the penalty parameter ρ, it is com-puted as explained in [12]: ρit+1= τ(1)ρ it if||Pit||F > µ||Dit||F ρit/τ(2) if||Dit||F > µ||Pit||F ρit otherwise. (18)
where Pit = Cit− ˜Cit and Dit = ρ( ˜Cit− ˜Cit−1) with
µ > 1, τ(1) > 1 and τ(2) > 1. As fas as α is concerned, we
initialize it to 1 and we divide it by 2 every 10 iterations.
3. NUMERICAL RESULTS
Now, we investigate the behavior of the proposed JEVD+
method and its use in the DIAG algorithm leading to the DIAG+technique.
To build simulated data sets, we firstly randomly gener-ated L = 100 diagonal matrices D(k) and a non-singular matrix A of size (R × R) with R = 3 by using a uniform distribution over [0, 1] and a standard normal distribution, re-spectively. Then we built L sets of K matrices M(k) to be jointly diagonalized such that:
∀k, M(k)= AD (k)A−1 ||AD(k)A−1|| F + σ N (k) ||N(k)|| F (19)
where N(k) models a Gaussian random noise and σ allows us to adjust the noise power to the desired SNR. We vary the SNR value from 0 to 70 dB. The condition number of A was adjusted by means of the following β parameter:
∀r > 1, ar= βa1+ (1 − β)u[0,1] (20)
Fig. 1.Median value of the mCcriterion at the output of the JDTM
and JEVD+methods as a function of SNR.
where ar denotes the r-th column vector of A and where
u[0,1] is a R-dimensional vector generated using a uniform
distribution over [0, 1]. The β parameter was fixed to 0.65. Secondly, we randomly generated L = 100 loading ma-trices F(1), F(2)and F(3)of size (N(1)×R), (N(2)×R) and
(N(3)× R), respectively, by using a uniform distribution over
[0, 1] with R = 2 and N(1)= N(2) = N(3)= 100. The
con-dition number was also adjusted using the strategy described by (20). Two β values were considered: β = 0 and β = 0.45. Then we built L sets of third order tensors T to be canoni-cally decomposed such that their first loading matrix T(1) is defined by : T(1) = F (1) F(3) F(2)T ||F(1) F(3) F(2)T ||F + σ Q (k) ||Q(k)||F (21)
where Q(k)models a Gaussian random noise and σ allows us to adjust the noise power to the desired SNR. The SNR value was fixed to 100 dB.
Regarding the first performance criterion, we used an er-ror between the calculated and actual matrices. More partic-ularly, we computed the normalized Frobenius norm mG of
the difference between the actual matrix G and its estimate bG. Note that the indetermination up to scale and permutation was corrected before the computation of mG. The second
perfor-mance criterion was the CPU time. It is noteworthy that all experiments were run with Matlab R2014a on Intel® Xeon® 4-Core 2.8GHz, equipped with 32GB RAM.
Figures 1 and 2 display the median of the mC and mA
criteria, respectively, at the output of the JDTM and JEVD+
methods as a function of the SNR when the similarity trans-formation matrix is ill-conditioned. Thus figures 1 and 2 clearly show that both the eigenvalues and the eigenvectors are better estimated for low SNR values by taking into ac-count the nonnegativity of the eigenvalues during the opti-mization procedure. More particularly, our ADMM-based JEVD+method gives better results than JDTM for SNR
Fig. 2.Median value of the mAcriterion at the output of the JDTM
and JEVD+methods as a function of SNR.
Table 1 compares the DIAG+ method based on our
ADMM-based JEVD+technique with ADMoM [12] in terms
of performance for low-rank tensors of high dimensions. Note that the same stopping criteria were used for both ADMM procedures, say a threshold of 10−6 and a maximum num-ber of iterations of 1000. When the three loading matrices are well-conditioned, both methods give a similar estimation accuracy while DIAG+ is ten times faster than ADMoM.
When the three loading matrices are ill-conditioned, not only DIAG+ is still faster than ADMoM, but it estimates the
loading matrices with a better accuracy. In fact, the DIAG procedure allows us to algebraically reformulate the original CPD problem as an optimization problem of smaller size, hence the gain of time. But it also allows us to reformulate the original CPD problem as a well-conditioned problem, hence the better robustness with respect to ill-conditioned loading matrices.
methods factor collinearity Error CPU time index (β) (10−3) (s)
ADMoM 0 0.0003 3.72
DIAG+ 0 0.0004 0.29
ADMoM 0.45 0.181 13
DIAG+ 0.45 0.0017 0.85
Table 1. Performance of ADMoM versus DIAG+.
4. CONCLUSION
In this paper, we showed how the ADMM algorithm could be used to compute the solution of the JEVD+problem and
to handle the nonnegativity constraint of the eigenvalues of semi-definite positive matrices. Numerical tests have shown the good behavior of the proposed ADMM-based JEVD+
ap-proach for low SNR values when the similarity transforma-tion matrix is ill-conditransforma-tioned. The ADMM was recently used for the CPD of nonnegative tensors leading to the ADMoM algorithm [12]. We showed through computer results that
DIAG+, a semi-algebraic CPD method using our
ADMM-based JEVD+ algorithm, gave a better estimation of factors
than ADMoM in the presence of swamps. DIAG+ also
ap-peared to be less time-consuming than ADMoM when low-rank tensors of high dimensions were considered.
5. REFERENCES
[1] H. BECKER, L. ALBERA, P. COMON, R. GRIBON-VAL, F. WENDLING, and I. MERLET, “Brain source imaging: from sparse to tensor models,” IEEE Signal Processing Magazine, special issue on Brain-Computer Interfaces, vol. 32, no. 6, pp. 100–112, November 2015.
[2] L. DE LATHAUWER, B. DE MOOR, and J. VANDE-WALLE, “Computation of the canonical decomposition by means of a simultaneous generalized schur decom-position,” SIAM Journal on Matrix Analysis and Appli-cations, vol. 26, no. 2, pp. 295–327, 2004.
[3] T. G. KOLDA and B. W. BADER, “Tensor decomposi-tions and applicadecomposi-tions,” SIAM Review, vol. 51, no. 3, pp. 455–500, September 2009.
[4] P. COMON, X. LUCIANI, and A. L. F. DE ALMEIDA, “Tensor decompositions, alternating least squares and other tales,” Journal of Chemometrics, vol. 23, April 2009.
[5] P. COMON, “Tensors: a brief introduction,” IEEE Sig-nal Processing Magazine, vol. 31, no. 3, pp. 44–53, May 2014.
[6] A.-H. PHAN, P. TICHAVSKY, and A. CICHOCKI, “CANDECOMP/PARAFAC decomposition of high-order tensors through tensor reshaping,” IEEE Transac-tions On Signal Processing, vol. 61, no. 19, pp. 4847– 4860, 2013.
[7] X. LUCIANI and L. ALBERA, “Canonical polyadic de-composition based on joint eigenvalue dede-composition,” Chemometrics and Intelligent Laboratory Systems, vol. 132, pp. 152–167, March 2014.
[8] F. ROEMER and M. HAARDT, “A semi-algebraic framework for approximate CP decompositions via si-multaneous matrix diagonalization (SECSI),” Elsevier Signal Processing, vol. 93, pp. 2462–2473, 2013.
[9] S. HAJIPOUR SARDOUIE, L. ALBERA, M. B. SHAMSOLLAHI, and I. MERLET, “Canonical polyadic decomposition of complex-valued multi-ways arrays based on simultaneous schur decomposition,” in ICASSP 13, 2013 IEEE International Conference on Acoustics Speech and Signal Processing, Vancouver, Canada, May 26-31 2013, pp. 4178–4182.
[10] L.-H. LIM and P. COMON, “Nonnegative approxima-tions of nonnegative tensors,” Journal of chemometrics, pp. 432–441, 2009.
[11] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Founda-tions and Trends in Machine Learning, vol. 3, no. 1, pp. 1–122, 2011.
[12] A. P. LIAVAS and N. D. SIDIROPOULOS, “Parallel algorithms for constrained tensor factorization via alter-nating direction method of multipliers,” IEEE Transac-tions on Signal Processing, vol. 63, no. 20, pp. 5450– 5463, 2015.
[13] L. DE LATHAUWER, B. DE MOOR, and J. VANDE-WALLE, “A multilinear singular value decomposition,” SIAM Journal on Matrix Analysis and Applications, vol. 21, no. 4, pp. 1253–1278, 2000.
[14] T. FU and X. GAO, “Simultaneous diagonalization with similarity transformation for non-defective matri-ces,” vol. 4, pp. 1137–1140, May 2006.
[15] R. IFERROUDJENE, K. ABED MERAIM, and A. BE-LOUCHRANI, “A new jacobi-like method for joint di-agonalization of arbitrary non-defective matrices,” Ap-plied Mathematics and Computation, vol. 2, pp. 363– 373, 2009.
[16] X. LUCIANI and L. ALBERA, “Joint eigenvalue de-composition of non-defective matrices based on the lu factorization with application to ICA,” IEEE Transac-tions On Signal Processing, vol. 63, no. 17, pp. 4594– 4608, September 2015.