• Aucun résultat trouvé

Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

N/A
N/A
Protected

Academic year: 2022

Partager "Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php"

Copied!
25
0
0

Texte intégral

(1)

SEMIDEFINITE RELAXATIONS FOR BEST RANK-1 TENSOR APPROXIMATIONS

JIAWANG NIE AND LI WANG

Abstract. This paper studies the problem of finding best rank-1 approximations for both symmetric and nonsymmetric tensors. For symmetric tensors, this is equivalent to optimizing ho- mogeneous polynomials over unit spheres; for nonsymmetric tensors, this is equivalent to optimiz- ing multiquadratic forms over multispheres. We propose semidefinite relaxations, based on sum of squares representations, to solve these polynomial optimization problems. Their special properties and structures are studied. In applications, the resulting semidefinite programs are often large scale.

The recent Newton-CG augmented Lagrangian method by Zhao, Sun, and Toh [SIAM J. Optim., 20 (2010), pp. 1737–1765] is suitable for solving these semidefinite relaxations. Extensive numerical experiments are presented to show that this approach is efficient in getting best rank-1 approxima- tions.

Key words. form, polynomial, relaxation, rank-1 approximation, semidefinite program, sum of squares, tensor

AMS subject classifications.15A18, 15A69, 90C22 DOI.10.1137/130935112

1. Introduction. Let m and n1, . . . , nm be positive integers. A tensor of or- der m and dimension (n1, . . . , nm) is an array F that is indexed by integer tuples (i1, . . . , im) with 1≤ij≤nj (j= 1, . . . , m), i.e.,

F= (Fi1,...,im)1≤i1≤n1,...,1≤im≤nm.

The space of all such tensors with real (resp., complex) entries is denoted asRn1×···×nm (resp.,Cn1×···×nm). Tensors of ordermare called m-tensors. Whenmequals 1 or 2, they are regular vectors or matrices. When m = 3 (resp., 4), they are called cubic (resp., quartic) tensors. A tensorF ∈Rn1×···×nm is symmetric if n1=· · ·=nmand

Fi1,...,im =Fj1,...,jm (i1, . . . , im)(j1, . . . , jm),

wheremeans that (i1, . . . , im) is a permutation of (j1, . . . , jm). We define the norm of a tensorF as

(1.1) F=

n 1

i1=1

· · ·

nm

im=1

|Fi1,...,im|2 1/2

.

For m= 1, F is the vector 2-norm, and for m= 2, F is the matrix Frobenius norm.

Every tensor can be expressed as a linear combination of outer products of vectors.

For vectorsu1Cn1, . . . , umCnm, their outer productu1⊗ · · · ⊗um is the tensor inCn1×···×nm such that for all 1≤ij ≤nj (j = 1, . . . , m)

(u1⊗ · · · ⊗um)i1,...,im= (u1)i1· · ·(um)im.

Received by the editors September 3, 2013; accepted for publication (in revised form) by T.

Kolda May 23, 2014; published electronically August 28, 2014. The research was partially supported by NSF grant DMS-0844775.

http://www.siam.org/journals/simax/35-3/93511.html

Department of Mathematics, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA 92093 (njw@math.ucsd.edu, liw022@math.ucsd.edu).

1155

Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(2)

For every tensorF of orderm, there exist tuples (ui,1, . . . , ui,m) (i= 1, . . . , r), with eachui,jCnj, such that

(1.2) F =

r i=1

ui,1⊗ · · · ⊗ui,m.

The smallestrin the above is called the rank ofF and is denoted as rankF. When rankF=r, (1.2) is called a rank decomposition ofF, and we say thatF is a rank-r tensor.

Tensor problems have wide applications in chemometrics, signal processing, and high order statistics [6]. For the theory and applications of tensors, we refer to Comon et al. [7], Kolda and Bader [15], and Landsberg [17]. Whenm≥3, determining ranks of tensors and computing rank decompositions are NP-hard (cf. Hillar and Lim [11]).

We refer to Brachat et al. [1], Bernardi et al. [2], and Oeding and Ottaviani [25] for tensor decompositions. When r > 1, the problem of finding best rank-r approxi- mations may be ill-posed, because a best rank-r approximation might not exist, as discovered by De Silva and Lim [10]. However, a best rank-1 approximation always exists. It is also NP-hard to compute best rank-1 approximations. This paper studies best real rank-1 approximations for real tensors (i.e., tensor entries are real numbers).

We begin with some reviews on this subject.

1.1. Nonsymmetric tensor approximations. Given a tensorF ∈Rn1×···×nm, we say that a tensorBis a best rank-1 approximation ofF if it is a minimizer of the least squares problem

(1.3) min

X ∈Rn1×···×nm,rankX=1, F − X 2.

This is equivalent to a homogeneous polynomial optimization problem, as shown in [9]. For convenience of description, we define the homogeneous polynomial

(1.4) F(x1, . . . , xm) :=

1≤i1≤n1,...,1≤im≤nm

Fi1,...,im(x1)i1· · ·(xm)im,

which is in x1 Rn1, . . . , xmRnm. Note thatF(x1, . . . , xm) is a multilinear form (a form is a homogeneous polynomial), since it is linear in eachxj. De Lathauwer, De Moor, and Vandewalle [9] proved the following result.

Theorem 1.1 (see [9, Theorem 3.1]). For a tensor F ∈Rn1×···×nm, the rank-1 approximation problem(1.3)is equivalent to the maximization problem

(1.5)

max

x1∈Rn1,...,xm∈Rnm, |F(x1, . . . , xm)|

s.t. x1=· · ·=xm= 1,

that is, a rank-1 tensorλ·(u1⊗ · · · ⊗um), with λ∈R and each ui= 1, is a best rank-1 approximation forF if and only if (u1, . . . , um)is a global maximizer of (1.5) andλ=F(u1, . . . , um). Moreover, it also holds that

F −λ·(u1⊗ · · · ⊗um)2=F2−λ2.

By Theorem 1.1, to find a best rank-1 approximation, it is enough to find a global maximizer of the multilinear optimization problem (1.5). There exist methods on finding rank-1 approximation, like the alternating least squares (ALS), truncated high

Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(3)

order singular value decomposition (HOSVD), higher order power method (HOPM), and quasi-Newton methods. We refer to Zhang and Golub [33], De Lathauwer, De Moor and Vandewalle [8, 9], Savas and Lim [29], and the references therein. An advantage of these methods is that they can be easily implemented. These kinds of methods typically generate a sequence that converges to a locally optimal rank-1 approximation or even just a stationary point. Even for the lucky cases that they get globally optimal rank-1 approximations, it is usually very difficult to verify the global optimality by these methods.

1.2. Symmetric tensor approximations. Let Sm(Rn) be the space of real symmetric tensors of ordermand in dimensionn. GivenF ∈Sm(Rn), we say thatB is a best rank-1 approximation ofF if it is a minimizer of the optimization problem

(1.6) min

X ∈Rn×···×n,rankX=1, F − X 2.

When F is symmetric, Zhang, Ling, and Qi [35] showed that (1.6) has a global minimizer that belongs to Sm(Rn), i.e., (1.6) has an optimizer that is a symmetric tensor. It is possible that a best rank-1 approximation of a symmetric tensor might not be symmetric. But there is always at least one global minimizer of (1.6) that is a symmetric rank-1 tensor. Therefore, for convenience, by best rank-1 approximation for symmetric tensors, we mean best symmetric rank-1 approximation.

A symmetric tensor inSm(Rn) is rank-1 if and only if it equalsλ·(u⊗ · · ·⊗u) for someλ∈Randu∈Rn. For convenience, denoteu⊗m:=u⊗ · · · ⊗u(uis repeated m times). In the spirit of Theorem 1.1 and the work [35], (1.6) is equivalent to the optimization problem

(1.7) max

x∈Rn |f(x)| s.t. x= 1,

wheref(x) :=F(x,· · · , x). Therefore, ifuis a global maximizer of (1.7) andλ=f(u), thenλ·u⊗m is a best rank-1 approximation ofF. Clearly, to solve (1.7), we need to solve two maximization problems:

(1.8) (I) max

x∈Sn−1f(x), (II) max

x∈Sn−1−f(x),

where Sn−1 :={x∈ Rn : x = 1} is then−1 dimensional unit sphere. Suppose u+, u are global maximizers of (I), (II) in (1.8), respectively. By Theorem 1.1, if |f(u+)| ≥ |f(u)|, f(u+)·(u+)⊗m is the best rank-1 approximation; otherwise, f(u)·(u)⊗mis the best.

For an introduction to symmetric tensors, we refer to Comon et al. [7]. Finding best rank-1 approximations for symmetric tensors is also NP-hard whenm≥3. There exist methods for computing rank-1 approximations for symmetric tensors. When HOPM is directly applied, it is often unreliable for attaining a good symmetric rank- 1 approximation, as pointed out in [9]. To get good symmetric rank-1 approximations, Kofidis and Regalia [14] proposed a symmetric higher order power method (SHOPM), and Zhang, Ling, and Qi [35] proposed a modified power method. These methods can be easily implemented. Like for nonsymmetric tensors, they often generate a locally optimal rank-1 approximation or even just a stationary point. Even for the lucky cases in which a globally optimal rank-1 approximation is found, these methods typically have difficulty verifying its global optimality. The problem (1.7) is related to extreme Z-eigenvalues of symmetric tensors. Recently, Hu, Huang, and Qi [12] proposed a method for computing extremeZ-eigenvalues for symmetric tensors of even orders. It solves a sequence of semidefinite relaxations based on sum of squares representations.

Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(4)

1.3. Contributions of the paper. In this paper, we propose a new approach for computing best rank-1 tensor approximations, i.e., using semidefinite program (SDP) relaxations. As we have seen, for nonsymmetric tensors, the problem is equiv- alent to optimizing a multilinear form over multispheres, i.e., (1.5); for symmetric tensors, it is equivalent to optimizing a homogeneous polynomial over the unit sphere, i.e., (1.7). Recently, there has been extensive work on solving polynomial optimization problems by using semidefinite relaxations and sum of squares representations, e.g., Lasserre [18], Parrilo and Sturmfels [26], and Nie and Wang [24]. These relaxations are often tight, and generally they are able to get global optimizers, which can be verified mathematically. We refer to Lasserre’s book [19] and Laurent’s survey [20]

for an overview for the work in this area.

For a nonsymmetric tensor F, to get a best rank-1 approximation is equivalent to solving the multilinear optimization problem (1.5). WhenF is symmetric, to get a best rank-1 approximation for F, we need to solve the homogeneous optimization problem (1.7). We solve these polynomial optimization problems by using semidefinite relaxations, based on sum of squares representations. In applications, the resulting semidefinite programs are often large scale. The traditional interior point methods for semidefinite programs are generally too expensive for solving them. The recent Newton-CG augmented Lagrangian method by Zhao, Sun, and Toh [32] is very effi- cient for solving such big semidefinite programs, as shown in [24]. In the paper, we use this method to compute best rank-1 approximations for tensors.

The paper is organized as follows. In section 2, we show how to find best rank-1 approximations by using semidefinite relaxations and we propose numerical algorithms based on them. Their special structures and properties are also studied. In section 3, we present extensive numerical experiments to show the efficiency of these semidefinite relaxations. In section 4, we discuss this approach and future work.

Notation. The symbol N (resp., R, C) denotes the set of nonnegative integers (resp., real numbers, complex numbers). For two tensors X,Y ∈Rn1×···nm, define their inner product as

X,Y:=

1≤i1≤n1,...,1≤im≤nm

Xi1,...,imYi1,...,im.

For t R, t denotes the smallest integer that is not smaller than t. For integer n >0, [n] denotes the set{1,· · ·, n}. Forα= (α1, . . . , αn)Nn,|α|:=α1+· · ·+αn. DenoteNnm={α∈Nn: |α|=m}. Forx= (x1, . . . , xn)Rn,xαdenotesxα11· · ·xαnn. LetR[x] be the ring of polynomials with real coefficients inx. A polynomialpis said to be sum of squares (SOS) ifp=p21+· · ·+p2kfor somep1, . . . , pkR[x]. The symbol Σn,mdenotes the cone of SOS forms in (x1, . . . , xn) and of degreem. For a finite set T, |T| denotes its cardinality. The symbol ei denotes theith unit vector, i.e., ei is the vector whoseith entry equals one and all others equal zero. For a matrixA,AT denotes its transpose. For a symmetric matrixW,W 0(resp., 0) means thatW is positive semidefinite (resp., definite). For any vectoru∈RN,u=

uTudenotes the standard Euclidean 2-norm.

2. Semidefinite relaxations and algorithms. To find best rank-1 tensor ap- proximations is equivalent to solving some homogeneous polynomial optimization problems with sphere constraints. In this section, we show how to solve them by using semidefinite relaxations based on sum of squares representations, and we study their properties.

Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(5)

2.1. Symmetric tensors of even orders. LetF ∈Sm(Rn) withm= 2deven.

To get a best rank-1 approximation of F, we need to solve (1.7), i.e., to maximize

|f(x)| over the unit sphere. For this purpose, we need to find the maximum and minimum off(x) overSn−1.

First, we consider the maximization problem:

(2.1) fmax:= max f(x) s.t. xTx= 1.

The formf is inx:= (x1, . . . , xn), determined by the tensorF as

(2.2) f(x) =

1≤i1,...,im≤n

Fi1,...,imxi1· · ·xim.

Let [xd] be the monomial vector:

[xd] :=

xd1 xd−11 x2 · · · xd−11 xn · · · xdnT .

Its length isn+d−1

d . The outer product [xd][xd]T is a symmetric matrix with entries being monomials of degreem. LetAα be symmetric matrices such that

[xd][xd]T =

α∈Nnm

Aαxα.

Fory∈RNnm, define the matrix-valued functionM(y) as M(y) :=

α∈Nnm

Aαyα.

ThenM(y) is a linear pencil iny (i.e.,M(y) is a linear matrix-valued function iny).

Letfα, gαbe the coefficients such that

f:=

α∈Nnm

fαxα, g:= (xTx)d =

α∈Nnm

gαxα.

Fory∈RNnm, define

f, y:=

α∈Nnm

fαyα, g, y:=

α∈Nnm

gαyα.

A semidefinite relaxation of (2.1) is (cf. [23, 24]) (2.3) fmaxsdp := max

y∈RNnm f, y s.t. M(y)0, g, y= 1.

It can be shown that the dual of the above is

(2.4) min γ s.t. γg−f Σn,m.

In the above, Σn,m denotes the cone of SOS forms of degree m and in variables x1, . . . , xn. Clearly, fmaxsdp ≥fmax. When fmaxsdp =fmax, we say that the semidefinite relaxation (2.3) is tight.

Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(6)

The dual problem (2.4) is a linear optimization with SOS type constraints. Re- cently, Hu, Huang, and Qi [12] proposed an SOS relaxation method for computing extreme Z-eigenvalues for symmetric tensors of even orders. Since not every nonneg- ative form is SOS, they consider a sequence of nesting SOS relaxations. The problem (2.4) is equivalent to the lowest order relaxation in [12, section 4]. In practice, the relaxation (2.4) is often tight, and it is frequently used because of its simplicity. The SOS relaxation (2.4) was also proposed in [23] for optimizing forms over unit spheres.

Its approximation quality was also analyzed there.

The feasible set of (2.3) is compact, because Trace(M(y))≤ g, y= 1.

So (2.3) always has a maximizer, say, y. If rankM(y) = 1, then (2.3) is a tight relaxation. In such case, there existsv+ Sn−1 such thaty = [(v+)m], because of the structure ofM(y). Then, it holds that

f(v+) = f, y=fmaxsdp ≥fmax.

This implies that v+ is a global maximizer of (2.1). The vector v+ can be chosen numerically as follows. Lets∈[n] be the index such that

y2de

s = max

1≤i≤ny2de

i. Then choosev+ as the normalization:

(2.5) uˆ= (y(2d−1)e

s+e1, . . . , y(2d−1)e

s+en), v+= ˆu/uˆ.

If rank(M(y))> 1 but M(y) satisfies a further rank condition, then (2.3) is also tight (cf. [24, section 2.2]). In computations, no matter whether (2.3) is tight or not, the vectorv+ selected as in (2.5) can be used as an approximation for a maximizer of (2.1).

Second, we consider the minimization problem:

(2.6) fmin:= min f(x) s.t. xTx= 1.

Similarly, a semidefinite relaxation of (2.6) is (2.7) fminsdp:= min

z∈RNnm f, z s.t. M(z)0, g, z= 1.

Its dual optimization problem can be shown to be

(2.8) max η s.t. f−ηg∈Σn,m.

Let z be a minimizer of (2.7). Similarly, if rank(M(z)) = 1, then (2.7) is a tight relaxation. In such case, a global minimizervcan be found as follows: lett∈[n] be the index such that

z2de

t = max

1≤i≤nz2de

i, then choosev as the normalization:

(2.9) u˜= (z(2d−1)e

t+e1, . . . , z(2d−1)e

t+en), v= ˜u/˜u.

Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(7)

When rankM(z)>1, (2.7) might not be a tight relaxation. In computations, the vectorv can be used as an approximation for a minimizer of (2.6).

Combining the above and inspired by Theorem 1.1, we get the following algorithm.

Algorithm 2.1. Rank-1 approximations for even symmetric tensors.

Input: A symmetric tensorF ∈Sm(Rn)with an even order m= 2d.

Output: A rank-1 tensor λ·u⊗m withλ∈Rand u∈Sn−1. Procedure:

Step 1. Solve the semidefinite relaxation (2.3)and get an optimizer y.

Step 2. Choose v+ as in (2.5). If rankM(y) = 1, let u+ =v+; otherwise, apply a nonlinear optimization method to get a better solution u+ of (2.1), by using v+ as a starting point. Letλ+=f(u+).

Step 3. Solve the semidefinite relaxation (2.7)and get an optimizer z.

Step 4. Choose v as in (2.9). If rankM(z) = 1, let u =v; otherwise, apply a nonlinear optimization method to get a better solution u of (2.6), by using v as a starting point. Letλ=f(u).

Step 5. If|λ+| ≥ |λ|, output(λ, u) := (λ+, u+); otherwise, output(λ, u) := (λ, u).

Remark. In Algorithm 2.1, if rankM(y) = rankM(z) = 1, then the output λ·u⊗m is a best rank-1 approximation of F. If this rank condition is not satisfied, thenλ·u⊗mmight not be best. The approximation qualities of semidefinite relaxations (2.3) and (2.7) are analyzed in [23, section 2].

Whenm = 2 or (n, m) = (3,4), the semidefinite relaxations (2.3) and (2.7) are always tight. This is because every bivariate, or ternary quartic, nonnegative form is always SOS (cf. [28]). For other cases of (n, m), this is not necessarily true. However, this does not occur very often in applications. In our numerical experiments, the semidefinite relaxations (2.3) and (2.7) are often tight.

We want to know when the ranks ofM(y) andM(z) are equal to one. Clearly, when they have rank one, the relaxations (2.3) and (2.7) must be tight. Interestingly, the reverse is often true, although it might be false sometimes (for very few cases).

This fact was observed in the field of polynomial optimization. However, we are not able to find suitable references for explaining this fact. Here, we give a natural geometric interpretation for this phenomena, for lack of suitable references. Let∂Σn,m be the boundary of the cone Σn,m.

Theorem 2.2. Let fmax, fmaxsdp, fmin, fminsdp, y, z be as above.

(i) Suppose fmax = fmaxsdp. If fmax·g −f is a smooth point of ∂Σn,m, then rankM(y) = 1.

(ii) Suppose fmin = fminsdp. If f −fmin ·g is a smooth point of ∂Σn,m, then rankM(z) = 1.

Proof. (i) Let μ be the uniform probability measure on the unit sphere Sn−1, and letyμ :=

[xm]dμRNnm. Then, we can show that M(yμ)0 and g, yμ= 1.

This shows thatyμ is an interior point of (2.3). So, the strong duality holds, that is, the optimal values of (2.3) and (2.4) are equal, and (2.4) achieves the optimal value fmaxsdp. The formσ :=fmax·g−f belongs to Σn,m, because fmax =fmaxsdp. Letube a maximizer of f onSn−1. Then,σ(u) = 0. So,σlies on the boundary ∂Σn,m. Let ˆ

y= [um]. Denote byR[x]m the space of all forms inxand of degreem. Then Hˆy:={p∈R[x]m: p,yˆ= 0}

is a supporting hyperplane of Σn,m throughσ, because p,yˆ=p(u)≥0 ∀p∈Σn,m

Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(8)

and σ,yˆ = σ(u) = 0. Because M(y) 0 and p, y 0 for all p Σn,m, the hyperplane

Hy :={p∈R[x]m: p, y= 0}

also supports Σn,mthroughσ. Sinceσis a smooth point of∂Σn,m, there is a unique supporting hyperplane Hthroughσ. So,y= ˆy= [um]. This implies that M(y) = M([um]) = [ud][ud]T has rank one, whered=m/2.

(ii) The proof is the same as for (i).

By Theorem 2.2, when semidefinite relaxations (2.3) and (2.7) are tight, we often have rankM(y) = rankM(z) = 1. This fact was observed in our numerical experi- ments. The reason is that the set of nonsmooth points of the boundary∂Σn,mhas a strictly smaller dimension than∂Σn,m.

2.2. Symmetric tensors of odd orders. Let F ∈Sm(Rn) with m= 2d1 odd. To get a best rank-1 approximation of F, we need to solve the optimization problem (1.7). Since the formf, defined as in (2.2), has the odd degree m, (1.7) is equivalent to (2.1). We cannot directly apply a semidefinite relaxation to solve (2.1).

For this purpose, we use a trick that is introduced in [23, section 4.2].

Letfmax, fminbe as in (2.1), (2.6), respectively. Sincef is an odd form, fmax=−fmin0.

Letxn+1 be a new variable, in addition tox= (x1, . . . , xn). Let

˜

x:= (x1, . . . , xn, xn+1), f˜(˜x) :=f(x)xn+1.

Then ˜fx) is a form of even degree 2d. Consider the optimization problem:

(2.10) f˜max:= max

x∈R˜ n+1

f˜(˜x) s.t. x˜= 1.

As shown in [23, section 4.2], it holds that fmax=

2d1 1 1

2d

−df˜max.

Since ˜f is an even form, the semidefinite relaxation for (2.10) is (2.11) f˜maxsdp := max

y∈Nn+12d

f , y˜ s.t. M(y)0, g, y= 1.

The vectoryis indexed by (n+ 1)-dimensional integer vectors.

Lety be a maximizer of (2.11), which always exists because the feasible set of (2.11) is compact. If rankM(y) = 1, then (2.11) is a tight relaxation, and a global maximizerv+ of (2.10) can be chosen as in (2.5). (Note that thenin (2.5) should be replaced byn+ 1.) Writev+ as

v+= (ˆv,ˆt), vˆ2+ ˆt2= 1.

If ˆv = 0 or ˆt = 0, then ˜fmax = fmax = 0 and the zero tensor is the best rank-1 approximation ofF. So, we consider the general case 0<|ˆt|<1. Note that sign(ˆt)·ˆv is a global maximizer off on the spherex22= 1ˆt2. Let

(2.12) uˆ= sign(ˆt)·ˆv/

1ˆt2.

Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(9)

Then ˆuis a global maximizer off onSn−1. When rankM(y)>1, the above ˆumight not be a global maximizer, but it can be used as an approximation for a maximizer.

Combining the above, we get the following algorithm.

Algorithm 2.3. Rank-1 approximations for odd symmetric tensors.

Input: A symmetric tensorF ∈Sm(Rn)with an odd order m= 2d1.

Output: A rank-1 symmetric tensor λ·u⊗m withλ∈Randu∈Sn−1. Procedure:

Step 1. Solve the semidefinite relaxation (2.11) and get an optimizery.

Step 2. Choosev+ as in(2.5)anduˆas in(2.12). (Thenin(2.5)should be replaced byn+ 1.)

Step 3. If rankM(y) = 1, let u = ˆu; otherwise, apply a nonlinear optimization method to get a better solution uof(2.1), by usinguˆ as a starting point. Let λ=f(u). Output(λ, u).

Remark. In Algorithm 2.3, if rankM(y) = 1, then the output λ·u⊗m is a best rank-1 approximation of the tensor F. If rankM(y) > 1, then λ·u⊗m is not necessarily the best. The approximation quality of the semidefinite relaxation (2.11) is analyzed in [23, section 4]. In our numerical experiments, we often have rankM(y) = 1. A similar version of Theorem 2.2 is true for Algorithm 2.3. We omit it for simplicity.

2.3. Nonsymmetric tensors. LetF ∈ Rn1×···×nm be a nonsymmetric tensor of orderm. Let F(x1, . . . , xm) be the multilinear form defined as in (1.4). To get a best rank-1 approximation ofF is equivalent to solving the multilinear optimization problem (1.5). Here we show how to solve it by using semidefinite relaxations.

Without loss of generality, assume nm= maxjnj. Since F(x1, . . . , xm) is linear inxm, we can write it as

F(x1, . . . , xm) =

nm

j=1

(xm)jFj(x1, . . . , xm−1),

where eachFj(x1, . . . , xm−1) is also a multilinear form. Letm=m−1, and Fsq:=

nm

j=1

Fj(x1, . . . , xm)2. By the Cauchy–Schwartz inequality, it holds that

|F(x1, . . . , xm)| ≤Fsq(x1, . . . , xm)1/2xm. The equality occurs in the above if and only ifxmis proportional to

F1(x1, . . . , xm), . . . , Fnm(x1, . . . , xm) .

Therefore, (1.5) is equivalent to

(2.13)

Fmax: = max

x1,...,xm

, Fsq(x1, . . . , xm) s.t. x1=· · ·=xm= 1.

Now we present the semidefinite relaxations for solving (2.13). The outer product K(x) :=x1⊗ · · · ⊗xm

Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(10)

is a vector of lengthn1· · ·nm. Denote

Ω :={(ı, j) : ı, j∈[n1]× · · · ×[nm]}. Expand the outer product ofK(x) as

K(x)K(x)T =

(ı,j)∈Ω

Bı,j(x1)ı1(x1)j1· · ·(xm)ı

m(xm)j

m,

where eachBı,j is a constant symmetric matrix. Forw∈RΩ, define

K(w) :=

(ı,j)∈Ω

Bı,jwı,j.

Clearly,K(w) is a linear pencil inw∈RΩ. Write

Fsq=

(ı,j)∈Ω

Gı,j(x1)ı1(x1)j1· · ·(xm)ı

m(xm)j

m,

h:=x12· · · xm2=

(ı,j)∈Ω

hı,j(x1)ı1(x1)j1· · ·(xm)ı

m(xm)j

m.

Forw∈RΩ, we denote

Fsq, w:=

(ı,j)∈Ω

Gı,jwı,j, h, w:=

(ı,j)∈Ω

hı,jwı,j.

A semidefinite relaxation of (2.13) is

(2.14) Fmaxsdp := max Fsq, w s.t. K(w)0, h, w= 1.

Define Σn1,...,n

m to be the cone as Σn1,...,n

m =

L

L=L21+· · ·+L2k where eachLi is a multilinear form in (x1, . . . , xm)

.

It can be shown that the dual problem of (2.14) is

(2.15) min γ s.t. γh−FsqΣn1,...,n

m.

Clearly, we always haveFmaxsdp ≥Fmax. When the equality occurs, we say that (2.14) is a tight relaxation.

The feasible set of (2.14) is compact, because Trace(K(w)) = h, w= 1.

Letwbe a maximizer of (2.14). Like for the case of symmetric tensors, if rankK(w) = 1, then (2.14) is tight, and there exist vectors v1, . . . , vm of unit length such that w= (v1(v1)T)⊗ · · · ⊗(vm(vm)T) and (v1, . . . , vm) is a maximizer of (2.13). They can be constructed as follows. Let [n1]× · · · ×[nm] be the index such that

w,= max

(ı,ı)∈Ωwı,ı .

Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(11)

Then choosevj (j= 1, . . . , m) as

(2.16) vˆj =

wˆ

1,, wˆ

2,, . . . , wˆ

nj,

, vj = ˆvj/ˆvj,

where ˆk = + (k j)·ej for each k [nj]. When rankK(w) > 1, the tuple (v1, . . . , vm) as in (2.16) might not be a global maximizer of (2.13). But it can be used as an approximation for a maximizer of (2.13).

Combining the above, we get the following algorithm.

Algorithm 2.4. Rank-1 approximations for nonsymmetric tensors.

Input: A nonsymmetric tensorF ∈Rn1×...×nm.

Output: A rank-1 tensor λ·(u1⊗ · · · ⊗um)with λ∈Rand each uj Snj−1. Procedure:

Step 1. Solve the semidefinite relaxation (2.14) and get a maximizer w. Step 2. Choose(v1, . . . , vm)as in (2.16). Then let

ˆ

vm:= (F1(v1, . . . , vm), . . . , Fnm(v1, . . . , vm)), andvm:= ˆvm/ˆvm.

Step 3. If rankK(w) = 1, letui=vi for i= 1, . . . , m; otherwise, apply a nonlinear optimization method to get a better solution (u1, . . . , um) of (1.5), by using (v1, . . . , vm)as a starting point.

Step 4. Let λ=F(u1, . . . , um), and output (λ, u1, . . . , um).

Remark. In Algorithm 2.4, if rankK(w) = 1, then the outputλ·u1⊗ · · · ⊗um is a best rank-1 approximation ofF. If rankK(w)>1, thenλ·u1⊗ · · · ⊗umis not necessarily the best. The approximation quality of the semidefinite relaxation (2.14) is analyzed in [23, section 3].

We want to know when rankK(w) = 1. Clearly, for this to be true, the relaxation (2.14) must be tight, i.e., Fmax =Fmaxsdp. Like for the symmetric case, the reverse is also often true, as shown in the following theorem. Let∂Σn1,...,n

m be the boundary of the cone Σn1,...,n

m.

Theorem 2.5. Let Fmax, Fmaxsdp, w be as above. SupposeFmax=Fmaxsdp. If Fmax· h−Fsq is a smooth point of∂Σn1,...,n

m, then rankK(w) = 1.

Proof. This can be proved in the same way as for Theorem 2.2. Let (u1, . . . , um) be a global maximizer of (2.13). Let ˆw∈RΩ be the vector such that

ˆ

wı,j= (u1)ı1(u1)j1· · ·(um)ım(um)jm (ı, j)Ω.

The key point is the observation that p,wˆ= 0 defines a unique supporting hyper- plane of Σn1,...,n

m throughFmax·h−Fsq, when it is a smooth point of the boundary

∂Σn1,...,n

m. The proof proceeds as for Theorem 2.2.

3. Numerical experiments. In this section, we report numerical experiments of using semidefinite relaxations to find best rank-1 tensor approximations. The computations are implemented in MATLAB 7.10 on a Dell Linux desktop with 8 GB memory and Intel 2.8 GHz CPU. In applications, the resulting semidefinite programs are often large scale. Interior point methods are not very suitable for solving such big semidefinite programs. We use the softwareSDPNAL[31] by Zhao, Sun, and Toh, which is based on the Newton-CG augmented Lagrangian method [32]. In our computations, the default values of the parameters inSDPNALare used. In Algorithms 2.1, 2.3, and 2.4, if the matricesM(y), M(z), K(w) do not have rank one, we apply the nonlinear

Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(12)

program solverfminconin the MATLAB Optimization Toolbox to improve the rank- 1 approximations obtained from semidefinite relaxations. Our numerical experiments show that these algorithms are often able to get best rank-1 approximations and SDPNALis efficient in solving such large-scale semidefinite relaxations.

We report the consumed computer time in the formathr:mn:scwith hr(resp., mn, sc) standing for the consumed hours (resp., minutes, seconds). In our presented tables, min(resp.,med, max) stands for the minimum (resp., median, maximum) of quantities like time and errors.

In our computations, the rank of a matrixA is numerically measured as follows:

if the singular values ofA areσ1 ≥σ2≥ · · · ≥σt>0, then rank (A) is set to be the smallest r such that σr+1r < 10−6. In the display of our computational results, only four decimal digits are shown.

3.1. Symmetric tensor examples. In this subsection, we report numerical ex- periments for symmetric tensors. We apply Algorithm 2.1 for even symmetric tensors, and we apply Algorithm 2.3 for odd symmetric tensors. In Algorithm 2.1, if

(3.1) rankM(y) = rankM(z) = 1,

then the output tensor λ·u⊗m is a best rank-1 approximation. If rankM(y) >1 or rankM(z) > 1, λ·u⊗m is not guaranteed to be a best rank-1 approximation.

However, the quantity

fubd:= max{|fmaxsdp|,|fminsdp|}

is always an upper bound of|f(x)|onSn−1. No matter whether (3.1) holds, the error (3.2) aprxerr:=|f(u)| −fubd/max{1, fubd}

is a measure of the approximation quality ofλ·u⊗m. When Algorithm 2.3 is applied for odd symmetric tensors,fubd andaprxerrare defined similarly. As in Qi [27], we define thebest rank-1 approximation ratioof a tensorF ∈Sm(Rn) as

(3.3) ρ(F) := max

X ∈Sm(Rn),rankX=1

| F,X | FX .

If λ·u⊗m, with u = 1 andλ =f(u), is a best rank-1 approximation of F, then ρ(F) =|λ|/F.Estimates forρ(F) are given in Qi [27].

Example 3.1 (see [9, Example 2]). Consider the tensorF ∈S3(R2) with entries F111= 1.5578, F222= 1.1226, F112=2.4443, F221=1.0982.

When Algorithm 2.3 is applied, we get the rank-1 tensorλ·u⊗3 with λ= 3.1155, u= (0.9264,0.3764).

It takes about 0.2 second. The computed matrixM(y) has rank one. So, we know λ·u⊗3 is a best rank-1 approximation. The erroraprxerr= 7.3e-9, the ratioρ(F) = 0.6203, the residualF −λ·u⊗3= 3.9399, andF= 5.0228.

Example 3.2 (see [16, Example 3.6], [35, Example 4.2]). Consider the tensor F ∈S3(R3) with entries

F111=0.1281,F112= 0.0516,F113=0.0954,F122=0.1958,F123=0.1790, F133=0.2676,F222= 0.3251,F223= 0.2513,F233= 0.1773,F333= 0.0338.

Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(13)

When Algorithm 2.3 is applied, we get the rank-1 tensorλ·u⊗3 with λ= 0.8730, u= (0.3921,0.7249,0.5664).

It takes about 0.2 second. The computed matrixM(y) has rank one, soλ·u⊗3 is a best rank-1 approximation. The erroraprxerr=1.2e-7, the ratioρ(F) = 0.8890, the residualF −λ·u⊗3= 0.4498, andF= 0.9820.

Example 3.3 (see [27, Example 2]). Consider the tensorF ∈S3(R3) with entries F111= 0.0517,F112= 0.3579,F113= 0.5298,F122= 0.7544,F123= 0.2156, F133= 0.3612,F222= 0.3943,F223= 0.0146,F233= 0.6718,F333= 0.9723.

When Algorithm 2.3 is applied, we get the rank-1 tensorλ·u⊗3 with λ= 2.1110, u= (0.5204,0.5113,0.6839).

It takes about 0.2 second. Since the computed matrixM(y) has rank one, λ·u⊗3 is a best rank-1 approximation. The erroraprxerr=6.9e-8, the ratioρ(F) = 0.8574, the residualF −λ·u⊗3= 1.2672, andF= 2.4621.

Example 3.4 (see [16, Example 3.5], [35, Example 4.1]). Consider the tensor F ∈S4(R3) with entries

F1111= 0.2883, F1112=0.0031,F1113= 0.1973,F1122=0.2485,F1123=0.2939, F1133= 0.3847, F1222= 0.2972,F1223= 0.1862,F1233= 0.0919,F1333=0.3619, F2222= 0.1241, F2223=0.3420, F2233= 0.2127, F2333= 0.2727,F3333=0.3054.

Applying Algorithm 2.1, we getλ+·(u+)⊗mandλ·(u)⊗mwith λ+= 0.8893, u+= (0.6672,0.2470,0.7027), λ=1.0954, u= (0.5915,0.7467,0.3043).

It takes about 0.3 second. Since+|<|λ|, the output rank-1 tensor isλ·u⊗4 with λ=λ, u=u. The computed matricesM(y), M(z) both have rank one, soλ·u⊗4 is a best rank-1 approximation. The erroraprxerr=2.8e-7, the ratioρ(F) = 0.4863, the residualF −λ·u⊗4= 1.9683, andF= 2.2525.

Example 3.5. Consider the tensor F ∈S3(Rn) with entries (F)i1,i2,i3 =(1)i1

i1 +(1)i2

i2 +(1)i3 i3 .

Forn= 5, we apply Algorithm 2.3 and get the rank-1 tensorλ·u⊗3 with λ= 9.9779, u=(0.7313,0.1375,0.4674,0.2365,0.4146).

It takes about 0.3 second. The erroraprxerr=1.4e-7. Since the computed matrix M(y) has rank one, we know λ·u⊗3 is a best rank-1 approximation. The ratio ρ(F) = 0.8813, the residualF −λ·u⊗3= 5.3498, andF= 11.3216.

For a range of values of nfrom 10 to 55, we apply Algorithm 2.3 to find rank- 1 approximations. All the computed matricesM(y) have rank one, so we get best rank-1 approximations for all of them. The computational results are listed in Table 1.

Forn= 5,10,15,20,25,30, Algorithm 2.3 takes less than 1 minute; forn= 35,40,45,

Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

Références

Documents relatifs

Recently, the notion of memory-efficient represen- tations of the orthonormal basis matrices of the second-order and higher-order Krylov subspaces was generalized to the model

Shu , Stability analysis and a priori error estimates to the third order explicit Runge-Kutta discontinuous Galerkin method for scalar conservation laws, SIAM J. Redistribution

Finally, we apply some spectral properties of M -tensors to study the positive definiteness of a class of multivariate forms associated with Z -tensors.. We propose an algorithm

In this paper we study the structure of finite dimensional estimation algebras with state dimension 3 and rank 2 arising from a nonlinear filtering system by using the theories of

In section 3, we use the sticky particle method to study weak solutions to the mCH equation for general Radon measure initial data.. Space-time BV estimates

In the first set of testing, we present several numerical tests to show the sharpness of these upper bounds, and in the second set, we integrate our upper bound estimates into

In the important case in which the underlying sets are convex sets described by convex polynomials in a finite dimensional space, we show that the H¨ older regularity properties

For the direct adaptation of the alternating direction method of multipliers, we show that if the penalty parameter is chosen sufficiently large and the se- quence generated has a