Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(1)

SEMIDEFINITE RELAXATIONS FOR BEST RANK-1 TENSOR APPROXIMATIONS^∗

JIAWANG NIE^† _AND LI WANG^†

Abstract. This paper studies the problem of finding best rank-1 approximations for both symmetric and nonsymmetric tensors. For symmetric tensors, this is equivalent to optimizing homogeneous polynomials over unit spheres; for nonsymmetric tensors, this is equivalent to optimizing multiquadratic forms over multispheres. We propose semidefinite relaxations, based on sum of squares representations, to solve these polynomial optimization problems. Their special properties and structures are studied. In applications, the resulting semidefinite programs are often large scale.

The recent Newton-CG augmented Lagrangian method by Zhao, Sun, and Toh [SIAM J. Optim., 20 (2010), pp. 1737–1765] is suitable for solving these semideﬁnite relaxations. Extensive numerical experiments are presented to show that this approach is eﬃcient in getting best rank-1 approximations.

Key words. form, polynomial, relaxation, rank-1 approximation, semideﬁnite program, sum of squares, tensor

AMS subject classifications.15A18, 15A69, 90C22 DOI.10.1137/130935112

1. Introduction. Let m and n₁, . . . , n_m be positive integers. A tensor of order m and dimension (n₁, . . . , n_m) is an array F that is indexed by integer tuples (i₁, . . . , i_m) with 1≤i_j≤n_j (j= 1, . . . , m), i.e.,

F= (Fi1,...,im)_1≤i₁_≤n₁_,...,1≤i_m_≤n_m.

The space of all such tensors with real (resp., complex) entries is denoted asRⁿ¹^×···×n^m (resp.,Cⁿ¹^×···×n^m). Tensors of ordermare called m-tensors. Whenmequals 1 or 2, they are regular vectors or matrices. When m = 3 (resp., 4), they are called cubic (resp., quartic) tensors. A tensorF ∈Rⁿ¹^×···×n^m is symmetric if n₁=· · ·=n_mand

F_i₁_,...,i_m =F_j₁_,...,j_m ∀ (i₁, . . . , i_m)∼(j₁, . . . , j_m),

where∼means that (i₁, . . . , i_m) is a permutation of (j₁, . . . , j_m). We deﬁne the norm of a tensorF as

(1.1) F=

_n 1

i1=1

· · ·

nm

im=1

|Fi1,...,im|² _1/2

.

For m= 1, F is the vector 2-norm, and for m= 2, F is the matrix Frobenius norm.

Every tensor can be expressed as a linear combination of outer products of vectors.

For vectorsu¹∈Cⁿ¹, . . . , u^m∈Cⁿ^m, their outer productu¹⊗ · · · ⊗u^m is the tensor inCⁿ¹^×···×n^m such that for all 1≤i_j ≤n_j (j = 1, . . . , m)

(u¹⊗ · · · ⊗u^m)_i₁_,...,i_m= (u¹)_i₁· · ·(u^m)_i_m.

∗Received by the editors September 3, 2013; accepted for publication (in revised form) by T.

Kolda May 23, 2014; published electronically August 28, 2014. The research was partially supported by NSF grant DMS-0844775.

http://www.siam.org/journals/simax/35-3/93511.html

†Department of Mathematics, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA 92093 (njw@math.ucsd.edu, liw022@math.ucsd.edu).

1155

Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(2)

For every tensorF of orderm, there exist tuples (uî,1, . . . , uî,m) (i= 1, . . . , r), with eachuî,j∈Cⁿ^j, such that

(1.2) F =

r i=1

u^i,1⊗ · · · ⊗u^i,m.

The smallestrin the above is called the rank ofF and is denoted as rankF. When rankF=r, (1.2) is called a rank decomposition ofF, and we say thatF is a rank-r tensor.

Tensor problems have wide applications in chemometrics, signal processing, and high order statistics [6]. For the theory and applications of tensors, we refer to Comon et al. [7], Kolda and Bader [15], and Landsberg [17]. Whenm≥3, determining ranks of tensors and computing rank decompositions are NP-hard (cf. Hillar and Lim [11]).

We refer to Brachat et al. [1], Bernardi et al. [2], and Oeding and Ottaviani [25] for tensor decompositions. When r > 1, the problem of ﬁnding best rank-r approximations may be ill-posed, because a best rank-r approximation might not exist, as discovered by De Silva and Lim [10]. However, a best rank-1 approximation always exists. It is also NP-hard to compute best rank-1 approximations. This paper studies best real rank-1 approximations for real tensors (i.e., tensor entries are real numbers).

We begin with some reviews on this subject.

1.1. Nonsymmetric tensor approximations. Given a tensorF ∈Rⁿ¹^×···×n^m, we say that a tensorBis a best rank-1 approximation ofF if it is a minimizer of the least squares problem

(1.3) min

X ∈Rⁿ¹^×···×nm,rankX=1, F − X ².

This is equivalent to a homogeneous polynomial optimization problem, as shown in [9]. For convenience of description, we deﬁne the homogeneous polynomial

(1.4) F(x¹, . . . , x^m) :=

1≤i1≤n1,...,1≤im≤nm

Fi1,...,im(x¹)_i₁· · ·(x^m)_i_m,

which is in x¹ ∈Rⁿ¹, . . . , x^m∈Rⁿ^m. Note thatF(x¹, . . . , x^m) is a multilinear form (a form is a homogeneous polynomial), since it is linear in eachx^j. De Lathauwer, De Moor, and Vandewalle [9] proved the following result.

Theorem 1.1 (see [9, Theorem 3.1]). For a tensor F ∈Rⁿ¹^×···×n^m, the rank-1 approximation problem(1.3)is equivalent to the maximization problem

(1.5)

max

x¹∈Rⁿ¹,...,x^m∈R^nm, |F(x¹, . . . , x^m)|

s.t. x¹=· · ·=x^m= 1,

that is, a rank-1 tensorλ·(u¹⊗ · · · ⊗u^m), with λ∈R and each uⁱ= 1, is a best rank-1 approximation forF if and only if (u¹, . . . , u^m)is a global maximizer of (1.5) andλ=F(u¹, . . . , u^m). Moreover, it also holds that

F −λ·(u¹⊗ · · · ⊗u^m)²=F²−λ².

By Theorem 1.1, to find a best rank-1 approximation, it is enough to find a global maximizer of the multilinear optimization problem (1.5). There exist methods on finding rank-1 approximation, like the alternating least squares (ALS), truncated high

Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(3)

order singular value decomposition (HOSVD), higher order power method (HOPM), and quasi-Newton methods. We refer to Zhang and Golub [33], De Lathauwer, De Moor and Vandewalle [8, 9], Savas and Lim [29], and the references therein. An advantage of these methods is that they can be easily implemented. These kinds of methods typically generate a sequence that converges to a locally optimal rank-1 approximation or even just a stationary point. Even for the lucky cases that they get globally optimal rank-1 approximations, it is usually very diﬃcult to verify the global optimality by these methods.

1.2. Symmetric tensor approximations. Let S^m(Rⁿ) be the space of real symmetric tensors of ordermand in dimensionn. GivenF ∈S^m(Rⁿ), we say thatB is a best rank-1 approximation ofF if it is a minimizer of the optimization problem

(1.6) min

X ∈R^n×···×n,rank_X=1, F − X ².

When F is symmetric, Zhang, Ling, and Qi [35] showed that (1.6) has a global minimizer that belongs to S^m(Rⁿ), i.e., (1.6) has an optimizer that is a symmetric tensor. It is possible that a best rank-1 approximation of a symmetric tensor might not be symmetric. But there is always at least one global minimizer of (1.6) that is a symmetric rank-1 tensor. Therefore, for convenience, by best rank-1 approximation for symmetric tensors, we mean best symmetric rank-1 approximation.

A symmetric tensor inS^m(Rⁿ) is rank-1 if and only if it equalsλ·(u⊗ · · ·⊗u) for someλ∈Randu∈Rⁿ. For convenience, denoteu^⊗m:=u⊗ · · · ⊗u(uis repeated m times). In the spirit of Theorem 1.1 and the work [35], (1.6) is equivalent to the optimization problem

(1.7) max

x∈Rⁿ |f(x)| s.t. x= 1,

wheref(x) :=F(x,· · · , x). Therefore, ifuis a global maximizer of (1.7) andλ=f(u), thenλ·u^⊗m is a best rank-1 approximation ofF. Clearly, to solve (1.7), we need to solve two maximization problems:

(1.8) (I) max

x∈Sⁿ⁻¹f(x), (II) max

x∈Sⁿ⁻¹−f(x),

where Sⁿ⁻¹ :={x∈ Rⁿ : x = 1} is then−1 dimensional unit sphere. Suppose u⁺, u⁻ are global maximizers of (I), (II) in (1.8), respectively. By Theorem 1.1, if |f(u⁺)| ≥ |f(u⁻)|, f(u⁺)·(u⁺)^⊗m is the best rank-1 approximation; otherwise, f(u⁻)·(u⁻)^⊗mis the best.

For an introduction to symmetric tensors, we refer to Comon et al. [7]. Finding best rank-1 approximations for symmetric tensors is also NP-hard whenm≥3. There exist methods for computing rank-1 approximations for symmetric tensors. When HOPM is directly applied, it is often unreliable for attaining a good symmetric rank- 1 approximation, as pointed out in [9]. To get good symmetric rank-1 approximations, Kofidis and Regalia [14] proposed a symmetric higher order power method (SHOPM), and Zhang, Ling, and Qi [35] proposed a modified power method. These methods can be easily implemented. Like for nonsymmetric tensors, they often generate a locally optimal rank-1 approximation or even just a stationary point. Even for the lucky cases in which a globally optimal rank-1 approximation is found, these methods typically have difficulty verifying its global optimality. The problem (1.7) is related to extreme Z-eigenvalues of symmetric tensors. Recently, Hu, Huang, and Qi [12] proposed a method for computing extremeZ-eigenvalues for symmetric tensors of even orders. It solves a sequence of semidefinite relaxations based on sum of squares representations.

Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(4)

1.3. Contributions of the paper. In this paper, we propose a new approach for computing best rank-1 tensor approximations, i.e., using semidefinite program (SDP) relaxations. As we have seen, for nonsymmetric tensors, the problem is equivalent to optimizing a multilinear form over multispheres, i.e., (1.5); for symmetric tensors, it is equivalent to optimizing a homogeneous polynomial over the unit sphere, i.e., (1.7). Recently, there has been extensive work on solving polynomial optimization problems by using semidefinite relaxations and sum of squares representations, e.g., Lasserre [18], Parrilo and Sturmfels [26], and Nie and Wang [24]. These relaxations are often tight, and generally they are able to get global optimizers, which can be verified mathematically. We refer to Lasserre’s book [19] and Laurent’s survey [20]

for an overview for the work in this area.

For a nonsymmetric tensor F, to get a best rank-1 approximation is equivalent to solving the multilinear optimization problem (1.5). WhenF is symmetric, to get a best rank-1 approximation for F, we need to solve the homogeneous optimization problem (1.7). We solve these polynomial optimization problems by using semidefinite relaxations, based on sum of squares representations. In applications, the resulting semidefinite programs are often large scale. The traditional interior point methods for semidefinite programs are generally too expensive for solving them. The recent Newton-CG augmented Lagrangian method by Zhao, Sun, and Toh [32] is very efficient for solving such big semidefinite programs, as shown in [24]. In the paper, we use this method to compute best rank-1 approximations for tensors.

The paper is organized as follows. In section 2, we show how to find best rank-1 approximations by using semidefinite relaxations and we propose numerical algorithms based on them. Their special structures and properties are also studied. In section 3, we present extensive numerical experiments to show the efficiency of these semidefinite relaxations. In section 4, we discuss this approach and future work.

Notation. The symbol N (resp., R, C) denotes the set of nonnegative integers (resp., real numbers, complex numbers). For two tensors X,Y ∈Rⁿ¹^×···n^m, deﬁne their inner product as

X,Y:=

1≤i1≤n1,...,1≤im≤nm

Xi1,...,imYi1,...,im.

For t ∈ R, t denotes the smallest integer that is not smaller than t. For integer n >0, [n] denotes the set{1,· · ·, n}. Forα= (α₁, . . . , α_n)∈Nⁿ,|α|:=α₁+· · ·+α_n. DenoteNⁿ_m={α∈Nⁿ: |α|=m}. Forx= (x₁, . . . , x_n)∈Rⁿ,x^αdenotesx^α₁¹· · ·x^α_nⁿ. LetR[x] be the ring of polynomials with real coefficients inx. A polynomialpis said to be sum of squares (SOS) ifp=p²₁+· · ·+p²_kfor somep₁, . . . , p_k∈R[x]. The symbol Σ_n,mdenotes the cone of SOS forms in (x₁, . . . , x_n) and of degreem. For a finite set T, |T| denotes its cardinality. The symbol e_i denotes theith unit vector, i.e., e_i is the vector whoseith entry equals one and all others equal zero. For a matrixA,A^T denotes its transpose. For a symmetric matrixW,W 0(resp., 0) means thatW is positive semidefinite (resp., definite). For any vectoru∈R^N,u=√

u^Tudenotes the standard Euclidean 2-norm.

2. Semidefinite relaxations and algorithms. To ﬁnd best rank-1 tensor approximations is equivalent to solving some homogeneous polynomial optimization problems with sphere constraints. In this section, we show how to solve them by using semideﬁnite relaxations based on sum of squares representations, and we study their properties.

Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(5)

2.1. Symmetric tensors of even orders. LetF ∈S^m(Rⁿ) withm= 2deven.

To get a best rank-1 approximation of F, we need to solve (1.7), i.e., to maximize

|f(x)| over the unit sphere. For this purpose, we need to ﬁnd the maximum and minimum off(x) overSⁿ⁻¹.

First, we consider the maximization problem:

(2.1) f_max:= max f(x) s.t. x^Tx= 1.

The formf is inx:= (x₁, . . . , x_n), determined by the tensorF as

(2.2) f(x) =

1≤i1,...,im≤n

Fi1,...,imx_i₁· · ·x_i_m.

Let [x^d] be the monomial vector:

[x^d] :=

x^d₁ x^d−1₁ x₂ · · · x^d−1₁ x_n · · · x^d_n_T .

Its length is_n+d−1

d . The outer product [x^d][x^d]^T is a symmetric matrix with entries being monomials of degreem. LetA_α be symmetric matrices such that

[x^d][x^d]^T =

α∈Nⁿ_m

A_αx^α.

Fory∈R^Nⁿ^m, deﬁne the matrix-valued functionM(y) as M(y) :=

α∈Nⁿ_m

A_αy_α.

ThenM(y) is a linear pencil iny (i.e.,M(y) is a linear matrix-valued function iny).

Letf_α, g_αbe the coeﬃcients such that

f:=

α∈Nⁿ_m

f_αx^α, g:= (x^Tx)^d =

α∈Nⁿ_m

g_αx^α.

Fory∈R^Nⁿ^m, deﬁne

f, y:=

α∈Nⁿm

f_αy_α, g, y:=

α∈Nⁿm

g_αy_α.

A semideﬁnite relaxation of (2.1) is (cf. [23, 24]) (2.3) f_max^sdp := max

y∈R^Nⁿ^m f, y s.t. M(y)0, g, y= 1.

It can be shown that the dual of the above is

(2.4) min γ s.t. γg−f ∈Σ_n,m.

In the above, Σ_n,m denotes the cone of SOS forms of degree m and in variables x₁, . . . , x_n. Clearly, f_max^sdp ≥f_max. When f_max^sdp =f_max, we say that the semideﬁnite relaxation (2.3) is tight.

Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(6)

The dual problem (2.4) is a linear optimization with SOS type constraints. Re- cently, Hu, Huang, and Qi [12] proposed an SOS relaxation method for computing extreme Z-eigenvalues for symmetric tensors of even orders. Since not every nonnegative form is SOS, they consider a sequence of nesting SOS relaxations. The problem (2.4) is equivalent to the lowest order relaxation in [12, section 4]. In practice, the relaxation (2.4) is often tight, and it is frequently used because of its simplicity. The SOS relaxation (2.4) was also proposed in [23] for optimizing forms over unit spheres.

Its approximation quality was also analyzed there.

The feasible set of (2.3) is compact, because Trace(M(y))≤ g, y= 1.

So (2.3) always has a maximizer, say, y^∗. If rankM(y^∗) = 1, then (2.3) is a tight relaxation. In such case, there existsv⁺ ∈Sⁿ⁻¹ such thaty^∗ = [(v⁺)^m], because of the structure ofM(y). Then, it holds that

f(v⁺) = f, y^∗=f_max^sdp ≥f_max.

This implies that v⁺ is a global maximizer of (2.1). The vector v⁺ can be chosen numerically as follows. Lets∈[n] be the index such that

y^∗_2de

s = max

1≤i≤ny^∗_2de

i. Then choosev⁺ as the normalization:

(2.5) uˆ= (y_(2d−1)e^∗

s+e1, . . . , y^∗_(2d−1)e

s+en), v⁺= ˆu/uˆ.

If rank(M(y^∗))> 1 but M(y^∗) satisﬁes a further rank condition, then (2.3) is also tight (cf. [24, section 2.2]). In computations, no matter whether (2.3) is tight or not, the vectorv⁺ selected as in (2.5) can be used as an approximation for a maximizer of (2.1).

Second, we consider the minimization problem:

(2.6) f_min:= min f(x) s.t. x^Tx= 1.

Similarly, a semideﬁnite relaxation of (2.6) is (2.7) f_min^sdp:= min

z∈R^Nⁿ^m f, z s.t. M(z)0, g, z= 1.

Its dual optimization problem can be shown to be

(2.8) max η s.t. f−ηg∈Σ_n,m.

Let z^∗ be a minimizer of (2.7). Similarly, if rank(M(z^∗)) = 1, then (2.7) is a tight relaxation. In such case, a global minimizerv⁻can be found as follows: lett∈[n] be the index such that

z_2de^∗

t = max

1≤i≤nz_2de^∗

i, then choosev⁻ as the normalization:

(2.9) u˜= (z_(2d−1)e^∗

t+e1, . . . , z_(2d−1)e^∗

t+en), v⁻= ˜u/˜u.

Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(7)

When rankM(z^∗)>1, (2.7) might not be a tight relaxation. In computations, the vectorv⁻ can be used as an approximation for a minimizer of (2.6).

Combining the above and inspired by Theorem 1.1, we get the following algorithm.

Algorithm 2.1. Rank-1 approximations for even symmetric tensors.

Input: A symmetric tensorF ∈S^m(Rⁿ)with an even order m= 2d.

Output: A rank-1 tensor λ·u^⊗m withλ∈Rand u∈Sⁿ⁻¹. Procedure:

Step 1. Solve the semidefinite relaxation (2.3)and get an optimizer y^∗.

Step 2. Choose v⁺ as in (2.5). If rankM(y^∗) = 1, let u⁺ =v⁺; otherwise, apply a nonlinear optimization method to get a better solution u⁺ of (2.1), by using v⁺ as a starting point. Letλ⁺=f(u⁺).

Step 3. Solve the semidefinite relaxation (2.7)and get an optimizer z^∗.

Step 4. Choose v⁻ as in (2.9). If rankM(z^∗) = 1, let u⁻ =v⁻; otherwise, apply a nonlinear optimization method to get a better solution u⁻ of (2.6), by using v⁻ as a starting point. Letλ⁻=f(u⁻).

Step 5. If|λ⁺| ≥ |λ⁻|, output(λ, u) := (λ⁺, u⁺); otherwise, output(λ, u) := (λ⁻, u⁻).

Remark. In Algorithm 2.1, if rankM(y^∗) = rankM(z^∗) = 1, then the output λ·u^⊗m is a best rank-1 approximation of F. If this rank condition is not satisﬁed, thenλ·u^⊗mmight not be best. The approximation qualities of semideﬁnite relaxations (2.3) and (2.7) are analyzed in [23, section 2].

Whenm = 2 or (n, m) = (3,4), the semideﬁnite relaxations (2.3) and (2.7) are always tight. This is because every bivariate, or ternary quartic, nonnegative form is always SOS (cf. [28]). For other cases of (n, m), this is not necessarily true. However, this does not occur very often in applications. In our numerical experiments, the semideﬁnite relaxations (2.3) and (2.7) are often tight.

We want to know when the ranks ofM(y^∗) andM(z^∗) are equal to one. Clearly, when they have rank one, the relaxations (2.3) and (2.7) must be tight. Interestingly, the reverse is often true, although it might be false sometimes (for very few cases).

This fact was observed in the ﬁeld of polynomial optimization. However, we are not able to ﬁnd suitable references for explaining this fact. Here, we give a natural geometric interpretation for this phenomena, for lack of suitable references. Let∂Σ_n,m be the boundary of the cone Σ_n,m.

Theorem 2.2. Let f_max, f_max^sdp, f_min, f_min^sdp, y^∗, z^∗ be as above.

(i) Suppose f_max = f_max^sdp. If f_max·g −f is a smooth point of ∂Σ_n,m, then rankM(y^∗) = 1.

(ii) Suppose f_min = f_min^sdp. If f −f_min ·g is a smooth point of ∂Σ_n,m, then rankM(z^∗) = 1.

Proof. (i) Let μ be the uniform probability measure on the unit sphere Sⁿ⁻¹, and lety^μ :=

[x^m]dμ∈R^Nⁿ^m. Then, we can show that M(y^μ)0 and g, y^μ= 1.

This shows thaty^μ is an interior point of (2.3). So, the strong duality holds, that is, the optimal values of (2.3) and (2.4) are equal, and (2.4) achieves the optimal value f_max^sdp. The formσ :=f_max·g−f belongs to Σ_n,m, because f_max =f_max^sdp. Letube a maximizer of f onSⁿ⁻¹. Then,σ(u) = 0. So,σlies on the boundary ∂Σ_n,m. Let ˆ

y= [u^m]. Denote byR[x]_m the space of all forms inxand of degreem. Then Hˆy:={p∈R[x]_m: p,yˆ= 0}

is a supporting hyperplane of Σ_n,m throughσ, because p,yˆ=p(u)≥0 ∀p∈Σ_n,m

Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(8)

and σ,yˆ = σ(u) = 0. Because M(y^∗) 0 and p, y^∗ ≥ 0 for all p ∈ Σ_n,m, the hyperplane

Hy^∗ :={p∈R[x]_m: p, y^∗= 0}

also supports Σ_n,mthroughσ. Sinceσis a smooth point of∂Σ_n,m, there is a unique supporting hyperplane Hthroughσ. So,y^∗= ˆy= [u^m]. This implies that M(y^∗) = M([u^m]) = [u^d][u^d]^T has rank one, whered=m/2.

(ii) The proof is the same as for (i).

By Theorem 2.2, when semideﬁnite relaxations (2.3) and (2.7) are tight, we often have rankM(y^∗) = rankM(z^∗) = 1. This fact was observed in our numerical experiments. The reason is that the set of nonsmooth points of the boundary∂Σ_n,mhas a strictly smaller dimension than∂Σ_n,m.

2.2. Symmetric tensors of odd orders. Let F ∈S^m(Rⁿ) with m= 2d−1 odd. To get a best rank-1 approximation of F, we need to solve the optimization problem (1.7). Since the formf, deﬁned as in (2.2), has the odd degree m, (1.7) is equivalent to (2.1). We cannot directly apply a semideﬁnite relaxation to solve (2.1).

For this purpose, we use a trick that is introduced in [23, section 4.2].

Letf_max, f_minbe as in (2.1), (2.6), respectively. Sincef is an odd form, f_max=−f_min≥0.

Letx_n+1 be a new variable, in addition tox= (x₁, . . . , x_n). Let

˜

x:= (x₁, . . . , x_n, x_n+1), f˜(˜x) :=f(x)x_n+1.

Then ˜f(˜x) is a form of even degree 2d. Consider the optimization problem:

(2.10) f˜_max:= max

x∈R˜ ⁿ⁺¹

f˜(˜x) s.t. x˜= 1.

As shown in [23, section 4.2], it holds that f_max=√

2d−1 1− 1

2d

−df˜_max.

Since ˜f is an even form, the semideﬁnite relaxation for (2.10) is (2.11) f˜_max^sdp := max

y∈Nⁿ⁺¹_2d

f , y˜ s.t. M(y)0, g, y= 1.

The vectoryis indexed by (n+ 1)-dimensional integer vectors.

Lety^∗ be a maximizer of (2.11), which always exists because the feasible set of (2.11) is compact. If rankM(y^∗) = 1, then (2.11) is a tight relaxation, and a global maximizerv⁺ of (2.10) can be chosen as in (2.5). (Note that thenin (2.5) should be replaced byn+ 1.) Writev⁺ as

v⁺= (ˆv,ˆt), vˆ²+ ˆt²= 1.

If ˆv = 0 or ˆt = 0, then ˜f_max = f_max = 0 and the zero tensor is the best rank-1 approximation ofF. So, we consider the general case 0<|ˆt|<1. Note that sign(ˆt)·ˆv is a global maximizer off on the spherex²₂= 1−ˆt². Let

(2.12) uˆ= sign(ˆt)·ˆv/

1−ˆt².

Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(9)

Then ˆuis a global maximizer off onSⁿ⁻¹. When rankM(y^∗)>1, the above ˆumight not be a global maximizer, but it can be used as an approximation for a maximizer.

Combining the above, we get the following algorithm.

Algorithm 2.3. Rank-1 approximations for odd symmetric tensors.

Input: A symmetric tensorF ∈S^m(Rⁿ)with an odd order m= 2d−1.

Output: A rank-1 symmetric tensor λ·u^⊗m withλ∈Randu∈Sⁿ⁻¹. Procedure:

Step 1. Solve the semidefinite relaxation (2.11) and get an optimizery^∗.

Step 2. Choosev⁺ as in(2.5)anduˆas in(2.12). (Thenin(2.5)should be replaced byn+ 1.)

Step 3. If rankM(y^∗) = 1, let u = ˆu; otherwise, apply a nonlinear optimization method to get a better solution uof(2.1), by usinguˆ as a starting point. Let λ=f(u). Output(λ, u).

Remark. In Algorithm 2.3, if rankM(y^∗) = 1, then the output λ·u^⊗m is a best rank-1 approximation of the tensor F. If rankM(y^∗) > 1, then λ·u^⊗m is not necessarily the best. The approximation quality of the semideﬁnite relaxation (2.11) is analyzed in [23, section 4]. In our numerical experiments, we often have rankM(y^∗) = 1. A similar version of Theorem 2.2 is true for Algorithm 2.3. We omit it for simplicity.

2.3. Nonsymmetric tensors. LetF ∈ Rⁿ¹^×···×n^m be a nonsymmetric tensor of orderm. Let F(x¹, . . . , x^m) be the multilinear form deﬁned as in (1.4). To get a best rank-1 approximation ofF is equivalent to solving the multilinear optimization problem (1.5). Here we show how to solve it by using semideﬁnite relaxations.

Without loss of generality, assume n_m= max_jn_j. Since F(x¹, . . . , x^m) is linear inx^m, we can write it as

F(x¹, . . . , x^m) =

nm

j=1

(x^m)_jF_j(x¹, . . . , x^m−1),

where eachF_j(x¹, . . . , x^m−1) is also a multilinear form. Letm=m−1, and F^sq:=

nm

j=1

F_j(x¹, . . . , x^m)². By the Cauchy–Schwartz inequality, it holds that

|F(x¹, . . . , x^m)| ≤F^sq(x¹, . . . , x^m)^1/2x^m. The equality occurs in the above if and only ifx^mis proportional to

F₁(x¹, . . . , x^m), . . . , F_n_m(x¹, . . . , x^m) .

Therefore, (1.5) is equivalent to

(2.13)

F_max: = max

x¹,...,x^m

, F^sq(x¹, . . . , x^m) s.t. x¹=· · ·=x^m= 1.

Now we present the semideﬁnite relaxations for solving (2.13). The outer product K(x) :=x¹⊗ · · · ⊗x^m

Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(10)

is a vector of lengthn₁· · ·n_m. Denote

Ω :={(ı, j) : ı, j∈[n₁]× · · · ×[n_m]}. Expand the outer product ofK(x) as

K(x)K(x)^T =

(ı,j)∈Ω

B_ı,j(x¹)_ı₁(x¹)_j₁· · ·(x^m)_ı

m(x^m)_j

m,

where eachB_ı,j is a constant symmetric matrix. Forw∈R^Ω, deﬁne

K(w) :=

(ı,j)∈Ω

B_ı,jw_ı,j.

Clearly,K(w) is a linear pencil inw∈R^Ω. Write

F^sq=

(ı,j)∈Ω

G_ı,j(x¹)_ı₁(x¹)_j₁· · ·(x^m)_ı

m(x^m)_j

m,

h:=x¹²· · · x^m²=

(ı,j)∈Ω

h_ı,j(x¹)_ı₁(x¹)_j₁· · ·(x^m)_ı

m(x^m)_j

m.

Forw∈R^Ω, we denote

F^sq, w:=

(ı,j)∈Ω

G_ı,jw_ı,j, h, w:=

(ı,j)∈Ω

h_ı,jw_ı,j.

A semideﬁnite relaxation of (2.13) is

(2.14) F_max^sdp := max F^sq, w s.t. K(w)0, h, w= 1.

Deﬁne Σ_n₁_,...,n

m to be the cone as Σ_n₁_,...,n

m =

L

L=L²₁+· · ·+L²_k where eachL_i is a multilinear form in (x¹, . . . , x^m)

.

It can be shown that the dual problem of (2.14) is

(2.15) min γ s.t. γh−F^sq∈Σ_n₁_,...,n

m.

Clearly, we always haveF_max^sdp ≥F_max. When the equality occurs, we say that (2.14) is a tight relaxation.

The feasible set of (2.14) is compact, because Trace(K(w)) = h, w= 1.

Letw^∗be a maximizer of (2.14). Like for the case of symmetric tensors, if rankK(w^∗) = 1, then (2.14) is tight, and there exist vectors v¹, . . . , v^m of unit length such that w^∗= (v¹(v¹)^T)⊗ · · · ⊗(v^m(v^m)^T) and (v¹, . . . , v^m) is a maximizer of (2.13). They can be constructed as follows. Let ∈[n₁]× · · · ×[n_m] be the index such that

w^∗_,= max

(ı,ı)∈Ωw_ı,ı^∗ .

Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(11)

Then choosev^j (j= 1, . . . , m) as

(2.16) vˆ^j =

w^∗_ˆ

1,, w^∗_ˆ

2,, . . . , w^∗_ˆ

_nj,

, v^j = ˆv^j/ˆv^j,

where ˆ_k = + (k− j)·e_j for each k ∈ [n_j]. When rankK(w^∗) > 1, the tuple (v¹, . . . , v^m) as in (2.16) might not be a global maximizer of (2.13). But it can be used as an approximation for a maximizer of (2.13).

Combining the above, we get the following algorithm.

Algorithm 2.4. Rank-1 approximations for nonsymmetric tensors.

Input: A nonsymmetric tensorF ∈Rⁿ¹^×...×n^m.

Output: A rank-1 tensor λ·(u¹⊗ · · · ⊗u^m)with λ∈Rand each u^j ∈Sⁿ^j⁻¹. Procedure:

Step 1. Solve the semidefinite relaxation (2.14) and get a maximizer w^∗. Step 2. Choose(v¹, . . . , v^m)as in (2.16). Then let

ˆ

v^m:= (F₁(v¹, . . . , v^m), . . . , F_n_m(v¹, . . . , v^m)), andv^m:= ˆv^m/ˆv^m.

Step 3. If rankK(w^∗) = 1, letuⁱ=vⁱ for i= 1, . . . , m; otherwise, apply a nonlinear optimization method to get a better solution (u¹, . . . , u^m) of (1.5), by using (v¹, . . . , v^m)as a starting point.

Step 4. Let λ=F(u¹, . . . , u^m), and output (λ, u¹, . . . , u^m).

Remark. In Algorithm 2.4, if rankK(w^∗) = 1, then the outputλ·u¹⊗ · · · ⊗u^m is a best rank-1 approximation ofF. If rankK(w^∗)>1, thenλ·u¹⊗ · · · ⊗u^mis not necessarily the best. The approximation quality of the semideﬁnite relaxation (2.14) is analyzed in [23, section 3].

We want to know when rankK(w^∗) = 1. Clearly, for this to be true, the relaxation (2.14) must be tight, i.e., F_max =F_max^sdp. Like for the symmetric case, the reverse is also often true, as shown in the following theorem. Let∂Σ_n₁_,...,n

m be the boundary of the cone Σ_n₁_,...,n

m.

Theorem 2.5. Let F_max, F_max^sdp, w^∗ be as above. SupposeF_max=F_max^sdp. If F_max· h−F^sq is a smooth point of∂Σ_n₁_,...,n

m, then rankK(w^∗) = 1.

Proof. This can be proved in the same way as for Theorem 2.2. Let (u¹, . . . , u^m) be a global maximizer of (2.13). Let ˆw∈R^Ω be the vector such that

ˆ

w_ı,j= (u¹)_ı₁(u¹)_j₁· · ·(u^m)_ı_m(u^m)_j_m ∀(ı, j)∈Ω.

The key point is the observation that p,wˆ= 0 deﬁnes a unique supporting hyperplane of Σ_n₁_,...,n

m throughF_max·h−F^sq, when it is a smooth point of the boundary

∂Σ_n₁_,...,n

m. The proof proceeds as for Theorem 2.2.

3. Numerical experiments. In this section, we report numerical experiments of using semidefinite relaxations to find best rank-1 tensor approximations. The computations are implemented in MATLAB 7.10 on a Dell Linux desktop with 8 GB memory and Intel 2.8 GHz CPU. In applications, the resulting semidefinite programs are often large scale. Interior point methods are not very suitable for solving such big semidefinite programs. We use the softwareSDPNAL[31] by Zhao, Sun, and Toh, which is based on the Newton-CG augmented Lagrangian method [32]. In our computations, the default values of the parameters inSDPNALare used. In Algorithms 2.1, 2.3, and 2.4, if the matricesM(y^∗), M(z^∗), K(w^∗) do not have rank one, we apply the nonlinear

Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(12)

program solverfminconin the MATLAB Optimization Toolbox to improve the rank- 1 approximations obtained from semidefinite relaxations. Our numerical experiments show that these algorithms are often able to get best rank-1 approximations and SDPNALis efficient in solving such large-scale semidefinite relaxations.

We report the consumed computer time in the formathr:mn:scwith hr(resp., mn, sc) standing for the consumed hours (resp., minutes, seconds). In our presented tables, min(resp.,med, max) stands for the minimum (resp., median, maximum) of quantities like time and errors.

In our computations, the rank of a matrixA is numerically measured as follows:

if the singular values ofA areσ₁ ≥σ₂≥ · · · ≥σ_t>0, then rank (A) is set to be the smallest r such that σ_r+1/σ_r < 10⁻⁶. In the display of our computational results, only four decimal digits are shown.

3.1. Symmetric tensor examples. In this subsection, we report numerical experiments for symmetric tensors. We apply Algorithm 2.1 for even symmetric tensors, and we apply Algorithm 2.3 for odd symmetric tensors. In Algorithm 2.1, if

(3.1) rankM(y^∗) = rankM(z^∗) = 1,

then the output tensor λ·u^⊗m is a best rank-1 approximation. If rankM(y^∗) >1 or rankM(z^∗) > 1, λ·u^⊗m is not guaranteed to be a best rank-1 approximation.

However, the quantity

f_ubd:= max{|f_max^sdp|,|f_min^sdp|}

is always an upper bound of|f(x)|onSⁿ⁻¹. No matter whether (3.1) holds, the error (3.2) aprxerr:=|f(u)| −f_ubd/max{1, f_ubd}

is a measure of the approximation quality ofλ·u^⊗m. When Algorithm 2.3 is applied for odd symmetric tensors,f_ubd andaprxerrare deﬁned similarly. As in Qi [27], we deﬁne thebest rank-1 approximation ratioof a tensorF ∈S^m(Rⁿ) as

(3.3) ρ(F) := max

X ∈S^m(Rⁿ),rankX=1

| F,X | FX .

If λ·u^⊗m, with u = 1 andλ =f(u), is a best rank-1 approximation of F, then ρ(F) =|λ|/F.Estimates forρ(F) are given in Qi [27].

Example 3.1 (see [9, Example 2]). Consider the tensorF ∈S³(R²) with entries F111= 1.5578, F222= 1.1226, F112=−2.4443, F221=−1.0982.

When Algorithm 2.3 is applied, we get the rank-1 tensorλ·u^⊗3 with λ= 3.1155, u= (0.9264,−0.3764).

It takes about 0.2 second. The computed matrixM(y^∗) has rank one. So, we know λ·u^⊗3 is a best rank-1 approximation. The erroraprxerr= 7.3e-9, the ratioρ(F) = 0.6203, the residualF −λ·u^⊗3= 3.9399, andF= 5.0228.

Example 3.2 (see [16, Example 3.6], [35, Example 4.2]). Consider the tensor F ∈S³(R³) with entries

F111=−0.1281,F112= 0.0516,F113=−0.0954,F122=−0.1958,F123=−0.1790, F133=−0.2676,F222= 0.3251,F223= 0.2513,F233= 0.1773,F333= 0.0338.

Downloaded 04/26/19 to 129.107.80.115. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(13)

When Algorithm 2.3 is applied, we get the rank-1 tensorλ·u^⊗3 with λ= 0.8730, u= (−0.3921,0.7249,0.5664).

It takes about 0.2 second. The computed matrixM(y^∗) has rank one, soλ·u^⊗3 is a best rank-1 approximation. The erroraprxerr=1.2e-7, the ratioρ(F) = 0.8890, the residualF −λ·u^⊗3= 0.4498, andF= 0.9820.

Example 3.3 (see [27, Example 2]). Consider the tensorF ∈S³(R³) with entries F111= 0.0517,F112= 0.3579,F113= 0.5298,F122= 0.7544,F123= 0.2156, F133= 0.3612,F222= 0.3943,F223= 0.0146,F233= 0.6718,F333= 0.9723.

When Algorithm 2.3 is applied, we get the rank-1 tensorλ·u^⊗3 with λ= 2.1110, u= (0.5204,0.5113,0.6839).

It takes about 0.2 second. Since the computed matrixM(y^∗) has rank one, λ·u^⊗3 is a best rank-1 approximation. The erroraprxerr=6.9e-8, the ratioρ(F) = 0.8574, the residualF −λ·u^⊗3= 1.2672, andF= 2.4621.

Example 3.4 (see [16, Example 3.5], [35, Example 4.1]). Consider the tensor F ∈S⁴(R³) with entries

F1111= 0.2883, F1112=−0.0031,F1113= 0.1973,F1122=−0.2485,F1123=−0.2939, F1133= 0.3847, F1222= 0.2972,F1223= 0.1862,F1233= 0.0919,F1333=−0.3619, F2222= 0.1241, F2223=−0.3420, F2233= 0.2127, F2333= 0.2727,F3333=−0.3054.

Applying Algorithm 2.1, we getλ⁺·(u⁺)^⊗mandλ⁻·(u⁻)^⊗mwith λ⁺= 0.8893, u⁺= (−0.6672,−0.2470,0.7027), λ⁻=−1.0954, u⁻= (−0.5915,0.7467,0.3043).

It takes about 0.3 second. Since|λ⁺|<|λ⁻|, the output rank-1 tensor isλ·u^⊗4 with λ=λ⁻, u=u⁻. The computed matricesM(y^∗), M(z^∗) both have rank one, soλ·u^⊗4 is a best rank-1 approximation. The erroraprxerr=2.8e-7, the ratioρ(F) = 0.4863, the residualF −λ·u^⊗4= 1.9683, andF= 2.2525.

Example 3.5. Consider the tensor F ∈S³(Rⁿ) with entries (F)_i₁_,i₂_,i₃ =(−1)ⁱ¹

i₁ +(−1)ⁱ²

i₂ +(−1)ⁱ³ i₃ .

Forn= 5, we apply Algorithm 2.3 and get the rank-1 tensorλ·u^⊗3 with λ= 9.9779, u=−(0.7313,0.1375,0.4674,0.2365,0.4146).

It takes about 0.3 second. The erroraprxerr=1.4e-7. Since the computed matrix M(y^∗) has rank one, we know λ·u^⊗3 is a best rank-1 approximation. The ratio ρ(F) = 0.8813, the residualF −λ·u^⊗3= 5.3498, andF= 11.3216.

For a range of values of nfrom 10 to 55, we apply Algorithm 2.3 to ﬁnd rank- 1 approximations. All the computed matricesM(y^∗) have rank one, so we get best rank-1 approximations for all of them. The computational results are listed in Table 1.

Forn= 5,10,15,20,25,30, Algorithm 2.3 takes less than 1 minute; forn= 35,40,45,