A POSITIVITY PRESERVING INVERSE ITERATION FOR FINDING THE PERRON PAIR OF AN IRREDUCIBLE NONNEGATIVE THIRD ORDER TENSOR

(1)

FINDING THE PERRON PAIR OF AN IRREDUCIBLE NONNEGATIVE THIRD ORDER TENSOR

CHING-SUNG LIU^∗, CHUN-HUA GUO^†, AND WEN-WEI LIN^‡

Abstract. We propose an inverse iterative method for computing the Perron pair of an irreducible nonnegative third order tensor. The method involves the selection of a parameterθkin the kth iteration. For every positive starting vector, the method converges quadratically and is positivity preserving in the sense that the vectors approximating the Perron vector are strictly positive in each iteration. It is also shown thatθ_k= 1 near convergence. The computational work for each iteration of the proposed method is less than four times (three times if the tensor is symmetric in modes two and three, and twice if we also take the parameter to be 1 directly) that for each iteration of the Ng–Qi–Zhou algorithm, which is linearly convergent for essentially positive tensors.

Key words. inverse iteration, nonnegative tensor,M-matrix, nonnegative matrix, positivity preserving, quadratic convergence, Perron vector, Perron root

AMS subject classifications.65F15, 65F50

1. Introduction. A real-valuedmth-order n-dimensional tensor A consists of n^mentries in R, and takes the form

A= (Ai1i2...im), Ai1i2...im ∈R, 1≤i1, i2, . . . , im≤n.

A tensor A is called nonnegative (positive) if Ai1i2...im ≥ 0 (Ai1i2...im > 0) for all i1, . . . , im. For various applications of tensors, nonnegative tensors in particular, see [13].

For an n-dimensional column vector x = [x1, x2, . . . , xn]^T ∈ Rⁿ, we define an n-dimensional column vector:

Ax^m⁻¹:=



 Xn i2,...,im=1

Ai i2...imxi2. . . xim





1≤i≤n

. (1.1)

Definition 1.1 ([3, 16]). Let Abe anmth-ordern-dimensional tensor andCbe the set of all complex numbers. Assume thatAx^m−¹ is not identically zero. We say that (λ,x)∈C×(Cⁿ\{0})is an eigenpair (eigenvalue-eigenvector) ofAif

Ax^m−¹=λx^[m−^1], (1.2)

wherex^[m⁻^1]= [x^m−₁ ¹, x^m−₂ ¹, . . . , x^m_n⁻¹]^T.

For an irreducible nonnegative tensorA, the Perron–Frobenuis theorem [3, The- orem 1.4] states that there are scalarλ_∗ >0 and unit vectorx_∗ >0 satisfying (1.2), and that |λ| ≤ λ_∗ for every eigenvalue λ of A. The number λ_∗ is then called the

∗Department of Applied Mathematics, National Chiao Tung University, Hsinchu 300, Taiwan ([email protected]). This author was supported in part by the Ministry of Science and Technology.

†Department of Mathematics and Statistics, University of Regina, Regina, SK S4S 0A2, Canada ([email protected]). This author was supported in part by an NSERC Discovery Grant and ST Yau Center at Chiao-Da in Taiwan.

‡Department of Applied Mathematics, National Chiao Tung University, Hsinchu 300, Taiwan ([email protected]). This author was supported in part by the Ministry of Science and Technology, the National Center for Theoretical Sciences, and ST Yau Center at Chiao-Da in Taiwan.

1

(2)

spectral radius of A, denoted by ρ(A), and is also called the Perron root ofA. The corresponding positive unit vectorx_∗ is unique [3] and is called the Perron vector of A. The Perron pair (ρ(A),x_∗) is needed in several applications. In particular, it is related to measuring higher order connectivity in hypergraphs [7, 8], and determines the probability distribution of higher-order Markov chains [16]. By computingρ(A) for a suitable irreducible nonnegative tensor A, we can also determine whether an irreducibleZ-tensor is an M-tensor [21]. The problem of computing the Perron pair has attracted much attention in recent years.

In 2009, Ng, Qi, and Zhou [16] presented a power method for computingρ(A) for a nonnegative tensor. Later on, Chang, Pearson, and Zhang [2] proved its convergence for primitive tensors. Linear convergence of the algorithm was then proved in [20]

for essentially positive tensors, with a particular starting vector. Without giving the detailed definitions, we simply mention that essentially positive tensors are primitive tensors, and primitive tensors are irreducible nonnegative tensors. Liu, Zhou, and Ibrahim [14] noted that for any irreducible nonnegative tensor A, one can apply the Ng–Qi–Zhou (NQZ) algorithm to B =A+sI, where sis a positive scalar and I is the unit tensor. Convergence of the algorithm is then guaranteed by [2] since B is primitive. For a primitive tensor, starting with any positive vector, the NQZ algorithm also produces approximations to the Perron vector that are positive vectors. So the algorithm is positivity preserving. But it is noted in [20] that the rate of convergence could be worse than linear if the tensor is primitive, but not essentially positive.

The Perron pair can also be found by Newton’s method [15], which has local quadratic convergence, but is not positivity preserving. Global convergence may be achieved through a line search [15], but this requires additional assumptions and the resulting algorithm is more complicated and is still not positivity preserving. The positivity of approximations is important in applications; if the approximations lose positivity then they may be meaningless and could not be interpreted.

For irreducible nonnegative second order tensors (i.e., matrices), there are fast- converging and positivity preserving methods [17, 5, 9] for computing the Perron pair. Our goal in this paper is to propose a fast-converging and positivity preserving algorithm for computing the Perron pair of an irreducible nonnegative third order tensorA, with a detailed convergence analysis. Third order tensors, as the immediate generalization of matrices, have received special attention [11, 12, 18].

In 1971, Noda [17] introduced a positivity preserving method for the nonnegative matrix eigenvalue problem, which has quadratic convergence [5] and is now called the Noda iteration. In this paper, we propose a positivity preserving iteration for nonnegative third order tensors by combining the idea of Newton’s method with the idea of the Noda iteration. We, therefore, call the iteration a Newton–Noda iteration (NNI). NNI is an inverse iterative method with variable shifts, and naturally preserves the strict positivity of the Perron vector in its approximations at all iterations for a positive starting vector. The major advantage of NNI is that, for any positive initial vector, it converges quadratically and computes the desired eigenpair correctly. Fur- thermore, NNI always generates a monotonically decreasing sequence of approximate eigenvalues, converging quadratically toρ(A), and the computational work (in terms of flop counts) each iteration is less than four times (and sometimes just twice) that for each iteration of the NQZ algorithm.

The paper is organized as follows. In section 2, we introduce some preliminaries and motivation. In section 3, we present a Newton–Noda iteration, and prove some basic properties for it. In section 4, we establish its convergence theory, and derive

(3)

the asymptotic convergence rate precisely. Finally, in section 5 we present some numerical examples illustrating the convergence theory and the effectiveness of NNI, and we make some concluding remarks in section 6.

2. Preliminaries, notation and motivation. A real matrixA= [Aij]∈Rⁿ^×^k is called nonnegative (positive) ifAij ≥0 (Aij>0) for alliand j. For real matrices A and B of the same size, if A−B is nonnegative (positive), we write A ≥ B (A > B). A real square matrixAis called aZ-matrix if all its off-diagonal elements are nonpositive. AnyZ-matrixAcan be written assI−B withB ≥0; it is called a nonsingularM-matrix ifs > ρ(B), and a singularM-matrix ifs=ρ(B), whereρ(·) is the spectral radius.

In addition, we denote|A|= [|A_ij|], and the superscriptT denotes the transpose of a vector or matrix. From now on we use v⁽ⁱ⁾ (instead of vi) to represent the ith element of a vectorv, since the notationvi may be confused with a vector sequence v_i. Throughout the paper, we use the 2-norm for vectors and matrices. All vectors are realn-vectors and all matrices are realn×nmatrices, unless specified otherwise.

The following result is well known (see [1, Theorems 6.2.3 and 6.2.7] for example).

Theorem 2.1. For aZ-matrixA, the following are equivalent:

(i) A is a nonsingularM-matrix.

(ii) A⁻¹≥0.

(iii) Av>0 for some vectorv>0.

An irreducible Z-matrix is a nonsingularM-matrix if and only if for some v>0 the vectorAv is nonnegative and nonzero.

The irreducibility of a tensor is a natural generalization of the irreducibility of a matrix.

Definition 2.2 ([3, 16]). An mth-order n-dimensional tensor A is called reducible if there exists a nonempty proper index subset S⊂ {1,2, . . . , n} such that

Ai1i2...im = 0, ∀ i1∈S, ∀ i2, . . . , im∈/S.

If Ais not reducible, then we callA irreducible.

For vectors v = [v⁽¹⁾,v⁽²⁾, . . . ,v⁽ⁿ⁾]^T and w = [w⁽¹⁾,w⁽²⁾, . . . ,w⁽ⁿ⁾]^T, with v⁽ⁱ⁾ 6= 0 for all i, we define ^w_v to be the n-vector whose ith element is ^w_v(i)⁽ⁱ⁾, and then define

maxw v

= max

i

w⁽ⁱ⁾ v⁽ⁱ⁾

, minw v

= min

i

w⁽ⁱ⁾ v⁽ⁱ⁾

.

Theorem 2.3 ([3, Theorems 1.4 and 4.2]). Let A be an irreducible nonnegative tensor of ordermand dimensionn. Then there existλ_∗>0and a unit vectorx_∗>0 such that

Ax^m_∗⁻¹=λ_∗x^[m_∗ ⁻^1].

If λis an eigenvalue of A, then |λ| ≤λ_∗. Denoteλ_∗ by ρ(A). If λ is an eigenvalue with a nonnegative unit eigenvectorx, thenλ=ρ(A)andx=x_∗. Moreover, for any v>0

min

Av^m⁻¹ v^[m⁻^1]

≤ρ(A)≤max

Av^m⁻¹ v^[m⁻^1]

.

(4)

For a third ordern-dimensional tensor, (1.1) can be written as

Ax²:=





x^TA1x ... x^TAnx



, (2.1)

where then×nmatricesAiare given byA(i,:,:), using the Matlab multi-dimensional array notation. From (2.1), the nonnegative tensor eigenvalue problem (1.2) can be written as a nonlinear eigenvalue problem, i.e.,

A(x)x=λx, where

A(x) =D(x)⁻¹



 x^TA1

... x^TAn



, D(x) =



 x⁽¹⁾

. ..

x⁽ⁿ⁾



. (2.2)

We will also need

B(x) =D(x)⁻¹





x^T A1+A^T₁ ... x^T An+A^T_n



. (2.3)

Note that for eachx>0,A(x) andB(x) are nonnegative and Ax²

x^[2] = A(x)x

x , B(x)x= 2A(x)x. (2.4)

Lemma 2.4. Let v be a positive vector and A be an irreducible nonnegative third order n-dimensional tensor. Then A(v) and B(v) are irreducible nonnegative matrices.

Proof. IfA(v) is a reducible matrix, then there exists a nonempty proper index subsetS⊂ {1,2, . . . , n}such that

(A(v))_ij = 0, ∀i∈S, ∀j /∈S. (2.5) Becausevis a positive vector, from (2.2) and (2.5), it follows that

(D(v)A(v))_ij =



 v^TA1

... v^TAn





ij

= 0, ∀i∈S, ∀j /∈S. (2.6)

On the other hand,

(D(v)A(v))_ij = Xn k=1

Aikjv^(k). (2.7)

SinceA ≥0 andv>0, by combining (2.6) and (2.7), it follows that Aikj= 0, ∀i∈S, ∀j /∈S, k= 1, . . . , n,

(5)

which contradicts the fact thatAis irreducible. Hence,A(v) is an irreducible matrix.

SinceB(v)≥A(v),B(v) is also an irreducible matrix.

Theorem 2.5. For any positive unit vector v, let λ= max

Av² v^[2]

. If v 6=x_∗ (wherex_∗is the Perron vector ofA) , thenλI−A(v)and2λI−B(v)are nonsingular M-matrices. If v = x_∗, then λI−A(v) and 2λI−B(v) are singular M-matrices, i.e., ρ(A(x_∗)) =ρ(A)and ρ(B(x_∗)) = 2ρ(A). Moreover, if(2ρ(A)I−B(x_∗))q≥0 for a unit vectorq, then q=±x_∗.

Proof. We have by (2.4)

λ= max Av²

v^[2]

= max

A(v)v v

≥ρ(A(v)).

Moreover, λ = max_A(v)v

v

= ρ(A(v)) if and only if A(v)v = ρ(A(v))v (see [1, Exercise 2.2.12]), i.e., Av² = ρ(A(v))v^[2], which holds if and only if v = x_∗ by Theorem 2.3. This proves the statements aboutA(v).

Similarly, we have by (2.4) 2λ= max

2Av² v^[2]

= max

B(v)v v

≥ρ(B(v)), and 2λ= max

B(v)v v

=ρ(B(v)) if and only ifv=x_∗.

For the irreducible singularM-matrixM = 2ρ(A)I−B(x_∗), we havev>0 such that Mv= 0. Given Mq≥0 for a unit vectorq. SupposeMq6= 0. Then fors >0 large enough, w = sv+q > 0 is such that Mw ≥ 0 and Mw 6= 0. Thus M is a nonsingular M-matrix by Theorem 2.1, a contradiction. Therefore, Mq = 0 and thus B(x_∗)q = 2ρ(A)q. SinceB(x_∗)x_∗ = 2ρ(A)x_∗, it follows from by the Perron–

Frobenius theorem thatq=±x_∗.

2.1. Motivation. We define two vector valued functionsr:Rⁿ⁺¹+ →Rⁿ and f :Rⁿ⁺¹+ →Rⁿ⁺¹ as follows:

r(x,λ) =λx^[2]− Ax², f(x, λ) =

−r(x,λ)

1

2 1−x^Tx

. (2.8)

Then the Jacobian off(x, λ) is given by Jf(x, λ) =−

J_xr(x,λ) x^[2]

x^T 0

. (2.9)

HereJxr(x,λ) is the matrix of partial derivatives ofr(x, λ) with respect tox, i.e., Jxr(x,λ) = 2λD(x)−G(x), (2.10) whereD(x) is defined by (2.2) and

G(x) =





x^T A1+A^T₁ ... x^T An+A^T_n



 (2.11)

withAi=A(i,:,:).

(6)

Note thatAx²= ¹₂G(x)x. It follows from (2.10) that r(x, λ) =λx^[2]− Ax²= 1

2Jxr(x, λ)x. (2.12) We now consider using Newton’s method to solvef(x,λ) = 0. Given an approximation (bx_k,bλk), Newton’s method produces the next approximation (bx_k+1,bλk+1) as follows:

Jf(xb_k,bλk) d_k

δk

=

λbkxb^[2]_k − Axb²_k

1

2 xb^T_kbx_k−1

, (2.13)

b

x_k+1=bx_k +d_k, (2.14)

bλk+1=bλk+δk. (2.15)

Using elimination in (2.13), we find

δk=

1

2 bx^T_kbx_k−1

−xb^T_k

J_xr(bx_k,bλk)−1

λbkxb^[2]_k − Axb²_k b

x^T_k

J_xr(bx_k,bλk)₋1

b x^[2]_k

. (2.16)

By (2.12) we can simplify (2.16) to

δk= −1

2xb^T_k

Jxr(bx_k,bλk)−1

b x^[2]_k

. (2.17)

From the first equation of (2.13) we have, using (2.8), (2.9), (2.14) and (2.12), 0 =Jxr(bx_k,λbk) (xb_k+1−xb_k) +r(bx_k,bλk) +δkbx^[2]_k

=Jxr(bx_k,λbk)bx_k+1−1

2Jxr(xb_k,bλk)xb_k+δkxb^[2]_k

=Jxr(bx_k,λbk)

b x_k+1−1

2xb_k

+δkxb^[2]_k . Hence, we have the following linear system

Jxr(bx_k,bλk)wb_k =bx^[2]_k , (2.18) where

b

w_k = −1 δk

b

x_k+1−1 2bx_k

, i.e., xb_k+1= 1

2bx_k−δkwb_k. (2.19) This means that bx_k+1 is a linear combination of bx_k and wb_k. Suppose we already have bx_k > 0. We would like to guarantee bx_k+1 > 0. What is needed here is that Jxr(bx_k,bλk) is a nonsingularM-matrix. In this case,wb_k>0 by (2.18) andδk <0 by (2.17), and thusxb_k+1>0.

Whenbx_k >0, the matrixJxr(bx_k,bλk) is an irreducibleZ-matrix by Lemma 2.4.

By (2.12) and Theorem 2.1, it is a nonsingularM-matrix ifbλkbx^[2]_k −Abx²_kis nonnegative and nonzero. This suggests takingbλk = max

Abx²_k b x^[2]_k

,which is precisely the idea of

(7)

the Noda iteration [17]. Newton’s method does not determine λbk in this way, and it is unlikely that Jxr(bx_k,bλk) will be a nonsingularM-matrix when (xb_k,bλk) is close to (x_∗, ρ(A)) since Jx(x_∗, ρ(A)) is a singular M-matrix. Indeed, we have examples showing that the sequence{bx_k}produced by Newton’s method can fail to be positive.

We are thus motivated to present a new algorithm that combines the idea of Newton’s method with the idea of the Noda iteration.

3. The Newton–Noda iteration and some basic properties. In this section, we will propose a Newton–Noda iteration (NNI) for computing the spectral radiusρ(A) and the associated eigenvector of an irreducible nonnegative third order tensorA, and then we prove a number of basic properties of NNI, which will be used to establish its convergence theory in section 4.

3.1. Newton–Noda iteration. Based on (2.18)–(2.19) and the Noda iteration, we propose a Newton–Noda iteration (NNI) which is an inverse iteration, and each iteration consists of four steps

Jxr(xk, λk)wk=x^[2]_k , y_k =w_k/kwkk, (3.1) e

x_k+1=x_k +θky_k, (3.2)

xk+1=xek+1/kexk+1k, (3.3) λk+1= max Ax²_k+1

x^[2]_k+1

!

, (3.4)

whereθk >0 is to be defined later by (3.11).

The following lemma shows that the parameterθk >0 in (3.2) naturally preserves the strict positivity ofx_k at all iterations.

Lemma 3.1. LetAbe an irreducible nonnegative third order tensor. Given a unit vectorx_k >0, ifx_k 6=x_∗ andθk >0, theny_k,x_k+1>0and

λk+1=λk−min h_k(θk) e x^[2]_k+1

!

, (3.5)

where

h_k(θ) = θx^[2]_k

kwkk +θ²r y_k,λk

+r x_k,λk

. (3.6)

Proof. By (2.10),Jxr(xk,λk) = 2λkD(xk)−G(xk) =D(xk) 2λkI−B(xk) . So the vectorw_k in (3.1) satisfies

2λkI−B(x_k)

w_k=x_k. (3.7)

Sinceλk= max

Ax²_k x^[2]_k

andx_k6=x_∗, we know by Theorem 2.5 that 2λkI−B(xk) is a nonsingularM-matrix. Thus

w_k = 2λkI−B(x_k)−1

x_k>0.

Theny_k >0, andx_k+1 >0 since θk>0.

(8)

By (3.1),

J_xr(x_k,λk)y_k= 2λkD(x_k)−G(x_k)

y_k= x^[2]_k

kwkk. (3.8)

Therefore,

2λkD(xk)yk− x^[2]_k kw_kk

=G(xk)yk =





x^T_k A1+A^T₁ y_k ...

x^T_k An+A^T_n y_k



=





x^T_kA1y_k ... x^T_kAny_k



+





y^T_kA1x_k ... y^T_kAnx_k



. (3.9)

From (3.2) and (3.9), we have

Axe²_k+1=



 e

x^T_k+1A1xe_k+1 ... e

x^T_k+1Anex_k+1



=Ax²_k+θk





x^T_kA1y_k ... x^T_kAny_k



+θk





y^T_kA1x_k ... y^T_kAnx_k



+θ_k²Ay²_k

=Ax²_k+ 2λkθkD(xk)yk−θkx^[2]_k

kwkk +θ²_kAy²_k

=

λkx^[2]_k + 2λkθkD(xk)yk+λkθ²_ky_k^[2]

−θkx^[2]_k kwkk +θ_k²Ay²_k−λkθ_k²y^[2]_k +Ax²_k−λkx^[2]_k

=λkex^[2]_k+1−θkx^[2]_k kwkk +θ_k²

Ay²_k−λky^[2]_k

+Ax²_k−λkx^[2]_k . Therefore,

Aex²_k+1=λkex^[2]_k+1−h_k(θk), (3.10) where

h_k(θ) = θx^[2]_k

kwkk +θ²r y_k,λk

+r x_k,λk

.

From (3.10), it follows that λk+1= max Ax²_k+1

x^[2]_k+1

!

= max λkxe^[2]_k+1−h_k(θk) e

x^[2]_k+1

!

=λk−min h_k(θk) e x^[2]_k+1

! .

We next show that{λk}is strictly decreasing for suitableθk, unlessx_k =x_∗ for somek, in which case NNI terminates withλk =ρ(A).

Theorem 3.2. Let Abe an irreducible nonnegative third order tensor andη >0 be a fixed constant. Given a unit vector x_k >0, suppose x_k 6= x_∗ and θk in (3.2) satisfies

θk = (

1 ifh_k(1)≥ ^x

[2]

k

(1+η)kw_kk;

ηk otherwise, (3.11)

(9)

where for each kwith h_k(1)< ^x

[2]

k

(1+η)kwkk,

ηk = η

(1 +η)kw_kk µk−λk

min x^[2]_k y^[2]_k

!

with µk = max Ay²_k

y^[2]_k

. Then 0 < ηk <1 whenever it is defined,x_k+1 >0 in (3.3), and

λk > λk+1≥ρ(A). (3.12)

Proof. By Lemma 3.1, we have

λk+1=λk−min h_k(θk) e x^[2]_k+1

! . We need to proveh_k(θk)>0.

From (3.6), we have h_k(θ) = θx^[2]_k

kw_kk +θ²r y_k,λk

+r x_k,λk

= θx^[2]_k

kw_kk +θ²r(yk,µk) +r x_k,λk

+θ² λk−µk

y^[2]_k (3.13)

= θx^[2]_k

(1 +η)kwkk+ θηx^[2]_k (1 +η)kwkk +θ²r(yk,µk) +r x_k,λk

+θ² λk−µk

y^[2]_k , (3.14)

whereµk = max

Ay_k² y^[2]_k

.Whenµk≤λk,

h_k(1)≥ x^[2]_k

kw_kk+r(y_k,µk) +r x_k,λk

(by (3.13))

≥ x^[2]_k

(1 +η)kw_kk >0.

Wheneverh_k(1)≥ ^x

[2]

k

(1+η)kw_kk, we haveλk+1< λk withθk = 1. Ifh_k(1)< ^x

[2]

k

(1+η)kw_kk, thenµk > λk, and ηk>0 is defined. Suppose ηk ≥1. Then

η

(1 +η)kwkkmin x^[2]_k y^[2]_k

!

≥ µk−λk

,

and thus

ηx^[2]_k

(1 +η)kwkk+ λk−µk

y^[2]_k ≥0.

(10)

It follows from (3.14) thath_k(1) ≥ _(1+η)^x^[2]^k_k_w

kk, a contradiction. Soηk <1. We now have

θk =ηk = η

(1 +η)kw_kk µk−λk

min x^[2]_k y^[2]_k

!

, (3.15)

which ensures the inequality θkηx^[2]_k

(1 +η)kwkk ≥θ²_k µk−λk

y^[2]_k . (3.16)

Substituting (3.16) into (3.14), we obtain h_k(θk) =

"

θkx^[2]_k

(1 +η)kw_kk +θ²_kr(yk,µk) +r x_k,λk# +

"

θkηx^[2]_k

(1 +η)kwkk +θ²_k λk−µk

y^[2]_k

#

≥ θkx^[2]_k

(1 +η)kwkk +θ²_kr(yk,µk) +r x_k,λk

.

Therefore,

h_k(θk)≥ θkx^[2]_k

(1 +η)kwkk >0 (3.17)

and then

λk+1 =λk−min h_k(θk) e x^[2]_k+1

!

< λk,

By Theorem 2.3 we haveλk+1≥ρ(A).

Algorithm 3.1Newton–Noda iteration (NNI) 1. Givenx₀>0 withkx0k= 1, λ0= max

Ax²₀ x^[2]₀

,η >0, andtol>0.

2. fork= 0,1,2, . . .

3. Solve the linear systemJxr(xk, λk)wk=x^[2]_k . 4. Normalize the vectorw_k: y_k =w_k/kwkk.

5. Compute the scalarθk satisfying (3.11).

6. Compute the vectorxe_k+1=x_k +θky_k.

7. Normalize the vectorex_k+1: x_k+1 =xe_k+1/kex_k+1k.

8. Computeλk+1= max Ax²_k+1

x^[2]_k+1

andλ_k+1= min

Ax²_k+1 x^[2]_k+1

. 9. untilconvergence: |λk+1−λ_k+1|/λk+1<tol.

Based on (3.1)–(3.4) and (3.11), we can present NNI as Algorithm 3.1. The main computational work in each iteration is in lines 3,5, and 8. The computational work in line 8 is the same as that for one iteration of the NQZ algorithm, which is 2n³flops,

(11)

since the main computational work for one iteration of the NQZ algorithm is just one evaluation of Av² for a positive vectorv. If θk needs to be determined in step 5, then an additional 2n³flops are needed. But we will see later in this section that we always have θk = 1 near convergence. Forming the linear system in step 3 requires 2n³ flops in general. But if the tensorAis symmetric in modes two and three [13], i.e.,Ai =A^T_i for alli= 1, . . . , n, then forming the linear system only requiresO(n²) flops. Solving the linear system in step 3 by the GTH algorithm [6] will require ⁴₃n³ flops. Therefore, the computational work (in terms of flop counts) in each iteration of NNI is less than four times (and sometimes just twice) that for each iteration of the NQZ algorithm.

The vector w_k can be computed by the GTH algorithm accurately even near convergence, and is guaranteed to be positive. Therefore, Algorithm 3.1 generates a positive vector sequence {xk}, so it is a positivity preserving algorithm. In what follows we will prove some properties ofθk,x_k andy_k.These properties will help us to establish the global and quadratic convergence of NNI.

3.2. Some basic properties. Lemma 2.4 shows that B(x_∗) is an irreducible nonnegative matrix. Recall that B(x_∗)x_∗ = ρ(B(x_∗))x_∗. Then for any orthogonal matrix

x_∗ V

direct computation gives x^T_∗

V^T

B(x_∗)

x_∗ V

=

ρ(B(x_∗)) c^T

0 L

.

Similarly, for an irreducible nonnegative matrix B(xk), let u_k be the unit length positive eigenvector corresponding to ρ(B(xk)). Then for any orthogonal matrix u_k Vk

it holds that u^T_k

V_k^T

B(xk)

u_k Vk

=

ρ(B(x_k)) c^T_k

0 Lk

. (3.18)

WhenB(xk)→B(x_∗),we haveu_k→x_∗ and chooseVk such thatVk →V,and then we havec_k →candLk→L.

Now define

εk =λk−ρ(A), Bk = 2λkI−B(xk), τk = 2λk−ρ(B(xk)). (3.19) Then from (3.18) we have

u^T_k V_k^T

Bk

u_k Vk

=

τk −c^T_k 0 Lk

,

whereLk = 2λkI−Lk. For 2λk6=ρ(B(xk)), it is easy to verify that u^T_k

V_k^T

B_k⁻¹

u_k Vk

=

" ₁

τk b^T_k 0 L⁻_k¹

#

withb^T_k =^c^T^k^L

−1 k

τk . (3.20) Lemma 3.3. Assume that the sequence

λk,x_k,y_k is generated by Algorithm 3.1.

For any subsequence

x_k_j ⊆ {x_k}, we have the following results:

(i) If x_k_j →vasj → ∞,thenv>0.

(ii) If x_k_j →x_∗ asj→ ∞, theny_k_j →x_∗ asj→ ∞.

(12)

Proof. (i). If limj→∞x_k_j =v, thenv≥0. LetS be the set of all indices t such that lim_j→∞x^(t)_k

j =v^(t) = 0. Sincex_k_j= 1, S is a proper subset of {1,2, . . . , n}.

SupposeS is nonempty. Then by the definition ofλk,

λ0≥λkj = max



Ax²_k_j x^[2]_k

j



≥ x^T_k

jAtx_k_j

x^(t)_k

j

2 for allt= 1,2, . . . , n.

Since limj→∞x^(t)_k

j = 0 for t ∈ S, it holds that limj→∞x^T_k

jAtx_k_j = v^TAtv = 0 for t ∈ S. Thus, Atpq = 0 for all t ∈ S and for all p, q /∈ S, which contradicts the irreducibility ofA. Therefore,S is empty and thusv>0.

(ii). Sincex_k_j →x_∗, we haveB(xkj)→B(x_∗).Then we haveu_k_j →x_∗,εkj →0, andτkj →0, where we have used ρ(B(x_∗)) = 2ρ(A). From (3.7) and (3.20), we get

τkjw_k_j =τkjB_k⁻_j¹x_k_j = u_k_ju^T_k

j +u_k_jc^T_k

jL⁻_k_j¹V_k^T_j +τkjVkjL⁻_k_j¹V_k^T_j x_k_j. SinceL⁻_k_j¹→[ρ(B(x_∗))I−L]⁻¹ andτkj →0, we haveτkjVkjL⁻_k_j¹V_k^T_j →0. Note that V_k^T_ju_k_j = 0 andV^Tx_∗= lim_k→∞V_k^T_ju_k_j = 0. We then get

jlim→∞u_k_jc^T_k_jL⁻_k_j¹V_k^T_jx_k_j =x_∗c^T(ρ(B(x_∗))I−L)⁻¹V^Tx_∗= 0.

A combination of the above relations shows that

j→∞lim τkjw_k_j = lim

j→∞

u_k_ju^T_k_j +u_k_jc^T_k_jL⁻_k_j¹V_k^T_j +τkjVkjL⁻_k_j¹V_k^T_j

x_k_j =x_∗. Hence,

jlim→∞y_k_j = lim

j→∞

τkjw_k_j

τkjw_k_j =x_∗.

Lemma 3.4. Assume that the sequence

Then the sequence{kwkk kyk−x_kk}is bounded, that is, there exists a constantM1>

0 such that

kwkk kyk−x_kk ≤M1 for allk.

Proof. From (2.12),

Jxr(xk,λk)yk−Jxr(xk,λk) (yk−x_k) =Jxr(xk,λk)xk = 2r(xk,λk)≥0.

This means

Jxr(xk,λk)yk≥Jxr(xk,λk) (yk−x_k). (3.21) We may assumex_k6=y_k. From (3.8) and (3.21), we have

x^[2]_k ≥ kwkk kyk−x_kkJxr(xk,λk)pk, (3.22)

(13)

where p_k = (yk−x_k)/kyk−x_kk with kpkk = 1. Because x_k,y_k > 0 and kxkk = kykk= 1,we havep_k0 andp_k 0 for allk.

Suppose{kwkk kyk−x_kk}is not bounded. Since {xk} and {pk} are bounded, we have for some subsequence{kj}

jlim→∞

w_k_jy_k_j −x_k_j=∞, lim

j→∞x_k_j =:v, lim

j→∞p_k_j =:p. (3.23) Sincew_k_jy_k_j −x_k_j≤2w_k_j, we also have lim_j→∞w_k_j=∞. By Lemma 3.3 we havev>0. We now provev=x_∗.

Since the sequence

λk is monotonically decreasing and bounded below byρ(A), lim_j→∞2λkj = 2α exists. By Theorem 2.5, 2λkjI −B(x_k_j) are nonsingular M- matrices. Thus 2αI −B(v) is an M-matrix, so 2α ≥ ρ(B(v)). If 2α > ρ(B(v)), then

jlim→∞w_k_j = lim

j→∞ 2λkjI−B(xkj)−1

x_k_j = (2αI−B(v))⁻¹v=:w>0, contradictory to limj→∞

w_k_j=∞. Thus 2α=ρ(B(v)). Now

ρ(B(v)) = lim

j→∞2λkj = lim

j→∞2 max



Ax²_k_j x^[2]_k

j



= 2 max Av²

v^[2]

= max

B(v)v v

.

It follows that B(v)v =ρ(B(v))v and then Av² = ¹₂ρ(B(v))v^[2]. Thus v =x_∗ by Theorem 2.3.

Now limj→∞x_k_j =x_∗and limj→∞λkj =α= ¹₂ρ(B(x_∗)) =ρ(A) by Theorem 2.5.

It follows from (3.22) and (3.23) that J_xr(x_∗,ρ(A))p= lim

j→∞J_xr(x_k_j,λkj)p_k_j ≤ lim

j→∞

x^[2]_k j

w_k_jy_k_j −x_k_j = 0.

Thus (2ρ(A)I−B(x_∗))p≤0. Then we havep=±x_∗ by Theorem 2.5. But by the definition ofp_k,pis neither positive nor negative. The contradiction shows that the sequence{kw_kk ky_k−x_kk}is bounded.

Lemma 3.5. Assume that the sequence

Then h_k(θk)can be expressed in the form h_k(θk) =2θkx^[2]_k

kwkk +R(θky_k,x_k, λk), (3.24) whereR(x,x_k,λk)≤M2kx−x_kk² for some constant M2.

Proof. To prove this, we use Taylor’s theorem for the functionr(x,λk) around the pointx_k:

r(x,λk) =r(xk,λk) +J_xr(xk,λk)(x−x_k) +R(x,x_k,λk), (3.25) whereR(x,x_k,λk)≤M2kx−x_kk²for some constantM2,noting thatλk≤λ0 for allk.

Therefore, from (3.25), we have

r(θky_k,λk) =r(xk,λk) +Jx(xk,λk)(θky_k−x_k) +R(θky_k,x_k,λk).

(14)

Sincer(xk,λk) =¹₂Jxr(xk,λk)xk by (2.12), we get

r(θky_k,λk) =−r(x_k,λk) +Jxr(xk,λk)(θky_k) +R(θky_k,x_k,λk)

=−r(xk,λk) +θkx^[2]_k

kw_kk +R(θky_k,x_k,λk).

Hence, noting thatr(θky_k,λk) =θ²_kr y_k,λk

,

h_k(θk) = θkx^[2]_k

kwkk +θ²_kr y_k,λk

+r x_k,λk

=2θkx^[2]_k

kwkk +R(θky_k,x_k, λk), which completes the proof.

Lemma 3.6. For NNI, we have (i) θk = 1ifkyk−x_kk ≤ ^min(x

[2]

k )

M1M2 ,whereM1 andM2 are as in Lemmas 3.4 and 3.5.

(ii) {θk} is bounded below by some constant ξ > 0, assuming that x_k 6= x_∗ for each k.

Proof. (i). From Lemma 3.4 and assumption, we have

x^[2]_k

M2kw_kk ky_k−x_kk ≥ min x^[2]_k

M2kw_kk ky_k−x_kke

≥ min(x^[2]_k ) M1M2

e≥ kyk−x_kke, (3.26) wheree= [1, . . . ,1]^T.Then, from (3.26) and Lemma 3.5,

x^[2]_k

kw_kk≥M2kyk−x_kk²e≥R(y_k,x_k,λk). (3.27) Substituting (3.27) into (3.24), we obtain

h_k(1) = 2x^[2]_k

kw_kk +R(yk,x_k,λk)≥ x^[2]_k

kw_kk > x^[2]_k (1 +η)kw_kk, which meansθk= 1.

(ii). From (3.11), we recall that θk=

(

1 ifh_k(1)≥ ^x

[2]

k

(1+η)kw_kk; ηk otherwise,

where ηk = ^η

(1+η)kwkk(^µ^k−λk)min

x^[2]_k y^[2]_k

<1 withµk = max

Ay_k² y^[2]_k

> λk. Suppose θk is not bounded below byξ >0. Since x_k is bounded, we can find a subsequence {kj}such that

jlim→∞θkj = 0, lim

j→∞x_k_j =:v. (3.28)

(15)

Note thatv>0 by Lemma 3.3.

As in the proof of Lemma 3.4, we have limj→∞2λkj = 2α≥ρ(B(v)).

If 2α > ρ(B(v)), then limj→∞w_k_j =:w >0, as in the proof of Lemma 3.4. In this case,

jlim→∞y_k_j = lim

j→∞

w_k_j

w_k_j =:y>0.

Ifηk is defined only on a finite subset of{k_j}, thenθkj = 1 except for a finite number ofj values, contradicting lim_j→∞θkj = 0. If ηk is defined on an infinite subset{k_j_i} of{k_j}, then

ilim→∞ηk_ji = η

(1 +η)kwk(µ−α)min v

y 2

>0,

where limi→∞µk_ji =µ > α sinceηk_ji <1. This is contradictory to limj→∞θkj = 0.

Thus 2α=ρ(B(v)), andv=x_∗ as in the proof of Lemma 3.4. Hence, lim_j→∞x_k_j = x_∗ and by Lemma 3.3 lim_j→∞y_k_j =x_∗, which meansx_k_j and y_k_j are close enough forjsufficiently large. Therefore, from (i),θkj = 1 forjlarge enough, a contradiction to lim_j→∞θkj = 0.

This Lemma shows that ifx_k andy_k are close enough, then the parameterθk in (3.2) can easily be determined, i.e.,θk = 1.

Corollary 3.7. If x_k=y_k for anyk≥0in Algorithm 3.1, then x_k =x_∗. Proof. If x_k =y_k, then θk = 1 by Lemma 3.6, and it is easily seen from Algo- rithm 3.1 thatλk+1=λk. By Theorem 3.2, we havex_k =x_∗.

4. Convergence analysis. In this section, we prove that the convergence of NNI is global and quadratic, assuming thatx_k 6=x_∗ for eachk.

4.1. Global convergence of NNI. Theorem 3.2 shows that the sequence λk

is strictly decreasing and bounded below byρ(A),and hence converges. We now show that the limit ofλk is preciselyρ(A).

Theorem 4.1. Let A be an irreducible nonnegative third order tensor and the sequence

λk is generated by Algorithm 3.1. Then the monotonically decreasing sequence

λk converges toρ(A),and{x_k}from Algorithm 3.1 converges to the positive eigenvector x_∗ corresponding toρ(A).

Proof. From (3.5), (3.17) and Lemma 3.6, we have

λk−λk+1= min h_k(θk) e x^[2]_k+1

!

≥min θkx^[2]_k (1 +η)kw_kkxe^[2]_k+1

!

≥min ξx^[2]_k (1 +η)kwkkex^[2]_k+1

!

. (4.1)

Sinceθk≤1 by construction, we havekexk+1k=kxk+θky_kk ≤2. It follows from (4.1) that lim_k→∞kw_kk⁻¹min(x^[2]_k ) = 0.

Suppose min(x^[2]_k ) is not bounded below by a positive constant. Then there exists a subsequence{kj}such that limj→∞min(x^[2]_k_j) = 0. Sincekxkjk= 1, we may assume that limj→∞x_k_j = v exists. Then limj→∞min(x^[2]_k

j) = min(v^[2]) = 0. This is a contradiction sincev>0 by Lemma 3.3. Therefore, min(x^[2]_k ) is bounded below by a positive constant, and thus limk→∞kwkk⁻¹= 0.