Downloaded 02/19/16 to 169.237.6.32. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(1)

STABILITY ANALYSIS OF THE TWO-LEVEL ORTHOGONAL ARNOLDI PROCEDURE^∗

DING LU^†, YANGFENG SU^†, _AND ZHAOJUN BAI^‡

Abstract. The second-order Arnoldi (SOAR) procedure is an algorithm for computing an orthonormal basis of the second-order Krylov subspace. It has found applications in solving quadratic eigenvalue problems and model order reduction of second-order dynamical systems among others.

Unfortunately, the SOAR procedure can be numerically unstable. The two-level orthogonal Arnoldi (TOAR) procedure has been proposed as an alternative to SOAR to cure the numerical instability.

In this paper, we provide a rigorous stability analysis of the TOAR procedure. We prove that under mild assumptions, the TOAR procedure is backward stable in computing an orthonormal basis of the associated linear Krylov subspace. The beneﬁt of the backward stability of TOAR is demonstrated by its high accuracy in structure-preserving model order reduction of second-order dynamical systems.

Key words. second-order Krylov subspace, second-order Arnoldi procedure, backward stability, model order reduction, dynamical systems

AMS subject classifications.65F15, 65F30, 65P99

DOI.10.1137/151005142

1. Introduction. The second-order Krylov subspace induced by a pair of matrices and one or two vectors is a generalization of the well-known (linear) Krylov subspace based on a matrix and a vector. An orthonormal basis matrix Q_k of the second-order Krylov subspace can be generated by a second-order Arnoldi (SOAR) procedure [3]. The SOAR procedure has found applications in solving quadratic eigenvalue problems [33, 38] and model order reduction of second-order dynamical systems [2, 6] and structural acoustic analysis [25, 26]. It is implemented in Omega3P for electromagnetic modeling of particle accelerators [18, 19], and in MOR4ANSYS for the model order reduction of ANSYS engineering models [28].

It has been known that SOAR is prone to numerical instability due to the fact that it involves solving potentially ill-conditioned triangular linear systems and im- plicitly generates a nonorthonormal basis matrixV_k of an associated (linear) Krylov subspace (see examples in section 5). The instability issue has drawn the attention of researchers since the SOAR procedure was proposed. For quadratic eigenvalue problems, Zhu [40] exploited the relations between the second-order Krylov subspace and the associated linear Krylov subspace described in [3] and proposed to represent the basis matrixV_k of the linear Krylov subspace by the product of two orthonormal matrices Q_k and U_k, whereQ_k is an orthonormal basis of the second-order Krylov subspace and U_k is computed to maintain the orthonormality ofV_k. The same idea

∗Received by the editors January 23, 2015; accepted for publication (in revised form) by T. Stykel December 11, 2015; published electronically February 16, 2016.

http://www.siam.org/journals/simax/37-1/100514.html

†School of Mathematical Sciences, Fudan University, Shanghai 200433, China (dinglu@fudan.edu.cn, yfsu@fudan.edu.cn). Part of this work was done while the ﬁrst author was visiting the University of California, Davis, supported by China Scholarship Council. The research of the second author was supported in part by the Innovation Program of Shanghai Mu- nicipal Education Commission 13zz007, E-Institutes of Shanghai Municipal Education Commission N.E303004, and NSFC key project 91330201.

‡Department of Computer Science and Department of Mathematics, University of California, Davis, CA 95616, USA (zbai@ucdavis.edu). The research of the this author was supported in part by the NSF grants DMS-1522697 and CCF-1527091.

195

Downloaded 02/19/16 to 169.237.6.32. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(2)

was also independently studied about the same time and presented in [34] for the purposes of curing the numerical instability and finding a memory-efficient representation of the basis matrixV_kin the context of using ARPACK [20] to solve high-order polynomial eigenvalue problems. The term “two-level orthogonal Arnoldi procedure”, TOAR in short, was coined in [34]. Recently, the notion of memory-efficient represen- tations of the orthonormal basis matrices of the second-order and higher-order Krylov subspaces was generalized to the model order reduction of time-delay systems [39], solving the linearized eigenvalue problem of matrix polynomials in the Chebyshev basis [17] and implementing a rational Krylov method for solving nonlinear eigenvalue problems [35].

Unlike SOAR, TOAR maintains an orthonormal basis matrixV_k of the associated linear Krylov subspace. Therefore, it is generally believed to be numerically stable.

This belief has been adopted in the literature [39, 17, 35]. The motivation of this paper is to provide a rigorous stability analysis of TOAR. We prove that under mild assumptions, TOAR with partial reorthogonalization is backward stable in computing the basis matrix V_k of the associated linear Krylov subspace. In addition, in this paper, we provide a diﬀerent derivation of the TOAR procedure comparing to the ones in [40, 34] and remove unnecessary normalization steps. A comprehensive TOAR procedure, including the partial reorthogonalization and the treatments of deﬂation and breakdown, is presented. The advantages of the TOAR procedure are illustrated by an application to the structure-preserving model order reduction of second-order dynamical systems.

The rest of this paper is organized as follows. In section 2, we review the deﬁnition and essential properties of the second-order Krylov subspace. In section 3, we derive the TOAR procedure. The backward stability of the TOAR procedure is proven in section 4. Numerical examples for the application of the TOAR procedure in the model order reduction of second-order dynamical systems are presented in section 5.

Concluding remarks are in section 6.

Notations. Throughout the paper, we use the upper case letter for matrices, lower case letter for vectors, particularly, I for the identity matrix with e_i being theith column, and the dimensions of these matrices and vectors conform with the dimensions used in the context. Following the convention of matrix analysis, we use

·^T for transpose, and ·^† for the pseudo-inverse, | · | for elementwise absolute value, · ₂ and · _F for 2-norm and Frobenius norm, respectively, span{U, v} for the subspace spanned by the columns of matricesU and vectorv. We also use MATLAB conventions v(i : j) for the ith to jth entries of vector v, and A(i : j, k: ) for the submatrix of matrix A by the intersection of row i to j and column k to . Other notations will be explained as used.

2. Second-order Krylov subspace. Let A and B be n×nmatrices and r₋₁ andr₀be length-nvectors such that [r₋₁, r₀]= 0. Then the sequencer₋₁, r₀, r₁, r₂, . . . with

(2.1) r_j=Ar_j−1+Br_j−2 forj≥1

is called asecond-order Krylov sequence based onA,B, r₋₁, and r₀. The subspace (2.2) G_k(A, B;r₋₁, r₀)≡span{r₋₁, r₀, r₁, . . . , r_k−1}

is called a kth second-order Krylov subspace. If the vector r_j lies in the subspace spanned by the vectorsr₋₁, . . . r_j−1, i.e.,G_j+1(A, B;r₋₁, r₀) =G_j(A, B;r₋₁, r₀), then

Downloaded 02/19/16 to 169.237.6.32. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(3)

a deflation occurs. We should stress that the deﬂation does not necessarily imply that r_j+1 and all the following vectors still lie inG_j(A, B;r₋₁, r₀). The latter case is referred to as abreakdown of the second-order Krylov sequence.

The second-order Krylov subspaceG_k(A, B;r₋₁, r₀) can be embedded in the linear Krylov subspace

K_k(L, v₀)≡span{v₀, Lv₀, L²v₀, . . . , L^k−1v₀}

= span r₀

r₋₁

, r₁

r₀

, r₂

r₁

, . . . , r_k−1

r_k−2

, (2.3)

where

L=

A B I 0

∈R^2n×2n and v₀= r₀

r₋₁

∈R²ⁿ,

andI is an identity matrix of sizen. Speciﬁcally, letQ_k andV_k be basis matrices of the subspacesGk(A, B;r₋₁, r₀) andKk(L, v₀), respectively. Then by (2.3), we know that

span{V_k(1 :n,:)}= span{r₀, r₁, . . . , r_k−1}, (2.4)

span{V_k(n+1 : 2n,:)}= span{r₋₁, r₀, . . . , r_k−2}, (2.5)

and

(2.6) span{Q_k}= span{V_k(1 :n,:), V_k(n+1 : 2n,:)}.

By (2.4)–(2.6), the basis matrix V_k of the linear Krylov subspace Kk(L, v₀) can be written in terms of the basis matrix Q_k of the second-order Krylov subspace Gk(A, B;r₋₁, r₀):

(2.7) V_k=

V_k(1 :n,:) V_k(n+1 : 2n,:)

=

Q_kU_k,1 Q_kU_k,2

= Q_k

Q_k U_k,1 U_k,2

≡Q_[k]U_k. Without loss of generality, we assume the linear Krylov subspace Kk(L, v₀) is not reduced, i.e., V_k is of dimension 2n×k. Otherwise,Kk(L, v₀) =Kj(L, v₀) for some j < k with Kj(L, v₀) being unreduced, and V_k = V_j. Note that Q_k is n×η_k, U_k,1 and U_k,2 are η_k×k, where η_k is the dimension ofGk(A, B;r₋₁, r₀), and η_k ≤ k+ 1 due to possible deﬂations. Equation (2.7) indicates a memory-eﬃcient compact representation of the basis matrixV_k since the memory size 2nkrequired to storeV_k is reduced to (n+ 2k)η_k forQ_k andU_k.

Equation (2.6) provides a means to compute an orthonormal basis matrixQ_k of G_k(A, B;r₋₁, r₀). One can first generate a basis matrix V_k of K_k(L, v₀), say by the Arnoldi procedure [31, Chap.5], and then extractQ_k by a rank-revealing orthogonalization procedure applied to the columns of [V_k(1 :n,:), V_k(n+ 1 : 2n,:)]. However, this scheme is too expensive. A computationally efficient algorithm exploits the compact representation (2.7) to directly computeQ_kwithout first generatingV_kexplicitly.

The SOAR procedure [3] is one such algorithm. In SOAR, and similarly in SOAR- like procedures [23, 5, 15], to compute the orthonormal basis matrix Q_k, it imposes U_k,1 =I (assuming no deﬂation for simplicity) andU_k,2 is a strict upper triangular matrix. At each iteration of the SOAR procedure, it needs to solve potentially ill- conditioned triangular linear systems; see [3, Algorithm 4] for detail. Moreover, due to the choice ofU_k,1andU_k,2, the corresponding basis matrixV_k of the associated linear Krylov subspace ofKk(L, v₀) is nonorthonormal. Consequently, the SOAR procedure could be numerically unstable. The instability problem of SOAR has been observed in several practical applications and will be illustrated numerically in section 5.

Downloaded 02/19/16 to 169.237.6.32. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(4)

3. TOAR Procedure. To cure the potential instability of the SOAR procedure, in this section we study a procedure that computes an orthonormal basis matrixQ_k of the second-order Krylov subspaceG_k(A, B;r₋₁, r₀), and meanwhile, maintains the basis matrix V_k of the associated linear Krylov subspace K_k(L, v₀) as orthonormal.

The procedure is called as TOAR.

Recall that the Arnoldi procedure for computing an orthonormal basis of thekth Kryov subspaceKk(L, v₀) is governed by the following Arnoldi decomposition of order k−1:

(3.1) LV_k−1=V_kH_k,

where V_k−1 consists of the ﬁrstk−1 columns of V_k, and H_k is a k×(k−1) unreduced upper Hessenberg matrix; see, for example, [31, Chap. 5]. By the compact representation (2.7), we have thecompact Arnoldi decomposition of orderk−1, (3.2)

A B I 0

Q_k−1U_k−1,1 Q_k−1U_k−1,2

=

Q_kU_k,1 Q_kU_k,2

H_k.

We now show how to compute Q_k, U_k,1, and U_k,2 without explicitly generating V_k. We note that the orthonormality ofQ_k andV_k implies that U_k = [U_k,1^T , U_k,2^T ]^T is also orthonormal. Let us begin with the following lemma describing the relations betweenQ_k−1andQ_k,U_k−1,i andU_k,i fori= 1,2.

Lemma 3.1. For the compact Arnoldi decomposition (3.2), (3.3) span{Q_k}= span{Q_k−1, r},

where r = AQ_k−1U_k−1,1(:, k −1) +BQ_k−1U_k−1,2(:, k −1). Furthermore, (a) if r∈span{Q_k−1}, then there exist vectorsx_k andy_k such that

(3.4) Q_k =Q_k−1, U_k,1=

U_k−1,1 x_k

, and U_k,2=

U_k−1,2 y_k

; (b)otherwise, there exist vectorsx_k andy_k and a scalar β_k = 0such that (3.5)

Q_k =

Q_k−1 q_k

, U_k,1=

U_k−1,1 x_k

0 β_k

, and U_k,2=

U_k−1,2 y_k

0 0

,

whereq_k = (I−Q_k−1Q^T_k−1)r/αandαis a normalization factor such thatq_k₂= 1.

Proof. Note that the (k−1)st column of (3.1) is

Lv_k−1=V_k−1H_k(1 :k−1, k−1) +H_k(k, k−1)·v_k,

wherev_k−1andv_k are the last columns ofV_k−1 andV_k, respectively. Therefore, given V_k−1 we have

span{V_k−1, Lv_k−1}= span{V_k−1, v_k}= span{V_k} ≡ Kk(L, v₀).

Since

V_k−1 Lv_k−1

=

Q_k−1U_k−1,1 r

Q_k−1U_k−1,2 Q_k−1U_k−1,1(:, k−1)

,

by applying the subspace relation (2.6), we have

span{Q_k}= span{Q_k−1U_k−1,1, r, Q_k−1U_k−1,2}= span{Q_k−1, r}.

Downloaded 02/19/16 to 169.237.6.32. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(5)

Hence the relation (3.3) is proven.

To prove (a) and (b), we ﬁrst note that by (2.4), the top n elements of thekth columnv_k ofV_k satisfy

(3.6a) v_k(1 :n)∈span{r₀, r₁, . . . , r_k−1} ⊂ G_k(A, B;r₋₁, r₀) = span{Q_k}, and by (2.5), the bottomnelements ofv_k satisfy

(3.6b) v_k(n+1 : 2n)∈span{r₋₁, r₀, . . . , r_k−2} ⊂ G_k−1(A, B;r₋₁, r₀) = span{Q_k−1}. If r ∈ span{Q_k−1}, then (3.3) implies span{Q_k} = span{Q_k−1}. Furthermore, by (3.6), the vector

v_k =

v_k(1 :n) v_k(n+ 1 : 2n)

=

Q_k−1x_k Q_k−1y_k

for some vectorsx_k andy_k. Therefore, V_k=

V_k−1 v_k

=

Q_k−1U_k−1,1 Q_k−1x_k Q_k−1U_k−1,2 Q_k−1y_k

=

Q_k−1[U_k−1,1x_k] Q_k−1[U_k−1,2 y_k]

.

Thus the result (a) is proven.

Ifr∈span{Q_k−1}, then by orthogonalizingragainstQ_k−1, we have r=Q_k−1s+αq_k,

where q_k = (I−Q_k−1Q^T_k−1)r/α is a unitary vector orthogonal to the columns of Q_k−1. Consequently, Q_k =

Q_k−1 q_k

. Furthermore, according to (3.6), the Arnoldi vector

v_k=

v_k(1 :n) v_k(n+ 1 : 2n)

=

Q_k−1x_k+β_kq_k Q_k−1y_k

for some vectorsx_k,y_k, and scalarβ_k. Therefore, the result (b) is immediately proven by the following equations:

V_k=

V_k−1 v_k

=

Q_k−1U_k−1,1 Q_k−1x_k+β_kq_k Q_k−1U_k−1,2 Q_k−1y_k

.

Now we derive the TOAR procedure to compute the compact Arnoldi decomposition (3.2). With the initial vectorsr₋₁andr₀such thatv₀=

r₀ r₋₁

= 0, we apply the rank revealing QR decomposition of then×2 matrix

r₋₁ r₀

=Q₁X,

whereQ₁is ann×η₁orthonormal matrix, andX is anη₁×2 matrix. Note thatη₁= 2 if the starting vectorsr₋₁andr₀ are linearly independent, otherwiseη₁= 1. Then it follows

V₁= 1 γ

r₀ r₋₁

= 1 γ

Q₁X( :,2) Q₁X( :,1)

= Q₁

Q₁

U_1,1 U_1,2

≡Q_[1]U₁, where γ=v₀₂, U_1,1=X( : ,2)/γ and U_1,2 =X( : ,1)/γ. After the initialization, we haveQ₁,U₁, and an empty 1×0 Hessenberg matrixH₁= [ ].

Downloaded 02/19/16 to 169.237.6.32. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(6)

Let us assume that the compact Arnoldi decomposition (3.2) has been computed for k= j, j ≥ 1. Next, the TOAR procedure consists of two steps to compute the decomposition (3.2) fork=j+ 1, namely,

(3.7)

A B I 0

Q_jU_j,1 Q_jU_j,2

=

Q_j+1U_j+1,1 Q_j+1U_j+1,2

H_j+1,

where Q_j, U_j,1, and U_j,2 and the ﬁrst j −1 columns of H_j+1 in (3.7) have been computed.

At the ﬁrst step, we compute the orthonormal matrix Q_j+1. According to Lemma 3.1, we have span{Q_j+1} = span{Q_j, r}, where r = AQ_jU_j,1(:, j) +BQ_jU_j,2(:, j). By orthogonalizingragainstQ_j, it yields

(3.8) q_j+1= (r−Q_js)/α with s=Q^T_jr, α=r−Q_js₂, where it is assumed thatα= 0. Subsequently,Q_j+1=

Q_j q_j+1

andη_j+1=η_j+1.

Ifα= 0, thenr∈span{Q_j}and Q_j+1 =Q_j andη_j+1 =η_j. A deﬂation occurs and G_j+1(A, B;r₋₁, r₀) =G_j(A, B;r₋₁, r₀).

At the second step, we compute U_j+1,1 andU_j+1,2such that U_j+1 =

U_j+1,1 U_j+1,2

is orthonormal. By left multiplying Q^T_[j+1] of (3.7) and taking the jth column, we have

(3.9)

Q^T_j+1r Q^T_j+1Q_jU_j,1(:, j)

=

U_j+1,1 U_j+1,2

H_j+1(:, j).

When there is no deﬂation in the ﬁrst step, by Lemma 3.1,U_j+1 is of the form

U_j+1=

U_j+1,1 U_j+1,2

=

⎡

⎢⎢

⎣

U_j,1 x_j+1 0 β_j+1 U_j,2 y_j+1

0 0

⎤

⎥⎥

⎦

for some vectorsx_j+1 andy_j+1, and scalarβ_j+1 = 0. Denote (see (3.8))

⎡

⎢⎢

⎣

1

ηj s

1 α

ηj u

1 0

⎤

⎥⎥

⎦≡

ηj+1 Q^T_j+1r

ηj+1 Q^T_j+1Q_jU_j,1(:, j)

,

whereu≡U_j,1(:, j). The equation (3.9) can be written as

⎡

⎢⎢

⎣ s α u 0

⎤

⎥⎥

⎦=

⎡

⎢⎢

⎣ U_j,1

0 U_j,2

0

⎤

⎥⎥

⎦h_j+h_j+1,j

⎡

⎢⎢

⎣ x_j+1 β_j+1 y_j+1 0

⎤

⎥⎥

⎦,

whereh_j=H_j+1(1 :j, j) andh_j+1,j=H_j+1(j+ 1, j). SinceU_j+1 is also imposed to be orthonormal, we have

(3.10) h_j=

⎡

⎢⎢

⎣ U_j,1

0 U_j,2

0

⎤

⎥⎥

⎦

T⎡

⎢⎢

⎣ s α u 0

⎤

⎥⎥

⎦ and h_j+1,j=

⎡

⎢⎢

⎣ s α u 0

⎤

⎥⎥

⎦−

⎡

⎢⎢

⎣ U_j,1

0 U_j,2

0

⎤

⎥⎥

⎦h_j

2

Downloaded 02/19/16 to 169.237.6.32. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

.

(7)

Assumeh_j+1,j= 0, the vectorsx_j+1 andy_j+1 and scalarβ_j+1, namely, the (j+ 1)st column ofU_j+1, are then given by

(3.11)

⎡

⎢⎢

⎣ x_j+1 β_j+1 y_j+1 0

⎤

⎥⎥

⎦= 1 h_j+1,j

⎡

⎢⎢

⎣ s α u 0

⎤

⎥⎥

⎦ with

⎡

⎢⎢

⎣ s α u 0

⎤

⎥⎥

⎦:=

⎡

⎢⎢

⎣ s α u 0

⎤

⎥⎥

⎦−

⎡

⎢⎢

⎣ U_j,1

0 U_j,2

0

⎤

⎥⎥

⎦h_j.

If there is a deﬂation in the ﬁrst step, thenQ_j+1=Q_j. By Lemma 3.1, U_j+1 is of the form

U_j+1=

U_j,1 x_j+1 U_j,2 y_j+1

for some vectorsx_j+1 andy_j+1. Denote ¹

ηj s

ηj u

=

Q^T_j+1r Q^T_j+1Q_jU_j,1(:, j)

=

Q^T_jr Q^T_jQ_jU_j,1(:, j)

=

Q^T_jr U_j,1(:, j)

.

Equation (3.9) can be written as s

u

= U_j,1

U_j,2

h_j+h_j+1,j x_j+1

y_j+1

,

where h_j =H_j+1(1 :j, j) and h_j+1,j=H_j+1(j+ 1, j). SinceU_j+1 is imposed to be orthonormal, we have

(3.12) h_j = U_j,1

U_j,2 _T

s u

, h_j+1,j=

s u

− U_j,1

U_j,2

h_j 2

.

Assumeh_j+1,j= 0, the vectorsx_j+1 andy_j+1, namely, the (j+ 1)st column ofU_j+1, are then given by

(3.13)

x_j+1 y_j+1

= 1

h_j+1,j s

u

with s

u

:=

s u

− U_j,1

U_j,2

h_j.

Finally, we note that ifh_j+1,j= 0 in (3.10) or (3.12), then by (3.7), we have (3.14)

A B I 0

Q_jU_j,1 Q_jU_j,2

=

Q_jU_j,1 Q_jU_j,2

H_j,

where H_j = H_j+1(1 : j,1 : j). It implies that V_j = Q_[j]U_j spans an invari- ant subspace of L, i.e., K(L, v₀) = K_j(L, v₀) for ≥ j. Consequently, by (2.3), G(A, B;r₋₁, r₀) =Gj(A, B;r₋₁, r₀) for all ≥j. This is the case where the break- down occurs.

A summary of the TOAR procedure is presented in Algorithm 1, where the mod- ified Gram–Schmidt (MGS) process is applied for the orthogonalization ofQ_j (lines 5–8) andU_j(lines 11–14). While the MGS is known to be numerically more accurate than the classical Gram–Schmidt process [13], it could still fail to compute a matrix orthonormal with respect to the machine precision. Specifically, by the error analysis in [7, 8], the computedQ_k by the MGS (lines 5–8) satisfies

(3.15) I−Q^T_kQ_k₂≤cκ₂(R_k)ε,

Downloaded 02/19/16 to 169.237.6.32. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(8)

Algorithm 1. Two-level Orthogonal ARnoldi(TOAR) Procedure

Input: MatricesA,B, and initial length-nvectorsr₋₁,r₀withγ≡ [r₋₁, r₀]_F = 0, and subspace orderk.

Output: Q_k ∈R^n×η^k,U_k,1,U_k,2∈R^η^k^×k and H_k ={h_ij} ∈R^k×k−1.

1: Rank revealing QR:

r₋₁ r₀

=QX withη₁ being the rank.

2: InitializeQ₁=Q,U_1,1=X(:,2)/γand U_1,2=X(:,1)/γ.

3: forj= 1,2, . . . , k−1 do

4: r=A(Q_jU_j,1(:, j)) +B(Q_jU_j,2(:, j))

5: fori= 1, . . . , η_j do

6: s_i=q^T_i r

7: r=r−s_iq_i

8: end for

9: α=r2

10: Sets= [s₁, . . . , s_η_j]^T and u=U_j,1(:, j)

11: fori= 1, . . . , j do

12: h_ij =U_j,1(:, i)^Ts+U_j,2(:, i)^Tu

13: s=s−h_ijU_j,1(:, i);u=u−h_ijU_j,2(:, i)

14: end for

15: h_j+1,j = (α²+s²₂+u²₂)^1/2

16: if h_j+1,j= 0 then

17: stop(breakdown)

18: end if

19: if α= 0then

20: η_j+1=η_j (deﬂation)

21: Q_j+1=Q_j; U_j+1,1=

U_j,1 s/h_j+1,j

; U_j+1,2=

U_j,2 u/h_j+1,j

22: else

23: η_j+1=η_j+ 1

24: Q_j+1=

Q_j r/α

; U_j+1,1=

U_j,1 s/h_j+1,j 0 α/h_j+1,j

; U_j+1,2=

U_j,2 u/h_j+1,j

0 0

25: end if

26: end for

wherecis a constant only depending on the dimension ofQ_k,κ₂(R_k) is the condition number of the matrix R_k = [r₁, r₂, . . . , r_k] with r_j being the computed vectorr at line 4 in the jth iteration. Therefore, the loss of orthonormality occurs whenR_k is ill-conditioned. Fortunately, this problem can be cured by applying a partial reorthogonalization [11], which runs the MGS twice on selective columns ofR_k. Speciﬁcally, after line 9, if it holds

(3.16) α≤θα₁,

whereα₁=AQ_jU_j,1(:, j) +BQ_jU_j,2(:, j)2andθ≤1 is a threshold parameter, then we orthogonalize the resultingragainstQ_jagain, and then updateq_j+1. This is also known as the Kahan–Parlett “twice-is-enough” algorithm [24, page 115] and [12]. To incorporate the partial reorthogonalization into Algorithm 1, we need to insert the following code segments between lines 9 and 10:

ifα≤θα₁

fori= 0, . . . , η_j

Downloaded 02/19/16 to 169.237.6.32. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(9)

s_i=q^T_i r r=r−s_iq_i s_i=s_i+s_i end for α=r₂ end if

where α₁ =AQ_jU_j,1(:, j) +BQ_jU_j,2(:, j)2 can be precomputed at line 4 of Algo- rithm 1. Analogously, we can apply the partial reorthogonalization for the second MGS, lines 11 to 14 in Algorithm 1, to ensure the computedU_j is orthonormal with respect to the machine precision.

To end this section, we note that the idea of using the product of two orthonormal matricesQ_k andU_k to represent an orthonormal basis matrixV_k has appeared in [40, 34]. In this section, we derived the algorithm based on the well-known Arnoldi decomposition and the relations betweenQ_k−1, U_k−1 andQ_k, U_k. The derivation is more straightforward than the one presented in [40]. In addition, we removed some unnecessary normalization steps for computingq_jas required in [40]. In section 4, we will discuss the treatments of deﬂation and breakdown in detail.

4. Backward Error Analysis. In this section we will provide a backward error analysis of the TOAR procedure in the presence of finite precision arithmetic. By taking into account floating point arithmetic errors, the computed compact Arnoldi decomposition by the TOAR procedure satisfies

(4.1)

A B I 0

Q_k−1U_k−1,1 Q_k−1U_k−1,2

=

Q_kU_k,1 Q_kU_k,2

H_k+E,

where Q_k, U_k,1, U_k,2, and H_k are computed matrices of Q_k, U_k,1, U_k,2, and H_k of the exact compact Arnoldi decomposition (3.2), and E is the error matrix. Let V_k=

Q_kU_k,1 Q_kU_k,2

andU_k= U_k,1

U_k,2

, then (4.1) can be written as (4.2) (L+ΔL)V_k−1=V_kH_k,

whereΔL=EV_k−1^† . V_k−1^† is the pseudoinverse ofV_k−1, andV_k−1^† =(V_k−1^T V_k−1)⁻¹V_k−1^T . Here we assume that the computed orthonormal matricesQ_kandU_kare of full column rank, which implies thatV_k is also of full column rank. Equation (4.2) indicates that V_kis an exact basis matrix of the Krylov subspace ofL+ΔL, a perturbed matrix ofL.

Therefore, the relative backward errorτ =ΔL/Lis a measure of the numerical stability of the TOAR procedure. We will show that under mild assumptions, the TOAR procedure is backward stable in the sense that the relative backward errorτ is of the order of machine precisionε.

The proof of the backward stability of the TOAR procedure consists of two parts.

In the first part, we derive an upper bound of the error matrix E. In the second part, we bound the backward error ΔL = EV_k−1^† . For the sake of exposition, we assume there is no deflation or breakdown, and no partial reorthogonalization. These cases will be addressed later in this section. In the following analysis, we adopt the standard rounding off error model for the floating point arithmetic of two real scalars αandβ:

(4.3) f l(αopβ) = (αopβ)(1 +δ) with|δ| ≤εfor op = +,−,∗, /,

Downloaded 02/19/16 to 169.237.6.32. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(10)

wheref l(x) denotes the computed quantity andεdenotes the machine precision. In addition, we recall the following results which will be used repeatedly in this section.

Lemma 4.1.

(a) For x, y∈Rⁿ,f l(x+y) =x+y+f, where f₂≤(x₂+y₂)ε.

(b) ForX∈R^n×k andy∈R^k,f l(Xy)=Xy+w, wherew₂≤kX_Fy₂ε+O(ε²).

(c) For X ∈R^n×k,y∈R^k,b∈Rⁿ, andβ ∈R,c≡f l((b−Xy)/β)satisfies βc=b−Xy+g, g₂≤(k+ 1) X c

F

y β

2

ε+O(ε²).

Proof. Result (a) is a direct consequence of the model (4.3) applied elementwisely to the vectorx+y.

For (b), repeatedly applying the model (4.3) leads to f l(Xy) = Xy+w with

|w| ≤k|X||y|ε+O(ε²); see e.g. [14, (3.12), page 78], where| · |denotes elementwise absolute values. By taking 2-norms and using(|X|)2≤ XF, we prove (b).

Result (c) is a direct consequence of Lemma 8.4 in [14, pp.154]. For the sake of completeness, we provide a proof here. We ﬁrst note that for the scalar operation, we have

(4.4) f l((α−γ)/β) = (α−γ)/β(1 +δ₁)(1 +δ₂) = (α−γ)/β+e,

where |δ_i| ≤ ε for i = 1,2 and |e| ≤ 2|f l((α−γ)/β)|ε+O(ε²). Applying (4.4) elementwisely to the vectorc gives rise to

c=f l((b−Xy)/β) = (b−f l(Xy))/β+f, f₂≤2c₂ε+O(ε²).

Multiplying the equation byβ and using (b), we obtain

g₂≡ βf −w₂≤(k+ 1)(|β|c₂+X_Fy₂)ε+O(ε²),

where we used max{k,2} ≤k+ 1 fork≥1. Then by the Cauchy–Schwartz inequality ab+cd≤(a²+c²)^1/2(b²+d²)^1/2we prove (c).

To derive an upper bound for the error matrix E of (4.1), let us ﬁrst introduce the following two matrices to represent the ﬂoating point errors of matrix-vector multiplications and the orthogonalization processes, respectively,

F_mv=A(Q_k−1U_k−1,1) +B(Q_k−1U_k−1,2)−R_k−1, (4.5)

F =

R_k−1 Q_k−1U_k−1,1

−

Q_kU_k,1 Q_kU_k,2

H_k, (4.6)

whereR_k−1= [r₁,r₂, . . . ,r_k−1] andr_j=f l

A(Q_jU_j,1(:, j)) +B(Q_jU_j,2(:, j)) which is computed at thejth iteration of Algorithm 1 (line 4). By (4.1), it is easy to verify that

E= F_mv

0

+F.

(4.7)

The following lemma gives an upper bound ofF_mv_F.

Lemma 4.2. Let Q_k−1 and U_k−1 be the computed orthonormal matrices by the TOAR procedure (Algorithm1). Then

(4.8) F_mv_F ≤ϕQ_k−1₂U_k−1₂L_Fε+O(ε²), whereϕ= 2k(2n+ 1).

Downloaded 02/19/16 to 169.237.6.32. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(11)

Proof. By the deﬁnition (4.5) ofF_mv, thejth column ofF_mv is given by F_mv(:, j) =A(Q_jU_j,1(:, j)) +B(Q_jU_j,2(:, j))−r_j,

(4.9)

wherer_j ≡f l

A(Q_jU_j,1(:, j)) +B(Q_jU_j,2(:, j))

. Note that here we use the fact that U_k−1,1andU_k−1,2are upper Hessenberg matrices andQ_k−1U_k−1,i(:, j) =Q_jU_j,i(:, j), fori= 1,2 andj≤k−1.

By repeatedly applying Lemmas 4.1(a) and 4.1(b), we have

r_j =f l(AQ_jU_j,1(:, j)) +f l(BQ_jU_j,2(:, j)) +w_j

=AQ_jU_j,1(:, j) +BQ_jU_j,2(:, j) +w⁽¹⁾_j +w⁽²⁾_j +w_j, wherew⁽¹⁾_j , w_j⁽²⁾ andw_j are ﬂoating point error vectors and satisfy

w⁽¹⁾_j 2≤2nAFQ_jFU_j,1(:, j)2ε+O(ε²), w⁽²⁾_j ₂≤2nB_FQ_j_FU_j,2(:, j)₂ε+O(ε²),

w_j₂≤(f l(AQ_jU_j,1(:, j))₂+f l(BQ_jU_j,2(:, j))₂)ε

≤(AQ_jU_j,1(:, j)₂+BQ_jU_j,2(:, j)₂)ε+O(ε²).

Therefore,F_mv(:, j) =−(w⁽¹⁾_j +w⁽²⁾_j +w_j). By the upper bounds of w⁽¹⁾_j ,w_j⁽²⁾, andw_j, we have

F_mv(:, j)2≤(2n+1)

AFQ_jFU_j,1( :, j)2+BFQ_jFU_j,2( :, j)2

ε+O(ε²)

≤2(2n+ 1)LFQ_k−1FU_j( :, j)2ε+O(ε²),

where for the second inequality we used the fact that max{ AF,BF} ≤ LF, Q_j_F ≤ Q_k−1_F, andU_j,i( :, j)₂≤ U_j( :, j)₂fori= 1,2. In matrix terms, we have

F_mvF =

⎛

⎝^k−1

j=1

F_mv(:, j)²₂

⎞

⎠

1/2

≤2(2n+ 1)LFQ_k−1FU_k−1Fε+O(ε²).

Since Q_k−1 and U_k−1 have at most k columns, so Q_k−1F ≤ √

kQ_k−12 and U_k−1F ≤√

kU_k2, we obtain the upper bound (4.8).

The following lemma gives an upper bound ofF_F.

Lemma 4.3. Let Q_k, U_k, and H_k be computed by the k-step TOAR procedure (Algorithm1). Then

(4.10) F_F ≤ϕQ_k₂U_k₂H_k_Fε+O(ε²), whereϕ= (k+ 1)(2k+ 1).

Proof. In thejth iteration of Algorithm 1, the computed matrix-vector multipli- cation (line 4) is

r_j=f l(AQ_jU_j,1(:, j) +BQ_jU_j,2(:, j)).