• Aucun résultat trouvé

Journal of Computational and Applied Mathematics

N/A
N/A
Protected

Academic year: 2022

Partager "Journal of Computational and Applied Mathematics"

Copied!
11
0
0

Texte intégral

(1)

Contents lists available atSciVerse ScienceDirect

Journal of Computational and Applied Mathematics

journal homepage:www.elsevier.com/locate/cam

Solving large-scale continuous-time algebraic Riccati equations by doubling

Tiexiang Li

a

, Eric King-wah Chu

b,

, Wen-Wei Lin

c

, Peter Chang-Yi Weng

b

aDepartment of Mathematics, Southeast University, Nanjing, 211189, People’s Republic of China

bSchool of Mathematical Sciences, Building 28, Monash University, VIC 3800, Australia

cDepartment of Applied Mathematics, National Chiao Tung University, Hsinchu 300, Taiwan

a r t i c l e i n f o

Article history:

Received 21 July 2011

Received in revised form 22 May 2012 MSC:

15A24 65F50 93C05 Keywords:

Continuous-time algebraic Riccati equation Doubling algorithm

Krylov subspace Large-scale problem

a b s t r a c t

We consider the solution of large-scale algebraic Riccati equations with numerically low- ranked solutions. For the discrete-time case, the structure-preserving doubling algorithm has been adapted, with the iterates forAnot explicitly computed but in the recursive formAk=A2k1D(k1)Sk1[D(k2)], withD(k1) and D(k2) being low-ranked and Sk1 being small in dimension. For the continuous-time case, the algebraic Riccati equation will be first treated with the Cayley transform before doubling is applied. Withnbeing the dimension of the algebraic equations, the resulting algorithms are of an efficientO(n) computational complexity per iteration, without the need for any inner iterations, and essentially converge quadratically. Some numerical results will be presented. For instance in Section 5.2, Example 3, of dimensionn=20 209 with 204 million variables in the solutionX, was solved using MATLAB on a MacBook Pro within 45 s to a machine accuracy ofO(1016).

©2012 Elsevier B.V. All rights reserved.

1. Large-scale algebraic Riccati equations

Let the system matrixAbe large and sparse, possibly with band structures. The discrete-time algebraic Riccati equation (DARE):

D

(

X

) ≡ −

X

+

AX

(

I

+

GX

)

1A

+

H

=

0

,

(1a) and the continuous-time algebraic Riccati equation (CARE):

C

(

X

) ≡

AX

+

XA

XGX

+

H

=

0

,

(1b) with the low-ranked

G

=

BR1B

,

H

=

CT1C

,

(1c)

whereB

Rn×m

,

C

Rn×landm

,

l

n, arise often in linear–quadratic optimal control problems [1,2].

The solution of CAREs and DAREs has been an extremely active area of research; see, e.g., [3,1,2]. The usual solution methods such as the Schur vector method, symplectic SR methods, the matrix sign function, the matrix disk function or the

Corresponding author. Tel.: +61 3 99054480; fax: +61 3 99054403.

E-mail addresses:[email protected](T. Li),[email protected](E.K.-w. Chu),[email protected](W.-W. Lin),[email protected] (P.C.-Y. Weng).

0377-0427/$ – see front matter©2012 Elsevier B.V. All rights reserved.

doi:10.1016/j.cam.2012.06.006

(2)

doubling method have not made (full) use of the sparsity and structure inA

,

GandH. Requiring in generalO

(

n3

)

flops and workspace of sizeO

(

n2

)

, these methods are obviously inappropriate for the large-scale problems we are interested in here.

For control problems for parabolic PDEs and the balancing based model order reduction of large linear systems, large- scale CAREs and DAREs have to be solved [4–9]. As stated in [10,11], ‘‘the basic observation on which all methods for solving such kinds of matrix equations are based, is that often the (numerical) rank of the solution is very small compared to its actual dimension and therefore it allows for a good approximation via low rank solution factors’’. Importantly, without solving the corresponding algebraic Riccati equations, alternative solutions to the optimal control problem require the deflating subspace of the corresponding Hamiltonian matrices or (generalized) symplectic pencils which are prohibitively expensive to compute.

Benner, Fassbender and Saak have done much on large-scale algebraic Riccati equations; see [10–13] and the references therein. They built their methods on (inexact) Newton’s methods with inner iterations for the associated Lyapunov and Stein equations. We shall adapt the structure-preserving doubling algorithm (SDA) [14–16], making use of the sparsity inAand the low-ranked structures inGandH. For other applications of the SDA, see [17].

2. Structure-preserving doubling algorithm for DAREs

We shall abbreviate the discussion for DAREs; please consult [18] for details.

The structure-preserving doubling algorithm (SDA) [15], assuming

(

I

+

GH

)

1exists, has the following form:

G

G

+

A

(

I

+

GH

)

1GA

,

H

H

+

AH

(

I

+

GH

)

1A

,

A

A

(

I

+

GH

)

1A

.

(2)

We shall apply the Sherman–Morrison–Woodbury formula (SMWF) to

(

I

+

GH

)

1and make use of the low-ranked forms ofGandHin(1c).

2.1. Large-scale SDA

From the first glance, the iteration forAin the SDA in(2)appears doomed, withO

(

n3

)

operations for the products of full matrices. However, with the low rank form in(1c), we shall organize the SDA into the form: (fork

=

1

,

2

, . . .

)





Ak

=

A2k1

D(k1)Sk1

D(k2)

,

Gk

=

BkRk1Bk

,

Hk

=

CkTk1Ck

.

(3)

The application of the SMWF on

(

In

+

GkHk

)

1yields Ak+1

=

Ak

(

In

+

GkHk

)

1Ak

=

Ak

In

GkCkTk1

Ilk

+

CkGkCkTk11

Ck

Ak

=

Ak

In

Bk

Imk

+

Rk1BkHkBk1

Rk1BkHk

Ak

,

whereCkandBkhave respectivelylkandmkcolumns. It will be obvious that it is more convenient to work withSk1

,

Rk1

andTk1, and we retain the inverse notation only for historical reasons, although there is no actual inversion involved.

Consequently, withCk

Rn×lkandBk

Rn×mk, we have Ak+1

=

A2k

D(k1+)1Sk+11

D(k2+)1

,

(4)

with the update of ‘‘size’’lkdefined by

D(k1+)1

=

AkGkCk

,

D(k2+)1

=

AkCk

,

Sk+11

=

Tk1

Ilk

+

CkGkCkTk11

Rlk×lk

,

(5a) or the update of ‘‘size’’mkdefined by

D(k1+)1

=

AkBk

,

D(k2+)1

=

AkHkBk

,

Sk+11

=

Imk

+

Rk1BkHkBk1

Rk1

Rmk×mk

,

(5b) all involvingO

(

n3

)

operations for a denseA. The operation counts will be reduced toO

(

n

)

with the assumption that the maximum number of nonzero components in any row or column ofAis much less thann(seeTable 2in Section4.2). The trick isnotto formAkexplicitly. Note that we have to store all theBi

,

Ci,Ri1andTi1fori

=

0

,

1

, . . . ,

k

1 to facilitate the multiplication of low-ranked matrices byAkorAk.

(3)

We may choose between(5a)and(5b)based on the sizeslkandmk. Ignoring the small saving in the inversion of smaller matrices, the compression and truncation in the next section produces the leanerBkandCk, which makes the choice here irrelevant. However, this choice may be important whenGorHare not low-ranked.

The induction proof of the general form ofAkin(4)–(5b)can be completed by considering the initialk

=

1 case, which is trivial.

ForBk

,

CkandRk, applying the SMWF to

(

I

+

GkHk

)

1in the SDA, we have Gk+1

=

Gk

+

AkGkAk

AkGkCkTk1

Ilk

+

CkGkCkTk11

CkGkAk

=

Gk

+

AkGkAk

AkBk

Imk

+

Rk1BkHkBk1

Rk1BkHkGkAk

,

(6)

and

Hk+1

=

Hk

+

AkHkAk

AkHkGkCkTk1

Ilk

+

CkGkCkTk11

CkAk

=

Hk

+

AkHkAk

AkHkBk

(

Imk

+

Rk1BkHkBk

)

1Rk1BkHkAk

.

(7) These imply that

Bk+1

= [

Bk

,

AkBk

] ,

Ck+1

= [

Ck

,

AkCk

] ,

(8) Rk+11

=

Rk1

Rk1

Rk1BkCkTk1

Ilk

+

CkGkCkTk11

CkBkRk1

(9a)

=

Rk1

Rk1

Imk

+

Rk1BkHkBk1

Rk1BkHkBkRk1

,

(9b)

Tk+11

=

Tk1

Tk1

Tk1CkGkCkTk1

Ilk

+

CkGkCkTk11

(10a)

=

Tk1

Tk1

Tk1CkBk

Imk

+

Rk1BkHkBk1

Rk1BkCkTk1

(10b) with the initial values

A0

=

A

,

B0

=

B

,

C0

=

C

,

R0

=

R

,

T0

=

T

.

(11)

We have shown that the SDA can be organized into the form(3). The existence ofRk1

,

Tk1and

(

In

+

GkHk

)

1guarantees the same for other inverses in(9a)–(10b). Note thatRk1

,

Sk1andTk1are symmetric for allk. Again, the choice in(9a)–(10b) may be relevant whenGorHare not low-ranked.

For well-behaved DAREs [14,15], we haveHk

=

CkTk1Ck

XandGk

=

BkRk1Bk

Y (solution of the dual DARE) as k

→ ∞

.

Note that the ranks ofXandY have been observed to be numerically low-ranked. Under suitable assumptions [14,15], the convergence of the SDA implies the convergence ofAk

=

O

( | λ |

2k

) →

0, for some

| λ | <

1. Together with(8)–(10b), we see thatBk+1andCk+1equal, respectively, the sums ofBkandCkand the diminishing componentsAkBkandAkCk. Thus the observation about the low numerical ranks ofXandYhas been shown to be true.

2.2. Compression and truncation of Bkand Ck

Now we shall consider an important aspect of the SDA for large-scale DAREs (SDA_ls)—the growth ofBkandCk. Obviously, as the SDA converges, increasingly smaller components are added toBkandCk. As is apparent from(8), the growth in the sizes and ranks of these iterates is potentially exponential. Let the computational complexity of the SDA_ls beO

(

n

) = α

n

+

O

(

1

)

. If the convergence is slow relative to the growth inBkandCk, the algorithm will fail, with

α

growing exponentially (see Table 2in Section4.2). In such cases,Xis obviously no longer numerically low-ranked, with respect to some given truncation tolerance (see

τ

1

, τ

2 in (10)and(11)). It will then be extremely challenging to approximateX in O

(

n

)

computational complexity to high accuracy, by any method. One possibility will be to accept approximations toXto lower accuracies with a higher truncation tolerance, thus lowering the corresponding numerical rank ofX.

To reduce the dimensions ofBk

,

Ck

,

D(k1)andD(k2), we shall compress their columns by orthogonaization. Consider the QR decompositions with column pivoting:

Bk

=

Q1kM1k

+

Q1kM1k

,

Ck

=

Q2kM2k

+

Q2kM2k with

∥

M1k

∥ ≤ τ

1

, ∥

M2k

∥ ≤ τ

2

(4)

where

τ

i

(

i

=

1

,

2

)

are some small tolerances controlling the compression and truncation process,lkandmkare respectively the numbers of columns inBkandCkbounded from above by some correspondingmmaxandlmax,

r1k

=

rankBk

lk

mmax

n

,

r2k

=

rankCk

mk

lmax

n

,

and fori

=

1

,

2

,

Qik

Rn×rikare unitary andMik

Rrik×nikare full-ranked and upper triangular. We then have BkRk1Bk

=

Q1k

M1kRk1M1k

Q1k

+

O

1

),

(12)

CkTk1Ck

=

Q2k

M2kTk1M2k

Q2k

+

O

2

),

(13)

and we should replaceBkandRk1(or,CkandTk1) respectively by the leanerQ1kandM1kRk1M1k(or,Q2kandM2kTk1M2k).

We may ignore compressing and truncatingD(k1)andD(k2)after compressing and truncatingBkandCk. As a result, we ignore theO

i

)

terms and control the growth ofrikwhile sacrificing a hopefully negligible bit of accuracy.

Interestingly, we need onlyR

,

TandI

+

GkHkto be invertible (which imply the invertibility ofRkandTkfor allk), opening up the possibility of dealing with DAREs with indefiniteRs andTs [19].

Eqs.(4)(used recursively butnotexplicitly),(5a)(or(5b)),(8),(9a)(or(9b)),(10a)(or(10b)),(12)and(13), together with the corresponding initial values in(11), constitute the SDA_ls.

2.3. SDA and Krylov subspaces

There is an interesting relationship between the SDA_ls and Krylov subspaces. Define the Krylov subspaces Kk

(

A

,

B

) ≡

span

{

B

} (

k

=

0

),

span

{

B

,

AB

,

A2B

, . . . ,

A2k1B

} (

k

>

0

).

From(4)and(8), we can see that

B0

=

B

K0

(

A

,

B

),

B1

= [

B

,

AB

] ∈

K1

(

A

,

B

)

and, for some low-rankedF,

B2

=

B1

,

A1B1

] = [

B

,

AB

, (

A2

ABF

)(

B

,

AB

)

K2

(

A

,

B

).

(We have abused notations, withV

Kk

(

A

,

B

)

meaning span

{

V

} ⊆

Kk

(

A

,

B

)

.) Similarly, it is easy to show that Bk

Kk

(

A

,

B

),

Ck

Kk

(

A

,

C

).

In other words, the general SDA is closely related to approximating the solutionsXandY using Krylov subspaces, with additional components vanishing quadratically. However, for problems of small sizen

,

BkandCkbecome full-ranked after a few iterations.

The Krylov subspacesKk

(

A

,

B

)

play a vital part in the fast convergence of the SDA, which comes from two sources. Apart from the diminishingAkcontributing in(2)in the updating ofGandH, the power of approximation of the corresponding Krylov subspaces also contributes, creating cancellations inGk+1 and Hk+1 in(6) and (7). This phenomenon has been confirmed in some extreme examples, with some eigenvalue

λ

of the symplectic matrix pencil associated with the DARE nearly on the unit circle [16]. Instead of the number of iterations predicted purely from

λ

for convergence, the SDA requires significantly less.

2.4. Errors of SDA_ls

The SDA_ls can be interpreted as a Galerkin method, or directly from(2). With

δ

k

max

{∥ δ

Gk

∥ , ∥ δ

Hk

∥ , ∥ δ

Ak

∥} ,

where

δ

Gk

, δ

Hkand

δ

Akare respectively the truncation/round-off errors inGk

,

HkandAk, we can show

δ

k+1

≤ (

1

+

ck

k

+

O

2k

),

(14)

withck

0 ask

→ ∞

. A more detailed discussion can be found in [18, Section 2.5]. Essentially, we limit the rank of the approximation toX, trading off the accuracy inXwith the efficiency of the SDA_ls. Assume that the compression and truncation in(12)and(13)create errors ofO

i

) (

i

=

1

,

2

)

inGkandHk, respectively. It is easy to see from(14)that errors of the same magnitude will propagate through toAk+1

,

Gk+1andHk+1. The fact thatAk

0 impliesck

0 and contributes towards diminishing these errors. From our numerical experience, the trade-off between the ranks ofGkandHkand the accuracy of the approximate solutions toXandY is the key to the success of our computation. If these ranks grow out of control, unnecessary and insignificant small additions to the iterates overwhelm the computation in terms of flop counts and memory requirement. Limiting the ranks will obviously reduce the accuracy of the approximate solution. We found we do not have to experiment much with the tolerances for the compression/truncation and convergence while trying to achieve a balance between accuracy and the feasibility/efficiency of the SDA.

(5)

Table 1

Krylov subspaces for solutionXand adjoint solutionY.

Equation X Y

DARE, Stein equation Kk(A,C) Kk(A,B) CARE, Lyapunov equation Kk(A−⊤γ ,A−⊤γ C) Kk(Aγ1,Aγ1B)

3. CAREs

One possible approach for large-scale CAREs is to transform them to DAREs using Cayley transforms.

3.1. SDA after Cayley transform

From [14], the matricesA

,

GandHin the CARE(1b)are first treated with the Cayley transform:

A0

=

I

+

2

γ

Aγ

+

GA−⊤γ H1

,

(15)

G0

=

2

γ

Aγ1G

Aγ

+

HAγ1G1

,

(16)

H0

=

2

γ

Aγ

+

HAγ1G1

HAγ1

,

(17)

withAγ

A

− γ

Iand a suitable

γ >

0 chosen to optimize the condition of various matrix inversions. A simple application of the SMWF implies

(

Aγ

+

GA−⊤γ H

)

1

=

Aγ1

Aγ1GA−⊤γ C

·

T1

(

Il

+

CAγ1GA−⊤γ CT1

)

1

·

CAγ1 (18a)

=

Aγ1

Aγ1B

·

Im

+

R1BA−⊤γ HAγ1B1

R1

·

BA−⊤γ HAγ1

.

(18b) It is not hard to see, with the above initialA0

,

G0andH0, that the SDA_ls still works, again with exactly the same forms and updating formulae forAk

,

Bk

,

Ck

,

D(k1)

,

D(k2)and the inverses ofRk

,

SkandTk. One relevant difference for CAREs is thatA0

̸=

A but satisfies, from(15),(18a)and(18b),

A0

=

In

+

2

γ

Aγ1

D(01)S01D(02)

(19) with

B0

=

Aγ1B

,

C0

=

A−⊤γ C

.

(20)

The corresponding sizelandmperturbed updates have the forms, respectively, D(02)

=

C0

,

D(01)

=

Aγ1GC0

,

S01

=

2

γ

Il

+

T1C0GC01

T1

;

(21a)

D(01)

=

B0

,

D(02)

=

A−⊤γ HB0

,

S01

=

2

γ

Im

+

R1B0HB01

R1

.

(21b)

Note that all computations can be realized inO

(

n

)

operations, assuming that the operationsAγ1BandA−⊤γ Care achievable inO

(

n

)

flops; see [20, Section 9.1] for a bandedA.

Similarly, we have R01

=

2

γ

R1

R1BC0

·

Il

+

T1C0GC01

T1

·

C0BR1

(22a)

=

2

γ

R1

R1B0HB0

Im

+

R1B0HB01

R1

,

(22b)

and

T01

=

2

γ

T1

T1

Il

+

C0GC0T11

C0GC0T1

(23a)

=

2

γ

T1

T1CB0

·

R1

Im

+

B0HB0R11

·

B0CT1

.

(23b)

For CAREs, we have

Bk

Kk

(

Aγ1

,

Aγ1B

),

Ck

Kk

(

A−⊤γ

,

A−⊤γ C

).

(24) Note that the Krylov subspacesKk

(

A±1

,

B

)

andKk

(

A±⊤

,

C

)

have been used in the solution of CAREs and Lyapunov equations in [21–26], quite different from the subspaces associated with the SDA here. This difference may explain the superiority of our methods. From(24)and [18,27], we can see clearly the appropriate choices of Krylov subspaces for DAREs and CAREs, as well as the corresponding Stein and Lyapunov equations. A summary is contained inTable 1.

(6)

We summarize the algorithm below, with the particular choice of(4),(5a),(8),(9a),(10b),(12)and(13). We would like to emphasize that care has to be exercised in Algorithm 1 below, with the multiplications byAk+1andAk+1carried out recursively using(4)and(5a)or(5b). Otherwise, computations cannot be carried out inO

(

n

)

complexity. Similar care has to be taken in the computation of residuals (used in Algorithm 1 below) or differences of iterates (as an alternative convergence control), as discussed in Section4.2later.

Algorithm 1 (SDA_ls)

Input: A

Rn×n

,

B

Rn×m

,

R1

=

R−⊤

Rm×m

,

C

Rn×l

,

T1

=

T−⊤

Rl×lshift

γ >

0, positive tolerances

τ

1

, τ

2and

ϵ

, andmmax

,

lmax;

Output: Bϵ

Rn×mϵ

,

Rϵ1

=

R−⊤ϵ

Rmϵ×mϵ

,

Cϵ

Rn×lϵandTϵ1

=

Tϵ−⊤

Rlϵ×lϵ, withCϵTϵ1Cϵand BϵRϵ1Bϵ approximating, respectively, the solutionsXandYto the large-scale CARE(1b) and its adjoint;

ComputeAγ

=

A

− γ

I;

Setk

=

0

,

r0

=

2

ϵ ;

B0

=

Aγ1B

,

C0

=

A−⊤γ C; R01

=

2

γ

R1

R1BC0

·

Il

+

T1C0GC01

T1

·

C0BR1

 , T01

=

2

γ

T1

T1CB0

·

R1

Im

+

B0HB0R11

·

B0CT1

; D(02)

=

C0

,

D(01)

=

Aγ1GC0

,

S01

=

2

γ

Il

+

T1C0GC01

T1, A0

=

In

+

2

γ

Aγ1

D(01)S01

D(02)

; Computeh

= ∥

H0

∥ = ∥

C0T01C0

; Dountil convergence:

Ifthe relative residualr

˜

k

= |

dk

/(

hk

+

mk

+

h

) | < ϵ

,

SetBϵ

=

Bk

,

Rϵ1

=

Rk1

,

Cϵ

=

CkandTϵ1

=

Tk1; Exit

End If

ComputeBk+1

= [

Bk

,

AkBk

] ,

Ck+1

= [

Ck

,

AkCk

]

; Rk+11

=

Rk1

Rk1

Rk1BkCkTk1

Ilk

+

CkGkCkTk11

CkBkRk1

 , Tk+11

=

Tk1

Tk1

Tk1CkBk

Imk

+

Rk1BkHkBk1

Rk1BkCkTk1

; withAk+1

=

A2k

D(k1+)1Sk+11

D(k2+)1

, D(k1+)1

=

AkGkCk

,

D(k2+)1

=

AkCk

,

Sk+11

=

Tk1

Il

+

CkGkCkTk11

; CompressBk+1andCk+1, using the tolerances

τ

1and

τ

2, and modify

Rk+11andTk+11, as in(12)and(13);

Computek

k

+

1

,

dk

= ∥

D

(

Hk

) ∥ ,

hk

= ∥

Hk

andmk

= ∥

Mk

, as in Section4.2;

End Do

4. Computational issues

4.1. Residuals and convergence control

Consider the difference of successive iterates:

dGk

BkRk1Bk

Bk+1Rk+11Bk+1

=

Bk+1Rk+11Bk+1

,

we have

Bk+1

[Bk

,

Bk+1]

,

Rk+11

Rk1

Rk+11

.

Similarly, withdHk

CkTk1Ck

Ck+1Tk+11Ck+1, we have

dHk

=

Ck+1Tk+11Ck+1 with

Ck+1

[Ck

,

Ck+1]

,

Tk+11

Tk1

Tk+11

.

Alternatively,(6)and(7)imply similar results.

Références

Documents relatifs

Using our approach with a semantic knowledge base and Linked Data prosuming Web APIs according to SPARQL pattern, pre-processing pipelines of wrapped MITK command line tools are

After studying the situation the ICSU report concludes that the resources used to support scientific publication are sufficient to bring about a scientific literature as

Our second contribution is a creative way of constructing a triplet representation for the defining matrices of all smaller ares during the doubling iterations so that the

To address this issue, we construct multiscale basis functions within the framework of generalized multiscale finite element method (GMsFEM) for dimension reduction in the

Lin, Structured doubling algorithms for solving g-palindromic quadratic eigenvalue problems, Technical Report, NCTS Preprints in Mathematics, National Tsing Hua University,

Since the shift-and-invert Arnoldi method is known to converge very fast when a proper shift is known, the overall computational costs of GE_GTSHIRA and GE_TSHIRA, including computing

By using variational methods, the existence and the non-existence of nontrivial homoclinic solutions are obtained, depending on a parameter.. Ó 2014

In this paper, we propose a new method to compute the numerical conformal maps to circular regions based on a variational formulation of the problem (4) based on [6].. Due to