• Aucun résultat trouvé

Algorithms for Structured Linear Systems Solving and their Implementation

N/A
N/A
Protected

Academic year: 2022

Partager "Algorithms for Structured Linear Systems Solving and their Implementation"

Copied!
47
0
0

Texte intégral

(1)Algorithms for Structured Linear Systems Solving and their Implementation S. G. Hyun, Éric Schost University of Waterloo. Romain Lebreton Université de Montpellier. Séminaire Aromath  Inria Sophia Antipolis April 25th 2018.

(2) Motivation. 2/34. Hermite-Padé approximants Given r1; :::; rs 2 k[x], compute f1; :::; fs 2 k[x] such that. f1 r1 +  + fs rs = 0 mod x. + degree conditions. Example If. 8 > < r0 = 8 x4 ¡ 8 x2 + 1 r1 = 16 x5 ¡ 20 x3 + 5 x > : r = 32 x6 ¡ 48 x4 + 18 x2 ¡ 1 2. then we nd the relation r0 ¡ 2 x r1 + r2 = 0 mod x3. Chebyshev polynomials.

(3) Motivation. 2/34. Hermite-Padé approximants Given r1; :::; rs 2 k[x], compute f1; :::; fs 2 k[x] such that. f1 r1 +  + fs rs = 0 mod x. + degree conditions. Solved by order basis or structured linear algebra :. 2. r1;0 0 6 6 r1;1 r1;0 6  r1;1 6  6   4   r1; r1; ¡1. 0 0      .  rs;0  rs;1      rs;. Special case:  algebraic approximants ri+1 = r i. d ir  dierential approximants ri+1 = dxi. 0. 0 rs;0 0 rs;1           . 2 3 f1;0 6  76  76 76 f2;0 76  76  56 4 fs;0  . 3. 7 2 7 7 7 4 7= 7 7 5. 3. 0  5  0.

(4) Outline of the talk 1. Structured matrices a. Toeplitz, Hankel, Vandermonde, Cauchy b. Displacement operators c. Pan's motto: Compress, operate, decompress d. Structured matrices multiplication. 2. Structured matrices inversion a. Cauchy-like inversion: iterative & divide-and-conquer algorithm b. Toeplitz-like inversion. 3. Implementations. 3/34.

(5) Toeplitz matrices 2. t0 t1    . 6 6 6 T =6 6 4 Remarks:. tn¡1. t¡1 t0 t1   .  t¡1     . 4/34.  t¡(n¡1)          t¡1  t1 t0. 3 7 7 7 7 7 5. Described by O(n) elements, Hermite-Padé matrix was block Toeplitz. Fast algorithm for linear system solving  Arithmetic cost (K = Fq) 1. O(n2). [LEVINSON '47, TRENCH '64]. ~ (n) 2. O. [Brent, Gustavson, Yun '80] [Morf '80], [Bitmead, Anderson '80]. Fast matrix vector product T  b:. 2. 6 T  b =4. t0  . tn¡1.  t¡(n¡1)      t0. 32 74 5. b0  . bn¡1. 3 3 2 b0 t0 + b1 t¡1 +  + bn¡1 t¡(n¡1) 6 7 5=4  5  b0 tn¡1 + b1 tn¡2 +  + bn¡1 t0. computed using PT(x) Pb(x) in time M(n) where PT(x) =. P. t¡(n¡1)+i xi..

(6) Hankel, Vandermonde and Cauchy matrices Hankel. 2. 6 6 6 H =6 6 4 Vandermonde. hn¡1. 2. 6 6 V =6 4 Cauchy. 2 6 6 4. h0 h1 h2  . h1 h2     .  h(n¡1)              h2n¡2. h2       . 1 u1 (u1)2 1 u2 (u2)2  1   1 un (un)2. u1n¡1.          n¡1  un. 1 u1 ¡ v 1. . 1 u1 ¡ v n. 1 un ¡ v 1. . 1 un ¡ v n.  .  . 3 7 7 5. 3 7 7 7 7 7 5. 3 7 7 7 5. 5/34.

(7) Structured matrix - vector product Hankel. 2 4. h0  . hn¡1. 32.  hn¡1 bn¡1   54      h2n¡2 b0. 6/34. Polynomial multiplication in M(n) 3 bn¡1 h0 +  + b0 hn¡1  5=4 5  bn¡1 hn¡1 +  + b0 h2n¡2. 3 2. P Vandermonde Multipoint evaluation of biX i at u1; :::; un in O(M(n) log(n)) 2 32 3 3 2 n¡1 n¡1 1 u1  u1 b0 b0 + b1 u1 +  + bn¡1 u1 6 7 6 7  54  5=4  4 1  5    n¡1 n¡1 bn¡1 1 un  un b0 + b1 un +  + bn¡1 un Cauchy. 2 6 6 4. P b Multipoint evaluation of j X ¡j v at u1; :::; un in j 3 2 2 3 1 1 b1 bn  +  + u1 ¡ v 1 u1 ¡ vn 7 b1 u1 ¡ v n 6 u1 ¡ v 1    74  5=6    5  4 1 1 b bn 1 bn  +  + u ¡v u ¡v u ¡v u ¡v n. 1. n. n. n. 1. n. n. O(M(n) log(n)) 3 7 7 5. Note: Cost drops from O(M(n) log(n)) to O(M(n)) if (ui) and (vj ) are geometric sequences..

(8) What is a structured matrix ?. 7/34. Toeplitz, Hankel, Cauchy, Vandermonde frameworks have similarities. Can we nd one unied framework ? What we could want from this framework: 1. Incorporate block Toeplitz 2. Requirements to have a divide-and-conquer algorithm for inversion like [Strassen '69] a. Stability by sum and product b. Stability by inverse c. Quasi-linear product of structured matrices ¡1 Strassen formula: Set K = A11 ¡ A10 A00 A01,. A¡1 =. . A00 A01 A10 A11. #" #" # ¡1 " ¡1 ¡1 Id 0 A00 0 Id ¡A00 A01 = ¡1 0 Id ¡A10 A00 Id 0 K¡1.

(9) Displacement rank Example  Toeplitz Let. 2. 6 6 Z =6 6 6 4. 0 1 0   0.        .   0             0 1 0. 3. 7 7 7 7. 7 5. Notice that Z T = (T #) ' (T. So consider rZ; Z(T) = Z T ¡ T Z = (T #) ¡ (T. 02. B6 rZ; Z@4. t0  . tn¡1. 2.  t¡(n¡1) 6 7 C 6   = 5 A 6   4  t0. So T has rZ; Z-displacement rank. Denition of displacement rank. ) = T Z.. ) and t¡1  t¡(n¡1) 0 0  0 t¡(n¡1)       0  0 t¡1. 3 7 7 7 5. = 2. [Kailath, Kung, Morf '79] A 2 Knn 7! r(A) 2 Knn. Displacement operator:. r-displacement rank. 31. 8/34. of A:. rank(r(A)).

(10) Displacement rank. [Kailath, Kung, Morf '79]. Denition of displacement rank. A 2 Knn 7! r(A) 2 Knn. Displacement operator:. r-displacement rank. 9/34. rank(r(A)). of A:. Example  Hankel Consider rZ; Zt(H) = Z H ¡ H Zt = (H #) ¡ (H !), then. 02. rZ; Zt@4. h0  . hn¡1. 31. 2.  hn¡1 6   5A=6    4  h2n¡2. So H has rZ; Zt-displacement rank. = 2.. 0 h1  . hn¡1. 3. h1  hn¡1 0  0 7 7   5   0  0.

(11) Displacement rank. 9/34. [Kailath, Kung, Morf '79]. Denition of displacement rank. A 2 Knn 7! r(A) 2 Knn. Displacement operator:. r-displacement rank. rank(r(A)). of A:. Example  Vandermonde Consider rDu ; Z(V) = Du V ¡ V Z, Du = Diag(u1; :::; un), then. 02. u1n¡1. 1 u1  B6  rDu ; Z@4 1   n¡1 1 un  un So V has rDu ; Z-displacement rank. = 1.. 31 2 7C 4 5A=. u1n. 3. 0  0    5    0  0 unn.

(12) Displacement rank. [Kailath, Kung, Morf '79]. Denition of displacement rank. A 2 Knn 7! r(A) 2 Knn. Displacement operator:. r-displacement rank. 9/34. rank(r(A)). of A:. Example  Cauchy Consider rDu ; Dv (C) = Du C ¡ C Dv, then. 02. B6 6 rDu ; DvB @4. 1 u1 ¡ v1. . 1 u1 ¡ v n. 1 un ¡ v 1. . 1 un ¡ v n.  . So C has rDu ; Dv -displacement rank.  . 31. 2. 7C 7C=4 5A. 3. 1  1   5   1  1. = 1.. Historical note: Initially, the displacement rank approach was intended for more restricted use. [KKM '79] introduced the concept to a measure of how 'close' to Toeplitz a given matrix is..

(13) Displacement operators. 10/34. Two main families of displacement operators 1. Sylvester displacement operator. rM; N(A) = M A ¡ A N. 2. Stein displacement operator. M; N(A) = A ¡ M A N. In this talk, M; N 2 fDu; Z; Zt g.. There exists more general theory with block companion M; N, that applies to Padé-Hermite [Olshevsky, Shokrollahi '00] generalization f1 Ri;1 +  + fs Ri;s = 0 mod Pi. [Bostan, Jeannerod, Mouilleron, Schost '17] Informally, A is structured for rM; N if.  n.. rZ; Z: Toeplitz-like, rZ; Zt: Hankel-like, rDu ; Dv: Cauchy-like, rDu ; Z: Vandermonde-like. Pan's motto: Compress, Operate, Decompress 1. Compression: Store A with n2 elements 2. Operate: Sum, product of structured matrices, matrix-vector product, inversion... 3. Decompression: Recover A.

(14) Pan's motto: Compress, Decompress. 11/34. Compression If r(A) has rank , A is stored as r(A) = G  Ht for G; H 2 Kn Size: 2 Cost: O(. n  n2 ! ¡2. n2). Echelonize r(A) (e.g. PLUQ). Decompression  Inversibility Is rM; N always invertible ?  No, consider rZ; Z(A) = Z A ¡ A Z then rZ; Z(Zi) = 0 for all i..  Theorem. If Spec(M) \ Spec(N) = ? then rM; N is invertible.. Decompression  Reconstruction formulae  Cauchy If A = (ai; j )i; j then rDu ; Dv(A) = (ai; j (ui ¡ vj ))i;j Cost: O(. ! ¡2. n2). Compute G  Ht.

(15) Pan's motto: Compress, Decompress Decompression  Reconstruction formulae  Vandermonde If M invertible and N nilpotent : Mn A = Mn¡1 (M A ¡ A N) + Mn¡1 A N = Mn¡1 (M A ¡ A N) + Mn¡2 (M A ¡ A N) N + Mn¡2 A N2 n X n = Mi¡1 (M A ¡ A N) Nn¡i + A N | {z}} i=1 n X. =. 0. Mi¡1 G Ht Nn¡i. i=1    = G M G  Mn¡1 G  (Nt)n¡1 H  Nt H H t. Set Krylov(M; G) =. . Mn ¡1 G  G. , you get.   t t n¡1 t A = M  M H  N H H G  G  (N ) = M¡1  Krylov(M¡1; G)  Krylov(Nt ; H)t ¡1. . . ¡(n¡1). Can be applied to Vandermonde-like since M = Du and N = Z. 12/34.

(16) Pan's motto: Compress, Decompress. 13/34. Decompression  Reconstruction formulae  Toeplitz, Hankel Tricks for rZ; Z :  Consider rZ1; Z:. Z1 = Z + e1;n. rZ1; Z is invertible since Z1 is invertible and Z nilpotent j. rZ 1; Z. ¡. rZ ; Z j 6 1. since rZ1; Z(A) ¡ rZ; Z(A) = (Z1 ¡ Z) A |||||||||{z}}}}}}}}} e1;n.  Or consider Stein operator: Zt ; Z(A) = A ¡ Zt A Z, who is invertible..

(17) Stability of structured matrices Let rank(rM; N(A)) =. 14/34. and rank(rM; N(B)) = ,. 1. Addition. rM; N(A + B) = rM; N(A) + rM; N(B). rank 6 +. 2. Transpose. rNt ; Mt(At) = (M A ¡ A N)t = (rM; N(A))t. rank =. 3. Inverse A¡1 rM; N(A) A¡1 = A¡1 M ¡ N A¡1 = ¡rN; M(A¡1). rank =. 4. Multiplication. rM; P(A B) = (M A ¡ A N) B + A (N B ¡ B P) = rM; N(A) B + A rN; P(B) Note: Requires (structured matrix)(dense matrix) product. rank 6 +.

(18) Stability of structured matrices Let rank(rM; N(A)) =. 14/34. and rank(rM; N(B)) = ,. 1. Addition. rM; N(A + B) = rM; N(A) + rM; N(B). rank 6 +. 2. Transpose. rNt ; Mt(At) = (M A ¡ A N)t = (rM; N(A))t. rank =. 3. Inverse A¡1 rM; N(A) A¡1 = A¡1 M ¡ N A¡1 = ¡rN; M(A¡1). rank =. 4. Multiplication. rM; P(A B) = (M A ¡ A N) B + A (N B ¡ B P) = rM; N(A) B + A rN; P(B) Note: Requires (structured matrix)(dense matrix) product. rank 6 +.

(19) Stability of structured matrices Let rank(rM; N(A)) =. 14/34. and rank(rM; N(B)) = ,. 1. Addition. rM; N(A + B) = rM; N(A) + rM; N(B). rank 6 +. 2. Transpose. rNt ; Mt(At) = (M A ¡ A N)t = (rM; N(A))t. rank =. 3. Inverse A¡1 rM; N(A) A¡1 = A¡1 M ¡ N A¡1 = ¡rN; M(A¡1). rank =. 4. Multiplication. rM; P(A B) = (M A ¡ A N) B + A (N B ¡ B P) = rM; N(A) B + A rN; P(B) Note: Requires (structured matrix)(dense matrix) product. rank 6 +.

(20) Stability of structured matrices Let rank(rM; N(A)) =. 14/34. and rank(rM; N(B)) = ,. 1. Addition. rM; N(A + B) = rM; N(A) + rM; N(B). rank 6 +. 2. Transpose. rNt ; Mt(At) = (M A ¡ A N)t = (rM; N(A))t. rank =. 3. Inverse A¡1 rM; N(A) A¡1 = A¡1 M ¡ N A¡1 = ¡rN; M(A¡1). rank =. 4. Multiplication. rM; P(A B) = (M A ¡ A N) B + A (N B ¡ B P) = rM; N(A) B + A rN; P(B) Note: Requires (structured matrix)(dense matrix) product. rank 6 +.

(21) Structured matrix  vector product. 15/34. Do we still have quasi-linear structured matrix  vector multiplication ? Remark: A is the sum of A = r¡1(G  Ht) = r¡1. Cauchy-like. =1. structured matrices of displacement rank 1 ! ! G1  H1t +  + r¡1 G  Ht where Gi ; Hi is the i-th column ||{z} |||| }} }}}} ||{z} ||||| }} }}}}} rank 1. rank 1. If rDu ; Dv(A) = (gi hj )i;j then A = (gi hj /(ui ¡ vj ))i;j = Dg Cu; v Dh So Cauchy-like matrix-vector product reduces to Cauchy matrix-vector product.. Cost for matrix-vector product: O(M(n) log(n)) Cauchy-like Cost for matrix-vector product: O( M(n) log(n)).

(22) Structured matrix  vector product. 15/34. Do we still have quasi-linear structured matrix  vector multiplication ? Remark: A is the sum of A = r¡1(G  Ht) = r¡1. Cauchy-like. structured matrices of displacement rank 1 ! ! G1  H1t +  + r¡1 G  Ht where Gi ; Hi is the i-th column ||{z} |||| }} }}}} ||{z} ||||| }} }}}}} rank 1. rank 1. = 1  Trick for geometric sequences with same ratio. Suppose ui =  i¡1 u1, vi =  i¡1 v1. P bj Classical approach. via interpolation on geometric sequence j X ¡v. 2 M(n). j. + Evaluation on geometric sequence u1; :::; un. M(n). Approach with trick.. h. 1 ui ¡ vj. i h =. 1 i ¡1  u1 ¡  j ¡1v1. i. =. h. 1  i ¡1. 1 u1 ¡  j ¡iv1. Reduce to Toeplitz matrix vector-product M(n). i. = D1; ¡1;::: . h. 1 u1 ¡  j ¡iv1. i. |||||||||||||||||||{z}}}}}}}}}}}}}}}}}}} Toeplitz. (vs. 3 M(n)).

(23) Structured matrix  vector product Toeplitz-like. 16/34. =1. When rank rZ1; Z(A) = 1 and rZ1; Z(A) = g  ht,. 2. 32 76 76 56 4. 3. hn 0  0 g1 gn  g2  hn   7 6 g1   ¡1 ¡1 t 6 g2    7 A = Z1  Krylov(Z1 ; g)  Krylov(Z; h) =4    gn   0 7  5 h 2      gn gn¡1  g1 h1 h2 :::. hn ||||||||||||||||||||||||||||||||{z}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}} ||||||||||||||||||||||||||||||{z}}}}}}}}}}}}}}}}}}}}}}}}}}}}}} Toeplitz. So reduction to Toeplitz matrix-vector product.. Cost: 2 M(n) Toeplitz-like Cost: 2. M(n). Same for Hankel-like, Vandermonde-like.. Toeplitz.

(24) Structured matrix  dense matrix product. 17/34. Structured matrix  dense matrix product Algorithm 1. Decompose into many structured matrix  vector product ~ ( n) Each cost O Algorithm 2.. [Bostan, Jeannerod, Schost '08], [Bostan, Jeannerod, Mouilleron, Schost '17]. Decompose into many structured matrix  vectors product ~ ( ! ¡1 n) ~ ( 2 n) for Algo 1) Each cost O (instead of O. Consequence to product of structured matrices If rM; N(A) = G  Ht ; rN; P(B) = Y  Zt with G; H; Y; Z 2 Kn then. rM; P(A B) = rM; N(A) B + A rN; P(B) = G  (Bt H)t + (A Y)  Zt A Y and Bt H are (structured matrix)( vectors) product. Cost: O(. ! ¡1. M(n)) or O(. ! ¡1. M(n) log(n)).

(25) Outline 1. Structured matrices a. Toeplitz, Hankel, Vandermonde, Cauchy b. Displacement operators c. Pan's motto: Compress, operate, decompress d. Structured matrices multiplication. 2. Structured matrices inversion a. Cauchy-like inversion: iterative & divide-and-conquer algorithm b. Toeplitz-like inversion. 3. Implementations. 18/34.

(26) Cauchy-like inversion  S(i) Intermediate inverse matrices S(i) For 0 6 i 6 n, cut A =. . A00 A01 A10 A11. S(i) =. Note:. . ". 19/34. [Cardinal '99], [Jeannerod, Mouilleron '10]. so that A00 2 Kii. Dene ¡1 A11 ¡ A10A00 A01 ¡1 ¡A00 A01. ¡1 A10A00 ¡1 A00. #. S(0) = A, S(n) = A¡1 ¡1 A11 ¡ A10A00 A01 is the Schur complement. Where do these intermediate inverse matrices come from ?.

(27) Cauchy-like inversion  S(i). 20/34. Intermediate inverse matrices S(i) [Cardinal '99], [Jeannerod, Mouilleron '10] " # ¡1 ¡1 A11 ¡ A10 A00 A01 A10 A00 S(i) = ¡1 ¡1 ¡A00 A01 A00 Where do these intermediate inverse matrices come from ? From. (0). R. =. . A Id. . , perform column elimination to transform A to Id, you get. R. (n). =. Partial column elimination:. ColRedi : Column reduction to transform [ A00 A01 ] to [ Idi 0 ] using A00 as pivot 02. A00 A01 B6 A10 A11 (i) 6 R := ColRediB @4 Idi 0 0 Idn¡i S(i) appears in R(i).. 31 2 Idi 0 ¡1 ¡1 7C 6 A A A ¡ A A A01 6 10 11 10 00 00 7C=6 ¡1 5A 4 A¡1 ¡A00 A01 00 0 Idn¡i. Shift columns so that we always reduce top left corner. ShColRedi. 3. 7 7 7: 5. ". Id A¡1. #. ..

(28) Cauchy-like inversion  S(i). 21/34. Intermediate inverse matrices S(i) [Cardinal '99], [Jeannerod, Mouilleron '10] " # ¡1 ¡1 A11 ¡ A10 A00 A01 A10 A00 S(i) = ¡1 ¡1 ¡A00 A01 A00 Iterative construction: ShColRed1. ShColRed1. ShColRed1. ShColRed1. S (0) ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! S (1) ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! S (2) ¡!  ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! S (i) ¡!  ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! S (n) and. ShColRedj (S (i)) = (ShColRed1    ShColRed1)(S (i)) = S (i+j).

(29) Cauchy-like inversion  generators of S(i). 22/34. How to compute ShColRedi with structured matrices ?. . ShColRedi. A00 A01 A10 A11. #  " ¡1 ¡1 A11 ¡ A10 A00 A01 A10 A00 = ¡1 ¡1 ¡A00 A01 A00. Cauchy magic  Submatrices If ru; v(A) = G  Ht (rank ) then rui ; vj (Ai j ) = Gi  Hjt (rank ) where h i h i G0 H0 u = [u0; u1], v = [v0; v1], G = G , H = H and u0; v0 2 K1i and G0; H0 2 Ki . 1. 1. Cardinal's formula  S(i) has displacement rank ! " #" #t 0 t 0 ¡A10 G0 + G1 ¡A01H0 + H1 r[u1; v0];[v1; u0](S(i)) =  0 G0 H00 ¡1 where G00 ; H00 are rv0; u0-generators of A00 .. (more on Cardinal's formula in appendix slide ? ).

(30) Cauchy-like divide&conquer inversion. 23/34. Algorithm Struct-inverse-DAC Input: G; H; u; v such that ru; v(A) = G  Ht Output: Y; Z such that rv; u(A¡1) = Y  Zt. 1. if n 6 threshold then return Struct-inverse-iter(G; H; u; v) 2. Cut G; H; u; v in two halves G00 ; H00 = Struct-inverse-DAC(G0; H0; u0; v0) " # " # t ¡A10 G00 + G1 ¡A01 H00 + H1 G= ;H= ; u = [u1; v0]; v = [v1; u0] G00 H00. (i = dn/2e). 3. Cut G; H; u; v in two halves (i = n ¡ dn/2e) 0 0 G0; H0 = Struct-inverse-DAC(G0; H0; u0; v0) " # " # 0 t 0 ¡A10 G0 + G1 ¡A01H0 + H1 G= ;H= ; u = [u1; v0]; v = [v1; u0] 0 G0 H00 4. return G; H. Idea: S(0) = A ¡! S(dn/2e) ¡! S(n) = A¡1.

(31) Cauchy-like divide&conquer inversion. 23/34. Algorithm Struct-inverse-DAC Input: G; H; u; v such that ru; v(A) = G  Ht Output: Y; Z such that rv; u(A¡1) = Y  Zt. 1. if n 6 threshold then return Struct-inverse-iter(G; H; u; v) 2. Cut G; H; u; v in two halves G00 ; H00 = Struct-inverse-DAC(G0; H0; u0; v0) " # " # t ¡A10 G00 + G1 ¡A01 H00 + H1 G= ;H= ; u = [u1; v0]; v = [v1; u0] G00 H00. (i = dn/2e). 3. Cut G; H; u; v in two halves (i = n ¡ dn/2e) 0 0 G0; H0 = Struct-inverse-DAC(G0; H0; u0; v0) " # " # 0 t 0 ¡A10 G0 + G1 ¡A01H0 + H1 G= ;H= ; u = [u1; v0]; v = [v1; u0] 0 G0 H00 4. return G; H. History  Need compression:. [Morf/Bitmead-Anderson '80], [Kaltofen '94].  Compressed formulae: [Cardinal '99], [Pan '00], [Jeannerod, Mouilleron '10].

(32) Cauchy-like divide&conquer inversion. 23/34. Algorithm Struct-inverse-DAC Input: G; H; u; v such that ru; v(A) = G  Ht Output: Y; Z such that rv; u(A¡1) = Y  Zt. 1. if n 6 threshold then return Struct-inverse-iter(G; H; u; v) 2. Cut G; H; u; v in two halves G00 ; H00 = Struct-inverse-DAC(G0; H0; u0; v0) " # " # t ¡A10 G00 + G1 ¡A01 H00 + H1 G= ;H= ; u = [u1; v0]; v = [v1; u0] G00 H00. (i = dn/2e). 3. Cut G; H; u; v in two halves (i = n ¡ dn/2e) 0 0 G0; H0 = Struct-inverse-DAC(G0; H0; u0; v0) " # " # 0 t 0 ¡A10 G0 + G1 ¡A01H0 + H1 G= ;H= ; u = [u1; v0]; v = [v1; u0] 0 G0 H00 4. return G; H. Cost analysis t Non recursive cost : A10 G00 ; A01 H00. Total cost: O((Non recursive cost)  log(n)) = O(. ' Cauchy-like matrix  ! ¡1. M(n) log2(n)). vectors.

(33) Cauchy-like divide&conquer inversion. 23/34. Algorithm Struct-inverse-DAC Input: G; H; u; v such that ru; v(A) = G  Ht Output: Y; Z such that rv; u(A¡1) = Y  Zt. 1. if n 6 threshold then return Struct-inverse-iter(G; H; u; v) 2. Cut G; H; u; v in two halves G00 ; H00 = Struct-inverse-DAC(G0; H0; u0; v0) " # " # t ¡A10 G00 + G1 ¡A01 H00 + H1 G= ;H= ; u = [u1; v0]; v = [v1; u0] G00 H00. (i = dn/2e). 3. Cut G; H; u; v in two halves (i = n ¡ dn/2e) 0 0 G0; H0 = Struct-inverse-DAC(G0; H0; u0; v0) " # " # 0 t 0 ¡A10 G0 + G1 ¡A01H0 + H1 G= ;H= ; u = [u1; v0]; v = [v1; u0] 0 G0 H00 4. return G; H. Contributions in [Hyun, L., Schost '17] Compute r = rank(A) and invert only the leading principal minor Check the generic rank prole property.

(34) Cauchy-like iterative inversion Algorithm Struct-inverse-iter Input: G; H; u; v such that ru; v(A) = G  Ht and step size Output: Y; Z such that rv; u(A¡1) = Y  Zt. (1 6. 24/34. 6 ). 1. for (i = 0; i < n; i = i + ) a. Cut G; H; u; v in size and n ¡ b. Decompress A00 = ru¡1 (G0  H0t ) 0; v0 c. Invert A00 Dense inversion ¡1 0 0 d. Compress A00 G0; H0 e. Decompress A01 = ru¡1 (G0  H1t ) and A10 = ru¡1 (G1  H0t ) 0; v1 1; v0 t f. Compute A10 G00 and #A01 H00" Dense matrix multiplication " # t ¡A10 G00 + G1 ¡A01 H00 + H1 g. G = ;H= ; u = [u1; v0]; v = [v1; u0] G00 H00 2. return G; H. Idea: S(0) = A ¡! S(. ). ¡! S(2. ). ¡!  ¡! S(k. ). ¡!  ¡! S(n) = A¡1.

(35) Cauchy-like iterative inversion Algorithm Struct-inverse-iter Input: G; H; u; v such that ru; v(A) = G  Ht and step size Output: Y; Z such that rv; u(A¡1) = Y  Zt. (1 6. 24/34. 6 ). 1. for (i = 0; i < n; i = i + ) a. Cut G; H; u; v in size and n ¡ b. Decompress A00 = ru¡1 (G0  H0t ) 0; v0 c. Invert A00 Dense inversion ¡1 0 0 d. Compress A00 G0; H0 e. Decompress A01 = ru¡1 (G0  H1t ) and A10 = ru¡1 (G1  H0t ) 0; v1 1; v0 t f. Compute A10 G00 and #A01 H00" Dense matrix multiplication " # t ¡A10 G00 + G1 ¡A01 H00 + H1 g. G = ;H= ; u = [u1; v0]; v = [v1; u0] G00 H00 2. return G; H. Why dense matrices ? t A00; A10; A01 2 K are too small to take advantage of structured algorithms.. So what is the benet of dealing with structured matrices ?. ¡1 Only compute compressed form of large matrix (A11 ¡ A10 A00 A01).

(36) Cauchy-like iterative inversion Algorithm Struct-inverse-iter Input: G; H; u; v such that ru; v(A) = G  Ht and step size Output: Y; Z such that rv; u(A¡1) = Y  Zt. (1 6. 24/34. 6 ). 1. for (i = 0; i < n; i = i + ) a. Cut G; H; u; v in size and n ¡ b. Decompress A00 = ru¡1 (G0  H0t ) 0; v0 c. Invert A00 Dense inversion ¡1 0 0 d. Compress A00 G0; H0 ¡1 t e. Decompress A01 = ru¡1 ( G  H ) and A = r (G1  H0t ) 0 10 1 ; v u 0 1 1; v0 0 t 0 f. Compute A G and A H Dense matrix multiplication 10 0 01 0 " # " # t ¡A10 G00 + G1 ¡A01 H00 + H1 g. G = ;H= ; u = [u1; v0]; v = [v1; u0] G00 H00 2. return G; H. Cost analysis for one iteration when. = :. ¡1 1. Decompress, invert A00 and compress A00 in O(. !. ). 2. Decompress A01; A10 (' compute G0  H1t ), multiply A10  G00 in O( Total cost: n/. iterations. ! ¡1. n) O(. ! ¡2. n2).

(37) Cauchy-like iterative inversion Algorithm Struct-inverse-iter Input: G; H; u; v such that ru; v(A) = G  Ht and step size Output: Y; Z such that rv; u(A¡1) = Y  Zt. (1 6. 24/34. 6 ). 1. for (i = 0; i < n; i = i + ) a. Cut G; H; u; v in size and n ¡ b. Decompress A00 = ru¡1 (G0  H0t ) 0; v0 c. Invert A00 Dense inversion ¡1 0 0 d. Compress A00 G0; H0 e. Decompress A01 = ru¡1 (G0  H1t ) and A10 = ru¡1 (G1  H0t ) 0; v1 1; v0 t f. Compute A10 G00 and #A01 H00" Dense matrix multiplication " # t ¡A10 G00 + G1 ¡A01 H00 + H1 g. G = ;H= ; u = [u1; v0]; v = [v1; u0] G00 H00 2. return G; H. History. = 1: Cost O( n2) = : Cost O(. ! ¡2. [Levinson '47], [Durbin '60], [Trench '64],[Kailath '99], [Mouilleron '08]. n2). [Hyun, L., Schost '17].

(38) Toeplitz-like inversion. 25/34. Original motivation  Hermite Padé approximants Find a non-zero vector in the kernel of T Toeplitz-like. [Pan '90] Reduce questions about Toeplitz-like matrices to ones about Cauchy-like matrices.. Advantages: 1. Benet from ecient Cauchy-like inversion 2. The generated C will have generic rank prole 3. Can choose u; v geometric sequences with same ratio. Gain factor 3. Note: [Jeannerod, Mouilleron '10] extend Cardinal's compressed formulae to Toeplitz..

(39) Toeplitz-like to Cauchy-like. 26/34. Transformation If T is Toeplitz-like of displacement rank. then. C = Vu T Wv is Cauchy-like with displacement rank 6 + 2. Why ? 1. rM; P(A B) = rM; N(A) B + A rN; P(B) 2. Vu Vandermonde: So rDu ; Z(Vu) has rank 1 3. rZ; Z(T) has rank 4. Wv reversed-Vandermonde so that rZ; Dv (Wv) has rank 1. Moderate cost overhead: O( M(n) log(n)) Regularization [Pan '01]: Regularization of Cauchy-like A:. A 0 = Dx Ca; u A Cv; b Dy. [Hyun, L., Schost '17]: A = Vu T Wv has generic rank prole for a generic u, v in K (see sketch of the proof in appendix slide ? ).

(40) Implementations. 27/34. Implementation details:  C++ implementation based on Shoup's NTL NTL provides ecient polynomial arithmetic and matrix multiplication over small prime elds  Available at https://github.com/romainlebreton/structured_linear_system_solving. Timings: 1. Iterative Cauchy-like inversion: 2. Cauchy-like matrix multiplication: 3. DAC Cauchy-like inversion:. step size. O(. 2. M(n)) vs O(. ! ¡1. = 1 vs. =. M(n)) [BJMS '17]. dense vs structured algorithms.

(41) Implementation  Iterative Cauchy-like inversion O( n2) (step size. Iterative inversion:. Notes:. = 1). vs.. O(. ! ¡2. 28/34. n2) ( = )..  For moderate , ! = 3 and same complexity but better constant in O.  Crossover for. ' 10. 5. p=65537. time ratio direct / block. 4.5 4 3.5 3 2.5 2 1.5 1. 0. 100. 200. 300. 400. 500. 600. 700. 800. ! ¡2 n2). for. = 30. n. Figure. Time ratio ( n2)/(. 900. 1000.

(42) Implementation  Cauchy-like matrix multiplication Cauchy-like matrix multiplication:. Notes:. O(. 2. vs.. M(n)). O(. ! ¡1. M(n)) [BJMS '17].  For moderate , ! = 3 and same complexity but better constant in O.  Crossover for 30 6. 6 55.  u; v are geometric sequences with same ratio. gain factor 3 in O(. 2.1. 2. M(n)). p=65537. 2. ratio naive / BJMS16. 1.9 1.8 1.7 1.6 1.5 1.4 1.3 1.2 1.1. 0. 500. 1000. 1500. Figure. Time ratio (. 2000. 2500. 3000. 3500. 4000. 2 M(n))/( ! ¡1 M(n)). for. 4500. 5000. = 60. 29/34. n.

(43) Implementation  DAC Cauchy-like inversion DAC algorithm:. Dense O(n!). vs. structured O(. Note: Structured algorithms gains as soon as 4 3.5. ! ¡1. 30/34. M(n) log(n)) ( = 5 and. 6 0.2 n. alpha=5 alpha=50 dense. time in sec.. 3 2.5 2 1.5 1 0.5 0. 500. 1000. 1500. 2000. 2500. 3000. 3500. 4000. For Pade-Hermite and Toeplitz-like inversion: overhead of 5%. 4500. n. = 50).

(44) Conclusion Summary:  Implemented many dierent algorithms using ecient libraries  Improvements to iterative algorithm with practical benets  New class of Cauchy-like matrices with faster matrix-vector product  New regularization process. Other aspects of [Hyun, L., Schost '17] not covered here:  Same problems over Q (instead of Fp)  DAC and Newton lifting algorithm  Exploited the additional structure of the algebraic approximants. 31/34.

(45) Thank you for your attention ;-).

(46) Cardinal's formula magic. 33/34. Recall S(i) =. ". ¡1 A11 ¡ A10 A00 A01 ¡1 ¡A00 A01. It is surprising that S(i) has displacement rank. ¡1 A10 A00 ¡1 A00. #. :. ¡1 ¡1 A00 A01 could be of rank 2 , A11 ¡ A10A00 A01 of rank 4 , S(i) of rank 6. Beginning of explanation: 1.. ¡1 A00 A01. is a submatrix of. ". ¡1 ¡1 A00 ¡A00 A01 0 Idn¡i. #. =. . A00 A01 0 Idn ¡i.  ¡1. ) rank. ¡1 2. (A11 ¡ A10A00 A01)¡1 = (A¡1)11. ) rank. ¡1 ¡1 3. Idea for S(i)? Use generators of A00 ; A11 and. ". ¡1 A11 ¡ A10 A00 A01 ¡1 ¡A00 A01. ¡1 A10 A00 ¡1 A00. # " =. ¡1 A11 ¡1 A01A11. ¡1 ¡A11 A10 ¡1 A00 ¡ A01 A11 A10. #¡1 Back to slide ?.

(47) Toeplitz to cauchy regularization. 34/34. Theorem. ([Hyun, L., Schost '17]) A = Vu T Wv has generic rank prole for a generic u, v in K. Sketch of proof. If entries of u, v are indeterminates, let's show that A has generic rank prole. Use Cauchy-Binet formulae: Let I = f1; :::; ig for i 6 rank(A). Then. det(Vu T Wv)I ;I =. X. det(Vu)I ;J det(A)J ;K det(Wv)K ;I. jJ j=i jK j=i. Show that it is not the zero polynomial. The leading monomial of det(Vu)I ;J det(Wv)K ;I is uniquely determined by J ; K (for some monomial ordering). Take J "; K " so that det(Vu)I ;J det(Wv)K ;I is maximal amongst fJ ; K s.t. det(A)J ;K = / 0g.. LM(det(Vu T Wv)I ;I ) = LM(det(Vu)I ;J " det(Wv)K ";I ) and the polynomial is not zero. Note: No proof when u; v are geometric sequences.. Back to slide ?.

(48)

Références

Documents relatifs

“A finite volume solver for 1D shallow- water equations applied to an actual river.” In: International Journal for Numerical Methods in Fluids 38.1 (2002), pp.. Hydraulique

années j’ai effectué à temps plein la mise en place d’un protocole fondamental de modèle de souris en déphasage chronique associé à la mise de point de protocole visant

More work is required to close the gap and suggest a theoretical oracle- like denoising guarantee for the stagewise version of these

Then in section 4 we specialize these results to the maximally monotone operator M P which is attached to problem (P), and present our main results concerning the convergence

We then describe an improved iterative algorithm (of quadratic complexity with respect to the matrix size), that makes use of fast matrix multiplication; finally, for matrices

The main feature of the proposed algorithm is that it ensures that the consistency constraint (the sum of the solution components summing to one) is satisfied at every step, and

The degrees of freedom of the articulated parts in the static object (i.e. the protein side-chains) and the internal degrees of freedom of the mobile object (i.e. the ligand

We analyze here the asymptotic convergence properties of a class of splitting algorithms for finding a zero of the sum of two maximal monotone operators, not necessarily