Algorithms for Structured Linear Systems Solving and their Implementation

Texte intégral

(1)Algorithms for Structured Linear Systems Solving and their Implementation Seung Gyu Hyun, Éric Schost University of Waterloo. Romain Lebreton Université de Montpellier. Séminaire Eco-Escape LIRMM June 6th 2018.

(2) Motivation. 2/37. Hermite-Padé approximants Given r1; :::; rs 2 k[x], compute f1; :::; fs 2 k[x] such that. f1 r1 + + fs rs = 0 mod x. + degree conditions. Example If. 8 > < r0 = 8 x4 ¡ 8 x2 + 1 r1 = 16 x5 ¡ 20 x3 + 5 x > : r = 32 x6 ¡ 48 x4 + 18 x2 ¡ 1 2. then we nd the relation r0 ¡ 2 x r1 + r2 = 0 mod x3. Chebyshev polynomials.

(3) Motivation. 2/37. Hermite-Padé approximants Given r1; :::; rs 2 k[x], compute f1; :::; fs 2 k[x] such that. f1 r1 + + fs rs = 0 mod x. + degree conditions. Solved by order basis or structured linear algebra :. 2. r1;0 0 6 6 r1;1 r1;0 6 r1;1 6 6 4 r1; r1; ¡1. 0 0 . rs;0 rs;1 rs;. Special case: algebraic approximants ri+1 = r i. d ir dierential approximants ri+1 = dxi. 0. 0 rs;0 0 rs;1 . 2 3 f1;0 6 76 76 76 f2;0 76 76 56 4 fs;0 . 3. 7 2 7 7 7 4 7= 7 7 5. 3. 0 5 0.

(4) Outline of the talk. 3/37. Goal. Design an implementation and study its practical performance in symbolic computation.. Outline. 1. Structured matrices basic operations 2. Structured matrices inversion over Fp a. Cauchy-like inversion: iterative & divide-and-conquer b. Application to Hermite-Padé approximants c. Implementations 3. Structured matrices inversion over Q a. Modular techniques b. Special case of algebraic approximants.

(5) Classical structured matrices Toeplitz. 2. 6 6 6 T =6 6 4. t0 t1 . tn¡1 2. t¡1 t0 t1 . t¡1 . t¡(n¡1) t¡1 t1 t0. Vandermonde 2. 1 u1 (u1) 6 6 1 u2 (u2)2 V =6 4 1 1 un (un)2. u1n¡1. n¡1 un. 3 7 7 7 7 7 5. 2. 6 6 6 H =6 6 4. 3 7 7 7 5. 4/37. Hankel. h0 h1 h2 . h1 h2 . hn¡1 2. 6 C =6 4. h2 . h(n¡1) h2n¡2. Cauchy 1 u1 ¡ v1. . 1 u1 ¡ vn. 1 un ¡ v 1. . 1 un ¡ v n. . . 3 7 7 7 7 7 5. 3 7 7 5. Remark: Hermite-Padé matrix was block Toeplitz Matrix vector product: 1. Toeplitz, Hankel: Cost M(n) arithmetic operations. 2. Vandermonde, Cauchy: Cost O(M(n) log(n)) or O(M(n)) if u; v are geometric sequences.

(6) Displacement rank Example Toeplitz Let. 2. 6 6 Z =6 6 6 4. 0 1 0 0. . 0 0 1 0. 3. 7 7 7 7. 7 5. Notice that Z T = (T #) ' (T. So consider rZ; Z(T) = Z T ¡ T Z = (T #) ¡ (T. 02. B6 rZ; Z@4. t0 . tn¡1. 31. t¡(n¡1) 6 7 C 6 = 5 A 6 4 t0. We say that T has rZ; Z-displacement rank. Denition of displacement rank. ) = T Z.. ) and t¡1 t¡(n¡1) 0 0 0 t¡(n¡1) 0 0 t¡1. 3 7 7 7 5. = 2. [Kailath, Kung, Morf '79] A 2 Knn 7! r(A) 2 Knn. Displacement operator:. r-displacement rank. 2. 5/37. of A:. rank(r(A)).

(7) General structured matrices. [Kailath, Kung, Morf '79]. Denition of displacement rank Sylvester displacement operator:. rM; N-displacement rank. rM; N(A) = M A ¡ A N := rank(rM; N(A)). of A: 2. 6 1. Let Du = Diag(u1; :::; un), Z' =6 4. 6/37. ' . 1. 3. 7 7 and 5. Z = Z0. Informally, A is structured for rM; N if the displacement rank. satises. n. Examples:. rZ; Z: Toeplitz-like, rZ; Zt: Hankel-like, rDu ; Dv : Cauchy-like, rDu ; Z: Vandermonde-like In this talk, M; N 2 fDu; Z; Zt g. There exists a more general theory with block companion M; N, that applies to Padé-Hermite [Olshevsky, Shokrollahi '00] generalization f1 Ri;1 + + fs Ri;s = 0 mod Pi. [Bostan, Jeannerod, Mouilleron, Schost '17].

(8) Pan's motto: Compress Pan's motto: Compress, Operate, Decompress 1. Compression: Store A with n2 elements 2. Operate: Sum, product, inversion... 3. Decompression: Recover A. Compression If rM; N(A) has rank , A is stored as. rM; N(A) = G Ht where G; H 2 Kn .. Compressed size: 2. n n2. 7/37.

(9) Pan's motto: Decompress Decompression Inversibility Is rM; N always invertible ? No, consider rZ; Z(A) = Z A ¡ A Z then rZ; Z(Zi) = 0 for all i.. Theorem. rM; N is invertible if and only if Spec(M) \ Spec(N) = ?.. Decompression Cauchy Inversibility. If fui g \ fvj g = Spec(M) \ Spec(N) = ? then rDu ; Dv is invertible. Reconstruction formula. If rDu ; Dv (A) = (si;j ) then A = (si;j /(ui ¡ vj )). 8/37.

(10) Pan's motto: Decompress. 9/37. Decompression Reconstruction formulae Theorem.. [Gohberg, Olshevsky '92], [Wood '93], [Pan et al. '02]. If M invertible, N nilpotent and rM; N(A) = G Ht, then rM; N is invertible and A = M¡1 Krylov(M¡1; G) Krylov(Nt ; H)t . . where Krylov(M; G) = Mn¡1 G M2 G M G G .. Decompression Vandermonde Since M = Du is invertible and N = Z is nilpotent, apply the theorem.. Decompression Toeplitz, Hankel Consider rZ1; Z instead of rZ; Z: rZ1; Z is invertible. Indeed Z1 is invertible and Z is nilpotent. j. Since rZ1; Z(A) ¡ rZ; Z(A) = (Z1 ¡ Z) A |||||||||{z}}}}}}}}}. rZ 1; Z. ¡. rZ ; Z j 6 1. rank 1.

(11) Stability of basic operations Let rank(rM; N(A)) =. 10/37. and rank(rM; N(B)) = ,. 1. Addition. rM; N(A + B) = rM; N(A) + rM; N(B). rank 6 +. 2. Transpose. rNt ; Mt(At) = (M A ¡ A N)t = (rM; N(A))t. rank =. 3. Inverse A¡1 rM; N(A) A¡1 = A¡1 M ¡ N A¡1 = ¡rN; M(A¡1). rank =. 4. Multiplication. rM; P(A B) = (M A ¡ A N) B + A (N B ¡ B P) = rM; N(A) B + A rN; P(B) Note: Requires (structured matrix)(dense matrix) product. rank 6 +.




(15) Structured matrix vector product Toeplitz SMatrix × Vector. Toeplitz-like α = 1 SMatrix × Vector. Cauchy SMatrix × Vector. Cauchy-like α = 1 SMatrix × Vector. 11/37. Toeplitz-like SMatrix × Vector. Cauchy-like SMatrix × Vector Cauchy-like SMatrix × Smatrix Cauchy-like SMatrix × α Vectors. Reminder: 1. Toeplitz matrix vector product costs M(n). 2. Cauchy matrix vector product costs O(M(n) log(n)). (or O(M(n)) in special case). How do we perform Toeplitz/Cauchy-like matrix vector multiplication ?.

(16) Structured matrix vector product Toeplitz SMatrix × Vector. Toeplitz-like α = 1 SMatrix × Vector. Cauchy SMatrix × Vector. Cauchy-like α = 1 SMatrix × Vector. 12/37. Toeplitz-like SMatrix × Vector. Cauchy-like SMatrix × Vector Cauchy-like SMatrix × Smatrix Cauchy-like SMatrix × α Vectors. Remark. A is the sum of displacement rank 1 matrices Decompose G Ht = G1 H1t + + G Ht ||{z} |||| }} }}}} ||{z} ||||| }} }}}}} rank 1. where Gi ; Hi are the i-th column. rank 1. Then A is the sum of displacement rank 1 matrices: A = r¡1(G Ht) = r¡1(G1 H1t ) + + r¡1(G Ht ) ||||||||||||||||||{z}}}}}}}}}}}}}}}}}} ||||||||||||||||||{z}}}}}}}}}}}}}}}}}} displacement rank 1. displacement rank 1.

(17) Structured matrix vector product Toeplitz SMatrix × Vector. Toeplitz-like α = 1 SMatrix × Vector. Cauchy SMatrix × Vector. Cauchy-like α = 1 SMatrix × Vector. 13/37. Toeplitz-like SMatrix × Vector. Cauchy-like SMatrix × Vector Cauchy-like SMatrix × Smatrix Cauchy-like SMatrix × α Vectors. Toeplitz-like Since. =1. = 1, write rZ1; Z(A) = g ht = (gi hj )i;j . Then 2. 32 76 76 56 4. 3. hn 0 0 g1 gn g2 hn 7 6 g1 ¡1 ¡1 t 6 g2 7 A = Z1 Krylov(Z1 ; g) Krylov(Z; h) =4 gn 0 7 5 h 2 gn gn¡1 g1 h1 h2 :::. hn ||||||||||||||||||||||||||||||||{z}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}} ||||||||||||||||||||||||||||||{z}}}}}}}}}}}}}}}}}}}}}}}}}}}}}} Toeplitz. So reduction to Toeplitz matrix-vector product.. Toeplitz. Cost: 2 M(n).

(18) Structured matrix vector product Toeplitz SMatrix × Vector. Toeplitz-like α = 1 SMatrix × Vector. Cauchy SMatrix × Vector. Cauchy-like α = 1 SMatrix × Vector. Toeplitz-like SMatrix × Vector. Cauchy-like SMatrix × Vector Cauchy-like SMatrix × Smatrix Cauchy-like SMatrix × α Vectors. Cauchy-like. =1. If rDu ; Dv(A) = (gi hj )i;j then A = (gi hj /(ui ¡ vj ))i;j = Dg Cu; v Dh So reduction to Cauchy matrix-vector product.. Cost for matrix-vector product: O(M(n) log(n)). 14/37.

(19) Structured matrix vector product Toeplitz SMatrix × Vector. Toeplitz-like α = 1 SMatrix × Vector. Cauchy SMatrix × Vector. Cauchy-like α = 1 SMatrix × Vector. 15/37. Toeplitz-like SMatrix × Vector. Cauchy-like SMatrix × Vector Cauchy-like SMatrix × Smatrix Cauchy-like SMatrix × α Vectors. Cauchy-like. = 1 Trick for geometric sequences with same ratio. If ui = i¡1 u1, vj = j ¡1 v1, then C=. h. 1 ui ¡ vj. i h =. 1 i ¡1 u1 ¡ j ¡1v1. i. =. h. Reduce to Toeplitz matrix vector-product. Cost for matrix-vector product: M(n). 1 i ¡1. 1 u1 ¡ j ¡iv1. i. = D1; ¡1;::: . h. 1 u1 ¡ j ¡iv1. i. |||||||||||||||||||{z}}}}}}}}}}}}}}}}}}} Toeplitz. (vs. 3 M(n) for classical approach).

(20) Structured matrix vector product Toeplitz SMatrix × Vector. Toeplitz-like α = 1 SMatrix × Vector. Cauchy SMatrix × Vector. Cauchy-like α = 1 SMatrix × Vector. 16/37. Toeplitz-like SMatrix × Vector. Cauchy-like SMatrix × Vector Cauchy-like SMatrix × Smatrix Cauchy-like SMatrix × α Vectors. Consequence to product of structured matrices If rM; N(A) = G Ht ; rN; P(B) = Y Zt with G; H; Y; Z 2 Kn then. rM; P(A B) = rM; N(A) B + A rN; P(B) = G (Bt H)t + (A Y) Zt A Y and Bt H are (structured matrix)( vectors) product. Solution 1: Decompose into. ~( Total cost: O. 2. n). ~ ( n) (structured matrix)(vector) products, each cost O.

(21) Structured matrix vector product Toeplitz SMatrix × Vector. Toeplitz-like α = 1 SMatrix × Vector. Cauchy SMatrix × Vector. Cauchy-like α = 1 SMatrix × Vector. 17/37. Toeplitz-like SMatrix × Vector. Cauchy-like SMatrix × Vector Cauchy-like SMatrix × Smatrix Cauchy-like SMatrix × α Vectors. Consequence to product of structured matrices If rM; N(A) = G Ht ; rN; P(B) = Y Zt with G; H; Y; Z 2 Kn then. rM; P(A B) = rM; N(A) B + A rN; P(B) = G (Bt H)t + (A Y) Zt A Y and Bt H are (structured matrix)( vectors) product. Solution 2:. [Bostan, Jeannerod, Schost '08], [Bostan, Jeannerod, Mouilleron, Schost '17]. ~( One (structured matrix)( vectors) product in cost O ~( Total cost: O. ! ¡1. n). ! ¡1. n).

(22) Outline 1. Structured matrices basic operations. 2. Structured matrices inversion over Fp a. Cauchy-like inversion: iterative & divide-and-conquer b. Application to Hermite-Padé approximants c. Implementations. 3. Structured matrices inversion over Q a. Modular techniques b. Special case of algebraic approximants. 18/37.

(23) Cauchy-like inversion S(i) Intermediate inverse matrices S(i) For 0 6 i 6 n, cut A =. . A00 A01 A10 A11. . 19/37. [Cardinal '99], [Jeannerod, Mouilleron '10]. so that A00 2 Kii. Dene. S(i) = PartialInversioni(A) =. ". ¡1 A11 ¡ A10A00 A01 ¡1 ¡A00 A01. ¡1 A10A00 ¡1 A00. #. Facts. 1. S(0) = A, S(n) = A¡1 2. We have PartialInversioni = PartialInversion1 PartialInversion1 which implies that PartialInversion. S(0) = A ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! S(. ) PartialInversion. ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! S(2. ) PartialInversion. ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ¡! S(k. ).

(24) Cauchy-like inversion generators of S(i) How can we compute PartialInversion with structured matrices ?. . S(i) = PartialInversioni. A00 A01 A10 A11. # " ¡1 ¡1 A11 ¡ A10 A00 A01 A10 A00 = ¡1 ¡1 ¡A00 A01 A00. Cauchy magic Submatrices are structured h i h i G0 H0 For instance if G = G , H = H then G0, H0 are generators of A00 (rank ). 1. 1. Cardinal's formula S(i) has displacement rank ! " #" #t 0 t 0 ¡A10 G0 + G1 ¡A01H0 + H1 r[u1; v0];[v1; u0](S(i)) = 0 G0 H00 ¡1 where G00 ; H00 are rv0; u0-generators of A00 .. 20/37.

(25) Partial inversion using structured operations Algorithm PartialInversioni-Structured Input: 0 6 i 6 n G; H; u; v s.t. ru; v(A) = G Ht ¡1 G00 ; H00 ; u0; v0 s.t. rv0; u0(A00 ) = G00 H00 t where A00 2 Kii Output: Y; Z; u 0; v 0 such that ru 0; v 0(S(i)) = Y Zt. 1. Cut G; H; u; v in size i t 2. Compute A10 G00 and A01 H00 using (structured matrix)( vectors) product 3. return " # " # 0 t 0 ¡A10 G0 + G1 ¡A01H0 + H1 Y= ;Z= ; u 0 = [u1; v0]; v 0 = [v1; u0] 0 0 G0 H0. Cost: (structured matrix) ( Total cost: O(. ! ¡1. vectors) product in O(. M(n) log(n)). ! ¡1. M(n) log(n)). 21/37.

(26) Cauchy-like divide-and-conquer inversion. 22/37. Algorithm Struct-inverse-DAC Input: G; H; u; v such that ru; v(A) = G Ht Output: Y; Z such that rv; u(A¡1) = Y Zt. 1. if n 6 threshold then return Struct-inverse-iter(G; H; u; v) PartialInversionn/2. (0). 2. S = A ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! S(n/2) a. Invert leading minor of S(0) using Struct-inverse-DAC b. Compute generators Y; Z of S(n/2) using PartialInverse-Structured PartialInversionn/2. 3. S(n/2) ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! S(n) = A¡1 a. Invert leading minor of S(n/2) using Struct-inverse-DAC b. return generators Y; Z of S(n) using PartialInverse-Structured. Total cost:. S(n) = 2 S(n/2) + O(. ! ¡1. M(n) log(n)). ). S(n) = O(. ! ¡1. M(n) log2(n)).

(27) Cauchy-like divide-and-conquer inversion. 22/37. Algorithm Struct-inverse-DAC Input: G; H; u; v such that ru; v(A) = G Ht Output: Y; Z such that rv; u(A¡1) = Y Zt. 1. if n 6 threshold then return Struct-inverse-iter(G; H; u; v) (0). PartialInversionn/2. 2. S = A ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! S(n/2) a. Invert leading minor of S(0) using Struct-inverse-DAC b. Compute generators Y; Z of S(n/2) using PartialInverse-Structured PartialInversionn/2. 3. S(n/2) ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! S(n) = A¡1 a. Invert leading minor of S(n/2) using Struct-inverse-DAC b. return generators Y; Z of S(n) using PartialInverse-Structured. History Need compression:. [Morf/Bitmead-Anderson '80], [Kaltofen '94]. Compressed formulae: [Cardinal '99], [Pan '00], [Jeannerod, Mouilleron '10].

(28) Cauchy-like divide-and-conquer inversion Algorithm Struct-inverse-DAC Input: G; H; u; v such that ru; v(A) = G Ht Output: Y; Z such that rv; u(A¡1) = Y Zt. 1. if n 6 threshold then return Struct-inverse-iter(G; H; u; v) (0). PartialInversionn/2. 2. S = A ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! S(n/2) a. Invert leading minor of S(0) using Struct-inverse-DAC b. Compute generators Y; Z of S(n/2) using PartialInverse-Structured PartialInversionn/2. 3. S(n/2) ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! S(n) = A¡1 a. Invert leading minor of S(n/2) using Struct-inverse-DAC b. return generators Y; Z of S(n) using PartialInverse-Structured. Contributions in [Hyun, L., Schost '17] Check the required generic rank prole property Compute r = rank(A) and invert only the leading principal minor. 22/37.

(29) Partial inversion using dense operations Algorithm PartialInversion -Dense Input: 0 6 6 and G; H; u; v s.t. ru; v(A) = G Ht Output: Y; Z; u 0; v 0 such that ru 0; v 0(S( )) = Y Zt. 1. Cut G; H; u; v in size 2. Decompress A00, A01 and A10 3. Invert A00 Dense inversion ¡1 0 0 4. Compress A00 G0; H0 0 t 5. Compute A10 G0 and A01 H00 Dense matrix multiplication 6. return " # " # t ¡A10 G00 + G1 ¡A01 H00 + H1 Y= ;Z= ; u 0 = [u1; v0]; v 0 = [v1; u0] 0 0 G0 H0. Why dense matrices ? t A00; A10; A01 2 K are too small to take advantage of structured algorithms.. Cost analysis for one iteration when. = : O(. ! ¡1. n). 23/37.

(30) Cauchy-like iterative inversion. Algorithm Struct-inverse-iter Input: G; H; u; v such that ru; v(A) = G Ht and step size Output: Y; Z such that rv; u(A¡1) = Y Zt. 24/37. (1 6. 6 ). 1. for (i = 0; i < n; i = i + ) PartialInversion. a. S(i ) ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! S((i+1) ) Compute generators Y; Z of S((i+1) 2. return Y; Z. ). using PartialInversion-Dense. Idea: PartialInversion. S(0) = A ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! S(. Total cost when. = :. ) PartialInversion. n/ |{z}}. ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! S(2. #iterations. . O( ! ¡1 n) ||||||||||||||{z}}}}}}}}}}}}}}. ) PartialInversion. ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ¡! S(n) = A¡1. PartialInversion-Dense. = O(. ! ¡2. n2).

(31) Cauchy-like iterative inversion. Algorithm Struct-inverse-iter Input: G; H; u; v such that ru; v(A) = G Ht and step size Output: Y; Z such that rv; u(A¡1) = Y Zt. (1 6. 24/37. 6 ). 1. for (i = 0; i < n; i = i + ) PartialInversion. a. S(i ) ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! S((i+1) ) Compute generators Y; Z of S((i+1) 2. return Y; Z. ). using PartialInversion-Dense. History. = 1: Cost O( n2) = : Cost O(. ! ¡2. [Levinson '47], [Durbin '60], [Trench '64],[Kailath '99], [Mouilleron '08]. n2). [Hyun, L., Schost '17].

(32) Toeplitz-like inversion. 25/37. Original motivation Hermite Padé approximants Find a non-zero vector in the kernel of T Toeplitz-like. Transformation between classes of structured matrices. [Heinig '94], [GOHBERG. [Pan '90] et al. '95]. We can reduce questions about Toeplitz-like matrices to ones about Cauchy-like matrices.. Advantages: 1. Benet from ecient Cauchy-like inversion 2. Can choose u; v geometric sequences with same ratio. Gain factor 3. 3. The generated C should have generic rank prole. Note: [Jeannerod, Mouilleron '10] extend Cardinal's compressed formulae to Toeplitz..

(33) Toeplitz-like to Cauchy-like. 26/37. Transformation If T is Toeplitz-like of displacement rank. then. C = Vu T Wv is Cauchy-like with displacement rank 6 + 2. Why ? Vu is rDu ; Z-structured, T is rZ; Z-structured and Wv is rZ; Dv -structured Moderate cost overhead: O( M(n) log(n)). Regularization [Pan '01]: Regularization of Cauchy-like A. A 0 = Dx Ca; u A Cv; b Dy. [Hyun, L., Schost '17]: A = Vu T Wv has generic rank prole for generic u, v in K.

(34) Implementations. 27/37. Implementation details: C++ implementation based on Shoup's NTL NTL provides ecient polynomial arithmetic and matrix multiplication over small prime elds Available at https://github.com/romainlebreton/structured_linear_system_solving.

(35) Implementation Iterative Cauchy-like inversion Iterative inversion: ! ¡2. = 1). n2) algorithm is faster as soon as. 5. vs.. O(. > 10. ! ¡2. p=65537. 4.5. time ratio direct / block. Note: O(. O( n2) (step size. 4 3.5 3 2.5 2 1.5 1. 0. 100. 200. 300. 400. 500. 600. 700. 800. n. Figure. Time ratio ( n2)/(. ! ¡2 n2). for. = 30. 900. 1000. 28/37. n2) ( = )..

(36) Implementation DAC Cauchy-like inversion DAC algorithm:. Dense O(n!). vs. structured O(. Note: Structured algorithms are faster as soon as 4 3.5. ! ¡1. 6 0.2 n. 29/37. M(n) log(n)) ( = 5 and. alpha=5 alpha=50 dense. time in sec.. 3 2.5 2 1.5 1 0.5 0. 500. 1000. 1500. 2000. 2500. 3000. 3500. 4000. For Pade-Hermite and Toeplitz-like inversion: overhead of 5%. 4500. n. = 50).

(37) Outline 1. Structured matrices basic operations. 2. Structured matrices inversion over Fp a. Cauchy-like inversion: iterative & divide-and-conquer b. Application to Hermite-Padé approximants c. Implementations. 3. Structured matrices inversion over Q a. Modular techniques b. Special case of algebraic approximants. 30/37.

(38) Algorithms over the rationals. 31/37. Initial goal: Pade-Hermite approximant over Q, so nd x s.t. T x = 0 (T Toeplitz-like) Reduce the problem: 1. Cauchy matrices: 2. Kernel to linear system solving: 3. Modular techniques:. C = Vu T Wv and C y = 0 , T (Wv y) = 0 " # ¡1 C00 C01 r ker C = where C00 maximal minor ¡r. r. Solve modulo N , then apply rational reconstruction. Current Goal Modular Cauchy system soving Given C invertible Cauchy-like matrix, solve C x = b mod N. Cost model: From now on, cost model will be bit complexity Notation: I(t) cost of multiplying t-bit integers.

(39) Modular techniques. 32/37. Newton Iteration [Pan '92]. N = pt. Compute generators of C¡1 modulo p; p2; p4; :::; pt Bit complexity: O. Divide-and-Conquer. M(n) I(t) + ! ¡1 M(n) log(n) ||||||||||||||||||||||||{z}}}}}}}}}}}}}}}}}}}}}}}} ||||||||||||||||||||||||||||{z}}}}}}}}}}}}}}}}}}}}}}}}}}}} ! ¡1. lifting. C ¡1 mod p. ! N = pt. Recursively compute the solution at half the precision and cancel the residue Particular case of a faster online algorithm Bit complexity: O. M(n) I(t) log(t) + ! ¡1 M(n) log(n) ||||||||||||||||||||||||||||{z}}}}}}}}}}}}}}}}}}}}}}}}}}}} ||||||||||||||||||||||||||||{z}}}}}}}}}}}}}}}}}}}}}}}}}}}} lifting. C ¡1 mod p. [Hoeven '02], [L. '12] !. Chinese Remainder Theorem. Solve C x = b modulo t primes p1; :::; pt of the same magnitude ! Bit complexity: O ! ¡1 M(n) log(n) t + n I(t) log(t) |||||||||||||||||||||||||||||||{z}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}} ||||||||||||||||{z}}}}}}}}}}}}}}}} C ¡1 mod p1;:::; pt. Chinese remaindering. N = p1 pt.

(40) Implementation Pade-Hermite over Q. 5 blocks with n columns 4500 4000. time in sec.. 3500. n blocks with n columns 50000. dac newton crt. 33/37. dac crt. 40000. 3000 30000. 2500 2000. 20000. 1500 1000. 10000. 500 0. 0. 1100 2200 3300 4400 5500. 0. 0. 3000. size. Figure. Pade-Hermite approximants over Q with degree bounds. 6000. 9000. size. . n; n; n; n; n |||||||||||||||||{z}}}}}}}}}}}}}}}}} 5 times. . and. . n; :::; n |||||||{z}}}}}}} n times. .

(41) Algebraic approximants. 34/37. Bottleneck in DAC solving are matrix-vector products. Special case of algebraic approximants Find f1; :::; fs s.t. f1 + f2 r + f3 r2 + + fs rs¡1 = 0 mod x and deg fi < d. Cost CT of matrix-vector product T r 1. Structured algorithms:. ~ (s2 d) CT = O( M(n)) = O. = s; n = O(d s). 2. Bivariate modular composition: Compute F (x; r(x))mod x for F (x; y) =. [Brent, Kung '78], [Nüsken, Ziegler '04]. P. fi y i¡1. ~ (s(!+1)/2 d) CT = O. Note: Same trick does not apply for Newton iteration, nor for chinese remaindering..

(42) Implementation Algebraic approximants over Q. 5 blocks with n columns 140 120. n blocks with n columns 900. bmc naive. 35/37. bmc naive. 800 700. time in sec.. 100. 600. 80. 500. 60. 400 300. 40. 200 20 0. 100 0. 2000 4000 6000 8000 10000. 0. 0. 2000 4000 6000 8000 10000. size. Figure. Algebraic approximants over Q with degree bounds. size. . n; n; n; n; n |||||||||||||||||{z}}}}}}}}}}}}}}}}} 5 times. . and. . n; :::; n |||||||{z}}}}}}} n times. .

(43) Conclusion Summary: Implemented many dierent algorithms using ecient libraries Improvements over Fp with signicant practical benets: Faster iterative algorithm New class of Cauchy-like matrices with faster matrix-vector product Simpler regularization process Improvements over Q with signicant practical benets: DAC is most ecient in practice Exploited the additional structure of the algebraic approximants. Perspective: Similar improvements over Q for the dierential case (r; r 0; r 00; :::). Bivariate modular composition in this context [Bostan, Schost '09]. 36/37.

(44) Thank you for your attention ;-).

(45)