• Aucun résultat trouvé

Integer and polynomial multiplication

N/A
N/A
Protected

Academic year: 2022

Partager "Integer and polynomial multiplication"

Copied!
99
0
0

Texte intégral

(1)Integer and polynomial multiplication Romain Lebreton – Équipe ECO December 20th, 2018. 1/33.

(2) Context of today’s talk. In 1 second, we can multiply integers of 30 000 000 digits and polynomials of degree 500 000.. É Mini-course (L2, M2 & bonus) É Theoretical and practical aspects É Presentation of team research topics É Link with current research É Relation with implementations (LinBox). 2/33.

(3) Application / Motivation É Exact computation: É Many operations reduced to multiplication: exponentiation, division, pgcd, factorization, . . . É Mathematical software: GMP, Sage, Matlab, Maple, . . .. É Other domains using exact computation: É Cryptography: [Discrete logarithm computation in a 180-digit prime field, 2014] É Combinatorics, Number Theory, . . .. É Numerical computation: É Trillions of digits of π. [YEE, KONDO ’11]. É Robotics: Equilibrium of cable driven parallel robots. 3/33.

(4) From polynomial to integer multiplication To multiply. (794x2 + 983x + 523) Evaluate at x = 107. ×. (564x2 + 637x + 185). (794 · (107 )2 + 983 · 107 + 523) 79400009830000523. × (564 · (107 )2 + 637 · 107 + 185) × 56400006370000185 = 4478161060190106803305150060096755. "Interpolate" at x = 107 447816x4 + 1060190x3 + 1068033x2 + 515006x + 96755 Remarks: É Technique called Kronecker substitution (1882) É 107 is the minimal power of 10 greater than all coefficients 4/33.

(5) From integer to polynomial multiplication To multiply 794983523 "Interpolation" at x = 103. ×. 564637185. (794x2 + 983x + 523). × (564x2 + 637x + 185) = 447816x4 + 1060190x3 + 1068033x2 + 515006x + 96755. Evaluate at x = 103 447816 · 1012 + 1060190 · 109 + 1068033 · 106 + 515006 · 103 + 96755 = 448877258548102755 Remarks: É Addition with carry É Base 264 instead of base 10 5/33.

(6) Outline. Polynomial multiplication algorithms: 1. 2. 3. 4.. Karatsuba Toom-Cook Fast Fourier Tranform (FFT) Truncated FFT (TFT). 6/33.

(7) Polynomial data structures É Dense polynomials = list of all coefficients. x11 + 5x10 + 9x8 + 4x7 + 7x6 + x2 + 8 1. 5. 0. 9. 4. 7. 0. 0. 0. 1. 0. 8. 7/33.

(8) Polynomial data structures É Dense polynomials = list of all coefficients É Sparse polynomials = list of all non-zero coefficients. x29 + 9x12 + 4x11 + 2x2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 4 0 0 0 0 0 0 0 0 2 0 0 29 12 11. 2. 9. 2. 1. 4. 7/33.

(9) Polynomial data structures É Dense polynomials = list of all coefficients É Sparse polynomials = list of all non-zero coefficients É Straight-line programs. X 4 +4X 3 Y +6X 2 Y 2 +4XY 3 +Y 4 +X 2 Z+2XYZ+Y 2 Z+X 2 +2XY +Y 2 +Z2 +2Z+1. 7/33.

(10) Polynomial data structures É Dense polynomials = list of all coefficients É Sparse polynomials = list of all non-zero coefficients É Straight-line programs. ((X + Y )2 )2 + (X + Y )2 · (Z + 1) + (Z + 1)2 X. Y. Z. 1. 7/33.

(11) Polynomial data structures É Dense polynomials = list of all coefficients É Sparse polynomials = list of all non-zero coefficients É Straight-line programs. 7/33.

(12) Naive polynomial multiplication Example: + a0 + a1 x + a2 x2 + a3 x3 + × = + b0 + b1 x + b2 x2 + b3 x3 + + +. ( ( ( ( ( ( (. a0 b3 a0 b2 a0 b1 a0 b0. + + +. a1 b3 a1 b2 a1 b1 a1 b0. + + +. a2 b3 a2 b2 a2 b1 a2 b0. + + +. a3 b3 a3 b2 a3 b1 a3 b0. ) ) ) ) ) ) ). x6 x5 x4 x3 x2 x1 x0. 8/33.

(13) Naive polynomial multiplication Example: + a0 + a1 x + a2 x2 + a3 x3 + × = + b0 + b1 x + b2 x2 + b3 x3 + + +. In general:. ( ( ( ( ( ( (. a0 b3 a0 b2 a0 b1 a0 b0. + + +. a1 b3 a1 b2 a1 b1 a1 b0. + + +. a2 b3 a2 b2 a2 b1 a2 b0. + + +. a3 b3 a3 b2 a3 b1 a3 b0. ) ) ) ) ) ) ). x6 x5 x4 x3 x2 x1 x0. Multiplication of degree n polynomials in O(n2 ) arithmetic operations (+, −, ×). 8/33.

(14) Naive polynomial multiplication Example: + a0 + a1 x + a2 x2 + a3 x3 + × = + b0 + b1 x + b2 x2 + b3 x3 + + +. In general:. ( ( ( ( ( ( (. a0 b3 a0 b2 a0 b1 a0 b0. + + +. a1 b3 a1 b2 a1 b1 a1 b0. + + +. a2 b3 a2 b2 a2 b1 a2 b0. + + +. a3 b3 a3 b2 a3 b1 a3 b0. ) ) ) ) ) ) ). x6 x5 x4 x3 x2 x1 x0. Multiplication of degree n polynomials in O(n2 ) arithmetic operations (+, −, ×). Lower bound: The multiplication costs at least n arith. operations. 8/33.

(15) Naive polynomial multiplication Example: + a0 + a1 x + a2 x2 + a3 x3 + × = + b0 + b1 x + b2 x2 + b3 x3 + + +. In general:. ( ( ( ( ( ( (. a0 b3 a0 b2 a0 b1 a0 b0. + + +. a1 b3 a1 b2 a1 b1 a1 b0. + + +. a2 b3 a2 b2 a2 b1 a2 b0. + + +. a3 b3 a3 b2 a3 b1 a3 b0. ) ) ) ) ) ) ). x6 x5 x4 x3 x2 x1 x0. Multiplication of degree n polynomials in O(n2 ) arithmetic operations (+, −, ×). Lower bound: The multiplication costs at least n log n arith. operations. 8/33.

(16) Naive polynomial multiplication Example: + a0 + a1 x + a2 x2 + a3 x3 + × = + b0 + b1 x + b2 x2 + b3 x3 + + +. In general:. ( ( ( ( ( ( (. a0 b3 a0 b2 a0 b1 a0 b0. + + +. a1 b3 a1 b2 a1 b1 a1 b0. + + +. a2 b3 a2 b2 a2 b1 a2 b0. + + +. a3 b3 a3 b2 a3 b1 a3 b0. ) ) ) ) ) ) ). x6 x5 x4 x3 x2 x1 x0. Multiplication of degree n polynomials in O(n2 ) arithmetic operations (+, −, ×). Lower bound: The multiplication costs at least n log n arith. operations What is the best complexity of multiplication ? 8/33.

(17) Outline. Polynomial multiplication algorithms: 1. 2. 3. 4.. Karatsuba Toom-Cook Fast Fourier Tranform (FFT) Truncated FFT (TFT). 9/33.

(18) Karatsuba. [KARATSUBA, OFMAN ’68]. (a0 + a1 x) · (b0 + b1 x) = c0 + c1 x + c2 x2 Naive algorithm: 4 multiplications   c0 = a0 b0 c1 = a0 b1 + a1 b0  c =a b 2 1 1 Karatsuba: 3 multiplications by writing   c0 = a0 b0 c1 = (a0 + a1 ) · (b0 + b1 ) − a0 b0 − a1 b1  c =a b 2 1 1. 10/33.

(19) Karatsuba - Divide and conquer algorithm What about multiplication a(x) · b(x) of polynomials of degree n?. 11/33.

(20) Karatsuba - Divide and conquer algorithm What about multiplication a(x) · b(x) of polynomials of degree n?. Recursive multiplication algorithm: P 1. a(x) = 0≤i<n ai xi. Split in 2 parts. 11/33.

(21) Karatsuba - Divide and conquer algorithm What about multiplication a(x) · b(x) of polynomials of degree n?. Recursive multiplication algorithm: P P 1. a(x) = 0≤i<n/2 ai xi + xn/2 0≤i<n/2 ai+n/2 xi. Split in 2 parts. 11/33.

(22) Karatsuba - Divide and conquer algorithm What about multiplication a(x) · b(x) of polynomials of degree n?. Recursive multiplication algorithm: 1. a(x) = al (x) + xn/2 ah (x). Split in 2 parts. 11/33.

(23) Karatsuba - Divide and conquer algorithm What about multiplication a(x) · b(x) of polynomials of degree n?. Recursive multiplication algorithm: 1. a(x) = al (x) + xn/2 ah (x). 2. b(x) = bl (x) + x. n/2. Split in 2 parts. bh (x). 11/33.

(24) Karatsuba - Divide and conquer algorithm What about multiplication a(x) · b(x) of polynomials of degree n?. Recursive multiplication algorithm: 1. a(x) = al (x) + xn/2 ah (x) 2. 3. 4. 5.. b(x) = bl (x) + x bh (x) cl (x) = al (x) · bl (x) cm (x) = (al + ah ) · (bl + bh ) ch (x) = ah (x) · bh (x). Split in 2 parts. n/2. Recursive call of size n/2 Recursive call of size n/2 Recursive call of size n/2. 11/33.

(25) Karatsuba - Divide and conquer algorithm What about multiplication a(x) · b(x) of polynomials of degree n?. Recursive multiplication algorithm: 1. a(x) = al (x) + xn/2 ah (x) 2. 3. 4. 5. 6.. Split in 2 parts. b(x) = bl (x) + x bh (x) cl (x) = al (x) · bl (x) Recursive call of size n/2 cm (x) = (al + ah ) · (bl + bh ) Recursive call of size n/2 ch (x) = ah (x) · bh (x) Recursive call of size n/2 return c(x) = cl (x) + (cm (x) − cl (x) − ch (x)) xn/2 + ch (x)xn n/2. 11/33.

(26) Karatsuba - Divide and conquer algorithm What about multiplication a(x) · b(x) of polynomials of degree n?. Recursive multiplication algorithm: 1. a(x) = al (x) + xn/2 ah (x) 2. 3. 4. 5. 6.. Split in 2 parts. b(x) = bl (x) + x bh (x) cl (x) = al (x) · bl (x) Recursive call of size n/2 cm (x) = (al + ah ) · (bl + bh ) Recursive call of size n/2 ch (x) = ah (x) · bh (x) Recursive call of size n/2 return c(x) = cl (x) + (cm (x) − cl (x) − ch (x)) xn/2 + ch (x)xn n/2. Remarks: É É É É. Complexity: K (n) = 3K (n/2) + O(n) = O(nlog2 (3) ) = O(n1,59 ) Karatsuba K (n)  O(n2 ) naive In practice, hybrid Karatsuba / naive algorithm Need careful memory management (one memory allocation, in-place algorithms) 11/33.

(27) Outline. Polynomial multiplication algorithms: 1. 2. 3. 4.. Karatsuba Toom-Cook Fast Fourier Tranform (FFT) Truncated FFT (TFT). 12/33.

(28) Karatsuba revisited. Karatsuba: ¨ a0 + a1 x b0 + b1 x   yMult.. c(x) = a(x) · b(x). 13/33.

(29) Karatsuba revisited. Karatsuba: ¨ a0 + a1 x b0 + b1 x   yMult.. Evaluation. −−−−−→. ¨. a(0), a(1), a(∞). b(0), b(1), b(∞). c(x) = a(x) · b(x). 13/33.

(30) Karatsuba revisited. Karatsuba: ¨ a0 + a1 x b0 + b1 x   yMult.. c(x) = a(x) · b(x). Evaluation. −−−−−→. ¨. a(0), a(1), a(∞). b(0), b(1), b(∞)  Pointwise y mult..   c(0) = a(0) · b(0) c(1) = a(1) · b(1)  c(∞) = a(∞) · b(∞). 13/33.

(31) Karatsuba revisited. Karatsuba: ¨ a0 + a1 x b0 + b1 x   yMult.. c(x) = a(x) · b(x). Evaluation. −−−−−→. Interpolation. ←−−−−−−−. ¨. a(0), a(1), a(∞). b(0), b(1), b(∞)  Pointwise y mult..   c(0) = a(0) · b(0) c(1) = a(1) · b(1)  c(∞) = a(∞) · b(∞). 13/33.

(32) Interpolation Interpolation: Recover the polynomial from evaluations of it.. 14/33.

(33) Interpolation Interpolation: Recover the polynomial from evaluations of it.. 14/33.

(34) Interpolation Interpolation: Recover the polynomial from evaluations of it.. 14/33.

(35) Interpolation Interpolation: Recover the polynomial from evaluations of it.. 14/33.

(36) Evaluation/interpolation algorithms Karatsuba: [KARATSUBA, OFMAN 1963] Polynomials a, b with 2 coefficients Evaluation in 3 points, e.g. 0, 1, ∞ Complexity: K (n) = 3K (dn/2e) + O(n) = O(nlog2 (3) ) = O(n1,59 ). Toom-3: [TOOM ’63] Polynomials a, b with 3 coefficients Evaluation in 5 points, e.g. −1, 0, 1, 2, ∞ Complexity: T (n) = 5K (dn/3e) + O(n) = O(nlog3 (5) ) = O(n1,47 ) Toom-Cook:. [COOK ’66]. É Generalization that reaches complexity O(n1+" ) for all " > 0 É Less and less pratical (big constant in O). 15/33.

(37) Outline. Polynomial multiplication algorithms: 1. 2. 3. 4.. Karatsuba Toom-Cook Fast Fourier Tranform (FFT) Truncated FFT (TFT). 16/33.

(38) Evaluation / Interpolation algorithms. Monomial representation ¨ a(x) degree n b(x) degree n   yMult. c(x) = a(x) · b(x) of degree 2n. Evaluation. −−−−−→. Interpolation. ←−−−−−−−. Evaluation representation ¨ a(0), a(1), . . . , a(2n + 1) b(0), b(1), . . . , b(2n + 1)  Pointwise mult. y.   c(0) = a(0) · b(0) ...  c(2n + 1) = a(2n + 1) · b(2n + 1). 17/33.

(39) Evaluation / Interpolation algorithms Karatsuba, Toom-3, Toom-Cook: Fixed size eval./interp. scheme + recursion Fast Fourier Transform: Full size eval./interp. scheme + no recursion. Monomial representation ¨ a(x) degree n b(x) degree n   yMult. c(x) = a(x) · b(x) of degree 2n. Evaluation. −−−−−→. Interpolation. ←−−−−−−−. Evaluation representation ¨ a(0), a(1), . . . , a(2n + 1) b(0), b(1), . . . , b(2n + 1)  Pointwise mult. y.   c(0) = a(0) · b(0) ...  c(2n + 1) = a(2n + 1) · b(2n + 1). 17/33.

(40) Evaluation / Interpolation algorithms Karatsuba, Toom-3, Toom-Cook: Fixed size eval./interp. scheme + recursion Fast Fourier Transform: Full size eval./interp. scheme + no recursion. Monomial representation ¨ a(x) degree n b(x) degree n   yMult. c(x) = a(x) · b(x) of degree 2n. Evaluation. −−−−−→. Interpolation. ←−−−−−−−. Evaluation representation ¨ a(0), a(1), . . . , a(2n + 1) b(0), b(1), . . . , b(2n + 1)  Pointwise mult. yCost : O(n).   c(0) = a(0) · b(0) ...  c(2n + 1) = a(2n + 1) · b(2n + 1). 17/33.

(41) Evaluation / Interpolation algorithms Karatsuba, Toom-3, Toom-Cook: Fixed size eval./interp. scheme + recursion Fast Fourier Transform: Full size eval./interp. scheme + no recursion. Monomial representation ¨ a(x) degree n b(x) degree n   yMult. c(x) = a(x) · b(x) of degree 2n. Evaluation. −−−−−→ Cost?. Interpolation. ←−−−−−−− Cost?. Evaluation representation ¨ a(0), a(1), . . . , a(2n + 1) b(0), b(1), . . . , b(2n + 1)  Pointwise mult. yCost : O(n).   c(0) = a(0) · b(0) ...  c(2n + 1) = a(2n + 1) · b(2n + 1). From now on, we will focus on the cost of evaluation / interpolation. 17/33.

(42) Evaluation / Interpolation algorithms Karatsuba, Toom-3, Toom-Cook: Fixed size eval./interp. scheme + recursion Fast Fourier Transform: Full size eval./interp. scheme + no recursion. Monomial representation ¨ a(x) degree n b(x) degree n   yMult. c(x) = a(x) · b(x) of degree 2n. Evaluation. −−−−−→ Cost?. Interpolation. ←−−−−−−− Cost?. Evaluation representation ¨ a(0), a(1), . . . , a(2n + 1) b(0), b(1), . . . , b(2n + 1)  Pointwise mult. yCost : O(n).   c(0) = a(0) · b(0) ...  c(2n + 1) = a(2n + 1) · b(2n + 1). From now on, we will focus on the cost of evaluation / interpolation. Note: Interpolation with errors. 17/33.

(43) Discrete Fourier Transform (DFT). Evaluation / interpolation is generally costly. But if evaluation points are specific, it can be very efficient: É Evaluate at ξ0 , ξ1 , ξ2 , . . . , ξn−1 É where ξ is a primitive root of unity. Discrete Fourier Transform = Evaluation on roots of unity DFTξ (a(x)) := (a(ξ0 ), . . . , a(ξn−1 )) where ξ is a n-th primitive root of unity and deg a(x) < n.. 18/33.

(44) Complex root of unity y 2iπ. ξ5 ξ. ξ. 4. ξ = e 16 ∈ C is a 16-th primitive root of unity: É ξ16 = 1. ξ3 ξ2. 6. ξ7. ξ1 2π 16. ξ8 ξ9. ξ0 ξ15. ξ10. ξ14 ξ. 11. ξ12. ξ. 13. x. É ξi 6= 1 for 0 < i < 16 Remark: É 1 = ξ0 = ξ16 = ξ32 = . . . É ξ16/2 = −1. Pros & Cons: Fast floating point arithmetic, but precision issues. 19/33.

(45) Modular root of unity Modular integers: Z/pZ if p prime (a = a + p = a + 2p = . . . modulo p) Example Z/17Z: 13. 5. ξ4. ξ5. 15. 10 ξ3 ξ2. ξ6 11. 16 ξ. 3. ξ1. ξ7. Z/17Z. 8. ξ. ξ9. 0. 1. ξ15. 14. 6 ξ. ξ = 3 ∈ Z/17Z is a 16-th primitive root of unity: É ξ16 = 1. 9. 10. ξ. 8. ξ11 7. ξ12 4. 14. 2. ξ13 12. É ξi 6= 1 for 0 < i < 16 Remark: É 1 = ξ0 = ξ16 = ξ32 = . . . É ξ16/2 = −1 20/33.

(46) Fast Fourier Transform Goal: Given a(x), compute (a(ξ0 ), . . . , a(ξn−1 )) (when n = 2k ) If a(x) = al (x) + xn/2 ah (x) then a(ξj ) = al (ξj ) + (ξj )n/2 ah (ξj ). 21/33.

(47) Fast Fourier Transform Goal: Given a(x), compute (a(ξ0 ), . . . , a(ξn−1 )) (when n = 2k ) If a(x) = al (x) + xn/2 ah (x) then a(ξj ) = al (ξj ) + (ξn/2 )j ah (ξj ). 21/33.

(48) Fast Fourier Transform Goal: Given a(x), compute (a(ξ0 ), . . . , a(ξn−1 )) (when n = 2k ) If a(x) = al (x) + xn/2 ah (x) then a(ξj ) = al (ξj ) + (−1)j ah (ξj ). 21/33.

(49) Fast Fourier Transform Goal: Given a(x), compute (a(ξ0 ), . . . , a(ξn−1 )) (when n = 2k ) If a(x) = al (x) + xn/2 ah (x) then a(ξj ) = al (ξj ) + (−1)j ah (ξj ) ⇒. a(ξ2i. §. a(ξ. 2i+1. ) = al (ξ2i ) + ah (ξ2i ) ) = al (ξ2i+1 ) − ah (ξ2i+1 ). 21/33.

(50) Fast Fourier Transform Goal: Given a(x), compute (a(ξ0 ), . . . , a(ξn−1 )) (when n = 2k ) If a(x) = al (x) + xn/2 ah (x) then a(ξj ) = al (ξj ) + (−1)j ah (ξj ) ⇒. a(ξ2i. §. a(ξ. 2i+1. ) = al (ξ2i ) + ah (ξ2i ) ) = al (ξ2i+1 ) − ah (ξ2i+1 ). Define r̄(x) = al (x) + ah (x), r0 (x) = al (x) − ah (x). ¯. 21/33.

(51) Fast Fourier Transform Goal: Given a(x), compute (a(ξ0 ), . . . , a(ξn−1 )) (when n = 2k ) If a(x) = al (x) + xn/2 ah (x) then a(ξj ) = al (ξj ) + (−1)j ah (ξj ) ⇒. a(ξ2i. §. a(ξ. 2i+1. ) = al (ξ2i ) + ah (ξ2i ) = r̄(ξ2i ) ) = al (ξ2i+1 ) − ah (ξ2i+1 ) = r0 (ξ2i+1 ) ¯. Define r̄(x) = al (x) + ah (x), r0 (x) = al (x) − ah (x). ¯. 21/33.

(52) Fast Fourier Transform Goal: Given a(x), compute (a(ξ0 ), . . . , a(ξn−1 )) (when n = 2k ) If a(x) = al (x) + xn/2 ah (x) then a(ξj ) = al (ξj ) + (−1)j ah (ξj ) ⇒. a(ξ2i. §. a(ξ. 2i+1. ) = al (ξ2i ) + ah (ξ2i ) = r̄(ξ2i ) ) = al (ξ2i+1 ) − ah (ξ2i+1 ) = r0 (ξ2i+1 ) ¯. Define r̄(x) = al (x) + ah (x), r0 (x) = al (x) − ah (x) and r(x) = r0 (ξx). ¯ ¯ ¯. 21/33.

(53) Fast Fourier Transform Goal: Given a(x), compute (a(ξ0 ), . . . , a(ξn−1 )) (when n = 2k ) If a(x) = al (x) + xn/2 ah (x) then a(ξj ) = al (ξj ) + (−1)j ah (ξj ) ⇒. a(ξ2i. §. a(ξ. 2i+1. ) = al (ξ2i ) + ah (ξ2i ) = r̄(ξ2i ) ) = al (ξ2i+1 ) − ah (ξ2i+1 ) = r(ξ2i ) ¯. Define r̄(x) = al (x) + ah (x), r0 (x) = al (x) − ah (x) and r(x) = r0 (ξx). ¯ ¯ ¯. 21/33.

(54) Fast Fourier Transform Goal: Given a(x), compute (a(ξ0 ), . . . , a(ξn−1 )) (when n = 2k ) If a(x) = al (x) + xn/2 ah (x) then a(ξj ) = al (ξj ) + (−1)j ah (ξj ) ⇒. a(ξ2i. §. a(ξ. 2i+1. ) = al (ξ2i ) + ah (ξ2i ) = r̄(ξ2i ) ) = al (ξ2i+1 ) − ah (ξ2i+1 ) = r(ξ2i ) ¯. Define r̄(x) = al (x) + ah (x), r0 (x) = al (x) − ah (x) and r(x) = r0 (ξx). ¯ ¯ ¯ Finally (a(ξ0 ), a(ξ1 ), a(ξ2 ), a(ξ3 ), . . . ) = (r̄(ξ0 ), r(ξ0 ), r̄(ξ2 ), r(ξ2 ), . . . ) ¯ ¯. 21/33.

(55) Fast Fourier Transform Goal: Given a(x), compute (a(ξ0 ), . . . , a(ξn−1 )) (when n = 2k ) If a(x) = al (x) + xn/2 ah (x) then a(ξj ) = al (ξj ) + (−1)j ah (ξj ) ⇒. a(ξ2i. §. a(ξ. 2i+1. ) = al (ξ2i ) + ah (ξ2i ) = r̄(ξ2i ) ) = al (ξ2i+1 ) − ah (ξ2i+1 ) = r(ξ2i ) ¯. Define r̄(x) = al (x) + ah (x), r0 (x) = al (x) − ah (x) and r(x) = r0 (ξx). ¯ ¯ ¯ Finally (a(ξ0 ), a(ξ1 ), a(ξ2 ), a(ξ3 ), . . . ) = (r̄(ξ0 ), r(ξ0 ), r̄(ξ2 ), r(ξ2 ), . . . ) ¯ ¯ FFT Algorithm: 1. 2. 3. 4. 5. 6. 7.. [COOLEY, TUKEY ’65]. Write a(x) = al (x) + xn/2 ah (x) Split in 2 parts Compute r̄(x) = al (x) + ah (x) Compute r0 (x) = al (x) − ah (x) Compute ¯r(x) = r0 (ξx) Evaluate r̄¯(ξ0 ), r̄¯ (ξ2 ), r̄(ξ4 ), . . . Recursive call in size n/2 0 2 4 Evaluate r(ξ ), r(ξ ), r(ξ ), . . . Recursive call in size n/2 return r̄(¯ξ0 ), r(¯ξ0 ), r̄(¯ξ2 ), r(ξ2 ), r̄(ξ4 ), r(ξ4 ), . . . ¯ ¯ ¯ 21/33.

(56) FFT butterflies a0 a1 a2 a3 al (x) a4 a5 a6 a7 a8 a9 a10 a11 ah (x) a12 a13 a14 a15. + + + + + + + + − − − − − − − −. ξ0. × 1 ξ × 2 ξ × 3 ξ × 4 ξ × 5 ξ × 6 ξ × 7 ξ ×. r̄0 r̄1 r̄2 r̄3 r̄(x) r̄4 r̄5 r̄6 r̄7 r0 ¯ r1 ¯ r2 ¯ r3 ¯ r(x) r4 ¯ ¯ r5 ¯ r6 ¯ r7 ¯. É r̄(x) = al (x) + ah (x) É r0 (x) = al (x) − ah (x) ¯ É r(x) = r0 (ξx) ¯ ¯. 22/33.

(57) FFT butterflies a0 a1 a2 a3 al (x) a4 a5 a6 a7 a8 a9 a10 a11 a12 a13 a14 a15. r̄0 r̄1 r̄2 r̄3 r̄4 r̄5 r̄6 r̄7 r0 ¯ r1 ¯ r2 ¯ r3 ¯ r4 ¯ r5 ¯ r6 ¯ r7 ¯ 22/33.

(58) FFT butterflies a0 a1 a2 a3 al (x) a4 a5 a6 a7 a8 a9 a10 a11 a12 a13 a14 a15. a(ξ0 ) a(ξ8 ) a(ξ4 ) a(ξ12 ) a(ξ2 ) a(ξ10 ) a(ξ6 ) a(ξ14 ) a(ξ1 ) a(ξ9 ) a(ξ5 ) a(ξ13 ) a(ξ3 ) a(ξ11 ) a(ξ7 ) a(ξ15 ) 22/33.

(59) FFT butterflies a00002 a00012 a00102 a00112 a01002 a01012 a01102 a01112 a10002 a10012 a10102 a10112 a11002 a11012 a11102 a11112. 2. a(ξ00002 ) a(ξ10002 ) a(ξ01002 ) a(ξ11002 ) a(ξ00102 ) a(ξ10102 ) a(ξ01102 ) a(ξ11102 ) a(ξ00012 ) a(ξ10012 ) a(ξ01012 ) a(ξ11012 ) a(ξ00112 ) a(ξ10112 ) a(ξ01112 ) a(ξ1111 ) 22/33.

(60) FFT timings É Evaluation or interpolation in 3/2n log n arithmetic operations É Multiplication in ∼ 9n log n É But only for degree n = 2k , pad with zeroes otherwise, loose factor 2 600. FFT. Multiplication time. 500. 400. 300. 200. 100. 0 2000. 4000. 6000. 8000 10000 Polynomial degree. 12000. 14000. 16000. Figure 1: Fast Fourier Transform timings 23/33.

(61) Outline. Polynomial multiplication algorithms: 1. 2. 3. 4.. Karatsuba Toom-Cook Fast Fourier Tranform (FFT) Truncated FFT (TFT). 24/33.

(62) Truncated Fourier Transform. [HOEVEN ’04]. Goal: Save computations when n = deg a(x) is not 2k : É compute only the first n evaluates of a(x) É get a cost ∼ 23 n log n for all degrees n. É save up to a factor 2. (instead of ∼ 32 2k log 2k ). 25/33.

(63) Truncated Fourier Transform a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 0 0 0 0 0. a(ξ0 ) a(ξ8 ) a(ξ4 ) a(ξ12 ) a(ξ2 ) a(ξ10 ) a(ξ6 ) a(ξ14 ) a(ξ1 ) a(ξ9 ) a(ξ5 ) a(ξ13 ) a(ξ3 ) a(ξ11 ) a(ξ7 ) a(ξ15 ) 26/33.

(64) Truncated Fourier Transform a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 0 0 0 0 0. a(ξ0 ) a(ξ8 ) a(ξ4 ) a(ξ12 ) a(ξ2 ) a(ξ10 ) a(ξ6 ) a(ξ14 ) a(ξ1 ) a(ξ9 ) a(ξ5 ) a(ξ13 ) a(ξ3 ) a(ξ11 ) a(ξ7 ) a(ξ15 ) 26/33.

(65) Truncated Fourier Transform a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 0 0 0 0 0. a(ξ0 ) a(ξ8 ) a(ξ4 ) a(ξ12 ) a(ξ2 ) a(ξ10 ) a(ξ6 ) a(ξ14 ) a(ξ1 ) a(ξ9 ) a(ξ5 ) a(ξ13 ) a(ξ3 ) a(ξ11 ) a(ξ7 ) a(ξ15 ) 26/33.

(66) Truncated Fourier Transform a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 0 0 0 0 0. a(ξ0 ) a(ξ8 ) a(ξ4 ) a(ξ12 ) a(ξ2 ) a(ξ10 ) a(ξ6 ) a(ξ14 ) a(ξ1 ) a(ξ9 ) a(ξ5 ) a(ξ13 ) a(ξ3 ) a(ξ11 ) a(ξ7 ) a(ξ15 ) 26/33.

(67) Truncated Fourier Transform a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 0 0 0 0 0. a(ξ0 ) a(ξ8 ) a(ξ4 ) a(ξ12 ) a(ξ2 ) a(ξ10 ) a(ξ6 ) a(ξ14 ) a(ξ1 ) a(ξ9 ) a(ξ5 ) a(ξ13 ) a(ξ3 ) a(ξ11 ) a(ξ7 ) a(ξ15 ) 26/33.

(68) Inverse Truncated Fourier Transform Goal: É É É É. recover the polynomial a(x) from only its first n evaluates knowing that deg a(x) < n (instead of deg a(x) < 2k ) 3 get a cost ∼ 2 n log n (instead of ∼ 32 2k log 2k ) save up to a factor 2. Operations required: É FFT butterfly:. −→. É Inverse FFT butterfly:. −→. É Crossed butterfly:. −→. 27/33.

(69) Inverse TFT - Recursive Algorithm : Case 1. 28/33.

(70) Inverse TFT - Recursive Algorithm : Case 1. 28/33.

(71) Inverse TFT - Recursive Algorithm : Case 1. 28/33.

(72) Inverse TFT - Recursive Algorithm : Case 1. 28/33.

(73) Inverse TFT - Recursive Algorithm : Case 1. 28/33.

(74) Inverse TFT - Recursive Algorithm : Case 1. 28/33.

(75) Inverse TFT - Recursive Algorithm : Case 1. 28/33.

(76) Inverse TFT - Recursive Algorithm : Case 1. 28/33.

(77) Inverse TFT - Recursive Algorithm : Case 1. 28/33.

(78) Inverse TFT - Recursive Algorithm : Case 1. 28/33.

(79) Inverse TFT - Recursive Algorithm : Case 2. 29/33.

(80) Inverse TFT - Recursive Algorithm : Case 2. 29/33.

(81) Inverse TFT - Recursive Algorithm : Case 2. 29/33.

(82) Inverse TFT - Recursive Algorithm : Case 2. 29/33.

(83) Inverse TFT - Recursive Algorithm : Case 2. 29/33.

(84) Inverse TFT - Recursive Algorithm : Case 2. 29/33.

(85) Inverse TFT - Recursive Algorithm : Case 2. 29/33.

(86) Inverse TFT - Recursive Algorithm : Case 2. 29/33.

(87) Complexity and timings Evaluation or interpolation in 3/2n log n for all degrees n 600. FFT TFT. Multiplication time. 500. 400. 300. 200. 100. 0 2000. 4000. 6000. 8000 10000 Polynomial degree. 12000. 14000. 16000. Figure 2: Fast Fourier Transform vs Truncated FFT timings. 30/33.

(88) Link with team research É Implementation of polynomial matrix multiplication in LinBox É FFT for lattice and symmetric polynomials. [HOEVEN, L., SCHOST ’14]. 31/33.

(89) Link with team research É Implementation of polynomial matrix multiplication in LinBox É FFT for lattice and symmetric polynomials. [HOEVEN, L., SCHOST ’14]. 31/33.

(90) Conclusion É Open questions on multiplication of polynomials of degree n:. 32/33.

(91) Conclusion É Open questions on multiplication of polynomials of degree n: É Complexity O(n log n) when coefficients in Z/pZ but n < p. 32/33.

(92) Conclusion É Open questions on multiplication of polynomials of degree n: É Complexity O(n log n) when coefficients in Z/pZ but n < p É Complexity O(n log n log log n) in general [SCHÖNHAGE, STRASSEN ’71]. 32/33.

(93) Conclusion É Open questions on multiplication of polynomials of degree n: É Complexity O(n log n) when coefficients in Z/pZ but n < p É Complexity O(n log n log log n) in general [SCHÖNHAGE, STRASSEN ’71] ∗ É Complexity O(n log n8log n ) when coefficients in Z/pZ [HARVEY, HOEVEN, LECERF ’17]. 32/33.

(94) Conclusion É Open questions on multiplication of polynomials of degree n: É Complexity O(n log n) when coefficients in Z/pZ but n < p É Complexity O(n log n log log n) in general [SCHÖNHAGE, STRASSEN ’71] ∗ É Complexity O(n log n8log n ) when coefficients in Z/pZ [HARVEY, HOEVEN, LECERF ’17]. É In practice :. 32/33.

(95) Conclusion É Open questions on multiplication of polynomials of degree n: É Complexity O(n log n) when coefficients in Z/pZ but n < p É Complexity O(n log n log log n) in general [SCHÖNHAGE, STRASSEN ’71] ∗ É Complexity O(n log n8log n ) when coefficients in Z/pZ [HARVEY, HOEVEN, LECERF ’17]. É In practice : É Complementary algorithms: Naive 0. FFT / TFT. Karatsuba 20. 100. n. 32/33.

(96) Conclusion É Open questions on multiplication of polynomials of degree n: É Complexity O(n log n) when coefficients in Z/pZ but n < p É Complexity O(n log n log log n) in general [SCHÖNHAGE, STRASSEN ’71] ∗ É Complexity O(n log n8log n ) when coefficients in Z/pZ [HARVEY, HOEVEN, LECERF ’17]. É In practice : É Complementary algorithms: Naive 0. FFT / TFT. Karatsuba 20. 100. n. É Will [HHL ’17] become practical in the future ?. 32/33.

(97) Conclusion É Open questions on multiplication of polynomials of degree n: É Complexity O(n log n) when coefficients in Z/pZ but n < p É Complexity O(n log n log log n) in general [SCHÖNHAGE, STRASSEN ’71] ∗ É Complexity O(n log n8log n ) when coefficients in Z/pZ [HARVEY, HOEVEN, LECERF ’17]. É In practice : É Complementary algorithms: Naive 0. FFT / TFT. Karatsuba 20. 100. n. É Will [HHL ’17] become practical in the future ? É Technical aspects not discussed: SIMD, cache, . . .. 32/33.

(98) Conclusion É Open questions on multiplication of polynomials of degree n: É Complexity O(n log n) when coefficients in Z/pZ but n < p É Complexity O(n log n log log n) in general [SCHÖNHAGE, STRASSEN ’71] ∗ É Complexity O(n log n8log n ) when coefficients in Z/pZ [HARVEY, HOEVEN, LECERF ’17]. É In practice : É Complementary algorithms: Naive 0. FFT / TFT. Karatsuba 20. 100. n. É Will [HHL ’17] become practical in the future ? É Technical aspects not discussed: SIMD, cache, . . .. É Thank you for your attention !. 32/33.

(99) Different Fourier transforms Classical Fourier transform É Decomposition in the frequency domain É Integral formula: Z f̂ (ξ) =. f (x)e−2iπxξ dx. É Multiplicativity: ĥ(ξ) = f̂ (ξ) · ĝ(ξ) when h = Convolution(f , g). Discrete Fourier transform É Discrete formula:. f̂k =. X n. fn e−. 2iπ N nk. P 2iπ É Link with evaluation: p̂k = P(e− N k ) where P(x) = n pn xn É Multiplicativity: Let c(x) = a(x) · b(x) then ĉk = âk · b̂k 33/33.

(100)

Références

Documents relatifs

In itself, the above observation does not imply the existence of a fast algorithm for matrix multiplication, because the problem of multiplying two n × n matrices can not directly

As long as the input data and the pre-computed constants t into the level 3 cache of the processor (in our case, 8 MB), we may simply unfold Algorithm 1: we do not transpose- copy

U NIVARIATE INTERPOLATION USING CYCLIC EXTENSIONS One general approach for sparse interpolation of univariate polynomials over gen- eral base fields K was initiated by Garg and

The same kind of algorithm should work over the integers too, but the results might not be as good (the fact that two polynomials with n coefficients multiply to form a product with

In Section 1, we present the no- tion of evaluation tree and function scheme that unifies and extends state of the art algorithms for polynomial evaluation, such as the Hörner

On propose ici d’écrire des procédures de multiplication de polynômes. Chacune de ces procédures prendra comme entrée les deux polynômes sous forme de listes et rendra le

We split the Karatsuba recursion of depth s into two main steps: a first step which performs component polynomial formations and recursive products and a second step which performs

By contraries, the modified Leblond’s model assumes a nonlinear relation between applied stress and transformation plastic strain only when the applied stress is larger than a half