• Aucun résultat trouvé

Modular composition via complex roots

N/A
N/A
Protected

Academic year: 2021

Partager "Modular composition via complex roots"

Copied!
11
0
0

Texte intégral

(1)

HAL Id: hal-01455731

https://hal.archives-ouvertes.fr/hal-01455731v2

Preprint submitted on 27 Mar 2017

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Joris van der Hoeven, Grégoire Lecerf

To cite this version:

Joris van der Hoeven, Grégoire Lecerf. Modular composition via complex roots. 2017. �hal- 01455731v2�

(2)

Modular composition via complex roots

JORIS VAN DERHOEVENa, GRÉGOIRELECERFb

Laboratoire d'informatique de l'École polytechnique, CNRS UMR 7161 École polytechnique

91128 Palaiseau Cedex, France a. Email:vdhoeven@lix.polytechnique.fr

b. Email:lecerf@lix.polytechnique.fr Preprint version, March 27, 2017

Modular composition is the problem to compute the composition of two univariate polynomials modulo a third one. For polynomials with coefficients in a finite field, Kedlaya and Umans proved in 2008 that the theoretical complexity for performing this task could be made arbi- trarily close to linear. Unfortunately, beyond its major theoretical impact, this result has not led to practically faster implementations yet. In this paper, we explore the particular case when the ground field is the field of computable complex numbers. Ultimately, when the precision becomes sufficiently large, we show that modular compositions may be performed in softly linear time.

1. I

NTRODUCTION

Let 𝕂 be an effective field, and let f,g,h be polynomials in 𝕂[x]. The problem of modular composition is to compute gf modulo h. Modular composition is an important problem in complexity theory because of its applications to polynomial factorization [14, 15, 16]. It also occurs very naturally whenever one wishes to perform polynomial computations over 𝕂inside an algebraic extension of 𝕂. In addition, given two different representations 𝕂[x] / (h(x)) ≅ 𝕂[x˜]/(h˜(x˜))of an algebraic extension of𝕂, the implementation of an explicit isomorphism actu- ally boils down to modular composition.

Denote by M(n) the number of operations in 𝕂 required to multiply two polynomials of degree<n in𝕂[x]. Let f,gandhbe polynomials in𝕂[x]of respective degrees<n,<nandn.

The naive modular composition algorithm takesO(nM(n))operations in𝕂. In 1978, Brent and Kung [3] gave an algorithm with costO(√nM(n)+n2). It uses thebaby-step giant-steptechnique due to Paterson and Stockmeyer [21], and even yields a sub-quadratic costO(n𝜛+√nM𝕂(n)) when using fast linear algebra (see [13, p. 185]). The constant𝜛 > 1.5 is such that a√ ×nn matrix over 𝕂 can be multiplied with another √ ×n n rectangular matrix in time O(n𝜛). The best current bound𝜛<1.6667is due to Huang and Pan [12, Theorem 10.1].

A major breakthrough has been achieved by Kedlaya and Umans [15,16] in the case when𝕂 is the finite field𝔽q. For any positive𝜀 > 0, they showed that the compositiongf modulo h could be computed with bit complexityO((nlogq)1+𝜀). Unfortunately, it remains a major open problem to turn this theoretical complexity bound into practically useful implementations.

Quite surprisingly, the existing literature on modular composition does not exploit the simple observation that composition modulo a separable polynomialh∈ 𝕂[x]that splits over𝕂can be reduced to the well known problems of multi-point evaluation and interpolation [6, Chapter 10].

More precisely, assume thath=(x−𝜎1)⋯ (x−𝜎n)is separable, which means thatgcd(h,h')=1. Iff,g∈𝕂[x]are of degree<n, thengfmodhcan be computed by evaluatingf at𝜎1,…,𝜎n, by evaluatinggat f(𝜎1),…,f(𝜎n), and by interpolating the evaluations(gf)(𝜎1),…,(gf)(𝜎n)to yieldgfmodh.

1

(3)

Whenever𝕂is algebraically closed and a factorization ofhis known, the latter observation leads to a softly-optimal algorithm for composition moduloh. More generally, if the computation of a factorization of hhas a negligible or acceptable cost, then this approach leads to an efficient method for modular composition. In this paper, we prove a precise complexity result in the case when𝕂is the field of computable complex numbers. In a separate paper [11], we also consider the case when𝕂is a finite field andhhas composite degree; in that case,hcan be factored over suitable field extensions, and similar ideas lead to improved complexity bounds.

In the special case of power series composition (i.e. composition moduloh=xn), our approach is similar in spirit to the analytic algorithm designed by Ritzmann [22]; see also [8]. In order to keep the exposition as simple as possible in this paper, we only study composition modulo separable polynomials. By handling multiplicities with Ritzmann's algorithm, we expect our algo- rithm to extend to the general case.

The organization of the present paper is as follows. In section2, we specify the complexity model to be used, and various standard notations. In section 3, we give a detailed version of the modular composition algorithm that we sketched above for a separable modulus that splits over𝕂. In order to instantiate this algorithm for the fieldℂcomof computable complex numbers, we need additional concepts. In section4, we recall basic results aboutball arithmetic[9]. In sec- tion5, we recall the computation model ofstraight-line programs[4]. In section6, we introduce a new ultimate complexity model that is convenient for proving complexity results at a “suffi- ciently large precision”. This model has the advantage that complexity results over an abstract effective field𝕂can naturally be turned into ultimate complexity results overℂcom. In section7, we apply this transfer principle to the modular composition algorithm from section3—we expect our framework to be useful in many other situations.

One disadvantage of ultimate complexity analysis is that it does not provide us with any information about the precision from which the ultimate complexity is reached. In practical appli- cations, the input polynomials f, g andh often admit integer or rational coefficients. In these cases, the required bit precision is expected to be of order n(l+n)in the worst case, where n = deghand lis the largest bit size of the coefficients: in fact, this precision allows to com- pute all the complex roots of h efficiently using algorithms from [18, 19, 24]. This precision should also be sufficient to perform the multi-point polynomial evaluations ofgand f by asymp- totically fast algorithms. We intend to work out more such detailed bit complexity bounds for this situation in a forthcoming paper.

2. P

RELIMINARIES

In the sequel, we consider both the algebraic and bit complexity models for analyzing the costs of our algorithms. The algebraic complexity model expresses the running time in terms of the number of operations in some abstract ground ring or field [4, Chapter 4]. The bit complexity modelrelies on Turing machines with a sufficient number of tapes [20].

Fundamental algebraic complexity bounds. Let𝕂be an effective field. We writeM:ℕ→ℝ>

for a function that bounds the total cost of a polynomial product algorithm in terms of the number operations in𝕂. In other words, two polynomials of degreesnin𝕂[x]can be multiplied using at mostM(n)arithmetic operations in𝕂. The schoolbook algorithm allows us to takeM(n)=O(n2). The fastest currently known algorithm [5] yieldsM(n) =O(n log nlog logn) =O˜(n). Here, the soft-Oh notation f(n) ∈O˜(g(n))means that f(n) = g(n) logO(1)g(n)(we refer the reader to [6, Chapter 25, Section 7] for technical details). In order to simplify the cost analysis of our algorithms we make the customary assumption that M(n1)/n1M(n2)/n2for all 0 <n1n2. Notice that this assumption implies thesuper-additivityofM, namelyM(n1)+M(n2)⩽M(n1+n2) for alln1⩾0andn2⩾0.

(4)

Fundamental bit complexity bounds. For bit complexity analyses, we consider Turing machines with sufficiently many tapes. We write I(n)for a function that bounds the bit-cost of an algorithm which multiplies two integers of bit sizes at most n, for the usual binary repre- sentation. The best known bound [7] forI(n)isO(nlogn8logn) =O˜(n). Again, we make the customary assumption thatI(n)/nis nondecreasing.

Multipoint evaluation and interpolation. Let𝕂 again be an effective field. The remainder (resp. quotient) of the Euclidean division of g byh in𝕂[x] is denoted by grem h (resp. by g quo h). It may be computed using O(M(n)) operations in 𝕂, if g and h have degrees ⩽n.

We recall that the gcd of two polynomials of degrees at mostnover 𝕂can be computed using O(M(n) logn)operations in𝕂 [6, Algorithm 11.4]. Given polynomials f andg1, …,gl over𝕂 withdegf=nanddegg1+⋯+deggl=O(n), all the remaindersfremgimay be computed simul- taneously in cost O(M(n) logl)using a subproduct tree[6, Chapter 10]. The inverse problem, calledChinese remaindering, can be solved with a similar costO(M(n) logl), assuming that thegi

are pairwise coprime. The fastest known algorithms for these tasks can be found in [1,2,10].

3. A

BSTRACT MODULAR COMPOSITION IN THE SEPARABLE CASE For any field𝕂andn∈ℕ, we denote

𝕂[x]<n ≔ {P∈𝕂[x]:degP<n}.

In this section, 𝕂 represents an abstract algebraically closed field of constants. Let h = xn + hn−1xn−1+⋯ +h0∈𝕂[x]be a separable monic polynomial, sohadmitsnpairwise distinct roots 𝜎1,…,𝜎nin𝕂. Then we may use the following algorithm for composition moduloh:

Algorithm 1

Input. Polynomials f,g∈𝕂[x]<nand pairwise distinct𝜎1,…,𝜎n∈𝕂. Output. fgremh, whereh=(x−𝜎1)⋯(x−𝜎n).

1. Computev1=f(𝜎1),…,vn=f(𝜎n)using fast multi-point evaluation.

2. Computew1=g(v1),…,wn=g(vn)using fast multi-point evaluation.

3. Retrieve𝜚∈𝕂[x]<nwith𝜚(𝜎1)=v1,…,𝜚(𝜎n)=vnusing fast interpolation.

4. Return𝜚.

THEOREM1. Algorithm1is correct and requires O(M(n)logn)operations in𝕂.

Proof. By construction,𝜚(𝜎i) = (gf)(𝜎i) = (gf remh)(𝜎i)for i= 1, …,n. Sincedeg 𝜚 <n and the𝜎iare pairwise distinct, it follows that𝜚=gfremh. This proves the correctness of the algorithm. The complexity bound follows from the fact that steps 1, 2 and 3 takeO(M(n) logn)

operations in𝕂. □

We wish to apply the theorem in the case when𝕂= ℂ. Of course, on a Turing machine, we can only approximate complex numbers with arbitrarily high precision, and likewise for the field operations inℂ. For given numbersxandy, approximations at precisionpforx+y,xy,x×yand x/y(whenevery≠ 0) can all be computed in timeO(I(p)). In view of Theorem1, it is therefore natural to ask whetherp-bit approximations of the coefficients ofgfremhmay be computed in timeO(I(p)M(n)logn).

In the remainder of this paper we give a positive answer to a carefully formulated version of this question. Our first task is to make the concept of “approximations at precisionp” more pre- cise and to understand the way errors accumulate when performing a sequence of computations at precision p. We rely on “fixed point ball arithmetic” for this matter, as described in the next subsection. At a second stage, we prove a complexity bound for modular composition that holds for a fixed modulushwith known roots𝜎1,…,𝜎nand for sufficiently large working precisionsp.

JORIS VAN DERHOEVEN, GRÉGOIRELECERF 3

(5)

The assumption that the roots𝜎1,…,𝜎nofhare known is actually quite harmless in this con- text for the following reason: as soon as approximations for𝜎1,…,𝜎nare known at a sufficiently high precision, the computation of even better approximations can be done fast using Newton's method combined with multi-point evaluation. Since we are only interested in the complexity for

“sufficiently large working precisions”, the computation of the initial approximations of𝜎1,…,𝜎n

can therefore be regarded as a precomputation of negligible cost.

4. B

ALL ARITHMETIC AND STRAIGHT

-

LINE PROGRAMS

4.1. Fixed point numbers

Letabe a real number, we write⌊a⌋for the largest integer less or equal toaand⌊a⌉≔⌊a+ /1 2⌋for the closest integer toa.

Given a precision p∈ ℕ, we denote by𝔻p= ℤ 2−pthe set of fixed point numbers with p binary digits after the dot. This set𝔻pis clearly stable under addition and subtraction. We can also define approximate multiplication×pon𝔻pusingx×py=⌊2px y⌉2−p, so|x×pyx y| ⩽2−p−1 for allx,y∈𝔻p.

For any fixed constantK> 0andx,y∈ 𝔻p∩ [−K,K], we notice thatx+yandxy can be computed in time O(p), whereas x×py can be computed in time I(p) +O(p). Similarly, one may define an approximate inversion𝜄pon𝔻p≔ 𝔻p∖ {0}by𝜄p(x) = ⌊2px−1⌉ 2−p. For any fixed constantK>0andx∈𝔻p∩[−K,K], we may compute𝜄p(x)in timeO(I(p)).

4.2. Fixed point ball arithmetic

Ball arithmetic is used for providing reliable error bounds for approximate computations. Aball is a setℬ(c,r)={z∈ℝ:|zc|⩽r}withc∈ℝandr∈ℝ. From the computational point of view, we represent such balls by their centerscand radiir. We denote by𝔹pthe set of balls with centers in𝔻pand radii in𝔻p. Given vectorsx= (x1, …,xn) ∈ ℝnand𝒙 = (𝒙1, …, 𝒙n) = (ℬ(c1,r1), …, ℬ(cn,rn)) ∈ 𝔹np we write x ∈ 𝒙 to mean x1 ∈ 𝒙1 ∧ ⋯ ∧ xn ∈ 𝒙n, and we also set rad(𝒙) ≔ max(r1,…,rn).

LetDbe an open subset ofℝn. We say that𝑫p⊆ 𝔹npis adomain lift at precision pif𝒙 ⊆D for all 𝒙 ∈ 𝑫p. The maximal such lift is given by𝑫p= {𝒙 ∈ 𝔹np: 𝒙 ⊆D}. Given a function f: D→ ℝm, aball liftof f at precisionpis a function𝒇p: 𝑫p→ 𝔹mp, where𝑫p= dom 𝒇pis a domain lift of D at precision p, that satisfies the inclusion property: for any𝒙 = (𝒙1, …, 𝒙n) ∈ 𝑫npand x=(x1,…,xn)∈ℝn, we have

x∈𝒙 ⟹ f(x)∈𝒇p(𝒙).

Aball lift 𝒇 of f is a computable sequence(𝒇p)p∈ℕof ball lifts at every precision such that for any sequence(𝒙p)p∈ℕwith𝒙p∈dom𝒇p, we have

p→∞limrad(𝒙p)=0 ∧ p∈ℕ

𝒙p≠∅ ⟹ limp→∞rad(𝒇p(𝒙p))=0.

This condition implies the following:

p→∞limrad(𝒙p)=0 ∧ p∈ℕ

𝒙p={x} ⟹ p∈ℕ

𝒇p(𝒙p)={f(x)}.

We say that𝒇 ismaximalifdom 𝒇pis the maximal domain lift for eachp. Notice that a function f must be continuous in order to admit a maximal ball lift.

The following formulas define maximal ball lifts⊕p,⊖pand⊗pat precision pfor the ring operations+, − and×:

ℬ(a,r)⊕pℬ(b,s) ≔ ℬ(a+b,r+s) ℬ(a,r)⊖pℬ(b,s) ≔ ℬ(ab,r+s)

ℬ(a,r)⊗pℬ(b,s) ≔ ℬ(a×pb,(|a|+rps+|bps+21−p).

(6)

The extra21−pin the formula for multiplication is needed in order to counter the effect of rounding errors that might occur in the three multiplicationsa×pb,(|a|+rpsand|bps. Forℬ(a,r)∈𝔹p

withr<|a|, the following formula also defines a maximal ball lift𝜾pat precisionpfor the inver- sion:

𝜾p(ℬ(a,r)) ≔ ℬ(𝜄p(a),𝜄p(|a|−r)−𝜄p(|a|)+21−p).

For any fixed constant K> 0anda,r,b,s∈ 𝔻p∩ [−K,K], we notice thatℬ(a,r) ⊕pℬ(b,s), ℬ(a,r)⊖pℬ(b,s),ℬ(a,r)⊗pℬ(b,s)and𝜾p(ℬ(a,r))can be computed in timeO(I(p)).

Let𝒇 be the ball lift of a function f:D→ ℝmwithD⊆ ℝn. Consider a second ball lift𝒈of a functiong:E→ℝlwith f(D)⊆E⊆ℝm. Then we may define a ball lift𝒈∘𝒇 of the composition gf:D→ ℝlas follows. For each precisionp, we take(𝒈 ∘ 𝒇 )p= 𝒈p∘ (𝒇p)|Dp, where(𝒇p)|DPis the restriction of𝒇pto the setDp={𝒙∈dom𝒇p:𝒇p(𝒙)∈dom𝒈p}.

We shall use ball arithmetic for the computation of complex functions ℂn→ℂm simply through the consideration of real and imaginary parts. This point of view is sufficient for the asymptotic complexity point of view of the present paper. Of course, it would be more efficient to directly compute with complex balls (i.e. balls with a complex center and a real radius), but this would involve approximate square roots and ensuing technicalities.

4.3. The Lipschitz property

Assume that we are given the ball lift𝒇 of a functionf:D→ℝmwithD⊆ℝn. Given a subsetUD and constants𝜆⩾0,𝜇⩾0, we say that the ball lift𝒇 is(𝜆,𝜇)-LipschitzonUif

p0∈ℕ, ∃𝜚>0, ∀pp0, ∀𝒙∈𝔹np,

𝒙⊆U ∧ rad(𝒙)⩽𝜚 ⟹ 𝒙∈dom𝒇p ∧ rad(𝒇p(𝒙))⩽𝜆rad(𝒙)+𝜇2−p.

For instance, the ball lifts⊕and⊖of addition and subtraction are(2,0)-Lipschitz onℝ2. Simi- larly, the ball lift⊗of multiplication is(3𝜆,3)-Lipschitz onU= {(x,y)∈ℝ2:|x| ⩽𝜆, |y| ⩽𝜆}(by taking𝜌=𝜆), whereas the ball lift𝜾of𝜄is(𝜆,3)-Lipschitz onU={x∈ℝ:𝜆−1/2⩽|x|}.

Given 𝒇 and𝜆 > 0, 𝜇 ⩾ 0 as above, we say that 𝒇 is locally(𝜆, 𝜇)-Lipschitz onU if 𝒇 is (𝜆, 𝜇)-Lipschitz on each compact subset of U. We define𝒇 to be𝜆-Lipschitz (resp. locally𝜆- Lipschitz) on U if there exists a constant𝜇 > 0 for which 𝒇 is (𝜆, 𝜇)-Lipschitz (resp. locally (𝜆, 𝜇)-Lipschitz). If𝒇 is locally𝜆-Lipschitz onU, then it is not hard to see that f is necessarily locally Lipschitz onU, with Lipschitz constant𝜆. That is,

xU, ∃𝜂>0, ∀a,b∈ℬ(x,𝜂)∩U, ‖f(b)−f(a)‖⩽𝜆‖ba.

In fact, the requirement that a computable ball lift𝒇 is𝜆-Lipschitz implies that we have a means to compute high quality error bounds. We finally define𝒇 to be Lipschitz (resp. locally Lipschitz) onUif there exists a constant𝜆>0for which𝒇 is𝜆-Lipschitz (resp. locally𝜆-Lipschitz).

LEMMA2. Let 𝒇 be a locally(𝜆, 𝜇)-Lipschitz ball lift of f:D→ℝmon an open set U. Let 𝒈be a locally(𝜆',𝜇')-Lipschitz ball lift of g:E→ℝlon an open set V. If f(D)⊆E and f(U)⊆V, then 𝒈∘𝒇 is a locally(𝜆𝜆',𝜇𝜆'+𝜇')-Lipschitz ball lift of gf on U.

Proof. Consider a compact subsetCU. Since this implies f(C)to be a compact subset of f(U) ⊆ V, it follows that there exists an 𝜀 > 0 such that f(C) + ℬ(0, 𝜀) ⊆ V. Let p0∈ ℕ, 0<𝜚<(𝜀−𝜇2−p)/𝜆and0<𝜚'be such that for anypp0,𝒙∈𝔹npand𝒚∈𝔹mp, we have

𝒙⊆C∧rad(𝒙)⩽𝜚 ⟹ 𝒙∈dom𝒇p∧rad(𝒇p(𝒙))⩽𝜆rad(𝒙)+𝜇2−p<𝜀 (𝒚⊆f(C)+ℬ(0,𝜀))∧rad(𝒚)⩽𝜚' ⟹ 𝒚∈dom𝒈p∧rad(𝒈p(𝒚))⩽𝜆'rad(𝒚)+𝜇'2−p.

JORIS VAN DERHOEVEN, GRÉGOIRELECERF 5

(7)

Given x∈ 𝔹npwith𝒙 ⊆Candrad(𝒙) ⩽ 𝜚, it follows that𝒚 ≔ 𝒇p(𝒙)satisfiesrad(𝒚) < 𝜀, whence 𝒚 ⊆ f(C) + ℬ(0, 𝜀). If we also assume that rad(𝒙) ⩽ (𝜚'−𝜇 2−p)/𝜆, then it also follows that rad(𝒚) ⩽ 𝜚', whence𝒚 ∈ dom 𝒈pandrad(𝒈p(𝒚)) ⩽ 𝜆'(𝜆 rad(𝒙) + 𝜇 2−p) + 𝜇'2−p= 𝜆 𝜆'rad(𝒙) + (𝜇 𝜆'+𝜇') 2−p. In other words, if𝒙⊆Candrad(𝒙)⩽min(𝜚,(𝜚'−𝜇 2−p)/𝜆), then𝒙∈dom(𝒈p∘𝒇p) andrad((𝒈p∘𝒇p)(𝒙))⩽𝜆𝜆'rad(𝒙)+(𝜇𝜆'+𝜇')2−p. □

5. S

TRAIGHT

-

LINE PROGRAMS

Asignatureis a finite or countable set of function symbolsℱtogether with an arityrf ∈ ℕfor each f∈ℱ. Amodelforℱis a setKtogether with a function fK:UfKwithUfKrffor each k ∈ ℱ. IfK is a topological space, then Uf is required to be an open subset of Krf. Let𝒱 be a countable and ordered set of variable symbols.

Astraight-line programΓwith signatureℱis a sequenceΓ1,…,Γof instructions of the form ΓkXkfk

(

Yk,1,…,Yk,rfk

)

,

where fk ∈ ℱand Xk, Yk,1, …,Yk,rfk ∈ 𝒱, together with a subset 𝒪Γ⊆ {X1, …,X} of output variables. Variables that appear for the first time in the sequence in the right-hand side of an instruction are calledinput variables. We denote byΓthe set of input variables. The numberℓ is called thelengthofΓ.

There exist unique sequences I1< ⋯ < In and O1 < ⋯ < Om with ℐΓ = {I1, …, In} and 𝒪Γ = {O1, …, Om}. Given a model K ofℱ we can run Γ for inputs in K, provided that the arguments Yk,1, …,Yk,rfkare always in the domain of fkwhen executing the instruction Γk. Let DΓ,Kbe the set of tuplesI=(I1,…,In)∈Knon whichΓcan be run. GivenIKn, letΓK(I)∈Km denote the value of (O1, …,Om)at the end of the program. Hence Γ gives rise to a function ΓK:DΓ,KKm.

Now assume that(ℝ,(f)f∈ℱ)is a model forℱand that we are given a ball lift𝒇 of ffor eachf∈ℱ. Then𝔹pis also a model forℱat each precisionp, by taking f𝔹p=𝒇pfor eachf∈ℱ. Consequently, any SLPΓas above gives rise to both a functionΓ:DΓ,ℝ→ℝmand a ball liftΓ𝔹p: DΓ,𝔹p→𝔹mp at each precisionp. The sequence𝔹p)pthus provides us with a ball lift𝚪forΓ. PROPOSITION3. If the ball lift 𝒇of fis Lipschitz for each f∈ℱ, then𝚪is again Lipschitz.

Proof. For each modelK ofℱ, for each variablev∈ 𝒱 and each inputI= (I1, …,In) ∈DΓ,K, let vK,k(I)denote the value ofvafter stepk. We may regardvK,kas a function fromDΓ,K toK.

In particular, we obtain a computable sequence of functionsv𝔹p,kthat give rise to a ball lift𝒗(k) of vℝ,k. Let us show by induction over kthat𝒗(k)is Lipschitz for everyv∈ 𝒱. This is clear for k=0, so letk>0. IfvXk, then we have𝒗(k)=𝒗(k−1); otherwise, we have

𝒗(k)=𝒇k

(

𝒀k,1(k−1),…,𝒀k,r(k−1)fk

)

.

In both cases, it follows from Lemma2 that𝒗(k)is again a Lipschitz ball lift. We conclude by

noticing that𝚪=

(

𝑶1(ℓ),…,𝑶n(ℓ)

)

.

6. C

OMPUTABLE NUMBERS AND ULTIMATE COMPLEXITY

A real numberx∈ ℝis said to becomputableif there exists anapproximation algorithm xˇ that takes p∈ ℕon input and producesxˇ(p) ∈ 𝔻pon output with|xxˇ(p)| ⩽ 2−p(we say thatxˇ(p)is a2−p-approximationofx). We denote bycomthe field of computable real numbers.

LetT(p)be a nondecreasing function. We say that a computable real numberx∈ ℝcomhas ultimate complexity T(p)if it admits an approximation algorithmxˇ that computesxˇ(p)in time T(p+ 𝛿)for some fixed constant𝛿∈ℕ. The fact that we allow xˇ(p)to be computed in time T(p + 𝛿)and not T(p) is justified by the observation that the position of the “binary dot” is somewhat arbitrary in the approximation process of a computable number.

(8)

The notion of approximation algorithm generalizes to vectors with real coefficients: given v∈ (ℝcom)n, an approximation algorithm forvas a whole is an algorithmvˇ that takes p∈ ℕon input and returnsvˇ(p) ∈ 𝔻rpon output with|vˇ(p)ivi| ⩽ 2−pfori= 1, …,n. This definition natu- rally extends to any other mathematical objects that can be encoded by vectors of real numbers:

complex numbers (by their real and complex parts), polynomials and matrices (by their vectors of coefficients), etc. The notion of ultimate complexity also extends to any of these objects.

A ball lift𝒇 is said to becomputableif there exists an algorithm for computing𝒇pfor allp∈ℕ. A computable ball lift𝒇 of a function f:D→ℝmwithD⊆ℝnallows us to compute the restriction of f to D∩ (ℝcom)n: givenxD∩ (ℝcom)n with approximation algorithmxˇ, by taking 𝒙p= ℬ(xˇ(p),2−p)∈𝔹np, we have

p∈ℕ𝒙p={x},

p∈ℕ𝒇p(𝒙p)={f(x)}, andlimp→∞rad(𝒇p(𝒙p))=0.

Let F be a nondecreasing function and assume that Dis open. We say that𝒇 hasultimate complexity F(p)if for every compact setCD, there exist constantsp0∈ℕ,𝜚>0and𝛿∈ℕsuch that for any pp0and𝒙p∈ dom 𝒇pwith𝒙pC andrad(𝒙p) ⩽ 𝜚, we can compute𝒇p(𝒙p)in time F(p+𝛿). For instance,⊕and⊖have ultimate complexityO(p), whereas⊗and𝜾have ultimate complexityO(I(p)).

PROPOSITION4. Assume that 𝒇 is locally Lipschitz. If 𝒇 has ultimate complexity F(p)and xD∩(ℝcom)nhas ultimate complexity T(p), then f(x)has ultimate complexity T(p)+F(p). Proof. Letxˇ be an approximation algorithm forxof complexityT(p+ 𝛿), where𝛿 ∈ ℕ. There exist p0∈ℕand a compact ballCaroundxwithC⊆domf and such that𝒙p=ℬ(xˇ(p),2−p)∈𝔹np

is included in C for all pp0. There also exists a constant 𝛿'∈ ℕ such that 𝒇p(𝒙p) can be computed in timeF(p+ 𝛿')for allpp0. Since𝒇 is locally Lipschitz, there exists yet another constant 𝛿''∈ ℕsuch thatrad(𝒇p(𝒙p)) ⩽ 2𝛿''−p for pp0. Forq= p−𝛿''⩾ max (p0−𝛿'', 0) and 𝛿'''= max (𝛿, 𝛿'), this shows that we may compute a 2−q-approximation of f(x)in time

T(q+𝛿''')+F(q+𝛿'''). □

PROPOSITION5. Assume that 𝒇 andgare two locally Lipschitz ball lifts of f and g that can be composed. If 𝒇andghave respective ultimate complexities F(p)and G(p), then𝒈∘𝒇has ultimate complexity F(p)+G(p).

Proof. In a similar way as in the proof of Lemma2, the evaluation of(𝒈∘𝒇 )p(𝒙p)for𝒙p∈dom𝒇p

with 𝒙pC andrad(𝒙p) ⩽ 𝜚 boils down to the evaluation of 𝒇p at𝒙p and the evaluation of𝒈p

at𝒚p≔ 𝒇p(𝒙p) ⊆C'f(C) + ℬ(0, 𝜀)withrad(𝒚p) ⩽ 𝜚'. Modulo a further lowering of𝜚and𝜚'if necessary, these evaluations can be done in timeF(p+𝛿)andG(p+𝛿')for suitable𝛿,𝛿'∈ℕand

sufficiently large p.

THEOREM6. Assume thatis a model for the function symbols, and that we are given a computable ball lift 𝒇 of ffor each f∈ℱ. For each f∈ℱ, assume in addition that 𝒇is locally Lipschitz, and let Ffbe a nondecreasing function such that 𝒇has ultimate complexity Ff(p). Let Γ=Γ1,…,Γbe an SLP overwhose k-th instructionΓkwrites Xkfk(Yk,1,…,Yk,rf). Then, the ball lift𝚪of Γhas ultimate complexity

FΓ(p) ≔ Ff1(p)+⋯+Ff(p).

Proof. This is a direct consequence of Proposition5. □ COROLLARY7. LetΓbe an SLP of lengthoverℱ={0,1,+,−,×,𝜄}(where0and1are naturally seen as constant fonctions of arity zero). Then, there exists a ball lift 𝚪 of Γwith ultimate complexity O(I(p)ℓ).

Proof. We use the ball lifts of section 4 for each f ∈ {+, −, ×, 𝜄}: they are locally Lipschitz and computable with ultimate complexityO(I(p)). We may thus apply the Theorem6to obtain

FΓ(p)=O(I(p)ℓ). □

JORIS VAN DERHOEVEN, GRÉGOIRELECERF 7

(9)

7. U

LTIMATE MODULAR COMPOSITION FOR SEPARABLE MODULI LEMMA 8. There exists a constant 𝜅 > 0 such that the following assertion holds. Let f, g ∈ ℂcom[x]<n, let 𝜎1, …, 𝜎nbe pairwise distinct elements ofcom, and let h= (x−𝜎1) ⋯ (x−𝜎n). Assume that (f0, …, fn−1, g0,…,gn−1, 𝜎1, …, 𝜎n) has ultimate complexity T(n, p). Then 𝜚 = gfremh has ultimate complexity T(n,p)+𝜅I(p)M(n)logn.

Proof. The algorithm for fast multi-point evaluation of a polynomialP= ∑in−1=0Pixi∈ 𝕂[x]<n

at𝜉1, …, 𝜉n∈ 𝕂can be regarded as an SLP overℱ = {0, 1, +,−, ×, 𝜄} of lengthO(M(n) logn) that takes (P0, …,Pn−1, 𝜉1, …, 𝜉n) ∈ 𝕂2non input and that produces(P(𝜉1), …,P(𝜉n)) ∈ 𝕂non output. Similarly, the algorithm for interpolation can be regarded as an SLP overℱ of length O(M(n) logn)that takes(𝜉1,…,𝜉n,v1,…,vn)∈𝕂2non input and that produces(P0,…,Pn−1)∈𝕂n on output withv1=P(𝜉1),…,vn=P(𝜉n). Altogether, we may regard the entire Algorithm1as an SLPΓoverℱof lengthO(M(n) logn)that takes(f0, …,fn−1,g0, …,gn−1, 𝜎0, …, 𝜎n−1) ∈ 𝕂3non input and that produces(𝜚0,…,𝜚n−1)∈𝕂non output with𝜌=gfremh=∑i=0n−1𝜌ixi∈𝕂[x]<n. It follows from Corollary7thatΓadmits a ball lift𝚪of ultimate complexityO(I(p)M(n) logn).

The conclusion now follows from Proposition4. □

According to the above lemma, we notice that the time complexity for computing 𝜚 = gfremhisT(n,p+𝛿)for some constant𝛿that depends onn, f, g, and the𝜎i.

LEMMA9. There exists a constant𝜅>0such that the following assertion holds. Let h∈ℂcom[x] be separable and monic of degree n, and denote the roots of h by𝜎=(𝜎1,…,𝜎n). If h has ultimate complexity T(n,p), then𝜎has ultimate complexity T(n,p)+𝜅I(p)M(n)logn.

Proof. There are many algorithms for the certified computation of the roots of a separable com- plex polynomial. We may use any of these algorithms as a “fall back” algorithm in the case that we only need a2−p-approximation of𝜎at a low precisionpdetermined byhonly.

For general precisionsp, we use the following strategy in order to compute a ball𝝈∈𝔹npwith 𝜎∈𝝈andrad(𝝈)⩽2−𝛼pfor some fixed threshold1 2/ <𝛼<1. For some suitablep0∈ℕandpp0, we use the fall back algorithm. For p>p0and for a second fixed constant1 2/ < 𝛽 < 1, we first compute a ball enclosure𝝉∈𝔹qnat the lower precisionq=⌈𝛽p⌉using a recursive application of the method. We next compute𝝈using a ball version of the Newton iteration, as explained below.

If this yields a ball𝝈with acceptable radiusrad(𝝈)⩽2−𝛼p, then we are done. Otherwise, we resort to our fall-back method. Such calls of the fall-back method only occur if the default threshold precisionp0was chosen too low. Nevertheless, we will show that there exists a thresholdp1such that the computed𝝈by the Newton iteration always satisfiesrad(𝝈)⩽2−𝛼pfor pp1.

Let us detail how we perform our ball version of the Newton iteration. Recall that𝝉 ∈ 𝔹qn

with 𝜎 ∈ 𝝉 and rad(𝝉) ⩽ 2−𝛼𝛽p is given. We also assume that we computed once and for all a 2−p-approximation of h, in the form of a ball polynomial 𝒉p ∈ 𝔹p[i][x] of radius 2−p that contains h. Now we evaluate 𝒉pand 𝒉'pat each of the points 𝝈1, …, 𝝈n using fast multi-point evaluation. Let us denote the results by 𝒗 = 𝒉p(𝝈) and𝒘 = 𝒉'p(𝝈). Let 𝜏,v andw denote the balls with radius zero and whose centers are the same as for 𝝉, 𝒗 and 𝒘. Using vector nota- tion, the Newton iteration now becomes:

𝝈 = (𝜏⊖p𝜾p(w)⊗pv)⊕p(1⊖p𝜾p(w)⊗p𝒘)⊗p(𝝉⊖p𝜏).

If 𝜎 ∈ 𝝉, then it is well-known [17,23] that𝜎 ∈ 𝝈. Sincerad(𝝉) ⩽ 2−𝛼𝛽p, the fact that multi- point ball evaluation (used for𝒉pand𝒉'p) is locally Lipschitz implies the existence of a constant 𝛿>0withrad(𝝂)⩽2𝛿−𝛼𝛽pandrad(𝒘)⩽2𝛿−𝛼𝛽p. Sinceh'(𝜎i)≠0fori=1,…,n, there also exists a constant 𝛿'> 0with 1−𝜾p(w) 𝒘 ⊆ ℬ(0, 2𝛿'−𝛼𝛽p). Altogether, this means that there exists a constant𝛿''> 0withrad(𝝈) ⩽ 2𝛿''−2𝛼𝛽p. Let p1= ⌈𝛿''/(𝛼 (2 𝛽−1))⌉. Then for any pp1, the Newton iteration provides us with a𝝈withrad(𝝈)⩽2−𝛼p.

(10)

Let us now analyze the ultimate complexityC(n,p)of our algorithm. For largepp1, the algorithm essentially performs two multi-point evaluations of ultimate cost𝜅'I(p)M(n)lognfor some constant𝜅'that does not depend onp, and a recursive call. Consequently,

C(n,p) ⩽ 𝜅'I(p)M(n)logn+C(n,⌈𝛽p⌉). We finally obtain an other constant𝜅⩾𝜅'such that

C(n,p) = 𝜅I(p)M(n)logn,

by summing up the geometric progression and using the fact thatI(p)/pis nondecreasing. The

conclusion now follows from Lemma4. □

Remark 10. A remarkable feature of the above proof is that the precisionp1at which the Newton iteration can safely be used does not need to be known in advance. In particular, the proof does not require anya prioriknowledge about the Lipschitz constants.

THEOREM11. There exists a constant𝜅 > 0such that the following assertion holds. Let f,g∈ ℂcom[x]<nand let h∈ℂcom[x]be separable and monic of degree n. Assume that(f,g,h)has ulti- mate complexity T(n,p). Then𝜚=gfremh has ultimate complexity T(n,p)+𝜅I(p)M(n)logn.

Proof. This is an immediate consequence of the combination of the two above lemmas. □

8. C

ONCLUSION AND FINAL REMARKS

With some more work, we expect that all above bounds of the form O(I(p)M(n) log n)can be lowered toO(I(n p) logn). Notice that I(n p) logn=O(I(p)nlogn)for pn, when taking

I(p) = Θ(n logn 8logn) [7]. In order to prove this stronger bound using our framework, one might add an auxiliary operation ×[n]for the product of two polynomials of degrees<n to the set of signatures ℱ. Polynomial products of this kind can be implemented for coefficients in 𝔻p[i]with pnusing Kronecker substitution. For bounded coefficients, this technique allows for the computation of one such product in time O(I(n p)). By using Theorem 6, a standard complexity analysis should show that multi-point evaluation and interpolation have ultimate com- plexityO(I(np)logn).

By Theorem11, the actual bit complexity of modular composition is of the formT(n,p+𝛿)+

𝜅I(p+ 𝛿)M(n) logn for some value of𝛿 that depends on f,g,h (hence of n). An interesting problem is to get a better grip on this value𝛿, which mainly depends on the geometric proximity of the roots of h.

If f,g,hbelong toℚ[x], thenT(n,p)=O(nI(p))and we may wish to bound𝛿as a function of nand the maximum bit size lof the coefficients of f,g andh. This would involve bit com- plexity results for root isolation [18, 19,24], for multi-point evaluation, and for interpolation.

The overal complexity should then be compared with the maximal size of the output, namely gfremh, which is in general much larger than the input size.

Ifhis not separable, but if a separable decomposition is known, then the techniques developed in this paper could be combined with Ritzmann's algorithm for the composition of formal power series [22]. If such a separable decomposition is not known, then it is an interesting problem to obtain a general algorithm for modular composition with a similar complexity (but this seems far beyond the scope of this paper).

B

IBLIOGRAPHY

[1] D. Bernstein. Scaled remainder trees. Available fromhttps://cr.yp.to/arith/scaledmod-20040820.pdf, 2004.

JORIS VAN DERHOEVEN, GRÉGOIRELECERF 9

(11)

[2] A. Bostan, G. Lecerf, and É. Schost. Tellegen's principle into practice. In Hoon Hong, editor,Proceedings of the 2003 International Symposium on Symbolic and Algebraic Computation, ISSAC '03, pages 37–44, New York, NY, USA, 2003. ACM.

[3] R. P. Brent and H. T. Kung. Fast algorithms for manipulating formal power series.J. ACM, 25(4):581–595, 1978.

[4] P. Bürgisser, M. Clausen, and M. A. Shokrollahi.Algebraic complexity theory, volume 315 ofGrundlehren der Mathematischen Wissenschaften. Springer-Verlag, 1997.

[5] D. G. Cantor and E. Kaltofen. On fast multiplication of polynomials over arbitrary algebras. Acta Infor., 28(7):693–701, 1991.

[6] J. von zur Gathen and J. Gerhard.Modern computer algebra. Cambridge University Press, New York, 2 edition, 2003.

[7] D. Harvey, J. van der Hoeven, and G. Lecerf. Even faster integer multiplication.J. Complexity, 36:1–30, 2016.

[8] J. van der Hoeven. Fast composition of numeric power series. Technical Report 2008-09, Université Paris-Sud, Orsay, France, 2008.

[9] J. van der Hoeven. Ball arithmetic. Technical report, CNRS & École polytechnique, 2011. https://hal.archives- ouvertes.fr/hal-00432152/.

[10] J. van der Hoeven. Faster Chinese remaindering. Technical report, CNRS & École polytechnique, 2016. http://

hal.archives-ouvertes.fr/hal-01403810.

[11] J. van der Hoeven and G. Lecerf. Modular composition via factorization. Technical report, CNRS & École polytechnique, 2017.http://hal.archives-ouvertes.fr/hal-01457074.

[12] Xiaohan Huang and V. Y. Pan. Fast rectangular matrix multiplication and applications. J. Complexity, 14(2):257–299, 1998.

[13] E. Kaltofen and V. Shoup. Fast polynomial factorization over high algebraic extensions of finite fields. In Proceedings of the 1997 International Symposium on Symbolic and Algebraic Computation, ISSAC '97, pages 184–188, New York, NY, USA, 1997. ACM.

[14] E. Kaltofen and V. Shoup. Subquadratic-time factoring of polynomials over finite fields. Math. Comp., 67(223):1179–1197, 1998.

[15] K. S. Kedlaya and C. Umans. Fast modular composition in any characteristic. InFOCS'08: IEEE Conference on Foundations of Computer Science, pages 146–155, Washington, DC, USA, 2008. IEEE Computer Society.

[16] K. S. Kedlaya and C. Umans. Fast polynomial factorization and modular composition. SIAM J. Comput., 40(6):1767–1802, 2011.

[17] R. Krawczyk. Newton-Algorithmen zur Bestimmung von Nullstellen mit Fehler-schranken. Computing, 4:187–201, 1969.

[18] C. A. Neff and J. H. Reif. An efficient algorithm for the complex roots problem. J. Complexity, 12(2):81–115, 1996.

[19] V. Y. Pan. Univariate polynomials: nearly optimal algorithms for numerical factorization and root-finding. J.

Symbolic Comput., 33(5):701–733, 2002.

[20] C. H. Papadimitriou.Computational Complexity. Addison-Wesley, 1994.

[21] M. S. Paterson and L. J. Stockmeyer. On the number of nonscalar multiplications necessary to evaluate polyno- mials.SIAM J.Comput., 2(1):60–66, 1973.

[22] P. Ritzmann. A fast numerical algorithm for the composition of power series with complex coefficients.Theoret.

Comput. Sci., 44:1–16, 1986.

[23] S. M. Rump.Kleine Fehlerschranken bei Matrixproblemen.PhD thesis, Universität Karlsruhe, 1980.

[24] A. Schönhage. The fundamental theorem of algebra in terms of computational complexity. Technical report, Preliminary Report of Mathematisches Institut der Universität Tübingen, Germany, 1982.

Références

Documents relatifs

locally homogenous complex manifolds are the moduli spaces for that class of varieties... Returning to the situation in which 0393BD is a moduli space, in general 0393BD

My Class Blog. During this unit, write a maximum of 50 words on your website about what you have liked or disliked, or about what you have learnt during each class. You

 Make the students answer the test in pairs. They can read it on the wiki and write down the answers.  To correct with the whole class, use the PowerPoint presentation.. Nothing

Without any further considerations it is possible to check (actually, prove) that the investigated curve is a parabola also in the symbolic sense.. To achieve this re- sult one

In this paper we introduce a general modular audio source separation frame- work based on a library of flexible source models that enable the in- corporation of prior knowledge

Other approaches, such as Lisa [22] or Melange [9], provide language reuse capabilities within frameworks and ecosystems relying on mainstream language engineering technologies

&lt;his is not Fukushima but Hiro- shima&gt;, can be an example of Critical Idol Des- truction hrough Visuality per se, namely in its mode of visual communication combined with

The Elected block sends an acknowledgment message to the Root. Upon reception of the acknowledgment message, the Root becomes inactive. The distributed election is then terminated.