Sparse polynomial interpolation. Exploring fast heuristic algorithms over finite fields

(1)

HAL Id: hal-02382117

https://hal.archives-ouvertes.fr/hal-02382117

Preprint submitted on 27 Nov 2019

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

heuristic algorithms over finite fields

Joris van der Hoeven, Grégoire Lecerf

To cite this version:

Joris van der Hoeven, Grégoire Lecerf. Sparse polynomial interpolation. Exploring fast heuristic algorithms over finite fields. 2019. �hal-02382117�

(2)

Exploring fast heuristic algorithms over ﬁnite ﬁelds

JORIS VAN DERHOEVEN^abc, GRÉGOIRELECERF^bd

a. CNRS (UMI 3069, PIMS) Department of Mathematics

Simon Fraser University 8888 University Drive Burnaby, British Columbia

V5A 1S6, Canada

b. CNRS, École polytechnique, Institut Polytechnique de Paris Laboratoire d'informatique de l'École polytechnique (LIX, UMR 7161)

1, rue Honoré d'Estienne d'Orves Bâtiment Alan Turing, CS35003

91120 Palaiseau, France

c. Email:vdhoeven@lix.polytechnique.fr d. Email:lecerf@lix.polytechnique.fr Work in progress, draft version of November 27, 2019

Consider a multivariate polynomial f∈K[x1, . . . ,xn]over a fieldK, which is given through a black box capable of evaluating fat points inKⁿ, or possibly at points inAⁿ for anyK-algebraA. The problem of sparse interpolation is to express fin its usual form with respect to the monomial basis. We analyze the complexity of various old and new algorithms for this task in terms of boundsDandTfor the total degree of f and its number of terms. We mainly focus on the case whenKis a finite field and explore possible speed-ups under suitable heuristic assumptions.

1. I

NTRODUCTION

Consider a polynomial functionf:Kⁿ→Kover a fieldKgiven through a black box capable of evaluating f at points in Kⁿ. The problem of sparse interpolation is to recover the representation of f∈K[x1, . . . ,xn]in its usual form, as a linear combination

f=

i1, . . . ,in

fi1, . . . ,inx₁ⁱ¹⋅ ⋅ ⋅x_nⁱⁿ (1) of monomials. The aim of this paper is to analyze various approaches for solving this problem, with our primary focus on the case whenKis a ﬁnite ﬁeld. We will survey and synthesize known algorithms, but we will also present a few new algorithms, together with improved complexity bounds for some important special cases.

We explore various methods under heuristic conditions that we expect to fairly reﬂect average behavior in practice. We preferred a relaxed and intuitive style of exposition to mathematically precise theorems with rigorous proofs.

1

(3)

Eﬃcient algorithms for the task of sparse interpolation go back as far as to the eighteen's century and the work of Prony [41]. The ﬁrst modern version of the algorithm is due to Ben Or and Tiwari [8]. This method was swiftly embraced in computer algebra [11,28, 32,34,36,37,39]; for early implementations, we refer to [13,14]. There has been a regain of interest for the problem during the last decade, both from a theoretical perspective [2, 3,4,15,16, 29,30,33] and from the practical point of view [25,26,27,31,35]. We also mention the survey paper [43] by Roche on the more general topic of computations with sparse polynomials.

1.1. Complexity considerations

Throughout this paperdwill stand for the total degree of f andtfor the number of non- zero terms in (1). Whenever available, the uppercase charactersDandTrepresent upper bounds fordandt. We will also writeLfor the number of ring or ﬁeld operations inK that are required in order to evaluate f.

The complexity analysis of sparse interpolation has to be carried out with a lot of care, due to the large variety of cases that can occur:

• What kind of complexity/evaluation model do we use?

∘ Do we count the number operations inKor the number of bit operations?

∘ Are we interested in theoretic (asymptotic) or practical complexity?

∘ Are divisions allowed for the evaluation of f and how do we count them?

∘ Are we only allowed to evaluate f at points inKⁿor also at points inKˆⁿfor certain extension rings or ﬁeldsKˆ?

• What kind of coeﬃcient ﬁeldKdo we use?

∘ A ﬁeld from analysis such asℂ.

∘ A discrete field such asℚor a finite field𝔽_q.

∘ Fields with roots of unity𝜔of large smooth order inK.

• The univariate case (n= 1)versusthe multivariate case (n> 1).

• Informally speaking, there are three levels of “sparsity”:

∘ Weakly sparse: total degreesdof the orderO(logt).

∘ Normally sparse: total degreesdof the ordert^O(1).

∘ Super sparse: total degrees of orderdwithlogt=o(logd).

We also notice that almost all general algorithms for sparse interpolation are probabilistic of Monte Carlo type. Indeed, without furthera priori knowledge about f, such as its support or its number of terms, the mere knowledge of a ﬁnite number of evaluations of f only allow us to guess plausible expressions for f.

In this paper, we will be mostly interested in the practical bit complexity of sparse interpolation over finite fieldsK= 𝔽_q. Sparse interpolation over the rational numbers can often be reduced to this case as well, in which caseq=pis a well chosen prime number that fits into 32 or 64 bits and such thatp−1admits a large smooth divisor; see section6.5.

We analyze the complexities of specializations of existing algorithms to the finite field case and also present a few new algorithms and tricks for this specific setting. Due to the large number of cases that can occur, we will not prove detailed complexity bounds for every single case, but rather outline how various ideas may be used and combined to reduce the practical complexity.

(4)

From our practical perspective, it is important to take into account logarithmic factors in complexity bounds, but it will be convenient to ignore sublogarithmic factors. For this reason, we use thead hocnotation

𝜑 =O^♭(𝜓) ⇔⇔⇔⇔⇔⇔⇔⇔⇔⇔⇔⇔^def⇔⇔⇔⇔⇔⇔⇔⇔ 𝜑 =O(𝜓 (log 𝜓)^o(1)(log log (TDq))^O(1)) for any functions𝜑, 𝜓.

We will also writeMK(d)for the bit cost of multiplying two polynomials of degreed overKand abbreviateM_q(d) ≔M_𝔽_q(d). For instance, the naive multiplication algorithm yieldsMq(d) =O^♭(d²log²q). For our complexity analyses we will give priority to the asymptotic complexity point of view and use the well known [20, 44] boundM_q(d) = O^♭(dlogdlogq).

1.2. Overview of the paper

Many of the challenges concerning sparse interpolation already arise in the univariate case when n= 1. As we will see in section 7.1, the multivariate case can actually be reduced to the univariate one using the technique called “Kronecker segmentation”, even though other approaches may be more eﬃcient. For this reason, a large part of the paper is devoted to methods for interpolating a univariate black box function f(x).

We distinguish three major types of algorithms:

• Cyclic extension methods (section4).

• Geometric progression methods (section5).

• FFT based methods (section6).

For the ﬁrst two types of methods we mostly review existing algorithms, although we do propose some new variants and optimizations. The third FFT based method is new, as far as we are aware. For each of the three methods, an importantleitmotif is to evaluate f(x)modulox^r−1for one or more suitable ordersr, after which we reconstruct f(x)from these modular projections.

Cyclic extension methods directly evaluate f over the cyclic extension ring K[x]/(x^r−1). This has the advantage that r can be freely chosen in a suitable range.

However, the evaluation of f over such a large cyclic extension induces a non-trivial overhead in the dependency of the complexity onL.

Geometric progression methods rather evaluate f at a sequence 1, 𝜔, . . . , 𝜔^2T−1 of pairwise distinct elements inK(or inside an extension ofKof modest degrees). IfK= 𝔽_q is a finite field, then𝜔necessarily has an orderrthat dividesq−1(orq^s−1when working over an extension of degree s). Although the evaluations of f become more efficient using this approach, the recovery of f(x) modulo x^r−1 from f(1), f(𝜔), . . . , f(𝜔^2T−1) requires extra work. The cost of this extra work highly depends on the kind of ordersr that can be taken as divisors ofq−1 (orq^s−1for small s). Theoretically speaking, the existence of suitable ordersris a delicate issue; in practice, they always tend to exist as long asD=TÔ(1); see sections2,6.3and6.4for some empirical evidence.

Geometric progression methods allow us to take r much larger than T, but they involve a non-trivial cost for recovering f(x)modulox^r−1from f(1),f(𝜔),...,f(𝜔^2T−1).

IfL=o((logT)³), then this cost may even dominate the cost of the evaluations of f. In such situations, an alternative approach is to evaluate f at1, 𝜔, . . . , 𝜔^r−1and to recover f(x)modulox^r−1using one inverse DFT of lengthr. However, this puts an even larger strain on the choice ofr, since it is important that T⩽r=O(T)for this approach to be

(5)

Method Complexity bound Conditions

Cyclic extension O^♭(LTlogDlog(qT)) H1andlogD=O(T) Geometric progression O^♭ (L+ (logT)³)T ^log_log^D_T ³log(qT) H1,H2, andD=T^O(1) FFT O^♭ (L+ logT)T ^log_log^D_T ²log(qT) H1,H2, andD=T^O(1)

Table 1. Heuristic complexity bounds for the three main approaches to sparse interpolation.

eﬃcient. Moreover, the recovery of f(x)from its reductions modulo x^r−1 for several ordersr of this type is more delicate and based on a probabilistic analysis. Yet, suitable ordersragain tend to exist in practice as long asD=T^O(1).

The expected complexities of the best versions of the three approaches are summarized in Table1. These bounds rely on two types of heuristics:

H1. Forr≪d, the exponents of f are essentially randomly distributed modulor.

H2. Fors=O(logD/logT), the numberq^s−1admits a large smooth divisor.

We will present a more precise version ofH1in section4. The heuristicH2will be made precise in sections5.6and6.2and numeric evidence is provided in sections2,6.3and6.4.

The last section7is dedicated to the interpolation of multivariate polynomials. We start with the well known strategies based on Kronecker segmentation (section7.1) and prime factorization (section7.2). For sparse polynomials in many variables, but with a modest total degreed, we also recall the inductive approach by packets of coordinates in section7.3. If T<q, then geometric progression and FFT based methods should be particularly favorable in combination with this inductive approach, since one can often avoid working over extensions ofKin this case. We conclude section7with a few algorithms for special situations.

2. P

RELIMINARIES ON FINITE FIELDS

One remarkable feature of the finite field𝔽_qwithqelements is that everya∈ 𝔽_qsatisfies the equationa^q=a. In particular, for any sparse polynomial f as in (1) and with coefficients in𝔽_q, the polynomial f takes the same values as

i1, . . . ,in

fi1, . . . ,i_nx₁ⁱ¹^rem(q−1)⋅ ⋅ ⋅x_nⁱⁿ^rem(q−1),

forx1,...,xn∈ 𝔽_q, where “rem” stands for the remainder of a Euclidean division. In other words, the exponents of f are only determined moduloq−1, so we may assume without loss of generality that they all lie in{0, . . . ,q−2}and that the total degreedof f satisﬁes d⩽n(q−2).

On the other hand, in the case that our black box function f can be evaluated not only over𝔽_q, but also over ﬁeld extensions𝔽_q^s(this is typically the case if f is given by an expression or a directed acyclic graph (dag)), then the exponents in the expression (1) can be general non-negative integers, but the above remark shows that we will crucially need to evaluate over extensions ﬁelds𝔽_q^swiths> 1in order to recover exponents that exceedq.

More generally, if we choose to evaluate f only at points(a1, . . . ,an) ∈ 𝔽_q^ssuch that a1,...,anarer-th roots of unity, then we can only hope to determine the exponents modulor in the expansion (1). In that case,rmust divide the orderq^s−1of the multiplicative group

(6)

s 2^s−1 3^s−1 5^s−1

1 2 2²

2 3 2³ 2³⋅ 3

3 7 2 ⋅ 13 2²⋅ 31

4 3 ⋅ 5 2⁴⋅ 5 2⁴⋅ 3 ⋅ 13

6 3²⋅ 7 2³⋅ 7 ⋅ 13 2³⋅ 3²⋅ 7 ⋅ 31

8 3 ⋅ 5 ⋅ 17 2⁵⋅ 5 ⋅ 41 2⁵⋅ 3 ⋅ 13 ⋅ 313

12 3²⋅ 5 ⋅ 7 ⋅ 13 2⁴⋅ 5 ⋅ 7 ⋅ 13 ⋅ 73 2⁴⋅ 3²⋅ 7 ⋅ 13 ⋅ 31 ⋅ 601

16 3 ⋅ 5 ⋅ 17 ⋅ 257 2⁶⋅ 5 ⋅ 17 ⋅ 41 ⋅ 193 2⁶⋅ 3 ⋅ 13 ⋅ 17 ⋅ 313 ⋅ 11489

24 3²⋅ 5 ⋅ 7 ⋅ 13 ⋅ 17 ⋅ 241 2⁵⋅ 5 ⋅ 7 ⋅ 13 ⋅ 41 ⋅ 73 ⋅ 6481 2⁵⋅ 3²⋅ 7 ⋅ 13 ⋅ 31 ⋅ 313 ⋅ 601 ⋅ 390001 30 3²⋅ 7 ⋅ 11 ⋅ 31 ⋅ 151 ⋅ 331 2³⋅ 7 ⋅ 11²⋅ 13 ⋅ 31 ⋅ 61 ⋅ 271 ⋅ 4561 2³⋅ 3²⋅ 7 ⋅ 11 ⋅ 31 ⋅ 61 ⋅ 71 ⋅ 181 ⋅ 521 ⋅ ⋅ ⋅ 36 3³⋅ 5 ⋅ 7 ⋅ 13 ⋅ 19 ⋅ 37 ⋅ 73 ⋅ 109 2⁴⋅ 5 ⋅ 7 ⋅ 13 ⋅ 19 ⋅ 37 ⋅ 73 ⋅ 757 ⋅ 530713 2⁴⋅ 3³⋅ 7 ⋅ 13 ⋅ 19 ⋅ 31 ⋅ 37 ⋅ 601 ⋅ 829 ⋅ ⋅ ⋅

Table 2.Prime factorizations of2^s−1,3^s−1, and5^s−1for various small smooth values ofs.

s 1299743^s−1 a(s)

1 2 ⋅ 649871 2

2 2⁶⋅ 3²⋅ 4513 ⋅ 649871 2³⋅ 3

3 2 ⋅ 7 ⋅ 13 ⋅ 397 ⋅ 649871 ⋅ 6680137 2

4 2⁷⋅ 3²⋅ 5²⋅ 193 ⋅ 4349 ⋅ 4513 ⋅ 40253 ⋅ 649871 2⁴⋅ 3 ⋅ 5

6 2⁶⋅ 3³⋅ 7²⋅ 13 ⋅ 31 ⋅ 397 ⋅ 4513 ⋅ 649871 ⋅ 6680137 ⋅ 18164844799 2³⋅ 3²⋅ 7 8 2⁸⋅ 3²⋅ 5²⋅ 17²⋅ 73 ⋅ 193 ⋅ 241 ⋅ 4349 ⋅ 4513 ⋅ 40253 ⋅ 649871 ⋅ 298090889 ⋅ 941485217 2⁵⋅ 3 ⋅ 5 12 2⁷⋅ 3³⋅ 7²⋅ 13 ⋅ 31 ⋅ 193 ⋅ 397 ⋅ 4349 ⋅ 4513 ⋅ 40253 ⋅ 649871 ⋅ 6680137 ⋅ 387205657 ⋅ ⋅ ⋅ 2⁴⋅ 3²⋅ 5 ⋅ 7 ⋅ 13 16 2⁹⋅ 3²⋅ 5²⋅ 17²⋅ 73 ⋅ 193 ⋅ 241 ⋅ 4349 ⋅ 4513 ⋅ 40253 ⋅ 649871 ⋅ 3955153 ⋅ 298090889 ⋅ ⋅ ⋅ 2⁶⋅ 3 ⋅ 5 ⋅ 17 24 2⁸⋅ 3³⋅ 5²⋅ 7²⋅ 13 ⋅ 17²⋅ 31 ⋅ 73 ⋅ 193 ⋅ 241 ⋅ 397 ⋅ 4349 ⋅ 4513 ⋅ 40253 ⋅ 649871 ⋅ ⋅ ⋅ 2⁵⋅ 3²⋅ 5 ⋅ 7 ⋅ 13 30 2⁶⋅ 3³⋅ 7²⋅ 11 ⋅ 13 ⋅ 31 ⋅ 61 ⋅ 71 ⋅ 271 ⋅ 397 ⋅ 701 ⋅ 881 ⋅ 1171 ⋅ 2411 ⋅ 4513 ⋅ 649871 ⋅ ⋅ ⋅ 2³⋅ 3²⋅ 7 ⋅ 11 ⋅ 31 36 2⁷⋅ 3⁴⋅ 5²⋅ 7²⋅ 13 ⋅ 31 ⋅ 37 ⋅ 109 ⋅ 193 ⋅ 397 ⋅ 757 ⋅ 1657 ⋅ 4349 ⋅ 4513 ⋅ 40253 ⋅ 649871 ⋅ ⋅ ⋅ 2⁴⋅ 3³⋅ 5 ⋅ 7 ⋅ 13 ⋅ 19 ⋅ 37

Table 3.Prime factorizations ofq^s−1anda(s)forq= 1299742and various values ofs.

of𝔽_q^s. In addition, as we will recall in sections5.1and5.2below, several important tools such as polynomial root ﬁnding and discrete logarithms admit faster implementations if we can takerof the formr=r₁r₂withr₁=O(T)and wherer₂is smooth. Sometimes, primitive roots of unity of such ordersralready exist in𝔽_q. If not, then we need to search them in extension ﬁelds𝔽_q^swiths> 1as small as possible.

Let us brieﬂy investigate the prime factorization ofq^s−1for variousqand smalls. As observed in [21], the numberq^s−1typically admits many small prime divisors whensis itself smooth. This phenomenon is illustrated in Table2for small values ofq. For practical purposes, givenT, it is easy in practice to ﬁnd a smallsand divisorsr=r1r2| (q^s−1) such thatr1≈Tandr2is smooth.

For larger q, we may need to resort to larger extension degrees s in order to ﬁnd appropriate ordersr. A natural question is whetherq^s−1is guaranteed to have a non- trivial smooth divisor for largeqand a ﬁxed value ofs. This leads us to introduce the following guaranteed lower bound:

a(s) ≔ lim

q0→∞gcd (q^s−1 :q⩾q0). (2)

In Table3, we have shown the prime factorizations ofq^s−1anda(s)forq= 1299743and various small smooth values ofs. Hereq= 1299743 was chosen such that (q−1)/2is also prime. For the practical applications in this paper, the table suggests that it remains likely that suitable ordersrcan still be found whenever needed, and thata(s)is usually quite pessimistic, even for large values ofq. Let us ﬁnally mention that the sequencea(s) coincides with Sloane's integer sequenceA079612; seehttps://oeis.org/A079612.

(7)

3. G

ENERAL OBSERVATIONS

As already mentioned in the introduction, most algorithms for sparse interpolation are probabilistic of Monte Carlo type. We notice that it is easy to check (with high probability) whether a candidate sparse interpolation f^∗of f is correct. Indeed, it suﬃces to evaluate f−f^∗at random sample points and check whether the result vanishes. Deter- ministic algorithms exist but with higher complexities; see for instance [22,42].

Many algorithms for sparse interpolation require extra information, such as boundsT⩾tandD⩾dfor the number of terms and the total degree of f. Furthermore, several algorithms are only able to guess some of the terms offwith high probability, but not all of them. In this section, using ideas from [3], we show how to turn such “partial”

algorithms for sparse interpolation into full-blown ones. Provided that the characteristic of K is zero or suﬃciently large, we also show how to upgrade an interpolation method modulo(x₁^r−1, . . . ,x_n^r−1)into a general algorithm, following [30].

3.1. From partial to full interpolation

Assume that we have an algorithm for “partial” sparse interpolation that takes a black box for f as input, together with bounds T and D for t and d. The algorithm should always return a sparse polynomialf^∗of total degree at mostDand with at mostTterms.

Moreover, for some ﬁxed constant0 < 𝜆 < 1, ift⩽Tandd⩽D, then f−f^∗should contain at most𝜆tterms, with high probability. Ift>Tord>D, then we allow f^∗to be essentially arbitrary under the above constraint that f^∗has at mostTterms of degree⩽D. Then we may use the following strategy for arbitrary sparse interpolations:

Algorithm 1

Input:a polynomial black box function f(x₁, . . . ,x_n) Output:the sparse interpolation f^∗of f

1. Let f^∗≔ 0be an initial approximation of f.

2. Set initial bounds T≔ 1 andD≔ 1for the number of terms and total degree.

3. Determine the approximate sparse interpolation𝛿^∗to𝛿 ≔f−f^∗usingT andDas bounds for the number of terms and the total degree of𝛿.

4. Set f^∗≔f^∗+ 𝛿^∗.

5. If f=f^∗with high probability, then return f^∗.

6. Whenever appropriate, increaseTand/orD, and reset f^∗≔ 0.

7. Return to step 3.

In step 6, the precise policy for increasingTandDmay depend on the application.

We typically doubleT whent is suspected to exceedT and we doublelogDwhendis suspected to exceedD. In this way, the boundsTandlogDare at most twice as large as the actual valuestandlogd, and the running time is essentially a constant times the running time of the approximate sparse interpolation with boundsTandD.

However, for this “meta complexity bound” to hold, it is crucial in step 3 that the sparse approximation f^∗can be evaluated eﬃciently at the sample points used during the sparse interpolation (the naive evaluation of a polynomial withtterms atΘ(t)points would take timeΘ(t²), which is too much). Fortunately, this is the case for all sparse interpolation strategies that will be considered in this paper.

(8)

When do we suspectTorDto be too low in step 6? In the case ofT, a natural idea is to test whether the number of terms in f^∗or𝛿^∗exceeds a ﬁxed constant portion ofT.

This strategy assumes that𝛿^∗be essentially random whenTis too small (if the number of terms of𝛿^∗is much smaller than T whenever t>T, then we might need more than Θ(logt)iterations before the algorithm converges).

The handling of exponents and degree bounds is more involved. When interpolating over a finite field, all non-zero evaluation points are roots of unity, so the exponents can only be determined modulo a certain integerr(or even modulo a submodule ofℤⁿ). If the characteristic ofKis sufficiently large, then the exponents can be computed directly:

see the next subsection. Otherwise, we need to recombine reductions with respect to several moduli: see section4below. This also provides a natural test in order to check whetherd⩽D. Indeed, it suﬃces to compute the sparse interpolation of ffor one or more extra moduli and check whether the results agree with our candidate interpolation.

3.2. Supersparse interpolation in large characteristic

Assume that we have an algorithm that allows us to compute the sparse interpolation of f moduloI_r= (x₁^r−1,...,x_n^r−1)for given modulir. Assume also that we have access to the program that computes f, typically in terms of a straight-line program. IfcharK= 0 orcharK> max (deg_x₁f,..., deg_x_nf), then let us show how to derive an algorithm for the sparse evaluation of f.

It is well known that Baur–Strassen's technique of “automatic differentiation” [7]

allows us to evaluate the gradient(∂f/∂x₁, . . . , ∂f/∂x_n)using at most 4L operations inK. Using5L+noperations, this provides an algorithm for the simultaneously evaluation of f,g1, . . . ,gnwithgi=xi(∂f/∂xi) fori= 1, . . . ,n. With a small overhead, this next allows us to jointly compute the sparse interpolations of f,g1, . . . ,g_nmoduloI_r.

Now assume thatcx₁ê¹⋅⋅⋅x_nêⁿis a term of f such that for any other termc′x₁ê¹^′⋅⋅⋅x_nêⁿ^′of f, we have(e1modr, ... ,enmodr) ≠ (e1′ modr,. .. ,en′ modr); we say that this term “does not collide” modulor. Then the sparse interpolation of f moduloI_rcontains the non-zero termcx₁ê¹^mod^r⋅⋅⋅x_nêⁿ^modr. Moreover, giveni∈ {1,...,n}withei≠ 0, the sparse interpolation of gi moduloIralso contains the non-zero termeic x₁ê¹^modr⋅ ⋅ ⋅x_nêⁿ^modr. This allows us to retrieveeithrough one simple division(eic)/c.

Furthermore, if the modulusrwas picked at random withr>t, then there is a high probability that a ﬁxed non-zero proportion of terms in f do not collide modulor. Com- bined with Algorithm1, this yields an algorithm for obtaining the sparse interpolation of f. This strategy for sparse interpolation was ﬁrst exploited by Huang [30].

Remark.For simplicity, we consider sparse interpolation of polynomials over ﬁeldsKin this paper. In fact, the algorithms also work for vectors of polynomials inK[x1, . . . ,xn]^𝜈, by considering them as polynomials with coeﬃcients inK^𝜈. We implicitly used this fact above when saying that we “jointly” compute the sparse interpolation of f,g1, . . . ,gn

moduloIr.

3.3. Conclusion

In summary, we have shown how to reduce the general problem of sparse interpolation to the case when

1. we have bounds for the number of terms and the total degree, and

2. we only require an approximate sparse interpolation (in the sense of section3.1).

(9)

4. U

NIVARIATE INTERPOLATION USING CYCLIC EXTENSIONS One general approach for sparse interpolation of univariate polynomials over general base ﬁeldsKwas initiated by Garg and Schost [15]. It assumes that the black box function f can be evaluated over any cyclic extension of the form K[x]/(x^r−1). The evaluation of

f=c1x^e¹+ ⋅ ⋅ ⋅ +ctx^e^t (3)

atxinside such an extension simply yields

f^[r]≔c1x^e¹^remr+ ⋅ ⋅ ⋅ +ctx^e^t^remr.

In absence of “collisions”e_i=e_jmodulorfori≠j, this both yields the coeﬃcients of fand its exponents modulor. By combining the evaluations for various moduli, it is possible to reconstruct the actual exponents using Chinese remaindering.

Throughout this section, we assume that we are given boundsDandTfor the degreed and the number of terms t of f. Garg and Schost's original algorithm was deterministic under these assumptions. However, their algorithm was not designed to be eﬃcient in practice. In the past decade, many variants have been proposed. Roughly speaking, they all follow the same recipe that we summarized in Algorithm2 below. The variants mainly diﬀer in the precise way recombinations are done in step 3.

Algorithm 2

Input:a black box polynomial f(x), a degree boundD, a sparsity boundT Output:a partial sparse interpolation f^∗of f

1. Determine suitable modulir1, . . . ,rl>Twithlcm (r1, . . . ,rl) >D.

2. Evaluate f atxinK[x]/(x^rⁱ−1)fori= 1, . . . ,l.

3. Determine matching terms in the expansions of f^[r¹^], . . . ,f^[r^l^]that are likely to be the reductions of the same termcix^eⁱin the expansion of f. 4. Return the sum f^∗of all termscix^eⁱas above.

4.1. Complexity analysis

For all matching strategies that have been considered so far, the cost of steps 1 and 3 is dominated by the cost of step 2. If the evaluation of fonly involves ring operations, then the running time of Algorithm2is therefore bounded byO((M_K(r₁) + ⋅ ⋅ ⋅ +M_K(r_l))L).

The moduli r_k are usually all of the same order of magnitude r_k≈T^𝜈 for some small 𝜈 ⩾ 1that depends on the matching strategy. Then we may takel=O(logD/logT), so the cost simplifies toO(LM_K(T^𝜈) logD/logT). For finite fieldsK= 𝔽_q, this cost becomes O^♭(LT^𝜈logDlogq). For the design of matching strategies, it is therefore important that we can take𝜈as small as possible.

Remark. The above analysis can be reﬁned by maintaining separate countsLadd,Lmul, andLdivfor the numbers of additions (or subtraction), multiplications, and divisions that are necessary for one evaluation of T. Then the cost of Algorithm 2 over𝔽_qbecomes O^♭(T^𝜈logq(L_add+L_mullogT+L_div(logT)²)).

(10)

Remark.The complexity analysis may need to be adjusted somewhat ifDis so large that we run out of suitable moduliri. If our matching strategy requires prime numbers of the order ofT^c, then this happens whenlogDexceeds approximately the same orderT^c. In that case, we need to replaceT by an appropriate power oflogDin our complexity bounds. Alternatively, if the characteristic of K is suﬃciently large, then we may fall back on the strategy from section3.2.

4.2. Survey of existing variants based on cyclic extensions

4.2.1. Determining the exponents using Chinese remaindering

Garg and Schost's original algorithm from [15] uses prime numbers p for the moduli rk. Assuming that f^[p] admits t terms, the algorithm is based on the observa- tion that the projection of the polynomial(z−e1) ⋅ ⋅ ⋅ (z−et) modulo p coincides with (z−e₁remp) ⋅⋅⋅ (z−e_tmodp). This allows for the recovery ofE= (z−e₁) ⋅⋅⋅ (z−e_t)through Chinese remaindering, by working with a sufficient number of primes. It then suffices to determine the zerose1, . . . ,etof Eand to recoverci as the coefficient of xêⁱ^rem^pin f^[p]

fori= 1, . . . ,t.

However, this strategy is very sensitive to collisions, and requiresp≫T²in order to ensure with high probability that f^[p]admits exactly tterms. In other words, it forces us to take𝜈 ⩾ 2in the complexity analysis. Garg and Schost's original algorithm is actually deterministic and usesO˜(L T⁴(logD)²)operations inK. The derandomization is achieved by usingΘ(T²logD)diﬀerent primesp.

4.2.2. Composite moduli

Another matching strategy for step 3 of Algorithm2has been proposed by Arnold, Gies- brecht, and Roche [3]. The idea is to pick rk=p0pk for k= 1, . . . ,l, where p0, . . . ,pl are primes withp₀≍Tandp₁, . . . ,p_l≈T^𝜖for some𝜖 > 0(so that𝜈 = 1 + 𝜖). Fori= 1, . . . ,tand k= 1, . . . ,l, there then exists a fixed non-zero probability such that the term cixêⁱ^remp⁰ of f^[p⁰^]matches a termcixêⁱ^remr^kof f^[r^k^]. Let𝒦_ibe the set of indiceskfor which we have a match. For some fixed constantc> 0, we then have lcm (r_k:k∈ 𝒦_i) >T^c𝜖l with high probability. By takingl>logD/(c𝜖 logT)in step 1, this implieslcm(rk:k∈ 𝒦_i) >D. With high probability, this allows us to reconstruct those termscixêⁱsuch thatei≠ejmodulop0

for allj≠i. The sum of these terms gives the desired approximation f^∗of f for which a ﬁxed non-zero proportion of terms are likely to be correct.

4.2.3. Diversiﬁcation

Giesbrecht and Roche proposed yet another matching strategy [16] which is based on the concept of “diversification”. The polynomial f is said to bediversifiedif its coefficientsci

are pairwise distinct. Assuming thatK is sufficiently large, it is shown in [16] that the polynomial f(𝛼x) is diversified with high probability for a random choice of 𝛼 ∈K^∗. Without loss of generality, this allows us to assume that f is diversified.

In step 3 of Algorithm2, we then match a termc x^eof f^[r^k^]with a termc′x^e′of f^[r^k′^] if and only ifc=c′. Giesbrecht and Roche's original algorithm usesl=O(logD)moduli r₁, . . . ,r_lof sizeO˜(T²logD). Consequently, their algorithm for sparse interpolation uses O˜(L T²(logD)²) operations in 𝔽_q. As we will see below, their probabilistic analysis is actually quite pessimistic: in practice, it is possible to take r1, . . . ,rl=O˜(T) as long as logD=O(T).

(11)

4.3. An optimized probabilistic algorithm based on diversiﬁcation

Let us now focus on the design and analysis of a probabilistic algorithm that exploits the idea of diversiﬁcation even more than Giesbrecht and Roche's original method from [16].

Given a diversiﬁed polynomial f, together with boundsDandTfor its degreedand its number of termst, our aim is to computeΘ(t)terms of f, with high probability. Our algorithm uses the following parameters:

• A constant𝜏 ≍ 1.

• Ordersr1< ⋅ ⋅ ⋅ <r_lthat are pairwise coprime, withr_k≈ 𝜏Tfork= 1, . . . ,l.

• The minimal number𝜈 ∈ {1, . . . ,l}such thatlcm (r₁, . . . ,r_𝜈) =r₁⋅ ⋅ ⋅r_𝜈>D.

The precise choice of𝜏 andlwill be detailed below; the parameter𝜏 and the ratiol/𝜈 should be sufficiently small for the algorithm to be efficient, but𝜏andl/𝜈should also be sufficiently large for our algorithm to succeed with high probability.

We now use Algorithm3below in order to compute an approximate sparse interpolation of f. It is a variant of Algorithm2with a matching strategy that has been detailed in steps 2, 3, and 4. Each individual termcx^eof f is reconstructed from only a subset of its reductions modulox^r¹−1, . . . ,x^r^l−1.

Algorithm 3

Input:a diversiﬁed black box polynomial f(x)and parameters as above Output:an approximate sparse interpolation f^∗of f

1. Compute f^[r^k^]=f rem (x^r^k−1)fork= 1, . . . ,l.

2. Let f^∗≔ 0.

3. LetCkbe the set of all coeﬃcients that occur once in f^[r^k^], fork= 1, . . . ,l.

4. For eachc∈C1∪ ⋅ ⋅ ⋅ ∪Cldo:

If𝒦 ≔ {k:c∈C_k}is such thatlcm (r_k:k∈ 𝒦) >D, then:

Determine the unique exponent e<Dsuch that c x^eremr^k occurs in f^[r^k^]for everyk∈ 𝒦, and set f^∗≔f^∗+cx^e.

5. Return f^∗.

4.4. Analysis of the expected number of correct terms

How to ensure that a non-zero portion of the terms of f^∗can be expected to be correct?

In order to answer this question, we make the following heuristic hypothesis:

Hred. Fork= 1, . . . ,l, the modular reductions of exponentseiremrkfori= 1, . . . ,tare uni- formly distributed in{0,...,r_k−1}. The distribution associated tor_kis independent of the one associated tork′wheneverk≠k′.

Such a heuristic is customary in computer science, typically when using hash tables.

According toH_red, the probability that a ﬁxed termcx^edoes not collide with another term modulorkis

(1−r_k⁻¹)^T−1 ≈ (1−(𝜏T)⁻¹)^T−1 ⩾ (1−(𝜏T)⁻¹)^T ⩾ eTlog(1−(𝜏T)⁻¹) ⩾ e^−1/𝜏.

Setting𝜖 ≔ 1−e^−1/𝜏, this probability tends to1−𝜖for largeT. The probability thatc x^e collides with another term modulor_kfor exactly𝜅values ofkis therefore bounded by

l

𝜅 𝜖^l−𝜅(1−𝜖)^𝜅

(12)

and this bound is sharp for largeT. Consequently, the probability that we cannot recover a term feix^eⁱin step 4 from its reductions modulox^r^k−1fork= 1, . . . ,lis at most

P(𝜈;l, 1−𝜖) =

i<𝜈

li 𝜖^l−i(1−𝜖)ⁱ, (4)

and this bound is sharp for largeT.

The probability (4) has been well studied; see [5] for a survey. Whenever a≔ 𝜈/l> 1−𝜖,

Chernoﬀ's inequality [5, Theorem 1] gives us P(𝜈;l, 1−𝜖) ⩽ exp −l alog a

1−𝜖 + (1−a) log 1−a

𝜖 .

Let𝜂 < 1be a positive real parameter. In order to ensureP(𝜈;l, 1−𝜖) < 𝜂it suﬃces to have l alog a

1−𝜖 + (1−a) log 1−a

𝜖 > log(1/𝜂).

Now thanks to [40, Lemma 2(a)] we have alog a

1−𝜖 + (1−a) log 1−a

𝜖 ⩾ 2 (a−(1−𝜖))², so it suﬃces to ensure that

(𝜈−l(1−𝜖))²>log(1/𝜂)

2 l. (5)

Now let us take𝜈 =c(1−𝜖)lwithc> 1, after which (5) is equivalent to (c−1)²>log(1/𝜂)

2l(1−𝜖)². (6)

For ﬁxed𝜂and largel(i.e. for largeD), it follows that we may takecarbitrarily close to1.

Summarizing, we may thus takel≈ 𝜈/(1−𝜖)in order to recover an average of at least (1−𝜂)tcorrect terms, where𝜂can be taken arbitrarily close to0:

PROPOSITION1. AssumeHredand let𝜏,𝜖, l, r1,...,rl, and𝜈be parameters as above. Given𝜂 < 1, assume that𝜈 =c(1−𝜖)l, where c satisﬁes(6). Then Algorithm3returns at least(1−𝜂)t correct terms of f on average.

4.5. Probabilistic complexity analysis

Let us now analyze the running time of Algorithm3. Takingl≈ 𝜈 /(1−𝜖), the cost of step 1 is proportional to

𝜏

1−𝜖 = 𝜏 e^1/𝜏,

and𝜏 e^1/𝜏reaches its minimum valueeat𝜏 = 1. This means that the total complexity is best when𝜏 is close to1. In other words, this prompts us to take𝜏 = 1,r1≈ ⋅ ⋅ ⋅ ≈rl≈T, 𝜈 ≈logD/logT, andl≈𝜈/(1−e⁻¹). For this choice of parameters, we obtain the following heuristic complexity bound:

PROPOSITION2. Assume thatHredandlogD=O(T). Given0 < 𝜂 < 1and a diversiﬁed poly- nomial f of degree d⩽D and with t⩽T terms, there exists a Monte Carlo probabilistic algorithm which computes at least(1−𝜂)t terms of f in time

O^♭(LTlogDlogq).

(13)

Proof. We take r_i to be the i-th smallest prime numbers that is larger thanT, so that 𝜈 =O(logD/logT) is the smallest number with r1⋅ ⋅ ⋅r𝜈>D. We also take 𝜏 = 1, 𝜖 = 1−1/e, and letl=O(logD/logT)be smallest such that (6) is satisﬁed forc= 𝜈/(l(1−𝜖)).

Combining [6] and [1], we may computer1, . . . ,r_lin timeO^♭(TlogD).

Now the running time of step 1 of Algorithm3is O((M_q(r1) + ⋅ ⋅ ⋅ +M_q(rl))L).

Withl=O(logD/logT), this cost simpliﬁes to

O^♭(LTlog(qT) logD/logT) =O^♭(LTlogqlogD).

Step 3 may be obtained by sorting the coeﬃcients of the f^[r^k^], in time O(lTlogTlogq) =O(TlogDlogq).

Using fast Chinese remaindering, step 4 takes timeTO˜(log 𝜈 logD) =O^♭(TlogD). □ Remark.Ifq≫T², then f(𝛼x)is diversiﬁed with high probability for a random choice of 𝛼∈𝔽_q∖{0}: see [16]. In the range whereq=Θ(T)andq=O(T²), it is possible to work with a slightly weaker assumption: we say that f isweakly diversiﬁedif{c1,...,ct}is of sizeΘ(t).

Ifq= Θ(T), then the polynomial f(𝛼x)is still weakly diversified, for a random choice of𝛼 ∈ 𝔽_q∖ {0}. If f is only weakly diversified andt′ ≔ {1 ⩽i⩽t: ∀j≠i,c_i≠c_j}, then our analysis can be adapted to show that Algorithm3returns about(1−𝜂)t′correct terms of f on average. Finally, in the range whereq=o(T), we need to work over a field exten- sion𝔽_q^swithq^s⩾T, which leads to an additional arithmetic overhead ofO^♭(logT/logq).

Remark. Let us show that with high probability, the polynomial f^∗returned by Algo- rithm 3 only contains correct terms of f (although it does not necessarily contain all terms). For this, we make the additional hypothesis that the coeﬃcients of f are essentially random non-zero values in𝔽_q(which is typically the case after a change of variables

f(x)↝f(𝛼x), where𝛼 ∈ 𝔽_q∖ {0}is random).

Now assume that some coefficientcin step 4 gives rise to a termcxêthat is not in f. Then for everyk∈ 𝒦, there should be at least two terms in f that collide modulor_kand for which the sum of the corresponding coefficients equalsc. The probability that this happens for a fixedk∈ 𝒦is bounded by(q−1)⁻¹and the probability that this happens for allk∈ 𝒦 is bounded by(q−1)^{−𝜈 ′}, where𝜈 ′ ≈ 𝜈is minimal withrl−𝜈 ′+1⋅ ⋅ ⋅rl>D.

4.6. Estimating the number of terms t

For the algorithms in this section, we assumed that a boundTfortwas given. It turns out that a variant of our probabilistic analysis can also be used for the eﬃcient computation of a rough estimate fort. This yields an interesting alternative to the naive doubling strategy described in section3.1.

Let us still assume that Hredholds. We will also assume that colliding terms rarely cancel out (which holds with high probability if qis suﬃciently large). This time, we compute f(x) rem (x^B−1)forB≔ 𝛼t, where𝛼 < 1is to be determined, and letNbe the number of terms in this remainder. When randomly distributingtballs overBboxes, the probability that none of the balls lands in a particular box is(1−1/B)^t. Consequently, the expected number of boxes with no ball is(1−1/B)^tB, whence

B−N≈ 1−1 B

tB≈ e⁻

1 𝛼B.

(14)

It follows that

1

𝛼≈ log B B−N , and thus

t=B

𝛼≈Blog B

B−N . (7)

By doublingBuntilB−N≳ B, we may then use the approximation (7) as a good candi- date fort. Notice that we haveB−N≈ B whenB≈ 2t/logt.

4.7. Conclusion

Cyclic extension methods for sparse interpolation are attractive due to their generality and the possibility to derive deterministic complexity bounds. However, even their most eﬃcient probabilistic versions suﬀer from the overhead of arithmetic operations in cyclic extension algebrasK[x]/(x^r−1).

The matching strategy based on diversiﬁcation leads to the best practical complexity bounds, as shown in section4.5. AssumingHred,logD=O(T), andq= Θ(T), we have given a Monte Carlo algorithm for sparse interpolation of complexityO^♭(LTlogDlogT).

The case when q=o(T) can be reduced to this case using a ﬁeld extension of degree O(logT/logq). Assuming onlyH_redand logD=O(T), we thus obtain a probabilistic algorithm that runs in time

O^♭(LTlogDlog (qT)). (8)

5. U

NIVARIATE INTERPOLATION USING GEOMETRIC PROGRESSIONS Prony's method is one of the oldest and most celebrated algorithms for sparse interpolation of univariate polynomials. It is based on the evaluation of f at points in a geometric progression. Since there are many variants, we keep our presentation as general as possible. As in the previous section, assume that

f=c₁x^e¹+ ⋅ ⋅ ⋅ +c_tx^e^t (9)

and that we know boundsDandTfor the degree and the number of terms of f. Algorithm 4

Input:a black box polynomial f(x), a degree boundD, a sparsity boundT Output:the sparse interpolation f^∗of f

1. Find a suitable element𝜔 ∈Kwith multiplicative orderr⩾ max (D, 2T), while replacingKby an extension if necessary.

2. Evaluate f(1),f(𝜔),f(𝜔²), . . . ,f(𝜔^2T−1).

3. Compute a minimalt⩽T, a monicΛ ∈K[z]of degreet, and anN∈K[z]

of degree<t, such that the following identity holds moduloO(z^2T):

k∈ℕ

f(𝜔^k)z^k =

1⩽i⩽t

ci

1−𝜔^eⁱz = N(z)

Λ(z). (10)

4. Find the roots𝜔^−eⁱofΛ ≔z^t+ Λ_t−1z^t−1+ ⋅ ⋅ ⋅ + Λ₀∈K[z].

5. Compute the discrete logarithms of the roots to base𝜔to discover the exponentse₁, . . . ,e_tof f as in (9).

6. Computece1,...,cetfrom f(1),...,f(𝜔^t−1)ande1,...,et, using linear algebra.

(15)

It is well known that steps 3 and 6 can be performed in timeO(M_K(T) logT), through fast Padé approximation [10, 38] in the case of step 3, and using a transposed version of fast multipoint interpolation [9, 11] for step 6. If K= 𝔽_q, then this bound becomes O^♭(T(logT)²logq). The efficiency of steps 4 and 5 highly depends on the coefficient fieldK. In the remainder of this section, we will discuss this issue in detail in the case whenK= 𝔽_qis a finite field.

5.1. Root ﬁnding

Finding the roots of a univariate polynomial over a finite field is a well-known and highly studied problem in computer algebra. The most eﬃcient general purpose algorithm for this task is due to Cantor–Zassenhaus [12]. It is probabilistic and computes the roots ofΛin timeO^♭(T(logT)²(logq)²).

In [17,18], several alternative methods were designed for the case whenr=r1r2with r₁=O(T)andr₂⩽T^O(1)smooth (in the sense that𝜋 =O^♭(1)for each prime factor𝜋ofr₂).

The idea is to proceed in three steps:

1. We ﬁrst compute the r2-th Graeﬀe transform Gr2(Ω) of Ω, whose roots are the r2-th powers of the roots ofΩ. This step can be done in timeO^♭(T(logT)²logq) by [17, Proposition 5].

2. We next compute the roots ofG_r₂(Ω)through an exhaustive evaluation at allr1-th roots of unity. This step takes timeO^♭(TlogTlogq).

3. We ﬁnally lift these roots back up to the roots of Ω. This can be done in time O^♭(T(logT)²logqlogr2) =O^♭(T(logT)³logq), using g.c.d. computations.

Altogether, this yields a sparse interpolation method of costO^♭(T(logT)³logq).

The backlifting of single roots can be accelerated using so-called “tangent Graeﬀe transforms”. The idea is to work over the ringR≔K[𝜖]/(𝜖²)instead ofK. Then𝛼 ∈K is a root of a polynomialP(z) ∈K[z]if and only if𝛼 + 𝜖 ∈Ris a root of the polynomial P(z−𝜖) ∈R[z]. Now if we know a single root(𝛼 + 𝜖)^r²= 𝛼^r²+r2𝛼^r²⁻¹𝜖of Gr2(Ω(z−𝜖)), then we may retrieve𝛼using one division of𝛼^r²byr₂𝛼^r²⁻¹and one multiplication byr₂ (note thatr2is invertible in𝔽_qsincer2dividesq−1). In other words, the backlifting step can be done in timeO^♭(Tlogq), usingO(T)operations in𝔽_q.

However, this method only works for single roots𝛼. When replacingΩ(z)andΩ(z−𝜆) for a randomly chosen 𝜆 ∈ 𝔽q, the polynomial Gr2(Ω(z−𝜆)) can be forced to admit a non-trivial proportion of single roots with high probability. However, these roots are no longer powers of𝜔, unless we tookr=q−1. Assuming thatr=q−1and using several shifts𝜆, it can be shown [17, Proposition 12] that the tangent Graeﬀe method yields a sparse interpolation method of complexityO^♭(TlogT(logq)²).

5.2. Discrete logarithms

The discrete logarithm problem in abelian groups is a well-known problem in compu- tational number theory. Ifr is smooth, then Pohlig–Hellman's algorithm provides an eﬃcient solution; it allows step 5 of Algorithm4to be performed in timeO^♭(Tlogrlogq).

Under the assumption that we may taker=T^O(1), this bound reduces toO^♭(TlogTlogq).

Again, the same bound still holds ifr=r1r2withr1=O(T)andr2smooth. Indeed, in that case, we may tabulate the powers 𝜔⁰, 𝜔^r², 𝜔^2r², . . . , 𝜔^(r¹^−1)r². This allows us to efficiently determine the discrete logarithms of𝜔ê¹^r², . . . , 𝜔ê^t^r²with respect to𝜔^r², which yields the exponentse1, . . . ,et modulo r1. We next use Pohlig–Hellman's algorithm to computee₁, . . . ,e_t.