Inversion and division - Binary field arithmetic

Finite Field Arithmetic

2.3 Binary field arithmetic

2.3.6 Inversion and division

In this subsection, we simplify the notation and denote binary polynomialsa(z)bya.

Recall that the inverse of a nonzero elementa∈F2^mis the unique elementg∈F2^msuch thatag=1 inF2^m, that is,ag≡1 (mod f). This inverse element is denoteda⁻¹mod f or simplya⁻¹ if the reduction polynomial f is understood from context. Inverses can be efficiently computed by the extended Euclidean algorithm for polynomials.

The extended Euclidean algorithm for polynomials

Letaandbbe binary polynomials, not both 0. Thegreatest common divisor (gcd)ofa andb, denoted gcd(a,b), is the binary polynomialdof highest degree that divides both aandb. Efficient algorithms for computing gcd(a,b)exploit the following polynomial analogue of Theorem 2.18.

Theorem 2.46 Leta andbbe binary polynomials. Then gcd(a,b)=gcd(b−ca,a) for all binary polynomialsc.

In the classical Euclidean algorithm for computing the gcd of binary polynomialsa andb, where deg(b)≥deg(a),bis divided byato obtain a quotientqand a remainder rsatisfyingb=qa+r and deg(r) <deg(a). By Theorem 2.46, gcd(a,b)=gcd(r,a). Thus, the problem of determining gcd(a,b)is reduced to that of computing gcd(r,a) where the arguments (r,a)have lower degrees than the degrees of the original argu-ments(a,b). This process is repeated until one of the arguments is zero—the result is then immediately obtained since gcd(0,d)=d. The algorithm must terminate since the degrees of the remainders are strictly decreasing. Moreover, it is efficient because the number of (long) divisions is at mostkwherek=deg(a).

In a variant of the classical Euclidean algorithm, only one step of each long division is performed. That is, if deg(b)≥deg(a)and j=deg(b)−deg(a), then one computes r=b+z^ja. By Theorem 2.46, gcd(a,b)=gcd(r,a). This process is repeated until a zero remainder is encountered. Since deg(r) <deg(b), the number of (partial) division steps is at most 2kwherek=max{deg(a),deg(b)}.

The Euclidean algorithm can be extended to find binary polynomials g and h satisfyingag+bh=dwhered=gcd(a,b). Algorithm 2.47 maintains the invariants

ag₁+bh₁=u ag₂+bh₂=v.

The algorithm terminates whenu=0, in which casev=gcd(a,b)andag₂+bh₂=d.

Algorithm 2.47Extended Euclidean algorithm for binary polynomials INPUT: Nonzero binary polynomialsaandbwith deg(a)≤deg(b).

OUTPUT:d=gcd(a,b)and binary polynomialsg,hsatisfyingag+bh=d.

1. u←a,v←b.

2. g₁←1,g₂←0,h₁←0,h₂←1.

3. Whileu =0 do

3.1 j←deg(u)−deg(v).

3.2 If j<0 then:u↔v,g₁↔g₂,h₁↔h₂, j← −j. 3.3 u←u+z^jv.

3.4 g₁←g₁+z^jg₂,h₁←h₁+z^jh₂. 4. d←v,g←g₂,h←h₂.

5. Return(d,g,h).

Suppose now that f is an irreducible binary polynomial of degreemand the nonzero polynomial a has degree at mostm−1 (hence gcd(a, f)=1). If Algorithm 2.47 is executed with inputsaand f, the last nonzerouencountered in step 3.3 isu=1. After this occurrence, the polynomialsg₁andh₁, as updated in step 3.4, satisfyag₁+ f h₁= 1. Henceag₁≡1 (mod f)and soa⁻¹=g₁. Note thath₁andh₂are not needed for the determination ofg₁. These observations lead to Algorithm 2.48 for inversion inF2^m. Algorithm 2.48Inversion inF2^musing the extended Euclidean algorithm

INPUT: A nonzero binary polynomialaof degree at mostm−1.

OUTPUT:a⁻¹ mod f. 1. u←a,v←f. 2. g₁←1,g₂←0.

3. Whileu =1 do

3.1 j←deg(u)−deg(v).

3.2 If j<0 then:u↔v,g₁↔g₂, j← −j. 3.3 u←u+z^jv.

3.4 g₁←g₁+z^jg₂. 4. Return(g₁).

Binary inversion algorithm

Algorithm 2.49 is the polynomial analogue of the binary algorithm for inversion in Fp (Algorithm 2.22). In contrast to Algorithm 2.48 where the bits of u and v are cleared from left to right (high degree terms to low degree terms), the bits ofu and vin Algorithm 2.49 are cleared from right to left.

Algorithm 2.49Binary algorithm for inversion inF2^m

INPUT: A nonzero binary polynomialaof degree at mostm−1.

OUTPUT:a⁻¹ mod f. 1. u←a,v←f. 2. g₁←1,g₂←0.

3. While (u =1 andv =1) do 3.1 Whilezdividesudo

u←u/z.

Ifzdividesg₁theng₁←g₁/z; elseg₁←(g₁+ f)/z.

3.2 Whilezdividesvdo v←v/z.

Ifzdividesg₂theng₂←g₂/z; elseg₂←(g₂+ f)/z.

3.3 If deg(u) >deg(v)then:u←u+v,g₁←g₁+g₂; Else:v←v+u,g₂←g₂+g₁.

4. Ifu=1 then return(g₁); else return(g₂).

The expression involving degree calculations in step 3.3 may be replaced by a sim-pler comparison on the binary representations of the polynomials. This differs from Algorithm 2.48, where explicit degree calculations are required in step 3.1.

Almost inverse algorithm

The almost inverse algorithm (Algorithm 2.50) is a modification of the binary inversion algorithm (Algorithm 2.49) in which a polynomial gand a positive integerk are first computed satisfying

ag≡z^k (mod f).

A reduction is then applied to obtain

a⁻¹=z^−kgmod f. The invariants maintained are

ag₁+ f h₁=z^ku ag2+ f h2=z^kv for someh₁,h₂that are not explicitly calculated.

Algorithm 2.50Almost Inverse Algorithm for inversion inF2^m

INPUT: A nonzero binary polynomialaof degree at mostm−1.

OUTPUT:a⁻¹ mod f. 1. u←a,v←f.

2. g₁←1,g₂←0,k←0.

3. While (u =1 andv =1) do 3.1 Whilezdividesudo

u←u/z,g₂←z·g₂,k←k+1.

3.2 Whilezdividesvdo

v←v/z,g₁←z·g₁,k←k+1.

3.3 If deg(u) >deg(v)then:u←u+v,g₁←g₁+g₂. Else:v←v+u,g₂←g₂+g₁.

4. Ifu=1 theng←g₁; elseg←g₂. 5. Return(z^−kgmod f).

The reduction in step 5 can be performed as follows. Letl =min{i ≥1| f_i =1}, where f(z)= f_mz^m+···+f₁z+f₀. LetSbe the polynomial formed by thelrightmost bits ofg. ThenS f +gis divisible byz^l andT =(S f +g)/z^l has degree less thanm;

thusT=gz^−l mod f. This process can be repeated to finally obtaingz^−k mod f. The reduction polynomial f is said to besuitableiflis above some threshold (which may depend on the implementation; e.g.,l≥W is desirable withW-bit words), since then less effort is required in the reduction step.

Steps 3.1–3.2 are simpler than those in Algorithm 2.49. In addition, the g₁ and g₂ appearing in these algorithms grow more slowly in almost inverse. Thus one can expect Algorithm 2.50 to outperform Algorithm 2.49 if the reduction polynomial is suitable, and conversely. As with the binary algorithm, the conditional involving degree calculations may be replaced with a simpler comparison.

Division

The binary inversion algorithm (Algorithm 2.49) can be easily modified to perform divisionb/a=ba⁻¹. In cases where the ratioI/Mof inversion to multiplication costs is small, this could be especially significant in elliptic curve schemes, since an elliptic curve point operation in affine coordinates (see §3.1.2) could use division rather than an inversion and multiplication.

Division based on the binary algorithm To obtainb/a, Algorithm 2.49 is modified at step 2, replacingg₁←1 withg₁←b. The associated invariants are

ag₁+ f h₁=ub ag₂+ f h₂=vb.

On termination with u =1, it follows that g₁ =ba⁻¹. The division algorithm is expected to have the same running time as the binary algorithm, since g₁ in Algo-rithm 2.49 goes to full-length in a few iterations at step 3.1 (i.e., the difference in initialization of g₁ does not contribute significantly to the time for division versus inversion).

If the binary algorithm is the inversion method of choice, then affine point operations would benefit from use of division, since the cost of a point double or addition changes from I+2M toI+M. (I andMdenote the time to perform an inversion and a multi-plication, respectively.) IfI/Mis small, then this represents a significant improvement.

For example, if I/M is 3, then use of a division algorithm variant of Algorithm 2.49 provides a 20% reduction in the time to perform an affine point double or addition.

However, if I/M>7, then the savings is less than 12%. UnlessI/M is very small, it is likely that schemes are used which reduce the number of inversions required (e.g., halving and projective coordinates), so that point multiplication involves relatively few field inversions, diluting any savings from use of a division algorithm.

Division based on the extended Euclidean algorithm Algorithm 2.48 can be trans-formed to a division algorithm in a similar fashion. However, the change in the initialization step may have significant impact on implementation of a division algo-rithm variant. There are two performance issues: tracking of the lengths of variables, and implementing the addition tog₁at step 3.4.

In Algorithm 2.48, it is relatively easy to track the lengths ofu andv efficiently (the lengths shrink), and, moreover, it is also possible to track the lengths of g₁ and g₂. However, the change in initialization for division means thatg₁goes to full-length immediately, and optimizations based on shorter lengths disappear.

The second performance issue concerns the addition tog₁ at step 3.4. An imple-mentation may assume that ordinary polynomial addition with no reduction may be performed; that is, the degrees ofg₁andg₂never exceedm−1. In adapting for division, step 3.4 may be less-efficiently implemented, sinceg₁is full-length on initialization.

Division based on the almost inverse algorithm Although Algorithm 2.50 is similar to the binary algorithm, the ability to efficiently track the lengths ofg₁andg₂(in addi-tion to the lengths ofuandv) may be an implementation advantage of Algorithm 2.50 over Algorithm 2.49 (provided that the reduction polynomial f is suitable). As with Algorithm 2.48, this advantage is lost in a division algorithm variant.

It should be noted that efficient tracking of the lengths ofg₁andg₂(in addition to the lengths ofuandv) in Algorithm 2.50 may involve significant code expansion (perhaps t² fragments rather than the t fragments in the binary algorithm). If the expansion cannot be tolerated (because of application constraints or platform characteristics), then almost inverse may not be preferable to the other inversion algorithms (even if the reduction polynomial is suitable).

Dans le document Guide to Elliptic Curve Cryptography (Page 78-83)