Inverse of Fermat Number Transform Using the Sliding Technique

(1)

PAPER

Inverse of Fermat Number Transform Using the Sliding Technique

Hamz´e Haidar ALAEDDINE^†a), El Houssa¨ın BAGHIOUS^†,Nonmembers,andGilles BUREL^†,Member

SUMMARY This paper is about a new eﬃcient method for the implementation of convolvers and correlators using the Fermat Number Trans- form (FNT) and the inverse (IFNT). The latter present advantages compared to Inverse Fast Fourier Transform (IFFT). An eﬃcient state space method for implementing the Inverse FNT (IFNT) over rectangular windows is proposed for the cases where there is a large overlap between the consecutive input signals. This is called Inverse Generalized Sliding Fer- mat Number Transform (IGSFNT) and is useful for reducing the computational complexity of finite ring convolvers and correlators. This algorithm uses the technique of Generalized Sliding associated to matricial calculation in the Galois Field. The computational complexity of this method is compared with that of standard IFNT.

key words: sliding fast fourier transform, fermat number transform, slid- ing fermat number transform

1. Introduction

Fermat Number Transform (FNT) is developed for fast error-free computation of finite digital convolutions and cor- relations [1], [2]. These transforms present the following advantages compared to Fast Fourier Transforms (FFT) [3].

• They require few or no multiplications.

• They suppress the use of floating point complex numbers and allow error-free computation.

• All calculations are executed on a finite ring of in- tegers, which is interesting for implementation into VLSI (Very-Large-Scale-Integration). Current tech- nology allows placing of a complete convolver or cor- relator based on the Fermat number transform (FNT) [2] onto a fraction of the area available on an ASIC [4].

Hence, the use of FNT will reduce the delay features by minimizing the computational complexity.

An eﬃcient state space method for implementing the fast FFT over rectangular windows is proposed for the cases where there is a large overlap among the consecutive input signals. This is called Generalized Sliding FFT (GSFFT) [5], [6].

The Generalized Sliding FNT (GSFNT) [7], inspired by the GSFFT, is proposed with the purpose of reducing the complexity of FNT based convolvers and correlators and thus enlarging the application area for the FNT. This

Manuscript received February 4, 2011.

Manuscript revised April 12, 2011.

†The authors are with Universit´e de Brest; CNRS, UMR 3192 Lab-STICC, 6 avenue Victor Le Gorgeu, CS 93837, 29238 Brest cedex 3 – France.

a) E-mail: hamze.alaeddine@univ-brest.fr DOI: 10.1587/transfun.E94.A.1656

method, based on the Butterfly structure, stores some intermediate results to avoid recomputing them at each sliding of the input window. The intermediate results of the GSFNT structure which can be used in the next iterations are thus preserved. In this contribution an algorithm of the Inverse Generalized Sliding Fermat number transform (IGSFNT) is proposed. This algorithm uses the technique of General- ized Sliding associated to matricial calculation in the Galois Field. This method of calculation thus reduces the computational complexity of IFNT-based convolvers and correlators, which thereafter widens the application area for the FNT.

The proposed algorithm is obtained by proper manipulation of the GSFFT.

The paper is organized as follows. The principles of FNT and GSFNT are presented in Sect. 2. In Sect. 3, we will introduce the principle of IGSFNT and the new algorithm of this transform; the computational complexity of both transforms (IFNT and IGSFNT) is presented. Finally, Sect. 4 contains the concluding remarks.

2. Generalized Sliding Fermat Number Transform (GSFNT)

2.1 Principle of the Fermat Number Transform (FNT) Discrete transforms based on the FNT concept have been developed for eﬃcient and error-free computation of finite convolutions [1]. An FNT of a discrete time signalχ, and the inverse are given respectively by:

X(j)= M−1

n=0

x(n)α^{n j}

Ft

(1)

x(n)=

M⁻¹

M−1

j=0

X(j)α⁻^{n j}

F_t

(2) Where x(n) are the samples of the signalχ, andM is the transform length. M being a prime number, there exists an integerM⁻¹such that

M.M⁻¹

Ft=1.Ft=2²^t+1 is thetth Fermat number, witht∈N,.F_tstands for residue reduction moduloFtandαis a root of unity.

For an eﬃcient implementation of FNT on a processor, the choice of parameters is important. If possible, the values Mandαare chosen as a power of 2 to allow replacement of multiplications by bit shifts. The particular modulo equal to a Fermat number, Ft =2²^t +1 with t ∈ N, oﬀers numer- ous possibilities for length Mof the transform. The values Copyright c2011 The Institute of Electronics, Information and Communication Engineers

(2)

1657

of M andαassociated to a Fermat Number Transform are given byM =2^t⁺¹⁻ⁱand

α=2²ⁱ

F_t with 0 ≤i<t[3] (see Table 1).

Some tests have shown that an FNT-based convolution reduces the computation time by a factor of 3 to 5 compared to the FFT implementation [2].

2.2 Algorithm of the GSFNT

The aim of the GSFNT is to reduce the number of Butterflies which must be calculated in the FNT for the cases when there is a large overlap among the consecutive input signals.

An example of overlap is illustrated in Fig. 1. It shows the consecutive input signalsχkof lengthM =8 which diﬀer byN=2 samples from timekto timek+1.

In a similar way to the GSFFT [5], [6], Fig. 2 shows an

Table 1 Possible combinations of FNT parameters.

example considering the principle of the Butterfly implantation by the Generalized Sliding technique applied to the FNT. In this example, the dimension of the input sequence, at timek=0 andk=1 is taken withM =8, which corresponds toF₂=2⁴+1 forα=2 (Table 1). In this example, two new data samples (N=2) have been collected between k=0 andk=1.

The use ofF2 is for illustrative purposes only in practical applications. In this figure, only the Butterflies which are marked by•must be computed. This example, where N=2, shows that using the Generalized Sliding technique can significantly reduce the computational complexity. It’s

Fig. 1 Principle of overlap forM=8,N=2.

Fig. 2 Generalized Sliding FNT forM=8,N=2 at timek=0 andk=1.

(3)

therefore necessary to define an algorithm for this technique.

The algorithm of the GSFNT is proposed in [7], based on computing the value of the variables in the FNT structure that are not computed in previous iterations. It refers to the computation of the FNT as the sequence slides over a time- limited rectangular windowNsamples at a time. The main advantage is the reduction of complexity.

In the following,N=2ⁿandM =N. L=2ⁿ⁺^pare the number of new data samples and input signal length respectively.

We define the GSFNT input signalχkas:

χk = [x(k·N)x(k·N−1) . . . x(k·N−M+1)]^T

=

Y_k^T Y_k^T₋₁ . . . Y_k^T₋_L₊₁T

(3) Where the time index is k ∈ [0,L−1]. Yk = [x(k·N)x(k·N−1) . . . x(k·N−2ⁿ+1)]^T, is the block of the new samples.

The GSFNT is based on the fact that in FNT a large part of the calculations is available from the previous iterations.

To determine an iterative calculation in the algorithm of the GSFNT, we define a state vectorSkwhich represents the diﬀerent nodes in the structure of the Butterfly. The vector is defined as: [5], [7]

Sk=

S^T_0,kS^T_1,k . . . S^T_(n₊_p),kT

(4) WhereS0,k =Yk andSi,k is 2ⁿ⁺ⁱ−dimensional vector for 0 ≤ i ≤ p−1.In Fig. 3, the variables at the nodes of the GSFNT structure which are marked byare equivalent to the components of the state vector in the state space equa- tions. The state vector can be expressed in iterative form, when 0≤i≤p−1 as:

Si+1,k=

(H₂i+1⊗I2ⁿ)· I₂n+i I₂n+i

I₂n+i −I₂n+i

Si,k

Vi,nS_i,k₋₂p−i−1

Ft

(5) Hⁱ₂⁺¹=[e1e1+2ⁱ e2e2+2ⁱ . . . e₂i+1] is a 2ⁱ⁺¹ by 2ⁱ⁺¹permuta- tion matrix,erisrth column of 2ⁱ⁺¹by 2ⁱ⁺¹identity matrix

Fig. 3 Generalized Sliding FNT forM=8,N=2 at timek.

and the operator⊗denoting the Kroneker product. The last element of the state vector is given by:

Vi,n=diag

1, α^σ(1), . . . , α^σ(²ⁱ−1)

⊗I2ⁿ (6) With the basisαequal to a power of 2.σ(r) is the bit reverse ofr.Also whenp≤i≤n+p−1

Si,k=

Si,k(0)Si,k(1) . . . Si,k(M−1)T, Si+1,k(c)

Si+1,k

c+2ⁿ⁺^p⁻ⁱ⁻¹

=

1 Vi,c

1 −Vi,c

Si,k(c) Si,k

c+2ⁿ⁺^p⁻ⁱ⁻¹

Ft

(7) Vi,c =α^σⁱ^(c), σi(c) is the bit reverse of(c−1)^M

2i such as 0≤ c^M

2i < ₂^M_i+1 and 0≤c<M.

The GSFNT of a discrete time signalχkis given by the vectorSi+1,kfori=n+p−1:

Xk=GSFNT(χk)=FNT(χk)=reverse carry Sn+p,k

(8) The diﬀerence between the two transforms (FNT and GSFNT) lies in the complexity, which is lower in the case of GSFNT [5].

3. Inverse GSFNT (IGSFNT)

3.1 Principle

The principle of the IGSFNT is inspired from the IGSFFT [5], [6], and adapted to computations modulo Fermat numbers. The approach is shown in Fig. 4 forM=8,F2=2⁴+1, α=2 andN =2 samples which have to be calculated. The output vector from the IGSFNT is staggered in a manner similar to the forward transform so that onlyN new output samples are computed per iteration of the IGSFNT.

The complexity of this procedure depends on the values ofN andM.Although Ncan take any value from 1 to M,the most practical choice will for most applications be to letNbe a power of 2. Under these conditions, the computational complexity of performing the IGSFNT after everyN

Fig. 4 Inverse of GSFNT overM=8,N=2 at timek=1.

(4)

1659

Fig. 5 Number of butterflies as a function ofNforM=64.

new data samples, in terms of the number of butterflies to be calculated, is: [6]

CIGS FNT= M 2

log₂N+2

−N (9)

If the root of unityαis simple, i.e.,

α=√

2m

=2^m²

Ft, the butterfly structure requires two additions and one- bit shifts, but ifαis complex, i.e.,

α=√

2m

=2^m−2² 2²^t−2 2²^t−2−1

F_t, the structure requires three additions and two- bit shifts.

When N = 1,the IGSFNT is reduced to the ISFNT, whereas the caseN =M corresponds to the ordinary IFNT (CIGS FNTis seen to vary fromM−1 forN=1 to ^M₂ log₂M for N = M). In Fig. 5, the computational complexity in terms of the number of butterflies, of both transforms (IFNT and IGSFNT), is plotted and compared as a function ofN for lengthM=64 andNa power of 2.

This figure shows that the reduction in the number of butterflies is observed in the case where 1≤N< ^M₂. 3.2 The Proposed Algorithm of the IGSFNT

This subsection presents a new algorithm to calculate the IFNT when its inputs hop over successive overlap- ping blocks of a data sequence. It is inspired from [5], [6] and adapted to computations modulo Fermat numbers.

This algorithm is called Inverse Generalized Sliding FNT (IGSFNT), and is obtained by proper manipulation of the GSFFT. Using the IGSFNT, it is clear that it is not necessary to calculate all the elements of the IFNT in order to reduce the computational complexity. Equation (7) implies fori=n+p−1, . . . ,p:

Si,k(c) Si,k

c+2ⁿ⁺^p⁻ⁱ⁻¹

=

1 Vi,c

1 −Vi,c

₋1

Si+1,k(c) Si+1,k

c+2ⁿ⁺^p⁻ⁱ⁻¹

Ft

(10)

where [ ]⁻¹denotes the inverse of matrix. The inverse, mod- uloFt, of the matrixA=

1 Vi,c

1 −Vi,c

, where,det (A)F_t 0,is calculated by: [8]

A⁻¹=

Δ⁻¹ad j(A)

F_t (11)

whereΔ =det (A)Ft,Δ⁻¹is the inverse ofΔmoduloFtand ad j(A) is the adjoint of A. We note that, sinceFtis a prime number,Δ⁻¹always exists.

In our case,A⁻¹is given by:

A⁻¹= Δ⁻¹

·−2Vi,c·

1 Vi,c

1 −Vi,c

₋₁

Ft

and Eq. (10) becomes:

Si,k(c) Si,k

c+2ⁿ⁺^p⁻ⁱ⁻¹

=⎛⎜⎜⎜⎜⎜⎝

Δ⁻¹

·−2Vi,c·

⎛⎜⎜⎜⎜⎜

⎝ 1 Vi,c

1 −Vi,c

₋₁⎞

⎟⎟⎟⎟⎟

⎠

⎞⎟⎟⎟⎟⎟

⎠

· Si+1,k(c) Si+1,k

c+2ⁿ⁺^p⁻ⁱ⁻¹

F_t

(12) Also Eq. (5) implies fori=p−1, . . . ,0 :

Si,k

Vi,nSi,k−2^p⁻ⁱ⁻¹

=

(Q)·(H)· Si+1,k

Ft (13)

with Q=

I2ⁿ⁺ⁱ I2ⁿ⁺ⁱ

I₂n+i −I2ⁿ⁺ⁱ

₋₁

Ft

= det I2ⁿ⁺ⁱ I2ⁿ⁺ⁱ

₋₁

Ft

·det I₂n+i I₂n+i

I₂n+i −I₂n+i

·

⎛⎜⎜⎜⎜⎜

⎝

I₂n+i I₂n+i

I₂n+i −I₂n+i

₋₁⎞

⎟⎟⎟⎟⎟

⎠ (14) Q=

Δ⁻¹

·

2⁽²⁽ⁿ⁺ⁱ⁾⁻¹⁾

· I₂n+i I₂n+i

and H =

(H₂i+1⊗I2ⁿ)⁻¹

Ft =(H₂i+1⊗I2ⁿ)

=

⎡⎢⎢⎢⎢⎢

⎢⎢⎢⎢⎢

⎢⎣

e1,1I₂ⁿ e_1,1₊₂ⁱI₂ⁿ . . . e_1,2i+1I₂ⁿ e₁₊₂ⁱ,1I₂ⁿ ... · · · ·e₁₊₂ⁱ,2ⁱ⁺¹I₂ⁿ

... · · · ep,sI2ⁿ · · · ... ... · · · ... ... e₂i+1,1I2ⁿ e₂i+1,1+2ⁱI2ⁿ · · · e₂i+1,2ⁱ⁺¹I2ⁿ

⎤⎥⎥⎥⎥⎥

⎥⎥⎥⎥⎥

⎥⎦

(15)

where,ep,sindicate the elements of the matrixH₂i+1, calculated by:

! ep,s=1 if p=s ep,s=0 if ps

"

We replaceQandHby their value in Eq. (13), and we obtain:

Si,k

Vi,nSi,k−2^p−i−1

= Δ⁻¹

·

2⁽²⁽ⁿ⁺ⁱ⁾⁻¹⁾

(5)

· I₂n+i I₂n+i

I2ⁿ⁺ⁱ −I2ⁿ⁺ⁱ

·(H₂i+1⊗I₂ⁿ)· Si+1,k

F_t

(16) With reference to Eq. (16), we can calculate the vector Si,kin order to find the blockYkof the new samples, by:

Si,k = ((Δ⁻¹)·(2⁽²⁽ⁿ⁺ⁱ⁾⁻¹⁾)·[I₂n+i I₂n+i])

·(H₂i+1⊗I2ⁿ)·(Si+1,k)F_t (17) The blockYkof the new samples is calculated fori=0 by:

Yk=S0,k (18)

Finally, the IGSFNT of a discrete time signal Xk is given by:

χk=IGS FNT(Xk)=IFNT(Xk)

=

S^T_0,kS^T_0,k₋₁. . .S^T_0,k₋_L₊₁T

=

Y_k^TY_k^T₋₁. . .Y_k^T₋_L₊₁T

(19) To highlight, schematically, the eﬃciency of the new algorithm, we present a block diagram depicted in Fig. 6.

This block shows that the new algorithm of the IGSFNT reduces the computational complexity by recuperating only the new samples (step 4).

The proposed algorithm, using the matricial calculation in the Galois Field, allows us to obtain an interesting result: the fact that input signal χk is deduced from

Fig. 6 Block diagram of GSFNT and IGSFNT algorithms.

S0,k, for i = 0, allowed reducing the complexity of R = _M

2

log₂^M_N

−(M−N)

butterflies, compared to the standard IFNT. This reduction of the complexity, has been proved by:

R=CIFNT−CIGS FNT=#M 2 log₂M

$−#M 2

log₂N+2−N$

=M 2 log₂ M

N −(M−N)

It should be noted that the sliding technique may be extended to many other transforms, such as Discrete-Cosine Transform (DCT) and Discrete Hartley Transform (DHT), which are widely used in transform domain adaptive filters.

These transforms, DCT and DHT, are exactly equivalent to a Fast Fourier Transform (FFT). The main distinctions from the FFT are that it transforms real inputs to real outputs.

4. Conclusion

This paper has proposed an algorithm called Inverse Gener- alized Sliding Fermat number transform (IGSFNT). Under where there is a large overlap among the consecutive input signals, this algorithm, which uses the matricial calculation in the Galois Field, only recuperates the new samples of the input signalsχk, at the timek.This method of calculation reduces the computational complexity of finite ring convolvers and correlators implantation on Digital Signal Processing (DSP), compared to the Inverse Fermat number transform (IFNT) and Inverse Fast Fourier Transform (IFFT). The algorithm proposed here can also be applied to other FNT’s, when the transform lengthMis a power of 2.

References

[1] R.C. Agarwal and C.S. Burrus, “Number theoretic transform to imple- ment fast digital convolution,” Proc. IEEE, vol.63, pp.550–560, April 1975.

[2] R.C. Agarwal and C.S. Burrus, “Fast convolution using Fermat number transform with application to digital filtering,” IEEE Trans.

Acoust. Speech Signal Process., vol.ASSP-22, no.2, pp.87–97, 1974.

[3] G.A. Jullien, Number Theoretic Techniques in Digital Signal Process- ing, vol.80, Chap. 2, pp.69–163, Academic Press, 1991.

[4] S. Gudvangen and A. Patel, “Rapid synthesis of a macro-pipelined CMOS ASIC for the Fermat number transform,” Proc. NORSIG, pp.143–148, Sept. 1995.

[5] S. Gazor and B. Farhang-Boroujeny, “A state space approach for eﬃ- cient implementation of block LMS adaptive filters,” Proc. Singapore Int. Conf. Commun. Syst. ICCS/ISITA’92, (Singapore), pp.808–812, Nov. 1992.

[6] B. Farhang-Boroujeny and S. Gazor, “Generalized sliding FFT and its application to implementation of block LMS adaptive filters,” IEEE Trans. Signal Process., vol.42, no.3, pp.532–538, March 1994.

[7] H. Alaeddine, E.-H. Baghious, G. Madre, and G. Burel, “Realization block robust adaptive filters using generalized sliding fermat number transform,” 14th European Signal Processing Conference (EU- SIPCO), Florence (Italy), Sept. 2006.

[8] K.H. Rosen, Elementary Number Theory and Its Applications, 5th ed., Chapter 4, Addison Wesley, 2005.

(6)

1661

Hamz´e Haidar Alaeddine was born in Liban, in1980. He received the B.Sc. degree in electronics from the Lebanese University, Liban, in 2002, and the M.Sc. degree from Uni- versity of Brest, France, in 2003. In 2007, he received the Ph.D. degree from University of Brest, France. His research interests in- clude Signal Processing for Telecommunica- tions, adaptive filtering, echo cancellation, number theoretic transform.

El Houssa¨ın Baghious was born in 1955 in Morocco. He received the Ph.D. degree from University of Brest, France, in1991. Since 1992, he has been an Associate Professor and a re- searcher at the Lab-STICC Laboratory (UMR CNRS 3192), University of Brest, France. His research interests are in the areas of Number Theoretic Transform and its applications to signal processing for digital communications.

Gilles Burel was born in 1964. He received the M.Sc. degree from Ecole Supérieure d’Electricité, Gif Sur Yvette, France, in 1988 and the Ph.D. degree from University of Brest, France, 1991. From 1988 to 1997 he was a member of the technical staffof Thomson CSF, then Thomson Multimedia, Rennes, France, where he worked on image processing and pat- tern recognition applications as project manager.

Since 1997, he has been Professor of Digital Communications, Image and Signal Processing at the University of Brest. He is Director of the Doctoral School SICMA and Associate Director of the Lab-STICC Laboratory (UMR CNRS 3192).

He is the author of 19 patents, one book and more than one hundred sci- entific papers. His present research interests are in signal processing for digital communications with emphasis on MIMO systems and interception of communications.