HAL Id: hal-02545612
https://hal.archives-ouvertes.fr/hal-02545612
Submitted on 17 Apr 2020
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
High Radix BKM algorithm with Selection by Rounding
Laurent-Stephane Didier, Fabien Rico
To cite this version:
Laurent-Stephane Didier, Fabien Rico. High Radix BKM algorithm with Selection by Rounding.
[Research Report] lip6.2002.009, LIP6. 2002. �hal-02545612�
Laurent-Stephane Didier,FabienRico
Laboratoired'Informatique de Paris6
January 21,2002
Abstract
WepresentinthispaperahighradiximplementationofBKMalgorithm. Thisisashiftandadd
CORDIC-Likealgorithm thatallows fastcomputationsofcomplexexponentialand logarithm. The
improvementlies infeweriterationsfor agivenprecision andinthereductionofthe sizeof lookup
tablesforhighradices.
Keywords: Elementaryfunction,CORDICalgorithm,ComputerArithmetic,HighRadix
1 Introduction
Severalclassofalgorithmforcomputingcomplexexponentialandlogarithmhavebeenproposed. A
rstclassiscomposedbyalgorithmsusingpolynomialapproximationofthefunctiontobecomputed
[Tan89 ,Smi89,Mul97 ,DMF00 ]. Generally,fewiterationshavetobedonebythiskindofalgorithm,
butmultiplicationsarerequired. Next,multipartitetablesmethodswhichcanbeimplementedforlow
precisionrequireveryfewarithmetic operators [SM95 ,HT95 , dDT00 ]. Ontheopposite, quadratic
convergence algorithm allow better precision, but are complex and uneasy to implement [Bre76,
BB84]. Such algorithmshavethe bestasymptoticcomplexity, butbecome really eÆcient only for
thousandsbits.Finally,shiftandaddalgorithms,whichmostfamousiscordic[Vol59 ,Wal71 ,DM96 ],
allowfairlygoodprecision(untilhundredsbits)withlowcostiterations[Mul97]. Similarlytocordic
algorithm, bkm algorithm allows the evaluation of many elementary functions[BKM94, BI99 ]. In
ordertoreducethenumberofiterationsforagivenprecision,itwasproposedforcordictousehigh
radiximplementation[ALB00a ,ALB00b ,Lew99 ].
This paperpresentsa highradix implementation of thebkm algorithm for computing complex
logarithm and exponentialwithout scaling factor. Weshow a methodfor choosing the digits and
aconvergence domainfor bkmalgorithm for highradix. A rstmethodusingtheradix =10as
beenpresentedin[IMR00 ]butthismethodspendalotofmemory. Indeed,suchanalgorithm need
anumberofdigits oftheorder of magnitudeof 2
. Aseachiteration will needto store acomplex
numberinatablebydigits,itwilluseanO(N 2
)entrytable(whereNisnalnumberofiterations)
whileCORDICalgorithmonlyuseO(N)entrytable(see[ALB00a]and[ALB00b ]).
The paperis organized as follows, we rst present briey the initial version of bkm algorithm.
Next,weproposeamodicationoftheitrationusedinthisalgorithm,designedtosolvetheprincipal
problem of high-radix bkm which is the memory cost. Thereafter, we examine the two mode o
bkm E-mode (section 3) for computing Complex exponential and L-mode (section 4) for complex
logarithm. each section present the selection digits method, the domain of convergence and an
exampleofargumentreductiontothisdomain. Finally,section5wecompareouralgorithmwiththe
otherhighradixcordicimplementationandconclude.
1.1 Notations
Letconsideracomplexnumberz. Wenoteitsrealpartz x
anditsimaginaryz y
. Thus
z=z x
+iz y
WecallGaussnumbersthecomplexnumberswhoserealandimaginarypartareintegers:
d=d x
+id y
withd x
;d y
2Z
p
pfractionaldigits. Forinstance,wenotedzc
0
thegaussnumberclosesttoz.
Wenotehzi
p
thevalueoftherealandimaginarypartsztruncated downwithpfractionaldigits.
1.2 The bkm algorithm
TheBKMalgorithmpresentedin[BKM94 ]computesthefollowingiterations:
E
n+1
= E
n (1+d
n
n
)
Ln+1 = Ln ln 1+dn n
whereisanintegerandthevaluesd
n
arechosensothat:
dn=d x
n +id
y
n
d x
n
;d y
n
2f a; a+1;:::;0;:::;a 1;ag with a=
2
+1 (1)
Inthispaper,thed
n
valuesarecalleddigitsandtheintegeristheradixofthealgorithm.
Inordertocomputeoneiteration,thevaluesln 1+dn n
arestoredinalookuptableandare
accededbyd
n
. Thesize ofthistabledirectlydependsonthenumberofpossible digits. Thus,each
iterationonlyneedsshift,additionandmultiplicationbyadigitdn.
Itisshownin[Mul97]thattheseiterationspreservethefollowing equalities:
En
E1
=
exp(L1)
exp(Ln)
and L
n L
1
=ln(E
1
) ln(E
n )
Asaconsequence,itispossibletousethisalgorithmtroughtwodierentmodes:
E-mode: if we choose dn so thatLn convergesto 0,thenEnwill convergeto E1e L
1
making
possiblethecomputationofthecomplexexponential
L-mode: similarly,ifwechoose dn sothatEnconvergesto1,thenLnwill converge toL1+
ln(E
1
)whichpermitsthecomputationofthecomplexlogarithm.
Initially this algorithm was developedin radix2, simplifying the choice of digitsdn. However,
thisalgorithmgivesonedigitoftheresultateachiteration. Forinstance,aradix2algorithmgives
onlyonebitoftheresultat eachstep. Asaconsequence,ahighradiximplementationoftheBKM
algorithmwouldneedlessiterationsforthesameprecision.
Unfortunately,threeproblemsappear: thecomplexityofthedigitselection,thenumericalinsta-
bilityoftherstiterationsandthesizeoftablesforhighradiximplementations.
It hasbeenshown in[IMR00 ] that for radix 10 the bkm algorithm looks for digits ina set of
onehundred elements. Thismeansthat eachiteration requires alookuptablehavingonehundred
entries,makingthisimplementationineÆcientfor higherradices. Therefore,wewillshowamethod
forreducingthetablessize.
2 Reducing the size of lookup tables
Inordertominimizethesetofdigits,wepropose similarlyto[Lew99 ]tospliteachiterationintwo
half-iterations. The rst half-iteration is designed to reduce the imaginary part. This iteration is
closetothecordiciteration,see[Mul97]:
En = En(1+id y
n
n
)
Ln = Ln ln 1+id y
n
n
(2)
Thesecondhalf-iteration hasto reduce thereal part. Thiscomputation is close tothe bkm itera-
tion[Mul97 ]):
En+1 = En(1+d x
n
n
)
Ln+1 = Ln ln 1+d x
n
n
(3)
Consequently,wecandeducefromtheseiterationsthatfortheE-mode(respectivelytheL-mode):
thedigitsd y
n
are choseninordertoonlyminimize
L y
n
(respectivelyE y
n ),
thedigitsd
n
are choseninordertoonlyminimizejL
n
j(respectivelyE
n ).
Inthisway,thenumberofiterationisdoubledcomparedtotheinitialalgorithm,butthenumberof
digitsisreducedbyoneorderofmagnitudebecausethedigitsarechosenfromasmallerset:
f a; a+1;;0;;a 1;ag [ f ia; i(a 1);;0;;i(a 1);iag with a=
2 +1
Inotherwords,thereareonlyO()possibledigitsinsteadofO(
2
).
Now,wewill showthatonasmallconvergencedomain,itispossibletochoosed x
n ord
y
n onlyby
roundingthevalueEn(see4.1)orLn(see3.1)whichismoreeÆcientthanchoosingatableofvalue
asin[Lew99]. Butinordertohaveafunctionalalgorithm,itisnecessarytoextendtheconvergence
domainby examining closely the two rstiterations of our algorithm. Tothis end, wewill study
separatelytheE-modeandL-mode.
3 The computation of complex exponential : E-mode
Inthismode,withregardtoL1,weconstructasequenceofd x
n ,d
y
n
suchthatthesequenceLnconverges
to 0. Thissection outlines a methodfor choosingthe digits d x
n and d
y
n
. Thischoice is critical for
theconvergenceandtheperformanceofouralgorithm. Indeed,acomplexchoicewillleadtoarapid
convergenceonawidedomainbutwillbecostlynotonlyintermofcomputingtimebutalsointerm
ofhardwareresources.
Becausethesetofpossibledigitsisnite,wecannotalwaystakethebestvalueford x
n andd
y
n and
we have toselect anapproximation which isrelatively close tothe best value. Unfortunately, this
willnot assurethe convergence ofthe algorithm for thewhole complexplane,butonly fora small
domain.
First, wepresentamethodforchoosingthedigits d x
n andd
y
n
. Next,wewill prove that,for any
value takenfromadeneddomain,aniteration willkeeptheresultinthisdomain. Thereafter,we
will show anargumentreduction fromany part ofthe complexplane tothis domain, allowing our
algorithmtoconvergeforanyinput.
3.1 Digit selection
Separatingtherealandimaginarypartinequation2and3givesthetwofollowingiterations:
L x
n
= L x
n 1
2 ln
1+d y
n 2
2n
L y
n
= L y
n
arctan d y
n
n
and L x
n+1
= L x
n
ln 1+d x
n
n
L y
n+1
= L y
n
(4)
Theexactvalueford x
n andd
y
n
thatminimiseL x
n+1 andL
y
n+1 are:
d x
n
= exp L x
n
1
n
and d y
n
=tan(L y
n )
n
(5)
Withthesedigits, L x
n+1
=L y
n+1
=0.But theseare notinteger values exceptifL x
n
=0 orL y
n
=0.
So we have to choose an integer close to these values. As L x
n and L
y
n
tend to 0, we can use an
approximationoftheexactfunctionsin(5). Indeed:
exp L x
n
1
n
'L x
n
n
and tan(L y
n )
n
'L y
n
n
(6)
Letnote
T
n
=L
n
n
and T
n
=L
n
n
(7)
Thus,d x
n andd
y
n
respectivelyshouldbeclosetoT x
n andT
y
n
.Inordertoobtaintheintegersd x
n andd
y
n ,
we shouldroundT x
n and T
y
n
to thenearestinteger. Thisexactroundingimpliesexactcomparisons
whichmaybeineÆcient,ifforinstance,T x
n andT
y
n
arecodedinaredundantnumberrepresentation.
Therefore,wewillroundthetruncationofT x
n andT
y
n
keepingonly2fractionaldigits:
d y
n
=
hT y
n i
2
0
d x
n
=
T x
n
2
0
(8)
Suchachoiceis the resultof severalapproximation and we muststudythe convergence of the
algorithm. For thispurpose, westudythe evolutionof thesequenceTn. Clearly, ifit's possible to
provethatT
n
isbounded,then,L
n
willtendto0andeachiterationwillreducetheerrorbyafactor
of.
Supposethatthesequencej T x
n
jisboundedbyavalueA. ThenjL x
n j ,jL
y
n jA
n
. Thismeansthat
eachiterationshouldreducethevalueofthesequencesL x
n andL
y
n
byafactor. Wewillshowthat
ifweuseourdigitselection method,thispropertyistrueexceptforthetworstiterations.
First,wehavetoboundthevalueofTn+1 byavaluedependingofTn. Leta=dT x
n
e,b=dT y
n e .
Wehave:
j T x
n
j a
jT y
n
j b
Withthechoiceofd x
n andd
y
n
weobtain:
T x
n
2 T
x
n
2
hT y
n i
2 T
y
n
2
and
d x
n
T x
n
0
1
2
d y
n dT
y
n c
0
1
2
then
d x
n T
x
n
1
2 +
2
jd y
n T
y
n
j
1
2 +
2
Itiswellknownthatthefunctionlnandarctanhavethefollowingproperties:
ifx2[0;
1
2 ]then
8
>
>
<
>
>
: x
x 2
2
ln(1+x) x
x x
3
3
arctan (x) x;
ifx2[ 1
2
;0]then
8
<
: x x
2
ln(1+x) x
x arctan (x) x x
3
3 :
Bydenitionofd y
n
andb,j d y
n
jb. Thersthalf-iterationcanbewritten
"
T x
n
= T x
n 1
2 ln
1+d y
n 2
2n
n
T y
n
= T y
n d
y
n +d
y
n
arctan d y
n
n
n
;
whichgivesthefollowinginequalities:
a b
2
2
n
T
x
n
a
1
2
2 b
3
3
2n
T
y
n
1
2 +
2
+ b
3
3
2n
Bydenitionofd x
n
and a(whichis aninteger), we have: a b
2
2
n
1
2 d
x
n
a. Thesecond
halfiterationis:
2
6
6
4 T
x
n+1
= T x
n d
x
n +d
x
n
ln 1+d x
n
n
n
T y
n+1
= T y
n
;
andgivesthefollowing bound:
2
1
T
x
n+1
2 +
1
+
a+ b
2
2
n
+ 1
2
2
n+1
and
2
1 b
3
3
2n+1
T
y
n+1
2 +
1
+ b
3
3
2n+1
(9)
Now,letsimplifytheboundsofTn.If10and:
ifn=2,a=2andb= then jT x
3
jandjT
3 j
2 +1,
ifn3,a=b=
2
+1 then jT x
n+1 jand
T y
n+1
2 +1.
Byinduction,thisprovesthatthealgorithmconvergesifn2onthefollowingdomain:
PE =
1 2
2
;1+ 2
2
+i
1
; 1
(10)
3.3 Argument reduction
Becauseofthemethodforconstructingd x
n and d
y
n
this domainremainverysmall. Indeed,thebest
choiceforthesenumbers(seesection3.1)arenumbersminimizingthevalues:
T x
n
n
ln 1+d x
n
n
and
T y
n
n
arctan 1+d y
n
n
(11)
Theproposedchoice(eq.8)iseÆcientonlyif(see6):
exp L x
n
1
n
'L x
n
n
and tan(L y
n )
n
'L y
n
n
Unfortunately,thisisnottrueforthetworstiterations. Inotherwords,ifn=1orn=2thenT x
n+1
andT y
n+1
maybegreaterthan
2
+1. So,theseiterationshavetobereplaced. Severalmethodscan
beusedfortherstiterations:
Itispossibletochoosethevaluesd x
n andd
y
n
fromalookuptableindexedbythemostsignicant
bitsofT x
n andT
y
n
. Thisallowsabetterchoiceforthedigitsandwillimprovethereduction.
It is possible to doublean iteration that will also improve the reduction butwill needmore
computations.
It is possibleto change the radix of thealgorithm, computinganiteration with radix 4 for
instance. Thisisequivalenttocodethefractionalpart ofd x
n and d
y
n
with2bits. Thenumber
ofpossibledigitsincreases,butthechoiceandthereductionarebetter.
Itispossibletoenlargethesetofpossibledigitsforthenextiteration.
Itisalsopossibletosimulatethetworstiterationbyseveraliterationsofradix2bkmalgorithm.
Example: Supposethat10. Therstiterationscanbecomputedasfollow:
Fist, notewewillstartfromthefollowing domain:
D = [ln(2);2ln(2)]+i h
4
;
4 i
Reducingtothisdomainiswellknownasshownin[Mul97].Wehavetoimprovethereductionfrom
Dto thedomainPE (see 10). Thissecondreduction willonlyuse twoiterationof bkm, butthose
iterationsareperformeddierently.Thetworstiterationsarecomputedasfollow:
Fortherstiteration,d y
n andd
x
n
arechosenfromtwolookuptablesindexedbyT y
n
andrespec-
tivelyT y
n
roundedwith1bitforthefractionalpart.
Thesetofdigitsfortheseconditerationisextendedtof ;:::;g. Furthermore,thisiteration
isperformedtwice.
Moreprecisely,wedene x
(n)and y
(n)beingtwolookuptablessuchthat:
x
(n)istheintegerdminimizingthevalue
n
2
ln 1+d 1
for0nd4ln(2) e.
y
(n)istheintegerdminimizingthevalue
n
2
arctan d 1
for 2n2.
Thesetablesverify:
n
2
ln 1+ x
(n) 1
1
2
(12a)
n
2
arctan y
(n) 1
1
2
(12b)
Wecandotheiterationsgivenintable1.
Step1
d y
1
=
y
( 2hT y
1 i
2
0 )
L
1
= L
1
ln 1+id y
1
1
Step1+ 1
2 (
d x
1
=
x
( l
2 D
T x
1 E
2 k
0 )
L
2
= L
1
ln 1+d x
1
1
Step2
d y
2
=
h T y
2 i
2
0
L
2
= L
2
ln 1+id y
2
2
Step2+ 1
2 (
d x
2
= lD
T y
2 E
2 k
0
L = L
2
ln 1+d x
2
2
Table1: The4half-iterationusedbytheE-mode
ConsideringL12D,wehave:
0 L
x
1
ln(2)
4
L
y
1
4
BydenitionofT y
1
(see7),jT y
1
j<then
2hT y
1 i
2
0
2andthevalued y
1
= y
(
2hT y
1 i
2
0 )exists
inthetable.
Wedothefollowing iteration:
L x
1
= L
x
1 1
2
ln 1+(d y
1
1
) 2
(13a)
L y
1
= L
y
1
arctan d y
1
1
(13b)
Bydenitionofd y
1
,wehavejd y
1
jandwithequation(13a),iteasilygivesaboundforL x
1 :
ln(2) 1
2
ln(2) L x
1
2ln(2)
0 L
x
1
2ln(2) (14)
Letusnote:
C = L
y
1
2hT y
1 i
2
0
2
and D =
2hT y
1 i
2
0
2
arctan d y
1
1
Theequation(13b)become
L y
1
= C+D (15)
BydenitionofC andT y
1
(see7):
jCj
L
y
1
hL
y
1 i
2
+
2hL
y
1 i
2
2h L y
1 i
2
0
2
1
3
+ 1
4
jCj 1
2
(16)
Furthermore,bydenitionofDand(12b):
j Dj 1
2
(17)