High-level synthesis and arithmetic optimizations

(1)

HAL Id: hal-02131970

https://hal.inria.fr/hal-02131970

Submitted on 16 May 2019

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

High-level synthesis and arithmetic optimizations

Yohann Uguen, Florent de Dinechin, Steven Derrien

To cite this version:

Yohann Uguen, Florent de Dinechin, Steven Derrien. High-level synthesis and arithmetic optimizations. Compas’2016, Jul 2016, Lorient, France. �hal-02131970�

(2)

H ^IGH - LEVEL SYNTHESIS

AND ARITHMETIC OPTIMIZATIONS

Y ^OHANN U ^{GUEN AND} F ^{LORENT DE} D INECHIN AND S ^TEVEN D ^ERRIEN

C ^ONTEXT

• Computing with real numbers

• Design a hardware accelerator

• Targeting FPGAs

• Trade off between 1. performance 2. accuracy

3. resource usage 4. ease of use

M ERGING APPLICATION - SPECIFIC ARITHMETIC AND HLS

P

i

x_i × y_i

FloPoCo [1]

flopoco FPLargeAcc wex=8 wfx=23

msba=17 lsba=-50 maxmsbx=17

float sumOfProduct(float in1[N], float in2[N]) float sum = 0;

#pragma FPacc VAR=sum MaxAcc=100000 epsilon=1E-15 for (int i=0; i<N; i++)

sum+=in2[i+1];

sum+=in1[i]*in2[i-1];

return sum;

Operator

VHDL

HLS (Vivado)

Source-to-source transformations using GeCoS [2]

HLS (Vivado) C

C

VHDL VHDL VHDL

⊕ Low resource

⊕ Low latency

⊕ Accuracy control Difficult to use

Moderate resource Moderate latency

No accuracy control

⊕ Ease of use

⊕ Low resource

⊕ Low latency

⊕ Accuracy control

⊕ Ease of use

A ^CCUMULATOR

Based on Kulisch and Snyder accumulator [3]

Fixed-to-ﬂoatFloat-to-ﬁxedSum

Exponent Mantissa Sign

Shifter

Negate

LZC + Shifter

Exponent Mantissa Sign

Shift value

Registers

Fixed-point sum

MaxMSBx we wf

MaxMSBX−LSBA+1

wA

we wf

E ^XACT M ^ULTIPLIER

Exact multiplier

Sign Exponent Mantissa

Sign 1Exponent 1

Mantiss a 1

Sign 2Exponent 2

Mantiss a 2

we we

wf wf

1 1

1 we +1 2 x wf +2

S ÔURCE - ^TO - SOURCE TRANSFORMATIONS USING G Ê C Ô S

switch Node do case +

Launch recursively on incoming nodes end

case ×

Replace with exact multiplier

combined with tuned

Float-to-fixed operator

end

case Accumulation variable

Ignore end

otherwise Insert

Float-to-fixed node end

endsw

i

in1[]

- +

X

1

in2[]

+

+ 1

in2[]

sum sum

i

in1[]

- +

Exact multiplier 1

in2[]

Tuned Float-to-fixed

+

+ 1

in2[]

Float-to-fixed

sum sum

R ^ESULTS

Input values in [0, 1]

100K accumulations

FloPoCo’s Naive Transformed Operator Code Code

Accumulator width 67 24 67

LUTs 693 313 868

DSPs 2 5 2

Latency 100K 1000K 100K

Accuracy 24 bits 17 bits 24 bits

R ^EFERENCES

[1] de Dinechin, Florent et al.: An FPGA-specific Ap- proach to Floating-Point Accumulation and Sum-of- Products, FPT 2008

[2] Floc’h, Antoine et al.: GeCoS: A framework for proto- typing custom hardware design flows, SCAM 2013

[3] Kulisch, Ulrich and Snyder, Van: The Exact Dot Prod- uct As Basic Tool for Long Interval Arithmetic, Com- puting 2011

High-level synthesis and arithmetic optimizations

H IGH - LEVEL SYNTHESIS

AND ARITHMETIC OPTIMIZATIONS

Y OHANN U GUEN AND F LORENT DE D INECHIN AND S TEVEN D ERRIEN

C ONTEXT

M ERGING APPLICATION - SPECIFIC ARITHMETIC AND HLS

A CCUMULATOR

Shifter

LZC + Shifter

E XACT M ULTIPLIER

S OURCE - TO - SOURCE TRANSFORMATIONS USING G E C O S

R ESULTS

R EFERENCES

H ^IGH - LEVEL SYNTHESIS

Y ^OHANN U ^{GUEN AND} F ^{LORENT DE} D INECHIN AND S ^TEVEN D ^ERRIEN

C ^ONTEXT

A ^CCUMULATOR

E ^XACT M ^ULTIPLIER

S ÔURCE - ^TO - SOURCE TRANSFORMATIONS USING G Ê C Ô S

R ^ESULTS

R ^EFERENCES