• Aucun résultat trouvé

Synthesis of certified programs in fixed-point arithmetic, and its application to linear algebra basic blocks

N/A
N/A
Protected

Academic year: 2021

Partager "Synthesis of certified programs in fixed-point arithmetic, and its application to linear algebra basic blocks"

Copied!
81
0
0

Texte intégral

(1)

HAL Id: lirmm-01277374

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01277374

Submitted on 22 Feb 2016

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Synthesis of certified programs in fixed-point arithmetic,

and its application to linear algebra basic blocks

Mohamed Amine Najahi

To cite this version:

Mohamed Amine Najahi. Synthesis of certified programs in fixed-point arithmetic, and its application

to linear algebra basic blocks. RAIM: Rencontres Arithmétiques de l’Informatique Mathématique,

Apr 2015, Rennes, France. 7ème Rencontres Arithmétiques de l’Informatique Mathématique, 2015.

�lirmm-01277374�

(2)

7ème Rencontres Arithmétiques de l’Informatique Mathématique (RAIM2015)

Rennes, 7-9 april 2015

Synthesis of certified programs in fixed-point

arithmetic, and its application to

linear algebra basic blocks

Amine Najahi

Univ. Perpignan Via Domitia, DALI project-team

Univ. Montpellier 2, LIRMM, UMR 5506

CNRS, LIRMM, UMR 5506

D

(3)

Which arithmetic for computational tasks?

Floating-point computations

©

Easy and fast to implement

©

Easily portable

[IEEE754]

§

Requires dedicated hardware

§

Slow if emulated in software

Fixed-point computations

§

Tedious and time consuming to implement

• > 50

%

of design time

[Wil98]

©

Relies only on integer instructions

©

Efficient

Embedded systems targets

µ

-controllers

DSPs

FPGAs

have efficient integer instructions

Fixed-point arithmetic is well suited for embedded systems

But, how to make it

easy

,

fast

, and

numerically safe

to use by non-expert programmers?

(4)

Which arithmetic for computational tasks?

Floating-point computations

©

Easy and fast to implement

©

Easily portable

[IEEE754]

§

Requires dedicated hardware

§

Slow if emulated in software

Fixed-point computations

§

Tedious and time consuming to implement

• > 50

%

of design time

[Wil98]

©

Relies only on integer instructions

©

Efficient

Embedded systems targets

µ

-controllers

DSPs

FPGAs

have efficient integer instructions

Fixed-point arithmetic is well suited for embedded systems

(5)

Which arithmetic for computational tasks?

Floating-point computations

©

Easy and fast to implement

©

Easily portable

[IEEE754]

§

Requires dedicated hardware

§

Slow if emulated in software

Fixed-point computations

§

Tedious and time consuming to implement

• > 50

%

of design time

[Wil98]

©

Relies only on integer instructions

©

Efficient

Embedded systems targets

µ

-controllers

DSPs

FPGAs

have efficient integer instructions

Fixed-point arithmetic is well suited for embedded systems

But, how to make it

easy

,

fast

, and

numerically safe

to use by non-expert programmers?

(6)

Which arithmetic for computational tasks?

Floating-point computations

©

Easy and fast to implement

©

Easily portable

[IEEE754]

§

Requires dedicated hardware

§

Slow if emulated in software

Fixed-point computations

§

Tedious and time consuming to implement

• > 50

%

of design time

[Wil98]

©

Relies only on integer instructions

©

Efficient

Embedded systems targets

µ

-controllers

DSPs

FPGAs

have efficient integer instructions

Fixed-point arithmetic is well suited for embedded systems

(7)

Which arithmetic for computational tasks?

Floating-point computations

©

Easy and fast to implement

©

Easily portable

[IEEE754]

§

Requires dedicated hardware

§

Slow if emulated in software

Fixed-point computations

§

Tedious and time consuming to implement

• > 50

%

of design time

[Wil98]

©

Relies only on integer instructions

©

Efficient

Embedded systems targets

µ

-controllers

DSPs

FPGAs

have efficient integer instructions

Fixed-point arithmetic is well suited for embedded systems

But, how to make it

easy

,

fast

, and

numerically safe

to use by non-expert programmers?

(8)

The DEFIS approach

DEFIS (ANR, 2011-2015)

Goal:

develop techniques and tools to automate

fixed-point programming

Combines conversion and IP block synthesis

Ï

Ménard et al. (

CAIRN, Univ. Rennes

)

[MCCS02]

:

automatic float-to-fix conversion

Ï

Didier et al. (

PEQUAN, Univ. Paris

)

[LHD14]

:

code generation for the linear filter IP block

Ï

Our approach (

DALI, Univ. Perpignan

):

certified fixed-point synthesis for:

• Fine grained IP blocks:

dot-products,

polynomials, ...

• High level IP blocks:

matrix multiplication,

triangular matrix inversion, Cholesky decomposition

Long term objective:

code synthesis for matrix inversion

Im p lem en ta tio n to o ls In fras tru ct u re fo r t h e d es ig n o f f ix ed -p o in t s ys tem s Al go rit h m le ve l op tim iz at ion IWL Determination Dynamic Range evaluation FWL Determination Back-end S2S transfor-mation Application description Specific block generation Floating-point C code Accuracy evaluation B1 B5 B4 B3 B6 B2 System level optimization Accuracy constraint High level Synthesis Compiler Architecture Fixed-point C code Architecture model Validation & Optimization Parameterized IP blocks

(9)

The DEFIS approach

DEFIS (ANR, 2011-2015)

Goal:

develop techniques and tools to automate

fixed-point programming

Combines conversion and IP block synthesis

Ï

Ménard et al. (

CAIRN, Univ. Rennes

)

[MCCS02]

:

automatic float-to-fix conversion

Ï

Didier et al. (

PEQUAN, Univ. Paris

)

[LHD14]

:

code generation for the linear filter IP block

Ï

Our approach (

DALI, Univ. Perpignan

):

certified fixed-point synthesis for:

• Fine grained IP blocks:

dot-products,

polynomials, ...

• High level IP blocks:

matrix multiplication,

triangular matrix inversion, Cholesky decomposition

Long term objective:

code synthesis for matrix inversion

Im p lem en ta tio n to o ls In fras tru ct u re fo r t h e d es ig n o f f ix ed -p o in t s ys tem s Al go rit h m le ve l op tim iz at ion IWL Determination Dynamic Range evaluation FWL Determination Back-end S2S transfor-mation Application description Specific block generation Floating-point C code Accuracy evaluation B1 B5 B4 B3 B6 B2 System level optimization Accuracy constraint High level Synthesis Compiler Architecture Fixed-point C code Architecture model Validation & Optimization Parameterized IP blocks

(10)

The DEFIS approach

DEFIS (ANR, 2011-2015)

Goal:

develop techniques and tools to automate

fixed-point programming

Combines conversion and IP block synthesis

Ï

Ménard et al. (

CAIRN, Univ. Rennes

)

[MCCS02]

:

automatic float-to-fix conversion

Ï

Didier et al. (

PEQUAN, Univ. Paris

)

[LHD14]

:

code generation for the linear filter IP block

Ï

Our approach (

DALI, Univ. Perpignan

):

certified fixed-point synthesis for:

• Fine grained IP blocks:

dot-products,

polynomials, ...

• High level IP blocks:

matrix multiplication,

triangular matrix inversion, Cholesky decomposition

Long term objective:

code synthesis for matrix inversion

Im p lem en ta tio n to o ls In fras tru ct u re fo r t h e d es ig n o f f ix ed -p o in t s ys tem s Al go rit h m le ve l op tim iz at ion IWL Determination Dynamic Range evaluation FWL Determination Back-end S2S transfor-mation Application description Specific block generation Floating-point C code Accuracy evaluation B1 B5 B4 B3 B6 B2 System level optimization Accuracy constraint High level Synthesis Compiler Fixed-point C code Architecture model Validation & Optimization Parameterized IP blocks

(11)

The DEFIS approach

DEFIS (ANR, 2011-2015)

Goal:

develop techniques and tools to automate

fixed-point programming

Combines conversion and IP block synthesis

Ï

Ménard et al. (

CAIRN, Univ. Rennes

)

[MCCS02]

:

automatic float-to-fix conversion

Ï

Didier et al. (

PEQUAN, Univ. Paris

)

[LHD14]

:

code generation for the linear filter IP block

Ï

Our approach (

DALI, Univ. Perpignan

):

certified fixed-point synthesis for:

• Fine grained IP blocks:

dot-products,

polynomials, ...

• High level IP blocks:

matrix multiplication,

triangular matrix inversion, Cholesky decomposition

Long term objective:

code synthesis for matrix inversion

Im p lem en ta tio n to o ls In fras tru ct u re fo r t h e d es ig n o f f ix ed -p o in t s ys tem s Al go rit h m le ve l op tim iz at ion IWL Determination Dynamic Range evaluation FWL Determination Back-end S2S transfor-mation Application description Specific block generation Floating-point C code Accuracy evaluation B1 B5 B4 B3 B6 B2 System level optimization Accuracy constraint High level Synthesis Compiler Architecture Fixed-point C code Architecture model Validation & Optimization Parameterized IP blocks

(12)

Our road-map

How to generate certified fixed-point code for matrix inversion?

1.

Specify an arithmetic model

Ï

Contributions:

formalization of p and

/

2.

Build a synthesis tool, CGPE, for fine grained IP

blocks:

Ï

it adheres to the arithmetic model

Ï

Contributions:

implementation of the arithmetic model

3.

Build a second synthesis tool, FPLA, for algorithmic

IP blocks:

Ï

it generates code using CGPE

Ï

Contributions:

trade-off implementations for matrix multiplication

code synthesis for Cholesky decomposition and

triangular matrix inversion

(13)

Our road-map

How to generate certified fixed-point code for matrix inversion?

1.

Specify an arithmetic model

Ï

Contributions:

formalization of p and

/

2.

Build a synthesis tool, CGPE, for fine grained IP

blocks:

Ï

it adheres to the arithmetic model

Ï

Contributions:

implementation of the arithmetic model

3.

Build a second synthesis tool, FPLA, for algorithmic

IP blocks:

Ï

it generates code using CGPE

Ï

Contributions:

trade-off implementations for matrix multiplication

code synthesis for Cholesky decomposition and

triangular matrix inversion

Arithmetic

model

(14)

Our road-map

How to generate certified fixed-point code for matrix inversion?

1.

Specify an arithmetic model

Ï

Contributions:

formalization of p and

/

2.

Build a synthesis tool, CGPE, for fine grained IP

blocks:

Ï

it adheres to the arithmetic model

Ï

Contributions:

implementation of the arithmetic model

3.

Build a second synthesis tool, FPLA, for algorithmic

IP blocks:

Ï

it generates code using CGPE

Ï

Contributions:

trade-off implementations for matrix multiplication

code synthesis for Cholesky decomposition and

triangular matrix inversion

Arithmetic

model

F

ix

ed

-po

int synthe

sis

to

ol

CGPE

(15)

Our road-map

How to generate certified fixed-point code for matrix inversion?

1.

Specify an arithmetic model

Ï

Contributions:

formalization of p and

/

2.

Build a synthesis tool, CGPE, for fine grained IP

blocks:

Ï

it adheres to the arithmetic model

Ï

Contributions:

implementation of the arithmetic model

3.

Build a second synthesis tool, FPLA, for algorithmic

IP blocks:

Ï

it generates code using CGPE

Ï

Contributions:

trade-off implementations for matrix multiplication

code synthesis for Cholesky decomposition and

triangular matrix inversion

Arithmetic

model

F

ix

ed

-po

int synthe

sis

to

ol

CGPE

Alg

orithmic level to

ol

FPLA

(16)

Outline of the talk

1. An arithmetic model for fixed-point code synthesis

2. An implementation of the arithmetic model: the CGPE tool

(17)

An arithmetic model for fixed-point code synthesis

Outline of the talk

1. An arithmetic model for fixed-point code synthesis

2. An implementation of the arithmetic model: the CGPE tool

3. Fixed-point code synthesis for linear algebra basic blocks

(18)

An arithmetic model for fixed-point code synthesis

Fixed-point arithmetic numbers

A fixed-point number x is defined by two integers:

.

X

the k-bit integer representation of x

.

f

the

implicit

scaling factor of x

The value of x is given by x =

2

X

f

=

k

−1−f

X

`=−f

X

`+f

· 2

`

X

7

X

6

X

5

X

4

X

3

X

2

X

1

X

0

k

= 8

i

= 3

f

= 5

Notation

A fixed-point number with

i

bits

of integer part and

f

bits

of fraction part is in the Q

i

.f

format

Example:

Ï

x

in Q

3

.

5

and X =

(1001 1000)

2

=

(152)

10

−→

x

=

(100.11000)

2

=

(4.75)

10

(19)

An arithmetic model for fixed-point code synthesis

Fixed-point arithmetic numbers

A fixed-point number x is defined by two integers:

.

X

the k-bit integer representation of x

.

f

the

implicit

scaling factor of x

The value of x is given by x =

2

X

f

=

k

−1−f

X

`=−f

X

`+f

· 2

`

X

7

X

6

X

5

X

4

X

3

X

2

X

1

X

0

k

= 8

i

= 3

f

= 5

Notation

A fixed-point number with

i

bits

of integer part and

f

bits

of fraction part is in the Q

i

.f

format

Example:

Ï

x

in Q

3

.

5

and X =

(1001 1000)

2

=

(152)

10

−→

x

=

(100.11000)

2

=

(4.75)

10

How to compute with fixed-point numbers?

(20)

An arithmetic model for fixed-point code synthesis

Fixed-point arithmetic numbers

A fixed-point number x is defined by two integers:

.

X

the k-bit integer representation of x

.

f

the

implicit

scaling factor of x

The value of x is given by x =

2

X

f

=

k

−1−f

X

`=−f

X

`+f

· 2

`

1

0

0

1

1

0

0

0

k

= 8

i

= 3

f

= 5

Notation

A fixed-point number with

i

bits

of integer part and

f

bits

of fraction part is in the Q

i

.f

format

Example:

Ï

x

in Q

3

.

5

and X =

(1001 1000)

2

=

(152)

10

−→

x

=

(100.11000)

2

=

(4.75)

10

(21)

An arithmetic model for fixed-point code synthesis

Fixed-point arithmetic numbers

A fixed-point number x is defined by two integers:

.

X

the k-bit integer representation of x

.

f

the

implicit

scaling factor of x

The value of x is given by x =

2

X

f

=

k

−1−f

X

`=−f

X

`+f

· 2

`

1

0

0

1

1

0

0

0

k

= 8

i

= 3

f

= 5

Notation

A fixed-point number with

i

bits

of integer part and

f

bits

of fraction part is in the Q

i

.f

format

Example:

Ï

x

in Q

3

.

5

and X =

(1001 1000)

2

=

(152)

10

−→

x

=

(100.11000)

2

=

(4.75)

10

How to compute with fixed-point numbers?

(22)

An arithmetic model for fixed-point code synthesis

An interval arithmetic based model

For each coefficient or variable v, we keep track of 2 intervals Val

(

v

)

and Err

(

v

)

Our model assumes a fixed word-length k

Val

(

v

)

is the range of v

the format Q

i

.

f

of v is deduced from

Val

(

v

)

=

£v,v¤

Ï

i

=

l

log

2

(

max

(

¯

¯

v

¯

¯

,

¯

¯

v

¯

¯

))

m

+ α

Ï

f

= k − i

α =

(1, if mod ¡log

2

(

v

)

,

1¢ 6= 0,

2, otherwise

Err

(

v

)

encloses the

rounding error of

computing v

a bound

² on rounding

errors is deduced from

Err

(

v

)

=

£e,e¤

Ï

² = max¡¯¯e¯¯,¯¯e¯¯¢

¦ ¦ ¦ a0 ¦ a1 ¦ ¦ a2 ¦ a3 ¦

x

¦ ¦ ¦ a4 ¦ a5

¦

.

.

Val

(

v

)

= g

¦

¡Val

(

v

1

)

, Val

(

v

2

)

, Err

(

v

1

)

, Err

(

v

2

)

¢

Err

(

v

)

= h

¦

¡Val

(

v

1

)

, Val

(

v

2

)

, Err

(

v

1

)

, Err

(

v

2

)

¢

Val

(

v

1

)

Err

(

v

1

)

Val

(

v

2

)

Err

(

v

2

)

How to propagate Val

(

v

)

and Err

(

v

)

for ¦ ∈ ©+,−,×,¿,À,p,

/

ª

(23)

An arithmetic model for fixed-point code synthesis

An interval arithmetic based model

For each coefficient or variable v, we keep track of 2 intervals Val

(

v

)

and Err

(

v

)

Our model assumes a fixed word-length k

Val

(

v

)

is the range of v

the format Q

i

.

f

of v is deduced from

Val

(

v

)

=

£v,v¤

Ï

i

=

l

log

2

(

max

(

¯

¯

v

¯

¯

,

¯

¯

v

¯

¯

))

m

+ α

Ï

f

= k − i

α =

(1, if mod ¡log

2

(

v

)

,

1¢ 6= 0,

2, otherwise

Err

(

v

)

encloses the

rounding error of

computing v

a bound

² on rounding

errors is deduced from

Err

(

v

)

=

£e,e

¤

Ï

² = max¡¯¯e¯¯,¯¯e¯¯¢

¦ ¦ ¦ a0 ¦ a1 ¦ ¦ a2 ¦ a3 ¦

x

¦ ¦ ¦ a4 ¦ a5

¦

.

.

Val

(

v

)

= g

¦

¡Val

(

v

1

)

, Val

(

v

2

)

, Err

(

v

1

)

, Err

(

v

2

)

¢

Err

(

v

)

= h

¦

¡Val

(

v

1

)

, Val

(

v

2

)

, Err

(

v

1

)

, Err

(

v

2

)

¢

Val

(

v

1

)

Err

(

v

1

)

Val

(

v

2

)

Err

(

v

2

)

How to propagate Val

(

v

)

and Err

(

v

)

for ¦ ∈ ©+,−,×,¿,À,p,

/

ª

?

(24)

An arithmetic model for fixed-point code synthesis

An interval arithmetic based model

For each coefficient or variable v, we keep track of 2 intervals Val

(

v

)

and Err

(

v

)

Our model assumes a fixed word-length k

Val

(

v

)

is the range of v

the format Q

i

.

f

of v is deduced from

Val

(

v

)

=

£v,v¤

Ï

i

=

l

log

2

(

max

(

¯

¯

v

¯

¯

,

¯

¯

v

¯

¯

))

m

+ α

Ï

f

= k − i

α =

(1, if mod ¡log

2

(

v

)

,

1¢ 6= 0,

2, otherwise

Err

(

v

)

encloses the

rounding error of

computing v

a bound

² on rounding

errors is deduced from

Err

(

v

)

=

£e,e

¤

Ï

² = max¡¯¯e¯¯,¯¯e¯¯¢

¦ ¦ ¦ a0 ¦ a1 ¦ ¦ a2 ¦ a3 ¦

x

¦ ¦ ¦ a4 ¦ a5

¦

.

.

Val

(

v

)

= g

¦

¡Val

(

v

1

)

, Val

(

v

2

)

, Err

(

v

1

)

, Err

(

v

2

)

¢

Err

(

v

)

= h

¦

¡Val

(

v

1

)

, Val

(

v

2

)

, Err

(

v

1

)

, Err

(

v

2

)

¢

Val

(

v

1

)

Err

(

v

1

)

Val

(

v

2

)

Err

(

v

2

)

How to propagate Val

(

v

)

and Err

(

v

)

for ¦ ∈ ©+,−,×,¿,À,p,

/

ª

(25)

An arithmetic model for fixed-point code synthesis

An interval arithmetic based model

For each coefficient or variable v, we keep track of 2 intervals Val

(

v

)

and Err

(

v

)

Our model assumes a fixed word-length k

Val

(

v

)

is the range of v

the format Q

i

.

f

of v is deduced from

Val

(

v

)

=

£v,v¤

Ï

i

=

l

log

2

(

max

(

¯

¯

v

¯

¯

,

¯

¯

v

¯

¯

))

m

+ α

Ï

f

= k − i

α =

(1, if mod ¡log

2

(

v

)

,

1¢ 6= 0,

2, otherwise

Err

(

v

)

encloses the

rounding error of

computing v

a bound

² on rounding

errors is deduced from

Err

(

v

)

=

£e,e

¤

Ï

² = max¡¯¯e¯¯,¯¯e¯¯¢

¦ ¦ ¦ a0 ¦ a1 ¦ ¦ a2 ¦ a3 ¦

x

¦ ¦ ¦ a4 ¦ a5

¦

.

.

Val

(

v

)

= g

¦

¡Val

(

v

1

)

, Val

(

v

2

)

, Err

(

v

1

)

, Err

(

v

2

)

¢

Err

(

v

)

= h

¦

¡Val

(

v

1

)

, Val

(

v

2

)

, Err

(

v

1

)

, Err

(

v

2

)

¢

Val

(

v

1

)

Err

(

v

1

)

Val

(

v

2

)

Err

(

v

2

)

How to propagate Val

(

v

)

and Err

(

v

)

for ¦ ∈ ©+,−,×,¿,À,p,

/

ª

?

(26)

An arithmetic model for fixed-point code synthesis

Fixed-point multiplication

The output format of a Q

i1

.

f1

× Q

i2

.

f2

is Q

i1 + i2

.

f1 + f2

But, doubling the word-length is costly

i1

f1

i2

f2

i1 + i2

f1 + f2

fr

• • • • • • •

• • • • • • •

×

• • • • • • •

• • • • • • •

Discarded bits

×

. .

Val

(v ) =

Val

(v1)×

Val

(v2)

Err

(v ) =

Val

(v1)×

Err

(v2)

+

Val

(v2)×

Err

(v1)

+

Err

(v1)×

Err

(v2)

Val

(v ) =

Val

(v1)×

Val

(v2)−

Err

×

Err

(v ) =

Err

×

+

Val

(v1)×

Err

(v2)

+

Val

(v2)×

Err

(v1)

+

Err

(v1)×

Err

(v2)

Val(v1) Err(v1) Val(v2) Err(v2)

Err

×

=

h

0,2

−f

r

− 2

(

f

1

+f

2

)

i

This multiplication is available on integer processors and DSPs

i n t 3 2 _ t

mul

(

i n t 3 2 _ t

v1

,

i n t 3 2 _ t

v2

) {

i n t 6 4 _ t

prod

= ((

i n t 6 4 _ t

)

v1

) * ((

i n t 6 4 _ t

)

v2

) ;

retu rn

(

i n t 3 2 _ t

) (

prod

>> 32) ;

(27)

An arithmetic model for fixed-point code synthesis

Fixed-point multiplication

The output format of a Q

i1

.

f1

× Q

i2

.

f2

is Q

i1 + i2

.

f1 + f2

But, doubling the word-length is costly

i1

f1

i2

f2

i1 + i2

f1 + f2

fr

• • • • • • •

• • • • • • •

×

• • • • • • •

• • • • • • •

Discarded bits

×

. .

Val

(v ) =

Val

(v1)×

Val

(v2)

Err

(v ) =

Val

(v1)×

Err

(v2)

+

Val

(v2)×

Err

(v1)

+

Err

(v1)×

Err

(v2)

Val

(v ) =

Val

(v1)×

Val

(v2)−

Err

×

Err

(v ) =

Err

×

+

Val

(v1)×

Err

(v2)

+

Val

(v2)×

Err

(v1)

+

Err

(v1)×

Err

(v2)

Val(v1) Err(v1) Val(v2) Err(v2)

Err

×

=

h

0,2

−f

r

− 2

(

f

1

+f

2

)

i

This multiplication is available on integer processors and DSPs

i n t 3 2 _ t

mul

(

i n t 3 2 _ t

v1

,

i n t 3 2 _ t

v2

) {

i n t 6 4 _ t

prod

= ((

i n t 6 4 _ t

)

v1

) * ((

i n t 6 4 _ t

)

v2

) ;

retu rn

(

i n t 3 2 _ t

) (

prod

>> 32) ;

}

(28)

An arithmetic model for fixed-point code synthesis

Fixed-point multiplication

The output format of a Q

i1

.

f1

× Q

i2

.

f2

is Q

i1 + i2

.

f1 + f2

But, doubling the word-length is costly

i1

f1

i2

f2

i1 + i2

f1 + f2

fr

• • • • • • •

• • • • • • •

×

• • • • • • •

• • • • • • •

Discarded bits

×

. .

Val

(v ) =

Val

(v1)×

Val

(v2)

Err

(v ) =

Val

(v1)×

Err

(v2)

+

Val

(v2)×

Err

(v1)

+

Err

(v1)×

Err

(v2)

Val

(v ) =

Val

(v1)×

Val

(v2)−

Err

×

Err

(v ) =

Err

×

+

Val

(v1)×

Err

(v2)

+

Val

(v2)×

Err

(v1)

+

Err

(v1)×

Err

(v2)

Val(v1) Err(v1) Val(v2) Err(v2)

Err

×

=

h

0,2

−f

r

− 2

(

f

1

+f

2

)

i

This multiplication is available on integer processors and DSPs

i n t 3 2 _ t

mul

(

i n t 3 2 _ t

v1

,

i n t 3 2 _ t

v2

) {

i n t 6 4 _ t

prod

= ((

i n t 6 4 _ t

)

v1

) * ((

i n t 6 4 _ t

)

v2

) ;

retu rn

(

i n t 3 2 _ t

) (

prod

>> 32) ;

(29)

An arithmetic model for fixed-point code synthesis

Fixed-point multiplication

The output format of a Q

i1

.

f1

× Q

i2

.

f2

is Q

i1 + i2

.

f1 + f2

But, doubling the word-length is costly

i1

f1

i2

f2

i1 + i2

f1 + f2

fr

• • • • • • •

• • • • • • •

×

• • • • • • •

• • • • • • •

Discarded bits

×

. .

Val

(v ) =

Val

(v1)×

Val

(v2)

Err

(v ) =

Val

(v1)×

Err

(v2)

+

Val

(v2)×

Err

(v1)

+

Err

(v1)×

Err

(v2)

Val

(v ) =

Val

(v1)×

Val

(v2)−

Err

×

Err

(v ) =

Err

×

+

Val

(v1)×

Err

(v2)

+

Val

(v2)×

Err

(v1)

+

Err

(v1)×

Err

(v2)

Val(v1) Err(v1) Val(v2) Err(v2)

Err

×

=

h

0,2

−f

r

− 2

(

f

1

+f

2

)

i

This multiplication is available on integer processors and DSPs

i n t 3 2 _ t

mul

(

i n t 3 2 _ t

v1

,

i n t 3 2 _ t

v2

) {

i n t 6 4 _ t

prod

= ((

i n t 6 4 _ t

)

v1

) * ((

i n t 6 4 _ t

)

v2

) ;

retu rn

(

i n t 3 2 _ t

) (

prod

>> 32) ;

}

(30)

An arithmetic model for fixed-point code synthesis

Our new fixed-point division

The output integer part of Q

i1

.

f1

/

Q

i2

.

f2

may be as large as i

1

+ f

2

But, doubling the word-length is costly

How to obtain sharper a error bounds on Err

/

?

i1

f1

i2

f2

i1

f1 + k

i2

f2

• • • • • • •

• • • • • • •

/

• • • • • • •

0 0 0 0 0 0 0 0

• • • • • • •

÷

×2

k

i1 + f2

i2 + f1

• • • • • • •

• • • • • • •

i1 + f2

fr

• • • • • • •

• • • • • • •

ir

fr

• • •

• • • • • • •

• • •

How to decide of the output format of division?

A large integer part

3

prevents overflow

7

loose error bounds and loss of

precision

A small integer part

7

may cause overflow

3

sharp error bounds and more

(31)

An arithmetic model for fixed-point code synthesis

Our new fixed-point division

The output integer part of Q

i1

.

f1

/

Q

i2

.

f2

may be as large as i

1

+ f

2

But, doubling the word-length is costly

How to obtain sharper a error bounds on Err

/

?

i1

f1

i2

f2

i1

f1 + k

i2

f2

• • • • • • •

• • • • • • •

/

• • • • • • •

0 0 0 0 0 0 0 0

• • • • • • •

÷

×2

k

i1 + f2

i2 + f1

• • • • • • •

• • • • • • •

i1 + f2

fr

• • • • • • •

• • • • • • •

ir

fr

• • •

• • • • • • •

• • •

Err

/

=

£−2

i

2

+f

1

,2

i

2

+f

1

¤

How to decide of the output format of division?

A large integer part

3

prevents overflow

7

loose error bounds and loss of

precision

A small integer part

7

may cause overflow

3

sharp error bounds and more

accurate computations

(32)

An arithmetic model for fixed-point code synthesis

Our new fixed-point division

The output integer part of Q

i1

.

f1

/

Q

i2

.

f2

may be as large as i

1

+ f

2

But, doubling the word-length is costly

How to obtain sharper a error bounds on Err

/

?

i1

f1

i2

f2

i1

f1 + k

i2

f2

• • • • • • •

• • • • • • •

/

• • • • • • •

0 0 0 0 0 0 0 0

• • • • • • •

÷

×2

k

i1 + f2

i2 + f1

• • • • • • •

• • • • • • •

i1 + f2

fr

• • • • • • •

• • • • • • •

ir

fr

• • •

• • • • • • •

• • •

Err

/

=

£−2

f

r

,2

f

r

¤

How to decide of the output format of division?

A large integer part

3

prevents overflow

7

loose error bounds and loss of

precision

A small integer part

7

may cause overflow

3

sharp error bounds and more

(33)

An arithmetic model for fixed-point code synthesis

Our new fixed-point division

The output integer part of Q

i1

.

f1

/

Q

i2

.

f2

may be as large as i

1

+ f

2

But, doubling the word-length is costly

How to obtain sharper a error bounds on Err

/

?

i1

f1

i2

f2

i1

f1 + k

i2

f2

• • • • • • •

• • • • • • •

/

• • • • • • •

0 0 0 0 0 0 0 0

• • • • • • •

÷

×2

k

i1 + f2

i2 + f1

• • • • • • •

• • • • • • •

i1 + f2

fr

• • • • • • •

• • • • • • •

ir

fr

• • •

• • • • • • •

• • •

Err

/

=

£−2

f

r

,2

f

r

¤

©

sharper bound

§

risk of overflow at run-time

How to decide of the output format of division?

A large integer part

3

prevents overflow

7

loose error bounds and loss of

precision

A small integer part

7

may cause overflow

3

sharp error bounds and more

accurate computations

(34)

An arithmetic model for fixed-point code synthesis

Our new fixed-point division

The output integer part of Q

i1

.

f1

/

Q

i2

.

f2

may be as large as i

1

+ f

2

But, doubling the word-length is costly

How to obtain sharper a error bounds on Err

/

?

i1

f1

i2

f2

i1

f1 + k

i2

f2

• • • • • • •

• • • • • • •

/

• • • • • • •

0 0 0 0 0 0 0 0

• • • • • • •

÷

×2

k

i1 + f2

i2 + f1

• • • • • • •

• • • • • • •

i1 + f2

fr

• • • • • • •

• • • • • • •

ir

fr

• • •

• • • • • • •

• • •

Err

/

=

£−2

f

r

,2

f

r

¤

©

sharper bound

§

risk of overflow at run-time

How to decide of the output format of division?

A large integer part

3

prevents overflow

7

loose error bounds and loss of

precision

A small integer part

7

may cause overflow

3

sharp error bounds and more

(35)

An arithmetic model for fixed-point code synthesis

The propagation rule and implementation of division

Once the output format decided Q

ir

.

fr

/

. .

Val

(v ) = Range(Qir

.

fr ) = [−2

ir −1,2ir −1 −2fr ].

Err

(v ) =

Val(v2)·

á

Err

(v1)−Val(v1)·

Err

(v2)

á

Val(v2)·

³

Val(v2)+

á

Err

(v2)

´

+

Err

/

Val(v1) Err(v1) Val(v2) Err(v2)

à

Val

(

v

2

)

=

Val

(

v

1

)

â

Val

(

v

)

+ Err/

∩ Val

(

v

2

)

and â

Val

(

v

)

=

[

−2

i

r

−1

, −2

−f

r

]

[

2

−f

r

,2

i

r

−1

− 2

f

r

]

Additional code to check for run-time overflows

(36)

An arithmetic model for fixed-point code synthesis

The propagation rule and implementation of division

Once the output format decided Q

ir

.

fr

/

. .

Val

(v ) = Range(Qir

.

fr ) = [−2

ir −1,2ir −1 −2fr ].

Err

(v ) =

Val(v2)·

á

Err

(v1)−Val(v1)·

Err

(v2)

á

Val(v2)·

³

Val(v2)+

á

Err

(v2)

´

+

Err

/

Val(v1) Err(v1) Val(v2) Err(v2)

à

Val

(

v

2

)

=

Val

(

v

1

)

â

Val

(

v

)

+ Err/

∩ Val

(

v

2

)

and â

Val

(

v

)

=

[

−2

i

r

−1

, −2

−f

r

]

[

2

−f

r

,2

i

r

−1

− 2

f

r

]

i n t 3 2 _ t

div

(

i n t 3 2 _ t

V1

,

i n t 3 2 _ t

V2

,

u i n t 1 6 _ t

eta

)

{

i n t 6 4 _ t

t1

= ((

i n t 6 4 _ t

)

V1

) <<

eta

;

i n t 6 4 _ t

V

=

t1

/

V2

;

CGPE_ASSERT

CGPE_ASSERT

ret urn

(

i n t 3 2 _ t

)

V

;

}

(37)

An arithmetic model for fixed-point code synthesis

The propagation rule and implementation of division

Once the output format decided Q

ir

.

fr

/

. .

Val

(v ) = Range(Qir

.

fr ) = [−2

ir −1,2ir −1 −2fr ].

Err

(v ) =

Val(v2)·

á

Err

(v1)−Val(v1)·

Err

(v2)

á

Val(v2)·

³

Val(v2)+

á

Err

(v2)

´

+

Err

/

Val(v1) Err(v1) Val(v2) Err(v2)

à

Val

(

v

2

)

=

Val

(

v

1

)

â

Val

(

v

)

+ Err/

∩ Val

(

v

2

)

and â

Val

(

v

)

=

[

−2

i

r

−1

, −2

−f

r

]

[

2

−f

r

,2

i

r

−1

− 2

f

r

]

i n t 3 2 _ t

div

(

i n t 3 2 _ t

V1

,

i n t 3 2 _ t

V2

,

u i n t 1 6 _ t

eta

)

{

i n t 6 4 _ t

t1

= ((

i n t 6 4 _ t

)

V1

) <<

eta

;

i n t 6 4 _ t

V

=

t1

/

V2

;

CGPE_ASSERT((((V & 0xFFFFFFFF80000000ll) == 0xFFFFFFFF80000000ll)

|| ((V & 0xFFFFFFFF80000000ll) == 0)));

ret urn

(

i n t 3 2 _ t

)

V

;

}

Additional code to check for run-time overflows

(38)

An arithmetic model for fixed-point code synthesis

The division format trade-off: case of inverting 2 × 2 matrices

Consider A =

Ã

a

b

c

d

!

with a,b,c,d ∈

[

−1, 1

]

in the format Q

2

.

30

Cramer’s rule: if

∆ = ad − bc 6= 0 then A

−1

=

Ã

d

−b

−c

a

!

/

d

×

a

d

×

b

c

[

−1, 1

]

[

−1, 1

]

[

−1, 1

]

Q

2

.

30

[

−2, 2

]

Q

3

.

29

?

Q −10 .42 Q −8.40 Q−6.38 Q−4.36 Q−2.34 Q0.32 Q 2.30 Q 4.28 Q 6.26 Q 8.24 Q10 .22

2

−36

2

−31

2

−26

2

−21

2

−16

2

−11

D

IVISION OUTPUT FORMAT

Maxim

um

exper

imental

error

0%

20%

40%

60%

80%

100%

Maximum error

(39)

An arithmetic model for fixed-point code synthesis

The division format trade-off: case of inverting 2 × 2 matrices

Consider A =

Ã

a

b

c

d

!

with a,b,c,d ∈

[

−1, 1

]

in the format Q

2

.

30

Cramer’s rule: if

∆ = ad − bc 6= 0 then A

−1

=

Ã

d

−b

−c

a

!

/

d

×

a

d

×

b

c

[

−1, 1

]

[

−1, 1

]

[

−1, 1

]

Q

2

.

30

[

−2, 2

]

Q

3

.

29

?

Q −10 .42 Q −8.40 Q−6.38 Q−4.36 Q−2.34 Q0.32 Q 2.30 Q 4.28 Q 6.26 Q 8.24 Q10 .22

2

−36

2

−31

2

−26

2

−21

2

−16

2

−11

D

IVISION OUTPUT FORMAT

Maxim

um

exper

imental

error

0%

20%

40%

60%

80%

100%

Maximum error

(40)

An arithmetic model for fixed-point code synthesis

The division format trade-off: case of inverting 2 × 2 matrices

Consider A =

Ã

a

b

c

d

!

with a,b,c,d ∈

[

−1, 1

]

in the format Q

2

.

30

Cramer’s rule: if

∆ = ad − bc 6= 0 then A

−1

=

Ã

d

−b

−c

a

!

/

d

×

a

d

×

b

c

[

−1, 1

]

[

−1, 1

]

[

−1, 1

]

Q

2

.

30

[

−2, 2

]

Q

3

.

29

?

Q −10 .42 Q −8.40 Q−6.38 Q−4.36 Q−2.34 Q0.32 Q 2.30 Q 4.28 Q 6.26 Q 8.24 Q10 .22

2

−36

2

−31

2

−26

2

−21

2

−16

2

−11

D

IVISION OUTPUT FORMAT

Maxim

um

exper

imental

error

0%

20%

40%

60%

80%

100%

Maximum error

(41)

An arithmetic model for fixed-point code synthesis

The division format trade-off: case of inverting 2 × 2 matrices

Consider A =

Ã

a

b

c

d

!

with a,b,c,d ∈

[

−1, 1

]

in the format Q

2

.

30

Cramer’s rule: if

∆ = ad − bc 6= 0 then A

−1

=

Ã

d

−b

−c

a

!

/

d

×

a

d

×

b

c

[

−1, 1

]

[

−1, 1

]

[

−1, 1

]

Q

2

.

30

[

−2, 2

]

Q

3

.

29

?

Q −10 .42 Q −8.40 Q−6.38 Q−4.36 Q−2.34 Q0.32 Q 2.30 Q 4.28 Q 6.26 Q 8.24 Q10 .22

2

−36

2

−31

2

−26

2

−21

2

−16

2

−11

D

IVISION OUTPUT FORMAT

Maxim

um

exper

imental

error

0%

20%

40%

60%

80%

100%

Maximum error

(42)

An arithmetic model for fixed-point code synthesis

The division format trade-off: case of inverting 2 × 2 matrices

Consider A =

Ã

a

b

c

d

!

with a,b,c,d ∈

[

−1, 1

]

in the format Q

2

.

30

Cramer’s rule: if

∆ = ad − bc 6= 0 then A

−1

=

Ã

d

−b

−c

a

!

/

d

×

a

d

×

b

c

[

−1, 1

]

[

−1, 1

]

[

−1, 1

]

Q

2

.

30

[

−2, 2

]

Q

3

.

29

?

Q −10 .42 Q −8.40 Q−6.38 Q−4.36 Q−2.34 Q0.32 Q 2.30 Q 4.28 Q 6.26 Q 8.24 Q10 .22

2

−36

2

−31

2

−26

2

−21

2

−16

2

−11

D

IVISION OUTPUT FORMAT

Maxim

um

exper

imental

error

0%

20%

40%

60%

80%

100%

Maximum error

(43)

An arithmetic model for fixed-point code synthesis

The division format trade-off: case of inverting 2 × 2 matrices

Consider A =

Ã

a

b

c

d

!

with a,b,c,d ∈

[

−1, 1

]

in the format Q

2

.

30

Cramer’s rule: if

∆ = ad − bc 6= 0 then A

−1

=

Ã

d

−b

−c

a

!

/

d

×

a

d

×

b

c

[

−1, 1

]

[

−1, 1

]

[

−1, 1

]

Q

2

.

30

[

−2, 2

]

Q

3

.

29

?

Q −10 .42 Q −8.40 Q−6.38 Q−4.36 Q−2.34 Q0.32 Q 2.30 Q 4.28 Q 6.26 Q 8.24 Q10 .22

2

−36

2

−31

2

−26

2

−21

2

−16

2

−11

D

IVISION OUTPUT FORMAT

Maxim

um

exper

imental

error

0%

20%

40%

60%

80%

100%

Maximum error

(44)

An arithmetic model for fixed-point code synthesis

The division format trade-off: case of inverting 2 × 2 matrices

Consider A =

Ã

a

b

c

d

!

with a,b,c,d ∈

[

−1, 1

]

in the format Q

2

.

30

Cramer’s rule: if

∆ = ad − bc 6= 0 then A

−1

=

Ã

d

−b

−c

a

!

/

d

×

a

d

×

b

c

[

−1, 1

]

[

−1, 1

]

[

−1, 1

]

Q

2

.

30

[

−2, 2

]

Q

3

.

29

?

Q −10 .42 Q −8.40 Q−6.38 Q−4.36 Q−2.34 Q0.32 Q 2.30 Q 4.28 Q 6.26 Q 8.24 Q10 .22

2

−36

2

−31

2

−26

2

−21

2

−16

2

−11

Maxim

um

exper

imental

error

0%

20%

40%

60%

80%

100%

Maximum error

Overflow rate

(45)

An arithmetic model for fixed-point code synthesis

The division format trade-off: case of inverting 2 × 2 matrices

Consider A =

Ã

a

b

c

d

!

with a,b,c,d ∈

[

−1, 1

]

in the format Q

2

.

30

Cramer’s rule: if

∆ = ad − bc 6= 0 then A

−1

=

Ã

d

−b

−c

a

!

/

d

×

a

d

×

b

c

[

−1, 1

]

[

−1, 1

]

[

−1, 1

]

Q

2

.

30

[

−2, 2

]

Q

3

.

29

?

Q −10 .42 Q −8.40 Q−6.38 Q−4.36 Q−2.34 Q0.32 Q 2.30 Q 4.28 Q 6.26 Q 8.24 Q10 .22

2

−36

2

−31

2

−26

2

−21

2

−16

2

−11

D

IVISION OUTPUT FORMAT

Maxim

um

exper

imental

error

0%

20%

40%

60%

80%

100%

Maximum error

Overflow rate

(46)

An arithmetic model for fixed-point code synthesis

The division format trade-off: case of inverting 2 × 2 matrices

Consider A =

Ã

a

b

c

d

!

with a,b,c,d ∈

[

−1, 1

]

in the format Q

2

.

30

Cramer’s rule: if

∆ = ad − bc 6= 0 then A

−1

=

Ã

d

−b

−c

a

!

/

d

×

a

d

×

b

c

[

−1, 1

]

[

−1, 1

]

[

−1, 1

]

Q

2

.

30

[

−2, 2

]

Q

3

.

29

?

Q −10 .42 Q −8.40 Q−6.38 Q−4.36 Q−2.34 Q0.32 Q 2.30 Q 4.28 Q 6.26 Q 8.24 Q10 .22

2

−36

2

−31

2

−26

2

−21

2

−16

2

−11

Maxim

um

exper

imental

error

0%

20%

40%

60%

80%

100%

Maximum error

Overflow rate

(47)

An arithmetic model for fixed-point code synthesis

The division format trade-off: case of inverting 2 × 2 matrices

Consider A =

Ã

a

b

c

d

!

with a,b,c,d ∈

[

−1, 1

]

in the format Q

2

.

30

Cramer’s rule: if

∆ = ad − bc 6= 0 then A

−1

=

Ã

d

−b

−c

a

!

/

d

×

a

d

×

b

c

[

−1, 1

]

[

−1, 1

]

[

−1, 1

]

Q

2

.

30

[

−2, 2

]

Q

3

.

29

?

Q −10 .42 Q −8.40 Q−6.38 Q−4.36 Q−2.34 Q0.32 Q 2.30 Q 4.28 Q 6.26 Q 8.24 Q10 .22

2

−36

2

−31

2

−26

2

−21

2

−16

2

−11

D

IVISION OUTPUT FORMAT

Maxim

um

exper

imental

error

0%

20%

40%

60%

80%

100%

Maximum error

Overflow rate

Références

Documents relatifs

Supplementary Materials: The following are available online at http://www.mdpi.com/2072-6694/12/9/2604/s1 , Figure S1: CONSORT diagram of investigational and control cohorts, Figure

Hence, the surface position results from the interplay of the three involved physical mechanisms: surface tension and probe/liquid interaction forces act directly, whereas

It is interesting to remark that, while the undecidability of the Diophantine decision and similar problems can be proved by reducing the Turing machine (or universal register

tion and rehydration in the TTL, LACM reconstructs total water mixing ratios at the top of the TTL, close to the cold point, that are always in substantially better agreement

Cette section combine les contributions pr´ ec´ edentes en un proc´ ed´ e de reconnaissance d’op´ erations d’alg` ebre lin´ eaire en tant que sous-calcul d’un programme

These algorithms search recursively in the APEG where a symmetric binary operator is repeated (we referred at these parts as homogeneous parts). As the propagation algorithms tend

Taking into account both accuracy and execution time simultane- ously is a difficult problem because these two criteria do not cohabit well: improving accuracy may be

The iterative Lloyd’s algorithm [12] is used in our case study. It is applied to bidimensional sets of vectors in order to have easier display and interpretation of the results.