Synthesis of certified programs in fixed-point arithmetic, and its application to linear algebra basic blocks

(1)

HAL Id: lirmm-01277374

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01277374

Submitted on 22 Feb 2016

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Synthesis of certified programs in fixed-point arithmetic,

and its application to linear algebra basic blocks

Mohamed Amine Najahi

To cite this version:

Mohamed Amine Najahi. Synthesis of certified programs in fixed-point arithmetic, and its application

to linear algebra basic blocks. RAIM: Rencontres Arithmétiques de l’Informatique Mathématique,

Apr 2015, Rennes, France. 7ème Rencontres Arithmétiques de l’Informatique Mathématique, 2015.

�lirmm-01277374�

(2)

7ème Rencontres Arithmétiques de l’Informatique Mathématique (RAIM2015)

Rennes, 7-9 april 2015

Synthesis of certified programs in fixed-point

arithmetic, and its application to

linear algebra basic blocks

Amine Najahi

Univ. Perpignan Via Domitia, DALI project-team

Univ. Montpellier 2, LIRMM, UMR 5506

CNRS, LIRMM, UMR 5506

D

(3)

Which arithmetic for computational tasks?

Floating-point computations

©

Easy and fast to implement

©

Easily portable

[IEEE754]

§

Requires dedicated hardware

§

Slow if emulated in software

Fixed-point computations

§

Tedious and time consuming to implement

• > 50

%

of design time

[Wil98]

©

Relies only on integer instructions

©

Efficient

Embedded systems targets

µ

-controllers

DSPs

FPGAs

→

have efficient integer instructions

Fixed-point arithmetic is well suited for embedded systems

But, how to make it

easy

,

fast

, and

numerically safe

to use by non-expert programmers?

(4)

Which arithmetic for computational tasks?

Floating-point computations

©

Easy and fast to implement

©

Easily portable

[IEEE754]

§

Requires dedicated hardware

§

Slow if emulated in software

Fixed-point computations

§

Tedious and time consuming to implement

• > 50

%

of design time

[Wil98]

©

Relies only on integer instructions

©

Efficient

Embedded systems targets

µ

-controllers

DSPs

FPGAs

→

have efficient integer instructions

Fixed-point arithmetic is well suited for embedded systems

(5)

Which arithmetic for computational tasks?

Floating-point computations

©

Easy and fast to implement

©

Easily portable

[IEEE754]

§

Requires dedicated hardware

§

Slow if emulated in software

Fixed-point computations

§

Tedious and time consuming to implement

• > 50

%

of design time

[Wil98]

©

Relies only on integer instructions

©

Efficient

Embedded systems targets

µ

-controllers

DSPs

FPGAs

→

have efficient integer instructions

Fixed-point arithmetic is well suited for embedded systems

But, how to make it

easy

,

fast

, and

numerically safe

to use by non-expert programmers?

(6)

Which arithmetic for computational tasks?

Floating-point computations

©

Easy and fast to implement

©

Easily portable

[IEEE754]

§

Requires dedicated hardware

§

Slow if emulated in software

Fixed-point computations

§

Tedious and time consuming to implement

• > 50

%

of design time

[Wil98]

©

Relies only on integer instructions

©

Efficient

Embedded systems targets

µ

-controllers

DSPs

FPGAs

→

have efficient integer instructions

Fixed-point arithmetic is well suited for embedded systems

(7)

Which arithmetic for computational tasks?

Floating-point computations

©

Easy and fast to implement

©

Easily portable

[IEEE754]

§

Requires dedicated hardware

§

Slow if emulated in software

Fixed-point computations

§

Tedious and time consuming to implement

• > 50

%

of design time

[Wil98]

©

Relies only on integer instructions

©

Efficient

Embedded systems targets

µ

-controllers

DSPs

FPGAs

→

have efficient integer instructions

Fixed-point arithmetic is well suited for embedded systems

But, how to make it

easy

,

fast

, and

numerically safe

to use by non-expert programmers?

(8)

The DEFIS approach

DEFIS (ANR, 2011-2015)

Goal:

develop techniques and tools to automate

fixed-point programming

Combines conversion and IP block synthesis

Ï

Ménard et al. (

CAIRN, Univ. Rennes

)

[MCCS02]

:

• automatic float-to-fix conversion

Ï

Didier et al. (

PEQUAN, Univ. Paris

)

[LHD14]

:

• code generation for the linear filter IP block

Ï

Our approach (

DALI, Univ. Perpignan

):

• certified fixed-point synthesis for:

• Fine grained IP blocks:

dot-products,

polynomials, ...

• High level IP blocks:

matrix multiplication,

triangular matrix inversion, Cholesky decomposition

Long term objective:

code synthesis for matrix inversion

Im p lem en ta tio n to o ls In fras tru ct u re fo r t h e d es ig n o f f ix ed -p o in t s ys tem s Al go rit h m le ve l op tim iz at ion IWL Determination Dynamic Range evaluation FWL Determination Back-end S2S transfor-mation Application description Specific block generation Floating-point C code Accuracy evaluation B1 B5 B4 B3 B6 B2 System level optimization Accuracy constraint High level Synthesis Compiler Architecture Fixed-point C code Architecture model Validation & Optimization Parameterized IP blocks

(9)

The DEFIS approach

DEFIS (ANR, 2011-2015)

Goal:

develop techniques and tools to automate

fixed-point programming

Combines conversion and IP block synthesis

Ï

Ménard et al. (

CAIRN, Univ. Rennes

)

[MCCS02]

:

• automatic float-to-fix conversion

Ï

Didier et al. (

PEQUAN, Univ. Paris

)

[LHD14]

:

• code generation for the linear filter IP block

Ï

Our approach (

DALI, Univ. Perpignan

):

• certified fixed-point synthesis for:

• Fine grained IP blocks:

dot-products,

polynomials, ...

• High level IP blocks:

matrix multiplication,

triangular matrix inversion, Cholesky decomposition

Long term objective:

code synthesis for matrix inversion

(10)

The DEFIS approach

DEFIS (ANR, 2011-2015)

Goal:

develop techniques and tools to automate

fixed-point programming

Combines conversion and IP block synthesis

Ï

Ménard et al. (

CAIRN, Univ. Rennes

)

[MCCS02]

:

• automatic float-to-fix conversion

Ï

Didier et al. (

PEQUAN, Univ. Paris

)

[LHD14]

:

• code generation for the linear filter IP block

Ï

Our approach (

DALI, Univ. Perpignan

):

• certified fixed-point synthesis for:

• Fine grained IP blocks:

dot-products,

polynomials, ...

• High level IP blocks:

matrix multiplication,

triangular matrix inversion, Cholesky decomposition

Long term objective:

code synthesis for matrix inversion

Im p lem en ta tio n to o ls In fras tru ct u re fo r t h e d es ig n o f f ix ed -p o in t s ys tem s Al go rit h m le ve l op tim iz at ion IWL Determination Dynamic Range evaluation FWL Determination Back-end S2S transfor-mation Application description Specific block generation Floating-point C code Accuracy evaluation B1 B5 B4 B3 B6 B2 System level optimization Accuracy constraint High level Synthesis Compiler Fixed-point C code Architecture model Validation & Optimization Parameterized IP blocks

(11)

The DEFIS approach

DEFIS (ANR, 2011-2015)

Goal:

develop techniques and tools to automate

fixed-point programming

Combines conversion and IP block synthesis

Ï

Ménard et al. (

CAIRN, Univ. Rennes

)

[MCCS02]

:

• automatic float-to-fix conversion

Ï

Didier et al. (

PEQUAN, Univ. Paris

)

[LHD14]

:

• code generation for the linear filter IP block

Ï

Our approach (

DALI, Univ. Perpignan

):

• certified fixed-point synthesis for:

• Fine grained IP blocks:

dot-products,

polynomials, ...

• High level IP blocks:

matrix multiplication,

triangular matrix inversion, Cholesky decomposition

Long term objective:

code synthesis for matrix inversion

(12)

Our road-map

How to generate certified fixed-point code for matrix inversion?

1. Specify an arithmetic model

Ï

Contributions:

• formalization of p and

/

2. Build a synthesis tool, CGPE, for fine grained IP

blocks:

Ï

it adheres to the arithmetic model

Ï

Contributions:

• implementation of the arithmetic model

3. Build a second synthesis tool, FPLA, for algorithmic

IP blocks:

Ï

it generates code using CGPE

Ï

Contributions:

• trade-off implementations for matrix multiplication

• code synthesis for Cholesky decomposition and

triangular matrix inversion

(13)

Our road-map

How to generate certified fixed-point code for matrix inversion?

1. Specify an arithmetic model

Ï

Contributions:

• formalization of p and

/

2. Build a synthesis tool, CGPE, for fine grained IP

blocks:

Ï

it adheres to the arithmetic model

Ï

Contributions:

• implementation of the arithmetic model

3. Build a second synthesis tool, FPLA, for algorithmic

IP blocks:

Ï

it generates code using CGPE

Ï

Contributions:

• trade-off implementations for matrix multiplication

• code synthesis for Cholesky decomposition and

triangular matrix inversion

Arithmetic

model

(14)

Our road-map

How to generate certified fixed-point code for matrix inversion?

1. Specify an arithmetic model

Ï

Contributions:

• formalization of p and

/

2. Build a synthesis tool, CGPE, for fine grained IP

blocks:

Ï

it adheres to the arithmetic model

Ï

Contributions:

• implementation of the arithmetic model

3. Build a second synthesis tool, FPLA, for algorithmic

IP blocks:

Ï

it generates code using CGPE

Ï

Contributions:

• trade-off implementations for matrix multiplication

• code synthesis for Cholesky decomposition and

triangular matrix inversion

Arithmetic

model

F

ix

ed

-po

_{int synthe}

sis

to

ol

CGPE

(15)

Our road-map

How to generate certified fixed-point code for matrix inversion?

1. Specify an arithmetic model

Ï

Contributions:

• formalization of p and

/

2. Build a synthesis tool, CGPE, for fine grained IP

blocks:

Ï

it adheres to the arithmetic model

Ï

Contributions:

• implementation of the arithmetic model

3. Build a second synthesis tool, FPLA, for algorithmic

IP blocks:

Ï

it generates code using CGPE

Ï

Contributions:

• trade-off implementations for matrix multiplication

• code synthesis for Cholesky decomposition and

triangular matrix inversion

Arithmetic

model

F

ix

ed

-po

_{int synthe}

sis

to

ol

CGPE

Alg

orithmic level to

ol

FPLA

(16)

Outline of the talk

1. An arithmetic model for fixed-point code synthesis

2. An implementation of the arithmetic model: the CGPE tool

(17)

An arithmetic model for fixed-point code synthesis

Outline of the talk

1. An arithmetic model for fixed-point code synthesis

2. An implementation of the arithmetic model: the CGPE tool

3. Fixed-point code synthesis for linear algebra basic blocks

(18)

Fixed-point arithmetic numbers

A fixed-point number x is defined by two integers:

.

X

the k-bit integer representation of x

.

f

the

implicit

scaling factor of x

The value of x is given by x =

₂

X

_f

=

k

−1−f

X

`=−f

X

_`+f

_{· 2}

`

X

₇

X

₆

X

₅

X

₄

X

₃

X

₂

X

₁

X

₀

k

= 8

i

= 3

f

= 5

Notation

A fixed-point number with

i

bits

of integer part and

f

bits

of fraction part is in the Q

i

.f

format

Example:

Ï

x

in Q

3

.

5 and X =

(1001 1000)

2 =

(152)

10 −→

x

=

(100.11000)

2 =

(4.75)

10

(19)

Fixed-point arithmetic numbers

A fixed-point number x is defined by two integers:

.

X

the k-bit integer representation of x

.

f

the

implicit

scaling factor of x

The value of x is given by x =

₂

X

_f

=

k

−1−f

X

`=−f

X

_`+f

_{· 2}

`

X

₇

X

₆

X

₅

X

₄

X

₃

X

₂

X

₁

X

₀

k

= 8

i

= 3

f

= 5

Notation

A fixed-point number with

i

bits

of integer part and

f

bits

of fraction part is in the Q

i

.f

format

Example:

Ï

x

in Q

3

.

5 and X =

(1001 1000)

2 =

(152)

10 −→

x

=

(100.11000)

2 =

(4.75)

10 How to compute with fixed-point numbers?

(20)

Fixed-point arithmetic numbers

A fixed-point number x is defined by two integers:

.

X

the k-bit integer representation of x

.

f

the

implicit

scaling factor of x

The value of x is given by x =

₂

X

_f

=

k

−1−f

X

`=−f

X

_`+f

_{· 2}

`

1

0

1

0

0 k

= 8

i

= 3

f

= 5

Notation

A fixed-point number with

i

bits

of integer part and

f

bits

of fraction part is in the Q

i

.f

format

Example:

Ï

x

in Q

3

.

5 and X =

(1001 1000)

2 =

(152)

10 −→

x

=

(100.11000)

2 =

(4.75)

10

(21)

Fixed-point arithmetic numbers

A fixed-point number x is defined by two integers:

.

X

the k-bit integer representation of x

.

f

the

implicit

scaling factor of x

The value of x is given by x =

₂

X

_f

=

k

−1−f

X

`=−f

X

_`+f

_{· 2}

`

1

0

1

0

0 k

= 8

i

= 3

f

= 5

Notation

A fixed-point number with

i

bits

of integer part and

f

bits

of fraction part is in the Q

i

.f

format

Example:

Ï

x

in Q

3

.

5 and X =

(1001 1000)

2 =

(152)

10 −→

x

=

(100.11000)

2 =

(4.75)

10 How to compute with fixed-point numbers?

(22)

An interval arithmetic based model

For each coefficient or variable v, we keep track of 2 intervals Val

(

v

)

and Err

(

v

)

Our model assumes a fixed word-length k

Val

(

v

)

is the range of v

the format Q

i

.

f

of v is deduced from

Val

(

v

)

₌

£v,v¤

Ï

i

=

l

log

₂

(

max

(

¯

v

¯

,

¯

v

¯

))

m

+ α

Ï

f

= k − i

α =

(1, if mod ¡log

2 (

v

)

,

1¢ 6= 0,

2, otherwise

Err

(

v

)

encloses the

rounding error of

computing v

a bound

² on rounding

errors is deduced from

Err

(

v

)

₌

£e,e¤

Ï

² = max¡¯¯e¯¯,¯¯e¯¯¢

¦ ¦ ¦ a0 ¦ a1 ¦ ¦ a2 ¦ a3 ¦

x

¦ ¦ ¦ a4 ¦ a5

¦

.

Val

(

v

)

= g

¦

¡Val

(

v

1

)

, Val

(

v

2

)

, Err

(

v

1

)

, Err

(

v

2

)

¢

Err

(

v

)

= h

¦

¡Val

(

v

1

)

, Val

(

v

2

)

, Err

(

v

1

)

, Err

(

v

2

)

¢

Val

(

v

1

)

Err

(

v

1

)

Val

(

v

2

)

Err

(

v

2

)

How to propagate Val

(

v

)

and Err

(

v

)

for ¦ ∈ ©+,−,×,¿,À,p,

/

ª

(23)

An interval arithmetic based model

For each coefficient or variable v, we keep track of 2 intervals Val

(

v

)

and Err

(

v

)

Our model assumes a fixed word-length k

Val

(

v

)

is the range of v

the format Q

i

.

f

of v is deduced from

Val

(

v

)

₌

£v,v¤

Ï

i

=

l

log

₂

(

max

(

¯

v

¯

,

¯

v

¯

))

m

+ α

Ï

f

= k − i

α =

(1, if mod ¡log

2 (

v

)

,

1¢ 6= 0,

2, otherwise

Err

(

v

)

encloses the

rounding error of

computing v

a bound

² on rounding

errors is deduced from

Err

(

v

)

₌

£e,e

¤

Ï

² = max¡¯¯e¯¯,¯¯e¯¯¢

¦ ¦ ¦ a0 ¦ a1 ¦ ¦ a2 ¦ a3 ¦

x

¦ ¦ ¦ a4 ¦ a5

¦

.

Val

(

v

)

= g

¦

¡Val

(

v

1

)

, Val

(

v

2

)

, Err

(

v

1

)

, Err

(

v

2

)

¢

Err

(

v

)

= h

¦

¡Val

(

v

1

)

, Val

(

v

2

)

, Err

(

v

1

)

, Err

(

v

2

)

¢

Val

(

v

1

)

Err

(

v

1

)

Val

(

v

2

)

Err

(

v

2

)

How to propagate Val

(

v

)

and Err

(

v

)

for ¦ ∈ ©+,−,×,¿,À,p,

/

ª

?

(24)

An interval arithmetic based model

For each coefficient or variable v, we keep track of 2 intervals Val

(

v

)

and Err

(

v

)

Our model assumes a fixed word-length k

Val

(

v

)

is the range of v

the format Q

i

.

f

of v is deduced from

Val

(

v

)

₌

£v,v¤

Ï

i

=

l

log

₂

(

max

(

¯

v

¯

,

¯

v

¯

))

m

+ α

Ï

f

= k − i

α =

(1, if mod ¡log

2 (

v

)

,

1¢ 6= 0,

2, otherwise

Err

(

v

)

encloses the

rounding error of

computing v

a bound

² on rounding

errors is deduced from

Err

(

v

)

₌

£e,e

¤

Ï

² = max¡¯¯e¯¯,¯¯e¯¯¢

¦ ¦ ¦ a0 ¦ a1 ¦ ¦ a2 ¦ a3 ¦

x

¦ ¦ ¦ a4 ¦ a5

¦

.

Val

(

v

)

= g

¦

¡Val

(

v

1

)

, Val

(

v

2

)

, Err

(

v

1

)

, Err

(

v

2

)

¢

Err

(

v

)

= h

¦

¡Val

(

v

1

)

, Val

(

v

2

)

, Err

(

v

1

)

, Err

(

v

2

)

¢

Val

(

v

1

)

Err

(

v

1

)

Val

(

v

2

)

Err

(

v

2

)

How to propagate Val

(

v

)

and Err

(

v

)

for ¦ ∈ ©+,−,×,¿,À,p,

/

ª

(25)

An interval arithmetic based model

For each coefficient or variable v, we keep track of 2 intervals Val

(

v

)

and Err

(

v

)

Our model assumes a fixed word-length k

Val

(

v

)

is the range of v

the format Q

i

.

f

of v is deduced from

Val

(

v

)

₌

£v,v¤

Ï

i

=

l

log

₂

(

max

(

¯

v

¯

,

¯

v

¯

))

m

+ α

Ï

f

= k − i

α =

(1, if mod ¡log

2 (

v

)

,

1¢ 6= 0,

2, otherwise

Err

(

v

)

encloses the

rounding error of

computing v

a bound

² on rounding

errors is deduced from

Err

(

v

)

₌

£e,e

¤

Ï

² = max¡¯¯e¯¯,¯¯e¯¯¢

¦ ¦ ¦ a0 ¦ a1 ¦ ¦ a2 ¦ a3 ¦

x

¦ ¦ ¦ a4 ¦ a5

¦

.

Val

(

v

)

= g

¦

¡Val

(

v

1

)

, Val

(

v

2

)

, Err

(

v

1

)

, Err

(

v

2

)

¢

Err

(

v

)

= h

¦

¡Val

(

v

1

)

, Val

(

v

2

)

, Err

(

v

1

)

, Err

(

v

2

)

¢

Val

(

v

1

)

Err

(

v

1

)

Val

(

v

2

)

Err

(

v

2

)

How to propagate Val

(

v

)

and Err

(

v

)

for ¦ ∈ ©+,−,×,¿,À,p,

/

ª

?

(26)

Fixed-point multiplication

The output format of a Q

_i1

.

_f1

× Q

_i2

.

_f2

is Q

_{i1 + i2}

.

_{f1 + f2}

But, doubling the word-length is costly

i1

f1

i2

f2

i1 + i2

f1 + f2

fr

• • • • • • • •

×

• • • • • • • •

Discarded bits

×

. .

Val

(v ) =

Val

_(v1)×

Val

_(v2)

Err

(v ) =

Val

_(v1)×

Err

_(v2)

+

Val

_(v2)×

Err

_(v1)

+

Err

_(v1)×

Err

_(v2)

Val

(v ) =

Val

_(v1)×

Val

_(v2)−

Err

_×

Err

(v ) =

Err

_×

+

Val

_(v1)×

Err

_(v2)

+

Val

_(v2)×

Err

_(v1)

+

Err

_(v1)×

Err

_(v2)

Val(v1) Err(v1) Val(v2) Err(v2)

Err

_×

=

h

0,2

−f

r

_{− 2}

−

(

f

1 +f

2 )

i

This multiplication is available on integer processors and DSPs

i n t 3 2 _ t

mul

(

i n t 3 2 _ t

v1

,

i n t 3 2 _ t

v2

) {

i n t 6 4 _ t

prod

= ((

i n t 6 4 _ t

)

v1

) * ((

i n t 6 4 _ t

)

v2

) ;

retu rn

(

i n t 3 2 _ t

) (

prod

>> 32) ;

(27)

Fixed-point multiplication

The output format of a Q

_i1

.

_f1

× Q

_i2

.

_f2

is Q

_{i1 + i2}

.

_{f1 + f2}

But, doubling the word-length is costly

i1

f1

i2

f2

i1 + i2

f1 + f2

fr

• • • • • • • •

×

• • • • • • • •

Discarded bits

×

. .

Val

(v ) =

Val

_(v1)×

Val

_(v2)

Err

(v ) =

Val

_(v1)×

Err

_(v2)

+

Val

_(v2)×

Err

_(v1)

+

Err

_(v1)×

Err

_(v2)

Val

(v ) =

Val

_(v1)×

Val

_(v2)−

Err

_×

Err

(v ) =

Err

_×

+

Val

_(v1)×

Err

_(v2)

+

Val

_(v2)×

Err

_(v1)

+

Err

_(v1)×

Err

_(v2)

Err

_×

=

h

0,2

−f

r

_{− 2}

−

(

f

1 +f

2 )

i

This multiplication is available on integer processors and DSPs

i n t 3 2 _ t

mul

(

i n t 3 2 _ t

v1

,

i n t 3 2 _ t

v2

) {

i n t 6 4 _ t

prod

= ((

i n t 6 4 _ t

)

v1

) * ((

i n t 6 4 _ t

)

v2

) ;

retu rn

(

i n t 3 2 _ t

) (

prod

>> 32) ;

}

(28)

Fixed-point multiplication

The output format of a Q

_i1

.

_f1

× Q

_i2

.

_f2

is Q

_{i1 + i2}

.

_{f1 + f2}

But, doubling the word-length is costly

i1

f1

i2

f2

i1 + i2

f1 + f2

fr

• • • • • • • •

×

• • • • • • • •

Discarded bits

×

. .

Val

(v ) =

Val

_(v1)×

Val

_(v2)

Err

(v ) =

Val

_(v1)×

Err

_(v2)

+

Val

_(v2)×

Err

_(v1)

+

Err

_(v1)×

Err

_(v2)

Val

(v ) =

Val

_(v1)×

Val

_(v2)−

Err

_×

Err

(v ) =

Err

_×

+

Val

_(v1)×

Err

_(v2)

+

Val

_(v2)×

Err

_(v1)

+

Err

_(v1)×

Err

_(v2)

Err

_×

=

h

0,2

−f

r

_{− 2}

−

(

f

1 +f

2 )

i

This multiplication is available on integer processors and DSPs

i n t 3 2 _ t

mul

(

i n t 3 2 _ t

v1

,

i n t 3 2 _ t

v2

) {

i n t 6 4 _ t

prod

= ((

i n t 6 4 _ t

)

v1

) * ((

i n t 6 4 _ t

)

v2

) ;

retu rn

(

i n t 3 2 _ t

) (

prod

>> 32) ;

(29)

Fixed-point multiplication

The output format of a Q

_i1

.

_f1

× Q

_i2

.

_f2

is Q

_{i1 + i2}

.

_{f1 + f2}

But, doubling the word-length is costly

i1

f1

i2

f2

i1 + i2

f1 + f2

fr

• • • • • • • •

×

• • • • • • • •

Discarded bits

×

. .

Val

(v ) =

Val

_(v1)×

Val

_(v2)

Err

(v ) =

Val

_(v1)×

Err

_(v2)

+

Val

_(v2)×

Err

_(v1)

+

Err

_(v1)×

Err

_(v2)

Val

(v ) =

Val

_(v1)×

Val

_(v2)−

Err

_×

Err

(v ) =

Err

_×

+

Val

_(v1)×

Err

_(v2)

+

Val

_(v2)×

Err

_(v1)

+

Err

_(v1)×

Err

_(v2)

Err

_×

=

h

0,2

−f

r

_{− 2}

−

(

f

1 +f

2 )

i

This multiplication is available on integer processors and DSPs

i n t 3 2 _ t

mul

(

i n t 3 2 _ t

v1

,

i n t 3 2 _ t

v2

) {

i n t 6 4 _ t

prod

= ((

i n t 6 4 _ t

)

v1

) * ((

i n t 6 4 _ t

)

v2

) ;

retu rn

(

i n t 3 2 _ t

) (

prod

>> 32) ;

}

(30)

Our new fixed-point division

The output integer part of Q

_i1

.

_f1

/

Q

_i2

.

_f2

may be as large as i

1 + f

2 But, doubling the word-length is costly

How to obtain sharper a error bounds on Err

/

?

i1

f1

i2

f2

i1

f1 + k

i2

f2

• • • • • • • •

/

• • • • • • • •

0 0 0 0 0 0 0 0

• • • • • • • •

÷

×2

k

i1 + f2

i2 + f1

• • • • • • • •

i1 + f2

fr

• • • • • • • •

ir

fr

• • • •

• • • • • • • •

• • • •

How to decide of the output format of division?

A large integer part

3 prevents overflow

7 loose error bounds and loss of

precision

A small integer part

7 may cause overflow

3 sharp error bounds and more

(31)

Our new fixed-point division

The output integer part of Q

_i1

.

_f1

/

Q

_i2

.

_f2

may be as large as i

1 + f

2 But, doubling the word-length is costly

How to obtain sharper a error bounds on Err

/

?

i1

f1

i2

f2

i1

f1 + k

i2

f2

• • • • • • • •

/

• • • • • • • •

0 0 0 0 0 0 0 0

• • • • • • • •

÷

×2

k

i1 + f2

i2 + f1

• • • • • • • •

i1 + f2

fr

• • • • • • • •

ir

fr

• • • •

• • • • • • • •

• • • •

Err

_/

=

£−2

i

2 +f

1 _,2

i

2 +f

1 ¤

How to decide of the output format of division?

A large integer part

3 prevents overflow

7 loose error bounds and loss of

precision

A small integer part

7 may cause overflow

3 sharp error bounds and more

accurate computations

(32)

Our new fixed-point division

The output integer part of Q

_i1

.

_f1

/

Q

_i2

.

_f2

may be as large as i

1 + f

2 But, doubling the word-length is costly

How to obtain sharper a error bounds on Err

/

?

i1

f1

i2

f2

i1

f1 + k

i2

f2

• • • • • • • •

/

• • • • • • • •

0 0 0 0 0 0 0 0

• • • • • • • •

÷

×2

k

i1 + f2

i2 + f1

• • • • • • • •

i1 + f2

fr

• • • • • • • •

ir

fr

• • • •

• • • • • • • •

• • • •

Err

_/

=

£−2

f

r

,2

f

r

¤

How to decide of the output format of division?

A large integer part

3 prevents overflow

7 loose error bounds and loss of

precision

A small integer part

7 may cause overflow

3 sharp error bounds and more

(33)

Our new fixed-point division

The output integer part of Q

_i1

.

_f1

/

Q

_i2

.

_f2

may be as large as i

1 + f

2 But, doubling the word-length is costly

How to obtain sharper a error bounds on Err

_/

?

i1

f1

i2

f2

i1

f1 + k

i2

f2

• • • • • • • •

/

• • • • • • • •

0 0 0 0 0 0 0 0

• • • • • • • •

÷

×2

k

i1 + f2

i2 + f1

• • • • • • • •

i1 + f2

fr

• • • • • • • •

ir

fr

• • • •

• • • • • • • •

• • • •

Err

_/

=

£−2

f

r

,2

f

r

¤

©

sharper bound

§

risk of overflow at run-time

How to decide of the output format of division?

A large integer part

3 prevents overflow

7 loose error bounds and loss of

precision

A small integer part

7 may cause overflow

3 sharp error bounds and more

accurate computations

(34)

Our new fixed-point division

The output integer part of Q

_i1

.

_f1

/

Q

_i2

.

_f2

may be as large as i

1 + f

2 But, doubling the word-length is costly

How to obtain sharper a error bounds on Err

_/

?

i1

f1

i2

f2

i1

f1 + k

i2

f2

• • • • • • • •

/

• • • • • • • •

0 0 0 0 0 0 0 0

• • • • • • • •

÷

×2

k

i1 + f2

i2 + f1

• • • • • • • •

i1 + f2

fr

• • • • • • • •

ir

fr

• • • •

• • • • • • • •

• • • •

Err

_/

=

£−2

f

r

,2

f

r

¤

©

sharper bound

§

risk of overflow at run-time

How to decide of the output format of division?

A large integer part

3 prevents overflow

7 loose error bounds and loss of

precision

A small integer part

7 may cause overflow

3 sharp error bounds and more

(35)

The propagation rule and implementation of division

Once the output format decided Q

ir

.

fr

/

. .

Val

_{(v ) = Range(Qir}

_.

_{fr ) = [−2}

ir −1,2ir −1 −2fr ].

Err

(v ) =

Val(v2)·

á

Err

(v1)−Val(v1)·

Err

(v2)

á

Val(v2)·

³

Val(v2)+

á

Err

_(v2)

´

+

Err

_/

à

Val

(

v

₂

)

=

Val

(

v

1 )

â

Val

(

v

)

+ Err/

∩ Val

(

v

₂

)

and â

Val

(

v

)

=

[

−2

i

r

−1

_{, −2}

−f

r

_]

_∪

_[

2 −f

r

,2

i

r

−1

_{− 2}

f

r

_]

Additional code to check for run-time overflows

(36)

The propagation rule and implementation of division

Once the output format decided Q

ir

.

fr

/

. .

Val

_{(v ) = Range(Qir}

_.

_{fr ) = [−2}

ir −1,2ir −1 −2fr ].

Err

(v ) =

Val(v2)·

á

Err

(v1)−Val(v1)·

Err

(v2)

á

Val(v2)·

³

Val(v2)+

á

Err

_(v2)

´

+

Err

_/

à

Val

(

v

₂

)

=

Val

(

v

1 )

â

Val

(

v

)

+ Err/

∩ Val

(

v

₂

)

and â

Val

(

v

)

=

[

−2

i

r

−1

_{, −2}

−f

r

_]

_∪

_[

2 −f

r

,2

i

r

−1

_{− 2}

f

r

_]

i n t 3 2 _ t

div

(

i n t 3 2 _ t

V1

,

i n t 3 2 _ t

V2

,

u i n t 1 6 _ t

eta

)

{

i n t 6 4 _ t

t1

= ((

i n t 6 4 _ t

)

V1

) <<

eta

;

i n t 6 4 _ t

V

=

t1

/

V2

;

CGPE_ASSERT

ret urn

(

i n t 3 2 _ t

)

V

;

}

(37)

The propagation rule and implementation of division

Once the output format decided Q

ir

.

fr

/

. .

Val

_{(v ) = Range(Qir}

_.

_{fr ) = [−2}

ir −1,2ir −1 −2fr ].

Err

(v ) =

Val(v2)·

á

Err

(v1)−Val(v1)·

Err

(v2)

á

Val(v2)·

³

Val(v2)+

á

Err

_(v2)

´

+

Err

_/

à

Val

(

v

₂

)

=

Val

(

v

1 )

â

Val

(

v

)

+ Err/

∩ Val

(

v

₂

)

and â

Val

(

v

)

=

[

−2

i

r

−1

_{, −2}

−f

r

_]

_∪

_[

2 −f

r

,2

i

r

−1

_{− 2}

f

r

_]

i n t 3 2 _ t

div

(

i n t 3 2 _ t

V1

,

i n t 3 2 _ t

V2

,

u i n t 1 6 _ t

eta

)

{

i n t 6 4 _ t

t1

= ((

i n t 6 4 _ t

)

V1

) <<

eta

;

i n t 6 4 _ t

V

=

t1

/

V2

;

CGPE_ASSERT((((V & 0xFFFFFFFF80000000ll) == 0xFFFFFFFF80000000ll)

|| ((V & 0xFFFFFFFF80000000ll) == 0)));

ret urn

(

i n t 3 2 _ t

)

V

;

}

Additional code to check for run-time overflows

(38)

The division format trade-off: case of inverting 2 × 2 matrices

Consider A =

Ã

a

b

c

d

!

with a,b,c,d ∈

[

_{−1, 1}

]

in the format Q

2

.

30 Cramer’s rule: if

∆ = ad − bc 6= 0 then A

−1

₌

Ã

_d

∆

−b

∆

−c

∆

a

!

/

d

−

×

a

d

×

b

c

[

−1, 1

]

[

−1, 1

]

[

−1, 1

]

Q

2

.

30 [

−2, 2

]

Q

3

.

29 ?

Q −10 .42 Q −8.40 Q−6.38 Q−4.36 Q−2.34 Q0.32 Q 2.30 Q 4.28 Q 6.26 Q 8.24 Q₁₀ .22

2

−36

2

−31

2

−26

2

−21

2

−16

2

−11

D

IVISION OUTPUT FORMAT

Maxim

um

exper

imental

error

0%

20%

40%

60%

80%

100%

Maximum error

(39)

The division format trade-off: case of inverting 2 × 2 matrices

Consider A =

Ã

a

b

c

d

!

with a,b,c,d ∈

[

_{−1, 1}

]

in the format Q

2

.

30 Cramer’s rule: if

∆ = ad − bc 6= 0 then A

−1

₌

Ã

_d

∆

−b

∆

−c

∆

a

!

/

d

−

×

a

d

×

b

c

[

−1, 1

]

[

−1, 1

]

[

−1, 1

]

Q

2

.

30 [

−2, 2

]

Q

3

.

29 ?

Q −10 .42 Q −8.40 Q−6.38 Q−4.36 Q−2.34 Q0.32 Q 2.30 Q 4.28 Q 6.26 Q 8.24 Q₁₀ .22

2

−36

2

−31

2

−26

2

−21

2

−16

2

−11

D

Maxim

um

exper

imental

error

0%

20%

40%

60%

80%

100%

Maximum error

(40)

The division format trade-off: case of inverting 2 × 2 matrices

Consider A =

Ã

a

b

c

d

!

with a,b,c,d ∈

[

_{−1, 1}

]

in the format Q

2

.

30 Cramer’s rule: if

∆ = ad − bc 6= 0 then A

−1

₌

Ã

_d

∆

−b

∆

−c

∆

a

!

/

d

−

×

a

d

×

b

c

[

−1, 1

]

[

−1, 1

]

[

−1, 1

]

Q

2

.

30 [

−2, 2

]

Q

3

.

29 ?

Q −10 .42 Q −8.40 Q−6.38 Q−4.36 Q−2.34 Q0.32 Q 2.30 Q 4.28 Q 6.26 Q 8.24 Q₁₀ .22

2

−36

2

−31

2

−26

2

−21

2

−16

2

−11

D

Maxim

um

exper

imental

error

0%

20%

40%

60%

80%

100%

Maximum error

(41)

The division format trade-off: case of inverting 2 × 2 matrices

Consider A =

Ã

a

b

c

d

!

with a,b,c,d ∈

[

_{−1, 1}

]

in the format Q

2

.

30 Cramer’s rule: if

∆ = ad − bc 6= 0 then A

−1

₌

Ã

_d

∆

−b

∆

−c

∆

a

!

/

d

−

×

a

d

×

b

c

[

−1, 1

]

[

−1, 1

]

[

−1, 1

]

Q

2

.

30 [

−2, 2

]

Q

3

.

29 ?

Q −10 .42 Q −8.40 Q−6.38 Q−4.36 Q−2.34 Q0.32 Q 2.30 Q 4.28 Q 6.26 Q 8.24 Q₁₀ .22

2

−36

2

−31

2

−26

2

−21

2

−16

2

−11

D

Maxim

um

exper

imental

error

0%

20%

40%

60%

80%

100%

Maximum error

(42)

The division format trade-off: case of inverting 2 × 2 matrices

Consider A =

Ã

a

b

c

d

!

with a,b,c,d ∈

[

_{−1, 1}

]

in the format Q

2

.

30 Cramer’s rule: if

∆ = ad − bc 6= 0 then A

−1

₌

Ã

_d

∆

−b

∆

−c

∆

a

!

/

d

−

×

a

d

×

b

c

[

−1, 1

]

[

−1, 1

]

[

−1, 1

]

Q

2

.

30 [

−2, 2

]

Q

3

.

29 ?

Q −10 .42 Q −8.40 Q−6.38 Q−4.36 Q−2.34 Q0.32 Q 2.30 Q 4.28 Q 6.26 Q 8.24 Q₁₀ .22

2

−36

2

−31

2

−26

2

−21

2

−16

2

−11

D

Maxim

um

exper

imental

error

0%

20%

40%

60%

80%

100%

Maximum error

(43)

The division format trade-off: case of inverting 2 × 2 matrices

Consider A =

Ã

a

b

c

d

!

with a,b,c,d ∈

[

_{−1, 1}

]

in the format Q

2

.

30 Cramer’s rule: if

∆ = ad − bc 6= 0 then A

−1

₌

Ã

_d

∆

−b

∆

−c

∆

a

!

/

d

−

×

a

d

×

b

c

[

−1, 1

]

[

−1, 1

]

[

−1, 1

]

Q

2

.

30 [

−2, 2

]

Q

3

.

29 ?

Q −10 .42 Q −8.40 Q−6.38 Q−4.36 Q−2.34 Q0.32 Q 2.30 Q 4.28 Q 6.26 Q 8.24 Q₁₀ .22

2

−36

2

−31

2

−26

2

−21

2

−16

2

−11

D

Maxim

um

exper

imental

error

0%

20%

40%

60%

80%

100%

Maximum error

(44)

The division format trade-off: case of inverting 2 × 2 matrices

Consider A =

Ã

a

b

c

d

!

with a,b,c,d ∈

[

_{−1, 1}

]

in the format Q

2

.

30 Cramer’s rule: if

∆ = ad − bc 6= 0 then A

−1

₌

Ã

_d

∆

−b

∆

−c

∆

a

!

/

d

−

×

a

d

×

b

c

[

−1, 1

]

[

−1, 1

]

[

−1, 1

]

Q

2

.

30 [

−2, 2

]

Q

3

.

29 ?

Q −10 .42 Q −8.40 Q−6.38 Q−4.36 Q−2.34 Q0.32 Q 2.30 Q 4.28 Q 6.26 Q 8.24 Q₁₀ .22

2

−36

2

−31

2

−26

2

−21

2

−16

2

−11

Maxim

um

exper

imental

error

0%

20%

40%

60%

80%

100%

Maximum error

Overflow rate

(45)

The division format trade-off: case of inverting 2 × 2 matrices

Consider A =

Ã

a

b

c

d

!

with a,b,c,d ∈

[

_{−1, 1}

]

in the format Q

2

.

30 Cramer’s rule: if

∆ = ad − bc 6= 0 then A

−1

₌

Ã

_d

∆

−b

∆

−c

∆

a

!

/

d

−

×

a

d

×

b

c

[

−1, 1

]

[

−1, 1

]

[

−1, 1

]

Q

2

.

30 [

−2, 2

]

Q

3

.

29 ?

Q −10 .42 Q −8.40 Q−6.38 Q−4.36 Q−2.34 Q0.32 Q 2.30 Q 4.28 Q 6.26 Q 8.24 Q₁₀ .22

2

−36

2

−31

2

−26

2

−21

2

−16

2

−11

D

Maxim

um

exper

imental

error

0%

20%

40%

60%

80%

100%

Maximum error

Overflow rate

(46)

The division format trade-off: case of inverting 2 × 2 matrices

Consider A =

Ã

a

b

c

d

!

with a,b,c,d ∈

[

_{−1, 1}

]

in the format Q

2

.

30 Cramer’s rule: if

∆ = ad − bc 6= 0 then A

−1

₌

Ã

_d

∆

−b

∆

−c

∆

a

!

/

d

−

×

a

d

×

b

c

[

−1, 1

]

[

−1, 1

]

[

−1, 1

]

Q

2

.

30 [

−2, 2

]

Q

3

.

29 ?

Q −10 .42 Q −8.40 Q−6.38 Q−4.36 Q−2.34 Q0.32 Q 2.30 Q 4.28 Q 6.26 Q 8.24 Q₁₀ .22

2

−36

2

−31

2

−26

2

−21

2

−16

2

−11

Maxim

um

exper

imental

error

0%

20%

40%

60%

80%

100%

Maximum error

Overflow rate

(47)

The division format trade-off: case of inverting 2 × 2 matrices

Consider A =

Ã

a

b

c

d

!

with a,b,c,d ∈

[

_{−1, 1}

]

in the format Q

2

.

30 Cramer’s rule: if

∆ = ad − bc 6= 0 then A

−1

₌

Ã

_d

∆

−b

∆

−c

∆

a

!

/

d

−

×

a

d

×

b

c

[

−1, 1

]

[

−1, 1

]

[

−1, 1

]

Q

2

.

30 [

−2, 2

]

Q

3

.

29 ?

Q −10 .42 Q −8.40 Q−6.38 Q−4.36 Q−2.34 Q0.32 Q 2.30 Q 4.28 Q 6.26 Q 8.24 Q₁₀ .22