Towards a constraint system for round-off error analysis of floating-point computation
DP-CP18
Rémy Garcia Claude Michel Marie Pelleau Michel Rueher
Université Côte d’Azur, CNRS, I3S, France
Outline
Motivation
Floating-point numbers Context
How to represent errors?
Dedicated filtering for errors Experimentation
Conclusion
Motivation
1/20
Program onFwritten with the semantic of R F‰R
Computations over Fproduce errors ÑProgram verification
Approximation tools (Fluctuat, PRECiSA,. . .) compute an over-approximationof the error
Our solver compute anover-approximationof the error and find input valuesto reach anactual error
Ñenable reasoning overerrors
Floats – definition
Floating-point numbers 2/20
F is a finite subset of R 0
´8 +8
R(horizontal line) and F(vertical lines) IEEE 754 floats are represented by
asign amantissa an exponent
´0.542ˆ10´2
Floats – rounding
3/20
x,yPFÑxdyRF, wheredis an operation onF Rounding of result to the closest float
Loss of precision: ˝(x)‰x
Required on all operations˝(˝(xdy)dz)
Rounding accumulation
Increase the divergence betweenR andF
Floats – main issues with rounding
Floating-point numbers 4/20
Absorption addition of two numbers of different magnitude
108 + 10´1 = 108 Ñ loss of precision
Cancellation subtraction of two numbers of similar magnitude
(1.0´10´5)´1.0 = ´9.99999999995449¨10´6 Ñ loss of the most significant digits
Motivating example
5/20
CSP
x= 11.34f^yP[1.3f,3.4f]^o= 2.43f^kP[2.0f,3.0f]^ (1)
err(y) = 0^err(k) = 0^ (2)
w=x+y^u=o+k^z=w´u (3) Goal
Finding input values such that:
err(z)ą0 err(z) = 0 Method
Filtering + Search
Motivating example: err(z) > 0
Floating-point numbers 6/20
CSP
x= 11.34f^yP[1.3f,3.4f]^o= 2.43f^kP[2.0f,3.0f]^
err(y) = 0^err(k) = 0^
w=x+y^u=o+k^z=w´u Results
y= 3.40000009536743164062e+00 k= 3.00000000000000000000e+00
Variable Value Error
w 1.47399997711181640625e+01 4.76837158203125000000e´07 u 5.43000030517578125000e+00 ´2.38418579101562500000e´07 z 9.30999946594238281250e+00 7.15255737304687500000e´07
Motivating example: err(z) = 0
7/20
CSPx= 11.34f^yP[1.3f,3.4f]^o= 2.43f^kP[2.0f,3.0f]^
err(y) = 0^err(k) = 0^
w=x+y^u=o+k^z=w´u Results
y= 3.40000009536743164062e+00 k= 2.99999880790710449219e+00
Variable Value Error
w 1.47399997711181640625e+01 4.76837158203125000000e´07 u 5.42999887466430664062e+00 0.00000000000000000000e+00 z 9.31000137329101562500e+00 0.00000000000000000000e+00
Error compensation : ez=ew´eu+ea Thanks to conservation of sign
Context
Context 8/20
Program verification overF
Program overF written with the semantic ofR Method: Constraint Programming
Representing possible values of the error Dedicated filtering for rounding errors over F Focus on basic arithmetic operations
Why analyze the error?
9/20
Computation deviation between Rand F
Absolute error
|(x¨y)´(xFdyF)|
x,yPRandxF,yFPF x=xF+ex
¨operation on Randdoperation on F Change in the expected behavior of a program
Execution overF‰R
Error on each elementary operation Impact on following computations
Which error?
Context 10/20
Computation of deviation
On classical arithmetic operations: +,´,ˆ,˜ ÑComputable on Q
Observational error not taken into account Signed errors
ÑPossible compensation of errors Computed error (x¨y)´(xFdyF)
x,yPRandxF,yFPF x=xF+ex
¨operation on Randdoperation on F
Deviation example
11/20
Rump’s polynomial
333.75b6+a2(11a2b2´b6´121b4´2) + 5.5b8+ a 2b Results
Fora= 77617 andb= 33096
over R=´5476766192 « ´0.827396056 over F«6.3382530011411ˆ1029
(simple float with rounding to nearest)
=ñ Change of sign and huge change of magnitude Error
Error =´41954164265153480271934889285124096
66192 Q
«−6.3382530011411ˆ1029F
Domain for errors over Q Q Q
How to represent errors? 12/20
Forxa float variable of a CSP Domain of values Dx
Interval of F
Cannot represent the associated error:
error RF
Ñ New domain over Q Domain of errorsDex
Interval of Q
Ñ Exact representation of error For+,´,ˆ,˜
x
Dx Dex
Relation between D v and D e
v13/20
Considerz=xdy
IEEE 754, operations correctly rounded: ‘,a,b,m (xdy)a1
2ulp(xdy)ďx¨yď(xdy)‘1
2ulp(xdy)
ulp: distance between two consecutive floats Error on the operation
´1
2ulp(xdy)ďedď 1
2ulp(xdy)
Projection function
Dedicated filtering for errors 14/20
Based on deviation computation betweenRandF For which constraints?
Basic arithmetic constraints: +,´,ˆ,˜ Assignement: propagation of the error Projection functions forz=x´y
ezÐezX(ex´ey+ea) exÐexX(ez+ey´ea) eyÐeyX(ex´ez+ea) eaÐeaX(ez´ex+ey)
Solve cubic function
15/20
Computingreal roots of the expression: x3+ax2+bx+c= 0 Implementation from theGSL(Gnu Scientific Library)
i n t g s l _ p o l y _ s o l v e _ c u b i c (double a , double b , double c , . . . ) { double q = ( a ∗ a ´ 3 ∗ b ) ;
double r = (2 ∗ a ∗ a ∗ a ´ 9 ∗ a ∗ b + 27 ∗ c ) ; double Q = q / 9 ;
double R = r / 5 4 ; double Q3 = Q ∗ Q ∗ Q;
double R2 = R ∗ R ;
double CR2 = 729 ∗ r ∗ r ; double CQ3 = 2916 ∗ q ∗ q ∗ q ; i f (R == 0 && Q == 0) {
. . . } e l s e {
. . . }
}
Solve cubic – problem
Experimentation 16/20
Question
Is thereinput valuessuch that the then-branch is taken with an error?
Computation framework
Solver overF: Objective-CP(L. Michel and P. Van Hentenryck) +FPCS(C. Michel)
Domain of errors: Qfrom GMP Filtering dedicated to errors Classical search strategy overF
Solve cubic – CSP
17/20
CSP reaching thethen-branch
aP[14.0,16.0]^bP[´200,200]^cP[´200,200]^ err(a) = 0^err(b) = 0^err(c) = 0^
q= (aˆa´3ˆb)^
r= (2ˆaˆaˆa´9ˆaˆb+ 27ˆc)^ Q=q˜9^
R=r˜54^ R== 0^Q== 0
Solve cubic – results
Experimentation 18/20
Exact computation
err(Q) = 0^err(R) = 0
a= 15 b= 75 c= 125 err(Q) = 0 err(R) = 0
Computation with errors err(Q)ą0^err(R)ą0
a= 1.51000000e+01 b= 7.60033333e+01 c= 1.27516704e+02 err(Q)«2.10794345e´15 err(R)«1.26633847e´14
Computation with errors
Produce a computation deviation between F and R Conditiontrue overF butfalseoverR
FPBench – solving time
19/20
Gappa Fluctuat Real2Float FPTaylor PRECiSA Objective-CP
carbonGas 0.152 0.025 0.815 1.209 3.830 0.060
verhulst 0.034 0.043 0.465 0.812 0.789 0.032
predPrey 0.052 0.031 0.735 0.916 0.477 0.050
rigidBody1 0.086 0.029 0.494 0.877 0.653 0.069
rigidBody2 0.112 0.024 0.287 1.115 0.565 0.094
doppler1 0.057 0.025 5.998 3.026 107.696 3.480
doppler2 0.069 0.029 5.993 3.008 26.520 2.519
doppler3 0.063 0.029 5.970 21.927 45.875 2.546
turbine1 0.165 0.028 67.960 2.906 110.272 0.232
turbine2 0.100 0.026 3.972 1.939 7.145 0.225
turbine3 0.130 0.026 67.460 3.430 351.022 0.295
sqroot 0.281 0.024 0.712 1.157 0.343 0.064
sine 0.145 0.025 0.948 1.296 6.023 2.445
sineOrder3 0.114 0.026 0.304 0.847 1.616 0.033
best,second best, and worstsolving time in seconds correct time for Objective-CP (filtering+ search) with computation over Q
Conclusion
Conclusion 20/20
Contributions:
Domain of errors Ñrepresentingpossible values of errors Projection functions Ñfilteringof domains of errors Constraints over errors Ñreasoning on errors Next step:
Find input values which maximise a reachable error