Characteristics

This package provides extended precision versions of Float64, Float32, Float16.


type namesignificandexponentbase typesignificandexponent
Double64106 bits11 bitsFloat6453 bits11 bits
Double3248 bits8 bitsFloat3224 bits8 bits
Double1622 bits5 bitsFloat1611 bits5 bits

Representation

Double64 is a magnitude ordered, nonoverlapping pair of Float64

Double32 is a magnitude ordered, nonoverlapping pair of Float32

Double16 is a magnitude ordered, nonoverlapping pair of Float16

  • (+, -, *) are error-free, (/, sqrt) are least-error
  • elementary functions are quite accurate
    • often better than C "double-double" libraries

ComplexDF64 is a (real, imag) pair of Double64

ComplexDF32 is a (real, imag) pair of Double32

ComplexDF16 is a (real, imag) pair of Double16

  • elementary functions are quite accurate
    • functions and their inverses round-trip well

Accuracy

For Double64 arguments within 0.0..2.0

  • expect the abserr of elementary functions to be 1e-30 or better
  • expect the relerr of elementary functions to be 1e-28 or better

When used with reasonably sized values, expect successive DoubleFloat ops to add no more than 10⋅𝘂² to the cumulative relative error (𝘂 is the relative rounding unit, usually 𝘂 = eps(x)/2). Relative error can accrue steadily. After 100,000 DoubleFloat ops with reasonably sized values, the relerr could approach 100,000 * 10⋅𝘂². In practice these functions are considerably more resiliant: our algorithms come frome seminal papers and extensive numeric investigation.


Absolute and Relative Error

results for f(x), x in 0..1

functionabserrrelerr
exp1.0e-311.0e-31
log1.0e-311.0e-31
sin1.0e-311.0e-31
cos1.0e-311.0e-31
tan1.0e-311.0e-31
asin1.0e-311.0e-31
acos1.0e-311.0e-31
atan1.0e-311.0e-31
sinh1.0e-311.0e-29
cosh1.0e-311.0e-31
tanh1.0e-311.0e-29
asinh1.0e-311.0e-29
atanh1.0e-311.0e-30

results for f(x), x in 1..2

functionabserrrelerr
exp1.0e-301.0e-31
log1.0e-311.0e-31
sin1.0e-311.0e-31
cos1.0e-311.0e-28
tan1.0e-301.0e-30
atan1.0e-311.0e-31
sinh1.0e-301.0e-31
cosh1.0e-301.0e-31
tanh1.0e-311.0e-31
asinh1.0e-311.0e-31