Characteristics

This package provides extended precision versions of Float64, Float32, Float16.

type name	significand	exponent	◊	base type	significand	exponent
`Double64`	106 bits	11 bits	◊	`Float64`	53 bits	11 bits
`Double32`	48 bits	8 bits	◊	`Float32`	24 bits	8 bits
`Double16`	22 bits	5 bits	◊	`Float16`	11 bits	5 bits

Representation

Double64 is a magnitude ordered, nonoverlapping pair of Float64
Double32 is a magnitude ordered, nonoverlapping pair of Float32
Double16 is a magnitude ordered, nonoverlapping pair of Float16

(+, -, *) are error-free, (/, sqrt) are least-error
elementary functions are quite accurate
- often better than C "double-double" libraries

ComplexDF64 is a (real, imag) pair of Double64
ComplexDF32 is a (real, imag) pair of Double32
ComplexDF16 is a (real, imag) pair of Double16

elementary functions are quite accurate
- functions and their inverses round-trip well

Accuracy

For Double64 arguments within 0.0..2.0

expect the abserr of elementary functions to be 1e-30 or better
expect the relerr of elementary functions to be 1e-28 or better

When used with reasonably sized values, expect successive DoubleFloat ops to add no more than 10⋅𝘂² to the cumulative relative error (𝘂 is the relative rounding unit, usually 𝘂 = eps(x)/2). Relative error can accrue steadily. After 100,000 DoubleFloat ops with reasonably sized values, the relerr could approach 100,000 * 10⋅𝘂². In practice these functions are considerably more resiliant: our algorithms come frome seminal papers and extensive numeric investigation.

Absolute and Relative Error

results for f(x), x in 0..1

function	abserr	relerr
exp	1.0e-31	1.0e-31
log	1.0e-31	1.0e-31

sin	1.0e-31	1.0e-31
cos	1.0e-31	1.0e-31
tan	1.0e-31	1.0e-31

asin	1.0e-31	1.0e-31
acos	1.0e-31	1.0e-31
atan	1.0e-31	1.0e-31

sinh	1.0e-31	1.0e-29
cosh	1.0e-31	1.0e-31
tanh	1.0e-31	1.0e-29

asinh	1.0e-31	1.0e-29
atanh	1.0e-31	1.0e-30

results for f(x), x in 1..2

function	abserr	relerr
exp	1.0e-30	1.0e-31
log	1.0e-31	1.0e-31

sin	1.0e-31	1.0e-31
cos	1.0e-31	1.0e-28
tan	1.0e-30	1.0e-30

atan	1.0e-31	1.0e-31

sinh	1.0e-30	1.0e-31
cosh	1.0e-30	1.0e-31
tanh	1.0e-31	1.0e-31

asinh	1.0e-31	1.0e-31