tango.math.IEEE

Low-level Mathematical Functions which take advantage of the IEEE754 ABI.

License:

BSD style: see license.txt, Digital Mars.

Authors:

Don Clugston, Walter Bright, Sean Kelly
struct IeeeFlags(T) [public]
IEEE exception status flags
These flags indicate that an exceptional floating-point condition has occured. They indicate that a NaN or an infinity has been generated, that a result is inexact, or that a signalling NaN has been encountered. The return values of the properties should be treated as booleans, although each is returned as an int, for speed.

Example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
real a=3.5;
// Set all the flags to zero
resetIeeeFlags();
assert(!ieeeFlags.divByZero);
// Perform a division by zero.
a/=0.0L;
assert(a==real.infinity);
assert(ieeeFlags.divByZero);
// Create a NaN
a*=0.0L;
assert(ieeeFlags.invalid);
assert(isNaN(a));

// Check that calling func() has no effect on the
// status flags.
IeeeFlags f = ieeeFlags;
func();
assert(ieeeFlags == f);
int inexact() [@property]
The result cannot be represented exactly, so rounding occured. (example: x = sin(0.1); }
int underflow() [@property]
A zero was generated by underflow (example: x = real.min*real.epsilon/2;)
int overflow() [@property]
An infinity was generated by overflow (example: x = real.max*2;)
int divByZero() [@property]
An infinity was generated by division by zero (example: x = 3/0.0; )
int invalid() [@property]
A machine NaN was generated. (example: x = real.infinity * 0.0; )
IeeeFlags ieeeFlags() [@property]
Return a snapshot of the current state of the floating-point status flags.
void resetIeeeFlags() [public]
Set all of the floating-point status flags to false.
enum RoundingMode : short [public]
IEEE rounding modes. The default mode is ROUNDTONEAREST.
RoundingMode setIeeeRounding(RoundingMode roundingmode) [public]
Change the rounding mode used for all floating-point operations.
Returns the old rounding mode.

When changing the rounding mode, it is almost always necessary to restore it at the end of the function. Typical usage:
1
2
auto oldrounding = setIeeeRounding(RoundingMode.ROUNDDOWN);
scope (exit) setIeeeRounding(oldrounding);
RoundingMode getIeeeRounding() [public]
Get the IEEE rounding mode which is in use.
PrecisionControl reduceRealPrecision(PrecisionControl prec) [public]
Set the number of bits of precision used by 'real'.

Returns:

the old precision. This is not supported on all platforms.
real frexp(real value, out int exp) [public]
Separate floating point value into significand and exponent.

Returns:

Calculate and return x and exp such that value =x*2exp and .5 <= |x| < 1.0 x has same sign as value.

Special Values
value returns exp
±0.0 ±0.0 0
+∞ +∞ int.max
-∞ -∞ int.min
±NAN ±NAN int.min
real ldexp(real n, int exp) [public]
Compute n * 2exp

References:

frexp
int ilogb(real x) [public]
Extracts the exponent of x as a signed integral value.
If x is not a special value, the result is the same as cast(int)logb(x).

Remarks:

This function is consistent with IEEE754R, but it differs from the C function of the same name in the return value of infinity. (in C, ilogb(real.infinity)== int.max). Note that the special return values may all be equal.

Special Values
x ilogb(x) Invalid?
0 FP_ILOGB0 yes
±∞ FP_ILOGBINFINITY yes
NAN FP_ILOGBNAN yes
real logb(real x) [public]
Extracts the exponent of x as a signed integral value.
If x is subnormal, it is treated as if it were normalized. For a positive, finite x:

1 <= x * FLT_RADIX-logb(x) < FLT_RADIX

Special Values
x logb(x) divide by 0?
±∞ +∞ no
±0.0 -∞ yes
real scalbn(real x, int n) [public]
Efficiently calculates x * 2n.
scalbn handles underflow and overflow in the same fashion as the basic arithmetic operators.

Special Values
x scalb(x)
±∞ ±∞
±0.0 ±0.0
real fdim(real x, real y) [public]
Returns the positive difference between x and y.
If either of x or y is NAN, it will be returned.

Returns:

) ) )
Special Values
Arguments fdim(x, y)
x > y x - y
x <= y +0.0
real fabs(real x) [public]
Returns |x|
Special Values
x fabs(x)
±0.0 +0.0
±∞ +∞
real fma(float x, float y, float z) [public]
Returns (x * y) + z, rounding only once according to the current rounding mode.

Bugs:

)
Not currently implemented - rounds twice.
creal expi(real y) [public]
Calculate cos(y) + i sin(y).
On x86 CPUs, this is a very efficient operation; almost twice as fast as calculating sin(y) and cos(y) seperately, and is the preferred method when both are required.
int isNaN(real x) [public]
Returns !=0 if e is a NaN.
int isNormal(X)(X x) [public]
Returns !=0 if x is normalized.
(Need one for each format because subnormal floats might be converted to normal reals)
bool isIdentical(real x, real y) [public]
bool isIdentical(ireal x, ireal y) [public]
bool isIdentical(creal x, creal y) [public]
Is the binary representation of x identical to y?
Same as ==, except that positive and negative zero are not identical, and two NANs are identical if they have the same 'payload'.
int isSubnormal(float f) [public]
int isSubnormal(double d) [public]
int isSubnormal(real x) [public]
Is number subnormal? (Also called "denormal".) Subnormals have a 0 exponent and a 0 most significant significand bit, but are non-zero.
int isZero(real x) [public]
Return !=0 if x is ±0.
Does not affect any floating-point flags
int isInfinity(real x) [public]
Return !=0 if e is ±∞;.
real nextUp(real x) [public]
double nextDoubleUp(double x) [public]
float nextFloatUp(float x) [public]
Calculate the next largest floating point value after x.
Return the least number greater than x that is representable as a real; thus, it gives the next point on the IEEE number line.

) ) ) ) ) )
Special Values
x nextUp(x)
-∞ -real.max
±0.0 real.min*real.epsilon
real.max
NAN NAN

Remarks:

This function is included in the IEEE 754-2008 standard. nextDoubleUp and nextFloatUp are the corresponding functions for the IEEE double and IEEE float number lines.
X splitSignificand(X)(ref X x) [package]
Reduces the magnitude of x, so the bits in the lower half of its significand are all zero. Returns the amount which needs to be added to x to restore its initial value; this amount will also have zeros in all bits in the lower half of its significand.
real nextDown(real x) [public]
double nextDoubleDown(double x) [public]
float nextFloatDown(float x) [public]
Calculate the next smallest floating point value before x.
Return the greatest number less than x that is representable as a real; thus, it gives the previous point on the IEEE number line.

) ) ) ) ) )
Special Values
x nextDown(x)
real.max
±0.0 -real.min*real.epsilon
-real.max -∞
-∞ -∞
NAN NAN

Remarks:

This function is included in the IEEE 754-2008 standard. nextDoubleDown and nextFloatDown are the corresponding functions for the IEEE double and IEEE float number lines.
real nextafter(real x, real y) [public]
Calculates the next representable value after x in the direction of y.
If y > x, the result will be the next largest floating-point value; if y < x, the result will be the next smallest value. If x == y, the result is y.

Remarks:

This function is not generally very useful; it's almost always better to use the faster functions nextUp() or nextDown() instead.

IEEE 754 requirements not implemented: The FE_INEXACT and FE_OVERFLOW exceptions will be raised if x is finite and the function result is infinite. The FE_INEXACT and FE_UNDERFLOW exceptions will be raised if the function value is subnormal, and x is not equal to y.
int feqrel(X)(X x, X y) [public]
To what precision is x equal to y?

Returns:

the number of significand bits which are equal in x and y. eg, 0x1.F8p+60 and 0x1.F1p+60 are equal to 5 bits of precision.

) ) ) ) ) )
Special Values
x y feqrel(x, y)
x x typeof(x).mant_dig
x >= 2*x 0
x <= x/2 0
NAN any 0
any NAN 0

Remarks:

This is a very fast operation, suitable for use in speed-critical code.
int signbit(real x) [public]
Return 1 if sign bit of e is set, 0 if not.
real copysign(real to, real from) [public]
Return a value composed of to with from's sign bit.
T ieeeMean(T)(T x, T y) [public]
Return the value that lies halfway between x and y on the IEEE number line.
Formally, the result is the arithmetic mean of the binary significands of x and y, multiplied by the geometric mean of the binary exponents of x and y. x and y must have the same sign, and must not be NaN.

Note:

this function is useful for ensuring O(log n) behaviour in algorithms involving a 'binary chop'.

Special cases: If x and y are within a factor of 2, (ie, feqrel(x, y) > 0), the return value is the arithmetic mean (x + y) / 2. If x and y are even powers of 2, the return value is the geometric mean, ieeeMean(x, y) = sqrt(x * y).
real NaN(ulong payload) [public]
Create a NAN, storing an integer inside the payload.
For 80-bit or 128-bit reals, the largest possible payload is 0x3FFF_FFFF_FFFF_FFFF. For doubles, it is 0x3_FFFF_FFFF_FFFF. For floats, it is 0x3F_FFFF.
ulong getNaNPayload(real x) [public]
Extract an integral payload from a NAN.

Returns:

the integer payload as a ulong.

For 80-bit or 128-bit reals, the largest possible payload is 0x3FFF_FFFF_FFFF_FFFF. For doubles, it is 0x3_FFFF_FFFF_FFFF. For floats, it is 0x3F_FFFF.