# Module type `Theory.Float`

The theory of floating-point numbers modulo transcendental functions.

This signature includes several functions that can be expressed in terms of a finite sequence of the operations of addition, multiplication, and root extraction. Despite the fact, that those operation are defined in the Basic Theory of Floating points (`Fbasic)`, operations in this signature couldn't be expressed in terms of `Fbasic` because the set of floating points is finite. Therefore the operations in this signature are defined in terms of real numbers arithmetic.

For example, the following expression is not true for all `x,y,m`:

`fmul m x (fmul m x x) = pown m x (succ (succ one))`,

E.g., when `fmul m x x` is not denoted with `x^2`, but the number that is closest to `x^2` with respect to the rounding mode `m`.

`include Fbasic`
```val float : ('r, 's) format Float.t Value.sort -> 's bitv -> ('r, 's) format float```

`float s x` interprets `x` as a floating-point number.

`val fbits : ('r, 's) format float -> 's bitv`

`fbits x` is a bitvector representation of the floating-point number `x`.

`val is_finite : 'f float -> bool`

`is_finite x` holds if `x` represents a finite number.

A floating-point number is finite if it represents a number from the set of real numbers `R`.

The predicate always holds for formats in which only finite floating-point numbers are representable.

`val is_nan : 'f float -> bool`

`is_nan x` holds if `x` represents a not-a-number.

A floating-point value is not-a-number if it is neither finite nor infinite number.

The predicated never holds for formats that represent only numbers.

`val is_inf : 'f float -> bool`

`is_inf x` holds if `x` represents an infinite number.

Never holds for formats in which infinite numbers are not representable.

`val is_fzero : 'f float -> bool`

`is_fzero x` holds if `x` represents a zero.

`val is_fpos : 'f float -> bool`

`is_fpos x` holds if `x` represents a positive number.

The denotation is not defined if `x` represents zero.

`val is_fneg : 'f float -> bool`

`is_fneg x` hold if `x` represents a negative number.

The denotation is not defined if `x` represents zero.

#### Rounding modes

Many operations in the Theory of Floating-Point numbers are defined using the rounding mode parameter.

The rounding mode gives a precise meaning to the phrase "the closest floating-point number to `x`", where `x` is a real number. When `x` is not representable by the given format, some other number `x'` is selected based on rules of the rounding mode.

`val rne : rmode`

rounding to nearest, ties to even.

The denotation is the floating-point number nearest to the denoted real number. If the two nearest numbers are equally close, then the one with an even least significant digit shall be selected. The denotation is not defined, if both numbers have an even least significant digit.

`val rna : rmode`

rounding to nearest, ties away.

The denotation is the floating-point number nearest to the denoted real number. If the two nearest numbers are equally close, then the one with larger magnitude shall be selected.

`val rtp : rmode`

rounding towards positive.

The denotation is the floating-point number that is nearest but no less than the denoted real number.

`val rtn : rmode`

rounding towards negative.

The denotation is the floating-point number that is nearest but not greater than the denoted real number.

`val rtz : rmode`

rounding towards zero.

The denotation is the floating-point number that is nearest but not greater in magnitude than the denoted real number.

`val requal : rmode -> rmode -> bool`

`requal x y` holds if rounding modes are equal.

`val cast_float : 'f Float.t Value.sort -> rmode -> 'a bitv -> 'f float`

`cast_float s m x` is the closest to `x` floating number of sort `s`.

The bitvector `x` is interpreted as an unsigned integer in the two-complement form.

`val cast_sfloat : 'f Float.t Value.sort -> rmode -> 'a bitv -> 'f float`

`cast_sfloat s rm x` is the closest to `x` floating-point number of sort `x`.

The bitvector `x` is interpreted as a signed integer in the two-complement form.

`val cast_int : 'a Bitv.t Value.sort -> rmode -> 'f float -> 'a bitv`

`cast_int s rm x` returns an integer closest to `x`.

The resulting bitvector should be interpreted as an unsigned two-complement integer.

`val cast_sint : 'a Bitv.t Value.sort -> rmode -> 'f float -> 'a bitv`

`cast_sint s rm x` returns an integer closest to `x`.

The resulting bitvector should be interpreted as a signed two-complement integer.

`val fneg : 'f float -> 'f float`

`fneg x` is `-x`

`val fabs : 'f float -> 'f float`

`fabs x` the absolute value of `x`.

`val fadd : rmode -> 'f float -> 'f float -> 'f float`

`fadd m x y` is the floating-point number closest to `x+y`.

`val fsub : rmode -> 'f float -> 'f float -> 'f float`

`fsub m x y` is the floating-point number closest to `x-y`.

`val fmul : rmode -> 'f float -> 'f float -> 'f float`

`fmul m x y` is the floating-point number closest to `x*y`.

`val fdiv : rmode -> 'f float -> 'f float -> 'f float`

`fdiv m x y` is the floating-point number closest to `x/y`.

`val fsqrt : rmode -> 'f float -> 'f float`

`fsqrt m x` returns the closest floating-point number to `r`, where `r` is such number that `r*r` is equal to `x`.

If `x` is a negative finite non-zero number, or is `nan`, or is the negative infinity, then `sqrt x` is `nan`. If `x` is the positive infinity then `fsqrt x` is the positive infinity.

`val fmodulo : rmode -> 'f float -> 'f float -> 'f float`

`fdiv m x y` is the floating-point number closest to the remainder of `x/y`.

`val fmad : rmode -> 'f float -> 'f float -> 'f float -> 'f float`

`fmad m x y z` is the floating-point number closest to `x * y + z`.

`val fround : rmode -> 'f float -> 'f float`

`fround m x` is the floating-point number closest to `x` rounded to an integral, using the rounding mode `m`.

`val fconvert : 'f Float.t Value.sort -> rmode -> _ float -> 'f float`

`fconvert f r x` is the closest to `x` floating number in format `f`.

`val fsucc : 'f float -> 'f float`

`fsucc m x` is the least floating-point number representable in (sort x) that is greater than `x`.

`val fpred : 'f float -> 'f float`

`fsucc m x` is the greatest floating-point number representable in (sort x) that is less than `x`.

`val forder : 'f float -> 'f float -> bool`

`forder x y` holds if floating-point number `x` is less than `y`.

The denotation is not defined if either of arguments do not represent a floating-point number.

`val pow : rmode -> 'f float -> 'f float -> 'f float`

`pow m b a` is a floating-point number closest to `b^a`.

Where `b^a` is `b` raised to the power of `a`.

Values, such as `0^0`, as well as `1^infinity` and `infinity^1` in formats that have a representation for infinity, are not defined.

`val compound : rmode -> 'f float -> 'a bitv -> 'f float`

`compound m x n` is the floating-point number closest to `(1+x)^n`.

Where `b^a` is `b` raised to the power of `a`.

The denotation is not defined if `x` is less than `-1`, or if `x` is `n` represent zeros, or if `x` doesn't represent a finite floating-point number.

`val rootn : rmode -> 'f float -> 'a bitv -> 'f float`

`rootn m x n` is the floating-point number closest to `x^(1/n)`.

Where `b^a` is `b` raised to the power of `a`.

The denotation is not defined if:

• `n` is zero;
• `x` is zero and n is less than zero;
• `x` is not a finite floating-point number;
`val pown : rmode -> 'f float -> 'a bitv -> 'f float`

`pown m x n` is the floating-point number closest to `x^n`.

Where `b^a` is `b` raised to the power of `a`.

The denotation is not defined if `x` and `n` both represent zero or if `x` doesn't represent a finite floating-point number.

`val rsqrt : rmode -> 'f float -> 'f float`

`rsqrt m x` is the closest floating-point number to `1 / sqrt x`.

The denotation is not defined if `x` is less than or equal to zero or doesn't represent a finite floating-point number.

`val hypot : rmode -> 'f float -> 'f float -> 'f float`

`hypot m x y` is the closest floating-point number to `sqrt(x^2 + y^2)`.

The denotation is not defined if `x` or `y` do not represent finite floating-point numbers.