PROPOSAL FOR TWO, OPTIONAL IEEE 754, BINARY
FLOATING-POINT WORD SETS
version 0.5.3
16-Aug-09
drafted by dnw, with several c.l.f. contributors, especially
Andrew, Anton, Ed, and Marcel (alphabetical order)
TABLE of CONTENTS
1 INTRODUCTION
2 TERMINOLOGY AND NOTATION
3 BINARY FLOATING-POINT FORMATS
4 IMPLEMENTATION
4.1 Requirements
4.2 Default format
4.3 Accuracy
4.4 Rounding
4.5 IEEE Exceptions
5 DATA TYPES
6 ENVIRONMENTAL QUERIES
7 TEXT INPUT
7.1 Constants
7.2 Decimal input
7.3 Hexadecimal input
8 IEEE FLOATING-POINT WORD SET
8.1 Conversion
8.2 Output
8.3 Comparison
8.4 Classification
8.5 Arithmetic
8.6 Math functions
8.7 Sign bit operations
8.8 Nearest integer functions
8.9 Data manipulation
9 IEEE FLOATING-POINT ENVIRONMENT WORD SET
9.1 Status flags
9.2 Rounding modes
10 REFERENCES
A.3 BINARY FLOATING-POINT FORMATS
A.7.1 NaN signs and loads
A.8.3 Comparison (informative rationale)
1 INTRODUCTION
This is a proposal for two, optional Forth 200x word sets,
called the "IEEE floating-point word set" and the "IEEE
floating-point environment word set", that support the binary
part of the IEEE 754-2008 standard for floating-point arithmetic
[1]. The most recent, freely available, but less comprehensive
version is IEEE 754 draft 1.2.9, January 27, 2007 [2]. There is
also a Wikipedia summary [3].
The standard [1] is hereafter referred to as "IEEE 754-2008",
with section numbers indicated by IEEE 754-2008 .
This specification requires that ISO Forth [4,5] floating-point
and floating-point extension words in the optional floating-
point word set, when present with the IEEE floating-point word
set, satisfy additional IEEE 754-2008 requirements. Words in
that word set and this that correspond to mathematical,
including logical, operations or functions in IEEE 754-2008
adopt the behavior required or recommended there by reference,
as far as that is possible and sensible, unless otherwise
stated.
The specification is compatible with, rather than conformant to,
IEEE 754-2008, because it includes only a subset of the IEEE
requirements. It also aims to make many of the remaining
requirements expressible in Forth.
Reference [4], the final draft of "ANSI X3.215-1994, American
National Standard for Information Systems--Programming
Languages--Forth", is hereafter referred to as "DPANS94". It is
believed to be the same as the published version, ISO/IEC
15145:1997 [5]. This document adopts the official terminology
of DPANS94 unless otherwise stated. Section numbers in that
document are indicated by DPANS94 .
The meaning of "optional" for these word sets is defined by
following two paragraphs from DPANS94 A.1.3.1:
The basic requirement is that if the implementor claims to
have a particular optional word set the entire required
portion of that word set must be available. If the
implementor wishes to offer only part of an optional word set,
it is acceptable to say, for example, "This system offers
portions of the [named] word set", particularly if the
selected or excluded words are itemized clearly.
and
Optional word sets may be offered in source form or otherwise
factored so that the user may selectively load them.
The current C99 standard [6-8], ISO/IEC 9899:1999, has a
comprehensive treatment of IEEE 754-1985, which offers a route
to implementation for those Forth systems that can call C
libraries. Reference [7] is believed to faithfully reflect the
current C99 standard. Section numbers from reference [7] are
indicated by C99:WG14/N1256 .
[**Bracketed statements like this are for editorial questions
and comments, eventually to be removed.]
2 TERMINOLOGY AND NOTATION
"fp": Short for "floating point". The Forth floating-point
stack is called the "fp stack". In this document, synonymous
with "binary floating point".
"IEEE special datum", or an "IEEE special": Signed zero, a
quiet or signaling signed nan, or signed infinity.
"full IEEE set": For an IEEE binary format, the set of normal
and subnormal numbers plus special data that it represents.
"IEEE datum": Any member of a full IEEE set.
"IEEE arithmetic": Arithmetic defined by IEEE 754-2008 for IEEE
data.
"affinely extended reals": Finite real numbers and +/-infinity,
with -infinity < {every finite number} < +infinity.
"nan load" or "nan payload": The value of the fractional bits
in the binary format of a nan, excluding the quiet bit,
considered as a positive integer. The smallest signaling load
is unity, and the smallest quiet load is zero.
"qnan", resp., "snan": A quiet or signaling nan, respectively,
of any sign or load.
"single": In the context of Forth fp, an IEEE 754-2008 32-bit
interchange format.
"double": In the context of Forth fp, an IEEE 754-2008 64-bit
interchange format.
"default": In the context of Forth fp, the float format for
data that can appear on the fp stack.
"fp exception": Unless otherwise stated, never used in this
document in the sense of Forth CATCH and TRHOW, but always in
the sense of IEEE 754-2008:
2.1.18 exception: An event that occurs when an operation on
some particular operands has no outcome suitable for every
reasonable application. That operation might signal one or
more exceptions by invoking the default or, if explicitly
requested, a language-defined alternate handling. Note that
"event", "exception", and "signal" are defined in diverse
ways in different programming environments.
"fp status flag": C99:WG14/N1256 7.6:
A floating-point status flag is a system variable whose
value is set (but never cleared) when a floating-point
exception is raised, which occurs as a side effect of
exceptional floating-point arithmetic to provide auxiliary
information.
"fp control mode": C99:WG14/N1256 7.6:
A floating-point control mode is a system variable whose
value may be set by the user to affect the subsequent
behavior of floating-point arithmetic.
Only rounding control modes are addressed in this document.
In particular, alternate exception handling modes are not
included.
"correct rounding": Conversion of an infinitely precise result
to a floating-point number or infinity according to the
current rounding mode.
3 BINARY FLOATING-POINT FORMATS
Each IEEE binary fp format has two fixed parameters, p > 0
(precision) and emax > 0 (maximum exponent), and defines emin =
1 - emax (minimum exponent). Each such format represents all
real numbers of the form
r = (-1)^s * 2^e * b_0.b_1 ... b_{p-1}
where
s = 0 or 1, emin <= e <= emax,
b_i = 0 or 1, p = #significand bits.
See Section A.3 for more information about IEEE binary fp formats.
The binary fp formats in this document are to be regarded as
logical formats defined only by p and emax, with unspecified
encoding or layout in memory or on the fp stack. In particular,
neither the presence or absence of an explicit integer bit, b_0,
nor endianness, is specified.
4 IMPLEMENTATION
4.1 Requirements
----------------
The DPANS94 floating-point and floating-point extensions word
sets are optional word sets, and so are the word sets described
by this document.
The word "shall" in the remainder of this document states a
requirement when the environmental query for IEEE-FP or
IEEE-FP-ENV returns true. "Should" means "strongly
recommended". The special terminology "nominally shall" may be
applied to the current rounding mode, as explained in Section 4.4.
4.2 Default format
------------------
Default fp data, i.e., data that can appear on the fp stack,
shall correspond to one of the IEEE basic or extended, full
logical binary formats.
Data stored in memory by F! shall have the default logical
format.
4.3 Accuracy
------------
Unless otherwise stated, the accuracy of floating-point
conversion or calculation is left to the implementation. This
is inevitable because of practical limitations on the
computational state of the art.
4.4 Rounding
------------
IEEE 754-2008 specifies the following settable rounding modes:
roundTiesToEven
roundTowardPositive
roundTowardNegative
roundTowardZero
roundTiesToAway
The last of these seems to be new, and is not addressed in this
document. The first four are commonly implemented in hardware
that claims IEEE compliance.
IEEE 754-2008 requires roundTiesToEven as the default mode for
binary rounding, and recommends that for decimal rounding. This
document adopts the same; roundTiesToEven shall be the default
for binary rounding and should be the default for decimal
rounding.
In DPANS94, a few words use "round to nearest" in the same sense
as roundTiesToEven. On the other hand, fp decimal to binary
conversion by the text interpreter, the printing words, and all
of the arithmetic and math function words use implementation-
defined rounding.
[**DPANS94 12.3.1.2:
...
Any rounding or truncation of floating-point numbers is
implementation defined.
]
Many IEEE floating-point operations specify that the current
rounding mode shall be used. This document adopts that where
relevant, but often makes it a "nominal shall", because of the
correct rounding issue mentioned below.
Using the current rounding mode affects text interpretation,
arithmetic operations, math functions except nearest integer
functions, and the following words, listed according to DPANS94
rounding:
REPRESENT DF! DF@ SF! SF@ round to nearest
>FLOAT F. FE. FS. implementation defined
In this specification, the words in the round to nearest list
remain compatible with DPANS94 programs when the current
rounding mode is the default mode.
The rounding mode can be changed from the default mode only by
words in the optional IEEE Floating-Point Environment word set.
Ideally, mathematical functions ought to be "correctly rounded",
i.e., computed with infinite accuracy and then rounded in the
current mode. Because accuracy is implementation defined,
correct rounding in this document is a "should", not a "shall",
except for the arithmetic functions in Section 8.5.
Because of the interaction between accuracy and rounding, when
use of the current rounding mode is required it may be specified
as a "nominal shall", which means that when accuracy is
sufficient for the rounding mode to make a difference, it is to
be regarded as a "shall", otherwise as a "should".
4.5 IEEE Exceptions
-------------------
IEEE floating-point exceptions have two aspects, signaling and
flags.
Signaling simply means handling the exception.
There are flags for five exceptions, all commonly implemented
in hardware that claims IEEE compliance: invalid, divideByZero,
overflow, underflow, and inexact.
Flags can be raised by an IEEE compliant system or by a user
program. The act of raising a flag does not cause any other
action. Reading a flag does not change it. A flag can be
lowered only by explicit user request.
IEEE specifies the fp operations and conditions that signal
exceptions. According to IEEE 754-2008 7.1, "Overview:
exceptions and flags", the default handling is nearly always to
produce a default datum, raise the corresponding flag, and keep
going.
IEEE recommends that a language standard should define
alternate, nondefault handling for fp exceptions, with
guidelines that allow stopping program execution and reporting
an error [**IIUC].
This document specifies the default to be IEEE default handling,
invoked under the conditions required by IEEE 754-2008 unless
otherwise stated, with the aim of restricting alternate handling
as little as possible. An interface for alternate handling is
not specified, but would have to interact with the system at a
low level to change the system default handling of exceptions in
general fp operations such as F+, F*, etc.
The exception flags can be accessed only by the optional IEEE
Floating-Point Environment word set. When those words are not
present, the IEEE requirements for setting the flags are moot,
because the flags have no side effects. Then the only
noticeable effect of exceptions is to produce IEEE special data.
5 DATA TYPES
For the purpose of this document, the DPANS94 r type is extended
to include all IEEEE data for the default fp format.
5 ENVIRONMENTAL QUERIES
[**CHANGES
The entitlements of the two queries are revised to move the
MAX-FLOAT entitlement to the IEEE-FP query, and an IEEE-FP-ENV
query is added. The following additonal specification is
proposed, so that when an application receives true from one of
the queries, it can simply drop the "known" flag returned by
specified others.
]
If either of the first two queries returns true for the "known"
flag, then so shall the other. If the third query returns true,
then so shall the other two.
Value
String Data Type Constant? Meaning
------------------------------------------------------------------
IEEE-FP flag no IEEE and DPANS94 floating-point word
sets present
IEEE-FP-FORMAT d no in usual stack notation, the default
format has IEEE parameters ( emax p )
IEEE-FP-ENV flag no IEEE floating-point environment word
set present
------------------------------------------------------------------
A true result for the IEEE-FP environmental query shall mean
that any words that are present from the DPANS94 floating-point
word set, the DPANS94 floating-point extensions word set, or the
IEEE floating-point word set obey the specifications of this
document, and that generic specifications not related to the
presence or absence of particular words are satisfied.
It shall also mean that the DPANS94 MAX-FLOAT query shall return
true and the largest, finite number in the default format.
The data value for the IEEE-FP query is true if and only if all
words in the DPANS94 and IEEE floating-point word sets are
present.
Nothing in this document depends on the encoding of the logical
format corresponding to the values of emax and p returned as
data by the IEEE-FP-FORMAT query.
The data value for the IEEE-FP-ENV query is true if and only if
all words in IEEE floating-point environment word set are
present.
7 TEXT INPUT
7.1 Constants
-------------
+INF ( f: -- +Inf )
-INF ( f: -- -Inf )
+NAN ( f: -- +NaN )
-NAN ( f: -- -NaN )
These words return, respectively, IEEE signed infinity and the quiet
signed nan with zero load, in the default format.
See Section A.7.1 for more information about the encoding of NaN.
7.2 Decimal input
-----------------
IEEE requires that conversion between text and binary fp formats
shall include signed zero, signed infinity, and signed nans,
with and without loads. See IEEE 754-2008 5.4.2, "Conversion
operations for floating-point formats and decimal character
sequences", and 5.12, "Details of conversion between
floating-point data and external character sequences".
Conversion of nan loads is not included in this specification.
Signed infinity and signed, quiet, unloaded nans are covered by
the constants defined in Section 7.1. Signed zero is already
included in the syntax specification in DPANS94 12.3.7, "Text
input number conversion".
DPANS94 12.3.7 specifies that the number-conversion algorithm
used by the text interpreter is to be extended to recognize
floating-point numbers when the base is decimal, with an
implication that the behavior on failure is to be governed by
the introductory paragraphs in DPANS94 3.4, "The Forth
interpreter". This document slightly modifies the syntax
specification, and is more explicit about failure.
When IEEE-FP is present, the syntax specification in DPANS94
12.3.7 shall be replaced by:
Convertible string :=
:= [][.]
[**Note that a leading digit before any
decimal point is required, as in DPANS94.]
:= []
[**Note that a sign with no digits is still
recognized, as in DPANS94.]
:=
:= *
:= { + | - }
:= { E | e }
[**Note the extra "e" option.]
:= { 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 }
The only change in syntax is the additional "e" option in
.
Interpretation shall convert to a finite, real number in the
default floating-point format, and nominally shall use the
current rounding mode. An ambiguous condition exists if
conversion overflows, or if the syntax is not satisfied and
causes unsuccessful text intepretation, in which case behavior
shall be governed by DPANS94 3.4.4, "Possible actions on an
ambiguous condition".
[**QUOTE
3.4.4 Possible actions on an ambiguous condition
When an ambiguous condition exists, a system may take one or
more of the following actions:
* ignore and continue;
* display a message;
* execute a particular word;
* set interpretation state and begin text interpretation;
* take other implementation-defined actions;
* take implementation-dependent actions.
The response to a particular ambiguous condition need not be the
same under all circumstances.
]
7.3 Hexadecimal input
---------------------
IEEE requires a text format for numbers with a hexadecimal
significand, and decimal radix two exponent, with exact
conversion to and from binary fp formats where possible. See
IEEE 5.12.3, "External hexadecimal-significand character
sequences representing finite numbers".
Conversion of that format is not included in this specification.
8 IEEE FLOATING-POINT WORD SET
This is an optional word set.
Unless otherwise stated, all fp words that do computations or
comparisons shall obey the requirements and recommendations of
IEEE 754-2008 5 and 6, for binary formats.
8.1 Conversion
--------------
D>F ( d -- f: r )
The DPANS94 specification is amended to require that when d
cannot be represented precisely in the default fp format, r
shall be shall be rounded according to the current rounding
mode.
[**IEEE and C99 background:
IEEE does not seem to specify an equivalent. C99:WG14/N1256
6.3.1.4, paragraph 2 says:
When a value of integer type is converted to a real floating
type, if the value being converted can be represented exactly
in the new type, it is unchanged. If the value being
converted is in the range of values that can be represented
but cannot be represented exactly, the result is either the
nearest higher or nearest lower representable value, chosen in
an implementation-defined manner. If the value being
converted is outside the range of values that can be
represented, the behavior is undefined.
Note that d is always in range for any of the formats
binary32, binary64, binary80, and binary128.
]
F>D ( f: r -- s: d )
The DPANS94 specification is amended to state that an
ambiguous condition exists, not only when the integer part of
r is not representable by a signed, double number, but also
when r is a nan or infinity.
[**RATIONALE
The DPANS94 version corresponds to convertToIntegerTowardZero();
see IEEE 754-2008 5.8, "Details of conversions from
floating-point to integer formats". IEEE requires that the
invalid operation exception be signaled when r is a nan or
infinity or out of range of the destination format.
IEEE also requires
convertToIntegerTiesToEven()
convertToIntegerTowardPositive()
convertToIntegerTowardNegative()
convertToIntegerTiesToAway()
plus versions of the five conversions that signal inexact when
appropriate. Note that, except for convertToIntegerTiesAway,
the nonsignaling conversions are equivalent to Forth phrases
such as "FCEIL F>D".
]
>FLOAT ( c-addr u -- [r: r s: true]|[false] )
The phrase "the string represents a valid floating-point
number" in the DPANS94 12.6.1.0558 shall be interpreted to
mean that it represents a finite number in the range of the
default format.
>IEEEFLOAT ( c-addr u -- [r: r s: true]|[false] )
This word extends the functionality of >FLOAT to include IEEE
special data, with modified syntax.
The decimal to binary conversion nominally shall use the
current rounding mode.
Overflow, underflow, and inexact exceptions shall be treated
according to IEEE 754-2008 7.4, "Overflow", 7.5 "Underflow",
and 7.6 "Inexact".
If the string represents a real number in the syntax below,
the result of conversion to the default format (signed
infinity on overflow) and true are returned.
If the string represents an IEEE special datum in the syntax
below, the datum and true are returned, with no change in the
fp exception status. A string of blanks [**or the empty
string?] shall be treated as +0.
Syntax of a convertible string := { [exponent]
| }
:= []{ [.] | . }
:=
:= { | }
:= []
:= { + | - }
:= { D | d | E | e }
:= []{ | }
:= { Inf | inf | INF | infinity | Infinity }
:= { NaN | nan | NAN }
REPRESENT ( f: r s: c-addr u -- n flag1 flag2 )
The DPANS94 specification is extended to include IEEE
specials. Instead of "round to nearest", the decimal mantissa
string nominally shall use the current rounding mode instead
of "round to nearest".
If flag2 is true, r was a finite number; if false,
either r was infinity and n = 0, or r was a nan and n is an
nonzero, implementation-defined number. If flag1 is true,
then the sign bit of r was one; otherwise the sign bit was
zero.
[**RATIONALE
The IEEE extension of REPRESENT aims to change the DPANS94
specification as little as possible. When the current
rounding mode is the default, roundTiesToEven, and flag2 is
true, all results nominally coincide with DPANS94. When flag2
is false, DPANS94 makes n and flag1 implementation defined.
The implementation-defined value of n for nans could be used
in a future proposal to encode information about the signaling
bit and load.]
[**DPANS94 does not embed the sign in the mantissa string,
contrary to impressions in c.l.f.
There was a request to have REPRESENT produce strings for
infinity and nan, but I couldn't see a good way to do that,
given that u might be less than three. The presence of either
is easy to detect.]
SF! ( f: r s: sf-addr -- )
SF@ ( sf-addr -- f: f )
DF! ( f: r s: df-addr -- )
DF@ ( sf-addr -- f: f )
The specification for these DPANS94 words is amended to
explicitly require conversion to or from the respective IEEE
754-2008 binary32 or binary64 logical interchange formats,
with implementation-defined memory layout. The conversion
shall be exact in either direction for signed zero, signed
infinity, and real numbers to a wider format, and shall use
the current rounding mode for conversion of real numbers to a
narrower format (see IEEE 754-2008 5.4.2, formatOf-convertFormat).
The conversion of nans is implementation defined, but should
not signal an exception, should preserve the sign bit, and
should treat payloads according to IEEE 754-208 6.2.3, "NaN
propagation".
[**CHANGES: current rounding, and nan conversion
implementation defined. IIUC these nan "shoulds" would be
automatic for the intel FLD and FST ops.]
8.2 Output
----------
F. ( f: r -- )
FE. ( f: r -- )
FS. ( f: r -- )
The DPANS94 specification is extended to include IEEE
specials, with output text of the appropriate form below, with
implementation-dependent case sensitivity:
[]0{ E | e }
[]{ Inf | INF | inf }
[]{ NaN | NAN | nan }
The current rounding mode nominally shall be used.
[**CHANGE: current rounding mode. In DPANS94 it's
implementation defined.]
8.3 Comparison
--------------
IEEE has twenty-two required comparisons which apply to the full
set of IEEE data. Twelve of these are quiet, and ten are
signaling. See IEEE 754-2008 5.6.1, "Comparisons", and 5.11,
"Details of comparison predicates".
This proposal requires only quiet comparisons, which do not
signal exceptions, and of those, only a subset of five, which is
sufficient for expressing all twelve.
See Section A.8 for rationale and more information about the
remaining comparisons, with high-level implementation examples.
IEEE identifies four fundamental, mutually exclusive
comparisons: less than ("<"), equal ("="), greater than (">"),
and unordered (see rule 3 below). Each of these is true iff
each of the others is false.
The basic rules are the following:
1. The sign of zero is ignored.
2. The sign of infinity is not ignored, and is treated in the
natural way for the "ordinary" comparisons with real numbers
or infinity, namely <, =, and >. In particular, either
signed infinity is equal to itself.
3. The unordered comparison is true iff at least one of its two
arguments is a nan. That implies that any of the other
three, "ordinary" comparisons involving a nan is false.
The five required comparisons are "<", ">", "=", "<=", and ">=",
where "<=" and ">=" stand for the usual phrases "less than or
equal" and "greater than or equal". Note that familiar
identities for real numbers are generally not satisfied by IEEE
comparisons. For example, the negation of "<" is not the same
as ">=". See Section A.8.
F< ( f: r1 r2 -- s: [r1 ( f: r1 r2 -- s: [r1>r2]? )
F<= ( f: r1 r2 -- s: [r1<=r2]? )
F>= ( f: r1 r2 -- s: [r1>=r2]? )
F0< ( f: r -- s: [r<0]? )
F0= ( f: r -- s: [r=0]? )
F0> ( f: r -- s: [r>0]? )
F0<= ( f: r -- s: [r<=0]? )
F0>= ( f: r -- s: [r>=0]? )
The data stack outputs are DPANS94 flags corresponding to the
indicated IEEE predicates. In particular, the specifications
for the existing DPANS94 words F<, F0<, and F0= are extended
to include IEEE specials.
F~ ( f: r1 r2 r3 -- s: flag )
If r3 has positive sign and is neither a nan nor zero, flag is
true iff the absolute value of r1 minus r2 is less than r3,
taking into account IEEE arithmetic and comparison rules.
If r3 is signed zero, flag is true iff r1 and r2 have
identical formats.
If r3 has negative sign and is neither a nan nor zero, flag is
true iff the absolute value of r1 minus r2 is less than the
absolute value of r3 times the sum of the absolute values of
r1 and r2, taking into account IEEE arithmetic and comparison
rules.
If r3 is a nan, flag is false.
8.4 Classification
------------------
IEEE 754-2008 5.7.2, "General operations", requires a large
number of classification operations. This documents defines
only those corresponding to:
isSignMinus
isNormal
isFinite
isZero
isSubnormal
isInfinite
isNaN
Actually isSignMinus corresponds to FSIGNBIT, and isZero
corresponds to F0=, which leaves the following:
FINITE? ( r: r -- s: [normal|subnormal]? )
FNORMAL? ( r: r -- s: normal? )
FSUBNORMAL? ( r: r -- s: subnormal? )
FINFINITE? ( r: r -- s: [+|-]Inf? )
FNAN? ( r: r -- s: nan? )
8.5 Arithmetic
--------------
See IEEE 5.4.1, "Arithmetic operations".
F* ( f: r1 r2 -- r1*r2 )
F*+ ( f: r1 r2 r3 -- [r2*r3]+r1 )
F+ ( f: r1 r2 -- r1+r2 )
F- ( f: r1 r2 -- r1-r2 )
F/ ( f: r1 r2 -- r1/r2 )
FSQRT ( f: r -- sqrt[r] )
The DPANS94 specification is extended to IEEE arithmetic.
These operations shall be correctly rounded. See IEEE
754-2008 5.1, "Overview", for precision, rounding, special
data treatment, and exceptions and 5.4.1, "Arithmetic
operations", for the arithmetic words.
8.6 Math functions
-------------------
The Forth words FABS, FMAX, FMIN, and FSQRT are covered
elsewhere.
The DPANS94 specification for the following words is extended to
adopt the corresponding IEEE behavior. These words should be
correctly rounded, and nominally shall use current rounding.
See IEEE 754-2008 9.2, "Recommended correctly rounded
functions", and 9.2.1, "Special values".
F** FACOS FACOSH FALOG FASIN FASINH FATAN FATAN2
FATANH FCOS FCOSH FEXP FEXPM1 FLN FLNP1 FLOG FSIN
FSINCOS FSINH FSQRT FTAN FTANH
8.7 Sign bit operations
-----------------------
FSIGNBIT ( f: r -- s: minus? )
This word corresponds to isSignMinus in IEEE 754-2008 5.7.2,
"General operations". The name is based on C99.
The following are all required by IEEE. See IEEE 5.5.1, "Sign
bit operations". The IEEE copy() function is superfluous in
Forth [**IIUC].
FNEGATE ( f: r -- -r )
FABS ( f: r -- |r| )
The DPANS94 specification is extended to IEEE specials.
FCOPYSIGN ( f: r1 r2 -- r3 )
The output r3 is r1 with its sign bit replaced by that of r2.
8.8 Nearest integer functions
-----------------------------
All of the words in this section correspond to C99 functions
with similar names, including the already existing FLOOR and
FROUND from DPANS94 and FTRUNC from Forth 200x, except that the
C99 round() does roundToIntegralTiesToAway instead of
roundToIntegralTiesToEven.
FCEIL ( f: r1 -- r2 )
FLOOR ( f: r1 -- r2 )
FROUND ( f: r1 -- r2 )
FTRUNC ( f: r1 -- r2 )
These words correspond to the respective IEEE required
operations:
roundToIntegralTowardPositive
roundToIntegralTowardNegative
roundToIntegralTiesToEven
roundToIntegralTowardZero
See IEEE 754-2008 5.3.1, "General operations" and 5.9,
"Details of operations to round a floating-point datum to
integral value". No word is defined for IEEE
roundToIntegralTiesToAway.
FNEARBYINT ( f: r1 -- r2 )
This word corresponds to the IEEE required operation:
roundToIntegralExact
It performs the function of whichever of the other four
corresponds to the current rounding mode.
8.9 Data manipulation
---------------------
FMAX ( f: r1 r2 -- r3 )
FMIN ( f: r1 r2 -- r3 )
The DPANS94 specification is extended to IEEE specials. See
minNum and maxNum in IEEE 754-2008 5.3.1, "General operations"
and 6.2, "Operations with NaNs".
FNEXTUP ( f: r1 -- r2 )
FNEXTDOWN ( f: r1 -- r2 )
When r1 is a nonzero real number, FNEXTUP returns the next
affinely extended real in the default format that compares
larger than r1, and FNEXTDOWN returns the next one that
compares less than r1. See IEEE 754-2008 5.3.1, "General
operations" for the behavior when r1 is an IEEE special.
[**RATIONALE
FNEXTDOWN can be defined as:
: FNEXTDOWN ( f: r1 -- f2 ) FNEGATE FNEXTUP FNEGATE ;
For accuracy control, it is useful to have efficient
implementations of both words.
]
FSCALBN ( f: r s: n -- f: r*2^n )
The output is efficiently scaled by 2^n. See IEEE 754-2008 5.3.3,
"logBFormat operations".
FLOGB ( f: r -- e )
Leave the radix-two exponent e of the fp representation as an
fp integer. If r is subnormal, the exponent is computed as if
r were normalized, with e < emin. See IEEE 754-2008 5.3.3,
"logBFormat operations" for treatment of IEEE specials.
FREMAINDER ( f: x y -- r q )
When y is not 0, the remainder r = fremainder(x, y) is defined
for finite x and y regardless of the current rounding mode by
the exact mathematical relation r = x - y * q, where q is the
integer nearest the exact number x/y, with roundToIntegralTiesToEven.
If r = 0, its sign shall be that of x, and fremainder(x, inf)
is x for finite x. See IEEE 754-2008 5.3.1, "General
operations".
9 IEEE FLOATING-POINT ENVIRONMENT WORD SET
This is an optional word set.
9.1 Status flags
----------------
A floating-point status flag is raised, but never lowered, as a
side effect of an fp operation. The default side effect also
provides a default fp datum and does not interrupt program flow.
Floating-point status flags can be raised or lowered by
SET-FSTATUS, and read by GET-FSTATUS. Both words have normal
interpretation and compilation semantics.
In the following, "fmask" stands for "floating-point status
mask", and "fflags" stands for the bit-wise OR of flag data
corresponding to an fmask.
The fp status flags correspond to the following distinct,
nonzero fp flag masks:
FDIVBYZERO FINEXACT FINVALID FOVERFLOW FUNDERFLOW
Masks may be OR'd to form fmask's.
[**RATIONALE
The mask names correspond to the C99 macros FE_DIVBYZERO,
FE_INEXACT, FE_INVALID, FE_OVERFLOW, and FE_UNDERFLOW, based on
C99:WG14/N1256 7.6.2.
The masks have nothing to do with the corresponding cpu masks in
intel processors, which select the default or an alternate IEEE
fp exception handler.
]
GET-FSTATUS ( fmask -- fflags )
Return the bit-wise OR of the current fp status flags selected
by the nonzero bits of fmask, without changing the status. It
is an ambiguous condition if fmask is not the OR of a subset
of the five, named masks, or if the underlying system does not
support all five flags.
SET-FSTATUS ( fflags fmask -- )
Set the values of the fp status flags selected by fmask to the
values of the corresponding bits in fflags, without side
effects. It is an ambiguous condition if fmask is not the OR
of a subset of the five, named masks, or if the underlying
system does not support all five flags.
[**MORE RATIONALE
The spirit of IEEE 754-2008 exceptions is that processing
continues without interrupting program flow. A default fp datum
is returned, and an exception flag is set, along with other
possible status information. Applications can query such
results and react to them. This scheme provides great
flexibility.
IEEE 754-2008 also allows the implementation of nondefault
exception handlers, which may interrupt program flow. C99 has
provisions for switching to such a mode, but does not require
it, and seems vague about how it's to be done.
The proposed words implement the simplest model, where the only
information provided about the fp exception state is the flag
values. The C99 functions fegetexceptflag() and
fesetexceptflag() are models for GET-FPSTATUS and SET-FPSTATUS.
Extra words that allow more information could be proposed in the
future, if needed.
The following quote from the C99 standard describes the intended
programming style, both for fp status flags and rounding modes.
C99:WG14/N1256 7.6, "Floating-point environment ",
paragraph 2:
Certain programming conventions support the intended model of
use for the floating-point environment:175)
- a function call does not alter its caller's floating-point
control modes, clear its caller's floating-point status
flags, nor depend on the state of its caller's
floating-point status flags unless the function is so
documented;
- a function call is assumed to require default floating-point
control modes, unless its documentation promises otherwise;
- a function call is assumed to have the potential for raising
floating-point exceptions, unless its documentation promises
otherwise.
175) With these conventions, a programmer can safely assume
default floating-point control modes (or be unaware of
them). The responsibilities associated with accessing the
floating-point environment fall on the programmer or program
that does so explicitly.
]
9.2 Rounding modes
------------------
[**UNDER CONSTRUCTION
This section is currently under discussion in comp.lang.forth.
See IEEE 9.3, "Operations on dynamic modes for attributes". Only
words corresponding to 9.3.1, "Operations on individual dynamic
modes", are expected to be implemented, and among those,
roundTiesToAway is not expected to be implemented.
]
9 REFERENCES
[1] "IEEE Standard for Floating-Point Arithmetic", approved June
12, 2008 as IEEE Std 754-2008:
http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4610935
[2] "DRAFT Standard for Floating-Point Arithmetic P754", IEEE
754 draft 1.2.9, January 27, 2007:
http://www.validlab.com/754R/nonabelian.com/754/comments/Q754.129.pdf
[3] Wikipedia, "IEEE 754-2008":
http://en.wikipedia.org/wiki/IEEE_754
[4] ANSI X3.215-1994 final draft:
http://www.taygeta.com/forth/dpans.html
[5] ISO/IEC 15145:1997:
http://webstore.ansi.org/RecordDetail.aspx?sku=ISO%2fIEC+15145%3a1997
http://www.iso.org/iso/catalogue_detail.htm?csnumber=26479
[6] ISO/IEC 9899:1999 (December 1, 1999),
ISO/IEC 9899:1999 Cor. 1:2001(E),
ISO/IEC 9899:1999 Cor. 2:2004(E),
ISO/IEC 9899:1999 Cor. 3:2007(E):
http://www.open-std.org/jtc1/sc22/wg14/www/standards.html#9899
[7] C99, TC1, TC2, and TC3 are included in the freely available
WG14/N1256, September 7, 2007 [**thanks to David Thompson]:
http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf
[8] Single UNIX 3, AFAICS duplicates the C99 library spec, with
some things pinned down more tightly [**thanks to David
Thompson]:
http://www.unix.org/single_unix_specification/
A.3 BINARY FLOATING-POINT FORMATS
IEEE 754-2008 defines three basic binary fp formats, binary32,
binary64, and binary128, plus three corresponding extended
binary formats, whose parameters are shown in Tables 1 and 2
below. It also defines the four binary interchange formats
shown in Table 3, plus those with storage widths of more than
128 bits that are a multiple of 32 bits.
Table 1: Parameters for IEEE 754-2008
basic binary formats.
binary32 binary64 binary128
---------------------------------------
p = 24 53 113
emax = 127 1023 16383
Table 2: Parameters for IEEE 754-2008
extended binary formats.
binary32 binary64 binary128
---------------------------------------
p >= 32 64 128
emax >= 1023 16383 65535
Table 3: Parameters for IEEE 754-2008
binary interchange formats
(k is the storage width in
bits).
binary 16 binary32 binary64 binary128
---------------------------------------------------
k = 16 32 64 128
p = 11 24 53 113
emax = 15 127 1023 16383
Note that the intel 80-bit format corresponds to one of the
extended binary64 formats, with p = 64 and emax = 16383. Its
precision is greater than that of basic binary64 and less than
that of basic binary128, with exponent range the same as basic
binary128. Although it is not defined as a basic IEEE binary
format, it may be called the "binary80" basic format. Its
implementation normally differs from that of the other basic
formats by having an explicit leading bit for normal and
subnormal numbers.
Binary interchange formats are logical formats, with unspecified
memory layout. They all have an implicit leading bit for normal
and subnormal numbers.
Note that the binary128 interchange format is the only one in
Table 3 that can contain the binary80 basic format. IEEE does
not define a binary80 interchange format.
A.7.1 NAN SIGNS AND LOADS
IEEE allows the load for nan results like 0E 0E F/ to be
anything, so the following does not necessarily give a zero
load:
0E 0E F/ FABS FCONSTANT +NAN
Aside from the ambiguous load, the FABS (extended to nan) is
necessary here, because not only does IEEE not specify the sign,
both pfe and gforth actually give opposite signs for 0E 0E F/
under ppc (+) vs. intel (-) Mac OS X. They do both give quiet
nans with zero load.
As a matter of fact, the intel QNaN "floating-point indefinite"
is the qnan with zero load and negative sign, according to
"Intel(R) 64 and IA-32 Architectures Software Developer's
Manual, Volume 1: Basic Architecture", Table 4-3:
http://www.intel.com/Assets/PDF/manual/253665.pdf
A.8.3 COMPARISON (INFORMATIVE RATIONALE)
[**Any high-level definitions in this section assume a separate
floating-point stack.]
The twelve IEEE required comparisons are the following, where
"N" means logical negation, "?" stands for "unordered", and "?" stand for "less than or unordered" and "greater than or
unordered":
< = > ? N< N= N> N? <= >= ?
Unfortunately, the common notation "?" for the unordered
predicate clashes with Forth practice, where "?" usually
[**always?] indicates a flag or test. The ? notation is used
here as a convenience for IEEE predicates, and does not appear
in any corresponding Forth names.
The <= and >= comparisons are no longer simple negations of >
and <, but are rather the AND's of those negations with N?. It
can be shown that =), and >? is N(<=). See IEEE Table
5.3, "Required unordered-quiet predicates and negations".
It is possible to implement all of the IEEE comparisons via
high-level definitions in terms of a few low-level words, even
fewer than the five required for this word set. The following
remarks are offered as a guide to possibilities for the choice
of low-level words.
DPANS94 has only F< and F~. Since IEEE "=" is semantically
different from 0E F~, low-level implementation of F= seems
inevitable. The two words F< and F= are probably a minimum set
for high-level implementation of the rest.
For example, IEEE ">" is not expressible in terms of "<" and "="
plus logical operations, but F> can be defined as:
: F> ( f: r1 r2 -- s: [r1>r2]? ) FSWAP F< ;
FUNORDERED is not required by this document; in particular, the
name is not reserved; but it can be defined as
: FUNORDERED ( f: r1 r2 -- s: [r1?r2]? )
FDUP F= FDUP F= AND 0= ;
The logical negations N<, N=, N>, and N? can be expressed with
the Forth phrases "F< 0=", etc.
Forth words for the <= and >= predicates can be defined as
: F2DUP ( f: r1 r2 -- r1 r2 r1 r2 ) FOVER FOVER ;
: F<= ( f: r1 r2 -- s: [r1<=r2]? ) F2DUP F< F= OR ;
: F>= ( f: r1 r2 -- s: [r1>=r2]? ) F2DUP F> F= OR ;
The ? predicates can be expressed by the Forth
phrases "F>= 0=" and "F<= 0=".
Thus all twelve IEEE predicates can be expressed with the five
required words, and as few as two.
[**Treat the following as a footnote.]
It can be shown that the closure of the four fundamental IEEE
comparison predicates under AND, OR, and negation consists of
sixteen independent relations, including the twelve that IEEE
requires. Two additional elements are the trivial,
identically true and false relations, and the other two are
"less than or greater than" and its negation, "unordered or
equal". The five words of this word set, plus their five
negations, implement the only nontrivial, transitive relations
among the sixteen.
On the other hand, several current cpu's have efficient
operations for all four of the fundamental IEEE comparisons, <,
>, =, and ?. Low-level implementations of at least F<, F>, and
F=, would be natural for such systems.
All of the words in the F0< family can be defined by analogy to
: F0< ( f: r -- s: [r<0]? ) 0E F< ;