Improve the documentation comments around CSE's uses (i.e., exploitations) of NaNs #505

spahrenk · 2024-09-09T17:56:10Z

I (@nealkruis) am going to start documenting things here as a draft while I wrap my head around all the different 32-bit representations.

In CSE, most record data members (especially those set through the user input language) are stored as 32-bit values.
CSE exploits the IEEE 754 definition of NaN to encode payload information about record members to indicate:

If the value is set by the user or not
If the value is supposed to be autosized
If the value is defined by the user as an expression (and which expression it corresponds to)
If the value is a choice input (and which choice value it contains)

Here's a good primer on floating point bit patterns. The 32 bits are divided into:

a sign (bit 1),
the exponent (bits 2-9), and
the mantissa/fraction (bits 10-32).

This exploit relies on relatively consistent implementations across compilers. However, per cppreference.com:

In IEEE 754, the most common binary representation of floating-point numbers, any value with all bits of the exponent set and at least one bit of the fraction set represents a NaN. It is implementation-defined which values of the fraction represent quiet or signaling NaNs, and whether the sign bit is meaningful.

The only real risk in this approach is when a floating point operation yields a signaling or quiet NaN value and CSE attempts to process its payload into a meaning that is not intended. In order to prevent this, we need to attempt to limit payload interpretations to bit patterns that are not commonly used as signaling or quiet NaNs in common compiler implementations.

Nomenclature:

0 bit must be zero
1 bit must be one
X bit may be either zero or one
Z all bits must contain at least one zero
N all bits must contain at least one one
B all bits must contain at least one zero and at least one one

Here are the rules (as far as I can tell):

0 00000000 00000000000000000000000: 0
1 00000000 00000000000000000000000: -0
0 11111111 00000000000000000000000: inf
1 11111111 00000000000000000000000: -inf
X ZZZZZZZZ XXXXXXXXXXXXXXXXXXXXXXX: Normal floating point number
0 11111111 10000000000000000000000: std::numeric_limits<float>::quiet_NaN()
0 11111111 01000000000000000000000: std::numeric_limits<float>::signaling_NaN()

This leaves the following bit sets for CSE's "NANDLES":

X 11111111 1NNNNNNNNNNNNNNNNNNNNNN: Quiet NaNs
X 11111111 0XNNNNNNNNNNNNNNNNNNNNN: Signaling NaNs

NANDLES (current):

1 11111111 00000000000000000000000: Unset (Note: this is also -inf)
1 11111111 00000001111111111111111: Autosizing
1 11111111 0000000BBBBBBBBBBBBBBBB: Expressions (bottom 16 bits = expression index)
0 11111111 XXXXXXXXXXXXXXXXXXXXXXX: Choices (top 7 bits = choice index; Note: Overlap with inf, std quiet NaN, and Signaling NaN)

The text was updated successfully, but these errors were encountered:

nealkruis · 2024-09-23T15:53:30Z

Unset should be changed to be different from -inf.

Choice needs to be modified to be either:

0 11111111 1NNNNNNNNNNNNNNNNNNNNNN
0 11111111 0XNNNNNNNNNNNNNNNNNNNNN

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve the documentation comments around CSE's uses (i.e., exploitations) of NaNs #505

Improve the documentation comments around CSE's uses (i.e., exploitations) of NaNs #505

spahrenk commented Sep 9, 2024 •

edited by nealkruis

Loading

nealkruis commented Sep 23, 2024

Improve the documentation comments around CSE's uses (i.e., exploitations) of NaNs #505

Improve the documentation comments around CSE's uses (i.e., exploitations) of NaNs #505

Comments

spahrenk commented Sep 9, 2024 • edited by nealkruis Loading

nealkruis commented Sep 23, 2024

spahrenk commented Sep 9, 2024 •

edited by nealkruis

Loading