Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More compatibility improvements in Accuracy and Precision. Adding tests. #769

Merged
merged 8 commits into from
Jan 30, 2023

Conversation

mmatera
Copy link
Contributor

@mmatera mmatera commented Jan 30, 2023

This PR is another fix coming from issues found when I was working on #766. In particular, the way in which Mathics handles Precision and Accuracy for real numbers with 0. nominal value. Also, a bug that prevents parsing Real numbers with fixed accuracy was fixed.

@rocky
Copy link
Member

rocky commented Jan 30, 2023

The thrust or idea behind this is good. Update: the more I look at this code the more I like it.

However, I would like to spend some time understanding things in detail. I want to make sure I understand things - that may take a little while.

CHANGES.rst Outdated
@@ -92,9 +94,10 @@ Bugs

#. Units and Quantities were sometimes failing. Also they were omitted from documentation.
#. Better handling of ``Infinite`` quantities.
#. Fix ``Precision`` compatibility with WMA.
#. Fix ``Precision`` and ``Accuracy``compatibility with WMA. In particular, ``Precision[0.]`` and ``Accuracy[0.]``
#. Accuracy in numbers usint the notation ``` n.nnn``acc ``` now is properly handled.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usint -> using

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small thing about using the word "Fix". Does it mean that all the problems with Precision have really been corrected or is it that the previous problems have been reduced, that is, that Precision has been improved?

The section on "Precision" says

This is rather a proof-of-concept than a full implementation

Is that still the case after this PR?

The section on "Accuracy" says:

Notice that the value is not exactly equal to the [value] obtained in WMA

Is this still accurate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In all the cases I have tested, I get the same result as in WMA (up to small numeric differences due to the mpmath implementation).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My style is to not exaggerate. This is then "more properly handle" as the next change lists, or "improve" than "fix".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The differences I found are in this order:

'Accuracy[0.32`30*^30]

WMA: 0.4948500216800894
Mathics: 0.49485002168009373

For Complex, there could be more differences:

Precision[12`20*^30+13`15*^24]
WMA: 19.9553
Mathics: 20.

Accuracy[12`20*^30+13`15*^24]
WMA: -11.1239
Mathics: -11.0792

Probably in this case the normalization is different. In any case, differences are of at most ~O(1/10). So, all the currently detected issues are fixed. But OK, I can rephrase it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for performing the experiments. It is nice to know this. I think we should document this somewhere so we don't lose this useful information.

>> Accuracy[0.``2]
= 2.

For 0.`, the accuracy is the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the -> is:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Colon at the end after "is" please.

>> Precision[0.]
= MachinePrecision

On the other hand, for a Precision Real with fixed accuracy,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a backslash after "fixed accuracy,"

= True

The case of `0.` values is special. Following WMA, in a Machine Real representation, the precision is
set to $MachinePrecision:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$MachinePrecision -> '$MachinePrecision'

Copy link
Member

@rocky rocky Jan 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add "\" to the end of the line ("the precision is").

@@ -948,6 +934,23 @@ class Precision(Builtin):
>> Precision[{{1, 1.`},{1.`5, 1.`10}}]
= 5.

For non-zero Real values, it holds in general
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

general -> general:

MAX_MACHINE_NUMBER = float_info.max
ZERO_MACHINE_ACCURACY = -mpmath.log(MIN_MACHINE_NUMBER, 10.0) + MACHINE_PRECISION_VALUE

# backward compatibility
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary? This release we are bumping the major number. If so, suggest possibly in the comment when we can remove this. (I would find it hard to believe that anyone other than ourselves is making use of this.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I leave this commentary because that variable appears in several modules. I can make the change, but then this PR would become larger. Also, I wondered if it was worth making the change. Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it worth making the change. If you want to do in another PR, that's okay.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then, I will do that in the following PR.

as a normalized machine number in the system.
</dl>

MachinePrecision minus the Log base 10 of this number is the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MachinePrecision -> 'MachinePrecision'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not seeing quotes around 'MachinePrecision' on line 623. Maybe a commit that wasn't pushed?

# ("Accuracy[-0.000000000000000000000000000000000000]", "36."),
("1.0000000000000000 // Accuracy", "15."),
("1.00000000000000000 // Accuracy", "17."),
@pytest.mark.parametrize(
Copy link
Member

@rocky rocky Jan 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Accuracy is based on computer hardware characteristics, right? If so we should add a pytest.mark.skipif when those hardware characteristics are not met.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the native sys.float_info, so it would adjust to each architecture. If we discover one architecture where it fails, then we can handle it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't accuracy depend on MACHINE_PRECISION_VALUE and MIN_MACHINE_NUMBER? and those depend of mpmath values and you say that mpmath varies?

I am okay with leaving out pytest.mark.skipif provided there is an assertion on these underlying values.

And then that way we will know that things have changed.

Recall we had a recent situation with reading bytes from a file that changed due to the endian-ness of the machine.

I would feel differently if we were testing regularly using different CPU architectures and 32-bit machines versus 64-bit OS setups and so on. But we are not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One other thing here. There should be unit test precisely for the underlying assumptions that are in effect.

One purpose of unit tests is exactly this. (There are other test like functional and integration tests). Our tests tend to be on the heavier side - integration and functional rather than on the smaller lighter, more precise unit tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't accuracy depend on MACHINE_PRECISION_VALUE and MIN_MACHINE_NUMBER? and those depend of mpmath values and you say that mpmath varies?

MACHINE_PRECISION_VALUE and MIN_MACHINE_NUMBER depends on numbers coming from the sys module, so it would work in any platform that runs Python. What should we catch?

I am okay with leaving out pytest.mark.skipif provided there is an assertion on these underlying values.

And then that way we will know that things have changed.

Recall we had a recent situation with reading bytes from a file that changed due to the endian-ness of the machine.

I would feel differently if we were testing regularly using different CPU architectures and 32-bit machines versus 64-bit OS setups and so on. But we are not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! Now I understand. What I think would be better is to change the tests in a way that the connection with these numbers be reinforced.

Copy link
Member

@rocky rocky Jan 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is even better. And if you are up for it, go for it!

Maybe you can find a 32-bit OS around and see what values you get or try a Python made from a 32-bit C compiler. Or if there are 128-bit architectures or 128-bt C compilers, see what those do.

There might be github workflows CI images that have these properties of 32-bitness or 32-bitness compilers (or other than 64-bitness) and we could run a test using that image.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the same thing along the lines of different mpc versions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It occurs to me that is you feel confident in knowing how what MACHINE_EPSILON, etc change things, then in a test you could artificially set these to different values and make the Precision and Accuracy tests.

It would however be nice to know what alternate values these possibly can take on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that with the new changes, the tests are robust across different platforms. The only thing that could go wrong is that certain platform does not have support for floating point numbers, but I guess that in such a case everything gets broken.

name = "$MinMachineNumber"
summary_text = "smallest normalized positive machine number"

def evaluate(self, evaluation) -> MachineReal:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In new code I have been adding a type annotation for evaluation: evaluation: Evaluation

@rocky
Copy link
Member

rocky commented Jan 30, 2023

@mmatera Branch merged with master because I can't run Django using this and the current master.


$$Accuracy[z] == Precision[z] + Log[z]$$
Accuracy[$z$] == Precision[$z$] + Log[$z$]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Single quotes around 'Accuracy', 'Precision', and 'Log' please.

@@ -183,11 +187,17 @@ class Accuracy(Builtin):
= Infinity

>> Accuracy[F[1.3, Pi, A]]
= 14.8861
= 15.8406
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this can vary, the please let's use '...' and make precise tests as pytests where we also have assertions or checks on the underlying factors that make this so.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you know with the MLMath stuff, I am not a fan of fragile tests because they fail unexpectedly at a time when you really don't want to start dealing with the failure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, the "fragility" on the test allows for checking the behavior. But maybe I could surround it with Round.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could, but this is the wrong philosophy: the primary purpose of these tests is to enlighten users as to what this Builtin does; testing is secondary. So complicating an example to make testing is not the right philosophy.

If you want a test, the add this as a pytest where you can get as elaborate as you want, like within so many digits. But better here is to also add checks for the fundamental properties that the this calculation is based on. That way when that assumption breaks for whatever reason, we have pinpointed at a low level what the problem is rather than have to recall or look around in the code what Accuracy assumes that has been violated.

@@ -927,8 +914,8 @@ class Precision(Builtin):
<dt>'Precision[$expr$]'
<dd>examines the number of significant digits of $expr$.
</dl>

<i>This is rather a proof-of-concept than a full implementation.</i>
<i>Notice that the result could be slighly different than the obtained
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

backslash after "obtained" please. For myself I go into Django and look at the rendered output. it can help catching things like this.

@mmatera mmatera force-pushed the more_on_accuracy_and_prec branch 2 times, most recently from 31ae1b1 to 93a7703 Compare January 30, 2023 18:04
more doc adjustments
@mmatera mmatera force-pushed the more_on_accuracy_and_prec branch from 93a7703 to f88e6e5 Compare January 30, 2023 18:05
@@ -195,7 +195,7 @@ class Accuracy(Builtin):
>> Accuracy[0.``2]
= 2.

For 0.`, the accuracy is
For 0.`, the accuracy satisfies
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, but please add a colon after "satisfies" this what we do everywhere (and it happens to be the correct punctuation).

("0.00`", "323.607"),
("0.00`2", "323.607"),
("0.00`20", "323.607"),
("0.", ZERO_MACHINE_ACCURACY_STR),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually, more readable. Thanks.

(" 0.4 + 2.4 I", "15.5744"),
# For some reason, the following test
# would fail in WMA
("1. I", "Accuracy[1.]"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again this and the below are more understandable. Thanks. (And again this is shows how pytests are very different from tests that happen to get run in documentation.)

@rocky
Copy link
Member

rocky commented Jan 30, 2023

LGTM and seems useful. Thanks. There may be some small lingering punctuation things, but those can be addressed any time.

@mmatera mmatera merged commit 40a446d into master Jan 30, 2023
@mmatera mmatera deleted the more_on_accuracy_and_prec branch January 30, 2023 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants