-
-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More compatibility improvements in Accuracy and Precision. Adding tests. #769
Conversation
The thrust or idea behind this is good. Update: the more I look at this code the more I like it. However, I would like to spend some time understanding things in detail. I want to make sure I understand things - that may take a little while. |
CHANGES.rst
Outdated
@@ -92,9 +94,10 @@ Bugs | |||
|
|||
#. Units and Quantities were sometimes failing. Also they were omitted from documentation. | |||
#. Better handling of ``Infinite`` quantities. | |||
#. Fix ``Precision`` compatibility with WMA. | |||
#. Fix ``Precision`` and ``Accuracy``compatibility with WMA. In particular, ``Precision[0.]`` and ``Accuracy[0.]`` | |||
#. Accuracy in numbers usint the notation ``` n.nnn``acc ``` now is properly handled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
usint -> using
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A small thing about using the word "Fix". Does it mean that all the problems with Precision have really been corrected or is it that the previous problems have been reduced, that is, that Precision has been improved?
The section on "Precision" says
This is rather a proof-of-concept than a full implementation
Is that still the case after this PR?
The section on "Accuracy" says:
Notice that the value is not exactly equal to the [value] obtained in WMA
Is this still accurate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In all the cases I have tested, I get the same result as in WMA (up to small numeric differences due to the mpmath implementation).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My style is to not exaggerate. This is then "more properly handle" as the next change lists, or "improve" than "fix".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The differences I found are in this order:
'Accuracy[0.32`30*^30]
WMA: 0.4948500216800894
Mathics: 0.49485002168009373
For Complex, there could be more differences:
Precision[12`20*^30+13`15*^24]
WMA: 19.9553
Mathics: 20.
Accuracy[12`20*^30+13`15*^24]
WMA: -11.1239
Mathics: -11.0792
Probably in this case the normalization is different. In any case, differences are of at most ~O(1/10). So, all the currently detected issues are fixed. But OK, I can rephrase it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for performing the experiments. It is nice to know this. I think we should document this somewhere so we don't lose this useful information.
mathics/builtin/atomic/numbers.py
Outdated
>> Accuracy[0.``2] | ||
= 2. | ||
|
||
For 0.`, the accuracy is the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the -> is:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Colon at the end after "is" please.
mathics/builtin/atomic/numbers.py
Outdated
>> Precision[0.] | ||
= MachinePrecision | ||
|
||
On the other hand, for a Precision Real with fixed accuracy, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a backslash after "fixed accuracy,"
mathics/builtin/atomic/numbers.py
Outdated
= True | ||
|
||
The case of `0.` values is special. Following WMA, in a Machine Real representation, the precision is | ||
set to $MachinePrecision: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$MachinePrecision -> '$MachinePrecision'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add "\" to the end of the line ("the precision is").
mathics/builtin/atomic/numbers.py
Outdated
@@ -948,6 +934,23 @@ class Precision(Builtin): | |||
>> Precision[{{1, 1.`},{1.`5, 1.`10}}] | |||
= 5. | |||
|
|||
For non-zero Real values, it holds in general |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
general -> general:
MAX_MACHINE_NUMBER = float_info.max | ||
ZERO_MACHINE_ACCURACY = -mpmath.log(MIN_MACHINE_NUMBER, 10.0) + MACHINE_PRECISION_VALUE | ||
|
||
# backward compatibility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary? This release we are bumping the major number. If so, suggest possibly in the comment when we can remove this. (I would find it hard to believe that anyone other than ourselves is making use of this.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I leave this commentary because that variable appears in several modules. I can make the change, but then this PR would become larger. Also, I wondered if it was worth making the change. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it worth making the change. If you want to do in another PR, that's okay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then, I will do that in the following PR.
mathics/builtin/numbers/constants.py
Outdated
as a normalized machine number in the system. | ||
</dl> | ||
|
||
MachinePrecision minus the Log base 10 of this number is the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MachinePrecision -> 'MachinePrecision'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not seeing quotes around 'MachinePrecision' on line 623. Maybe a commit that wasn't pushed?
# ("Accuracy[-0.000000000000000000000000000000000000]", "36."), | ||
("1.0000000000000000 // Accuracy", "15."), | ||
("1.00000000000000000 // Accuracy", "17."), | ||
@pytest.mark.parametrize( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Accuracy is based on computer hardware characteristics, right? If so we should add a pytest.mark.skipif
when those hardware characteristics are not met.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used the native sys.float_info
, so it would adjust to each architecture. If we discover one architecture where it fails, then we can handle it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't accuracy depend on MACHINE_PRECISION_VALUE and MIN_MACHINE_NUMBER? and those depend of mpmath values and you say that mpmath varies?
I am okay with leaving out pytest.mark.skipif
provided there is an assertion on these underlying values.
And then that way we will know that things have changed.
Recall we had a recent situation with reading bytes from a file that changed due to the endian-ness of the machine.
I would feel differently if we were testing regularly using different CPU architectures and 32-bit machines versus 64-bit OS setups and so on. But we are not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One other thing here. There should be unit test precisely for the underlying assumptions that are in effect.
One purpose of unit tests is exactly this. (There are other test like functional and integration tests). Our tests tend to be on the heavier side - integration and functional rather than on the smaller lighter, more precise unit tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't accuracy depend on MACHINE_PRECISION_VALUE and MIN_MACHINE_NUMBER? and those depend of mpmath values and you say that mpmath varies?
MACHINE_PRECISION_VALUE
and MIN_MACHINE_NUMBER
depends on numbers coming from the sys
module, so it would work in any platform that runs Python. What should we catch?
I am okay with leaving out
pytest.mark.skipif
provided there is an assertion on these underlying values.And then that way we will know that things have changed.
Recall we had a recent situation with reading bytes from a file that changed due to the endian-ness of the machine.
I would feel differently if we were testing regularly using different CPU architectures and 32-bit machines versus 64-bit OS setups and so on. But we are not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! Now I understand. What I think would be better is to change the tests in a way that the connection with these numbers be reinforced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is even better. And if you are up for it, go for it!
Maybe you can find a 32-bit OS around and see what values you get or try a Python made from a 32-bit C compiler. Or if there are 128-bit architectures or 128-bt C compilers, see what those do.
There might be github workflows CI images that have these properties of 32-bitness or 32-bitness compilers (or other than 64-bitness) and we could run a test using that image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And the same thing along the lines of different mpc versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It occurs to me that is you feel confident in knowing how what MACHINE_EPSILON, etc change things, then in a test you could artificially set these to different values and make the Precision and Accuracy tests.
It would however be nice to know what alternate values these possibly can take on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that with the new changes, the tests are robust across different platforms. The only thing that could go wrong is that certain platform does not have support for floating point numbers, but I guess that in such a case everything gets broken.
mathics/builtin/numbers/constants.py
Outdated
name = "$MinMachineNumber" | ||
summary_text = "smallest normalized positive machine number" | ||
|
||
def evaluate(self, evaluation) -> MachineReal: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In new code I have been adding a type annotation for evaluation
: evaluation: Evaluation
@mmatera Branch merged with master because I can't run Django using this and the current master. |
mathics/builtin/atomic/numbers.py
Outdated
|
||
$$Accuracy[z] == Precision[z] + Log[z]$$ | ||
Accuracy[$z$] == Precision[$z$] + Log[$z$] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Single quotes around 'Accuracy', 'Precision', and 'Log' please.
mathics/builtin/atomic/numbers.py
Outdated
@@ -183,11 +187,17 @@ class Accuracy(Builtin): | |||
= Infinity | |||
|
|||
>> Accuracy[F[1.3, Pi, A]] | |||
= 14.8861 | |||
= 15.8406 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this can vary, the please let's use '...' and make precise tests as pytests where we also have assertions or checks on the underlying factors that make this so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you know with the MLMath stuff, I am not a fan of fragile tests because they fail unexpectedly at a time when you really don't want to start dealing with the failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, the "fragility" on the test allows for checking the behavior. But maybe I could surround it with Round
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could, but this is the wrong philosophy: the primary purpose of these tests is to enlighten users as to what this Builtin does; testing is secondary. So complicating an example to make testing is not the right philosophy.
If you want a test, the add this as a pytest where you can get as elaborate as you want, like within so many digits. But better here is to also add checks for the fundamental properties that the this calculation is based on. That way when that assumption breaks for whatever reason, we have pinpointed at a low level what the problem is rather than have to recall or look around in the code what Accuracy assumes that has been violated.
mathics/builtin/atomic/numbers.py
Outdated
@@ -927,8 +914,8 @@ class Precision(Builtin): | |||
<dt>'Precision[$expr$]' | |||
<dd>examines the number of significant digits of $expr$. | |||
</dl> | |||
|
|||
<i>This is rather a proof-of-concept than a full implementation.</i> | |||
<i>Notice that the result could be slighly different than the obtained |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
backslash after "obtained" please. For myself I go into Django and look at the rendered output. it can help catching things like this.
31ae1b1
to
93a7703
Compare
more doc adjustments
93a7703
to
f88e6e5
Compare
mathics/builtin/atomic/numbers.py
Outdated
@@ -195,7 +195,7 @@ class Accuracy(Builtin): | |||
>> Accuracy[0.``2] | |||
= 2. | |||
|
|||
For 0.`, the accuracy is | |||
For 0.`, the accuracy satisfies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, but please add a colon after "satisfies" this what we do everywhere (and it happens to be the correct punctuation).
("0.00`", "323.607"), | ||
("0.00`2", "323.607"), | ||
("0.00`20", "323.607"), | ||
("0.", ZERO_MACHINE_ACCURACY_STR), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually, more readable. Thanks.
(" 0.4 + 2.4 I", "15.5744"), | ||
# For some reason, the following test | ||
# would fail in WMA | ||
("1. I", "Accuracy[1.]"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again this and the below are more understandable. Thanks. (And again this is shows how pytests are very different from tests that happen to get run in documentation.)
LGTM and seems useful. Thanks. There may be some small lingering punctuation things, but those can be addressed any time. |
This PR is another fix coming from issues found when I was working on #766. In particular, the way in which Mathics handles Precision and Accuracy for real numbers with 0. nominal value. Also, a bug that prevents parsing Real numbers with fixed accuracy was fixed.