-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add ak.approx_equal
#2198
Conversation
Codecov Report
Additional details and impacted files
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works, but @ivirshup brought up the point that the functions in np.testing are all assertion functions, which this one is not. Since the motivation for introducing the ak.testing
submodule was to follow the NumPy interface, this would have to be an assert_*
type function. See #1105 (comment) and following.
The main benefit of those functions over boolean-returning functions (in my opinion) is that they can give helpful error messages. A failed assert
statement (in pytest) would try to print out the two arrays, put an ellipsis in the middle because their str
representations are huge, and then underline the differences, which might not even be differences in the part of the str
that's not hidden.
A boolean-returning "approximately equal to" function can be top-level operations (through ak.operations
), but then we'd want to make it an extension of an existing boolean-returning function, such as np.allclose
. What would an Awkward extension of np.allclose
look like?
np.allclose
does not care about dtypes:
>>> np.allclose(np.array([[1, 2], [3, 4]]), np.array([[1.00001, 2.00001], [3.00001, 4.00001]]))
True
But it does care about shapes, in a raising-exceptions kind of way (not a returning-False kind of way):
>>> np.allclose(np.array([1, 2, 3, 4]), np.array([[1.00001, 2.00001], [3.00001, 4.00001]]))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<__array_function__ internals>", line 180, in allclose
File "/home/jpivarski/mambaforge/lib/python3.9/site-packages/numpy/core/numeric.py", line 2265, in allclose
res = all(isclose(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan))
File "<__array_function__ internals>", line 180, in isclose
File "/home/jpivarski/mambaforge/lib/python3.9/site-packages/numpy/core/numeric.py", line 2375, in isclose
return within_tol(x, y, atol, rtol)
File "/home/jpivarski/mambaforge/lib/python3.9/site-packages/numpy/core/numeric.py", line 2356, in within_tol
return less_equal(abs(x-y), atol + rtol * abs(y))
ValueError: operands could not be broadcast together with shapes (4,) (2,2)
A generalization of that could be to raise exceptions if the Awkward types differ but return True or False depending only on the values.
The distance metric must be using abs
because comparisons with complex numbers do what we'd expect:
>>> np.allclose(np.array([[1, 2], [3, 4]]), np.array([[1+0.00001j, 2+0.00001j], [3+0.00001j, 4+0.00001j]]))
True
but they seem to have not even implemented datetimes or timedeltas. I would think these should be considered separate types, but yet be comparable.
>>> np.allclose(np.array([1, 2, 3, 4], "M8"), np.array([1, 2, 3, 4], "M8"))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: Converting an integer to a NumPy datetime requires a specified unit
and
>>> np.allclose(np.array([1, 2, 3, 4], "m8"), np.array([1, 2, 3, 4], "m8"))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<__array_function__ internals>", line 180, in allclose
File "/home/jpivarski/mambaforge/lib/python3.9/site-packages/numpy/core/numeric.py", line 2265, in allclose
res = all(isclose(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan))
File "<__array_function__ internals>", line 180, in isclose
File "/home/jpivarski/mambaforge/lib/python3.9/site-packages/numpy/core/numeric.py", line 2375, in isclose
return within_tol(x, y, atol, rtol)
File "/home/jpivarski/mambaforge/lib/python3.9/site-packages/numpy/core/numeric.py", line 2356, in within_tol
return less_equal(abs(x-y), atol + rtol * abs(y))
numpy.core._exceptions._UFuncBinaryResolutionError: ufunc 'add' cannot use operands with types dtype('float64') and dtype('<m8')
That last one is surely a NumPy bug; I'll report it. numpy/numpy#23157
It considers booleans to be numerical, which is not typical of NumPy.
>>> np.allclose(np.array([False, True, False, True]), np.array([0.00000001, 1.00000001, 0.00000001, 1.00000001]))
True
(From the other discussion)
Right, to be precise it's implemented mostly in terms of ufuncs, and it broadcasts.
Well, this is a generic broadcasting error. I'd not put that in the "this is a noisy function box", otherwise we'd have to do the same for all ufuncs, for example.
So if we generalised NumPy, we'd only error on structural differences rather than type alone; if it broadcasts, then it's allowed.
We definitely could do something rich-like and make it easier to spot why an assert failed, sure. However, I think our L1 users are most likely to want a boolean, not an assert. My view is that we have different functions here. We can implement an Awkward-specific |
Okay, our need is different enough here (types matter more to Awkward than to NumPy) that we can introduce a new function with a new name. But perhaps it should be in What would you think about making it a regular operation? |
Just want to acknowledge that I appreciate the note, but that there's no need to apologise for this kind of stuff - we're all allowed to change our mind based upon new information!
Yes, if you're happy that we have our own not-like-NumPy operation, let's do that. |
Yeah, let's just do it as a regular operation: Someday, we might need |
f8e39b8
to
5066df9
Compare
@jpivarski how do you feel about |
ak.testing.arrays_approx_equal
ak.approx_equal
That's a good name. For all of the operations, it's implicit that some argument is going to be an array or array-like. |
73eb5b9
to
8e01d77
Compare
…kward into agoose77/feat-arrays-equal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thanks!
Fixes #1105