Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: allow python scalars in binary elementwise functions #100

Merged
merged 3 commits into from
Jan 24, 2025

Conversation

ev-br
Copy link
Contributor

@ev-br ev-br commented Nov 22, 2024

towards #96

@ev-br ev-br marked this pull request as draft November 22, 2024 09:17
@ev-br ev-br marked this pull request as ready for review November 22, 2024 12:46
@ev-br
Copy link
Contributor Author

ev-br commented Nov 22, 2024

TODO:

What's here now is likely too simple because this way python scalars can upcast arrays. So depending on what ends up in the standard this might need a tweak for "weak scalars" in the NEP 50 sense.

@ev-br
Copy link
Contributor Author

ev-br commented Nov 25, 2024

cross-ref data-apis/array-api#862

@asmeurer
Copy link
Member

We need to make sure this properly disallows cross-kind combinations that aren't supported. That will presumably match the text here https://data-apis.org/array-api/latest/API_specification/type_promotion.html#mixing-arrays-with-python-scalars (if data-apis/array-api#841 is implemented, real -> complex will also be allowed).

@ev-br ev-br marked this pull request as draft November 28, 2024 08:12
@ev-br
Copy link
Contributor Author

ev-br commented Nov 28, 2024

The whole binop/binary elementwise functions is probably worth a larger rework. ATM there is duplication of logic in matching pairs, like xp.add and Array.__add__, and the duplicate logic is subtly different, and scalars specifically, array methods seem to be ahead.

Maybe the right thing to do is to transfer the logic from array methods to elementwise functions and delegate from functions to array methods. There's a precedent in torch dynamo, which just registers ufuncs as ndarray methods: https://github.com/pytorch/pytorch/blob/main/torch/_numpy/_ndarray.py#L197.

This nicely avoids duplication and slight incompatibilities.

@ev-br
Copy link
Contributor Author

ev-br commented Nov 29, 2024

In the last commit,

  1. Generate all binary "ufuncs" in a uniform way, with a decorator
  2. Make binary "ufuncs" follow the same logic of the binary operators
  3. Reuse the test loop of Array.__binop__ for binary "ufuncs" to make sure the type promotion etc is consistent between pairs of __mul__ and xp.multiply and their ilk.

@ev-br ev-br marked this pull request as ready for review November 29, 2024 21:54
array_api_strict/_helpers.py Outdated Show resolved Hide resolved
@ev-br ev-br force-pushed the scalars branch 4 times, most recently from f173afc to 5ddba5b Compare December 1, 2024 08:08
@ev-br
Copy link
Contributor Author

ev-br commented Dec 1, 2024

The CI failure is unrelated: data-apis/array-api-tests#280, hopefully worked around by data-apis/array-api-tests#325

@ev-br ev-br requested a review from asmeurer December 1, 2024 13:33
@ev-br ev-br force-pushed the scalars branch 2 times, most recently from 7a33199 to 17d8018 Compare December 2, 2024 14:28
@@ -234,6 +234,8 @@ def _check_device(self, other):
elif isinstance(other, Array):
if self.device != other.device:
raise ValueError(f"Arrays from two different devices ({self.device} and {other.device}) can not be combined.")
else:
raise TypeError(f"Cannot combine an Array with {type(other)}.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When does it happen that we get here with other being neither a scalar nor an array? My expectation was somehow that this wouldn't happen

Maybe worth adding a comment given that it isn't immediately obvious to me now and probably also not obvious to readers from the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thank you! This is interference from #103.
Should probably decide on that PR one way or another and rebase this one.

Comment on lines +167 to +168
"__truediv__": "floating-point",
"__xor__": "integer or boolean",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my education, is there a reason for renaming these other than aesthetics? In particular the integer_or_boolean makes me wonder if there was a reason why it was created without spaces in it

Copy link
Contributor Author

@ev-br ev-br Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally not aesthetics! This is to match the keys from https://github.com/data-apis/array-api-strict/blob/main/array_api_strict/_dtypes.py#L117 , which are "integer or boolean" without underscores. Now that _check_op_array_scalar from test_array_object.py is reused in test_elementwise_functions.py, and there is looping over the keys of the _dtype_categories dict, it saves a bit of gymnastics to map different spellings onto each other.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha! That sounds totally sensible, less 🤸

@betatim
Copy link
Member

betatim commented Dec 3, 2024

I'm a little undecided on the changes to _elementwise_functions.py. It reduces the amount of content in the file and the amount of "dumb" search&replace that needs doing when we want to edit it. This is nice. However the current main is super explicit and straight-forward, which is nice if you need to special case something and for understanding what is what. So yeah.

@ev-br
Copy link
Contributor Author

ev-br commented Dec 3, 2024

Re copy-paste vs generate functions: while I'm definitely not too attached to the current way (it's borderline too meta to my tastes TBH), what I'd be worried about is consistency. Too easy for different functions to drift apart slightly, and we're trying to be super-consistent here, so I'd guess this way will be a little less maintenance going forward. Even at an expense of an initial "what on earth is this" kind of first reaction.

@betatim
Copy link
Member

betatim commented Dec 4, 2024

Let's see what others think. My main worry with the code generation approach is that we forget to do something which leads to some subtle breakage somewhere in a way that will be tricky to debug. A bit similar to how you should use functools.wraps for decorators even though at (very) first glance it seems like it is trivial to do yourself, except it isn't :D

Let's see what opinions and feelings others have.

@ev-br ev-br closed this Dec 4, 2024
@ev-br ev-br reopened this Dec 4, 2024
@lucascolley
Copy link
Member

Re copy-paste vs generate functions: while I'm definitely not too attached to the current way (it's borderline too meta to my tastes TBH), what I'd be worried about is consistency. Too easy for different functions to drift apart slightly, and we're trying to be super-consistent here, so I'd guess this way will be a little less maintenance going forward. Even at an expense of an initial "what on earth is this" kind of first reaction.

Let's see what others think. My main worry with the code generation approach is that we forget to do something which leads to some subtle breakage somewhere in a way that will be tricky to debug. A bit similar to how you should use functools.wraps for decorators even though at (very) first glance it seems like it is trivial to do yourself, except it isn't :D

Let's see what opinions and feelings others have.

My inclination is towards leaving as is with the straight-forward approach. I think the "what on earth is this" might be a net-negative on the maintenance burden, but of course that depends on how many different people will have to maintain the file. Even if it did increase maintainability slightly I think I agree with Tim that it could lead to bugs which are tricky to debug.

But I don't feel strongly, happy for it to go the other way if people feel so.

@rgommers
Copy link
Member

rgommers commented Jan 7, 2025

I'd go for merging this change. One may be able to argue that there's slightly easier to understand code patterns and the refactor could have been done a bit differently, but it's hard to argue against deduplicating many of the same 5 lines. With this, adding a validation change like _maybe_normalize_py_scalars can be done in a single place, and adding a function that is added to the standard is a 1-2 line change. Seems like a win overall; if a bug shows up it shouldn't be very disruptive and probably easily fixable.

@betatim
Copy link
Member

betatim commented Jan 13, 2025

I'm fine with the decision to use this PR and merge it.

ev-br added 3 commits January 24, 2025 12:21
Allow func(array, scalar) and func(scalar, array), raise on
func(scalar, scalar) if API_VERSION>=2024.12

cross-ref data-apis/array-api#807

To make sure it is all uniform,

1. Generate all binary "ufuncs" in a uniform way, with a decorator
2. Make binary "ufuncs" follow the same logic of the binary operators
3. Reuse the test loop of Array.__binop__ for binary "ufuncs"
4. (minor) in tests, reuse canonical names for dtype categories
   ("integer or boolean" vs "integer_or_boolean")
@ev-br ev-br merged commit 4847400 into data-apis:main Jan 24, 2025
21 checks passed
@ev-br
Copy link
Contributor Author

ev-br commented Jan 24, 2025

data-apis/array-api#862 has landed, this PR has been okay'd, so lets land this one, too. Thanks to all the reviewers!

@ev-br ev-br mentioned this pull request Jan 25, 2025
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants