Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

argmin and argmax #70

Closed
jpivarski opened this issue Jan 14, 2020 · 2 comments · Fixed by #165
Closed

argmin and argmax #70

jpivarski opened this issue Jan 14, 2020 · 2 comments · Fixed by #165
Assignees
Labels
feature New feature or request

Comments

@jpivarski
Copy link
Member

Like the NumPy functions of the same name, these operations return an integer array of the positions of the minimum and maximum values per subarray. (By the way, this is technically not a reducer in the sense of issue #69 and it would have an independent implementation.)

>>> import numpy
>>> nparray = numpy.array([[1.1, 2.2, 3.3], [4.4, 5.5, 6.6], [7.7, 8.8, 9.9]])
>>> nparray.argmin(axis=0).tolist()
[0, 0, 0]
>>> nparray.argmax(axis=0).tolist()
[2, 2, 2]

The old Awkward Array had a different return type for this function:

>>> import awkward
>>> akarray = awkward.fromiter([[1.1, 2.2, 3.3], [4.4, 5.5, 6.6], [7.7, 8.8, 9.9]])
>>> akarray.argmin().tolist()
[[0], [0], [0]]
>>> akarray.argmax().tolist()
[[2], [2], [2]]

in order to handle cases without a minimum or maximum, which are possible because of the existence of jagged arrays.

>>> akarray = awkward.fromiter([[1.1, 2.2, 3.3], [], [4.4, 5.5]])
>>> akarray.argmin().tolist()
[[0], [], [0]]
>>> akarray.argmax().tolist()
[[2], [], [1]]

These jagged integer arrays could then be used as slices, so it was a useful deviation from NumPy.

However, deviating at all from NumPy is bad, especially if the function has the same name and mostly the same meaning. Instead of what was done in the old Awkward, the above should return

>>> ak.tolist(ak.argmin(akarray))
[0, None, 0]
>>> ak.tolist(ak.argmax(akarray))
[2, None, 1]

For regular arrays, this behavior would be identical to NumPy's; for irregular ones, users are introduced to OptionType arrays (most likely IndexedOptionArray, which has already been implemented, but possibly ByteMaskedArray or BitMaskedArray).

I'll be allowing these arrays as slices in #67 (following the behavior established by pyarrow), so they'll also be useful.

@nsmith- might want to comment on this change in behavior.

@jpivarski jpivarski added the feature New feature or request label Jan 14, 2020
@nsmith-
Copy link
Member

nsmith- commented Jan 16, 2020

I think for argmax indeed it makes much more sense to reduce the dimension by 1, reflecting the behavior of max.

@jpivarski jpivarski assigned jpivarski and unassigned ianna Mar 13, 2020
@jpivarski jpivarski linked a pull request Mar 16, 2020 that will close this issue
@jpivarski
Copy link
Member Author

Also relevant: the new ak.singletons function in #198. This is needed to turn the ak.argmax output into something that can select one object from each list.

Maybe there will also need to be a convenient syntax for it, since the examples in the tutorial are looking like:

#  select <- to singletons <- argmax <- the cut
events.pions[ak.singletons(ak.argmax(abs(events.pions.vtx), axis=1))]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants