-
Notifications
You must be signed in to change notification settings - Fork 930
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] GroupBy Transform return incorrect results #11067
Comments
The issue may simply be that we discard that index during a groupby transform, while Pandas maintains it: In [13]: df
Out[13]:
a b
7 2 1
6 4 2
5 1 3
4 1 4
3 2 5
2 3 6
1 1 7
In [14]: df.groupby('a').transform('max')
Out[14]:
b
0 5
1 2
2 7
3 7
4 5
5 6
6 7
In [15]: df.to_pandas().groupby('a').transform('max')
Out[15]:
b
7 5
6 2
5 7
4 7
3 5
2 6
1 7 |
@shwina I think we may be susceptible to this class of error (dropping an input’s index or name when creating the output) in other places too. I caught a similar problem in #10715. If you can determine whether this is (1) not being tested or (2) not being caught by tests, that would be helpful. (edit: this problem is slightly different than the class of error I had in mind -- nothing special is needed here. We're probably fine.) |
I verified that #11068 fixes this issue. |
I believe this should close #11067, but I'm unable to reproduce the original bug locally. Will report back here once I'm able to do that. Edit: it does. Authors: - Ashwin Srinath (https://github.com/shwina) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) - Bradley Dice (https://github.com/bdice) URL: #11068
Describe the bug
cuDF's groupby transform returns incorrect results
Steps/Code to reproduce bug
I have noticed that when the index contains non-consecutive numbers, then cuDF groupby transform returns incorrect results.
Expected behavior
cuDF groupby transform should return correct results
Environment overview (please complete the following information)
Version cuDF is 22.04.00
Additional context
Here is a jupyter notebook to reproduce the error
https://github.com/cdeotte/RAPIDS-development/blob/master/bugs/bug005.ipynb
The text was updated successfully, but these errors were encountered: