Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pandas grouped summarize fails if group_keys is set to false #457

Closed
machow opened this issue Nov 15, 2022 · 1 comment
Closed

pandas grouped summarize fails if group_keys is set to false #457

machow opened this issue Nov 15, 2022 · 1 comment

Comments

@machow
Copy link
Owner

machow commented Nov 15, 2022

Note that siuba expects to put grouping columns on a summarize result by resetting the index. However, when group_keys is set to false, resetting the index fails.

from siuba.data import mtcars
from siuba import summarize

mtcars.groupby("cyl", group_keys=False).apply(lambda df: summarize(df, res=_.mpg.mean()))
# note no cyl on index
         res
0  26.663636
0  19.742857
0  15.100000

We should just have a grouped summarize set group_keys to true. It seems like we should also be checking whether any result columns have overridden grouping columns, and ensure that doesn't raise an error.

For example, this works in dplyr:

mtcars %>% group_by(cyl) %>% summarize(cyl = mean(mpg))
@machow machow changed the title pandas grouped summarize fails is group_keys is set to false pandas grouped summarize fails if group_keys is set to false Nov 15, 2022
@machow
Copy link
Owner Author

machow commented Nov 17, 2022

Fixed in #459

@machow machow closed this as completed Nov 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant