pandas grouped summarize fails if group_keys is set to false #457

machow · 2022-11-15T18:54:10Z

Note that siuba expects to put grouping columns on a summarize result by resetting the index. However, when group_keys is set to false, resetting the index fails.

from siuba.data import mtcars
from siuba import summarize

mtcars.groupby("cyl", group_keys=False).apply(lambda df: summarize(df, res=_.mpg.mean()))

# note no cyl on index
         res
0  26.663636
0  19.742857
0  15.100000

We should just have a grouped summarize set group_keys to true. It seems like we should also be checking whether any result columns have overridden grouping columns, and ensure that doesn't raise an error.

For example, this works in dplyr:

mtcars %>% group_by(cyl) %>% summarize(cyl = mean(mpg))

machow · 2022-11-17T04:21:38Z

Fixed in #459

machow changed the title ~~pandas grouped summarize fails is group_keys is set to false~~ pandas grouped summarize fails if group_keys is set to false Nov 15, 2022

machow closed this as completed Nov 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pandas grouped summarize fails if group_keys is set to false #457

pandas grouped summarize fails if group_keys is set to false #457

machow commented Nov 15, 2022 •

edited

Loading

machow commented Nov 17, 2022

pandas grouped summarize fails if group_keys is set to false #457

pandas grouped summarize fails if group_keys is set to false #457

Comments

machow commented Nov 15, 2022 • edited Loading

machow commented Nov 17, 2022

machow commented Nov 15, 2022 •

edited

Loading