-
Notifications
You must be signed in to change notification settings - Fork 369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Indexing of a GroupedDataFrame by group #1693
Comments
The syntax you propose fits more to DataFramesMeta.jl than DataFrames.jl. I guess what you propose should be rather handled by a custom |
I've thought many times that |
@nalimilan. So is your proposal the following:
|
Agreed.
No, I don't think using puns with this syntax is a good idea. Standard indexing should be OK (with two dimensions).
When would that be useful? |
So how do you want the user to select some specific group from a
You select group |
of course assuming that you know |
Having thought about it a bit maybe a better startegy would be to leave The only drawback would be that now you can create |
The decision on this should also mean what we return with |
I imagine we could support something like
Yes but is that an actual use case? I mean, that sounds reasonable, but in what concrete operations does one need this? Anyway, I agree using
Sorry, I don't follow. What kind of index would that be? |
We could do it (but this is breaking, so we should do it sooner than later). However, I think that instead of
In interactive use it is not very useful. But if you write a script that iterates groups it could be useful to get information about the values of grouping variables, something like:
instead of what you currently have to do:
Looking at this what we have now is not that bad, but maybe not an obvious pattern. I am not sure if it is worth to add. |
The same kind of index that e.g. JuliaDB uses for indexing. |
Why is it breaking? AFAICT that's an error now.
OK, makes sense.
So what you propose is to have |
This is true. I mean that changing normal indexing into
Yes. Actually you could add this information "in place" to the same This is something I think was mentioned when discussing H2O benchmarks that I do not want to say that this is a must, but I just wanted to explore the possibilities before we make a decision. |
I still don't understand. What's the "normal indexing"?
Yeah, that's interesting. A related thing that we could do is to support primary keys and/or keeping track of whether a OTOH it's not very typical of Julia to discretely mutate the argument in a function which doesn't end with |
With "normal indexing" I mean that Regarding the index issue - I agree that an explicit |
Ah, yes, that's the main (or only) conflict with the |
Yes. In general (of course no pressure and no rush 😄) it would be best it at some point to write down the contract around these issues related to |
Now we support indexing by values. |
Should we allow indexing of a grouped DataFrame by the grouping variables?
gd(:a == 1 && :b == 2)
which would return a grouped dataframe of all the groups matching those values (say there is a third group,:c
.The text was updated successfully, but these errors were encountered: