-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support polars non-strict expand_selector #368
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #368 +/- ##
==========================================
+ Coverage 85.90% 86.11% +0.21%
==========================================
Files 41 41
Lines 4328 4337 +9
==========================================
+ Hits 3718 3735 +17
+ Misses 610 602 -8 ☔ View full report in Codecov by Sentry. |
@jrycw if you have a min, do you mind taking a look? Hopefully this rounds out our polars column selection work! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@machow it's amazing how quickly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@machow Overall, it looks good. We might need to update the documentation after merging this PR.
great_tables/_tbl_data.py
Outdated
@@ -313,38 +329,64 @@ def _(data: PlDataFrame, expr: Union[list[str], _selector_proxy_], strict: bool | |||
from operator import or_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these legacy imports?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm referring to the lines from functools import reduce
and from operator import or_
.
great_tables/_tbl_data.py
Outdated
all_selectors = [ | ||
cs.by_name(x) if isinstance(x, str) else cs.by_index(x) if isinstance(x, int) else x | ||
for x in expr | ||
] | ||
|
||
_validate_selector_list(all_selectors) | ||
# validate all entries ---- | ||
_validate_selector_list(all_selectors, strict=False if expand_opts else True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I would call _validate_selector_list(all_selectors, **expand_opts)
directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, good catch!
This PR adds support for polars expressions to be used in column selection in addition to polars selectors. It requires polars v0.20.30, but falls back to the old behavior (or requiring a selector object) otherwise.
Note that this is the result of a chain of things:
Polars
#360Which resulted in polars super quickly adding broader support for accepting Expressions and selectors that only result in column selection (and not e.g. transformation, or aliasing; PR pola-rs/polars#16479).
I tried to give reasonably helpful error messages for people who are not on v0.20.3 that use Expr, but also suspect people will upgrade polaras patch versions fairly quickly.
Example
Note both that a list is supported along with the use of expressions (e.g.
pl.col("fctr")
)