-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement pl.DataFrame.collect
similarly to pl.LazyFrame.lazy
#6846
Comments
Maybe your use of |
I wouldn't call it nonsensical, it's the exact functionality I want. It's for testing correctness and it's fast enough that I'd prefer to (maybe) have to do a recalculation as opposed to make conditional paths for DataFrame versus LazyFrame. I feel like the argument is the exact same as the argument for Another example could be that I wanted to plot some intermediate subresults: def plot(df: pl.LazyFrame | pl.DataFrame):
data = df.select('value').filter(pl.col('value') > 100).collect()
my_plot(data)
def calculate_and_plot_thing(df: pl.LazyFrame | pl.DataFrame):
first_df = df.with_columns(...)
plot(first_df)
second_df = first_df.with_columns(...)
plot(second_df)
return second_df.with_columns(...) If I rewrote |
Sorry for being rude. I didn't even know about |
|
Ah yes, that works. Thanks! Given the existence of a |
Closing in favor of #7882 |
Problem description
In the same way
pl.LazyFrame
has.lazy()
it would be nice ifpl.DataFrame
had.collect()
, so it's easier to write code that works for both. (as the docstring forpl.LazyFrame.lazy
notes).My specific motivation is written some generic test assertions where I now have to do:
If
pl.DataFrame.collect
was Implemented this could be written as:If you think it makes sense, I can make a PR.
The text was updated successfully, but these errors were encountered: