Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add backend argument to lazy #1895

Merged
merged 18 commits into from
Feb 1, 2025

Conversation

raisadz
Copy link
Contributor

@raisadz raisadz commented Jan 29, 2025

What type of PR is this? (check all applicable)

  • πŸ’Ύ Refactor
  • ✨ Feature
  • πŸ› Bug Fix
  • πŸ”§ Optimization
  • πŸ“ Documentation
  • βœ… Test
  • 🐳 Other

Related issues

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

If you have comments or can explain your changes, please do so below

@raisadz raisadz marked this pull request as ready for review January 29, 2025 18:56
@FBruzzesi FBruzzesi added the enhancement New feature or request label Jan 29, 2025
Copy link
Member

@FBruzzesi FBruzzesi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @raisadz thanks a ton! this is a great start πŸš€

I have some general concerns on the API/design itself, for pandas and pyarrow specifically, and a suggestion for the docstring - all the comments in the code

Comment on lines 550 to 554
supported_backends = (
Implementation.DASK,
Implementation.DUCKDB,
Implementation.POLARS,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RIP Pyspark πŸ˜‚ Happy to add it later no worries, I am just kidding!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I will try to look into adding PySpark in a follow-up PR

Comment on lines 145 to 146
def lazy(self: Self, *, backend: Implementation | None = None) -> Self:
return self
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LazyFrame does not have backend option anyway.

If it would, I feel like I rather raise if the passed backend here was different than None or Implementation.DUCKDB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the review, @FBruzzesi!

I added the backend argument to be able to run the following code:

import narwhals.stable.v1 as nw_v1
nw_v1.from_native(duckdb_rel).lazy()

which otherwise made the tests fail for V1. I also now added ValueError if backend is not None for DuckDB.

Copy link
Member

@FBruzzesi FBruzzesi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @raisadz , I left a tiny suggestion for DuckDB only.

In the meanwhile, I adjusted .collect to allow backend to take the following types: str | ModuleType | Implementation | None - see #1734

We might want to allow the same here? It can be a follow up PR anyway, but I just wanted to make you aware of it.

# backwards compatibility because in `narwhals.stable.v1`
# function `.from_native()` will return a DataFrame for DuckDB.

if backend is not None: # pragma: no cover
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think duckdb should be allowed?

Suggested change
if backend is not None: # pragma: no cover
if backend not in (None, Implementation.DUCKDB): # pragma: no cover

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think it needs to be - if this path only exists for v1 backwards-compatibility, then #1895 (comment) is something somebody could have written, but they wouldn't have passed a backend, so raising for backend is not None wouldn't break anyone's code

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @raisadz ! and @FBruzzesi for reviewing

@MarcoGorelli MarcoGorelli merged commit eb4eff0 into narwhals-dev:main Feb 1, 2025
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants