Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lance dataset that overrides Dataset.scanner and Dataset.head #158

Merged
merged 2 commits into from
Sep 12, 2022

Conversation

changhiskhan
Copy link
Contributor

closes #141

@changhiskhan changhiskhan requested a review from eddyxu September 12, 2022 19:12
Copy link
Contributor

@eddyxu eddyxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nits

Dataset,
FileFormat,
FileWriteOptions,
CDataset,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this indent is weird.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed. not sure why pycharm is doing that

scanner = self.scanner(limit=n, offset=offset)
return scanner.to_table()

def scanner(self, *args, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we going to deprecate lance.scanner()? Could be just remove it from this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense. Can just do lance.dataset(...).scanner(...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -36,6 +36,16 @@ def test_simple_round_trips(tmp_path: Path):
assert table == actual


def test_head(tmp_path: Path):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a test to make sure duckdb works with this dataset

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is an existing unit test that runs a duckdb query (how i discovered the problem originally)

@changhiskhan changhiskhan merged commit cc3e442 into main Sep 12, 2022
@changhiskhan changhiskhan deleted the changhiskhan/lance-dataset branch September 12, 2022 21:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add convenience function for limit/offset
2 participants