Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for constant expression evaluation in limit #9790

Merged
merged 5 commits into from
Mar 27, 2024

Conversation

mustafasrepo
Copy link
Contributor

@mustafasrepo mustafasrepo commented Mar 25, 2024

Which issue does this PR close?

Closes #9506.

Rationale for this change

Add support for queries in following kind:

select * from '1.parquet' limit 1 + 1;

Unfortunately, I couldn't apply the suggestion because of the library dependencies.

What changes are included in this PR?

Are these changes tested?

Yes

Are there any user-facing changes?

@github-actions github-actions bot added sql SQL Planner sqllogictest SQL Logic Tests (.slt) labels Mar 25, 2024
/// * `Result<i64>` - An `Ok` variant containing the constant result if evaluation is successful,
/// or an `Err` variant containing an error message if evaluation fails.
fn get_constant_result(expr: &Expr, arg_name: &str) -> Result<i64> {
match expr {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use ConstEvaluator and see if it is evaluated to a constant value

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To use this helper. I have imported datafusion-optimizer to datafusion-sql crate. This dependency caused an error in test_deps test. Hence I am retracting these changes for now.

Copy link
Member

@jonahgao jonahgao Mar 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we change the Limit logical plan to support arbitrary expressions?

pub struct Limit {
    pub skip: Expr,
    pub fetch: Option<Expr>,
    pub input: Arc<LogicalPlan>,
}

The SimplifyExpressions rule can automatically optimize them into constants. Some optimization rules such as PushDownLimit only run when the limit expression is a constant. We may need to add a cast for the limit expression when planning, only checking if it is a constant of type u64.

When creating the LimitExec physical plan, convert the limit expression into PhysicalExpr and evaluate it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we change the Limit logical plan to support arbitrary expressions?

I agree this would be the "correct" way to support arbitrarily simplifiable expressions in the limit clause

However, I suspect it might be a major change

So I guess it comes down to "how important is the + / - case -- if it is really important then we can proceed with this PR, but if the real use case is arbitrary expressions, it probably makes sense to do the more substantial change as suggested by @jonahgao

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have experimented with the keeping Expr in the limit state. However, it is not a trivial change. Also, it is not clear when to do error checking, when to ignore errors while converting skip and fetch to their corresponding u64 versions. I suggest, we first merge this PR. Then, we can have a support for arbitrary expressions in another PR (Also I think, we can postpone this feature until a use case hits.).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I filed #9821 to track

@github-actions github-actions bot added the optimizer Optimizer rules label Mar 25, 2024
@github-actions github-actions bot removed the optimizer Optimizer rules label Mar 25, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @mustafasrepo

I agree this PR makes the situation better (error message is better and more cases are supported). Thank you very much @mustafasrepo

We can improve the situation more in the future and I filed #9821 to track that . Thank you for the suggestions @jonahgao and @Dandandan

🚀

datafusion/sql/src/query.rs Show resolved Hide resolved
datafusion/sqllogictest/test_files/select.slt Show resolved Hide resolved
@mustafasrepo mustafasrepo merged commit 1b6ae8f into apache:main Mar 27, 2024
23 checks passed
Lordworms pushed a commit to Lordworms/arrow-datafusion that referenced this pull request Apr 1, 2024
* Add support for constant expression evaluation in limit

* Use existing const evaluator

* Revert "Use existing const evaluator"

This reverts commit 99b7a55.

* Update datafusion/sql/src/query.rs

Co-authored-by: Andrew Lamb <[email protected]>

* Add negative tests

---------

Co-authored-by: Andrew Lamb <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sql SQL Planner sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Got "LIMIT must not be negative" error as long as it is not a literal
4 participants