Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(datafusion): Support pushdown more datafusion exprs to Iceberg #649

Merged
merged 3 commits into from
Oct 12, 2024

Conversation

FANNG1
Copy link
Contributor

@FANNG1 FANNG1 commented Sep 26, 2024

fixes: #644

DataFusion expressions and Iceberg predicates do not have a one-to-one correspondence.

  1. Columns and literals are represented as Expr in DataFusion, whereas they are merely components of predicates in Iceberg.
  2. And and Or are operators in DataFusion, but they are treated as predicates in Iceberg.

This disparity complicates the transformation process when using TreeNode. To address this, I also refactored the code to simplify it.

@FANNG1 FANNG1 marked this pull request as draft September 26, 2024 01:47
@FANNG1 FANNG1 changed the title Support pushdown more datafusion exprs to Iceberg [SIP] feat: Support pushdown more datafusion exprs to Iceberg Sep 26, 2024
@FANNG1 FANNG1 changed the title [SIP] feat: Support pushdown more datafusion exprs to Iceberg feat: Support pushdown more datafusion exprs to Iceberg Sep 28, 2024
@FANNG1 FANNG1 marked this pull request as ready for review September 29, 2024 00:09
@FANNG1
Copy link
Contributor Author

FANNG1 commented Sep 29, 2024

@liurenjie1024 @Xuanwo @sdd @a-agmon please help to review when you are free, thanks

@FANNG1 FANNG1 changed the title feat: Support pushdown more datafusion exprs to Iceberg feat(datafusion): Support pushdown more datafusion exprs to Iceberg Sep 29, 2024
Copy link
Contributor

@a-agmon a-agmon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM @FANNG1
Thats a great refactor. Make so much more sense.
Left one small comment on typo, and table.rs is behind #650 . It Will not conflict directly, but might be better merge. (It also makes it easier to test)

TransformedResult::Predicate(Predicate::Binary(BinaryExpression::new(op, r, d)))
}

fn reserve_predicate_operator(op: PredicateOperator) -> PredicateOperator {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you meant reverse in the function name rather than reserve?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, thanks for pointing, updated

Copy link
Contributor

@sdd sdd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @FANNG1 - this looks like a really elegant solution to a tricky problem. Thanks for the PR! Approved :-)

Copy link
Member

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @FANNG1 for this and thank you @sdd for the review, let's move.

@Xuanwo Xuanwo merged commit 026b436 into apache:main Oct 12, 2024
16 checks passed
@FANNG1
Copy link
Contributor Author

FANNG1 commented Oct 12, 2024

thanks @sdd @a-agmon @Xuanwo for the reviewing.

shaeqahmed pushed a commit to matanolabs/iceberg-rust that referenced this pull request Dec 9, 2024
…pache#649)

* suppport more datafusion predict

* add it

* reverse op
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support pushdown more datafusion exprs to Iceberg
4 participants