Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement TPCH substrait integration test, support tpch_4 and tpch_5 #11311

Merged
merged 5 commits into from
Jul 8, 2024

Conversation

Lordworms
Copy link
Contributor

Which issue does this PR close?

part of #10710

Closes #.

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@Lordworms
Copy link
Contributor Author

not much to change for these two queries.

@alamb alamb changed the title Implement TPCH substrait integration teset, support tpch_4 and tpch_5 Implement TPCH substrait integration test, support tpch_4 and tpch_5 Jul 7, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Lordworms -- I had one small suggestion, but I don't think it is nececssary

FYI @Blizzara

\n Sort: FILENAME_PLACEHOLDER_0.o_orderpriority ASC NULLS LAST\
\n Aggregate: groupBy=[[FILENAME_PLACEHOLDER_0.o_orderpriority]], aggr=[[count(Int64(1))]]\
\n Projection: FILENAME_PLACEHOLDER_0.o_orderpriority\
\n Filter: FILENAME_PLACEHOLDER_0.o_orderdate >= CAST(Utf8(\"1993-07-01\") AS Date32) AND FILENAME_PLACEHOLDER_0.o_orderdate < CAST(Utf8(\"1993-10-01\") AS Date32) AND EXISTS (<subquery>)\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -1297,6 +1297,32 @@ pub async fn from_substrait_rex(
outer_ref_columns,
})))
}
SubqueryType::SetPredicate(predicate) => {
match predicate.predicate_op {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we could use https://docs.rs/substrait/0.35.0/substrait/proto/expression/subquery/struct.SetPredicate.html#method.predicate_op to match on PredicateOp rather than a constant

So lke

match predicate.predicate_op() { 
  PredicateOp::Exists => ...
  other_type => ...
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll fix it.

Copy link
Contributor

@Blizzara Blizzara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

match predicate.predicate_op() {
// exist
PredicateOp::Exists => {
let relations = &predicate.tuples;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit:

Suggested change
let relations = &predicate.tuples;
let relation = &predicate.tuples;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in 489b96c

Comment on lines 1321 to 1324
other_type => Err(DataFusionError::Substrait(format!(
"unimplemented type {:?} for set predicate",
other_type
))),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would

Suggested change
other_type => Err(DataFusionError::Substrait(format!(
"unimplemented type {:?} for set predicate",
other_type
))),
other_type => substrait_datafusion_err!(
"unimplemented type {:?} for set predicate",
other_type
),

work here as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in f1ae84c

async fn create_context_tpch4() -> Result<SessionContext> {
let ctx = SessionContext::new();

let registrations = vec![
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't need to be part of this PR, but how about having a general create_context_tpch(registrations: Vec<(string, string)>)? and then writing the vec in the test functions

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Lordworms and @Blizzara -- I implemented @Blizzara 's suggestions other than https://github.com/apache/datafusion/pull/11311/files#r1668128566 which I agree would be nicer, though we can keep it in a separate PR

@alamb alamb merged commit e4b54f6 into apache:main Jul 8, 2024
23 checks passed
@alamb
Copy link
Contributor

alamb commented Jul 8, 2024

🚀

findepi pushed a commit to findepi/datafusion that referenced this pull request Jul 16, 2024
…pache#11311)

* Implement TPCH substrait integration teset, support tpch_4 and tpch_5

* optimize code

* rename variable

* Use error macro

---------

Co-authored-by: Andrew Lamb <[email protected]>
xinlifoobar pushed a commit to xinlifoobar/datafusion that referenced this pull request Jul 18, 2024
…pache#11311)

* Implement TPCH substrait integration teset, support tpch_4 and tpch_5

* optimize code

* rename variable

* Use error macro

---------

Co-authored-by: Andrew Lamb <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants