Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-7362][VL] Add test for 'IN' and 'OR' filter in Scan #7363

Merged
merged 1 commit into from
Nov 14, 2024

Conversation

zml1206
Copy link
Contributor

@zml1206 zml1206 commented Sep 26, 2024

What changes were proposed in this pull request?

Fixes: #7362
Resolved by #6754, add test for it.

How was this patch tested?

UT

@github-actions github-actions bot added the VELOX label Sep 26, 2024
Copy link

#7362

@FelixYBW
Copy link
Contributor

Is this a Gluten issue?

@zml1206
Copy link
Contributor Author

zml1206 commented Sep 27, 2024

Is this a Gluten issue?

Yes, the first singularOrList in OR needs to check in range.

@zml1206 zml1206 mentioned this pull request Sep 27, 2024
@zml1206
Copy link
Contributor Author

zml1206 commented Sep 27, 2024

cc @rui-mo Thank you.

@FelixYBW FelixYBW requested a review from rui-mo September 27, 2024 15:34
Copy link
Contributor

@rui-mo rui-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. These are my thoughts.

runQueryAndCompare(
"select count(1) from lineitem " +
"where (l_shipmode in ('TRUCK', 'MAIL') or l_shipmode in ('AIR', 'FOB')) " +
"and l_shipmode in ('RAIL','SHIP')") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this issue is in the condition (A or B) and C, when C is already pushed down, the (A or B) cannot be pushed down.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but only for in, which is singularOrList.

@@ -1669,7 +1669,8 @@ bool SubstraitToVeloxPlanConverter::canPushdownOr(
}
Copy link
Contributor

@rui-mo rui-mo Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I suggest we add a setOrRange function in RangeRecorder which returns false when other conditions already exist for a field. We could return false if it returns false at L1665 in the canPushdownOr function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your suggestion, updated.

if (!canPushdownSingularOrList(singularOrList, true)) {
return false;
}
uint32_t fieldIdx = getColumnIndexFromSingularOrList(singularOrList);
// Disable IN pushdown for int-like types.
if (!rangeRecorders.at(fieldIdx).setInRange(true /*forOrRelation*/)) {
if (!rangeRecorders.at(fieldIdx).setInRange(sign /*forOrRelation*/)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this change as this logic is checking the nested expressions in an or condition, while here the issue is or cannot be pushed down totally.

Copy link
Contributor

@rui-mo rui-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@@ -286,6 +286,16 @@ class SubstraitToVeloxPlanConverter {
return true;
}

/// Set the existence of or-range and returns whether it can coexist with
/// existing conditions for this field.
bool setOrRange() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry but I just notice 'setMultiRange' is for the 'or' condition, so we can just use it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setMultiRange seems to conflict with setCertainRangeForFunction of 1681L.

@@ -239,6 +239,13 @@ class MiscOperatorSuite extends VeloxWholeStageTransformerSuite with AdaptiveSpa
"select l_orderkey from lineitem " +
"where l_partkey in (1552, 674) or l_partkey in (1552) and l_orderkey > 1") { _ => }
checkLengthAndPlan(df, 73)

runQueryAndCompare(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have merged d6326f0. Would you like to try if this issue has been fixed? Perhaps we can merge this test to ensure the functionality. Thanks for your patience.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I will try it locally later.

@zml1206 zml1206 changed the title [GLUTEN-7362][VL] Fix push down or with singularOrList [GLUTEN-7362][VL] Add test for push down in and or Nov 14, 2024
@zml1206
Copy link
Contributor Author

zml1206 commented Nov 14, 2024

Local test passed, I modified pr to only add ut. @rui-mo

@zml1206 zml1206 changed the title [GLUTEN-7362][VL] Add test for push down in and or [GLUTEN-7362][VL] Add unit test for push down in and or Nov 14, 2024
@zml1206 zml1206 changed the title [GLUTEN-7362][VL] Add unit test for push down in and or [GLUTEN-7362][VL] Add test for push down in and or to scan Nov 14, 2024
@zml1206 zml1206 requested a review from rui-mo November 14, 2024 08:47
@rui-mo rui-mo changed the title [GLUTEN-7362][VL] Add test for push down in and or to scan [GLUTEN-7362][VL] Add test for 'IN' and 'OR' filter in Scan Nov 14, 2024
Copy link
Contributor

@rui-mo rui-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@rui-mo rui-mo merged commit 21b4e65 into apache:main Nov 14, 2024
49 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[VL] Diff of in_or_and
3 participants