-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NestedLoopJoin will panic when right child contains RepartitionExec #5022
Comments
@ygf11 Do you find the root cause why the panic happened?
From your example, the left side is partitioned and the right side is required single-partitioned. The output partition should be consistent with left side. The right single-partition will be executed one more time, is it not allowed to be called multiple times? Maybe it is the point why it caused panic |
@liukun4515, that is the cause. We can see in If |
Yes, we got the same reason, but I don't know why the |
Other physicalexec have the same behavior? |
The
the state will shared between multiple runtime/coroutine context, if one coroutine execute the |
I don't know if there is other physical plan has the same behavior.
The bug of example is the case of I think maybe it is an another cause. @liukun4515 @alamb |
Let me explain more. The main difference is the way of iteration. // Given a partition number - x
// execute right-partition-x
for batch in right-partition-x {
join(batch, left-data-(single partition))
} For // Given a partition number - x
// execute left-partition-x
for batch in left-partition-x {
join(batch, right-data(single partition))
} If we use // Given a partition number - x
// Execute right-partition-0 --- right partition is a single partition and execute many times
for batch in right-partition-0 {
join(batch, left-data(partition-x))
} |
I think the In the NLJ, we need the outer-side/outer table and inner-side/inner table.
The inner table will be traveled many times in the most basic implementation. In the current implementation, the left table is the outer table, the right table is the inner table. For the
I think this implementation of iter is matched with the algorithm of the NLJ. |
I think we can change the function
to
temporarily If we can not support executing multi times for one physical executor for the same partition. It will not block your fix for the issue #4866 cc @ygf11 |
Thanks for response @liukun4515. I think I find the way to fix this bug.
IMOP, The relationship between
The algorithm will be like:
we can see left table will be visited one more time, and right table will be visited only once.
Yes, visiting a relation many time is a common operation. Currently the implementation will always load/cache( I think the queries will success after that. |
Describe the bug
A clear and concise description of what the bug is.
When I work on #4866, I find some
NestedLoopJoin
s will panic.This sql will panic:
To Reproduce
See above.
Expected behavior
The join should work.
It seems the partition of right child may execute one more time in
NestedLoopJoin
, if it has multiple output partitions. It is ok in most time, but when meetsRepartitionExec
it will panic.Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: