-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Projects require unique expressions names error in substrait producer/consumer #10815
Comments
Isn't the repro trying to alias different column names (PS_PARTKEY, PS_SUPPKEY) to same alias (K1)? Why would you want to do that? 😅 |
Ahh...that was my mistake. One of those should be k2. I was trying to get a more simple repro of a much larger query with multiple joins. However...now that I have a more proper query, I am running into a different issue. This is the query I have now:
And this is the substrait error from that:
So the original issue that I was hitting was datafusion trying to run a substrait plan generated from DuckDB. And the error from that is the same error as I put in the description. |
I added to the substrait support epic: #5173 |
Hmm not sure how you got the NotImplemented error - maybe somehow running a quite old DataFusion? However with the query
I do get the same error as you originally:
This is because Substrait doesn't include aliases neither for tables nor for columns. I'm trying to see if I can add that into Substrait, it'd make these things easier to support: substrait-io/substrait#648 |
Ahhh yea...i was on an older version. |
Given that names don't matter in Substrait (the final names are provided) is the problem solvable within the Substrait consumer for Datafusion? Shouldn't the consumer be able to rename the columns to whatever it wants? Stepping further back I wonder if the check is needed at all here -- is it trying to prevent extra work or is it trying to prevent confusion on its part later on? It may be designed for the case where the fields are named the same but are from different sources which isn't happening here. Perhaps the check needs to be made more precise? |
As discussed on the Substrait ticket, yes it can be solved, but not in a nice way.
It can, however given the user has named the columns/tables in one way in the original plan, it can be quite confusing to the user if the columns/tables are named much differently in the actually executed plan.
This plan results in a cross join, so the fields do refer to different sources, or same table but different sides of the join, so they are different columns. |
Describe the bug
Datafusion substrait consumer is unable to produce/consumer a substrait plan that uses the same column names with different aliases
To Reproduce
Error:
Expected behavior
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: