-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
move the type coercion out of the optimizer and refactor the optimizer #3582
Comments
I agree that type coercion doesn't really belong in an "optimizer" as it actually changes the meaning of the exprs (on purpose) where the other optimzier passes are supposed to keep the meaning of the plan the same, but make it faster in some way. I recommend changing the code to explicitly run the type coercion logic immediately prior to running any other optimization passes. I also think the type coercion logic should not depend on any other simplification (such as constant folding) -- it should depend only on the types. Once type coercion is done, then we can run the expr simplifier pass to clean up / simplify the expressions |
I agree - the type coercion should be a mandatory pass that is done after planning. |
You got it. |
Spark has an Analysis phase that runs before the optimizer. Maybe we can learn from their model. Within the analysis phase, they have a number of type coercion rules that they run:
|
Yes, I think we can follow other database or sql system. But It's long way to go and refactor. My plan is to do some refactor, and make good base for the next big refactor like the spark or other system. |
Sounds like a great plan to me |
Here is a PR that contributes to this goal: #3728 |
After this pr merged I will do other Expr for type coercion |
after supporting all expr for type coercion, we can move this rule out of the optimizer, and use it as a separate module |
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
When I do this pr https://github.com/apache/arrow-datafusion/pull/3396/files#
I am stuck by many issue about the type in the optimizer framework.
In other SQL or database system, the data type coercion should be done before any optimization.
But we do it in the physical phase, after the @andygrove work and add the
TypeCoercion
rule in the optimizer. It can be done in the logical phase.After the https://github.com/apache/arrow-datafusion/pull/3396/files# merge, we need to do the refactor.
If the data type is not right, all the operation and optimization will meet some wired issue.
cc @andygrove @alamb
Describe the solution you'd like
A clear and concise description of what you want to happen.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.
The text was updated successfully, but these errors were encountered: