Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PlanNode Expression-to-RowExpression Migration Plan #12546

Closed
8 tasks done
highker opened this issue Mar 28, 2019 · 1 comment
Closed
8 tasks done

PlanNode Expression-to-RowExpression Migration Plan #12546

highker opened this issue Mar 28, 2019 · 1 comment
Assignees
Labels

Comments

@highker
Copy link
Contributor

highker commented Mar 28, 2019

Background

Detaching AST (Node) from IR (PlanNode) has been discussed for years. Especially, we will replace Expression references in PlanNode with RowExpression. This will be part of the multiple IR cleaning efforts. Other major refactorings include moving out TableLayout/Handle (cc: @hellium01), Symbol, ConnectorId, etc and adding data properties, traits, and subquery node (cc: @oerling). In that way, a clean IR benefits multiple long-term projects that are on roadmap:

Current Status and Goal

The current lifecycle a plan (before being compiled as operators) is:

  1. building AST
  2. building raw plan
  3. plan optimization
  4. plan sanity check
  5. plan cost computation
  6. building subplans
  7. distributing subplan (over the wire)
  8. compiling subplan locally

Expression-to-RowExpression translation happens at step 8 as of today. We are moving it to step 3 (and future to step 2). The reason to put the translation at step 3 instead of step 2 is due to heavy references of Expression in optimizers. Once we finish cleaning up optimizers as well, we can build PlanNode with RowExpression in step 2.

Solution in a Nutshell

There is no short cut to this. Most of the utilities (in sql.planner) need to be duplicated in functionality during migration. Ultimately, the utilities for Expression will go away. Example ones are interpreter, equivalence, domain translator, etc. Cost estimation stats rules also need to be rewritten.

We also need a new class called RawExpression : RowExpression to wrap around Expression during migration (7ca44b1#diff-ea10e702a57ffd7f6cafb50fdd029617R91). In this way, no optimizers need to be changed. They can still work on Expression as if everything remains the same.

PlanNode::getOutputSymbols may also need to be enhanced to have layout information (i.e., channel mapping). In the long-term, Symbol will go away as well.

Plan

We divide and conquer each PlanNode and Symbol. The corresponding utilities will be migrated as well if any of them is blocking migrating a PlanNode. These PlanNodes are:

@highker
Copy link
Contributor Author

highker commented Apr 3, 2019

Detailed migration plan:
IR.pdf

cc: @mbasmanova @hellium01 @rongrong @cemcayiroglu @oerling

cc: @zhenxiao for optimizer participating query optimization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant