-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce scope to semantic analyzer and update correlation in Apply node #5302
Conversation
5220e36
to
111dd05
Compare
111dd05
to
752c4a4
Compare
{ | ||
// expressions at this point can not have sub queries so deny all access checks | ||
// in the future, we will need a full access controller here to verify access to functions | ||
ExpressionAnalyzer analyzer = create(new Analysis(), session, metadata, sqlParser, new DenyAllAccessControl(), false); | ||
Node root = Iterables.size(expressions) > 0 ? ExpressionUtils.and(expressions) : TRUE_LITERAL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we create a boolean expression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When somebody calls Analysis.getScope(node)
and such node
has not have mapping to scope. Then to find its scope a visitor is used, it finds the nearest parent node of this node
which has a scope set and returns it as a node
scope. This search is started from the root node.
Here I wanted to create such artificial
root for all the expressions. But now, when you have asked I realized that is not needed and what is worse it introduces unnecessary confusion. So I removed that. Instead I create multiple roots, one per each expression.
d52ea1f
to
afda16c
Compare
@martint Comments applied and rebased. |
afda16c
to
836af32
Compare
private final Optional<Scope> parent; | ||
|
||
private RelationType relation; | ||
private Optional<Scope> lateralScope = Optional.empty(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to call out the lateral scope explicitly? Unless I'm mistaken, when lateral is used, the lateral derived table is behaves almost like a subquery, and the relation on the left of it acts as the "parent" scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your suggestion has worked. I used the "parent" scope instead.
836af32
to
0a090df
Compare
@martint All the comments applied and rebased. Does spliting pull requests help you reviewing? If so I can extract query rewites as a separate pull request. |
private final Session session; | ||
private final SqlParser sqlParser; | ||
|
||
public Visitor( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put args on the same line
0a090df
to
091608d
Compare
@martint rebased, comments applied, ping :) |
// Input fields == Output fields | ||
analysis.setOutputExpressions(node, descriptorToFields(queryBodyScope)); | ||
|
||
queryScope = Scope.builder().withParent(withScope).withRelationType(queryBodyScope.getRelationType()).withApproximate(approximate).build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put each .withXXX
in a separate line for readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
I extracted |
091608d
to
320fc53
Compare
@martint Rebased and applied all the comments. Also, please notice that I extracted some refactoring commits as a separate review. |
Travis errored with same failures as for master branch. |
@kokosing, I think there are some comments/questions that still need to be addressed. Can you check? Also, this doesn't rebase cleanly after I merged the EXISTS and statement rewriter PRs. |
320fc53
to
d0a5482
Compare
Inner and outer `with` queries had the same query body. So it was possible that test would pass while having wrong query executed (false positive test result).
Scope is an object which contains information about entities available for current SQL clauses. It contains current RelationType as well as information about outer queries RelationTypes and available named queries. Semantic analyzer was changed in a way that each visit method gets and returns Scope objects. The input scope gives information about the context of given node. Contains information about outer relation types and accessible RelationType. Returned Scope contains information about produced entities (RelationType). Scopes are stored in Analysis per AST nodes. Each to find a Scope for a AST node (expression) we find the closest parent which has Scope set in Analysis.
All the column references of subquery plan which are not resolved (do not come from subquery relations) are collected. SubqueryPlanner check if it is able to rewrite such references to symbol produced by its relation, if it is able then add replace column reference with that symbol and that this symbol to the correlation list of Apply node. If SubqueryPlanner is not able to rewrite that column reference, then it is assumed the outer SubqueryPlanner execution will do that. If there is no outer SubqueryPlanner then SemanticException should be thrown.
When symbols produced by subquery in ApplyNode are not consumed by any of the outer nodes then such subquery and hence ApplyNode can be optimized out. In that case ApplyNode is replaced by its input.
I just removed tests whose the main purpose was to check if logical plan assertions are working. Instead I left test whose the main purpose is to check if logical planner return correct plan. Logical plan assertions are tested anyway, I do think we need special tests for that.
In order to that some modification to logical plan assertions had to be applied: - Adding CorrelationMatcher - refactoring SymbolAliases to ExpressionAliases - extending ExpressionVerifier to support more expressions
From: ``` anyTree PlanMatchPattern { PlanMatchPattern { PlanNodeMatcher{nodeClass=class com.facebook.presto.sql.planner.plan.FilterNode} FilterMatcher{predicate=(3 IN ("c"))} PlanMatchPattern { PlanNodeMatcher{nodeClass=class com.facebook.presto.sql.planner.plan.ApplyNode} CorrelationMatcher{correlation=[C, O]} PlanMatchPattern { PlanNodeMatcher{nodeClass=class com.facebook.presto.sql.planner.plan.ProjectNode} PlanMatchPattern { PlanNodeMatcher{nodeClass=class com.facebook.presto.sql.planner.plan.TableScanNode} TableScanMatcher{expectedTableName=orders} SymbolMatcher{alias=O, pattern=orderkey} SymbolMatcher{alias=C, pattern=custkey} } } anyTree PlanMatchPattern { PlanMatchPattern { PlanNodeMatcher{nodeClass=class com.facebook.presto.sql.planner.plan.ApplyNode} CorrelationMatcher{correlation=[L]} PlanMatchPattern { PlanNodeMatcher{nodeClass=class com.facebook.presto.sql.planner.plan.TableScanNode} TableScanMatcher{expectedTableName=lineitem} SymbolMatcher{alias=L, pattern=orderkey} } PlanMatchPattern { PlanNodeMatcher{nodeClass=class com.facebook.presto.sql.planner.plan.EnforceSingleRowNode} PlanMatchPattern { PlanNodeMatcher{nodeClass=class com.facebook.presto.sql.planner.plan.ProjectNode} PlanMatchPattern { PlanNodeMatcher{nodeClass=class com.facebook.presto.sql.planner.plan.ValuesNode} } } } } } } } } ``` to: ``` anyTree node(FilterNode) FilterMatcher{predicate=(3 IN ("c"))} node(ApplyNode) CorrelationMatcher{correlation=[C, O]} node(ProjectNode) node(TableScanNode) TableScanMatcher{expectedTableName=orders} SymbolMatcher{alias=O, pattern=orderkey} SymbolMatcher{alias=C, pattern=custkey} anyTree node(ApplyNode) CorrelationMatcher{correlation=[L]} node(TableScanNode) TableScanMatcher{expectedTableName=lineitem} SymbolMatcher{alias=L, pattern=orderkey} node(EnforceSingleRowNode) node(ProjectNode) node(ValuesNode) ```
d0a5482
to
f7b02b0
Compare
@martint Sorry about missing comments. I had to do something wrong with the git and I lost some changes. Please notice that I restored the This pull request still need some work. I need to add unit tests for correlated Anyway, it would be good to merge some of the commits as this pull request is getting bigger and bigger. I have heard that you are going on vacation so I don't if it is possible in near future. |
Why can't we model it as:
Semantically, UNNEST implies "CROSS JOIN LATERAL", so I don't see why we'd treat it any different from common LATERAL. |
According to my understanding, what you have proposed is the semantic of that SQL constructs. It it is the only way it can be modeled to make this construct work correctly. Moreover, in this PR it scopes are implemented that way, but maybe it is not obvious because of the usage of I will try to remove |
No description provided.