-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-24865] Remove AnalysisBarrier #21822
Conversation
Test build #93306 has finished for PR 21822 at commit
|
Test build #93307 has finished for PR 21822 at commit
|
Test build #93308 has finished for PR 21822 at commit
|
// Lookup WindowSpecDefinitions. This rule works with unresolved children. | ||
case WithWindowDefinition(windowDefinitions, child) => | ||
child.transform { | ||
// TODO(rxin): Check with Herman whether the next line is OK. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @hvanhovell
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is good. The earlier resolveOperators
makes sure we don't overwrite a window spec, with a similarly named one defined higher up the tree. BTW I don't think we have a test that covers this (it is pretty rare).
dedupAttr(a, attributeRewrites) | ||
case s: SubqueryExpression => | ||
s.withNewPlan(dedupOuterReferencesInSubquery(s.plan, attributeRewrites)) | ||
// TODO(rxin): Why do we need transformUp here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @hvanhovell @cloud-fan Why do we need transformUp here?
@@ -533,7 +537,8 @@ trait CheckAnalysis extends PredicateHelper { | |||
|
|||
// Simplify the predicates before validating any unsupported correlation patterns | |||
// in the plan. | |||
BooleanSimplification(sub).foreachUp { | |||
// TODO(rxin): Why did this need to call BooleanSimplification??? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @dilipbiswal @cloud-fan @gatorsmile
Why did we need BooleanSimplification here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rxin From what i remember Reynold, most of this logic was housed in Analyzer before and we moved it to optimizer. In the old code we used to walk the plan after simplifying the predicates. The comment used to read "Simplify the predicates before pulling them out.". I just retained that semantics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I'm going to add it back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I added boolean simplification here. I didn't quite like it back then, and I still don't like it. I was hoping this was happening in the Optimizer
now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hvanhovell Hi Herman, as you said, we do the actual pulling up of the predicates in the optimizer in PullupCorrelatedPredicates in subquery.scala. We are also doing a BooleanSimplication first before traversing the plan there. In here, we are doing the error reporting and i thought it would be better to keep the traversal the same way. Basically previously we did the error reporting and rewriting in Analyzer and now, we do the error reporting in checkAnalysis and rewriting in Optimizer. Just to refresh your memory so you can help to take the right call here :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well tests fail without it, so we don't really have a choice here. For a second I thought we could also create some utils class, but that would just mean moving the code in BooleanSimplification in there just for esthetics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hvanhovell Yeah. I agree.
Test build #93310 has finished for PR 21822 at commit
|
Test build #93320 has finished for PR 21822 at commit
|
retest this please |
Test build #93325 has finished for PR 21822 at commit
|
@@ -787,6 +782,7 @@ class Analyzer( | |||
right | |||
case Some((oldRelation, newRelation)) => | |||
val attributeRewrites = AttributeMap(oldRelation.output.zip(newRelation.output)) | |||
// TODO(rxin): Why do we need transformUp here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @cloud-fan why do we need transformUp here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we still need to transform resolved plan here to resolve self-join. Image
val df = ...
df.as("a").join(df.as("b"), ...)
We need to look into the resolved plan to replace the conflicted attributes.
Test build #93351 has finished for PR 21822 at commit
|
case SubqueryAlias(_, child) => child | ||
// This is actually called in the beginning of the optimization phase, and as a result | ||
// is using transformUp rather than resolveOperators. This is also often called in the | ||
// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: finish comment
|
||
object LogicalPlan { | ||
|
||
private val resolveOperatorDepth = new ThreadLocal[Int] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: explain what this is
* | ||
* @param rule the function use to transform this nodes children | ||
*/ | ||
def resolveOperators(rule: PartialFunction[LogicalPlan, LogicalPlan]): LogicalPlan = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: add unit tests
retest this please |
Test build #93366 has finished for PR 21822 at commit
|
retest this please |
Test build #93370 has finished for PR 21822 at commit
|
Test build #93375 has finished for PR 21822 at commit
|
retest this please |
Test build #93380 has finished for PR 21822 at commit
|
Test build #93420 has finished for PR 21822 at commit
|
retest this please |
@rxin, I think this PR could possibly cause some performance effect given the latest test ran above https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93470/ and from a rough scan of other builds:
vs https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-sbt-hadoop-2.7/4699/consoleFull
There could of course other factors like machine's status as well but I thought it's good to note while I am taking a look for those. |
Yea the extra check in test cases might've contributed to the longer test
time. Let me think about how to reduce it.
|
Test build #93470 has finished for PR 21822 at commit
|
I changed the way we do the checks in test to use a thread local rather than checking the stacktrace, so they should run faster now. Also added test cases for the various new methods. Also moved the relevant code into AnalysisHelper for better code structure. This should be ready now if tests pass. |
@@ -751,7 +751,8 @@ object TypeCoercion { | |||
*/ | |||
case class ConcatCoercion(conf: SQLConf) extends TypeCoercionRule { | |||
|
|||
override protected def coerceTypes(plan: LogicalPlan): LogicalPlan = plan transform { case p => | |||
override protected def coerceTypes( | |||
plan: LogicalPlan): LogicalPlan = plan resolveOperatorsDown { case p => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im using a weird wrapping here to minimize the diff.
Test build #93524 has finished for PR 21822 at commit
|
retest this please |
Test build #93527 has finished for PR 21822 at commit
|
Test build #93526 has finished for PR 21822 at commit
|
Test build #93533 has finished for PR 21822 at commit
|
Test build #93630 has finished for PR 21822 at commit
|
retest this please |
Test build #93648 has finished for PR 21822 at commit
|
LGTM, merging to master! |
…enkins build (from 300m to 340m) ## What changes were proposed in this pull request? Currently, looks we hit the time limit time to time. Looks better increasing the time a bit. For instance, please see apache#21822 For clarification, current Jenkins timeout is 400m. This PR just proposes to fix the test script to increase it correspondingly. *This PR does not target to change the build configuration* ## How was this patch tested? Jenkins tests. Closes apache#21845 from HyukjinKwon/SPARK-24886. Authored-by: hyukjinkwon <[email protected]> Signed-off-by: hyukjinkwon <[email protected]>
What changes were proposed in this pull request?
AnalysisBarrier was introduced in SPARK-20392 to improve analysis speed (don't re-analyze nodes that have already been analyzed).
Before AnalysisBarrier, we already had some infrastructure in place, with analysis specific functions (resolveOperators and resolveExpressions). These functions do not recursively traverse down subplans that are already analyzed (with a mutable boolean flag _analyzed). The issue with the old system was that developers started using transformDown, which does a top-down traversal of the plan tree, because there was not top-down resolution function, and as a result analyzer performance became pretty bad.
In order to fix the issue in SPARK-20392, AnalysisBarrier was introduced as a special node and for this special node, transform/transformUp/transformDown don't traverse down. However, the introduction of this special node caused a lot more troubles than it solves. This implicit node breaks assumptions and code in a few places, and it's hard to know when analysis barrier would exist, and when it wouldn't. Just a simple search of AnalysisBarrier in PR discussions demonstrates it is a source of bugs and additional complexity.
Instead, this pull request removes AnalysisBarrier and reverts back to the old approach. We added infrastructure in tests that fail explicitly if transform methods are used in the analyzer.
How was this patch tested?
Added a test suite AnalysisHelperSuite for testing the resolve* methods and transform* methods.