-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-17751] [SQL] Remove spark.sql.eagerAnalysis and Output the Plan if Existed in AnalysisException #15316
Conversation
Test build #66189 has finished for PR 15316 at commit
|
Huh, when did we remove eager analysis? |
@hvanhovell It was removed when we migrated DataFrame to Dataset. For details, see the PR: #11443 |
Hmmmm. I did kinda like this feature. @liancheng why did we remove this? |
Shouldn't we bring it back? This is an important feature when there is a bug. |
To @hvanhovell @rxin @liancheng listed the reason in the original PR.
However, there is a bug in AnslysisException output. It does not output the plan. I just fixed it. See the updated PR description. |
Do we need a test for this fix? |
Test build #66203 has finished for PR 15316 at commit
|
Test build #66222 has finished for PR 15316 at commit
|
Test build #66221 has finished for PR 15316 at commit
|
@@ -43,6 +43,11 @@ class AnalysisException protected[sql] ( | |||
} | |||
|
|||
override def getMessage: String = { | |||
val planAnnotation = plan.map(p => s";\n$p").getOrElse("") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need a separate method here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When hitting an AnalysisException
, we called getMessage
in SQLQueryTestSuite.scala
. That means, we also output the logical plan in the .sql.out
when hitting an AnalysisException
. The logical plan contains expression IDs. Thus, when we doing the text compare between the query results with the contents in .sql.out
, it might fail due to the mismatch of expression IDs. Thus, I added an extra function here. Maybe I need to add a comment to say this is for testing only
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that makes sense.
Test build #66668 has finished for PR 15316 at commit
|
retest this please |
Test build #66687 has finished for PR 15316 at commit
|
cc @hvanhovell @rxin Any more comment about this PR? I assume Spark 2.0.2 needs it. Recently, when we analyzing the JIRA https://issues.apache.org/jira/browse/SPARK-17709, we are unable to see the plan due the analyzer failure. The users have to manually rebuild it with this fix. Then, we can see the failed analyzed plan. |
Also cc @cloud-fan and @liancheng |
LGTM, cc @hvanhovell @rxin for final sign-off |
LGTM |
Test build #67058 has finished for PR 15316 at commit
|
LGTM. Merging to master. Thanks! |
@gatorsmile I cannot merge this 2.0. Can you open a backport for this? |
Sure, will do it soon. Thanks! |
…utput the Plan if Existed in AnalysisException ### What changes were proposed in this pull request? This PR is to backport the fix #15316 to 2.0. Dataset always does eager analysis now. Thus, `spark.sql.eagerAnalysis` is not used any more. Thus, we need to remove it. This PR also outputs the plan. Without the fix, the analysis error is like ``` cannot resolve '`k1`' given input columns: [k, v]; line 1 pos 12 ``` After the fix, the analysis error becomes: ``` org.apache.spark.sql.AnalysisException: cannot resolve '`k1`' given input columns: [k, v]; line 1 pos 12; 'Project [unresolvedalias(CASE WHEN ('k1 = 2) THEN 22 WHEN ('k1 = 4) THEN 44 ELSE 0 END, None), v#6] +- SubqueryAlias t +- Project [_1#2 AS k#5, _2#3 AS v#6] +- LocalRelation [_1#2, _2#3] ``` ### How was this patch tested? N/A Author: gatorsmile <[email protected]> Closes #15529 from gatorsmile/eagerAnalysis20.
… if Existed in AnalysisException ### What changes were proposed in this pull request? Dataset always does eager analysis now. Thus, `spark.sql.eagerAnalysis` is not used any more. Thus, we need to remove it. This PR also outputs the plan. Without the fix, the analysis error is like ``` cannot resolve '`k1`' given input columns: [k, v]; line 1 pos 12 ``` After the fix, the analysis error becomes: ``` org.apache.spark.sql.AnalysisException: cannot resolve '`k1`' given input columns: [k, v]; line 1 pos 12; 'Project [unresolvedalias(CASE WHEN ('k1 = 2) THEN 22 WHEN ('k1 = 4) THEN 44 ELSE 0 END, None), v#6] +- SubqueryAlias t +- Project [_1#2 AS k#5, _2#3 AS v#6] +- LocalRelation [_1#2, _2#3] ``` ### How was this patch tested? N/A Author: gatorsmile <[email protected]> Closes apache#15316 from gatorsmile/eagerAnalysis.
… if Existed in AnalysisException ### What changes were proposed in this pull request? Dataset always does eager analysis now. Thus, `spark.sql.eagerAnalysis` is not used any more. Thus, we need to remove it. This PR also outputs the plan. Without the fix, the analysis error is like ``` cannot resolve '`k1`' given input columns: [k, v]; line 1 pos 12 ``` After the fix, the analysis error becomes: ``` org.apache.spark.sql.AnalysisException: cannot resolve '`k1`' given input columns: [k, v]; line 1 pos 12; 'Project [unresolvedalias(CASE WHEN ('k1 = 2) THEN 22 WHEN ('k1 = 4) THEN 44 ELSE 0 END, None), v#6] +- SubqueryAlias t +- Project [_1#2 AS k#5, _2#3 AS v#6] +- LocalRelation [_1#2, _2#3] ``` ### How was this patch tested? N/A Author: gatorsmile <[email protected]> Closes apache#15316 from gatorsmile/eagerAnalysis.
What changes were proposed in this pull request?
Dataset always does eager analysis now. Thus,
spark.sql.eagerAnalysis
is not used any more. Thus, we need to remove it.This PR also outputs the plan. Without the fix, the analysis error is like
After the fix, the analysis error becomes:
How was this patch tested?
N/A