Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-5649][SQL] added a rule to check datatypes cast #4425

Closed
wants to merge 6 commits into from

Conversation

scwf
Copy link
Contributor

@scwf scwf commented Feb 6, 2015

Throw an exception when datatypes cast is illegitimate, this patch info user the cast issue in the sqls.
Before this patch, we only get an attribute unresolved exception which is really hard to understand the problems in users' sql

@SparkQA
Copy link

SparkQA commented Feb 6, 2015

Test build #26908 has started for PR 4425 at commit 9fb60ce.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Feb 6, 2015

Test build #26908 has finished for PR 4425 at commit 9fb60ce.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26908/
Test FAILed.

@scwf scwf changed the title [SPARK-5649][SQL] Throw exception when can not apply datatype cast [SPARK-5649][SQL] added a rule to check datatypes cast Feb 6, 2015
@SparkQA
Copy link

SparkQA commented Feb 6, 2015

Test build #26910 has started for PR 4425 at commit 172e4dd.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Feb 6, 2015

Test build #26911 has started for PR 4425 at commit 97bb90b.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Feb 6, 2015

Test build #26910 has finished for PR 4425 at commit 172e4dd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26910/
Test PASSed.

@SparkQA
Copy link

SparkQA commented Feb 6, 2015

Test build #26911 has finished for PR 4425 at commit 97bb90b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26911/
Test PASSed.

@chenghao-intel
Copy link
Contributor

It should be either a bug of the Analyzer or the bad semantic of the user. As I think all of the children should be resolved before the checking, can you also give an example or unit test that will throw exception?

@scwf
Copy link
Contributor Author

scwf commented Feb 6, 2015

unit test added

@SparkQA
Copy link

SparkQA commented Feb 6, 2015

Test build #26917 has started for PR 4425 at commit bd8d9f8.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Feb 6, 2015

Test build #26917 has finished for PR 4425 at commit bd8d9f8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26917/
Test PASSed.

@scwf
Copy link
Contributor Author

scwf commented Feb 7, 2015

we can throw an AnalysisException not treenodeexception after #4439 has in

@SparkQA
Copy link

SparkQA commented Feb 11, 2015

Test build #27260 has started for PR 4425 at commit db86dc4.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Feb 11, 2015

Test build #27260 has finished for PR 4425 at commit db86dc4.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27260/
Test FAILed.

@scwf
Copy link
Contributor Author

scwf commented Feb 11, 2015

Retest this please

@SparkQA
Copy link

SparkQA commented Feb 11, 2015

Test build #27266 has started for PR 4425 at commit db86dc4.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Feb 11, 2015

Test build #27266 has finished for PR 4425 at commit db86dc4.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27266/
Test FAILed.

@scwf
Copy link
Contributor Author

scwf commented Feb 11, 2015

retest this please

@SparkQA
Copy link

SparkQA commented Feb 11, 2015

Test build #27272 has started for PR 4425 at commit db86dc4.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Feb 11, 2015

Test build #27272 has finished for PR 4425 at commit db86dc4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27272/
Test PASSed.

@scwf
Copy link
Contributor Author

scwf commented Feb 11, 2015

Updated, @marmbrus any comment here?

@@ -69,6 +69,7 @@ class Analyzer(catalog: Catalog,
typeCoercionRules ++
extendedRules : _*),
Batch("Check Analysis", Once,
CheckCast ::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to check casts after we check resolution. Otherwise casts of invalid attributes will still give confusing TreeNodeExceptions.

scala> sql("SELECT CAST(x AS STRING) FROM src").collect()
org.apache.spark.sql.catalyst.analysis.UnresolvedException: Invalid call to dataType on unresolved object, tree: 'x
    at org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute.dataType(unresolved.scala:48)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we check resolution first,
1 check cast will be invalid since legitimate cast expression must be not resolved.
2 select cast(key as binary) from src will give a scala.NotImplementedError

how about not change the order but catch exception of child.dataType to make it pass this rule? then your case will go into check resolution

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scala.NotImplementedError should be a bug, filed PR #4552 for this

@marmbrus
Copy link
Contributor

Thanks for bringing this up! I rolled the fix into #4558.

@scwf scwf closed this Feb 13, 2015
@scwf scwf deleted the cast branch February 13, 2015 02:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants