Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-17680] [SQL] [TEST] Added a Testcase for Verifying Unicode Character Support for Column Names and Comments #15255

Closed
wants to merge 4 commits into from

Conversation

gatorsmile
Copy link
Member

@gatorsmile gatorsmile commented Sep 27, 2016

What changes were proposed in this pull request?

Spark SQL supports Unicode characters for column names when specified within backticks(`). When the Hive support is enabled, the version of the Hive metastore must be higher than 0.12, See the JIRA: https://issues.apache.org/jira/browse/HIVE-6013 Hive metastore supports Unicode characters for column names since 0.13.

In Spark SQL, table comments, and view comments always allow Unicode characters without backticks.

BTW, a separate PR has been submitted for database and table name validation because we do not support Unicode characters in these two cases.

How was this patch tested?

N/A

@gatorsmile
Copy link
Member Author

Also need to add a test case for data source tables. Will do it later

@gatorsmile
Copy link
Member Author

... Hit a bug in the write path... Need to fix it at first...

@gatorsmile
Copy link
Member Author

... Hit one more bug in the write path...

@SparkQA
Copy link

SparkQA commented Sep 27, 2016

Test build #65946 has finished for PR 15255 at commit 741d59c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Nov 7, 2016

Test build #68263 has finished for PR 15255 at commit 741d59c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Nov 27, 2016

Test build #69197 has finished for PR 15255 at commit 741d59c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member Author

cc @cloud-fan Could you review this PR? Thanks!

val comment = "庙"
// scalastyle:on
withTable(tabName) {
// non ascii characters are not allowed in the source code, so we disable the scalastyle.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this line?

@cloud-fan
Copy link
Contributor

LGTM

@SparkQA
Copy link

SparkQA commented Nov 30, 2016

Test build #69357 has finished for PR 15255 at commit 57817a1.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Nov 30, 2016

Test build #69379 has finished for PR 15255 at commit 57817a1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request Nov 30, 2016
…cter Support for Column Names and Comments

### What changes were proposed in this pull request?

Spark SQL supports Unicode characters for column names when specified within backticks(`). When the Hive support is enabled, the version of the Hive metastore must be higher than 0.12,  See the JIRA: https://issues.apache.org/jira/browse/HIVE-6013 Hive metastore supports Unicode characters for column names since 0.13.

In Spark SQL, table comments, and view comments always allow Unicode characters without backticks.

BTW, a separate PR has been submitted for database and table name validation because we do not support Unicode characters in these two cases.
### How was this patch tested?

N/A

Author: gatorsmile <[email protected]>

Closes #15255 from gatorsmile/unicodeSupport.

(cherry picked from commit a1d9138)
Signed-off-by: Wenchen Fan <[email protected]>
@cloud-fan
Copy link
Contributor

thanks, merging to master/2.1!

@asfgit asfgit closed this in a1d9138 Nov 30, 2016
robert3005 pushed a commit to palantir/spark that referenced this pull request Dec 2, 2016
…cter Support for Column Names and Comments

### What changes were proposed in this pull request?

Spark SQL supports Unicode characters for column names when specified within backticks(`). When the Hive support is enabled, the version of the Hive metastore must be higher than 0.12,  See the JIRA: https://issues.apache.org/jira/browse/HIVE-6013 Hive metastore supports Unicode characters for column names since 0.13.

In Spark SQL, table comments, and view comments always allow Unicode characters without backticks.

BTW, a separate PR has been submitted for database and table name validation because we do not support Unicode characters in these two cases.
### How was this patch tested?

N/A

Author: gatorsmile <[email protected]>

Closes apache#15255 from gatorsmile/unicodeSupport.
robert3005 pushed a commit to palantir/spark that referenced this pull request Dec 15, 2016
…cter Support for Column Names and Comments

### What changes were proposed in this pull request?

Spark SQL supports Unicode characters for column names when specified within backticks(`). When the Hive support is enabled, the version of the Hive metastore must be higher than 0.12,  See the JIRA: https://issues.apache.org/jira/browse/HIVE-6013 Hive metastore supports Unicode characters for column names since 0.13.

In Spark SQL, table comments, and view comments always allow Unicode characters without backticks.

BTW, a separate PR has been submitted for database and table name validation because we do not support Unicode characters in these two cases.
### How was this patch tested?

N/A

Author: gatorsmile <[email protected]>

Closes apache#15255 from gatorsmile/unicodeSupport.
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
…cter Support for Column Names and Comments

### What changes were proposed in this pull request?

Spark SQL supports Unicode characters for column names when specified within backticks(`). When the Hive support is enabled, the version of the Hive metastore must be higher than 0.12,  See the JIRA: https://issues.apache.org/jira/browse/HIVE-6013 Hive metastore supports Unicode characters for column names since 0.13.

In Spark SQL, table comments, and view comments always allow Unicode characters without backticks.

BTW, a separate PR has been submitted for database and table name validation because we do not support Unicode characters in these two cases.
### How was this patch tested?

N/A

Author: gatorsmile <[email protected]>

Closes apache#15255 from gatorsmile/unicodeSupport.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants