Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-18076][CORE][SQL] Fix default Locale used in DateFormat, NumberFormat to Locale.US #15610

Closed
wants to merge 2 commits into from

Conversation

srowen
Copy link
Member

@srowen srowen commented Oct 24, 2016

What changes were proposed in this pull request?

Fix Locale.US for all usages of DateFormat, NumberFormat

How was this patch tested?

Existing tests.

@srowen
Copy link
Member Author

srowen commented Oct 24, 2016

CC @HyukjinKwon

@SparkQA
Copy link

SparkQA commented Oct 24, 2016

Test build #67453 has finished for PR 15610 at commit 4260ba0.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

I support this PR in general. Let also me try to help verify this with some comments within a couple of coming days (I have been testing related ones too).

@HyukjinKwon
Copy link
Member

@srowen, It seems we should fix FastDateFormat.getInstance too. It seems that method also takes locale as the second argument.

.../sql/catalyst/json/JSONOptions.scala:    FastDateFormat.getInstance(parameters.getOrElse("dateFormat", "yyyy-MM-dd"))
.../sql/catalyst/json/JSONOptions.scala:    FastDateFormat.getInstance(
.../sql/execution/datasources/csv/CSVOptions.scala:    FastDateFormat.getInstance(parameters.getOrElse("dateFormat", "yyyy-MM-dd"))
.../sql/execution/datasources/csv/CSVOptions.scala:    FastDateFormat.getInstance(
.../sql/execution/datasources/csv/CSVSuite.scala:      val iso8501 = FastDateFormat.getInstance("yyyy-MM-dd'T'HH:mm:ss.SSSZZ")
.../sql/execution/datasources/csv/CSVSuite.scala:      val iso8501 = FastDateFormat.getInstance("yyyy-MM-dd")

Also, IMHO, it seems okay to change the failed test

test("Float and Double Types are cast correctly with Locale") {
  val originalLocale = Locale.getDefault
  try {
    val locale : Locale = new Locale("fr", "FR")
    Locale.setDefault(locale)
    assert(CSVTypeCast.castTo("1,00", FloatType) == 1.0)
    assert(CSVTypeCast.castTo("1,00", DoubleType) == 1.0)
  } finally {
    Locale.setDefault(originalLocale)
  }
}

to

test("Float and Double Types are cast correctly with US Locale") {
  val originalLocale = Locale.getDefault
  try {
    val locale : Locale = new Locale("fr", "FR")
    Locale.setDefault(locale)
    assert(CSVTypeCast.castTo("1,00", FloatType) == 100.0)
    assert(CSVTypeCast.castTo("1,00", DoubleType) == 100.0)
  } finally {
    Locale.setDefault(originalLocale)
  }
}

in order to test if setting other locales does not affect the converted results as this PR proposes the fixed locale.

One concern about this is to not consider the default locale in JVM for parsing CSV and JSON but I think we might be able to introduce another option to set the locale if this is problematic (with default value Locale.US) in the future.

@SparkQA
Copy link

SparkQA commented Oct 25, 2016

Test build #67508 has finished for PR 15610 at commit 7a909c8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member Author

srowen commented Oct 29, 2016

I wonder if @zsxwing or @davies have an opinion on something like this? looks good to us but it conceivably results in a behavior change -- for the better, but a change in some cases.

@srowen
Copy link
Member Author

srowen commented Nov 2, 2016

Merged to master/2.1. I think this is important as a fix though bears noting the change in release notes. I tagged the JIRA.

@asfgit asfgit closed this in 9c8deef Nov 2, 2016
asfgit pushed a commit that referenced this pull request Nov 2, 2016
…rFormat to Locale.US

## What changes were proposed in this pull request?

Fix `Locale.US` for all usages of `DateFormat`, `NumberFormat`
## How was this patch tested?

Existing tests.

Author: Sean Owen <[email protected]>

Closes #15610 from srowen/SPARK-18076.

(cherry picked from commit 9c8deef)
Signed-off-by: Sean Owen <[email protected]>
@srowen srowen deleted the SPARK-18076 branch November 2, 2016 09:44
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
…rFormat to Locale.US

## What changes were proposed in this pull request?

Fix `Locale.US` for all usages of `DateFormat`, `NumberFormat`
## How was this patch tested?

Existing tests.

Author: Sean Owen <[email protected]>

Closes apache#15610 from srowen/SPARK-18076.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants