-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-3032][Shuffle] Fix key comparison integer overflow introduced sorting exception #2514
Conversation
@@ -152,7 +152,7 @@ private[spark] class ExternalSorter[K, V, C]( | |||
override def compare(a: K, b: K): Int = { | |||
val h1 = if (a == null) 0 else a.hashCode() | |||
val h2 = if (b == null) 0 else b.hashCode() | |||
h1 - h2 | |||
Integer.compare(h1, h2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method does not exist in Java 6. There is an equivalent in Guava or you can just write the comparisons directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thanks Sean.
QA tests have started for PR 2514 at commit
|
QA tests have started for PR 2514 at commit
|
@@ -152,7 +152,7 @@ private[spark] class ExternalSorter[K, V, C]( | |||
override def compare(a: K, b: K): Int = { | |||
val h1 = if (a == null) 0 else a.hashCode() | |||
val h2 = if (b == null) 0 else b.hashCode() | |||
h1 - h2 | |||
if (h1 < h2) -1 else if (h1 == h2) 0 else 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use this, use Integer.compare instead. Reason is Java will likely optimize the latter one. You can fix it in ExternalAppendOnlyMap too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what about Java 6 compatibility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mateiz per my comment, that would no longer run in Java 6 as Integer.compare
doesn't exist before Java 7.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, we can keep it as is then.
QA tests have finished for PR 2514 at commit
|
Test PASSed. |
QA tests have finished for PR 2514 at commit
|
Test PASSed. |
QA tests have started for PR 2514 at commit
|
Hi Matei, I modify the unit test to reproduce the exception. Seems it is difficult to reproduce this exception with small dataset manually, as this exception is unreliable. Here is the reason searched from stack overflow
So here I generate 1m random integer values to reproduce the exception. Seems in my local test with above 1000 rounds of test, this exception can always be produced. But it cannot be logically proved and still have chance to not throw exception. Also I tested with 1k random integer plus some large data like Integer.MaxValue and Integer.MinValue, hard to reproduce this exception. And with 10k dataset, 1/3 of chance to get the exception. I think unless someone familiar with TimSort can manually create effectively small dataset, potentially this unit test may fail. So would you give me some suggestions? Thanks a lot. |
QA tests have finished for PR 2514 at commit
|
Test FAILed. |
If you have a test that reproduces it in 1000 runs, it's fine. I'd improve it by seeding a random generator with a fixed seed that you saw produce the problem. TimSort is deterministic, so it will always throw it in that case (though this may break if we change the implementation). |
sc = new SparkContext("local-cluster[1,1,512]", "test", conf) | ||
|
||
// Using wrongHashOrdering to show integer overflow introduced exception. | ||
val rand = new Random |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just seed this at a given value that you see causes the exception to occur.
QA tests have started for PR 2514 at commit
|
Hi Matei, thanks a lot for your suggestions. I've updated the code with fixed seed. Would you mind taking a look at this? Thanks a lot. |
QA tests have finished for PR 2514 at commit
|
Test FAILed. |
Jenkins, retest this please. |
QA tests have started for PR 2514 at commit
|
QA tests have finished for PR 2514 at commit
|
Test PASSed. |
…sorting exception Previous key comparison in `ExternalSorter` will get wrong sorting result or exception when key comparison overflows, details can be seen in [SPARK-3032](https://issues.apache.org/jira/browse/SPARK-3032). Here fix this and add a unit test to prove it. Author: jerryshao <[email protected]> Closes #2514 from jerryshao/SPARK-3032 and squashes the following commits: 6f3c302 [jerryshao] Improve the unit test according to comments 01911e6 [jerryshao] Change the test to show the contract violate exception 83acb38 [jerryshao] Minor changes according to comments fa2a08f [jerryshao] Fix key comparison integer overflow introduced sorting exception (cherry picked from commit dab1b0a) Signed-off-by: Matei Zaharia <[email protected]>
Thanks Saisai, this looks good. I've merged it. |
Previous key comparison in
ExternalSorter
will get wrong sorting result or exception when key comparison overflows, details can be seen in SPARK-3032. Here fix this and add a unit test to prove it.