-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-2856] Decrease initial buffer size for Kryo to 64KB. #1780
Conversation
QA tests have started for PR 1780. This patch merges cleanly. |
LGTM |
@@ -47,7 +47,9 @@ class KryoSerializer(conf: SparkConf) | |||
with Logging | |||
with Serializable { | |||
|
|||
private val bufferSize = conf.getInt("spark.kryoserializer.buffer.mb", 2) * 1024 * 1024 | |||
private val bufferSize = | |||
(conf.getDouble("spark.kryoserializer.buffer.mb", 0.064) * 1024 * 1024).toInt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add a comment // 64KB
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's pretty obvious, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well for lazy people like me it's nice if we don't have to spend extra cycles in our head
QA results for PR 1780: |
Thanks. Merging in master. @andrewor14 if you feel strongly about it, I can push a commit to add an one-line comment. |
IIRC if kryo cant host entire serialized object in the buffer, it throws up : we saw issues with it being as high as 256 kb for some of our jobs : though we were using a different api (and it was slightly more complicated usecase). |
Nah that's fine. |
@mridulm our kryo configuration has been changed from the version you are running - this value is only the initial size of the buffer, not the maximum size. Checkout the master docs: https://github.com/apache/spark/blob/master/docs/configuration.md Look for "spark.kryoserializer.buffer.max.mb" |
Hi @pwendell, my observation about buffer size was not in context of spark ... we saw issues which "looked like" buffer overflow when the serialized object graph was large, and it was not handling the buffer growth properly. If there are issues with buffer growth, and we lower the limit, a lot of jobs will start failing on release. Given some of the past bugs we have fixed @pwendell (the flush issue comes to mind for example !), I am very wary of kryo - when it works, it is great, rest is suspicious :-) |
Author: Reynold Xin <[email protected]> Closes #1780 from rxin/kryo-init-size and squashes the following commits: 551b935 [Reynold Xin] [SPARK-2856] Decrease initial buffer size for Kryo to 64KB. (cherry picked from commit 184048f) Signed-off-by: Reynold Xin <[email protected]>
@mridulm ah I misunderstood! Anyways, thanks for filling it in - I agree we need to be defensive and cautious about our use here :) |
Author: Reynold Xin <[email protected]> Closes apache#1780 from rxin/kryo-init-size and squashes the following commits: 551b935 [Reynold Xin] [SPARK-2856] Decrease initial buffer size for Kryo to 64KB.
No description provided.