-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-18256] Improve the performance of event log replay in HistoryServer #15756
Conversation
@@ -540,7 +544,8 @@ private[spark] object JsonProtocol { | |||
|
|||
def taskStartFromJson(json: JValue): SparkListenerTaskStart = { | |||
val stageId = (json \ "Stage ID").extract[Int] | |||
val stageAttemptId = (json \ "Stage Attempt ID").extractOpt[Int].getOrElse(0) | |||
val stageAttemptId = | |||
Utils.jsonOption(json \ "Stage Attempt ID").map(_.extract[Int]).getOrElse(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the difference? Is extractOpt
really slow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if so maybe we should add a scalastyle to ban it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh wait you already did 😄
LGTM. That's a massive amount of time spent in |
Test build #68077 has finished for PR 15756 at commit
|
retest this please |
Test build #68149 has finished for PR 15756 at commit
|
LGTM |
Cool. Merging to master! |
…erver ## What changes were proposed in this pull request? This patch significantly improves the performance of event log replay in the HistoryServer via two simple changes: - **Don't use `extractOpt`**: it turns out that `json4s`'s `extractOpt` method uses exceptions for control flow, causing huge performance bottlenecks due to the overhead of initializing exceptions. To avoid this overhead, we can simply use our own` Utils.jsonOption` method. This patch replaces all uses of `extractOpt` with `Utils.jsonOption` and adds a style checker rule to ban the use of the slow `extractOpt` method. - **Don't call `Utils.getFormattedClassName` for every event**: the old code called` Utils.getFormattedClassName` dozens of times per replayed event in order to match up class names in events with SparkListener event names. By simply storing the results of these calls in constants rather than recomputing them, we're able to eliminate a huge performance hotspot by removing thousands of expensive `Class.getSimpleName` calls. ## How was this patch tested? Tested by profiling the replay of a long event log using YourKit. For an event log containing 1000+ jobs, each of which had thousands of tasks, the changes in this patch cut the replay time in half: data:image/s3,"s3://crabby-images/8b34a/8b34ab44c0754eed586cd8409ed4ed7216523b2e" alt="image" Prior to this patch's changes, the two slowest methods in log replay were internal exceptions thrown by `Json4S` and calls to `Class.getSimpleName()`: data:image/s3,"s3://crabby-images/32ca3/32ca3c6dd67b4f1ea9f55ca6db94168e7f92150c" alt="image" After this patch, these hotspots are completely eliminated. Author: Josh Rosen <[email protected]> Closes apache#15756 from JoshRosen/speed-up-jsonprotocol.
What changes were proposed in this pull request?
This patch significantly improves the performance of event log replay in the HistoryServer via two simple changes:
extractOpt
: it turns out thatjson4s
'sextractOpt
method uses exceptions for control flow, causing huge performance bottlenecks due to the overhead of initializing exceptions. To avoid this overhead, we can simply use our ownUtils.jsonOption
method. This patch replaces all uses ofextractOpt
withUtils.jsonOption
and adds a style checker rule to ban the use of the slowextractOpt
method.Utils.getFormattedClassName
for every event: the old code calledUtils.getFormattedClassName
dozens of times per replayed event in order to match up class names in events with SparkListener event names. By simply storing the results of these calls in constants rather than recomputing them, we're able to eliminate a huge performance hotspot by removing thousands of expensiveClass.getSimpleName
calls.How was this patch tested?
Tested by profiling the replay of a long event log using YourKit. For an event log containing 1000+ jobs, each of which had thousands of tasks, the changes in this patch cut the replay time in half:
Prior to this patch's changes, the two slowest methods in log replay were internal exceptions thrown by
Json4S
and calls toClass.getSimpleName()
:After this patch, these hotspots are completely eliminated.