-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-3741] Make ConnectionManager propagate errors properly and add mo... #2593
Conversation
… more logs to avoid Executors swallowing errors
QA tests have started for PR 2593 at commit
|
QA tests have finished for PR 2593 at commit
|
Test FAILed. |
retest this please. This test is OK in my machine. |
QA tests have started for PR 2593 at commit
|
Test PASSed. |
callback(this, e) | ||
} catch { | ||
case NonFatal(e) => { | ||
logWarning("Ignore error", e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you change this to be more descriptive? How about something like "Ignored error in onExceptionCallback"?
Hi @zsxwing, Thanks for submitting this PR (and sorry for the delayed review)! These changes will be very helpful in debugging certain types of connection manager issues that we've encountered. I like the careful attention to error-handling cases that we missed before, especially the use of I left a few comments, mostly regarding naming. If you fix up the merge conflicts, I think this looks ready to merge. Thanks! |
Conflicts: core/src/main/scala/org/apache/spark/network/nio/Connection.scala core/src/main/scala/org/apache/spark/network/nio/ConnectionManager.scala
|
||
override def afterExecute(r: Runnable, t: Throwable): Unit = { | ||
super.afterExecute(r, t) | ||
if (t != null && NonFatal(t)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added NonFatal(t)
to avoid to output fatal exceptions. It's expected that they are not be handled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice change.
Merged and updated the naming. |
QA tests have started for PR 2593 at commit
|
QA tests have finished for PR 2593 at commit
|
Test PASSed. |
This looks good to me, so I'm going to merge it. Thanks a bunch; this will really help with debugging! |
Sorry. I found that I forgot to add `afterExecute` for `handleConnectExecutor` in #2593. Author: zsxwing <[email protected]> Closes #2794 from zsxwing/SPARK-3741 and squashes the following commits: a0bc4dd [zsxwing] Add afterExecute for handleConnectExecutor
...re logs to avoid Executors swallowing errors
This PR made the following changes:
Connection
so that the error will be propagated properly.Promise
doesn't allow to call success/failure more than once.