Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPARK-2701: ConnectionManager throws out of "Could not find reference for received ack message xxx" exception. #1603

Closed
wants to merge 1 commit into from

Conversation

witgo
Copy link
Contributor

@witgo witgo commented Jul 26, 2014

No description provided.

@SparkQA
Copy link

SparkQA commented Jul 26, 2014

QA tests have started for PR 1603. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17221/consoleFull

@SparkQA
Copy link

SparkQA commented Jul 26, 2014

QA results for PR 1603:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17221/consoleFull

@SparkQA
Copy link

SparkQA commented Jul 26, 2014

QA tests have started for PR 1603. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17223/consoleFull

@aarondav
Copy link
Contributor

Does the movement of the status modification into the synchronized block change anything? It seems the only effective change here is downgrading an exception to a log message.

Because we remove the status from the map within the synchronized block, it would seem that it's safe to do our subsequent modification of the status without (though it does seem to be more consistent with the rest of the code to do it within).

@witgo
Copy link
Contributor Author

witgo commented Jul 26, 2014

Throw an exception here cause System.exit(ExecutorExitCode.UNCAUGHT_EXCEPTION) is called.
This is not necessary.

@SparkQA
Copy link

SparkQA commented Jul 26, 2014

QA results for PR 1603:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17223/consoleFull

@mridulm
Copy link
Contributor

mridulm commented Jul 26, 2014

I have not analyzed in detail, but this looks like something which might cause deadlocks ..
Changing to logging instead of throwing exception is a good idea though : earlier we were not exit'ing in case of uncaught exception - which was introduced in some subsequent release - so this should be done now (logError).

@witgo witgo changed the title ConnectionManager throws out of "Could not find reference for received ack message xxx" exception. SPARK-2701: ConnectionManager throws out of "Could not find reference for received ack message xxx" exception. Jul 28, 2014
@JoshRosen
Copy link
Contributor

How did you trigger this exception? I wonder whether this points to a synchronization error somewhere, or a bug in how the ACK'er handles message ids.

@witgo
Copy link
Contributor Author

witgo commented Aug 5, 2014

This appeared in a production environment. I don't know how to reproduce it.

@SparkQA
Copy link

SparkQA commented Aug 7, 2014

QA tests have started for PR 1603. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18092/consoleFull

@SparkQA
Copy link

SparkQA commented Aug 7, 2014

QA results for PR 1603:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18092/consoleFull

@SparkQA
Copy link

SparkQA commented Aug 7, 2014

QA tests have started for PR 1603. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18093/consoleFull

@SparkQA
Copy link

SparkQA commented Aug 7, 2014

QA results for PR 1603:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18093/consoleFull

@SparkQA
Copy link

SparkQA commented Aug 7, 2014

QA tests have started for PR 1603. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18098/consoleFull

@SparkQA
Copy link

SparkQA commented Aug 7, 2014

QA results for PR 1603:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18098/consoleFull

@witgo witgo closed this Aug 17, 2014
@witgo witgo deleted the SPARK-2701 branch August 17, 2014 00:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants