-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Zen2] Logging improvements in CoordinatorTests #33991
[Zen2] Logging improvements in CoordinatorTests #33991
Conversation
Today, we know that CoordinatorTests sometimes fail to stabilise due to an election collision. This change improves the logging that occurs when an election collision occurs so it will be easier to see if this is happening when analysing a test failure. We also wrap the call to masterService.submitStateUpdateTask() in a context that logs the node on which it runs. We also introduce the InitialJoinAccumulator instead of using a placeholder CandidateJoinAccumulator at startup, which reduces the cases to consider in CandidateJoinAccumulator.close() and tightens up the assertions we can make here.
Pinging @elastic/es-distributed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left two minor (optional) comments.
logger.debug("handleStartJoin: failed election, discarding {}", joinVotes); | ||
} else if (startJoinRequest.getSourceNode().equals(localNode) == false) { | ||
logger.debug("handleStartJoin: standing down as leader, discarding {}", joinVotes); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we also log here when bumping the term as leader? I think all the cases where there are existing votes are interesting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
public void onFailure(String source, Exception e) { | ||
logger.debug(() -> new ParameterizedMessage("failed to publish: [{}]", source), e); | ||
public String toString() { | ||
return "submitStateUpdateTask: new value [" + value + "]"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this toString() here? We run it right away and don't log it anywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left over from earlier when I was enqueueing it, good point.
Today, we know that CoordinatorTests sometimes fail to stabilise due to an
election collision. This change improves the logging that occurs when an
election collision occurs so it will be easier to see if this is happening when
analysing a test failure. We also wrap the call to
masterService.submitStateUpdateTask() in a context that logs the node on which
it runs.
We also introduce the InitialJoinAccumulator instead of using a placeholder
CandidateJoinAccumulator at startup, which reduces the cases to consider in
CandidateJoinAccumulator.close() and tightens up the assertions we can make
here.