-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IndexShard.routingEntry
should only be updated once all internal state is ready
#26776
IndexShard.routingEntry
should only be updated once all internal state is ready
#26776
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left one question.
@@ -524,6 +523,8 @@ public void onFailure(Exception e) { | |||
latch.countDown(); | |||
} | |||
} | |||
// set this last, once we finished updating all internal state. | |||
this.shardRouting = newRouting; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that in the case of a primary this should be published before we countdown the latch that lets the operations that are run under the permit block proceed. At that point we will have published the new primary term but not the new routing entry, and this shard will be used for example in the resync operations. Am I missing something here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming green CI, LGTM.
Thx @s1monw , @jasontedor |
…ate is ready (#26776) The routing entry is used by external components to check whether the shard is ready to perform as primary. Most notably, the peer recovery source handler delays recoveries until the shard routing entry says the shard is ready. When a shard is promoted to primary, we currently update the shard's routing entry before we finish all the work relating to the promotion. This can cause recoveries to fail later on because the `GlobalCheckpointTracker` isn't set (yet) to primary mode. This commit fixes this issue by updating the routing entry last.
…ate is ready (#26776) The routing entry is used by external components to check whether the shard is ready to perform as primary. Most notably, the peer recovery source handler delays recoveries until the shard routing entry says the shard is ready. When a shard is promoted to primary, we currently update the shard's routing entry before we finish all the work relating to the promotion. This can cause recoveries to fail later on because the `GlobalCheckpointTracker` isn't set (yet) to primary mode. This commit fixes this issue by updating the routing entry last.
Pinging @elastic/es-core-infra |
The routing entry is used by external components to check whether the shard is ready to perform as primary. Most notably, the peer recovery source handler delays recoveries until the shard routing entry says the shard is ready.
When a shard is promoted to primary, we currently update the shard's routing entry before we finish all the work relating to the promotion. This can cause recoveries to fail later on because the
GlobalCheckpointTracker
isn't set (yet) to primary mode.This commit fixes this issue by updating the routing entry last.