-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Backport 2.x] Limit RW separation to remote store enabled clusters and update recovery flow #16999
Conversation
…ery flow (opensearch-project#16760) * Update search only replica recovery flow This PR includes multiple changes to search replica recovery. 1. Change search only replica copies to recover as empty store instead of PEER. This will run a store recovery that syncs segments from remote store directly and eliminate any primary communication. 2. Remove search replicas from the in-sync allocation ID set and update routing table to exclude them from allAllocationIds. This ensures primaries aren't tracking or validating the routing table for any search replica's presence. 3. Change search replica validation to require remote store. There are versions of the above changes that are still possible with primary based node-node replication, but I don't think they are worth making at this time. Signed-off-by: Marc Handalian <[email protected]> * more coverage Signed-off-by: Marc Handalian <[email protected]> * add changelog entry Signed-off-by: Marc Handalian <[email protected]> * add assertions that Search Replicas are not in the in-sync id set nor the AllAllocationIds set in the routing table Signed-off-by: Marc Handalian <[email protected]> * update async task to only run if the FF is enabled and we are a remote store cluster. This check had previously only checked for segrep Signed-off-by: Marc Handalian <[email protected]> * clean up max shards logic Signed-off-by: Marc Handalian <[email protected]> * remove search replicas from check during renewPeerRecoveryRetentionLeases Signed-off-by: Marc Handalian <[email protected]> * Revert "update async task to only run if the FF is enabled and we are a remote store cluster." reverting this, we already check for remote store earlier. This reverts commit 48ca1a3. Signed-off-by: Marc Handalian <[email protected]> * Add more tests for failover case Signed-off-by: Marc Handalian <[email protected]> * Update remotestore restore logic and add test ensuring we can restore only writers when red Signed-off-by: Marc Handalian <[email protected]> * Fix Search replicas to honor node level recovery limits Signed-off-by: Marc Handalian <[email protected]> * Fix translog UUID mismatch on existing store recovery. This commit adds PR feedback and recovery tests post node restart. Signed-off-by: Marc Handalian <[email protected]> * Fix spotless Signed-off-by: Marc Handalian <[email protected]> * Fix bug with remote restore and add more tests Signed-off-by: Marc Handalian <[email protected]> --------- Signed-off-by: Marc Handalian <[email protected]> (cherry picked from commit 8191de8)
❕ Gradle check result for 3b04475: UNSTABLE Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## 2.x #16999 +/- ##
============================================
+ Coverage 71.79% 71.92% +0.12%
- Complexity 65454 65599 +145
============================================
Files 5316 5318 +2
Lines 305529 305739 +210
Branches 44509 44580 +71
============================================
+ Hits 219359 219902 +543
+ Misses 67902 67504 -398
- Partials 18268 18333 +65 ☔ View full report in Codecov by Sentry. |
Description
Manual backport of #16760 - Only conflict was in IndexShard.java because TranslogManager does not exist on 2.x.
Related Issues
Resolves #15952
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.