Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce retention leases versioning #37951

Merged
merged 31 commits into from
Feb 1, 2019

Conversation

jasontedor
Copy link
Member

Because concurrent sync requests from a primary to its replicas could be in flight, it can be the case that an older retention leases collection arrives and is processed on the replica after a newer retention leases collection has arrived and been processed. Without a defense, in this case the replica would overwrite the newer retention leases with the older retention leases. This commit addresses this issue by introducing a versioning scheme to retention leases. This versioning scheme is used to resolve out-of-order processing on the replica. We persist this version into Lucene and restore it on recovery. The encoding of retention leases is starting to get a little ugly. We can consider addressing this in a follow-up.

Relates #37165

Because concurrent sync requests from a primary to its replicas could be
in flight, it can be the case that an older retention leases collection
arrives and is processed on the replica after a newer retention leases
collection has arrived and been processed. Without a defense, in this
case the replica would overwrite the newer retention leases with the
older retention leases. This commit addresses this issue by introducing
a versioning scheme to retention leases. This versioning scheme is used
to resolve out-of-order processing on the replica. We persist this
version into Lucene and restore it on recovery. The encoding of
retention leases is starting to get a little ugly. We can consider
addressing this in a follow-up.
@jasontedor jasontedor added >enhancement v7.0.0 :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. v6.7.0 labels Jan 29, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jasontedor for a quick response :). I left a comment to discuss.

assert primaryMode == false;
this.retentionLeases.clear();
this.retentionLeases.putAll(retentionLeases.stream().collect(Collectors.toMap(RetentionLease::id, Function.identity())));
if (retentionLeases.version() > version) {
Copy link
Member

@dnhatn dnhatn Jan 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might have an issue with the primary fail-over here. Suppose the primary is syncing two new leases requests L1 and L2, both of them are processed on replica-1, but the primary crashes before any of them arrives on the replica-2. If the replica-2 is promoted, then we add two new leases: L3 and L4. Unfortunately, replica-1 will ignore these two new requests, and its leases (L1 and L2) are different from the leases on the new primary (L3 and L4). I think we can use the primary-term with the leases version to have a total ordering pair <<term, version>>. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish we could find a simpler way, but I can't. +1 from me on the suggestion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also looked for a simpler way, and @ywelsch did too. I could not find anything, and neither could @ywelsch. We synced earlier and agreed that this looks like the only way forward. We need to do a refactoring to the replication tracker and index shard to push the operation primary term down into the tracker, and then we will proceed with updating this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks. great catch @dnhatn .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dnhatn is going to work on adding the retention lease infrastructure to index level replication test case infrastructure so that we can test these scenarios.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dnhatn I have updated this PR to include the primary term here. I want to add some additional testing, but this PR is ready for another review round from you.

* master: (100 commits)
  Push primary term to replication tracker (elastic#38044)
  Introduce ability to minimize round-trips in CCS (elastic#37828)
  Don't Assert Ack on when Publish Timeout is 0 in Test (elastic#38077)
  Reduce object creation in Rounding class (elastic#38061)
  Treat put-mapping calls with `_doc` as a top-level key as typed calls. (elastic#38032)
  Fix test bug when testing the merging of mappings and templates. (elastic#38021)
  spelling: java script -- not JavaScript (elastic#37057)
  Enable SSL in reindex with security QA tests (elastic#37600)
  Disable BWC tests during backport (elastic#38074)
  SQL: Added SSL configuration options tests (elastic#37875)
  Minor fixes in the release notes script. (elastic#37967)
  Fix typo in docs. (elastic#38018)
  Update Lucene repo for 7.0.0-alpha2 (elastic#37985)
  Fix size of rolling-upgrade bootstrap config (elastic#38031)
  fix DateIndexNameProcessorTests offset pattern (elastic#38069)
  Speed up converting of temporal accessor to zoned date time (elastic#37915)
  Work around JDK8 timezone bug in tests (elastic#37968)
  Correct arg names when update mapping/settings from leader (elastic#38063)
  Introduce ssl settings to reindex from remote (elastic#37527)
  Mute testRetentionLeasesSyncOnExpiration
  ...
…ersion

* elastic/master:
  Do not set up NodeAndClusterIdStateListener in test (elastic#38110)
  ML: better handle task state race condition (elastic#38040)
  Soft-deletes policy should always fetch latest leases (elastic#37940)
  Handle scheduler exceptions (elastic#38014)
  Minor logging improvements (elastic#38084)
  Fix Painless void return bug (elastic#38046)
  Update PutFollowAction serialization post-backport (elastic#37989)
  fix a few versionAdded values in ElasticsearchExceptions (elastic#37877)
  Reenable BWC tests after backport of elastic#37899 (elastic#38093)
  Mute failing test
  Mute failing test
  Fail start on obsolete indices documentation (elastic#37786)
  SQL: Implement FIRST/LAST aggregate functions (elastic#37936)
  Un-mute NoMasterNodeIT.testNoMasterActionsWriteMasterBlock
  remove unused parser fields in RemoteResponseParsers
@jasontedor jasontedor requested a review from dnhatn January 31, 2019 18:24
Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some minor comments. This PR is really good already. Thanks @jasontedor.

@jasontedor
Copy link
Member Author

@elasticmachine test this please

@jasontedor
Copy link
Member Author

@elasticmachine test this please

* master: (36 commits)
  Ensure joda compatibility in custom date formats (elastic#38171)
  Do not compute cardinality if the `terms` execution mode does not use `global_ordinals` (elastic#38169)
  Do not set timeout for IndexRequests in GatewayIndexStateIT (elastic#38147)
  Zen2ify testMasterFailoverDuringIndexingWithMappingChanges (elastic#38178)
  SQL: [Docs] Add limitation for aggregate functions on scalars (elastic#38186)
  Add elasticsearch-node detach-cluster command (elastic#37979)
  Add tests for fractional epoch parsing (elastic#38162)
  Enable bw tests for elastic#37871 and elastic#38032. (elastic#38167)
  Clear send behavior rule in CloseWhileRelocatingShardsIT (elastic#38159)
  Fix testCorruptedIndex (elastic#38161)
  Add finalReduce flag to SearchRequest (elastic#38104)
  Forbid negative field boosts in analyzed queries (elastic#37930)
  Remove AtomiFieldData#getLegacyFieldValues (elastic#38087)
  Universal cluster bootstrap method for tests with autoMinMasterNodes=false (elastic#38038)
  Fix FullClusterRestartIT.testHistoryUUIDIsAdded (elastic#38098)
  Replace joda time in ingest-common module (elastic#38088)
  Fix eclipse config for ssl-config (elastic#38096)
  Don't load global ordinals with the `map` execution_hint (elastic#37833)
  Relax fault detector in some disruption tests (elastic#38101)
  Fix java time epoch date formatters (elastic#37829)
  ...
* master:
  Replace awaitBusy with assertBusy in atLeastDocsIndexed (elastic#38190)
  Adjust SearchRequest version checks (elastic#38181)
  AwaitsFix testClientSucceedsWithVerificationDisabled (elastic#38213)
  Zen2ify RareClusterStateIT (elastic#38184)
  ML: Fix error race condition on stop _all datafeeds and close _all jobs (elastic#38113)
  AwaitsFix PUT mapping with _doc on an index that has types (elastic#38204)
  Allow built-in monitoring_user role to call GET _xpack API (elastic#38060)
  Update geo_shape docs to include unsupported features (elastic#38138)
  [ML] Remove "8" prefixes from file structure finder timestamp formats (elastic#38016)
  Disable bwc tests while backporting elastic#38104 (elastic#38182)
  Enable TLSv1.3 by default for JDKs with support (elastic#38103)
  Fix _host based require filters (elastic#38173)
  RestoreService should update primary terms when restoring shards of existing indices (elastic#38177)
  Throw if two inner_hits have the same name (elastic#37645)
…ersion

* elastic/master:
  SnapshotShardsService Simplifications (elastic#38025)
  Default include_type_name to false in the yml test harness. (elastic#38058)
  Disable bwc preparing to backport of#37977, elastic#37857 and elastic#37872 (elastic#38126)
  Adding ml_settings entry to HLRC and Docs for deprecation_info (elastic#38118)
@jasontedor jasontedor merged commit f181e17 into elastic:master Feb 1, 2019
@jasontedor jasontedor deleted the retention-leases-version branch February 1, 2019 22:19
jasontedor added a commit to bleskes/elasticsearch that referenced this pull request Feb 1, 2019
* elastic/master:
  Introduce retention leases versioning (elastic#37951)
  Correctly disable tests for FIPS JVMs (elastic#38214)
jasontedor added a commit to gwbrown/elasticsearch that referenced this pull request Feb 1, 2019
* elastic/master: (54 commits)
  Introduce retention leases versioning (elastic#37951)
  Correctly disable tests for FIPS JVMs (elastic#38214)
  AwaitsFix testAbortedSnapshotDuringInitDoesNotStart (elastic#38227)
  Preserve ILM operation mode when creating new lifecycles (elastic#38134)
  Enable trace log in FollowerFailOverIT (elastic#38148)
  SnapshotShardsService Simplifications (elastic#38025)
  Default include_type_name to false in the yml test harness. (elastic#38058)
  Disable bwc preparing to backport of#37977, elastic#37857 and elastic#37872 (elastic#38126)
  Adding ml_settings entry to HLRC and Docs for deprecation_info (elastic#38118)
  Replace awaitBusy with assertBusy in atLeastDocsIndexed (elastic#38190)
  Adjust SearchRequest version checks (elastic#38181)
  AwaitsFix testClientSucceedsWithVerificationDisabled (elastic#38213)
  Zen2ify RareClusterStateIT (elastic#38184)
  ML: Fix error race condition on stop _all datafeeds and close _all jobs (elastic#38113)
  AwaitsFix PUT mapping with _doc on an index that has types (elastic#38204)
  Allow built-in monitoring_user role to call GET _xpack API (elastic#38060)
  Update geo_shape docs to include unsupported features (elastic#38138)
  [ML] Remove "8" prefixes from file structure finder timestamp formats (elastic#38016)
  Disable bwc tests while backporting elastic#38104 (elastic#38182)
  Enable TLSv1.3 by default for JDKs with support (elastic#38103)
  ...
dnhatn added a commit that referenced this pull request Feb 2, 2019
If the innerLength is 0, the version won't be increased; then there will
be two RetentionLeases with the same term and version, but their leases
are different.

Relates #37951
Closes #38245
dnhatn added a commit that referenced this pull request Feb 2, 2019
We should increase primary term before renewing leases; otherwise, the
term of the latest RetentionLeases will be lower than the current term.

Relates #37951
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Feb 2, 2019
If the innerLength is 0, the version won't be increased; then there will
be two RetentionLeases with the same term and version, but their leases
are different.

Relates elastic#37951
Closes elastic#38245
dnhatn added a commit that referenced this pull request Feb 2, 2019
If the innerLength is 0, the version won't be increased; then there will
be two RetentionLeases with the same term and version, but their leases
are different.

Relates #37951
Closes #38245
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. >enhancement v6.7.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants