Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Zen2] PersistedState interface implementation #35819

Merged
merged 32 commits into from
Nov 27, 2018

Conversation

andrershov
Copy link
Contributor

Today GatewayMetaState is capable of atomically storing MetaData to
disk. We've also moved fields that are needed to be persisted in Zen2
from ClusterState to ClusterState.MetaData.CoordinationMetaData.

This PR implements PersistedState interface.

  1. version and currentTerm are persisted as a part of Mainfest.
  2. GatewayMetaState now have two descendants Zen1GatewayMetaState
    and Zen2GatewayMetaState (which implements PersistedState).
  3. GatewayMetaState now constructs previousClusterState (including
    MetaData) and previousManifest inside the constructor, so that all
    PersistedState methods are usable as soon as GatewayMetaState
    instance is constructed. Also loadMetaData is renamed to
    getMetaData, because it just returns
    previousClusterState.metaData().
  4. Currently when deciding whether to write IndexMetaData to disk,
    we're comparing current IndexMetaData version and received
    IndexMetaData version. This is not safe in Zen2 if term has changed.
    So updateClusterState now accepts writeWithoutComparingVersion
    method parameter. When it's set to true, we always write
    IndexMetaData to disk.
  5. Things that are not covered by GatewayMetaStateTests are covered
    by Zen2GatewayMetaStateTests.

This PR does not inject Zen2GatewayMetaState in place of
Zen1GatewayMetaState and in place of InMemoryPersistedState, this
is a work to be done in a follow-up PR.

@andrershov andrershov added >enhancement :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. labels Nov 22, 2018
@andrershov andrershov self-assigned this Nov 22, 2018
@andrershov andrershov requested a review from ywelsch November 22, 2018 11:03
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left some comments, the main one being about initialization where the localNode object is not available yet. I'm also not sure about the class names Zen1GatewayMetaState and Zen2GatewayMetaState, but we can address that in a follow-up.

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would really like to see an ESIntegTestCase that forms a Zen2 cluster and performs a full cluster restart here. Is that possible, or is the lack of changes to state recovery an obstruction? That kind of test would catch things like this
https://github.com/elastic/elasticsearch/pull/35819/files#r235791731 for instance.

@andrershov
Copy link
Contributor Author

andrershov commented Nov 26, 2018

@ywelsch @DaveCTurner, I've made further changes, could you please take a look?

  1. I've merged Zen1GatewayMetaState and Zen2GatewayMetaState into GatewayMetaState to make sure we can use GatewayMetaState for both Zen1 and Zen2 discovery.
  2. GatewayService is no longer responsible for registering GatewayMetaState as ClusterStateApplier, now ZenDiscovery is responsible for that.
  3. ZenDiscovery accepts ClusterApplier, not ClusterApplierService. And it turned out to be not easy to make it accept ClusterApplierService and make tests working. That's why I've moved addLowPriorityApplier to ClusterApplier interface, which is not elegant.
  4. My first attempt was to change TestZenDiscovery (in zen2 mode) to use GatewayMetaState instead of InMemoryPersistedState, after this change 10 integration tests were failing, because we need to implement proper state recovery, which is a subject of separate PR. That's why I've implemented another option to use GatewayMetaState in place of InMemoryPersistedState with InMemoryPersistedState being default. IfInMemoryPeristedState is used, we're currently registering addLowPriorityApplier with GatewayMetaState, to make sure it receives all cluster state updates (otherwise 13 integration tests and one unit tests are failing).
  5. There is one test that uses Zen2 with GatewayMetaState, see PersistedStateIT. It tests that cluster persistent settings work properly, it does not pass with InMemoryPersistentState. I was trying to make a test with IndexMetaData, but it requires proper recovery implementation.
  6. Also, I've merged VotingTombstone class from zen2 branch and enhanced Zen2GatewayMetaStateTests.

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left some more smaller comments. Looking good already.

@andrershov
Copy link
Contributor Author

@ywelsch I'm done with the changes, ready for the next pass.

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@andrershov andrershov merged commit 0e283f9 into elastic:zen2 Nov 27, 2018
@andrershov andrershov mentioned this pull request Nov 28, 2018
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. >enhancement v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants