Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add voting-only master node #43410

Merged
merged 31 commits into from
Jun 25, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
0567b49
Voting-only nodes
DaveCTurner Jun 18, 2019
7dd80ce
Randomise good-quorum calculation in CoordinationStateTests
DaveCTurner Jun 18, 2019
c298508
State transfer only
ywelsch Jun 19, 2019
0d3750d
Move to isElectionQuorum
ywelsch Jun 19, 2019
ee52a89
use JoinVoteCollection
ywelsch Jun 19, 2019
1a50872
fix test
ywelsch Jun 19, 2019
2cb2010
fix build and use transport intercepter
ywelsch Jun 19, 2019
8383823
move tests
ywelsch Jun 19, 2019
9ee9fdc
move tests
ywelsch Jun 19, 2019
77c252f
rest test
ywelsch Jun 19, 2019
9de65ea
Add x-pack feature set
ywelsch Jun 19, 2019
03d249f
add docs
ywelsch Jun 19, 2019
b170c9d
fixup
ywelsch Jun 19, 2019
c11b2e6
fix docs tests
ywelsch Jun 19, 2019
d109686
Register usage action
ywelsch Jun 20, 2019
e410121
more fixups
ywelsch Jun 20, 2019
88f081d
Merge remote-tracking branch 'elastic/master' into state-transfer-only
ywelsch Jun 20, 2019
5e260d5
more fixes
ywelsch Jun 20, 2019
65fda75
More fixups
ywelsch Jun 20, 2019
1b30c71
Merge remote-tracking branch 'elastic/master' into state-transfer-only
ywelsch Jun 20, 2019
fbba2c3
fix docs tests on OSS distrib
ywelsch Jun 20, 2019
e7c325e
Fold JoinVoteCollection into VoteCollection
ywelsch Jun 21, 2019
1caa1b6
s/election type/election strategy/
ywelsch Jun 21, 2019
6907d06
test adjustment
ywelsch Jun 21, 2019
e284426
Move ElectionStrategy from interface to class
ywelsch Jun 21, 2019
bcd6ec2
Have VotingOnlyNodePlugin always enabled
ywelsch Jun 21, 2019
264e5a3
Ryan feedback
ywelsch Jun 21, 2019
d8a0b9a
checkstyle
ywelsch Jun 21, 2019
ccdb483
doc changes
ywelsch Jun 21, 2019
28efcf0
Add note about voting-only in default distrib
ywelsch Jun 24, 2019
7bb6fa2
Reword docs
DaveCTurner Jun 25, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Empty file added A
Empty file.
18 changes: 12 additions & 6 deletions docs/reference/cluster.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -22,12 +22,14 @@ one of the following:
* an IP address or hostname, to add all matching nodes to the subset.
* a pattern, using `*` wildcards, which adds all nodes to the subset
whose name, address or hostname matches the pattern.
* `master:true`, `data:true`, `ingest:true` or `coordinating_only:true`, which
respectively add to the subset all master-eligible nodes, all data nodes,
all ingest nodes, and all coordinating-only nodes.
* `master:false`, `data:false`, `ingest:false` or `coordinating_only:false`,
which respectively remove from the subset all master-eligible nodes, all data
nodes, all ingest nodes, and all coordinating-only nodes.
* `master:true`, `data:true`, `ingest:true`, `voting_only:true` or
`coordinating_only:true`, which respectively add to the subset all
master-eligible nodes, all data nodes, all ingest nodes, all voting-only
nodes, and all coordinating-only nodes.
* `master:false`, `data:false`, `ingest:false`, `voting_only:true`, or
`coordinating_only:false`, which respectively remove from the subset all
master-eligible nodes, all data nodes, all ingest nodes, all voting-only
nodes and all coordinating-only nodes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the docs should clearly delineate that voting_only requires x-pack, we can wrap it in conditionals and add x-pack annotations so that it doesn't show in the docs if someone builds the OSS-only docs, and has x-pack designations in the docs when the full docs are published.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know that OSS-only docs were even a thing. Are we publishing those somewhere? How do you build those? You will have to spell out the details on how to set up the conditionals, I'm not aware of any such infrastructure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we do have [x-pack] macros, I'm not aware of an OSS-only docs build functionality.

@debadair @lcawl Are you aware of an OSS-only docs build?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don’t build them but it’s available for users that want to build OSS-only docs.

@lcawl Can you help @ywelsch add the appropriate x-pack annotations here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Content that requires the default distribution should be tagged with the [role="xpack"] directive. Now that all of the doc source is in the public repo we no longer maintain two versions of the index.asciidoc file, so conditional statements based on the include_xpack attribute have no effect.

Inline references like this are tricky. If using the voting_only attribute throws an error in the OSS distro, I'd be inclined to add a note to that effect. Something like:

NOTE: Designating nodes as voting_only and using voting_only in node filters is requires the default distribution of Elasticsearch.

@lcawl can correct me if I'm wrong, but I don't think there's (currently) any way to attach the xpack bug to an admonition block.

* a pair of patterns, using `*` wildcards, of the form `attrname:attrvalue`,
which adds to the subset all nodes with a custom node attribute whose name
and value match the respective patterns. Custom node attributes are
Expand All @@ -46,6 +48,9 @@ means that filters such as `master:false` which remove nodes from the chosen
subset are only useful if they come after some other filters. When used on its
own, `master:false` selects no nodes.

NOTE: The `voting_only` role requires the {default-dist} of Elasticsearch and
is not supported in the {oss-dist}.

Here are some examples of the use of node filters with the
<<cluster-nodes-info,Nodes Info>> APIs.

Expand All @@ -69,6 +74,7 @@ GET /_nodes/10.0.0.*
GET /_nodes/_all,master:false
GET /_nodes/data:true,ingest:true
GET /_nodes/coordinating_only:true
GET /_nodes/master:true,voting_only:false
# Select nodes by custom attribute (e.g. with something like `node.attr.rack: 2` in the configuration file)
GET /_nodes/rack:2
GET /_nodes/ra*:2
Expand Down
9 changes: 7 additions & 2 deletions docs/reference/cluster/stats.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,8 @@ Will return, for example:
"data": 1,
"coordinating_only": 0,
"master": 1,
"ingest": 1
"ingest": 1,
"voting_only": 0
},
"versions": [
"{version}"
Expand Down Expand Up @@ -207,6 +208,7 @@ Will return, for example:
// TESTRESPONSE[s/"plugins": \[[^\]]*\]/"plugins": $body.$_path/]
// TESTRESPONSE[s/"network_types": \{[^\}]*\}/"network_types": $body.$_path/]
// TESTRESPONSE[s/"discovery_types": \{[^\}]*\}/"discovery_types": $body.$_path/]
// TESTRESPONSE[s/"count": \{[^\}]*\}/"count": $body.$_path/]
// TESTRESPONSE[s/"packaging_types": \[[^\]]*\]/"packaging_types": $body.$_path/]
// TESTRESPONSE[s/: true|false/: $body.$_path/]
// TESTRESPONSE[s/: (\-)?[0-9]+/: $body.$_path/]
Expand All @@ -217,7 +219,10 @@ Will return, for example:
// see an exhaustive list anyway.
// 2. Similarly, ignore the contents of `network_types`, `discovery_types`, and
// `packaging_types`.
// 3. All of the numbers and strings on the right hand side of *every* field in
// 3. Ignore the contents of the (nodes) count object, as what's shown here
// depends on the license. Voting-only nodes are e.g. only shown when this
// test runs with a basic license.
// 4. All of the numbers and strings on the right hand side of *every* field in
// the response are ignored. So we're really only asserting things about the
// the shape of this response, not the values in it.

Expand Down
46 changes: 44 additions & 2 deletions docs/reference/modules/node.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -84,8 +84,9 @@ creating or deleting an index, tracking which nodes are part of the cluster,
and deciding which shards to allocate to which nodes. It is important for
cluster health to have a stable master node.

Any master-eligible node (all nodes by default) may be elected to become the
master node by the <<modules-discovery,master election process>>.
Any master-eligible node that is not a <<voting-only-node,voting-only node>> may
be elected to become the master node by the <<modules-discovery,master election
process>>.

IMPORTANT: Master nodes must have access to the `data/` directory (just like
`data` nodes) as this is where the cluster state is persisted between node restarts.
Expand Down Expand Up @@ -134,6 +135,47 @@ cluster.remote.connect: false <4>
<3> Disable the `node.ingest` role (enabled by default).
<4> Disable {ccs} (enabled by default).

[float]
[[voting-only-node]]
==== Voting-only master-eligible node

A voting-only master-eligible node is a node that participates in
<<modules-discovery,master elections>> but which will not act as the cluster's
elected master node. In particular, a voting-only node can serve as a tiebreaker
in elections.

It may seem confusing to use the term "master-eligible" to describe a
voting-only node since such a node is not actually eligible to become the master
at all. This terminology is an unfortunate consequence of history:
master-eligible nodes are those nodes that participate in elections and perform
certain tasks during cluster state publications, and voting-only nodes have the
same responsibilities even if they can never become the elected master.

To configure a master-eligible node as a voting-only node, set the following
setting:

[source,yaml]
-------------------
node.voting_only: true <1>
-------------------
<1> The default for `node.voting_only` is `false`.

IMPORTANT: The `voting_only` role requires the {default-dist} of Elasticsearch
and is not supported in the {oss-dist}. If you use the {oss-dist} and set
`node.voting_only` then the node will fail to start. Also note that only
master-eligible nodes can be marked as voting-only.

High availability (HA) clusters require at least three master-eligible nodes, at
least two of which are not voting-only nodes. Such a cluster will be able to
elect a master node even if one of the nodes fails.

Since voting-only nodes never act as the cluster's elected master, they may
require require less heap and a less powerful CPU than the true master nodes.
However all master-eligible nodes, including voting-only nodes, require
reasonably fast persistent storage and a reliable and low-latency network
connection to the rest of the cluster, since they are on the critical path for
<<cluster-state-publishing,publishing cluster state updates>>.

[float]
[[data-node]]
=== Data Node
Expand Down
4 changes: 4 additions & 0 deletions docs/reference/rest-api/info.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,10 @@ Example response:
"available" : true,
"enabled" : true
},
"voting_only" : {
"available" : true,
"enabled" : true
},
"watcher" : {
"available" : true,
"enabled" : true
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -121,14 +121,16 @@ static class ClusterFormationState {
private final List<TransportAddress> resolvedAddresses;
private final List<DiscoveryNode> foundPeers;
private final long currentTerm;
private final ElectionStrategy electionStrategy;

ClusterFormationState(Settings settings, ClusterState clusterState, List<TransportAddress> resolvedAddresses,
List<DiscoveryNode> foundPeers, long currentTerm) {
List<DiscoveryNode> foundPeers, long currentTerm, ElectionStrategy electionStrategy) {
this.settings = settings;
this.clusterState = clusterState;
this.resolvedAddresses = resolvedAddresses;
this.foundPeers = foundPeers;
this.currentTerm = currentTerm;
this.electionStrategy = electionStrategy;
}

String getDescription() {
Expand Down Expand Up @@ -185,7 +187,9 @@ String getDescription() {
final VoteCollection voteCollection = new VoteCollection();
foundPeers.forEach(voteCollection::addVote);
final String isQuorumOrNot
= CoordinationState.isElectionQuorum(voteCollection, clusterState) ? "is a quorum" : "is not a quorum";
= electionStrategy.isElectionQuorum(clusterState.nodes().getLocalNode(), currentTerm, clusterState.term(),
clusterState.version(), clusterState.getLastCommittedConfiguration(), clusterState.getLastAcceptedConfiguration(),
voteCollection) ? "is a quorum" : "is not a quorum";

return String.format(Locale.ROOT,
"master not discovered or elected yet, an election requires %s, have discovered %s which %s; %s",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,13 +24,14 @@
import org.elasticsearch.cluster.coordination.CoordinationMetaData.VotingConfiguration;
import org.elasticsearch.cluster.metadata.MetaData;
import org.elasticsearch.cluster.node.DiscoveryNode;
import org.elasticsearch.common.settings.Settings;

import java.util.Collection;
import java.util.Collections;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Map;
import java.util.Optional;
import java.util.Set;

/**
* The core class of the cluster state coordination algorithm, directly implementing the
Expand All @@ -42,6 +43,8 @@ public class CoordinationState {

private final DiscoveryNode localNode;

private final ElectionStrategy electionStrategy;

// persisted state
private final PersistedState persistedState;

Expand All @@ -53,11 +56,12 @@ public class CoordinationState {
private VotingConfiguration lastPublishedConfiguration;
private VoteCollection publishVotes;

public CoordinationState(Settings settings, DiscoveryNode localNode, PersistedState persistedState) {
public CoordinationState(DiscoveryNode localNode, PersistedState persistedState, ElectionStrategy electionStrategy) {
this.localNode = localNode;

// persisted state
this.persistedState = persistedState;
this.electionStrategy = electionStrategy;

// transient state
this.joinVotes = new VoteCollection();
Expand Down Expand Up @@ -100,13 +104,9 @@ public boolean electionWon() {
return electionWon;
}

public boolean isElectionQuorum(VoteCollection votes) {
return isElectionQuorum(votes, getLastAcceptedState());
}

static boolean isElectionQuorum(VoteCollection votes, ClusterState lastAcceptedState) {
return votes.isQuorum(lastAcceptedState.getLastCommittedConfiguration())
&& votes.isQuorum(lastAcceptedState.getLastAcceptedConfiguration());
public boolean isElectionQuorum(VoteCollection joinVotes) {
return electionStrategy.isElectionQuorum(localNode, getCurrentTerm(), getLastAcceptedTerm(), getLastAcceptedVersion(),
getLastCommittedConfiguration(), getLastAcceptedConfiguration(), joinVotes);
}

public boolean isPublishQuorum(VoteCollection votes) {
Expand All @@ -117,6 +117,11 @@ public boolean containsJoinVoteFor(DiscoveryNode node) {
return joinVotes.containsVoteFor(node);
}

// used for tests
boolean containsJoin(Join join) {
return joinVotes.getJoins().contains(join);
}

public boolean joinVotesHaveQuorumFor(VotingConfiguration votingConfiguration) {
return joinVotes.isQuorum(votingConfiguration);
}
Expand Down Expand Up @@ -243,7 +248,7 @@ public boolean handleJoin(Join join) {
throw new CoordinationStateRejectedException("rejecting join since this node has not received its initial configuration yet");
}

boolean added = joinVotes.addVote(join.getSourceNode());
boolean added = joinVotes.addJoinVote(join);
boolean prevElectionWon = electionWon;
electionWon = isElectionQuorum(joinVotes);
assert !prevElectionWon || electionWon; // we cannot go from won to not won
Expand Down Expand Up @@ -489,18 +494,28 @@ default void markLastAcceptedStateAsCommitted() {
}

/**
* A collection of votes, used to calculate quorums.
* A collection of votes, used to calculate quorums. Optionally records the Joins as well.
*/
public static class VoteCollection {

private final Map<String, DiscoveryNode> nodes;
private final Set<Join> joins;

public boolean addVote(DiscoveryNode sourceNode) {
return nodes.put(sourceNode.getId(), sourceNode) == null;
}

public boolean addJoinVote(Join join) {
final boolean added = addVote(join.getSourceNode());
if (added) {
joins.add(join);
}
return added;
}

public VoteCollection() {
nodes = new HashMap<>();
joins = new HashSet<>();
}

public boolean isQuorum(VotingConfiguration configuration) {
Expand All @@ -519,24 +534,31 @@ public Collection<DiscoveryNode> nodes() {
return Collections.unmodifiableCollection(nodes.values());
}

public Set<Join> getJoins() {
return Collections.unmodifiableSet(joins);
}

@Override
public String toString() {
return "VoteCollection{" + String.join(",", nodes.keySet()) + "}";
return "VoteCollection{votes=" + nodes.keySet() + ", joins=" + joins + "}";
}

@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
if (!(o instanceof VoteCollection)) return false;

VoteCollection that = (VoteCollection) o;

return nodes.equals(that.nodes);
if (!nodes.equals(that.nodes)) return false;
return joins.equals(that.joins);
}

@Override
public int hashCode() {
return nodes.hashCode();
int result = nodes.hashCode();
result = 31 * result + joins.hashCode();
return result;
}
}
}
Loading