-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Zen2] Introduce gossip-like discovery of master nodes #32246
Conversation
This commit introduces the `PeerFinder` which can be used to collect the identities of the master-eligible nodes in a masterless cluster, based on the `UnicastHostsProvider`, the nodes in the `ClusterState`, and nodes that other nodes have discovered.
Pinging @elastic/es-distributed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've done an initial round. I've left mostly smaller comments, the general flow and the testing looks good.
|
||
public abstract class PeerFinder extends AbstractLifecycleComponent { | ||
|
||
public static final String REQUEST_PEERS_ACTION_NAME = "internal:discovery/zen2/requestpeers"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we remove the zen2
part here and just have this under discovery
? Also maybe request_peers
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, done
|
||
// the time between attempts to find all peers | ||
public static final Setting<TimeValue> CONSENSUS_FIND_PEERS_DELAY_SETTING = | ||
Setting.timeSetting("discovery.zen2.find_peers_delay", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove zen2
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, done
public static final String REQUEST_PEERS_ACTION_NAME = "internal:discovery/zen2/requestpeers"; | ||
|
||
// the time between attempts to find all peers | ||
public static final Setting<TimeValue> CONSENSUS_FIND_PEERS_DELAY_SETTING = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consensus? 🗡
just FIND_PEERS_DELAY_SETTING
I also wonder if we should call this FIND_PEERS_INTERVAL_SETTING
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, register the setting please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, done
return getFoundPeersSet(); | ||
} | ||
|
||
public boolean foundQuorumFrom(VotingConfiguration votingConfiguration) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is not used anywhere (also not in tests?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, fixed
this.executorServiceFactory = executorServiceFactory; | ||
} | ||
|
||
public Iterable<DiscoveryNode> getFoundPeers() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this method is not used anywhere (also not in tests?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, fixed
|
||
private class ActivePeerFinder { | ||
private boolean running; | ||
private final Map<DiscoveryNode, FoundPeer> foundPeers; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you make this package-visible? It's accessed by PeerFinder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, fixed. Is there a tool to tell you this? Also running
.
} | ||
|
||
logger.trace("startProbe({}) found disconnected {}, probing again", transportAddress, cachedNode); | ||
connectedNodes.remove(transportAddress, cachedNode); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we also clean-up foundPeers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think not, but added a TODO for further discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
foundPeers
is no longer a thing - we just use connectedNodes
throughout.
import java.util.Optional; | ||
|
||
public class PeersResponse extends TransportResponse { | ||
private final Optional<DiscoveryNode> masterNode; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it would be nicer to have this as a boolean (isActiveMaster
) stating whether the current node providing the peer response is an active master node, and have the list of discoveryNodes just called masterNodes
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it is now, if you contact a follower the next step is just to contact the follower's leader. With a flag you'd go ahead and contact all the follower's peers before discovering that one of them is the leader.
In a properly-configured cluster it doesn't seem to make much difference, but if badly configured (e.g. the leader is not in all the unicast hosts lists, and there are lots of master-eligible nodes) then there's less traffic this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure I follow (no pun intended). A follower would return isActiveMaster = false + the active master as singleton in the list of masterEligibleNodes, which would lead to exact same behavior as what's currently in the PR if I understand this correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed on Zoom. There's no strong argument either way. I slightly prefer that as it is now there's a difference in the responses between a follower and a candidate that knows of a single node, although we don't use this difference anywhere. I also see that moving to a single list (a singleton from followers) or a boolean would remove a couple of lines of code that deal with the master node as a special case. We'll leave it as is.
|
||
public class PeersRequest extends TransportRequest { | ||
private final DiscoveryNode sourceNode; | ||
private final List<DiscoveryNode> discoveryNodes; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we call this masterNodes
(or masterEligibleNodes
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to candidateNodes
to reflect that they're only there if there isn't a known leader.
/** | ||
* Label a <code>Runnable</code>, overriding its <code>toString()</code> method. | ||
*/ | ||
public static Runnable labelRunnable(final Runnable runnable, final String label) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could be dangerous if an abstract runnable is passed in
add an assertion that it's not an AbstractRunnable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, this is bad. I just removed it.
…st pass it in at activation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved things to the discovery
package.
* under the License. | ||
*/ | ||
|
||
package org.elasticsearch.cluster.coordination; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, moved things around in 2013f10
@@ -180,7 +180,7 @@ private void handleWakeUp() { | |||
} | |||
|
|||
if (active == false) { | |||
logger.trace("ActivePeerFinder#handleWakeUp(): not running"); | |||
logger.trace("PeerFinder: not running"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we even need to say that this is PeerFinder? We have a logger for PeerFinder that already spits out the class name (same thing for other logging occurrences in this class).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course. I pushed 7948268.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
active = false; | ||
handleWakeUp(); | ||
} | ||
active = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert active == false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests deactivate the PeerFinder
at the end, regardless of whether it's active or not, and there's no particular reason to avoid double-deactivation or to make sure that each test ends with an active PeerFinder
.
Also although I am pretty sure that we can't change the leader without being active, this is not something we guarantee, nor do I think we need to.
The `PeerFinder`, introduced in elastic#32246, obtains the collection of seed addresses configured by the user from a `ConfiguredHostsResolver`. In reality this collection comes from the `UnicastHostsProvider` via a slightly complicated threading model that performs the resolution of hostnames to addresses using a dedicated `ExecutorService`. This commit introduces an adapter to allow the `PeerFinder` to obtain its seed addresses in this manner.
The `PeerFinder`, introduced in elastic#32246, needs to be able to identify, and connect to, a remote master node using only its `TransportAddress`. This can be done by opening a single-channel connection to the address, performing a handshake, and only then forming a full-blown connection to the node. This change implements this logic.
The `PeerFinder`, introduced in #32246, needs to be able to identify, and connect to, a remote master node using only its `TransportAddress`. This can be done by opening a single-channel connection to the address, performing a handshake, and only then forming a full-blown connection to the node. This change implements this logic.
The `PeerFinder`, introduced in #32246, obtains the collection of seed addresses configured by the user from a `ConfiguredHostsResolver`. In reality this collection comes from the `UnicastHostsProvider` via a slightly complicated threading model that performs the resolution of hostnames to addresses using a dedicated `ExecutorService`. This commit introduces an adapter to allow the `PeerFinder` to obtain its seed addresses in this manner.
Today, CapturingTransport#createCapturingTransportService creates a transport service with a connection manager with reasonable default behaviours, but overriding this behaviour in a consumer is a litle tricky. Additionally, the default behaviour for opening a connection duplicates the content of the CapturingTransport#openConnection() method. This change removes this duplication by delegating to openConnection() and introduces overridable nodeConnected() and onSendRequest() methods so that consumers can alter this behaviour more easily. Relates elastic#32246 in which we test the mechanisms for opening connections to unknown (and possibly unreachable) nodes.
Today, CapturingTransport#createCapturingTransportService creates a transport service with a connection manager with reasonable default behaviours, but overriding this behaviour in a consumer is a litle tricky. Additionally, the default behaviour for opening a connection duplicates the content of the CapturingTransport#openConnection() method. This change removes this duplication by delegating to openConnection() and introduces overridable nodeConnected() and onSendRequest() methods so that consumers can alter this behaviour more easily. Relates #32246 in which we test the mechanisms for opening connections to unknown (and possibly unreachable) nodes.
Today, CapturingTransport#createCapturingTransportService creates a transport service with a connection manager with reasonable default behaviours, but overriding this behaviour in a consumer is a litle tricky. Additionally, the default behaviour for opening a connection duplicates the content of the CapturingTransport#openConnection() method. This change removes this duplication by delegating to openConnection() and introduces overridable nodeConnected() and onSendRequest() methods so that consumers can alter this behaviour more easily. Relates #32246 in which we test the mechanisms for opening connections to unknown (and possibly unreachable) nodes.
This commit introduces the
PeerFinder
which can be used to collect theidentities of the master-eligible nodes in a masterless cluster, based on the
UnicastHostsProvider
, the nodes in theClusterState
, and nodes that othernodes have discovered.