Make `node.master` a dynamic setting #10793

TwP · 2015-04-24T17:26:20Z

It would be nice to add dedicated master nodes to an existing cluster without requiring a full restart of each node in the cluster. To accomplish this, the node.master would need to be dynamically configurable.

Changing the node.master setting at runtime would not force a new master election. It would only apply to the cluster moving forward. To force a master election, the current master node would need to be restarted as is currently the case today.

The text was updated successfully, but these errors were encountered:

s1monw · 2015-04-24T18:29:02Z

@TwP we will discuss this internally but it might take a while to get here, just a headsup

clintongormley · 2015-04-26T17:14:25Z

I like the idea. That said:

It would be nice to add dedicated master nodes to an existing cluster without requiring a full restart of each node in the cluster.

You currently only need to restart the node you're promoting to master, unless I misunderstand you?

dakrone · 2015-04-26T22:11:29Z

You currently only need to restart the node you're promoting to master, unless I misunderstand you?

Let's say you start with a 9-node cluster with minimum_master_nodes set to 5, then you decide at some point you want to move to dedicated master nodes. If the node.master setting were dynamic, you could add 3 nodes with node.master: true and update node.master: false on the other 9 nodes (also updating minimum_master_nodes to 2). Then you could (optionally) bounce the current master node and have one of the dedicated master nodes take over.

Right now you would have to restart each of the 9 data nodes (which stinks if you have a lot of data) in order to mark them all as non-master-eligible, because the node.master setting can't be dynamically changed.

clintongormley · 2015-04-27T08:53:26Z

OK. I can see one issue here. If you change the setting dynamically and you don't update the config file, and the node reboots (expectedly or unexpectedly), it'll pick up the node.master setting from the config file.

Currently, all node-specific (as opposed to cluster-wide) settings are set at node startup only. We have no API to set node settings, and nowhere to persist them.

clintongormley · 2015-04-27T08:53:52Z

Isn't the better solution for this problem fixing the slow restarts?

dakrone · 2015-04-27T19:35:37Z

Isn't the better solution for this problem fixing the slow restarts?

I don't think so, (don't get me wrong, fixing slow restarts would be fantastic), but I still think it is nice to be able to make changes to node.master without requiring a restart.

rjernst · 2015-04-27T20:13:48Z

A real world scenario where you may want this is when transitioning to a new set of nodes in a cloud environment. You would want to nicely transition to those new nodes, but you need to force the system to transition off of the old nodes. While killing the old nodes should work, this makes the exceptional case (ie master dies) the path taken, which shouldn't be necessary. There should be a clean way to transition from one set of master nodes to another set.

s1monw · 2015-04-28T09:09:44Z

This kind of scenario is something that wonders me for a while now. You have a setup where you realize it`s not optimal ie. want to move toe dedicated master nodes etc. But you have to change this node level setting and bounce processes, do recoveries etc. set minimum master nodes what have you. (I bet folks miss at least on important step here regularly). I wonder if we should have a dedicated API that bakes the cluster. Ie you wanna move to 3 master nodes from a 9 node cluster you would have ideally a simple API call like this:

PUT /_cluster/bake
{
  "master_nodes" : [ "master_1", "master_2", "master_3"],
  "set_minimum_master_nodes" : "true|false #optional true by default", 
  "force" : "true|false #false by default to barf if something is not safe (ie. only one master)"
}

with this call we basically move away form everything is dynamic and ignore other master nodes even if they are master eligible. It will force a new election if the current master is not in the list. This might also help use with some safety mechanisms or allows us to harden master election since we know the nodes now and the list is not dynamic anymore. I am not an expert on this but I wanted to throw out the idea here...

clintongormley · 2015-04-28T09:45:55Z

You still have to deal with what happens if any nodes reboot unexpectedly. Do they take their local elasticsearch.yml into account, or just use whatever setting is current in the cluster? If the latter, then the settings applied via the API should always override the local settings (which is confusing if you see settings from the yaml not being applied). What happens if three ex-masters reboot, and can't see the rest of the cluster initially? They'd use their local settings and form their own cluster.

TwP · 2015-04-29T15:04:54Z

@clintongormley you bring up a very valid point. However, that problem is not isolated to this proposed change to the node.master eligibility setting. All dynamically configurable settings have this same problem. Although with the node.master eligibility the consequences could be much more severe as you pointed out.

The first step is making the setting dynamically configurable. Changing the master eligibility via the API is one route for setting this value. The other route is to update the elasticsearch.yaml configuration file and then signal the running process to reload settings from the configuration file. Static settings are ignored. Dynamic settings are updated accordingly.

How dynamic settings are handled across restarts is an orthogonal problem. It should not prevent this feature from being implemented. However, taking the broader view will definitely highlight problems that can be introduced by making master eligibility configurable at runtime.

mlorch-ai · 2015-05-20T08:01:39Z

I just ran into the situation of having to restart a full production cluster to enable this setting and hence would've loved it we could change that dynamically.
Please keep in mind that if we change the master node (e.g. going from a cluster where each node can be a master to a cluster with a smaller number of dedicated masters) we also need to change/adapt other settings (such as minimum_master_nodes) dynamically.

I dont see an issue with the fact that the config file may have outdated information. We need to keep the config file updated with other settings as well as we evolve our cluster to accomodate for node failures/restarts. Being able to change this dynamically would however avoid the "yellow" cluster state and reduced redundancy that happens when a node (all nodes) needs to be restarted while indexing is going on as its indexes need to be brought up to speed.

rjsm · 2015-11-02T21:16:36Z

I would assume any cluster that this would be an issue on is under config management. You can update the elasticsearch.yml under the node, and then tell to not be master eligible. If it unexpectedly restarts, it'll pick up the intended configuration from the file. I'm transitioning my cluster to dedicated masters here shortly.

yehosef · 2015-12-14T08:42:31Z

+1 - this would be great. I can change the elasticsearch.yml file behind the scenes to handle a reboot - But I shouldn't have to restart the node to do it.

So I assume if I set node.master = false on a machine that's currently the master the cluster would start a new master selection process? I think it's important to be able to manually force a master - or migrate a master when I know a machine is going down. Even though the cluster will rebound, there is no reason to enter a failover state and any risks that could involve, when I don't need to.

charlesmims · 2016-01-26T17:40:56Z

Would a possible solution to this problem be not necessarily making node.master a dynamic setting, but among the nodes which are master eligible, providing a way to force election of a particular node as the master? Perhaps a transient cluster API setting?

pkusnail · 2016-07-22T08:08:52Z

I really love it if a non-master node in a cluster can be dynamically upgrade to a master eligible node , in order to prevent potential problems such as split brain , we set only one master eligible node in a cluster which is the master of the cluster for sure , and we add supervisor process which can restart the master node with 20 seconds, but if the node is physically damaged or power off , it is better that we can dynamically upgrade a node (such as client node ) to be master of the cluster.

bleskes · 2016-07-22T12:52:42Z

we set only one master eligible node in a cluster which is the master of the cluster for sure
if the node is physically damaged or power off , it is better that we can dynamically upgrade a node

The right solution for this problem is having at least 3 master nodes and properly configure minimum master nodes to 2.

The issue is requesting for a master transition which doesn't require a 3 second period with no master, which is different.

elasticmachine · 2018-03-14T08:51:35Z

Pinging @elastic/es-distributed

DaveCTurner · 2018-05-10T09:16:22Z

I found a trap:

elasticsearch/server/src/main/java/org/elasticsearch/repositories/RepositoriesService.java

Lines 201 to 204 in 4b319d7

    
           public boolean mustAck(DiscoveryNode discoveryNode) { 
        
               // repository was created on both master and data nodes 
        
               return discoveryNode.isMasterNode() || discoveryNode.isDataNode(); 
        
           }

This is checked twice in MasterService.AckCountDownListener: once in the constructor and once in onNodeAck(). We rely on the return value not changing in between those two calls.

This is, of course, surmountable - I just thought it wise to write it down here for future reference.

jasontedor · 2018-11-08T21:34:29Z

@andrershov @DaveCTurner @ywelsch What impact does the work on Zen 2 have on an issue such as this one?

ywelsch · 2018-11-09T09:43:47Z

@jasontedor there are a number of different scenarios and asks brought up on this issue. Zen2 in its current form already addresses some, but not all of them. The main change with Zen2 is that it makes the notion of "master-eligible" a bit more dynamic, by giving the flexibility to assign voting rights to only a subset of the master-eligible nodes, and allowing these voting rights to be dynamically shifted to another subset of the master-eligible nodes. This enables a clean transition from one set of master nodes to another set (the situation @rjernst mentioned). There are other scenarios described here (e.g. dynamically making a mixed master/data node a master-only node), which have consequences that go beyond Zen (moving the shard data off these nodes before allowing them to become master-only nodes). Finally, there is also another dimension to this problem, namely the capability to make the property of a node to "act as elected master" more dynamic (relates e.g. to #14340). This is not dynamically adaptable in Zen2 yet, which means that every master-eligible node (whether voting right or not) can become the elected master. We will be looking at some of these remaining use cases in more detail post 7.0. I'm convinced that the Zen2 way of managing voting configurations and the leader election mechanism will facilitate implementing solutions for them though.

andrershov · 2019-03-12T16:28:13Z

@ywelsch what do you think about closing this issue, because it discusses too many different things, and opening new dedicated issues instead?
I'm not convinced that dynamically making a mixed master/data node a master-only worth the effort, we generally recommend running dedicated master nodes and if it happens that someone wants to convert mixed node into master-only - it can be stopped, repurposed and started again.
So probably leaving just #14340 open is enough.

DaveCTurner · 2019-03-12T18:31:21Z

I am +1 on closing this issue.

We have addressed this valid use-case in #37802: the master now gracefully abdicates when removed from the voting configuration. We have also addressed some other comments about needing to update minimum_master_nodes by deprecating that setting in #37868.

It would be nice to add dedicated master nodes to an existing cluster without requiring a full restart of each node in the cluster.

I think it is not too onerous to restart a node when repurposing it from a mixed master/data node to a data-only node. It would be nice, but I do not think it gains us very much. The interplay between static node settings and dynamic cluster-wide settings concerns me, as does all the other things you need to do when node.master changes.

ywelsch · 2019-03-13T08:10:50Z

+1 as well for closing this issue. Restarting nodes is something that will naturally have to happen in a cluster anyway (e.g. rolling upgrades) and repurposing nodes should be a rather rare event. In addition to what @andrershov and @DaveCTurner have said, I also want to point out that we have taken steps to address the following (quoting myself):

There are other scenarios described here (e.g. dynamically making a mixed master/data node a master-only node), which have consequences that go beyond Zen (moving the shard data off these nodes before allowing them to become master-only nodes).

In #37748 and #37347 we have come up with stricter rules for repurposing nodes and in #39403 we are adding tooling to support this model.

dakrone added help wanted adoptme discuss labels Apr 24, 2015

clintongormley added the :Core/Infra/Settings Settings infrastructure and APIs label Apr 26, 2015

cywjackson mentioned this issue May 22, 2016

unassigned replica shards after cluster restart #9602

Closed

rjernst added :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. and removed :Core/Infra/Settings Settings infrastructure and APIs labels Mar 14, 2018

DaveCTurner added :Distributed Coordination/Discovery-Plugins Anything related to our integration plugins with EC2, GCP and Azure and removed :Distributed Indexing/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. labels Mar 15, 2018

pcsanwald added team-discuss and removed discuss labels Nov 8, 2018

ywelsch added >feature and removed team-discuss help wanted adoptme labels Nov 9, 2018

ywelsch closed this as completed Mar 13, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make `node.master` a dynamic setting #10793

Make `node.master` a dynamic setting #10793

TwP commented Apr 24, 2015

s1monw commented Apr 24, 2015

clintongormley commented Apr 26, 2015

dakrone commented Apr 26, 2015

clintongormley commented Apr 27, 2015

clintongormley commented Apr 27, 2015

dakrone commented Apr 27, 2015

rjernst commented Apr 27, 2015

s1monw commented Apr 28, 2015

clintongormley commented Apr 28, 2015

TwP commented Apr 29, 2015

mlorch-ai commented May 20, 2015

rjsm commented Nov 2, 2015

yehosef commented Dec 14, 2015

charlesmims commented Jan 26, 2016

pkusnail commented Jul 22, 2016

bleskes commented Jul 22, 2016

elasticmachine commented Mar 14, 2018

DaveCTurner commented May 10, 2018

jasontedor commented Nov 8, 2018

ywelsch commented Nov 9, 2018

andrershov commented Mar 12, 2019

DaveCTurner commented Mar 12, 2019

ywelsch commented Mar 13, 2019

Make node.master a dynamic setting #10793

Make node.master a dynamic setting #10793

Comments

TwP commented Apr 24, 2015

s1monw commented Apr 24, 2015

clintongormley commented Apr 26, 2015

dakrone commented Apr 26, 2015

clintongormley commented Apr 27, 2015

clintongormley commented Apr 27, 2015

dakrone commented Apr 27, 2015

rjernst commented Apr 27, 2015

s1monw commented Apr 28, 2015

clintongormley commented Apr 28, 2015

TwP commented Apr 29, 2015

mlorch-ai commented May 20, 2015

rjsm commented Nov 2, 2015

yehosef commented Dec 14, 2015

charlesmims commented Jan 26, 2016

pkusnail commented Jul 22, 2016

bleskes commented Jul 22, 2016

elasticmachine commented Mar 14, 2018

DaveCTurner commented May 10, 2018

jasontedor commented Nov 8, 2018

ywelsch commented Nov 9, 2018

andrershov commented Mar 12, 2019

DaveCTurner commented Mar 12, 2019

ywelsch commented Mar 13, 2019

Make `node.master` a dynamic setting #10793

Make `node.master` a dynamic setting #10793