Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not try to exclude a master node that never existed #1986

Closed
barkbay opened this issue Oct 14, 2019 · 9 comments · Fixed by #4078
Closed

Do not try to exclude a master node that never existed #1986

barkbay opened this issue Oct 14, 2019 · 9 comments · Fixed by #4078
Labels
>bug Something isn't working

Comments

@barkbay
Copy link
Contributor

barkbay commented Oct 14, 2019

I ran out of resources on a K8S cluster while doing an upscale of a set of MDI nodes.

I'm now in a situation where the nodeSet can't be downscaled because the operator is trying to exclude a master node which has never existed:

pod/es-apm-sample-es-1-0                    1/1     Running   0          33m    10.56.3.18    gke-michael-dev-cluster-default-pool-8a982915-3msd   <none>           <none>
pod/es-apm-sample-es-1-1                    0/1     Pending   0          26m    <none>        <none>
2019-10-14T07:30:50.277Z	ERROR	controller-runtime.controller	Reconciler error
{
	"ver": "1.0.0-beta1-bc11-c8bb5e5b",
	"controller": "elasticsearch-controller",
	"request": "default/es-apm-sample",
	"error": "unable to add to voting_config_exclusions: 400 Bad Request: add voting config exclusions request for [es-apm-sample-es-1-1] matched no master-eligible nodes",
	"errorCauses": [{
		"error": "unable to add to voting_config_exclusions: 400 Bad Request: unknown",
		"errorVerbose": "400 Bad Request: unknown
unable to add to voting_config_exclusions
github.com/elastic/cloud-on-k8s/pkg/controller/elasticsearch/client.(*clientV7).AddVotingConfigExclusions\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/controller/elasticsearch/client/v7.go:41\ngithub.com/elastic/cloud-on-k8s/pkg/controller/elasticsearch/version/zen2.AddToVotingConfigExclusions\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/controller/elasticsearch/version/zen2/voting_exclusions.go:34\ngithub.com/elastic/cloud-on-k8s/pkg/controller/elasticsearch/driver.updateZenSettingsForDownscale\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/controller/elasticsearch/driver/downscale.go:237\ngithub.com/elastic/cloud-on-k8s/pkg/controller/elasticsearch/driver.doDownscale\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/controller/elasticsearch/driver/downscale.go:198\ngithub.com/elastic/cloud-on-k8s/pkg/controller/elasticsearch/driver.attemptDownscale\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/controller/elasticsearch/driver/downscale.go:129\ngithub.com/elastic/cloud-on-k8s/pkg/controller/elasticsearch/driver.HandleDownscale\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/controller/elasticsearch/driver/downscale.go:54\ngithub.com/elastic/cloud-on-k8s/pkg/controller/elasticsearch/driver.(*defaultDriver).reconcileNodeSpecs\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/controller/elasticsearch/driver/nodes.go:112\ngithub.com/elastic/cloud-on-k8s/pkg/controller/elasticsearch/driver.(*defaultDriver).Reconcile\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/controller/elasticsearch/driver/driver.go:234\ngithub.com/elastic/cloud-on-k8s/pkg/controller/elasticsearch.(*ReconcileElasticsearch).internalReconcile\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/controller/elasticsearch/elasticsearch_controller.go:284\ngithub.com/elastic/cloud-on-k8s/pkg/controller/elasticsearch.(*ReconcileElasticsearch).Reconcile\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/controller/elasticsearch/elasticsearch_controller.go:219\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:216\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:192\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:171\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:88\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1357"
	}]
}
@barkbay barkbay added the >bug Something isn't working label Oct 14, 2019
@sebgl
Copy link
Contributor

sebgl commented Oct 14, 2019

This is an interesting one.

Some ideas:

  1. Ignore the error when it happens and move on with removing the master node.
  2. Check if the master node is part of the cluster before we add it to voting_config_exclusions. Don't make the voting_config_exclusions call if the node is not part of the cluster.

In both cases, there's a race condition:

  • we don't add the master node to voting_config_exclusions (either from 1. or 2.)
  • the node joins the cluster <-- the operator does not notice yet
  • we remove the node, but it was not excluded from voting

The current code retries over and over again, hitting the same error until the node finally joins the cluster. But this could never happen if the node stays Pending or bootlooping forever.
We can detect a Pending or bootlooping Node, but we would still end up with the same race condition as above.

Note we do remove only one master node at a time, which mitigates the risks introduced with the above race condition.

I think we have the same sort of problem when setting allocation excludes in cluster settings, to migrate shards away from a data node before removing it. We have an easy way out though: it is possible to exclude a node that is not part of the cluster. The corresponding HTTP call does not fail.

@ywelsch @DaveCTurner I would appreciate your thoughts on this.

@DaveCTurner
Copy link

Ugh yes this is tricky.

Unfortunately it's necessary to know the node ID (not just its name) before we can exclude it from the voting configuration. If it's not in the cluster we don't know its node ID so we cannot exclude it, hence the exception.

Naively, if a node is not running then you don't need to play with the voting configuration to get rid of it safely. If the cluster is alive then the node in question wasn't needed for its votes, and if the cluster is dead then it's already too late. The main thing that worries me is that this node is still showing as Pending which suggests to me that it might come to life at some point in the future. If we knew it would certainly never start then life would be easier. Is that possible?

Unfortunately, "will not run in future" isn't quite enough. Nodes that are not running cannot join a cluster, but they could remain in a cluster for a short while after their deaths. I think that after stopping the node from running we need to ensure it is certainly out of the cluster. I don't think we provide an API to do this today.

I wonder if we should strengthen the voting config exclusions API to accept an unknown node name.

@DaveCTurner
Copy link

I opened elastic/elasticsearch#47990

@DaveCTurner
Copy link

Ok the change to Elasticsearch is now merged to master and 7.x: We have replaced POST /_cluster/voting_config_exclusions/... with POST /_cluster/voting_config_exclusions?node_names=.... The existing API will be supported throughout the rest of 7.x but will result in deprecation warnings when used in ≥7.8.0.

It will shortly be removed in master but I will hold off on doing that for at least a week from now to give you some time to adapt to the new API without breaking your master builds.

@sebgl
Copy link
Contributor

sebgl commented Apr 24, 2020

Thanks for the heads up @DaveCTurner!

I suggest we keep this issue open for pre-8.0 clusters (we may decide to do nothing about it though).
And create a new one to track the necessary changes for 8.0.0: #2951.

@sebgl
Copy link
Contributor

sebgl commented Nov 16, 2020

I just realized that thanks to elastic/elasticsearch#50836 we could already fix this for Elasticsearch 7.8+, by changing our call from /_cluster/voting_config_exclusions/node1,node2 to /_cluster/voting_config_exclusions? node_names=node1,node2 , which should properly ignore non-existing nodes. @DaveCTurner pointed this out already in his comment above, not sure how we missed it 😞.

Raising priority on this issue.

@pebrc pebrc assigned pebrc and unassigned pebrc Nov 17, 2020
@pebrc
Copy link
Collaborator

pebrc commented Nov 17, 2020

We have #2951 for the more focused fix of using the new query parameter

@barkbay
Copy link
Contributor Author

barkbay commented Nov 19, 2020

To workaround this situation when running Elasticsearch < 7.8 it's possible to edit the StatefulSet and scale down manually the number of replicas:

  1. Find the StatefulSet:
> kubectl get sts -l elasticsearch.k8s.elastic.co/cluster-name=<cluster-name>
NAME                       READY   AGE
<cluster-name>-es-<nodeset>   m/n     44h
  1. Adjust the number of replicas:
> kubectl scale --replicas=m  sts/<cluster-name>-es-<nodeset>

@pebrc
Copy link
Collaborator

pebrc commented Nov 30, 2020

So are we going to add this workaround to our troubleshooting docs for <7.8 and close this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants