Need a way to cleanly shut down nodes #2052

blalor · 2016-12-01T22:55:40Z

Nomad v0.5.0

There doesn't appear to be a way to cleanly shut down a client node in a way that allows allocations to be moved to other nodes and also accounts for the data in sticky ephemeral disks to be migrated. I wrote a script to help my systemd service delay stopping the service until allocations have been moved, but there doesn't appear to be a way to monitor for the status of the migrated data. If the data's not moved quickly, it could be lost when the node shuts down.

Something like nomad shutdown that blocks until the agent is completely idle would be ideal.

The text was updated successfully, but these errors were encountered:

groggemans · 2018-06-03T20:38:42Z

Consul has the leave command. Would be nice to have a similar command in nomad, which would trigger a node drain, wait for it to complete, and then gracefully leave the cluster.

preetapan · 2018-06-04T03:44:18Z

@groggemans Nomad 0.8 added advanced node draining features. Some useful links:

Node drain command
Blog post that explains node draining features

groggemans · 2018-06-04T06:56:08Z

I know, and it solves/implements te draining part, but then the node should still gracefully leave the cluster. And I think the only way to do this now is by stopping/interrupting the service (with leave_on_terminate = true or leave_on_interrupt = true).

Setting leave_on_interrupt or leave_on_terminate to true isn't always desirable, but it should still be possible to do a graceful leave from the cli even when both options are false (default).

For servers there's the force-leave option, but for clients there's no command to do a graceful leave. A universal leave that would work for both servers and clients which also triggers a node drain seems to be missing.

onlyjob · 2018-08-15T01:08:03Z

A relevant discussion happened in #4305. @insanejudge, @schmichael.

Indeed draining node on shutdown is the best and service file could be adjusted to do that. However graceful restart can not be implemented in systemd service because it can not distinguish shut down from restart. Regardless, KillMode=control-group (default) is better than KillMode=process because the latter does not guarantee cleanup. It is important to leave no unmanaged processes behind.

dcparker88 · 2018-09-12T16:29:14Z

Not sure if this is relevant or related - but even when I have leave_on_terminate set in my config - it doesn't seem to fully leave. I've been doing some testing with the stuff above, and I can see in my logs the node is cleanly shutting down:

nomad: ==> Caught signal: terminated
nomad: ==> Gracefully shutting down agent...
nomad[2525]: agent: requesting shutdown
nomad: 2018/09/12 16:11:52.306399 [INFO] agent: requesting shutdown
nomad: 2018/09/12 16:11:52.306468 [INFO] client: shutting down
nomad[2525]: client: shutting down
nomad: 2018/09/12 16:11:52.320998 [INFO] agent: shutdown complete
nomad[2525]: agent: shutdown complete

but when I check nomad node status it still shows down:

$ nomad node status
ID        DC                 Name   Class   Drain  Eligibility  Status
4995dacd  east     agent1     <none>  false  ineligible   down

is that expected behavior? I would expect once the node leaves the cluster it doesn't appear in the status anymore.

schmichael · 2018-09-12T17:03:04Z

@onlyjob Does systemd allow configuring different signals for reloads, restarts, and shutdowns? If so we could use SIGHUP, SIGINT, and SIGTERM respectively to separate the shutdown behaviors. Adding APIs+CLI commands would also be useful. This is definitely something we're hoping to do, but I don't know if it will make it into 0.9.0.

@dcparker88 Unfortunately leave_on_terminate is not implemented for clients, so yes, that is expected.

onlyjob · 2018-09-12T22:34:42Z

No it doesn't... There is ExecStop but not ExecRestart... Anyway IMHO it is wrong to distinguish. Node should be drained on restart as well because it is the only safe approach. If updated executable fail to start then system will end up with dangled unaccounted services.

schmichael · 2018-09-13T00:24:27Z

@onlyjob Nomad will continue to support inplace upgrades (restarting without draining) for at least a couple of reasons:

Some jobs are expensive to restart/migrate (QEMU VMs)
We do not want to tie the lifetime/stability of the Nomad client agent to all of the tasks it runs. We try to isolate defects in our code from affecting user services.

That being said we've definitely come close to dropping support for inplace upgrades. I could see it happening someday but for now we intend to support restarts that don't affect tasks.

onlyjob · 2018-09-13T08:04:07Z

It is OK if you are committed to support restart without draining. However this is unsafe and therefore should be configurable. Moreover draining node on restart must be default behaviour. It is not OK to leave dangling VMs because they are not cheap to restart.
It is a classic "speed over safety" dilemma.

Betting on perfect stability of the Nomad client is a strategy for the perfect world, like saying that defects in your code (will) never happen.
One day something unforeseen will happen on architecture that your CI does not cover and client will fail to start for whatever reason - could be low memory condition for example - how do you know if there will be enough memory available to start Nomad if it doesn't terminate its jobs?

http://thecodelesscode.com/case/96

dcparker88 · 2018-09-13T14:57:20Z

@schmichael ah thanks - that makes sense then. Do you know if that's a planned feature, or should I just continue to use GC to clean out down nodes?

schmichael · 2018-09-14T17:08:46Z

@dcparker88: Do you know if that's a planned feature, or should I just continue to use GC to clean out down nodes?

We hoped shutdown improvements would land in 0.9.0, but some larger features (eg plugins) take priority so you may want to continue using GC for the time being. If they don't make it in 0.9.0, hopefully we'll get them out in a patch release.

@onlyjob: ...therefore should be configurable. Moreover draining node on restart must be default behaviour.

This is the plan!

@onlyjob: Betting on perfect stability of the Nomad client is a strategy for the perfect world, like saying that defects in your code (will) never happen.

This is precisely reason 2 I gave above for supporting inplace upgrades. A guiding principle in Nomad's design is in the face of errors: do not stop user services! Nomad downtime should prevent further scheduling, but it should avoid causing service downtime as much as possible.

onlyjob · 2018-09-15T04:06:21Z

Thanks. :) I think there is a flaw in this reasoning... We need to separate two issues: avoiding stopping services during normal operations and a case when Nomad client itself is restarting.
It violate principles of integrity and common sense to leave scheduled jobs running when nomad client exited...
Service downtime is necessary when manager/dispatcher is restarting because it is the only safe mode of operations.

What if updated Nomad disagrees with running Docker on version of API?

rmlsun · 2020-07-13T23:54:38Z

Basically what we want is, if nomad itself is running into unexpected issues, leave the task runtime alone and confine nomad issue to be just nomad issue as much as possible (smallest blast radius possible). On the other hand, if it's an intentional shutdown of nomad client, provides a way to trigger a clean shutdown of task runtimes

I think there might be a fine line here @schmichael

Ideally, if nomad client itself crashes or shutdown b/c not operator initiated reasons, it should not trigger task shutdown. Only if it's an operator initiated shutdown, it triggers (and waits for the finish of) clean shutdown of all tasks.

So would a signal be a good way to indicate it's an intentional shutdown? Like, instead of having client.drain_shutdown = true, how about client.drain_shutdown_signal = SIGINT something along that line.

Would a client.drain_shutdown = true agent configuration parameter fit your use case? The idea being that when the nomad client received the signal to shutdown it would block exiting until it had drained all running allocations?

mwild1 · 2021-04-20T11:57:16Z

I understand the reasoning for many people in this thread wanting this feature. It is surely a safer option in many environments. However a few comments implied that the current behaviour is always undesirable, which is not the case.

There are some workloads where it absolutely makes sense to keep allocations running during a restart (or crash) of the nomad client. Assuming of course that those allocations can be re-adopted by a new nomad process.

In-place upgrades and the general ease of upgrades with zero or minimal disruption in Nomad are one of the big features over Kubernetes for me.

So yes, by all means a way to combine drain+shutdown/restart would be great, but not because it's the only way that makes sense.

ketzacoatl · 2021-04-22T11:54:53Z

@tgross what discussion is needed to figure out next steps here? I would love to help move this along!

ketzacoatl · 2021-09-15T14:46:18Z

@tgross ping

tgross · 2021-09-16T00:00:05Z

Hey @ketzacoatl given that you already cross-linked this somewhere the Nomad team was asking for inputs, I'm sure they'll have some thoughts for you at some point. But I'm not at HashiCorp for a while now and there aren't any non-HashiCorp maintainers so pinging me probably won't help move things along. 😁

That being said, if it were up to me (and it's not!), I'd say there's not much to this issue:

If you're already shutting down a client intentionally, scripting a drain doesn't seem like a huge additional effort.
If a node is shutting down unintentionally (i.e. it crashes), the node can't participate in telling the server to drain it. So you need to rely on something like stop_after_client_disconnect anyways.

I'm sure the Nomad team would be open to a patch that provides client configuration that causes the node to drain on graceful shutdown.

ketzacoatl · 2021-09-16T15:10:34Z

@tgross apologies for the ping!

mikenomitch · 2021-11-12T19:23:38Z

There was a suggestion to use systemd inhibitor locks to achieve this. Noting here in case it is helpful if this gets picked up.

tgross · 2023-04-14T19:36:14Z

Implemented (finally!) in https://github.com/hashicorp/nomhttps://github.com/hashicorp/nomad/pull/16827ad/pull/16827, which will ship in the next release of Nomad.

github-actions · 2025-01-12T02:18:02Z

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

dadgar added theme/client type/enhancement stage/thinking labels Dec 1, 2016

blalor mentioned this issue Jun 29, 2017

Improvement: flag to wait for node to be drained #2736

Closed

schmichael mentioned this issue Jul 13, 2020

Ability to send Unix Signals to processes Nomad is running #817

Closed

tgross added stage/needs-discussion and removed stage/thinking labels Aug 24, 2020

ketzacoatl mentioned this issue Sep 15, 2021

Nomad Tools Discussion #8816

Open

tgross added the theme/drain label Mar 21, 2023

tgross mentioned this issue Apr 13, 2023

client: allow drain_on_shutdown configuration #16827

Merged

tgross self-assigned this Apr 13, 2023

tgross added this to the 1.5.x milestone Apr 13, 2023

tgross closed this as completed in #16827 Apr 14, 2023

hc-github-team-nomad-core mentioned this issue Apr 14, 2023

Backport of client: allow drain_on_shutdown configuration into release/1.5.x #16891

Merged

github-actions bot locked as resolved and limited conversation to collaborators Jan 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need a way to cleanly shut down nodes #2052

Need a way to cleanly shut down nodes #2052

blalor commented Dec 1, 2016

groggemans commented Jun 3, 2018

preetapan commented Jun 4, 2018 •

edited

Loading

groggemans commented Jun 4, 2018

onlyjob commented Aug 15, 2018

dcparker88 commented Sep 12, 2018

schmichael commented Sep 12, 2018

onlyjob commented Sep 12, 2018

schmichael commented Sep 13, 2018

onlyjob commented Sep 13, 2018 •

edited

Loading

dcparker88 commented Sep 13, 2018

schmichael commented Sep 14, 2018

onlyjob commented Sep 15, 2018

rmlsun commented Jul 13, 2020 •

edited

Loading

mwild1 commented Apr 20, 2021

ketzacoatl commented Apr 22, 2021

ketzacoatl commented Sep 15, 2021

tgross commented Sep 16, 2021

ketzacoatl commented Sep 16, 2021

mikenomitch commented Nov 12, 2021

tgross commented Apr 14, 2023

github-actions bot commented Jan 12, 2025

Need a way to cleanly shut down nodes #2052

Need a way to cleanly shut down nodes #2052

Comments

blalor commented Dec 1, 2016

groggemans commented Jun 3, 2018

preetapan commented Jun 4, 2018 • edited Loading

groggemans commented Jun 4, 2018

onlyjob commented Aug 15, 2018

dcparker88 commented Sep 12, 2018

schmichael commented Sep 12, 2018

onlyjob commented Sep 12, 2018

schmichael commented Sep 13, 2018

onlyjob commented Sep 13, 2018 • edited Loading

dcparker88 commented Sep 13, 2018

schmichael commented Sep 14, 2018

onlyjob commented Sep 15, 2018

rmlsun commented Jul 13, 2020 • edited Loading

mwild1 commented Apr 20, 2021

ketzacoatl commented Apr 22, 2021

ketzacoatl commented Sep 15, 2021

tgross commented Sep 16, 2021

ketzacoatl commented Sep 16, 2021

mikenomitch commented Nov 12, 2021

tgross commented Apr 14, 2023

github-actions bot commented Jan 12, 2025

preetapan commented Jun 4, 2018 •

edited

Loading

onlyjob commented Sep 13, 2018 •

edited

Loading

rmlsun commented Jul 13, 2020 •

edited

Loading