-
-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
systemd: Interactions of restartIfChanged, reloadIfChanged, stopIfChanged
is fragile
#49528
Comments
Any comments of the maintainer? I would love to have this fixed |
Cc @edolstra as he has written most of the activation logic. I still very much would like to see it fixed. Also happy to help with implementing. Just need feedback whether I am on the right track and not missing any edge cases. |
This issue has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/design-choice-of-switch-to-configuration/6016/2 |
I had some discussion with @aanderse about reload behaviour and also digged into the As for the 3 booleans, they might still be required, but maybe the names are misleading:
Even when a unit provides a way to be reloaded, there's still cases where a restart is necessary - like when the underlying binary is exchanged due to a (security) update, or some of the other directives in the systemd unit changed (like the user a service should run as, sandboxing options etc.). This is currently a big bug IMHO, and we should only reload when its safe, and in all other cases restart. I'll tinker around with the semantics a bit, but my current idea could entirely deprecate the need for a |
So, to be a bit more specific here, the idea is to do the following:
Then, we could sketch a safer reloading mechanism: Services that provide a way to reload their configuration while running, and that want to provide the reload possibility need to indirect their configuration file via During activation of the new configuration, the activation script will notice only the monitored config files changed, and trigger a reload. I think these semantics make reloading services much safer and more predictable. |
This issue has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/design-choice-of-switch-to-configuration/6016/4 |
Part of this effort would also be to fix our restart behaviour of systemd's
This requires some coorperation of the managed service, but is also something we should make use of, as it allows us to |
I marked this as stale due to inactivity. → More info |
Definitely still relevant. I guess @flokli and I should chat about this again at some point... |
I came here from #49415.
This is a case where
but kresd has no My suggestion: what if we make all services without |
For the race condition where a service stopped before systemctl deamon-reload can still be started by a timer, a possibility is to use systemd-run:
we run
This tells systemd to ensure earlyoom is still down when running daemon-reload. This also extends to daemon-reexec, but we have to check ourselves that the command was successful, because systemd fails to keep track of this across reexec:
|
Since version 5.2.0 there's non-empty stop phase: ExecStopPost=/usr/bin/env rm -f "/run/knot-resolver/control/%i" but it's perfectly OK to run that from a different version (and typically it's no-op anyway). Real-life example where this helps: #49528 (comment)
I marked this as stale due to inactivity. → More info |
definitely not stale |
I don't like the unsavory things I've had to do to work around this. Is anyone interested in an online and informal hackathon to tackle this? |
I'm quite busy at the moment but I can report on my failed experiment using systemd-run like in #49528 (comment) (from what I can recall, I tried this several months ago) when a service foo.service must be stopped, But I learned recently that one can create impure services by writing somewhere in /run instead of /etc/systemd. |
I marked this as stale due to inactivity. → More info |
Is this solved by #157329? |
#157329 introduces more stuff, but doesn't deprecate using |
Agreed. Would people here agree the problem is somewhat mitigated when |
This seems to require a serious focus group to fix this issue. |
No, see #49415 (comment) NixOS configuration switching is fundamentally broken because forget about That violates the fundamental promise of NixOS, congruent configuration management: "The system runs the declared services, not some old leftover ones". I've re-opened #49415 (comment) for that, so then this issue can be about the And I've created a PR that will hopefully help: #307562 Please help test! |
…ixes NixOS#49415 See the added comment for a detail description. See also the related NixOS#49528.
Issue description
The following three options describe how a unit must be reloaded on
switch-to-configuration
:Their behaviour tightly depends on eachother. E.g.
restartIfChanged
behaves differently than documented ifstopIfChanged
istrue
, but it'strue
by default, sorestartIfChanged
never behaves as documented.But they are not documented in the same place (the options are not next to eachother in
man configuration.nix
). Also,stopIfChanged
has a bit of a deceptive name. This makes these options hard to discover, and they can act a bit surprisingly.Solution:
Lets instead replace the three options with one option:
The text was updated successfully, but these errors were encountered: