Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] systemd offline when restarting service on OpenSUSE MicroOS #62311

Closed
DennisGlindhart opened this issue Jul 8, 2022 · 8 comments
Closed
Labels
Bug broken, incorrect, or confusing behavior needs-triage

Comments

@DennisGlindhart
Copy link

Description
Modifying a config-file and restarting a systemd-service when that config-file changes (ip6tables in this case) gives systemd offline error once the service-restart is triggered.

Setup
Both salt-master and minion runs on OpenSUSE MicroOS (Immutable)

Steps to Reproduce the behavior

Statefile

ip6tablesconfig:
  file.managed:
    - name: /etc/ip6tables-rules
    - salt://ip6tables-rules
    - template: jinja
    - user: root
    - group: root
    - mode: 0644

ip6tables:
  service.running:
    - reload: False
    - watch:
      - file: ip6tablesconfig

systemd-service

# cat /etc/systemd/system/ip6tables.service
[Unit]
Before=network-pre.target
Wants=network-pre.target

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/sbin/ip6tables-restore --wait=10 /etc/ip6tables-rules

[Install]
WantedBy=multi-user.target

Output/Error when running state (and config-file being changed)

# salt --output-diff -v 'server-data0' state.apply 
Executing job with jid 20220708105951843591
-------------------------------------------

server-data0:
----------
          ID: ip6tablesconfig
    Function: file.managed
        Name: /etc/ip6tables-rules
      Result: True
     Comment: File /etc/ip6tables-rules updated
     Started: 12:59:56.360664
    Duration: 90.73 ms
     Changes:   
              ----------
              diff:
                  --- 
                  +++ 
                  @@ -58,4 +58,3 @@
                   
                   -A OUTPUT -j LOGREJECT
                   COMMIT
                  -
----------
          ID: ip6tables
    Function: service.running
      Result: False
     Comment: An exception occurred in this state: Traceback (most recent call last):
                File "/usr/lib/python3.10/site-packages/salt/state.py", line 2195, in call
                  ret = self.states[cdata["full"]](
                File "/usr/lib/python3.10/site-packages/salt/loader/lazy.py", line 149, in __call__
                  return self.loader.run(run_func, *args, **kwargs)
                File "/usr/lib/python3.10/site-packages/salt/loader/lazy.py", line 1203, in run
                  return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs)
                File "/usr/lib/python3.10/site-packages/salt/loader/lazy.py", line 1218, in _run_as
                  return _func_or_method(*args, **kwargs)
                File "/usr/lib/python3.10/site-packages/salt/loader/lazy.py", line 1251, in wrapper
                  return f(*args, **kwargs)
                File "/usr/lib/python3.10/site-packages/salt/states/service.py", line 1019, in mod_watch
                  if __salt__["service.status"](name, sig, **status_kwargs):
                File "/usr/lib/python3.10/site-packages/salt/loader/lazy.py", line 149, in __call__
                  return self.loader.run(run_func, *args, **kwargs)
                File "/usr/lib/python3.10/site-packages/salt/loader/lazy.py", line 1203, in run
                  return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs)
                File "/usr/lib/python3.10/site-packages/salt/loader/lazy.py", line 1218, in _run_as
                  return _func_or_method(*args, **kwargs)
                File "/usr/lib/python3.10/site-packages/salt/modules/systemd_service.py", line 1116, in status
                  _check_for_unit_changes(service)
                File "/usr/lib/python3.10/site-packages/salt/modules/systemd_service.py", line 143, in _check_for_unit_changes
                  if _untracked_custom_unit_found(name) or _unit_file_changed(name):
                File "/usr/lib/python3.10/site-packages/salt/modules/systemd_service.py", line 395, in _untracked_custom_unit_found
                  return os.access(unit_path, os.R_OK) and not _check_available(name)
                File "/usr/lib/python3.10/site-packages/salt/modules/systemd_service.py", line 105, in _check_available
                  raise CommandExecutionError(
              salt.exceptions.CommandExecutionError: Cannot run in offline mode. Failed to get information on unit 'ip6tables'
     Started: 12:59:58.255107
    Duration: 5.047 ms
     Changes:   

Summary for server-data0
------------
Succeeded: 1 (changed=1)
Failed:    1
------------
Total states run:     2
Total run time:  95.777 ms

Expected behavior
Service being restarted

Versions Report

salt --versions-report
Salt Version:
          Salt: 3004
 
Dependency Versions:
          cffi: 1.15.0
      cherrypy: Not Installed
      dateutil: Not Installed
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 3.0.3
       libgit2: Not Installed
      M2Crypto: 0.38.0
          Mako: Not Installed
       msgpack: 1.0.4
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     pycparser: 2.21
      pycrypto: Not Installed
  pycryptodome: Not Installed
        pygit2: Not Installed
        Python: 3.10.5 (main, Jun 06 2022, 22:34:44) [GCC]
  python-gnupg: Not Installed
        PyYAML: 6.0
         PyZMQ: 22.3.0
         smmap: Not Installed
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.3.4
 
System Versions:
          dist: opensuse-microos 20220629 
        locale: utf-8
       machine: x86_64
       release: 5.18.6-1-default
        system: Linux
       version: openSUSE MicroOS 20220629

Additional context

I think I'm hitting this error:

https://github.com/saltstack/salt/blob/master/salt/modules/systemd_service.py#L104

which, accodring to [1], should be equal to running

# salt 'server-data0' service.offline
server-data0:
    False

If I understand correct it somehow returns True when being triggered by the watch/file-change, which results in the error.

I suspect it might have something to do with running on OpenSuSE MicroOS (immutable, but /etc should be writeable). The offline-check seems new (based on source history) in version 3004 which, in release notes[1], mentions exactly MicroOS, but only around transactional-update / reboot.

# salt 'server-data0' cmd.which systemctl
server-data0:
    /usr/bin/systemctl

[0]

def offline():

[1] https://docs.saltproject.io/en/latest/topics/releases/3004.html#release-3004

@DennisGlindhart DennisGlindhart added Bug broken, incorrect, or confusing behavior needs-triage labels Jul 8, 2022
@welcome
Copy link

welcome bot commented Jul 8, 2022

Hi there! Welcome to the Salt Community! Thank you for making your first contribution. We have a lengthy process for issues and PRs. Someone from the Core Team will follow up as soon as possible. In the meantime, here’s some information that may help as you continue your Salt journey.
Please be sure to review our Code of Conduct. Also, check out some of our community resources including:

There are lots of ways to get involved in our community. Every month, there are around a dozen opportunities to meet with other contributors and the Salt Core team and collaborate in real time. The best way to keep track is by subscribing to the Salt Community Events Calendar.
If you have additional questions, email us at [email protected]. We’re glad you’ve joined our community and look forward to doing awesome things with you!

@DennisGlindhart
Copy link
Author

It works when removing transactional_update module executor and using direct_call only. ( --module-executors='[direct_call]' )

The list of modules delegated to be routed via transactional_update executor does not contain service/system, but contains cmd which systemd module uses to check wether systemd is online/offline.

@SchoolGuy
Copy link

@meaksh @agraul @vzhestkov Maybe you can chime in here too? I stumbled across this by accident too.

@meaksh
Copy link
Contributor

meaksh commented Oct 5, 2022

In "transactional systems", like OpenSUSE MicroOS, the "transactional_update" executor is enabled by default in the minion configuration. This means all Salt actions are, by default, processed by this executor (unless you pass --module-executors='[direct_call]' to your call to override the default settings).

This "transactional update" executor implements a map of Salt functions (module.name) and also execution modules that have to be executed using a special wrapper function (implemented in the transactional update module) in order to make the execution to happen in a separated transaction (not in the running system which is readonly).

Important thing to notice is that, the state.apply is one of those functions where the execution is delegated to the "transactional update" module, so it makes it to run in a separated transaction by calling salt-call inside a new transaction via "transactional update" CLI. Once your states are applied in the new transaction, you can reboot the system to enable your changes.

Now, with that said, there are some Salt states modules, and execution modules, that are not expected to work in this context (when running inside transaction) as, for example, DBUS is not available inside the new transaction.

Particularly, looking at your SLS, the services.running state you are calling is expected to "start" the service in case it is stopped, and this is not possible to happend on a transaction. Using service.enabled might help you here...

In any case your desire and your SLS is expected to run on the running system (not in a transaction), then I guess you should call state.apply passing "direct_call" as executor:

salt MINION --module-executors='[direct_call]' state.apply 

Hth!

@DennisGlindhart
Copy link
Author

Thanks @meaksh - I agree, it actually does make sense when you consider /etc, configuration, service-state etc. as part of the "Immutability". I can't speak to whether that is the "mainstream" perception among MicroOS users, but I'll bite.

@tacerus
Copy link

tacerus commented Oct 4, 2023

Hi,

service.enabled does not work inside a transactional state.apply:

          ID: enable_salt-minion
    Function: service.enabled
        Name: salt-minion
      Result: False
     Comment: Cannot run in offline mode. Failed to get information on unit 'salt-minion'
     Started: 16:05:26.997894
    Duration: 12.309 ms
     Changes:

And unfortunately, service.running with enable: true is just being skipped:

          ID: salt-minion
    Function: service.running
      Result: True
     Comment: Running in OFFLINE mode. Nothing to do
     Started: 16:05:27.011925
    Duration: 2.236 ms
     Changes:

It makes perfect sense that services cannot be started/stopped inside the transactional update, however enabling/disabling should be possible, given that merely translating to placing/removing symlinks in /etc.

Is this a bug / missing feature or am I missing some option?

Ugly workaround:

enable_salt-minion:
  file.symlink:
    - name: /etc/systemd/system/multi-user.target.wants/salt-minion.service
    - target: /usr/lib/systemd/system/salt-minion.service

@meaksh
Copy link
Contributor

meaksh commented Oct 5, 2023

Hi @tacerus,
Yeah, that is a known limitation. The problem is that service.enabled and service.disabled state functions uses DBus to gather the status of the systemd service before actually enabling/disabling it, and we cannot connect DBus inside the transaction.

Another workaround could be:

enable_salt-minion:
  cmd.run:
    - name: systemctl enable salt-minion

Hth!

@tacerus
Copy link

tacerus commented Oct 5, 2023

Hi @meaksh, thanks for getting back! Your workaround sounds like a good alternative as well.

I understand the DBus limitation, but it's unfortunate that this does not scale, as many existing Salt states (for example, ones shipped with the popular formulas) do not implement such special handling for transactional systems and require custom patching.

Would it make sense to implement such special handling inside the execution and state modules instead? For example, in addition to an _offline check, having something like _transactional, which would then only execute "dumb" systemctl enable/disable shell commands (or handle the symlinks - not sure about this, the systemctl commands look more appropriate, but the symlinks don't require subshells).

tacerus added a commit to openSUSE/heroes-salt that referenced this issue Oct 30, 2023
This allows the Salt Minion to be enabled on transactional systems.
Should be moved to the infrastructure.salt formula if needed on more
ems.

saltstack/salt#62311 (comment)

Signed-off-by: Georg Pfuetzenreuter <[email protected]>
Bajzathd added a commit to hortonworks/cloudbreak-images that referenced this issue Jan 18, 2024
Salt 3004+ introduced additional systemctl checks for it service module, which is not compatible with the systemctl we have on YARN images. As a workaround, these were replaced with equivalent cmd.run commands for YARN images.
Similar issue with the suggested workaround: saltstack/salt#62311 (comment)
Bajzathd added a commit to hortonworks/cloudbreak-images that referenced this issue Jan 18, 2024
Salt 3004+ introduced additional systemctl checks for it service module, which is not compatible with the systemctl we have on YARN images. As a workaround, these were replaced with equivalent cmd.run commands for YARN images.
Similar issue with the suggested workaround: saltstack/salt#62311 (comment)
Bajzathd added a commit to hortonworks/cloudbreak-images that referenced this issue Jan 19, 2024
Salt 3004+ introduced additional systemctl checks for it service module, which is not compatible with the systemctl we have on YARN images. As a workaround, these were replaced with equivalent cmd.run commands for YARN images.
Similar issue with the suggested workaround: saltstack/salt#62311 (comment)
Also had to bump Salt version to 3006.5: saltstack/salt#65114 (comment)
Bajzathd added a commit to hortonworks/cloudbreak-images that referenced this issue Jan 19, 2024
Salt 3004+ introduced additional systemctl checks for it service module, which is not compatible with the systemctl we have on YARN images. As a workaround, these were replaced with equivalent cmd.run commands for YARN images.
Similar issue with the suggested workaround: saltstack/salt#62311 (comment)
Also had to bump Salt version to 3006.5: saltstack/salt#65114 (comment)
Bajzathd added a commit to hortonworks/cloudbreak-images that referenced this issue Jan 24, 2024
Salt 3004+ introduced additional systemctl checks for it service module, which is not compatible with the systemctl we have on YARN images. As a workaround, these were replaced with equivalent cmd.run commands for YARN images.
Similar issue with the suggested workaround: saltstack/salt#62311 (comment)
Also had to bump Salt version to 3006.5: saltstack/salt#65114 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior needs-triage
Projects
None yet
Development

No branches or pull requests

4 participants