Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does upsmon -c cmd delete daemon's PID file? #1728

Closed
jimklimov opened this issue Dec 2, 2022 · 2 comments
Closed

Does upsmon -c cmd delete daemon's PID file? #1728

jimklimov opened this issue Dec 2, 2022 · 2 comments
Labels
bug service/daemon start/stop General subject for starting and stopping NUT daemons (drivers, server, monitor); also BG/FG/Debug
Milestone

Comments

@jimklimov
Copy link
Member

Originally posted by @ullix in #1721 (comment)

...Back to upsmon. Why has it lost the pid on the second call?

~$ sudo systemctl restart nut-monitor.service 

~$ sudo upsmon -c reload
Network UPS Tools upsmon 2.8.0

~$ sudo upsmon -c reload
Network UPS Tools upsmon 2.8.0
fopen /run/nut/upsmon.pid: No such file or directory

~$ sudo upsmon -c reload
Network UPS Tools upsmon 2.8.0
fopen /run/nut/upsmon.pid: No such file or directory

~$ sudo systemctl restart nut-monitor.service 

~$ sudo upsmon -c reload
Network UPS Tools upsmon 2.8.0

~$ sudo upsmon -c reload
Network UPS Tools upsmon 2.8.0
fopen /run/nut/upsmon.pid: No such file or directory

MAINTAINER NOTE: I would guess that the program registers an exit-handler to clean up, and it kicks in for -c cmd mode as well - even if it never called writepid() in the first place.

TODO: Check other daemon/tool programs (upsd -c cmd, any others?) for similar mis-behavior.

@jimklimov jimklimov added bug service/daemon start/stop General subject for starting and stopping NUT daemons (drivers, server, monitor); also BG/FG/Debug labels Dec 2, 2022
@jimklimov jimklimov added this to the 2.8.1 milestone Dec 2, 2022
@jimklimov
Copy link
Member Author

In upsd the atexit(upsd_cleanup) which deletes a PID file (if string is not null) is registered just after signaling/probing a previous instance (and aborting in case of errors or after sending a signal) - so should not remove the PID file by virtue of having run.

The upsmon.c source file does not contain atexit nor unlink...

@jimklimov
Copy link
Member Author

jimklimov commented Dec 2, 2022

So far struggling to reproduce the originally reported issue (although without systemd involved on the testbed); however I see upsmon mostly behaving so far:

  • Initial start:
$ ps -ef | grep ups
jim      14027   163  0 15:42 pts/2    00:00:00 grep --color=auto ups

$  sudo ./clients/upsmon
Network UPS Tools upsmon 2.8.0-Windows-188-gc78e889c4
fopen /run/upsmon.pid: No such file or directory
Could not find PID file to see if previous upsmon instance is already running!
UPS: x (monitoring only)
Warning: no shutdown command defined!

$ ps -ef | grep ups
root     14043   162  0 15:42 ?        00:00:00 /home/jim/nut/clients/.libs/upsmon
nobody   14044 14043  0 15:42 ?        00:00:00 /home/jim/nut/clients/.libs/upsmon
nobody   14048 14044  0 15:42 ?        00:00:00 [upsmon] <defunct>
jim      14052   163  0 15:42 pts/2    00:00:00 grep --color=auto ups

$ cat /run/upsmon.pid
14044

... so far so good - PID file created, process running

  • Try to make a conflict:
$  sudo ./clients/upsmon
Network UPS Tools upsmon 2.8.0-Windows-188-gc78e889c4
Fatal error: A previous upsmon instance is already running!
Either stop the previous instance first, or use the 'reload' command.

$ cat /run/upsmon.pid
14044

...also still good; same PID recorded

  • Does reload make problems?
$  sudo ./clients/upsmon -c reload
Network UPS Tools upsmon 2.8.0-Windows-188-gc78e889c4

$ cat /run/upsmon.pid
14044

$ ps -ef | grep ups
root     14043   162  0 15:42 ?        00:00:00 /home/jim/nut/clients/.libs/upsmon
nobody   14044 14043  0 15:42 ?        00:00:00 /home/jim/nut/clients/.libs/upsmon
jim      14091   163  0 15:43 pts/2    00:00:00 grep --color=auto ups

...nope, same PID recorded and running

  • Do many reloads make problems?
$  sudo ./clients/upsmon -c reload
Network UPS Tools upsmon 2.8.0-Windows-188-gc78e889c4

$  sudo ./clients/upsmon -c reload
Network UPS Tools upsmon 2.8.0-Windows-188-gc78e889c4

$  sudo ./clients/upsmon -c reload
Network UPS Tools upsmon 2.8.0-Windows-188-gc78e889c4

$ ps -ef | grep ups
root     14043   162  0 15:42 ?        00:00:00 /home/jim/nut/clients/.libs/upsmon
nobody   14044 14043  0 15:42 ?        00:00:00 /home/jim/nut/clients/.libs/upsmon
jim      14138   163  0 15:43 pts/2    00:00:00 grep --color=auto ups

$ cat /run/upsmon.pid
14044

... still okay

  • How about a stop command?
$  sudo ./clients/upsmon -c stop
Network UPS Tools upsmon 2.8.0-Windows-188-gc78e889c4

$ cat /run/upsmon.pid
14044

$ ps -ef | grep ups
jim      14177   163  0 15:45 pts/2    00:00:00 grep --color=auto ups

...processed well, with one minor issue - PID file remains (consistent with lack of unlink in easily visible codebase)

  • Subsequent signal commands fail (as expected):
$  sudo ./clients/upsmon -c reload
Network UPS Tools upsmon 2.8.0-Windows-188-gc78e889c4
kill: No such process
Failed to signal the currently running daemon (if any)
Try 'systemctl reload nut-monitor.service' or add '-P $PID' argument

$ cat /run/upsmon.pid
14044

...but do not wipe the PID file

  • And a subsequent start just starts (after not finding the competitor):
$  sudo ./clients/upsmon
Network UPS Tools upsmon 2.8.0-Windows-188-gc78e889c4
kill: No such process
UPS: x (monitoring only)
Warning: no shutdown command defined!

$ cat /run/upsmon.pid
14210

$ ps -ef | grep ups
root     14209   162  0 15:52 ?        00:00:00 /home/jim/nut/clients/.libs/upsmon
nobody   14210 14209  0 15:52 ?        00:00:00 /home/jim/nut/clients/.libs/upsmon
nobody   14215 14210  0 15:52 ?        00:00:00 [upsmon] <defunct>
jim      14219   163  0 15:52 pts/2    00:00:00 grep --color=auto ups
$  sudo ./clients/upsmon -DDD
Network UPS Tools upsmon 2.8.0-Windows-188-gc78e889c4
kill: No such process
   0.000000     [D1] Just failed to send signal, no daemon was running
   0.000032     UPS: x (monitoring only)
   0.000050     [D1] debug level is '3'
Warning: no shutdown command defined!
   0.000062     [D1] debug level is '3'
   0.000264     [D1] Saving PID 14265 into /run/upsmon.pid
   0.000464     [D1] Succeeded to become_user(nobody): now UID=65534 GID=65534
   0.001298     [D1] Trying to connect to UPS [x]
...

Notably systemd unit nut-monitor starts upsmon -F but in plain command-line the original problem was also not reproduced. So might be an issue of systemd (e.g. could it wipe the PID file for grandchild process?..)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug service/daemon start/stop General subject for starting and stopping NUT daemons (drivers, server, monitor); also BG/FG/Debug
Projects
None yet
Development

No branches or pull requests

1 participant