Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upsdrvctl can not run in the foreground #1874

Closed
c1emon opened this issue Mar 20, 2023 · 21 comments
Closed

upsdrvctl can not run in the foreground #1874

c1emon opened this issue Mar 20, 2023 · 21 comments
Labels
bug service/daemon start/stop General subject for starting and stopping NUT daemons (drivers, server, monitor); also BG/FG/Debug
Milestone

Comments

@c1emon
Copy link

c1emon commented Mar 20, 2023

Hello.
I just want to run single driver in the foreground. Just add -F option did not work.
The upsdrvctl is compiled in ubuntu 22.04 x64 with follow cmds:

apt -y install git python3 curl build-essential automake libtool m4 autoconf libmodbus-dev
./autogen.sh
./configure --with-serial=yes --with-modbus=yes --with-systemdsystemunitdir=no
make

So... debug and find some problem with waitpid()

ret = waitpid(pid, &wstat, 0);

Every time startup, waitpid() return -1 and now errno is EINTR.
Follow code just ignore EINTR and fix this bug(?) for me.

while ((ret = waitpid(pid, &wstat, 0)) != pid) {
          if( EINTR == errno) {
                  upslogx(LOG_WARNING, "ignore errno: %d", errno);
                  continue;
          }
          break;
}

Did this a bug or something else?

@c1emon
Copy link
Author

c1emon commented Mar 20, 2023

Oh, forget this.
Even upsdrvctl exited for errno EINTR, the driver process still runs background.

@jimklimov
Copy link
Member

jimklimov commented Mar 20, 2023

Seems related to issue #1759 and PR #1806

@jimklimov jimklimov added the service/daemon start/stop General subject for starting and stopping NUT daemons (drivers, server, monitor); also BG/FG/Debug label Mar 20, 2023
@jimklimov
Copy link
Member

jimklimov commented Mar 20, 2023

Thinking about it, the bigger context is (I think) that "normally" - per older NUT behavior - drivers were always backgrounding except when debugged (then they were foregrounded unconditionally). Relatively recently explicit options were added to ensure fore-/back-grounding of drivers and other daemons. Also #1287 was posted to pass debugging level request from upsdrvctl to an actual driver - which (per #1759) backfired by keeping drivers foregrounded because debug was enabled; they never detached from the parent process and timed out on service starts IIRC (and made managing multiple drivers in one command like upsdrvctl start problematic). Finally #1806 was added to also allow explicitly requesting that the actual driver started via upsdrvctl is backgrounded even if with debug, or foregrounded even if quiet.

@jimklimov
Copy link
Member

Just to be sure - are you building current/recent Git HEAD of NUT? Not an older release tag like 2.7.4 or 2.8.0? (That PR was merged this January).

Also, what command are you starting the driver with? Including the device name on command line, or not?

@c1emon
Copy link
Author

c1emon commented Mar 21, 2023

Thank you for answer. I just want to make this run in docker with my Arm device and one container per driver. I indeed compile recent Git HEAD of NUT for -F option and run /nut/upsdrvctl -F -u nut start dummy for test. Tested with both dummy driver and huawei-ups2000, both same problem. And seems -t option make upsdrvctl skip forkexec, so itself won't fork and exit.

@jimklimov
Copy link
Member

jimklimov commented Mar 21, 2023

Still can't wrap my head around why people want containers for NUT. It has many purposes, but usually the main one is to shut down the system, not sure if containers may command the host environment that much. (Well, with a bind-mounted host procfs you can echo into sysrq handler or some such).

Likewise, on the system managing the UPSes it should run a copy of the driver late in shutdown routine (with FSes unmounted etc.) after all clients have disconnected, to tell the UPS to cut power, where possible. I suppose no containers are runnable at that point...

But please, feel free to enlighten me, I do have a feeling people know something I don't :D

@jimklimov
Copy link
Member

At least, after some trial and error I can indeed confirm some sort of bug here, thanks.

It seems the fix for foreground mode did not account in forkexec() for not-waiting for the child process to fork away. The upsdrvctl originally waited for maxstartdelay (UPS or global setting), and since there was no exit-code from waitpid() the driver "apparently" did not initialize in time and fork off, so it is restarted and retried... 3 times and bust.

@jimklimov
Copy link
Member

On a side note, these retries with a dummy-ups driver left me with 3 copies actually running. "Real" drivers, at least some of them, should kill off older copies (if their PID file is found). Not sure OTOH if dummy-ups does that; but if others do not - it is a bug (although a separate one). For more context on this bit see issue #1423 / PR #1424

@jimklimov jimklimov added the bug label Mar 21, 2023
@jimklimov jimklimov added this to the 2.8.1 milestone Mar 21, 2023
@c1emon
Copy link
Author

c1emon commented Mar 21, 2023

The upsd and driver run on an arm device (like Respberry Pi), which I don't care force shutdown. Other devices run nut client and connect to upsd(on arm device via network). For convenience, I make docker image for nut to run on such device(special linux release without package manager like apt). Don't known it is a correct way and I just use a huawei ups2000 for my home lab.

@c1emon
Copy link
Author

c1emon commented Mar 21, 2023

For this bug. I think when need run in the foreground, invoke fork() and execv() in a for loop and finally use waitpid(-1, &wstat, 0)) block here until all drivers exit.
For background, make drivers become deamon process???
Just some little thought. Maybe unusable...

@jimklimov
Copy link
Member

Got a holistic fix in progress...

@jimklimov
Copy link
Member

Regarding runs "on ARM" - ideally you do care to shut it down, at least to have filesystems safe. On a Raspberry with MMC/SD card you might also want to look into reducing writes (to minimize flash wear-out), e.g. running the host OS mounted read-only mostly and containers with virtual file systems (tmpfs...) if you truly do not care about their data.

Still, if you want your home lab to come back up automagically in most cases, you should look into telling the UPS to cut power when things begin a shutdown routine. This should ensure that whenever wall power appears (and maybe when batteries become sufficiently charged), all your useful load will power on - even if you are not home to push power buttons. With reasonable UPSes this should also cover the case that power returned too early - while you were shutting down (by Murphy's law, this does happen and too often): it turns off, discovers wall power, turns on, acting as a big reset for everything it feeds.

@jimklimov
Copy link
Member

Would you like to check in practice if that PR's source branch works better for your use-case, before I merge it?

@c1emon
Copy link
Author

c1emon commented Mar 22, 2023

PR #1424 was tested on huawei ups2000, fore/background mode both work very well!

And thank you very much for the elegant solution #1874 (comment), I will try later~~

@jimklimov
Copy link
Member

I meant to test PR #1875 - git branch at https://github.com/jimklimov/nut/tree/issue-1874

Rough start could be like:

:; git clone -b issue-1874 https://github.com/jimklimov/nut nut-issue-1874
:; cd nut-issue-1874 && ./ci_build.sh

more suggestions in https://github.com/networkupstools/nut/wiki/Building-NUT-for-in%E2%80%90place-upgrades-or-non%E2%80%90disruptive-tests

@c1emon
Copy link
Author

c1emon commented Mar 23, 2023

Seems a little bug here
https://github.com/jimklimov/nut/blob/d80478d5db05b4cd8014a927d14d9d78c935ef12/drivers/upsdrvctl.c#L252-L269
At this point, check if the driver really shutdown?
But if driver has killed before, there is no pid file available and sendsignalfn(pidfn, 0) will return -3.
The log show

fopen /var/state/ups/dummy-ups-ups0.pid: No such file or directory
Sending signal to /var/state/ups/dummy-ups-ups0.pid failed, driver is finally down or wrongly owned

@jimklimov
Copy link
Member

Well, sending "signal 0" is the OS-independent way of checking if the other process exists. Fiddling with e.g. /proc/$PID directories is not portable, to the extent that many OSes do not serve those, and running ps programs is quite expensive and they also are not ubiquitously available nor have same sort of output patterns to parse (something I did often have as an option in admin scripts).

However this can be compromised by lack of permission to send the signal (unprivileged user owning upsdrvctl may not signal a driver running as another unprivileged user), or in this case - that the driver had probably shut down cleanly and removed the PID file so we could not parse its PID from that to signal it. For the method, the verdict is a singluar failure state = "could not signal, don't know why" although the logs tell the detail; to humans.

This probably could be refactored (return more states than zero/nonzero, parse the PID file once and error out if it does not exist; use the same PID number in later retries and avoid extra disk I/O and string-to-number parsing), but so far it was not the focus of the exercise.

@jimklimov
Copy link
Member

Other than the messages, did it work functionally? :)

Also, in this test case, you seem to have checked upsdrvctl stop (drivername) (not the case of running upsdrvctl start -F ... and then breaking with Ctrl+C which would stop the foregrounded drivers) - right?

@c1emon
Copy link
Author

c1emon commented Mar 23, 2023

Yes, driver can run in foreground with -F and then stopped by Ctrl+C. And upsdrvctl stop can indeed stop backgrounded drivers.
Last but not least, learned a lot from your explanation, thanks~

@jimklimov
Copy link
Member

Well, thanks to you too :)

Feel free to hang around, find and maybe even fix some more issues, scratch some itches :)

@jimklimov
Copy link
Member

PR merged, closing issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug service/daemon start/stop General subject for starting and stopping NUT daemons (drivers, server, monitor); also BG/FG/Debug
Projects
None yet
Development

No branches or pull requests

2 participants