Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unhandled python exceptions do not result in non-zero exit codes in Win32Services #898

Closed
darad opened this issue Jan 26, 2021 · 11 comments · Fixed by #1783
Closed

Unhandled python exceptions do not result in non-zero exit codes in Win32Services #898

darad opened this issue Jan 26, 2021 · 11 comments · Fixed by #1783

Comments

@darad
Copy link

darad commented Jan 26, 2021

We have a python program cx-frozen into a Win32Service. The service must be running all the time and needs to recover from crashes automatically.

In face of unhandled python exceptions (ie fatal error), the running service seems to exit with a zero exit code leading the service state to STOPPED state. This render any of the Windows Service Recovery strategies ineffective as they only respond to non-zero ExitCodes.

As a workaround, we have resorted to a secondary service that's always checking the state of the frozen service. If it's in STOPPED state, it restarts it. That's an extra layer of management we need to maintain.

Can the service api change in a way that the exit codes reflects more closely the outcome of python program's execution?

@marcelotduarte
Copy link
Owner

I think so. You can put more details like a sample? Do your service is based on the sample in the repository? Do you use the initialize, stop, run and state_chaged in the Handler class?

@TechnicalPirate
Copy link
Contributor

TechnicalPirate commented Feb 8, 2021

Hi Marcel,

I spent some time today replicating this.

Using latest release ( 6.3 ) + the service sample i did the following.

  • Replaced Run with:
def Run(self):
        raise RuntimeError("Doomsday")
        self.stopEvent.set()
  • Built the service as usual + installed it via admin cmd prompt
  • I set the recovery options of the service
    image
  • I start the service by right clicking > start
  • The service immediately stops
  • Checking the log there is the one error
[15008] 2021/02/09 10:29:45.078 starting logging at level ERROR
[15008] 2021/02/09 10:29:45.143 Python exception encountered:
[15008] 2021/02/09 10:29:45.143     Internal Message: exception running service
[15008] 2021/02/09 10:29:45.144     Type => <class 'RuntimeError'>
[15008] 2021/02/09 10:29:45.144     Value => Doomsday
[15008] 2021/02/09 10:29:45.144     Traceback (most recent call last):

[15008] 2021/02/09 10:29:45.144       File "C:\King\repos\git\WinServiceTesting\cx_sample\service\ServiceHandler.py", line 27, in Run
    raise RuntimeError("Doomsday")

[15008] 2021/02/09 10:29:45.146     RuntimeError: Doomsday

[04308] 2021/02/09 10:29:45.147 ending logging
  • Windows makes no attempt to restart the service

In the scenario of an unhandled exception being raised in any of the Services methods ( Install / Start / Stop / Run / state/session change ) i would expect the process to exit with a None-Zero exit code so the windows recovery options can react to the services death.

Having a quick look at the underlying implementation it looks like it will always return 0....

I hope that all makes sense / please say if i we can do anything to help out!
And obligatory - thank you so much for your work on this awesome package!

@marcelotduarte
Copy link
Owner

it always returns 0

Yes, when the service run is ok.
The second line (and following lines) of the log, are output of:

return LogPythonException("exception running service");

And LogPythonException returns -1
https://github.com/anthony-tuininga/cx_Logging/blob/master/src/cx_Logging.c#L1529-L1546

to

// run the service
if (Service_Run(&info) < 0) {
Service_SetStatus(&info, SERVICE_STOPPED);
return;
}

Then you should use state_changed to catch this, and in Config.py it should be set to true (defaults to false)
SESSION_CHANGES = False

@TechnicalPirate
Copy link
Contributor

TechnicalPirate commented Feb 10, 2021

Hi Marcel

I tried ( i think! ) your suggestions but am still finding it always exits cleanly and doesn't trigger any recovery actions.

(btw When searching this git project i can't find any references to state_changed anywhere....)

I added a logger to the Handler class to try your suggestion, please find the modified service sample below :)

With this + SESSION_CHANGES set to True - i never see any logs from StateChanged / state_changed or / SessionChanged*( in this sample - in our main product SessionChanged works as expected for logging events like user logins etc )

import threading
import katlog
import os

def _get_logger():
    _log_name = "cx_service_test"
    logger_obj = katlog.get_template_kat_logger(
        _log_name,
        output_directory=katlog.get_kat_tool_logs_subdir(_log_name),
        file_name=_log_name,
        log_file_prefix="pid{}".format(os.getpid()),
    )
    logger_obj.info("LOGGER Created '{}'".format(_log_name))
    return logger_obj


class Handler(object):

    LOGGER = None

    def __init__(self):
        self.stopEvent = threading.Event()
        self.stopRequestedEvent = threading.Event()

    def Initialize(self, configFileName):

        Handler.LOGGER = _get_logger()
        Handler.LOGGER.debug("WinService.Initialize: Loading Virtual Config: {}".format(configFileName))
        Handler.LOGGER.debug("WinService.Initialize: Creating Instance")

    def Run(self):
        Handler.LOGGER.debug("WinService.Run: About To Explode")
        raise RuntimeError("Doomsday")
        Handler.LOGGER.debug("WinService.Run: After Explosion")
        self.stopEvent.set()

    def Stop(self):
        Handler.LOGGER.debug("WinService.Stop: About To Stop")
        self.stopRequestedEvent.set()
        self.stopEvent.wait()

    def StateChanged(self, *args, **kwargs):
        Handler.LOGGER.debug("WinService.StateChanged: State Changed")
        Handler.LOGGER.debug("{}, {}".format(args, kwargs))

    def state_changed(self, *args, **kwargs):
        Handler.LOGGER.debug("WinService.state_changed: State Changed")
        Handler.LOGGER.debug("{}, {}".format(args, kwargs))

    def SessionChanged(self, *args, **kwargs):
        Handler.LOGGER.debug("WinService.SessionChanged: {} {}".format(args, kwargs))

Log Output...

2021-02-10 10:51:50,149: INFO: katLogging: katLogging.py: MainThread: cx_service_test: get_template_kat_logger: 59: Created new Logger cx_service_test
2021-02-10 10:51:50,150: INFO: ServiceHandler: ServiceHandler.py: MainThread: cx_service_test: _get_logger: 22: LOGGER Created 'cx_service_test'
2021-02-10 10:51:50,150: INFO: __init__: __init__.py: MainThread: cx_service_test: enable_api_logger: 25: Connected katapi logger.
2021-02-10 10:51:50,150: DEBUG: ServiceHandler: ServiceHandler.py: MainThread: cx_service_test: Initialize: 40: WinService.Initialize: Loading Virtual Config: C:\King\repos\git\WinServiceTesting\cx_sample\service\build\exe.win32-3.5\cx_FreezeSampleService.ini
2021-02-10 10:51:50,150: DEBUG: ServiceHandler: ServiceHandler.py: MainThread: cx_service_test: Initialize: 41: WinService.Initialize: Creating Instance
2021-02-10 10:51:50,151: DEBUG: ServiceHandler: ServiceHandler.py: MainThread: cx_service_test: Run: 50: WinService.Run: About To Explode

Side note.... Re:

 // run the service 
 if (Service_Run(&info) < 0) { 
     Service_SetStatus(&info, SERVICE_STOPPED); 
     return; 
 } 

Wouldn't this return 0/None/Null?

@johan-ronnkvist
Copy link
Contributor

From a brief reading, it seems that it's possible to set the exit code from a stopped service in the call to Service_SetStatus().

According to the documentation for windows services status, it's possible to use the param dwWin32ExitCode to indicate what the service should report when stopping. Currently, this value is always set to 0.

If the function Service_SetStatus() was given another parameter to indicate the intended exit code, this could likely be fixed. At the moment I'm not able to test, verify and offer a concrete PR for this change - but hopefully, this will make the process easier for the person who's looking into this issue.

@marcelotduarte
Copy link
Owner

marcelotduarte commented Feb 19, 2023

@darad @TechnicalPirate Can you test the @johron PR?

To install the latest development build:
pip install --upgrade --pre --extra-index-url https://marcelotduarte.github.io/packages/ cx_Freeze

@TechnicalPirate
Copy link
Contributor

Hey Marcel - thanks for flagging it - I will put it on my TODO list to test.

Thank you for taking this Johron! :)

@TechnicalPirate
Copy link
Contributor

Hi @marcelotduarte - I spent some time with Johron this week :)

  • We reproduce the issue again on the latest version to confirm nothing had changed in the interim
  • We then restored his PR / brought it up to the latest master
  • Found some issues with it -> which we've now addressed
  • I can now confirm that the PR is fresh & addresses the issue - the expected behavior is now being performed.

@marcelotduarte
Copy link
Owner

@TechnicalPirate Did you test today's last commit?

@TechnicalPirate
Copy link
Contributor

TechnicalPirate commented May 24, 2023

@marcelotduarte - Yep, we actually ended up on a call to go through and ensure we were both observing the same behaviour prior to making the change -> then confirming service recovery works as expected.

There is one small nuance that's worth commenting on here - even with the latest commit - there is an implicit assumption that the service has to enter a running state for recovery measures to kick-in if the service process encounters an unhandled exception during execution. ( Something we discovered during testing ).

But I can confirm that as of the latest commit a9e0df3 on that PR - everything is tested and working 👍

@marcelotduarte marcelotduarte linked a pull request May 24, 2023 that will close this issue
@marcelotduarte
Copy link
Owner

Release 6.15.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants