-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Have SIGTERM print a traceback #12856
Conversation
Added a signal handler for SIGTERM. The plan is to switch the integration builds to send SIGTERM when we run out of time and then get a traceback. This will allow us to differentiate a crash from a timeout from the log file and exit code.
A new Pull Request was created by @Dr15Jones (Chris Jones) for CMSSW_8_0_X. It involves the following packages: FWCore/Services @cmsbuild, @smuzaffar, @Dr15Jones, @davidlange6 can you please review it and eventually sign? Thanks. Following commands in first line of a comment are recognized
|
please test |
+1 |
The tests are being triggered in jenkins. |
This pull request is fully signed and it will be integrated in one of the next CMSSW_8_0_X IBs after it passes the integration tests. This pull request requires discussion in the ORP meeting before it's merged. @slava77, @davidlange6, @Degano, @smuzaffar |
@smuzaffar Once this gets added to CMSSW_8_0_X we should change the IB RelVals to send a SIGTERM when the jobs reach their time limit. This will allow us to easily distinguish between a timeout and a segmentation fault. |
Have SIGTERM print a traceback
@smuzaffar How hard would it be to change the IB RelVals of just CMSSW_8_0 to have timeout send SIGTERM instead of SIGSEGV? |
@Dr15Jones , should be trivial. I will update cms-bot to use SIGTERM for 80X and SIGSEGV for reset. |
Added a signal handler for SIGTERM. The plan is to switch the integration builds to send SIGTERM when we run out of time and then get a traceback. This will allow us to differentiate a crash from a timeout from the log file and exit code.