-
Notifications
You must be signed in to change notification settings - Fork 388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shell Jobs randomly getting 'stuck' #1483
Comments
We are seeing the same thing. We were running 3.2.6 previously, and we're going to roll back to that version. |
Having the same issue with 3.2.7. |
Rolling back to 3.2.6 fixed it for us. |
I tried out the latest DKRON version (4.0.2), hoping that it fixes the issue. It doesn't. The bug fills up our busy tab, so we're stuck on the old version. I just put together a docker compose stack to demonstrate the bug, based on the redis command provided by @mladoi. Here is a repository with the demonstration: dkron-busy-bug I think it's clear that this bug was introduced in 3.2.7. |
Thanks @morgan-atproperties for reporting this and preparing the demo repo, we're on it |
Describe the bug
Some shell jobs are randomly getting stuck in the 'busy' state, and don't clear until we restart the dkron agent.
The job processes launched by dkron agent are finishing as expected and exiting cleanly. Dkron doesn't notice, so the job stays in the 'busy' status, and further executions don't happen (we prevent concurrency).
We recently upgraded from 3.2.0 to 3.2.7, and this problem is now happening randomly with some of our jobs.
Job Description
Logs
The job took around ~15 seconds to execute (I can see from the logs that my dkron_run script generates).
Logs for a good job that ran at around the same time. This other job took less than a second to execute.
Specifications
The text was updated successfully, but these errors were encountered: