-
-
Notifications
You must be signed in to change notification settings - Fork 208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Job stuck in queued state and never picked up #1590
Comments
That's really weird! It sounds like it's getting into some kind of deadlock state (implied by you having to kill it). That it's happening in development only makes me think that constant autoloading/reloading is involved Would you be able to share what the And do you have any initializers/configuration of GoodJob (or anything else in your code paths) that might be touching an autoloader constant (a model or job class, for example) |
Sure, not a lot actually def sync
Accounts::SyncCalendarsJob.set(wait: 2.seconds).perform_later(self)
end the job then pulls the user's google calendar and either remove locally if they are removed from upstream or just updates class Accounts::SyncCalendarsJob < UrgentJob
def perform(account)
upstream_calendars = account.calendar_service.list_calendar_lists
# Delete calendars that are no longer in the list
account.calendars.where.not(google_id: upstream_calendars.items.map(&:id)).map(&:enqueue_for_removal)
upstream_calendars.items.map do |item|
Calendar.from_google(item:, account:)
ensure
account.user.synced!
end
end
end
not a lot going on in the initializers or config folder, just configuring some gems like pay-rails, high-voltage and rails_icons. I do have Sentry but it is behind a However there is something now that I'm thinking: a calendar is linked to a webhook (called watch channel) and if the user decides to stop listening to webhook, I need to delete the job that is enqueued to renew the webhook: class WatchChannel < ApplicationRecord
before_destroy :dequeue_renewal_job, if: -> { !expired? }
...
def dequeue_renewal_job
GoodJob::Job.scheduled.find_by(id: renewal_job_id)&.discard_job(
"Watch channel stopped earlier"
)
end |
That deletion is not my favorite (generally not safe to reach directly into the Jobs) but I don't think it would be the cause of this (though maybe!) I think you could remove it by having the renewal job do I still don't think that it would lock up the entire process though; the process heartbeats happen on their own thread/scheduler and should be independent of jobs. It still looks to me like a Ruby VM deadlock. 🤔 |
Here's the best way to debug this:
I have a script here with some helpers if you wanted to see what else you might do, but it's that |
unfortunately it is not, it is triggered by the calendar so the calendar is the argument. unrelated but now I see that "renew" is the wrong term here, there is no renew, once it expires, you create a new webhook, not renew it. I will add Thanks a lot once more, Ben! |
Hey 👋🏻
Every now and then, I see a weird issue, just in development: a job gets scheduled but never picked up.
I have an endpoint that schedules a job when gets a callback:
Sometimes, without any specific trigger or event, the job gets queued but doesn't get picked up:
This one is sitting in the queue for 5 minutes.
I also noticed that when this happens, it's like the process enters a deadlock state or something. I said this because it stops pinging:
and when I stop the process, it actually has to get killed
while when everything is fine, it just exits
I'm not sure if this rings any bell but I'm seeing this in two different machines, in dev only..
Here is my config:
Running Rails from
main
branch and the latest Good Job (4.8.2 as of now). Both machines being Mac, one is a M1 and another M2 Pro. I start my dev env with Overmind:and my Procfile looks like this
I also don't know exactly what triggers this, I'm trying to reproduce and find something but just want to put this out in case you know something.
The text was updated successfully, but these errors were encountered: