-
-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add index to good_jobs
to improve querying candidate jobs
#726
Add index to good_jobs
to improve querying candidate jobs
#726
Conversation
@ mitchellhenke thank you! This is great:
It's perfect 😄 Naming indexes is one of my least favorite activities. |
Thank you for your work in making GoodJob! I did also experiment with adding |
@mitchellhenke another suggestion, if you're still up for profiling, is seeing if Postgres will use multiple indexes with a bitmap union: https://www.postgresql.org/docs/15/indexes-bitmap-scans.html I am somewhat skeptical because I have yet to see a real query plan that implements this strategy, but I'm imagining that maybe three indexes could be unioned by Postgres: add_index :good_jobs, [:priority, :created_at], order: { priority: "DESC NULLS LAST", created_at: :asc }, where: "finished_at IS NULL", name: :index_good_jobs_jobs_on_priority_created_at_when_unfinished
add_index :good_jobs, :queue_name, where: "finished_at IS NULL", name: :index_good_jobs_jobs_on_queue_name
# this index already exists, and I haven't ever seen it used in a bitmap union
add_index :good_jobs, :scheduled_at, where: "finished_at IS NULL", name: :index_good_jobs_jobs_on_scheduled_at |
I haven’t gotten a chance to try those indexes, but one thing did occur to me. Would it be possible to always set |
I've been reluctant to change the meaning of the An alternative is querying/indexing using a COALESCE statement, which I think would look something like this: ORDER BY priority DESC NULLS LAST, COALESCE(scheduled_at, created_at) ASC, created_at ASC I dunno if it's much better than the status quo. That's because ordering by I've also been looking at Postgres 11+ feature of index |
Adding the index sould be done in a separate migration, right? Otherwise people updating good_job from a previous version won't get them |
@coorasse yep, you'll need to update to the latest version of GoodJob and run |
Thanks! I missed this part of the docs. |
fyi, as of #928, all jobs records will have |
The preexisting index (introduced in bensheldon#726) gave a direct answer to the materialized subquery used to find candidate jobs, subject to the ordering rules that were in place at that time. In bensheldon#883, `GoodJob` deprecated that ordering in favour of a lower-is-more-important scheme, aligning with Active Job. When there are a substantial number of completed jobs, the `priority desc` index does not allow for a straight index scan, and may not actually be used by the planner. Aligning the sort orders here allows for the subquery to be satisfied directly.
The preexisting index (introduced in bensheldon#726) gave a direct answer to the materialized subquery used to find candidate jobs, subject to the ordering rules that were in place at that time. In bensheldon#883, `GoodJob` deprecated that ordering in favour of a lower-is-more-important scheme, aligning with Active Job. When there are a substantial number of completed jobs, the `priority desc` index does not allow for a straight index scan, and may not actually be used by the planner. Aligning the sort orders here allows for the subquery to be satisfied directly. Although `ASC NULLS LAST` is the default (as observable in the generated schema) it seems appropriate to be explicit in the actual index declaration, although mostly for consistency with the previous version.
The preexisting index (introduced in bensheldon#726) gave a direct answer to the materialized subquery used to find candidate jobs, subject to the ordering rules that were in place at that time. In bensheldon#883, `GoodJob` deprecated that ordering in favour of a lower-is-more-important scheme, aligning with Active Job. When there are a substantial number of completed jobs, the `priority desc` index does not allow for a straight index scan, and may not actually be used by the planner. Aligning the sort orders here allows for the subquery to be satisfied directly. Although `ASC NULLS LAST` is the default (as observable in the generated schema) it seems appropriate to be explicit in the actual index declaration, although mostly for consistency with the previous version.
The preexisting index (introduced in bensheldon#726) gave a direct answer to the materialized subquery used to find candidate jobs, subject to the ordering rules that were in place at that time. In bensheldon#883, `GoodJob` deprecated that ordering in favour of a lower-is-more-important scheme, aligning with Active Job. When there are a substantial number of completed jobs, the `priority desc` index does not allow for a straight index scan, and may not actually be used by the planner. Aligning the sort orders here allows for the subquery to be satisfied directly. Although `ASC NULLS LAST` is the default (as observable in the generated schema) it seems appropriate to be explicit in the actual index declaration, although mostly for consistency with the previous version.
The preexisting index (introduced in bensheldon#726) gave a direct answer to the materialized subquery used to find candidate jobs, subject to the ordering rules that were in place at that time. In bensheldon#883, `GoodJob` deprecated that ordering in favour of a lower-is-more-important scheme, aligning with Active Job. When there are a substantial number of completed jobs, the `priority desc` index does not allow for a straight index scan, and may not actually be used by the planner. Aligning the sort orders here allows for the subquery to be satisfied directly. Although `ASC NULLS LAST` is the default (as observable in the generated schema) it seems appropriate to be explicit in the actual index declaration, although mostly for consistency with the previous version.
The preexisting index (introduced in bensheldon#726) gave a direct answer to the materialized subquery used to find candidate jobs, subject to the ordering rules that were in place at that time. In bensheldon#883, `GoodJob` deprecated that ordering in favour of a lower-is-more-important scheme, aligning with Active Job. When there are a substantial number of completed jobs, the `priority desc` index does not allow for a straight index scan, and may not actually be used by the planner. Aligning the sort orders here allows for the subquery to be satisfied directly. Although `ASC NULLS LAST` is the default (as observable in the generated schema) it seems appropriate to be explicit in the actual index declaration, although mostly for consistency with the previous version.
…1213) The preexisting index (introduced in #726) gave a direct answer to the materialized subquery used to find candidate jobs, subject to the ordering rules that were in place at that time. In #883, `GoodJob` deprecated that ordering in favour of a lower-is-more-important scheme, aligning with Active Job. When there are a substantial number of completed jobs, the `priority desc` index does not allow for a straight index scan, and may not actually be used by the planner. Aligning the sort orders here allows for the subquery to be satisfied directly. Although `ASC NULLS LAST` is the default (as observable in the generated schema) it seems appropriate to be explicit in the actual index declaration, although mostly for consistency with the previous version.
I think we might be having the same issue as described in #720 and the index appears to help quite a bit.
I tried to follow the index naming scheme, but the name is a bit wordy, so I wasn't sure 🙂
SQL Explain on a `good_jobs` table with hundreds of thousands of unfinished jobs