Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise render-time link expansion for GraphQL (all editions, except ministers index page) #3102

Merged
merged 3 commits into from
Jan 30, 2025

Conversation

brucebolt
Copy link
Member

@brucebolt brucebolt commented Jan 28, 2025

This makes two changes to optimise the performance of all edition (except ministers index page) render-time link expansion in GraphQL.

These two changes reduce the total response time for a GraphQL query on the prime minister's page from ~4,500ms (ActiveRecord: ~360ms) to ~3,200ms (ActiveRecord: ~250ms), when run locally in govuk-docker. The total request duration reduces from ~1,000ms to ~800ms, when queries are made against Publishing API in the integration environment.

1. Reduction in number of queries for link expansion

This halves the number of database queries needed to retrieve links when doing render-time link expansion on GraphQL queries, by combining the retrieval of link set and edition links into a single database query.

For example, this reduces the queries for rendering the prime minister page from 162 to 81.

2. Adding an index to optimise edition links

We are currently querying on edition_id and link_type in GraphQL's LinkedToEditionSource dataloader.

At the moment, we are only using an index for the edition_id part of the query, not the link_type part as well.

Adding a composite index to allow one index to handle both parts of this query.

Given the query (as generated by ActiveRecord):

SELECT "links"."id" AS t0_r0, "links"."link_set_id" AS t0_r1, "links"."target_content_id" AS t0_r2, "links"."link_type" AS t0_r3, "links"."created_at" AS t0_r4, "links"."updated_at" AS t0_r5, "links"."position" AS t0_r6, "links"."edition_id" AS t0_r7, "target_documents"."id" AS t1_r0, "target_documents"."content_id" AS t1_r1, "target_documents"."locale" AS t1_r2, "target_documents"."stale_lock_version" AS t1_r3, "target_documents"."created_at" AS t1_r4, "target_documents"."updated_at" AS t1_r5, "target_documents"."owning_document_id" AS t1_r6, "editions"."id" AS t2_r0, "editions"."title" AS t2_r1, "editions"."public_updated_at" AS t2_r2, "editions"."publishing_app" AS t2_r3, "editions"."rendering_app" AS t2_r4, "editions"."update_type" AS t2_r5, "editions"."phase" AS t2_r6, "editions"."analytics_identifier" AS t2_r7, "editions"."created_at" AS t2_r8, "editions"."updated_at" AS t2_r9, "editions"."document_type" AS t2_r10, "editions"."schema_name" AS t2_r11, "editions"."first_published_at" AS t2_r12, "editions"."last_edited_at" AS t2_r13, "editions"."state" AS t2_r14, "editions"."user_facing_version" AS t2_r15, "editions"."base_path" AS t2_r16, "editions"."content_store" AS t2_r17, "editions"."document_id" AS t2_r18, "editions"."description" AS t2_r19, "editions"."publishing_request_id" AS t2_r20, "editions"."major_published_at" AS t2_r21, "editions"."published_at" AS t2_r22, "editions"."publishing_api_first_published_at" AS t2_r23, "editions"."publishing_api_last_edited_at" AS t2_r24, "editions"."auth_bypass_ids" AS t2_r25, "editions"."details" AS t2_r26, "editions"."routes" AS t2_r27, "editions"."redirects" AS t2_r28, "editions"."last_edited_by_editor_id" AS t2_r29 FROM "links" LEFT OUTER JOIN "documents" "target_documents" ON "target_documents"."content_id" = "links"."target_content_id" LEFT OUTER JOIN "editions" ON "editions"."content_store" = 'live' AND "editions"."document_id" = "target_documents"."id" WHERE ("links"."edition_id" = 14331776 OR "links"."link_set_id" = 36866) AND "links"."link_type" = 'ordered_parent_organisations' AND "target_documents"."locale" = 'en' ORDER BY "links"."link_type" ASC, "links"."position" ASC

Query explanation before new indices:

Sort  (cost=549.70..549.70 rows=1 width=901) (actual time=0.486..0.487 rows=2 loops=1)
  Sort Key: links.\"position\"
  Sort Method: quicksort  Memory: 26kB
  ->  Nested Loop Left Join  (cost=10.87..549.69 rows=1 width=901) (actual time=0.348..0.449 rows=2 loops=1)
        ->  Nested Loop  (cost=10.44..546.17 rows=1 width=114) (actual time=0.270..0.333 rows=2 loops=1)
              ->  Bitmap Heap Scan on links  (cost=10.01..537.72 rows=1 width=67) (actual time=0.200..0.201 rows=2 loops=1)
                    Recheck Cond: ((edition_id = 14331776) OR ((link_set_id = 36866) AND ((link_type)::text = 'ordered_parent_organisations'::text)))
                    Filter: ((link_type)::text = 'ordered_parent_organisations'::text)
                    Heap Blocks: exact=1
                    ->  BitmapOr  (cost=10.01..10.01 rows=133 width=0) (actual time=0.153..0.154 rows=0 loops=1)
                          ->  Bitmap Index Scan on index_links_on_edition_id  (cost=0.00..5.44 rows=133 width=0) (actual time=0.080..0.080 rows=0 loops=1)
                                Index Cond: (edition_id = 14331776)
                          ->  Bitmap Index Scan on index_links_on_link_set_id_and_link_type  (cost=0.00..4.58 rows=1 width=0) (actual time=0.067..0.067 rows=2 loops=1)
                                Index Cond: ((link_set_id = 36866) AND ((link_type)::text = 'ordered_parent_organisations'::text))
              ->  Index Scan using index_documents_on_content_id_and_locale on documents target_documents  (cost=0.43..8.45 rows=1 width=47) (actual time=0.064..0.064 rows=1 loops=2)
                    Index Cond: ((content_id = links.target_content_id) AND ((locale)::text = 'en'::text))
        ->  Index Scan using index_editions_on_document_id_and_document_type_live on editions  (cost=0.42..28.25 rows=9 width=787) (actual time=0.056..0.056 rows=1 loops=2)
              Index Cond: (document_id = target_documents.id)
Planning Time: 5.328 ms
Execution Time: 0.590 ms

Query explanation after new index:

Sort  (cost=25.14..25.15 rows=1 width=901) (actual time=0.092..0.094 rows=2 loops=1)
  Sort Key: links.\"position\"
  Sort Method: quicksort  Memory: 26kB
  ->  Nested Loop Left Join  (cost=10.00..25.13 rows=1 width=901) (actual time=0.061..0.080 rows=2 loops=1)
        ->  Nested Loop  (cost=9.58..21.62 rows=1 width=114) (actual time=0.048..0.059 rows=2 loops=1)
              ->  Bitmap Heap Scan on links  (cost=9.15..13.17 rows=1 width=67) (actual time=0.033..0.034 rows=2 loops=1)
                    Recheck Cond: (((edition_id = 14331776) AND ((link_type)::text = 'ordered_parent_organisations'::text)) OR ((link_set_id = 36866) AND ((link_type)::text = 'ordered_parent_organisations'::text)))
                    Heap Blocks: exact=1
                    ->  BitmapOr  (cost=9.15..9.15 rows=1 width=0) (actual time=0.026..0.027 rows=0 loops=1)
                          ->  Bitmap Index Scan on index_links_on_edition_id_and_link_type  (cost=0.00..4.58 rows=1 width=0) (actual time=0.013..0.013 rows=0 loops=1)
                                Index Cond: ((edition_id = 14331776) AND ((link_type)::text = 'ordered_parent_organisations'::text))
                          ->  Bitmap Index Scan on index_links_on_link_set_id_and_link_type  (cost=0.00..4.58 rows=1 width=0) (actual time=0.012..0.012 rows=2 loops=1)
                                Index Cond: ((link_set_id = 36866) AND ((link_type)::text = 'ordered_parent_organisations'::text))
              ->  Index Scan using index_documents_on_content_id_and_locale on documents target_documents  (cost=0.43..8.45 rows=1 width=47) (actual time=0.009..0.009 rows=1 loops=2)
                    Index Cond: ((content_id = links.target_content_id) AND ((locale)::text = 'en'::text))
        ->  Index Scan using index_editions_on_document_id_and_document_type_live on editions  (cost=0.42..28.25 rows=9 width=787) (actual time=0.008..0.008 rows=1 loops=2)
              Index Cond: (document_id = target_documents.id)
Planning Time: 1.005 ms
Execution Time: 0.155 ms

Trello card

@brucebolt brucebolt force-pushed the reduce-graphql-queries branch 3 times, most recently from cff4cb6 to 4880fcd Compare January 28, 2025 10:56
@brucebolt brucebolt changed the base branch from main to add-graphql-optimisation-index January 28, 2025 10:56
@brucebolt brucebolt changed the title Reduce database queries for GraphQL link expansion Reduce database queries for GraphQL link expansion (all non-ministers index page edition) Jan 28, 2025
@brucebolt brucebolt changed the title Reduce database queries for GraphQL link expansion (all non-ministers index page edition) Optimise render-time link expansion for GraphQL (all non-ministers index page edition) Jan 28, 2025
@brucebolt brucebolt changed the title Optimise render-time link expansion for GraphQL (all non-ministers index page edition) Optimise render-time link expansion for GraphQL (all editions) Jan 28, 2025
@brucebolt brucebolt changed the title Optimise render-time link expansion for GraphQL (all editions) Optimise render-time link expansion for GraphQL (all editions, except ministers index page) Jan 28, 2025
@brucebolt brucebolt marked this pull request as ready for review January 28, 2025 11:55
@brucebolt brucebolt force-pushed the add-graphql-optimisation-index branch from b0f3e69 to e1b1955 Compare January 29, 2025 14:33
This halves the number of database queries needed to retrieve links when
doing render-time link expansion on GraphQL queries, by combining the
retrieval of link set and edition links into a single query.

For example, this reduces the queries for rendering the prime minister
page from 162 to 81.
This tests that a mixture of edition and link set links are returned
when an edition has both.
We are currently querying on `edition_id` and `link_type` in GraphQL's
`LinkedToEditionSource` dataloader.

At the moment, we are only using an index for the `edition_id` part of
the query, not the `link_type` part as well.

Adding a composite index to allow one index to handle both parts of this
query.
@brucebolt brucebolt force-pushed the reduce-graphql-queries branch from 4880fcd to 0bca135 Compare January 29, 2025 14:34
Copy link
Contributor

@mike3985 mike3985 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one

Base automatically changed from add-graphql-optimisation-index to main January 30, 2025 09:57
Copy link
Contributor

@richardTowers richardTowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

@brucebolt brucebolt merged commit 6eb106f into main Jan 30, 2025
12 checks passed
@brucebolt brucebolt deleted the reduce-graphql-queries branch January 30, 2025 10:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants