Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perf improvement to getting auth chains #17169

Open
wants to merge 11 commits into
base: develop
Choose a base branch
from

Conversation

erikjohnston
Copy link
Member

@erikjohnston erikjohnston commented May 8, 2024

Let's see if this helps.

--

Regression introduced in #17044

Fixes: #17129

Replaces: #17154

@erikjohnston erikjohnston force-pushed the erikj/faster_auth_chains branch from 5b3845d to a2adc9c Compare May 8, 2024 15:33
@erikjohnston erikjohnston marked this pull request as ready for review May 8, 2024 15:33
@erikjohnston erikjohnston requested a review from a team as a code owner May 8, 2024 15:33
@csett86
Copy link

csett86 commented May 9, 2024

@erikjohnston I applied this patch on 1.106.0 and unfortunately it does not really improve. Very long get_auth_chain_difference_chains are back:

Bildschirmfoto 2024-05-09 um 16 39 14

@erikjohnston erikjohnston force-pushed the erikj/faster_auth_chains branch from 46a0c20 to ca79b4d Compare May 13, 2024 12:10
@heftig
Copy link

heftig commented May 15, 2024

The old code also removed the chains that _materialize adds to the chains from the chains_to_fetch. Is this behavior covered again?

@@ -581,7 +592,7 @@ def fetch_chain_info(events_to_fetch: Collection[str]) -> None:
# are reachable from any event.

# (We need to take a copy of `seen_chains` as the function mutates it)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment no longer applies.

@reivilibre reivilibre removed the request for review from a team May 29, 2024 11:48
@strutztm
Copy link

strutztm commented Sep 4, 2024

We deployed this on top of v1.114.0 yesterday and it seems to fix the issue. On the most affected server, our transaction times for get_auth_chain_difference_chains are between 10 seconds and 2 minutes instead of quickly increasing to hours.

Is that a normal range? That's about the same as we measured after rolling back from v1.106.0 to v1.104.0, but all other transaction times are in the milliseconds range.

@jaywink
Copy link
Member

jaywink commented Oct 3, 2024

We've been testing this in EMS. There were encouraging results for 3 hosts suffering badly from #17129 that benefited from this patch. However, further testing with more hosts was not as successful. Several hosts seem to be performing worse, once grinding to a halt with the RDS instance totally pinning itself on CPU. Removing the patch immediately resolved the RDS CPU being hammered.

Screenshot-20241003114819-1751x529

Host names in the linked internal issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Very long query times for get_auth_chain_difference_chains
5 participants