-
Notifications
You must be signed in to change notification settings - Fork 238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sliding Sync: Speed up incremental sync by avoiding extra work #17665
Sliding Sync: Speed up incremental sync by avoiding extra work #17665
Conversation
@@ -744,26 +745,75 @@ async def get_room_sync_data( | |||
# indicate to the client that a state reset happened. Perhaps we should indicate | |||
# this by setting `initial: True` and empty `required_state`. | |||
|
|||
# Check whether the room has a name set | |||
name_state_ids = await self.get_current_state_ids_at( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We now get to avoid this costly get_current_state_ids_at(...)
on incremental sync and instead look through the current state deltas to see if the room name changed (get_current_state_deltas_for_room(...)
).
Since we need to get_current_state_deltas_for_room(...)
anyway, doing it here first instead is just free lunch.
Before:
room_membership_summary: Optional[Mapping[str, MemberSummary]] = None | ||
empty_membership_summary = MemberSummary([], 0) | ||
if room_membership_for_user_at_to_token.membership in ( | ||
Membership.LEAVE, | ||
Membership.BAN, | ||
): | ||
# TODO: Figure out how to get the membership summary for left/banned rooms | ||
room_membership_summary = {} | ||
else: | ||
room_membership_summary = await self.store.get_room_summary(room_id) | ||
# TODO: Reverse/rewind back to the `to_token` | ||
# We need the room summary for: | ||
# - Always for initial syncs (or the first time we send down the room) | ||
# - When the room has no name, we need `heroes` | ||
# - When the membership has changed so we need to give updated `heroes` and | ||
# `joined_count`/`invited_count`. | ||
# | ||
# Ideally, instead of just looking at `name_changed`, we'd check if the room | ||
# name is not set but this is a good enough approximation that saves us from | ||
# having to pull out the full event. This just means, we're generating the | ||
# summary whenever the room name changes instead of only when it changes to | ||
# `None`. | ||
if initial or name_changed or membership_changed: | ||
# We can't trace the function directly because it's cached and the `@cached` | ||
# decorator doesn't mix with `@trace` yet. | ||
with start_active_span("get_room_summary"): | ||
if room_membership_for_user_at_to_token.membership in ( | ||
Membership.LEAVE, | ||
Membership.BAN, | ||
): | ||
# TODO: Figure out how to get the membership summary for left/banned rooms | ||
room_membership_summary = {} | ||
else: | ||
room_membership_summary = await self.store.get_room_summary(room_id) | ||
# TODO: Reverse/rewind back to the `to_token` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We now also conditionally generate the room_membership_summary
based on when it's needed.
This wasn't being measured before in a trace but probably some savings since that was it's own database transaction involving multiple queries.
events = await self.store.get_events( | ||
[d.event_id for d in deltas if d.event_id] | ||
state_filter.filter_state(room_state_delta_id_map).values() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra savings here because we now filter the room state to what we need before fetching the full events.
joined_count: Optional[int] = None | ||
if initial or membership_changed: | ||
assert room_membership_summary is not None | ||
joined_count = room_membership_summary.get( | ||
Membership.JOIN, empty_membership_summary | ||
).count | ||
|
||
invited_count: Optional[int] = None | ||
if initial or membership_changed: | ||
assert room_membership_summary is not None | ||
invited_count = room_membership_summary.get( | ||
Membership.INVITE, empty_membership_summary | ||
).count |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to return the joined_count
/invited_count
if they haven't changed
hero_membership_state = await self.get_current_state_at( | ||
room_id=room_id, | ||
room_membership_for_user_at_to_token=room_membership_for_user_at_to_token, | ||
state_filter=StateFilter.from_types(hero_room_state), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could subtract the membership events we already have from the delta changes but this is bound to only be one event and makes the logic more complex. Most likely, since the heroes are the first of the memberships, the new membership won't be involved anyway.
if initial or name_changed: | ||
meta_room_state.append((EventTypes.Name, "")) | ||
if initial or avatar_changed: | ||
meta_room_state.append((EventTypes.RoomAvatar, "")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We get to avoid fetching extra state if the name/avatar hasn't changed.
# heroes from the state | ||
if hero_user_ids: | ||
hero_membership_state = await self.get_current_state_at( | ||
room_id=room_id, | ||
room_membership_for_user_at_to_token=room_membership_for_user_at_to_token, | ||
state_filter=StateFilter.from_types(hero_room_state), | ||
to_token=to_token, | ||
) | ||
room_state.update(hero_membership_state) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, on incremental sync, heroes
was bugged because it only showed the heroes
that were in the membership changes.
Now we get the full hero list out but only when necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
Thanks for the review and merge @erikjohnston 🐦 |
Speed up incremental sync by avoiding extra work. We first look at the state delta changes and only fetch and calculate further derived things if they have changed.
This isn't a very good trace comparison because these are just from some synthetic local tests that have too little data and run so fast that there is a lot of variation in the span durations. The main thing to notice is that the
get_current_state_ids_at(...)
chunk isn't happening the after trace.Trace before:
aded9b5d1d517d83.json
Trace after:
48dee2667089b630.json
The real-world trace before (
0672eb7bbbc4d50e.json
) illustrates the problem a little better.Dev notes
Enable Jaeger tracing in Twisted Trial tests:
Run Jaeger locally with Docker: https://www.jaegertracing.io/docs/1.6/getting-started/#all-in-one-docker-image
Add the following to
tests/unittest.py
insetUp(...)
:Add the following to
tests/rest/client/sliding_sync/test_sliding_sync.py
indefault_config(...)
Add a sleep to the end of the test you're running so the traces have a chance to flow to Jaeger before the process exits.
Run your tests.
Visit http://localhost:16686/ and choose the
test master
service and find the traces you're interesting in.Pull Request Checklist
EventStore
toEventWorkerStore
.".code blocks
.(run the linters)