Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSC4171 Omit service members from room summary #17866
MSC4171 Omit service members from room summary #17866
Changes from 13 commits
ec91bea
027a86c
587b687
d9c52b2
656996a
21c4677
7e6ae4f
b13a032
b84e949
990e2ce
9567c14
0ab5dae
d5530ff
0a8d85b
1da2f61
5b8cbf4
e60d4cf
5a3cdf2
6c17ee1
7521b78
429381a
ee13e3f
2c181a7
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be careful about there being too many
exclude_members
and bailing to prevent DoS.Alternatively, we could just fetch N extra members like we do in case one of them is the calling user and do the exclusion outside of the SQL. But we should probably have a limit on that as well. Perhaps something to clarify in the spec and then we can bail if the list is longer than the specc'ed max. Perhaps we just rely on max length of an event?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to rely on the max length of an event, which I think even if you had a long list of very short userIds would still only be up to about 9k or so (once you've included the usual event padding).
I'm not quite sure on the performance cost here, but I'd assume that a 9k string list filter in postgres isn't terrible as it's not going to impact IO.
Alternatively we could do as you say and set a sensible max number of users in the spec (say 100 or so). I'm generally a bit allergic to limitations in the spec, as someones probably going to come up with a use case of 101 members however it might be justified in the case of performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, and
get_room_summary
has big fat cache on it so it's probably okay to do a slightly more expensive call here. Admittedly this does impact the hot path of sync, but I think the operation of pulling out excluded users is fairly fast.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We tend to limit
make_in_list_sql_clause(...)
to at-most 1000 (see usages ofbatch_iter(..., 1000)
) but it's a bit tough to do here with thenegative=True
condition.We should at-least add a comment here that we are assuming that there should be no more than 9.3k members to exclude based on the max length of an event (65535 bytes).
@m:h
) maximally@m:h.io
)@mm:h.io
)We could have a valid MXID check and add them to a
Set
to deduplicate but I don't think it's worth the extra computation.Perhaps we should just have a practical limit to re-evaluate if someone hits it. Have a 1k check and set
exclude_members = []
in that case with a log warning (warning instead of assert because we don't want to break the whole/sync
response). This can always be increased in the future when someone has a practical use case but avoids rooms where there only goal is performance abuse.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use
f-strings
, or the more modern.format(..)
.The old
%
is a bit of a footgun, as e.g. its meant to take a collection like a tuple, but you've actually just passed it a plain string (since you forgot a comma), which is valid as a string is a collection of strings. This does work, as python does magic to detect the case, but its a bit ugh.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems a bit dubious to change the behavior of
/sync
(m.heroes
) without it being under an unstable endpoint or key in the response. It is under a feature flag (msc4171_enabled
) 🤔There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, unsure how best to do this one. I felt it would be better for the homeserver to uniformly provide the same result across all users / endpoints based on the configuration, than adding query parameters to each endpoint and filtering. It's a reasonable request though, we could do it that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, I didn't manage to add a feature flag check here in the original PR. Have updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably would be prudent to separate this behavior, either via a new key in the
/sync
response asio.element.msc4171.heroes
or conditional based on a new query parameter (any good prior art?)New endpoint(s) is also a valid option but not worth how cumbersome that would be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Query parameter would be probably best I think. I'll just need to let the EX crew know that this needs adding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the response. I do think we need some review on the MSC as it's a naturally ripe-for-abuse feature. So from my view, this feature primarily helps with DM rooms where no room name nor alias is set and you will have around 3 participants (Alice, Bob, and a bot).
We can do this, but unfortunately really only helps with larger groups with no other metadata to gather a name from.
We can do this, but it does present a couple of problems. First, it means we need to fetch state as part of the sync loop which either means a small delay or a flicker as room names adjust. I am unsure how palatable that will be, but it's an option.
I think this would be my preference. This avoids changing behaviors for existing clients, and gives clients the opportunity to opt out. It also prevents the flicker problem noted above which is my primary concern with the client approach. I'll proceed with this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really follow this? Clients using sync v2 will get the service state event alongside the heroes, and for sliding sync the client can add it to the list of requested state events.
We can also server side annotate the hero membership events in sync v2 to indicate that they are service users (and do something similar in sliding sync), but given the client already has that information it feels a bit redundant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the point of m.heroes was to calculate a room name when you don't have the full state locally? Admittedly I didn't know that you could ask for specific state events to appear alongside. That seems like a more sensible approach to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, we always send down all the state to the clients, except for membership events if they have enabled "lazy-loading" for sync v2. In sliding sync the client can specify what state event types it wants the server to return (and gives an option for lazy loading members).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, that makes things much easier. I think the optimal path here is probably to scrap this PR and have the clients pull the correct state and filter then.