Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC4171 Omit service members from room summary #17866

Closed
wants to merge 23 commits into from
Closed
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/17866.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Add support for filtering out "service members" from room summary responses, as described in MSC4171.
Half-Shot marked this conversation as resolved.
Show resolved Hide resolved
2 changes: 2 additions & 0 deletions synapse/api/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,8 @@ class EventTypes:

PollStart: Final = "m.poll.start"

MSC4171FunctionalMembers: Final = "io.element.functional_members"


class ToDeviceEventTypes:
RoomKeyRequest: Final = "m.room_key_request"
Expand Down
3 changes: 3 additions & 0 deletions synapse/config/experimental.py
Original file line number Diff line number Diff line change
Expand Up @@ -448,5 +448,8 @@ def read_config(self, config: JsonDict, **kwargs: Any) -> None:
# MSC4151: Report room API (Client-Server API)
self.msc4151_enabled: bool = experimental.get("msc4151_enabled", False)

# MSC4171: Service members
self.msc4171_enabled: bool = experimental.get("msc4171_enabled", False)

# MSC4210: Remove legacy mentions
self.msc4210_enabled: bool = experimental.get("msc4210_enabled", False)
38 changes: 34 additions & 4 deletions synapse/storage/databases/main/roommember.py
Original file line number Diff line number Diff line change
Expand Up @@ -306,7 +306,7 @@ async def get_room_summary(self, room_id: str) -> Mapping[str, MemberSummary]:
"""

def _get_room_summary_txn(
txn: LoggingTransaction,
txn: LoggingTransaction, exclude_members: List[str]
) -> Dict[str, MemberSummary]:
# first get counts.
# We do this all in one transaction to keep the cache small.
Expand All @@ -318,6 +318,10 @@ def _get_room_summary_txn(
for membership, count in counts.items():
res.setdefault(membership, MemberSummary([], count))

exclude_users_clause, args = make_in_list_sql_clause(
self.database_engine, "state_key", exclude_members, negative=True
)
Comment on lines +324 to +326
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be careful about there being too many exclude_members and bailing to prevent DoS.

Alternatively, we could just fetch N extra members like we do in case one of them is the calling user and do the exclusion outside of the SQL. But we should probably have a limit on that as well. Perhaps something to clarify in the spec and then we can bail if the list is longer than the specc'ed max. Perhaps we just rely on max length of an event?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to rely on the max length of an event, which I think even if you had a long list of very short userIds would still only be up to about 9k or so (once you've included the usual event padding).

I'm not quite sure on the performance cost here, but I'd assume that a 9k string list filter in postgres isn't terrible as it's not going to impact IO.

Alternatively we could do as you say and set a sensible max number of users in the spec (say 100 or so). I'm generally a bit allergic to limitations in the spec, as someones probably going to come up with a use case of 101 members however it might be justified in the case of performance.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, and get_room_summary has big fat cache on it so it's probably okay to do a slightly more expensive call here. Admittedly this does impact the hot path of sync, but I think the operation of pulling out excluded users is fairly fast.

Copy link
Contributor

@MadLittleMods MadLittleMods Oct 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We tend to limit make_in_list_sql_clause(...) to at-most 1000 (see usages of batch_iter(..., 1000)) but it's a bit tough to do here with the negative=True condition.

We should at-least add a comment here that we are assuming that there should be no more than 9.3k members to exclude based on the max length of an event (65535 bytes).

  • There is nothing stopping someone from just using a single letter string over and over in the list so it could be practically ~65k things in the list.
  • With actual MXIDs: 16.4k = 65535 / 4 (@m:h) maximally
  • In the most likely maximal scenario with public federation: 10.9k = 65535 / 6 (@m:h.io)
  • With enough unique combos in localpart: 9.3k = 65535 / 7 (@mm:h.io)

We could have a valid MXID check and add them to a Set to deduplicate but I don't think it's worth the extra computation.

Perhaps we should just have a practical limit to re-evaluate if someone hits it. Have a 1k check and set exclude_members = [] in that case with a log warning (warning instead of assert because we don't want to break the whole /sync response). This can always be increased in the future when someone has a practical use case but avoids rooms where there only goal is performance abuse.


# Order by membership (joins -> invites -> leave (former insiders) ->
# everything else (outsiders like bans/knocks), then by `stream_ordering` so
# the first members in the room show up first and to make the sort stable
Expand All @@ -330,16 +334,18 @@ def _get_room_summary_txn(
FROM current_state_events
WHERE type = 'm.room.member' AND room_id = ?
AND membership IS NOT NULL
AND %s
ORDER BY
CASE membership WHEN ? THEN 1 WHEN ? THEN 2 WHEN ? THEN 3 ELSE 4 END ASC,
event_stream_ordering ASC
LIMIT ?
"""
""" % (exclude_users_clause)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use f-strings, or the more modern .format(..).

The old % is a bit of a footgun, as e.g. its meant to take a collection like a tuple, but you've actually just passed it a plain string (since you forgot a comma), which is valid as a string is a collection of strings. This does work, as python does magic to detect the case, but its a bit ugh.


txn.execute(
sql,
(
room_id,
*args,
# Sort order
Membership.JOIN,
Membership.INVITE,
Expand All @@ -357,8 +363,31 @@ def _get_room_summary_txn(

return res

functional_members_event_id = await self.db_pool.simple_select_one_onecol(
table="current_state_events",
keyvalues={
"room_id": room_id,
"type": EventTypes.MSC4171FunctionalMembers,
"state_key": "",
},
retcol="event_id",
allow_none=True,
)

exclude_members = []
if functional_members_event_id:
functional_members_event = await self.get_event(functional_members_event_id)
functional_members_data = functional_members_event.content.get(
"service_members"
)
# ONLY use this value if this looks like a valid list of strings. Otherwise, ignore.
if isinstance(functional_members_data, list) and all(
isinstance(item, str) for item in functional_members_data
):
exclude_members = functional_members_data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems a bit dubious to change the behavior of /sync (m.heroes) without it being under an unstable endpoint or key in the response. It is under a feature flag (msc4171_enabled) 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, unsure how best to do this one. I felt it would be better for the homeserver to uniformly provide the same result across all users / endpoints based on the configuration, than adding query parameters to each endpoint and filtering. It's a reasonable request though, we could do it that way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, I didn't manage to add a feature flag check here in the original PR. Have updated.

Copy link
Contributor

@MadLittleMods MadLittleMods Oct 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably would be prudent to separate this behavior, either via a new key in the /sync response as io.element.msc4171.heroes or conditional based on a new query parameter (any good prior art?)

New endpoint(s) is also a valid option but not worth how cumbersome that would be.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Query parameter would be probably best I think. I'll just need to let the EX crew know that this needs adding.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the response. I do think we need some review on the MSC as it's a naturally ripe-for-abuse feature. So from my view, this feature primarily helps with DM rooms where no room name nor alias is set and you will have around 3 participants (Alice, Bob, and a bot).

Deprioritise service users to the end of the join list (this doesn't need to be behind a flag)

We can do this, but unfortunately really only helps with larger groups with no other metadata to gather a name from.

Have clients remove service users from heroes manually

We can do this, but it does present a couple of problems. First, it means we need to fetch state as part of the sync loop which either means a small delay or a flicker as room names adjust. I am unsure how palatable that will be, but it's an option.

Otherwise, I'm happy adding a new query param flag to enable filtering of m.heroes.

I think this would be my preference. This avoids changing behaviors for existing clients, and gives clients the opportunity to opt out. It also prevents the flicker problem noted above which is my primary concern with the client approach. I'll proceed with this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do this, but it does present a couple of problems. First, it means we need to fetch state as part of the sync loop which either means a small delay or a flicker as room names adjust. I am unsure how palatable that will be, but it's an option.

I don't really follow this? Clients using sync v2 will get the service state event alongside the heroes, and for sliding sync the client can add it to the list of requested state events.

We can also server side annotate the hero membership events in sync v2 to indicate that they are service users (and do something similar in sliding sync), but given the client already has that information it feels a bit redundant?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the point of m.heroes was to calculate a room name when you don't have the full state locally? Admittedly I didn't know that you could ask for specific state events to appear alongside. That seems like a more sensible approach to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, we always send down all the state to the clients, except for membership events if they have enabled "lazy-loading" for sync v2. In sliding sync the client can specify what state event types it wants the server to return (and gives an option for lazy loading members).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, that makes things much easier. I think the optimal path here is probably to scrap this PR and have the clients pull the correct state and filter then.


return await self.db_pool.runInteraction(
"get_room_summary", _get_room_summary_txn
"get_room_summary", _get_room_summary_txn, exclude_members
)

@cached()
Expand Down Expand Up @@ -1754,7 +1783,8 @@ def __init__(


def extract_heroes_from_room_summary(
details: Mapping[str, MemberSummary], me: str
details: Mapping[str, MemberSummary],
me: str,
) -> List[str]:
"""Determine the users that represent a room, from the perspective of the `me` user.

Expand Down
50 changes: 50 additions & 0 deletions tests/storage/test_roommember.py
Original file line number Diff line number Diff line change
Expand Up @@ -636,6 +636,56 @@ def test_extract_heroes_from_room_summary_first_five_joins(self) -> None:
hero_user_ids, [user1_id, user2_id, user3_id, user4_id, user5_id]
)

def test_extract_heroes_from_room_summary_exclude_service_members(self) -> None:
"""
Test that `extract_heroes_from_room_summary(...)` returns the first 5 joins who are
not mentioned in the functional members state event.
"""
user1_id = self.register_user("user1", "pass")
user1_tok = self.login(user1_id, "pass")
user2_id = self.register_user("user2", "pass")
user2_tok = self.login(user2_id, "pass")
user3_id = self.register_user("user3", "pass")
user3_tok = self.login(user3_id, "pass")
user4_id = self.register_user("user4", "pass")
user4_tok = self.login(user4_id, "pass")
user5_id = self.register_user("user5", "pass")
user5_tok = self.login(user5_id, "pass")
user6_id = self.register_user("user6", "pass")
user6_tok = self.login(user6_id, "pass")
user7_id = self.register_user("user7", "pass")
user7_tok = self.login(user7_id, "pass")

# Setup the room (user1 is the creator and is joined to the room)
room_id = self.helper.create_room_as(user1_id, tok=user1_tok)

# Exclude some users
self.helper.send_state(
room_id,
event_type=EventTypes.MSC4171FunctionalMembers,
body={"service_members": [user2_id, user3_id]},
tok=user1_tok,
)

# User2 -> User7 joins
self.helper.join(room_id, user2_id, tok=user2_tok)
self.helper.join(room_id, user3_id, tok=user3_tok)
self.helper.join(room_id, user4_id, tok=user4_tok)
self.helper.join(room_id, user5_id, tok=user5_tok)
self.helper.join(room_id, user6_id, tok=user6_tok)
self.helper.join(room_id, user7_id, tok=user7_tok)

room_membership_summary = self.get_success(self.store.get_room_summary(room_id))

hero_user_ids = extract_heroes_from_room_summary(
room_membership_summary, me="@fakuser"
Half-Shot marked this conversation as resolved.
Show resolved Hide resolved
)

# First 5 users to join the room, excluding service members.
self.assertListEqual(
Half-Shot marked this conversation as resolved.
Show resolved Hide resolved
hero_user_ids, [user1_id, user4_id, user5_id, user6_id, user7_id]
)

MadLittleMods marked this conversation as resolved.
Show resolved Hide resolved
def test_extract_heroes_from_room_summary_membership_order(self) -> None:
"""
Test that `extract_heroes_from_room_summary(...)` prefers joins/invites over
Expand Down
Loading