-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Handle outliers in /federation/v1/event
#12332
Handle outliers in /federation/v1/event
#12332
Conversation
93c23e2
to
156b4c6
Compare
The intention here is to avoid doing state lookups for outliers in `/_matrix/federation/v1/event`. Unfortunately that's expanded into something of a rewrite of `filter_events_for_server`, which ended up trying to do that operation in a couple of places.
156b4c6
to
8701d49
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was tricky to follow. I think the rewrite looks sane, and excluding the outliers seems to be a one line change (+377?).
WIth that said:
synapse/visibility.py
Outdated
visibility_ids = { | ||
vis_event_id | ||
for vis_event_id in ( | ||
state_ids.get((EventTypes.RoomHistoryVisibility, "")) | ||
for state_ids in event_to_state_ids.values() | ||
) | ||
if vis_event_id | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found the nested comprehensions tricky to read, but I don't think I can see a neater way to write it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. We could build the set incrementally:
visibility_ids = set()
for state_ids in event_to_state_ids.values():
vis_event_id = state_ids.get(_HISTORY_VIS_KEY)
if vis_event_id:
visibility_ids.add(vis_event_id)
... but I'm assuming (without checking) that the comprehension is faster.
Co-authored-by: David Robertson <[email protected]>
@DMRobertson: Thanks for ploughing through this. Certainly it's not easy to follow from the diff. I've tried to answer your questions, though it looks like you've already grokked it pretty well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside from #12332 (comment) LGTM.
Is it worth rebasing #12191 on top of the small changes to this? Either way I can take a look at that one next.
Synapse 1.57.0 (2022-04-19) =========================== This version includes a [change](matrix-org/synapse#12209) to the way transaction IDs are managed for application services. If your deployment uses a dedicated worker for application service traffic, **it must be stopped** when the database is upgraded (which normally happens when the main process is upgraded), to ensure the change is made safely without any risk of reusing transaction IDs. See the [upgrade notes](https://github.com/matrix-org/synapse/blob/v1.57.0rc1/docs/upgrade.md#upgrading-to-v1570) for more details. No significant changes since 1.57.0rc1. Synapse 1.57.0rc1 (2022-04-12) ============================== Features -------- - Send device list changes to application services as specified by [MSC3202](matrix-org/matrix-spec-proposals#3202), using unstable prefixes. The `msc3202_transaction_extensions` experimental homeserver config option must be enabled and `org.matrix.msc3202: true` must be present in the application service registration file for device list changes to be sent. The "left" field is currently always empty. ([\#11881](matrix-org/synapse#11881)) - Optimise fetching large quantities of missing room state over federation. ([\#12040](matrix-org/synapse#12040)) - Offload the `update_client_ip` background job from the main process to the background worker, when using Redis-based replication. ([\#12251](matrix-org/synapse#12251)) - Move `update_client_ip` background job from the main process to the background worker. ([\#12252](matrix-org/synapse#12252)) - Add a module callback to react to new 3PID (email address, phone number) associations. ([\#12302](matrix-org/synapse#12302)) - Add a configuration option to remove a specific set of rooms from sync responses. ([\#12310](matrix-org/synapse#12310)) - Add a module callback to react to account data changes. ([\#12327](matrix-org/synapse#12327)) - Allow setting user admin status using the module API. Contributed by Famedly. ([\#12341](matrix-org/synapse#12341)) - Reduce overhead of restarting synchrotrons. ([\#12367](matrix-org/synapse#12367), [\#12372](matrix-org/synapse#12372)) - Update `/messages` to use historic pagination tokens if no `from` query parameter is given. ([\#12370](matrix-org/synapse#12370)) - Add a module API for reading and writing global account data. ([\#12391](matrix-org/synapse#12391)) - Support the stable `v1` endpoint for `/relations`, per [MSC2675](matrix-org/matrix-spec-proposals#2675). ([\#12403](matrix-org/synapse#12403)) - Include bundled aggregations in search results ([MSC3666](matrix-org/matrix-spec-proposals#3666)). ([\#12436](matrix-org/synapse#12436)) Bugfixes -------- - Fix a long-standing bug where updates to the server notices user profile (display name/avatar URL) in the configuration would not be applied to pre-existing rooms. Contributed by Jorge Florian. ([\#12115](matrix-org/synapse#12115)) - Fix a long-standing bug where events from ignored users were still considered for bundled aggregations. ([\#12235](matrix-org/synapse#12235), [\#12338](matrix-org/synapse#12338)) - Fix non-member state events not resolving for historical events when used in [MSC2716](matrix-org/matrix-spec-proposals#2716) `/batch_send` `state_events_at_start`. ([\#12329](matrix-org/synapse#12329)) - Fix a long-standing bug affecting URL previews that would generate a 500 response instead of a 403 if the previewed URL includes a port that isn't allowed by the relevant blacklist. ([\#12333](matrix-org/synapse#12333)) - Default to `private` room visibility rather than `public` when a client does not specify one, according to spec. ([\#12350](matrix-org/synapse#12350)) - Fix a spec compliance issue where requests to the `/publicRooms` federation API would specify `limit` as a string. ([\#12364](matrix-org/synapse#12364), [\#12410](matrix-org/synapse#12410)) - Fix a bug introduced in Synapse 1.49.0 which caused the `synapse_event_persisted_position` metric to have invalid values. ([\#12390](matrix-org/synapse#12390)) Updates to the Docker image --------------------------- - Bundle locked versions of dependencies into the Docker image. ([\#12385](matrix-org/synapse#12385), [\#12439](matrix-org/synapse#12439)) - Fix up healthcheck generation for workers docker image. ([\#12405](matrix-org/synapse#12405)) Improved Documentation ---------------------- - Clarify documentation for running SyTest against Synapse, including use of Postgres and worker mode. ([\#12271](matrix-org/synapse#12271)) - Document the behaviour of `LoggingTransaction.call_after` and `LoggingTransaction.call_on_exception` methods when transactions are retried. ([\#12315](matrix-org/synapse#12315)) - Update dead links in `check-newsfragment.sh` to point to the correct documentation URL. ([\#12331](matrix-org/synapse#12331)) - Upgrade the version of `mdbook` in CI to 0.4.17. ([\#12339](matrix-org/synapse#12339)) - Updates to the Room DAG concepts development document to clarify that we mark events as outliers because we don't have any state for them. ([\#12345](matrix-org/synapse#12345)) - Update the link to Redis pub/sub documentation in the workers documentation. ([\#12369](matrix-org/synapse#12369)) - Remove documentation for converting a legacy structured logging configuration to the new format. ([\#12392](matrix-org/synapse#12392)) Deprecations and Removals ------------------------- - Remove the unused and unstable `/aggregations` endpoint which was removed from [MSC2675](matrix-org/matrix-spec-proposals#2675). ([\#12293](matrix-org/synapse#12293)) Internal Changes ---------------- - Remove lingering unstable references to MSC2403 (knocking). ([\#12165](matrix-org/synapse#12165)) - Avoid trying to calculate the state at outlier events. ([\#12191](matrix-org/synapse#12191), [\#12316](matrix-org/synapse#12316), [\#12330](matrix-org/synapse#12330), [\#12332](matrix-org/synapse#12332), [\#12409](matrix-org/synapse#12409)) - Omit sending "offline" presence updates to application services after they are initially configured. ([\#12193](matrix-org/synapse#12193)) - Switch to using a sequence to generate AS transaction IDs. Contributed by Nick @ Beeper. If running synapse with a dedicated appservice worker, this MUST be stopped before upgrading the main process and database. ([\#12209](matrix-org/synapse#12209)) - Add missing type hints for storage. ([\#12267](matrix-org/synapse#12267)) - Add missing type definitions for scripts in docker folder. Contributed by Jorge Florian. ([\#12280](matrix-org/synapse#12280)) - Move [MSC2654](matrix-org/matrix-spec-proposals#2654) support behind an experimental configuration flag. ([\#12295](matrix-org/synapse#12295)) - Update docstrings to explain how to decipher live and historic pagination tokens. ([\#12317](matrix-org/synapse#12317)) - Add ground work for speeding up device list updates for users in large numbers of rooms. ([\#12321](matrix-org/synapse#12321)) - Fix typechecker problems exposed by signedjson 1.1.2. ([\#12326](matrix-org/synapse#12326)) - Remove the `tox` packaging job: it will be redundant once #11537 lands. ([\#12334](matrix-org/synapse#12334)) - Ignore `.envrc` for `direnv` users. ([\#12335](matrix-org/synapse#12335)) - Remove the (broadly unused, dev-only) dockerfile for pg tests. ([\#12336](matrix-org/synapse#12336)) - Remove redundant `get_success` calls in test code. ([\#12346](matrix-org/synapse#12346)) - Add type annotations for `tests/unittest.py`. ([\#12347](matrix-org/synapse#12347)) - Move single-use methods out of `TestCase`. ([\#12348](matrix-org/synapse#12348)) - Remove broken and unused development scripts. ([\#12349](matrix-org/synapse#12349), [\#12351](matrix-org/synapse#12351), [\#12355](matrix-org/synapse#12355)) - Convert `Linearizer` tests from `inlineCallbacks` to async. ([\#12353](matrix-org/synapse#12353)) - Update docstrings for `ReadWriteLock` tests. ([\#12354](matrix-org/synapse#12354)) - Refactor `Linearizer`, convert methods to async and use an async context manager. ([\#12357](matrix-org/synapse#12357)) - Fix a long-standing bug where `Linearizer`s could get stuck if a cancellation were to happen at the wrong time. ([\#12358](matrix-org/synapse#12358)) - Make `StreamToken.from_string` and `RoomStreamToken.parse` propagate cancellations instead of replacing them with `SynapseError`s. ([\#12366](matrix-org/synapse#12366)) - Add type hints to tests files. ([\#12371](matrix-org/synapse#12371)) - Allow specifying the Postgres database's port when running unit tests with Postgres. ([\#12376](matrix-org/synapse#12376)) - Remove temporary pin of signedjson<=1.1.1 that was added in Synapse 1.56.0. ([\#12379](matrix-org/synapse#12379)) - Add opentracing spans to calls to external cache. ([\#12380](matrix-org/synapse#12380)) - Lay groundwork for using `poetry` to manage Synapse's dependencies. ([\#12381](matrix-org/synapse#12381), [\#12407](matrix-org/synapse#12407), [\#12412](matrix-org/synapse#12412), [\#12418](matrix-org/synapse#12418)) - Make missing `importlib_metadata` dependency explicit. ([\#12384](matrix-org/synapse#12384), [\#12400](matrix-org/synapse#12400)) - Update type annotations for compatiblity with prometheus_client 0.14. ([\#12389](matrix-org/synapse#12389)) - Remove support for the unstable identifiers specified in [MSC3288](matrix-org/matrix-spec-proposals#3288). ([\#12398](matrix-org/synapse#12398)) - Add missing type hints to configuration classes. ([\#12402](matrix-org/synapse#12402)) - Add files used to build the Docker image used for complement testing into the Synapse repository. ([\#12404](matrix-org/synapse#12404)) - Do not include groups in the sync response when disabled. ([\#12408](matrix-org/synapse#12408)) - Improve type hints related to HTTP query parameters. ([\#12415](matrix-org/synapse#12415)) - Stop maintaining a list of lint targets. ([\#12420](matrix-org/synapse#12420)) - Make `synapse._scripts` pass type checks. ([\#12421](matrix-org/synapse#12421), [\#12422](matrix-org/synapse#12422)) - Add some type hints to datastore. ([\#12423](matrix-org/synapse#12423)) - Enable certificate checking during complement tests. ([\#12435](matrix-org/synapse#12435)) - Explicitly specify the `tls` extra for Twisted dependency. ([\#12444](matrix-org/synapse#12444))
Synapse 1.57.0 (2022-04-19) =========================== This version includes a [change](matrix-org#12209) to the way transaction IDs are managed for application services. If your deployment uses a dedicated worker for application service traffic, **it must be stopped** when the database is upgraded (which normally happens when the main process is upgraded), to ensure the change is made safely without any risk of reusing transaction IDs. See the [upgrade notes](https://github.com/matrix-org/synapse/blob/v1.57.0rc1/docs/upgrade.md#upgrading-to-v1570) for more details. No significant changes since 1.57.0rc1. Synapse 1.57.0rc1 (2022-04-12) ============================== Features -------- - Send device list changes to application services as specified by [MSC3202](matrix-org/matrix-spec-proposals#3202), using unstable prefixes. The `msc3202_transaction_extensions` experimental homeserver config option must be enabled and `org.matrix.msc3202: true` must be present in the application service registration file for device list changes to be sent. The "left" field is currently always empty. ([\matrix-org#11881](matrix-org#11881)) - Optimise fetching large quantities of missing room state over federation. ([\matrix-org#12040](matrix-org#12040)) - Offload the `update_client_ip` background job from the main process to the background worker, when using Redis-based replication. ([\matrix-org#12251](matrix-org#12251)) - Move `update_client_ip` background job from the main process to the background worker. ([\matrix-org#12252](matrix-org#12252)) - Add a module callback to react to new 3PID (email address, phone number) associations. ([\matrix-org#12302](matrix-org#12302)) - Add a configuration option to remove a specific set of rooms from sync responses. ([\matrix-org#12310](matrix-org#12310)) - Add a module callback to react to account data changes. ([\matrix-org#12327](matrix-org#12327)) - Allow setting user admin status using the module API. Contributed by Famedly. ([\matrix-org#12341](matrix-org#12341)) - Reduce overhead of restarting synchrotrons. ([\matrix-org#12367](matrix-org#12367), [\matrix-org#12372](matrix-org#12372)) - Update `/messages` to use historic pagination tokens if no `from` query parameter is given. ([\matrix-org#12370](matrix-org#12370)) - Add a module API for reading and writing global account data. ([\matrix-org#12391](matrix-org#12391)) - Support the stable `v1` endpoint for `/relations`, per [MSC2675](matrix-org/matrix-spec-proposals#2675). ([\matrix-org#12403](matrix-org#12403)) - Include bundled aggregations in search results ([MSC3666](matrix-org/matrix-spec-proposals#3666)). ([\matrix-org#12436](matrix-org#12436)) Bugfixes -------- - Fix a long-standing bug where updates to the server notices user profile (display name/avatar URL) in the configuration would not be applied to pre-existing rooms. Contributed by Jorge Florian. ([\matrix-org#12115](matrix-org#12115)) - Fix a long-standing bug where events from ignored users were still considered for bundled aggregations. ([\matrix-org#12235](matrix-org#12235), [\matrix-org#12338](matrix-org#12338)) - Fix non-member state events not resolving for historical events when used in [MSC2716](matrix-org/matrix-spec-proposals#2716) `/batch_send` `state_events_at_start`. ([\matrix-org#12329](matrix-org#12329)) - Fix a long-standing bug affecting URL previews that would generate a 500 response instead of a 403 if the previewed URL includes a port that isn't allowed by the relevant blacklist. ([\matrix-org#12333](matrix-org#12333)) - Default to `private` room visibility rather than `public` when a client does not specify one, according to spec. ([\matrix-org#12350](matrix-org#12350)) - Fix a spec compliance issue where requests to the `/publicRooms` federation API would specify `limit` as a string. ([\matrix-org#12364](matrix-org#12364), [\matrix-org#12410](matrix-org#12410)) - Fix a bug introduced in Synapse 1.49.0 which caused the `synapse_event_persisted_position` metric to have invalid values. ([\matrix-org#12390](matrix-org#12390)) Updates to the Docker image --------------------------- - Bundle locked versions of dependencies into the Docker image. ([\matrix-org#12385](matrix-org#12385), [\matrix-org#12439](matrix-org#12439)) - Fix up healthcheck generation for workers docker image. ([\matrix-org#12405](matrix-org#12405)) Improved Documentation ---------------------- - Clarify documentation for running SyTest against Synapse, including use of Postgres and worker mode. ([\matrix-org#12271](matrix-org#12271)) - Document the behaviour of `LoggingTransaction.call_after` and `LoggingTransaction.call_on_exception` methods when transactions are retried. ([\matrix-org#12315](matrix-org#12315)) - Update dead links in `check-newsfragment.sh` to point to the correct documentation URL. ([\matrix-org#12331](matrix-org#12331)) - Upgrade the version of `mdbook` in CI to 0.4.17. ([\matrix-org#12339](matrix-org#12339)) - Updates to the Room DAG concepts development document to clarify that we mark events as outliers because we don't have any state for them. ([\matrix-org#12345](matrix-org#12345)) - Update the link to Redis pub/sub documentation in the workers documentation. ([\matrix-org#12369](matrix-org#12369)) - Remove documentation for converting a legacy structured logging configuration to the new format. ([\matrix-org#12392](matrix-org#12392)) Deprecations and Removals ------------------------- - Remove the unused and unstable `/aggregations` endpoint which was removed from [MSC2675](matrix-org/matrix-spec-proposals#2675). ([\matrix-org#12293](matrix-org#12293)) Internal Changes ---------------- - Remove lingering unstable references to MSC2403 (knocking). ([\matrix-org#12165](matrix-org#12165)) - Avoid trying to calculate the state at outlier events. ([\matrix-org#12191](matrix-org#12191), [\matrix-org#12316](matrix-org#12316), [\matrix-org#12330](matrix-org#12330), [\matrix-org#12332](matrix-org#12332), [\matrix-org#12409](matrix-org#12409)) - Omit sending "offline" presence updates to application services after they are initially configured. ([\matrix-org#12193](matrix-org#12193)) - Switch to using a sequence to generate AS transaction IDs. Contributed by Nick @ Beeper. If running synapse with a dedicated appservice worker, this MUST be stopped before upgrading the main process and database. ([\matrix-org#12209](matrix-org#12209)) - Add missing type hints for storage. ([\matrix-org#12267](matrix-org#12267)) - Add missing type definitions for scripts in docker folder. Contributed by Jorge Florian. ([\matrix-org#12280](matrix-org#12280)) - Move [MSC2654](matrix-org/matrix-spec-proposals#2654) support behind an experimental configuration flag. ([\matrix-org#12295](matrix-org#12295)) - Update docstrings to explain how to decipher live and historic pagination tokens. ([\matrix-org#12317](matrix-org#12317)) - Add ground work for speeding up device list updates for users in large numbers of rooms. ([\matrix-org#12321](matrix-org#12321)) - Fix typechecker problems exposed by signedjson 1.1.2. ([\matrix-org#12326](matrix-org#12326)) - Remove the `tox` packaging job: it will be redundant once matrix-org#11537 lands. ([\matrix-org#12334](matrix-org#12334)) - Ignore `.envrc` for `direnv` users. ([\matrix-org#12335](matrix-org#12335)) - Remove the (broadly unused, dev-only) dockerfile for pg tests. ([\matrix-org#12336](matrix-org#12336)) - Remove redundant `get_success` calls in test code. ([\matrix-org#12346](matrix-org#12346)) - Add type annotations for `tests/unittest.py`. ([\matrix-org#12347](matrix-org#12347)) - Move single-use methods out of `TestCase`. ([\matrix-org#12348](matrix-org#12348)) - Remove broken and unused development scripts. ([\matrix-org#12349](matrix-org#12349), [\matrix-org#12351](matrix-org#12351), [\matrix-org#12355](matrix-org#12355)) - Convert `Linearizer` tests from `inlineCallbacks` to async. ([\matrix-org#12353](matrix-org#12353)) - Update docstrings for `ReadWriteLock` tests. ([\matrix-org#12354](matrix-org#12354)) - Refactor `Linearizer`, convert methods to async and use an async context manager. ([\matrix-org#12357](matrix-org#12357)) - Fix a long-standing bug where `Linearizer`s could get stuck if a cancellation were to happen at the wrong time. ([\matrix-org#12358](matrix-org#12358)) - Make `StreamToken.from_string` and `RoomStreamToken.parse` propagate cancellations instead of replacing them with `SynapseError`s. ([\matrix-org#12366](matrix-org#12366)) - Add type hints to tests files. ([\matrix-org#12371](matrix-org#12371)) - Allow specifying the Postgres database's port when running unit tests with Postgres. ([\matrix-org#12376](matrix-org#12376)) - Remove temporary pin of signedjson<=1.1.1 that was added in Synapse 1.56.0. ([\matrix-org#12379](matrix-org#12379)) - Add opentracing spans to calls to external cache. ([\matrix-org#12380](matrix-org#12380)) - Lay groundwork for using `poetry` to manage Synapse's dependencies. ([\matrix-org#12381](matrix-org#12381), [\matrix-org#12407](matrix-org#12407), [\matrix-org#12412](matrix-org#12412), [\matrix-org#12418](matrix-org#12418)) - Make missing `importlib_metadata` dependency explicit. ([\matrix-org#12384](matrix-org#12384), [\matrix-org#12400](matrix-org#12400)) - Update type annotations for compatiblity with prometheus_client 0.14. ([\matrix-org#12389](matrix-org#12389)) - Remove support for the unstable identifiers specified in [MSC3288](matrix-org/matrix-spec-proposals#3288). ([\matrix-org#12398](matrix-org#12398)) - Add missing type hints to configuration classes. ([\matrix-org#12402](matrix-org#12402)) - Add files used to build the Docker image used for complement testing into the Synapse repository. ([\matrix-org#12404](matrix-org#12404)) - Do not include groups in the sync response when disabled. ([\matrix-org#12408](matrix-org#12408)) - Improve type hints related to HTTP query parameters. ([\matrix-org#12415](matrix-org#12415)) - Stop maintaining a list of lint targets. ([\matrix-org#12420](matrix-org#12420)) - Make `synapse._scripts` pass type checks. ([\matrix-org#12421](matrix-org#12421), [\matrix-org#12422](matrix-org#12422)) - Add some type hints to datastore. ([\matrix-org#12423](matrix-org#12423)) - Enable certificate checking during complement tests. ([\matrix-org#12435](matrix-org#12435)) - Explicitly specify the `tls` extra for Twisted dependency. ([\matrix-org#12444](matrix-org#12444))
The intention here is to avoid doing state lookups for outliers in
/_matrix/federation/v1/event
. Unfortunately that's expanded into something of a rewrite offilter_events_for_server
, which ended up trying to do that operation in a couple of places.First of all, note that there is a long-standing fudge here, where we return outliers via
/_matrix/federation/v1/event
to any server that is currently in the room, even though we don't know thehistory_visibility
in force at the time of the event, nor if the server was in the room at that time (which are normally prerequisites for allowing this). This has to be the case, because the federation protocol relies on being able to pullauth_events
for a given event in this manner (and we have no way to show that the server should not be able to see those events).It currently works somewhat via the back door, because
get_state_ids_for_events
returns an empty set for outliers; that equates to a missingm.room.history_visibility
event, which equates toshared
, ie "visible to any server which is now in the room".filter_events_for_server
does two state lookups: first of all it just looks up the history_visibility events. If those are all unrestricted, there is a fast path for it to bail out. Otherwise, it does a second state lookup, including (currently) both history_visibility events and membership events.Rather than apply the same special-casing for outliers twice, I've refactored the code so that, rather than having a separate fast path, I've changed the "slow" path so that it only looks up membership events for events where it has already decided that there is a restricted history visibility. That therefore side-steps the outlier issue for that second state lookup.