-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Switch the JSON byte producer from a pull to a push producer #8116
Conversation
63924b9
to
02cace6
Compare
https://twisted.readthedocs.io/en/twisted-20.3.0/core/howto/producers.html had some relevant information that was useful. In particular it refers to an example: https://github.com/twisted/twisted/blob/trunk/docs/core/examples/streaming.py |
self._paused = False | ||
|
||
# Write until there's backpressure telling us to stop. | ||
while not self._paused: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This definitely looks like an infinite loop, but I don't think it is (see a previous discussion about this description here). I think when we call write
we give up control to the reactor which might call pauseProducing
before control is returned to resumeProducing
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you call write()
you go through the Request
-> HTTPChannel
-> iocpreactor.tcp.Connection
(this is HTTPChannel.transport
) -> iocpreactor.abstract.FileHandle
, which might cause producer.pauseProducing
(which goes back through the chain in the opposite direction, essentially).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does mean that if we write slower than the TCP buffer drains we will write everything in one go. I don't know whether we want to put some limit to the amount written in one reactor tick? Maybe something like 64KB (or even 512KB)? A reactor tick generally takes less than 1ms, so those values would mean maxing out at ~64MB/s (or 512MB/s).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, but FileHandle
has a write buffer of 64KB and doesn't actually pull from the buffer until a reactor tick, so we don't need to worry about it. The flow becomes: 1) write data, 2) pause producing is called if over 64kb of data has been written, 3) we exit the loop until the FileHandle
buffer has been drained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was my understanding as well! 👍 This is some fun spelunking into Twisted.
I believe the commit I just pushed will fix the tests, but I'm not 100% sure the changes are "correct". |
self._paused = False | ||
|
||
# Write until there's backpressure telling us to stop. | ||
while not self._paused: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does mean that if we write slower than the TCP buffer drains we will write everything in one go. I don't know whether we want to put some limit to the amount written in one reactor tick? Maybe something like 64KB (or even 512KB)? A reactor tick generally takes less than 1ms, so those values would mean maxing out at ~64MB/s (or 512MB/s).
try: | ||
data = next(self._iterator) | ||
buffer.append(data) | ||
buffered_bytes += len(data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the code this is basically what iocpreactor.abstract.FileHandle
is doing, so I think we can just skip all this and call self._request.write(data)
for each chunk?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that the http.Request
calls toChunk
so each time we call write
it will get turned into an HTTP chunk.
The goal of this code is to avoid b"{"
being emitted from the JSONProducer
becoming the HTTP chunk of 1\r\n{\r\n
, which is a large explosion in the streamed bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh gah I hate that http.Request
gets in the way :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment to that effect please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expanded the comment above this section in 2afe301.
self._paused = False | ||
|
||
# Write until there's backpressure telling us to stop. | ||
while not self._paused: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, but FileHandle
has a write buffer of 64KB and doesn't actually pull from the buffer until a reactor tick, so we don't need to worry about it. The flow becomes: 1) write data, 2) pause producing is called if over 64kb of data has been written, 3) we exit the loop until the FileHandle
buffer has been drained.
Synapse 1.20.0rc1 (2020-09-08) ============================== Removal warning --------------- Some older clients used a [disallowed character](https://matrix.org/docs/spec/client_server/r0.6.1#post-matrix-client-r0-register-email-requesttoken) (`:`) in the `client_secret` parameter of various endpoints. The incorrect behaviour was allowed for backwards compatibility, but is now being removed from Synapse as most users have updated their client. Further context can be found at [\#6766](#6766). Features -------- - Add an endpoint to query your shared rooms with another user as an implementation of [MSC2666](matrix-org/matrix-spec-proposals#2666). ([\#7785](#7785)) - Iteratively encode JSON to avoid blocking the reactor. ([\#8013](#8013), [\#8116](#8116)) - Add support for shadow-banning users (ignoring any message send requests). ([\#8034](#8034), [\#8092](#8092), [\#8095](#8095), [\#8142](#8142), [\#8152](#8152), [\#8157](#8157), [\#8158](#8158), [\#8176](#8176)) - Use the default template file when its equivalent is not found in a custom template directory. ([\#8037](#8037), [\#8107](#8107), [\#8252](#8252)) - Add unread messages count to sync responses, as specified in [MSC2654](matrix-org/matrix-spec-proposals#2654). ([\#8059](#8059), [\#8254](#8254), [\#8270](#8270), [\#8274](#8274)) - Optimise `/federation/v1/user/devices/` API by only returning devices with encryption keys. ([\#8198](#8198)) Bugfixes -------- - Fix a memory leak by limiting the length of time that messages will be queued for a remote server that has been unreachable. ([\#7864](#7864)) - Fix `Re-starting finished log context PUT-nnnn` warning when event persistence failed. ([\#8081](#8081)) - Synapse now correctly enforces the valid characters in the `client_secret` parameter used in various endpoints. ([\#8101](#8101)) - Fix a bug introduced in v1.7.2 impacting message retention policies that would allow federated homeservers to dictate a retention period that's lower than the configured minimum allowed duration in the configuration file. ([\#8104](#8104)) - Fix a long-standing bug where invalid JSON would be accepted by Synapse. ([\#8106](#8106)) - Fix a bug introduced in Synapse v1.12.0 which could cause `/sync` requests to fail with a 404 if you had a very old outstanding room invite. ([\#8110](#8110)) - Return a proper error code when the rooms of an invalid group are requested. ([\#8129](#8129)) - Fix a bug which could cause a leaked postgres connection if synapse was set to daemonize. ([\#8131](#8131)) - Clarify the error code if a user tries to register with a numeric ID. This bug was introduced in v1.15.0. ([\#8135](#8135)) - Fix a bug where appservices with ratelimiting disabled would still be ratelimited when joining rooms. This bug was introduced in v1.19.0. ([\#8139](#8139)) - Fix logging in via OpenID Connect with a provider that uses integer user IDs. ([\#8190](#8190)) - Fix a longstanding bug where user directory updates could break when unexpected profile data was included in events. ([\#8223](#8223)) - Fix a longstanding bug where stats updates could break when unexpected profile data was included in events. ([\#8226](#8226)) - Fix slow start times for large servers by removing a table scan of the `users` table from startup code. ([\#8271](#8271)) Updates to the Docker image --------------------------- - Fix builds of the Docker image on non-x86 platforms. ([\#8144](#8144)) - Added curl for healthcheck support and readme updates for the change. Contributed by @maquis196. ([\#8147](#8147)) Improved Documentation ---------------------- - Link to matrix-synapse-rest-password-provider in the password provider documentation. ([\#8111](#8111)) - Updated documentation to note that Synapse does not follow `HTTP 308` redirects due to an upstream library not supporting them. Contributed by Ryan Cole. ([\#8120](#8120)) - Explain better what GDPR-erased means when deactivating a user. ([\#8189](#8189)) Internal Changes ---------------- - Add filter `name` to the `/users` admin API, which filters by user ID or displayname. Contributed by Awesome Technologies Innovationslabor GmbH. ([\#7377](#7377), [\#8163](#8163)) - Reduce run times of some unit tests by advancing the reactor a fewer number of times. ([\#7757](#7757)) - Don't fail `/submit_token` requests on incorrect session ID if `request_token_inhibit_3pid_errors` is turned on. ([\#7991](#7991)) - Convert various parts of the codebase to async/await. ([\#8071](#8071), [\#8072](#8072), [\#8074](#8074), [\#8075](#8075), [\#8076](#8076), [\#8087](#8087), [\#8100](#8100), [\#8119](#8119), [\#8121](#8121), [\#8133](#8133), [\#8156](#8156), [\#8162](#8162), [\#8166](#8166), [\#8168](#8168), [\#8173](#8173), [\#8191](#8191), [\#8192](#8192), [\#8193](#8193), [\#8194](#8194), [\#8195](#8195), [\#8197](#8197), [\#8199](#8199), [\#8200](#8200), [\#8201](#8201), [\#8202](#8202), [\#8207](#8207), [\#8213](#8213), [\#8214](#8214)) - Remove some unused database functions. ([\#8085](#8085)) - Add type hints to various parts of the codebase. ([\#8090](#8090), [\#8127](#8127), [\#8187](#8187), [\#8241](#8241), [\#8140](#8140), [\#8183](#8183), [\#8232](#8232), [\#8235](#8235), [\#8237](#8237), [\#8244](#8244)) - Return the previous stream token if a non-member event is a duplicate. ([\#8093](#8093), [\#8112](#8112)) - Separate `get_current_token` into two since there are two different use cases for it. ([\#8113](#8113)) - Remove `ChainedIdGenerator`. ([\#8123](#8123)) - Reduce the amount of whitespace in JSON stored and sent in responses. ([\#8124](#8124)) - Update the test federation client to handle streaming responses. ([\#8130](#8130)) - Micro-optimisations to `get_auth_chain_ids`. ([\#8132](#8132)) - Refactor `StreamIdGenerator` and `MultiWriterIdGenerator` to have the same interface. ([\#8161](#8161)) - Add functions to `MultiWriterIdGen` used by events stream. ([\#8164](#8164), [\#8179](#8179)) - Fix tests that were broken due to the merge of 1.19.1. ([\#8167](#8167)) - Make `SlavedIdTracker.advance` have the same interface as `MultiWriterIDGenerator`. ([\#8171](#8171)) - Remove unused `is_guest` parameter from, and add safeguard to, `MessageHandler.get_room_data`. ([\#8174](#8174), [\#8181](#8181)) - Standardize the mypy configuration. ([\#8175](#8175)) - Refactor some of `LoginRestServlet`'s helper methods, and move them to `AuthHandler` for easier reuse. ([\#8182](#8182)) - Fix `wait_for_stream_position` to allow multiple waiters on same stream ID. ([\#8196](#8196)) - Make `MultiWriterIDGenerator` work for streams that use negative values. ([\#8203](#8203)) - Refactor queries for device keys and cross-signatures. ([\#8204](#8204), [\#8205](#8205), [\#8222](#8222), [\#8224](#8224), [\#8225](#8225), [\#8231](#8231), [\#8233](#8233), [\#8234](#8234)) - Fix type hints for functions decorated with `@cached`. ([\#8240](#8240)) - Remove obsolete `order` field from federation send queues. ([\#8245](#8245)) - Stop sub-classing from object. ([\#8249](#8249)) - Add more logging to debug slow startup. ([\#8264](#8264)) - Do not attempt to upgrade database schema on worker processes. ([\#8266](#8266), [\#8276](#8276))
Synapse 1.20.0 (2020-09-22) =========================== No significant changes since v1.20.0rc5. Removal warning --------------- Historically, the [Synapse Admin API](https://github.com/matrix-org/synapse/tree/master/docs) has been accessible under the `/_matrix/client/api/v1/admin`, `/_matrix/client/unstable/admin`, `/_matrix/client/r0/admin` and `/_synapse/admin` prefixes. In a future release, we will be dropping support for accessing Synapse's Admin API using the `/_matrix/client/*` prefixes. This makes it easier for homeserver admins to lock down external access to the Admin API endpoints. Synapse 1.20.0rc5 (2020-09-18) ============================== In addition to the below, Synapse 1.20.0rc5 also includes the bug fix that was included in 1.19.3. Features -------- - Add flags to the `/versions` endpoint for whether new rooms default to using E2EE. ([\#8343](matrix-org/synapse#8343)) Bugfixes -------- - Fix rate limiting of federation `/send` requests. ([\#8342](matrix-org/synapse#8342)) - Fix a longstanding bug where back pagination over federation could get stuck if it failed to handle a received event. ([\#8349](matrix-org/synapse#8349)) Internal Changes ---------------- - Blacklist [MSC2753](matrix-org/matrix-spec-proposals#2753) SyTests until it is implemented. ([\#8285](matrix-org/synapse#8285)) Synapse 1.20.0rc4 (2020-09-16) ============================== Synapse 1.20.0rc4 is identical to 1.20.0rc3, with the addition of the security fix that was included in 1.19.2. Synapse 1.20.0rc3 (2020-09-11) ============================== Bugfixes -------- - Fix a bug introduced in v1.20.0rc1 where the wrong exception was raised when invalid JSON data is encountered. ([\#8291](matrix-org/synapse#8291)) Synapse 1.20.0rc2 (2020-09-09) ============================== Bugfixes -------- - Fix a bug introduced in v1.20.0rc1 causing some features related to notifications to misbehave following the implementation of unread counts. ([\#8280](matrix-org/synapse#8280)) Synapse 1.20.0rc1 (2020-09-08) ============================== Removal warning --------------- Some older clients used a [disallowed character](https://matrix.org/docs/spec/client_server/r0.6.1#post-matrix-client-r0-register-email-requesttoken) (`:`) in the `client_secret` parameter of various endpoints. The incorrect behaviour was allowed for backwards compatibility, but is now being removed from Synapse as most users have updated their client. Further context can be found at [\#6766](matrix-org/synapse#6766). Features -------- - Add an endpoint to query your shared rooms with another user as an implementation of [MSC2666](matrix-org/matrix-spec-proposals#2666). ([\#7785](matrix-org/synapse#7785)) - Iteratively encode JSON to avoid blocking the reactor. ([\#8013](matrix-org/synapse#8013), [\#8116](matrix-org/synapse#8116)) - Add support for shadow-banning users (ignoring any message send requests). ([\#8034](matrix-org/synapse#8034), [\#8092](matrix-org/synapse#8092), [\#8095](matrix-org/synapse#8095), [\#8142](matrix-org/synapse#8142), [\#8152](matrix-org/synapse#8152), [\#8157](matrix-org/synapse#8157), [\#8158](matrix-org/synapse#8158), [\#8176](matrix-org/synapse#8176)) - Use the default template file when its equivalent is not found in a custom template directory. ([\#8037](matrix-org/synapse#8037), [\#8107](matrix-org/synapse#8107), [\#8252](matrix-org/synapse#8252)) - Add unread messages count to sync responses, as specified in [MSC2654](matrix-org/matrix-spec-proposals#2654). ([\#8059](matrix-org/synapse#8059), [\#8254](matrix-org/synapse#8254), [\#8270](matrix-org/synapse#8270), [\#8274](matrix-org/synapse#8274)) - Optimise `/federation/v1/user/devices/` API by only returning devices with encryption keys. ([\#8198](matrix-org/synapse#8198)) Bugfixes -------- - Fix a memory leak by limiting the length of time that messages will be queued for a remote server that has been unreachable. ([\#7864](matrix-org/synapse#7864)) - Fix `Re-starting finished log context PUT-nnnn` warning when event persistence failed. ([\#8081](matrix-org/synapse#8081)) - Synapse now correctly enforces the valid characters in the `client_secret` parameter used in various endpoints. ([\#8101](matrix-org/synapse#8101)) - Fix a bug introduced in v1.7.2 impacting message retention policies that would allow federated homeservers to dictate a retention period that's lower than the configured minimum allowed duration in the configuration file. ([\#8104](matrix-org/synapse#8104)) - Fix a long-standing bug where invalid JSON would be accepted by Synapse. ([\#8106](matrix-org/synapse#8106)) - Fix a bug introduced in Synapse v1.12.0 which could cause `/sync` requests to fail with a 404 if you had a very old outstanding room invite. ([\#8110](matrix-org/synapse#8110)) - Return a proper error code when the rooms of an invalid group are requested. ([\#8129](matrix-org/synapse#8129)) - Fix a bug which could cause a leaked postgres connection if synapse was set to daemonize. ([\#8131](matrix-org/synapse#8131)) - Clarify the error code if a user tries to register with a numeric ID. This bug was introduced in v1.15.0. ([\#8135](matrix-org/synapse#8135)) - Fix a bug where appservices with ratelimiting disabled would still be ratelimited when joining rooms. This bug was introduced in v1.19.0. ([\#8139](matrix-org/synapse#8139)) - Fix logging in via OpenID Connect with a provider that uses integer user IDs. ([\#8190](matrix-org/synapse#8190)) - Fix a longstanding bug where user directory updates could break when unexpected profile data was included in events. ([\#8223](matrix-org/synapse#8223)) - Fix a longstanding bug where stats updates could break when unexpected profile data was included in events. ([\#8226](matrix-org/synapse#8226)) - Fix slow start times for large servers by removing a table scan of the `users` table from startup code. ([\#8271](matrix-org/synapse#8271)) Updates to the Docker image --------------------------- - Fix builds of the Docker image on non-x86 platforms. ([\#8144](matrix-org/synapse#8144)) - Added curl for healthcheck support and readme updates for the change. Contributed by @maquis196. ([\#8147](matrix-org/synapse#8147)) Improved Documentation ---------------------- - Link to matrix-synapse-rest-password-provider in the password provider documentation. ([\#8111](matrix-org/synapse#8111)) - Updated documentation to note that Synapse does not follow `HTTP 308` redirects due to an upstream library not supporting them. Contributed by Ryan Cole. ([\#8120](matrix-org/synapse#8120)) - Explain better what GDPR-erased means when deactivating a user. ([\#8189](matrix-org/synapse#8189)) Internal Changes ---------------- - Add filter `name` to the `/users` admin API, which filters by user ID or displayname. Contributed by Awesome Technologies Innovationslabor GmbH. ([\#7377](matrix-org/synapse#7377), [\#8163](matrix-org/synapse#8163)) - Reduce run times of some unit tests by advancing the reactor a fewer number of times. ([\#7757](matrix-org/synapse#7757)) - Don't fail `/submit_token` requests on incorrect session ID if `request_token_inhibit_3pid_errors` is turned on. ([\#7991](matrix-org/synapse#7991)) - Convert various parts of the codebase to async/await. ([\#8071](matrix-org/synapse#8071), [\#8072](matrix-org/synapse#8072), [\#8074](matrix-org/synapse#8074), [\#8075](matrix-org/synapse#8075), [\#8076](matrix-org/synapse#8076), [\#8087](matrix-org/synapse#8087), [\#8100](matrix-org/synapse#8100), [\#8119](matrix-org/synapse#8119), [\#8121](matrix-org/synapse#8121), [\#8133](matrix-org/synapse#8133), [\#8156](matrix-org/synapse#8156), [\#8162](matrix-org/synapse#8162), [\#8166](matrix-org/synapse#8166), [\#8168](matrix-org/synapse#8168), [\#8173](matrix-org/synapse#8173), [\#8191](matrix-org/synapse#8191), [\#8192](matrix-org/synapse#8192), [\#8193](matrix-org/synapse#8193), [\#8194](matrix-org/synapse#8194), [\#8195](matrix-org/synapse#8195), [\#8197](matrix-org/synapse#8197), [\#8199](matrix-org/synapse#8199), [\#8200](matrix-org/synapse#8200), [\#8201](matrix-org/synapse#8201), [\#8202](matrix-org/synapse#8202), [\#8207](matrix-org/synapse#8207), [\#8213](matrix-org/synapse#8213), [\#8214](matrix-org/synapse#8214)) - Remove some unused database functions. ([\#8085](matrix-org/synapse#8085)) - Add type hints to various parts of the codebase. ([\#8090](matrix-org/synapse#8090), [\#8127](matrix-org/synapse#8127), [\#8187](matrix-org/synapse#8187), [\#8241](matrix-org/synapse#8241), [\#8140](matrix-org/synapse#8140), [\#8183](matrix-org/synapse#8183), [\#8232](matrix-org/synapse#8232), [\#8235](matrix-org/synapse#8235), [\#8237](matrix-org/synapse#8237), [\#8244](matrix-org/synapse#8244)) - Return the previous stream token if a non-member event is a duplicate. ([\#8093](matrix-org/synapse#8093), [\#8112](matrix-org/synapse#8112)) - Separate `get_current_token` into two since there are two different use cases for it. ([\#8113](matrix-org/synapse#8113)) - Remove `ChainedIdGenerator`. ([\#8123](matrix-org/synapse#8123)) - Reduce the amount of whitespace in JSON stored and sent in responses. ([\#8124](matrix-org/synapse#8124)) - Update the test federation client to handle streaming responses. ([\#8130](matrix-org/synapse#8130)) - Micro-optimisations to `get_auth_chain_ids`. ([\#8132](matrix-org/synapse#8132)) - Refactor `StreamIdGenerator` and `MultiWriterIdGenerator` to have the same interface. ([\#8161](matrix-org/synapse#8161)) - Add functions to `MultiWriterIdGen` used by events stream. ([\#8164](matrix-org/synapse#8164), [\#8179](matrix-org/synapse#8179)) - Fix tests that were broken due to the merge of 1.19.1. ([\#8167](matrix-org/synapse#8167)) - Make `SlavedIdTracker.advance` have the same interface as `MultiWriterIDGenerator`. ([\#8171](matrix-org/synapse#8171)) - Remove unused `is_guest` parameter from, and add safeguard to, `MessageHandler.get_room_data`. ([\#8174](matrix-org/synapse#8174), [\#8181](matrix-org/synapse#8181)) - Standardize the mypy configuration. ([\#8175](matrix-org/synapse#8175)) - Refactor some of `LoginRestServlet`'s helper methods, and move them to `AuthHandler` for easier reuse. ([\#8182](matrix-org/synapse#8182)) - Fix `wait_for_stream_position` to allow multiple waiters on same stream ID. ([\#8196](matrix-org/synapse#8196)) - Make `MultiWriterIDGenerator` work for streams that use negative values. ([\#8203](matrix-org/synapse#8203)) - Refactor queries for device keys and cross-signatures. ([\#8204](matrix-org/synapse#8204), [\#8205](matrix-org/synapse#8205), [\#8222](matrix-org/synapse#8222), [\#8224](matrix-org/synapse#8224), [\#8225](matrix-org/synapse#8225), [\#8231](matrix-org/synapse#8231), [\#8233](matrix-org/synapse#8233), [\#8234](matrix-org/synapse#8234)) - Fix type hints for functions decorated with `@cached`. ([\#8240](matrix-org/synapse#8240)) - Remove obsolete `order` field from federation send queues. ([\#8245](matrix-org/synapse#8245)) - Stop sub-classing from object. ([\#8249](matrix-org/synapse#8249)) - Add more logging to debug slow startup. ([\#8264](matrix-org/synapse#8264)) - Do not attempt to upgrade database schema on worker processes. ([\#8266](matrix-org/synapse#8266), [\#8276](matrix-org/synapse#8276))
* commit 'c9c544cda': Remove `ChainedIdGenerator`. (#8123) Switch the JSON byte producer from a pull to a push producer. (#8116) Updated docs: Added note about missing 308 redirect support. (#8120) Be stricter about JSON that is accepted by Synapse (#8106) Convert runWithConnection to async. (#8121) Remove the unused inlineCallbacks code-paths in the caching code (#8119) Separate `get_current_token` into two. (#8113) Convert events worker database to async/await. (#8071) Add a link to the matrix-synapse-rest-password-provider. (#8111)
This appears to be a follow-on to #8013. |
Per some conversation with @erikjohnston and @richvdh this might be more efficient as we can hopefully fill more of the TCP buffer in the kernel before giving up control of the reactor.
The
_ByteProducer
will create HTTP chunks of 1 KB, but now it should continue writing chunks until told to stop.