diff --git a/proposals/3202-encrypted-appservices.md b/proposals/3202-encrypted-appservices.md new file mode 100644 index 00000000000..dac11cdb333 --- /dev/null +++ b/proposals/3202-encrypted-appservices.md @@ -0,0 +1,158 @@ +# MSC3202: Encrypted Appservices + +Presently, appservices in Matrix are capable of attaching themselves to a homeserver for high-traffic +bot-like usecases, such as bridging and operationally expensive bots. Traditionally, these appservices +only work in unencrypted rooms due to not having enough context on the encryption state to actually +function properly. + +This MSC targets the missing bits to support encryption at the appservice level: other MSCs, such as +[MSC2409](https://github.com/matrix-org/matrix-doc/pull/2409) and [MSC2778](https://github.com/matrix-org/matrix-doc/pull/2778) +give appservices foundational pieces to get device IDs and to-device messages, as required by encryption. + +## Proposal + +This proposal takes inspiration from [MSC2409](https://github.com/matrix-org/matrix-doc/pull/2409) by +defining a new set of keys on the appservice `/transactions` endpoint, similar to sync: + +```json5 +{ + "events": [ + // as defined today + ], + "ephemeral": [ + // MSC2409 + ], + "to_device": [ + // MSC2409 + ], + "device_lists": { + "changed": ["@alice:example.org"], + "left": ["@bob:example.com"] + }, + "device_one_time_keys_count": { + "@_irc_bob:example.org": { + "DEVICEID": { + "curve25519": 10, + "signed_curve25519": 20 + } + } + }, + "device_unused_fallback_key_types": { + "@_irc_bob:example.org": { + "DEVICEID": ["signed_curve25519"] + } + } +} +``` + +These fields are heavily inspired by [the extensions to /sync](https://matrix.org/docs/spec/client_server/r0.6.1#id84) +in the client-server API. + +All the new fields can be omitted if there are no changes for the appservice to handle. For +`device_one_time_keys_count` and `device_unused_fallback_key_types`, the format is slightly different +from the client-server API to better map the appservice's user namespace users to the counts. Users +in the namespace without keys or which have unchanged keys since the last transaction can be omitted +(more details on this later on). Note that fallback keys are described in +[MSC2732](https://github.com/matrix-org/matrix-doc/pull/2732) as of writing. + +Like MSC2409, any user the appservice would be considered "interested" in (user in the appservice's +namespace, or sharing a room with an appservice user/namespaced room) would qualify for the device +list changes section. + +Note that it's typical for clients to pause sync loops when processing device list changes to avoid +a scenario where they are unable to decrypt/encrypt a message from/to a particular device. Appservices +are expected to mirror this by ensuring the transaction request does not complete until processing +is complete. In the worst case, the server will time out the request and retry it verbatim, so +appservices might wish to track which device list changes in which transaction they already processed +or keep processing transactions in the background while retries are attempted. + +In order to allow the appservice to masquerade as its users, an extension to the existing +[identity assertion](https://matrix.org/docs/spec/application_service/r0.1.2#identity-assertion) +ability is proposed. To compliment the (optional) `user_id` when using an `as_token` as an access +token, a similarly optional `device_id` query parameter is proposed. When provided, the server asserts +that the device ID is valid for the user, and that the appservice is able to masquerade as that user. +If valid, that device ID should be assumed as being used for that request. For many requests, this +means updating the "last seen IP" and "last seen timestamp" for the device, however for some endpoints +it means interacting with that device (such as when uploading keys). + +### Optimization: when to send OTKs/fallback keys + +As mentioned above, in order to keep the transaction byte size down the server can (and should) exclude +OTK counts and unused fallback keys when they haven't changed since the last transaction. Appservices +however should be tolerable of the server over-communicating the counts as a quick/cheap approach would +be to *always* include the OTK counts/unused fallback keys for all known users rather than trying to +detect changes. + +As a middle ground, servers might be interested in an algorithm which doesn't detect changes between +transactions but does attempt to reduce traffic. If the appservice is about to receive an event or +message typically associated with encryption, the counts for the affected users could be included. This +would result in the following rules: +* If an `m.room.encrypted` event is being included in the transaction's `events`, include OTK counts and + unused fallback key types for all appservice users which reside in that room. +* If an appservice user is receiving a to-device message in the transaction's `to_device` array, include + OTK counts and unused fallback key types for that user. + +This approach has the advantage of typically minimal changes to the internals of the homeserver, works +similar to `/sync`, and reduces noisy traffic in the transaction sending. This additionally still honours +the "when they change, send the counts" requirement to a reasonable degree: typically a use of an OTK will +be followed by a to-device message. It is theoretically possible for an appservice to run out of OTKs if +a remote user claims all OTKs without actually using them. Implementations may be interested in +[MSC2732: Fallback keys](https://github.com/matrix-org/matrix-doc/pull/2732) which will avoid a scenario +where the appservice can no longer decrypt messages. + +However, as mentioned, servers are free to include this information as little or often as they'd like, +provided they send it at least as often as when it changes. + +### Optimization: Don't encrypt as often + +Appservices theoretically do not need to establish Olm sessions with other appservice users as the appservice +will typically be managing the devices in one place. In short, this means that a room with 10k appservice +users and only 1 non-appservice user can be sped up by only encrypting from the appservice's users to the +non-appservice user. The appservice would not need to set up 10k * 10k Olm sessions given the encryption +and decryption all happens in the same place. As an added bonus, this improves performance of the appservice +as it doesn't have to handle to-device messages sent to itself. + +Some implementations might not be able to support this sort of optimization though, so it is still permitted +to establish sessions and such between appservice users. + +## Potential issues + +Servers would have to track and send this information to appservices, however this is still perceived +to be more performant than appservices using potentionally thousands of `/sync` streams. + +Appservices additionally cannot opt-in (or out) of this functionality unlike with MSC2409. It is +expected that servers will optimize for not including/calculating the fields if the appservice has +no interest in the information. Specifically, appservices which don't have any keys under their user +namespace can be assumed to not need device list changes and thus can be optimized out. + +## Alternatives + +An endpoint for appservices to poll could work, though this is extra work for the appservice and would +likely need pagination and such, which is all heavyweight for the server. Instead, having the server +batch up updates and send them to the appservice is likely faster. + +## Security considerations + +None relevant - this is the same information the appservice would get if it spawned `/sync` streams for +all the users in its namespace. + +## Unstable prefix + +While this MSC is not considered stable for implementation, implementations should use `org.matrix.msc3202.` +as a prefix to the fields on the `/transactions` endpoint. For example: +* `device_lists` becomes `org.matrix.msc3202.device_lists` +* `device_one_time_keys_count` becomes `org.matrix.msc3202.device_one_time_keys_count` +* `device_unused_fallback_key_types` becomes `org.matrix.msc3202.device_unused_fallback_key_types` + +Appservices which support encryption but never see these fields (ie: server is not implementing this in an +unstable capacity) should be fine, though encryption might not function properly for them. It is the +responsibility of the appservice to try and recover safely and sanely, if desired, when the server is not +implementing this in an unstable capacity. This is not a concern once the MSC becomes stable in a released +version of the specification, as servers will be required to implement it. + +For servers wishing to force appservices to opt-in to this behaviour, they may use `org.matrix.msc3202: true` +in the registration file. Servers will be able to check for "opt-in" behaviour once this MSC is stable by +seeing whether or not the appservice has an encryption-capable device recorded in its users namespaces. + +To use device ID masquerading, implementations should use `org.matrix.msc3202.device_id` instead of `device_id` +in the query string while this MSC is considered unstable.