diff --git a/proposals/1219-storing-megolm-keys-serverside.md b/proposals/1219-storing-megolm-keys-serverside.md new file mode 100644 index 00000000000..95430fe21f8 --- /dev/null +++ b/proposals/1219-storing-megolm-keys-serverside.md @@ -0,0 +1,562 @@ +Storing megolm keys serverside +============================== + +Background +---------- + +A user who uses end-to-end encryption will usually have many inbound group session +keys. Users who log into new devices and want to read old messages will need a +convenient way to transfer the session keys from one device to another. While +users can currently export their keys from one device and import them to +another, this is involves several steps and may be cumbersome for many users. +Users can also share keys from one device to another, but this has several +limitations, such as the fact that key shares only share one key at a time, and +require another logged-in device to be active. + +To help resolve this, we *optionally* let clients store an encrypted copy of +their megolm inbound session keys on the homeserver. Clients can keep the +backup up to date, so that users will always have the keys needed to decrypt +their conversations. The backup could be used not just for new logins, but +also to support clients with limited local storage for keys (clients can store +old keys to the backup, and remove their local copy, retrieving the key from +the backup when needed). + +To recover keys from the backup, a user will need to enter a recovery key to +decrypt the backup. The backup will be encrypted using public key +cryptography, so that any of a user's devices can back up keys without needing +the user to enter the recovery key until they need to read from the backup. + +See also: + +* https://github.com/matrix-org/matrix-doc/issues/1219 +* https://github.com/vector-im/riot-web/issues/3661 +* https://github.com/vector-im/riot-web/issues/5675 +* https://docs.google.com/document/d/1MOoIA9qEKIhUQ3UmKZG-loqA8e0BzgWKKlKRUGMynVc/edit# + (old version of proposal) + +Proposal +-------- + +This proposal creates new APIs to allow clients to back up room decryption keys +on the server. Room decryption keys are encrypted (using public key crypto) +before being sent to the server along with some unencrypted metadata to allow +the server to manage the backups. If a key for a new megolm session is +uploaded, it is added to the current backup. If a key is uploaded for a megolm +session is that is already present in the backup, the server will use the +metadata to determine which version of the key is "better". The way in which +the server determines which key is "better" is described in the [Storing +Keys](#storing-keys) section. The user is given a private recovery key in +order to recover the keys from the backup in the future. + +Clients can create new key backups (sometimes also referred to in the API as +backup versions) to replace the current backup. Aside from the initial backup +creation, a client might start a new a backup when, for example, a user loses a +device and wants to ensure that that device does not get any new decryption +keys. In this case, the client will then create a new backup using a new key +that the device does not have access to. + +Once one client has created a backup, other clients can fetch the public part +of the recovery key from the server and add keys to the backup, if they trust +that the backup was not created by a malicious device. + +### Possible UX for interactive clients + +This section gives an example of how a client might handle key backups. Clients +may behave differently. + +On receipt of encryption keys (1st time): + +1. client checks if there is an existing backup: `GET /room_keys/version` + 1. if not, ask if the user wants to back up keys + 1. if yes: + 1. generate new curve25519 key pair, which will be the recovery key + 2. create new backup: `POST /room_keys/version` + 3. display private key for user to save (see below for the + [format of the recovery key](#recovery-key)) + 2. if no, exit and remember decision (user can change their mind later) + 3. while prompting, continue to poll `GET /room_keys/versions`, as + another device may have created a backup. If so, go to 1.2. + 2. if yes, either get the public part of the recovery key and check that it + is signed by the master cross-signing key, or prompt user to enter the + private part of the recovery key (which can derive the public part). + 1. User can also decide to create a new backup, in which case, go to 1.1. +2. send key to backup: `PUT /room_keys/keys/${roomId}/${sessionId}?version=$v` +3. continue backing up keys as we receive them (may receive a + `M_WRONG_ROOM_KEYS_VERSION` error if a new backup has been created: + see below) + +On `M_WRONG_ROOM_KEYS_VERSION` error when trying to `PUT` keys: + +1. get the current version +2. notify the user that there is a new backup, and display relevant information +3. confirm with user that they want to use the backup (user may want use the + backup, to stop backing up keys, or to create a new backup) +4. ensure the public part of the recovery key is signed by the user's master + key, or prompt the user to enter the private part of the recovery key + +On receipt of undecryptable message: + +1. ask user if they want to restore backup (ask whether to get individual key, + room keys, or all keys). (This can be done in the same place as asking if + the user wants to request keys from other devices.) +2. if yes, prompt for private key, and get keys: `GET /room_keys/keys` + +Users can also set up, disable, or rotate backups, or restore from backup via user +settings. + +### Recovery key + +The recovery key can be saved by the user directly, stored encrypted on the +server (using the method proposed in +[MSC1946](https://github.com/matrix-org/matrix-doc/issues/1946)), or both. If +the key is saved directly by the user, then the code is constructed as follows: + +1. The 256-bit curve25519 private key is prepended by the bytes `0x8B` and + `0x01` +2. All the bytes in the string above, including the two header bytes, are XORed together to form a parity + byte. This parity byte is appended to the byte string. +3. The byte string is encoded using base58, using the same mapping as is used + for Bitcoin addresses. + +This 58-character string is presented to the user to save. Implementations may +add whitespace to the recovery key; adding a space every 4th character is +recommended. + +When reading in a recovery key, clients must disregard whitespace. Clients +must base58-decode the code, ensure that the first two bytes of the decoded +string are `0x8B` and `0x01`, ensure that XOR-ing all the bytes together +results in 0, and ensure that the total length of the decoded string +is 35 bytes. Clients must then remove the first two bytes and the last byte, +and use the resulting string as the private key to decrypt backups. + +#### Encoding the recovery key for server-side storage via MSC1946 + +If MSC1946 is used to store the key on the server, it must be stored using the +`account_data` type `m.megolm_backup.v1`. + +As a special case, if the recovery key is the same as the curve25519 key used +for storing the key, then the contents of the `m.megolm_backup.v1` +`account_data` for that key will be an object with a `passthrough` property +whose value is `true`. For example, if `m.megolm_backup.v1` is set to: + +```json +{ + "encrypted": { + "key_id": { + "passthrough": true + } + } +} +``` + +means that the recovery key for the backup is the same as the private key for +the key with ID `key_id`. (This is mostly intended to provide a migration path +for for backups that were created using an earlier draft that stored the +recovery information in the `auth_data`.) + +### API + +#### Backup versions + +##### `POST /room_keys/version` + +Create a new backup version. + +Body parameters: + +- `algorithm` (string): Required. The algorithm used for storing backups. + Currently, only `m.megolm_backup.v1.curve25519-aes-sha2` is defined. +- `auth_data` (object): Required. algorithm-dependent data. For + `m.megolm_backup.v1.curve25519-aes-sha2`, see below for the [definition of + this property](#auth_data-backup-versions). + +Example: + +```javascript +{ + "algorithm": "m.megolm_backup.v1.curve25519-aes-sha2", + "auth_data": { + "public_key": "abcdefg", + "signatures": { + "something": { + "ed25519:something": "hijklmnop" + } + } + } +} +``` + +On success, returns a JSON object with keys: + +- `version` (string): the backup version + +##### `GET /room_keys/version/{version}` + +Get information about the given version, or the current version if `/{version}` +is omitted. + +On success, returns a JSON object with keys: + +- `algorithm` (string): Required. Same as in the body parameters for `POST + /room_keys/version`. +- `auth_data` (object): Required. Same as in the body parameters for + `POST /room_keys/version`. +- `version` (string): Required. The backup version. +- `etag` (string): Required. The etag value which is an opaque string + representing stored keys in the backup. Clients can compare it with the + `etag` value they received in the response of their last key storage request. + If not equal, another client has pushed new keys to the backup. +- `count` (number): Required. The number of keys stored in the backup. + +Error codes: + +- `M_NOT_FOUND`: No backup version has been created. (with HTTP status code 404) + +##### `PUT /room_keys/version/{version}` + +Update information about the given version. Only `auth_data` can be updated. + +Body parameters: + +- `algorithm` (string): Required. Must be the same as in the body parameters for `GET + /room_keys/version`. +- `auth_data` (object): Required. algorithm-dependent data. For + `m.megolm_backup.v1.curve25519-aes-sha2`, see below for the [definition of + this property](#auth_data-backup-versions). +- `version` (string): Optional. The backup version. If present, must be the same as the path parameter. + +Example: + +```javascript +{ + "algorithm": "m.megolm_backup.v1.curve25519-aes-sha2", + "auth_data": { + "public_key": "abcdefg", + "signatures": { + "something": { + "ed25519:something": "hijklmnop" + "ed25519:anotherthing": "abcdef" + } + } + }, + "version": "42" +} +``` + +On success, returns the empty JSON object. + +Error codes: + +- `M_NOT_FOUND`: This backup version was not found. (with HTTP status code 404) + +#### Storing keys + +##### `PUT /room_keys/keys/${roomId}/${sessionId}?version=$v` + +Store the key for the given session in the given room, using the given backup +version. + +If the server already has a backup in the backup version for the given session +and room, then it will keep the "better" one. To determine which one is +"better", keys are compared: + +- first by the `is_verified` flag (`true` is better than `false`), +- then, if `is_verified` is equal, by the `first_message_index` (a lower number is better), +- and finally, is `is_verified` and `first_message_index` are equal, by + `forwarded_count` (a lower number is better). + +If neither key is better than the other (that is, if all three fields are +equal), then the server should keep the existing key. + +Body parameters: + +- `first_message_index` (integer): Required. The index of the first message + in the session that the key can decrypt. +- `forwarded_count` (integer): Required. The number of times this key has been + forwarded. +- `is_verified` (boolean): Required. Whether the device backing up the key has + verified the device that the key is from. +- `session_data` (object): Required. Algorithm-dependent data. For + `m.megolm_backup.v1.curve25519-aes-sha2`, see below for the [definition of + this property](#auth_data-backup-versions). + +On success, returns a JSON object with keys: + +- `etag` (string): Required. The new etag value representing stored keys. See + `GET /room_keys/version/{version}` for more details. +- `count` (number): Required. The new count of keys stored in the backup. + +Error codes: + +- `M_WRONG_ROOM_KEYS_VERSION`: the version specified does not match the current + backup version (with HTTP status code 403). The current backup version will + be included in the `current_version` field of the HTTP result. + +Example: + +`PUT /room_keys/keys/!room_id:example.com/sessionid?version=1` + +```javascript +{ + "first_message_index": 1, + "forwarded_count": 0, + "is_verified": true, + "session_data": { + "ephemeral": "base64+ephemeral+key", + "ciphertext": "base64+ciphertext+of+JSON+data", + "mac": "base64+mac+of+ciphertext" + } +} +``` + +Result: + +```javascript +{ + "etag": "abcdefghi", + "count": 10 +} +``` + +##### `PUT /room_keys/keys/${roomId}?version=$v` + +Store several keys for the given room, using the given backup version. + +Behaves the same way as if the keys were added individually using `PUT +/room_keys/keys/${roomId}/${sessionId}?version=$v`. + +Body parameters: +- `sessions` (object): an object where the keys are the session IDs, and the + values are objects of the same form as the body in `PUT + /room_keys/keys/${roomId}/${sessionId}?version=$v`. + +Returns the same as `PUT +/room_keys/keys/${roomId}/${sessionId}?version=$v`. + +Example: + +`PUT /room_keys/keys/!room_id:example.com?version=1` + +```javascript +{ + "sessions": { + "sessionid": { + "first_message_index": 1, + "forwarded_count": 0, + "is_verified": true, + "session_data": { + "ephemeral": "base64+ephemeral+key", + "ciphertext": "base64+ciphertext+of+JSON+data", + "mac": "base64+mac+of+ciphertext" + } + } + } +} +``` + +Result: + +```javascript +{ + "etag": "abcdefghi", + "count": 10 +} +``` + +##### `PUT /room_keys/keys?version=$v` + +Store several keys, using the given backup version. + +Behaves the same way as if the keys were added individually using `PUT +/room_keys/keys/${roomId}/${sessionId}?version=$v`. + +Body parameters: +- `rooms` (object): an object where the keys are the room IDs, and the values + are objects of the same form as the body in `PUT + /room_keys/keys/${roomId}/?version=$v`. + +Returns the same as `PUT +/room_keys/keys/${roomId}/${sessionId}?version=$v` + +Example: + +`PUT /room_keys/keys/!room_id:example.com?version=1` + +```javascript +{ + "rooms": { + "!room_id:example.com": { + "sessions": { + "sessionid": { + "first_message_index": 1, + "forwarded_count": 0, + "is_verified": true, + "session_data": { + "ephemeral": "base64+ephemeral+key", + "ciphertext": "base64+ciphertext+of+JSON+data", + "mac": "base64+mac+of+ciphertext" + } + } + } + } + } +} +``` + +Result: + +```javascript +{ + "etag": "abcdefghi", + "count": 10 +} +``` + +#### Retrieving keys + +When retrieving keys, the `version` parameter is optional, and defaults to +retrieving keys from the latest backup version. + +##### `GET /room_keys/keys/${roomId}/${sessionId}?version=$v` + +Retrieve the key for the given session in the given room from the backup. + +On success, returns a JSON object in the same form as the request body of `PUT +/room_keys/keys/${roomId}/${sessionId}?version=$v`. + +Error codes: + +- M_NOT_FOUND: The session is not present in the backup, or the requested + backup version does not exist. (with HTTP status code 404) + +##### `GET /room_keys/keys/${roomId}?version=$v` + +Retrieve the all the keys for the given room from the backup. + +On success, returns a JSON object in the same form as the request body of `PUT +/room_keys/keys/${roomId}?version=$v`. + +If the backup version exists but no keys are found, then this endpoint returns +a successful response with body: + +``` +{ + "sessions": {} +} +``` + +Error codes: + +- `M_NOT_FOUND`: The requested backup version does not exist. (with HTTP status code 404) + +##### `GET /room_keys/keys?version=$v` + +Retrieve all the keys from the backup. + +On success, returns a JSON object in the same form as the request body of `PUT +/room_keys/keys?version=$v`. + +If the backup version exists but no keys are found, then this endpoint returns +a successful response with body: + +``` +{ + "rooms": {} +} +``` + +Error codes: + +- `M_NOT_FOUND`: The requested backup version does not exist. (with HTTP status code 404) + +#### Deleting keys + +##### `DELETE /room_keys/keys/${roomId}/${sessionId}?version=$v` +##### `DELETE /room_keys/keys/${roomId}?version=$v` +##### `DELETE /room_keys/keys/?version=$v` + +Deletes keys from the backup. + +Returns the same as `PUT +/room_keys/keys/${roomId}/${sessionId}?version=$v`. + +#### `m.megolm_backup.v1.curve25519-aes-sha2` definitions + +##### `auth_data` for backup versions + +The `auth_data` property for the backup versions endpoints for +`m.megolm_backup.v1.curve25519-aes-sha2` is a [signed +json](https://matrix.org/docs/spec/appendices#signing-json) object with the +following keys: + +- `public_key` (string): the curve25519 public key used to encrypt the backups +- `signatures` (object): signatures of the `auth_data`. + +The `auth_data` should be signed by the user's [master cross-signing +key](https://github.com/matrix-org/matrix-doc/pull/1756), and may also be +signed by the user's device key. This allows clients to ensure that the public +key is valid, and prevents an attacker from being able to change the backup to +use a public key that they have the private key for. + +##### `session_data` for key backups + +The `session_data` field in the backups is constructed as follows: + +1. Encode the session key to be backed up as a JSON object with the properties: + - `algorithm` (string): `m.megolm.v1.aes-sha2` + - `sender_key` (string): base64-encoded device curve25519 key + - `sender_claimed_keys` (object): object containing the identity keys for the + sending device + - `forwarding_curve25519_key_chain` (array): zero or more curve25519 keys + for devices who forwarded the session key + - `session_key` (string): base64-encoded (unpadded) session key in + [session-sharing + format](https://gitlab.matrix.org/matrix-org/olm/blob/master/docs/megolm.md#session-sharing-format) +2. Generate an ephemeral curve25519 key, and perform an ECDH with the ephemeral + key and the backup's public key to generate a shared secret. The public + half of the ephemeral key, encoded using base64, becomes the `ephemeral` + property of the `session_data`. +3. Using the shared secret, generate 80 bytes by performing an HKDF using + SHA-256 as the hash, with a salt of 32 bytes of 0, and with the empty string + as the info. The first 32 bytes are used as the AES key, the next 32 bytes + are used as the MAC key, and the last 16 bytes are used as the AES + initialization vector. +4. Stringify the JSON object, and encrypt it using AES-CBC-256 with PKCS#7 + padding. This encrypted data, encoded using base64, becomes the + `ciphertext` property of the `session_data`. +5. Pass the raw encrypted data (prior to base64 encoding) through HMAC-SHA-256 + using the MAC key generated above. The first 8 bytes of the resulting MAC + are base64-encoded, and become the `mac` property of the `session_data`. + +(The key HKDF, AES, and HMAC steps are the same as what are used for encryption +in olm and megolm.) + +Security Considerations +----------------------- + +An attacker who gains access to a user's account can delete or corrupt their +key backup. This proposal does not attempt to protect against that. + +An attacker who gains access to a user's account can create a new backup +version using a key that they control. For this reason, clients SHOULD confirm +with users before sending keys to a new backup version or verify that it was +created by a trusted device by checking the signature. Alternatively, if the +signature cannot be verified, the backup can be validated by prompting the user +to enter the recovery key, and confirming that the backup's public key +corresponds to the recovery key. + +Other Issues +------------ + +Since many clients will receive encryption keys at around the same time, they +will all want to back up their copies of the keys at around the same time, +which may increase load on the server if this happens in a big room. (TODO: +how much of an issue is this?) For this reason, clients should offset their +backup requests randomly. + +Conclusion +---------- + +This proposal allows users to securely and conveniently back up and restore +their decryption keys so that users logging into a new device can decrypt old +messages.