Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ADR 020 to use Any #6111

Merged
merged 17 commits into from
May 13, 2020
Merged
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
264 changes: 158 additions & 106 deletions docs/architecture/adr-020-protobuf-transaction-encoding.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
- 2020 March 06: Initial Draft
- 2020 March 12: API Updates
- 2020 April 13: Added details on interface `oneof` handling
- 2020 April 30: Switch to `Any`

## Status

Expand All @@ -24,76 +25,185 @@ With this in mind, we will tackle the migration path via two main areas, txs and
querying. However, this ADR solely focuses on transactions. Querying should be
addressed in a future ADR, but it should build off of these proposals.

Based on detailed discussions ([\#6030](https://github.com/cosmos/cosmos-sdk/issues/6030)
and [\#6078](https://github.com/cosmos/cosmos-sdk/issues/6078)), the original
design for transactions was changed substantially from an `oneof` /JSON-signing
approach to the approach described below.

## Decision

### Transactions

Since the messages that an application is known and allowed to handle are specific
to the application itself, so must the transactions be specific to the application
itself. Similar to how we described in [ADR 019](./adr-019-protobuf-state-encoding.md),
the concrete types will be defined at the application level via Protobuf `oneof`.
Since interface values are encoded with `google.protobuf.Any` in state (see [ADR 019](adr-019-protobuf-state-encoding.md)),
`sdk.Msg`s are encoding with `Any` in transactions.

One of the main goals of using `Any` to encode interface values is to have a
core set of types which is reused by apps so that
clients can safely be compatible with as many chains as possible.

It is one of the goals of this specification to provide a flexible cross-chain transaction
format that can serve a wide variety of use cases without breaking client
compatibility.

In order to facilitate signing, transactions are separated into `TxBody`,
which will be re-used by `SignDoc` below, and `signatures`:
aaronc marked this conversation as resolved.
Show resolved Hide resolved

```proto
// types/types.proto
package cosmos_sdk.v1;

message Tx {
TxBody body = 1;
repeated bytes signatures = 2;
}

message TxBody {
repeated google.protobuf.Any messages = 1;
repeated SignerInfo signer_info = 2;
aaronc marked this conversation as resolved.
Show resolved Hide resolved
Fee fee = 3;
string memo = 4;
int64 timeout_height = 5;
aaronc marked this conversation as resolved.
Show resolved Hide resolved
repeated google.protobuf.Any extension_options = 6;
Copy link
Member

@webmaster128 webmaster128 May 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about using number 100 to keep this always the last field?

}

message SignerInfo {
alexanderbez marked this conversation as resolved.
Show resolved Hide resolved
PublicKey pub_key = 1;
SignMode mode = 2;
}

enum SignMode {
SIGN_MODE_DEFAULT = 0;
SIGN_MODE_LEGACY_AMINO = -1;
SIGN_MODE_EXTENDED = 1;
aaronc marked this conversation as resolved.
Show resolved Hide resolved
}
aaronc marked this conversation as resolved.
Show resolved Hide resolved
```

As will be discussed below, in order to include as much of the `Tx` as possible
in the `SignDoc`, `SignerInfo` is separated from signatures so that only the
raw signatures themselves live outside of `TxBody`.

The application will define a single canonical `Message` Protobuf message
with a single `oneof` that implements the SDK's `Msg` interface.
Because we are aiming for be a flexible, extensible cross-chain transaction
aaronc marked this conversation as resolved.
Show resolved Hide resolved
format, new transaction processing options should be added to `TxBody` as soon
those use cases are discovered, even if they can't be implemented yet.

Example:
Because there is coordination overhead in this, `TxBody` includes an
`extension_options` field which can be used for any transaction processing
options that are not already covered. App developers should, nevertheless,
attempt to upstream important improvements to `Tx`.

```protobuf
// app/codec/codec.proto
### Signing

message Message {
option (cosmos_proto.interface_type) = "github.com/cosmos/cosmos-sdk/types.Msg";
Signatures are structured using the `SignDoc` below which reuses `TxBody` and only
adds the fields which are needed for signatures but not present on `TxBody`:

oneof sum {
cosmos_sdk.x.bank.v1.MsgSend msg_send = 1;
cosmos_sdk.x.bank.v1.MsgMultiSend msg_multi_send = 2;
cosmos_sdk.x.crisis.v1.MsgVerifyInvariant msg_verify_invariant = 3;
// ...
}
```proto
// types/types.proto
message SignDoc {
TxBody body = 1;
alexanderbez marked this conversation as resolved.
Show resolved Hide resolved
string chain_id = 2;
alexanderbez marked this conversation as resolved.
Show resolved Hide resolved
uint64 account_number = 3;
uint64 account_sequence = 4;
}
```

Because an application needs to define it's unique `Message` Protobuf message, it
will by proxy have to define a `Transaction` Protobuf message that encapsulates this
`Message` type. The `Transaction` message type must implement the SDK's `Tx` interface.
#### `SIGN_MODE_DEFAULT`

Example:
The default signing behavior is to sign the raw `TxBody` bytes as broadcast over
the wire. This has the advantages of:

```protobuf
// app/codec/codec.proto
* requiring the minimum additional client capabilities beyond a standard protocol
buffers implementation
* leaving effectively zero holes for transaction malleability (i.e. there are no
subtle differences between the signing and encoding formats which could
potentially be exploited by an attacker)

message Transaction {
cosmos_sdk.x.auth.v1.StdTxBase base = 1;
repeated Message msgs = 2;
In order to sign in the default mode, clients take the following steps:

1. Encode `TxBody`
2. Sign `SignDocRaw`

The raw encoded `TxBody` bytes are encoded into `SignDocRaw` below so that the
encoded body bytes exactly match the signed body bytes with no need for
alexanderbez marked this conversation as resolved.
Show resolved Hide resolved
["canonicalization"](https://github.com/regen-network/canonical-proto3) of that
message.
aaronc marked this conversation as resolved.
Show resolved Hide resolved

```proto
// types/types.proto
message SignDocRaw {
bytes body_bytes = 1;
string chain_id = 2;
uint64 account_number = 3;
uint64 account_sequence = 4;
}
```

Note, the `Transaction` type includes `StdTxBase` which will be defined by the SDK
and includes all the core field members that are common across all transaction types.
Developers do not have to include `StdTxBase` if they wish, so it is meant to be
used as an auxiliary type.
The only requirements are that the client _must_ encode `SignDocRaw` itself canonically.
This means that:

* all of the fields must be encoded in order
* default values (i.e. empty/zero values) must be omitted

### Signing
If a protobuf implementation does not by default encode `SignDocRaw` canonically,
the client _must_ manually encode `SignDocRaw` following the guidelines above.

Signing of a `Transaction` must be canonical across clients and binaries. In order
to provide canonical representation of a `Transaction` to sign over, clients must
obey the following rules:
Again, it does not matter if `TxBody` was encoded canonically or not.

- Encode `SignDoc` (see below) via [Protobuf's canonical JSON encoding](https://developers.google.com/protocol-buffers/docs/proto3#json).
- Default must be stripped from the output!
- JSON keys adhere to their Proto-defined field names.
- Generate canonical JSON to sign via the [JSON Canonical Form Spec](https://gibson042.github.io/canonicaljson-spec/).
- This spec should be trivial to interpret and implement in any language.
Note that in order to decode `SignDocRaw` for displaying contents to users,
the regular `SignDoc` type should be used.

```Protobuf
// app/codec/codec.proto
3. Broadcast `TxRaw`

message SignDoc {
StdSignDocBase base = 1;
repeated Message msgs = 2;
In order to make sure that the signed body bytes exactly match the encoded body
aaronc marked this conversation as resolved.
Show resolved Hide resolved
bytes, clients should encode and broadcast `TxRaw` with the same body bytes used
in `SignDocRaw`.

```proto
// types/types.proto
message TxRaw {
bytes body_bytes = 1;
repeated bytes signatures = 2;
}
```

Signature verifiers should verify signatures by decoding `TxRaw` and then
encoding `SignDocRaw` with the raw body bytes.

The standard `Tx` type (which is byte compatible with `TxRaw`) can be used to
decode transactions for all other use cases.

#### `SIGN_MODE_LEGACY_AMINO`

In order to support legacy wallets and exchanges, Amino JSON will be emporarily
supported transaction signing. Once wallets and exchanges have had a
aaronc marked this conversation as resolved.
Show resolved Hide resolved
chance to upgrade to protobuf based signing, this option will be disabled. In
the meantime, it is foreseen that disabling the current Amino signing would cause
too much breakage to be feasible.
aaronc marked this conversation as resolved.
Show resolved Hide resolved

Legacy clients will be able to sign a transaction using the current Amino
JSON format and have it encoded to protobuf using the REST `/tx/encode`
endpoint before broadcasting.
aaronc marked this conversation as resolved.
Show resolved Hide resolved

#### `SIGN_MODE_EXTENDED`

As was discussed extensively in [\#6078](https://github.com/cosmos/cosmos-sdk/issues/6078),
there is a desire for a human-readable signing encoding, especially for hardware
wallets like the [Ledger](https://www.ledger.com) which display
transaction contents to users before signing. JSON was an attempt at this but
falls short of the ideal.

`SIGN_MODE_EXTENDED` is intended as a placeholder for a human-readable
encoding which will replace Amino JSON. This new encoding should be even more
focused on readability than JSON, possibly based on formatting strings like
[MessageFormat](http://userguide.icu-project.org/formatparse/messages).

In order to ensure that the new human-readable format does not suffer from
transaction malleability issues, `SIGN_MODE_EXTENDED`
requires that the _human-readable bytes are concatenated with the raw `SignDoc`_
to generate sign bytes.
aaronc marked this conversation as resolved.
Show resolved Hide resolved

Multiple human-readable formats (maybe even localized messages) may be supported
by `SIGN_MODE_EXTENDED` when it is implemented.

### CLI & REST

Currently, the REST and CLI handlers encode and decode types and txs via Amino
Expand Down Expand Up @@ -135,66 +245,11 @@ Then, each module's client handler will at the minimum accept a `Marshaler` inst
of a concrete Amino codec and a `Generator` along with an `AccountRetriever` so
that account fields can be retrieved for signing.

#### Interface `oneof` Handling

If the module needs to work with any `sdk.Msg`s that use interface types, that
`sdk.Msg` should be implemented as an interface with getters and setters on the
module level and a no-arg constructor function should be passed around to
required CLI and REST client commands.

For example, in `x/gov`, `Content` is an interface type, so `MsgSubmitProposalI`
should also be an interface and implement setter methods:

```go
// x/gov/types/msgs.go
type MsgSubmitProposalI interface {
sdk.Msg

GetContent() Content
// SetContent returns an error if the underlying oneof does not support
// the concrete Content passed in
SetContent(Content) error

GetInitialDeposit() sdk.Coins
SetInitialDeposit(sdk.Coins)

GetProposer() sdk.AccAddress
SetProposer(sdk.AccAddress)
}
```

Note that the implementation of `MsgSubmitProposalI` can be simplified by
using an embedded base struct which implements most of that interface - in this
case `MsgSubmitProposalBase`.

A parameter `ctr func() MsgSubmitProposalI` would then be passed to CLI client
methods in order to construct a concrete instance.

## Future Improvements

Requiring application developers to have to redefine their `Message` Protobuf types
can be extremely tedious and may increase the surface area of bugs by potentially
missing one or more messages in the `oneof`.

To circumvent this, an optional strategy can be taken that has each module define
it's own `oneof` and then the application-level `Message` simply imports each module's
`oneof`. However, this requires additional tooling and the use of reflection.

Example:

```protobuf
// app/codec/codec.proto

message Message {
option (cosmos_proto.interface_type) = "github.com/cosmos/cosmos-sdk/types.Msg";

oneof sum {
bank.Msg = 1;
staking.Msg = 2;
// ...
}
}
```
A concrete implementation of `SIGN_MODE_EXTENDED` is intended as a near-term
future improvement.

## Consequences

Expand All @@ -206,11 +261,8 @@ message Message {

### Negative

- Learning curve required to understand and implement Protobuf messages.
- Less flexibility in cross-module type registration. We now need to define types
at the application-level.
- Client business logic and tx generation become a bit more complex as developers
have to define more types and implement more interfaces.
- `google.protobuf.Any` type URLs increase transaction size although the effect
may be negligible or compression may be able to mitigate it.

### Neutral

Expand Down