Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finalize protobuf encodings #275

Closed
cwgoes opened this issue Sep 20, 2019 · 5 comments · Fixed by #284
Closed

Finalize protobuf encodings #275

cwgoes opened this issue Sep 20, 2019 · 5 comments · Fixed by #284
Assignees
Labels
tao Transport, authentication, & ordering layer.
Milestone

Comments

@cwgoes
Copy link
Contributor

cwgoes commented Sep 20, 2019

Such that all encoded data in the provable store has a canonical encoding defined by the spec.

@cwgoes cwgoes added the tao Transport, authentication, & ordering layer. label Sep 20, 2019
@cwgoes cwgoes added this to the IBC 1.0.0-rc5 milestone Sep 20, 2019
@cwgoes cwgoes self-assigned this Sep 20, 2019
@mossid
Copy link

mossid commented Sep 20, 2019

Some opinions:

  • If a key-value element has a primitive type, it does not need to be encoded as a message(no tags).
  • For numbers, fixed{32, 64} should be preferred(nonsigned nonvarint little-endian). varints should be avoided.
  • bytes(ASCII) should be preferred over string(UTF-8).
  • bytes are prepended with varint encoded length, which is unnecessary for our use case. We should prefer to encode without length prefix whenever possible.
  • bools are safe to use.

In conclusion, the preferred set of types is {fixed64, bytes, bool}, without tags if possible.

@cwgoes
Copy link
Contributor Author

cwgoes commented Oct 14, 2019

We should prefer to encode without length prefix whenever possible.

Why? Merkle proofs, for example, often won't be constant-length.

@cwgoes cwgoes mentioned this issue Oct 14, 2019
@mossid
Copy link

mossid commented Oct 14, 2019

Only when there is no ambiguation, for example, simple bytes can be encoded without prefix when it is not in a message(as it cannot be ambiguous), but should be encoded with prefix when it is in a message.

For example:

message M {
  bytes F1 = 1;
  bytes F2 = 2;
}

F1 and F2 should be encoded with prefix(because there are two bytes and omitting length prefix will ambiguate them), but when the value is not a message and a single bytes, it does not need to be defined as a message so the length prefix can be omitted

@cwgoes
Copy link
Contributor Author

cwgoes commented Oct 14, 2019

I see what you mean; in the single-bytes case I agree that we should not length prefix.

@cwgoes
Copy link
Contributor Author

cwgoes commented Oct 15, 2019

Adding a document describing the canonical encoding to the spec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tao Transport, authentication, & ordering layer.
Projects
Status: Backlog
Development

Successfully merging a pull request may close this issue.

2 participants