-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Start account and sequence numbers from 1 #6949
Conversation
Codecov Report
@@ Coverage Diff @@
## master #6949 +/- ##
==========================================
+ Coverage 61.82% 67.31% +5.48%
==========================================
Files 520 151 -369
Lines 32129 10343 -21786
==========================================
- Hits 19865 6962 -12903
+ Misses 10656 2836 -7820
+ Partials 1608 545 -1063 |
Thank you for bringing this up @aaronc. I'm still a strong supporter of the change because the handling of zero values and protobuf leads to a lot of problems in interface design. Look at this random current example of {
"height": "0",
"result": {
"type": "cosmos-sdk/BaseAccount",
"value": {
"address": "cosmos1pkptre7fdkl6gfrzlesjjvhxhlc3r4gmmk8rs6",
"account_number": "1"
}
}
} Why are I understand that this might be a bit of work in the state machine and migrations, but many people using very different technologies will have to deal with the signing interface.
I just briefly looked at the use case – makes sense. I would just not consider it a "zero value" but a "special value". It is a value that has to be treated differently then the normal numbers, but it is no extrapolation of the regular values to zero. You just need to map to any place in the uint64 range that avoids conflicts. One solution is to reserve a range for special values (like position = str.find(str2);
if (position != string::npos)
cout << "first 'needle' found at: " << int(found) << endl; where the constant |
Thanks for chiming in @webmaster128 ! So one change I would like to make with proto json is to emit default values in client facing endpoints. There is a flag we can use to enable this across the board. I think that would help with the json issue you shared above. I would also like to hear thoughts from people in the ecosystem about this change to understand if there are potential negative side effects we should know about or not. /cc @ebuchman @zmanian @alexanderbez @alessio @jackzampolin I do want to make sure the client UX is as good as possible and also make sure we are doing that wisely and with all the information. |
Do we honestly have any alternatives @aaronc? From what I understand, this is mainly important with respect to signing msgs and txs, not the client-facing APIs and data. Is that correct? Might help to quickly outline why |
Let me try: protobuf seralization is not unique (i.e. there exist an unlimited number of valid binary representations for a protobuf document). For signature verification, signer and verifier need to agree on the same serialization of a SignDoc. This is a relatively simple document with a TxBody embedded as bytes, chain ID, account number and sequence. In order to avoid the need for specifying and implementing a sophisticated protobuf normalization algorithm it is valuable to avoid the existence of empty values in SignDoc. Then the serialization rules are as simple as: 1. no extra fields and 2. fields ordered by field number. It is very likely that those rules are followed by default in common protobuf encoders. |
As I have noted in the past, writing a hand generated serializer for If we took that approach, zero assumptions need to be made about the underlying proto implementation even field ordering. So I wouldn't consider the complexity of hand-encoding |
@aaronc you have my support to take that route and close this PR then. |
Just a quick question - will a custom serialisation function cause issues for hardware signers (e.g. the Ledger)? We should check to see if this will require an app update and let the Ledger developers know if so. |
@cwgoes for one thing, the ledger will continue to use amino json for now and then be replaced with |
This sounds like the most hacky of all possible options. It is not only 10-15 lines of code. We then need a spec specific to that one document and test vectors as well. This creates almost all of the work but cannot be reused. Encoding protobuf is not rocket science, but not trivial either. If you are not willing to work around the empty value problem of protobuf, we should have a properly specified canonicalization, like canonical-proto3 without the JSON part. This then can be reused if needed. |
With all due respect, I think this is simply not true. Let's all take a deep breath and avoid knee jerk reactions. Here is some pseudocode that encodes to a buffer that has two functions if( !signDoc.body_bytes.empty() ) {
buf.WriteUVarInt64(0xA) // wire type and field number for body_bytes
buf.WriteUVarInt64(signDoc.body_bytes.length())
buf.WriteBytes(signDoc.body_bytes)
}
if( !signDoc.auth_info.empty()) {
buf.WriteUVarInt64(0x12) // wire type and field number for auth_info
buf.WriteUVarInt64(signDoc.auth_info.length())
buf.WriteBytes(signDoc.auth_info)
}
if( !signDoc.chain_id.empty()) {
buf.WriteUVarInt64(0x1a) // wire type and field number for chain_id
buf.WriteUVarInt64(signDoc.chain_id.length())
buf.WriteBytes(signDoc.chain_id)
}
if signDoc.account_number != 0 {
buf.WriteUVarInt64(0x20) // wire type and field number for account_number
buf.WriteUVarInt(signDoc.account_number)
}
if signDoc.account_sequence != 0 {
buf.WriteUVarInt64(0x28) // wire type and field number for account_sequence
buf.WriteUVarInt(signDoc.account_sequence)
} So I stand corrected 18 lines of significant code if you check for empty values for body_bytes, etc. which I think you're assuming will never be empty (so it would be 15 without those). We can argue about what is more hacky, but one could argue that it's hacky to make assumptions about all protobuf implementations encoding fields in order or changing account/sequence numbers to accomodate encoding. Let's just assess the situation and make the best decision! In order to actually change acount/sequence numbers I actually need green lights from other folks too (ping @ebuchman). By the way, I am aware that weave uses pseudocode to describe transaction signing: https://weave.iov.one/docs/weave/weave-api-spec/tx-sign-spec#signing-transactions. I don't necessarily see how this is much different and we do actually have a spec - we're basically saying encode a protobuf document in field order with no defaults. If you have a compliant implementation, just use it. If not, then hand code it... which is still true even with acc num/seq starting from 1 because encoding fields in order is another assumption, just saying... |
My primary concern is another case of do what the Go code does, which you need to reverse engineer in order to write a compliant implementation. If we want a good interface, I think it is necessary to think about the interface first, not how it is implemented in any technology. Given your code snippet above, I assume you favour the encoding rules
Am I missing something? If not, this is a relatively simple unique encoding schema that can be documented and reused. If it gets a nice name (I suggest "regencoding"), everyone can easily refer to the spec. It is only one more rule as in the avoid empty approach. It is likely you get this encoding for free in many cases. If not, you can implement it easily by either configuring an existing encoder or writing your own. In my previous comment I confused canonicalization with unique encoding, but we e.g. don't need to be able to tell if a serialization is canonical or not. Sorry for that. I'm not promoting Weave, any decision we made in Weave or the Weave documentation. Weave is unmaintained for quite some time and its original inventor did not work on it for well over a year now. I think we should aim higher in terms of accessibility by more people using more different technologies. But the first difference is that there is a complete spec above any code snippet and the code you see there only backs up what was defined before. The second difference is that the encoding is completely custom and not an existing encoding with extra rules. This makes things like numbers significatly simpler as we you can just used fixed length encodings.
If that is what we want, good. I just don't see where it is specified in ADR-0020. What we have right now is in the verification: Create a SignDoc and serialize it. Due to the simplicity of the type it is expected that this matches the serialization used by the signer. |
By the way, the simplest way to follow rule number 3. in protobufjs is to encode like this: const bytes = SignDoc.encode({
// encoder sorts fields by field number
accountNumber: accountNumber || null, // normalize 0 to unset
accountSequence: accountSequence || null, // normalize 0 to unset
authInfoBytes: authInfoBytes,
chainId: chainId,
bodyBytes: bodyBytes,
}).finish(); |
This is not what you want to have people to be forced to implement in each language, especially without a spec. I think the only viable options are either:
btw, I think account numbers do start with 1, it is just sequences. Adding yet more custom go code leads us towards the same snakepit that was go-amino. Not that it was a bad codec, but it was unportable. Ensuring portability should be the first goal (or second after correctness). |
For the record this has absolutely nothing to do with golang specifics. Gogo proto just happens to do the logical thing. In the intial version of ADR 020 I explained that 1) fields need to be serialized in order and 2) that default values need to be omitted. This is just the logical choice for attempting to canonicalize a protobuf message and it's documented here: https://github.com/regen-network/canonical-proto3. So nothing to do with golang. That documentation was linked in earlier versions but since we've tried to simplify signing to the utmost it's not even referenced, but we probably should reference it or even include the relevant parts in ADR 020. There is no way to avoid some special encoding rules for Like I said either a) you have protobuf implementation that happens to follow these rules or b) you hand encode None of this is to say that we shouldn't consider avoiding 0 but so far the biggest reason seems to be to accommodate how protobuf.js does things which I would note is also tying a spec to a language. And so far I haven't heard any feedback on potential downsides or lack thereof from anyone else here and I can't proceed without that... |
Agreed. And at that point it probably does not really matter if we have 2 or 3 encoding rules on top of protobuf. I'd inline the encoding rules because most of canonical-proto3 is not needed here.
For me it is not about better supporting one language or library but about the question whether or not we can assume serializations match by accident. protobuf.js is just one example that shows we either need to specify zero handling or work around it. |
I'm pretty strongly opposed to something that will require so much user education. This is computers! Sequences start with "0" |
Turns out account numbers start with 0 as well, you just hardly see it. This is the account of the delegator address or our validator in local simapp setup:
|
Given protobufjs/protobuf.js#1357 (comment), I think it is pretty clear that serializing default values for numbers is disallowed by the spec and a bug in the encoder. This invalidates the original motivation for this change and I think we can close this PR. |
ref: #6213
In ADR 020 (#6111) we decided to start account and sequence numbers from
1
to avoid needing to deal with differences in howprotobuf implementations encode
0
.This PR makes in-state account numbers start from
2
and uses1
as the "zero" value that is used for genesis transactions (and potentially fee grant transactions in the future https://github.com/cosmos/cosmos-sdk/pull/5768/files#r465244086)I do want to note that I have reservations about merging this. It has been noted that there could be unintended consequences to this change. For instance, we would likely need to increment all existing account numbers. So I think before we move forward with this we should really think through what all the consequences of this are and if it worth it.
Before we can merge this PR, please make sure that all the following items have been
checked off. If any of the checklist items are not applicable, please leave them but
write a little note why.
docs/
) or specification (x/<module>/spec/
)godoc
comments.Unreleased
section inCHANGELOG.md
Files changed
in the Github PR explorerCodecov Report
in the comment section below once CI passes