Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to tm version 23.0. #1927

Merged
merged 18 commits into from
Aug 12, 2018
Merged

Upgrade to tm version 23.0. #1927

merged 18 commits into from
Aug 12, 2018

Conversation

ValarDragon
Copy link
Contributor

@ValarDragon ValarDragon commented Aug 6, 2018

This is Ready for review! This upgrade meant we had to maintain a map from addr to pubkey. This also removes crypto.Signature, and changes abci time from int64 to time.Time.

  • Updated all relevant documentation (docs/)
  • Updated all relevant code comments - (I think so?)
  • Wrote tests - n/a?
  • Added entries in PENDING.md that include links to the relevant issue or PR that most accurately describes the change.

For Admin Use:

  • Added appropriate labels to PR (ex. wip, ready-for-review, docs)
  • Reviewers Assigned
  • Squashed all commits, uses message "Merge pull request #XYZ: [title]" (coding standards)

@ValarDragon
Copy link
Contributor Author

There are the hanging issues, which I have no idea whats causing them. Then theres also these two issues:

--- FAIL: TestUnbondingPeriod (0.23s)
	Error Trace:	handler_test.go:528
	Error:      	Should be true
	Test:       	TestUnbondingPeriod
	Messages:   	expected an error
--- FAIL: TestRedelegationPeriod (0.24s)
	Error Trace:	handler_test.go:587
	Error:      	Should be true
	Test:       	TestRedelegationPeriod
	Messages:   	expected an error

Those two are probably caused by the time change.

@ValarDragon
Copy link
Contributor Author

ValarDragon commented Aug 7, 2018

Hanging has now been fixed, its just halting now.

Found the cause of the halting. In slashing/tick.go we have a handleValidatorSignature which takes in the public key. However we changed the Signing Validator struct to just return the address. The application is meant to maintain an address to pubkey mapping. So I guess we need to do this in slashing. (I am confused about what the merits of maintaining this map are though / but thats a separate discussion: tendermint/tendermint#2177 (comment))

/cc @cwgoes @rigelrozanski

@ebuchman ebuchman mentioned this pull request Aug 8, 2018
6 tasks
@ValarDragon ValarDragon changed the title WIP: Upgrade to tm version 23.0. Upgrade to tm version 23.0. Aug 8, 2018
@ValarDragon ValarDragon force-pushed the dev/upgrade_tm_version branch from 0f11891 to 56d8eac Compare August 8, 2018 04:55
@ValarDragon ValarDragon force-pushed the dev/upgrade_tm_version branch from 56d8eac to 2f85fff Compare August 8, 2018 04:56
@ValarDragon
Copy link
Contributor Author

ValarDragon commented Aug 8, 2018

Figuring out why the LCD tests were failing is quite difficult from our logs. We really need to reduce the log output there, I have no idea what the error is after looking at all of those messages. (I don't even know which of the tests is failing) I had to find this out via manually testing each test individually. (It was a rather simple fix, just finding the error was the hardest part)

Currently this doesn't delete from the address to pubkey map. I'm not sure what the best way to delete from this map is, since it has to be deleted when a "CompleteUnbonding" transaction goes through. (As we can still receive slashing evidence until that point) For purposes of getting the next gaia release, I think we can punt on figuring this part out. (We may deem that its unnecessary after tendermint/tendermint#2177)

@codecov
Copy link

codecov bot commented Aug 8, 2018

Codecov Report

Merging #1927 into develop will decrease coverage by <.01%.
The diff coverage is 65.21%.

@@             Coverage Diff             @@
##           develop    #1927      +/-   ##
===========================================
- Coverage    64.86%   64.86%   -0.01%     
===========================================
  Files          114      115       +1     
  Lines         6837     6862      +25     
===========================================
+ Hits          4435     4451      +16     
- Misses        2120     2127       +7     
- Partials       282      284       +2

@cwgoes
Copy link
Contributor

cwgoes commented Aug 8, 2018

The application is meant to maintain an address to pubkey mapping. So I guess we need to do this in slashing.

I don't think we should maintain such a map unless absolutely necessary, I think we should only use the validator address and treat it as an opaque key.

@@ -412,11 +413,7 @@ func (app *BaseApp) CheckTx(txBytes []byte) (res abci.ResponseCheckTx) {
Log: result.Log,
GasWanted: result.GasWanted,
GasUsed: result.GasUsed,
Fee: cmn.KI64Pair{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we getting rid of this? Presumably Tendermint needs to know about the fee?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fees were removed.
from the changelog:
[abci] Removed Fee from ResponseDeliverTx and ResponseCheckTx

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, should we add priority then?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ValarDragon thoughts on the above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason to add priority right now. We're not ready to support that at the mempool level, and we won't be for awhile. We discussed that priority is something we want to add as a way to test protobuf upgradability on a test net, not right now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha.

Copy link
Contributor

@cwgoes cwgoes Aug 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I think it's not necessary to delay ABCI updates because Tendermint won't use the value yet, but nbd.

// Make sure we have the expected length.
if size != -1 {
require.Equal(t, size, len(bz), "Amino binary size mismatch")
assert.Equal(t, size, len(bz), "Amino binary size mismatch")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why assert instead of require? Wouldn't we want this to fail immediately?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was changing this to match whats in tendermint/crypto right now. I don't really have a preference here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's most important to be consistent -- it seems in SDK-land, the defacto call is require.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its to use require unless there is any reason not to. There is good reason to not use it here if we had more pubkey types, but I guess we can change it when that happens.

Copy link
Contributor

@alexanderbez alexanderbez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great @ValarDragon. Left a few minor remarks 👍

^--- PENDING wasn't purged on sdk v0.23.0 release.

BREAKING CHANGES
* Update to tendermint v0.23.0. This involves removing crypto.Pubkey,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add the PR reference here (preferably with a link por favor)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why the PR is helpful. The tendermint changelog is what is helpful, I can link that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the SDK changed and that's important to note where/how it happened.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is a downside of our pending system, the commit with all the changes can't be obtained from the file. I'll link the PR then.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'd just would like to see a CHANGELOG and be able to go to the issue and/or PR that made the relevant changes.

"github.com/cosmos/cosmos-sdk/x/stake/types"
)

// InitGenesis initializes the keeper's address to pubkey map
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add punctuation here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to enforce this stuff, we should have it as output from a linter. I'll change this, but #postlaunch we should make the linter say this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree 👍 .

}
}

// TODO: Make a method to remove the pubkey from the map when a validator is unbonded.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is pretty trivial, no? Why not just add the call to Keeper#unbondValidator?

Copy link
Contributor Author

@ValarDragon ValarDragon Aug 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual method on the slashing keeper is trivial. Making this work when a validator is unbonded is non-trivial. We need to have a way for the staking keeper to talk to the slashing keeper. The method here isn't added, since we should hold off until we figure that out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the discussion in tendermint/tendermint#2177, we may just want to not do this at all, so no point figuring it out. (We can live with one hashmap entry per every validator on the next testnet)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see I see.

}

func (k Keeper) getPubkey(address crypto.Address) (crypto.PubKey, error) {
addr := new([tmhash.Size]byte)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we allocate a pointer instead of just:

var addr [tmhash.Size]byte
copy(addr[:], address)
// ...

Copy link
Contributor Author

@ValarDragon ValarDragon Aug 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

var addr [tmhash.Size]byte is equivalent to what new does. It just has more indirection. I don't know what you mean by "allocating a pointer", since that is used for var as well.

EDIT: Nevermind, your right. Var seems like the right way to go :), (forgot that var didn't make a pointer)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new returns a memory address -- that way you don't have to do *addr but just addr. No big deal though.

@ValarDragon
Copy link
Contributor Author

#1927 (comment)

Lets continue discussion of that in tendermint/tendermint#2177. Bucky was suggesting that its important to maintain referrence to the pubkey, and I think we should only use the pubkey.

@ValarDragon
Copy link
Contributor Author

Currently the addr-> pubkey map isn't persisted in state. (Thanks for pointing that out chris), that needs to be finished before this can be merged.

Copy link
Contributor

@alexanderbez alexanderbez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM @ValarDragon 👍

@ValarDragon
Copy link
Contributor Author

ValarDragon commented Aug 9, 2018

This should be good to merge now :) (pending final review)

for i := 0; i < len(vals); i++ {
val := vals[i]
pubkey, err := tmtypes.PB2TM.PubKey(val.PubKey)
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we panic here?

}

// TODO: Make a method to remove the pubkey from the map when a validator is unbonded.
func (k Keeper) addPubkey(ctx sdk.Context, pubkey crypto.PubKey, del bool) {
Copy link
Contributor

@cwgoes cwgoes Aug 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a function called addPubkey which takes a delete boolean is a bit confusing, why not just have addPubkey and deletePubkey?

At minimum change the function name, maybe to addOrRemovePubkey

func (k Keeper) getPubkey(ctx sdk.Context, address crypto.Address) (crypto.PubKey, error) {
store := ctx.KVStore(k.storeKey)
var pubkey crypto.PubKey
err := k.cdc.UnmarshalBinary(store.Get(getAddrPubkeyRelationKey(address)), &pubkey)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MustUnmarshalBinary

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(unless you wanted the custom error, in which case disregard)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think a custom error message is preferrable

Copy link
Contributor

@cwgoes cwgoes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM mostly, a few minor nits.

Copy link
Contributor

@cwgoes cwgoes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK, we should definitely test any Tendermint-23 relevant changes on an internal testnet.

@ValarDragon
Copy link
Contributor Author

ValarDragon commented Aug 10, 2018

That basically means testing liveness slashing, and unbonding /rebonding due to time changes. (I'm not worried about the crypto.Signature change breaking stuff)

Also yeah, was in a hurry when writing that last commit, forgot to test

Copy link
Member

@ebuchman ebuchman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Dev.

Note gaiadebug binary got accidentally checked in

We might also want our own Time function that returns the .UTC for us automatically and so we don't have to pass in nanoseconds.

Also I wonder when is it better to use a new type for []byte, eg. type Signature []byte ?

}

func getAddrPubkeyRelationKey(address []byte) []byte {
return append([]byte{0x03}, address...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

magic number

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -15,15 +16,15 @@ func TestGetSetValidatorSigningInfo(t *testing.T) {
newInfo := ValidatorSigningInfo{
StartHeight: int64(4),
IndexOffset: int64(3),
JailedUntil: int64(2),
JailedUntil: time.Unix(2, 0),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.UTC() ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The utc was in some places due to time defaulting to the 0 value. (and not everywhere else)

It is kind of sub ideal though, but i thought just using utc was preferable to going through a separate initializer for those tests when initializing validators.


sdk "github.com/cosmos/cosmos-sdk/types"
)

// defaultUnbondingTime reflects three weeks in seconds as the default
// unbonding time.
const defaultUnbondingTime int64 = 60 * 60 * 24 * 3
const defaultUnbondingTime time.Duration = 60 * 60 * 24 * 3 * time.Second
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note it's currently 3 days not 3 weeks

@ebuchman
Copy link
Member

We also need to update the spec about the addr->pubkey map since it's being persisted.

Though looks like we might revert the need for this depending on what happens with tendermint/tendermint#2177

@ebuchman ebuchman merged commit b2a4aec into develop Aug 12, 2018
@ebuchman ebuchman deleted the dev/upgrade_tm_version branch August 12, 2018 07:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants