Highly-available & fault-tolerant validators #17189
outofforest
started this conversation in
Ideas
Replies: 1 comment 5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Highly-available & fault-tolerant validators
This document describes my findings related to possible implementation of highly available, fault-tolerant validators.
The purpose of this issue is to discuss this topic with Cosmos SDK team and other interested people, check if this functionality is desired (I believe it is!), provide more details for unclear parts and discover possible dependencies in code I missed. Feel free to comment and give suggestions.
After finishing discussions, if I see that implementing this is possible, I will invest some time to do it.
Motivation
Let's say you run a validator. So you have a server or virtual machine where validator runs.
Now you need a disaster recovery plan in case server goes down or needs a maintenance, because you don't want to be slashed.
For this, you need at least two servers and a procedure to switch validator from one server to another. It might be done
manually or automatically, by using some heartbeat software.
The problem is, no matter what option you choose, currently the same private key of the validator must be moved between
or coexist on both machines. This leads to some problems:
I think we all have seen the cases when even professional companies experienced their validator being tombstoned because of some mistakes.
The source of the issue is that Cosmos SDK has never been designed with those scenarios in mind.
So I started thinking on how it could be fixed, and I've found a possible fix.
If assumption is made that private key of the validator can never leave the machine where it was generated,
then it leads us to an obvious conclusion that each server (main and backup) must hold its own private key.
But then problem arises because validators in Cosmos SDK may sign blocks and proposals only with single private key.
This led me to another conclusion that I need to break this assumption.
After thinking more about it, I developed the idea of this framework:
I see three possible conditions for switching the public key:
authz
to do it on behalf of the operatorThis functionality could be implemented as an extension to the current
staking
module, eliminating the huge problem validator operators experience when maintaining the validators.Mechanics of the HA node switching
On the CometBFT side the thing is simple. There is a set of validators, each represented by the public key and voting power.
At the moment there is 1 to 1 relationship between validator in CometBFT and Cosmos SDK.
By implementing this proposal I want one CometBFT validator to be represented by n possible nodes (public keys) in Cosmos SDK,
grouped under common operator's address. At any time exactly one public key in Cosmos SDK is active for each operator,
as a result the 1 to 1 relationship between CometBFT and Cosmos SDK is still maintained. The only difference is that the set
of CometBFT validators is "more dynamic".
In practice, it means that whenever the active public key of Cosmos SDK validator is changed, it must replace the old one in CometBFT
by issuing the validator update in the end blocker of the
staking
module, providing the active public key with the same voting power.As a result, CometBFT "knows" only the active public keys constituting the set of active validators.
Terminology
The problem with terminology arises because the word
validator
may have many meanings now:In the spec below I use
HA node
for the third meaning. But the good wording for first and second meaning is welcomed.End blocker
As mentioned earlier, whenever the set of active HA nodes is changed,
staking
module must prepare the setof validator updates to be passed to CometBFT.
The old HA node must be removed, by setting the voting power of the corresponding public key to 0, and new active one must be added
instead, by setting the voting power for its public key.
Create validator tx
When validator is created it is identified by the operator address which might be treated as an ID of the validator.
Cosmos SDK already enforces that only one validator may be run by each operator, so it's already unique.
When validator is created, its public key is passed as an independent field, meaning we may create many HA nodes,
each using different public key.
There is a check verifying that public key is not used by any other validator. I must do the same to check that public key
is unique across all the HA nodes.
Relations to other modules
At the end,
AfterValidatorCreated
hook is called.slashing
anddistribution
modules subscribe to this hook:distrobution
: fields related to rewards and commissions are initialized. All the operations there, use only theoperator's address so my changes don't affect the logic there. Nothing needs to be modified
slashing
: consensus address -> public key relation is stored by the hook. That mapping is used only bythe
evidence
module to check that the consensus address key reported in the evidence exists in the system. I believethis is not needed. Anyway, consensus address is derived from the public key, so it is 1 to 1
relationship for each HA node, meaning I may just add the mapping for each node.
To do this, I need new hooks:
to maintain the mapping inside
slashing
module.Managing relations between validator and its HA nodes
Proto of
staking
module definesValidator
message containingconsensus_pubkey
field, storing the public key of the validator.It must be converted into a slice to store many public keys of all the HA nodes.
There is
ConsPubKey() (cryptotypes.PubKey, error)
method used in a couple of places to get that key. As there is no singlekey for the validator anymore it must be converted to one of the options:
It hasn't been identified yet which solution fits the purpose of each call.
The
Validator
message should be extended by addingactive_consensus_pubkey
field indicating which HA node is active at the moment.There is
ValidatorSigningInfo
map mapping consensus address to some metrics and information.That structure contains fields related to the validator itself (not a particular HA node), except the consensus address itself.
The
Address
field is not used anywhere, so maybe it might be simply removed. Then, the operator's address should be usedas a key in that map because this is the value uniquely identifying the validator, not the consensus address.
HA node states and active node switching
When validator is created, the provided public key constitutes the first HA node. This node is automatically set to
active
state.There are 3 possible states for an HA node:
active
- it means this HA node signs and proposes blocks - only one HA node per validator might be in this stateenabled
- HA node in this state does not sign anything until it is set toactive
disabled
- HA node in this state does not sign anything and cannot be set toactive
- it must be set toenabled
firstThe difference between
enabled
anddisabled
is that operator may grant someone else (usingauthz
) permissionto change the active HA node (move it from
enabled
toactive
) but at the same time operator might decide that thereare some HA nodes (disabled ones) which cannot be activated, e.g. servers might be maintained or intentionally turned off.
This means that hypothetical heartbeat application might exist, monitoring the status of the servers and switching the active
HA node automatically if the current one is dead. The application should use its own private key (not the one belonging to the operator).
and that private key should be permitted to (with
authz
) to broadcast transaction selecting theactive
HA node from the set ofenabled
ones.At the same time this private key should not be allowed to enable a disabled HA node.
New transactions
New transactions need to be added to the
staking
module for:CreateValidator
tx must be modified accordingly to immediately create and activate the first HA node for the validator.Looks like the structure of the message does not need to be changed.
New queries
To do in next steps
The next step after implementing this proposal would be adding an option to switch the active HA node automatically by the chain
if the current active one missed the opportunity to propose the block. This failover mechanism would eliminate the need
for having the heartbeat application described above because its role would be taken by the chain itself.
Beta Was this translation helpful? Give feedback.
All reactions