Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for "clique" based consensus #1457

Closed
Tracked by #72
pipermerriam opened this issue Nov 9, 2018 · 8 comments
Closed
Tracked by #72

Add support for "clique" based consensus #1457

pipermerriam opened this issue Nov 9, 2018 · 8 comments

Comments

@pipermerriam
Copy link
Member

What is wrong?

The clique POA has gained some consensus across more clients. We should add support for it.

How can it be fixed

Implement EIP225 based consensus in py-evm and trinity.

@lithp
Copy link
Contributor

lithp commented Feb 20, 2019

This would enable support for both goerli and rinkeby

@lithp
Copy link
Contributor

lithp commented Feb 20, 2019

Here's the unmerged draft EIP: ethereum/EIPs#1570

@pipermerriam
Copy link
Member Author

This seems like a good grant candidate. Anyone want to try to ballpark implementation complexity, assign a DAI amount and submit it to the grants request page.

@soc1c
Copy link

soc1c commented Mar 9, 2019

Goerli initiative is willing to fund this with 2500 DAI if anyone wants to pick this up. (Ideally with ethereum/trinity#72 as bonus.)

If anyone is interested, reach out here, here ethereum/trinity#72 or here eth-clients/goerli#16 before allocating a bounty.

@veox
Copy link
Contributor

veox commented Apr 16, 2019

Anyone want to try to ballpark implementation complexity (...)

FTR, I've started looking into this. Nothing serious yet; branch add-goerli-testnet.

Random notes follow.


EIP-225 has been merged (currently still Draft).


Appearances of eth.consensus:

% g -A1 -R -Pzo 'from eth.consensus.*import \((\n|.)*?\)' *
eth/vm/base.py:from eth.consensus.pow import (
    check_pow,
)--
eth/tools/mining.py:from eth.consensus import (
    pow,
)--
tests/core/consensus/test_pow_mining.py:from eth.consensus.pow import (
    CACHE_MAX_ITEMS,
    EPOCH_LENGTH,
    check_pow,
    get_cache,
)

check_pow() is currently done deep in the VM class hierarchy, in VM.validate_seal(). Not sure if this is a problem - still reading up on how Clique works, exactly.

VM.validate_header() does check extra_data length (<=32), so seems like VM does need to know if it's running Clique. :/

Ironic, since (from EIP-225):

Note, changing the length of a header field is a non invasive operation as all code (such as RLP encoding, hashing) is agnostic to that, so clients wouldn’t need custom logic.


Another complication seems to be that (from EIP-225, highlight mine):

To authorize a block for the network, the signer needs to sign the block’s hash containing everything except the signature itself. The means that the hash contains every field of the header (nonce and mixDigest included), and also the extraData with the exception of the 65 byte signature suffix. The fields are hashed in the order of their definition in the yellow paper.

Guess that for this reason, the naive genesis hash-matching test fails: the BlockHeader.hash property RLP-encodes the entirety of the header, where IIUC it should be skipping some in Clique. PEBKAC, test passes.

Still, might need to

#CLIQUE_BLOCK_SIGNATURE_LENGTH = 65
BlockHeader.copy(extra_data=block_header_before_signing.extra_data[:-65])

(or something) when checking non-genesis block signature, and that might mean BlockHeader, too, would need to know whether it and/or Chain/VM is/are running Clique. X_X

@veox
Copy link
Contributor

veox commented Apr 17, 2019

Another day, more notes.


There are two rule sets for header fields:

  • one for epoch transition blocks (including genesis), also called "checkpoints" for containing full lists of authorised signers (authorities);
  • another for all blocks in between epoch transitions.

There is overlap in rule sets; main differences (explicitly stated in EIP-225) are for beneficiary, nonce, extra_data.


Possibly an issue is the statefulness of the protocol. This may become an issue downstream, in trinity. Or here. Or not at all.

Every epoch transition (...) acts as a stateless checkpoint, from which capable clients should be able to sync without requiring any previous state. [On epoch transition] all non settled votes are discarded, and tallying starts from scratch.

For all non-epoch transition blocks:

  • Signers may cast one vote per own block to propose a change to the authorization list.
  • (...)
  • Votes are tallied live as the chain progresses (...).
  • Proposals reaching majority consensus SIGNER_LIMIT come into effect immediately.

As things stand, check_pow is stateless, and therefore can be meaningfully contained in BlockHeader.validate_seal().

check_clique_poa() would need to get passed a list - which may change on any (non-epoch?..) block. So, PoA validation seems a higher-level abstraction.

The need to maintain this authority list, plus the current vote tally, means there is no short-cut for nodes that do not wish to participate in block sealing. They, too, must maintain this rolling state, block-by-block, starting from latest epoch transition. "Come into effect immediately" is what causes the complication.

(OT bit: Had it been "get approved to start on next epoch transition", validating a header would be possible by using the static list from the start of current epoch.)


Anyway, seems that authority validation must involve the VM level.

  • Currently for PoW: VM.validate_header(...) -> VM.validate_seal(header) -> eth.consensus.pow.check_pow(<header_elements>)
  • Maybe for Clique PoA: VM.validate_header(...) -> VM.validate_seal(header) -> eth.consensus.clique.check_poa(**header_elements, VM.authorities)

Similar to parent_hash, authority data refers to previous blocks, so is "out of scope" at the BlockHeader level. Different from parent_hash, verification can't be done meaningfully if it's not available.

Also differently from parent_hash, verification can't be done meaningfully if there's been a gap in verifications leading up to the header-at-question. This is the part which may become problematic in trinity.


BaseChain and BaseVM are subclassed from Configurable... Could that be leveraged?

GOERLI_VM_CONFIGURATION = (
    # Note: Frontier through Constantinople are excluded since this chain starts at Petersburg.                        
    (PETERSBURG_GOERLI_BLOCK, PetersburgVM, CliqueConsensus),
)

@veox
Copy link
Contributor

veox commented Oct 1, 2019

FTR, @cburgdorf has started work on this (both py-evm and trinity). See gitter for progress / issue description / git branches.


Issue has been identified: tx fee going to the wrong address. EIP-225 doesn't seem to state anywhere (explicitly) that the fee should go to the signers.

@cburgdorf
Copy link
Contributor

I just merged #1855 which adds most of Clique Consensus. It's not 100 % implemented at this point which means it won't be able to spin up a Py-EVM based node as an actual Görli validator but it's possible to sync the Görli network just fine.

I still need to polish the Trinity side of this but it should soon land in a next release I'm sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants