Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define a consensus track and spec Clique (EIP-225). #1570

Merged
merged 4 commits into from
Mar 6, 2019

Conversation

karalabe
Copy link
Member

@karalabe karalabe commented Nov 9, 2018

By popular demand, the Clique PoA EIP.

@Arachnid I'm unsure who is maintaining the EIP website, but this PR actually attempts to add (please fix if incorrect) a new spec category called consensus. I really don't think a new consensus protocol fits into any other category and if we expect long term to be a few more proposals (e.g. aura), then it might make sense to add it now.

cc @5chdn

@karalabe karalabe requested a review from Arachnid November 9, 2018 08:36
EIPS/eip-225.md Outdated Show resolved Hide resolved
@jpitts
Copy link
Member

jpitts commented Nov 9, 2018

AFAIK, the source of the categories / navigation in the EIPs website is the set of "EIP Types" defined in EIP-1:

https://github.com/ethereum/EIPs/blob/master/EIPS/eip-1.md

Copy link
Contributor

@nicksavers nicksavers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karalabe Great to see the PR here. Just a few comments. For the rest I'd say this is good enough for a Draft. During implementations in other clients hopefully in the next few weeks, we could see some minor clarifications and have it moved to LAST CALL status.

EIPS/eip-225.md Outdated Show resolved Hide resolved
EIPS/eip-225.md Outdated Show resolved Hide resolved
EIPS/eip-225.md Outdated Show resolved Hide resolved
* **`DIFF_NOTURN`**: Block score (difficulty) for blocks containing out-of-turn signatures.
* Suggested `1` since it just needs to be an arbitrary baseline constant.
* **`DIFF_INTURN`**: Block score (difficulty) for blocks containing in-turn signatures.
* Suggested `2` to show a slight preference over out-of-turn signatures.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this on Discord already, I would strongly suggest having the DIFF_INTURN set to 3.

Short summary: because it is more likely that you have two ouf-of-turn blocks with total score of 2, than three out-of-turn blocks with total score of 3. An in-turn block can help the network to reorganize even if there were two out-of-turn blocks. We had multiple of such occasions that caused Görli network to halt.

* If a signer is allowed to sign a block (is on the authorized list and didn't sign recently).
* Calculate the optimal signing time of the next block (parent + **`BLOCK_PERIOD`**).
* If the signer is in-turn, wait for the exact time to arrive, sign and broadcast immediately.
* If the signer is out-of-turn, delay signing by `rand(SIGNER_COUNT * 500ms)`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Observations on Görli show that we have way too many reorganizations due to the out-of-turn delay often being lower than the network latency. I would strongly encourage to introduce something like a MIN_WAIT of 5000ms, so that the out-of-turn delay is something around MIN_WAIT + rand(SIGNER_COUNT * 500ms).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, with just the random delay out of turn blocks may be published at essentially the same time as the in-turn block. I'd highly recommend the minimum wait time and the 500ms multiplier should both be configurable like block period tends to be. The best values for them are highly affected by network latency and the chosen block period.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will argue that NO_TURN blocks should probably wait for at least half period , to give person inturn an chance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not affecting consensus, so clients could just go ahead and implement it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@5chdn I will make the change in parity for this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any delay will make block times jump around so instead of 15 seconds, some blocks will take 22 seconds. That seems overly excessive to me.

Also, if a signer is offline, there will be more than 1 gap, (the signer that siges out-of-turn next has a high probability of missing its turn because it cannot yet sign).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree, in an POA environment, the node are much more stable, i would argue that the distribution of block time is not Normial distributed or Bi Modal distributed, let's say some comptatnt node can generate inturn block and distributed at an relaibility of 99.9% (or 99% if super unlucky) , the NOTURN block should really just be 0.1%.

Suppose we put NOTURN and INTURN block both together with no delay, then 99.9% of the time the network will reorg to INTURN block, which isn't ideal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any delay will make block times jump around so instead of 15 seconds, some blocks will take 22 seconds. That seems overly excessive to me.

Just like on mainnet 😃 -- I don't see why this should be an issue, especially when this will drastically improve network stability.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any delay will make block times jump around so instead of 15 seconds, some blocks will take 22 seconds. That seems overly excessive to me.

Just like on mainnet 😃 -- I don't see why this should be an issue, especially when this will drastically improve network stability.

@5chdn you are saying you agree that more "potential" delay is better than frequent 1block reorg, righ?

* **`SIGNER_COUNT`**: Number of authorized signers valid at a particular instance in the chain.
* **`SIGNER_INDEX`**: Index of the block signer in the sorted list of current authorized signers.
* **`SIGNER_LIMIT`**: Number of consecutive blocks out of which a signer may only sign one.
* Must be `floor(SIGNER_COUNT / 2) + 1` to enforce majority consensus on a chain.
Copy link
Contributor

@5chdn 5chdn Dec 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why + 1? Wouldn't that allow for more blocks sealed by a single miner than half of the allowed of authorities? edit, misread the sentence.


* If a signer is allowed to sign a block (is on the authorized list and didn't sign recently).
* Calculate the optimal signing time of the next block (parent + **`BLOCK_PERIOD`**).
* If the signer is in-turn, wait for the exact time to arrive, sign and broadcast immediately.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add that in-turn signers should continue to produce their block even if they receive an out-of-turn block for that height before they publish their in-turn one. Otherwise clock drift and the lack of a minimum delay for out of turn blocks can cause an online in-turn validator from publishing their block and cause problems for the network. There's a more detailed explanation of how this happens at https://docs.google.com/document/d/1tmsr66sAPJmIZfSy5zck1uxnLzaXYxvzTNL5CF1pJQ0/edit#heading=h.ic4qz2ybx51

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this graunteed by default because NOTURN block has diffcuilty = 1 ,while inturn block has diffculity = 2, thus locally it will think the pending block is the best block to work on? same as ethash engine, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That depends on the client's algorithm for mining. It's not unreasonable, and arguably preferable, for a client to always ensure they are building on the best chain head rather than just ensuring the block they're mining would become the new chain head. Doing so would reduce the number of uncle blocks generated and potentially give the mined block a better chance of remaining on the canonical chain (because building a longer chain should give you an even higher total difficulty)

@Arachnid
Copy link
Contributor

Arachnid commented Dec 5, 2018

@karalabe I really think this fits just fine in the existing 'core' category. I don't think we need a new category.

* Arbitrary values are permitted nonetheless (even meaningless ones such as voting out non signers) to avoid extra complexity in implementations around voting mechanics.
* **Must** be filled with zeroes on checkpoint (i.e. epoch transition) blocks.
* **`nonce`**: Signer proposal regarding the account defined by the `beneficiary` field.
* Should be **`NONCE_DROP`** to propose deauthorizing `beneficiary` as a existing signer.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From discussions on Pantheon github - should the NONCE_DROP be defined as the default value of the nonce field when not voting?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well that is ture already.

@nicksavers nicksavers merged commit a545451 into ethereum:master Mar 6, 2019
@nicksavers
Copy link
Contributor

Great work Péter!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.