Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow tortoise to verify layer n+1 before verifying layer n #2403

Closed
lrettig opened this issue May 1, 2021 · 5 comments
Closed

Allow tortoise to verify layer n+1 before verifying layer n #2403

lrettig opened this issue May 1, 2021 · 5 comments

Comments

@lrettig
Copy link
Member

lrettig commented May 1, 2021

Description

Right now, if verifying tortoise fails to verify a layer N, it gives up trying to verify all later layers.

If the verifying tortoise fails to verify a layer N, it should continue trying to verify later layers. Verification of each layer is totally independent and does not rely upon verification of earlier layers. E.g., if the Hare fails for layer N but succeeds for layer N+1, it should be possible to verify layer N+1 even before the status of layer N is resolved.

The catch here is that we have to be very careful with how we apply the state of layer N+1, if we haven't yet applied the state of layer N. We can conservatively "project" elements of the state of layer N+1 that do not depend upon unresolved state transitions in layer N.

See spacemeshos/SMIPS#46 for more.

Affected code

  • Tortoise
  • State

This issue appears in commit hash: 9d96750

Related files (optionally with line numbers): tortoise/verifying_tortoise.go

@noamnelke
Copy link
Member

The catch here is that we have to be very careful with how we apply the state of layer N+1, if we haven't yet applied the state of layer N. We can conservatively "project" elements of the state of layer N+1 that do not depend upon unresolved state transitions in layer N.

At this point we shouldn't apply the state of non-contiguous layers, IMO, as it greatly complicates everything (think of rollbacks for example). If a layer is stuck, transactions are stuck at that point until it's verified (or given up on by verifying it as empty at some point).

This is not to say we shouldn't verify subsequent layers (without applying their state changes). This is obviously needed. We should decouple, as much as possible, the contextual block validation from the state transition.

Since applying or partially applying the state transition, as you suggested, is a local operation that has little impact on consensus - we can quite easily change this in a future release with no need for a consensus fork.

@lrettig
Copy link
Member Author

lrettig commented May 2, 2021

I agree. Right now, verifying a layer and applying its state are tightly coupled - either we do both, or neither. So another way of looking at this is, we need to decouple the two. It should be possible for a layer to be verified, and for us to have consensus on its contents, but for its state transitions to not yet be applied. There are of course downstream UX implications: how do you differentiate between these in smapp or in the explorer?

If layer N is verified, and layer N+1 is not verified, and Alice sends Bob some coins in layer N+2, and then layer N+2 is verified but its state not yet applied, then Bob should be happy to give Alice the coffee she bought with those coins, secure in knowing that the transaction will eventually be applied and finalized. This has implications for splitting the validator role from the miner role, and the "conservative balances" that may entail.

@noamnelke
Copy link
Member

If layer N is verified, and layer N+1 is not verified, and Alice sends Bob some coins in layer N+2, and then layer N+2 is verified but its state not yet applied, then Bob should be happy to give Alice the coffee she bought with those coins, secure in knowing that the transaction will eventually be applied and finalized.

This is not true. What if a transaction draining Alice's account is later added to layer N+1? Then the transaction from Alice to Bob in layer N+2 will NOT be applied. That's exactly why I say that we can't apply the state in a layer if not all layers before it were verified.

I actually think that if layers are not verified immediately this creates a whole slew of complicated edge cases. We're talking about penalizing block producers in the Hare if they picked bad transactions (bad mempool management). What if they can't make smart decisions about transactions because there's no consensus about previous layers? Even if we want to exempt block producers from penalties if there's no consensus - what if consensus is reached after they've produced their blocks? What if it was reached earlier, but we only realized it was reached after receiving their blocks? What if it was reached sooner and we knew about it on time, but the block producer didn't?

I don't have a good solution for this, but I think we should treat delayed consensus as the terrible disaster that it is and not as a normal condition where we want to gracefully handle it and move on. No immediate consensus? Stop the world until it's reached. If this creates a bad user experience - put all our efforts into making sure consensus is reached immediately, not better handling the case where it isn't.

@lrettig
Copy link
Member Author

lrettig commented Jul 28, 2021

What if a transaction draining Alice's account is later added to layer N+1? Then the transaction from Alice to Bob in layer N+2 will NOT be applied.

This can be accommodated by using "conservative balances" that factor in transactions that haven't been finalized. But that's complex and, for the foreseeable future, I agree with you that we should stop the world whenever and for as long as consensus is stuck.

@moshababo
Copy link
Contributor

@dshulyak still relevant?

@dshulyak dshulyak closed this as completed Aug 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants