Handle rollbacks I #184

ch1bo · 2022-01-30T17:22:53Z

What & Why

Currently, rollbacks are not handled by hydra-node. If a rollback occurs in Layer 1 and there are open Hydra Heads after the rolled-back fork point, hydra-node will crash. A basic ("dirt-road") solution ensuring that hydra-node remains operational in such a scenario is needed in order to run hydra-node on a public testnet.

Full support for rollbacks will be implemented at a later stage ("high-way" solution) #185

Requirements

The hydra-node does not crash when a rollback occurs
A ServerOutput is emitted to the client when a rollback occurs
When a not-yet-open head is rolled back, clients can continue opening (i.e. commit, close or abort) the head after the rollback occurred (safety)
When an already-open head is rolled back "crossing opening point", it is acceptable that the off-chain communication state is reset and we could continue or contest (no liveliness)

Tasks

The text was updated successfully, but these errors were encountered:

KtorZ · 2022-02-25T12:17:46Z

The solution we discussed for handling rollbacks is a 3-step approach, for which the first step was completed as part of #228 and #221

Step 1

As a first step, we want to untangle the low-level construction and observation of on-chain transactions. We've been carrying around an OnChainHeadState with internals details needed to construct and observe those transactions. However it'd be preferable to keep those details opaque to the higher level of the chain stack such that a state merely becomes some opaque blob of data that needs to be stored but with no way of inspection by upper layers.

To achieve this, we create a State interface to sit between the higher level chain component and the low level transaction construction and observation. That state contains information needed to construct and observe transactions and encode the on-chain transitions of the Hydra protocol. The chain component has no way to peek into the state and can only modify it via the provided interface which is opaque by construction.

With this, the tracking of the on-chain head state within the chain component becomes trivial => some data is stored in a TVar.

Step 2

After step 1, we've made it so that storing the on-chain head state within the chain component itself has become redundant or unnecessary. The chain components evolves around inputs it receives from the head logic (PostChainTx) and communicates back to the chain logic via events (OnChainTx) yielded through a callback.

Thus, it is possible to make the entire approach more stateless by moving the content of the TVar to PostChainTx and OnChainTx however this requires a bit of rework of the various observation function in the Direct.Tx module. In particular, most functions currently require some stateful information to ease the discovery of inputs and scripts but those are in practice unnecessary:

For observeInitTx this is "trivial" because there's no previous state anyway. So InitTx is only yielding stuff up.
observeCommitTx currently relies on the initials :: [TxIn] to find the input from the inputs list that was distributed during the initTx. This is, unnecessary since (a) we statically know the initial script and (b) have access to all transaction witnesses and redeemers. So we could very much figure this one out of the transaction fully.
observeAbortTx weirdly take a UTXO for resolving input based on the head script address in order to find which input of the transaction is the state-machine input! Instead, it could also just look at the witnesses, find the appropriate redeemer for the head script and assert from there.
observeCollectComTx does the same weird thing as observeAbortTx with that passed UTXO.
observeCloseTx same.
observeFanoutTx same.

So, what we need ultimately is get the middle Direct.State layer to yield events that contain new version of the on-chain head state, which remains opaque but can now be stored directly in the head logic and then, be provided back for constructing transactions in PostChainTx

Step 3

Finally, once we've made the chain component stateless and centralized all state representation in the head logic state, we can move the head logic from an aggregated representation to a representation in the form of sequence of events. Done naively, we can always reconstruct the required state by folding over the sequence of events.

After that, handling rollback becomes quite straightforward as it only requires to drop events that happened beyond the point of rollback, and then, replay the application state from it.

ch1bo · 2022-03-23T10:29:12Z

What does "remaining operational" actually mean? If we are fine by loosing off-chain state, but always want to stay "in control" of the Head, we could do the following for example (as a very simple dirt-road):

We record the "point" when a Head was initialized (or just before that)
We detect rollbacks and communicate that one happened to the client
We provide a way to start a hydra-node given a "point", or do that automatically whenever we see a rollback with the last known pre-initialization point
As a consequence, we will always re-synchronize the on-chain head state (assuming this is fast enough) from the init point, but maybe loose all the off-chain state on rollbacks.
Note that you would not be able to contest an already closed Head after this, even when the rollback was not impacting the Head closing transactions, but we don't have contestation until Contest logic & OCV #192 anyways.

ch1bo · 2022-03-23T10:32:55Z

An (easy) extension to ☝️ is to keep track of two points, an "init" point and an "open" point. That way, we could distinguish the rollbacks impacting the opening of a Head or not. Only when the opening of the head is impacted we would (suggest) to throw away the off-chain head state.

ch1bo · 2022-04-19T07:54:18Z

Updated the requirements to reflect our current approach.

ch1bo · 2022-04-19T09:58:15Z

Done as we also merged #310

ch1bo added the 💬 feature A feature on our roadmap label Jan 30, 2022

ch1bo added this to the Testnet maturity milestone Jan 30, 2022

ch1bo moved this to Todo in Hydra Head Roadmap Feb 2, 2022

ch1bo added green 💚 Low complexity or well understood feature L1 Affects the on-chain protocol of Hydra and removed L1 Affects the on-chain protocol of Hydra labels Feb 3, 2022

ch1bo removed this from the Testnet maturity milestone Mar 8, 2022

ch1bo mentioned this issue Mar 11, 2022

Create an ADR to get rid of the TVar in the Direct module #257

Closed

ch1bo added this to the 0.5.0 milestone Mar 23, 2022

KtorZ mentioned this issue Apr 6, 2022

Spike: see how we to test rollbacks and their consequences on the head lifecycle. #296

Closed

ch1bo closed this as completed Apr 19, 2022

Repository owner moved this from In Planning to Done in Hydra Head Roadmap Apr 19, 2022

pgrange mentioned this issue Apr 25, 2023

Commit vs rollbacks #827

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle rollbacks I #184

Handle rollbacks I #184

ch1bo commented Jan 30, 2022 •

edited

Loading

KtorZ commented Feb 25, 2022

ch1bo commented Mar 23, 2022

ch1bo commented Mar 23, 2022

ch1bo commented Apr 19, 2022

ch1bo commented Apr 19, 2022

Handle rollbacks I #184

Handle rollbacks I #184

Comments

ch1bo commented Jan 30, 2022 • edited Loading

What & Why

Requirements

Tasks

KtorZ commented Feb 25, 2022

Step 1

Step 2

Step 3

ch1bo commented Mar 23, 2022

ch1bo commented Mar 23, 2022

ch1bo commented Apr 19, 2022

ch1bo commented Apr 19, 2022

ch1bo commented Jan 30, 2022 •

edited

Loading