-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change remaining contestation period to deadline #483
Change remaining contestation period to deadline #483
Conversation
This should solve all our FIXMEs but has a bit of a rats tail: - API and clients need to change - Need to Delay "until" the deadline - Either change Delay effect semantics (or introduce a new one), - or bite the bullet and observe "blockchain time"
This makes it trivial to compare the contestation deadline with the latest observed time (from the chain layer).
Usually we would reach for a with pattern now, but the Model based tests can't deal with that scoping. So let's start with a thread handle and add some cleanup functions next.
This required to add some information about whether it was sent already in the state (readyToFanoutSent). Furthermore, to be able to re-use the previous state value in that head logic step, factoring out a ClosedState was necessary.
Seems like this is failing now as IOSim cannot detect a deadlock with the new tick thread. Unless we find a way to do this, we need to drop this test now.
This is necessary as we cannot detect deadlock (impossible progress) anymore within our IOSim-based simulation of the chain. See the currently failing sanity test in BehaviorSpec.
This is not possible anymore as the hydra node is continuously fed with Tick events now and may be responding eventually. So there is no deadlock to detect and we cannot early exit simulation. Unfortunate as it makes tests slower, but it is a more accurate simulation.
Also updated the e2e scenario to reflect the semantics accordingly. This is something which should also go into the API -> how can we not duplicate our docs?
This is necessary as we have not the right constraints on re-querying things from the node in the actual handling of chain sync events. Besides.. when to re-query things like protocol parameters? Likely not on each block?
This simplifies a couple of things: * No need to inspect the (chain) state when posting fanout as the deadline is already in the HeadLogic / FanoutTx * Using UTCTime does not lose precision when converting forth and back with slots (using the EpochInfo / History APIs) * We do not need to query protocol parameters to convert as we are only working with protocol versions not requiring the time conversion "backport" (hack).
These two tests do not make any assertions on whether `Tick` events are yielded or not, so we simply ignore them in the `callback.`
The property tests highlight that `POSIXTime -> UTCTime -> POSIXTime` is losless, while `UTCTime -> POSIXTime -> UTCTime` looses precision.
We do not care about the exact number here and it different now because of the 'Tick' events.
This is required to compute the remaining time as before, but this time in the client.
Although this is only for informational purposes, it makes the instrumentation in the e2e integration scenarios much simpler and no computation on when to expect the `ReadyToFanout` is required.
This makes it possible to only render the "[f]anout" when actually possible and makes it more clear that the deadline passed, but it's not yet ready.
The implemention of it will be in this same PR, so if approved it can be considered accepted.
This is now possible as the only usage for this was the 'ShouldPostFanout'.
Transactions CostsSizes and execution budgets for Hydra protocol transactions. Note that unlisted parameters are currently using
Cost of Init Transaction
Cost of Commit TransactionCurrently only one UTxO per commit allowed (this is about to change soon)
Cost of CollectCom Transaction
Cost of Close Transaction
Cost of Contest Transaction
Cost of Abort TransactionSome variation because of random mixture of still initial and already committed outputs.
Cost of FanOut TransactionInvolves spending head output and burning head tokens. Uses ada-only UTxO for better comparability.
|
TODOs on masterTotal (153)
TODOs on branchTotal (146)
|
Unit Test Results251 tests +3 245 ✔️ +3 14m 33s ⏱️ +53s Results for commit 719b0cd. ± Comparison against base commit 93b6daf. This pull request removes 7 and adds 10 tests. Note that renamed tests count towards both.
♻️ This comment has been updated with latest results. |
Performing the Model in ModelSpec still does, but that will change on another branch as well, so holding off on this.
This trades a TODO for an XXX as the generators are getting more and more messy.
+ Clients can now rely on `ReadyToFanout`, such that sending a `Fanout` input after seeing this output will never be "too early". | ||
+ The `HeadIsClosed` server output now contains the deadline instead of the remaining time. | ||
+ See `hydra-tui` for an example how to use the `contestationDeadline` and `ReadyToFanout`. | ||
+ See [ADR20](./docs/adr/2022-08-02_020-handling-time.md) for details and the rationale. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
simulateTicks nodes = forever $ do | ||
threadDelay blockTime | ||
now <- getCurrentTime | ||
readTVarIO nodes >>= \ns -> mapM_ (`handleChainEvent` Tick now) ns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
NewState | ||
-- XXX: Requires -Wno-incomplete-record-updates. Should refactor | ||
-- 'HeadState' to hold individual 'ClosedState' etc. types | ||
(cst{readyToFanoutSent = True}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or simply write: ClosedState{contestationDeadline, readyToFanoutSent = True}
😅 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately it's some more fields
| ClosedState
{ parameters :: HeadParameters
, confirmedSnapshot :: ConfirmedSnapshot tx
, previousRecoverableState :: HeadState tx
, contestationDeadline :: UTCTime
, -- | Tracks whether we have informed clients already about being
-- 'ReadyToFanout'.
readyToFanoutSent :: Bool
}
and I chose to disable the warning instead of the field noise
{ currentPointInTime = do | ||
currentSlotNo <- left show $ utcTimeToSlot now | ||
pt <- slotToUTCTime currentSlotNo | ||
pure (currentSlotNo, pt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this now, I find this function's name confusing. The current suggests that this is an effectful function that will get the current time, but it merely spits back whatever time it was when the handle was created.
Perhaps something like: lastKnownPointInTime
?
slotToUTCTime slot = | ||
-- NOTE: We are not using the Ledger.slotToPOSIXTime as we do not need the | ||
-- workaround for past protocol versions. Hence, we also not need the | ||
-- protocol parameters for this conversion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
-- time. We tried passing the current time from the caller but given the current machinery | ||
-- around `observeSomeTx` this is actually not straightforward and quite ugly. | ||
let event = OnCloseTx{snapshotNumber, remainingContestationPeriod = 0} | ||
let ClosedThreadOutput{closedContestationDeadline} = threadOutput |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
| ReceivedTxs {onChainTxs :: [OnChainTx Tx], receivedTxs :: [Ledger.TxId StandardCrypto]} | ||
| RolledForward {point :: SomePoint} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mixed feelings about that addition, especially if we start using the chain-sync protocol more / in replacement of the state-query. We don't want to be dumping 1000 log messages per seconds while fast syncing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are logging all ReceivedTxs
on 0.7.0! We have been trimming that down to only log TxId
on master
and I thought I add this as we are not logging chain points anywhere and not always log ReceivedTxs
, so I hoped that RolledForward
would be a consistent point of reference in the logs and helpful to find values for --start-chain-from
:)
(CC @ffakenz we just spoke about where to find those.. here would also be a way)
@@ -196,8 +196,15 @@ withDirectChain tracer networkId iocp socketPath keyPair party cardanoKeys point | |||
(submitTx queue) | |||
) | |||
( handle onIOException $ do | |||
-- NOTE: We can't re-query the time handle while the | |||
-- 'chainSyncHandler' is running due to constraints. So this will use | |||
-- always these initial parameters (as queried) for time conversions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE: We can't re-query the time handle while the 'chainSyncHandler' is running due to constraints.
What 😅 ? Can you clarify what you mean by that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Querying things from the node involves quite some IO
functions. However, the whole purpose of the ChainSyncHandler
abstraction was to decouple the code for testability and not tie ourselves to IO
, but allow running the handlers in IOSim
. Hence we only have MonadDelay, MonadTime, etc.
constraints on them. Also, I don't think that MonadIO IOSim
is a thing.
Then, thinking about this.. I don't think we would want to re-query stuff from the cardano-node on each received block?
SystemStart
will never change -> ezEraHistory
does change / extend in-between hard-forks? -> I have no idea.
Long story short, we query on start and this likely means we won't be able to cross hard-forks with changing slot lengths because of this.. but YAGNI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EraHistory does change / extend in-between hard-forks? -> I have no idea.
It does, every era adds a new portion to it. But more importantly, the last era in the history has an upper bound which is at k blocks after the tip. Which means that, it's already obsolete by the moment you've fetched it. As you time goes by, your visible future shorten (how poetic) until the moment when the upper bound of your era history is now in the past and any slot conversion will fail.
There are two ways around it.
-
We can
unsafeExtendSafeZone
to infinity which is basically the same as saying "we assume that all future eras will have the same protocol parameters w.r.t slot arithmetic". Possibly okay-ish if we also allow to restart a hydra-node without loosing a head (such that, in case the parameters in a next hard-fork are bound to change, we can ship an update of the hydra-node which can accommodate this). -
We fetch a new EraHistory everytime we need to calculate time. I imagine that defining a
MonadEraHistory
wouldn't be too complicated, with anIOSim
instance that returns an history with no upper bound for testing, and runs in plain IO for production.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would mean that the current interpreter will fail after using for longer than the safe zone .. should validate that, but a MonadTimeHandle
which could be used to mock queryTimeHandle
out for IOSim is a good idea.
When this happens, we don't expect that chain sync handler to continue to work. In fact, the whole process might be ill-fated and requires a restart. So this is really an exceptional case (and can't be handled right now).
So, good for me if and only if we record the work left to be done in an issue and it gets addressed promptly. |
🍁 TLDR: Addresses a couple
FIXME
s in the realm of knowing "when we can fanout". We took short-cuts when computing a remaining contestation period and also using theDelay
to know when to send aReadyToFanout
message to our clients. Unfortunately this output was not very reliable as ADR20 cares to explain.🍁 Avoid workarounds of setting
remainingContestatioPeriod
to 0 inDirect.State
, but then filling it inDirect.Handlers
modules by "just use thecontestationDeadline
". This ripples through the system as it we change theOnCloseTx
event etc.🍁 Instead of continuing the workaround via
Delay
effects, this PR bites the bullet and introduces aTick
event to communicate time from theChain
to theLogic
layer. This madeDelay
redundant and we can remove that wart as well.🍁 Reworked the
TimeHandle
functions as I was starting to useUTCTime
everywhere. This results in much lessimport Plutus.V2.Ledger.Api
all over the place.🍁 With these regular
Tick
events, we can sendReadyToFanout
when we really know that the deadline has passed and aFanout
input will be able to construct thefanoutTx
.🍁 Update clients and fix test suites to use the
contestationDeadline
to wait more robustly onReadyToFanout
.🍁 Also needed to work around some issues with
IOSim
not bailing on us anymore (which we even relied on in theModelSpec
)🍁 Re-generated golden files and updated json specs round off the picture and make this a HUGE PR.. I'm sorry, but it was fun and I think this is needed 😅