-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Less dumb heartbeat #33
Conversation
This does not yet change the way we send Pings and handle them in the node.
We want to know which _Party_ is connected/disconnected which might not necessarily be the _Host_ we are directly connected to. Using Party identifier to send Heartbeat and track which ones are Connected/Disconnected seems more robust should the topology of the network evolve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Liveness should be determined using something >> than the hearbeat delay. Maybe double?
I have addressed comments from the review, and also tried to make tests and code clearer, the latter by separating threads for sending heartbeats and checking peers. |
@@ -70,7 +69,8 @@ instance Tx tx => SignableRepresentation (Snapshot tx) where | |||
getSignableRepresentation = encodeUtf8 . show @Text | |||
|
|||
data ClientResponse tx | |||
= PeerConnected Host | |||
= PeerConnected Party | |||
| PeerDisconnected Party |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would say PartyConnected
and PartyDisconnected
is more correct as this is really based on the hydra-node (with a configured Party
) sending us pings?
|
||
shouldSendHeartbeat :: Time -> HeartbeatState -> Bool | ||
shouldSendHeartbeat now HeartbeatState{lastSent} = | ||
maybe True (checkTimeout id now) lastSent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be quite a bit clearer to me if this would be
maybe True (checkTimeout id now) lastSent | |
now > lastSent + livenessDelay |
updateSuspected heartbeatState now = | ||
atomically $ do | ||
aliveParties <- alive <$> readTVar heartbeatState | ||
let timedOutParties = Map.filter (checkTimeout (2 *) now) aliveParties |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Analogous to shouldSendHeartbeat
, a function with something like
checkTimedOut now lastSeen = now > lastSeen + 2 * livenesDelay
Would be easier to read and understand IMO (instead of passing a modifier to a re-used checkTimeout
function.
let sentHeartbeats = runSimOrThrow $ do | ||
sentMessages <- newTVarIO ([] :: [HydraMessage SimpleTx]) | ||
sentMessages <- newTVarIO ([] :: [Heartbeat (HydraMessage Integer)]) | ||
|
||
withHeartbeat testHost (dummyNetwork sentMessages) noop $ \_ -> do | ||
withHeartbeat 1 (captureOutgoing sentMessages) noop $ \_ -> | ||
threadDelay 1.1 | ||
|
||
readTVarIO sentMessages | ||
|
||
sentHeartbeats `shouldBe` replicate 2 (Ping testHost) | ||
sentHeartbeats `shouldBe` [Ping 1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having the assertions in the runSimOrThrow
would be quite a bit easier to read, i.e
runSimOrThrow $ do
sentMessages <- newTVarIO ([] :: [Heartbeat (HydraMessage Integer)])
withHeartbeat 1 (captureOutgoing sentMessages) noop $ \_ ->
threadDelay 1.1
sentHeartbeats <- readTVarIO sentMessages
sentHeartbeats `shouldBe` [Ping 1]
Where the newTVarIO
and readTVarIO
could be also moved into (typed) boilerplate functions with drafted usage:
runSimOrThrow $ do
(capture, getSentHeartbeats) <- captureOutgoing
withHeartbeat 1 capture noop $ \_ ->
threadDelay 1.1
getSentHeartbeats `shouldReturn` [Ping 1]
let component incoming action = | ||
race_ | ||
(action (Network noop)) | ||
(incoming (Ping 2) >> threadDelay 4 >> incoming (Ping 2) >> threadDelay 7) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having some interpretations to these numbers would've been great, e.g. slightlyMore
and moreThanDouble
, where these variables would also be parameterized by the actual livenessDelay
This is a possibly more useful implementation of a Heartbeat or rather a Failure detector.