You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It seems whenever I have a node leave a cluster, then a rejoin occurs, I get failed acks and handler queue full logs from the node still in the cluster.
Is there any sort of clean-up I need to do to clear the queue or ack on rejoin? Everything works fine with joining until one leaves then tries to rejoin...
Scenario:
Node A begins, first node in cluster.
Node B joins cluster, connection is fine
Node B is killed, after 3 failed acks marked as dead
Node B restarted, the following messages are seen:
Node A:
A node has joined: m-127.0.0.1:8888
## NODE B KILLED
2020/11/19 12:14:33 [DEBUG] memberlist: Failed ping: m-127.0.0.1:8888 (timeout reached)
2020/11/19 12:14:34 [INFO] memberlist: Suspect m-127.0.0.1:8888 has failed, no acks received
2020/11/19 12:14:34 [INFO] memberlist: Suspect m-127.0.0.1:8888 has failed, no acks received
2020/11/19 12:14:36 [INFO] memberlist: Suspect m-127.0.0.1:8888 has failed, no acks received
2020/11/19 12:14:37 [INFO] memberlist: Suspect m-127.0.0.1:8888 has failed, no acks received
2020/11/19 12:14:37 [INFO] memberlist: Marking m-127.0.0.1:8888 as failed, suspect timeout reached (0 peer confirmations)
A node has left: m-127.0.0.1:8888
## NODE B RESTARTED
2020/11/19 12:14:39 [DEBUG] memberlist: Stream connection from=127.0.0.1:53980
2020/11/19 12:14:42 [WARN] memberlist: handler queue full, dropping message (3) from=127.0.0.1:8888
2020/11/19 12:14:43 [WARN] memberlist: handler queue full, dropping message (3) from=127.0.0.1:8888
The change in port&name doesn't seem to make a difference. Whether it is the same node with the same name or not
Node B (on rejoin):
2020/11/19 12:14:39 [DEBUG] memberlist: Initiating push/pull sync with: 127.0.0.1:4444
2020/11/19 12:14:39 [WARN] memberlist: Refuting a suspect message (from: m-127.0.0.1:8888)
A node has joined: m-127.0.0.1:4444
2020/11/19 12:14:40 [INFO] memberlist: Suspect m-127.0.0.1:4444 has failed, no acks received
2020/11/19 12:14:42 [INFO] memberlist: Suspect m-127.0.0.1:4444 has failed, no acks received
2020/11/19 12:14:43 [INFO] memberlist: Marking m-127.0.0.1:4444 as failed, suspect timeout reached (0 peer confirmations)
A node has left: m-127.0.0.1:4444
2020/11/19 12:14:43 [INFO] memberlist: Suspect m-127.0.0.1:4444 has failed, no acks received
It seems whenever I have a node leave a cluster, then a rejoin occurs, I get
failed acks
andhandler queue full
logs from the node still in the cluster.Is there any sort of clean-up I need to do to clear the queue or
ack
on rejoin? Everything works fine with joining until one leaves then tries to rejoin...Scenario:
Node A:
The change in port&name doesn't seem to make a difference. Whether it is the same node with the same name or not
Node B (on rejoin):
Example Code Snippet:
The text was updated successfully, but these errors were encountered: