You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
our current ping system is known as heartbeat approach. It's robust but quite overloaded. The basic SWIM has the following properties:
an arbitrary node will choose a random ping target (M_i) at each ping period.
if the M-i answers successfully, the period ends.
if not, the node will choose K random ambassadors for indirect probing. It will send ping-req(M_i) to these K ambassadors. If all of them return a negative response, indicating that M_i is down, M_i will be kicked.
if M_i is detected as down, this message will be multicasted to all other nodes.
It is an important note that since the failure detection phase is pretty strict, (1 + K nodes must fail to probe a node to mark it as down), this phase is based in trust, meaning that everyone in the system will trust this multicast message and will apply it without any further investigation.
At first sight, our current pingInterval and onPingReceive is enough for this approach. One big drawback is that our current ping messages carry a huge weight (serializedPathTree + nodes). One major step toward being more efficient is to separate these two distinct concerns. In other words, SWIM emphasizes on separating failure detection and update dissemination, in our terminology, it should be the separation of failure detection and update dissemination and service dissemination.
To enhance this even further, it's fair to say that a UDP, or at most TCP packet, is enough for failure detection phase. Hence HTTP is a complete waste of resource and I MUST start implementing the TCP/UDP transport layer.
The text was updated successfully, but these errors were encountered:
our current ping system is known as
heartbeat
approach. It's robust but quite overloaded. The basic SWIM has the following properties:M_i
) at each ping period.M-i
answers successfully, the period ends.K
random ambassadors forindirect
probing. It will sendping-req(M_i)
to these K ambassadors. If all of them return a negative response, indicating thatM_i
is down,M_i
will be kicked.M_i
is detected as down, this message will be multicasted to all other nodes.failure detection
phase is pretty strict, (1 + K nodes must fail to probe a node to mark it as down), this phase is based in trust, meaning that everyone in the system will trust this multicast message and will apply it without any further investigation.At first sight, our current
pingInterval
andonPingReceive
is enough for this approach. One big drawback is that our current ping messages carry a huge weight (serializedPathTree + nodes
). One major step toward being more efficient is to separate these two distinct concerns. In other words, SWIM emphasizes on separatingfailure detection
andupdate dissemination
, in our terminology, it should be the separation offailure detection
andupdate dissemination
andservice dissemination
.To enhance this even further, it's fair to say that a UDP, or at most TCP packet, is enough for
failure detection
phase. Hence HTTP is a complete waste of resource and I MUST start implementing the TCP/UDP transport layer.The text was updated successfully, but these errors were encountered: