-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Leader reelections triggered by broken follower in healthy cluster #9563
Comments
The isolated follower may election timeout and becomes candidate to trigger election. But the message vote from isolated follower would never reach other peers, so the active leader won't step down (e.g. message receive from a node with higher term) while packet dropping is activated. For Scenario 1, do you observe leader election after For Scenario 2, do you observe leader election on step 5 (also server logs between current leader and new leader would be helpful to confirm)? Note: Similar scenarios are already being tested in our functional testing, but above same patterns should be added to make sure. |
Yes - there is single election after 5.
After unblocking I start observing reelections. I have 100% of reproduction those - I can't do it now, but I can send you some logs tomorrow morning. |
So to provide some logs - these are the logs from the scenario 1 - as I wrote, i have 100% of reproducing it with the instructions I wrote above. etcd1.log: follower on which I blocked traffic and unblocked outgoing Let me know if that's useful and you need any other information. |
@gyuho - did you have chance to look into that? |
@wojtek-t Sorry for delay. Will look into it this week. |
@wojtek-t I looked at your server logs and was also able to reproduce:
Here's what happens in etcd server and Raft
Step 6 and 7 are necessary to prevent an isolated node from being stuck, thus this disruptive election is inevitable (even in 3.3 and master branch, even with pre-vote enabled). |
Just to clarify
Whether there were writes or not, election would still happen as soon as isolated follower regains its connectivity. |
OK - so what you're saying is that this is "by-design" and there are no plans to change that behavior, right? |
Etcd version: 3.1.11 or 3.1.13 (both behave the same)
Scenario 1
I've done the following experiment:
-A INPUT -i eth0 -p tcp -m tcp --dport 2380 -j DROP
-A OUTPUT -o eth0 -p tcp -m tcp --dport 2380 -j DROP
This triggered leader-reelection (even though the cluster was healthy all the time - the leader and the second follower).
Is that expected behavior?
Scenario 2
The second experiment I've done was similar:
-A INPUT -i eth0 -p tcp -m tcp --dport 2380 -j DROP
-A OUTPUT -o eth0 -p tcp -m tcp --dport 2380 -j DROP
-A INPUT -i eth0 -p tcp -m tcp --dport 2380 -j DROP
-A OUTPUT -o eth0 -p tcp -m tcp --dport 2380 -m statistic --mode random --probability 0.5 -j DROP
This one triggers a stream of leader-reelection - reelection is happening every few seconds.
[This may be a bit artificial experiment, but that sounds like a bug to me.]
Question
Can you please clarify what (from your perspective) is a "by-design" outcome in those situations.
[In the idea world, I would expect that in case in healthy cluster, no leader-reelections should happen, but maybe that's not the case in raft].
@gyuho @xiang90 @jpbetz @mborsz
The text was updated successfully, but these errors were encountered: