Infinite re-election loop on leader node rejoin #2837
Labels
theme/internal-cleanup
Used to identify tech debt, testing improvements, code refactoring, and non-impactful optimization
type/bug
Feature does not function as expected
Milestone
consul version
for Serverconsul info
for ServerServer:
Operating system and Environment details
docker run \ --memory=128m \ --cpu-shares=128 \ --net=custom-net-1 \ -e GOMAXPROCS=4 \ --publish=8500:8500 \ --publish=10.2.2.1:8400:8400 \ --publish=10.2.2.1:8300-8302:8300-8302 \ --publish=10.2.2.1:8300-8302:8300-8302/udp \ custom-consul-image:0.7.0 \ # Consul server parameters ...
Note, Docker container does not mount
/var/consul/data
so leave from the cluster purges the raft database and all data./etc/consul.d/config.json
and health-check configuration from/etc/consul.d/node.json
(command have been modified) respectively:Description of the Issue (and unexpected/desired result)
After the leave of Consul leader from the cluster and subsequent rejoin the server with the same IP replicated logs are keep getting rejected. It is expected that replicated logs are committed to the joined server of the cluster.
Reproduction steps
1. Create a three-node cluster with an identical configuration on each node:
Where:
${ipsubnet}
takes one of:10.2.1.0/24
,10.2.2.0/24
,10.2.3.0/24
.${ipaddress}
takes one of:10.2.1.1
,10.2.2.1
,10.2.3.1
respectively.2. Wait until the cluster finishes bootstrapping stage (in my case node with IP
10.2.2.1
takes the leadership of the cluster and starts replication routines):3. Gracefully delete leader node from the Consul cluster:
Logs from the former Consul leader:
4. Wait until a new leader is reelected:
5. Join detached member to the Consul cluster back (with the same IP address and
--bootstrap-expect
set to3
):6. Notice that due to the heart-beat timeout expiration caused by accumulation of uncommitted logs on rejoined member leader re-election never ends:
$ grep -i 'log not found' 10-2-1-1-consul-after.log 2017/03/28 11:06:06 [WARN] raft: Failed to get previous log: 247 log not found (last: 0) 2017/03/28 11:06:48 [WARN] raft: Failed to get previous log: 261 log not found (last: 260) 2017/03/28 11:07:48 [WARN] raft: Failed to get previous log: 269 log not found (last: 268) 2017/03/28 11:08:48 [WARN] raft: Failed to get previous log: 278 log not found (last: 277) ... 2017/03/28 11:14:07 [WARN] raft: Failed to get previous log: 331 log not found (last: 330) 2017/03/28 11:15:07 [WARN] raft: Failed to get previous log: 342 log not found (last: 341) 2017/03/28 11:16:07 [WARN] raft: Failed to get previous log: 353 log not found (last: 352) ... 2017/03/28 11:28:44 [WARN] raft: Failed to get previous log: 472 log not found (last: 465) 2017/03/28 11:28:44 [WARN] raft: Failed to get previous log: 476 log not found (last: 475)
According to my limited understanding, only event of the logs commitment update the leader heart-beat timer on the followers node. An expired timer causes the leader re-election over and over.
Log Fragments
Logs from the server before the leave from the cluster 10-2-1-1-consul-before.log.
Logs from the server after the rejoin to the cluster: 10-2-1-1-consul-after.log.
Logs from another server node: 10-2-3-1-consul.log.
Note, I validated this scenario on
0.6.4
and0.7.1
versions as well and always able to reproduce.The text was updated successfully, but these errors were encountered: