You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have Consul v1.19.2 cluster of 5 servers in 2 datacenters connected by VPN. The overall stability of the cluster is good, but in case of loosing the connection between the datacenters, at the moment of restoring the connection, we temporary loose service discovery. The following record is present in consul log:
The cluster works correctly, the leader is at the 'main' side (where we have 3 Consul servers installed);
The connection with 'secondary' datacenter is lost, the 2 servers from the 'secondary' datacenter are wiped from the configuration at the 'main' side, everything works correctly on the 'main' side;
The connection is restored, two servers from 'secondary' side reconnect to the cluster, the current leader stops his leadership, the election process is started;
At this moment a client tries to discover a service using "/v1/health/service/..." request to a server, this request is failed as the cluster has no leader;
Everything comes back after election of a new leader.
Maybe this behavior is 'by design' and we need to tweak our configuration to avoid failures in service discovery during leader election. Any advise is welcome.
Reproduction Steps
Install Consul cluster with at least 3 servers
Add at least one service
Cut network connection of one server
Restore network connection of disconnected server
Immediately run curl to get list of healthy services
Operating system and Environment details
Consul v1.19.2 on FreeBSD 14.0 x64
The text was updated successfully, but these errors were encountered:
Overview of the Issue
We have Consul v1.19.2 cluster of 5 servers in 2 datacenters connected by VPN. The overall stability of the cluster is good, but in case of loosing the connection between the datacenters, at the moment of restoring the connection, we temporary loose service discovery. The following record is present in consul log:
<133>1 2024-11-28T00:09:46.231877+01:00 consul6.cloud.local consul 12196 - - 2024-11-28T00:09:46.231+0100 [ERROR] agent.http: Request error: method=GET url="/v1/health/service/bderp?filter=%28not+%28Checks.Status%3D%3Dcritical%29+and+%28Checks.CheckID%21%3DserfHealth%29%29" from=10.192.8.140:38318 error="No cluster leader"
As I understand, the sequence is as follows:
Maybe this behavior is 'by design' and we need to tweak our configuration to avoid failures in service discovery during leader election. Any advise is welcome.
Reproduction Steps
Operating system and Environment details
Consul v1.19.2 on FreeBSD 14.0 x64
The text was updated successfully, but these errors were encountered: