You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Consul's kubernetes integration currently deploys a sidecar that will re-register the pod's service with the local agent every 10 seconds.
This is sadly necessary for now because the client agent daemon set it stateless and so will loose API-driven registrations on a restart. Alternative solutions to prevent that currently have prohibitive costs associated with them.
It turns out this isn't actually how Consul works! I labelled this an enhancement since it technically never worked this way and never documented that it did, but since many of us assumed it did we could consider it a bug!
The reason for the confusion is that we have two other mechanisms like this:
In agent anti-entropy, our periodic syncs will actively diff state against servers and only re-register a service if there is drift
Servers themselves will check inside the FSM that the registration request is actually changing the service or check definition. If it doesn't it becomes a no-op to avoid pointlessly waking up potentially thousands of watchers to deliver them a message that has no new data in. But the server has already processed the RPC and made a full trip through raft by this point so it's only optimizing away the blocking query broadcast not the actual cost of the write on the servers.
It would be relatively easy in the local package where we call AddService to use the same "IsSame" method we use inside the FSM to ensure that we only update the local state if there is actually a change in the registration. We'd probably need to do this for services and for checks. If there is no change we can either no-op and just return success, or we can continue to make the update but just set InSync true so that we avoid triggering a registration RPC to the servers.
The problem with leaving it like it is is that on a modest sized kube cluster (100 nodes) with say 5000 pods, the servers are now handling 500 raft writes per second that are all no-ops! In one experiment we saw Consul servers sustaining 12k writes per second. They were perfectly healthy and handling that load just fine, but we did not expect the workload to be anywhere near as hard as that until we tracked down this issue to be the root cause.
There is another issue for the Kube component to be more efficient however this seems like an easy win for better efficiency anywhere some automation might be making idempotent registrations and not expecting high load from them.
The text was updated successfully, but these errors were encountered:
Consul's kubernetes integration currently deploys a sidecar that will re-register the pod's service with the local agent every 10 seconds.
This is sadly necessary for now because the client agent daemon set it stateless and so will loose API-driven registrations on a restart. Alternative solutions to prevent that currently have prohibitive costs associated with them.
We came up with this mechanism under the shared assumption that the local consul agent will just ignore the repeated attempts to register the same service and anti-entropy won't be invoked because "nothing changed" so the behaviour won't impact Consul servers. We even encoded that assumption in the code comments here: https://github.com/hashicorp/consul-k8s/blob/1518dcc6991b1b5e80198dac6a5a28102f4fd38a/subcommand/lifecycle-sidecar/command.go#L93-L97
It turns out this isn't actually how Consul works! I labelled this an enhancement since it technically never worked this way and never documented that it did, but since many of us assumed it did we could consider it a bug!
The reason for the confusion is that we have two other mechanisms like this:
It would be relatively easy in the
local
package where we callAddService
to use the same "IsSame" method we use inside the FSM to ensure that we only update the local state if there is actually a change in the registration. We'd probably need to do this for services and for checks. If there is no change we can either no-op and just return success, or we can continue to make the update but just setInSync
true so that we avoid triggering a registration RPC to the servers.The problem with leaving it like it is is that on a modest sized kube cluster (100 nodes) with say 5000 pods, the servers are now handling 500 raft writes per second that are all no-ops! In one experiment we saw Consul servers sustaining 12k writes per second. They were perfectly healthy and handling that load just fine, but we did not expect the workload to be anywhere near as hard as that until we tracked down this issue to be the root cause.
There is another issue for the Kube component to be more efficient however this seems like an easy win for better efficiency anywhere some automation might be making idempotent registrations and not expecting high load from them.
The text was updated successfully, but these errors were encountered: