Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent: don't send RPCs to Re-register service if it hasn't changed. #9266

Closed
banks opened this issue Nov 24, 2020 · 0 comments · Fixed by #9284
Closed

Agent: don't send RPCs to Re-register service if it hasn't changed. #9266

banks opened this issue Nov 24, 2020 · 0 comments · Fixed by #9284
Assignees
Labels
type/enhancement Proposed improvement or new feature

Comments

@banks
Copy link
Member

banks commented Nov 24, 2020

Consul's kubernetes integration currently deploys a sidecar that will re-register the pod's service with the local agent every 10 seconds.

This is sadly necessary for now because the client agent daemon set it stateless and so will loose API-driven registrations on a restart. Alternative solutions to prevent that currently have prohibitive costs associated with them.

We came up with this mechanism under the shared assumption that the local consul agent will just ignore the repeated attempts to register the same service and anti-entropy won't be invoked because "nothing changed" so the behaviour won't impact Consul servers. We even encoded that assumption in the code comments here: https://github.com/hashicorp/consul-k8s/blob/1518dcc6991b1b5e80198dac6a5a28102f4fd38a/subcommand/lifecycle-sidecar/command.go#L93-L97

It turns out this isn't actually how Consul works! I labelled this an enhancement since it technically never worked this way and never documented that it did, but since many of us assumed it did we could consider it a bug!

The reason for the confusion is that we have two other mechanisms like this:

  1. In agent anti-entropy, our periodic syncs will actively diff state against servers and only re-register a service if there is drift
  2. Servers themselves will check inside the FSM that the registration request is actually changing the service or check definition. If it doesn't it becomes a no-op to avoid pointlessly waking up potentially thousands of watchers to deliver them a message that has no new data in. But the server has already processed the RPC and made a full trip through raft by this point so it's only optimizing away the blocking query broadcast not the actual cost of the write on the servers.

It would be relatively easy in the local package where we call AddService to use the same "IsSame" method we use inside the FSM to ensure that we only update the local state if there is actually a change in the registration. We'd probably need to do this for services and for checks. If there is no change we can either no-op and just return success, or we can continue to make the update but just set InSync true so that we avoid triggering a registration RPC to the servers.

The problem with leaving it like it is is that on a modest sized kube cluster (100 nodes) with say 5000 pods, the servers are now handling 500 raft writes per second that are all no-ops! In one experiment we saw Consul servers sustaining 12k writes per second. They were perfectly healthy and handling that load just fine, but we did not expect the workload to be anywhere near as hard as that until we tracked down this issue to be the root cause.

There is another issue for the Kube component to be more efficient however this seems like an easy win for better efficiency anywhere some automation might be making idempotent registrations and not expecting high load from them.

@banks banks added the type/enhancement Proposed improvement or new feature label Nov 24, 2020
@dnephin dnephin self-assigned this Nov 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement Proposed improvement or new feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants