Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unnecessary rendering all templates whem template.wait is set #1140

Open
marqc opened this issue Sep 17, 2018 · 14 comments
Open

Unnecessary rendering all templates whem template.wait is set #1140

marqc opened this issue Sep 17, 2018 · 14 comments
Labels

Comments

@marqc
Copy link

marqc commented Sep 17, 2018

Hi, I encountered some bug/place for improvement on template wait timer. It is desired to prevent flooding changes to disk/notified program when they occur to often in consul.

Current behavior is that for template with wait defined it is added to quescienceMap, and with every tick this template is rendered in memory, event though nothing changes in consul.

There is a place to improve this (runner.go) to append new quescience to quescienceMap only on consul event/change received, and then remove entry from this map, when template gets rendered. Also Run method triggered from quescience tick should only generate tpls currently present in quescienceMap.

Consul Template version

0.19.4

Configuration

template {
  source = "/tpls/conf.tpl"
  destination = "/output-file.tpl"
  command = "/bin/sh -c '/usr/bin/killall -SIGHUP myapp'"
  backup = false
  wait {
    min = "2s"
    max = "10s"
  }
}
{{range services}}{{$name := .Name}}
{{- range service $name}}
{{.Address}} {{.Name}} {{.Name}}.{{ env "DOMAIN"}} {{.Address | replaceAll "." "-"}}.{{.Name}}  {{.Address | replaceAll "." "-"}}.{{.Name}}.{{ env "DOMAIN"}}
{{- end }}

Command

/usr/local/bin/consul-template -config=/ctconfig -consul-addr=$CONSUL_ADDRESS -log-level=debug

Debug output

...
2018/09/17 12:54:51.783305 [DEBUG] (runner) received template "5eca8f5577257878e8b533b6f7b6a5ca" from quiescence
2018/09/17 12:54:51.783343 [INFO] (runner) initiating run
2018/09/17 12:54:51.783358 [DEBUG] (runner) checking template 5eca8f5577257878e8b533b6f7b6a5ca
2018/09/17 12:54:51.801431 [DEBUG] (runner) rendering "/tpls/conf.tpl" => "/etc/dnsmasq.d/ns.default.conf"
2018/09/17 12:54:51.801543 [DEBUG] (runner) checking template 664e869a6fc27466226da186f9bfd4cf
2018/09/17 12:54:51.810525 [DEBUG] (runner) diffing and updating dependencies
2018/09/17 12:54:51.812603 [DEBUG] (runner) enabling template-specific quiescence for "5eca8f5577257878e8b533b6f7b6a5ca"
2018/09/17 12:54:51.812613 [WARN] (runner) watching 290 dependencies - watching this many dependencies could DDoS your consul cluster
2018/09/17 12:54:53.810727 [DEBUG] (runner) received template "664e869a6fc27466226da186f9bfd4cf" from quiescence
2018/09/17 12:54:53.810783 [INFO] (runner) initiating run
2018/09/17 12:54:53.810814 [DEBUG] (runner) checking template 5eca8f5577257878e8b533b6f7b6a5ca
2018/09/17 12:54:53.834651 [DEBUG] (runner) checking template 664e869a6fc27466226da186f9bfd4cf
2018/09/17 12:54:53.843978 [DEBUG] (runner) rendering "/tpls/clusterhosts.tpl" => "/etc/hostsdir/hosts.default"
2018/09/17 12:54:53.844074 [DEBUG] (runner) diffing and updating dependencies
2018/09/17 12:54:53.845802 [DEBUG] (runner) enabling template-specific quiescence for "664e869a6fc27466226da186f9bfd4cf"
2018/09/17 12:54:53.845809 [WARN] (runner) watching 290 dependencies - watching this many dependencies could DDoS your consul cluster
2018/09/17 12:54:55.834815 [DEBUG] (runner) received template "5eca8f5577257878e8b533b6f7b6a5ca" from quiescence
2018/09/17 12:54:55.834858 [INFO] (runner) initiating run
2018/09/17 12:54:55.834883 [DEBUG] (runner) checking template 5eca8f5577257878e8b533b6f7b6a5ca
2018/09/17 12:54:55.855534 [DEBUG] (runner) rendering "/tpls/conf.tpl" => "/etc/dnsmasq.d/ns.default.conf"
2018/09/17 12:54:55.855635 [DEBUG] (runner) checking template 664e869a6fc27466226da186f9bfd4cf
2018/09/17 12:54:55.868029 [DEBUG] (runner) diffing and updating dependencies
2018/09/17 12:54:55.873420 [DEBUG] (runner) enabling template-specific quiescence for "5eca8f5577257878e8b533b6f7b6a5ca"
2018/09/17 12:54:55.873428 [WARN] (runner) watching 290 dependencies - watching this many dependencies could DDoS your consul cluster

Expected behavior

Rendering with timer only happens since consul watch change detected until actual changes gets written to file. When nothing changes in cluster ticks do nothing and CPU is resting.

Actual behavior

Template is rendered (in memory) every 2 secs, even when nothing changes in consul. CPU is working.

Steps to reproduce

Everytime with above config.

References

@sodabrew
Copy link

sodabrew commented Nov 2, 2018

I suspect this is related to the work in #988 that I tracked down as the source of issues in #1043 (comment)

@skyrocknroll
Copy link

@marqc Do you have working patch ?

@pierresouchay
Copy link

@skyrocknroll http://github.com/criteo/consul-templaterb/ has those features too (but working hopefully) :-)

@pearkes pearkes added this to the v0.20.0 milestone Feb 6, 2019
@pearkes pearkes self-assigned this Feb 6, 2019
@pearkes
Copy link
Contributor

pearkes commented Feb 9, 2019

I talk a bit about this in #1174. I believe this is how the runner loop is currently designed -- fortunately it isn't writing to disk at this interval but it is causing a good amount of seemingly unnecessary work.

I believe there are good reasons it is set-up this way -- notably the way dependencies are resolved -- so I don't want to suggest this isn't the right trade-off currently.

I think as we continue to evaluate consul-template's design in the longer term we might want to optimize for not causing this. I'm also open to hearing more feedback or suggestions here if there is a better way!

@pearkes pearkes removed this from the v0.20.0 milestone Feb 9, 2019
@sodabrew
Copy link

sodabrew commented Feb 9, 2019

@pearkes would it make sense to try to dig into this issue now, after you’ve landed 1174, or would you be able to cut a 0.19.6 release soon and leave this issue for a subsequent micro?

@pearkes
Copy link
Contributor

pearkes commented Mar 8, 2019

Some good discussion in #1195 which was a duplicate of this.

@sodabrew I think it'd be great to fix this but as I noted above I fear it not being straight-forward given how the system is currently set-up. It seems like we could definitely better utilize blocking queries to do this more efficiently.

@eikenb
Copy link
Contributor

eikenb commented Jun 14, 2019

Note that this is also a duplicate of #998 which matches the behavior with the addition that in it they noticed due to timestamps being included which triggered a file write every time.

@eikenb
Copy link
Contributor

eikenb commented Jul 8, 2019

References in other issues that might be of interest/relevant while working on this...

#1043 (comment)
#1196 (comment)

@eikenb
Copy link
Contributor

eikenb commented Jul 8, 2019

I cannot reproduce this using the provided config/template. So far the only way I can replicate this is to use 2 templates like in #998 and #1195.

@eikenb
Copy link
Contributor

eikenb commented Jul 8, 2019

Just noticed that the debug output shows 2 templates where the provided config only has 1. So the same issue, just the provided config is incomplete.

@eikenb
Copy link
Contributor

eikenb commented Jul 8, 2019

After spending most of the day tracing through the runner code I'm almost convinced that this is going to have to wait for a later release after some refactoring to get the code into a state more amiable to the work.

@vaLski
Copy link
Contributor

vaLski commented Jan 3, 2025

We had an issue where templates with frequent updates were flooding the disk any time a change occurs instead of being aggregated and rendered on the min/max interval as expected.

Upon further investigation we discovered that quiescence min/max intervals were parsed only after r.allTemplatesRendered but we our setup never reached that code, or it was reaching it hours further down the runtime of consul-template.

We moved the wait interval parser, prior allTemplatesRendered, tested it and wait aggregation intervals are now working just as expected.

The change has been verified to fix our initial problem, and since, has been gradually rolled out on more than 1k prod servers. We are running the patched version for a month now with no side effects observed so far.

Kindly check if PR-2014 is solving the issues you are experiencing and if appropriate consider merging it upstream.

Oh, and by the way, happy new 2025. Keep rocking 🎉

@pierresouchay
Copy link

@vaLski so glad someone finally tackled this, congrats!

@vaLski
Copy link
Contributor

vaLski commented Jan 3, 2025

@vaLski so glad someone finally tackled this, congrats!

Honestly I don't know if it actually fixes the initially reported in-memory rendering overhead as I did not have the time to test in details, but I hope so:

Template is rendered (in memory) every 2 secs, even when nothing changes in consul. CPU is working.

However it definitely fixed our case with on disk rendering flood and flood of post render events/execs that one might have like in our case. The template we are rendering trigger chain of bash-ishms, so to speak, leading to thunder of exec, service reloads etc. and without working aggregation it quickly becomes a nightmare for the host resources and the services running on top.

As far as I can remember, a working aggregation was one of the consul-templaterb main advantages. I considered migrating to it it numerous times, but since we don't have rubby anywhere on the stack nor I personally wanted to start managing it I ditched the idea anytime it popped up.

Nevertheless both consul-template and consul-templaterb are good tools as long as they solve ones problems. So also congrats to you @pierresouchay , for your efforts on the consul-templaterb and the multiple fixes/features you submitted to the consul/template ecosystem over the years.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants