-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Infinite template re-render loop when using {{ timestamp }} in two different templates of the same task #20618
Comments
Hi @faryon93, thanks for reporting this. I was able to reproduce, and not only with podman but also with docker driver. Indeed there's something fishy happening with the taskrunner here, and we'll look into it. |
The title of this issue may need to be updated, as it is not just timestamps that cause this to happen. It is anything that causes the source data to change triggering a rendering. This only happens if you have more than a single template. Another Example:
If we change the The reason we found this bug is because it causes extremely high Nomad cpu usage when running a lot of containers. Looks like this maybe also related to all these issues: hashicorp/consul-template#1427 A simple fix would be if we could just disable the quiescence timers in Nomad, but I have tried everything to pass null to the Wait times but nothing works. I think it's because of this code that will always assign a default: https://github.com/hashicorp/nomad/blob/v1.7.7/client/config/config.go#L422-L425 Looking into the consul-template code and docs, if you pass null to the Wait config it will disable the quiescence timers. |
I cannot confirm that. I just modified my example job file to use the Even when using a The problem when using In the meantime I upgraded my nomad cluster to 1.8.2, so this tests were conducted with that version. @pkazmierczak is there any news on this topic? |
hey @faryon93, sorry this is taking a while. The issue is on our board, but sadly I can't give you an estimate or commit to a timeline. We'll try to fix soon. |
I tested this again on the latest version I can get v1.8.3 running in local dev:
I created the
@pkazmierczak when you do the test, can you make sure you are getting DEBUG logs? |
@mismithhisler see also #24638 |
I've investigated and it definitely seems to originate from Those logs are generated with a local Nomad environment, all HCL files are provided here: #24638 (comment) Working case: 1 template (this shows only once, a few seconds after updating the variable):
Bogous case: 2 templates (this is where the problem appears -- the min wait has been changed from 5s to 15s in Nomad client config):
The issue seems to originate from the fact that in point 1, it receives each template independantly from quiescence. And therefore it starts 2 independant quiescence timers, where each timer are evaluating both templates, where only 1 is dynamic and contains a variable that changed. |
@valeriansaliou You are absolutely correct this does originate from consul-template, and has to do with the quiescence timers, which are set by default in Nomad clients. We've identified the origin of this issue in consul-template, and are working on possible fixes. |
Nomad version
Operating system and Environment details
Issue
When I use the
{{ timestamp "unix"}}
function in two differentetemplate
blocks of the same task and one of the templates uses other functions which can cause re-rendering (like{{ key }}
or{{ service }}
, etc. ) and a re-rendering is performed, the template gets re-rendered in an infinite loop.The problem does not exist when the task has only one template.
Sidenote: When the task is stuck in the renderloop proper shutdown is not possible.
Reproduction steps
foo/bar
in consul with some random contentfoo/bar
to another valueRecent Events
table of the task for re-occurringTemplate re-rendered
messagesExpected Result
The template should be re-rendered exactly once.
Actual Result
The template gets re-rendered in an infinite loop every few seconds.
Job file (if appropriate)
Nomad Server / Client logs
The text was updated successfully, but these errors were encountered: