-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Chef: Gracefully Handle RFC062 Exit Codes #19155
Conversation
f4b0b8e
to
b732ee3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, but I do have one suggestion you might have a look at...
I noticed this is an old PR, are you still willing to work with me to get the PR ready for merging? In that case please have a look at my suggestion and also please rebase the PR so that the test can (hopefully) run successfully. Lastly it would be great if you can sign the contributor license agreement. Thanks! |
@svanharmelen sure, I've also gotten quite a bit better with Go since. I'll give this another whirl. |
b732ee3
to
32495c3
Compare
@stevenoneill-jmfamily and others, would we want to attempt a retry here or is the pending reboot notification good enough? Putting this in a retry loop if we get one of those exit codes should be doable, maybe with a configurable sleep. Thoughts for configurables:
@svanharmelen If this is good to go, I'd say merge as MVC support for this. Don't have a timeframe for being able to support those configurables, write the tests, etc. |
@bdwyertech thanks for the heads up on this, appreciate it! That sounds reasonable to me, I like that the defaults result in nothing (keep existing behavior). My two bits....from a Chef user pov, I feel like I'd be expecting behavior and names like the current test-kitchen experience. eg, utilizing these variable names:
It'd be cool to have the consistency across products, if possible. If not, i'm still good with it because this feature would be really frickin useful :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is indeed a much better solution, but it will needs some tweaks to make it work as you intended 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @bdwyertech,
Thanks for this, I just ran into the same issue.. Needed a small fix but works great:
...
null_resource.default (chef): Recipe: test::windows
null_resource.default (chef): * reboot[now] action reboot_now[2020-04-29T23:30:36+02:00] WARN: Rebooting system immediately, requested by 'now'
...
null_resource.default (chef): Chef Infra Client finished, 1/6 resources updated in 19 seconds
null_resource.default (chef): [2020-04-29T23:30:39+02:00] WARN: Rebooting server at a recipe's request. Details: {:delay_mins=>0, :reason=>"Reboot by Chef Infra Client", :timestamp=>2020-04-29 23:30:36 +0200, :requested_by=>"now"}
...
null_resource.default (chef): Chef Infra Client failed. 1 resources updated in 20 seconds
null_resource.default (chef): [2020-04-29T23:30:39+02:00] FATAL: Stacktrace dumped to C:/chef/cache/chef-stacktrace.out
null_resource.default (chef): [2020-04-29T23:30:39+02:00] FATAL: Please provide the contents of the stacktrace.out file if you file a bug report
null_resource.default (chef): [2020-04-29T23:30:39+02:00] FATAL: Chef::Exceptions::Reboot: Rebooting server at a recipe's request. Details: {:delay_mins=>0, :reason=>"Reboot by Chef Infra Client", :timestamp=>2020-04-29 23:30:36 +0200, :requested_by=>"now"}
null_resource.default (chef): Reboot has been scheduled in the run state
null_resource.default: Creation complete after 54s [id=6131578846470282695]
Apply complete! Resources: 1 added, 0 changed, 1 destroyed.
32495c3
to
5a9ebb8
Compare
Codecov Report
|
5a9ebb8
to
6dc94e5
Compare
Alright, so I have nothing better to do then clean up the laundry list... @shoekstra @svanharmelen @stevenoneill-jmfamily what do you think of this? I haven't the slightest idea how you would stub something like this in a test... Would love to see how you'd do it. If you want to give this a spin, use a cookbook that triggers a reboot, and set the codes you want in your provisioner block. retry_on_exit_code = [35, 213]
wait_for_retry = 30 // Default (seconds) : time to wait after receiving exit code before retrying connection & provisioning process
max_retries = 1 // Default : number of times to retry |
Signed-off-by: Brian Dwyer <[email protected]>
6dc94e5
to
4701ed2
Compare
This looks great, but it is also a considerably bigger change than your initial PR and my concern is it'll be harder to get merged in without adding tests. It would be a shame for this PR to stagnate again as a lot of Chef users would benefit from your fix. May I suggest creating a new PR that adds the retry functionality so it doesn't hold up merging your initial change? I think would have a better chance of being merged as is because the proposed change is much smaller. |
Signed-off-by: Brian Dwyer <[email protected]>
c812261
to
2f4067c
Compare
Even if MaxRetries is 0, we should still execute the loop one time in order to run the Chef-Client at least once. Also waiting only makes sense when we have `attempts` left. And last but not least we want to exit immediately when the exit code is not in the retry list. So this PR fixes three small issues to make everything work as expected.
This example doesn't really show how these values should be used. The default of retry_on_exit_code is now already when most people want, so this line is not needed in most cases. I think the docs describe the new options just fine, so lets leave this out...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good now 👍
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
Allows exit code 35 and 37 which do not imply Chef failure.
https://github.com/chef/chef-rfc/blob/master/rfc062-exit-status.md
Should help with #15134 and #16551