Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[oximeter] Don't stop refreshing producers forever if one attempt fails #7191

Merged
merged 2 commits into from
Dec 2, 2024

Conversation

jgallagher
Copy link
Contributor

#7126 introduced a return; inside refresh_producer_list to avoid clobbering our producer list on an error talking to Nexus, but refresh_producer_list had two loops: it was both "do one refresh" (from which returning is correct) and also "periodically refresh" (from which returning is incorrect: it causes us to never refresh again).

This PR splits refresh_producer_list into refresh_producer_list_{task,once}; strongly recommend looking at the diff with whitespace ignored, as it's basically a no-op other than this split (which makes the return correct, as it only terminates a single refresh and not all future ones).

I also added some InlineErrorChain bits to try to get more info from logged errors.

Fixes #7190.

@jgallagher jgallagher requested a review from bnaecker December 2, 2024 16:02
Copy link
Collaborator

@bnaecker bnaecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, thanks!

@jgallagher jgallagher merged commit a9df1f8 into main Dec 2, 2024
17 checks passed
@jgallagher jgallagher deleted the john/fix-oximeter-refresh-stops-on-failure branch December 2, 2024 18:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Propolis zone metrics not captured for some newly created instances/disks
2 participants