-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade kube-prometheus-stack chart to v66.1.1 #2341
Upgrade kube-prometheus-stack chart to v66.1.1 #2341
Conversation
Unassigned @elastisys/goto-scripts due to policy update #2347 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried out this in a dev cluster. Grafana and metrics etc looks good, but (as expected?) Thanos has trouble.
Should this wait for a Thanos update?
Hmm maybe, I will test and check it again, I know I had issues with the newer alertmanager version and Thanos, but that should have been fixed by keeping the alertmanager version we used previously... Are you getting any particular alerts for Thanos? |
Many of the Thanos pods were crashlooping. I'll have to take another look, could be a misconfiguration of my cluster. |
1c0fca2
to
3a9e990
Compare
Hmm when I tried upgrading to the new version I did not have any issues with Thanos. |
Just tried a clean install and thanos seems to be running fine, did you manage to investigate why they were crashing for you @Zash ? |
I will close this PR in favor of #2381 to upgrade Prometheus to v3. |
Warning
This is a public repository, ensure not to disclose:
What kind of PR is this?
Required: Mark one of the following that is applicable:
Optional: Mark one or more of the following that are applicable:
Important
Breaking changes should be marked
kind/admin-change
orkind/dev-change
depending on typeCritical security fixes should be marked with
kind/security
What does this PR do / why do we need this PR?
Noticed that the
kube-prometheus-stack
was falling behind a bit, this PR upgrades the Helm chart to v66.1.1.This fixes some ARP metrics and a log issue caused by this in the node-exporter (this is mentioned in the linked issue).
Alertmanager in the Mangement cluster is not upgraded, instead the image version is fixed to previous v0.26.0 due to v0.27.0 deprecating the
v1
API endpoint, which is still used by Thanos. Once we upgrade Thanos to v0.35 or higher, thev2
endpoint will be default (see related upstream issue).Information to reviewers
Since this PR changes a lot of files, I recommend simply testing out the migration script in a dev environment, i.e. deploy previous version of
kube-prometheus-stack
and then run:Run tests and check that metrics looks fine.
Checklist
NetworkPolicy Dashboard