Upgrade kube-prometheus-stack chart to v66.1.1 #2341

anders-elastisys · 2024-11-13T11:57:13Z

Warning

This is a public repository, ensure not to disclose:

personal data beyond what is necessary for interacting with this pull request, nor
business confidential information, such as customer names.

What kind of PR is this?

Required: Mark one of the following that is applicable:

Optional: Mark one or more of the following that are applicable:

Important

Breaking changes should be marked kind/admin-change or kind/dev-change depending on type
Critical security fixes should be marked with kind/security

kind/admin-change
kind/dev-change
kind/security
kind/adr

What does this PR do / why do we need this PR?

Noticed that the kube-prometheus-stack was falling behind a bit, this PR upgrades the Helm chart to v66.1.1.
This fixes some ARP metrics and a log issue caused by this in the node-exporter (this is mentioned in the linked issue).

Alertmanager in the Mangement cluster is not upgraded, instead the image version is fixed to previous v0.26.0 due to v0.27.0 deprecating the v1 API endpoint, which is still used by Thanos. Once we upgrade Thanos to v0.35 or higher, the v2 endpoint will be default (see related upstream issue).

Fixes Upgrade Kube-prometheus-stack-60.0.0 #2166

Information to reviewers

Since this PR changes a lot of files, I recommend simply testing out the migration script in a dev environment, i.e. deploy previous version of kube-prometheus-stack and then run:

CK8S_CLUSTER=both ./migration/v0.43/apply/11-kube-prometheus-stack.sh execute

Run tests and check that metrics looks fine.

Checklist

simonklb · 2024-11-20T08:46:39Z

Unassigned @elastisys/goto-scripts due to policy update #2347

Zash

I tried out this in a dev cluster. Grafana and metrics etc looks good, but (as expected?) Thanos has trouble.

Should this wait for a Thanos update?

anders-elastisys · 2024-12-04T13:47:34Z

I tried out this in a dev cluster. Grafana and metrics etc looks good, but (as expected?) Thanos has trouble.

Should this wait for a Thanos update?

Hmm maybe, I will test and check it again, I know I had issues with the newer alertmanager version and Thanos, but that should have been fixed by keeping the alertmanager version we used previously... Are you getting any particular alerts for Thanos?

Zash · 2024-12-04T14:07:08Z

Are you getting any particular alerts for Thanos?

Many of the Thanos pods were crashlooping. I'll have to take another look, could be a misconfiguration of my cluster.

anders-elastisys · 2024-12-17T15:28:03Z

Many of the Thanos pods were crashlooping. I'll have to take another look, could be a misconfiguration of my cluster.

Hmm when I tried upgrading to the new version I did not have any issues with Thanos.
I will try do a clean install from this version as well and test later.

anders-elastisys · 2024-12-23T16:21:40Z

Just tried a clean install and thanos seems to be running fine, did you manage to investigate why they were crashing for you @Zash ?

anders-elastisys · 2024-12-30T08:40:32Z

I will close this PR in favor of #2381 to upgrade Prometheus to v3.

anders-elastisys marked this pull request as ready for review November 19, 2024 15:31

anders-elastisys requested review from a team as code owners November 19, 2024 15:31

simonklb removed the request for review from a team November 20, 2024 08:46

anders-elastisys requested a review from AlbinB97 November 27, 2024 09:01

Zash reviewed Dec 4, 2024

View reviewed changes

anders-elastisys added 3 commits December 17, 2024 16:23

apps: upgrade kube-prometheus-stack to v66.1.1

399db96

release: add kube-prometheus-stack migration script

0e16d6e

apps sc: pin alertmanager version

3a9e990

anders-elastisys force-pushed the anders-elastisys/upgrade-kube-prometheus-stack branch from 1c0fca2 to 3a9e990 Compare December 17, 2024 15:26

anders-elastisys requested a review from a team December 17, 2024 15:26

anders-elastisys closed this Dec 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade kube-prometheus-stack chart to v66.1.1 #2341

Upgrade kube-prometheus-stack chart to v66.1.1 #2341

anders-elastisys commented Nov 13, 2024 •

edited

Loading

simonklb commented Nov 20, 2024

Zash left a comment

anders-elastisys commented Dec 4, 2024

Zash commented Dec 4, 2024

anders-elastisys commented Dec 17, 2024

anders-elastisys commented Dec 23, 2024

anders-elastisys commented Dec 30, 2024

Upgrade kube-prometheus-stack chart to v66.1.1 #2341

Upgrade kube-prometheus-stack chart to v66.1.1 #2341

Conversation

anders-elastisys commented Nov 13, 2024 • edited Loading

What kind of PR is this?

What does this PR do / why do we need this PR?

Information to reviewers

Checklist

simonklb commented Nov 20, 2024

Zash left a comment

Choose a reason for hiding this comment

anders-elastisys commented Dec 4, 2024

Zash commented Dec 4, 2024

anders-elastisys commented Dec 17, 2024

anders-elastisys commented Dec 23, 2024

anders-elastisys commented Dec 30, 2024

anders-elastisys commented Nov 13, 2024 •

edited

Loading