Skip to content
This repository has been archived by the owner on Nov 1, 2022. It is now read-only.

Create metric for flux manifest errors #2199

Closed
mpashka opened this issue Jun 27, 2019 · 3 comments
Closed

Create metric for flux manifest errors #2199

mpashka opened this issue Jun 27, 2019 · 3 comments
Labels
blocked-needs-validation Issue is waiting to be validated before we can proceed enhancement

Comments

@mpashka
Copy link
Contributor

mpashka commented Jun 27, 2019

Describe the feature
Add metric to inform users about invalid manifest or flux inability to apply manifest to k8s by some reason.

Sometimes user can make a mistake and push wrong manifest into git. Or sometimes manifest is correct but flux can't apply it - it is common use case for big configmaps - flux uses "apply" to push configmap to kubernetes and that implies saving configmap content into annotations. And in kubernetes there is limitation on annotations size. So if configmap size is bigger than some limit (something like 255Kb) then kubectl apply reports error and manifest can't be applied. There are flux events that can be used to detect this but in standalone flux installation it's much more convenient to create metrics with information about manifest errors. Metrics that can be easily checked any time. Also that metrics can be used to rise prometheus alarm which is de facto standard for k8s installations.

Expected behavior
I expect new metric with a number of invalid manifest files in flux git repository. And metric with a number of errors that occured during applying manifests to kubernetes. And possible a metric with a total number of manifests, but that one is only for reference.

Also it can be good idea to create flux prometheus dashboard that can be used for checking flux health and prometheus alarms for flux that can be used to inform users about invalid manifests.

@mpashka mpashka added blocked-needs-validation Issue is waiting to be validated before we can proceed enhancement labels Jun 27, 2019
@ksaritek
Copy link

@stefanprodan & @mpashka, is it a difficult ticket for a newbie? is anybody working on that?

mpashka added a commit to pulsepointinc/flux that referenced this issue Oct 21, 2019
mpashka added a commit to pulsepointinc/flux that referenced this issue Oct 21, 2019
mpashka added a commit to pulsepointinc/flux that referenced this issue Oct 21, 2019
mpashka added a commit to pulsepointinc/flux that referenced this issue Oct 24, 2019
mpashka added a commit to pulsepointinc/flux that referenced this issue Oct 24, 2019
mpashka added a commit to pulsepointinc/flux that referenced this issue Oct 24, 2019
2opremio pushed a commit to 2opremio/flux that referenced this issue Oct 31, 2019
@2opremio
Copy link
Contributor

2opremio commented Nov 12, 2019

I think we can close this, since it was addressed by #2535

@ArthurSens
Copy link

Is it possible to add git-repository, git-path, and file to this metric?
If I'm using one Prometheus server to monitor multiple cluster/repositories, I'd like to create an alert that tells me exactly where is the error that is causing the failed sync

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
blocked-needs-validation Issue is waiting to be validated before we can proceed enhancement
Projects
None yet
Development

No branches or pull requests

4 participants