-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Report errors at kubernetes level #2695
Comments
I don't know how hard it is for a pod to change its own status (or whether it's possible at all) but it seems like abusing Kubernetes' status mechanism, since the problem is likely not coming from the pod itself but most likely environmental. I think that prometheus metrics are a much better solution (e.g. #2535 ) |
I have been using Kubefed lately and I really loved the events being reported at the kubernetes level. The events were not logged on the kubefed's pods themselves but instead on each federated ressources that were created. When I think of it, its similar in how the helm operator works. Maybe the status of each deployments could be added to the events of the HelmReleases (though this is not the correct repo to debate on the subject) ? For Flux itself, maybe it could report events at the namespace level ? |
Flux doesn't own the pods, nor the namespaces so issuing events on those objects is not an option. Kubernetes does event compactation so relying on events for critical info is not the best idea. I don't see any advantages of running |
Ok lets forget about events IN the events section then. We can all read events from the logs but there is quite a lot of things going on in there... In my personnal experience, I often think that something's wrong and I go through the logs and find nothing. In general its because nothing is wrong and its just comming from elsewhere but If I had access to I don't represent huge corporations with hundreds or more flux instances. But I do have arround I think that having quick feedback of what each Flux is doing has some value. But maybe its just me ;-) |
Examples of valid Field one: A second field would have the current hash commit. Finally I would love a section IDK maybe this is just my dream. But it doesn't hurt to ask ! |
Thank you for the suggestion. It cannot be implemented in Flux v1 for reasons I think were explained in the thread. In the next version of Flux, this feedback has been incorporated though, and although we do not write Flux status errors into pod events, since that is still not possible, the design has been rebased on an API of CRDs for Flux v2, also known as the GitOps Toolkit. The failure states that can be encountered (like "failed to get latest commit from git", "failed to apply manifests to cluster", ...) are now observable as Kubernetes Events on the CRDs Flux v1 is in maintenance mode now, and is not adding any new features unless they are critical. As Flux contrib efforts have been focused on Flux v2, the Flux project has moved to a new repo, fluxcd/flux2 In the interest of reducing the number of open issues not directly related to supporting Flux v1 in maintenance mode, and respecting you may have moved on already, I will go ahead and close out this issue for now. If you have a use case for Flux that isn't covered well in the new Flux v2 (which is a total rewrite), we want to hear about it. If you've been following our development efforts then of course we hope you are able to upgrade, here's more info on how to find support with that: https://fluxcd.io/support/ |
Describe the feature
Actual state
When a fatal error appears in Flux it gets logged on the output of the container.
(I call fatal error things like cloning or yaml syntax errors)
The only way to access thoses errors and know the state of Flux is to read the logs using
kubectl logs flux
Requested feature
An enhancement would be to have fatal errors reported directly inside the events of the pod.
Then we could do a simple
kubeclt describe pod flux
and have a general idea of what is going on.If this is too complicated, maybe just set an annotation on the pod to advertise the status (either Ok or Error state)
What would the new user story look like?
When the last commit seems to not being deployed in time and you have to investigate, users can check events on the namespace or on the Flux pod directly without having to get into the logs.
This allows everyone to debug Flux problems instead of only people used to read its logs.
How would the new interaction with Flux look like? E.g.
Expected behavior
Errors (or Flux state at least) should appear at the kubernetes level.
The text was updated successfully, but these errors were encountered: