-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Allow automated image updates to partly fail #2619
Comments
I think it would be better to make this behaviour possible by adding an additional flag to the This would make the behaviour and exit status code predictable based on the configured flags. |
fluxctl release --update-all-images
to partly fail
Cool thanks for your quick response. |
fluxctl release --update-all-images
to partly fail
@squaremo given the above, is there a reason we currently have a 'succeed all or fail' policy for automated image updates? |
I would grant that it's not always -- or even usually -- necessary to make updates atomic (all-or-nothing). Given the nature of Kubernetes, no-one should be relying on atomic changes. Even if committed to git, and applied to the API at the same time, they won't take effect at the same time. But inconsistencies due to partial updates would persist indefinitely, and probably need human intervention to resolve. So at the least, the failing parts should be made quite visible. |
We could technically just add them to the commit note as 'attempted but failed', to make it observable outside the |
Yes, and yes. And perhaps introduce (adapt?) a metric for the number of automated updates succeeding/failing. |
Describe the bug
A clear and concise description of what the bug is.
If one automated workload failed due to an error like #2618 all automated workloads will be skipped
To Reproduce
Steps to reproduce the behaviour:
fluxcd - helmoperator
1.1 one should have an error like fluxcd failed to automate helmrelease chart-image #2618
The log line
ts=2019-11-15T04:46:45.190432924Z caller=releaser.go:59 component=sync-loop jobID=cbcffd5d-5f97-de26-2303-f46eed9d7492 type=release updates=2
is misleading as none of the two images will be updated thru flux.
Only as the failed chart-image was locked flux was able to automate the other image again.
Expected behavior
A clear and concise description of what you expected to happen.
flux should log and skip failed automated releases properly, so that an error within one release should not affect all automated releases.
Logs
If applicable, please provide logs of
fluxd
or the helm-operator. In a standard stand-alone installation of Flux, you'd get this by runningkubectl logs -n default deploy/flux
.see above
Additional context
Add any other context about the problem here, e.g
The text was updated successfully, but these errors were encountered: