Skip to content
This repository has been archived by the owner on Nov 1, 2022. It is now read-only.

Flux aborts synchronization on manifest syntax errors #2861

Closed
suvl opened this issue Feb 20, 2020 · 3 comments
Closed

Flux aborts synchronization on manifest syntax errors #2861

suvl opened this issue Feb 20, 2020 · 3 comments
Labels

Comments

@suvl
Copy link
Contributor

suvl commented Feb 20, 2020

We noticed today that fluxd stopped applying our commits for over an hour. Flabbergasted by the lack of errors posted to our #bots slack channel, we found these logs being repeatedly output:

flux-6b7f6f6d4c-nk4m8 flux ts=2020-02-20T17:05:35.504746867Z caller=images.go:23 component=sync-loop error="getting unlocked automated resources: scanning multidoc from \"some-folder/some-file.yaml\": yaml: line 2214: found a tab character where an indentation space is expected"
flux-6b7f6f6d4c-nk4m8 flux ts=2020-02-20T17:06:00.421201205Z caller=images.go:23 component=sync-loop error="getting unlocked automated resources: scanning multidoc from \"some-folder/some-file.yaml\": yaml: line 2214: found a tab character where an indentation space is expected"
flux-6b7f6f6d4c-nk4m8 flux ts=2020-02-20T17:06:18.321171399Z caller=images.go:23 component=sync-loop error="getting unlocked automated resources: scanning multidoc from \"some-folder/some-file.yaml\": yaml: line 2214: found a tab character where an indentation space is expected"
flux-6b7f6f6d4c-nk4m8 flux ts=2020-02-20T17:06:44.28888356Z caller=images.go:23 component=sync-loop error="getting unlocked automated resources: scanning multidoc from \"some-folder/some-file.yaml\": yaml: line 2214: found a tab character where an indentation space is expected"

Indeed, there was a problem with that single file. However, this is a repo with 34 folders and totalling 267 yaml files and it would be nice if fluxd would just ignore the failing file and kept going. We had other commits that were not applied to the k8s and people had no clue, other than from the flux-sync tag not changing, that something was going on.

Additional context

  • Flux version: 1.18.0
  • Fluxcloud version: v0.3.9-c2695df5
  • Kubernetes version: 1.16.6
  • Git provider: gitlab on-prem
  • Container registry provider: containerd
@suvl suvl added blocked-needs-validation Issue is waiting to be validated before we can proceed bug labels Feb 20, 2020
@2opremio
Copy link
Contributor

2opremio commented Feb 20, 2020

Thanks for reporting the problem!

I understand your frustration, but the fact that this was caused by a tab, and the amount of manifests you are dealing with is just incidental.

Flux takes a no-harm approach. Finding an error in a manifest and blindly applying the rest can be disastrous in a production environment (e.g. ignoring a syntax error in in an updated config map which is needed for the correct functioning of a critical workload also updated accordingly).

Now, users should be made aware of reconciliation being aborted as soon as possible to avoid situations like the one described in this issue. Flux already offers an (unfortunately as of yet undocumented) event API (used by e.g. Weave Cloud and fluxcloud) and some Prometheus metrics.

Please let us know if those two methods are not enough for you. If the you are already using the events API through your bots (which you didn't clarify in the description) that should be fixed.

We understand that Flux's observability could be improved, and we are keeping track of that at #2812. We will happily accept your contributions improving the situation. This is open source after all!

@2opremio 2opremio changed the title A tab character cannot compromise the whole git repo Flux aborts synchronization on manifest syntax errors Feb 20, 2020
@2opremio 2opremio added question and removed blocked-needs-validation Issue is waiting to be validated before we can proceed bug labels Feb 20, 2020
@2opremio
Copy link
Contributor

Related, and probably solvable by the same means as #2535

@kingdonb
Copy link
Member

Flux v2 resolves this with a model that allows sharding of your resources across Kustomizations that reconcile independently, and a great deal of new documentation around the new Notification API, which can promote lots of new events from the other GitOps Toolkit APIs into various notification providers natively.

https://toolkit.fluxcd.io/guides/notifications/
https://toolkit.fluxcd.io/guides/monitoring/

Flux v1 is in maintenance mode now, and is not adding any new features unless they are critical.

As Flux contrib efforts have been focused on Flux v2, the Flux project has moved to a new repo, fluxcd/flux2

In the interest of reducing the number of open issues not directly related to supporting Flux v1 in maintenance mode, and respecting you may have moved on already, I will go ahead and close out this issue for now. Thanks for using Flux!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants