-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stuck in Reconciling and SSH_AUTH_SOCKET not specified #565
Comments
Can you share a pseudo copy of the |
Hi Hidde, it's as follows
|
Could you please try to reproduce this on an upgraded source-controller from the latest Flux release? There are some changes in source-controller. Also, could you try setting the If it still reproduces on the latest version, we will of course still want to follow up this issue, even if you are un-blocked! Another suggestion: you might want to try setting One more thing to clarify: are there any submodules in the test repo? |
I am having the same issue. This is a brand new/fresh cluster on DigitalOcean following steps with brew install and fluxctl bootstrap. It seems to fail with both go-git and libgit2, but libgit2 produces this error instead
Here is my GitRepository
I would be happy to provide information on your outlined steps, but need some help. How do I get the Secret? How do I upgrade to the latest source-controller? |
Solution is I needed to add a secret to Kubernetes. I did not like the idea of committing a secret to git so I went ahead and created a secrets YAML that I applied directly (be sure to base64 encode both the username and password). Use kubectl explain gitrepository.spec to get information on secret format for https vs ssh. I do not know if there is a better way to do GitOps with secrets rather then applying the secret manually but it felt bad to commit it alongside the flux configurations. But in any case I think the Flux getting started guides should call out this requirement to add the secret up front. It may be obvious but I think myself and others missed it. apiVersion: v1 apiVersion: source.toolkit.fluxcd.io/v1beta1 EDIT: Seems like Flux will remove the secret manually applied by-design. So we must put it in the repository. Is this right? |
@ZeroDeltaAlpha did you get a chance to try on Kingdon's recommendation for your main issue? On the timeouts not being honoured, a current fix we have for this is to use the Libgit2 Managed Transport we have recently released. Here's how to enable and use it: #636 (comment) |
Hi @pjbgf we moved onto ArgoCD, as It basically deadlocked our deployment pipeline. Even using flux uninstall to remove state, git repos would immediately get into a state where they could never fetch commits and would also never timeout even if one was set. We tried to replicate on sandbox cluster and it worked fine but not on our production cluster. |
@ZeroDeltaAlpha thank you for getting back to me on this. The first issue was fixed by #740, so The second issue is caused by using password-protected SSH keys without providing its password to Flux. The error message is really not user friendly, so I created a new issue to tackle that (#802). |
Closing this in favour of #802 to track the outstanding work. But happy to re-open in case of similar occurrences in the future. |
Hi All,
We are running the following flux components in our EKS cluster:
We are observing the following issues:
GitRepo stuck in an eternal reconciling in progress status, even when a timeout is set.
GitRepo reconcile failing immediately with the message: unable to clone 'ssh://[email protected]/example/examplerepo': error creating SSH agent: "SSH agent requested but SSH_AUTH_SOCK not-specified"
We have tried full uninstall and re bootstrapping to no success, we ran the same gitrepos in a sandbox cluster, we also tried applying https://github.com/fluxcd/source-controller/blob/main/config/testdata/git/large-repo.yaml to ensure that it wasn't a authentication issue, this workload also gets stuck in reconcile. Rolling back the source controller also does not resolve the issue.
Also setting the source controller to trace level doesn't reveal any issue in the logs.
Please let me know if we can provide anything further.
The text was updated successfully, but these errors were encountered: