-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Image build process Freezes on Taking snapshot of full filesystem...
#1333
Comments
Can you please elaborate snapshot for which filesystem is being taken while building image so that we can see if filesystem size is causing this issue. we are using kaniko to build images in gitlab cicd and runner is deployed on kubernetes using helm chart. |
@abhi1git can you try the newer snapshot modes |
@abhi1git please switch to using |
@Kiddinglife Can you provide your dockerfile or some stats on number of files in your repo? |
I am experience this problem while building an image with less than a gb. Interesting that it fails silently. GitLab CI job will be marked as successfull but no image is actually pushed. We are using kaniko for several other projects but this error only happens on two projects. Both are monorepos and use lerna for extending yarn commands to sub packages. I must say it was working at some point and it does work normally when using docker to build the image Here is a snippet of the build logs:
Interesting to note that neither |
same issue , nothing changed only version of kaniko |
I'm hitting the same problem, tried --snapshotMode=redo, but it does not always help. |
Adding a data point, I was initially observing the build process freezing problem, when I do not add any memory/cpu request/limits. logs
version: gcr.io/kaniko-project/executor:v1.6.0-debug |
I guess the root cause is actually insufficient memory, but when we do not allocate enough memory it will freeze on |
Edit: my guess is wrong, I reverted to kaniko:1.3.0-debug and added enough memory requests & limit, but I'm still observing the image build freezing problem from time to time. |
Hi @abhi1git, did you find a solution for your issue ? I am facing the same. |
The issue is still actual for me too. Any updates? |
Same issue here, the system has enough memory (not hitting any memory limits), |
For us, after investigations, we found that the WAF in front of our Gitlab was blocking the requests. After whitelisting it, all is working fine. |
Still an issue, can you reopen @tejal29? Building an image like this shouldn't be OOMKilling/using GBs of RAM - seems like a clear cut bug to me. |
What kind of whitelisting was required for this? Can you help me to clarify how to set it up? |
If you have a WAF in front of Gitlab, It would be good to check your logs and confirm what kind of requests is blocking first. |
Anyone tried with version 1.7.0? |
v1.7.0 is about 4 months old, and had some showstopper auth issues, and :latest currently points to :v1.6.0, so I would guess that not many folks are using :v1.7.0 Instead, while we wait for v1.8.0 (#1871) you can try a commit-tagged image, the latest of which is currently |
Thanks @imjasonh I'm going to try gcr.io/kaniko-project/executor:09e70e44d9e9a3fecfcf70cb809a654445837631-debug |
I've tried
Guess the only solution is waiting for another commit-tagged image or 1.8.0 to be released |
It sounds like whatever bug is causing that is still present, so it won't be fixed by releasing the latest image as v1.8.0. We just need someone to figure out why it gets stuck and fix it. Unfortunately Kaniko is not really actively staffed at the moment, so it's probably going to fall to you or me or some other kind soul reading this to investigate and get us back on the track to solving this. Any takers? |
Hold on a second, maybe I spoke early! My pipeline currently builds multiple images in parallel. The images actually stuck are basically the same Postgres image built with different I consequently tried to remove this parallelism and tried to build these Postgres images in sequence. So from my tests it looks like when building images happens in parallel against the same registry mirror used as cache, if one image is taking snapshots in parallel with another it gets stuck. It may be a coincidence, maybe not. |
same issue: containers:
- args:
- --dockerfile=/workspace/Dockerfile
- --context=dir:///workspace/
- --destination=xxxx/xxx/xxx:1.0.0
- --skip-tls-verify
- --verbosity=debug
- --build-arg="http_proxy='http://xxxx'"
- --build-arg="https_proxy='http://xxxx'"
- --build-arg="HTTP_PROXY='http://xxxx'"
- --build-arg="HTTPS_PROXY='http://xxxx'"
image: gcr.io/kaniko-project/executor:v1.7.0
imagePullPolicy: IfNotPresent
name: kaniko
volumeMounts:
- mountPath: /kaniko/.docker
name: secret
- mountPath: /workspace
name: code here are some logs, maybe useful
|
Same issue here:
Docker works fine (yet requires privileged mode). |
Hello everyone! I found solution here https://stackoverflow.com/questions/67748472/can-kaniko-take-snapshots-by-each-stage-not-each-run-or-copy-operation adding option to kaniko --single-snapshot /kaniko/executor |
Same for me too |
If it doesn't work, then may try adding --use-new-run and --snapshot-mode=redo
|
I have the same issue. Is it a disk size issue? |
I see this answer being lost in this thread, but it fixed the issue for me. Just pass this flag to Kaniko --compressed-caching=false |
it Is not available in the Skaffold schema for Kaniko. So I am trying to understand the root cause of this issue |
俺也一样 |
添加 |
I had this problem when trying to install terraform in an alpine linux image with the recommendations from this page https://www.hashicorp.com/blog/installing-hashicorp-tools-in-alpine-linux-containers However the |
It seems that GitLab Kubernetes Runner pods may appear stuck when the So the issue for us is not about performance at all, but rather about memory usage. |
Actual behavior
While building image using gcr.io/kaniko-project/executor:debug in gitlab CI runner hosted on kubernetes using helm chart the image build process freezes on Taking snapshot of full filesystem... for the time till the runner timeouts(1 hr)
This behaviour is intermittent as for the same project image build stage works sometimes
Issue arises in multistage as well as single stage Dockerfile.
Expected behavior
Image build should not freeze at
Taking snapshot of full filesystem...
and should be successful everytime.To Reproduce
As the behaviour is intermittent not sure how it can be reproduced
--cache
flag@tejal29
The text was updated successfully, but these errors were encountered: