Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[question] buildctl prune seems to leak some disk space #1198

Closed
d-theus opened this issue Oct 11, 2019 · 8 comments
Closed

[question] buildctl prune seems to leak some disk space #1198

d-theus opened this issue Oct 11, 2019 · 8 comments

Comments

@d-theus
Copy link

d-theus commented Oct 11, 2019

Hello,
I'm having some trouble getting buildctl prune --all --keep-storage N to actually use around N megs of space build after build, runc-overlayfs usually just grows indefinitely.

I also understand that the way I use buildkit is propably not the way it's meant to be used. It looks like this:

it runs inside docker container with a volume mounted like so:

docker run -it --rm -v /data/app-xyz/build-data/buildkit:/var/lib/buildkit --privileged --entrypoint run.sh moby/buildkit

the script it runs (actually it's a snippet from a template for that script):

# get code from git, put it into .

buildkitd --debug &
until buildctl debug workers >/dev/null 2>&1; do sleep 1s; done

buildctl \
  build \
  --frontend dockerfile.v0 \
  --local context=. \
  --local dockerfile=. \
  --export-cache type=registry,ref=<%= img %>:buildcache \
  --output type=image,name=<%= img %>:<%= hash %>,push=true

buildctl \
  prune \
  --all \
  --keep-storage 8000

pkill buildkitd

After some iterations cache grows far beyond 8 gigabytes. runc-overlayfs took more than 20 gigabytes when I last checked. The output of prune command says something like

ID									RECLAIMABLE	SIZE	LAST ACCESSED
f6iidtphi9hajzs079ll63nd8*                                             	true       	4.17kB
...
sha256:badfbcebf7f868b2dc0e4b1aa21db05bcd5cb2be0afb314ac196a0f51f7b04ed	true       	148.95MB
Total:	6.41GB

so I assume it does collect something.

But if I manually run

buildctl prune --all

it reclaims some space, and I expect it to clean all snapshots and content, but

du -sh /data/app-xyz/build-data/buildkit/runc-overlayfs/*
368K	/data/app-xyz/build-data/buildkit/runc-overlayfs/containerdmeta.db
32K	/data/app-xyz/build-data/buildkit/runc-overlayfs/content
4.0K	/data/app-xyz/build-data/buildkit/runc-overlayfs/executor
5.9M	/data/app-xyz/build-data/buildkit/runc-overlayfs/metadata.db
2.8G	/data/app-xyz/build-data/buildkit/runc-overlayfs/snapshots   <<<<<
4.0K	/data/app-xyz/build-data/buildkit/runc-overlayfs/workerid

buildctl du always (well, after some builds) show far less disk usage, than it actually is.

What am I missing? I also tried to sync after pkill buildkitd and wait before exit, as it seemed like some .db files weren't in sync, but apparently this is not the case.

Thank you!

@tonistiigi
Copy link
Member

Can you put together a reproducer that shows this behavior? It is possible this is addressed with #1176 but would need reproducer to be sure.

@d-theus
Copy link
Author

d-theus commented Oct 12, 2019

@d-theus
Copy link
Author

d-theus commented Oct 15, 2019

My mistake. prune is sort of async command. It exits before something actually gets collected. So, I should wait before killing daemon.

@d-theus d-theus closed this as completed Oct 15, 2019
@d-theus
Copy link
Author

d-theus commented Oct 29, 2019

@tonistiigi
I've been able to reproduce some leak. It takes place when some image export involved. E.g.:

          buildctl \
            build \
            --frontend dockerfile.v0 \
            --local context=. \
            --local dockerfile=. \
            --output type=docker,name=build > /var/ctr

as well as with type=registry.

I mean, when I later do buildctl prune --all, there are some snapshots left while buildctl du shows 0.

At the same time, this does not leak:

          buildctl \
            build \
            --frontend dockerfile.v0 \
            --local context=. \
            --local dockerfile=. 

No (dangling) snapshots left in this case.

This occures with any dockerfile and context.
I'm using docker image moby/buildkit:v0.6.2 (tried master, same result) with oci worker.

@d-theus d-theus reopened this Oct 29, 2019
@d-theus
Copy link
Author

d-theus commented Oct 31, 2019

Looks like it was fixed with one of latest commits.
Although, to actually collect snapshots I need to restart daemon, then it collects old snapshots.

@nalbury-handy
Copy link

We're also seeing this behavior as well even after restarting the daemon on v0.6.3. Happy to provide more information but it's almost exactly the situation described above:

  • df -h |grep /var/lib/buildkit shows 32GB used
  • buildctl prune --all
  • buildctl du shows 0GB
  • After about 30min df -h |grep /var/lib/buildkit shows 23GB used

It does drop about 9GB asynchronously after running the prune cmd, but it seems to get stuck on something and stops pruning. After running du on the fs, it looks like about 22GB of that usage is /var/lib/buildkit/runc-overlayfs/snapshots

Seems related to #1174 as well but they solved it by using the --all flag which doesn't seem to help in our case.

For context we're running the daemon(s) in a statefulset in kubernetes with rook managed rbd mounts for /var/lib/buildkit

Curious if this is expected behavior and we just need to clean that up out of band? or if we're missing something?
Thanks in advance!

@nalbury-handy
Copy link

Quick follow up, we've started using the "master" image tag from dockerhub and the issue appears to be resolved so it looks like this is patched somewhere between 0.6.3 and master. Any chance there will be a new release in the near future?

Thanks

@jsravn
Copy link

jsravn commented Jul 20, 2020

This is fixed in 0.7.x afaict. I had a similar issue open (#1385).

@d-theus d-theus closed this as completed Jul 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants