Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unused unpacked snapshot left in content store after nerdctl system prune #2372

Closed
ginglis13 opened this issue Jul 13, 2023 · 11 comments · Fixed by #2374
Closed

Unused unpacked snapshot left in content store after nerdctl system prune #2372

ginglis13 opened this issue Jul 13, 2023 · 11 comments · Fixed by #2374
Labels
kind/unconfirmed-bug-claim Unconfirmed bug claim

Comments

@ginglis13
Copy link
Contributor

ginglis13 commented Jul 13, 2023

Description

With changes introduced in finch#461, we will be defaulting to building images using the type=image format rather than the type=docker format. This change has exposed common test failures in finch image save and finch image load in both finch and finch-core projects. This is effectively an issue that has presented itself with nerdctl image save and nerdctl image load. The issue occurs when an image (A) is built with no tags and with type=image, the image and builder cache are pruned, and we attempt to pull and save an image (B) which was (A)'s base layer. While the content for (A) has been effectively removed, the unpacked snapshot of (A)'s base layer remains. When we pull (B), only its manifest, config, and index are pulled. The actual content is not, resulting in

$ finch save ...
FATA[0000] failed to get reader: content digest sha256:XXX: not found
FATA[0000] exit status 1

Steps to reproduce the issue

note: these images are arm64. sha’s will vary on different platforms but the below process should work the same to repro.

  1. Build an image without tags

nerdctl build --no-cache -f Dockerfile.with-build-arg --progress=plain --build-arg VERSION=3.13 .

# Dockerfile.with-build-arg
ARG VERSION=latest
FROM public.ecr.aws/docker/library/alpine:${VERSION}

In the output, see the following sha that represents the base layer of the image:

#5 sha256:25f523f0e93b2b5fa676c15d91b90f08ee4de7a160874e6c52ea452929d5a7cc 2.72MB / 2.72MB 0.3s done

We also see the output:

#4 exporting to image
#4 exporting layers done
#4 exporting manifest sha256:0f5d034dfccaf2b8cf5d1901a356f836945526bec6b5a33c055567896dc23ee1 done
#4 exporting config sha256:e2730a754813a28b0f90c47d888aafc6c53ec1bb87da60881ee7fc4e4a99e801 done
#4 naming to <none>@sha256:0f5d034dfccaf2b8cf5d1901a356f836945526bec6b5a33c055567896dc23ee1 done
#4 unpacking to <none>@sha256:0f5d034dfccaf2b8cf5d1901a356f836945526bec6b5a33c055567896dc23

We can see our image looks... weird, since we didn't tag it:

$ finch images
REPOSITORY    TAG       IMAGE ID        CREATED          PLATFORM       SIZE       BLOB SIZE
<none>        <none>    0f5d034dfcca    3 minutes ago    linux/arm64    5.7 MiB    2.6 MiB
  1. use ctr to inspect content
$ sudo ctr content ls | grep sha256:0f5d034d # <- manifest of image we just built
sha256:0f5d034dfccaf2b8cf5d1901a356f836945526bec6b5a33c055567896dc23ee1 502B    3 hours         containerd.io/gc.ref.content.0=sha256:e2730a754813a28b0f90c47d888aafc6c53ec1bb87da60881ee7fc4e4a99e801
$ sudo ctr content ls | grep sha256:e2730a75
sha256:0f5d034dfccaf2b8cf5d1901a356f836945526bec6b5a33c055567896dc23ee1 502B    3 hours         containerd.io/gc.ref.content.0=sha256:e2730a754813a28b0f90c47d888aafc6c53ec1bb87da60881ee7fc4e4a99e801
sha256:e2730a754813a28b0f90c47d888aafc6c53ec1bb87da60881ee7fc4e4a99e801 907B    3 hours         containerd.io/gc.ref.snapshot.overlayfs=sha256:de51348d431b23f0be552f83fe8efd4504db8a384d5d6efc9e01550958e09fd5
$ sudo ctr content ls | grep sha256:469b
sha256:469b6e04ee185740477efa44ed5bdd64a07bbdd6c7e5f5d169e540889597b911 1.638kB 7 minutes       containerd.io/distribution.source.public.ecr.aws=docker/library/alpine
$ sudo ctr content ls | grep sha256:25f523f0e # <- THE LAYER CONTENT
sha256:25f523f0e93b2b5fa676c15d91b90f08ee4de7a160874e6c52ea452929d5a7cc 2.722MB 7 minutes       buildkit.io/blob/annotation.containerd.io/uncompressed=sha256:de51348d431b23f0be552f83fe8efd4504db8a384d5d6efc9e01550958e09fd5,buildkit.io/blob/mediatype=application/vnd.docker.image.rootfs.diff.tar.gzip,containerd.io/gc.ref.content.blob-sha256:25f523f0e93b2b5fa676c15d91b90f08ee4de7a160874e6c52ea452929d5a7cc=sha256:25f523f0e93b2b5fa676c15d91b90f08ee4de7a160874e6c52ea452929d5a7cc

Remove the image you just built and inspect content again

nerdctl rmi 0f5d034dfcca # <- image id from step 1

Now, let’s do the same as 2. inspect content: All content is still there. This is because buildkit cached the content and the unmounted snapshot remains.

  1. Prune “everything”

$ nerdctl system prune --all -f

This results in

Deleted build cache objects:
x764neru5picibs7cb5yc4f3x
shizatx98cbbpy4ar4gojbpfc
kz6wnfl0pssu7dbwghx11owe6

Untagged: public.ecr.aws/docker/library/alpine:3.13
deleted: sha256:25f523f0e93b2b5fa676c15d91b90f08ee4de7a160874e6c52ea452929d5a7cc

and again follow inspect content:

$ sudo ctr content ls | grep sha256:25f523f0e # <- the actual base layer: gone
$  sudo ctr content ls | grep sha256:469b # <- the ref sha: gone
$ sudo ctr content ls | grep sha256:e2730a75 # <- the built config sha...
sha256:0f5d034dfccaf2b8cf5d1901a356f836945526bec6b5a33c055567896dc23ee1 502B    3 hours         containerd.io/gc.ref.content.0=sha256:e2730a754813a28b0f90c47d888aafc6c53ec1bb87da60881ee7fc4e4a99e801
sha256:e2730a754813a28b0f90c47d888aafc6c53ec1bb87da60881ee7fc4e4a99e801 907B    3 hours         containerd.io/gc.ref.snapshot.overlayfs=sha256:de51348d431b23f0be552f83fe8efd4504db8a384d5d6efc9e01550958e09fd5
$ sudo ctr content ls | grep sha256:0f5d034d # <- the built manifest sha...
sha256:0f5d034dfccaf2b8cf5d1901a356f836945526bec6b5a33c055567896dc23ee1 502B    3 hours         containerd.io/gc.ref.content.0=sha256:e2730a754813a28b0f90c47d888aafc6c53ec1bb87da60881ee7fc4e4a99e801

Check out the remaining content:

$ cat /var/lib/containerd/io.containerd.content.v1.content/blobs/sha256/0f5d034dfccaf2b8cf5d1901a356f836945526bec6b5a33c055567896dc23ee1
{
  "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
  "schemaVersion": 2,
  "config": {
    "mediaType": "application/vnd.docker.container.image.v1+json",
    "digest": "sha256:e2730a754813a28b0f90c47d888aafc6c53ec1bb87da60881ee7fc4e4a99e801",
    "size": 907
  },
  "layers": [
    {
      "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
      "digest": "sha256:25f523f0e93b2b5fa676c15d91b90f08ee4de7a160874e6c52ea452929d5a7cc",
      "size": 2722126
    }
  ]
}
$ cat /var/lib/containerd/io.containerd.content.v1.content/blobs/sha256/e2730a754813a28b0f90c47d888aafc6c53ec1bb87da60881ee7fc4e4a99e801
{"architecture":"arm64","config":{"Env":["PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"],"Cmd":["/bin/sh"],"OnBuild":null},"created":"2022-11-10T20:39:56.601468255Z","history":[{"created":"2022-11-10T20:39:56.523308612Z","created_by":"/bin/sh -c #(nop) ADD file:f23c059b4312458fbf0fc018d4695f36157a3eb6e5a83167912a39f9a738f4eb in / "},{"created":"2022-11-10T20:39:56.601468255Z","created_by":"/bin/sh -c #(nop)  CMD [\"/bin/sh\"]","empty_layer":true}],"moby.buildkit.buildinfo.v1":"eyJmcm9udGVuZCI6ImRvY2tlcmZpbGUudjAiLCJzb3VyY2VzIjpbeyJ0eXBlIjoiZG9ja2VyLWltYWdlIiwicmVmIjoicHVibGljLmVjci5hd3MvZG9ja2VyL2xpYnJhcnkvYWxwaW5lOjMuMTMiLCJwaW4iOiJzaGEyNTY6NDY5YjZlMDRlZTE4NTc0MDQ3N2VmYTQ0ZWQ1YmRkNjRhMDdiYmRkNmM3ZTVmNWQxNjllNTQwODg5NTk3YjkxMSJ9XX0=","os":"linux","rootfs":{"type":"layers","diff_ids":["sha256:de51348d431b23f0be552f83fe8efd4504db8a384d5d6efc9e01550958e09fd5"]},"variant":"v8"}

note sha256:de51348d43 - this is the sha of the unpacked layer for the alpine:3.13 image:

...
"rootfs":{"type":"layers","diff_ids":["sha256:de51348d431b23f0be552f83fe8efd4504db8a384d5d6efc9e01550958e09fd5"]},"variant":"v8"}
...

we can still find that in sudo ctr snapshot ls:

...
de51348d431b23f0be552f83fe8efd4504db8a384d5d6efc9e01550958e09fd5
...
  1. Pull public.ecr.aws/docker/library/alpine:3.13, try to save it
$ nerdctl pull public.ecr.aws/docker/library/alpine:3.13
public.ecr.aws/docker/library/alpine:3.13:                                        resolved       |++++++++++++++++++++++++++++++++++++++|
index-sha256:469b6e04ee185740477efa44ed5bdd64a07bbdd6c7e5f5d169e540889597b911:    done           |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:448028a5480dcea5eae6ed1442fb85a44921c41972a405987883aee3abdaf410: done           |++++++++++++++++++++++++++++++++++++++|
config-sha256:1384a14f8577009b729eb1ef6aabe2729ae5a35e07d4d6b84a6dd9b841a818e3:   done           |++++++++++++++++++++++++++++++++++++++|
elapsed: 5.2 s 
$ nerdctl images
REPOSITORY                              TAG     IMAGE ID        CREATED           PLATFORM          SIZE       BLOB SIZE
public.ecr.aws/docker/library/alpine    3.13    469b6e04ee18    10 minutes ago    linux/arm64/v8    5.7 MiB    2.6 Mi                                                                  total:  2.1 Ki (416.0 B/s)
$ nerdctl save -o myfake.tar public.ecr.aws/docker/library/alpine:3.13
FATA[0000] failed to get reader: content digest sha256:25f523f0e93b2b5fa676c15d91b90f08ee4de7a160874e6c52ea452929d5a7cc: not found
FATA[0000] exit status 1

You can see that even though the actual base layer doesn’t exist in the content store (it is unpacked as a snapshot), on pull of the image, we don’t pull that layer back into the containerd content store. We only pull the index, manifest, and config. Why? because the snapshot for that layer, sha256:de51348d43, has already been unpacked and committed. For some reason, nerdctl/containerd thinks the layer still exists.

Describe the results you received and expected

I expected either

  1. nerdctl system prune --all -f to remove the snapshot sha256:de51348d43 that was unpacked during build

Or

  1. pulling an image with a missing layer would still either pull that layer or create it in the content store, even if the contents of the layer have already be unpacked as a committed containerd snapshot

What version of nerdctl are you using?

v1.4.0

Are you using a variant of nerdctl? (e.g., Rancher Desktop)

Finch/Lima, buildkit

Host information

Finch VM https://github.com/runfinch/finch

@ginglis13 ginglis13 added the kind/unconfirmed-bug-claim Unconfirmed bug claim label Jul 13, 2023
@ginglis13
Copy link
Contributor Author

ginglis13 commented Jul 13, 2023

It looks to me like the issue when we nerdctl system prune --all -f and then check the content of the image we built in the manner (untagged with type=image), for some reason the index and manifest content for that image are not removed. From above

and again follow inspect content:

$ sudo ctr content ls | grep sha256:25f523f0e # <- the actual base layer: gone
$  sudo ctr content ls | grep sha256:469b # <- the ref sha: gone
$ sudo ctr content ls | grep sha256:e2730a75 # <- the built config sha...
sha256:0f5d034dfccaf2b8cf5d1901a356f836945526bec6b5a33c055567896dc23ee1 502B    3 hours         containerd.io/gc.ref.content.0=sha256:e2730a754813a28b0f90c47d888aafc6c53ec1bb87da60881ee7fc4e4a99e801
sha256:e2730a754813a28b0f90c47d888aafc6c53ec1bb87da60881ee7fc4e4a99e801 907B    3 hours         containerd.io/gc.ref.snapshot.overlayfs=sha256:de51348d431b23f0be552f83fe8efd4504db8a384d5d6efc9e01550958e09fd5
$ sudo ctr content ls | grep sha256:0f5d034d # <- the built manifest sha...
sha256:0f5d034dfccaf2b8cf5d1901a356f836945526bec6b5a33c055567896dc23ee1 502B    3 hours         containerd.io/gc.ref.content.0=sha256:e2730a754813a28b0f90c47d888aafc6c53ec1bb87da60881ee7fc4e4a99e801

The image config has the garbage collection label to remove the snapshot: containerd.io/gc.ref.snapshot.overlayfs=sha256:de5134 but this image config content does not get removed on nerdctl system prune --all -f

Additionally, the manifest for the image we just created is missing the containerd.io/gc.ref.content.config label

@ginglis13
Copy link
Contributor Author

ginglis13 commented Jul 13, 2023

This is a bug with buildkitv0.11.x . The bug has been patched in moby/buildkit#3972 which was included in buildkitv0.12.0 which was released just yesterday: https://github.com/moby/buildkit/releases/tag/v0.12.0

@rptaylor
Copy link

Is there any convenient workaround for this?

@apostasie
Copy link
Contributor

Is there any convenient workaround for this?

If your specific manifestation of the issue comes from buildkit, as documented by @ginglis13 then you should be fine with buildkit >= v0.12.

On the other hand, the same symptoms (content digest not found) range from a variety of sources, probably all tied to an unaddressed containerd issue.

We have patched a number of cases, and we now forcefully ensure content is there on specific operations (save, commit, tag), but there certainly are more cases we have not covered and the underlying problem is still very much here.

#3513 has some context

Unless you think your issue is exactly the same scenario as reported in this ticket here, I suggest you open a new ticket with clear steps to reproduce and we can look into it.

Hope that helps.

@rptaylor
Copy link

rptaylor commented Jan 31, 2025

Thanks for the extra info @apostasie !
The symptoms appear exactly the same:

$ sudo /usr/local/bin/nerdctl -n k8s.io image save -o /tmp/quay.io_coreos_etcd_v3.5.10.tar quay.io/coreos/etcd:v3.5.10
FATA[0000] failed to get reader: content digest sha256:63b450eae87c42ba59c0fa815ad0e5b8cb6fb76a039cc341dbff6e744fa77a77: not found

$  sudo /usr/local/bin/ctr -n k8s.io image export test.tar  quay.io/coreos/etcd:v3.5.10 
ctr: failed to get reader: content digest sha256:63b450eae87c42ba59c0fa815ad0e5b8cb6fb76a039cc341dbff6e744fa77a77: not found

So far it has happened reproducibly on every Kubespray cluster I try to upgrade (to v2.23.3), specifically with this etcd_v3.5.10 image. I tried pruning images and deleting and re-pulling this one but it didn't help. Maybe the issue is linked with the current etcd v3.5.6 image which is in use. I'm not sure how the images are built. To work around it I am pulling the required image on a different system, then import it, so the upgrade can proceed.

containerd version: v1.7.13 7c3aca7a610df76212171d200ca3811ff6096eb8

Just mentioning in case this would be a useful debugging scenario, otherwise I"ll just do the workaround on all our clusters and continue upgrading them.

@apostasie
Copy link
Contributor

@rptaylor which version of nerdctl are you using (output of nerdctl version and nerdctl info)?

@rptaylor
Copy link

rptaylor commented Jan 31, 2025

$ sudo /usr/local/bin/nerdctl version
WARN[0000] unable to determine buildctl version: exec: "buildctl": executable file not found in $PATH 
WARN[0000] unable to determine runc version: exec: "runc": executable file not found in $PATH 
Client:
 Version:	v1.4.0
 OS/Arch:	linux/amd64
 Git commit:	7e8114a82da342cdbec9a518c5c6a1cce58105e9
 buildctl:
  Version:	

Server:
 containerd:
  Version:	v1.7.13
  GitCommit:	7c3aca7a610df76212171d200ca3811ff6096eb8
 runc:
  Version:	

$ sudo /usr/local/bin/nerdctl info
Client:
 Namespace:	k8s.io
 Debug Mode:	false

Server:
 Server Version: v1.7.13
 Storage Driver: overlayfs
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 1
 Plugins:
  Log: fluentd journald json-file syslog
  Storage: native overlayfs
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 4.18.0-553.27.1.el8_10.x86_64
 Operating System: AlmaLinux 8.10 (Cerulean Leopard)
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 5.798GiB
 Name: cluster-dev-k8s-node-p03
 ID: 7dadd0f6-7251-461b-9501-f65432a460f8

@apostasie
Copy link
Contributor

nerdctl v1.4.0 is quite old, and unsupported.

Fixes for the problems discussed have been shipped with v2.0.

Is it possible for you to update to nerdctl v2.0.3 and try again?

@apostasie
Copy link
Contributor

Kernel is really old too (it's from 2018 right?). Is that an Oracle maintained kernel? Alma is the new CentOS, right?

It should not be a problem (and definitely orthogonal to the issues here), but I wanted to point out that we do not have integration testing for that.

@rptaylor
Copy link

No, this issue blocks the upgrade to a newer cluster version (including containerd, nerdctl versions, all integrated together in the Kubespray version) because it prevents getting the newer etcd image. After I do the workaround the upgrade can proceed to newer versions.

This kernel is from October 2024, just a few months ago: https://almalinux.pkgs.org/8/almalinux-baseos-x86_64/kernel-4.18.0-553.27.1.el8_10.x86_64.rpm.html

Anyway that's okay, thanks for looking into it!

@apostasie
Copy link
Contributor

No, this issue blocks the upgrade to a newer cluster version (including containerd, nerdctl versions, all integrated together in the Kubespray version) because it prevents getting the newer etcd image. After I do the workaround the upgrade can proceed to newer versions.

I see.

Well, the likely silver-lining for you is that nerdctl v2 should have a patch for your issue with save (#3435) so, future you should not have the same problem...

This kernel is from October 2024, just a few months ago: https://almalinux.pkgs.org/8/almalinux-baseos-x86_64/kernel-4.18.0-553.27.1.el8_10.x86_64.rpm.html

Thanks for the info.
Reading about Alma.
Yeah, looks like this kernel is a long-lived branch from a heavily modified 4.18.

Top of the head, you may have issues with recursive read-only bind mounts, but I guess for the most part it should be fine?

Keep us posted if you get a chance once you have completed your upgrade.

Cheers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/unconfirmed-bug-claim Unconfirmed bug claim
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants