Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR: content digest sha256:***: not found #3809

Closed
pjonsson opened this issue Apr 20, 2023 · 9 comments · Fixed by #4210
Closed

ERROR: content digest sha256:***: not found #3809

pjonsson opened this issue Apr 20, 2023 · 9 comments · Fixed by #4210

Comments

@pjonsson
Copy link

Using Docker 23.0.4 packages installed on a stock Ubuntu 22.04. docker buildx ls shows the builder as running BuildKit 0.11.5.

After switching from the default builder to the docker-container builder and adding attestation information, the CI machines consistently fail in this way:

...
#22 1.792 [0001]  WARN unable to access path="/run/src/core/sbom/usr/share/locale/zh_TW/LC_MESSAGES/coreutils.mo": lstat /run/src/core/sbom/usr/share/locale/zh_TW/LC_MESSAGES/coreutils.mo: no such file or directory
#22 DONE 5.2s
#23 preparing layers for inline cache
#23 DONE 18.7s
#24 exporting to image
#24 exporting layers done
#24 exporting manifest sha256:9aa27b18a4c0ab200c6b440592d71551ac1b7a8d5ed5af7051fe9126b826c297
#24 exporting manifest sha256:9aa27b18a4c0ab200c6b440592d71551ac1b7a8d5ed5af7051fe9126b826c297 0.4s done
#24 exporting config sha256:dd807f62da826d0286521aa5702d438e933b47042ecf1f0f51986ad81e04eff7 0.1s done
#24 ERROR: content digest sha256:2ab09b027e7f3a0c2e8bb1944ac46de38cebab7145f0bd6effebfe5492c818b6: not found
------
 > exporting to image:
------
ERROR: failed to solve: content digest sha256:2ab09b027e7f3a0c2e8bb1944ac46de38cebab7145f0bd6effebfe5492c818b6: not found

This happens for multiple images, but here are the commands for the fastest one:

#!/bin/bash -e

export CI_RO_USER=ro-user
export CI_RO_PW=ro-pw
export CI_RW_USER=rw-user
export CI_RW_PW=rw-pw
export CI_REGISTRY=registry.name
export CI_REGISTRY_PATH=basepath
export IMAGE_NAME=postgres

WORK_DIRECTORY=$(mktemp -d)
echo "Using $WORK_DIRECTORY as directory for attempt"

cd $WORK_DIRECTORY
git clone --depth 1 --filter=blob:none --no-checkout https://github.com/docker-library/postgres
cd postgres
git sparse-checkout set 13/bullseye
git checkout master
perl -pi -e 's#apt-get update;#export DEBIAN_FRONTEND=noninteractive; apt-get update; apt-get -y upgrade;#g' 13/bullseye/Dockerfile
perl -pi -e 's#apt/ bullseye-pgdg main#apt/ jammy-pgdg main#g' 13/bullseye/Dockerfile
perl -pi -e 's#ENV PG_VERSION (.+)$#ENV PG_VERSION 13\.10-1\.pgdg22\.04\+1#g' 13/bullseye/Dockerfile
cd 13/bullseye
echo $CI_RO_PW | docker --config .docker_temp_config login $CI_REGISTRY -u $CI_RO_USER --password-stdin
docker --config .docker_temp_config buildx create --name repro-builder --driver docker-container --bootstrap
echo $CI_RW_PW | docker --config .docker_temp_config login $CI_REGISTRY -u $CI_RW_USER --password-stdin
docker --config .docker_temp_config build --push --progress plain --build-context debian:bullseye-slim=docker-image://ubuntu:22.04 --pull --no-cache --builder repro-builder --attest type=provenance,mode=max --attest type=sbom --build-arg BUILDKIT_INLINE_CACHE=1 --cache-from $CI_REGISTRY/$CI_REGISTRY_PATH/$IMAGE_NAME:13-bullseye --tag $CI_REGISTRY/$CI_REGISTRY_PATH/$IMAGE_NAME:13-bullseye --file Dockerfile .

The build is driven from a Makefile invoked with make -j2, so several images are built using the same docker-container concurrently. Calling make without -j still reproduces the issue in the CI. It seems to be the same images that are failing, but other images are built and pushed to the registry when using the same docker command line (modulo image names and revisions).

I cannot use the read-write-user on my local machine, but with just the read-only user the problem does not reproduce on my local machine (also Ubuntu 22.04 with Docker 23.0.4 and BuildKit 0.11.5 in the docker-container). My local machine is quite a lot faster than the CI machines though.

My understanding is that #3566 should be a part of 0.11.5, so I'm not sure what problem I'm seeing.

The issue is definitely more common for small/slim-images, most of the big images are pushed to the registry. This could just be a consequence of the Makefile structure, the big targets are listed before the slim targets in the dependencies.

@crazy-max
Copy link
Member

Using Docker 23.0.4 packages installed on a stock Ubuntu 22.04. docker buildx ls shows the builder as running BuildKit 0.11.5.

Docker 23 embeds BuildKit v0.10 atm. Can you show the output of docker buildx ls?

@pjonsson
Copy link
Author

The reason we switched to using docker-container was for the attestation (which isn't a part of 0.10 according to my understanding), and that is also why the reproducer script passes an explicit --builder parameter.

NAME/NODE        DRIVER/ENDPOINT             STATUS  BUILDKIT PLATFORMS
repro-builder    docker-container                             
  repro-builder0 unix:///var/run/docker.sock running v0.11.5  linux/amd64, linux/amd64/v2, linux/386
default *        docker                                       
  default        default                     running 23.0.4   linux/amd64, linux/amd64/v2, linux/386

The reproducer script is cut and pasted from the commands run by the CI, I've only replaced paths and registry information with variables.

@pjonsson
Copy link
Author

pjonsson commented Apr 20, 2023

I just discovered that the postgres-reproducer in the ticket fails on the same sha256 hash as the node:16-slim image also fails on (2ab09b027e7f3a0c2e8bb1944ac46de38cebab7145f0bd6effebfe5492c818b6).

The only commonalities between the Dockerfiles for postgres and node:16-slim that I can see is the line FROM debian:bullseye-slim, and that is neutralized by the --build-context debian:bullseye-slim=docker-image://ubuntu:22.04 in the docker build command.

The big images that do not fail also have a build context though, it's just a different context (--build-context buildpack-deps:bullseye=docker-image://buildpack-deps:22.04).

Here's the clone part for node, the docker build command is identical to the command for postgres.

git clone --depth 1 --filter=blob:none --no-checkout https://github.com/nodejs/docker-node
cd docker-node
git sparse-checkout set 16/bullseye 16/bullseye-slim
git checkout main
cd 16/bullseye
<same docker build>

Edit: all images build if I change --build-context debian:bullseye-slim=docker-image://ubuntu:22.04 into --build-context debian:bullseye-slim=docker-image://buildpack-deps:22.04. I also got the error on a separate machine, and it's the same 2ab09b027e7f3a0c2e8bb1944ac46de38cebab7145f0bd6effebfe5492c818b6 sha256 missing. Hoping I would be able to do something with 2ab09b027e7f3a0c2e8bb1944ac46de38cebab7145f0bd6effebfe5492c818b6, I tried changing the --push into -o type=oci,dest=/tmp/image-debug.tar, but the file ends up as 0 bytes since the content digest cannot be found.

@tonistiigi
Copy link
Member

I don't understand this reproducer as afaik there is no way to have multiple user accounts for the same registry in Docker config.

Looks like it may be somewhat related to how tokens are handled in the registry for the images that are accessed. Can you provide a runnable reproducer that doesn't involve the custom registry setup?

@pjonsson
Copy link
Author

The reproducer was a cut and paste of what the CI runs, I did not want to omit commands since I have no idea what could be causing the issue.

I had to add the big node image build, but here's a reproducer without registry that triggers the issue every time I run it. Stock Ubuntu 22.04, Docker 23.0.4 installed with the packages from docker.com, and BuildKit 0.11.5 inside the container-builder.

#!/bin/bash -ex

WORK_DIRECTORY=$(mktemp -d)
echo "Using $WORK_DIRECTORY as directory for attempt"

cd $WORK_DIRECTORY
git clone --depth 1 --filter=blob:none --no-checkout https://github.com/docker-library/postgres
cd postgres
git sparse-checkout set 13/bullseye
git checkout master
perl -pi -e 's#apt-get update;#export DEBIAN_FRONTEND=noninteractive; apt-get update; apt-get -y upgrade;#g' 13/bullseye/Dockerfile
perl -pi -e 's#apt/ bullseye-pgdg main#apt/ jammy-pgdg main#g' 13/bullseye/Dockerfile
perl -pi -e 's#ENV PG_VERSION (.+)$#ENV PG_VERSION 13\.10-1\.pgdg22\.04\+1#g' 13/bullseye/Dockerfile
cd 13/bullseye
docker --config $WORK_DIRECTORY/.docker_temp_config buildx create --name repro-builder --driver docker-container --bootstrap
docker --config $WORK_DIRECTORY/.docker_temp_config build --output type=image,name=test-postgres --progress plain --build-context debian:bullseye-slim=docker-image://ubuntu:22.04 --pull --no-cache --builder repro-builder --attest type=provenance,mode=max --attest type=sbom --build-arg BUILDKIT_INLINE_CACHE=1 --file Dockerfile .
cd ../../..
git clone --depth 1 --filter=blob:none --no-checkout https://github.com/nodejs/docker-node
cd docker-node
git sparse-checkout set 16/bullseye 16/bullseye-slim
git checkout main
cd 16/bullseye
docker --config $WORK_DIRECTORY/.docker_temp_config build --output type=image,name=test-node-big --progress plain --build-context buildpack-deps:bullseye=docker-image://buildpack-deps:22.04 --pull --no-cache --builder repro-builder --attest type=provenance,mode=max --attest type=sbom --build-arg BUILDKIT_INLINE_CACHE=1 --file Dockerfile .
cd ../bullseye-slim
docker --config $WORK_DIRECTORY/.docker_temp_config build --output type=image,name=test-node --progress plain --build-context debian:bullseye-slim=docker-image://ubuntu:22.04 --pull --no-cache --builder repro-builder --attest type=provenance,mode=max --attest type=sbom --build-arg BUILDKIT_INLINE_CACHE=1 --file Dockerfile .

@scott-kausler
Copy link

I have encountered this error as well when building from apache/airflow:2.5.3-python3.10

@hezhizhen
Copy link
Contributor

Any updates?

@tonistiigi
Copy link
Member

tonistiigi commented Sep 6, 2023

I could repro this once but now I can't get past the first postgres build anymore:

10.12 Building dependency tree...
10.20 Reading state information...
10.20 Package postgresql-13 is not available, but is referred to by another package.
10.20 This may mean that the package is missing, has been obsoleted, or
10.20 is only available from another source
10.20
10.20 E: Version '13.10-1.pgdg22.04+1' for 'postgresql-13' was not found

Looks like some update between debian package state and these perl patches in the repro.

edit: patch in #4210

@pjonsson
Copy link
Author

pjonsson commented Sep 25, 2023

I have upgraded my Docker to 24.0.6 since the ticket was filed, but here is an updated reproducer that triggers the problem with 24.0.6 on my computer:

#!/bin/bash -ex

WORK_DIRECTORY=$(mktemp -d)
echo "Using $WORK_DIRECTORY as directory for attempt"

cd $WORK_DIRECTORY
git clone --depth 1 --filter=blob:none --no-checkout https://github.com/docker-library/postgres
cd postgres
git sparse-checkout set 13/bullseye
git checkout master
perl -pi -e 's#apt-get update;#export DEBIAN_FRONTEND=noninteractive; apt-get update; apt-get -y upgrade;#g' 13/bullseye/Dockerfile
perl -pi -e 's#apt/ bullseye-pgdg main#apt/ jammy-pgdg main#g' 13/bullseye/Dockerfile
perl -pi -e 's#ENV PG_VERSION 13\.([0-9\-]+)(.+)$#ENV PG_VERSION 13\.\1\.pgdg22\.04\+1#g' 13/bullseye/Dockerfile
cd 13/bullseye
docker --config $WORK_DIRECTORY/.docker_temp_config buildx create --name repro-builder --driver docker-container --bootstrap
docker --config $WORK_DIRECTORY/.docker_temp_config build --output type=image,name=test-postgres --progress plain --build-context debian:bullseye-slim=docker-image://ubuntu:22.04 --pull --no-cache --builder repro-builder --attest type=provenance,mode=max --attest type=sbom --build-arg BUILDKIT_INLINE_CACHE=1 --file Dockerfile .
cd ../../..
git clone --depth 1 --filter=blob:none --no-checkout https://github.com/nodejs/docker-node
cd docker-node
git sparse-checkout set 18/bullseye 18/bullseye-slim
git checkout main
cd 18/bullseye
docker --config $WORK_DIRECTORY/.docker_temp_config build --output type=image,name=test-node-big --progress plain --build-context buildpack-deps:bullseye=docker-image://buildpack-deps:22.04 --pull --no-cache --builder repro-builder --attest type=provenance,mode=max --attest type=sbom --build-arg BUILDKIT_INLINE_CACHE=1 --file Dockerfile .
cd ../bullseye-slim
docker --config $WORK_DIRECTORY/.docker_temp_config build --output type=image,name=test-node --progress plain --build-context debian:bullseye-slim=docker-image://ubuntu:22.04 --pull --no-cache --builder repro-builder --attest type=provenance,mode=max --attest type=sbom --build-arg BUILDKIT_INLINE_CACHE=1 --file Dockerfile .

The docker upgrade also triggers new versions of other components, but that doesn't seem to improve the situation:

$ docker --config /tmp/tmp.g3CbueWjA4/.docker_temp_config/ buildx  ls
NAME/NODE        DRIVER/ENDPOINT             STATUS  BUILDKIT             PLATFORMS
repro-builder    docker-container                                         
  repro-builder0 unix:///var/run/docker.sock running v0.12.2              linux/amd64, linux/amd64/v2, linux/386

Edit: was the PR stuck waiting for a reproducer, or is it touching code that is difficult to review?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants