-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get error: content digest sha256: ***: not found when exporting image #2631
Comments
Hey @sunchunming I am experiencing the same issue. Using buildx with a Buildkit daemon on k8s with an SSD PV/C for /var/lib/buildkit. I tried removing part of the cache manually using I really don’t want to delete the entire cache but I can’t seem to come up with a better solution. I also discussed this with @tonistiigi on Slack but unfortunately we couldn’t figure it out. Can you share your complete setup? Are you using BuildKit daemon on docker or k8s? How are you trying to export the image? The more details the better. |
I have some interesting updates. I use an apikey for nvcr.io as it provides better rate limits. They apikey is valid and I checked in both on my local machine and another machine. When I remove it, I see:
Now, when I use it - which is what I have done up until now, I see a very strange log entry in buildkitd:
There's only one entry like that when I build after deleting the cache manually as mentioned in my previous comment:
I believe that this might be related to something like: google/go-containerregistry#728 I saw that this is used here: Lines 667 to 669 in 6fa5a92
What do you think? EDIT:The 401 Unauthorized happens because I have multiple auths in my
|
I kept on investigating and maybe this will help figuring out something. I'm building this image:
Using this command:
I also tried without Steps taken:
|
+1, also meet the same issue; |
Experiencing the same issue. Any workaround here? |
@okgolove which version of buildkit and buildx are you using? |
@Shaked I'm using buildctl instead of buildx. |
Related to rancher/kim#74 I think. If this is indeed the same behavior (assuming you are using the containerd worker), workarounds are:
|
Hi @sunchunming @Shaked @lugeng @okgolove @dweomer We meet the same issue in our buildkit production environment (buildkitd v0.10.3), there is about a 1% chance of error. This issue can be fixed by this patch, we have tested 5k image builds for some days. We have observed that this problem always occurs during the push layers phase, and that the base image layer is not found. By adding some logs we found that in the https://github.com/imeoer/buildkit/blob/26c11880022774bc6eca6376aef5e698ecf629c5/cache/refs.go#L276, This issue seems difficult to reproduce, perhaps we can take a look deeper. cc @tonistiigi @sipsma |
Thank you very much for your contribution, I have tested 3k images build for two weeks in our build cluster and the issue didn't reproduce. |
Same issue here. If the patch helps why not merging it ? |
@jbguerraz Will try to submit a PR. |
hey ! any news ? :) |
Looks like I am experiencing the same issue here. Using version built from #3447, this all goes away. Running in GitHub Action. My workflow is derived from the testing image example (https://docs.docker.com/build/ci/github-actions/examples/#test-your-image-before-pushing-it): steps:
- uses: actions/checkout@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Create Buildx local cache folder
run: |
BUILDX_CACHE_FOLDER=$(mktemp -d -q)
echo "BUILDX_CACHE_FOLDER=${BUILDX_CACHE_FOLDER}" >> $GITHUB_ENV
- name: Set Buildx remote cache locations
run: |
BASE="type=s3,region=us-west-1,bucket=xxx,prefix=${{ matrix.architecture.runner }}/,access_key_id=${{ env.AWS_ACCESS_KEY_ID }},secret_access_key=${{ env.AWS_SECRET_ACCESS_KEY }},session_token=${{ env.AWS_SESSION_TOKEN }}"
case ${GITHUB_EVENT_NAME} in
pull_request)
echo 'BUILDX_CACHE_FROM<<EOF' >> $GITHUB_ENV
echo "${BASE},name=${GITHUB_HEAD_REF}" >> $GITHUB_ENV
echo "${BASE},name=${GITHUB_BASE_REF}" >> $GITHUB_ENV
echo 'EOF' >> $GITHUB_ENV
echo "BUILDX_CACHE_TO=${BASE},name=${GITHUB_HEAD_REF}" >> $GITHUB_ENV
;;
push | workflow_dispatch)
BUILDX_CACHE="${BASE},name=${GITHUB_REF_NAME#deploy/}"
echo "BUILDX_CACHE_FROM=${BUILDX_CACHE}" >> $GITHUB_ENV
echo "BUILDX_CACHE_TO=${BUILDX_CACHE}" >> $GITHUB_ENV
;;
*)
echo "Event not supported"
exit 1
;;
esac
- name: Build and export image to Docker
uses: docker/build-push-action@v3
with:
context: .
file: docker/Dockerfile
load: true
tags: ${{ github.repository }}:test
cache-from: ${{ env.BUILDX_CACHE_FROM }}
cache-to: type=local,dest=${{ env.BUILDX_CACHE_FOLDER }},mode=max
...
some testing depending on docker --load
...
- name: Build and push image to Amazon ECR
uses: docker/build-push-action@v3
with:
context: .
file: docker/Dockerfile
push: ${{ startsWith(github.ref_name, 'deploy/') }}
tags: ${{ steps.metadata.outputs.tags }}
labels: ${{ steps.metadata.outputs.labels }}
cache-from: type=local,src=${{ env.BUILDX_CACHE_FOLDER }}
cache-to: ${{ env.BUILDX_CACHE_TO }} Failures have been very consistent. It started around the 0.11.2 release so I first thought of a regression. Pinning version 0.11.1 did not change the outcome, oddly. Versions used are GitHub Actions defaults:
Would really appreciate some help. All I can offer is a bunch of testing, not a Go expert here... |
@ohmer How often do you see it? Do you have a public workflow that we could use for debugging? edit: also, did you test with vanilla 0.11.2 as well? There were some other patches in 0.11. |
@tonistiigi over the past 2 days, workflow failure rate was around 95%. The sample is around 100 run (we are a small team). That's private repository, really can't be made public but happy to share anything not related to our app code. I did not have buildkit version pinned until workflow started to fail, so was running 0.11 until 0.12 was published (4 days ago?). I only switched to my fork late today (from NZ, 6:45pm here). |
@ohmer Could you check what this patch does for you: https://github.com/moby/buildkit/compare/master...tonistiigi:buildkit:blobonly-debug?expand=1 |
@tonistiigi will try this tomorrow on my private repo. In the meantime, I extracted the workflow to a public repo and can reproduce: https://github.com/sharesight/buildkit-debug/actions/runs/4062589459/jobs/6993829704. Bucket is public if you want to have a look at the cache state on S3. |
FYI had the same issue, I compiled buildkit with this change #3447 and didn't get this issue again. However, I can't confim it works for every cases. |
@tonistiigi was not able work on this today, did not forget about it. |
@tonistiigi Was able to test your patch. Here is the method and result.
I built buildkit with Is there anything else I should try? Does it help? |
It should have been fixed in #3566. |
Would help check below issue and advise if any update? Thank you.
When exporting image, I found below error
Then I checked the dictionary: '~/.local/share/buildkit/runc-native/content/blobs/sha256' and file 97bac3dab075a8e745a60a2e05e9f678053d6bca7ad1d109867220704b154443 is missing, but could be found in one of the image manifest files.
buildkit version:0.9.0, I didn't config gc in Buildkitd configure file. I suspect there is an issue, which deletes a shared cache record in default gc.
some debug logs:
time="2022-02-23T05:59:48Z" level=debug msg=push
time="2022-02-23T05:59:48Z" level=debug msg="fetch response received" response.header.accept-ranges=bytes response.header.cache-control="max-age=31536000" response.header.connection=keep-alive response.header.content-length=306 response.header.content-type=application/octet-stream response.header.date="Wed, 23 Feb 2022 05:59:48 GMT" response.header.docker-content-digest="sha256:32678decbeb81d3211ddd542bd383f7ff304d63af7a78321e7b01b4021f65614" response.header.docker-distribution-api-version=registry/2.0 response.header.etag=""sha256:32678decbeb81d3211ddd542bd383f7ff304d63af7a78321e7b01b4021f65614"" response.header.server=nginx response.header.set-cookie="sid=6bd09de24143b204fc15afffd55b05a0; Path=/; HttpOnly" response.status="200 OK"
time="2022-02-23T05:59:48Z" level=debug msg="checking and pushing to" url="http://harbor.jd.com/v2/jpipe-test/prod/jimidatalhwebservices/blobs/sha256:5e44ff2aeae6efe1449c178bf8edd1e7ced6fa5510fda8a3edc8d99c5fd64cc0"
time="2022-02-23T05:59:48Z" level=debug msg="do request" request.header.content-type=application/octet-stream request.header.user-agent=containerd/1.6.0-beta.1+unknown request.method=PUT
time="2022-02-23T05:59:48Z" level=debug msg=push
time="2022-02-23T05:59:48Z" level=debug msg="do request" request.header.accept="application/vnd.docker.image.rootfs.diff.tar.gzip, /" request.header.user-agent=containerd/1.6.0-beta.1+unknown request.method=HEAD
time="2022-02-23T05:59:48Z" level=debug msg="checking and pushing to" url="http://harbor.jd.com/v2/jpipe-test/prod/jimidatalhwebservices/blobs/sha256:d9a2c8ccae4221b6b060b63bf56e53dad2295c1ce9d2cf1eb047ebd4eba1b297"
time="2022-02-23T05:59:48Z" level=warning msg="failed to update distribution source for layer sha256:32678decbeb81d3211ddd542bd383f7ff304d63af7a78321e7b01b4021f65614: content digest sha256:32678decbeb81d3211ddd542bd383f7ff304d63af7a78321e7b01b4021f65614: not found"
......
time="2022-02-23T06:08:09Z" level=debug msg="checking and pushing to" url="http://harbor.jd.com/v2/jpipe-test/prod/dbbakmasterlb/blobs/sha256:ce68c6bbd0a17f0742f673fb01cf19b88b575861ea0b45533ddbd82a068d1246"
time="2022-02-23T06:08:09Z" level=debug msg="do request" request.header.accept="application/vnd.docker.image.rootfs.diff.tar.gzip, /" request.header.user-agent=containerd/1.6.0-beta.1+unknown request.method=HEAD
time="2022-02-23T06:08:09Z" level=debug msg="fetch response received" response.header.connection=keep-alive response.header.content-length=0 response.header.content-type="text/plain; charset=utf-8" response.header.date="Wed, 23 Feb 2022 06:08:09 GMT" response.header.docker-distribution-api-version=registry/2.0 response.header.docker-upload-uuid=94aee7cd-b8fa-4b04-a973-767fa2e87226 response.header.location="http://harbor.jd.com/v2/jpipe-test/prod/dbbakmasterlb/blobs/uploads/94aee7cd-b8fa-4b04-a973-767fa2e87226?_state=O1cQSYNegKctbLxK1DNYrkatUhKemITsws5DR0VRnpt7Ik5hbWUiOiJqcGlwZS10ZXN0L3Byb2QvZGJiYWttYXN0ZXJsYiIsIlVVSUQiOiI5NGFlZTdjZC1iOGZhLTRiMDQtYTk3My03NjdmYTJlODcyMjYiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjItMDItMjNUMDY6MDg6MDkuNDc0NTc4NDkxWiJ9" response.header.range=0-0 response.header.server=nginx response.header.set-cookie="sid=d63550dbee60e074afbebdfa4c37ba99; Path=/; HttpOnly" response.status="202 Accepted"
time="2022-02-23T06:08:09Z" level=debug msg="do request" request.header.content-type=application/octet-stream request.header.user-agent=containerd/1.6.0-beta.1+unknown request.method=PUT
time="2022-02-23T06:08:09Z" level=error msg="/moby.buildkit.v1.Control/Solve returned error: rpc error: code = Unknown desc = content digest sha256:32678decbeb81d3211ddd542bd383f7ff304d63af7a78321e7b01b4021f65614: not found\n"
time="2022-02-23T06:08:09Z" level=debug msg="session finished: "
The text was updated successfully, but these errors were encountered: