Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible incompatibility with AWS ECR #826

Closed
jfagoagas opened this issue Mar 2, 2023 · 20 comments
Closed

Possible incompatibility with AWS ECR #826

jfagoagas opened this issue Mar 2, 2023 · 20 comments

Comments

@jfagoagas
Copy link

jfagoagas commented Mar 2, 2023

Troubleshooting

Before submitting a bug report please read the Troubleshooting doc.

Behaviour

Steps to reproduce this issue

Assuming you have the right permissions to push to an AWS ECR (in our case we do)

  1. Create the following action:
name: build-push

on:
  push:
    branches:
      - "test-branch"

env:
  TAG: test
  GH_TOKEN: ${{ github.token }}
  DOCKERFILE_PATH: ./Dockerfile
  IMAGE_NAME: test
  AWS_REGION: eu-west-1
  IAM_ROLE_ARN: arn:aws:iam::111111111111:role/ecr-role
  ECR: 111111111111.dkr.ecr.eu-west-1.amazonaws.com

jobs:
  build-push:
    runs-on: ubuntu-latest
    permissions:
      id-token: write
      contents: read
    steps:
      - name: Checkout
        uses: actions/checkout@v3

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v1
        with:
          aws-region: ${{ env.AWS_REGION }}
          role-to-assume: ${{ env.IAM_ROLE_ARN }}
          role-session-name: test

      - name: Login to ECR
        uses: docker/login-action@v2
        with:
          registry: ${{ env.ECR }}

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2

      - name: Build and push container image
        if: github.event_name == 'push'
        uses: docker/build-push-action@v4
        with:
          context: .
          push: true
          file: ${{ env.DOCKERFILE_PATH }}
          tags: ${{ env.ECR }}/${{ env.IMAGE_NAME }}:${{ env.TAG }}
          # outputs: type=docker
    
      # The step below works with the configured credentials if the "outputs: type=docker" is uncommented
      - name: push manually
        run: docker push ${{ env.ECR }}/${{ env.IMAGE_NAME }}:${{ env.TAG }}
  1. Run the action.
  2. The above action returns the following error:
ERROR: failed to solve: failed to push 111111111111.dkr.ecr.eu-west-1.amazonaws.com/test:test: unexpected status: 403 Forbidden
Error: buildx failed with: ERROR: failed to solve: failed to push 111111111111.dkr.ecr.eu-west-1.amazonaws.com/test:test: unexpected status: 403 Forbidden

Expected behaviour

The docker/build-push-action@v4 should be able to upload the container image to the container registry.

Actual behaviour

The docker/build-push-action@v4 returns a 403 Forbidden error even when the action has the right credentials to push to the repository. That's because setting push: false and using a separate docker push results in the image correctly pushed to the container registry.

Configuration

  • Repository URL (if public): Not public.
  • Build URL (if public): Not public.
name: build-push

on:
  push:
    branches:
      - "test-branch"

env:
  TAG: test
  GH_TOKEN: ${{ github.token }}
  DOCKERFILE_PATH: ./Dockerfile
  IMAGE_NAME: test
  AWS_REGION: eu-west-1
  IAM_ROLE_ARN: arn:aws:iam::111111111111:role/ecr-role
  ECR: 111111111111.dkr.ecr.eu-west-1.amazonaws.com

jobs:
  build-push:
    runs-on: ubuntu-latest
    permissions:
      id-token: write
      contents: read
    steps:
      - name: Checkout
        uses: actions/checkout@v3

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v1
        with:
          aws-region: ${{ env.AWS_REGION }}
          role-to-assume: ${{ env.IAM_ROLE_ARN }}
          role-session-name: test

      - name: Login to ECR
        uses: docker/login-action@v2
        with:
          registry: ${{ env.ECR }}

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2

      - name: Build and push container image
        if: github.event_name == 'push'
        uses: docker/build-push-action@v4
        with:
          context: .
          push: true
          file: ${{ env.DOCKERFILE_PATH }}
          tags: ${{ env.ECR }}/${{ env.IMAGE_NAME }}:${{ env.TAG }}
          # outputs: type=docker
    
      # The step below works with the configured credentials if the "outputs: type=docker" is uncommented
      - name: push manually
        run: docker push ${{ env.ECR }}/${{ env.IMAGE_NAME }}:${{ env.TAG }}

Logs

Download the log file of your build and attach it to this issue.

@crazy-max
Copy link
Member

crazy-max commented Mar 2, 2023

Do you have a link to your repo? Can you also post the BuildKit logs please (see https://docs.docker.com/build/ci/github-actions/configure-builder/#buildkit-container-logs)?

@jfagoagas
Copy link
Author

jfagoagas commented Mar 2, 2023

Do you have a link to your repo? Can you also post the BuildKit logs please (see https://docs.docker.com/build/ci/github-actions/configure-builder/#buildkit-container-logs)?

The repo is private and just a fresh new for testing this action.

I can send you the logs later today.

Thanks!

@jfagoagas
Copy link
Author

Hi @crazy-max, here are the logs setting buildkitd-flags: --debug.

test-logs.txt

Thank you!

@crazy-max
Copy link
Member

Hum I wonder if this is related to an issue with ECR and provenance. Can you try with provenance disabled?:

      - name: Build and push container image
        if: github.event_name == 'push'
        uses: docker/build-push-action@v4
        with:
          context: .
          push: true
          file: ${{ env.DOCKERFILE_PATH }}
          tags: ${{ env.ECR }}/${{ env.IMAGE_NAME }}:${{ env.TAG }}
          provenance: false

@jfagoagas
Copy link
Author

The error persists setting provenance: false.

test-logs-without-provenance.txt

@crazy-max
Copy link
Member

Hum that's odd.

  time="2023-03-02T22:37:24Z" level=debug msg="checking and pushing to" digest="sha256:8131fb7cb0b47f423c719cc9bf9296a0a85a28391b25bd64c160079c260a2ed4" mediatype=application/vnd.docker.distribution.manifest.v2+json size=3048 spanID=f423bff6834ca9a8 traceID=0ea281446ce5a8e4681a38d32ff4bc92 url="https://111111111111.dkr.ecr.eu-west-1.amazonaws.com/v2/test/manifests/latest"
  time="2023-03-02T22:37:24Z" level=debug msg="do request" digest="sha256:8131fb7cb0b47f423c719cc9bf9296a0a85a28391b25bd64c160079c260a2ed4" mediatype=application/vnd.docker.distribution.manifest.v2+json request.header.accept="application/vnd.docker.distribution.manifest.v2+json, */*" request.header.user-agent=buildkit/v0.11 request.method=HEAD size=3048 spanID=f423bff6834ca9a8 traceID=0ea281446ce5a8e4681a38d32ff4bc92 url="https://111111111111.dkr.ecr.eu-west-1.amazonaws.com/v2/test/manifests/latest"
  time="2023-03-02T22:37:24Z" level=debug msg="fetch response received" digest="sha256:8131fb7cb0b47f423c719cc9bf9296a0a85a28391b25bd64c160079c260a2ed4" mediatype=application/vnd.docker.distribution.manifest.v2+json response.header.content-length=325 response.header.content-type="application/json; charset=utf-8" response.header.date="Thu, 02 Mar 2023 22:37:24 GMT" response.header.docker-distribution-api-version=registry/2.0 response.header.sizes= response.status="403 Forbidden" size=3048 spanID=f423bff6834ca9a8 traceID=0ea281446ce5a8e4681a38d32ff4bc92 url="https://111111111111.dkr.ecr.eu-west-1.amazonaws.com/v2/test/manifests/latest"
  time="2023-03-02T22:37:24Z" level=debug msg="unexpected response" body= digest="sha256:8131fb7cb0b47f423c719cc9bf9296a0a85a28391b25bd64c160079c260a2ed4" mediatype=application/vnd.docker.distribution.manifest.v2+json resp="&{403 Forbidden 403 HTTP/1.1 1 1 map[Content-Length:[325] Content-Type:[application/json; charset=utf-8] Date:[Thu, 02 Mar 2023 22:37:24 GMT] Docker-Distribution-Api-Version:[registry/2.0] Sizes:[]] {0xc0009e8640} 325 [] false false map[] 0xc000f97d00 0xc00009d550}" size=3048 spanID=f423bff6834ca9a8 traceID=0ea281446ce5a8e4681a38d32ff4bc92
  time="2023-03-02T22:37:24Z" level=error msg="/moby.buildkit.v1.Control/Solve returned error: rpc error: code = Unknown desc = failed to push 111111111111.dkr.ecr.eu-west-1.amazonaws.com/test:latest: unexpected status: 403 Forbidden"
  failed to push 111111111111.dkr.ecr.eu-west-1.amazonaws.com/test:latest: unexpected status: 403 Forbidden

I wonder if this is linked to the Configure AWS Credentials step. We are testing the build and push action to push to ECR on our side but only using the login-action:

-
name: AWS ECR
registry: 175142243308.dkr.ecr.us-east-2.amazonaws.com
slug: 175142243308.dkr.ecr.us-east-2.amazonaws.com/sandbox/test-docker-action
username_secret: AWS_ACCESS_KEY_ID
password_secret: AWS_SECRET_ACCESS_KEY
type: remote

And looks good: https://github.com/docker/build-push-action/actions/runs/4312754710/jobs/7523704245#step:11:318

I see you're using the GitHub OIDC provider as shown in https://github.com/aws-actions/configure-aws-credentials#credentials. Can you replace the Login to ECR step with:

      - name: Login to Amazon ECR
        uses: aws-actions/amazon-ecr-login@v1

I see this workflow using this action and looks to work for them: https://github.com/nhost/hasura-auth/actions/runs/4205620537/jobs/7298746334#step:12:913

Also 403 Forbidden does not seem to be an authentication issue but something else. It seems to be the same as docker/buildx#1528 for another registry.

@jfagoagas
Copy link
Author

jfagoagas commented Mar 3, 2023

That's weird... we tried yesterday with the aws-actions/amazon-ecr-login@v1 and it didn't work either.

The thing is that this workflow was executed 2 weeks ago, maybe there is something broken now.

I'm going to try this:

     - name: Configure AWS
        uses: aws-actions/configure-aws-credentials@v1
        with:
          aws-region: ${{ env.AWS_REGION }}
          role-to-assume: ${{ env.IAM_ROLE_ARN }}
          role-session-name: test
      - name: Login to Amazon ECR
        uses: aws-actions/amazon-ecr-login@v1
      - name: Build and publish to Docker Hub and AWS ECR
        uses: docker/build-push-action@v3
        with:
          context: .
          file: ${{ env.DOCKERFILE_PATH }}
          tags: ${{ env.ECR }}/${{ env.IMAGE_NAME }}:${{ env.TAG }}
          push: true

@jfagoagas
Copy link
Author

jfagoagas commented Mar 3, 2023

Ok, the following is super weird.

If I use the following I get a 401 Unauthorized:

     - name: Configure AWS
        uses: aws-actions/configure-aws-credentials@v1
        with:
          aws-region: ${{ env.AWS_REGION }}
          role-to-assume: ${{ env.IAM_ROLE_ARN }}
          role-session-name: test

      - name: Login to Amazon ECR
        uses: aws-actions/amazon-ecr-login@v1

      - name: Build and publish to Docker Hub and AWS ECR
        uses: docker/build-push-action@v3
        with:
          context: .
          file: ${{ env.DOCKERFILE_PATH }}
          tags: ${{ env.ECR }}/${{ env.IMAGE_NAME }}:${{ env.TAG }}
          push: true

But with the following I can push the images without any problem:

jobs:
  build-push-connector-container:
    runs-on: ubuntu-latest
    permissions:
      id-token: write
      contents: read
    steps:
      - name: Checkout
        uses: actions/checkout@v3

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v1
        with:
          aws-region: ${{ env.AWS_REGION }}
          role-to-assume: ${{ env.IAM_ROLE_ARN }}
          role-session-name: test

      - name: Login to ECR
        uses: docker/login-action@v2
        with:
          registry: ${{ env.ECR }}

      # - name: Login to Amazon ECR
      #   id: login-ecr
      #   uses: aws-actions/amazon-ecr-login@v1

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2
        with:
          buildkitd-flags: --debug

      - name: Build and push container latest image
        uses: docker/build-push-action@v4
        with:
          push: false
          file: ${{ env.DOCKERFILE_PATH }}
          tags: ${{ env.ECR }}/${{ env.IMAGE_NAME }}:${{ env.TAG }}
          outputs: type=docker

      - name: Push manually
        run: docker push ${{ env.ECR }}/${{ env.IMAGE_NAME }}:${{ env.TAG }}

Do you have any idea of what to test? Definitely it is not a permissions issue whatsoever.

@korenyoni
Copy link

korenyoni commented Mar 16, 2023

Ok, the following is super weird.

If I use the following I get a 401 Unauthorized:

(omitting your snippet for brevity)

Do you have any idea of what to test? Definitely it is not a permissions issue whatsoever.

I am experiencing the same thing with OCI-compliant (provenance:true) multiplatform images on self-hosted runners. And interestingly, not on all images (only really large ones?). Technically I can work around it by using type=oci, writing it out to a file, and in another step doing a docker import and docker push. But this will strip away CMD and ENTRYPOINT.

So for now type=docker and a subsequent push is the best workaround.

Here's a gist I made for context.

What fails (same issue as yours)

The type=oci + import + push workaround

Mind you this isn't even using the action. So it seems to be a buildx bug (or maybe some shortcoming on the ECR side?), rather than a build-push-action bug.

EDIT:

Your workaround won't work for multiarch OCI-compliant images (see: docker/roadmap#371)

...docker exporter does not currently support exporting manifest lists

SECOND EDIT (24/03/23): wanted to follow up on this:

A good workaround is to do outputs: type=image (which exports the image, including multiplatform OCI-compliant images to the buildx driver) and then push that image however you want. For me, podman manifest push works in all cases. But probably docker manifest push will work fine as well.

@jairov4
Copy link

jairov4 commented Mar 18, 2023

In my case the it got working after restart the buildx builder container and re-login in ECR.

docker buildx ls
docker restart <buildx-container-name>

Logs of the worker

time="2023-03-18T12:44:52Z" level=warning msg="forcibly turning on oci-mediatype mode for attestations"
time="2023-03-18T12:44:53Z" level=error msg="/moby.buildkit.v1.Control/Solve returned error: rpc error: code = Unknown desc = failed to push <my-registry-goes-here>: unexpected status: 403 Forbidden"
time="2023-03-18T12:46:57Z" level=info msg="stopping server"
buildkitd: context canceled
<here I restarted and relogin>
time="2023-03-18T12:46:58Z" level=info msg="auto snapshotter: using overlayfs"
time="2023-03-18T12:46:58Z" level=warning msg="using host network as the default"
time="2023-03-18T12:46:58Z" level=info msg="found worker \"nzrh304dxzr9bvafx52j4ano0\", labels=map[org.mobyproject.buildkit.worker.executor:oci org.mobyproject.buildkit.worker.hostname:45f94c3682e5 org.mobyproject.buildkit.worker.network:host org.mobyproject.buildkit.worker.oci.process-mode:sandbox org.mobyproject.buildkit.worker.selinux.enabled:false org.mobyproject.buildkit.worker.snapshotter:overlayfs], platforms=[linux/amd64 linux/amd64/v2 linux/amd64/v3 linux/arm64 linux/riscv64 linux/ppc64le linux/s390x linux/386 linux/mips64le linux/mips64 linux/arm/v7 linux/arm/v6]"
time="2023-03-18T12:46:58Z" level=warning msg="skipping containerd worker, as \"/run/containerd/containerd.sock\" does not exist"
time="2023-03-18T12:46:58Z" level=info msg="found 1 workers, default=\"nzrh304dxzr9bvafx52j4ano0\""
time="2023-03-18T12:46:58Z" level=warning msg="currently, only the default worker can be used."
time="2023-03-18T12:46:58Z" level=info msg="running server on /run/buildkit/buildkitd.sock"
time="2023-03-18T12:47:20Z" level=warning msg="forcibly turning on oci-mediatype mode for attestations"

@insider89
Copy link

Have the same issue: thx for workaround #826 (comment)

@jggatter
Copy link

jggatter commented Mar 28, 2023

Maybe there is something more going on for other people, but for me it turns out I was simply absent-minded and hadn't referenced the ECR as a resource on the IAM policy. I would triple-check your policies on the IAM role/user that you are using for the action to make sure it's pointed at the correct ECR and that it has all the policies necessary to log into and push to it.

Anyway my solution ended up looking like:

  • actions/checkout@v3 (README said this is not necessary but I found I needed it)
  • docker/login-action@v2 with IAM user credentials (role-assume is better practice)
  • docker/setup-buildx-action@v2
  • docker/build-push-action@v4

I didn't need anything further such as the AWS credentials configure or the ECR login actions.

@crazy-max
Copy link
Member

crazy-max commented Apr 7, 2023

But with the following I can push the images without any problem:
Do you have any idea of what to test? Definitely it is not a permissions issue whatsoever.

Looking at the workflow working for you, you're loading the image to Docker (outputs: type=docker). You could just do:

      - name: Build and push container latest image
        uses: docker/build-push-action@v4
        with:
          load: true
          file: ${{ env.DOCKERFILE_PATH }}
          tags: ${{ env.ECR }}/${{ env.IMAGE_NAME }}:${{ env.TAG }}

load: true is a shorthand for --output=type=docker.

And you're pushing manually in the last step. What's your use case for pushing manually btw?

If it works when pushing manually, it should work just fine with the action:

jobs:
  build-push-connector-container:
    runs-on: ubuntu-latest
    permissions:
      id-token: write
      contents: read
    steps:
      - name: Checkout
        uses: actions/checkout@v3

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v1
        with:
          aws-region: ${{ env.AWS_REGION }}
          role-to-assume: ${{ env.IAM_ROLE_ARN }}
          role-session-name: test

      - name: Login to ECR
        uses: docker/login-action@v2
        with:
          registry: ${{ env.ECR }}

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2
        with:
          buildkitd-flags: --debug

      - name: Build and push container latest image
        uses: docker/build-push-action@v4
        with:
          context: .
          push: true
          file: ${{ env.DOCKERFILE_PATH }}
          tags: ${{ env.ECR }}/${{ env.IMAGE_NAME }}:${{ env.TAG }}

What's odd is our e2e tests when pushing to ECR or ECR Public look good: https://github.com/docker/build-push-action/actions/runs/4637273377/jobs/8206003788#step:11:306

@jfagoagas
Copy link
Author

But with the following I can push the images without any problem:
Do you have any idea of what to test? Definitely it is not a permissions issue whatsoever.

Looking at the workflow working for you, you're loading the image to Docker (outputs: type=docker). You could just do:

      - name: Build and push container latest image
        uses: docker/build-push-action@v4
        with:
          load: true
          file: ${{ env.DOCKERFILE_PATH }}
          tags: ${{ env.ECR }}/${{ env.IMAGE_NAME }}:${{ env.TAG }}

load: true is a shorthand for --output=type=docker.

And you're pushing manually in the last step. What's your use case for pushing manually btw?

We're pushing mannually since if we use the push: true within the action it returns the mentioned 403 error.

If it works when pushing manually, it should work just fine with the action:

jobs:
  build-push-connector-container:
    runs-on: ubuntu-latest
    permissions:
      id-token: write
      contents: read
    steps:
      - name: Checkout
        uses: actions/checkout@v3

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v1
        with:
          aws-region: ${{ env.AWS_REGION }}
          role-to-assume: ${{ env.IAM_ROLE_ARN }}
          role-session-name: test

      - name: Login to ECR
        uses: docker/login-action@v2
        with:
          registry: ${{ env.ECR }}

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2
        with:
          buildkitd-flags: --debug

      - name: Build and push container latest image
        uses: docker/build-push-action@v4
        with:
          context: .
          push: true
          file: ${{ env.DOCKERFILE_PATH }}
          tags: ${{ env.ECR }}/${{ env.IMAGE_NAME }}:${{ env.TAG }}

What's odd is our e2e tests when pushing to ECR or ECR Public look good: https://github.com/docker/build-push-action/actions/runs/4637273377/jobs/8206003788#step:11:306

That's the thing we've been talking through this thread, there is some issue using the action since we receive a 403 Forbidden if we set the push: true but it works perfect if we push manually.

@halostatue
Copy link

We’re not currently using OIDC, but I was able to get my Docker Publish to ECR working last night:

jobs:
  publish:
    name: Publish
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v3

      - id: aws-credentials
        uses: aws-actions/[email protected]
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{ env.AWS_REGION }}

      - id: login-ecr
        uses: aws-actions/[email protected]

      - id: meta
        uses: docker/[email protected]
        with:
          images: |
            ${{ steps.login-ecr.outputs.registry }}/${{ env.ECR_REPOSITORY }}
          tags: |
            type=raw,value=${{ env.RELEASE_TAG }},enable=true
            type=ref,event=tag

      - uses: docker/[email protected]

      - name: Build image and push to ECR
        id: docker_build
        uses: docker/[email protected]
        with:
          context: .
          push: true
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
          provenance: false
          cache-from: type=gha
          cache-to: type=gha,mode=max

I wasn’t able to get it working either with this or with the manual push, but it boiled down to a typo in the value for env.ECR_REPOSITORY trying to push to a repository that doesn’t exist. I know this probably increases the noise on this issue, but I have been following this issue in part because I was running into various issues which seemed similar.

@jfagoagas
Copy link
Author

Thanks for that @halostatue, I'll test it again pushing using the action.

Thank you too @crazy-max.

@CarlosBorroto
Copy link

I'm still experiencing this issue, all actions are latest version. Commenting to highlight @halostatue is using user credentials (access and secret key), not OIDC (role-to-assume). Maybe that's why it is working for them.

@blockmar
Copy link

blockmar commented May 2, 2023

While doing additional testing on the setup with AWS I noticed that if I updated the polices to allow the OIDC policy to perform ecr:* things started working for me.

We finally got it working using OIDC with this policy (some retractions made):

{
    "Statement": [
        {
            "Action": [
                "ecr:BatchGetImage",
                "ecr:BatchCheckLayerAvailability",
                "ecr:CompleteLayerUpload",
                "ecr:GetDownloadUrlForLayer",
                "ecr:InitiateLayerUpload",
                "ecr:PutImage",
                "ecr:UploadLayerPart"
            ],
            "Effect": "Allow",
            "Resource": [
                "arn:aws:ecr:us-east-1:XXXXXXXXXXXX:repository/*"
            ],
            "Sid": "ecr"
        },
        {
            "Action": "ecr:GetAuthorizationToken",
            "Effect": "Allow",
            "Resource": "*",
            "Sid": "token"
        }
    ],
    "Version": "2012-10-17"
}

Note that we added the following compared to a policy that already worked for regular docker push.

"ecr:BatchGetImage",
"ecr:GetDownloadUrlForLayer",

@CarlosBorroto
Copy link

I can confirm adding these two actions to the policy attached to the role we are assuming fixed the issue for us.

@vinko-rx
Copy link

vinko-rx commented Dec 4, 2024

In my case it was a full ECR. After I removed a number of images from ECR, it started to work again. Then I added a lifecycle policy to the registry :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants