Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cdk-cli deploy: fail: docker push to ecr unexpected status from PUT request 400 Bad Request #33264

Closed
1 task
james-g-stream opened this issue Feb 1, 2025 · 13 comments · Fixed by #33415
Closed
1 task
Labels
@aws-cdk/aws-ecr Related to Amazon Elastic Container Registry bug This issue is a bug. effort/medium Medium work item – several days of effort p3

Comments

@james-g-stream
Copy link

Describe the bug

when trying to run cdk deploy with an ecs.ContainerImage.fromAsset() I'm getting a fail: docker push to ecr unexpected status from PUT request 400 Bad Request. I've uninstalled and reinstalled docker, deleted the image asset it references from ecr and nothing seems to be working. cdk version 2.177.0 (build b396961)

Regression Issue

  • Select this option if this issue appears to be a regression.

Last Known Working CDK Version

No response

Expected Behavior

ckd successfully push the docker image to ecr

Current Behavior

the docker push is returning 400 bad request and failing cdk deploy

Reproduction Steps

yarn cdk deploy

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.177.0 (build b396961)

Framework Version

No response

Node.js Version

v20.15.0

OS

macOS 14.5

Language

TypeScript

Language Version

No response

Other information

No response

@james-g-stream james-g-stream added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Feb 1, 2025
@github-actions github-actions bot added the @aws-cdk/aws-ecr Related to Amazon Elastic Container Registry label Feb 1, 2025
@pahud
Copy link
Contributor

pahud commented Feb 1, 2025

Was it working prior to 2.177.0 but failing in 2.177.0 ?

400 bad request generally indicates bad requests and it could be many different causes.

Are you able to simply push a local image to your ECR using your current AWS identity from local?

What's happening in CDK is essentially using $CDK_DOCKER or docker executive to docker build and push to ECR assets and handle required get-login-password for you. The image assets uploading process could fail due to network issue as well.

Let me know if you are able to manually push an random image to ecr?

@pahud pahud added effort/medium Medium work item – several days of effort p3 and removed needs-triage This issue or PR still needs to be triaged. labels Feb 1, 2025
@james-g-stream
Copy link
Author

Hi @pahud thanks for the response, I reverted to version 2.173.1 and it still fails with 400 bad request on ecr push.

I was able to docker login with aws ecr get-login-password --region YOUR_AWS_REGION | docker login --username AWS --password-stdin YOUR_AWS_ACCOUNT_ID.dkr.ecr.YOUR_AWS_REGION.amazonaws.com
docker build, tag and push to my ecr account.

I then tried to cdk deploy and it failed again with 400 Bad Request.

What I did notice was this. When i check my docker AWS credentials with docker-credential-desktop get <<< "https://{AWS_ACCOUNT}.dkr.ecr.us-east-1.amazonaws.com"

I checked the JWT in the "Secret" property of that credential JSON in jwt.io. when i manually login with ecr, the jwt was valid in jwt.io. But the JWT from "Secret" that gets set when I run cdk deploy comes back as an invalid JWT in jwt.io as it ends in "==".
The JWT from the manual login does not end in "==".

Not sure if this maybe causing issues.

@james-g-stream
Copy link
Author

james-g-stream commented Feb 4, 2025

I ended up deleting docker desktop and installing docker and colima and it's working fine with colima with both buildkit enabled and disabled.

@wbeardall
Copy link

wbeardall commented Feb 4, 2025

+1 to this with Docker Desktop on MacOS 15.2, CDK 2.177.0 (build b396961). Same error comes up with ecs.AssetImage.fromTarball.

@pahud
Copy link
Contributor

pahud commented Feb 5, 2025

Thank you for the report. @james-g-stream and @wbeardall. While we'll investigate this issue, can you help us check is it only happening in 2.177.0 ? What about 2.176 or even earlier versions?

@james-g-stream
Copy link
Author

@pahud i tried it on 2.173.1 and still happening only for docker desktop

@kaizencc
Copy link
Contributor

kaizencc commented Feb 5, 2025

Possibly related to these issues, I'm looking into it: #30258 (comment)

Need to turn off containerd or set --provenance=false. CDK fix is likely to add that as default.

@tmokmss
Copy link
Contributor

tmokmss commented Feb 6, 2025

@kaizencc yes they are related. In our cases, we get the 400 error for each image asset when cdk tries to push attestations and images, which sometimes results in 0mb images. (The behavior seems indeterministic. Sometimes attestation wins and sometimes the actual image wins. They share the same image tag and the first one is pushed but the second one throws an error because of ecr tag immutability, it seems.)

Then if we retry cdk deploy, it continues to deploy with those invalid images, because cdk does not have to push any images due to cache, which finally results in ecs or lambda deployment errors.

@StarfieldDesigns
Copy link

StarfieldDesigns commented Feb 10, 2025

Hello all, I had this issue too recently and solved by following steps found in another thread related to this one:

Key note here: Changing my Docker BuildX and attestation settings solved the ECR PUT 400 status code problem, but deleting the 0 byte images from my CDK's ECR repository is what got my deployments to actually work again (was seeing deploy errors for Lambda functions that did not stabilize).

Sounds like the AWS CDK teams are aware of this issue and have it on their radar, currently a P2 issue sitting in backlog it sounds like. Would be great to see this bumped up to P1 and patched, but it does seem to only effect some smallish subset of people, and there is a workaround for it. Hope this helps anyone having this issue!

@kaizencc
Copy link
Contributor

Game plan is for cdklabs/cdk-assets#342 to be merged/released and a new cdk-assets version in cdk for this weeks release. The 0 byte images either need to be deleted, or your docker assets need to be updated in some way so that the hash changes. The 0 byte images are squatting on your valid hashes :(.

@ChristopherAUGSCM
Copy link

Fixed this issue by disabling containerd from Settings -> General in Docker Desktop. This resolved the issue for me.

@kaizencc Thank you for your help!

github-merge-queue bot pushed a commit to cdklabs/cdk-assets that referenced this issue Feb 11, 2025
There are various issues in cdk that can be traced back to attestations
in docker:

aws/aws-cdk#30258
aws/aws-cdk#31549
aws/aws-cdk#33264

cdk-assets cannot work with docker containerd because it will attempt to
upload multiple files to the same hash in ECR, and our ECR repository is
immutable (by requirement). docker recently changed their default to
turn on containerd which causes this issue to skyrocket.

the hotfix here is to add an environment variable when calling `docker`
so that the attestation file is not added to the manifest. we can later
look into adding support for including
[provenance](https://docs.docker.com/build/metadata/attestations/slsa-provenance/)
attestations if there is need for it.

i've chosen to fix this via environment variable instead of as a command
option `--provenance=false` because we must keep docker replacements in
mind, and at least finch [does
not](https://runfinch.com/docs/cli-reference/finch_build/) have a
`provenance` property at the moment.

in addition to this unit test that shows the env variable exists when
`docker build` is called, i have also ensured that this solves the issue
in my local setup + symlinked `cdk-assets`..
aws-cdk-automation pushed a commit to cdklabs/cdk-assets that referenced this issue Feb 11, 2025
There are various issues in cdk that can be traced back to attestations
in docker:

aws/aws-cdk#30258
aws/aws-cdk#31549
aws/aws-cdk#33264

cdk-assets cannot work with docker containerd because it will attempt to
upload multiple files to the same hash in ECR, and our ECR repository is
immutable (by requirement). docker recently changed their default to
turn on containerd which causes this issue to skyrocket.

the hotfix here is to add an environment variable when calling `docker`
so that the attestation file is not added to the manifest. we can later
look into adding support for including
[provenance](https://docs.docker.com/build/metadata/attestations/slsa-provenance/)
attestations if there is need for it.

i've chosen to fix this via environment variable instead of as a command
option `--provenance=false` because we must keep docker replacements in
mind, and at least finch [does
not](https://runfinch.com/docs/cli-reference/finch_build/) have a
`provenance` property at the moment.

in addition to this unit test that shows the env variable exists when
`docker build` is called, i have also ensured that this solves the issue
in my local setup + symlinked `cdk-assets`..

(cherry picked from commit 8bdea13)

# Conflicts:
#	lib/private/docker.ts
#	test/private/docker.test.ts
@mergify mergify bot closed this as completed in #33415 Feb 12, 2025
@mergify mergify bot closed this as completed in ccd5f38 Feb 12, 2025
Copy link

Comments on closed issues and PRs are hard for our team to see.
If you need help, please open a new issue that references this one.

1 similar comment
Copy link

Comments on closed issues and PRs are hard for our team to see.
If you need help, please open a new issue that references this one.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 12, 2025
yashkh-amzn pushed a commit to yashkh-amzn/aws-cdk that referenced this issue Feb 21, 2025
The lastest cdk-assets is required in cdk to mitigate a ECR upload issue. It includes the following fix: cdklabs/cdk-assets#342. The following issues are related to this:

aws#30258
aws#31549
aws#33264

I am keeping aws#31549 open as it is still true. this [feature request](cdklabs/cdk-assets#348) tracks the work to make cdk-assets compatible with containerd

Closes aws#30258 and closes aws#33264
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
@aws-cdk/aws-ecr Related to Amazon Elastic Container Registry bug This issue is a bug. effort/medium Medium work item – several days of effort p3
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants