Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker-compose cache_from (using buildkit) builds to wrong step #1711

Closed
jamesholmes-linktree opened this issue Oct 2, 2020 · 9 comments
Closed

Comments

@jamesholmes-linktree
Copy link

jamesholmes-linktree commented Oct 2, 2020

I have a single stage Dockerfile. If I build it with docker-compose from scratch, everything works well. In this build I have BUILDKIT_INLINE_CACHE=1 and I'm pushing to AWS ECR. When, for subsequent builds, I use the previously built image as a cache with cache_from, the subsequent build will be built to the wrong step (as evidenced by missing files in the image) despite every step showing up in the output as CACHED.

This is making it impossible to rely on the cache.

@tonistiigi
Copy link
Member

Reproducer?

@jamesholmes-linktree
Copy link
Author

Reproducer?

Unfortunately it's a private repo. I might be able to privately share at least the dockerfile, docker-compose yaml and the output in the CI server. Let me know how I can do that, if it's useful.

@jamesholmes-linktree
Copy link
Author

In an effort to be slightly more helpful, I've redacted the dockerfile a bit:

# -----------------------
# 🏡 Base
# -----------------------
FROM node:14.5.0-alpine3.10

# OS Dependencies
ENV APK_ADD="libc6-compat"
ENV PATH=$PATH:/root/.pulumi/bin

# Install missing APK dependencies.
RUN apk add libc6-compat && \
    apk add --virtual native-deps \
    python \
    make \
    g++ \
    chromium \
    unzip \
    curl && \
    # Install node-prune
    curl -sfL https://install.goreleaser.com/github.com/tj/node-prune.sh | sh -s -- -b /usr/local/bin && \
    # Install Pulumi CLI
    curl -fsSL https://get.pulumi.com/ | sh && \
    # Install Kubectl (can this be removed?)
    curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl && \
    chmod +x ./kubectl && mv ./kubectl /usr/local/bin && \
    # Install AWS CLI
    curl "https://s3.amazonaws.com/aws-cli/awscli-bundle.zip" -o "awscli-bundle.zip" && \
    unzip awscli-bundle.zip && \
    ./awscli-bundle/install -i /usr/local/aws -b /usr/local/bin/aws

# -----------------------
# 🥜 Provision
# -----------------------
COPY package.json yarn.lock tsconfig.json lerna.json babel.config.js .prettierrc /home/node/app/
COPY packages/pkg1/package.json  /home/node/app/packages/pkg1/
COPY packages/pkg2/package.json /home/node/app/packages/pkg2/
COPY packages/pkg3/package.json /home/node/app/packages/pkg3/
COPY packages/pkg4/package.json /home/node/app/packages/pkg4/
COPY packages/pkg5/package.json /home/node/app/packages/pkg5/
COPY services/svc1/package.json /home/node/app/services/svc1/
RUN touch /tmp/provision.txt

# -----------------------
# 📦 Dependencies
# -----------------------
WORKDIR /home/node/app

RUN yarn --pure-lockfile --frozen-lockfile && \
    rm -rf /usr/local/share/.cache && \
    # Remove unnecessary node_modules
    rm -rf \
    /home/node/app/node_modules/@sentry \
    /home/node/app/node_modules/gatsby \
    /home/node/app/node_modules/jsc-android \
    /home/node/app/node_modules/react-docgen-typescript-loader \
    /home/node/app/node_modules/react-docgen-typescript-plugin \
    /home/node/app/node_modules/react-native \
    /home/node/app/node_modules/tslint-react \
    # Remove unnecessary files (@todo - do via profiles.Dockerignore)
    node-prune /home/node/app/services/svc1/node_modules && \
    node-prune /home/node/app/packages/pkg1/node_modules && \
    node-prune /home/node/app/packages/pkg2/node_modules && \
    node-prune /home/node/app/packages/pkg3/node_modules && \
    node-prune /home/node/app/packages/pkg4/node_modules

RUN touch /tmp/dependencies.txt

# -----------------------
# 🚧 Prepare
# -----------------------

COPY services/svc1 services/svc1
COPY packages/pkg1 packages/pkg1
COPY packages/pkg2 packages/pkg2
COPY packages/pkg3 packages/pkg3
COPY packages/pkg4 packages/pkg4
COPY packages/pgk5 packages/pgk5
COPY .buildkite/scripts /home/node/app/.buildkite/scripts

RUN yarn lerna run prepare
RUN touch /tmp/prepare.txt

# -----------------------
# 👟 Build
# -----------------------

WORKDIR /home/node/app/services/svc1
RUN chown -R node:node /home/node/app/services
USER node
ENV STAGE=${STAGE}
ENV CDN_DISTRIBUTION_URL=${CDN_DISTRIBUTION_URL}

RUN rm -rf .env* mock-server kubernetes infrastructure && yarn build
# @todo - Remove all APK packages that are not needed for the build step here.
# @todo - Remove all unused node_modules here once the binary is built.

EXPOSE 3000
RUN touch /tmp/build.txt

Using the above without a cache results in a working container with each touched file where it should be. A subsequent build of exactly the same git checkout, with no changes, results in build output showing each layer is CACHED (although that output is not in the correct step order, but I assume that's normal for buildkit) and pushing the image shows that the all of the layers already existed. However, pulling the rebuilt image back down on my locla machine and running it shows that only /tmp/dependencies.txt and /tmp/provision.txt now exist (and ECR shows the image is now smaller by a few MB).

@jamesholmes-linktree
Copy link
Author

jamesholmes-linktree commented Oct 4, 2020

Also of interest: when the image is build from scratch, the push involves these layers:


7f90ab648cad: Pushed
bee1a629afc4: Pushed
d15ee18fe985: Pushed
5f70bf18a086: Pushed
cfca9043b972: Pushed
024bd99960b9: Pushed
1af1ac72f575: Pushed
fdf7ea01c9aa: Pushed
c91e58e38689: Pushed
a45a76d5c6da: Pushed
9142355cfba4: Pushed
c17699636e5c: Pushed
40320245b213: Pushed
5cf05b445559: Pushed
4ed5dd3e8ce8: Pushed
15f036b03c87: Pushed
39e2df666b1a: Pushed
26fc9642e637: Pushed
5623b84a2b2e: Pushed
5dc6b48239c7: Pushed
19c04113d79e: Pushed
1a88e2f7c825: Pushed
668ef74a77d1: Pushed
1dabdaa014ec: Pushed
68d82e861e44: Layer already exists
44ebb6228ede: Layer already exists
258b9c5c3c29: Layer already exists
1b3ee35aacca: Layer already exists

However, when that is used as the cache for the next build, fewer layers are in the push:


40320245b213: Layer already exists
5cf05b445559: Layer already exists
4ed5dd3e8ce8: Layer already exists
5f70bf18a086: Layer already exists
15f036b03c87: Layer already exists
39e2df666b1a: Layer already exists
26fc9642e637: Layer already exists
5623b84a2b2e: Layer already exists
5dc6b48239c7: Layer already exists
19c04113d79e: Layer already exists
1a88e2f7c825: Layer already exists
668ef74a77d1: Layer already exists
1dabdaa014ec: Layer already exists
68d82e861e44: Layer already exists
44ebb6228ede: Layer already exists
258b9c5c3c29: Layer already exists
1b3ee35aacca: Layer already exists

@jamesholmes-linktree
Copy link
Author

jamesholmes-linktree commented Oct 4, 2020

Digging around in the built image more, the result of COPY services/svc1 services/svc1 was included, but the subsequent steps were not. I'm testing out if the chown step messes up the cache for subsequent builds.

@tonistiigi
Copy link
Member

You can also test #1568 (19.03.13)

moby/moby#41219

@jamesholmes-linktree
Copy link
Author

Thanks, I'll try to give this a go (confirmed we're on 19.03.2) right now.

@jamesholmes-linktree
Copy link
Author

I can confirm that 19.03.13 fixes this issue.

@ypadlyak
Copy link

ypadlyak commented Apr 9, 2021

@tonistiigi What If we are using an older docker version on Google CloudBuild, but specify frontend version: # syntax=docker/dockerfile:1.2.1 . Should it be fixed as well? We are experiencing a missing data :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants