-
Notifications
You must be signed in to change notification settings - Fork 509
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with sharing cache across jobs #156
Comments
You can still use a multi-stage Dockerfile with two separate runs where first time you build the a target and second time the b target(that depends on a). Instead of doing Eg.
For example, we use same pattern in buildkit CI parallelization. The first task builds |
Thank you very much Tonis! Aloys |
Cache mounts is a different concept from the instruction cache. They are local persistent directories that don't have a deterministic state. Theoretically, you can use the build itself to read files into or out of these directories but I don't think it would give you performance increase. The point of the instruction cache is to determine that the sources are still valid for build and skip over the instructions, while cache mounts provide an incremental boost while the run command is running. For the big Dockerfiles, there has been discussions for supporting "include" to split files apart but nobody has made a prototype yet. |
Thanks Tonis. |
Only way to do this atm is with the build request itself. Eg. you can build a stage that loads files into a cache mount or a stage that just returns the files in the cache mount. I don't recommend doing this though unless you can clearly measure that it improves your performance. |
Thanks again Tonis! |
@aloysbaillet for the same named target, the cache mount will be shared between jobs in the same builder. For instruction cache, you can push the intermediary layers to a network close registry and access that from multiple builders |
Thanks Fernando! That's great info, and I updated my test repo with all the useful knowledge gathered here. |
So I went back to my test project and tried to use both instruction cache and mounted cache as fallback, but I can't find a way to use these both at the same time! I'm stuck between using instruction cache (really fast when every character of the RUN command has not changed, but needs complete rebuild when anything there changes) and the fallback of mounted cache which works really well to speed up builds when instruction cache is not matching (in my case a 3 hour build becomes a 10m build). The main problem with trying to use both caches is that I need to somehow inject the content of the mounted cache into the build without invalidating the instruction cache (and ideally without requiring a rsync server as explained there: http://erouault.blogspot.com/2019/05/incremental-docker-builds-using-ccache.html ). Using a bind mount to expose the previous ccache content cannot work as the first build will have an empty cache to mount and that is recorded in the instruction cache, and the second build will use the first build's cache, but find a non-empty ccache which will invalidate the instruction cache from the first build. See this file for an example: https://github.com/aloysbaillet/buildx-testing/blob/master/Dockerfile#L15 FROM n0madic/alpine-gcc:8.3.0 as buildx-testing-image-a-builder
RUN --mount=type=cache,target=/tmp/ccache \
--mount=type=cache,target=/tmp/downloads \
--mount=type=bind,source=ccache,target=/tmp/ccache_from \
export CCACHE_DIR=/tmp/ccache && \
export DOWNLOADS_DIR=/tmp/downloads && \
if [ -f /tmp/ccache_from/ccache.tar.gz ] ; then cd /tmp/ccache && tar xf /tmp/ccache_from/ccache.tar.gz && cd - ; fi && \
if [ ! -f $DOWNLOADS_DIR/Python-3.7.3.tgz ] ; then curl --location https://www.python.org/ftp/python/3.7.3/Python-3.7.3.tgz -o $DOWNLOADS_DIR/Python-3.7.3.tgz ; fi && \
tar xf $DOWNLOADS_DIR/Python-3.7.3.tgz && \
cd Python-3.7.3 && \
./configure \
--prefix=/usr/local \
--enable-shared && \
make -j4 && \
make install && \
ccache --show-stats && \
tar cfz /tmp/ccache.tar.gz /tmp/ccache
FROM scratch as buildx-testing-image-a-ccache
COPY --from=buildx-testing-image-a-builder /tmp/ccache.tar.gz /ccache/ccache.tar.gz
which is used there: https://github.com/aloysbaillet/buildx-testing/blob/master/.github/workflows/dockerimage.yml#L30 tar xf ccache/ccache.tar.gz
# buildx-testing-image-a
docker buildx build \
. \
-t aloysbaillet/buildx-testing-image-a:0 \
--target=buildx-testing-image-a \
--cache-from=type=local,src=docker-cache \
--cache-to=type=local,mode=max,dest=docker-cache-a \
--load
# buildx-testing-image-a-ccache
docker buildx build \
. \
--target=buildx-testing-image-a-ccache \
--platform=local \
-o . It really feels like to make this work we would need a new What do you think? |
Do keep in mind that github nodes currently don't support any form of cache and each job runs from a new node, so any host cache is lost between runs and jobs |
Indeed, I'm actually maintaining a set of docker images that get built on Azure Pipelines (which has caching available in preview, I believe caching is coming across to GitHub actions very soon...) so I'm emulating this feature by doing a curl of previous build's artifact. |
After some introspection into the buildx container I found this hack that properly moves the whole buildkit cache between nodes: backup: docker buildx create --name cacheable --use
docker buildx build ... # no --cache-to or --cache-from
docker run --rm \
--volumes-from buildx_buildkit_cacheable0 \
-v $(pwd)/buildkit-cache-a:/backup \
alpine /bin/sh -c "cd / && tar cf /backup/backup.tar.gz /var/lib/buildkit" restore: docker buildx create --name cacheable --use
docker buildx inspect --bootstrap
docker buildx stop
docker run --rm \
--volumes-from buildx_buildkit_cacheable0 \
-v $(pwd)/buildkit-cache-a:/backup \
alpine /bin/sh -c "cd / && tar xf /backup/backup.tar.gz" Obviously this is a bit brittle as it assumes the naming convention of the docker container created by |
you can create cache images to disk, if you dont want to use a registry |
Unfortunately |
you mean host cache? |
Cache mounts support |
Thanks Tonis, I don't think this "mounted cache": the content of the cache defined by
"instruction cache": the buildkit image cache, keyed by each RUN line:
|
And you only update |
But this means that one has to choose in advance between "mounted cache" and "instruction cache" and it is impossible to know in advance which one will be valid. I need both caches to be available at all times, here's a timeline of build events: Build 1:create empty
-> everything builds Build 2:get previous
-> everything builds again! If |
I'm also struggling with this. Just installed buildx into my Gitlab docker:dind pipeline in a hope that my mount caches will be exported along with layers, but job retry did not show any signs of cache being exported. And then I stumbled across this issue. Would be really great if we could control inclusion of mount cache into cache export! |
## Build containers in parallel The `docker_build` used in the `kind_integration.yml`, `cloud_integration.yml` and `release.yml` workflows relied on running `bin/docker-build` which builds all the containers in sequence. Now each container is built in parallel using `strategy.matrix`. ## New caching strategy CI now uses `docker buildx` for building the container images, which allows using an external cache source for builds, a location in the filesystem in this case. That location gets cached using actions/cache, using the key `{{ runner.os }}-buildx-${{ matrix.target }}-${{ env.TAG }}` and the restore key `${{ runner.os }}-buildx-${{ matrix.target }}-`. For example when building the `web` container, its image and all the intermediary layers gets cached under the key `Linux-buildx-web-git-abc0123`. When that has been cached in the `main` branch, that cache will be available to all the child branches, including forks. If a new branch in a fork asks for a key like `Linux-buildx-web-git-def456`, the key won't be found during the first CI run, but the system falls back to the key `Linux-buildx-web-git-abc0123` from `main` and so the build will start with a warm cache (more info about how keys are matched in the [actions/cache docs](https://docs.github.com/en/actions/configuring-and-managing-workflows/caching-dependencies-to-speed-up-workflows#matching-a-cache-key)). ## Packet host no longer needed To benefit from the warm caches both in non-forks and forks like just explained, we're ditching doing the builds in Packet and now everything runs in the github runners VMs. The build performance for non-forks remains similar when using warm caches, in part due to the new parallel strategy. E.g. before, the docker builds (all sequential) were taking a total time around 2 mins in Packet, and now the longest parallel build (`cni-plugin`) takes around the same time. This also means the workflow yamls were vastly simplified, no longer having to have separate logic for non-forks and forks. ## Local builds You still are able to run `bin/docker-build` or any of the `docker-build.*` scripts. To make use of buildx, run those same scripts after having set the env var `DOCKER_BUILDKIT=1`. Using buildx supposes you have installed it, as instructed [here](https://github.com/docker/buildx). ## Other - A new script `bin/docker-cache-prune` is used to remove unused images from the cache. Without that the cache grows constantly and we can rapidly hit the 5GB limit (when the limit is attained the oldest entries get evicted). - The `go-deps` dockerfile base image was changed from `golang:1.14.2` (ubuntu based) to `golang-1:14.2-alpine` also to conserve cache space. ## Known issues - Most dockerfiles rely on the `go-deps` base image at a hard-coded tag, that they retrieve from the gcr registry. Whenever that base image changes, it gets rebuilt prior to building the other images. Now we're using the docker-container driver for buildx, and it can't use the local cache like that (see docker/buildx#156). So changes to `go-deps` will break the build. This will be addressed in a separate PR.
## Motivation - Improve build times in forks. Specially when rerunning builds because of some flaky test. - Start using `docker buildx` to pave the way for multiplatform builds. ## Performance improvements These timings were taken for the `kind_integration.yml` workflow when we merged and rerun the lodash bump PR (#4762) Before these improvements: - when merging: `24:18` - when rerunning after merge (docker cache warm): `19:00` - when running the same changes in a fork (no docker cache): `32:15` After these improvements: - when merging: `25:38` - when rerunning after merge (docker cache warm): `19:25` - when running the same changes in a fork (docker cache warm): `19:25` As explained below, non-forks and forks now use the same cache, so the important take is that forks will always start with a warm cache and we'll no longer see long build times like the `32:15` above. The downside is a slight increase in the build times for non-forks (up to a little more than a minute, depending on the case). ## Build containers in parallel The `docker_build` job in the `kind_integration.yml`, `cloud_integration.yml` and `release.yml` workflows relied on running `bin/docker-build` which builds all the containers in sequence. Now each container is built in parallel using a matrix strategy. ## New caching strategy CI now uses `docker buildx` for building the container images, which allows using an external cache source for builds, a location in the filesystem in this case. That location gets cached using actions/cache, using the key `{{ runner.os }}-buildx-${{ matrix.target }}-${{ env.TAG }}` and the restore key `${{ runner.os }}-buildx-${{ matrix.target }}-`. For example when building the `web` container, its image and all the intermediary layers get cached under the key `Linux-buildx-web-git-abc0123`. When that has been cached in the `main` branch, that cache will be available to all the child branches, including forks. If a new branch in a fork asks for a key like `Linux-buildx-web-git-def456`, the key won't be found during the first CI run, but the system falls back to the key `Linux-buildx-web-git-abc0123` from `main` and so the build will start with a warm cache (more info about how keys are matched in the [actions/cache docs](https://docs.github.com/en/actions/configuring-and-managing-workflows/caching-dependencies-to-speed-up-workflows#matching-a-cache-key)). ## Packet host no longer needed To benefit from the warm caches both in non-forks and forks like just explained, we're required to ditch doing the builds in Packet and now everything runs in the github runners VMs. As a result there's no longer separate logic for non-forks and forks in the workflow files; `kind_integration.yml` was greatly simplified but `cloud_integration.yml` and `release.yml` got a little bigger in order to use the actions artifacts as a repository for the images built. This bloat will be fixed when support for [composite actions](https://github.com/actions/runner/blob/users/ethanchewy/compositeADR/docs/adrs/0549-composite-run-steps.md) lands in github. ## Local builds You still are able to run `bin/docker-build` or any of the `docker-build.*` scripts. And to make use of buildx, run those same scripts after having set the env var `DOCKER_BUILDKIT=1`. Using buildx supposes you have installed it, as instructed [here](https://github.com/docker/buildx). ## Other - A new script `bin/docker-cache-prune` is used to remove unused images from the cache. Without that the cache grows constantly and we can rapidly hit the 5GB limit (when the limit is attained the oldest entries get evicted). - The `go-deps` dockerfile base image was changed from `golang:1.14.2` (ubuntu based) to `golang-1:14.2-alpine` also to conserve cache space. ## Known issues to be addressed in a followup PR - Most dockerfiles rely on the `go-deps` base image at a hard-coded tag, that they retrieve from the gcr registry. Whenever that base image changes, it gets rebuilt prior to building the other images. Now that we're using the docker-container driver for buildx, it can't use the local cache for retrieving the `go-deps` image just built (see docker/buildx#156). So changes to `go-deps` will break the build.
It's been 3 years since the last activity in this issue. Can anyone confirm if the state of remote sharing of mount cache remains the same or if there has been any new developments / workarounds in this area? |
Going to close this. The initial report is about reusing build result as images in another build what has been resolved with https://www.docker.com/blog/dockerfiles-now-support-multiple-build-contexts/ But then it goes to various unrelated cache mount topics, eventually how to transfer For mount cache persistence follow moby/buildkit#1512 or experiment with copying cache mounts to instruction cache in #156 (comment) |
Hi,
I am trying to reuse a "local" cache across multiple CI jobs without using a registry. The main reason to avoid the registry is to nicely associate a CI run with the generated artifact, and not adding more images on the registry when the CI system already supports artifact tied to the builds.
I made an example repo with 2 images: imageA builds python from source using ccache https://github.com/aloysbaillet/buildx-testing/blob/master/imageA/Dockerfile and the second image just uses the first image. The build commands are in https://github.com/aloysbaillet/buildx-testing/blob/master/.github/workflows/dockerimage.yml but here's a summary:
build A (uses a previous local cache and saves a new cache for future runs)
build B (trying to use cache A to avoid pulling the non-existing image
aloysbaillet/buildx-testing-image-a:0
)Here is the sequence of builds:
https://github.com/aloysbaillet/buildx-testing/runs/241659655
The main issue I'm facing is how to make job B believe that
aloysbaillet/buildx-testing-image-a:0
is a valid docker image. Runningdocker load
is ignored by buildx when usingdocker-container
driver which is necessary for--cache-from
and--cache-to
to function.Is there a way to populate the buildkit image from docker? I thought using the cache from job A would have been enough as the image A is tagged and is in cache A...
N.B. it would seem obvious to use a multi-stage job for images A and B, unfortunately in my real-world scenario image A takes around 2 hours to build, and one of my image Bs takes around 7 hours, which is timing out my free CI system... Hence the need to split the jobs and use ccache.
Thanks in advance for any help!
Cheers,
Aloys
The text was updated successfully, but these errors were encountered: