Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-10798: [CI] Made certain workflows be less triggered on changes of dockerfiles #8824

Closed
wants to merge 3 commits into from
Closed

Conversation

jorgecarleitao
Copy link
Member

@jorgecarleitao jorgecarleitao commented Dec 3, 2020

This PR restricts some CI workflows from getting triggered when any dockerfile changes, thereby reducing the workload on our queue when someone modifies a dockerfile.

This is particularly important when testing the CI itself, as devs will trigger multiple runs of the CI for single change of a single dockerfile, which triggers the execution of a large number of workflows.

@github-actions
Copy link

github-actions bot commented Dec 3, 2020

@jorgecarleitao jorgecarleitao changed the title ARROW-10798: [CI] Made workflows be less triggered. ARROW-10798: [CI] Made certain workflows be less triggered. Dec 3, 2020
@jorgecarleitao jorgecarleitao changed the title ARROW-10798: [CI] Made certain workflows be less triggered. ARROW-10798: [CI] Made certain workflows be less triggered on changes of dockerfiles Dec 3, 2020
@jorgecarleitao
Copy link
Member Author

ccing @pitrou , since I have now added cpp to the mix.

@pitrou pitrou requested a review from kszucs December 3, 2020 17:55
@pitrou
Copy link
Member

pitrou commented Dec 3, 2020

@kszucs Can you take a look here?

Copy link
Member

@nealrichardson nealrichardson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this. It's a little tricky, which is why I punted on doing this before, but here's one note to watch out for.

@@ -21,7 +21,7 @@ on:
push:
paths:
- '.github/workflows/cpp.yml'
- 'ci/docker/**'
- 'ci/docker/*cpp*'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to add this one (or similar) to the R, Python, and Ruby jobs too. Their dockerfiles generally build on the cpp images (you can trace through the base parameter in docker-compose.yml, e.g. https://github.com/apache/arrow/blob/master/docker-compose.yml#L430)

Copy link
Member Author

@jorgecarleitao jorgecarleitao Dec 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry it took me so long to go through this comment. I actually think that this PR is correct, but it surfaces a bigger problem present in master.

Let's say that we have an image X (typically build via archery build X), that depends on a base image (typically build via archery build base).

Let's analyze what happens in the following 4 cases:

  • push to PR with no change to base image's dockerfile
  • push to master with no change to base image's dockerfile
  • push to PR with a change to base image's dockerfile
  • push to master with a change to base image's dockerfile

Let's also call the hash of the current base image in the registry as base_v1.

push to PR with no change to base image's dockerfile

When a PR is open, image X is built using the dockerfile on the PR for that image, and it uses base_v1 from the registry to base its build from.

push to master with no change to base image's dockerfile

Same as before.

push to PR with a change to base image's dockerfile

In this situation, the PR triggers two CI jobs: the build of image X and the build image base. Since these two jobs are independent workflows, image X will pick base_v1, while the base image job will build base_v2.

This happens because docker-compose run X does not know that X depends on the compose job base to create the image base, and instead has that image available from the remote docker registry ${REPO}.

Because PRs do not push images to the registry, a new push to the PR will not alter this behavior: the image X will continue to be built based on base_v1. Thus, any error derived from building image X from the new image base_v2 will not surface.

push to master with a change to base image's dockerfile (e.g. PR is merged)

In this situation, there is a race condition (undefined behavior): image X will be built using base_v1 if it starts running before the flow build image base finishes (the likely scenario), or it picks base_v2 if the other flow has finished.

Thus, atm, whenever a PR changes both dockerfiles, X and base, the image X on the registry is based of base_v1, but base:latest points to base_v2.

Thus, even if the new dockerfile is incompatible with base_v2, the CI will actually pass in master because it picks base_v1. On the next push to master that triggers X to be built, it will pick base_v2, and the error surfaces 🤯

Summary

In both cases where the dockerfile of base changes, that change is not propagated to the build of X. Therefore, IMO this PR is correct: changes to the dockerfile of image base do not require to trigger a change to the build of the image X, because there are no constraints in the jobs to build X to wait for the build of image base.

This issue seems to indicate that our builds based of docker images that depend on other docker images that are build using docker-compose build may be too complex as there is hidden state through the registry.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's correct but maybe @kszucs can weigh in definitively. My understanding of how docker works (which may be wrong) is that if image Y is based dockerfile y, and y is changed, then Y gets rebuilt locally in docker-compose build Y. If y is based on image X (and therefore dockerfile x), and x is changed, building Y will first build X and then proceed using y to build Y.

I see a few options:

  1. Empirically determine which is correct
  2. Defensively, make the change I suggested (add cpp dockerfile triggers to ruby, python, and r workflows) and we're safe either way
  3. Make no additional change, as you propose, and risk merging changes to C++ dockerfiles that break downstream jobs that we'll only see after merging.

The risks entailed by #3 are probably low--these files change infrequently, and unless we're doing a wholesale refactor of docker images, they're probably low risk changes. At the same time, the cost of doing #2 is also very low and prevents the surprise failures that #3 could later cause. Given the low costs (and my hesitance to make definitive statements about how docker/docker-compose handles dependent dockerfiles), I still advocate for option #2.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusion: I agree with you and I do think that we should use 2 regardless, I side-tracked into a potential issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did a quick check to verify this, and AFAI can tell my hypothesis holds:

services:
  s1:
    image: t1
    build:
      context: .
      dockerfile: docker1.dockerfile

  s2:
    image: t2
    build:
      context: .
      dockerfile: docker2.dockerfile
# docker1.dockerfile
FROM python:3.7.4

CMD ls
# docker2.dockerfile
FROM t1

CMD ls
docker-compose build s2

Building s2
Step 1/3 : FROM t1
ERROR: Service 's2' failed to build : pull access denied for t1, repository does not exist or may require 'docker login': denied: requested access to the resource is denied

My understanding is that docker has no idea about docker-compose and how images get built; it only knows about where images are located, via registries. To declare such a dependency of image builds, Docker uses Multi-stage builds. Furthermore, docker-compose also does not declare image dependencies, only service dependencies (i.e. containers, not images).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, neither does docker nor docker-compose have an idea about the image hierarchy but archery does.

archery build and archery run commands are idempotent, meaning that the produced docker images are going to be the same for the same input configuration regardless of the state in the docker registry.

See the commands called underneath the archery docker run command:

❯ archery docker --dry-run run ubuntu-docs
COMPOSE_DOCKER_CLI_BUILD=1 docker-compose pull --ignore-pull-failures ubuntu-cpp
COMPOSE_DOCKER_CLI_BUILD=1 docker-compose pull --ignore-pull-failures ubuntu-python
COMPOSE_DOCKER_CLI_BUILD=1 docker-compose pull --ignore-pull-failures ubuntu-docs
COMPOSE_DOCKER_CLI_BUILD=1 docker-compose build --build-arg BUILDKIT_INLINE_CACHE=1 ubuntu-cpp
COMPOSE_DOCKER_CLI_BUILD=1 docker-compose build --build-arg BUILDKIT_INLINE_CACHE=1 ubuntu-python
COMPOSE_DOCKER_CLI_BUILD=1 docker-compose build --build-arg BUILDKIT_INLINE_CACHE=1 ubuntu-docs
COMPOSE_DOCKER_CLI_BUILD=1 docker-compose run --rm ubuntu-docs

@@ -21,7 +21,7 @@ on:
push:
paths:
- '.github/workflows/ruby.yml'
- 'ci/docker/**'
- 'ci/docker/linux-apt-ruby.dockerfile'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would need to list all dockerfiles used in the parent images, in this case:

ci/docker/ubuntu-18.04-cpp.dockerfile
ci/docker/linux-apt-c-glib.dockerfile
ci/docker/linux-apt-ruby.dockerfile

though if we wan't to run UBUNTU=20.04 archery docker run ubuntu-ruby then:

ci/docker/ubuntu-20.04-cpp.dockerfile
ci/docker/linux-apt-c-glib.dockerfile
ci/docker/linux-apt-ruby.dockerfile

This hierarchy is encoded in the docker-compose.yml depending on the env variables as well.

@kszucs
Copy link
Member

kszucs commented Jun 30, 2021

I think the safest to keep the current configuration with docker/** so we won't forget to update the workflow files if we change the image dependencies. A better approach could be to generate the workflow files from the docker-compose.yml but currently it may not worth the effort.

I'm closing this PR for now, feel free to reopen if there is anything I missed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants