-
Notifications
You must be signed in to change notification settings - Fork 12.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[workflows] Port buildkite Windows config to GitHub actions #82093
base: main
Are you sure you want to change the base?
Conversation
This reuses most of the generate-buildkite-pipeline-premerge script which determines which projects to build and which check targets to use. This new workflow only tests clang, llvm, and lld due to resource contraints on the GitHub runners.
@llvm/pr-subscribers-github-workflow Author: Tom Stellard (tstellar) ChangesThis reuses most of the generate-buildkite-pipeline-premerge script which determines which projects to build and which check targets to use. The workflow file is also based on the linux workflow from PR#80951. Unlike buildkite, which tested everything, this new workflow only tests clang, llvm, and lld due to resource constraints on the GitHub runners. Full diff: https://github.com/llvm/llvm-project/pull/82093.diff 3 Files Affected:
diff --git a/.github/workflows/compute-projects-to-test/action.yml b/.github/workflows/compute-projects-to-test/action.yml
new file mode 100644
index 00000000000000..37df06c8c301c5
--- /dev/null
+++ b/.github/workflows/compute-projects-to-test/action.yml
@@ -0,0 +1,21 @@
+name: 'Compute Projects To Test'
+inputs:
+ projects:
+ required: false
+ type: 'string'
+
+outputs:
+ check-targets:
+ description: "A space delimited list of check-targets to pass to ninja."
+ value: ${{ steps.compute-projects.outputs.check-targets }}
+
+ projects:
+ description: "A semi-colon delimited list of projects to pass to -DLLVM_ENABLE_PROJECTS."
+ value: ${{ steps.compute-projects.outputs.projects }}
+
+runs:
+ using: "composite"
+ steps:
+ - id: compute-projects
+ run: .github/workflows/compute-projects-to-test/compute-projects-to-test.sh ${{ inputs.projects }}
+ shell: bash
diff --git a/.github/workflows/compute-projects-to-test/compute-projects-to-test.sh b/.github/workflows/compute-projects-to-test/compute-projects-to-test.sh
new file mode 100755
index 00000000000000..807142668f618b
--- /dev/null
+++ b/.github/workflows/compute-projects-to-test/compute-projects-to-test.sh
@@ -0,0 +1,220 @@
+#!/usr/bin/env bash
+#===----------------------------------------------------------------------===##
+#
+# Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+# See https://llvm.org/LICENSE.txt for license information.
+# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+#
+#===----------------------------------------------------------------------===##
+
+#
+# This file generates a Buildkite pipeline that triggers the various CI jobs for
+# the LLVM project during pre-commit CI.
+#
+# See https://buildkite.com/docs/agent/v3/cli-pipeline#pipeline-format.
+#
+# As this outputs a yaml file, it's possible to log messages to stderr or
+# prefix with "#".
+
+
+set -eu
+set -o pipefail
+
+# Environment variables script works with:
+
+# Set by GitHub
+: ${GITHUB_OUTPUT:=}
+: ${RUNNER_OS:=}
+
+# Allow users to specify which projects to build.
+all_projects="bolt clang clang-tools-extra compiler-rt cross-project-tests flang libc libclc lld lldb llvm mlir openmp polly pstl"
+if [ "$#" -ne 0 ]; then
+ wanted_projects="${@}"
+else
+ wanted_projects="${all_projects}"
+fi
+
+# List of files affected by this commit
+: ${MODIFIED_FILES:=$(git diff --name-only HEAD~1...HEAD)}
+
+echo "Files modified:" >&2
+echo "$MODIFIED_FILES" >&2
+modified_dirs=$(echo "$MODIFIED_FILES" | cut -d'/' -f1 | sort -u)
+echo "Directories modified:" >&2
+echo "$modified_dirs" >&2
+echo "wanted_projects: $wanted_projects"
+
+function remove-unwanted-projects() {
+ projects=${@}
+ for project in ${projects}; do
+ if echo "$wanted_projects" | tr ' ' '\n' | grep -q -E "^${project}$"; then
+ echo "${project}"
+ fi
+ done
+}
+
+function compute-projects-to-test() {
+ projects=${@}
+ for project in ${projects}; do
+ echo "${project}"
+ case ${project} in
+ lld)
+ for p in bolt cross-project-tests; do
+ echo $p
+ done
+ ;;
+ llvm)
+ for p in bolt clang clang-tools-extra flang lld lldb mlir polly; do
+ echo $p
+ done
+ ;;
+ clang)
+ for p in clang-tools-extra compiler-rt flang libc lldb openmp cross-project-tests; do
+ echo $p
+ done
+ ;;
+ clang-tools-extra)
+ echo libc
+ ;;
+ mlir)
+ echo flang
+ ;;
+ *)
+ # Nothing to do
+ ;;
+ esac
+ done
+}
+
+function add-dependencies() {
+ projects=${@}
+ for project in ${projects}; do
+ echo "${project}"
+ case ${project} in
+ bolt)
+ for p in lld llvm; do
+ echo $p
+ done
+ ;;
+ cross-project-tests)
+ for p in lld clang; do
+ echo $p
+ done
+ ;;
+ clang-tools-extra)
+ for p in llvm clang; do
+ echo $p
+ done
+ ;;
+ compiler-rt|libc|openmp)
+ echo clang lld
+ ;;
+ flang|lldb)
+ for p in llvm clang; do
+ echo $p
+ done
+ ;;
+ lld|mlir|polly)
+ echo llvm
+ ;;
+ *)
+ # Nothing to do
+ ;;
+ esac
+ done
+}
+
+function exclude-linux() {
+ projects=${@}
+ for project in ${projects}; do
+ case ${project} in
+ cross-project-tests) ;; # tests failing
+ lldb) ;; # tests failing
+ openmp) ;; # https://github.com/google/llvm-premerge-checks/issues/410
+ *)
+ echo "${project}"
+ ;;
+ esac
+ done
+}
+
+function exclude-windows() {
+ projects=${@}
+ for project in ${projects}; do
+ case ${project} in
+ cross-project-tests) ;; # tests failing
+ compiler-rt) ;; # tests taking too long
+ openmp) ;; # TODO: having trouble with the Perl installation
+ libc) ;; # no Windows support
+ lldb) ;; # tests failing
+ bolt) ;; # tests are not supported yet
+ *)
+ echo "${project}"
+ ;;
+ esac
+ done
+}
+
+# Prints only projects that are both present in $modified_dirs and the passed
+# list.
+function keep-modified-projects() {
+ projects=${@}
+ for project in ${projects}; do
+ if echo "$modified_dirs" | grep -q -E "^${project}$"; then
+ echo "${project}"
+ fi
+ done
+}
+
+function check-targets() {
+ projects=${@}
+ for project in ${projects}; do
+ case ${project} in
+ clang-tools-extra)
+ echo "check-clang-tools"
+ ;;
+ compiler-rt)
+ echo "check-all"
+ ;;
+ cross-project-tests)
+ echo "check-cross-project"
+ ;;
+ lldb)
+ echo "check-all" # TODO: check-lldb may not include all the LLDB tests?
+ ;;
+ pstl)
+ echo "check-all"
+ ;;
+ libclc)
+ echo "check-all"
+ ;;
+ *)
+ echo "check-${project}"
+ ;;
+ esac
+ done
+}
+
+# Generic pipeline for projects that have not defined custom steps.
+#
+# Individual projects should instead define the pre-commit CI tests that suits their
+# needs while letting them run on the infrastructure provided by LLVM.
+
+# Figure out which projects need to be built on each platform
+modified_projects="$(keep-modified-projects ${all_projects})"
+echo "modified_projects: $modified_projects"
+
+if [ "${RUNNER_OS}" = "Linux" ]; then
+ projects_to_test=$(exclude-linux $(compute-projects-to-test ${modified_projects}))
+elif [ "${RUNNER_OS}" = "Windows" ]; then
+ projects_to_test=$(exclude-windows $(compute-projects-to-test ${modified_projects}))
+else
+ echo "Unknown runner OS: $RUNNER_OS"
+fi
+check_targets=$(check-targets $(remove-unwanted-projects ${projects_to_test}) | sort | uniq)
+projects=$(remove-unwanted-projects $(add-dependencies ${projects_to_test}) | sort | uniq)
+
+echo "check-targets=$(echo ${check_targets} | tr ' ' ' ')" >> $GITHUB_OUTPUT
+echo "projects=$(echo ${projects} | tr ' ' ';')" >> $GITHUB_OUTPUT
+
+cat $GITHUB_OUTPUT
diff --git a/.github/workflows/precommit-windows.yml b/.github/workflows/precommit-windows.yml
new file mode 100644
index 00000000000000..819f592b5e4d86
--- /dev/null
+++ b/.github/workflows/precommit-windows.yml
@@ -0,0 +1,61 @@
+name: "Windows Precommit Tests"
+
+permissions:
+ contents: read
+
+on:
+ pull_request:
+ branches:
+ - main
+
+concurrency:
+ # Skip intermediate builds: always.
+ # Cancel intermediate builds: only if it is a pull request build.
+ group: ${{ github.workflow }}-${{ github.ref }}
+ cancel-in-progress: ${{ startsWith(github.ref, 'refs/pull/') }}
+
+jobs:
+ build-llvm-linux:
+ name: "Build and test LLVM (Windows)"
+ runs-on: windows-2022
+ steps:
+ - name: Setup Windows
+ uses: llvm/actions/setup-windows@main
+ with:
+ arch: amd64
+ - name: Fetch LLVM sources
+ uses: actions/checkout@v4
+ with:
+ fetch-depth: 2
+ - name: Setup ccache
+ uses: hendrikmuhs/ccache-action@v1
+ with:
+ max-size: 500M
+ variant: sccache
+ key: precommit-windows
+ - name: Compute projects to test
+ id: compute-projects
+ uses: ./.github/workflows/compute-projects-to-test
+ with:
+ projects: clang llvm lld
+
+ - name: Configure LLVM
+ shell: bash
+ if: ${{ steps.compute-projects.outputs.check-targets }}
+ run: |
+ cmake -B build -GNinja \
+ -DCMAKE_BUILD_TYPE=Release \
+ -DLLVM_ENABLE_PROJECTS="${{ steps.compute-projects.outputs.projects }}" \
+ -DCMAKE_C_COMPILER_LAUNCHER=sccache \
+ -DCMAKE_CXX_COMPILER_LAUNCHER=sccache \
+ -DLLVM_ENABLE_ASSERTIONS=ON \
+ -DLLVM_LIT_ARGS="-v --no-progress-bar" \
+ -S llvm
+ - name: Build LLVM
+ if: ${{ steps.compute-projects.outputs.check-targets }}
+ run: |
+ ninja -C build test-depends
+ - name: Check LLVM
+ if: ${{ steps.compute-projects.outputs.check-targets }}
+ run: |
+ ninja -C build "${{ steps.compute-projects.outputs.check-targets }}"
|
I'm still testing to see if it's feasible to add more sub-projects. |
d854cfb
to
3093657
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unlike buildkite, which tested everything, this new workflow only tests clang, llvm, and lld due to resource constraints on the GitHub runners.
I do have very strong concerns about this: it does not seem acceptable to me to make steps in a direction where we're reduce the previous support and make it seems that there are two classes of support in the project between subprojects.
The current version of this workflow isn't meant to replace buildkite, I see it as a backup plan until either the existing Windows job on buildkite is fixed or we get more powerful Windows runners on GitHub. I'm still testing to see how many sub-projects we can realistically enable. |
Thank you for working on this Tom!
It has been mentioned multiple times that there are technical limitations that need to be taken into account. I'd prefer myself if powerful Windows machines on BuildKite that have capacity for every project have been working properly, but it's not the case. |
Please point me to the "multiple times" where this has been detailed, I think I followed the discussion, yet I don't know what you mean.
I don't quite see this as answering my concerns right now. |
Tom pointed out to this almost two weeks ago in https://discourse.llvm.org/t/rfc-future-of-windows-pre-commit-ci/76840/4
Back then you also suggested that we can "just enable" all projects, and I agreed, as long as we "don’t spend another two weeks on this". Here we are today, with Windows CI completely disabled, and you still pushing against at least some kind of replacement. |
elif [ "${RUNNER_OS}" = "Windows" ]; then | ||
projects_to_test=$(exclude-windows $(compute-projects-to-test ${modified_projects})) | ||
else | ||
echo "Unknown runner OS: $RUNNER_OS" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should exit with an error here:
echo "Unknown runner OS: $RUNNER_OS" | |
echo "Unknown runner OS: $RUNNER_OS" | |
exit 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping
I ran this job with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's worth noting that there are a few variables here:
- Which path in the repo this triggers on
- The mapping from a path to a given project
- The dependent projects that are kicked in based on the affected projects from the path mapping.
That is: minimally speaking it is very possible to enable this for every single project, while turning of the dependent projects calculation. For example a PR touching lld
should only build and test lld
: this should be cheap overall.
Your problem (as far as I understand) is rather: "when project X is touched: how many other dependent project can we bundle here" ; when X is llvm
it becomes very large but when X is MLIR (for example) then it is just adding Flang on top of MLIR.
If the above is correct, then a good starting point could be to start with a very limited set of dependent projects, based on the coupling ("how much are tests of project Y likely to provide extra coverage to project X?"). LLVM can break potentially all the other projects in theory, but probably not with the same likeliness.
If the goal is to re-establish some amount of Windows coverage ASAP, then the most straightforward way to me seems like: "test only the project that is touched by a PR". From there we can observe breakages and increate the number of dependent projects based on actually impact.
fi | ||
|
||
# List of files affected by this commit | ||
: ${MODIFIED_FILES:=$(git diff --name-only HEAD~1...HEAD)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this work, when a branch contains multiple commits (e.g. an original one and multiple corrections/tweaks) - essentially we should check the diff of the whole branch, right?
But when doing that, we also need to make sure that we ignore any changes from merge commits, like in llvm/llvm-admin#8, otherwise we will essentially end up testing every single subproject as soon as an MR contains a merge commit. IIRC git diff <base>...HEAD
is the syntax for that, but it's worth testing and looking into.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The HEAD commit for a PR is a merge of the PR branch into main. It's not the HEAD for the PR target branch. In my testing using HEAD~1...HEAD worked for multiple commits. I did not test the case where the PR branch had a merge commit in it already. I can test git diff <base>...HEAD
More Benchmarks:
|
Going beyond 5 hours by design sounds risky to me, as I don't expect GitHub runners to have a very consistent performance. On top of that, we also should accommodate growth of the projects. As https://llvm-compile-time-tracker.com/graphs.php shows, over time it takes more effort to build Clang.
|
if [ "${RUNNER_OS}" = "Linux" ]; then | ||
projects_to_test=$(exclude-linux $(compute-projects-to-test ${modified_projects})) | ||
elif [ "${RUNNER_OS}" = "Windows" ]; then | ||
projects_to_test=$(exclude-windows $(compute-projects-to-test ${modified_projects})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
projects_to_test=$(exclude-windows $(compute-projects-to-test ${modified_projects})) | |
# Don't run dependent projects on Windows for now. | |
projects_to_test=$(exclude-windows ${modified_projects}) |
with: | ||
projects: clang llvm lld |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with: | |
projects: clang llvm lld |
- name: Build LLVM | ||
if: ${{ steps.compute-projects.outputs.check-targets }} | ||
run: | | ||
ninja -C build test-depends |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This step will overbuild: if a PR only touches clang and not LLVM, it should only run ninja check-clang
and not build the LLVM unit-tests and other things not needed by clang for example.
(same for the other projects)
- name: Build LLVM | |
if: ${{ steps.compute-projects.outputs.check-targets }} | |
run: | | |
ninja -C build test-depends |
- name: Check LLVM | ||
if: ${{ steps.compute-projects.outputs.check-targets }} | ||
run: | | ||
ninja -C build "${{ steps.compute-projects.outputs.check-targets }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ninja -C build "${{ steps.compute-projects.outputs.check-targets }}" | |
ninja -C build ${{ steps.compute-projects.outputs.check-targets }} |
Need to remove the quotes here because there can be multiple targets
* Split the build into 3 jobs to avoid timeouts. * Maintiain an scacche cache that is unique for each PR.
I've done a major rework of this with the following changes:
I'm still debugging and addressing some of the review comments, but I think this new approach will be better. |
key: precommit-windows | ||
|
||
- name: Restore sccache from previous PR run | ||
uses: ./.github/workflows/pr-sccache-restore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a cache for the initial run? If not that seems like quite a pessimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a global cache that is shared by all PRs which is used for the first run. However, given how many PRs there and the limits on the size of the cache. I think the benefit from the global cache may be small.
if echo "${{ needs.compute-projects.outputs.check-targets }}" | grep clang; then | ||
targets="$targets clang" | ||
fi | ||
ninja -C build $targets |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow this step's logic here? Is llc a universal dependency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to pick a target that would build most of llvm/, that seemed like the best one to use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This just seems like a bad setup to me: why would we build all of llvm when we don’t necessarily need it?
shell: bash | ||
run: | | ||
ls | ||
ninja -C build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not make sense: why would we build it all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a better build target to use here to build "everything else"? We had test-depends
before, but that was building too much (e.g. llvm unit tests, when we only care about check-flang). Maybe we need to add -DLLVM_INCLUDE_TESTS=OFF
when configuring so they don't get built when not needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by “everything else”?
Why does this step exists in the first place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason we have this step is to work around the GitHub timeouts. If we split the build into multiple jobs, then each job has its own 6 hour timeout and it's possible to fully replicate what we have with buildkite. Currently, this patch splits the build into 3 parts:
- Build LLVM (and clang if necessary).
ninja llc clang
- Build the rest of the enabled projects and runtimes (i.e "everything else"):
ninja
- Run the tests.
ninja $CHECK_TARGETS
We can split up the jobs anyway want, but I think we need to have at least two.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think about this more, I think I have a better idea for working around the timeouts. I'm going to try something else.
tar --zstd -xf llvm-project.tar.zst | ||
rm llvm-project.tar.zst | ||
|
||
- name: Check LLVM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- name: Check LLVM | |
- name: Check LLVM |
This checks more than LLVM, doesn't it?
This simplifies the workflow a lot and will enable us to support builds that take more than 6 hours.
I re-worked the job now so that it automatically saves the build after 5 hours and 30 minutes and then continues it in a new job to avoid the GitHub timeouts. This simplifies the workflow a lot, and let's us just run I did some benchmarking and for full build of all the projects and running all the check targets (i.e. what happens if you make a change in llvm/) it takes 8 Hr 10 Minutes with no cache and 1Hr 45 Minutes with 100% cache usage. There is a global sccache and also a local one for each PR. I would expect in most cases, the first run for a PR will take the max time (8Hr 10M), and subsequent runs will take about 2 Hrs. This is for PRs that touch llvm/ PRs that tough other directories will take less time to test. |
I believe I've addressed all the review comments now. I'm not sure what the current status is on Buildkite and if there are still issues with long queue times, but this job is ready to be used as a drop-in replacement if needed. |
Is there a way to deprioritize these jobs relative to others (like the issue comments/PR subscribers jobs) so that the quick running jobs can run in between? Also, it seems like the total time it takes for this workflow for a change to a project like LLVM will be many hours? I don't think that's going to be feasible given our 60 runner limit. When I last did the math, we would be reasonably comfortable with a turn-around time of 1-hour, and could maybe push a couple hours, but six hours from my previous estimates would definitely run into throughput limitations. Those numbers are somewhat old, but if anything, usage has gone up from what I've seen. |
Shall we actually setup a proof of concept custom GCP Windows runner? Then we can just invite Google to actually migrate the buildkite runners to GitHub instead. I seem to recall Microsoft ready to provide some machine as well? (@AaronBallman knows maybe?). |
Migrating the current infrastructure over to Github Actions is already on Google's roadmap (@gchatelet @Keenuts), but not in the short-term from what I understand. I remember Aaron saying that Intel would be willing to provide machines, but if I'm remembering that correctly, it was just hardware that would still need to be plugged in somewhere, and I don't believe anyone is willing to take in hardware and run it currently. |
I have two IceLake servers that are in the process of being provisioned at Intel (the servers are available, folks are getting to the point of needing to install an OS and configure the machines). As I understand things, Intel can host the servers but we need someone from community to actually set up and maintain the CI bits on them. It's somewhat unclear as to who would be responsible for things like updating the OS, but I figure details like that can be worked out once we have someone from the LLVM community who is willing to be the point of contact for getting CI working on the servers. I'm happy to be involved in the process, but I don't know enough about CI or have enough bandwidth to own the servers myself; is someone willing to be that point of contact? |
@AaronBallman If you want to use them as GitHub Action self-hosted runners, I can help set that up. |
@lnihlen in case you missed this PR. |
Thanks! FWIW, we're willing to use them in whatever way the community will get the most value from, so we can be somewhat flexible on how to move forward. |
This reuses most of the generate-buildkite-pipeline-premerge script which determines which projects to build and which check targets to use. The workflow file is also based on the linux workflow from PR#80951.
Unlike buildkite, which tested everything, this new workflow only tests clang, llvm, and lld due to resource constraints on the GitHub runners.