Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[workflows] Port buildkite Windows config to GitHub actions #82093

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

tstellar
Copy link
Collaborator

This reuses most of the generate-buildkite-pipeline-premerge script which determines which projects to build and which check targets to use. The workflow file is also based on the linux workflow from PR#80951.

Unlike buildkite, which tested everything, this new workflow only tests clang, llvm, and lld due to resource constraints on the GitHub runners.

This reuses most of the generate-buildkite-pipeline-premerge script
which determines which projects to build and which check targets to
use.

This new workflow only tests clang, llvm, and lld due to resource
contraints on the GitHub runners.
@llvmbot
Copy link
Member

llvmbot commented Feb 17, 2024

@llvm/pr-subscribers-github-workflow

Author: Tom Stellard (tstellar)

Changes

This reuses most of the generate-buildkite-pipeline-premerge script which determines which projects to build and which check targets to use. The workflow file is also based on the linux workflow from PR#80951.

Unlike buildkite, which tested everything, this new workflow only tests clang, llvm, and lld due to resource constraints on the GitHub runners.


Full diff: https://github.com/llvm/llvm-project/pull/82093.diff

3 Files Affected:

  • (added) .github/workflows/compute-projects-to-test/action.yml (+21)
  • (added) .github/workflows/compute-projects-to-test/compute-projects-to-test.sh (+220)
  • (added) .github/workflows/precommit-windows.yml (+61)
diff --git a/.github/workflows/compute-projects-to-test/action.yml b/.github/workflows/compute-projects-to-test/action.yml
new file mode 100644
index 00000000000000..37df06c8c301c5
--- /dev/null
+++ b/.github/workflows/compute-projects-to-test/action.yml
@@ -0,0 +1,21 @@
+name: 'Compute Projects To Test'
+inputs:
+  projects:
+    required: false
+    type: 'string'
+
+outputs:
+  check-targets:
+    description: "A space delimited list of check-targets to pass to ninja."
+    value: ${{ steps.compute-projects.outputs.check-targets }}
+
+  projects:
+    description: "A semi-colon delimited list of projects to pass to -DLLVM_ENABLE_PROJECTS."
+    value: ${{ steps.compute-projects.outputs.projects }}
+
+runs:
+  using: "composite"
+  steps:
+    - id: compute-projects
+      run: .github/workflows/compute-projects-to-test/compute-projects-to-test.sh ${{ inputs.projects }}
+      shell: bash
diff --git a/.github/workflows/compute-projects-to-test/compute-projects-to-test.sh b/.github/workflows/compute-projects-to-test/compute-projects-to-test.sh
new file mode 100755
index 00000000000000..807142668f618b
--- /dev/null
+++ b/.github/workflows/compute-projects-to-test/compute-projects-to-test.sh
@@ -0,0 +1,220 @@
+#!/usr/bin/env bash
+#===----------------------------------------------------------------------===##
+#
+# Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+# See https://llvm.org/LICENSE.txt for license information.
+# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+#
+#===----------------------------------------------------------------------===##
+
+#
+# This file generates a Buildkite pipeline that triggers the various CI jobs for
+# the LLVM project during pre-commit CI.
+#
+# See https://buildkite.com/docs/agent/v3/cli-pipeline#pipeline-format.
+#
+# As this outputs a yaml file, it's possible to log messages to stderr or
+# prefix with "#".
+
+
+set -eu
+set -o pipefail
+
+# Environment variables script works with:
+
+# Set by GitHub
+: ${GITHUB_OUTPUT:=}
+: ${RUNNER_OS:=}
+
+# Allow users to specify which projects to build.
+all_projects="bolt clang clang-tools-extra compiler-rt cross-project-tests flang libc libclc lld lldb llvm mlir openmp polly pstl"
+if [ "$#" -ne 0 ]; then
+  wanted_projects="${@}"
+else
+  wanted_projects="${all_projects}"
+fi
+
+# List of files affected by this commit
+: ${MODIFIED_FILES:=$(git diff --name-only HEAD~1...HEAD)}
+
+echo "Files modified:" >&2
+echo "$MODIFIED_FILES" >&2
+modified_dirs=$(echo "$MODIFIED_FILES" | cut -d'/' -f1 | sort -u)
+echo "Directories modified:" >&2
+echo "$modified_dirs" >&2
+echo "wanted_projects: $wanted_projects"
+
+function remove-unwanted-projects() {
+  projects=${@}
+  for project in ${projects}; do
+    if echo "$wanted_projects" | tr ' ' '\n' | grep -q -E "^${project}$"; then
+      echo "${project}"
+    fi
+  done
+}
+
+function compute-projects-to-test() {
+  projects=${@}
+  for project in ${projects}; do
+    echo "${project}"
+    case ${project} in
+    lld)
+      for p in bolt cross-project-tests; do
+        echo $p
+      done
+    ;;
+    llvm)
+      for p in bolt clang clang-tools-extra flang lld lldb mlir polly; do
+        echo $p
+      done
+    ;;
+    clang)
+      for p in clang-tools-extra compiler-rt flang libc lldb openmp cross-project-tests; do
+        echo $p
+      done
+    ;;
+    clang-tools-extra)
+      echo libc
+    ;;
+    mlir)
+      echo flang
+    ;;
+    *)
+      # Nothing to do
+    ;;
+    esac
+  done
+}
+
+function add-dependencies() {
+  projects=${@}
+  for project in ${projects}; do
+    echo "${project}"
+    case ${project} in
+    bolt)
+      for p in lld llvm; do
+        echo $p
+      done
+    ;;
+    cross-project-tests)
+      for p in lld clang; do
+        echo $p
+      done
+    ;;
+    clang-tools-extra)
+      for p in llvm clang; do
+        echo $p
+      done
+    ;;
+    compiler-rt|libc|openmp)
+      echo clang lld
+    ;;
+    flang|lldb)
+      for p in llvm clang; do
+        echo $p
+      done
+    ;;
+    lld|mlir|polly)
+      echo llvm
+    ;;
+    *)
+      # Nothing to do
+    ;;
+    esac
+  done
+}
+
+function exclude-linux() {
+  projects=${@}
+  for project in ${projects}; do
+    case ${project} in
+    cross-project-tests) ;; # tests failing
+    lldb)                ;; # tests failing
+    openmp)              ;; # https://github.com/google/llvm-premerge-checks/issues/410
+    *)
+      echo "${project}"
+    ;;
+    esac
+  done
+}
+
+function exclude-windows() {
+  projects=${@}
+  for project in ${projects}; do
+    case ${project} in
+    cross-project-tests) ;; # tests failing
+    compiler-rt)         ;; # tests taking too long
+    openmp)              ;; # TODO: having trouble with the Perl installation
+    libc)                ;; # no Windows support
+    lldb)                ;; # tests failing
+    bolt)                ;; # tests are not supported yet
+    *)
+      echo "${project}"
+    ;;
+    esac
+  done
+}
+
+# Prints only projects that are both present in $modified_dirs and the passed
+# list.
+function keep-modified-projects() {
+  projects=${@}
+  for project in ${projects}; do
+    if echo "$modified_dirs" | grep -q -E "^${project}$"; then
+      echo "${project}"
+    fi
+  done
+}
+
+function check-targets() {
+  projects=${@}
+  for project in ${projects}; do
+    case ${project} in
+    clang-tools-extra)
+      echo "check-clang-tools"
+    ;;
+    compiler-rt)
+      echo "check-all"
+    ;;
+    cross-project-tests)
+      echo "check-cross-project"
+    ;;
+    lldb)
+      echo "check-all" # TODO: check-lldb may not include all the LLDB tests?
+    ;;
+    pstl)
+      echo "check-all"
+    ;;
+    libclc)
+      echo "check-all"
+    ;;
+    *)
+      echo "check-${project}"
+    ;;
+    esac
+  done
+}
+
+# Generic pipeline for projects that have not defined custom steps.
+#
+# Individual projects should instead define the pre-commit CI tests that suits their
+# needs while letting them run on the infrastructure provided by LLVM.
+
+# Figure out which projects need to be built on each platform
+modified_projects="$(keep-modified-projects ${all_projects})"
+echo "modified_projects: $modified_projects"
+
+if [ "${RUNNER_OS}" = "Linux" ]; then
+  projects_to_test=$(exclude-linux $(compute-projects-to-test ${modified_projects}))
+elif [ "${RUNNER_OS}" = "Windows" ]; then
+  projects_to_test=$(exclude-windows $(compute-projects-to-test ${modified_projects}))
+else
+  echo "Unknown runner OS: $RUNNER_OS"
+fi
+check_targets=$(check-targets $(remove-unwanted-projects ${projects_to_test}) | sort | uniq)
+projects=$(remove-unwanted-projects $(add-dependencies ${projects_to_test}) | sort | uniq)
+
+echo "check-targets=$(echo ${check_targets} | tr ' ' ' ')" >> $GITHUB_OUTPUT
+echo "projects=$(echo ${projects} | tr ' ' ';')" >> $GITHUB_OUTPUT
+
+cat $GITHUB_OUTPUT
diff --git a/.github/workflows/precommit-windows.yml b/.github/workflows/precommit-windows.yml
new file mode 100644
index 00000000000000..819f592b5e4d86
--- /dev/null
+++ b/.github/workflows/precommit-windows.yml
@@ -0,0 +1,61 @@
+name: "Windows Precommit Tests"
+
+permissions:
+  contents: read
+
+on:
+  pull_request:
+    branches:
+      - main
+
+concurrency:
+  # Skip intermediate builds: always.
+  # Cancel intermediate builds: only if it is a pull request build.
+  group: ${{ github.workflow }}-${{ github.ref }}
+  cancel-in-progress: ${{ startsWith(github.ref, 'refs/pull/') }}
+
+jobs:
+  build-llvm-linux:
+    name: "Build and test LLVM (Windows)"
+    runs-on: windows-2022
+    steps:
+      - name: Setup Windows
+        uses: llvm/actions/setup-windows@main
+        with:
+          arch: amd64
+      - name: Fetch LLVM sources
+        uses: actions/checkout@v4
+        with:
+          fetch-depth: 2
+      - name: Setup ccache
+        uses: hendrikmuhs/ccache-action@v1
+        with:
+          max-size: 500M
+          variant: sccache
+          key: precommit-windows
+      - name: Compute projects to test
+        id: compute-projects
+        uses: ./.github/workflows/compute-projects-to-test
+        with:
+          projects: clang llvm lld
+
+      - name: Configure LLVM
+        shell: bash
+        if: ${{ steps.compute-projects.outputs.check-targets }}
+        run: |
+          cmake -B build -GNinja \
+            -DCMAKE_BUILD_TYPE=Release \
+            -DLLVM_ENABLE_PROJECTS="${{ steps.compute-projects.outputs.projects }}" \
+            -DCMAKE_C_COMPILER_LAUNCHER=sccache \
+            -DCMAKE_CXX_COMPILER_LAUNCHER=sccache \
+            -DLLVM_ENABLE_ASSERTIONS=ON \
+            -DLLVM_LIT_ARGS="-v --no-progress-bar" \
+            -S llvm
+      - name: Build LLVM
+        if: ${{ steps.compute-projects.outputs.check-targets }}
+        run: |
+          ninja -C build test-depends
+      - name: Check LLVM
+        if: ${{ steps.compute-projects.outputs.check-targets }}
+        run: |
+          ninja -C build "${{ steps.compute-projects.outputs.check-targets }}"

@tstellar
Copy link
Collaborator Author

I'm still testing to see if it's feasible to add more sub-projects.

Copy link
Collaborator

@joker-eph joker-eph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike buildkite, which tested everything, this new workflow only tests clang, llvm, and lld due to resource constraints on the GitHub runners.

I do have very strong concerns about this: it does not seem acceptable to me to make steps in a direction where we're reduce the previous support and make it seems that there are two classes of support in the project between subprojects.

@tstellar
Copy link
Collaborator Author

I do have very strong concerns about this: it does not seem acceptable to me to make steps in a direction where we're reduce the previous support and make it seems that there are two classes of support in the project between subprojects.

The current version of this workflow isn't meant to replace buildkite, I see it as a backup plan until either the existing Windows job on buildkite is fixed or we get more powerful Windows runners on GitHub.

I'm still testing to see how many sub-projects we can realistically enable.

@Endilll
Copy link
Contributor

Endilll commented Feb 17, 2024

Thank you for working on this Tom!

I do have very strong concerns about this: it does not seem acceptable to me to make steps in a direction where we're reduce the previous support and make it seems that there are two classes of support in the project between subprojects.

It has been mentioned multiple times that there are technical limitations that need to be taken into account. I'd prefer myself if powerful Windows machines on BuildKite that have capacity for every project have been working properly, but it's not the case.

@joker-eph
Copy link
Collaborator

It has been mentioned multiple times that there are technical limitations that need to be taken into account.

Please point me to the "multiple times" where this has been detailed, I think I followed the discussion, yet I don't know what you mean.

I'd prefer myself if powerful Windows machines on BuildKite that have capacity for every project have been working properly, but it's not the case.

I don't quite see this as answering my concerns right now.

@Endilll
Copy link
Contributor

Endilll commented Feb 17, 2024

Please point me to the "multiple times" where this has been detailed, I think I followed the discussion, yet I don't know what you mean.

Tom pointed out to this almost two weeks ago in https://discourse.llvm.org/t/rfc-future-of-windows-pre-commit-ci/76840/4

We chose these projects for testing on the release branch, because at the time, these were the most the we could build without running into the disk space limitations or the 6 hour build timeout.

Back then you also suggested that we can "just enable" all projects, and I agreed, as long as we "don’t spend another two weeks on this". Here we are today, with Windows CI completely disabled, and you still pushing against at least some kind of replacement.

elif [ "${RUNNER_OS}" = "Windows" ]; then
projects_to_test=$(exclude-windows $(compute-projects-to-test ${modified_projects}))
else
echo "Unknown runner OS: $RUNNER_OS"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should exit with an error here:

Suggested change
echo "Unknown runner OS: $RUNNER_OS"
echo "Unknown runner OS: $RUNNER_OS"
exit 1

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

@tstellar
Copy link
Collaborator Author

I ran this job with -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;flang;lld;llvm;mlir;polly" which matches what buildkite does when there is a change made to llvm and just the build took 5h and 59m, so there was no time left to run the tests (GitHub timeout is 6 hours). I'm going to try to do some benchmarks with the other combinations of projects.

Copy link
Collaborator

@joker-eph joker-eph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth noting that there are a few variables here:

  • Which path in the repo this triggers on
  • The mapping from a path to a given project
  • The dependent projects that are kicked in based on the affected projects from the path mapping.

That is: minimally speaking it is very possible to enable this for every single project, while turning of the dependent projects calculation. For example a PR touching lld should only build and test lld: this should be cheap overall.

Your problem (as far as I understand) is rather: "when project X is touched: how many other dependent project can we bundle here" ; when X is llvm it becomes very large but when X is MLIR (for example) then it is just adding Flang on top of MLIR.

If the above is correct, then a good starting point could be to start with a very limited set of dependent projects, based on the coupling ("how much are tests of project Y likely to provide extra coverage to project X?"). LLVM can break potentially all the other projects in theory, but probably not with the same likeliness.

If the goal is to re-establish some amount of Windows coverage ASAP, then the most straightforward way to me seems like: "test only the project that is touched by a PR". From there we can observe breakages and increate the number of dependent projects based on actually impact.

fi

# List of files affected by this commit
: ${MODIFIED_FILES:=$(git diff --name-only HEAD~1...HEAD)}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this work, when a branch contains multiple commits (e.g. an original one and multiple corrections/tweaks) - essentially we should check the diff of the whole branch, right?

But when doing that, we also need to make sure that we ignore any changes from merge commits, like in llvm/llvm-admin#8, otherwise we will essentially end up testing every single subproject as soon as an MR contains a merge commit. IIRC git diff <base>...HEAD is the syntax for that, but it's worth testing and looking into.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The HEAD commit for a PR is a merge of the PR branch into main. It's not the HEAD for the PR target branch. In my testing using HEAD~1...HEAD worked for multiple commits. I did not test the case where the PR branch had a merge commit in it already. I can test git diff <base>...HEAD

@tstellar
Copy link
Collaborator Author

I ran this job with -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;flang;lld;llvm;mlir;polly" which matches what buildkite does when there is a change made to llvm and just the build took 5h and 59m, so there was no time left to run the tests (GitHub timeout is 6 hours). I'm going to try to do some benchmarks with the other combinations of projects.

More Benchmarks:

Projects Build Test Total Result
clang;clang-tools-extra;flang;lld;llvm;mlir;polly 359m 1m 360m Timeout
clang;clang-tools-extra;lld;llvm;mlir;polly" 282m 72m 354m Pass
clang;clang-tools-extra;lld;llvm;mlir 271m 72m 343m Pass
clang;lld;llvm;mlir 234m 67m 301m Pass
clang;lld;llvm 183m 67m 250m Pass

@Endilll
Copy link
Contributor

Endilll commented Feb 19, 2024

Going beyond 5 hours by design sounds risky to me, as I don't expect GitHub runners to have a very consistent performance. On top of that, we also should accommodate growth of the projects. As https://llvm-compile-time-tracker.com/graphs.php shows, over time it takes more effort to build Clang.

What the testing environment was? Were you using Clang with PGO trained on building LLVM or Clang? Never mind, I realized we're speaking about Windows here.

if [ "${RUNNER_OS}" = "Linux" ]; then
projects_to_test=$(exclude-linux $(compute-projects-to-test ${modified_projects}))
elif [ "${RUNNER_OS}" = "Windows" ]; then
projects_to_test=$(exclude-windows $(compute-projects-to-test ${modified_projects}))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
projects_to_test=$(exclude-windows $(compute-projects-to-test ${modified_projects}))
# Don't run dependent projects on Windows for now.
projects_to_test=$(exclude-windows ${modified_projects})

Comment on lines 39 to 40
with:
projects: clang llvm lld
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
with:
projects: clang llvm lld

Comment on lines 54 to 57
- name: Build LLVM
if: ${{ steps.compute-projects.outputs.check-targets }}
run: |
ninja -C build test-depends
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This step will overbuild: if a PR only touches clang and not LLVM, it should only run ninja check-clang and not build the LLVM unit-tests and other things not needed by clang for example.
(same for the other projects)

Suggested change
- name: Build LLVM
if: ${{ steps.compute-projects.outputs.check-targets }}
run: |
ninja -C build test-depends

- name: Check LLVM
if: ${{ steps.compute-projects.outputs.check-targets }}
run: |
ninja -C build "${{ steps.compute-projects.outputs.check-targets }}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ninja -C build "${{ steps.compute-projects.outputs.check-targets }}"
ninja -C build ${{ steps.compute-projects.outputs.check-targets }}

Need to remove the quotes here because there can be multiple targets

* Split the build into 3 jobs to avoid timeouts.
* Maintiain an scacche cache that is unique for each PR.
@tstellar
Copy link
Collaborator Author

I've done a major rework of this with the following changes:

  • Split the build into 3 jobs to avoid timeouts.
  • Added a sccache cache the is specific to each PR. This should help make testing PR updates faster.

I'm still debugging and addressing some of the review comments, but I think this new approach will be better.

key: precommit-windows

- name: Restore sccache from previous PR run
uses: ./.github/workflows/pr-sccache-restore
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a cache for the initial run? If not that seems like quite a pessimization.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a global cache that is shared by all PRs which is used for the first run. However, given how many PRs there and the limits on the size of the cache. I think the benefit from the global cache may be small.

if echo "${{ needs.compute-projects.outputs.check-targets }}" | grep clang; then
targets="$targets clang"
fi
ninja -C build $targets
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow this step's logic here? Is llc a universal dependency?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to pick a target that would build most of llvm/, that seemed like the best one to use.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just seems like a bad setup to me: why would we build all of llvm when we don’t necessarily need it?

shell: bash
run: |
ls
ninja -C build
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not make sense: why would we build it all?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a better build target to use here to build "everything else"? We had test-depends before, but that was building too much (e.g. llvm unit tests, when we only care about check-flang). Maybe we need to add -DLLVM_INCLUDE_TESTS=OFF when configuring so they don't get built when not needed?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by “everything else”?
Why does this step exists in the first place?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason we have this step is to work around the GitHub timeouts. If we split the build into multiple jobs, then each job has its own 6 hour timeout and it's possible to fully replicate what we have with buildkite. Currently, this patch splits the build into 3 parts:

  1. Build LLVM (and clang if necessary). ninja llc clang
  2. Build the rest of the enabled projects and runtimes (i.e "everything else"): ninja
  3. Run the tests. ninja $CHECK_TARGETS

We can split up the jobs anyway want, but I think we need to have at least two.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think about this more, I think I have a better idea for working around the timeouts. I'm going to try something else.

tar --zstd -xf llvm-project.tar.zst
rm llvm-project.tar.zst

- name: Check LLVM
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- name: Check LLVM
- name: Check LLVM

This checks more than LLVM, doesn't it?

This simplifies the workflow a lot and will enable us to support builds
that take more than 6 hours.
@tstellar
Copy link
Collaborator Author

tstellar commented May 2, 2024

I re-worked the job now so that it automatically saves the build after 5 hours and 30 minutes and then continues it in a new job to avoid the GitHub timeouts. This simplifies the workflow a lot, and let's us just run ninja $check-targets directly with no intermediate steps. This job is now more or less equivalent to what is run on the buildkite builders.

I did some benchmarking and for full build of all the projects and running all the check targets (i.e. what happens if you make a change in llvm/) it takes 8 Hr 10 Minutes with no cache and 1Hr 45 Minutes with 100% cache usage. There is a global sccache and also a local one for each PR. I would expect in most cases, the first run for a PR will take the max time (8Hr 10M), and subsequent runs will take about 2 Hrs. This is for PRs that touch llvm/ PRs that tough other directories will take less time to test.

@tstellar
Copy link
Collaborator Author

tstellar commented May 4, 2024

I believe I've addressed all the review comments now. I'm not sure what the current status is on Buildkite and if there are still issues with long queue times, but this job is ready to be used as a drop-in replacement if needed.

@boomanaiden154
Copy link
Contributor

Is there a way to deprioritize these jobs relative to others (like the issue comments/PR subscribers jobs) so that the quick running jobs can run in between?

Also, it seems like the total time it takes for this workflow for a change to a project like LLVM will be many hours? I don't think that's going to be feasible given our 60 runner limit. When I last did the math, we would be reasonably comfortable with a turn-around time of 1-hour, and could maybe push a couple hours, but six hours from my previous estimates would definitely run into throughput limitations. Those numbers are somewhat old, but if anything, usage has gone up from what I've seen.

@joker-eph
Copy link
Collaborator

Shall we actually setup a proof of concept custom GCP Windows runner? Then we can just invite Google to actually migrate the buildkite runners to GitHub instead. I seem to recall Microsoft ready to provide some machine as well? (@AaronBallman knows maybe?).
Having more beefy machines (+ having sizable local persistent sccache as well). may just be what we need here.

@boomanaiden154
Copy link
Contributor

Shall we actually setup a proof of concept custom GCP Windows runner? Then we can just invite Google to actually migrate the buildkite runners to GitHub instead. I seem to recall Microsoft ready to provide some machine as well? (@AaronBallman knows maybe?).
Having more beefy machines (+ having sizable local persistent sccache as well). may just be what we need here.

Migrating the current infrastructure over to Github Actions is already on Google's roadmap (@gchatelet @Keenuts), but not in the short-term from what I understand.

I remember Aaron saying that Intel would be willing to provide machines, but if I'm remembering that correctly, it was just hardware that would still need to be plugged in somewhere, and I don't believe anyone is willing to take in hardware and run it currently.

@AaronBallman
Copy link
Collaborator

Shall we actually setup a proof of concept custom GCP Windows runner? Then we can just invite Google to actually migrate the buildkite runners to GitHub instead. I seem to recall Microsoft ready to provide some machine as well? (@AaronBallman knows maybe?). Having more beefy machines (+ having sizable local persistent sccache as well). may just be what we need here.

I have two IceLake servers that are in the process of being provisioned at Intel (the servers are available, folks are getting to the point of needing to install an OS and configure the machines). As I understand things, Intel can host the servers but we need someone from community to actually set up and maintain the CI bits on them. It's somewhat unclear as to who would be responsible for things like updating the OS, but I figure details like that can be worked out once we have someone from the LLVM community who is willing to be the point of contact for getting CI working on the servers. I'm happy to be involved in the process, but I don't know enough about CI or have enough bandwidth to own the servers myself; is someone willing to be that point of contact?

@tstellar
Copy link
Collaborator Author

tstellar commented May 7, 2024

@AaronBallman If you want to use them as GitHub Action self-hosted runners, I can help set that up.

@Keenuts
Copy link
Contributor

Keenuts commented May 7, 2024

Shall we actually setup a proof of concept custom GCP Windows runner? Then we can just invite Google to actually migrate the buildkite runners to GitHub instead. I seem to recall Microsoft ready to provide some machine as well? (@AaronBallman knows maybe?).
Having more beefy machines (+ having sizable local persistent sccache as well). may just be what we need here.

Migrating the current infrastructure over to Github Actions is already on Google's roadmap (@gchatelet @Keenuts), but not in the short-term from what I understand.

I remember Aaron saying that Intel would be willing to provide machines, but if I'm remembering that correctly, it was just hardware that would still need to be plugged in somewhere, and I don't believe anyone is willing to take in hardware and run it currently.

@lnihlen in case you missed this PR.
Guillaume and I are looking into how to move the old llvm presubmit to github actions. Right now, we are evaluating the capacity we use/require, and are still at the early stages of this process.

@AaronBallman
Copy link
Collaborator

@AaronBallman If you want to use them as GitHub Action self-hosted runners, I can help set that up.

Thanks! FWIW, we're willing to use them in whatever way the community will get the most value from, so we can be somewhat flexible on how to move forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants