-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI] Fix ccache cache restoration to improve build times #5202
Changes from all commits
ac9adea
97a544b
c49ef99
5c54bcf
77e5969
ead0add
6248b4b
060bae0
9fe6699
a0e2544
1c63625
3b745cf
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -21,10 +21,12 @@ concurrency: | |
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }} | ||
permissions: read-all | ||
env: | ||
TRITON_BUILD_WITH_CCACHE: "true" | ||
TRITON_BUILD_WITH_CLANG_LLD: "TRUE" | ||
TRITON_USE_ASSERT_ENABLED_LLVM: "TRUE" | ||
TRITON_DISABLE_LINE_INFO: 1 | ||
PROTON_SKIP_PC_SAMPLING_TEST: 1 | ||
CCACHE_COMPRESS: "true" | ||
jobs: | ||
Runner-Preparation: | ||
runs-on: ubuntu-latest | ||
|
@@ -154,6 +156,8 @@ jobs: | |
strategy: | ||
matrix: | ||
runner: ${{fromJson(needs.Runner-Preparation.outputs.matrix-CUDA)}} | ||
env: | ||
RUNNER_TYPE: ${{ matrix.runner[0] }} | ||
steps: | ||
- name: Checkout | ||
uses: actions/checkout@v4 | ||
|
@@ -199,22 +203,30 @@ jobs: | |
# "restore" step. This is to prevent the caches from accumulating stale | ||
# files over time. | ||
name: Restore cache of ccache and Triton compilation artifacts | ||
if: github.event_name != 'push' | ||
id: restore-build-cache | ||
if: github.ref != 'refs/heads/main' | ||
uses: actions/cache/restore@v4 | ||
with: | ||
path: | | ||
~/.triton/cache | ||
~/.cache/ccache | ||
~/.ccache | ||
# Restore the most recent cache entry. | ||
restore-keys: triton-artifacts-${{ runner.os }}-${{ runner.arch }}-${{ runner.name }}-llvm-${{ steps.cache-key.outputs.llvm }}- | ||
restore-keys: | | ||
triton-artifacts-${{ runner.os }}-${{ runner.arch }}-${{ env.RUNNER_TYPE }}-llvm-${{ steps.cache-key.outputs.llvm }}- | ||
triton-artifacts-${{ runner.os }}-${{ runner.arch }}-${{ env.RUNNER_TYPE }}- | ||
# We expect this cache key never to hit and for us to fall back | ||
# unconditionally to the restore-key, so it doesn't actually matter | ||
# what we put here (so long as it doesn't hit an existing key). | ||
key: triton-artifacts-${{ runner.os }}-${{ runner.arch }}-${{ runner.name }}-llvm-${{ steps.cache-key.outputs.llvm }}-${{ steps.cache-key.outputs.datetime }} | ||
- name: Inspect cache directory | ||
key: triton-artifacts-${{ runner.os }}-${{ runner.arch }}-${{ env.RUNNER_TYPE }}-llvm-${{ steps.cache-key.outputs.llvm }}-${{ steps.cache-key.outputs.datetime }} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Changing |
||
- name: Inspect cache directories | ||
run: | | ||
mkdir -p ~/.triton | ||
ls -alh ~/.triton | ||
du -sh ~/.triton/** | ||
|
||
mkdir -p ~/.ccache | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The default cache directory if There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You meant other platforms use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm referring to the ccache directory. Linux defaults to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm confused, you're using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, ccache has multiple defaults which have an order of precedence. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, I get it now. Thanks for the reference |
||
ls -alh ~/.ccache | ||
du -sh ~/.ccache | ||
- name: Update PATH | ||
run: | | ||
echo "$HOME/.local/bin" >> $GITHUB_PATH | ||
|
@@ -224,12 +236,14 @@ jobs: | |
python3 -m pip install cython setuptools wheel cmake==3.24 ninja pytest-forked pytest-xdist lit | ||
- name: Install Triton | ||
env: | ||
TRITON_BUILD_WITH_CCACHE: "true" | ||
CUDA_HOME: "/usr/local/cuda" | ||
run: | | ||
echo "PATH is '$PATH'" | ||
cd python | ||
python3 -m pip install '.[tests]' | ||
ccache --zero-stats | ||
python3 -m pip install -v '.[tests]' | ||
- name: CCache Stats | ||
run: ccache --print-stats | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just some debug output to confirm we're getting cache hits. |
||
- name: Run lit tests | ||
run: | | ||
cd python | ||
|
@@ -278,6 +292,15 @@ jobs: | |
cd third_party/proton/test | ||
python3 -m pytest -s . | ||
cd .. | ||
- name: Inspect cache directories | ||
run: | | ||
mkdir -p ~/.triton | ||
ls -alh ~/.triton | ||
du -sh ~/.triton/** | ||
|
||
mkdir -p ~/.ccache | ||
ls -alh ~/.ccache | ||
du -sh ~/.ccache | ||
- # If we're on branch `main`, save the ccache Triton compilation artifacts | ||
# to the cache so they can be used by other (non-main) CI runs. | ||
# | ||
|
@@ -287,22 +310,17 @@ jobs: | |
if: github.ref == 'refs/heads/main' | ||
uses: actions/cache/save@v4 | ||
with: | ||
path: ~/.triton/cache ~/.cache/ccache | ||
key: triton-artifacts-${{ runner.os }}-${{ runner.arch }}-${{ runner.name }}-llvm-${{ steps.cache-key.outputs.llvm }}-${{ steps.cache-key.outputs.datetime }} | ||
- name: Inspect cache directories | ||
run: | | ||
mkdir -p ~/.triton | ||
ls -alh ~/.triton | ||
du -sh ~/.triton/** | ||
|
||
mkdir -p ~/.cache/ccache | ||
ls -alh ~/.cache/ccache | ||
du -sh ~/.cache/ccache | ||
path: | | ||
~/.triton/cache | ||
~/.ccache | ||
key: ${{ steps.restore-build-cache.outputs.cache-primary-key }} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This just saves you having to make sure the keys match, and enforces it programatically. |
||
Integration-Tests-AMD: | ||
needs: Runner-Preparation | ||
if: needs.Runner-Preparation.outputs.matrix-HIP != '' | ||
runs-on: ${{ matrix.runner }} | ||
timeout-minutes: 30 | ||
env: | ||
RUNNER_TYPE: ${{ matrix.runner[1] }} | ||
strategy: | ||
matrix: | ||
runner: ${{fromJson(needs.Runner-Preparation.outputs.matrix-HIP)}} | ||
|
@@ -355,40 +373,55 @@ jobs: | |
# "restore" step. This is to prevent the caches from accumulating stale | ||
# files over time. | ||
name: Restore cache of ccache and Triton compilation artifacts | ||
if: github.event_name != 'push' | ||
id: restore-build-cache | ||
if: github.ref != 'refs/heads/main' | ||
uses: actions/cache/restore@v4 | ||
with: | ||
path: | | ||
~/.triton/cache | ||
~/.cache/ccache | ||
~/.ccache | ||
# Restore the most recent cache entry. | ||
restore-keys: triton-artifacts-${{ runner.os }}-${{ runner.arch }}-${{ runner.name }}-llvm-${{ steps.cache-key.outputs.llvm }}- | ||
restore-keys: | | ||
triton-artifacts-${{ runner.os }}-${{ runner.arch }}-${{ env.RUNNER_TYPE }}-llvm-${{ steps.cache-key.outputs.llvm }}- | ||
triton-artifacts-${{ runner.os }}-${{ runner.arch }}-${{ env.RUNNER_TYPE }}- | ||
# We expect this cache key never to hit and for us to fall back | ||
# unconditionally to the restore-key, so it doesn't actually matter | ||
# what we put here (so long as it doesn't hit an existing key). | ||
key: triton-artifacts-${{ runner.os }}-${{ runner.arch }}-${{ runner.name }}-llvm-${{ steps.cache-key.outputs.llvm }}-${{ steps.cache-key.outputs.datetime }} | ||
- name: Inspect cache directory | ||
key: triton-artifacts-${{ runner.os }}-${{ runner.arch }}-${{ env.RUNNER_TYPE }}-llvm-${{ steps.cache-key.outputs.llvm }}-${{ steps.cache-key.outputs.datetime }} | ||
- name: Inspect cache directories | ||
run: | | ||
mkdir -p ~/.triton | ||
ls -alh ~/.triton | ||
du -sh ~/.triton/** | ||
|
||
mkdir -p ~/.ccache | ||
ls -alh ~/.ccache | ||
du -sh ~/.ccache | ||
- name: Update PATH | ||
run: | | ||
echo "/opt/rocm/llvm/bin" >> $GITHUB_PATH | ||
- name: Install pip dependencies | ||
run: | | ||
python3 -m pip install --upgrade pip | ||
python3 -m pip install lit | ||
- name: Install apt dependencies | ||
run: | | ||
apt update | ||
apt install ccache | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
- name: Install Triton | ||
id: amd-install-triton | ||
run: | | ||
echo "PATH is '$PATH'" | ||
pip uninstall -y triton | ||
cd python | ||
ccache --zero-stats | ||
pip install -v -e '.[tests]' | ||
- name: Clean up after an unsuccessful build | ||
if: ${{ !success() && steps.amd-install-triton.outcome != 'success' }} | ||
run: | | ||
rm -rf ~/.triton | ||
- name: CCache Stats | ||
run: ccache --print-stats | ||
- name: Run lit tests | ||
run: | | ||
cd python | ||
|
@@ -431,6 +464,15 @@ jobs: | |
cd python | ||
cd "build/$(ls build | grep -i cmake)" | ||
ctest -j32 | ||
- name: Inspect cache directories | ||
run: | | ||
mkdir -p ~/.triton | ||
ls -alh ~/.triton | ||
du -sh ~/.triton/** | ||
|
||
mkdir -p ~/.ccache | ||
ls -alh ~/.ccache | ||
du -sh ~/.ccache | ||
- # If we're on branch `main`, save the ccache Triton compilation artifacts | ||
# to the cache so they can be used by other (non-main) CI runs. | ||
# | ||
|
@@ -440,17 +482,10 @@ jobs: | |
if: github.ref == 'refs/heads/main' | ||
uses: actions/cache/save@v4 | ||
with: | ||
path: ~/.triton/cache ~/.cache/ccache | ||
key: triton-artifacts-${{ runner.os }}-${{ runner.arch }}-${{ runner.name }}-llvm-${{ steps.cache-key.outputs.llvm }}-${{ steps.cache-key.outputs.datetime }} | ||
- name: Inspect cache directories | ||
run: | | ||
mkdir -p ~/.triton | ||
ls -alh ~/.triton | ||
du -sh ~/.triton/** | ||
|
||
mkdir -p ~/.cache/ccache | ||
ls -alh ~/.cache/ccache | ||
du -sh ~/.cache/ccache | ||
path: | | ||
~/.triton/cache | ||
~/.ccache | ||
key: ${{ steps.restore-build-cache.outputs.cache-primary-key }} | ||
- name: Clean up caches | ||
run: | | ||
rm -rf ~/.triton/cache | ||
|
@@ -462,6 +497,8 @@ jobs: | |
strategy: | ||
matrix: | ||
runner: ${{fromJson(needs.Runner-Preparation.outputs.matrix-MACOS)}} | ||
env: | ||
RUNNER_TYPE: ${{ matrix.runner[0] }} | ||
steps: | ||
- name: Checkout | ||
uses: actions/checkout@v4 | ||
|
@@ -470,7 +507,7 @@ jobs: | |
- name: Install brew dependencies | ||
run: | | ||
brew update | ||
brew install ccache llvm@19 lld | ||
brew install ccache llvm@19 lld coreutils | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
- name: Compute cache keys | ||
id: cache-key | ||
run: | | ||
|
@@ -511,22 +548,30 @@ jobs: | |
# "restore" step. This is to prevent the caches from accumulating stale | ||
# files over time. | ||
name: Restore cache of ccache and Triton compilation artifacts | ||
if: github.event_name != 'push' | ||
id: restore-build-cache | ||
if: github.ref != 'refs/heads/main' | ||
uses: actions/cache/restore@v4 | ||
with: | ||
path: | | ||
~/.triton/cache | ||
~/.cache/ccache | ||
~/.ccache | ||
# Restore the most recent cache entry. | ||
restore-keys: triton-artifacts-${{ runner.os }}-${{ runner.arch }}-${{ runner.name }}-llvm-${{ steps.cache-key.outputs.llvm }}- | ||
restore-keys: | | ||
triton-artifacts-${{ runner.os }}-${{ runner.arch }}-${{ env.RUNNER_TYPE }}-llvm-${{ steps.cache-key.outputs.llvm }}- | ||
triton-artifacts-${{ runner.os }}-${{ runner.arch }}-${{ env.RUNNER_TYPE }}- | ||
# We expect this cache key never to hit and for us to fall back | ||
# unconditionally to the restore-key, so it doesn't actually matter | ||
# what we put here (so long as it doesn't hit an existing key). | ||
key: triton-artifacts-${{ runner.os }}-${{ runner.arch }}-${{ runner.name }}-llvm-${{ steps.cache-key.outputs.llvm }}-${{ steps.cache-key.outputs.datetime }} | ||
- name: Inspect cache directory | ||
key: triton-artifacts-${{ runner.os }}-${{ runner.arch }}-${{ env.RUNNER_TYPE }}-llvm-${{ steps.cache-key.outputs.llvm }}-${{ steps.cache-key.outputs.datetime }} | ||
- name: Inspect cache directories | ||
run: | | ||
mkdir -p ~/.triton | ||
ls -alh ~/.triton | ||
du -sh ~/.triton/** | ||
|
||
mkdir -p ~/.ccache | ||
ls -alh ~/.ccache | ||
du -sh ~/.ccache | ||
- name: Update PATH | ||
run: | | ||
echo "$HOME/.local/bin" >> $GITHUB_PATH | ||
|
@@ -539,7 +584,6 @@ jobs: | |
python3 -m pip install cython setuptools wheel cmake==3.24 ninja pytest-xdist lit pybind11 | ||
- name: Install Triton | ||
env: | ||
TRITON_BUILD_WITH_CCACHE: "true" | ||
TRITON_BUILD_WITH_O1: "true" | ||
# macos-latest has 3 vcpus and 7GB DRAM, to save memory we limit the number of jobs to 3 | ||
# https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners/about-github-hosted-runners#standard-github-hosted-runners-for-public-repositories | ||
|
@@ -548,7 +592,19 @@ jobs: | |
source ~/.venv/bin/activate | ||
echo "PATH is '$PATH'" | ||
cd python | ||
python3 -m pip install --no-build-isolation . | ||
ccache --zero-stats | ||
python3 -m pip install -v --no-build-isolation . | ||
- name: CCache Stats | ||
run: ccache --print-stats | ||
- name: Inspect cache directories | ||
run: | | ||
mkdir -p ~/.triton | ||
ls -alh ~/.triton | ||
du -sh ~/.triton/** | ||
|
||
mkdir -p ~/.ccache | ||
ls -alh ~/.ccache | ||
du -sh ~/.ccache | ||
- # If we're on branch `main`, save the ccache Triton compilation artifacts | ||
# to the cache so they can be used by other (non-main) CI runs. | ||
# | ||
|
@@ -558,14 +614,7 @@ jobs: | |
if: github.ref == 'refs/heads/main' | ||
uses: actions/cache/save@v4 | ||
with: | ||
path: ~/.triton/cache ~/.cache/ccache | ||
key: triton-artifacts-${{ runner.os }}-${{ runner.arch }}-${{ runner.name }}-llvm-${{ steps.cache-key.outputs.llvm }}-${{ steps.cache-key.outputs.datetime }} | ||
- name: Inspect cache directories | ||
run: | | ||
mkdir -p ~/.triton | ||
ls -alh ~/.triton | ||
du -sh ~/.triton/** | ||
|
||
mkdir -p ~/.cache/ccache | ||
ls -alh ~/.cache/ccache | ||
du -sh ~/.cache/ccache | ||
path: | | ||
~/.triton/cache | ||
~/.ccache | ||
key: ${{ steps.restore-build-cache.outputs.cache-primary-key }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh and this condition was disabling the cache from being restored on PRs