Skip to content

Commit

Permalink
Refs/heads/eliasj42/llama runner migration (#938)
Browse files Browse the repository at this point in the history
moving llama workflows to mi300x machine

moved PkgCI - shark-ai (this workkflow was already on the OSSCI cluster
but was using an outdated runner name), CI - sharktank perplexity short,
CI - sharktank perplexity and Llama Benchmarking 8B Tests to new OSSCI
cluster arc runners on mi300x machines

---------

Signed-off-by: Elias Joseph <[email protected]>
Signed-off-by: root <[email protected]>
Co-authored-by: Elias Joseph <[email protected]>
Co-authored-by: root <[email protected]>
  • Loading branch information
3 people authored Feb 8, 2025
1 parent 2c61420 commit 3ad85b3
Show file tree
Hide file tree
Showing 4 changed files with 8 additions and 8 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/ci-llama-quick-tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ jobs:
matrix:
version: [3.11]
fail-fast: false
runs-on: llama-mi300x-1
runs-on: linux-mi300-1gpu-ossci
defaults:
run:
shell: bash
Expand Down
8 changes: 4 additions & 4 deletions .github/workflows/ci_eval.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ jobs:
strategy:
matrix:
version: [3.11]
runs-on: [llama-mi300x-3]
runs-on: [linux-mi300-1gpu-ossci]
fail-fast: false
runs-on: ${{matrix.runs-on}}
defaults:
Expand Down Expand Up @@ -65,7 +65,7 @@ jobs:
- name: Run perplexity test with IREE
run: |
source ${VENV_DIR}/bin/activate
pytest -n 8 -v -s sharktank/tests/evaluate/perplexity_iree_test.py --run-nightly-llama-tests --bs=100 --iree-device=hip://0 --iree-hip-target=gfx942 --iree-hal-target-device=hip --llama3-8b-f16-model-path=/data/llama3.1/weights/8b/fp16/llama3.1_8b_fp16_instruct.irpa --llama3-8b-tokenizer-path=/data/llama3.1/weights/8b/fp16/tokenizer_config.json --html=out/llm/llama/perplexity/iree_perplexity/index.html --log-cli-level=INFO
pytest -n 8 -v -s sharktank/tests/evaluate/perplexity_iree_test.py --run-nightly-llama-tests --bs=100 --iree-device=hip://0 --iree-hip-target=gfx942 --iree-hal-target-device=hip --llama3-8b-f16-model-path=/shark-dev/data/llama3.1/weights/8b/fp16/llama3.1_8b_fp16_instruct.irpa --llama3-8b-tokenizer-path=/shark-dev/data/llama3.1/weights/8b/fp16/tokenizer_config.json --html=out/llm/llama/perplexity/iree_perplexity/index.html --log-cli-level=INFO
ls -lha ${{ github.workspace }}/perplexity_ci_artifacts
Expand All @@ -84,7 +84,7 @@ jobs:
strategy:
matrix:
version: [3.11]
runs-on: [llama-mi300x-3]
runs-on: [linux-mi300-1gpu-ossci]
fail-fast: false
runs-on: ${{matrix.runs-on}}
defaults:
Expand Down Expand Up @@ -121,7 +121,7 @@ jobs:
- name: Run perplexity test with Torch
run: |
source ${VENV_DIR}/bin/activate
pytest -n 8 -v -s sharktank/tests/evaluate/perplexity_torch_test.py --longrun --llama3-8b-f16-model-path=/data/llama3.1/weights/8b/fp16/llama3.1_8b_fp16_instruct.irpa --llama3-8b-tokenizer-path=/data/llama3.1/weights/8b/fp16/tokenizer_config.json --html=out/llm/llama/perplexity/torch_perplexity/index.html
pytest -n 8 -v -s sharktank/tests/evaluate/perplexity_torch_test.py --longrun --llama3-8b-f16-model-path=/shark-dev/data/llama3.1/weights/8b/fp16/llama3.1_8b_fp16_instruct.irpa --llama3-8b-tokenizer-path=/shark-dev/data/llama3.1/weights/8b/fp16/tokenizer_config.json --html=out/llm/llama/perplexity/torch_perplexity/index.html
- name: Deploy to GitHub Pages
uses: peaceiris/actions-gh-pages@4f9cc6602d3f66b9c108549d475ec49e8ef4d45e # v4.0.0
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/ci_eval_short.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ jobs:
strategy:
matrix:
version: [3.11]
runs-on: [llama-mi300x-3]
runs-on: [linux-mi300-1gpu-ossci]
fail-fast: false
runs-on: ${{matrix.runs-on}}
defaults:
Expand Down Expand Up @@ -64,5 +64,5 @@ jobs:
- name: Run perplexity test with vmfb
run: |
source ${VENV_DIR}/bin/activate
pytest -n 8 -v -s sharktank/tests/evaluate/perplexity_iree_test.py --bs=5 --iree-device=hip://0 --iree-hip-target=gfx942 --iree-hal-target-device=hip --llama3-8b-f16-model-path=/data/llama3.1/weights/8b/fp16/llama3.1_8b_fp16_instruct.irpa --llama3-8b-tokenizer-path=/data/llama3.1/weights/8b/fp16/tokenizer_config.json --log-cli-level=INFO
pytest -n 8 -v -s sharktank/tests/evaluate/perplexity_iree_test.py --bs=5 --iree-device=hip://0 --iree-hip-target=gfx942 --iree-hal-target-device=hip --llama3-8b-f16-model-path=/shark-dev/data/llama3.1/weights/8b/fp16/llama3.1_8b_fp16_instruct.irpa --llama3-8b-tokenizer-path=/shark-dev/data/llama3.1/weights/8b/fp16/tokenizer_config.json --log-cli-level=INFO
ls -lha ${{ github.workspace }}/perplexity_ci_artifacts
2 changes: 1 addition & 1 deletion .github/workflows/pkgci_shark_ai.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ jobs:
test_device: cpu
python-version: 3.11
- name: amdgpu_rocm_mi300_gfx942
runs-on: linux-mi300-gpu-1
runs-on: linux-mi300-1gpu-ossci
test_device: gfx942
python-version: 3.11
# temporarily disable mi250 because the cluster is unsable & slow
Expand Down

0 comments on commit 3ad85b3

Please sign in to comment.